Status and Prospects of LHC Experiments Data Acquisition. Niko Neufeld, CERN/PH

Similar documents
Improving Packet Processing Performance of a Memory- Bounded Application

The CMS Event Builder

Deferred High Level Trigger in LHCb: A Boost to CPU Resource Utilization

Stefan Koestner on behalf of the LHCb Online Group ( IEEE - Nuclear Science Symposium San Diego, Oct.

2008 JINST 3 S Online System. Chapter System decomposition and architecture. 8.2 Data Acquisition System

Trigger and Data Acquisition at the Large Hadron Collider

The CMS Computing Model

The new detector readout system for the ATLAS experiment

AN OVERVIEW OF THE LHC EXPERIMENTS' CONTROL SYSTEMS

MiniDAQ1 A COMPACT DATA ACQUISITION SYSTEM FOR GBT READOUT OVER 10G ETHERNET 22/05/2017 TIPP PAOLO DURANTE - MINIDAQ1 1

Commissioning of the ALICE data acquisition system

Detector Control LHC

Summary of the LHC Computing Review

Dataflow Monitoring in LHCb

Experiment

Evaluation of the LDC Computing Platform for Point 2

Modular Platforms Market Trends & Platform Requirements Presentation for IEEE Backplane Ethernet Study Group Meeting. Gopal Hegde, Intel Corporation

Lot # 10 - Servers. 1. Rack Server. Rack Server Server

LHCb Online System BEAUTY-2002

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

1. Introduction. Outline

L1 and Subsequent Triggers

FELI. : the detector readout upgrade of the ATLAS experiment. Soo Ryu. Argonne National Laboratory, (on behalf of the FELIX group)

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

Remote power and console management in large datacenters

Front-End Electronics Configuration System for CMS. Philippe Gras CERN - University of Karlsruhe

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

A DAQ Architecture for Agata experiment

Virtualizing a Batch. University Grid Center

Sun and Oracle. Kevin Ashby. Oracle Technical Account Manager. Mob:

Networking for Data Acquisition Systems. Fabrice Le Goff - 14/02/ ISOTDAQ

Data Transfers Between LHC Grid Sites Dorian Kcira

Update on PRad GEMs, Readout Electronics & DAQ

Storage and I/O requirements of the LHC experiments

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

10 Gbit/s Challenge inside the Openlab framework

CORRIGENDUM No. 1 TENDER DOSSIER (INSTRUCTIONS TO TENDERERS AND TECHNICAL SPECIFICATIONS) EuropeAid/128840/D/SUP/HR

A first look at 100 Gbps LAN technologies, with an emphasis on future DAQ applications.

Control and Monitoring of the Front-End Electronics in ALICE

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

FPGAs and Networking

The NE010 iwarp Adapter

VMware, Cisco and EMC The VCE Alliance

Andrea Sciabà CERN, Switzerland

Electronics, Trigger and Data Acquisition part 3

THE ATLAS DATA ACQUISITION SYSTEM IN LHC RUN 2

LCG Conditions Database Project

GW2000h w/gw175h/q F1 specifications

Modeling and Validating Time, Buffering, and Utilization of a Large-Scale, Real-Time Data Acquisition System

An Introduction to GPFS

Virtual Environment For Yemen Mobile

Dell EMC Storage with the Avigilon Control Center System

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO

Dell EMC Storage with the Avigilon Control Center System

The Software Defined Online Storage System at the GridKa WLCG Tier-1 Center

The INFN Tier1. 1. INFN-CNAF, Italy

Dell EMC Storage with Panasonic Video Insight

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

<Insert Picture Here> Managing Oracle Exadata Database Machine with Oracle Enterprise Manager 11g

Storage Optimization with Oracle Database 11g

Parallel File Systems for HPC

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

vstart 50 VMware vsphere Solution Specification

arxiv: v1 [physics.ins-det] 16 Oct 2017

Benchmarking message queue libraries and network technologies to transport large data volume in

Experience the GRID Today with Oracle9i RAC

HLT Infrastructure Commissioning

Data Quality Monitoring Display for ATLAS experiment

Over 70% of servers within a data center are not connected to Fibre Channel SANs for any of the following reasons:

Software Defined Storage at the Speed of Flash. PRESENTATION TITLE GOES HERE Carlos Carrero Rajagopal Vaideeswaran Symantec

Data Acquisition in Particle Physics Experiments. Ing. Giuseppe De Robertis INFN Sez. Di Bari

SPECIFICATION FOR NETWORK ATTACHED STORAGE (NAS) TO BE FILLED BY BIDDER. NAS Controller Should be rack mounted with a form factor of not more than 2U

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008

The ATLAS Data Acquisition System: from Run 1 to Run 2

2008 JINST 3 S Control System. Chapter Detector Control System (DCS) Introduction Design strategy and system architecture

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Oracle Exadata X7. Uwe Kirchhoff Oracle ACS - Delivery Senior Principal Service Delivery Engineer

Adapted from: TRENDS AND ATTRIBUTES OF HORIZONTAL AND VERTICAL COMPUTING ARCHITECTURES

CERN IT procurement update HEPiX Autumn/Fall 2018 at Port d Informació Científica

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008"

Sun Fire V880 System Architecture. Sun Microsystems Product & Technology Group SE

Overview. About CERN 2 / 11

Introduction. Architecture Overview

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

The High Performance Archiver for the LHC Experiments

Cluster Setup and Distributed File System

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

Diamond Networks/Computing. Nick Rees January 2011

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Dell PowerEdge R910 SQL OLTP Virtualization Study Measuring Performance and Power Improvements of New Intel Xeon E7 Processors and Low-Voltage Memory

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

Oracle s Netra Modular System. A Product Concept Introduction

SUN ZFS STORAGE APPLIANCE

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

SUN ZFS STORAGE APPLIANCE

Horizontal Scaling Solution using Linux Environment

CyberStore DSS. Multi Award Winning. Broadberry. CyberStore DSS. Open-E DSS v7 based Storage Appliances. Powering these organisations

EMC Symmetrix DMX Series The High End Platform. Tom Gorodecki EMC

High Speed DAQ with DPDK

Transcription:

Status and Prospects of LHC Experiments Data Acquisition Niko Neufeld, CERN/PH

Acknowledgements & Disclaimer I would like to thank Bernd Panzer, Pierre Vande Vyvre, David Francis, John-Erik Sloper, Frans Meijers, Christoph Schwick and of course my friends and colleagues in LHCb for answering my questions and sharing their ideas Any misrepresentations, misunderstandings are my fault Any value-statements t t are my personal opinion LHC DAQ CHEP09 - Niko Neufeld 2

Outline Readout & Architecture t Online Farms Run Control & Commissioning Outlook & Statust LHC DAQ CHEP09 - Niko Neufeld 3

Readout Architectures CMS DAQ LHC DAQ CHEP09 - Niko Neufeld 4

Trigger/DAQ parameters No.Levels Level-0,1,2 Event Readout HLT Out Trigger Rate (Hz) Size (Byte) Bandw.(GB/s) MB/s (Event/s) ALICE 4 Pb-Pb 500 5x10 7 25 1250 (10 2 ) p-p 10 3 2x10 6 200 (10 2 ) ATLAS 3 LV-1 10 5 1.5x10 6 4.5 300 (2x10 2 ) LV-2 3x10 3 CM MS 2 LV-1 10 5 10 6 100 ~1000 (10 2 ) LHCb 2 LV-0 10 6 3.5x10 4 35 70 (2x10 3 ) LHC DAQ CHEP09 - Niko Neufeld 5

Readout Links of LHC Experiments DDL Optical 200 MB/s 400 links Full duplex: Controls FE (commands, Pedestals, Calibration data) Receiver card interfaces to PC Flow Control yes SLINK Optical: 160 MB/s 1600 Links Receiver card interfaces to PC. yes SLINK 64 Glink (GOL) LVDS: 200 MB/s (max. 15m) 500 links Peak throughput 400 MB/s to absorb fluctuations Receiver card interfaces to commercial NIC (Myrinet) Optical 200 MB/s 400 links Receiver card interfaces to custom-built Ethernet NIC (4 x 1 Gbit/s over copper) yes (no) LHC DAQ CHEP09 - Niko Neufeld 6

Readout Architecture Yesterday s s discussions Partial vs. Full readout LHCb & CMS readout everything on a first-level trigger ALICE has an (optional) sparse readout ATLAS has a partial, on-demand, readout (Level-2) seeded by the Region of Interest (ROI) followed by a full readout Pull vs. Push Push is used by everybody from the front-end (with backpressure except LHCb) ATLAS & CMS pull in the global event-building ALICE pushes over TCP/IP (implicit rate-control) LHCb uses push throughout (with a global backpressure signal and central control of FE buffers) LHC DAQ CHEP09 - Niko Neufeld 7

Point to point the demise of buses All readout is on point-to-point to links in Local Area Networks except the sub-event building in readout-unit PC-servers (the last stand of the buses) Many have been called forward, few have been chosen: SCI, Myrinet, ATM, Ethernet (100 Mbit), Ethernet (1000 Mbit), InfiniBand LHC DAQ CHEP09 - Niko Neufeld 8

Convergence Readout links of quite similar characteristics: can the new GBT become a universal standard? COTS hardware (as much as possible) LAN technology (actually all core Ethernet in the LHC DAQs comes from the same vendor) Standard protocols: Ethernet, UDP, (TCP)/IP Message coalescing: message rate needs to be controlled (for LHCb this is an issue even for the data packets) LHC DAQ CHEP09 - Niko Neufeld 9

Heresy We have seen a lot of similar technologies & ideas 4 scalable systems Could all 4 experiments have used the same DAQ system? I think the answer is probably: Yes Up to the output of the filter-farms with suitable adaptations On the other hand: by doing it differently we can learn from each other independent teams are the best to cater to the needs of the individual id experiments LHC DAQ CHEP09 - Niko Neufeld 10

A personal hit-list ALICE The most versatile, universal link. A comprehensive, ATLAS easy to install, well documented software stack. the most economical system using the physics signature (RoI) to read out a huge detector with a relatively small LAN CMS the optimum in scalability and elegance LHCb The leanest. The minimum number of different components, the lightest protocol LHC DAQ CHEP09 - Niko Neufeld 11

High Level Trigger Farms And that, in simple terms, is what we do in the High Level Trigger LHC DAQ CHEP09 - Niko Neufeld 12

Online Trigger Farms 2009 ALICE ATLAS CMS LHCb CERN IT # servers 81 (1) 837 900 550 5700 # cores 324 ~ 6400 7200 4400 ~ 34600 total available ~ 2000 (2) ~ 1000 550 29MW 2.9 power (kw) currently used power (kw) total available cooling power total available rack-space (Us) ~ 250 450 (3) ~ 145 2.0 MW ~ 500 ~ 820 800 (currently) 525 2.9 MW ~ 2000 2449 ~ 3600 2200 n/a CPU type(s) AMD Intel Intel Intel Mixed (Intel) Opteron Hapertown (mostly) Harpertown Harpertown (1) 4-U servers with powerful FPGA preprocessor cards H-RORC (2) Available from transformer (3) PSU rating LHC DAQ CHEP09 - Niko Neufeld 13

Technologies Operating System: Linux (SLC4 and SLC5) 32-bit and 64-bits: standard kernels, no (hard) real-time. (Windows is used only in parts of the detector control-systems) Hardware: PC-server (Intel and AMD): rack-mount and blades Network (Core-routers and aggregation switches) LHC DAQ CHEP09 - Niko Neufeld 14

Managing g Online farms How to manage the software: Quattor (CMS & LHCb) RPMs + scripts (ALICE & ATLAS) We all *love* IPMI. In particular if it comes with console redirection! How to monitor the fabric: Lemon, FMC/PVSS, Nagios, Run them disk-less (ATLAS, LHCb) or with local OS installation (ALICE, CMS) How to use them during shutdowns: s: Online use only (ALICE, ATLAS, CMS), use as a Tier2 (LHCb) LHC DAQ CHEP09 - Niko Neufeld 15

Online farms Old problems & some new solutions The problems are always the same: power, space & cooling Space: E.g. Twin-mainboard i server (Supermicro) bring 2 x 2 x 4 = 16 cores + up to 64 GB of memory on 1 U Blades (typically ~ 13 cores /U) Power: in-rush currents, harmonic distortions Cooling: all experiments use heatexchangers mounted to the back of the racks ( rack-cooler doors ) instead of room air-conditioning. A codevelopment of all 4 experiments with support from the CERN PH department LHC DAQ CHEP09 - Niko Neufeld 16

Networks Large Ethernet networks Thousands of Gigabit ports & Hundreds of 10 Gigabit ports (e.g. ATLAS 200) 100es of switches Several separated but (partly) connected networks: Experiment Technical Network CERN Technical Network CERN General Purpose Network Experiment DAQ network DAQ networks are of course a critical part of the data-flow: lots of monitoring: Nagios, (custom applications using) SNMP, PVSS, Spectrum ALICE and LHCb have dedicated Storage Area Networks (SAN) based on FibreChannel. 200 FC4 ports (ALICE), 64 FC4 / 8 FC8 (LHCb) LHC DAQ CHEP09 - Niko Neufeld 17

Problems Quality of commodity hardware: memory, mainboards, disks, power-supplies, switch-ports, riser-cards Software stack: firmware issues (in BMCs, switches, routers), OS (e.g. Ethernet device numbering) Hardware obsolescence: PCI-X cards, KVM Heterogeneity of the hardware Purchasing rules lead to many different vendors /warranty contracts over the years manifold procedures, support-contacts LHC DAQ CHEP09 - Niko Neufeld 18

Gallery ALICE Storage System ATLAS Online Network Infrastructure CMS on-line computing center LHC DAQ CHEP09 - Niko Neufeld 19

Runcontrol Warner Bros.

Runcontrol challenges Start, t configure and control O(10000) processes on farms of several 1000 nodes Configure and monitor O(10000) front- end elements Fast data-base access, caching, preloading, parallelization and all this 100% reliable! LHC DAQ CHEP09 - Niko Neufeld 21

Runcontrol technologies Communication: CORBA (ATLAS) HTTP/SOAP (CMS) DIM (LHCb, ALICE) Behavior & Automatisation: SMI++ (Alice) CLIPS (ATLAS) RCMS (CMS) SMI++ (in PVSS) (used also in the DCS) Job/Process control: Based on XDAQ, CORBA, FMC/PVSS (LHCb, does also fabric monitoring) Logging: log4c, log4j, syslog, FMC (again), LHC DAQ CHEP09 - Niko Neufeld 22

How fast can we start it? Starting a run here means bringing the DAQ from the Unconfigured state to the Running state. This will typically imply: Configuring the detector front-ends Loading and/or configuring the trigger processes in the HLT farms Configuring the L1 trigger Warm start Limited by ALICE ~ 5 min detector FE config ATLAS ~ 7 min (*) detector FE config CMS ~ 1 1/2 min (central DAQ) + 2 min One subdetector LHCb ~ 4 min One subdetector All experiments are working hard to reduce this time. These times hold for the good case : i.e. all goes well (Y.M.M.V.) (*)measured 10/09/08 LHC DAQ CHEP09 - Niko Neufeld 23

Run Control GUI Main panel of the LHCb run-control (PVSS II) LHC DAQ CHEP09 - Niko Neufeld 24

Databases The Online systems use a lot of data-bases: Run database, Configuration DB, Conditions DB, DB for logs, for logbooks, histogram DB, inventory DB, Not to forget: the archiving i of data collected from PVSS (used by all detector control systems) All experiments run Oracle RAC infrastructures, some use in addition MySQL, object data-base for ATLAS Configuration (OKS) Administration of Oracle DBs is largely outsourced to our good friends in the CERN IT/DM group Exchange of conditions between offline and online uses Oracle streaming (like replication to Tier1s) LHC DAQ CHEP09 - Niko Neufeld 25

Upgrades Farm upgrades transparent within power, space and cooling budgets Higher L1 rate: general feeling is that it is too early: first wait for data and see All systems are scalable and will work a long way up How much do we loose in the high p t trigger? extreme case LHCb: about 50% read out entire detector at collision i rate (trigger-free DAQ) Any upgrade in speed beyond the maximal L1 rate will require new front-end electronics and readout-links Upgrade considerations will start from the readout-link and TTC developments (GBT) LHC DAQ CHEP09 - Niko Neufeld 26

Are we ready? 2008: 10000 stable runs, 3 PB of data readout, 350 TB data recorded, 515 days of data taking Since 09/12: 400 k files of Cosmics, 216 millions of events, 453 TB no BField ~300 M cosmic events nominal BField 3.8T ~300 M cosmic events, ~100 TB of raw data Cosmics since Spring 2008: 1138 runs, 2459 files, 469041 events, 3.16 TB LHC DAQ CHEP09 - Niko Neufeld 27

Status & Summary We are ready LHC DAQ CHEP09 - Niko Neufeld 28

LHC DAQ / Online talks in depth coverage of topics in this talk [40] Commissioning the ALICE Experiment P. V. Vyvre [3] CMS Data Acquisition System Software J. Gutleber [150] The ATLAS Online High Level Trigger Framework: Experience reusing Offline Software Components in the ATLAS Trigger W. Wiedenmann [38] The ALICE Online Data Storage System R. Divia [313] The LHCb Run Control C. Gaspar [540] SMI++ Object Oriented Framework used for automation and error recovery in the LHC experiments B. Franek (poster) [138] Dynamic configuration of the CMS Data Acquisition cluster H. Sakulin [461] The ALICE Online-Offline Framework for the Extraction of Conditions Data C. Zampolli [178] The CMS Online Cluster: IT for a Large Data Acquisition and Control Cluster J. A. Coarasa Perez [47] Event reconstruction in the LHCb Online cluster A. Puig Navarro [94] Commissioning of the ATLAS High Level Trigger with Single Beam and Cosmic Rays A. Di Mattia LHC DAQ CHEP09 - Niko Neufeld 29