Prototype of the CMS Object Oriented Reconstruction and nalysis Framework for the Beam Test Data CMS Collaboration presented by Lucia Silvestris CERN, Geneve, Suisse and INFN, Bari, Italy bstract. CMS software requirements and computing resources will by far exceed those of any existing high energy physics experiment, not only for the complexity of the detector and of the physics task but also for the size of the collaboration and the long time scale. Therefore, software should be developed keeping in mind not only performance but also modularity, flexibility, maintainability, quality assurance and documentation. Object Orientation has been identified as the enabling technology, since it directly addresses this kind of problems. We will report on the development of an Object Oriented Reconstruction and nalysis Framework for the CMS experiment and in particular on a prototype of a complete analysis chain for the CMS test-beam data. The analysis chain consists of three different components: data acquisition, reconstruction and analysis, and interactive analysis tools. In the online part the data, read from the VME, are stored into an Objectivity federated database. Later, using an automatic procedure, the database files are moved from the disk connected to the online computer system to the disks, and eventually tapes, connected to the off-line system. In the reconstruction and analysis step the data are retrieved from the database and an analysis of the detector performances can be carried out. The final step consists in the visualisation of the Histograms (produced using the LHC++ HISTOO package) and Tags which have been filled during the previous step. HEP-Explorer tools are used for the visualisation. Commercial as well as freely available class libraries have been used to build the functional prototype. review of results and performances of the prototype, based on the data collected during the summer 998, will be presented. INTRODUCTION The CMS experiment [] is one of four approved LHC experiments. Data taking is scheduled to start in 2005 and will last at least ten years. CMS software and computing task [2] will be 0-000 times bigger than current HEP experiments. Therefore, software should be developed keeping in mind not only performance but also modularity, flexibility, maintainability, quality assurance and documentation.
Object Orientation has been identified as the enabling technology, since it directly address this kind of problems. The overall design of the CMS Software rchitecture is motivated by the following underlying principles: Multiple Environments: Various software modules must be able to run in a variety of environments as different computing tasks are performed (examples of these environments are level 3 triggering, production reconstruction, program development, and individual analysis); Migration between environments: a particular software module may start out being developed for one environment, then later used in other unforeseen environments as well; Distributed code development: the software will be developed by organisationally and geographically dispersed groups of part-time non-professional programmers only some portions of the code will be written by computing professionals; Flexibility: not all software requirements will be fully known in advance, therefore the software systems must be adaptable without requiring total rewrites; Ease of use: the software systems must be easily usable by collaboration physicists who are not computing experts and cannot devote large amounts of time to learning computing techniques. These requirements on the software architecture result in the following overall structure for the CMS software: a customisable application framework for each of the computing environments; physics software modules with clearly defined interfaces that can be plugged into the framework; a service and utility Toolkit that can be used by any of the physics modules. The framework will provide at the program control, module scheduling, and input/output. It will be tailored to the task in hand and will most likely be written by computing professionals (or physicists acting in that role) using disciplined software engineering methods. The physics and utility modules can be plugged into any of the application frameworks at run time. One can easily choose between different versions of various modules. The physics modules will not communicate with each other directly but only through the data access protocols that are part of the service layer below. The service and utility toolkit will consist of two major categories of services: physics type services (histogrammers, fitters, physics calculation routines) and computer services (data access, inter module communication, event input and output, etc).
CMS TESTBEM SOFTWRE The general goals of the CMS test-beam activities are to investigate and verify the functionality, quality, and performance of detector elements. These tests are carried out under conditions as similar to the real situation as possible, with various configurations of magnetic field, orientation, shielding, and incident beam energy and particle type. In addition to tests of individual subsystem modules, sets of several detectors are tested as a pseudo CMS slice to study the global performance of CMS. Test-beams also serve for calibration and large scale tests during the construction phase. Currently, there are five test-beams in use by CMS at CERN: two in the North rea (H2 and H4) and two in the West rea (X5B and X5C) and one in the East Hall (T9). ll test-beams used by CMS so far either originate from previous R&D projects, or are parasitically used only. The number of user groups varies from one to many, all with different backgrounds, history, equipment, requirements, and experience. This has led to a great diversity of DQ systems which write the raw data to tape with different methods and data formats. Therefore, we are working towards more coherent and common on-line and off-line analysis environments, based on common data formats, utility programs and simulation frameworks. This will result in enhanced productivity, higher quality components, and allow for more mobility between test-beams of the different user groups. It will be an important test for the CMS DQ system and for the CMS analysis and reconstruction framework. The complete analysis chain for the CMS testbeam data consists of three different components: data acquisition, reconstruction and analysis and interactive analysis tools. CMS TESTBEM DT CQUISITION SOFTWRE Data coming from the detectors, and corresponding to a particular spill, are readout using front-end digitisers. The DQ system then has to assemble the event from many buffers into a single buffer in the processing system of the Event Filter Unit. The CMS TestBeam Event Filter Unit [3] is a multi-threaded application: the first thread, called RawData Server, receives new events from an object scanning the VME memory; the second thread, called Data Store, pulls new events from the RawData Server and stores the event data into the database. The structure of event data may be arbitrarily complex. This requires structured storage which can be obtained using databases. Furthermore, the data access pattern of user applications is such that often many events of a certain type or a certain part of many events will be retrieved. These two types of data access make
Event RawEvent data()() {persistent} _time : unsigned long RawData() GetHeaderEvent() {persistent} DetUnit represents an elementary part of a detector. It is responsible for creating (online) and retrieving (offline) the corresponding RawData from the event structure..n RawData readoutname : ReadOutUnit detectorname : DetUnit detectorname {persistent} DetUnit detname&cuts : DetChip det&noisecut : DetChannel : HitCollection_per_detector..n RawData() {persistent} DetectorGeom Reconstructed Hit can performs some preliminary reconstruction to transform the rawdata into Reconstructed detectorname Reconstructed Hit detectorname : DetUnit {transient} DetectorGeom encapsulates geometrical properties for the different DetUnit File: /afs/cern.ch/user/l/lucia/rose/nalysistool_new.mdl Page FIGURE. CMS TestBeam Event Model relational databases less appropriate since many time-consuming join operations are required. Object-Oriented databases provide a better performance for these types of data access, match more closely the programming paradigm and better support the storage of complex data types. In this implementation of the CMS TestBeam DQ System Objectivity/DB version 4.02 is used. Fig. shows the event data model that is implemented for the CMS TestBeams: Event (Persistent Class): the entry point to all information about an event; RawEvent (Persistent Class): an index of all RawData objects; DetUnit (Persistent Class): it represents an elementary part of a detector. It is responsible for creating (online) and retrieving (off-line) the corresponding rawdata from the event structure; RawData (Persistent Class): DetUnit is defined has a part of RawData. This means that we have an instance of RawData only if exists an instance of DetUnit; Reconstructed Hit (Transient Class): Reconstructed Hit can perform some preliminary reconstruction to transform the rawdata into reconstructed data; DetectorGeom (Persistent Class) : DetectorGeom encapsulates the geometrical properties for the different DetUnit s. The CMS TestBeam DQ system stores the data into an Objectivity federated database. Later, using an automatic procedure, called Central Data Recording (CDR), the database files are moved from the disks connected to the on-line computer system, to the disks, and eventually tapes, connected to the off-line system. One of the major advantages of CDR is that it obviates the need for tape drives
and robotics to be installed with the DQ system, and thus the need for operation and management of such devices at the testbeam site. Using this implementation of the CMS TestBeam DQ system data were acquired in three different test-beam areas: X5B testbeam: 997: the DQ was active for the October testbeam (-5/0/97). More than 40GB were stored in about 20 databases files; 998: the DQ was active from 5 to 0 June 998. More than 22GB were acquired and stored in 79 database files; H2 testbeam: 997: the DQ was active from ugust 6 to September 29 997. More than 60GB were acquired and stored in about 250 database files; 998: the DQ was active from 0 July to 0 September 998. More than 30GB were acquired and stored in 300 database files; T9 testbeam 998: the DQ was active from 2 to 8 October 998. More than 0GB were acquired and stored in about 00 database files. CMS TESTBEM RECONSTRUCTION ND NLYSIS FRMEWORK There are different components of the CMS nalysis and Reconstruction Framework (Fig. 2 ): Persistent storage manager: the data are persistently stored. Both event and non-event data have to be managed. It will definitely be an application based on an ODBMS. It will incorporate several distinct mechanisms to efficiently retrieve the required objects in response to a user query. Reconstruction Framework: the system which steers reconstruction according to the user request and the availability of already stored and/or still valid objects. It is intimately connected to the persistent storage manager being its major client. nalysis Framework: uses the reconstruction framework to obtain the required reconstruction objects. Event Visualisation Framework: uses the reconstruction framework to visualise the required reconstruction objects. Statistical nalysis Tools: graphical packages. mainly histogramming tools and corresponding
CMS Reconstruction & nalysis FRMEWORK Reconstruction lgorithms Event Filters Event Objects Physics nalysis Calibration Objects Visualisation System Common Tools LHC++ ODBMS CLHEP STL (G)UI FIGURE 2. Components of the CMS Reconstruction and nalysis Framework Fitting Tools: such as MINUIT and relates routines. Presentation Tools: whatever is required to represent Physic Results. n important mechanism for the CMS nalysis and Reconstruction Framework is the notification mechanism. Its responsability is to dispatch events to the different observers, i.e. to the different analysis classes. Using this mechanism the user analysis classes will inherit from abstract classes and no changes to the analysis framework are required. Fig. 3 shows the classes collaborating in this mechanisms and their relationships: Dispatcher : It knows its observers. ny number of observer objects may observe a dispatcher. It provides an interface for attaching and detaching observer objects. Observer : It defines an updating interface for objects that should be notified of changes in a Dispatcher. The relationship Dispatcher-Observer implements a well known Observer Pattern [4]. DTBE : It is a dispatcher of TestBeam events. OTBE : It is an observer of TestBeam events. The user analysis (RawDumper, RawHistos etc) can inherit from OTBE class. nalyzer: It is the generic analysis class that it provided to the user. Histogrammer : This class makes the encapsulation of the HistOO package [5]. Silinal : It is an example of user analysis class. This class is a multiple inheritance from analizer and histogrammer classes. Implementing this class
T Dispatcher ooref( X5Even DTBE 0.. S T _currentevent 0..n ooref( X5Run) OX5R T Observer X5 User nalysis ooref( X5Even OTBE nalizer X5aMain TBEDumper RaWDumper RaWHistos Silinal tbedumper F F DTBE = Dispatcher Test Beam Event OTBE = Observer Test Beam Event, OX5R = Observer X5 Run TBEDumper = Dump of the Test Beam Events Histogram mer File: /afs/cern.ch/user/l/lucia/rose/x5user.mdl Page FIGURE 3. Notification Mechanisms the user can do the analysis of the test-beam events and can store the results into HistOOgrams [5]. Using the same notification mechanism, but with different types of events like Geant3 [7] or Geant4 [9] events the user is able to analyse not only TestBeam events but also simulated events. CMS TESTBEM INTERCTIVE NLYSIS TOOL The final step consists of the visualisation of the histograms, produced using LHC++ HistOO package [5], and Tags [6] which are produced in the analysis step. HEP-Explorer [8] is used for the visualisation. Fig. 4 shows first results for silicon detectors, that are obtained from the complete CMS TestBeam analysis chain. CONCLUSIONS The complexity of the computing tasks of the CMS experiment requires its software to be developed using methods that will ensure not only its correctness but also flexibility and ease to use. This will cope with the inevitable changes in requirements and detector configuration, in a large collaboration and with a long time scale. In the last years a series of studies and prototypes have shown how Object- Oriented technologies are the best candidate to satisfy those requirements. CMS Framework for Reconstruction and nalysis, entirely based on Object- Oriented components, has been developed to prove several concepts to be used in the final implementation, like:
FIGURE 4. X5B TestBeam Results event model; persistent storage manager (ODBMS) notification mechanisms query mechanisms interface with Geant3 and Geant4 simulation It has been successfully tested in H2, X5B and T9 TestBeam data acquisition, simulation and analysis. In 999, this prototype will evolve into a common software to be used in all CMS TestBeam areas and in the outside laboratories where the detectors will be assembled and tested. REFERENCES. CMS - The Compact Muon Solenoid, Technical Proposal CERN/LHCC 94-38, LHCC/P, CERN 994. 2. The Compact Muon Solenoid, Computing Technical Proposal CERN/LHCC 96-45, CERN 996. 3. CMS Online Event Filter Farm Software talk 74, session B, presented at this conference. 4. E. Gamma et al., Design Patterns, ddison Wesley, Reading, Massachusetts, 994. 5. http://wwwinfo.cern.ch/asd/lhc++/histoo/toc.html, CERN 998 6. http://wwwinfo.cern.ch/asd/lhc++/hepodbms/reference-manual/index.html, CERN 998. 7. http://wwwinfo.cern.ch/asd/geant/index.html, CERN 998. 8. http://wwwinfo.cern.ch/asd/lhc++/hepexplorer/index.html, CERN 998. 9. http://wwwinfo.cern.ch/asd/geant/geant4.html, CERN 998.