Object oriented data analysis in ALEPH

Similar documents
Atlas event data model optimization studies based on the use of segmented VArray in Objectivity/DB.

A Prototype of the CMS Object Oriented Reconstruction and Analysis Framework for the Beam Test Data

Precision Timing in High Pile-Up and Time-Based Vertex Reconstruction

FAMOS: A Dynamically Configurable System for Fast Simulation and Reconstruction for CMS

HepMC 2. a C++ Event Record for Monte Carlo Generators. User Manual Version 2.0 August 18, 2006.

Last updates on TAUOLA. Jakub Zaremba IFJ Kraków

LCIO: A Persistency Framework and Event Data Model for HEP. Steve Aplin, Jan Engels, Frank Gaede, Norman A. Graf, Tony Johnson, Jeremy McCormick

ObjectStore and Objectivity/DB. Application Development Model of Persistence Advanced Features

Short Notes of CS201

ALPHA++ User s manual. ALEPH OO Physics Analysis Package. release 3.4

CS201 - Introduction to Programming Glossary By

Andrea Sciabà CERN, Switzerland

Real-time Analysis with the ALICE High Level Trigger.

Monte Carlo programs

Clustering and Reclustering HEP Data in Object Databases

Tracking and Vertex reconstruction at LHCb for Run II

CDF/DOC/COMP UPG/PUBLIC/4522 AN INTERIM STATUS REPORT FROM THE OBJECTIVITY/DB WORKING GROUP

HLT Hadronic L0 Confirmation Matching VeLo tracks to L0 HCAL objects

Persistency. Author: Youhei Morita

A Scenic tour of C++ Dietrich Liko. Dietrich Liko

First LHCb measurement with data from the LHC Run 2

INTRODUCTION TO THE ANAPHE/LHC++ SOFTWARE SUITE

The COMPASS Event Store in 2002

Monte Carlo Tuning of LUARLW Mode

Adding timing to the VELO

Rekayasa Perangkat Lunak 2 (IN043): Pertemuan 8. Data Management Layer Design

Electron and Photon Reconstruction and Identification with the ATLAS Detector

BUILDING PETABYTE DATABASES WITH OBJECTIVITY/DB

Performance of the ATLAS Inner Detector at the LHC

BCS THE CHARTERED INSTITUTE FOR IT. BCS HIGHER EDUCATION QUALIFICATIONS BCS Level 5 Diploma in IT. Object Oriented Programming

LHCb Computing Strategy

The GAP project: GPU applications for High Level Trigger and Medical Imaging

A Geometrical Modeller for HEP

8.882 LHC Physics. Track Reconstruction and Fitting. [Lecture 8, March 2, 2009] Experimental Methods and Measurements

ATLAS Tracking Detector Upgrade studies using the Fast Simulation Engine

Geant4 simulation in a distributed computing environment

arxiv:hep-ph/ v1 11 Mar 2002

PoS(ACAT08)101. An Overview of the b-tagging Algorithms in the CMS Offline Software. Christophe Saout

LCIO - A persistency framework for linear collider simulation studies

HepMC 2. a C++ Event Record for Monte Carlo Generators. User Manual Version 2.06 May 17, 2010.

Introduction to Programming Using Java (98-388)

ALICE Simulation Architecture

Introduction to C++ Introduction to C++ Dr Alex Martin 2013 Slide 1

Monte Carlo Production on the Grid by the H1 Collaboration

An SQL-based approach to physics analysis

Framework for Collaborative Structural Analysis Software Development. Jun Peng and Kincho H. Law Stanford University

The performance of the ATLAS Inner Detector Trigger Algorithms in pp collisions at the LHC

Tracking and flavour tagging selection in the ATLAS High Level Trigger

Computing in High Energy and Nuclear Physics, March 2003, La Jolla, California

ATLAS NOTE. December 4, ATLAS offline reconstruction timing improvements for run-2. The ATLAS Collaboration. Abstract

PoS(High-pT physics09)036

COMP 250 Winter 2011 Reading: Java background January 5, 2011

Advanced Systems Programming

LHCb Topological Trigger Reoptimization

2 Databases for calibration and bookkeeping purposes

Data Analysis in Experimental Particle Physics

Object Oriented Programming

PoS(IHEP-LHC-2011)002

What are the characteristics of Object Oriented programming language?

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

The Cut and Thrust of CUDA

SRM ARTS AND SCIENCE COLLEGE SRM NAGAR, KATTANKULATHUR

Tracking and compression techniques

Applying STEP Principles to Product Models in High Energy Physics Research

A New Segment Building Algorithm for the Cathode Strip Chambers in the CMS Experiment

CS304 Object Oriented Programming Final Term

Monte Carlo Tuning of LUARLW Mode

STUDY NOTES UNIT 1 - INTRODUCTION TO OBJECT ORIENTED PROGRAMMING

PoS(EPS-HEP2017)492. Performance and recent developments of the real-time track reconstruction and alignment of the LHCb detector.

OBJECT ORIENTED PROGRAMMING USING C++ CSCI Object Oriented Analysis and Design By Manali Torpe

Contents 1 Introduction 2 2 Design and Implementation The BABAR Framework Requirements and Guidelines

Fast pattern recognition with the ATLAS L1Track trigger for the HL-LHC

Virtualizing a Batch. University Grid Center

CLAS Data Management

M.C.A DEGREE EXAMINATION,NOVEMBER/DECEMBER 2010 Second Semester MC 9222-OBJECT ORIENTED PROGRAMMING (Regulation 2009)

A new inclusive secondary vertex algorithm for b-jet tagging in ATLAS

Legacy code: lessons from NA61/SHINE offline software upgrade adventure

DATA STRUCTURES CHAPTER 1

LHCb Computing Resources: 2018 requests and preview of 2019 requests

The Error Reporting in the ATLAS TDAQ System

Histograms And N-tuples

CMS Conference Report

DESY at the LHC. Klaus Mőnig. On behalf of the ATLAS, CMS and the Grid/Tier2 communities

Lecturer: William W.Y. Hsu. Programming Languages

The BaBar Computing Model *

Lecture 10: building large projects, beginning C++, C++ and structs

PoS(EPS-HEP2017)523. The CMS trigger in Run 2. Mia Tosi CERN

Introduction to Information Systems

The ATLAS Trigger Simulation with Legacy Software

MIP Reconstruction Techniques and Minimum Spanning Tree Clustering

Frank Gaede, DESY, SLAC Simulation Meeting March 16/ Persistency and Data Model for ILC Software

Determination of the aperture of the LHCb VELO RF foil

Geant4 Computing Performance Benchmarking and Monitoring

Object Oriented Programming with c++ Question Bank

Tau ID systematics in the Z lh cross section measurement

Robustness Studies of the CMS Tracker for the LHC Upgrade Phase I

Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland

VALLIAMMAI ENGINEERING COLLEGE

Ministry of Higher Education Colleges of Applied Sciences Final Exam Academic Year 2009/2010. Student s Name

CERN openlab & IBM Research Workshop Trip Report

Transcription:

EPS-HEP99 Abstract # 5-707 Parallel sessions: 6 Plenary sessions: 5 ALEPH 99-056 CONF 99-03 June 30, 999 PRELIMINARY OPEN-99-250 30/06/99 Object oriented data analysis in ALEPH The ALEPH Collaboration and D. Düllmann a a RD45 Collaboration, CERN, Geneva, Switzerland Contact person : G. Dissertori, Guenther.Dissertori@cern.ch Abstract The project ALPHA++ of the ALEPH collaboration is presented. The ALEPH data have been converted from Fortran data structures (BOS banks) into C++ objects and stored in an object database (Objectivity/DB), using tools provided by the RD45 collaboration and the LHC++ software project at CERN. A description of the database setup and of a preliminary version of an object oriented analysis program (C++) is given. Contribution by ALEPH to the 999 summer conferences

Introduction Object oriented (OO) programming is becoming the new paradigm for software development in high energy physics, in particular in view of the upcoming, very demanding LHC project and its experiments. In order to provide an analysis platform and OO application software, the LHC++ project has been set up at CERN []. Furthermore, commercial products for the storage of persistent objects in so called object databases are being evaluated by the RD45 collaboration [2]. The current candidate for a large scale application at CERN is Objectivity/DB [3]. In ALEPH a project called ALPHA++ [4] has been set up, with the following goals : convert the ALEPH data from a Fortran (BOS [5]) bank style into persistent objects and write them to an object database (Objectivity/DB) rewrite a mini-version of the ALEPH analysis package ALPHA [6] in an object oriented computing language (C++), based on the object database. compare the standard and OO performance with regard to the efficient access of the data and their manipulation. test the software engineered by the RD45 and LHC++ projects. provide input and experience for a possible archiving of ALEPH data. give an opportunity to learn OO analysis, design and programming. In this report we present the structure of the ALEPH object database and a preliminary version of the analysis program. 2 The ALEPH data structure and its conversion to objects ALEPH uses the Fortran based BOS system for the memory management. The event data are kept in memory in a large array which is globally accessible as a common block. The data are organized in so called banks, and the data definition language (DDL) for these banks is provided by ADAMO [7]. The ADAMO package offers a conversion to C header files, which facilitates the automatic conversion of the ALEPH DDL to C++ classes, or to be more precise, to the ddl-files needed for the setup of an Objectivity/DB database. It has been decided to perform a one-to-one conversion of all the relevant banks stored on an ALEPH DST (data summary tape), which can be read by the analysis program ALPHA. A simple C++ tool has been developed which allows for the automatic conversion of the ADAMO DDL of all 73 relevant banks into C++ classes. An example of the correspondence between the ADAMO description of the track bank FRFT and its C++ implementation is given in Fig.. The general DDL structure as implemented in the Objectivity database is outlined in Fig. 2. AclassAlephBank serves as a base class for all the ALEPH banks. Persistency is obtained by inheritance from the Objectivity class ooobj. TheclassAlephBank stores the name of each bank, and defines the interface for the reading from and writing to memory of the bank contents. Each 2

ADAMO DDL FRFT : Global Geometrical track FiT NR=0.(JUL)\ Number of words/track\ Number of tracks STATIC = (InverseRadi = REAL [*,*], TanLambda = REAL [*,*], Phi0 = REAL [0.,6.3], D0 = REAL [-80.,80.], Z0 = REAL [-220.,220.], Alpha = REAL [-3.5,3.5], EcovarM(2) = REAL [*,*], Chis2 = REAL [0.,*], numdegfree = INTE [0,63], nopt = INTE [0,49] ); C++ CLASS class FRFT { public: // default constructor FRFT() {} float InverseRadi; float TanLambda; float Phi0; float D0; float Z0; float Alpha; float EcovarM[2]; float Chis2; int numdegfree; int nopt; }; Figure : Correspondence between the ADAMO description of the track bank FRFT and its C++ implementation. ALEPH bank is implemented as a class NAME Bank, wherename is the BOS name of the bank, such as FRFT. This class NAME Bank contains a variable length vector NAME Table, the elements of which are objects of the class NAME obtained from the conversion of the ADAMO DDL into a C++ class. For example, the elements of the vector FRFT Table are the FRFT objects (tracks) of a given event. NAME Bank also has the specific implementation of the read and write methods for the memory access. 3 The ALEPH object database The database structure of the ALEPH object database is reproduced in Fig. 3. At the basis there is a federated database called ALEPHDB, which contains several databases. These databases either contain real data from different data taking periods with or without pre-classification of events, or Monte Carlo events from fully simulated hadronic Z decays. Every database has a container holding Run objects. These objects store the run number and have bi-directional links to all the event objects of the corresponding run and to banks which store global run condition data. Furthermore, for every run there is a container with the event objects of this run and an additional container with all the bank objects per run. The event objects store the event number and the classification bit of the event, and have bi-directional links to all the banks of the event. At the moment about eight Gbyte of real and Monte Carlo data are stored in the federated database. In the following some examples are given. A database which contains 5 runs, 9773 events without preselection and a total of 374686 banks, has a size of 234.3 Mbytes. The operation of the Objectivity tool ootidy reduces this by 5% to 22.2 Mbytes. The size of the corresponding EPIO file where the BOS banks are stored is 86.2 Mbytes. The overhead arises mainly from the relationships which are stored in the object database. A high page density of 99.8% is achieved. 3

ooobj AlephBank void setname(char* name) int LoadFromMem(int* p) NAME=FRFT =FRTL =... char*4 _bankname NAME_Bank oovarray(name) NAME_Table int LoadFromMem(int* p) NAME float InverseRadi; float TanLambda; float phi0;. Figure 2: The DDL structure of the ALEPH object database Another database of 9 runs, 5000 events classified as hadronic events and 204362 banks, has a size of 763.4 Mbytes, whereas a database with 5000 simulated hadronic events is of similar size, containing 883047 banks. For the moment two simple C++ programs exist for the reading and writing of the databases. In the reading case, a loop over all run, event and bank objects is performed, using the iterators over the bi-directional links, and for every bank in the loop its method is called which copies the bank contents into memory in order to restore the BOS common. This allows for the application of the standard Fortran based analysis program ALPHA as well as for the event display program to be used. The banks stored in the memory are also the starting point for an object oriented analysis program which is described in the next section. 4 The Analysis Program The ALPHA++ analysis program is based on the structure of ALPHA. Two basic objects are defined: the Tracks and the Vertices. A Track can be a charged track reconstructed in the TPC, a photon identified in the ECAL or a general Energy Flow Object. The common attributes of a Track are: the total momentum with its components, the energy, the mass and the charge. A Track can also have more specific attributes depending if it is a charged track, a photon or an Energy Flow Object. The previously described structure has been implemented in ALPHA++ with an abstract class AlObject (Figure 4), with the common attributes of a Track defined as purely virtual member functions. The concrete transient classes Altrack (charged tracks), AlEflw (Energy Flow) and AlGamp (photons) are defined by inheritance from AlObject (Figure 5). The same principles apply also to the definition of the Vertices (Figure 4). There is an abstract class AlVertex with the basic attributes, and two concrete transient classes AlMainVertex, holding 4

Database Runs Container Container Events_<run#> Container Banks_<run#> Database 2 Federated Database Run Event Event Bank Bank Bank Figure 3: The database structure of the ALEPH object database. the position of the interaction point, and AlGenVertex for the secondary vertices found by the reconstruction program (Figure 6). The implementation of the previously mentioned transient classes in ALPHA++ makes use of the ALPHA internal banks QVEC (containing the Tracks) and QVRT (containing the Vertices). This is done in the following steps: For each event the relevant persistent classes from the Objectivity database are read and the corresponding BOS banks in the Fortran BOS common are filled; the Fortran subroutines which starting from the BOS banks construct the internal QVEC and QVRT data structures are called from C++; the concrete C++ classes (such as AlTrack, AlMainVertex etc.) are instantiated using the data contained in QVEC and QVRT; for each event the transient class AlphaBanks containing all the instances of the previous classes and giving access to them through member functions is instantiated. A complete class diagram of the analysis program is shown in Fig. 7. The choice to call Fortran from C++ in the analysis program is due to the fact that there are many algorithms in ALPHA which, on a short time scale, are not possible to develop in C++. Therefore our strategy is to have a Fortran wrapped analysis program already working and to use it as a basis to develop new C++ code and algorithms. 5 Conclusions A project called ALPHA++ has been set up within the ALEPH experiment in order to test new object oriented approaches to data storage and analysis. The data definition language, the structure of the object database as well as a preliminary version of an object oriented analysis program have been presented. Future work will concentrate on the further development of this analysis program as well as on detailed performance studies. 5

6 Acknowledgements We would like to thank the members of the RD45 collaboration and of the LHC++ project for their help with respect to conceptual as well as technical problems. References [] http://wwwinfo.cern.ch/asd/lhc++/index.html [2] http://wwwinfo.cern.ch/asd/rd45/index.html [3] http://www.objectivity.com/ [4] http://alephwww.cern.ch/~disserto/objty/welcome.html [5] http://www.cern.ch/light/examples/aleph/bos/bos.html [6] http://alephwww.cern.ch/alephgeneral/phy/doc/alguide/alguide.html [7] http://www.cern.ch:80/adamo/refmanual/document.html 6

class AlObject { public: ~AlObject(); virtual float QP() = 0; virtual float QX() = 0; virtual float QY() = 0; virtual float QZ() = 0; virtual float QE() = 0; virtual float QM() = 0; virtual float QCH() = 0; }; class AlVertex { public: ~AlVertex(); virtual float VXposition() = 0; virtual float VYposition() = 0; virtual float VZposition() = 0; virtual int Vertex_number() = 0; virtual int Vertex_type() = 0; virtual float ChiSquareFit() = 0; virtual float* CovMatrix() = 0; // pointer to covariance matrix }; Figure 4: The AlObject and AlVertex abstract classes interface AlObject AlTrack AlEflw AlGamp Figure 5: The AlObject inheritance structure AlVertex AlMainVertex AlGenVertex Figure 6: The AlVertex inheritance structure 7

AlVertex TYPE() QVX() QVY() QVZ() KVN() KVTYPE() QVCHIF() QVEM() QVRT_ALPHA qvr 0.. QVEC_ALPHA 0.. qve AlObject QP() QX() QY() QZ() QE() QM() QCH() TYPE() QvrtBase #q 0.. QVRT 0.. qvr 0.. QVEC #q 0.. qv QvecBase AlMainVertex 0.. AlGenVertex 0.. ge AlphaBanks AlphaBanks() ~AlphaBanks() Eflw() Gampec() MainVertex() GenVertex() NTrack() NEflw() NGampec() NGen_Vertex() Track() AlEflw ef 0.. 0.. tr AlTrack AlGamp g 0.. frtl 0.. FRTL_ALPHA Figure 7: ALPHA++ analysis program class diagram (reverse engineered). The vertex related classes are on the left side, the track related classes are on the right side. The class AlphaBanks acts as a container for the previous classes. 8