LCB Workshop, Marseille 1999

Similar documents
Multi-threaded, discrete event simulation of distributed computing systems

Summary of the LHC Computing Review

INTRODUCTION TO THE ANAPHE/LHC++ SOFTWARE SUITE

The CMS Computing Model

A Prototype of the CMS Object Oriented Reconstruction and Analysis Framework for the Beam Test Data

GAUDI - The Software Architecture and Framework for building LHCb data processing applications. Marco Cattaneo, CERN February 2000

ALICE ANALYSIS PRESERVATION. Mihaela Gheata DASPOS/DPHEP7 workshop

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

High-Energy Physics Data-Storage Challenges

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

LHCb Computing Strategy

Geant4 activities at DESY

The BABAR Database: Challenges, Trends and Projections

Improving Packet Processing Performance of a Memory- Bounded Application

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

ALICE Simulation Architecture

Computing at Belle II

Computing. DOE Program Review SLAC. Rainer Bartoldus. Breakout Session 3 June BaBar Deputy Computing Coordinator

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

PROOF-Condor integration for ATLAS

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data

A Geometrical Modeller for HEP

CDF/DOC/COMP UPG/PUBLIC/4522 AN INTERIM STATUS REPORT FROM THE OBJECTIVITY/DB WORKING GROUP

Event cataloguing and other database applications in ATLAS

The ATLAS EventIndex: Full chain deployment and first operation

and the GridKa mass storage system Jos van Wezel / GridKa

LHCb Computing Status. Andrei Tsaregorodtsev CPPM

A Summary of the Final Report Globally Interconnected Object Databases (The GIOD Project)

Storage and I/O requirements of the LHC experiments

LHCb Distributed Conditions Database

DESY at the LHC. Klaus Mőnig. On behalf of the ATLAS, CMS and the Grid/Tier2 communities

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Data Analysis in ATLAS. Graeme Stewart with thanks to Attila Krasznahorkay and Johannes Elmsheuser

ROOT Course. Vincenzo Vitale, Dip. Fisica and INFN Roma 2

Evaluation of the computing resources required for a Nordic research exploitation of the LHC

Grid monitoring with MonAlisa

EventStore A Data Management System

Internet2 Meeting September 2005

A Simulation Model for Large Scale Distributed Systems

The BaBar Computing Model *

Andrea Sciabà CERN, Switzerland

Data Transfers Between LHC Grid Sites Dorian Kcira

Clustering and Reclustering HEP Data in Object Databases

Data Analysis in Experimental Particle Physics

The INFN Tier1. 1. INFN-CNAF, Italy

Status of KISTI Tier2 Center for ALICE

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

Access to ATLAS Geometry and Conditions Databases

Storage Resource Sharing with CASTOR.

Global Software Distribution with CernVM-FS

SPINOSO Vincenzo. Optimization of the job submission and data access in a LHC Tier2

LHC Computing Models

Hall D and IT. at Internal Review of IT in the 12 GeV Era. Mark M. Ito. May 20, Hall D. Hall D and IT. M. Ito. Introduction.

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN

Data services for LHC computing

Grid Computing: dealing with GB/s dataflows

ATLAS Oracle database applications and plans for use of the Oracle 11g enhancements

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

Data handling and processing at the LHC experiments

Geant4 in a Distributed Computing Environment

IEPSAS-Kosice: experiences in running LCG site

CMS Simulation Software

Offline Tutorial I. Małgorzata Janik Łukasz Graczykowski. Warsaw University of Technology

Deferred High Level Trigger in LHCb: A Boost to CPU Resource Utilization

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

Metadaten Workshop 26./27. März 2007 Göttingen. Chimera. a new grid enabled name-space service. Martin Radicke. Tigran Mkrtchyan

Virtualizing a Batch. University Grid Center

How Can We Deliver Advanced Statistical Tools to Physicists. Ilya Narsky, Caltech

Efficient HTTP based I/O on very large datasets for high performance computing with the Libdavix library

Grid Computing Activities at KIT

Monte Carlo programs

The COMPASS Event Store in 2002

ANSE: Advanced Network Services for [LHC] Experiments

Deep Learning Photon Identification in a SuperGranular Calorimeter

A Simple Mass Storage System for the SRB Data Grid

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

The European DataGRID Production Testbed

Cooperation among ALICE Storage Elements: current status and directions (The ALICE Global Redirector: a step towards real storage robustness).

A L I C E Computing Model

RADU POPESCU IMPROVING THE WRITE SCALABILITY OF THE CERNVM FILE SYSTEM WITH ERLANG/OTP

CMS Computing Model with Focus on German Tier1 Activities

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

Update of the Computing Models of the WLCG and the LHC Experiments

13th International Workshop on Advanced Computing and Analysis Techniques in Physics Research ACAT 2010 Jaipur, India February

Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI

Existing Tools in HEP and Particle Astrophysics

The Run 2 ATLAS Analysis Event Data Model

Test Beam Task List - ECAL

HEP Grid Activities in China

The CORAL Project. Dirk Düllmann for the CORAL team Open Grid Forum, Database Workshop Barcelona, 4 June 2008

UW-ATLAS Experiences with Condor

Cluster Setup and Distributed File System

Visita delegazione ditte italiane

Experiment

Clarens remote analysis enabling environment

ELFms industrialisation plans

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

ATLAS Nightly Build System Upgrade

LCG data management at IN2P3 CC FTS SRM dcache HPSS

Transcription:

LCB Workshop Marseille, 27/9-2/10 1999 Event Filter Farms Distributed Computing and Regional Centres Architecture Round Table on Software Process Simulation Persistency at LHC Data Analysis Technology Tracking

H.Hoffmann CERN Planned Review of Computing Review of the Progress and Planning of the Computing Efforts at CERN and in the Experiments for the LHC Start-Up World-wide Analysis/Computing model Software project: Design and development Management & Resources

Event Filter Farms Session Chair : François Touchard Review of Existing Farms STAR, PHENIX, HERA-B, CDF, DØ, BaBar State of the Art Farm Computing The LHC situation

Event Filter Farms (contd) Nobody really discussed event filter farms Situation did not really change Experiments presented their DAQ PDP talked about generic farms Offline oriented Monitoring Single system illusion Process scheduling (NQS & Co.) Failure management Parallel computing Disk/File servers

J.Jaen IT/PDP Key Issues g Size Scalability (physical & application) g Enhanced Availability (failure management) g Single System Image (look-and-feel of one system) g Fast Communication (networks & protocols) g Load Balancing (CPU, Net, Memory, Disk) g Security and Encryption (farm of farms) g Distributed Environment (Social issues) g Manageability (admin. and control) g Programmability (offered API) g Applicability (farm-aware and non-aware app.)

What I d expected Hardware installation topics n*1000 PCs take up some space How to use the cheapest commodity items (slow control ) Level2/3 parameter management (SCADA, ) Trigger programs will run with some FSM Process control Filter programs must be in synch. with DAQ Some ideas to collect information e.g. Monitoring histograms.

HERA-B HERA-B reduced reduced reconstruction reconstruction time time O(10) O(10) by by optimizing optimizing algorithms algorithms!!!!!!

Distributed Computing and Rapporteur talk Harvey Newman Regional Centres MONARC Working Groups status report Iosif Legrand DPSS Distributed Parallel Storage System Brian Tierney

H.Newman CIT What is MONARC Doing? Similar data processing jobs are performed in several RCs Each Centre has TAG and AOD databases replicated. Main Centre provides ESD and RAW data Each job processes AOD data, and also a a fraction of ESD and RAW.

J.Legrand CIT Process Modeling Concurrent running tasks share resources (CPU, memory, I/O) Interrupt driven scheme: For each new task or when one task is finished, an interrupt is generated and all processing times are recomputed. It provides: An efficient mechanism to simulate multitask processing. Handling of concurrent jobs with different priorities. An easy way to apply different load balancing schemes.

J.Legrand CIT Mean Time per job [ms] 350 300 250 200 150 100 50 Model Validation 0 0 5 10 15 20 25 30 35 No. of concurrent jobs Local DB (atlobj) Local Db (monarc) AMS Sim Local DB Sim Local DB Sim AMS

J.Legrand CIT User Happiness Factors CERN CALTECH INFN Mean 0.83 Mean 0.42 Mean 0.57

Objectivity Usage Statistics Would these parameters not also fit for LHCb? >20 sites using Objectivity USA, UK, France, Italy, Germany ~650 licensees People who have signed the license agreement ~400 users People who have created a test federation >100 simultaneous users Monitoring distributed oolockmon statistics

Persistency at LHC Persistency at LHC Vincenzo Innocente Recent Experience at BaBar David Quarrie

V.Innocente CMS Physical Model and Logical Model Physical model may be changed to optimise performance Existing applications continue to work

V.Innocente CMS Solution Space ODBMS ROOT Objectivity/DB In-house build Quasi random access to event data Root alone is not sufficient Wrapped relational databases Hybrid solutions (CDF, D0, Star, Phenix) sequential files/ ROOT files File management using relational databases

V.Innocente CMS Bulk HEP Data Event Data, User Data Well structured WORM? This are the PBytes User Tag (N-tuple tuple) Tracker Alignment Tracks Ecal calibration Collection Meta-Data Event Collection Event Electrons

Exp. Characteristics (D.Quarrie-BaBar) Characteristic Size No. of Detector Subsystems 7 No. of Electronic Channels ~250,000 Raw Event Size ~32kBytes DAQ to Level 3 Trigger 2000Hz 50MByte/sec Level 3 to Reconstruction 100Hz 2.5MByte/sec Reconstruction 100Hz 7.5MByte/sec Event Rate 10 9 events/year Storage Requirements (real ~300TByte/year & simulated data)

First Results Results 40 36 baseline 32 events/se 28 24 20 16 noobjy Asymptotic limit 12 8 4 0 Production set point 0 20 40 60 80 100 120 140 160 180 200 # nodes

Results so far events/sec 40 36 32 28 24 20 16 12 8 4 0 baseline initnrpages 2dbClusters noobjy 2AMSes 2AMSes+2dbClusters(segr) 1Gb metadata 4AMSes+4dbClusters(segr) fixedams (4 used) fixed AMS (1 used) FS defragm 0 20 40 60 80 100 120 140 160 180 200 # nodes

Knobs to twiddle (tested) Minimize catalog operations Separate conditions DB server Separate catalog server Tune AMS server Client file descriptors Client cache sizes Initial container sizes Transaction lengths TCP configuration Multiple AMS processes Database clustering Autonomous partitions Disable filters Singleton Federations Veritas Filesystem optimization Decrease payload per event LM starvation? Loadbalance across datamovers More datamovers Database pre-creation Gigabit lockserver Caching handles Local bootfile Unlock instead of minitransaction Run OPR with no output Run on shire to bypass AMS

My Impressions BaBar x 2-3 LHCb BaBar x n ATLAS or CMS (n < 10) ALICE is different Objectivity/DB works, but BaBar has 6-12 people dealing with persistency It s a nice tale that commercial SW does not need maintenance

Simulation Rapporteur talk Matthias Schroeder Report from GEANT4 workshop John Apostolakis Base Classes for Simulation in ALICE: Generators and Segmentation Andreas Morsch

J.Apostolakis Geant4/IT another Geant 4 Overview Geant4: brief history Overview of Geant4 kernel s power additional abilities Developments at Geant4 workshop (20-24 Sept. 1999) Status and plans We have to put hands on ourselves.

A.Morsch ALICE Simulation Components Feasibility Studies Physics Signals Fast Simulation Detector Simulation Physics Performance Example: Fast Simulation MC Event-Generators Particle Stack Detector Performance Segmentation Detector Classes Example: Digitisation Cluster Finder Tracking Codes Hits Digits o Reconstruction o Reconstruction Optimisation

A.Morsch ALICE New Tracking Schema FLUKA Step GUSTEP AliRun::StepManager Geant4 StepManager Detector Version StepManager Add the hit Disk I/O Objy Root

Round Table on Software Process Overview of Processes Currently Established Hans-Peter Wellisch Software Processes in BaBar Gabriele Cosmo Software Processes in Geant4 John Apostolakis Panel Discussion: ALICE, ATLAS, CMS, LHCb, BaBar, Geant4

Architecture Rapporteur talk RD Schaffer

RD.Schaffer ATLAS Architecture Styles Control/data interaction issues Are control-flow and data-flow topologies isomorphic? If isomorphic, is the direction the same or opposite? Useful examples of architectural patterns Call-and-return styles data abstraction (object-oriented) Data flow styles pipe-and-filter systems Data-centered repository styles blackboard Interacting process styles implicit invocation Note: few systems are purely any one of these!

RD.Schaffer ATLAS method calls Call-andreturn style memory Blackboard (shared data) Data-centered repository style computational objects

RD.Schaffer ATLAS control framework BaBar and CDF control input module event app module app module app module app module output module pipe-and-filter style intermediate data objects ATLAS Object Network module module module input module event module module module event output module module

RD.Schaffer ATLAS!?? event signals!?! method invocation Independent reactive objects CMS

RD.Schaffer ATLAS Architectural Issues of Persistency ODBMS knowledge optimized for a specific purpose disk to memory persistent shape transient shape converted shape algo 1 algo 2 ODBMS knowledge optimized for a specific purpose disk to memory persistent shape no transient shape converted shape algo 1 algo 2

Data Analysis Rapporteur talk Guy Wormser Migration from Fortran to C++ and OO, as seen by the physicist Marek Kowalski The WIRED experience Julius Hrivnac The JAS experience Michael Ronan

G. Wormser LAL Orsay Historical perspective : PAW Very large productivity boost in the physicists community with the introduction of a universal analysis tool program PAW very easy to use, available everywhere Ntuples, MINUIT, presentation package fortran interpreter macros/script (KUIP,.kumac) No integration within experiments framework No overhead! But not possible to benefit from infrastructure

G. Wormser LAL Orsay The new environment OO Data structures (ROOT,Objectivity,etc) Analysis codes and tools in OO language We want PAW_OO! Very large datasets want Better integration within the framework Very powerful CPUs Better interactivity

G. Wormser LAL Orsay ROOT examples

J.Hrivnac ATLAS Java is coming...

The JAVA Alliance Analysis helpers in JAVA seem to become usable Easily extendable JAVA language is equivalent to scripting and better than kumac Tony Johnson/SLAC JAS: HBOOK & Co. are just plugins to access data Mark Donzelmann/CERN WIRED: Client - Server approach to event displays

M.Ronan LBL JAVA@NLC Started with JAVA, no legacy Full reconstruction for detector studies Detector is a bigger ALEPH

Technology Tracking Pasta - Processors, storage, architectures Les Robertson Local Area Networking Jean-Michel Jouanigot Wide Area Networking Olivier Martin Data Grid projects Stewart Loken

Tech.Tracking CERN/IT Storage, CPU & Networks Estimated cost in 2005 Processors: $0.75-1.60 per CERN-Unit Disks: $2-4/GB in 2005 data rate increases only with the linear density Tapes: $0.50 per GB for raw tape reliable drives unlikely to go below $10-20K robotics - >=$20 per slot (no improvement) LAN: 1000BT. NIC: $200, Switch port: $500 - $1000 WAN: unforeseeable expected growth at constant cost: 15-50 % / year