ATLAS Nightly Build System Upgrade

Similar documents
ATLAS software configuration and build tool optimisation

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data

Software installation and condition data distribution via CernVM File System in ATLAS

Organization, management, and documentation of ATLAS offline software releases

Improved ATLAS HammerCloud Monitoring for Local Site Administration

Large Scale Software Building with CMake in ATLAS

The ALICE Glance Shift Accounting Management System (SAMS)

CMS - HLT Configuration Management System

AGIS: The ATLAS Grid Information System

CMS users data management service integration and first experiences with its NoSQL data storage

CernVM-FS beyond LHC computing

The Database Driven ATLAS Trigger Configuration System

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers.

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns

An SQL-based approach to physics analysis

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation

A Tool for Conditions Tag Management in ATLAS

ATLAS Tracking Detector Upgrade studies using the Fast Simulation Engine

Security in the CernVM File System and the Frontier Distributed Database Caching System

Monitoring of large-scale federated data storage: XRootD and beyond.

Overview of ATLAS PanDA Workload Management

Evolution of Database Replication Technologies for WLCG

The CMS data quality monitoring software: experience and future prospects

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

Recent developments in user-job management with Ganga

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

The TDAQ Analytics Dashboard: a real-time web application for the ATLAS TDAQ control infrastructure

A new approach for ATLAS Athena job configuration

Automating usability of ATLAS Distributed Computing resources

TAG Based Skimming In ATLAS

Benchmarking the ATLAS software through the Kit Validation engine

Experience with ATLAS MySQL PanDA database service

Andrea Sciabà CERN, Switzerland

Use of containerisation as an alternative to full virtualisation in grid environments.

Experience with PROOF-Lite in ATLAS data analysis

The AAL project: automated monitoring and intelligent analysis for the ATLAS data taking infrastructure

Geant4 Computing Performance Benchmarking and Monitoring

HammerCloud: A Stress Testing System for Distributed Analysis

Testing an Open Source installation and server provisioning tool for the INFN CNAF Tier1 Storage system

GStat 2.0: Grid Information System Status Monitoring

Improvements to the User Interface for LHCb's Software continuous integration system.

Development of DKB ETL module in case of data conversion

Streamlining CASTOR to manage the LHC data torrent

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

Servicing HEP experiments with a complete set of ready integreated and configured common software components

An Analysis of Storage Interface Usages at a Large, MultiExperiment Tier 1

Recent Developments in the CernVM-File System Server Backend

The virtual geometry model

Using Puppet to contextualize computing resources for ATLAS analysis on Google Compute Engine

Striped Data Server for Scalable Parallel Data Analysis

INDEXING OF ATLAS DATA MANAGEMENT AND ANALYSIS SYSTEM

A data handling system for modern and future Fermilab experiments

The ATLAS Software Installation System v2 Alessandro De Salvo Mayuko Kataoka, Arturo Sanchez Pineda,Yuri Smirnov CHEP 2015

Monitoring System for the GRID Monte Carlo Mass Production in the H1 Experiment at DESY

Evaluation of the Huawei UDS cloud storage system for CERN specific data

Phronesis, a diagnosis and recovery tool for system administrators

CMS High Level Trigger Timing Measurements

Rise of the Build Infrastructure

Subtlenoise: sonification of distributed computing operations

The DMLite Rucio Plugin: ATLAS data in a filesystem

Evolution of Cloud Computing in ATLAS

Long Term Data Preservation for CDF at INFN-CNAF

The ATLAS Tier-3 in Geneva and the Trigger Development Facility

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN

PoS(EGICF12-EMITC2)106

DIRAC pilot framework and the DIRAC Workload Management System

CMS conditions database web application service

Monitoring of Computing Resource Use of Active Software Releases at ATLAS

Evolution of Database Replication Technologies for WLCG

File Access Optimization with the Lustre Filesystem at Florida CMS T2

Interoperating AliEn and ARC for a distributed Tier1 in the Nordic countries.

The Diverse use of Clouds by CMS

Stitched Together: Transitioning CMS to a Hierarchical Threaded Framework

Using S3 cloud storage with ROOT and CvmFS

Experiences with the new ATLAS Distributed Data Management System

LHCb Distributed Conditions Database

CouchDB-based system for data management in a Grid environment Implementation and Experience

First LHCb measurement with data from the LHC Run 2

ATLAS Data Management Accounting with Hadoop Pig and HBase

Online remote monitoring facilities for the ATLAS experiment

Data preservation for the HERA experiments at DESY using dcache technology

AutoPyFactory: A Scalable Flexible Pilot Factory Implementation

Monte Carlo Production on the Grid by the H1 Collaboration

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

ATLAS Distributed Computing Experience and Performance During the LHC Run-2

Performance quality monitoring system for the Daya Bay reactor neutrino experiment

ATLAS Oracle database applications and plans for use of the Oracle 11g enhancements

13th International Workshop on Advanced Computing and Analysis Techniques in Physics Research ACAT 2010 Jaipur, India February

Modular and scalable RESTful API to sustain STAR collaboration's record keeping

Monitoring ARC services with GangliARC

Large scale commissioning and operational experience with tier-2 to tier-2 data transfer links in CMS

Early experience with the Run 2 ATLAS analysis model

Popularity Prediction Tool for ATLAS Distributed Data Management

The ATLAS PanDA Pilot in Operation

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Distributing storage of LHC data - in the nordic countries

How the Monte Carlo production of a wide variety of different samples is centrally handled in the LHCb experiment

Using CernVM-FS to deploy Euclid processing S/W on Science Data Centres

The GAP project: GPU applications for High Level Trigger and Medical Imaging

Performance of popular open source databases for HEP related computing problems

Transcription:

Journal of Physics: Conference Series OPEN ACCESS ATLAS Nightly Build System Upgrade To cite this article: G Dimitrov et al 2014 J. Phys.: Conf. Ser. 513 052034 Recent citations - A Roadmap to Continuous Integration for ATLAS Software Development J Elmsheuser et al - The importance of having an appropriate relational data segmentation in ATLAS G Dimitrov View the article online for updates and enhancements. This content was downloaded from IP address 148.251.232.83 on 10/10/2018 at 21:39

ATLAS Nightly Build System Upgrade G Dimitrov 1, E Obreshkov 2, B Simmons 3 and A Undrus 4,5, on behalf of the ATLAS Collaboration 1 CERN, 1211 Geneva, Switzerland 2 University of Innsbruck, Innrain 52, 6020 Innsbruck, Austria 3 University College, Gower Street, London WC1E 6BT, UK 4 Brookhaven National Laboratory, Upton, NY 11973, USA E-mail: undrus@bnl.gov Abstract. The ATLAS Nightly Build System is a facility for automatic production of software releases. Being the major component of ATLAS software infrastructure, it supports more than 50 multi-platform branches of nightly releases and provides ample opportunities for testing new packages, for verifying patches to existing software, and for migrating to new platforms and compilers. The Nightly System testing framework runs several hundred integration tests of different granularity and purpose. The nightly releases are distributed and validated, and some are transformed into stable releases used for data processing worldwide. The first LHC long shutdown (2013-2015) activities will elicit increased load on the Nightly System as additional releases and builds are needed to exploit new programming techniques, languages, and profiling tools. This paper describes the plan of the ATLAS Nightly Build System Long Shutdown upgrade. It brings modern database and web technologies into the Nightly System, improves monitoring of nightly build results, and provides new tools for offline release shifters. We will also outline our long-term plans for distributed nightly releases builds and testing. 1. Introduction ATLAS (A Toroidal LHC Apparatus) [1] is one of the largest collaborative efforts ever attempted in the physical sciences. The Phase I ATLAS upgrade [2] is designed to accommodate the detector hardware and software for the increased LHC luminosity. The upgraded software and computing systems will deal with increased data volumes and event complexity. The code should be modified to work efficiently with new computing vector-processing and multi-threading architectures. The upgrade provides opportunities to consider, make and test changes to collaborative software infrastructure. The ATLAS Nightly Build System upgrade is a central part of ATLAS Infrastructure update and will provide an improved monitoring of nightly build results, new tools for automation of offline release shifters tasks, a modern database, and web technologies, for the Nightly System. The paper describes the plan of the ATLAS Nightly Build System Long Shutdown upgrade and outlines future development plans. 5 To whom any correspondence should be addressed. Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Published under licence by IOP Publishing Ltd 1

2. ATLAS Nightly Build System overview The ATLAS Nightly System facilitates coordination between several hundred software developers working around the world and around the clock [3]. The central part of this System is the NICOS Nightly Control Tool [4]. The ATLAS nightly build system supports more than 50 nightly release branches described in table 1. An ATLAS software release [5] comprises a large number of packages with specific version tags stored in the Tag Collector [6] database application with a web interface. Developers are able to interactively select the version tags from the ATLAS SVN code repository for the nightly releases. Table 1. ATLAS nightly release branches. Branch group Number of branches Number of packages Purpose Major integration 3 2000 2300 Validation 3 2000 2300 Experimental 5 2000 2300 Migration 5 500 2300 New features 8 10 2300 Patch 15 20 200 Physics Analysis 17 3 50 ROOT-based Analysis 3 20 50 Preparation of stable software releases Testing new versions before submission to major integration branches Probing new systems and compilers Development of specific software domains or testing new versions of external tools and applications Assigned to developers of new software Amendments to stable software releases Analysis software collections Light-weight analysis tools collections ATLAS nightly releases are rebuilt on 1 to 4 platforms every day for each branch (in some cases several times per day) with CMT configuration management and build tool [7] on the ATLAS nightly computing farm at CERN equipped with ~50 powerful multi-core nodes. Builds are accelerated by parallelism supported at file and package levels by CMT, with the distcc, ccache tools, and by running tests in parallel. The largest builds take up to 9 hours. ATLAS nightly releases are packaged by PackDist tool [8] and installed on AFS and CernVM-FS [9] distributed file systems for worldwide access. The CernVM-FS is a fuse-based http, read-only file system which guarantees file de-duplication and on-demand file transfer with caching, scalability and performance. Nightly releases are kept for 2 to 7 days. When certain development goals are achieved the successful nightly release is transformed into the stable release by the team of ATLAS offline release shifters. Stable releases have unique numeric identifiers and indefinite lifetime. The Nightly System is connected with ATN [10] and RTT [11] testing scaffolds that run tests of different granularity levels. The ATN test tool is embedded within the Nightly System and launches tests concurrently with compilations for faster results delivery. As fast feedback to developers is one 2

of the most important functionalities of nightly systems, NICOS automatically posts the information about the progress of nightly builds and tests, identifies problems, and creates the summary web pages reflecting the system status. Automatic e-mail notifications about problems are sent out to responsible developers. 3. Key features of the ATLAS Nightly Build System upgrade The upgrade objective is to provide ATLAS developers with an improved ATLAS Nightlies web user interface and to automate ATLAS offline release coordinator and shifter tasks, thus reducing their workload. The upgrade components are shown in figure 1. New key system components are the Nightlies Oracle Database and Nightlies Web Server. Figure 1. Components of the ATLAS Nightly Build System upgrade. The Nightlies Database stores nightly jobs data and serves as a mediator between the Nightly and other ATLAS systems. The upgraded Nightly System uses the database data internally, in particular for jobs synchronization. The database-driven dynamic user interfaces replace the collection of web pages generated by previous NICOS versions. The new ATLAS Nightly web server is an Apache server managed by CERN IT. It is powered by the PanDA Web Platform [12], which supports Python plugins capable of accessing data, generating and publishing both the web content as well as user interface. The Platform data layer provides the means to communicate with Oracle databases. 4. ATLAS Nightlies Database The ATLAS Nightlies Database resides in the ATLAS database production cluster dedicated to offline analysis (ATLR) [13]. It relies on the Oracle RDBMS (Relational Database Management System) and is supported by the CERN IT-DB group. This operational database holds: Status for all stages of nightly jobs; Results of tests and compilations; Statistical information (e.g. number of packages in the release); Properties of nightly releases (e.g. packages tags). The ATLAS Nightlies Database is the source of dynamic content for the Nightlies Web Server. The data retention period is 12 months allowing an access to historical information well beyond the nightly releases life on distributed file systems. 3

The expected data volume is about 10 20 GB per year. The table partitioning is designed taking into account frequent access to the recent nightly information. An active software developer can make several thousand nightly database queries daily when browsing dynamic web pages. Parent tables are configured to use range partitioning while reference partitioning is used for child tables that host the nightly jobs attributes. Reference partitioning ensures that the tables are partitioned in a uniform way. It enhances the manageability of child tables because all partition maintenance operations on the parent table automatically cascade to child tables. 5. ATLAS Nightlies web interfaces The PanDA Web Platform [12] alleviates the maintenance of web servers, is powered by the JQuery [14] library, is easily extensible, and integrates well with external monitoring tools and components. User interfaces are generated by the Platform from Python modules backed up by JavaScript frontends. Each module: Provides access to databases; Publishes the content generated in json format; Enables JavaScript rendering functions to be defined. The JQuery-based ThemeRoller web application [15] provides web theme designs with consistent look and feel. The PanDA Web Platform allows different interface designs, each customized for a different group of developers, to co-exist. The nightly information web interfaces provide all kinds of information about nightly jobs. Overview and detailed results views are available: The global page provides the Nightly System status at a glance, and points at recent releases in all branches; Branch summaries show information for a particular nightly branch (for single or multiple platforms); Nightly releases summaries show build, test, and installation data for the project of a selected release; Compilation results; Test results; Package tags of a particular nightly release (with comparison with a previous release). The nightly administrative interfaces require CERN Single Sign On authentication [16] and use secured HTTPS connections. They facilitate the system management and provide services for release coordination: Stop or restart nightly jobs; For release coordinators: forms for stable release building requests; For offline release shifters: buttons to perform certain shift tasks (e.g. generate instructions to fulfill a release coordinator request). 6. Upgrade phases and long-term plans The ATLAS Nightly System Upgrade is planned during LHC Long Shutdown I in two stages: Delivery of key components (Nightly Database and Web Server) and nightly information web interfaces. Prototypes of administrative interfaces should be created (1st quarter of 2014); Providing the full range of administrative interfaces (1st quarter of 2015). Also probing new nightly build types is planned: Continuous nightlies imply triggering release builds by new submissions to a code repository. This nightlies type provides fast feedback to developers and accelerates development cycles; 4

Nightlies on demand do not have a regular release build schedule. Instead a release coordinator is provided with tools to start and re-start nightly jobs as needed. Long-term plans include the development of a distributed nightly system in which releases are created using GRID resources and then validated on GRID sites where ATLAS Production tasks run. 7. Conclusion Over the last decade the ATLAS Nightly System served as a major tool in the ATLAS collaborative software organization and management schemes. The upgraded Nightly System provides the ATLAS community with improved tools for coordinating development of new software functionality, and paves the way for exploring new computing techniques, compilers, and platforms. It is capable of sustaining increased numbers of developers and their testing demands. 8. Acknowledgements The authors wish to thank members of the ATLAS Software Infrastructure and Database teams for much valuable advice and useful discussions. This work was supported by the US Department of Energy and National Science Foundation. References [1] The ATLAS Collaboration 2008 JINST 3 S08003 [2] The ATLAS Collaboration Letter of Intent for the Phase I Upgrade of the ATLAS Experiment CERN-LHCC-2011-012, http://cdsweb.cern.ch/record/1402470 [3] Undrus A 2012 J. Phys.: Conf. Ser. 396 052070 [4] Undrus A 2003 Proc. Int. Conf. on Computing in High Energy and Nuclear Physics CHEP 03 (La Jolla, USA) econf C0303241 pp TUJT006 (e-print hep-ex/0305087) [5] Luehring F, Obreshkov E, Quarrie D, Rybkine G and Undrus A 2010 J. Phys.: Conf. Ser. 219 042045 [6] Albrand S, Collot J, Fulachier J and Lambert F 2004 The tag collector a tool for ATLAS code release management Proc. Int. Conf. on Computing in High Energy and Nuclear Physics CHEP 04 (Interlaken, Switzerland), Obreshkov E 2008 et al. Nucl.Instrum.Meth.A 584 244-51, http://atlastagcollector.in2p3.fr [7] Arnault C 2000 Proc. Int. Conf. on Computing in High Energy and Nuclear Physics CHEP 00 (Padova, Italy) (Computer Physics Communications Vol. 140) pp 692-5, Arnault C 2001 Proc. Int. Conf. on Computing in High Energy and Nuclear Physics CHEP 01 (Beijing) pp 8-006, http://www.cmtsite.net [8] Rybkine G 2012 J. Phys.: Conf. Ser. 396 052063 [9] De Salvo A et al 2012 J. Phys.: Conf. Ser. 396 032030 [10] Undrus A 2004 Proc. Int. Conf. on Computing in High Energy and Nuclear Physics CHEP 04 (Interlaken, Switzerland) CERN 2005-002 pp 521-3 [11] Simmons B, Sherwood P, Ciba K and Richards A 2010 J. Phys.: Conf. Ser. 219 042023 [12] https://twiki.cern.ch/twiki/bin/view/panda/pandaplatform [13] Dimitrov G, Canali L, Blaszczyk M and Sorokoletov R 2012 J. Phys.: Conf. Ser. 396 052027 [14] http://jquery.com [15] http://jqueryui.com/themeroller [16] Ormancey E 2008 J. Phys.: Conf. Ser. 119 082008 5