AGIS: The ATLAS Grid Information System

Similar documents
The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

PoS(EGICF12-EMITC2)106

ATLAS Nightly Build System Upgrade

Automating usability of ATLAS Distributed Computing resources

The ALICE Glance Shift Accounting Management System (SAMS)

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Andrea Sciabà CERN, Switzerland

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

Experiences with the new ATLAS Distributed Data Management System

Improved ATLAS HammerCloud Monitoring for Local Site Administration

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Overview of ATLAS PanDA Workload Management

Software installation and condition data distribution via CernVM File System in ATLAS

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data

INDEXING OF ATLAS DATA MANAGEMENT AND ANALYSIS SYSTEM

CMS users data management service integration and first experiences with its NoSQL data storage

Service Availability Monitor tests for ATLAS

The ATLAS PanDA Pilot in Operation

CernVM-FS beyond LHC computing

Evolution of Database Replication Technologies for WLCG

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers.

IEPSAS-Kosice: experiences in running LCG site

ATLAS & Google "Data Ocean" R&D Project

Data Management for the World s Largest Machine

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation

Federated data storage system prototype for LHC experiments and data intensive science

Grid Computing a new tool for science

HammerCloud: A Stress Testing System for Distributed Analysis

Volunteer Computing at CERN

TAG Based Skimming In ATLAS

Online remote monitoring facilities for the ATLAS experiment

IllustraCve Example of Distributed Analysis in ATLAS Spanish Tier2 and Tier3

N. Marusov, I. Semenov

ATLAS software configuration and build tool optimisation

C3PO - A Dynamic Data Placement Agent for ATLAS Distributed Data Management

Evolution of Database Replication Technologies for WLCG

Extending ATLAS Computing to Commercial Clouds and Supercomputers

Large Scale Software Building with CMake in ATLAS

Unified System for Processing Real and Simulated Data in the ATLAS Experiment

ATLAS Distributed Computing Experience and Performance During the LHC Run-2

The LHC Computing Grid

AutoPyFactory: A Scalable Flexible Pilot Factory Implementation

ATLAS computing activities and developments in the Italian Grid cloud

Distributing storage of LHC data - in the nordic countries

Popularity Prediction Tool for ATLAS Distributed Data Management

PanDA: Exascale Federation of Resources for the ATLAS Experiment

CC-IN2P3: A High Performance Data Center for Research

Investigation on Oracle GoldenGate Veridata for Data Consistency in WLCG Distributed Database Environment

Monitoring of Computing Resource Use of Active Software Releases at ATLAS

Monte Carlo Production on the Grid by the H1 Collaboration

Installation of CMSSW in the Grid DESY Computing Seminar May 17th, 2010 Wolf Behrenhoff, Christoph Wissing

Travelling securely on the Grid to the origin of the Universe

CHIPP Phoenix Cluster Inauguration

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

The Grid. Processing the Data from the World s Largest Scientific Machine II Brazilian LHC Computing Workshop

ATLAS distributed computing: experience and evolution

Security in the CernVM File System and the Frontier Distributed Database Caching System

Analytics Platform for ATLAS Computing Services

Grid Computing: dealing with GB/s dataflows

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

PoS(ACAT2010)039. First sights on a non-grid end-user analysis model on Grid Infrastructure. Roberto Santinelli. Fabrizio Furano.

Evolution of SAM in an enhanced model for Monitoring WLCG services

Grid Data Management

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

Monitoring System for the GRID Monte Carlo Mass Production in the H1 Experiment at DESY

Benchmarking the ATLAS software through the Kit Validation engine

Batch Services at CERN: Status and Future Evolution

LHCb Distributed Conditions Database

Stephen J. Gowdy (CERN) 12 th September 2012 XLDB Conference FINDING THE HIGGS IN THE HAYSTACK(S)

The Grid: Processing the Data from the World s Largest Scientific Machine

The Database Driven ATLAS Trigger Configuration System

Early experience with the Run 2 ATLAS analysis model

ATLAS DQ2 to Rucio renaming infrastructure

Monitoring of large-scale federated data storage: XRootD and beyond.

Prompt data reconstruction at the ATLAS experiment

An SQL-based approach to physics analysis

ATLAS Data Management Accounting with Hadoop Pig and HBase

Storage and I/O requirements of the LHC experiments

Interoperating AliEn and ARC for a distributed Tier1 in the Nordic countries.

Introduction to Grid Infrastructures

Evaluation of the computing resources required for a Nordic research exploitation of the LHC

The AAL project: automated monitoring and intelligent analysis for the ATLAS data taking infrastructure

T0-T1-T2 networking. Vancouver, 31 August 2009 LHCOPN T0-T1-T2 Working Group

Distributed e-infrastructures for data intensive science

DIRAC pilot framework and the DIRAC Workload Management System

Big Data Tools as Applied to ATLAS Event Data

The CMS data quality monitoring software: experience and future prospects

Austrian Federated WLCG Tier-2

The LHC Computing Grid. Slides mostly by: Dr Ian Bird LCG Project Leader 18 March 2008

The JINR Tier1 Site Simulation for Research and Development Purposes

The Error Reporting in the ATLAS TDAQ System

HEP Grid Activities in China

The ATLAS Trigger Simulation with Legacy Software

CMS - HLT Configuration Management System

Experience with ATLAS MySQL PanDA database service

The ATLAS Tier-3 in Geneva and the Trigger Development Facility

ATLAS TDAQ System Administration: Master of Puppets

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

Transcription:

AGIS: The ATLAS Grid Information System Alexey Anisenkov 1, Sergey Belov 2, Alessandro Di Girolamo 3, Stavro Gayazov 1, Alexei Klimentov 4, Danila Oleynik 2, Alexander Senchenko 1 on behalf of the ATLAS Collaboration 1 Budker Institute of Nuclear Physics, Novosibirsk, Russia 2 Joint Institute of Nuclear Research, Laboratory of Information Technology, Dubna. Russia 3 CERN, ES Department, CH-1211 Geneva 23, Switzerland 4 Brookhaven National Laboratory, Upton, NY 11973, USA Email: Alexey.Anisenkov@cern.ch ATL-SOFT-PROC-2012-052 18 June 2012 Abstract. ATLAS is a particle physics experiment at the Large Hadron Collider at CERN. The experiment produces petabytes of data annually through simulation production and tens petabytes of data per year from the detector itself. The ATLAS Computing model embraces the Grid paradigm and a high degree of decentralization and computing resources able to meet ATLAS requirements of petabytes scale data operations. In this paper we present ATLAS Grid Information System (AGIS) designed to integrate configuration and status information about resources, services and topology of whole ATLAS Grid needed by ATLAS Distributed Computing applications and services. 1. Introduction The ATLAS experiment [1] produces petabytes of data per year that needs to be distributed globally to sites worldwide according to the ATLAS computing model [2]. The ATLAS computing model is based on a worldwide Grid computing infrastructure that uses a set of hierarchical tiers. It provides all members of the ATLAS Collaboration speedy access to all reconstructed data for analysis, and appropriate access to raw data for organised monitoring, calibration and alignment activities. The root node of the complex tiered topology is Tier-0, located at CERN itself. After a primary data processing at CERN in a Tier-0 facility, raw data from the detector and reconstructed data products are replicated to 10 Tier-1 centres (geographically spread across Europe, Asia and North America) according to the ATLAS data distribution policy. In addition to the data taken from the detector, simulated data is produced on Grid resources worldwide, and this data must also be replicated. Tier-1 and associated underlying tiers are grouped into a cloud. The ATLAS grid is composed by independent sub-grids: European Grid Infrastructure (EGI) [3] Open Science Grid (OSG) [4] These two sub-grids form the Worldwide LHC Computing Grid infrastructure (WLCG). Figure 1 shows a schematic view of ATLAS Grid topology.

This immense volume of data must be handled by the computing with integrated information systems. Information system helps users and high-level services acquire information of the resources available in the grid, its static properties and characteristics in order to make decisions on how to use them. ATLAS Grid Information System represents a general-purpose storage of different parameters, configuration, static and dynamic data needed to configure and operate the ATLAS Distributed Computing (ADC) [5,6] systems and services. Figure 1. Tiers of ATLAS 2. AGIS Requirements and Motivation ADC applications and services require the diversity of common information, configurations, parameters and quasi-static data originally stored in different sources. Sometimes such static data are simply hardcoded in application programs or spread over various configuration files. The difficulty faced by ADC applications is that ATLAS computing uses a variety of Grid infrastructures which have different information services, application interfaces, communication systems and policies. Therefore, each application has to know about the proper information source, data structure formats and application interfaces, security infrastructures and other low-level technical details to retrieve specific data. Moreover, an application has to implement communication logic to retrieve data from external sources that produces a lot of code duplications. To satisfy ADC requirements central information system should be designed and implemented to have a coherent approach to store and expose data. Such a central place should be a database based system with good scalability and convenient data access interfaces. Centralized information system helps to solve information duplication issues and inconsistencies between data stored in many external databases, configuration files and even hard-coded in applications. Common information system also helps to solve synchronization issues, to reduce code duplication and simplify application code logic. AGIS is designed to cover all these requirements and ATLAS needs. By introducing own caching mechanism AGIS helps to improve performance in data management. Being an intermediate middleware system between a client and external information sources, AGIS automatically collects and keeps data up to date, caching information required by and specific for ATLAS, removing the source as a direct dependency for clients but without duplicating the source information system itself. Also, using AGIS as one data repository will improve system maintainability and allow adding new classes of information centrally. There are many different parameters that should be managed in that way. It is, for example, sites availability information, general ATLAS topology, site downtime information, resource description,

data access permissions and so forth. AGIS is designed to integrate information about resources, services and topology of the ATLAS grid infrastructure from various independent sources including BDII, GOCDB, the ATLAS data management system [7] and the ATLAS PanDA workload management system [8] and others. 3. AGIS Architecture AGIS Architecture is based on the classic client-server computing model. Oracle DBMS is chosen as a database backend. Since the system provides various interfaces (API, WEB, CLI) to retrieve and to manage data, no direct access to the database from the clients is required. AGIS uses Django framework [9] as a high-level web application framework written in Python. Object Relation Mapping technique built in Django framework allows to operate and to access data in terms of high level models avoiding direct dependence of relational database system used. Figure 2 shows schematic view of AGIS architecture. There is no full duplication of external information sources, since AGIS extracts only the information needed for ATLAS and stores data objects in a way convenient for ATLAS, and introduces additional object relations required by ADC applications. All interactions with various information services are hidden. Synchronization of AGIS content with external sources is performed by agents which periodically communicate with sources via standard interfaces and update AGIS database content. For some types of information AGIS itself is the primary source. The clients are able to update information stored in AGIS through the API. A python API and command line tools further help the end users and developers to use the system conveniently. The client API allows users to retrieve data in XML or in JSON format, for instance all ATLAS topology can be exported in an XML feed. WebUI is designed to be able to quickly view and change most of the information stored in the AGIS. The interface is focused on the use by ADC experts and does not require any special software development knowledge. Figure 2. Schematic view of AGIS client-server architecture To automatically populate the database a set of cron jobs runs on the main AGIS server. The Figure 3 illustrates typical data flow directions within the system while collecting data from external sources. One of the operation requirements recently addressed to the information system is the possibility to track the changes and the functionality to fully reproduce the database content from previously saved

state. To meet this demand the data populating approach has been improved by introducing intermediate level in the data collecting procedure. New approach assumes that data collected from external source is saved into a temporary storage before actually being inserted into the database. It allows to overwrite specific values and control the data coming in AGIS. As the data interchange format the JSON structure is chosen. The Figure 4 shows simplified class diagram of AGIS entities implemented. Figure 3. Schematic view of data flow in AGIS 4. Stored Information There are two types of information stored in AGIS: data retrieved from external sources (TiersOfATLAS, GOCDB, BDII, myosg, PanDA databases, GStat source, etc), data managed within AGIS In current implementation, AGIS stores all information concerning ATLAS topology: clouds, regional centers, sites specifics, such as geography, time zone, geo coordinates, etc. It also stores resources and services information LFC, FTS, CE, SRM, Squid, Frontier services and others. Key examples of information stored in AGIS: ATLAS topology (clouds, centres, sites and sites specifics) site s resources and services information, status and its description site information and configuration data replication sharing and pairing policy list of activities and its properties (Functional Tests, data distribution, etc.) global configuration parameters needed by ADC applications user related information (privileges, roles, account info) for ADC applications downtime information about ATLAS sites, list of affected services site blacklisting data

Figure 4. AGIS class diagram 5. Conclusion AGIS is developed to provide a single portal and information cache for application developers and interactive users. AGIS is currently in production for the ATLAS experiment. At the moment AGIS project is in stage of extending its functionality and improving reliability. Information stored in TiersOfATLAS file is migrated to AGIS database. Many AGIS services, in particular the API functionality exposed ATLAS data in JSON format are starting to be actively used in the production applications. Web interfaces such as a site downtime calendar and ATLAS topology viewers are widely used by shifters and data distribution experts. AGIS is continuously extending data structures to follow new demands of ADC needs. Recently implemented functionality to handle Frontier and Squid services data, to manage software installations data are the examples of last fulfilled requests from ADC users. New external information sources will be supported in the near future. So, more ADC applications are expected to use AGIS as the main information system. AGIS design and basic principles included into the architecture allows to use its core part for several HEP experiments. Russian Federation sites are considering AGIS as an information system to be used for RF cloud after two new Tier-1 facilities will be commissioned.

References [1] G. Aad et al., The ATLAS Experiment at the CERN Large Hadron Collider, JINST 3 (2008) S08003. [2] The ATLAS Computing Model, Adams, D ; Barberis, D ; Bee, C P ; Hawkings, R ; Jarp,S ; Jones, R ; Malon, D ; Poggioli, L ; Poulard, G ; Quarrie, D et al. - ATLAS Collaboration.,ATL-SOFT-2004-007; ATL-COM-SOFT-2004-009; CERN-ATL-COM- SOFT-2004-009;CERN-LHCC-2004-037-G-085.- Geneva : CERN, 2005-37 [3] European Grid Infrastructure (EGI), http://www.egi.eu/ [4] Open Science Grid (OSG) Web Page, http://www.opensciencegrid.org/ [5] ATLAS Distributed Computing, web page, https://twiki.cern.ch/twiki/bin/view/atlas/atlasdistributedcomputing [6] A.Anisyonkov, A.Klimentov, D.Krivashin, M.Titov. ATLAS Distributed Computing, XXII International Symposium on Nuclear Electronics & Computing (NEC 2009) (Institute for Nuclear Research and Nuclear Energy, Varna, Bulgaria, 2009), pp. 4. September 7-14, 2009; Proceedings of the Symposium; E10,11-2010-22, Dubna, 2010 - p. 45-49 [7] M. Branco et al., Managing ATLAS data on a petabyte-scale with DQ2, J. Phys. Con f.ser. 119 (2008) 062017 [8] T. Maeno, PanDA: Distributed production and distributed analysis system for ATLAS, J. Phys. Conf. Ser. 119 (2008) 062036. [9] Django project web site, http://www.djangoproject.com