The INFN Tier1. 1. INFN-CNAF, Italy

Similar documents
I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

The CMS experiment workflows on StoRM based storage at Tier-1 and Tier-2 centers

Long Term Data Preservation for CDF at INFN-CNAF

Experience with Fabric Storage Area Network and HSM Software at the Tier1 INFN CNAF

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. Presented by Manfred Alef Contributions of Jos van Wezel, Andreas Heiss

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008"

The Legnaro-Padova distributed Tier-2: challenges and results

A comparison of data-access platforms for BaBar and ALICE analysis computing model at the Italian Tier1

SPINOSO Vincenzo. Optimization of the job submission and data access in a LHC Tier2

Austrian Federated WLCG Tier-2

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

GEMSS: a novel Mass Storage System for Large Hadron Collider da

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

Andrea Sciabà CERN, Switzerland

Storage Resource Sharing with CASTOR.

Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI

Benoit DELAUNAY Benoit DELAUNAY 1

Data storage services at KEK/CRC -- status and plan

Cluster Setup and Distributed File System

EGEE and Interoperation

Edinburgh (ECDF) Update

Status of KISTI Tier2 Center for ALICE

The ATLAS Tier-3 in Geneva and the Trigger Development Facility

On enhancing GridFTP and GPFS performances

High-density Grid storage system optimization at ASGC. Shu-Ting Liao ASGC Operation team ISGC 2011

LCG data management at IN2P3 CC FTS SRM dcache HPSS

CMS Belgian T2. G. Bruno UCL, Louvain, Belgium on behalf of the CMS Belgian T2 community. GridKa T1/2 meeting, Karlsruhe Germany February

ISTITUTO NAZIONALE DI FISICA NUCLEARE

UW-ATLAS Experiences with Condor

Virtualizing a Batch. University Grid Center

A High Availability Solution for GRID Services

Summary of the LHC Computing Review

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 27 July 2010

Grid Operation at Tokyo Tier-2 Centre for ATLAS

Disaster recovery of the INFN Tier-1 data center: lesson learned. Luca dell Agnello INFN - CNAF CHEP 2018 conference Sofia, July

Distributing storage of LHC data - in the nordic countries

IWR GridKa 6. July GridKa Site Report. Holger Marten

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN

150 million sensors deliver data. 40 million times per second

Large scale commissioning and operational experience with tier-2 to tier-2 data transfer links in CMS

where the Web was born Experience of Adding New Architectures to the LCG Production Environment

Tier2 Centre in Prague

Data Transfers Between LHC Grid Sites Dorian Kcira

Overview of ATLAS PanDA Workload Management

The LHC Computing Grid

System upgrade and future perspective for the operation of Tokyo Tier2 center. T. Nakamura, T. Mashimo, N. Matsui, H. Sakamoto and I.

Storage and I/O requirements of the LHC experiments

Influence of Distributing a Tier-2 Data Storage on Physics Analysis

Data Storage. Paul Millar dcache

Streamlining CASTOR to manage the LHC data torrent

and the GridKa mass storage system Jos van Wezel / GridKa

Data Access and Data Management

glite Grid Services Overview

ATLAS computing activities and developments in the Italian Grid cloud

Grid Computing Activities at KIT

IEPSAS-Kosice: experiences in running LCG site

DESY at the LHC. Klaus Mőnig. On behalf of the ATLAS, CMS and the Grid/Tier2 communities

HEP Grid Activities in China

UK Tier-2 site evolution for ATLAS. Alastair Dewhurst

ARC integration for CMS

PROOF-Condor integration for ATLAS

Testing an Open Source installation and server provisioning tool for the INFN CNAF Tier1 Storage system

Considerations for a grid-based Physics Analysis Facility. Dietrich Liko

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

CHIPP Phoenix Cluster Inauguration

ELFms industrialisation plans

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005

Pan-European Grid einfrastructure for LHC Experiments at CERN - SCL's Activities in EGEE

Understanding StoRM: from introduction to internals

Operating the Distributed NDGF Tier-1

The LHC Computing Grid

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 21 July 2011

Tier-2 structure in Poland. R. Gokieli Institute for Nuclear Studies, Warsaw M. Witek Institute of Nuclear Physics, Cracow

Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU

Introduction to SRM. Riccardo Zappi 1

10 Gbit/s Challenge inside the Openlab framework

Challenges of the LHC Computing Grid by the CMS experiment

Philippe Charpentier PH Department CERN, Geneva

CMS Grid Computing at TAMU Performance, Monitoring and Current Status of the Brazos Cluster

The LHC Computing Grid. Slides mostly by: Dr Ian Bird LCG Project Leader 18 March 2008

High-Energy Physics Data-Storage Challenges

DIRAC pilot framework and the DIRAC Workload Management System

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation

Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center

The glite middleware. Presented by John White EGEE-II JRA1 Dep. Manager On behalf of JRA1 Enabling Grids for E-sciencE

The European DataGRID Production Testbed

irods usage at CC-IN2P3: a long history

Workload Management. Stefano Lacaprara. CMS Physics Week, FNAL, 12/16 April Department of Physics INFN and University of Padova

A scalable storage element and its usage in HEP

Implementing GRID interoperability

D0 Grid: CCIN2P3 at Lyon

CERN and Scientific Computing

NCP Computing Infrastructure & T2-PK-NCP Site Update. Saqib Haleem National Centre for Physics (NCP), Pakistan

Using Hadoop File System and MapReduce in a small/medium Grid site

High Energy Physics data analysis

The Software Defined Online Storage System at the GridKa WLCG Tier-1 Center

GRID COMPUTING APPLIED TO OFF-LINE AGATA DATA PROCESSING. 2nd EGAN School, December 2012, GSI Darmstadt, Germany

I Service Challenge e l'implementazione dell'architettura a Tier in WLCG per il calcolo nell'era LHC

Transcription:

IV WORKSHOP ITALIANO SULLA FISICA DI ATLAS E CMS BOLOGNA, 23-25/11/2006 The INFN Tier1 L. dell Agnello 1), D. Bonacorsi 1), A. Chierici 1), M. Donatelli 1), A. Italiano 1), G. Lo Re 1), B. Martelli 1), M. Onofri 1), P. Ricci 1), F. Rosso 1), D. Salomoni 1), V. Sapunenko 1), C. Vistoli 1) 1. INFN-CNAF, Italy Abstract A description of the status of INFN Tier1 is presented with a particular focus on services offered to ATLAS and CMS experiments. 1 Introduction INFN Tier1, located at CNAF in Bologna, is a computing facility supporting various experiments besides ATLAS and CMS: among the HEP collaborations also BABAR, CDF, VIRGO and the other LHC experiments use CNAF as Tier1. Besides these experiments, other smaller HEP collaborations such as AMS, ARGO, PAMELA, MAGIC and non-hep VOs 1) (e.g. geophysics, biomedocs etc..) have access to computing resources at CNAF. The access to the CNAF resources is granted on VO basis according to the policies defined by the INFN management (e.g. amount of disk and tape space and computing power), while each VO manages its own users permitting (or revoking) them the access. A fine grain authorization scheme is also possible, allowing VOs to grant different priorities to users based on user s role. In the following paragraphs a brief description of the main Tier1 components (network, farm and storage) is presented. Figure 1: The LAN infrastructure 2 The network INFN-CNAF is connected with one 10 gigabit link to the GARR, the general purpose network interconnecting all academic and research centers in Italy, and through the GARR to the other research networks worldwide. CNAF 1) In grid terminology an experiment is also called a Virtual Organization (VO).

The INFN Tier1 is also part of the LHC Optical Private Network (OPN), a dedicated network for the LHC experiments primarily interconnecting each Tier1 to the Tier0 (CERN), for the raw data transfer, and possibly to the other Tier1s. In the framework of OPN two 10 gigabit dedicated links interconnect CNAF respectively to CERN and to FZK (the path from CNAF to Milano is presently shared between the two connections). It is foreseen that these links will evolve in optical paths. 3 The farm As of March 2007, the computing farm is composed from nearly 700 machines (also called worker nodes, or WNs) for a total computing power of about 1600 KSI2k 2). All these WNs are bi-processor, 1 rack unit, machines configured, for efficiency reasons, to serve up to 4 concurrent running jobs. As a consequence the defined CPU slots (and hence the maximum number of running jobs) are nearly 2600. Figure 2: Partial view of the farm Figure 3: Farm load 2) the quoted figures vary slightly due to hardware maintenance

IV WORKSHOP ITALIANO SULLA FISICA DI ATLAS E CMS BOLOGNA, 23-25/11/2006 A tender for CPUs (for nominal 800 KSI2k) has been completed and we expect to put the new WNs in operation during Q2 2007. The job scheduling is centrally managed by a local batch system, LSF 3), where several queues (at least one for each experiment) are defined. The concurrent access to the farm of different VOs is managed on the basis of so-called fair share policy: shares for each VO are statically configured, according to the amount of computing power assigned to the experiment, and enforced on (sliding) time window. As a consequence when all VOs submit jobs to the farm, each of them has at each time a number of running jobs equal to its share; but each VO can profit of all CPU slots of the farm if available. In the near future we plan to upgrade the operating system of farm to Scientific Linux 4.x, from the present 3.x version, as soon as the glite middleware for the new version of operating system (glite 3.1) will be certified. Another change will be the migration to 64 bits software for the WNs supporting it (e.g. Opteron and newest Intel CPUs). All these changes will be taken in agreement with experiments and in the common framework of LCG/EGEE 4). 4 The storage The storage system at CNAF is based on Storage Area Network (SAN). Figure 4: The structure of Storage Area Network With the commissioning of the latest tender for disks at the beginning of 2007, the total amount of disk space currently installed at CNAF is about 1 PB raw, with a net usable disk space of about 800 TBs 5). All the storage is of type FC/SATA and nearly all is organized in a Storage Area Network (SAN) built on 3 interconnected Fibre Channel (FC) switches (see figure 4). The disk space is currently partitioned among 3 distinct storage systems: Xrootd 6), specifically for the BaBar experiment; CASTOR 7) for all other experiments requiring data archiving on tape; GPFS 8) for pure disk access. It is foreseen a huge expansion commencing next year to reach a size of about 3 PB of raw-disk space which will be further increased during the subsequent years. In the framework of LCG, CNAF, as a Tier-1 site, will have to provide up to three different access types (depending on the experiment) to the storage systems, called Storage Classes (SC): Disk0-Tape1 (D0T1). The data must be saved on tape and the disk copy is considered only as a temporary 3) Load Sharing Facility is a product by Platform c 4) For LCG (LHC Computing Grid) see: http://lcg.web.cern.ch/lcg/; for EGEE (Enabling Grids for E-sciencE) see: http://www.eu-egee.org/ 5) The difference between raw and net disk space is due to parity disks and hot spare disk in raid-sets. 6) xrootd is developed by SLAC. See: http://xrootd.slac.stanford.edu/ 7) CASTOR (CERN Advanced STORage manager) is developed by CERN. See: http://castor.web.cern.ch/castor/ 8) GPFS (General Parallel File System) is an IBM product (see: http://www-03.ibm.com/systems/clusters/software/gpfs.html).

The INFN Tier1 buffer (called staging area) and it is managed by the system, i.e. data are automatically deleted from the staging area when the occupancy is higher than a configurable threshold and the data have been migrated to tape. The size of the staging area is normally of the order of 20% of the tape space. Disk1-tape1 (D1T1). The data must be saved both on tape and on disk. For this SC, the sizes of disk space and tape space are by definition identical. Disk1-tape0 (D1T0). In this case these is no guaranteed copy on tape and the management of the disk space is delegated to the Virtual Organization (VO) itself owning the data. These SCs will be accessed through the Storage Resource Management Grid layer 9), which is provided with an interface allowing the applications to manage the underlying storage systems in a transparent way, without the need of dealing with its peculiarities. At present all SRM interfaces in production (i.e. CASTOR, dcache, DPM) implement specifications v. 1.1, while it is planned that the new generation (v. 2.2) will be deployed during of Q3 2007. The natural candidates at CNAF for the implementation of these SCs are CASTOR for D0T1 and D1T1 and GPFS/StoRM 10) for D1T0. Since StoRM only implements SRM specifications v. 2.2 (previously 2.1), currently it is not yet in production, but it will be deployed for pre-production at CNAF during Q2 2007. 4.1 CASTOR present status and evolution Presently at CNAF, following CERN CASTOR support team advice, there is only one instance of CASTOR stager in production for all experiments (LHC and non LHC). Moreover at the end of 2006, another stager instance have been installed for testing purposes (e.g. tests on SRM v. 2.2, new CASTOR releases). It is foreseen that a new major release for CASTOR will be available next Summer. An issue also to be addressed is the load on the disk-servers due to the contention between WAN ( gridftp ) and LAN ( rfio ) access of transfers. The plan is to add more disk servers, with more disk pools, even if the differentiation between the disk buffer for data import from CERN and the disk buffer for access form the WNs is not obvious. An essential task is to clearly monitor performances: for this, the monitoring tool presently in use, Nagios 11), seems not to be enough. We are currently evaluating Lemon 12), the monitoring tool developed by CERN and more integrated with CASTOR. On the other hand Nagios offers a powerful framework for alarms management allowing to trigger actions as a consequence of some event (e.g. to disable a server when it shows problematic behavior, or to send an alarm SMS.). We are also working on the high availability of the CASTOR services: in this perspective we will move the databases of stager and name service on an Oracle cluster, while at present they are installed as a single instance. Another issue is the support, for disk-servers, of SLC v.4 needed especially for newer hardware not supported by the currently used operating system (SLC 3.x). From this point of view the most critical point is related to the release of middleware compatible with the new OSes. Currently, since the migration to SRM v. 2.2 has not yet taken place, we are supporting with CASTOR both Storage Classes D0T1 and D1T0. The CASTOR implementation of disk only Storage Class (D1T0) is weak and prone to get frozen (especially when the available disk space is low). This is the main reason that pushed us towards StoRM for D1T0 Storage Class. During Q2 2007 we will perform SRM2.2 tests both with CASTOR and StoRM in order to prepare for the deployment of the production service. 4.2 Tape library evolution With the present library, a StorageTek L5500 containing up to 1 PB of tape on line, it is not possible to fulfill the requests of LHC (and also non LHC) experiments for 2010 (i.e. about 10 PB of data on line). Given the basic constrain of compatibility of the library with CASTOR, we are considering two possible solutions: the SUN SL8500, with up to 10000 slots (currently for 500 GB tapes, 1 TB in 18 months) and up to 80 T10000 drives; 9) for SRM (Storage Resource Management) see http://sdm.lbl.gov/srm-wg 10) StoRM (Storage Resource Manager) is an SRM interface for POSIX file systems developed at CNAF. See: http://storm.forge.cnaf.infn.it/ 11) http://www.nagios.com 12) http://www.cern.ch/lemon/

IV WORKSHOP ITALIANO SULLA FISICA DI ATLAS E CMS BOLOGNA, 23-25/11/2006 the IBM 3584, with up to 7000 slots (currently for 500 GB tapes, 750 GB in few months). Figure 5: The SL5500 tape library We plan to have the new library ready for production at the end of 2007. An issue to be solved is the maintenance of present library after 2008: a possibility is to move data on 9940b tapes to the newest tapes. Another option, in case the new library will be the SL8500, is to move the 9940b drives to new library. Figure 6: CMS transfers using FTS 4.3 The File Transfer Service The FTS 13) manages all data transfers among SRM end-points at Tier0 (CERN), Tier1s and Tier2s in the framework of LCG. All transfers from Tier0 are managed from the FTS server at CERN, while the FTS server at CNAF manages all transfers with Tier2s and to the other Tier1s. Presently the FTS server at CNAF is composed by a single server (with both web service and the agents) while the Oracle back-end is hosted on ad hoc hardware. This layout will evolve in a production quality setup with 3 dedicated servers in load balancing/fail-over and the agents 13) glite FTS (File Transfer Service) is developed by EGEE. See: https://twiki.cern.ch/twiki/bin/view/egee/fts

The INFN Tier1 will be distributed on the 3 servers (but all agents will be configured on each server) and the Oracle back-end will be moved to an Oracle RAC. Moreover the present version of FTS is only compatible with SRM v. 1.1: hence also this service will be upgraded when SRM 2.2 will be put in production. The service is used both by CMS (through PhEDEx) during CSA06 and SC4 and ATLAS within DQ. In all these cases problems related to CASTOR have often been observed. 5 Conclusions During the year 2007 both the farm power and the storage capacity will be increased with the addition of about 1500 KSI2k, 200 TB and a new tape library. Moreover during Q3 2007 also the migration to SRM v. 2.2 will be performed and the Storage Classes implementation will be finalized. One problem we are facing is the shortage of man power: presently about 16 FTEs are working at Tier1, besides dedicated personnel from experiments. Moreover most of crucial duties (e.g. CASTOR, ORACLE) are assigned to temporary people with short term contracts. For a better use of Tier1 and to solve performance issues a tighter coupling with experiments is needed!