SPINOSO Vincenzo. Optimization of the job submission and data access in a LHC Tier2

Similar documents
Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

Cluster Setup and Distributed File System

Distributed production managers meeting. Armando Fella on behalf of Italian distributed computing group

The INFN Tier1. 1. INFN-CNAF, Italy

The Legnaro-Padova distributed Tier-2: challenges and results

Monitoring system for geographically distributed datacenters based on Openstack. Gioacchino Vino

Testing SLURM open source batch system for a Tierl/Tier2 HEP computing facility

File Access Optimization with the Lustre Filesystem at Florida CMS T2

LCG data management at IN2P3 CC FTS SRM dcache HPSS

Data Transfers Between LHC Grid Sites Dorian Kcira

Batch system usage arm euthen F azo he Z J. B T

High Energy Physics data analysis

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

Understanding StoRM: from introduction to internals

PROOF-Condor integration for ATLAS

High Throughput WAN Data Transfer with Hadoop-based Storage

Parallel Computing at DESY Zeuthen. Introduction to Parallel Computing at DESY Zeuthen and the new cluster machines

Comparative evaluation of software tools accessing relational databases from a (real) grid environments

The National Analysis DESY

The Software Defined Online Storage System at the GridKa WLCG Tier-1 Center

Using Hadoop File System and MapReduce in a small/medium Grid site

UW-ATLAS Experiences with Condor

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

High-density Grid storage system optimization at ASGC. Shu-Ting Liao ASGC Operation team ISGC 2011

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008"

ARC integration for CMS

OBTAINING AN ACCOUNT:

and the GridKa mass storage system Jos van Wezel / GridKa

The ATLAS Tier-3 in Geneva and the Trigger Development Facility

GEMSS: a novel Mass Storage System for Large Hadron Collider da

The JANUS Computing Environment

The Global Grid and the Local Analysis

Status of KISTI Tier2 Center for ALICE

EMC Business Continuity for Microsoft Applications

Improving Performance using the LINUX IO Scheduler Shaun de Witt STFC ISGC2016

EMC Integrated Infrastructure for VMware. Business Continuity

dcache Introduction Course

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation

Storage Resource Sharing with CASTOR.

A comparison of data-access platforms for BaBar and ALICE analysis computing model at the Italian Tier1

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

A scalable storage element and its usage in HEP

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

HPC at UZH: status and plans

Grid Data Management

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. Presented by Manfred Alef Contributions of Jos van Wezel, Andreas Heiss

Experiences with HP SFS / Lustre in HPC Production

CMS Grid Computing at TAMU Performance, Monitoring and Current Status of the Brazos Cluster

CernVM-FS beyond LHC computing

The CMS experiment workflows on StoRM based storage at Tier-1 and Tier-2 centers

Data storage services at KEK/CRC -- status and plan

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group

Filesystems on SSCK's HP XC6000

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN

Parallel File Systems for HPC

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

Europe and its Open Science Cloud: the Italian perspective. Luciano Gaido Plan-E meeting, Poznan, April

Introduction to High-Performance Computing (HPC)

Quick Guide for the Torque Cluster Manager

Long Term Data Preservation for CDF at INFN-CNAF

CMS Belgian T2. G. Bruno UCL, Louvain, Belgium on behalf of the CMS Belgian T2 community. GridKa T1/2 meeting, Karlsruhe Germany February

SAM at CCIN2P3 configuration issues

A Simulation Model for Large Scale Distributed Systems

Distributed Monte Carlo Production for

Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU

BeoLink.org. Design and build an inexpensive DFS. Fabrizio Manfredi Furuholmen. FrOSCon August 2008

Summary of the LHC Computing Review

Lessons Learned in the NorduGrid Federation

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

Cisco Unified Provisioning Manager 2.2

Grid Computing Competence Center Large Scale Computing Infrastructures (MINF 4526 HS2011)

Coordinating Parallel HSM in Object-based Cluster Filesystems

The ATLAS Distributed Analysis System

Considerations for a grid-based Physics Analysis Facility. Dietrich Liko

UK Tier-2 site evolution for ATLAS. Alastair Dewhurst

Data Access and Data Management

Getting started with the CEES Grid

The CMS Computing Model

CMS experience with the deployment of Lustre

Clouds in High Energy Physics

Data oriented job submission scheme for the PHENIX user analysis in CCJ

Technology Insight Series

Data Movement & Tiering with DMF 7

Using a dynamic data federation for running Belle-II simulation applications in a distributed cloud environment

Lessons learned from Lustre file system operation

Andrea Sciabà CERN, Switzerland

Storage Management for Exchange. August 2008

Edinburgh (ECDF) Update

System upgrade and future perspective for the operation of Tokyo Tier2 center. T. Nakamura, T. Mashimo, N. Matsui, H. Sakamoto and I.

An Oracle White Paper April 2010

Managing a tier-2 computer centre with a private cloud infrastructure

Metadaten Workshop 26./27. März 2007 Göttingen. Chimera. a new grid enabled name-space service. Martin Radicke. Tigran Mkrtchyan

Care and Feeding of HTCondor Cluster. Steven Timm European HTCondor Site Admins Meeting 8 December 2014

Outline. ASP 2012 Grid School

Embedded Filesystems (Direct Client Access to Vice Partitions)

INDIGO-DataCloud Architectural Overview

Transcription:

EGI User Forum Vilnius, 11-14 April 2011 SPINOSO Vincenzo Optimization of the job submission and data access in a LHC Tier2

Overview User needs Administration issues INFN Bari farm design and deployment Storage access optimization File system performance Performance over the WAN link Interactive jobs 2/28

Overview User needs Administration issues INFN Bari farm design and deployment Storage access optimization File system performance Performance over the WAN link Interactive jobs 3/28

User needs Grid submission Local submission Interactive facilities Code development, debugging Analysis with ROOT Personal research data Backups Editing Efficient I/O when serving analysis jobs Jobs may read from storage at 12MB/s Fast and reliable WAN transfers (SRM, GridFTP, Xrootd) 4/28

Admin issues Improving reliability and efficiency of the services provided Sharing and consolidation to avoid duplication of services Support to different VOs Support to different use cases Support to different types of access (grid, local, interactive) 5/28

Farm layout 6/28

Overview User needs Administration issues INFN Bari farm design and deployment Storage access optimization File system performance Performance over the WAN link Interactive jobs 7/28

Storage access Lustre StoRM POSIX parallel file system SRM layer on top of Lustre (CMS) Xrootd Alice production instance CMS test instance Different storage brands Different technologies (HW/SW RAIDs, RAID 5/6, FC, external SAS) 8/28

Storage pre-production Lustre 5 disk servers Network 4x 1Gbps each 190 TB ~600 concurrent jobs Result: 400MB/s RW 400MB/s 9/28

Storage in production CMS job robot efficiency is 95% 10/28

Storage in production 250TB used 10 servers 800 concurrent jobs Real ROOT analysis Result: up to 1.3GB/s (max) 1Gbps 1Gbps 11/28

Storage in production 500TB in production 15 servers Real user activity Result: concurrent reads up to 2GB/s (max) 12/28

Storage in production 650 TB in production 20 servers Real user activity Result: concurrent reads up to 2Gbps on average 2Gbps 13/28

CMS feedback from the grid IO performance tests (L. Sala) CMS walltime for the job CMSSW_CpuPercentage (UserTime/WallTime) Feedback CPU efficiency highly improved total execution time decreased 14/28

Overview User needs Administration issues INFN Bari farm design and deployment Storage access optimization File system performance Performance over the WAN link Interactive jobs 15/28

2Gbps WAN link 2Gbps 2Gbps BARI 16/28

Download from T1/T2 173 MB/s 17/28

Download from T2 145 MB/s 18/28

Upload to FNAL BARI FNAL: 237 MB/s 19/28

Xrootd tests Running ~50 jobs at TRIESTE (CMS T3) Jobs are reading data stored at Bari (remote access using XRootd) BARI TRIESTE 1Gbps spikes 20/28

Overview User needs Administration issues INFN Bari farm design and deployment Storage access optimization File system performance Performance over the WAN link Interactive jobs 21/28

Interactive jobs: why Classic interactive cluster issues maintenance issue (ad-hoc configuration, consistency) scalability performance degradation on heavy load different requirements by different use cases (even if coming from the same VO) Interactive access through interactive jobs The interactive submission is similar to the batch submission: the batch manager chooses one CPU to execute the job and returns an interactive shell The user will keep that CPU until releasing the interactive job (logout) Maintenance: one unique cluster provides both batch and interactive access; the environment is the same, no consistency issues Scalability: the interactive cluster can increase in size, dinamically, depending on the user requests Performance: one CPU per user, so the users are never sharing the same core 22/28

Interactive jobs: how Interactive jobs are provided by Torque as a functionality LFS has it as well The maui configuration is tuned a bit in order to guarantee high priority to those jobs A simple (custom) daemon guarantees the user that he will wait at most 60 seconds to get interactive access Interactive jobs can be logged out and re-logged in afterwards, using screen No hard limit on number of concurrent interactive sessions You can run also multicpu interactive jobs User can ask for n nodes, m processors per node 23/28

Interactive jobs AND the file system GOAL: we wanted one file system both for user and global data, for all the VOs on the site the file system had to be fast and POSIX compliant, in order to support interactive sessions just like a local filesystem the file system had to be shared on all the nodes of the farm, in order to allow both batch and interactive jobs to access both the user home directories and the globally available data stored on site needed a solution which allowed a warm upgrade of the disk space CHOICE: POSIX high performance cluster file system was preferred: Lustre. StoRM on top of Lustre to provide the SRM service 24/28

Interactive jobs example 1. Access to the frontend 2. Get a CPU 3. Use the CPU 4. Release the CPU 5. Release the frontend shell 25/28

Interactive jobs example 26/28

People involved Donvito Giacinto INFN, Università di Bari Spinoso Vincenzo INFN, Università di Bari Maggi Giorgio Pietro INFN, Politecnico di Bari 27/28

References Lustre Wiki StoRM http://storm.forge.cnaf.infn.it Xrootd http://wiki.lustre.org/index.php/main_page http://xrootd.slac.stanford.edu/ Interactive jobs using qsub http://www.clusterresources.com 28/28