NAF & NUC reports. Y. Kemp for NAF admin team H. Stadie for NUC 4 th annual Alliance Workshop Dresden,

Similar documents
The National Analysis DESY

Computing / The DESY Grid Center

Batch system usage arm euthen F azo he Z J. B T

The German National Analysis Facility What it is and how to use it efficiently

The Global Grid and the Local Analysis

ATLAS operations in the GridKa T1/T2 Cloud

Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI

Tutorial for CMS Users: Data Analysis on the Grid with CRAB

Edinburgh (ECDF) Update

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008"

Andrea Sciabà CERN, Switzerland

Grid Engine - A Batch System for DESY. Andreas Haupt, Peter Wegner DESY Zeuthen

Computing for LHC in Germany

Summary of the LHC Computing Review

OBTAINING AN ACCOUNT:

DESY at the LHC. Klaus Mőnig. On behalf of the ATLAS, CMS and the Grid/Tier2 communities

Austrian Federated WLCG Tier-2

Parallel Computing at DESY Zeuthen. Introduction to Parallel Computing at DESY Zeuthen and the new cluster machines

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

Status of KISTI Tier2 Center for ALICE

Grid Computing Activities at KIT

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Site Report. Stephan Wiesand DESY -DV

The ATLAS Tier-3 in Geneva and the Trigger Development Facility

SPINOSO Vincenzo. Optimization of the job submission and data access in a LHC Tier2

CMS Grid Computing at TAMU Performance, Monitoring and Current Status of the Brazos Cluster

The ATLAS Distributed Analysis System

Virtualizing a Batch. University Grid Center

Installation of CMSSW in the Grid DESY Computing Seminar May 17th, 2010 Wolf Behrenhoff, Christoph Wissing

150 million sensors deliver data. 40 million times per second

Computing at DESY Zeuthen. Wolfgang Friebel

DESY. Andreas Gellrich DESY DESY,

File Access Optimization with the Lustre Filesystem at Florida CMS T2

Considerations for a grid-based Physics Analysis Facility. Dietrich Liko

HammerCloud: A Stress Testing System for Distributed Analysis

Challenges in making Lustre systems reliable

Deploying virtualisation in a production grid

Data Processing and Analysis Requirements for CMS-HI Computing

Computing at Belle II

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Name Department/Research Area Have you used the Linux command line?

Lisa User Day Lisa architecture. John Donners

Long term data preservation and virtualization

Virtualization. A very short summary by Owen Synge

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

Our new HPC-Cluster An overview

The Wuppertal Tier-2 Center and recent software developments on Job Monitoring for ATLAS

Introduction to BioHPC

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

The ATLAS Production System

Introduction to High Performance Computing and an Statistical Genetics Application on the Janus Supercomputer. Purpose

Monitoring System for the GRID Monte Carlo Mass Production in the H1 Experiment at DESY

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group

Oracle Solaris 10 Recommended Patching Strategy

Data Analysis in ATLAS. Graeme Stewart with thanks to Attila Krasznahorkay and Johannes Elmsheuser

Introduction to High-Performance Computing (HPC)

Triton file systems - an introduction. slide 1 of 28

LHCb Distributed Conditions Database

The Software Defined Online Storage System at the GridKa WLCG Tier-1 Center

where the Web was born Experience of Adding New Architectures to the LCG Production Environment

Grid Computing. Olivier Dadoun LAL, Orsay. Introduction & Parachute method. Socle 2006 Clermont-Ferrand Orsay)

Introduction to Discovery.

UW-ATLAS Experiences with Condor

XCache plans and studies at LMU Munich

Tier-2 structure in Poland. R. Gokieli Institute for Nuclear Studies, Warsaw M. Witek Institute of Nuclear Physics, Cracow

AGATA Analysis on the GRID

Andrej Filipčič

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

PRACE Project Access Technical Guidelines - 19 th Call for Proposals

Sun Grid Engine - A Batch System for DESY

An Introduction to Cluster Computing Using Newton

HTCondor with KRB/AFS Setup and first experiences on the DESY interactive batch farm

Experience with Data-flow, DQM and Analysis of TIF Data

Lesson 1: Using Task Manager

and the GridKa mass storage system Jos van Wezel / GridKa

MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization

XRAY Grid TO BE OR NOT TO BE?

Computing. DOE Program Review SLAC. Rainer Bartoldus. Breakout Session 3 June BaBar Deputy Computing Coordinator

Introduction to High-Performance Computing (HPC)

ATLAS Tier-3 UniGe

Outline. March 5, 2012 CIRMMT - McGill University 2

Getting started with the CEES Grid

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing

RESEARCH DATA DEPOT AT PURDUE UNIVERSITY

Execo tutorial Grid 5000 school, Grenoble, January 2016

Introduction to Discovery.

Extraordinary HPC file system solutions at KIT

Knights Landing production environment on MARCONI

CMS Computing Model with Focus on German Tier1 Activities

BEST PRACTICES FOR DOCKER

Tier-2 DESY Volker Gülzow, Peter Wegner

High Energy Physics data analysis

X Grid Engine. Where X stands for Oracle Univa Open Son of more to come...?!?

IEPSAS-Kosice: experiences in running LCG site

Workload Management. Stefano Lacaprara. CMS Physics Week, FNAL, 12/16 April Department of Physics INFN and University of Padova

HPC at UZH: status and plans

CMS Analysis Workflow

arxiv: v1 [physics.ins-det] 1 Oct 2009

Transcription:

NAF & NUC reports Y. Kemp for NAF admin team H. Stadie for NUC 4 th annual Alliance Workshop Dresden, 2.12.2010

NAF: The Overview Picture and current resources Yves Kemp NAF 2.12.2010 Page 2

ATLAS NAF CPU usage > Current snapshot of ATLAS NAF batch usage accounting interval: 1.8. - 29.11.2010 total CPU time: 21362 days ~12% of NAF CPUs total wall clock time: 58228 days ~33% of NAF CPUs nominal ATLAS share: ~25% ATLAS is working on the NAF CPU time fractions NAF ATLAS > Analysis type jobs are becoming more prominent within the group of power users Other category includes ~100 users As an example: September 2010-71 users total, 28 from DESY/HUB (slide provided by M. Barisonzi & W. Ehrenfeld) Yves Kemp NAF 2.12.2010 Page 3

NAF usage by CMS > CMS: Install CMSSW on NAF AFS Adapt submission frameworks to local batch Jobs access data on Tier-2 dcache SE Interactive data analysis with PROOF and Lustre > CMS: Additional data sets (160 TB) at DESY All data very well used by community, often many users per dataset > Tasks performed: (Prompt) data analysis Special MC sample production Development of analysis tools Calibration, alignement, CMS Physica Analysis Summary: Yves Kemp NAF 2.12.2010 Page 4 Extract from CHEP 2010 presentation by Kai Leffhalm

NAF usage by LHCb & ILC/CALICE > LHCb: > ILC: E.g: Study of CP-violation in the B sector: Requires complex max. likelihood fits Generate toy MC, very CPU intensive, fast turnaround, short jobs most users perform ntuple production LHCb uses resources as expected NAF important pillar of their analysis infrastructure ILD LoI: Studies of impact of machine background on track reco efficiency Fast turn-around time for efficient prototyping NAF: Easy to manage jobs > CALICE: GEANT4 validation with AHCAL data Custom MC generation NAF: work with scripts in homogeneous environment and keep efficient access to Grid storage Yves Kemp NAF 2.12.2010 Page 5 Extract from CHEP 2010 presentation by Kai Leffhalm

NAF Resources well used Need upgrade in 2011! Recommended limit: < 75%, peaks up to 90% NAF well used by German institutes 21% used by DESY scientists. Yves Kemp NAF 2.12.2010 Page 6

dcache storage & NAF > Both ATLAS and CMS have substantially more space in dcache compared to T2 MoU pledges > NAF and user space And other contributions, e.g. UniHH-CMS E.g ATLAS: T2 part 66% used, NAF part 303 TB/441 TB used ATLAS (HH+ZN) ATLAS (HH+ZN) T2 pledges (740 TB) CMS T2 pledges (400 TB) Used space Free space After observing data taking for ~one year now: 1) Optimize dcache for speed 2) Optimize dcache for safety and availability of custodial data 3) Optimize dcache usage and data placement for non-t2 data ATLAS-HH dcache 29.11/30.11 1.5 Gbyte/s to Grid WN Sustained over 6 hours dcache is THE working horse for data storage Yves Kemp NAF 2.12.2010 Page 7

Hardware Status NAF is three years old now > Have to start replacing first hardware First replacement currently ongoing Newer hardware, more RAM/core, new network technology, At the end, more computing power 10 Gbit infrastructure and more to come in 2011 > New additions to dcache storage (quantity & quality) > Clear commitment from DESY to support NAF > Future purchases planned together with the NUC and take into account findings of the GridCenter Review Task Force. Yves Kemp NAF 2.12.2010 Page 8

Problems and Issues > AFS: Problems started ~Mid July: Whole AFS instance unavailable for some minutes at a time Debugging difficult: Consulting with AFS developers Main cause: SGE behavior with NAF job type when starting many jobs at the same time First countermeasures taken, more to come User training will start this afternoon > Lustre: Many features still not working reliably (e.g. group quotas, ACL, ) Maintenance tools to make users life easier not yet available (deletion tools, ) Overall stability improved, but some hick-ups are still seen Performance reports unclear no end-to-end performance investigation done Future of Lustre unclear in general (ORACLE) and at DESY: looking for alternatives The need for such an easy-access large file store is indisputable Yves Kemp NAF 2.12.2010 Page 9

NAF User Committee and User Meeting > Monthly meetings of the NAF User Committee. Members: ATLAS: Marcello Barisonzi & Wolfgang Ehrenfeld CMS: Andreas Nowack & Hartmut Stadie (Chair) LHC-B: Johan Blouw & Alexey Zhelezov ILC: Steve Aplin & Shaojun Lu IT: Andreas Gellrich & Kai Leffhalm > status reports and discussions with NAF technical coordinators > NAF Users Meeting see you there! Yves Kemp NAF 2.12.2010 Page 10

Random comments collected by NUC > the currently available resources, especially CPU in the batch system, could provide good working conditions, when all systems are working properly > ongoing problems make an effective and timely data analysis almost impossible > dcache user diretories are not reliable enough > congested work group servers > slow I/O with dcache (data placement), need more space > add more Lustre space Yves Kemp NAF 2.12.2010 Page 11

NUC: Some words on support > Support Ansatz: Two different paths Problems with central NAF services DESY helpdesk problem with experiment infrastructure experiment mailing list ( + second level support structure, available for experiment experts directly ) > Challenges: dedicated manpower for central services? dedicated manpower for experiment support? (FSPs) O(Min) response time? analysis with fast turn-around needs very reliable system (better than Tier-2 MoUs) Yves Kemp NAF 2.12.2010 Page 12

Yves Kemp NAF 2.12.2010 Page 13

NAF introduction in one minute > Access to experiment data on the Grid Storage Element > CPU cycles for analysis: interactive and local batch Complement the Grid resources New techniques like PROOF > Additional storage: Lustre parallel file system > Home directories on AFS, accessible from everywhere http://naf.desy.de/ Yves Kemp NAF 2.12.2010 Page 14

NAF well used NAF: Running jobs (2010) > Very peaky behavior: Try to keep overall utilization below 75% and peaks below 90%: Will add hardware in 2011 (starting now). NAF CMS tests 2010-11 > Availability and reliability One of the most important aspects for users and admins Availability and reliability is ~97.5%, similar to the DESY Grid but this does not tell the whole story: The 2.5% failures affect you and your work much more than in the Grid! We want and need to get better!... Have a look at the following slides. Yves Kemp NAF 2.12.2010 Page 15

Major problems in the past months: > Data on dcache not available, slow transfers Some problems with dcache file server availability Under investigation / solved User code sometimes causing denial-of-service: e.g. not closing files after reading them will keep them open for the duration of the job. Only a certain number of files can be kept open at the same time Other jobs cannot open files Slow data transfer: Can have many different causes. It is known that older ROOT files are written in a bad way for reading them efficiently. Sometimes a file server is also overloaded ROOT versions to be changed by experiments & Improvements on dcache side constantly done > Lustre not working properly Lustre does not like small files: Keep your code / SVN / output files outside of Lustre! We provide you AFS-Scratch volumes for such purposes! Other users might do harmful operations and affect your speed or even accessibility To increase stability, Lustre data in HH is going via TCP/IP instead of InfiniBand In general: Lustre future is unclear: ORACLE: DESY looking into alternatives but we recognize that there is a need for easy access file store Yves Kemp NAF 2.12.2010 Page 16

AFS problems in the past months > AFS hangs, Login impossible, shell is frozen, jobs die, > We had severe troubles with NAF AFS cell in the past months > Investigation very difficult and painful, even asking developers for help > Patched AFS kernel module: Solved some problems > It turned out that major problem is due to interference between SGE and AFS. Similar jobs (e.g. one user submitting many jobs): All STDOUT and STDERR end up in files in the same directory These files are created at time of job start If cluster is rather empty, can be several hundreds of jobs: Files are created and read simultaneously in the same directory Fileserver ensures consistency of client cache through callbacks A storm of callbacks between AFS server and AFS clients will basically paralyze the fileserver and the clients, when jobs read in the directory with the.e/.o files > We think we finally have solutions / workarounds! Yves Kemp NAF 2.12.2010 Page 17

Solutions to AFS problem: What NAF can/will do > Limit number of jobs / user: Ad hoc and drastic measure > Throttle start of jobs: Will be implemented soon > Possible long-term solution Change STDOUT/STDERR files with prologue and epilogue methods Write into separate directories > and we now have a simple recipe for you to help us by defusing your jobs: See next slide Yves Kemp NAF 2.12.2010 Page 18

Solutions to AFS problem: What YOU can do Change the submission command like this: qsub -j y -o /dev/null <other requirements> <your jobscript>! Have as the very firsts lines in your job script something like: exec > $TMPDIR /std.out 2>"$TMPDIR /std.err! (this will store the files locally on the WN) ($TMPDIR is unique during job execution, you can of course add $JOB_ID, $SGE_TASK_ID to the filename) and at the very end of your job script, copy these files over to some location on AFS, preferably in a subdirectory any maintainers of CRAB / GANGA / here? Can you implement this for all users? we prepare a web page, and inform all users soon Yves Kemp NAF 2.12.2010 Page 19

Reminder of the NAF support channels > Got a problem with your experiment setup? naf-[atlas,cms,ilc,lhcb]-support@desy.de > Got a problem with the NAF fabric (or are not sure where problem resides)? naf-helpdesk@desy.de Experiment supporters: You know the different system experts and you can use them directly > If you think your job causes a problem: We need you to contact us and help us making the NAF better! Yves Kemp NAF 2.12.2010 Page 20