One Pool To Rule Them All The CMS HTCondor/glideinWMS Global Pool. D. Mason for CMS Software & Computing

Similar documents
Connecting Restricted, High-Availability, or Low-Latency Resources to a Seamless Global Pool for CMS

glideinwms architecture by Igor Sfiligoi, Jeff Dost (UCSD)

CMS experience of running glideinwms in High Availability mode

A Virtual Comet. HTCondor Week 2017 May Edgar Fajardo On behalf of OSG Software and Technology

glideinwms: Quick Facts

Shooting for the sky: Testing the limits of condor. HTCondor Week May 2015 Edgar Fajardo On behalf of OSG Software and Technology

Building Campus HTC Sharing Infrastructures. Derek Weitzel University of Nebraska Lincoln (Open Science Grid Hat)

What s new in HTCondor? What s coming? HTCondor Week 2018 Madison, WI -- May 22, 2018

The Diverse use of Clouds by CMS

Introducing the HTCondor-CE

glideinwms Training Glidein Internals How they work and why by Igor Sfiligoi, Jeff Dost (UCSD) glideinwms Training Glidein internals 1

Care and Feeding of HTCondor Cluster. Steven Timm European HTCondor Site Admins Meeting 8 December 2014

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

Flying HTCondor at 100gbps Over the Golden State

UK Tier-2 site evolution for ATLAS. Alastair Dewhurst

Introduction to Distributed HTC and overlay systems

BOSCO Architecture. Derek Weitzel University of Nebraska Lincoln

HTCondor overview. by Igor Sfiligoi, Jeff Dost (UCSD)

BOSCO Architecture. Derek Weitzel University of Nebraska Lincoln

Look What I Can Do: Unorthodox Uses of HTCondor in the Open Science Grid

Enabling Distributed Scientific Computing on the Campus

Using ssh as portal The CMS CRAB over glideinwms experience

glideinwms UCSD Condor tunning by Igor Sfiligoi (UCSD) UCSD Jan 18th 2012 Condor Tunning 1

Batch Services at CERN: Status and Future Evolution

Scientific Computing on Emerging Infrastructures. using HTCondor

Virtualizing a Batch. University Grid Center

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Singularity in CMS. Over a million containers served

glideinwms Frontend Installation

HIGH ENERGY PHYSICS ON THE OSG. Brian Bockelman CCL Workshop, 2016

CERN: LSF and HTCondor Batch Services

VC3. Virtual Clusters for Community Computation. DOE NGNS PI Meeting September 27-28, 2017

The HTCondor CacheD. Derek Weitzel, Brian Bockelman University of Nebraska Lincoln

On-demand provisioning of HEP compute resources on cloud sites and shared HPC centers

High Throughput WAN Data Transfer with Hadoop-based Storage

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

DIRAC pilot framework and the DIRAC Workload Management System

Configuring a glideinwms factory

UW-ATLAS Experiences with Condor

Pegasus Workflow Management System. Gideon Juve. USC Informa3on Sciences Ins3tute

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

The Fermilab HEPCloud Facility: Adding 60,000 Cores for Science! Burt Holzman, for the Fermilab HEPCloud Team HTCondor Week 2016 May 19, 2016

An update on the scalability limits of the Condor batch system

HTCondor with KRB/AFS Setup and first experiences on the DESY interactive batch farm

HTCondor Week 2015: Implementing an HTCondor service at CERN

Clouds in High Energy Physics

(HT)Condor - Past and Future

AutoPyFactory: A Scalable Flexible Pilot Factory Implementation

Monitoring for IT Services and WLCG. Alberto AIMAR CERN-IT for the MONIT Team

PROOF-Condor integration for ATLAS

Condor-G: HTCondor for grid submission. Jaime Frey (UW-Madison), Jeff Dost (UCSD)

Grid Compute Resources and Grid Job Management

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID

CMS Tier-2 Program for user Analysis Computing on the Open Science Grid Frank Würthwein UCSD Goals & Status

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

Overview of ATLAS PanDA Workload Management

CSCS CERN videoconference CFD applications

Open Science Grid LATBauerdick/Fermilab

Grid Mashups. Gluing grids together with Condor and BOINC

arxiv: v1 [physics.ins-det] 1 Oct 2009

Monitoring and Analytics With HTCondor Data

Monitoring system for geographically distributed datacenters based on Openstack. Gioacchino Vino

HammerCloud: A Stress Testing System for Distributed Analysis

Optimization of Italian CMS Computing Centers via MIUR funded Research Projects

ALICE Grid Activities in US

Virtualization of the ATLAS Tier-2/3 environment on the HPC cluster NEMO

ISTITUTO NAZIONALE DI FISICA NUCLEARE

The LHC Computing Grid

Scalability and interoperability within glideinwms

New Directions and BNL

LHCb experience running jobs in virtual machines

ARC integration for CMS

Eurogrid: a glideinwms based portal for CDF data analysis - 19th January 2012 S. Amerio. (INFN Padova) on behalf of Eurogrid support group

Workload management at KEK/CRC -- status and plan

SLATE. Services Layer at the Edge. First Meeting of the National Research Platform Montana State University August 7-8, 2017

Big Computing and the Mitchell Institute for Fundamental Physics and Astronomy. David Toback

CMS Grid Computing at TAMU Performance, Monitoring and Current Status of the Brazos Cluster

Distributed Computing Framework. A. Tsaregorodtsev, CPPM-IN2P3-CNRS, Marseille

The Global Grid and the Local Analysis

ATLAS Experiment and GCE

Introduction to SciTokens

Grid Compute Resources and Job Management

RESEARCH DATA DEPOT AT PURDUE UNIVERSITY

Primer for Site Debugging

Tier 3 batch system data locality via managed caches

Data Transfers Between LHC Grid Sites Dorian Kcira

Distributed Monte Carlo Production for

EGEE and Interoperation

Interoperating AliEn and ARC for a distributed Tier1 in the Nordic countries.

CouchDB-based system for data management in a Grid environment Implementation and Experience

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

HTCondor: Virtualization (without Virtual Machines)

DESY at the LHC. Klaus Mőnig. On behalf of the ATLAS, CMS and the Grid/Tier2 communities

Distributed Caching Using the HTCondor CacheD

HTCondor on Titan. Wisconsin IceCube Particle Astrophysics Center. Vladimir Brik. HTCondor Week May 2018

Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI

CCB The Condor Connection Broker. Dan Bradley Condor Project CS and Physics Departments University of Wisconsin-Madison

Computing at the Large Hadron Collider. Frank Würthwein. Professor of Physics University of California San Diego November 15th, 2013

Transcription:

One Pool To Rule Them All The CMS HTCondor/glideinWMS Global Pool D. Mason for CMS Software & Computing 1

Going to try to give you a picture of the CMS HTCondor/ glideinwms global pool What s the use case what problem are we trying to solve How we re solving it i.e. the global pool Building the thing, how well its working Obligatory prognostication 2

Lake Geneva Geneva Geneva Airport CNGS CMS is a particle physicscms experiment at the CERN LHC ν s Gran Sasso Alps CMS CMS CERN Meyrin Millions of protons at very high energy collide every second Here. Our detector records collisions we think are interesting TheseFrance are events 3 LHC

We keep billions of independent events like these Disentangling what happened in EACH event takes ~minutes of CPU 4

We keep billions of independent events like these Disentangling what happened in EACH event takes ~minutes of CPU HT 5

How We Handled This in the First LHC Run Strictly Tiered Infrastructure. Tasks Constrained To Class of Resource 6

And then subdivided depending on grid Analysis Pool Production Pool Analysis & Production 7

LHC Run 1: Hard partitioned resources with multiple submission methods But in this upcoming run Beam energy is two times higher We ll record two times the rate of data The machine will collide many more protons together at the same time Need many more resources than we did for Run 1 O(100k s cores) Choose a submission method: +glideinwms (E. Fajardo s talk tomorrow AM) Need to pool resources be flexible where to run Need to be able to rebalance priorities globally 8

The Unifier glidein WMS Independent of the underlying batch system of a site, from the VO perspective glideinwms constructs a uniform HTCondor pool essential for making a global pool. 9

CMS Drivers & Implementation The analysis and central production use cases rely on agents (ultimately daemons written in python) collecting, building and submitting jobs. CRAB3 collects user jobs and handles job submission, retries, data stage out WMAgent handles requests from physics groups for simulation or data reprocessing. Agents sit with the schedd s, all schedd s talk to the common global pool Frontend In the absence of other requirements, site whitelists, memory # core requirements, a global priority determines who runs first. Frontend calls up factories to send pilots to the requisite sites pilots have a pilot or production role. Use glexec to take on the central production or analysis user s credentials. 10

Overall Design Characteristics We have agents/schedd s distributed at CERN, FNAL, UCSD and UNL. About a dozen active at any given time With the help of OSG Factory Ops we use factories at CERN, FNAL, UCSD, GOC We ve as much as possible tried to configure the glideinwms components as HA between CERN and FNAL. With the latest glideinwms now all components, Frontend, Collectors, can run HA Important when you don t have a 24 hour team at your disposal (FNAL does, used as a last resort) but need 24 hour availability. Worked hard to move from patched up custom built infrastructure of a year ago to all release RPM s, deployed via puppet. Much easier to scale up when needed, replace failed nodes, make test instances! 11

Global Pool Timeline ~May 2014 June-July 2014: Aug. 2014: Sep 2014: Nov 2014: Jan 2015: Mar 2015: Global Pool begins with analysis use case Analysis pool scale testing in CSA14 exercise Begin adding test production jobs to the mix 50k test production jobs reached in global pool Production officially joins analysis in global pool >100k analysis and production jobs reached CMS Tier 0 begins flocking to global pool Production dives into the global pool A year ago we suffered from long negotiation cycles, schedd crashes, Frontends shedding jobs. Certainly thanks to close cooperation with the HTCondor and glideinwms developers, the global pool now reliably handles ~100k jobs. 12

Global Prioritization The global priority now allows us to better control use of the resources depending on the need 13

Tier 0 and the global pool In Run 1 the Tier 0, the first line of prompt processing after data leaves the detector ran in the local LSF queue at CERN. This time around, with the move to the Agile Infrastructure cloud CMS decided to build a dedicated condor pool for the Tier 0 at CERN. The Tier 0 is tied to the data-taking stream, so reliability is key Because of length of time to provision OpenStack AI resources, we carve off VM s running month long pilots. There were concerns during machine down times, lower activity, etc. that having large numbers of long lived idle pilots would have adverse effects on the global pool But also given the high data rate * more complicated events we know the CERN resources will not be sufficient to run all the Tier 0 workflows. Overflow processing to Tier 1 s I.e. flock to the global pool 14

Tier 0 pool to Global Pool Flocking Allows the Tier 0 to expand out to take advantage of the Tier 1 resources Inherently set a high priority for the flocked jobs, there is a ~few day requirement for T0 job completion. Input data already distributed to Tier 1 sites by the time the jobs needing it as input run. Though can use xrootd as a fallback to read direct from CERN 15

Successful Flocking of T0 Jobs T0 Cores/ T1 site CMS Milestone 16

What Next? As the intensity of the LHC increases the focus will be gaining access to and utilizing additional resources. Opportunistic use of spare cycles in OSG Making use of allocations at HPC centers accessing via Bosco and Parrot CMS High Level Trigger farm between LHC fills Gambling on getting cheap resources on commercial clouds All of this of course you saw in Tony T s talk this morning Sanjay s talk just a bit ago Providing easy and appropriate access to local resources, my campus cluster 17

A picture is worth 100k jobs Fuzzy thing at the left is what the global pool really looks like! http://condor.cse.nd.edu/condor_matrix.cgi In time for the new physics run CMS has converted its submission infrastructure into a single coherent global HTCondor/glideinWMS pool. It will allow us to be more flexible, use resources more efficiently, and be better able to exploit the science of LHC Run 2! As we move forward, as the machine intensity increases, so will the need for more and more varied resources HTCondor & glideinwms has more than met the challenge so far! We look forward to continue working with the HTCondor & glideinwms teams to meet the challenges to come! 18 Monitoring by Notre Dame, adapted to global pool by B. Holzman and T. Tiradani

Acknowledgements James Letts (UCSD) co CMS Submission Infrastructure lead Brian Bockelman (UNL) as Bockelman-at-Large FNAL and CERN operations teams K. Larson, T. Tiradani, A. Malta, F. Khan, A. McCrea, M. Mascheroni, B. Holzman, M. Saiz-Santos, S. Belaforte, D. Hufnagel, V. Verguilov, J. Silva, J. Barcas Jeff Dost (UCSD) & the OSG Factory Ops Team HTCondor development team glideinwms development team 19