Scientific Computing on Emerging Infrastructures. using HTCondor

Similar documents
STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID

Clouds in High Energy Physics

One Pool To Rule Them All The CMS HTCondor/glideinWMS Global Pool. D. Mason for CMS Software & Computing

Singularity in CMS. Over a million containers served

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

Clouds at other sites T2-type computing

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

DIRAC pilot framework and the DIRAC Workload Management System

glideinwms architecture by Igor Sfiligoi, Jeff Dost (UCSD)

Introduction to Distributed HTC and overlay systems

What s new in HTCondor? What s coming? HTCondor Week 2018 Madison, WI -- May 22, 2018

ARC integration for CMS

glideinwms: Quick Facts

glideinwms Training Glidein Internals How they work and why by Igor Sfiligoi, Jeff Dost (UCSD) glideinwms Training Glidein internals 1

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

Shooting for the sky: Testing the limits of condor. HTCondor Week May 2015 Edgar Fajardo On behalf of OSG Software and Technology

On-demand provisioning of HEP compute resources on cloud sites and shared HPC centers

High Throughput WAN Data Transfer with Hadoop-based Storage

Software and computing evolution: the HL-LHC challenge. Simone Campana, CERN

CMS Tier-2 Program for user Analysis Computing on the Open Science Grid Frank Würthwein UCSD Goals & Status

HIGH ENERGY PHYSICS ON THE OSG. Brian Bockelman CCL Workshop, 2016

Connecting Restricted, High-Availability, or Low-Latency Resources to a Seamless Global Pool for CMS

WLCG Lightweight Sites

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

FIVE REASONS YOU SHOULD RUN CONTAINERS ON BARE METAL, NOT VMS

ALICE Grid Activities in US

The Fermilab HEPCloud Facility: Adding 60,000 Cores for Science! Burt Holzman, for the Fermilab HEPCloud Team HTCondor Week 2016 May 19, 2016

AutoPyFactory: A Scalable Flexible Pilot Factory Implementation

LHCb Computing Status. Andrei Tsaregorodtsev CPPM

Batch Services at CERN: Status and Future Evolution

Large Scale Sky Computing Applications with Nimbus

ATLAS 実験コンピューティングの現状と将来 - エクサバイトへの挑戦 坂本宏 東大 ICEPP

Sky Computing on FutureGrid and Grid 5000 with Nimbus. Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes Bretagne Atlantique Rennes, France

13th International Workshop on Advanced Computing and Analysis Techniques in Physics Research ACAT 2010 Jaipur, India February

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Flying HTCondor at 100gbps Over the Golden State

Global Software Distribution with CernVM-FS

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

Opportunities for container environments on Cray XC30 with GPU devices

The LHC Computing Grid

EGEE and Interoperation

CSE6331: Cloud Computing

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN

Cloud Computing Lecture 4

UW-ATLAS Experiences with Condor

Evolution of Cloud Computing in ATLAS

The Legnaro-Padova distributed Tier-2: challenges and results

Use of containerisation as an alternative to full virtualisation in grid environments.

Big Data Analytics and the LHC

BigDataBench-MT: Multi-tenancy version of BigDataBench

CMS experience of running glideinwms in High Availability mode

Virtualizing a Batch. University Grid Center

An update on the scalability limits of the Condor batch system

Monitoring system for geographically distributed datacenters based on Openstack. Gioacchino Vino

IBM Bluemix compute capabilities IBM Corporation

Report on the HEPiX Virtualisation Working Group

Nested Virtualization and Server Consolidation

Distributed Systems COMP 212. Lecture 18 Othon Michail

Austrian Federated WLCG Tier-2

Cloud Computing and Service-Oriented Architectures

Demystifying the Cloud With a Look at Hybrid Hosting and OpenStack

Top 40 Cloud Computing Interview Questions

Falling Out of the Clouds: When Your Big Data Needs a New Home

LHCb experience running jobs in virtual machines

First Experience with LCG. Board of Sponsors 3 rd April 2009

A High Availability Solution for GRID Services

CLOUD COMPUTING. Rajesh Kumar. DevOps Architect.

An Introduction to Virtualization and Cloud Technologies to Support Grid Computing

10 Steps to Virtualization

Workload management at KEK/CRC -- status and plan

Data Transfers Between LHC Grid Sites Dorian Kcira

Some thoughts on the evolution of Grid and Cloud computing

Scheduling Computational and Storage Resources on the NRP

The ATLAS Software Installation System v2 Alessandro De Salvo Mayuko Kataoka, Arturo Sanchez Pineda,Yuri Smirnov CHEP 2015

Paperspace. Architecture Overview. 20 Jay St. Suite 312 Brooklyn, NY Technical Whitepaper

Grid Computing Activities at KIT

Scalability and interoperability within glideinwms

where the Web was born Experience of Adding New Architectures to the LCG Production Environment

Cloud Computing. Luigi Santangelo Department of Computer Engineering University of Pavia

Building a Data-Friendly Platform for a Data- Driven Future

Cloud & container monitoring , Lars Michelsen Check_MK Conference #4

Operating the Distributed NDGF Tier-1

Status of KISTI Tier2 Center for ALICE

Virtual Machines. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Benchmarking the ATLAS software through the Kit Validation engine

Modernize Your Backup and DR Using Actifio in AWS

Geant4 on Azure using Docker containers

[Docker] Containerization

CS 470 Spring Virtualization and Cloud Computing. Mike Lam, Professor. Content taken from the following:

Distributed Computing Framework. A. Tsaregorodtsev, CPPM-IN2P3-CNRS, Marseille

Multi-tenancy version of BigDataBench

Cloud computing and Citrix C3 Last update 28 July 2009 Feedback welcome -- Michael. IT in the Clouds. Dr Michael Harries Citrix Labs

Container Orchestration on Amazon Web Services. Arun

Virtualization of the ATLAS Tier-2/3 environment on the HPC cluster NEMO

The INFN Tier1. 1. INFN-CNAF, Italy

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data

Lecture 09: VMs and VCS head in the clouds

Cloud du CCIN2P3 pour l' ATLAS VO

A Virtual Comet. HTCondor Week 2017 May Edgar Fajardo On behalf of OSG Software and Technology

Transcription:

Scientific Computing on Emerging Infrastructures using HT HT Week, 20th May 2015 University of California, San Diego 1

Scientific Computing LHC probes nature at 10-17cm Weak Scale Scientific instruments: More precise Processing and analysis needs Software developments: reduces processing timing Complexity (PileUps): increase in resource demand We need large (sets of) resources: use them efficiently Emerging infrastructures (timing): Decouple at software layer HTC & HPC Time to process LHC events more resources Run2: 2015-2018 Run3: 2020-2023 Run4: 2025-2028 Not one size fits all Number of events and complexity 2

Challenges with LHC Evolutions LHC 8 TeV (04th April 2012-16th Dec. 2012 = 256 days), with ~10hr/day Benchmark Points (approximate): 1. Run 1 (8 TeV, 2012); Lumi = 5 x 1033 ; HLT rate: 350 (base)+350 Hz; # 6.4 billion evts 2. Run 2 (13-14 TeV, 2015); Lumi = 0.7 x 1034; HLT rate:?? Hz; # ~3.0 billion evts 3. Run 2 (13-14 TeV, 2016); Lumi = 1.5 x 1034; HLT rate: 1084 ± 37 Hz # 9.9 billion evts 4. Run 2 (13-14 TeV, 2017); Lumi = 1.5 x 1034; HLT rate: 1084 ± 37 Hz # 9.9 billion evts 5. Run 3 (14 TeV, 2019); Lumi = 2.0 x 1034; HLT rate: 1431 ± 51 Hz # 13.2 billion evts 6. Run 4 (14 TeV, 2025); Lumi = 5.0 x 1034; HLT rate ~ 10 khz # 92 billion evts 3

Evolution of Computing Model GRID Virtual organizations (VOs) of group of users CLOUD (Commercial) See: Talk by J. Kinney (AWS) Trusted by sites Pay-As-You-Go Executive Polices Security (You decide) Provisioning by Provisioning by users middleware Empower users Provisioning by users Elasticity Resource Pledges Volume: CPU, Disks, etc. Resource limitations (No elasticity) Velocity: Days/Weeks HYBRIDS We have to live using a combination of GRIDs and CLOUDs Need homogeneity Near infinite capacity Advanced technologies at the harnessing level. HT? Variety: Spot Markets Velocity: 3.5min startup @ Site Admins GRID: Distributed Computing with user level resource access across participating sites 4

Evolution of Computing Model CDF, FNAL CDF used FNAL local resources for its processing and user activities Around 2004, CDF started using Distributed Computing between FNAL & INFN Homogeneity was achieved at the user access level using HT Operated until the end using the hybrid model for Simulations and User activities: Local and GRID 5

Evolution of Late-binding technology: (HT), CRONUS: A Glide-in Based ATLAS Production Executor, CHEP 07, 2-9 Sept. 2007, Canada Now: OSG Connect? 6

Evolution of Late-binding technology: (HT) Early 2006: late binding glidein based WMS - Location: Build 32 @ CERN 7

Evolution of Late-binding technology: (HT) USCMS glidein based WMS development efforts (independently) in ~Dec. 2006 Igor sfiligoi (FNAL) and I worked together joining CMS in April 2008 Integration of various production aspects from previous generation gwms First deployment of glideinwms at UCSD (2008) for user analysis activities - Various glideinwms developments - User Analysis Framework for CMS using glideinwms - Integrated submissions infrastructure to ARC and CREAM-CEs in glideinwms First production version co-developed & demonstrated during CCRC-08 FERMILAB-CONF-10-258-CD 8

Late-binding technologies in Scientific Computing GlideinWMS: currently used in almost all CMS Production & Analysis infrastructure Currently: CMS (glideinwms), ATLAS (PanDA), LHCb (DIRAC), ALICE (AliRoot) - All are pilot-based late-binding technologies widely used Late-binding changed the course of LHC Computing Revolutionary new direction - Homogeneity, virtual batch system, separates provisioning from Scheduling Wonderful! What is missing? - Fault Tolerant (Manual? Pilot waste is not a waste, rest is 's problem) - Elasticity (No. Use opportunistic resources or ask WLCG for more future pledges) - Multi-tenancy (Absolutely, but all users/vos should use the same Linux OS) - Virtualization (No, Why do you need it?) GRID solutions have architectural problem (Everything is a job ) - No elastic service, No mechanism for your own distributed applications. 9

Computing Challenges with LHC Evolutions 10

Event (Re)processing - Burst usage - 7/8 TeV LHC Recent evaluation of processing time, including optimizations in CMSSW Run 1 (B1) = 6.4 billion evts: Using 5 x 1033 ; Time ~ 15 sec/evt On a 15k CPU site ~ 2.5 months SDSC (~3k cores) used for parked data Without including GEN-SIM: Total = 6.4 + 6.4 (MC) = 12.8 billion evts On a 15k CPU site ~ 5 months 11

CMS Resources Using current estimate: - T1 (2015): 300kHS06 ~37k cores Very optimistic - T1 (2016): 400kHS06 ~50k cores - T1 (2017): 520kHS06 (?) ~65k cores B2: Run 2 (13-14 TeV, 2015); Lumi = 0.7 x 1034; HLT rate:?? Hz; # ~3.0 billion evts Total: 3.0 + 4.0 (MC) ~ 7.0 billion events Reprocessing Time on a 37k site @ 15 sec/event ~1.4 months B3. Run 2 (13-14 TeV, 2016); Lumi = 1.5 x 1034; HLT: 1084 ± 37 Hz # 9.9 billion evts Total: 9.9 + 9.9 (MC) ~ 20 billion events Reprocessing Time on a 50k site @ 30 sec/event ~ 4.6 months Total Reprocessing time B2 + B3 ~5.4 months B4. Run 2 (13-14 TeV, 2017); Lumi = 1.5 x 1034; HLT: 1084 ± 37 Hz # 9.9 billion evts Total Reprocessing time on a 65k site with B2 + B3 + B4 ~7.7 months Assuming no of events MC : Data = 1:1 & no other T1 usage In any case burst modeling will require being part of a huge pool of resources - Amazon EC2 is a good example Note: Use of GPUGPU/Co-Processors will help (Need CMSSW with CUDA huge software change else very low efficiency) Essential: Clustering of GPUs per CPU becomes essential in the Workload management system 12

Burst Modeling Burst Usage: No of Compute resources MC Production Data Reprocessing Simulations,etc. Current modeling of burst usage Time (days/months) Finite number of resources by the experiments (Compute as well as Storage) Burst usage is modeled using delays (~months) due to (re)processing capabilities Elasticity in the system is really essential 13

Elasticity in Scientific Computing Scalability and Elasticity in Scientific Computing: - Currently scalability is defined based on owned or opportunistic resources - Purchasing/Owning can have delays with hardware technology of that date - LHC Run-I experience strongly suggests the need for elasticity due to burst (irregular fast turn-around) processing needs by its proponents - On-demand provisioning of VMs using Amazon EC2 can help batch If the batch system busy but cloud available: Expand batch system into the cloud cloud batch Demonstrated similar expansion/ shrink to cloud using custom made auto-elasticity: S. Padhi, J. Kinney, Scientific Computing using Amazon, CERN, Feb. 02, 2015 14

Multi-tenant approach - Virtualization OS level provisioning not only can satisfy other experiments needs - It adds elasticity in the computing system in case of burst needs - One can also add private VMs to the whole central pool for enhanced usage Not possible to use via glideinwms unless all experiments use exact same linux flavor - Resource clusterization/partition is complicated via glideinwms Virtualization depends on need 1. Low level virtualization vcpus Containers (shares kernel) LXC (cgroups), Docker, etc. 2. Hypervisor-based (heavy): - emulates virtual hardware - Hyper-V, KVM, Xen, etc. pcpu 15

Transition from GRID to EC2 Cloud - Hypervisors CONDOR_HOST GRID Sites - GlideinWMS OS & Other items Enforced Cloud: AWS EC2 Worker Worker Worker Worker Worker Worker Worker Worker 16

Transition from GRID to EC2 Cloud - Hypervisors CMS Analysis workflow using Amazon EC2 Scheduler was installed on a non-t1/t2 site Scheduler @ Brown Uni. Thanks to M. Narain! - by default 0 WNs associated with the site Auto-elasticity based on submitted jobs - Computing-on-demand - Custom made auto-elasticity implementation - Site expands/shrinks based on submitted jobs Input data for CRAB3 to the WN via xrootd CMSSW via CVMFS 17

Transition from GRID to EC2 Cloud - Containers CONDOR_HOST See talk by Greg Thain GRID Sites (Not for today) OS and other items Enforced CLOUD: Amazon EC2 or ECS Cont. Cont. Cont. Cont. Worker Worker Worker Worker GRID user NOT part of the docker group People (Italians?) are interested in pushing this forward to the WLCG HT with CCB is not very docker/container friendly (can be fixed) Google users found ways to use within their own containerized clusters Can work: docker run --rm --privileged --net=host -ti -e 'spadhi/condorv2' Amazon ECS works extremely well with user containers See talk by: J. Kinney 18

The New Reality Applications need to be: - Fault Tolerant (Withstand failure) - Scalable (Auto load balance) - Elastic (Can grow and shrink based on demand) - Leveraging the modern kernel isolation (cgroup) - Mixed workload, Multi-tenancy, etc. - Virtualization Scheduling batch is simple Can we schedule services? Hadoop, Squids, PhEDex, etc. Mesos: A Platform for Fine Grained Resource share - Overcomes static partitioning issues - Build and run distributed systems - Multi-resource scheduling (Mem, CPU, disk, etc.) - Supports Docker Containers - Isolation between tasks and containers 19

Novel Architectures - Parallel and Co-processor framework Rapid developments in the industry toward GPUs (also MICs) with high performance We are likely to be forced into new Architectures by industry developments For CMS this will be a very large change (impossible before LS2, hard say HL-LHC) - Development of CMSSW using dedicated environment CUDA or OpenCL - Even then we will need both CPUs and parallel processing units. Clusterization of compute resources will be important 20

Virutalization and Clustering Cloud providers use VMs with direct access to the GPU Use IO memory management unit to handle direct memory access Use hypervisor for passing through PCI devices to guest VMs upon creation. 1:1 or 1:N mapping of VMs to GPUs is possible 21

Summary and Outlook Scientific Computing is evolving into the Cloud era Most likely we will stay with the Hybrid model of GRIDS and CLOUDS Enormous developments from Public/Commercial CLOUDS pioneered by AWS HT is still evolving with the fast paced technological evolutions HPC is all about timing and parallelism (Not part of this talk) - Mira at Argonne: 260,000 parallel threads producing 30M W+5jet evts/hr - GeantV vector prototypes use new VecGeom classes (dynamic threads) Next generation supercomputer probably be FPGA based - performance, capacity, and optimized price/performance/watt (power) - Embedded Processors FPGA design to the SW design - PetaLinux Embedded Linux solutions on Xilinx (FPGA) processing systems - For Triggers or Highly Streaming Computing Hardware/soft Scheduler 22

CMS AWS Proposal Submitted as a CMS IN Note and has been approved by AWS Thanks to the support from USCMS & AWS for this proposal 23

CMS Proposal Phase-I : May 2015 March 2016 24

CMS Proposal Phase-II : April 2016 March 2017 25

Alarms We need to create Alarms for safety reasons 26