New Features in PanDA. Tadashi Maeno (BNL)

Similar documents
Overview of ATLAS PanDA Workload Management

The PanDA System in the ATLAS Experiment

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

ATLAS Computing: the Run-2 experience

CHEP 2013 October Amsterdam K De D Golubkov A Klimentov M Potekhin A Vaniachine

The ATLAS Distributed Analysis System

The ATLAS PanDA Pilot in Operation

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation

The ATLAS Production System

New data access with HTTP/WebDAV in the ATLAS experiment

CMS users data management service integration and first experiences with its NoSQL data storage

ATLAS Distributed Computing Experience and Performance During the LHC Run-2

L3.4. Data Management Techniques. Frederic Desprez Benjamin Isnard Johan Montagnat

HTCondor on Titan. Wisconsin IceCube Particle Astrophysics Center. Vladimir Brik. HTCondor Week May 2018

Linux SMR Support Status

UW-ATLAS Experiences with Condor

HammerCloud: A Stress Testing System for Distributed Analysis

Percolator. Large-Scale Incremental Processing using Distributed Transactions and Notifications. D. Peng & F. Dabek

Geant4 on Azure using Docker containers

Lecture 11 Hadoop & Spark

HIGH ENERGY PHYSICS ON THE OSG. Brian Bockelman CCL Workshop, 2016

1. Create a customer account.

Workflow as a Service: An Approach to Workflow Farming

Popularity Prediction Tool for ATLAS Distributed Data Management

ATLAS Analysis Workshop Summary

Experience with ATLAS MySQL PanDA database service

LHCb Computing Status. Andrei Tsaregorodtsev CPPM

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

ATLAS operations in the GridKa T1/T2 Cloud

XSEDE High Throughput Computing Use Cases

PASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year

OBTAINING AN ACCOUNT:

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

WMS overview and Proposal for Job Status

Singularity tests at CC-IN2P3 for Atlas

Improved 3G Bridge scalability to support desktop grid executions

Influence of Distributing a Tier-2 Data Storage on Physics Analysis

AutoPyFactory: A Scalable Flexible Pilot Factory Implementation

COS 318: Operating Systems. Deadlocks. Jaswinder Pal Singh Computer Science Department Princeton University

Release Notes for Patches for the MapR Release

UK Tier-2 site evolution for ATLAS. Alastair Dewhurst

Improving the MapReduce Big Data Processing Framework

Cost-efficient Task Farming with ConPaaS Ana Oprescu, Thilo Kielmann Vrije Universiteit, Amsterdam Haralambie Leahu, Technical University Eindhoven

Monte Carlo Production on the Grid by the H1 Collaboration

Improved ATLAS HammerCloud Monitoring for Local Site Administration

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

DIRAC pilot framework and the DIRAC Workload Management System

C Exam Code: C Exam Name: IBM InfoSphere DataStage v9.1

2/26/2017. For instance, consider running Word Count across 20 splits

Modification Alerts. Feature Deep Dive

Past. Inputs: Various ATLAS ADC weekly s Next: TIM wkshop in Glasgow, 6-10/06

ANSE: Advanced Network Services for [LHC] Experiments

Jenkins: A complete solution. From Continuous Integration to Continuous Delivery For HSBC

MillWheel:Fault Tolerant Stream Processing at Internet Scale. By FAN Junbo

Perceptive Matching Engine

Change-sets. Basavaraj Karadakal

IBM Cloud for VMware Solutions vrealize Automation 7.2 Chef Integration

EGI-InSPIRE. Security Drill Group: Security Service Challenges. Oscar Koeroo. Together with: 09/23/11 1 EGI-InSPIRE RI

Pulpstream. How to create and edit work streams

An Advance Reservation-Based Computation Resource Manager for Global Scheduling

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Architecture Proposal

A Study of the Performance Tradeoffs of a Tape Archive

Who Moved My Module? 1

ALICE ANALYSIS PRESERVATION. Mihaela Gheata DASPOS/DPHEP7 workshop

Orisync Usability Improvement

Interactive Responsiveness and Concurrent Workflow

SUBMITTER PROCESS GUIDE

WIReDSpace (Wits Institutional Repository Environment on DSpace)

Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU

Seven Habits of Highly Effective Jenkins Users. Andrew Bayer Cloudera OSCON Java 2011

Apache Storm. Hortonworks Inc Page 1

Belle II - Git migration

New BMI App p r p o r va o l va Process April 3, 2017

Microsoft Developing Microsoft Azure Solutions.

CSE 120 Principles of Operating Systems

Case Management Implementation Guide

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)

ZFS: NEW FEATURES IN REPLICATION

PanDA: Exascale Federation of Resources for the ATLAS Experiment

High Performance Computing How-To Joseph Paul Cohen

Chapter 20: Database System Architectures

Monitoring and Analytics With HTCondor Data

Transacting with foreign servers

IBM Workplace Web Content Management and Why Every Company Needs It. Sunny Wan Technical Sales Specialist

Recent developments in user-job management with Ganga

Corral: A Glide-in Based Service for Resource Provisioning

The JINR Tier1 Site Simulation for Research and Development Purposes

The functionality. Managing more than Operating

Sccm 2012 Manually Remove Package From Distribution Point

Atlas Managed Production on Nordugrid

EMC Disk Library Automated Tape Caching Feature

PoS(ACAT)020. Status and evolution of CRAB. Fabio Farina University and INFN Milano-Bicocca S. Lacaprara INFN Legnaro

The ATLAS EventIndex: Full chain deployment and first operation

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

<Insert Picture Here> Linux Data Integrity

Martinos Center Compute Cluster

Advance Indexing. Limock July 3, 2014

Transcription:

New Features in PanDA Tadashi Maeno (BNL)

Production managers Old Workflow PanDA server define submitter submitter (bamboo) (bamboo) production jobs remote task/job repository (Production DB) get submit job client client tools tools analysis jobs + src files Site pilot pilot End-user Worker Nodes 2

JEDI JEDI New Workflow jobs PanDA server request ReqIF/DEFT ReqIF/DEFT tasks generate get remote submit analysis task + src files job client client tools tools Site pilot pilot End-user Worker Nodes 3

Main Changes Tasks are submitted to the system instead of jobs Task = runs on input datasets to produce output datasets Job = runs on input files to produce output files Many client functions (job submission, retry, kill) are moved to the server-side Built-in retry mechanism at the same site and/or other sites Capability for task chaining Optimization based on job profile measured by scouts Tasks are more exposed to users rather than jobs 4

Benefits The system works more coherently with user's perspective Users are interested in tasks rather than jobs Allows task-level workload management Retry/kill/reassign tasks Simplification of client tools and centralization of user functions Better maintainability Optimal usage of computing resources without detailed knowledge on the grid Lower hurdle for users Optimization of database access to get/provide task information 5

Main functions Scout jobs Dynamic job generation Automatic retry Task Chaining Prestaging from TAPE Merging Network-aware brokerage 6

Scout jobs Scout jobs are introduced to measure job profiles before generating a bunch of jobs Scout-avalanche chain If no scout jobs succeeded the task is aborted The user can avoid filling up the system with problematic jobs Job parameters are optimized by using real job profiles, which is more accurate than a- priori estimate by users Some users unintentionally submit short jobs to long queues since they don't know beforehand how exactly long their jobs take 7

Dynamic job generation Input for jobs is optimized for each site/queue e.g., more input files for larger scratch disk and/or longer walltime limit Considering # of cores and memory size as well Good for MCORE and exotic resources The number of input files for each job could vary even in the same task Tasks can be configure for all jobs to have the same number of input files if necessary 8

Automatic retry JEDI has a capability for automatic retry at the same or another site Input is dynamically split or combined based on site/queue configuration For example, if jobs are reassigned to a site where longer walltime limit is available, jobs are reconfigured to run on more input files Jobs are not atomic any more PandaMon shows ratio between successful and failed inputs and hides retried jobs 9

Task Chaining A capability of task chaining is available Example Completion of an evgen task can trigger a G4 simulation task The downstream task starts processing before or when the upstream is completed, according to task definition e.g., G4 jobs can be generated as soon as new EVNT files become available 10

Prestaging from TAPE When inputs are available on TAPE, subscriptions/rules are made in DDM and jobs are activated once callbacks are received The same machinery has been used for production and now is used for analysis as well More TAPE usage Will use separate shares for production and analysis in the future so that those activities would not interfere with each other 11

Merging JEDI has a capability to merge files in a task To merge files at T2 before transferring back to T1 To merge user's output files Jobs go to merging state Files actually start being merged when all jobs at the site finished/failed Possible to have a separate task to merge files e.g., super merge and log merge 12

Network-aware brokerage JEDI brokerage is aware of network performance measured by cost matrix Currently only for analysis Example When sitea has free CPUs and good network connection to siteb where input data is available, jobs can be sent to sitea even if sitea doesn't have the input data. The jobs run at sitea and remotely read data from siteb Jobs can go to sitea and siteb while jobs went only to siteb in the old system Users can use more CPU resources in the new system 13

Future Plans To improve the brokerage to take the site failure rate into account e.g. avoid problematic sites for reattempts To use pre-merged files as final products when merge jobs fail Currently pre-merged files are discarded when merge jobs fail For analysis 14