RELEASE OF GANGA. Basics and organisation What Ganga should do tomorrow Ganga design What Ganga will do today Next steps

Similar documents
GANGA: a user-grid interface for Atlas and LHCb

Ganga - a job management and optimisation tool. A. Maier CERN

Ganga The Job Submission Tool. WeiLong Ueng

DIRAC Distributed Infrastructure with Remote Agent Control

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

Recent developments in user-job management with Ganga

PoS(ACAT2010)039. First sights on a non-grid end-user analysis model on Grid Infrastructure. Roberto Santinelli. Fabrizio Furano.

The ARDA project: Grid analysis prototypes of the LHC experiments

DIRAC Distributed Infrastructure with Remote Agent Control

LHCb Computing Strategy

WMS overview and Proposal for Job Status

Andrea Sciabà CERN, Switzerland

Overview of LHCb applications and software environment. Bologna Tutorial, June 2006

Considerations for a grid-based Physics Analysis Facility. Dietrich Liko

Create quick link URLs for a candidate merge Turn off external ID links in candidate profiles... 4

Workload Management. Stefano Lacaprara. CMS Physics Week, FNAL, 12/16 April Department of Physics INFN and University of Padova

A Login Shell interface for INFN-GRID

Edinburgh (ECDF) Update

1. Create a customer account.

UW-ATLAS Experiences with Condor

verapdf Industry supported PDF/A validation

IllustraCve Example of Distributed Analysis in ATLAS Spanish Tier2 and Tier3

Data and Analysis preservation in LHCb

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

PROOF-Condor integration for ATLAS

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Submitting your Work using GIT

LHCb Computing Status. Andrei Tsaregorodtsev CPPM

Batch Services at CERN: Status and Future Evolution

WP3 Final Activity Report

DQ2 - Data distribution with DQ2 in Atlas

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation

DIRAC pilot framework and the DIRAC Workload Management System

Belle II - Git migration

ALICE Grid/Analysis Tutorial Exercise-Solutions

AliEn Resource Brokers

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

pyframe Ryan Reece A light-weight Python framework for analyzing ROOT ntuples in ATLAS University of Pennsylvania

GRID COMPANION GUIDE


where the Web was born Experience of Adding New Architectures to the LCG Production Environment

Grid Compute Resources and Job Management

CHEP 2013 October Amsterdam K De D Golubkov A Klimentov M Potekhin A Vaniachine

The PanDA System in the ATLAS Experiment

Active Directory Synchronisation

Modeling and Simulation with SST and OCCAM

Large Scale Software Building with CMake in ATLAS

What s Next for OpenEdge

Summary of the LHC Computing Review

Benchmarking the ATLAS software through the Kit Validation engine

Protocol Governance Committee Meeting #19 28/06/ h00 Summary Minutes

Configuring the Oracle Network Environment. Copyright 2009, Oracle. All rights reserved.

Grid Code Planner EU Code Modifications GC0100/101/102/104

CMS Grid Computing at TAMU Performance, Monitoring and Current Status of the Brazos Cluster

GAUDI - The Software Architecture and Framework for building LHCb data processing applications. Marco Cattaneo, CERN February 2000

Implementing Problem Resolution Models in Remedy

Installing and running COMSOL 4.3a on a Linux cluster COMSOL. All rights reserved.

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

Lab 01 How to Survive & Introduction to Git. Web Programming DataLab, CS, NTHU

FRIENDS AND FAMILY TEST IN GENERAL PRACTICE

Technical Qualifications How to book assessments

Grid Computing in SAS 9.4

CSI Lab 02. Tuesday, January 21st

CVS Application. William Jiang

L3.4. Data Management Techniques. Frederic Desprez Benjamin Isnard Johan Montagnat

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

FRIENDS AND FAMILY TEST IN GENERAL PRACTICE

incontact Workforce Management v2 Planner Web Site User Manual

Volunteer Computing at CERN

[workshop welcome graphics]

Lab 08. Command Line and Git

HammerCloud: A Stress Testing System for Distributed Analysis

L25: Modern Compiler Design Exercises

Singularity in CMS. Over a million containers served

Scalable Computing: Practice and Experience Volume 10, Number 4, pp

The Problem of Grid Scheduling

ATLAS Production System in ATLAS Data Challenge 2 Luc Goossens (CERN/EP/ATC) Kaushik De (UTA)

LCG-2 and glite Architecture and components

doconv Documentation Release Jacob Mourelos

SCAP Security Guide Questions / Answers. Ján Lieskovský Contributor WorkShop November 2015

Scientist s User Interface (SUI)

INSPIRE and SPIRES Log File Analysis

Researcher User Manual

ICAT Job Portal. a generic job submission system built on a scientific data catalog. IWSG 2013 ETH, Zurich, Switzerland 3-5 June 2013

Web Portal and Connectivity Guide

What s new in HTCondor? What s coming? HTCondor Week 2018 Madison, WI -- May 22, 2018

hereby recognizes that Timotej Verbovsek has successfully completed the web course 3D Analysis of Surfaces and Features Using ArcGIS 10

CS110/CS119 Introduction to Computing (Java) Bob Wilson S-3-176

The Global Grid and the Local Analysis

eservice The simple way to manage your Ricoh products RICOH eservice User Guide

British Safety Council Centre Portal User Guide

WeBS Core Counts 3 Guide to WeBS Online

Testbed a walk-through

Distributed production managers meeting. Armando Fella on behalf of Italian distributed computing group

Report on the HEPiX Virtualisation Working Group

Grid Computing in SAS 9.2. Second Edition

G E T T I N G S TA R T E D W I T H G I T

AASPI SOFTWARE PARALLELIZATION

Advanced Scripting Techniques

Request Manager User's Guide

Transcription:

K. Harrison CERN, 12th June 2003 RELEASE OF GANGA Basics and organisation What Ganga should do tomorrow Ganga design What Ganga will do today Next steps

GANGA BASICS Ganga is an acronym for Gaudi/Athena and Grid Alliance Short-term aims: Deal with configuring and running Gaudi-based applications Deal with submitting and monitoring jobs to/on distributed (Grid) and local batch systems Longer-term aim: Develop Gaudi services for use in Grid enviroment (enable querying of replica catalogues, enable publication of information for Grid monitoring, etc.) Ganga is being developed as an ATLAS/LHCb common project, with support in UK from GridPP Good possibilities for contributing to LCG Physicist Interface (PI)

PROJECT ORGANISATION Current main contributors to Ganga are: Developers: K.Harrison, W.Lavrijsen, A.Soroko, C.L.Tan Technical direction: P.Mato, C.E.Tull GriPP coordination: N.Brook (handing over soon to G.Patrick), R.W.L.Jones Ganga-related information regularly updated on web site: http://ganga.web.cern.ch/ganga A mailing list has been active since November 2002: project-ganga@cern.ch Usually have telephone meeting at least once every two weeks: details of times are placed on web site, and are circulated to mailing list Presentations of ganga status and plans given at various other meetings of ATLAS, LHCb and GridPP

WHAT GANGA SHOULD DO TOMORROW Short-term plans: June-September 2003 Longer-term plans: October 2003 onwards 1) Simplify users lives by providing a single interface for working with all Gaudi-based (offline) applications Short term: graphical user interface (GUI) and command-line interface (CLI) for working with analysis jobs (in particular, DaVinci jobs) Incorporate features from ATLAS Athena Startup Kit, ASK (W.Lavrijsen) Longer term: allow for running of production-type jobs, integrating with existing production systems of LHCb and ATLAS: DIRAC (A.Tsaregorodtsev et al.) and AtCom (L.Goossens et al.)

2) Make workflows/data transformations easy to define, store and instantiate, and supply templates for common use cases Short term: Ganga will provide a simple workflow-definition mechanism of its own Adopt more sophisticated workflow definition procedure, for example based on tools developed in context of DIRAC (G.Kuznetsov et al.), or based on Chimera (R.Gardner et al.) 3) Help with application configuration by providing a job-options editor Short term: allow access to, and modification of, all job options, with possibilities for choosing options for a particular algorithm or user favorites Longer term: give guidance on meaningful values (with input from algorithm developers)

4) Provide a simple, flexible procedure for splitting and closing jobs Short term: introduce splitting/cloning procedure, deal with common use cases, take care of merging of outputs where appropriate/possible Longer term: dependent on user feedback 5) Help users keep track of what they ve done Short term: provide catalogue of jobs and their status, and allow access to settings for each Longer term: dependent on user feedback 6) Perform job monitoring tasks on local and distruted batch systems Short term: pull information from jobs, allowing automatic updates of status and user-initiated queries Longer term: move to system where jobs push information to a user-specified location; integrate with NetLogger, to have detailed information on progress of jobs on the Grid

7) Allow for user mobility Short term: provide a single procedure for submitting jobs to different types of batch systems (EDG, LSF, PBS, other), with the batch system accessible from the machine where Ganga is run Longer term: allow user to submit jobs from any machine with Ganga running, to batch queues on any machine (Gatekeeper) where user has an account or is in the Grid mapfile; take care of software installation at remote nodes, for example building on procedure used in DIRAC or using pacman (S.Youssef) 8) Other things, to be determined by user requests, but should consider possibilities for web-portal interface interactive analysis (based on ROOT) data-management services

Send script Gatekeeper Submit job Request software User running Ganga client Send job output Request software Receive software Send job output Receive software Worker nodes Software server

DESIGN DETAILS User Gaudi/Athena job definition Gaudi/Athena job-options editor Gaudi/Athena job splitting Gaudi/Athena output collection CLI GUI Software Bus PythonRoot GaudiPython Job definition Job registry Script generation Job submission File transfer Job monitoring User has access to functionality of Ganga components both through GUI, and through CLI, layered one over the other above the software bus Components used by Ganga can be divided into three categories: Ganga components of general applicability (to right in diagram) Ganga components providing specialised functionality (to left in diagram) External components (at bottom in diagram)

GANGA COMPONENTS OF GENERAL APPLICABILITY Components potentially have uses outside ATLAS and LHCb Could be of interest for LCG PI project and other escience applications Core component provides classes for job definition, where a job is characterised in terms of: name, workflow, required resources, status Workflow is represented as a sequence of elements (executables, parameters, input/output files, etc.) for which associated actions are implicitely defined Required resources are specified using a generic syntax

Other components perform operations on, for, or using job objects Job-registry component allows for storage and recovery of job information, and allows for job objects to be serialised Scrip-generation component translates a job s workflow into the set of instructions to be executed when the job is run Job-submission component submits workflow script to target batch system, creating JDL (job-description language) file if necessary and translating resource requests as required File-transfer component handles transfer between sites of input and output files, adding appropriate commands to workflow script at submission time Job-monitoring component performs queries of job status

GANGA COMPONENTS PROVIDING SPECIALISED FUNCTIONALITY FOR LHCB AND ATLAS Components incorporate knowledge of the Gaudi/Athena framework Component for Gaudi/Athena job definition adds classes for workflow elements not dealt with by general-purpose job-definition component, for example applications packaged using CMT; component also provides workflow templates covering common tasks Other components provide for job-option editing, job splitting, and output collection

EXTERNAL COMPONENTS Additional functionality obtained using components developed outside of Ganga: Modules of python standard library Non-python components for which appropriate interface has been written, for example Gaudi framework itself and ROOT, BoostPython

WHAT GANGA WILL DO TODAY (OR VERY SOON) Code for Ganga v1r0 is in Gaudi CVS repository Want to carry out a few more checks and do some repackaging, then release during week 16th-20th June Ganga v1r0 includes: GUI Command-line access to underlying tools (but not user oriented) Job-options editor (so far set up only for ATLAS fast simulation) Submission of some types of jobs, including DaVinci jobs, to different batch systems (LSF, PBS, EDG) Mechanism for splitting/cloning jobs Job catalogue Monitoring (made more system friendly since March software week)

Items on which work is in progress, but which aren t ready for v1r0, include: Software bus that adds to functionality of python interpreter User-oriented CLI Pure client submission Enhancement of features already present in v1r0 (generalised job-options editor, treatment of more types of job, etc.) Ganga v1r1 scheduled for release during week beginning 28th July, and should include some of above

GUI (A.Soroko) GUI developed using wxpython User presented with main window, job tree and python prompt Some simplifications made in response to comments at March software week

UNDERLYING TOOLS AND CLI (K.Harrison) Underlying tools currently use the following breakdown: Workflow: a series of WorkSteps, defining the sum of the actions to be performed when the job is run WorkStep: a set of elements providing all information necessary to run one instance of an executable WorkStep/Workflow elements: Command, InputFile, OutputFile, CMTPackage (produces a library), CMTApplication (has an executable associated with it) other elements to follow

Underlying tools can be accessed from Ganga prompt to submit DaVinci job: from GangaComponent import * davincisetup = GangaCommand( source /afs/cern.ch/lhcb/scripts/projectenv.sh DaVinci v8rl ) davinci = GangaCMTApplication( Phys/DaVinci, v8rl, DaVinci.exe, options/davinci.opts ) workstep1 = GangaWorkStep([daVinciSetup, davinci]) workflow = GangaWorkFlow([workStep1]) lsfjob = GangaLSFJob( davincitest,workflow) lsfjob.build() lsfjob.submit()

GUI and CLI simplify use of underlying tools CLI under development will reduce above, adding splitting into sub-jobs, as: job.create( davincitest, DaVinci v8r1 ) job.submit(lsf@cern.ch,20)

JOB-OPTIONS EDITOR (C.L.Tan) Job-options editor has been developed with hard-coded defaults appropriate to ATLAS fast simulation, but will be generalised so that defaults for any Gaudi/Athena application can be read from a file or database Prevents some errors (mis-spelling of options/values, incorrect syntax) Allows definition/manipulation of sequences and lists Editor is option-type aware Drop-down menus for discrete choices Arbitrary value entry for simple options Value append for list-type options

Preferred settings can be saved to file for subsequent reloading Future improvements should include: Expansion of include files Support for python job options Display of favorite options first Display of option values by algorithm

SOFTWARE BUS (W.Lavrijsen) Prototype software bus, PyBus, has been developed as user-level python module, with no privileges over modules Standard python modules corrently handled by PyBus PyBus components treated as modules by python interpreter Allows component to be loaded by logical, functional or actual name First two not necessarily unique: choose on basis of PyBus configuration, priority scheme or user input Performs complete unloading and reloading of components, and allows component replacement Allows components to register their parameters, permitting component configuration

CLONING AND SPLITTING OF JOBS (W.Lavrijsen, A.Soroko, C.E.Tull) Sub-jobs from cloning or splitting a Gaudi/Athena job are near copies of one another, but are distinguished by name and may have a different value for one or more of the job-option parameters Experimenting with generic approach, where a splitting function returns for all sub-jobs the job-option parameters that differ from those of the initial job Splitting functions for common cases should be supplied with Ganga; more specialised splitting functions can be added by the user In the typical DaVinci case, the splitting function should examine the list of input files associated with EventSelector.Input in the job options, and assign some group of files to each sub-job The Ganga job handler dispatches the sub-jobs, and stores information for each in separate directories

CONCLUSIONS A lot of work has been done on Ganga since March software week Code for Ganga v1r0 is in Gaudi CVS repository Release during week 16th-20th June, following tests and some repackaging Ganga v1r0 is not production quality, but is useful for giving a feel of how things should work Ganga team has a well-defined plan of additions and improvements, but would welcome user feedback on what is already implemented, and on priorities for the things that are missing