A Practical Approach for a Workflow Management System

Similar documents
Architecture of the WMS

Overview of WMS/LB API

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007

DataGrid D EFINITION OF ARCHITECTURE, TECHNICAL PLAN AND EVALUATION CRITERIA FOR SCHEDULING, RESOURCE MANAGEMENT, SECURITY AND JOB DESCRIPTION

CREAM-WMS Integration

Multiple Broker Support by Grid Portals* Extended Abstract

Advanced Job Submission on the Grid

Gergely Sipos MTA SZTAKI

g-eclipse A Framework for Accessing Grid Infrastructures Nicholas Loulloudes Trainer, University of Cyprus (loulloudes.n_at_cs.ucy.ac.

Interconnect EGEE and CNGRID e-infrastructures

glite Grid Services Overview

LCG-2 and glite Architecture and components

Grid Scheduling Architectures with Globus

Three Fundamental Dimensions of Scientific Workflow Interoperability: Model of Computation, Language, and Execution Environment

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

WMS overview and Proposal for Job Status

International Collaboration to Extend and Advance Grid Education. glite WMS Workload Management System

User Tools and Languages for Graph-based Grid Workflows

Problemi di schedulazione distribuita su Grid

Grid Computing Middleware. Definitions & functions Middleware components Globus glite

Minutes of WP1 meeting Milan (20 th and 21 st March 2001).

Workflow, Planning and Performance Information, information, information Dr Andrew Stephen M c Gough

BENCHFLOW A FRAMEWORK FOR BENCHMARKING BPMN 2.0 WORKFLOW MANAGEMENT SYSTEMS

WMS Application Program Interface: How to integrate them in your code

DataGrid TECHNICAL PLAN AND EVALUATION CRITERIA FOR THE RESOURCE CO- ALLOCATION FRAMEWORK AND MECHANISMS FOR PARALLEL JOB PARTITIONING

Job submission and management through web services: the experience with the CREAM service

A Login Shell interface for INFN-GRID

Building E-Business Suite Interfaces using BPEL. Asif Hussain Innowave Technology

Grid Compute Resources and Job Management

Grid Compute Resources and Grid Job Management

1Z0-560 Oracle Unified Business Process Management Suite 11g Essentials

PoS(EGICF12-EMITC2)081

Workload Management. Stefano Lacaprara. CMS Physics Week, FNAL, 12/16 April Department of Physics INFN and University of Padova

Dynamic Workflows for Grid Applications

Lesson 5 Web Service Interface Definition (Part II)

The Problem of Grid Scheduling

ISTITUTO NAZIONALE DI FISICA NUCLEARE Sezione di Padova

DIRAC pilot framework and the DIRAC Workload Management System

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

DataGrid. Document identifier: Date: 24/11/2003. Work package: Partner: Document status. Deliverable identifier:

How to use computing resources at Grid

Petri Nets ~------~ R-ES-O---N-A-N-C-E-I--se-p-te-m--be-r Applications.

CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase. Chen Zhang Hans De Sterck University of Waterloo

Advanced Topics in Software Engineering (02265) Ekkart Kindler

A Software Developing Environment for Earth System Modeling. Depei Qian Beihang University CScADS Workshop, Snowbird, Utah June 27, 2012

Generalized Document Data Model for Integrating Autonomous Applications

Eclipse Technology Project: g-eclipse

Telecooperation. Application of Subject-oriented Modeling in Automatic Service Composition. Erwin Aitenbichler. Technische Universität Darmstadt

Pegasus Workflow Management System. Gideon Juve. USC Informa3on Sciences Ins3tute

Automatic Parallelization of Sequential C Code

Enterprise Integration with Workflow Management

Understanding StoRM: from introduction to internals

AMGA metadata catalogue system

BOSCO Architecture. Derek Weitzel University of Nebraska Lincoln

Model Driven Engineering (MDE)

Grid Computing Fall 2005 Lecture 5: Grid Architecture and Globus. Gabrielle Allen

Generating Petri Nets from AADL descriptions. Thomas Vergnaud

Tutorial 4: Condor. John Watt, National e-science Centre

ACET s e-research Activities

PoS(ACAT)020. Status and evolution of CRAB. Fabio Farina University and INFN Milano-Bicocca S. Lacaprara INFN Legnaro

Introduction to Grid Infrastructures

Kestrel An XMPP-Based Framework for Many Task Computing Applications

Distributed systems. Distributed Systems Architectures. System types. Objectives. Distributed system characteristics.

GRID COMPANION GUIDE

A Composable Service-Oriented Architecture for Middleware-Independent and Interoperable Grid Job Management

Cloud Computing. Up until now

QosCosGrid Middleware

Bookkeeping and submission tools prototype. L. Tomassetti on behalf of distributed computing group

Dr. Giuliano Taffoni INAF - OATS

Chapter 2 Introduction to the WS-PGRADE/gUSE Science Gateway Framework

Tutorial for CMS Users: Data Analysis on the Grid with CRAB

L3.4. Data Management Techniques. Frederic Desprez Benjamin Isnard Johan Montagnat

glideinwms architecture by Igor Sfiligoi, Jeff Dost (UCSD)

Pegasus WMS Automated Data Management in Shared and Nonshared Environments

Geant4 on Azure using Docker containers

What s new in HTCondor? What s coming? HTCondor Week 2018 Madison, WI -- May 22, 2018

Integration of the guse/ws-pgrade and InSilicoLab portals with DIRAC

Topics of Discussion

The glite middleware. Ariel Garcia KIT

Implementing GRID interoperability

Deploying virtualisation in a production grid

Semantic-enabled CARE Resource Broker (SeCRB) for managing grid and cloud environment

DataGrid. Logging and Bookkeeping Service for the DataGrid. Date: October 8, Workpackage: Document status: Deliverable identifier:

CHAPTER 3 GRID MONITORING AND RESOURCE SELECTION

Overview. Distributed Systems. Distributed Software Architecture Using Middleware. Components of a system are not always held on the same host

High Throughput WAN Data Transfer with Hadoop-based Storage

Independent Software Vendors (ISV) Remote Computing Usage Primer

EGEE and Interoperation

Parallel Computing in EGI

Grid Infrastructure For Collaborative High Performance Scientific Computing

XSEDE High Throughput Computing Use Cases

Implementing Problem Resolution Models in Remedy

Topics in Object-Oriented Design Patterns

The Opal Toolkit. Wrapping Scientific Applications as Web Services

MONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT

Task Management Service

Resource Allocation in computational Grids

Testing an Open Source installation and server provisioning tool for the INFN CNAF Tier1 Storage system

Resource Discovery in IoT: Current Trends, Gap Analysis and Future Standardization Aspects

A Next Generation Hypervisor for the Embedded Market. Whitepaper

Transcription:

A Practical Approach for a Workflow Management System Simone Pellegrini, Francesco Giacomini, Antonia Ghiselli INFN Cnaf Viale B. Pichat, 6/2 40127 Bologna {simone.pellegrini francesco.giacomini antonia.ghiselli}@cnaf.infn.it

Outline Workflow Management Systems overview A practical approach for real workflows Implementation issues A case study: JDL to GWorkflowDL conversion

Outline Workflow Management Systems overview A practical approach for real workflows Implementation issues A case study: JDL to GWorkflowDL conversion

Workflow Management Systems Overview A lot of interest around WfMSs exist Thanks to workflows, the processes' business logic can be easily expressed using graphs Appealing for users with limited programming skills However, the lack of a recognized standard causes incompatibility among WfMSs: several languages for workflow description exist: Based on different modeling formalisms (DAGs, Petri Nets, Pi Calculus...);

Problems in Workflow Management The choice of a WfMS usually binds users to a specific workflow language Change of the WfMS has a high cost: Legacy workflows must be rewritten. Interoperability among WfMSs is still an open issue Difficulty to manage large, complex and real life scientific processes

Outline Workflow Management Systems overview A practical approach for real workflows Implementation issues A case study: JDL to GWorkflowDL conversion

A practical approach for real workflows Introducing a WfMS: Petri Nets based Formal semantics Turing complete (deals with Workflow Patterns) Build time analysis tools (reachability, boundedness...) Independent from the underlying Grid middleware Multi language: GWorkflowDL used as internal representation Deals with interoperability

The WfMS Architecture Aims at language independence Layered architecture Workflow Management System Workflow Gateway Workflow Engine Grid Abstraction Layer Grid Middleware/s Execution nodes Storage nodes Language Interoperability JDL, GworkflowDL, BPEL Engine Interoperability Aims at Grid middleware independence

Grid Abstraction Layer Abstracts the basic functions of a Grid providing portability of the Workflow engine over different Grid middlewares Grid Abstraction Layer Dispatcher Data Transfer Dispatcher: Job submission/cancellation Reservation Data Transfer: Move data between Grid nodes Observer: Monitor submitted job status Reservation: Resource reservation Observer

Workflow Gateway Provides language and model converters in order to achieve compatibility with legacy workflows and legacy WfMSs Parser: Extracts the model from a workflow description Workflow Gateway Dag Pi Calculus Model Translator Parser Compiler BPEL glite (classad) JSDL Compiler: Produces a workflow representation using a target language (JDL, GWorkflowDL, BPEL, )

Language Interoperability The Gateway solves some interoperability issues in workflow management: Part of a workflow (or a sub workflow) can be translated and successively delegated to a thirdparty WfMS Legacy workflows can be executed on our WfMS without being rewritten Every process can be expressed in terms of a Petri Net (Turing complete)

The Workflow Engine Our goal is to keep the engine as simple as possible and concentrate on interoperability issues Main characteristics Petri Net base Micro Kernel architecture: aims at modularity and extendibility Distributed

Outline Workflow Management Systems overview A practical approach for real workflows Implementation issues A case study: JDL to GWorkflowDL conversion

EGEE/gLite as a Grid Middleware Choice of EGEE/gLite middleware because of: Services maturity Reliability Large adoption Job management is done by the Workload Management System (WMS) Job monitoring is done by the Logging and Bookkeeping Service (LB)

Project Goal: A WfMS over the WMS The practical outcome of our work is to build a WfMS relying on the WMS + LB Both WMS and LB provide a Web Service interface simplify interaction with Grid services

WfMS Deploy (1/2) WfMS running on a dedicated server: Client sends the workflow description to the server; WfMS server manages workflows execution; Client Workflow running instance Monitoring WFMS Server Workflow Description Grid Submitted Jobs Interface to the Grid is provided by the WMS API.

WfMS Deploy (2/2) WfMS running as a Grid job: Client submits a workflow to the Grid via the WMS and monitors it via the LB The WfMS ends up running on a Grid node The WfMS instance submits workflow tasks to the Grid using the WMS and monitors them via the LB Client Monitoring Workflow Description submit Grid WFMS running instances Taking full advantage from the Grid resources and facilities (e.g. checkpointing)

Outline Workflow Management Systems overview A practical approach for real workflows Implementation issues A case study: JDL to GWorkflowDL conversion

The Job Description Language (JDL) The Job Description Language describes a job to be execution on the Grid The JDL adopted within the glite middleware is based upon Condor's Classified Advertisements (ClassAds) record like structure composed of a finite number of attributes separated by semi colon (;) Attribute = Value;

A JDL DAG Workflow JDL allows workflow (DAG based) definition: [ ] Type = dag ; [... ] nodes = [ father = [... ]; son1 = [... ]; son2 = [... ]; final = [... ]; dependencies = { {father, {son1, son2}}, ]; }; {son1, final}, {son2, final} Dependencies JDL parser + Model extractor son1 DAG Model father final son2 Job

DAGs in glite Many legacy workflows expressed in JDL exist; JDL workflows are managed by the Condor DAG Manager (DAGMan): Acts as a Meta Scheduler for Condor jobs Submits job respecting their inter dependencies In case of job failure, DAGMan continues until it can no longer make progress.

DAG -> Petri Net The current DAG model is very limited, e.g. lack of error handling lack of task types other than computation DAGs can be easily described using Petri Nets a DAG node can be represented by a Petri Net transition the flow of data among DAG nodes is modeled using data tokens

DAG -> Petri Net Petri Net Model DAG Model init father father P1 P2 son1 son2 son1 son2 final DAG to Petri Net Model Converter P3 final P4 end

Petri Net -> GWorkflowDL Petri Net Model init father P1 P2 son1 son2 P3 P4 final end <workflow [...]> <place ID= init /> <place ID= P1 /> <place ID= P2 /> [...] Compiler <place ID= end /> <transition ID= father > <inputplace placeid= init /> <outputplace placeid= P1 /> <outputplace placeid= P2 /> <operation /> </transition> [...] </workflow> DEMO

From abstract to concrete workflow (1/4) The workflow description needs a refinement process in order to perform concrete task operations In the case of the glite middleware task execution is asynchronous The WMS serves job submission request returning an ID that identifies the job in its task queue The ID is used to query (or register for notifications by) the LB service until task termination (or failure)

From abstract to concrete workflow (2/4) Task Execution P1 InputSandbox jdl jobid P1 P1 1 JobExecute P1 2 JobStart P1 3 JobRegister data movement JobExecute Recovery Strategy if(result.fail) result P1 4 P2 P2 P1 5 wait_for_termination data movement if(result.success)

From abstract to concrete workflow (3/4) Waits until the task is done (or failed) using polling: getjobstate() operation of the LB service. wait_for_termination (polling) if(s.job_done s.job_fail) P1 3 jobid N sec getjobstate P1 3 1 s if(s.job_running) P1 4

From abstract to concrete workflow (4/4) Same sub workflow implementation using notifications; wait_for_termination (notify) P1 3 P1 3 jobid result P1 4 N sec do_polling

Conclusions and Future Work Workflow Gateway as a standalone component Other WfMS can easily take advantage from languages conversions customizable target language depending on the underlying WfMS A lightweight WfMS with basic functionalities solves low level aspects of workflow management Investigate WfMSs engine interoperability

Thank You!