The GridWay approach for job Submission and Management on Grids Eduardo Huedo Rubén S. Montero Ignacio M. Llorente Laboratorio de Computación Avanzada Centro de Astrobiología (INTA - CSIC) Associated to NASA Astrobiology Institute Distributed Systems Architecture & Security Group Dpto. de Arquitectura de Computadores y Automática Facultad de Informática (UCM) Outline Motivation The GridWay Framework Selection Adaptive Job Execution Example: Opportunistic Job Migration Summary
Motivation I Globus Toolkit Enables secure multiple domain operation with different resource managers and access policies Globus components: Management (GRAM) Data Management (GridFTP & Replica Catalog) Grid Security Infrastructure (GSI) Information Service (MDS) User: Where do I execute my job? What do I need (files,...)? How do I execute my job? How is my job doing? Can I move my job to a better host? How do I retrieve job output? resource selection resource preparation job submission monitoring migration termination Motivation II High Fault Rate Network Dynamic Cost time of the day (working / nonworking hours) resource load (peak/off-peak) Dynamic Availability Job cancellation by remote administrator Addition and removal of resources A Grid Dynamic Load Shared resources Idle hosts become saturated, and vice versa. Job must be able to migrate among grid resources to obtain application performance and fault tolerance
The GridWay Framework Project Goal: Easy and efficient execution of jobs on heterogeneous and dynamic grids in a submit & forget fashion Design Guidelines: Easily Adaptable (modular design) Easily Scalable (decentralized architecture) Easily Deployable (user privileges, standard services) Easily Extensible (use of non-standard services) Easily Applicable (ready to use for a wide range of applications) The GridWay Framework User Interface: gwps: display job information and status JID AID TID DM SM GSM STIME ETIME EXETIME EXIT HOST TEMPLATE 0 -- -- submitted prologue -- --:-- --:-- --:-- -- columba job_template 1 -- -- zombie done -- 27:37 28:07 00:30 0 ursa job_template 7 -- -- pending done -- --:-- --:-- --:-- -- draco job_template gwhistory: display job execution history HOST RANK STIME ETIME EXETIME MIGRATION_REASON columba.dacya.ucm.es 100 --:-- --:-- --:-- -- ursa.dacya.ucm.es/jobmanager-grd 50 27:41 27:52 00:11 discovery timeout gwkill: signals a job (kill, stop, resume, reschedule) gwsubmit: submits a job, or an array job gwwait: waits for zombie state of a job (any, all, set) Client API: Allows the interaction with each module, (DMRAA subset)
The GridWay Architecture Selector MDS GIIS/GRIS requirements Rank expression Dispatch Manager Request Manager Performance Monitor Performance Degradation Evaluator Performance Profile Submission Agent Job Pool Submission Manager Job Files Executable I/O files Checkpoint GridFTP Client Host GRAM request GRAM callback GASS requirements Rank expression GateKeeper JobManager JOB JOB Performance Profile Execution Host Selector I Rank Expression Requirements (&(Mds-Computer-isa=sparc) (Mds-Memory-Ram-free256)) FQDN stage-rm ursa.ucm.es jobmanager draco.ucm.es jobmanager exec-rm rank jobmanager-sge 50 jobmanager 25 LDAP Filter Static Information (S.O., architecture) User-provided Requirements Authorization test Dispatch Manager Discovery Globus Monitoring and Discovery Service (MDS) Filtered LDAP search GRIS Dynamic Information (CPU load, ) Rank expression User provided executable Characterize discovered hosts Selection LDAP queries GRIIS GRIIS GRIIS
Selector II Estimated execution time (lowest is best) Rank = T exe (h n,h n ) = T cpu (h n,h n ) + T xfer (h n,h n ) Estimated Computational Time: Computational work already performed Dynamic performance of the host Estimated File Transfer Time between: Client host and candidate execution host Job submission and monitoring File staging (executable, input/output files) File server and candidate execution host Input/output files Candidate execution host and current execution host Restart files Adaptive Job Execution Job Adaptation is achieved by automatic job migration when: A new better resource is discovered (opportunistic migration) The remote host or its network connection fails The job is cancelled or suspended A performance degradation is detected The requirements of the application change (self-migration) Migration gain (opportunistic migration and performance degradation): G m Rank ( h = n 1, t n 1 ) Rank ( h Rank ( h, t n 1, t n 1 ) n n ) User threshold
Example: Opportunistic Migration Experimental testbed: Host Model Speed OS Memory Network ursa Sun Blade 100 500Mhz Solaris 8 256MB LAN draco Sun Ultra 1 167Mhz Solaris 8 128MB LAN columba Intel Pentium MMX 233MHz Linux 2.4 160MB LAN cepheus Intel Pentium Pro 200MHz Linux 2.4 64MB LAN solea Sun Enterprise 250 296MHz Solaris 8 256MB MAN Client host Execution host File server Experiment: CFD code to solve the 3D Navier-Stokes equations using an iterative multigrid method Initially the application is submitted to draco Re-schedules when columba and solea becomes available at different iterations of the application running on draco #Job template EXECUTABLE = NS3D.$GW_ARCH ARGUMENTS = input INPUT= gsftp://cepheus/mesh input OUTPUT=profile STDOUT=stdout.$GW_JOB_ID RESTART_FILE=checkpoint REQUIREMENTS=host_req.ldif RANK=rank.sh Example: Opportunistic Migration Dynamic ranks of solea and columba at different execution points 1 2 3 Measured Execution Profile, of the application when it is actually migrated at different iterations 1. Migration to solea is profitable until Iteration 2 is reached 2. From fourth iteration the best host is columba (nearest) 3. From fifth iteration the performance gain is not high enough to compensate the file transfer overhead
Summary Related Work: Job management within the same administration domain: Condor Load Sharing Facility (LSF) Sun Grid Engine Portable Batch System (PBS) Job management for interconnection of multiple domains: Sun Grid Engine Enterprise Edition Condor Flocking Globus middleware for job management: Condor/G AppLes Nimrod/G Job Adaptation: Cactus Worm GrADS GridWay