QosCosGrid Middleware

Similar documents
QosCosGrid end user tools in PL-Grid - good practices, examples and lesson learned

Grid Scheduling Architectures with Globus

SMOA Computing approach to HPC Basic Profile DRMAA + gsoap

HPCP-WG, OGSA-BES-WG March 20, Smoa Computing HPC Basic Profile Adoption Experience Report

Grid Compute Resources and Job Management

SMOA Computing HPC Basic Profile adoption Experience Report

How to build Scientific Gateways with Vine Toolkit and Liferay/GridSphere framework

Batch Systems. Running calculations on HPC resources

First European Globus Community Forum Meeting

glite Grid Services Overview

Information and monitoring

A unified user experience for MPI jobs in EMI

Bazaar Site Admin Toolkit (BazaarSAT)

The glite middleware. Ariel Garcia KIT

g-eclipse A Framework for Accessing Grid Infrastructures Nicholas Loulloudes Trainer, University of Cyprus (loulloudes.n_at_cs.ucy.ac.

Parallel Computing in EGI

Getting Started with XSEDE Andrew Grimshaw and Karolina Sarnowska- Upton

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

GridWay interoperability through BES

Cluster Computing. Resource and Job Management for HPC 16/08/2010 SC-CAMP. ( SC-CAMP) Cluster Computing 16/08/ / 50

INITIATIVE FOR GLOBUS IN EUROPE. Dr. Helmut Heller Leibniz Supercomputing Centre (LRZ) Munich, Germany IGE Project Coordinator

Easy Access to Grid Infrastructures

Gridbus Portlets -- USER GUIDE -- GRIDBUS PORTLETS 1 1. GETTING STARTED 2 2. AUTHENTICATION 3 3. WORKING WITH PROJECTS 4

Outline. Infrastructure and operations architecture. Operations. Services Monitoring and management tools

Moab Workload Manager on Cray XT3

Batch environment PBS (Running applications on the Cray XC30) 1/18/2016

Report on integration of E-VLBI System

Gergely Sipos MTA SZTAKI

On the EGI Operational Level Agreement Framework

Independent Software Vendors (ISV) Remote Computing Usage Primer

Cloud Computing. Up until now

G-ECLIPSE: A MIDDLEWARE-INDEPENDENT FRAMEWORK TO ACCESS AND MAINTAIN GRID RESOURCES

European Grid Infrastructure

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine

Advanced Job Submission on the Grid

ARC NOX AND THE ROADMAP TO THE UNIFIED EUROPEAN MIDDLEWARE

Chapter 2 Introduction to the WS-PGRADE/gUSE Science Gateway Framework

Setup Desktop Grids and Bridges. Tutorial. Robert Lovas, MTA SZTAKI

Grid Experiment and Job Management

Grid Computing Fall 2005 Lecture 5: Grid Architecture and Globus. Gabrielle Allen

GROWL Scripts and Web Services

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase. Chen Zhang Hans De Sterck University of Waterloo

ARC integration for CMS

Name Department/Research Area Have you used the Linux command line?

EMI Deployment Planning. C. Aiftimiei D. Dongiovanni INFN

Architecture of the WMS

Juliusz Pukacki OGF25 - Grid technologies in e-health Catania, 2-6 March 2009

Batch Systems. Running your jobs on an HPC machine

Using Resources of Multiple Grids with the Grid Service Provider. Micha?Kosiedowski

Jozef Cernak, Marek Kocan, Eva Cernakova (P. J. Safarik University in Kosice, Kosice, Slovak Republic)

EGEE and Interoperation

History of SURAgrid Deployment

The LGI Pilot job portal. EGI Technical Forum 20 September 2011 Jan Just Keijser Willem van Engen Mark Somers

Slurm basics. Summer Kickstart June slide 1 of 49

Introduction of NAREGI-PSE implementation of ACS and Replication feature

Sentinet for Microsoft Azure SENTINET

Provisioning of Grid Middleware for EGI in the framework of EGI InSPIRE

STARTING THE DDT DEBUGGER ON MIO, AUN, & MC2. (Mouse over to the left to see thumbnails of all of the slides)

Corral: A Glide-in Based Service for Resource Provisioning

The University of Oxford campus grid, expansion and integrating new partners. Dr. David Wallom Technical Manager

Supercomputer and grid infrastructure! in Poland!

A Compact Computing Environment For A Windows PC Cluster Towards Seamless Molecular Dynamics Simulations

CYFRONET SITE REPORT IMPROVING SLURM USABILITY AND MONITORING. M. Pawlik, J. Budzowski, L. Flis, P. Lasoń, M. Magryś

European Globus Community Forum The Future of Globus in Europe

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007

Adding Distribution Settings to a Job Profile (CLUI)

Towards sustainability: An interoperability outline for a Regional ARC based infrastructure in the WLCG and EGEE infrastructures

Workflow applications on EGI with WS-PGRADE. Peter Kacsuk and Zoltan Farkas MTA SZTAKI

Grid Compute Resources and Grid Job Management

An Introduction to GC3Pie Session-based scripts. Riccardo Murri

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Introduction to High-Performance Computing (HPC)

REsources linkage for E-scIence - RENKEI -

g-eclipse - An Integrated Framework to Access and Maintain Grid Resources

The LHC Computing Grid

Running applications on the Cray XC30

Distributed Information Processing

Introduction to GALILEO

LCG-2 and glite Architecture and components

XSEDE High Throughput Computing Use Cases

Lesson 6: Portlet for job submission

EGI Operations and Best Practices

VOMS Support, MyProxy Tool and Globus Online Tool in GSISSH-Term Siew Hoon Leong (Cerlane) 23rd October 2013 EGI Webinar

PBS Pro and Ansys Examples

Developing Microsoft Azure Solutions (70-532) Syllabus

DS 2009: middleware. David Evans

Designing a Resource Broker for Heterogeneous Grids

NUSGRID a computational grid at NUS

S i m p l i f y i n g A d m i n i s t r a t i o n a n d M a n a g e m e n t P r o c e s s e s i n t h e P o l i s h N a t i o n a l C l u s t e r

Oracle Enterprise Manager. 1 Before You Install. System Monitoring Plug-in for Oracle Unified Directory User's Guide Release 1.0

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced

Introduction to Cluster Computing

Quick-Start Tutorial. Airavata Reference Gateway

OpenIAM Identity and Access Manager Technical Architecture Overview

Monitoring tools in EGEE

Grid Examples. Steve Gallo Center for Computational Research University at Buffalo

The glite middleware. Presented by John White EGEE-II JRA1 Dep. Manager On behalf of JRA1 Enabling Grids for E-sciencE

ARC-XWCH bridge: Running ARC jobs on the XtremWeb-CH volunteer

<Insert Picture Here> QCon: London 2009 Data Grid Design Patterns

Transcription:

Domain-oriented services and resources of Polish Infrastructure for Supporting Computational Science in the European Research Space PLGrid Plus QosCosGrid Middleware Domain-oriented services and resources of Polish Infrastructure for Supporting Computational Science in the European Research Space PLGrid Plus Tools and Services for Advanced Job Management, Advance Reservation and Co-allocation of Computing Resources Tomasz Piontek, Krzysztof Kurowski Poznan Supercomputing and Networking Center EGI Community Forum Helsinki 19-23.05.2014

Outline QCG general information QCG Architecture & Components Tools for end-users Success stories Integration with EGI Documentation & Repositories Contact Questions

QosCosGrid middleware The QosCosGrid (QCG) middleware is an integrated system offering advanced job and resource management capabilities to deliver to end-users supercomputer-like performance and structure. By connecting many distributed computing resources together, QCG offers highly efficient mapping, execution and monitoring capabilities for variety of applications, such as parameter sweep, workflows, MPI or hybrid MPI-OpenMP.

QCG functionality Automatic steering of various types of complex computing experiments: Simple tasks DAG Workflows with recursive conditions and dependencies on any task status Multi-Dimentional parameter sweep tasks Single cluster Parallel tasks (MPI/OpenMP) Cross-Cluster Parallel tasks

QCG Functionality Advance reservation capabilities Quality of Service Co-allocation of resources and cross-cluster Support for interactive tasks Possibility to connect to running task with interactive session Task status and progress notifications

QCG Architecture

QCG-Computing Deployed on access nodes of the batch systems (SGE, Slurm, torque/maui, LoadLeveler, PBS Pro, Condor, Apple Xgrid) Provides remote access to task submission and advance reservation capabilities of LRMS via DRMAA interface Compatible with the OGF HPC Basic Profile specification (JSDL and BES) Offers basic file transfer mechanisms

QCG-Notification Supports the topic-based publish/subscribe pattern for asynchronous message exchange Serves as the main message bus between the services, applications and the end-user Is capable of sending notifications using variety of transport mechanism, including SOAP, SMTP and, what is a unique feature, the XMPP protocol

QCG-Broker Offers scheduling and brokering of jobs capabilities Controls the whole experiments execution (including workflows and parameter sweep tasks) Provides requested QoS and co-allocates resources Stages in/out files and directories

QCG Tools & Clients QCG-SimpleClient (command line) QCG-Icon (GUI) QCG-Science-Gateways (web) QCG-QoS-Access (web) QCG-Monitoring (web) QCG-Data (clone of idrop, under development) QCG-Icon2 (under development)

QCG SimpleClient Set of commands patterned on queuing system tools JSDL, QCG-Simple, QCG-XML description dialects Support for interactive tasks Automatic staging in/out files Notifications about statuses and progress of application (mail, xmpp, QCG-Monitor)

QCG-SimpleClient Submission and controlling of tasks: qcg-cancel - cancel task(s) qcg-clean - clean the working directories of given tasks qcg-connect establish interactive session to the task qcg-info - display detailed information about the given tasks qcg-list - list tasks in the system qcg-peek - display ending of (stdout, stderr) streams qcg-proxy - create user proxy certificate qcg-refetch - retry/repeat the transfer of output files/directories qcg-refresh_proxy - refresh user proxy certificate for given tasks qcg-sub - submit batch or interactive tasks to be processes by QCG Resources reservation and control: qcg-rcancel - cancel reservation(s) qcg-reserve - reserve resources qcg-rinfo - display information about the given reservation(s) qcg-rlist - list reservation in the system

qcg-offer command [plgpiontek@qcg ~]$ qcg-offer HYDRA: Summary: Metric Name nodes/cores share Total Resources: 282/5340 100%/100% Up Resources: 250/4668 88%/87% Used Resources: 114/1710 40%/32% Free Resources: 129/2628 45%/49% (FreeNodes=87x12,17x16,18x48,7x64) PartFree Resources: 142/2810 50%/52% (AvgFreeCoresPerNode=19) Reserved Resources: 2/24 0%/00% (Utilization=0%) GALERA: Summary: Metric Name nodes/cores share Total Resources: 195/2700 100%/100% Up Resources: 191/2652 97%/98% Used Resources: 139/1426 71%/52% Free Resources: 38/456 19%/16% (FreeNodes=38x12) PartFree Resources: 73/698 37%/25% (AvgFreeCoresPerNode=9) Reserved Resources: 9/468 4%/17% (Utilization=0%)

QCG-SimpleClient Polecenia qcg-cancel qcg-clean qcg-connect qcg-info qcg-list qcg-peek qcg-proxy qcg-refetch qcg-resub qcg-sub qcg-rcancel qcg-reserve qcg-rinfo qcg-rlist Dyrektywy #QCG application argument environment error/output grant host memory nodes / procs note notify / watch-output preprocess / postprocess queue reservation stage-in-dir/file stage-out-dir/file walltime więcej #!/bin/bash #QCG host=nova #QCG queue=plgrid #QCG note=naphthalene #QCG output=${job_id}.output #QCG error=${job_id}.error #QCG stage-in-file=naphthalene.gjf #QCG stage-in-file=gaussian.ntf #QCG stage-out-dir=.->result #QCG nodes=1:1 #QCG walltime=pt10m #QCG notify=xmpp:tomasz.piontek@plgrid.pl #QCG watch-output=20,gaussian.ntf #QCG application=g09 #QCG argument=naphthalene.gjf

XMPP Status Notification

QCG-Icon lightweight intuitive application for Windows, MAC OSX and Linux platforms, provides transparent, unified access to applications installed on Grid resources and available via QosCosGrid services. automatically transferes input / output files between Grid and user-side. gives to the user impression of local work

QCG-Icon

QCG Science Gateways

QCG-Monitoring

QCG Monitoring 20

QCG-QoS-Access

QCG Success Stories Infrastructural and research EU and national projects (PLGridPLUS, pmedicine, AirPROM, MAPPER, ACGT, BEinGRID, PL-GRID, QosCosGrid, BREIN) Production deployment on PL-Grid resources (the most popular middleware in Poland) EGI & PSNC/QCG MoU (2012) BCC-UNG&PSNC/QCG MoU (2013) Part of UMD 3.2.0 Cooperation with PRACE

PLGrid Statistics (2013)

Integration with EGI QCG unit support in GGUS SAM Nagios probes & Dashboard alerts APEL / GRID-SAFE accounting Advertising GLUE2 schema information in BDII (in progress) Integration with Virtual Organization Management Support (VOMS) Part of the UMD 3.2.0 (September 2013) QCG-Icon in AppDB

QCG Deployment Installation from packages QCG software repository For SL5/SL6 and Debian (QCG-Comp, QCG-Notif) UMD repository SL5 (SL6 in next update with SHA2 support) Installation from sources Windows/Linux/MacOS QCG-Icon installer

www.qoscosgrid.org

Contact General questions: contact@qoscosgrid.org Technical questions: support@lists.qoscosgrid.org Tomasz Piontek piontek@man.poznan.pl

Questions? Comments! Advices ;-)