NUSGRID a computational grid at NUS

Similar documents
UNIT IV PROGRAMMING MODEL. Open source grid middleware packages - Globus Toolkit (GT4) Architecture, Configuration - Usage of Globus

Using MATLAB on the TeraGrid. Nate Woody, CAC John Kotwicki, MathWorks Susan Mehringer, CAC

Update on EZ-Grid. Priya Raghunath University of Houston. PI : Dr Barbara Chapman

Managing CAE Simulation Workloads in Cluster Environments

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007

The University of Oxford campus grid, expansion and integrating new partners. Dr. David Wallom Technical Manager

Grid Scheduling Architectures with Globus

Grid Compute Resources and Job Management

Cloud Computing. Up until now

Moab Workload Manager on Cray XT3

Grid Computing Middleware. Definitions & functions Middleware components Globus glite

GT 4.2.0: Community Scheduler Framework (CSF) System Administrator's Guide

Chapter 5. Minimization of Average Completion Time and Waiting Time in Cloud Computing Environment

IVOA/AstroGrid SSO system and Grid standards

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

Batch Systems. Running calculations on HPC resources

Grid Architectural Models

g-eclipse A Framework for Accessing Grid Infrastructures Nicholas Loulloudes Trainer, University of Cyprus (loulloudes.n_at_cs.ucy.ac.

PARALLEL PROGRAM EXECUTION SUPPORT IN THE JGRID SYSTEM

First evaluation of the Globus GRAM Service. Massimo Sgaravatto INFN Padova

Linear Algebra libraries in Debian. DebConf 10 New York 05/08/2010 Sylvestre

Grid Computing Fall 2005 Lecture 5: Grid Architecture and Globus. Gabrielle Allen

Web Interface to Materials Simulations

[workshop welcome graphics]

UNICORE Globus: Interoperability of Grid Infrastructures

ARC-XWCH bridge: Running ARC jobs on the XtremWeb-CH volunteer

Graham vs legacy systems

WMS overview and Proposal for Job Status

Easy Access to Grid Infrastructures

InfoBrief. Platform ROCKS Enterprise Edition Dell Cluster Software Offering. Key Points

Service Execution Platform WebOTX To Support Cloud Computing

WhatÕs New in the Message-Passing Toolkit

The GPU-Cluster. Sandra Wienke Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

JOB SUBMISSION ON GRID

CS6703 GRID AND CLOUD COMPUTING. Question Bank Unit-I. Introduction

A RESOURCE AWARE SOFTWARE ARCHITECTURE FEATURING DEVICE SYNCHRONIZATION AND FAULT TOLERANCE

Parallel Tools Platform for Judge

CSF4:A WSRF Compliant Meta-Scheduler

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Juliusz Pukacki OGF25 - Grid technologies in e-health Catania, 2-6 March 2009

Using the MyProxy Online Credential Repository

Prototype DIRAC portal for EISCAT data Short instruction

Bazaar Site Admin Toolkit (BazaarSAT)

Design and Implementation of a Monitoring and Scheduling System for Multiple Linux PC Clusters*

Chapter 3. Design of Grid Scheduler. 3.1 Introduction

The Ranger Virtual Workshop

Shaking-and-Baking on a Grid

Usage of LDAP in Globus

RWTH GPU-Cluster. Sandra Wienke March Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

Introduction to HPC Using zcluster at GACRC

GRAIL Grid Access and Instrumentation Tool

Oracle Grid Infrastructure 12c Release 2 Cluster Domains O R A C L E W H I T E P A P E R N O V E M B E R

High-Performance and Parallel Computing

S i m p l i f y i n g A d m i n i s t r a t i o n a n d M a n a g e m e n t P r o c e s s e s i n t h e P o l i s h N a t i o n a l C l u s t e r

Cloud Computing. UCD IT Services Experience

Getting Started with Serial and Parallel MATLAB on bwgrid

The Grid: Feng Shui for the Terminally Rectilinear

Windows-HPC Environment at RWTH Aachen University

A Compact Computing Environment For A Windows PC Cluster Towards Seamless Molecular Dynamics Simulations

Chapter 1: Introduction

Eclipse Technology Project: g-eclipse

Grid Computing in SAS 9.4

Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU

Scalable Computing: Practice and Experience Volume 10, Number 4, pp

Advanced School in High Performance and GRID Computing November Introduction to Grid computing.

A VO-friendly, Community-based Authorization Framework

Designing a Java-based Grid Scheduler using Commodity Services

High Performance Computing (HPC) Using zcluster at GACRC

Parallel Computing at DESY Zeuthen. Introduction to Parallel Computing at DESY Zeuthen and the new cluster machines

Introduction to HPC Using zcluster at GACRC

Multi-threaded, discrete event simulation of distributed computing systems

Introduction to Grid Computing

USING CODEBLOCKS. Implementing Computational Models

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010

GT-OGSA Grid Service Infrastructure

For Dr Landau s PHYS8602 course

S.No QUESTIONS COMPETENCE LEVEL UNIT -1 PART A 1. Illustrate the evolutionary trend towards parallel distributed and cloud computing.

glite Grid Services Overview

Ioan Raicu. Everyone else. More information at: Background? What do you want to get out of this course?

GAIA CU6 Bruxelles Meeting (12-13 october 2006)

A Resource Discovery Algorithm in Mobile Grid Computing Based on IP-Paging Scheme

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

Introduction to Operating Systems Prof. Chester Rebeiro Department of Computer Science and Engineering Indian Institute of Technology, Madras

Grid Programming Models: Current Tools, Issues and Directions. Computer Systems Research Department The Aerospace Corporation, P.O.

Windows Compute Cluster Server 2003 allows MATLAB users to quickly and easily get up and running with distributed computing tools.

GRID Stream Database Managent for Scientific Applications

Practical High Performance Computing

XSEDE Campus Bridging Tools Rich Knepper Jim Ferguson

GROWL Scripts and Web Services

University College London. Department of Computer Science

Running Applications on The Sheffield University HPC Clusters

An Experience in Accessing Grid Computing from Mobile Device with GridLab Mobile Services

Root cause codes: Level One: See Chapter 6 for a discussion of using hierarchical cause codes.

Supercomputing resources at the IAC

HPC at UZH: status and plans

A Distributed System for Continuous Integration with JINI 1

Virtualization. A very short summary by Owen Synge

Deploying virtualisation in a production grid

An Evaluation of Alternative Designs for a Grid Information Service

Integrating SGE and Globus in a Heterogeneous HPC Environment

Transcription:

NUSGRID a computational grid at NUS Grace Foo (SVU/Academic Computing, Computer Centre) SVU is leading an initiative to set up a campus wide computational grid prototype at NUS. The initiative arose out of a desire to enhance resource sharing and overall utilization / efficiency of compute resources across NUS. Implementation of the grid prototype, called NUSgrid, is based on the popular grid middleware Globus Toolkit. NUSgrid will link up existing computational resources (connected over the campus network) from three organizations / entities. More resources will be added after the prototype is tested and proven. The organizations are SVU, Computational Science Department (CZD) and Engineering IT Unit (EITU). The contributed resources consist of a heterogeneous mix of parallel servers, LINUX or other UNIX workstation clusters shown in Table 1: Entity CZD EITU SVU Resource Contribution AMD Linux cluster SUN BLADE 2000 workstation cluster Compaq ES40 (4 CPUs) server Intel Xeon (16 CPUs) Linux cluster Table 1. NUSgrid resources In addition to the compute severs, two other servers complete the NUSgrid infrastructure: a server hosting a certificate authority (CA) and a web portal server. By having its own CA, the process of digital certification is made more convenient for the development and testing of NUSgrid. The infrastructure is summarized in Figure 1:

Figure 1. NUSgrid infrastructure Grid Design The grid middleware is the component that makes the grid possible. For NUSgrid, we used Globus Toolkit (GT) aka Globus (web site at http://www.globus.org). Since its release in the late 1990s, Globus has become the de facto grid middleware, with a very high rate of adoption in academia. Funding for the NUSgrid project is minimal (mainly for hardware - CA and portal servers), so Globus was the obvious choice. Globus Toolkit has a command line interface which is not easy to use. For greater accessibility and ease of use, we developed a web interface to NUSgrid. Since this is a computational grid, the main focus is on compiling/running jobs. Users will be able to compile code and submit their jobs from the web interface. The java Commodity Grid (CoG) kit (web site at http://wwwunix.globus.org/cog/java), an open source Globus application development toolkit, was used to implement the portal. Applications need to be enabled to run on the grid. We grid enabled the common compilers (c/c++, fortran, java). Some of the resources on the grid are parallel servers/clusters. So, we also grid enabled the MPI (c, fortran) compilers and the MPI run time environment. Matlab, a general purpose mathematical tool widely used across many research domains, was also enabled. We intend to grid enable more applications in the future.

Portal User Interface The portal user interface is meant to facilitate the compile / run cycle of a developer. We describe briefly some of its features. After a registration / activation process, the user may log into the portal to see a page with the following menu items (Figure 2). Figure 2. Portal menus The main menu items are Compile job and Run job. But before code can be compiled, it has to be uploaded to the portal server. Every user is allocated space on the portal to hold files and data. The upload may be easily done through the Manage files link. Through this link, the user may also view and delete files in his portal space. When the Compile job link is clicked, the right side of the page is expanded to show host and compiler dropdown lists. The host dropdown allows the user to select the host (contributed by the organizations) for the compilation. The compiler dropdown only lists compilers available on the selected host. After the compiler is selected, the page expands to show more items as in Figure 3. The user enters the source code filename and may specify compiler arguments. Pressing the Update button will update the command line box which shows the command that will be executed on the host. The user may further edit the command line if necessary.

Figure 3. Compiling a job There is an add / drop section for specifying any other files needed in the compilation, for example, header files or special libraries. These will be copied over to the execution host, together with the source code. The executable filename may also be specified (the default name given depends on the compiler), with an option to keep a copy on the execution host. The compilation status / result will be shown in the portal window. The Run job link provides a similar window for the user to fill in host, application, arguments, input / output file, and other requirements for his job. As in Compile job, the page is customized for the host / application selected. Both interactive and batch job submissions are possible for most applications. An add / drop section allows the user to specify any other files needed in the execution, for example, (java) class files or data input files. For interactive jobs, the job is submitted directly to the host and the user waits for the results of the execution. This is similar to code compilation. Batch jobs are run in the background and results may not be returned immediately. Furthermore, batch jobs will be submitted to queues if a job scheduler (for example, LSF) exists on the execution host.

The Job status link allows the user to check the status of his submitted batch jobs. When the user clicks the link, the portal server checks the host for the user s uncompleted batch jobs and updates the status information in the listing. If the job is completed, result / data files will be copied back to the portal. Batch jobs which are queued are listed in a separate table from the ordinary / non queued ones. The listings have Information such as time of submission, host and output files, for each job. The Grid info link provides Information about the hosts / servers and applications available. The host information includes status, operating system, number / speed of CPUs, free and total memory. Problems and Status There were many problems / issues encountered in the design and implementation of NUSgrid. One of the main problem areas was the Globus Toolkit middleware itself. Globus Toolkit is open sourced and under active development. There are many bugs and ongoing bug fixes. Documentation is poor. Installation of the Toolkit on some of platforms had problems due to bugs in the Toolkit which were later discovered / fixed. The open sourced java Commodity Grid (CoG) kit used to develop the portal has similar stability issues. Also, many features of Globus are not available through the java CoG kit. The portal user interface was not easy to design. It had to be quite general as our computational grid was intended to support general computing needs. At the same time, it was sometimes necessary to customize to specific applications and hosts. Each host environment was different, but these differences had to be hidden from the user as much as possible. Testing of the grid / portal required substantial manpower and patience. There were two main types of testing, Globus command line and through portal integration testing. Successful execution at the Globus command line was necessary but did not guarantee an easy or successful portal integration due to the peculiarities of the java CoG kit. Lack of manpower was a major problem in this project and the development / implementation of the system proceeded at a very slow pace. Nevertheless, the system implementation is now close to completion. NUSgrid will be released for use by year end. After release, the system will be monitored and users will be surveyed after a period of use. In the future, more hosts/servers will be added. More applications will also be enabled to run in our grid environment. There will be improvements to enhance the system features based on data collected during monitoring and user feedback. For questions or feedback on NUSgrid, please contact the author at ccefoog@nus.edu.sg.