Resource Sharing Problem

Size: px
Start display at page:

Download "Resource Sharing Problem"

Transcription

1 1 Resource Management Resource Sharing Problem Locate resources What computers can I use. Which are the IP addresses? Allocate resources Get a free computer now, reserve one for tomorrow at noon Authenticate and access control (authorization) Grid security lecture Prepare a resource for use Free 2MB of disk space, ask the other users to log out Use the resource Run a job (execute an application) Monitor the resource Is the computer still connected to the Internet? Did my application finish? 2 Resource Management Stack 5. Scheduler Optimum resource allocation 4. Resource brokers Map high-level onto local requests Resource matchmaking and negotiation 3. Resource co-allocators Allocate multiple l resources simultaneously 2. Grid site resource manager Bridge over local managers Common Grid resource management language 1. Local resource managers Heterogeneous site autonomy Every local system administrator does what he wants and likes Scheduler Resource Broker Multiple Grid Site Resource Co-allocator Grid Site Resource Manager Local Resource Manager 3 1

2 4 1. Local Resource Management What is a Resource? An entity that is to be shared E.g., computers, networks, storage, data, software, people Does not have to be a physical entity Defined in terms of interfaces,, not devices ssh defines the access to a computer POSIX defines interface to a computer resources Open/close/read/write functions define access to a distributed file system o e.g., NFS, AFS, DFS Everything (or anything) is a resource 5 Compute Resources Single processor computers Home computers Parallel computers Symmetric multi-processors o Set of processors sharing a common main memory Distributed memory computers o Network of workstations CC-NUMA o Physically distributed, logically shared, cache coherent memory Batch queuing systems for controlled use 6 2

3 7 2. Grid Resource Management Why it is Hard? Site autonomy No control over local administration If one site suddenly decides to suspend its daily production it cannot announce the entire Grid At maximum it can publish a local Web page Heterogeneous substrate Many different platforms E.g., queuing systems: LoadLeveler, LSF, PBS, SGE, Co-allocation Simultaneous access to independent resources Unreliability ZID sites should be available at 22:00, but diligent tutors shut them down 8 GRAM Bridge 9 GRAM LL LSF PBS SGE Grid Resource Allocation Manager Grid applications use GRAM to execute processes on computers GRAM uses the local job queuing system to manage resources Set-up by the local system administrator 3

4 10 Grid Resource Allocation Manager GRAM Defines resource layer protocols and APIs that enable clients to securely instantiate a Grid computational task Remote job submissions Relies on local resource management interfaces Load Leveler, LSF, PBS, SGE GSI enabled This is a new required and important Grid feature Wrapper over local resource management (job queuing) systems GRAM Architecture 11 GRAM Client MDS client API calls to locate resources GRAM client API calls to request resource allocation and process creation. Site boundary Gatekeeper Authentication Globus Security Infrastructure Create Request Job Manager Parse RSL Library MDS Update MDS with resource state information Gram Reporter Query current status t of resource Local Resource Manager Monitor & control Process Process Process Allocate & create processes Host Certificates Each Grid host runs a GRAM gatekeeper Each Grid host has a X.509 certificate Identity + Public key /etc/grid-security/hostcert.pem The certificate must be for a machine which has a consistent name in DNS To not be run on a computer using DHCP where a different name could be assigned to your computer Used for mutual authentication with every user 12 4

5 13 Mutual Authentication Protocol User authenticates to GRAM with a proxy certificate GRAM authenticates to the user with a host certificate Handshaking gprotocol Generate a random message Encrypt with the public key partner (taken from the X.509 certificate) Send the encrypted message to the partner The partner decrypts it and sends it back If the decrypted message matches the original random one, the handshaking is completed Host X.509 User Proxy private key Proxy X.509 GRAM Gatekeeper Host private key GRAM Architecture 14 globus-job-run 15 radu@petzeck.dps.uibk.ac.at:~$ grid-proxy-init Your identity: /O=AustrianGrid/O=UIBK/OU=DPS/CN=Radu Prodan Enter GRID pass phrase for this identity: p p y Creating proxy... Done Your proxy is valid until: Thu Feb 3 21:58: radu@petzeck:~$ globus-job-run gescher.vcpc.univie.ac.at /bin/echo "Hello Grid" Hello Grid 5

6 Limitation Full path to executable must be given Big drawback in a Grid environment How am I supposed to know the full home path on each remote site? I never logged in on most of the Grid machines Information service 16 radu@petzeck:~$ globus-job-run gescher.vcpc.univie.ac.at echo "Hello Grid" GRAM Job failed because the executable does not exist (error code 5) The globus-job-run Script radu@petzeck:~$ globus-job-run altix1.uibk.ac.at /bin/echo "Hello Grid" Hello Grid 17 radu@petzeck:~$ time globus-job-run altix1.uibk.ac.at /bin/echo "Hello Grid" Hello Grid real user sys 0m4.523s 0m0.630s 0m0.280s GRAM and PBS radu@petzeck:~$ time globus-job-run altix1.uibk.ac.at/jobmanager-pbs /bin/hostname altix1 18 real user sys 0m18.425s 0m0.580s 0m0.220s 6

7 GRAM and SGE time globus-job-run hc-ma.uibk.ac.at/jobmanager-sge /bin/hostname 19 real 0m33.651s user 0m0.620s sys 0m0.230s Job Termination 20 How to detect job termination? Push events Local manager automatically notifies GRAM through callbacks Not supported Pull events GRAM queries the local manager While(1) { getstate(); Sleep(sec); } Compromise accuracy versus busy waiting (sec = 0) No way to know if the application crashed Master Process Pull event Slave Process Batch Queuing System Application Push event 21 Grid Resource Management Latency Executing jobs on the Grid is prone to HUGE latencies 5 seconds mutual authentication 15 seconds job queuing system o It cannot do busy waiting o Ask every 5-10 seconds or so Grid applications must have executions an order of magnitude higher than 20 seconds Otherwise you waste most of your time waiting 7

8 Batch Job Submission Send a batch job to the Grid globus-job-submit altix1.uibk.ac.at/jobmanagerpbs /bin/hostname Receive a URL handler that is unique on the Grid / Internet Use the URL to query the status of the job globus-job-status / ACTIVE DONE 22 Batch Job Submission 23 petzeck:/home/radu>globus-job-submit altix1.uibk.ac.at/jobmanager-pbs /home/cb56/cb561004/a.out petzeck:/home/radu>globus-job-status ACTIVE petzeck:/home/radu>globus-job-status DONE Job Output globus-job-get-output job_id [-f number ] out -err [file] Job_id job URL -out Gets standard output -err Gets standard error -f number gets the last number of lines from stdout / stderr 24 8

9 Cancel a Job globus-job-cancel job_url Cancels (kills) a job Changes the state to FAILED globus-job-clean job_url Remove all the the fl files associated with a job previously started using globus-job-submit All the information concerning the job is lost Must be used after globus-job-get-output 25 Job Cancellation and Cleanup radu@pc6163-c703:~$ globus-job-cancel Are you sure you want to cancel the job now (Y/N)? Y Job canceled. NOTE: You still need to clean files associated with the job by running globus-job-clean <jobid> radu@pc6163-c703:~$ globus-job-clean WARNING: Cleaning a job means: - Kill the job if it still running, and - Remove the cached output on the remote resource 26 Are you sure you want to cleanup the job now (Y/N)? Y Cleanup successful. Executable Staging Copy the file you want to execute to the remote machine Of course, the executable has to be compiled for the remote architecture Not for the local l one globus-job-run altix1.uibk.ac.at stage /bin/ls 27 a.out altix1.jku.austriangrid.at 9

10 28 Cross Platform Executable Staging cb561004]$ globus-job-run agrid1.uibk.ac.at -stage /bin/ls /scratch/cb56/cb561004/.globus/.gass_cache /local/md5/cf/b fcf8d51c3cecf 89c864/md5/4b/fde7a8d6acbd9901f1e1211 d2ff114/data: /scratch/cb56/cb561004/.globus/.gass_cac he/local/md5/cf/b fcf8d51c3ce cf89c864/md5/4b/fde7a8d6acbd9901f1e12 11d2ff114/data: cannot execute binary file 29 Resource Specification Language Grid job requests may be too complex to be specified using batch commands RSL = Common language for specifying complex job requests GRAM service translates this common language g into scheduler specific language E.g., RSL into PBS RSL is based on a set of (attribute=value) pairs & (executable= /bin/echo ) (arguments= Hello Grid ) GRAM service understands a well defined set of attributes Hello Grid RSL Example (directory = /home/radu/) (executable = /bin/echo) (arguments = Hello Grid ) (stdout = std.out) (stderr = std.err) 30 Save this in run.rsl 10

11 The globusrun Script time globusrun -r altix1.jku.austriangrid.at/jobmanager-pbs -f run.rsl globus_gram_client_callback_allow successful GRAM Job submission successful GLOBUS_GRAM_PROTOCOL_JOB_STATE_PENDING GLOBUS_GRAM_PROTOCOL_JOB_STATE_ACTIVE GLOBUS_GRAM_PROTOCOL_JOB_STATE_DONE 31 real user sys 0m20.150s 0m0.220s 0m0.010s State Transition Diagram 32 GRAM manages asynchronous state change callbacks Execution Attributes (executable = string) Program to run A file path (absolute or relative) URL for file staging o Copy the input file from the remote URL to local file (directory = string) Directory in which to run (default is $HOME) (arguments = arg1 arg2 arg3...) List of string arguments to program (environment = (var1 val1) (var2 val2)) List of environment variable name/value pairs 33 11

12 Input / Output Attributes 34 (stdin = string) Stdin for program A file path (absolute or relative) URL for stdin staging o Automatically get the input from the given URL (stdout = string) (stderr = string) Stdout / stderr for program A file path (absolute or relative) URL for stdout / stderr staging Sample RSL 35 & (executable = my_prog) (directory = /home/radu/my_dir) (arguments = std1.in std2.in std3.in) (environment = (JAVA_HOME /usr/java)/ (PATH /usr/java/bin) ) (LD_LIBRARY_PATH /usr/lib ) ) (stdin = std.in) (stdout = std.out) (stderr = std.err) Execution Time Attributes 36 (maxtime=integer) Maximum wall clock or cpu runtime (GRAM s choice) in minutes (maxwalltime=integer) Maximum wall clock runtime in minutes Wallclock time = total execution time What you see when you look at the watch (maxcputime=integer) Maximum CPU runtime in minutes How much the process sits on the CPU If the execution time is exceeded the job is killed 12

13 Parallel Processing Attributes 37 (count=integer) Number of processors to allocate on a parallel machine Default is 1 (hostcount=integer) number of nodes to distribute the count processes On SMP clusters Count / hostcount processes per node (count = 4) (hostcount = 2) 2 distinct nodes 2 threads per node Memory Attributes (maxmemory=integer) Maximum amount of memory for each process in megabytes (minmemory=integer) Minimum amount of memory for each process in megabytes 38 It would be interested what happens when you try this Job Type (jobtype=value) Predefined job types for easy use Single run a single instance of the program let the program start the other count-1 processes Multiple Start <count> instances of the program using the appropriate scheduling mechanism MPI Run the program using mpirun -np <count> Condor (a very advanced job queuing system) Start a <count> Condor processes running in standard universe 39 13

14 Sequential Process 40 & (jobtype = single) (count = 1) (directory = /home/radu/my_dir) (executable = my_prog) (arguments = std.in) (stdout = std.out) (stderr = std.err) (maxwalltime = 120) (maxcputime = 60) OpenMP Applications & (jobtype = single) (count = 4) (directory = /home/radu/my_dir) (executable = my_prog) (arguments = std.in) (stdout = std.out) (stderr = std.err) (environment = (OMP_GET_NUM_THREADS 4) ) (maxwalltime = 120) (maxcputime = 120) 41 Allocate count (4) processors Run a single instance of the program my_prog Let my_prog start the other count-1 (3) processes MPI Applications & (jobtype = mpi) (count = 4) (directory = /home/radu/my_dir) (executable = my_prog) (arguments = std.in) (stdout = std.out) (stderr = std.err) (maxwalltime = 120) (maxcputime = 120) 42 Automatically run all the count (4) my_prog instances using mpirun 14

15 Queuing System Attributes (project=string) Project (account) against which to charge (queue=string) Queue into which to submit job short, express, long, 43 RSL Substitutions 44 RSL supports simple variable substitutions Substitutions are declared using a list of pairs (rslsubstitution=(sub1 val1)(sub2 val2)) A substitution is invoked with $(SUB) Processing order: Within scope, processed left-to-right, Outer scope processed before inner scope Variable definition can reference previously defined variables File Staging & (executable = my_prog) (directory = /home/radu/my_dir) (arguments = std1.in std2.in std3.in) (rsl_substitution = URL ) (file_stage_in = ( $(URL)/my_in_file my_in_file ) ) (file_stage_out = ( $(URL)/my_out_file my_out_file ) ) (file_clean_up = my_in_file) (stdin = std.in) (stdout = std.out) (stderr = std.err) 45 15

16 Default RSL Substitutions GRAM defines a set of RSL substitutions before processing the job request Useful for portability Machine Information GLOBUS_HOST_MANUFACTURER GLOBUS_HOST_CPUTYPE GLOBUS_HOST_OSNAME GLOBUS_HOST_OSVERSION Paths to Globus GLOBUS_LOCATION Miscellaneous HOME LOGNAME GLOBUS_ID 46 Other Attributes 47 dryrun Simulate what would happen in terms of resource allocation (do not run the job) Useful for testing whether my interface to GRAM is correct scratch_dir Specifies the location to create a scratch subdirectory A SCRATCH_DIRECTORY substitution will be filled with the name of the directory which is created stdout_position / stderr_position Specifies where in the file remote standard output / error streaming should be restarted from. Fault Tolerance Attributes (save_state=yes) saves job state information to a persistent file on disk (restart) Start a new job manager to manage an existing job Searches for a state file saved with save_state Recover from a GRAM crash proxy_timeout Specifies how long (in seconds) before the delegated X509 proxy expires the job manager should exit 48 16

17 Two-Phase Commit 49 Some application need to be executed exactly once (no more, no less) e.g. Bank transfers You ask GRAM to execute something, but after the mutual authentication, the network crashes You get no PENDING / ACTIVE event Did GRAM receive the request or not? Two-phase commit protocol Preparation phase Commit phase (two_phase=<int>) <int> = seconds to wait before job times out 2PC Protocol 50 Fault Tolerant Example 51 & (count=8) (hostcount= 2:ppn=4") (jobtype=mpi) (directory=/home/radu/apps/lapw0/src) (executable=../src/lapw0) (arguments=lapw0.def) (stdout=st.out) (stderr=st.err) (maxwalltime = 60) (maxcputime = 60) (save_state=yes) (two_phase = yes) (proxy_timeout = 10) 17

18 Commodity Grid Kits 52 Grid Applications Java CoG Kit Application Grid Portals High BiologyChemistry Energy Physics Earth Science Grid Portals CoG Portal Services Python CoG Kit Grid Middleware Grid Management Services Advanced CoG Services (File Management, Job Management, Security Management) SWIG Grid Primitives Execution & Security GT3 OGSA Globus Toolkit GT2 GT4 Grid prototyping Web Services CoG Abstraction Layers 53 Nano materials Bio- Disaster Informatics Management Applications Portals Development Support CoG Gridfaces Layer CoG Data and Task Management Layer CoG Abstraction Layer CoG GridIDE CoG CoG CoG CoG CoG CoG CoG CoG CoG CoG CoG CoG CoG GT3 Others GT4 GT2 OGSI WS-RF Condor Unicore SSH Avaki classic SETI Java CoG Kit Java API to the Globus services Download from ~/.globus.cog.properties usercert=/home/radu/.globus/usercert.pem userkey=/home/radu/.globus/userkey.pem proxy=/tmp/x509up_u

19 Poll Event-based Job import org.globus.gram.*; import org.globus.gram.internal.*; 55 class GramPoll { public static void main(string[] args) { String rsl = "&(executable=\"/bin/ls\")(stdout=\"std.out\")"; GramJobRun job = new GramJobRun(rsl,"agrid1.uibk.ac.at"); job.run(); while(job.getstatus()!= GRAMConstants.STATUS_DONE && job.getstatus()!= GRAMConstants.STATUS_FAILED) try { Thread.sleep(1000); System.out.println(job.getStatusAsString()); } catch(interruptedexception ex) { } } } Push Event-based Job 56 import org.globus.gram.internal.*; import org.globus.gram.*; class GramPush implements GramJobListener { private static boolean done = false; public GramPush(String rsl) throws Exception { GramJob job = new GramJob(rsl); job.addlistener(this); job.request("altix1.uibk.ac.at"); } public static void main(string[] args) throws Exception { String rsl = "&(executable=\"/bin/ls\")(stdout=\"std.out\")"; new GramPush(rsl); while(! done) { Thread.sleep(1000); } } public void statuschanged(gramjob job) { if(job.getstatus() == GRAMConstants.STATUS_DONE job.getstatus() == GRAMConstants.STATUS_FAILED) done = true; } } Multiple Grid Site Resource Co-allocation 19

20 Grid Resource Management Architecture 58 Resource Specification Language Broker + Scheduler RSL specialization Application Ground RSL Queries & Info Information Service Co-allocator Grid resource managers Local resource managers Simple ground RSL GRAM GRAM GRAM LSF PBS NQE Resource Co-allocation 59 Run an application on two different computers in different administration domains The computers are not under the control of the same local batch queuing system Problem: The job may start on one system immediately And stays in the queue for 1 day in the other In 1 day the first system may be no longer free PBS LSF Dynamically Updated Request Online Co-allocator DUROC Run jobs on multiple sites Decomposes the job into multiple GRAM requests Rendez-vous barrier with allocating agent What happens if one site is available but the other busy? One site waits at the barrier Extra RSL options to control barrier synchronization inter sub-job communication RSL 1 GRAM 1 RSL DUROC RSL n GRAM n 60 globus_duroc_runtime_barrier Job Job Procs. Procs. 20

21 Co-allocation Example 61 + ( & (resourcemanagercontact="c703-pc450.uibk.ac.at/jobmanager-pbs") (jobtype = mpi) (label ="subjob 0") (environment =(GLOBUS_DUROC_SUBJOB_INDEX 0) ) (count = 1) (executable = /afs/zid1.uibk.ac.at/home/c703/c703246/teaching/mpi/a.out) (stdout = mpi1.out) (stderr = mpi1.err) ) ( & (resourcemanagercontact="c703-pc421.uibk.ac.at/jobmanager-pbs") (jobtype = mpi) (label ="subjob 1") (environment =(GLOBUS_DUROC_SUBJOB_INDEX 1) ) (count = 1) (executable = /afs/zid1.uibk.ac.at/home/c703/c703246/teaching/mpi/a.out) (stdout = mpi2.out) (stderr = mpi2.err) ) Resource Management Attributes (resourcemanagercontact=string) (resourcemanagername=string) resource manager to which to submit a subjob (label=string) identifier for this subjob (subjobstarttype=value) alters the startup barrier mechanism values are strict-barrier, loose-barrier, no-barrier (subjobcommstype=value) values are blocking-join and independent if value is set to independent, the subjob won t be seen from the other subjobs when doing inter-subjob communication 62 MPI on the Grid MPI successful for parallel programming Grid idea came from HPC community MPI applications that require Grid computing exist Incremental approach to running Grid applications Full portability of existing MPI applications on the Grid 63 21

22 MPICH Architecture MPICH Architecture 64 Us er The MPI i nt er f ace ( def i ned by t he MPI st andar ds) The MPI CH l ayer (impl ement s t he MPI i nt er f ace) Abstract Device Interface (ADI) A Particular Platform MPP SMP Cl ust er MPICH-G2 Implemented as globus2 device in MPICH GSI-based authentication and authorization Executable and IO file staging Communication intra-machine: local MPI implementation inter-machine: TCP/IP socket communication (Globus I/O, GridFTP) limitation: requires public IP addresses for all parallel computer nodes DUROC for resource allocation barrier at the end of MPI_Init 65 MPICH-G2 Architecture 66 22

23 2-Site MPI Latency Program 67 + ( & (resourcemanagercontact="c703-pc450.uibk.ac.at/jobmanager-pbs") (jobtype = mpi) (label ="subjob 0") (environment =(GLOBUS_DUROC_SUBJOB_INDEX 0) ) (count = 1) (executable = /afs/zid1.uibk.ac.at/home/c703/c703246/teaching/mpi/a.out) (stdout = mpi1.out) (stderr = mpi1.err) ) ( & (resourcemanagercontact="c703-pc421.uibk.ac.at/jobmanager-pbs") (jobtype = mpi) (label ="subjob 1") (environment =(GLOBUS_DUROC_SUBJOB_INDEX 1) ) (count = 1) (executable = /afs/zid1.uibk.ac.at/home/c703/c703246/teaching/mpi/a.out) (stdout = mpi2.out) (stderr = mpi2.err) ) 68 Best Effort Resource Allocation GRAM does best effort resource allocation No request for resource allocation is rejected Consequently, no level of service is guaranteed E.g., IP networks, time-sharing CPU schedulers, job queuing systems Problem Run Grid application on sites A and B A is immediately available and allocated B will be available only in 1 hour Advance reservation when and for how long a resource will be available and exclusively allocated Ensures that the application starts simultaneously on two sites Quality of service (QoS) GARA 69 General-purpose Architecture for Reservation and Allocation Integrate QoS for different types of resources networks, CPUs, batch job schedulers, disks Mechanisms for advanced and immediate reservations for resources create, modify, cancel reservations RSL extensions Integration with DUROC for CPU reservations No Globus toolkit integration 23

24 GARA RSL Example & (reservation-type=compute) (start-time= ) (duration=3600) (percentcpu=100) 70 & (reservation-type=network) (start-time= ) (duration=3600) (endpoint-a= ) (endpoint-b= ) (directionality=unidirectional-ba) (bandwidth=150) GRAM Summary 71 Client MDS client API calls to locate resources MDS client API calls to get resource info MDS: Grid Index Info Server Site boundary GRAM client API calls to request resource allocation and process creation. Globus Security Infrastructure Create Gatekeeper GRAM client API state change callbacks Job Manager Parse RSL Library MDS: Grid Resource Info Server Query current status of resource Local Resource Manager Allocate & Request create processes Monitor & control Process Process Process Resource Management Stack 72 Scheduler Resource Broker To come Multiple Grid Site Resource Co-allocator Grid Site Resource Manager Done Local Resource Manager 24

25 Bibliography GRAM DUROC fp.globus.org/duroc/function_reference.html or reference MPICH-G2 GARA

1 Resource Management

1 Resource Management Resource Management 1 2 Resource Sharing Problem Locate resources What computers can I use. Which are the IP addresses? Allocate resources Get a free computer now, reserve one for tomorrow at noon Authenticate

More information

GRAM: Grid Resource Allocation & Management

GRAM: Grid Resource Allocation & Management Copyright (c) 2002 University of Chicago and The University of Southern California. All Rights Reserved. This presentation is licensed for use under the terms of the Globus Toolkit Public License. See

More information

Grid Computing Fall 2005 Lecture 10 and 12: Globus V2. Gabrielle Allen

Grid Computing Fall 2005 Lecture 10 and 12: Globus V2. Gabrielle Allen Grid Computing 7700 Fall 2005 Lecture 10 and 12: Globus V2 Gabrielle Allen allen@bit.csc.lsu.edu http://www.cct.lsu.edu/~gallen/ Globus 4 Primer Required Reading Coursework Essay: 4 pages Describe the

More information

Grid Compute Resources and Job Management

Grid Compute Resources and Job Management Grid Compute Resources and Job Management How do we access the grid? Command line with tools that you'll use Specialised applications Ex: Write a program to process images that sends data to run on the

More information

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy Why the Grid? Science is becoming increasingly digital and needs to deal with increasing amounts of

More information

Grid Computing Fall 2005 Lecture 5: Grid Architecture and Globus. Gabrielle Allen

Grid Computing Fall 2005 Lecture 5: Grid Architecture and Globus. Gabrielle Allen Grid Computing 7700 Fall 2005 Lecture 5: Grid Architecture and Globus Gabrielle Allen allen@bit.csc.lsu.edu http://www.cct.lsu.edu/~gallen Concrete Example I have a source file Main.F on machine A, an

More information

Grid Compute Resources and Grid Job Management

Grid Compute Resources and Grid Job Management Grid Compute Resources and Job Management March 24-25, 2007 Grid Job Management 1 Job and compute resource management! This module is about running jobs on remote compute resources March 24-25, 2007 Grid

More information

Globus Toolkit 4 Execution Management. Alexandra Jimborean International School of Informatics Hagenberg, 2009

Globus Toolkit 4 Execution Management. Alexandra Jimborean International School of Informatics Hagenberg, 2009 Globus Toolkit 4 Execution Management Alexandra Jimborean International School of Informatics Hagenberg, 2009 2 Agenda of the day Introduction to Globus Toolkit and GRAM Zoom In WS GRAM Usage Guide Architecture

More information

Resource Specification Language (RSL)

Resource Specification Language (RSL) (RSL) Shamjith K V System Software Development Group, CDAC, Bangalore. Common notation for exchange of information between components Syntax similar to MDS/LDAP filters RSL provides Resource requirements:

More information

Resource Specification Language (RSL)

Resource Specification Language (RSL) (RSL) Asvija B System Software Development Group, CDAC, Bangalore. Common notation for exchange of information between components Syntax similar to MDS/LDAP filters RSL provides Resource requirements:

More information

History of SURAgrid Deployment

History of SURAgrid Deployment All Hands Meeting: May 20, 2013 History of SURAgrid Deployment Steve Johnson Texas A&M University Copyright 2013, Steve Johnson, All Rights Reserved. Original Deployment Each job would send entire R binary

More information

Grid Architectural Models

Grid Architectural Models Grid Architectural Models Computational Grids - A computational Grid aggregates the processing power from a distributed collection of systems - This type of Grid is primarily composed of low powered computers

More information

GEMS: A Fault Tolerant Grid Job Management System

GEMS: A Fault Tolerant Grid Job Management System GEMS: A Fault Tolerant Grid Job Management System Sriram Satish Tadepalli Thesis submitted to the faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements

More information

Day 1 : August (Thursday) An overview of Globus Toolkit 2.4

Day 1 : August (Thursday) An overview of Globus Toolkit 2.4 An Overview of Grid Computing Workshop Day 1 : August 05 2004 (Thursday) An overview of Globus Toolkit 2.4 By CDAC Experts Contact :vcvrao@cdacindia.com; betatest@cdacindia.com URL : http://www.cs.umn.edu/~vcvrao

More information

Layered Architecture

Layered Architecture The Globus Toolkit : Introdution Dr Simon See Sun APSTC 09 June 2003 Jie Song, Grid Computing Specialist, Sun APSTC 2 Globus Toolkit TM An open source software toolkit addressing key technical problems

More information

Grid Computing Middleware. Definitions & functions Middleware components Globus glite

Grid Computing Middleware. Definitions & functions Middleware components Globus glite Seminar Review 1 Topics Grid Computing Middleware Grid Resource Management Grid Computing Security Applications of SOA and Web Services Semantic Grid Grid & E-Science Grid Economics Cloud Computing 2 Grid

More information

Managing MPICH-G2 Jobs with WebCom-G

Managing MPICH-G2 Jobs with WebCom-G Managing MPICH-G2 Jobs with WebCom-G Padraig J. O Dowd, Adarsh Patil and John P. Morrison Computer Science Dept., University College Cork, Ireland {p.odowd, adarsh, j.morrison}@cs.ucc.ie Abstract This

More information

Grid Scheduling Architectures with Globus

Grid Scheduling Architectures with Globus Grid Scheduling Architectures with Workshop on Scheduling WS 07 Cetraro, Italy July 28, 2007 Ignacio Martin Llorente Distributed Systems Architecture Group Universidad Complutense de Madrid 1/38 Contents

More information

Cloud Computing. Up until now

Cloud Computing. Up until now Cloud Computing Lecture 4 and 5 Grid: 2012-2013 Introduction. Up until now Definition of Cloud Computing. Grid Computing: Schedulers: Condor SGE 1 Summary Core Grid: Toolkit Condor-G Grid: Conceptual Architecture

More information

Agent Teamwork Research Assistant. Progress Report. Prepared by Solomon Lane

Agent Teamwork Research Assistant. Progress Report. Prepared by Solomon Lane Agent Teamwork Research Assistant Progress Report Prepared by Solomon Lane December 2006 Introduction... 3 Environment Overview... 3 Globus Grid...3 PBS Clusters... 3 Grid/Cluster Integration... 4 MPICH-G2...

More information

Architecture Proposal

Architecture Proposal Nordic Testbed for Wide Area Computing and Data Handling NORDUGRID-TECH-1 19/02/2002 Architecture Proposal M.Ellert, A.Konstantinov, B.Kónya, O.Smirnova, A.Wäänänen Introduction The document describes

More information

JOB SUBMISSION ON GRID

JOB SUBMISSION ON GRID arxiv:physics/0701101v2 [physics.comp-ph] 12 Jan 2007 JOB SUBMISSION ON GRID An Users Introduction Rudra Banerjee ADVANCED COMPUTING LAB. Dept. of Physics, University of Pune March 13, 2018 Contents preface

More information

First evaluation of the Globus GRAM Service. Massimo Sgaravatto INFN Padova

First evaluation of the Globus GRAM Service. Massimo Sgaravatto INFN Padova First evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova massimo.sgaravatto@pd.infn.it Draft version release 1.0.5 20 June 2000 1 Introduction...... 3 2 Running jobs... 3 2.1 Usage examples.

More information

NUSGRID a computational grid at NUS

NUSGRID a computational grid at NUS NUSGRID a computational grid at NUS Grace Foo (SVU/Academic Computing, Computer Centre) SVU is leading an initiative to set up a campus wide computational grid prototype at NUS. The initiative arose out

More information

Grid Computing Training Courseware v-1.0

Grid Computing Training Courseware v-1.0 -Testing Group, C-DAC Grid Computing Training Courseware Grid Computing Training Courseware v-1.0 Designed for Testing, Benchmarking & Performance Activities Document Title Grid Computing Training Courseware

More information

GT-OGSA Grid Service Infrastructure

GT-OGSA Grid Service Infrastructure Introduction to GT3 Background The Grid Problem The Globus Approach OGSA & OGSI Globus Toolkit GT3 Architecture and Functionality: The Latest Refinement of the Globus Toolkit Core Base s User-Defined s

More information

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT. Chapter 4:- Introduction to Grid and its Evolution Prepared By:- Assistant Professor SVBIT. Overview Background: What is the Grid? Related technologies Grid applications Communities Grid Tools Case Studies

More information

Installation and Administration

Installation and Administration Introduction to GT3 Background The Grid Problem The Globus Approach OGSA & OGSI Globus Toolkit GT3 Architecture and Functionality: The Latest Refinement of the Globus Toolkit Core Base Services User-Defined

More information

UNICORE Globus: Interoperability of Grid Infrastructures

UNICORE Globus: Interoperability of Grid Infrastructures UNICORE : Interoperability of Grid Infrastructures Michael Rambadt Philipp Wieder Central Institute for Applied Mathematics (ZAM) Research Centre Juelich D 52425 Juelich, Germany Phone: +49 2461 612057

More information

Tutorial 4: Condor. John Watt, National e-science Centre

Tutorial 4: Condor. John Watt, National e-science Centre Tutorial 4: Condor John Watt, National e-science Centre Tutorials Timetable Week Day/Time Topic Staff 3 Fri 11am Introduction to Globus J.W. 4 Fri 11am Globus Development J.W. 5 Fri 11am Globus Development

More information

Grid Examples. Steve Gallo Center for Computational Research University at Buffalo

Grid Examples. Steve Gallo Center for Computational Research University at Buffalo Grid Examples Steve Gallo Center for Computational Research University at Buffalo Examples COBALT (Computational Fluid Dynamics) Ercan Dumlupinar, Syracyse University Aerodynamic loads on helicopter rotors

More information

Content. MPIRUN Command Environment Variables LoadLeveler SUBMIT Command IBM Simple Scheduler. IBM PSSC Montpellier Customer Center

Content. MPIRUN Command Environment Variables LoadLeveler SUBMIT Command IBM Simple Scheduler. IBM PSSC Montpellier Customer Center Content IBM PSSC Montpellier Customer Center MPIRUN Command Environment Variables LoadLeveler SUBMIT Command IBM Simple Scheduler Control System Service Node (SN) An IBM system-p 64-bit system Control

More information

OpenPBS Users Manual

OpenPBS Users Manual How to Write a PBS Batch Script OpenPBS Users Manual PBS scripts are rather simple. An MPI example for user your-user-name: Example: MPI Code PBS -N a_name_for_my_parallel_job PBS -l nodes=7,walltime=1:00:00

More information

Advanced Job Submission on the Grid

Advanced Job Submission on the Grid Advanced Job Submission on the Grid Antun Balaz Scientific Computing Laboratory Institute of Physics Belgrade http://www.scl.rs/ 30 Nov 11 Dec 2009 www.eu-egee.org Scope User Interface Submit job Workload

More information

New User Seminar: Part 2 (best practices)

New User Seminar: Part 2 (best practices) New User Seminar: Part 2 (best practices) General Interest Seminar January 2015 Hugh Merz merz@sharcnet.ca Session Outline Submitting Jobs Minimizing queue waits Investigating jobs Checkpointing Efficiency

More information

The University of Oxford campus grid, expansion and integrating new partners. Dr. David Wallom Technical Manager

The University of Oxford campus grid, expansion and integrating new partners. Dr. David Wallom Technical Manager The University of Oxford campus grid, expansion and integrating new partners Dr. David Wallom Technical Manager Outline Overview of OxGrid Self designed components Users Resources, adding new local or

More information

NorduGrid Tutorial. Client Installation and Job Examples

NorduGrid Tutorial. Client Installation and Job Examples NorduGrid Tutorial Client Installation and Job Examples Linux Clusters for Super Computing Conference Linköping, Sweden October 18, 2004 Arto Teräs arto.teras@csc.fi Steps to Start Using NorduGrid 1) Install

More information

ARC-XWCH bridge: Running ARC jobs on the XtremWeb-CH volunteer

ARC-XWCH bridge: Running ARC jobs on the XtremWeb-CH volunteer ARC-XWCH bridge: Running ARC jobs on the XtremWeb-CH volunteer computing platform Internal report Marko Niinimaki, Mohamed BenBelgacem, Nabil Abdennadher HEPIA, January 2010 1. Background and motivation

More information

UNIT IV PROGRAMMING MODEL. Open source grid middleware packages - Globus Toolkit (GT4) Architecture, Configuration - Usage of Globus

UNIT IV PROGRAMMING MODEL. Open source grid middleware packages - Globus Toolkit (GT4) Architecture, Configuration - Usage of Globus UNIT IV PROGRAMMING MODEL Open source grid middleware packages - Globus Toolkit (GT4) Architecture, Configuration - Usage of Globus Globus: One of the most influential Grid middleware projects is the Globus

More information

M. Roehrig, Sandia National Laboratories. Philipp Wieder, Research Centre Jülich Nov 2002

M. Roehrig, Sandia National Laboratories. Philipp Wieder, Research Centre Jülich Nov 2002 Category: INFORMATIONAL Grid Scheduling Dictionary WG (SD-WG) M. Roehrig, Sandia National Laboratories Wolfgang Ziegler, Fraunhofer-Institute for Algorithms and Scientific Computing Philipp Wieder, Research

More information

Extensible Job Managers for Grid Computing

Extensible Job Managers for Grid Computing Extensible Job Managers for Grid Computing Paul D. Coddington Lici Lu Darren Webb Andrew L. Wendelborn Department of Computer Science University of Adelaide Adelaide, SA 5005, Australia Email: {paulc,andrew,darren}@cs.adelaide.edu.au

More information

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007 Grid Programming: Concepts and Challenges Michael Rokitka SUNY@Buffalo CSE510B 10/2007 Issues Due to Heterogeneous Hardware level Environment Different architectures, chipsets, execution speeds Software

More information

Gridbus Portlets -- USER GUIDE -- GRIDBUS PORTLETS 1 1. GETTING STARTED 2 2. AUTHENTICATION 3 3. WORKING WITH PROJECTS 4

Gridbus Portlets -- USER GUIDE --  GRIDBUS PORTLETS 1 1. GETTING STARTED 2 2. AUTHENTICATION 3 3. WORKING WITH PROJECTS 4 Gridbus Portlets -- USER GUIDE -- www.gridbus.org/broker GRIDBUS PORTLETS 1 1. GETTING STARTED 2 1.1. PREREQUISITES: 2 1.2. INSTALLATION: 2 2. AUTHENTICATION 3 3. WORKING WITH PROJECTS 4 3.1. CREATING

More information

PARALLEL PROGRAM EXECUTION SUPPORT IN THE JGRID SYSTEM

PARALLEL PROGRAM EXECUTION SUPPORT IN THE JGRID SYSTEM PARALLEL PROGRAM EXECUTION SUPPORT IN THE JGRID SYSTEM Szabolcs Pota 1, Gergely Sipos 2, Zoltan Juhasz 1,3 and Peter Kacsuk 2 1 Department of Information Systems, University of Veszprem, Hungary 2 Laboratory

More information

Knowledge Discovery Services and Tools on Grids

Knowledge Discovery Services and Tools on Grids Knowledge Discovery Services and Tools on Grids DOMENICO TALIA DEIS University of Calabria ITALY talia@deis.unical.it Symposium ISMIS 2003, Maebashi City, Japan, Oct. 29, 2003 OUTLINE Introduction Grid

More information

Communication. Distributed Systems Santa Clara University 2016

Communication. Distributed Systems Santa Clara University 2016 Communication Distributed Systems Santa Clara University 2016 Protocol Stack Each layer has its own protocol Can make changes at one layer without changing layers above or below Use well defined interfaces

More information

Globus GTK and Grid Services

Globus GTK and Grid Services Globus GTK and Grid Services Michael Rokitka SUNY@Buffalo CSE510B 9/2007 OGSA The Open Grid Services Architecture What are some key requirements of Grid computing? Interoperability: Critical due to nature

More information

Implementation of Parallelization

Implementation of Parallelization Implementation of Parallelization OpenMP, PThreads and MPI Jascha Schewtschenko Institute of Cosmology and Gravitation, University of Portsmouth May 9, 2018 JAS (ICG, Portsmouth) Implementation of Parallelization

More information

Cloud Computing. Summary

Cloud Computing. Summary Cloud Computing Lectures 2 and 3 Definition of Cloud Computing, Grid Architectures 2012-2013 Summary Definition of Cloud Computing (more complete). Grid Computing: Conceptual Architecture. Condor. 1 Cloud

More information

By Ian Foster. Zhifeng Yun

By Ian Foster. Zhifeng Yun By Ian Foster Zhifeng Yun Outline Introduction Globus Architecture Globus Software Details Dev.Globus Community Summary Future Readings Introduction Globus Toolkit v4 is the work of many Globus Alliance

More information

Globus Toolkit Manoj Soni SENG, CDAC. 20 th & 21 th Nov 2008 GGOA Workshop 08 Bangalore

Globus Toolkit Manoj Soni SENG, CDAC. 20 th & 21 th Nov 2008 GGOA Workshop 08 Bangalore Globus Toolkit 4.0.7 Manoj Soni SENG, CDAC 1 What is Globus Toolkit? The Globus Toolkit is an open source software toolkit used for building Grid systems and applications. It is being developed by the

More information

An Example Grid Middleware - The Globus Toolkit. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

An Example Grid Middleware - The Globus Toolkit. MCSN N. Tonellotto Complements of Distributed Enabling Platforms An Example Grid Middleware - The Globus Toolkit 1 Globus Toolkit A software toolkit addressing key technical problems in the development of Grid enabled tools, services, and applications Offer a modular

More information

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems Distributed Systems Outline Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems What Is A Distributed System? A collection of independent computers that appears

More information

Distributed Memory Programming With MPI Computer Lab Exercises

Distributed Memory Programming With MPI Computer Lab Exercises Distributed Memory Programming With MPI Computer Lab Exercises Advanced Computational Science II John Burkardt Department of Scientific Computing Florida State University http://people.sc.fsu.edu/ jburkardt/classes/acs2

More information

Credentials Management for Authentication in a Grid-Based E-Learning Platform

Credentials Management for Authentication in a Grid-Based E-Learning Platform Credentials Management for Authentication in a Grid-Based E-Learning Platform Felicia Ionescu, Vlad Nae, Alexandru Gherega University Politehnica of Bucharest {fionescu, vnae, agherega}@tech.pub.ro Abstract

More information

M/s. Managing distributed workloads. Language Reference Manual. Miranda Li (mjl2206) Benjamin Hanser (bwh2124) Mengdi Lin (ml3567)

M/s. Managing distributed workloads. Language Reference Manual. Miranda Li (mjl2206) Benjamin Hanser (bwh2124) Mengdi Lin (ml3567) 1 M/s Managing distributed workloads Language Reference Manual Miranda Li (mjl2206) Benjamin Hanser (bwh2124) Mengdi Lin (ml3567) Table of Contents 1. Introduction 2. Lexical elements 2.1 Comments 2.2

More information

Communication. Overview

Communication. Overview Communication Chapter 2 1 Overview Layered protocols Remote procedure call Remote object invocation Message-oriented communication Stream-oriented communication 2 Layered protocols Low-level layers Transport

More information

Chapter 3. Design of Grid Scheduler. 3.1 Introduction

Chapter 3. Design of Grid Scheduler. 3.1 Introduction Chapter 3 Design of Grid Scheduler The scheduler component of the grid is responsible to prepare the job ques for grid resources. The research in design of grid schedulers has given various topologies

More information

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms Grid Computing 1 Resource sharing Elements of Grid Computing - Computers, data, storage, sensors, networks, - Sharing always conditional: issues of trust, policy, negotiation, payment, Coordinated problem

More information

Grid Middleware and Globus Toolkit Architecture

Grid Middleware and Globus Toolkit Architecture Grid Middleware and Globus Toolkit Architecture Lisa Childers Argonne National Laboratory University of Chicago 2 Overview Grid Middleware The problem: supporting Virtual Organizations equirements Capabilities

More information

Introduction to GT3. Overview. Installation Pre-requisites GT3.2. Overview of Installing GT3

Introduction to GT3. Overview. Installation Pre-requisites GT3.2. Overview of Installing GT3 Introduction to GT3 Background The Grid Problem The Globus Approach OGSA & OGSI Globus Toolkit GT3 Architecture and Functionality: The Latest Refinement of the Globus Toolkit Core Base Services User-Defined

More information

Multiprocessors 2007/2008

Multiprocessors 2007/2008 Multiprocessors 2007/2008 Abstractions of parallel machines Johan Lukkien 1 Overview Problem context Abstraction Operating system support Language / middleware support 2 Parallel processing Scope: several

More information

How to install Condor-G

How to install Condor-G How to install Condor-G Tomasz Wlodek University of the Great State of Texas at Arlington Abstract: I describe the installation procedure for Condor-G Before you start: Go to http://www.cs.wisc.edu/condor/condorg/

More information

Programming with MPI

Programming with MPI Programming with MPI p. 1/?? Programming with MPI Miscellaneous Guidelines Nick Maclaren nmm1@cam.ac.uk March 2010 Programming with MPI p. 2/?? Summary This is a miscellaneous set of practical points Over--simplifies

More information

Michigan Grid Research and Infrastructure Development (MGRID)

Michigan Grid Research and Infrastructure Development (MGRID) Michigan Grid Research and Infrastructure Development (MGRID) Abhijit Bose MGRID and Dept. of Electrical Engineering and Computer Science The University of Michigan Ann Arbor, MI 48109 abose@umich.edu

More information

Using the MyProxy Online Credential Repository

Using the MyProxy Online Credential Repository Using the MyProxy Online Credential Repository Jim Basney National Center for Supercomputing Applications University of Illinois jbasney@ncsa.uiuc.edu What is MyProxy? Independent Globus Toolkit add-on

More information

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia Grid services Dusan Vudragovic dusan@phy.bg.ac.yu Scientific Computing Laboratory Institute of Physics Belgrade, Serbia Sep. 19, 2008 www.eu-egee.org Set of basic Grid services Job submission/management

More information

2 Model. 2.1 Introduction

2 Model. 2.1 Introduction 2 Model 2.1 Introduction The mostly used attempt to define Grid computing [77] is through an analogy with the electric power evolution around 1910. The truly revolutionary development was not the discovery

More information

Programming with MPI

Programming with MPI Programming with MPI p. 1/?? Programming with MPI Miscellaneous Guidelines Nick Maclaren Computing Service nmm1@cam.ac.uk, ext. 34761 March 2010 Programming with MPI p. 2/?? Summary This is a miscellaneous

More information

The GridWay. approach for job Submission and Management on Grids. Outline. Motivation. The GridWay Framework. Resource Selection

The GridWay. approach for job Submission and Management on Grids. Outline. Motivation. The GridWay Framework. Resource Selection The GridWay approach for job Submission and Management on Grids Eduardo Huedo Rubén S. Montero Ignacio M. Llorente Laboratorio de Computación Avanzada Centro de Astrobiología (INTA - CSIC) Associated to

More information

GT 4.2.0: Community Scheduler Framework (CSF) System Administrator's Guide

GT 4.2.0: Community Scheduler Framework (CSF) System Administrator's Guide GT 4.2.0: Community Scheduler Framework (CSF) System Administrator's Guide GT 4.2.0: Community Scheduler Framework (CSF) System Administrator's Guide Introduction This guide contains advanced configuration

More information

Our new HPC-Cluster An overview

Our new HPC-Cluster An overview Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization

More information

CS516 Programming Languages and Compilers II

CS516 Programming Languages and Compilers II CS516 Programming Languages and Compilers II Zheng Zhang Spring 2015 Mar 12 Parallelism and Shared Memory Hierarchy I Rutgers University Review: Classical Three-pass Compiler Front End IR Middle End IR

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.

More information

SHARCNET Workshop on Parallel Computing. Hugh Merz Laurentian University May 2008

SHARCNET Workshop on Parallel Computing. Hugh Merz Laurentian University May 2008 SHARCNET Workshop on Parallel Computing Hugh Merz Laurentian University May 2008 What is Parallel Computing? A computational method that utilizes multiple processing elements to solve a problem in tandem

More information

Operating Systems Overview. Chapter 2

Operating Systems Overview. Chapter 2 Operating Systems Overview Chapter 2 Operating System A program that controls the execution of application programs An interface between the user and hardware Masks the details of the hardware Layers and

More information

SuperMike-II Launch Workshop. System Overview and Allocations

SuperMike-II Launch Workshop. System Overview and Allocations : System Overview and Allocations Dr Jim Lupo CCT Computational Enablement jalupo@cct.lsu.edu SuperMike-II: Serious Heterogeneous Computing Power System Hardware SuperMike provides 442 nodes, 221TB of

More information

Cluster Network Products

Cluster Network Products Cluster Network Products Cluster interconnects include, among others: Gigabit Ethernet Myrinet Quadrics InfiniBand 1 Interconnects in Top500 list 11/2009 2 Interconnects in Top500 list 11/2008 3 Cluster

More information

Multithreading and Interactive Programs

Multithreading and Interactive Programs Multithreading and Interactive Programs CS160: User Interfaces John Canny. This time Multithreading for interactivity need and risks Some design patterns for multithreaded programs Debugging multithreaded

More information

g-eclipse A Framework for Accessing Grid Infrastructures Nicholas Loulloudes Trainer, University of Cyprus (loulloudes.n_at_cs.ucy.ac.

g-eclipse A Framework for Accessing Grid Infrastructures Nicholas Loulloudes Trainer, University of Cyprus (loulloudes.n_at_cs.ucy.ac. g-eclipse A Framework for Accessing Grid Infrastructures Trainer, University of Cyprus (loulloudes.n_at_cs.ucy.ac.cy) EGEE Training the Trainers May 6 th, 2009 Outline Grid Reality The Problem g-eclipse

More information

Integrating SGE and Globus in a Heterogeneous HPC Environment

Integrating SGE and Globus in a Heterogeneous HPC Environment Integrating SGE and Globus in a Heterogeneous HPC Environment David McBride London e-science Centre, Department of Computing, Imperial College Presentation Outline Overview of Centre

More information

GROWL Scripts and Web Services

GROWL Scripts and Web Services GROWL Scripts and Web Services Grid Technology Group E-Science Centre r.j.allan@dl.ac.uk GROWL Collaborative project (JISC VRE I programme) between CCLRC Daresbury Laboratory and the Universities of Cambridge

More information

CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase. Chen Zhang Hans De Sterck University of Waterloo

CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase. Chen Zhang Hans De Sterck University of Waterloo CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase Chen Zhang Hans De Sterck University of Waterloo Outline Introduction Motivation Related Work System Design Future Work Introduction

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information

X Grid Engine. Where X stands for Oracle Univa Open Son of more to come...?!?

X Grid Engine. Where X stands for Oracle Univa Open Son of more to come...?!? X Grid Engine Where X stands for Oracle Univa Open Son of more to come...?!? Carsten Preuss on behalf of Scientific Computing High Performance Computing Scheduler candidates LSF too expensive PBS / Torque

More information

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced Sarvani Chadalapaka HPC Administrator University of California

More information

Advanced Distributed Systems

Advanced Distributed Systems Course Plan and Department of Computer Science Indian Institute of Technology New Delhi, India Outline Plan 1 Plan 2 3 Message-Oriented Lectures - I Plan Lecture Topic 1 and Structure 2 Client Server,

More information

CS6703 GRID AND CLOUD COMPUTING. Question Bank Unit-I. Introduction

CS6703 GRID AND CLOUD COMPUTING. Question Bank Unit-I. Introduction CS6703 GRID AND CLOUD COMPUTING Question Bank Unit-I Introduction Part A 1. Define Grid Computing. 2. Define Cloud Computing. 3. Analyze the working of GPUs. 4. List out the cluster design. 5. Differentiate

More information

ECE 574 Cluster Computing Lecture 4

ECE 574 Cluster Computing Lecture 4 ECE 574 Cluster Computing Lecture 4 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 31 January 2017 Announcements Don t forget about homework #3 I ran HPCG benchmark on Haswell-EP

More information

An introduction to checkpointing. for scientific applications

An introduction to checkpointing. for scientific applications damien.francois@uclouvain.be UCL/CISM - FNRS/CÉCI An introduction to checkpointing for scientific applications November 2013 CISM/CÉCI training session What is checkpointing? Without checkpointing: $./count

More information

Introduction to Cluster Computing

Introduction to Cluster Computing Introduction to Cluster Computing Prabhaker Mateti Wright State University Dayton, Ohio, USA Overview High performance computing High throughput computing NOW, HPC, and HTC Parallel algorithms Software

More information

CPSC/ECE 3220 Fall 2017 Exam Give the definition (note: not the roles) for an operating system as stated in the textbook. (2 pts.

CPSC/ECE 3220 Fall 2017 Exam Give the definition (note: not the roles) for an operating system as stated in the textbook. (2 pts. CPSC/ECE 3220 Fall 2017 Exam 1 Name: 1. Give the definition (note: not the roles) for an operating system as stated in the textbook. (2 pts.) Referee / Illusionist / Glue. Circle only one of R, I, or G.

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected

More information

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016 Distributed Systems 2015 Exam 1 Review Paul Krzyzanowski Rutgers University Fall 2016 1 Question 1 Why did the use of reference counting for remote objects prove to be impractical? Explain. It s not fault

More information

Architecture of the WMS

Architecture of the WMS Architecture of the WMS Dr. Giuliano Taffoni INFORMATION SYSTEMS UNIT Outline This presentation will cover the following arguments: Overview of WMS Architecture Job Description Language Overview WMProxy

More information

Globus Toolkit Firewall Requirements. Abstract

Globus Toolkit Firewall Requirements. Abstract Globus Toolkit Firewall Requirements v0.3 8/30/2002 Von Welch Software Architect, Globus Project welch@mcs.anl.gov Abstract This document provides requirements and guidance to firewall administrators at

More information

OBTAINING AN ACCOUNT:

OBTAINING AN ACCOUNT: HPC Usage Policies The IIA High Performance Computing (HPC) System is managed by the Computer Management Committee. The User Policies here were developed by the Committee. The user policies below aim to

More information

Introduction to PICO Parallel & Production Enviroment

Introduction to PICO Parallel & Production Enviroment Introduction to PICO Parallel & Production Enviroment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Nicola Spallanzani n.spallanzani@cineca.it

More information

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology Introduction to the SHARCNET Environment 2010-May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology available hardware and software resources our web portal

More information

LSF HPC :: getting most out of your NUMA machine

LSF HPC :: getting most out of your NUMA machine Leopold-Franzens-Universität Innsbruck ZID Zentraler Informatikdienst (ZID) LSF HPC :: getting most out of your NUMA machine platform computing conference, Michael Fink who we are & what we do university

More information