execution host commd

Size: px
Start display at page:

Download "execution host commd"

Transcription

1 Batch Queuing and Resource Management for Applications in a Network of Workstations Ursula Maier, Georg Stellner, Ivan Zoraja Lehrstuhl fur Rechnertechnik und Rechnerorganisation (LRR-TUM) Institut fur Informatik Technische Universitat Munchen fmaier,stellner,zorajag@informatik.tu-muenchen.de Abstract A resource management system can eectively shorten the runtime of batch jobs in a network of workstations (NOW). This is achieved with load balancing mechanisms to distribute the load equally among the hosts. To avoid conicts between interactive users and batch jobs, a resource management system must be able to migrate batch jobs from an interactive host to an idle host. Common resource management systems oer process migration only for sequential jobs but not for parallel jobs. Within the SEMPA project a resource management system with batch queuing functionalities including checkpointing and migration is designed and implemented. We focus on applications because oers dynamic task management and an interface to resource management systems 1. 1 Introduction Parallel scientic computing applications, e.g. in computational uid dynamics, require a large amount of CPU time and memory. Therefore, they are often run on massively parallel systems. However, networks of workstations (NOWs) often have computing capacities available that are sucient for the computation of resource intense applications. Especially smaller companies or research institutes use their NOWs for parallel applications as a low-cost alternative to massively parallel systems. A resource management system makes the use of a NOW transparent to the user and guarantees that the computational power of a NOW is utilized in the best possible way. To take advantage of a resource management system, resource intense applications are executed as batch jobs. In the remaining paper a parallel application is a application submitted as batch job to a resource management system. Checkpointing and migration of applications are important functionalities of a resource management system for reasons of fault tolerance and dynamic load balancing. Periodic checkpoints are written of long running applications to avoid the loss of the so far computed results, if the application unexpectedly aborts, e.g. because of a hardware error. Process migration is a way to equalize the load in a NOW if the load situation is unbalanced or to relocate processes during runtime. 1 This work has been funded by the German Federal Department of Education, Science, Research and Technology, BMBF (Bundesministerium fur Bildung, Wissenschaft, Forschung und Technologie) within the research project SEMPA (Software Engineering Methods for Parallel Applications in Scientic Computing).

2 Primarily a NOW is used for interactive work, batch jobs only utilize idle resources and hence, the interactive users have precedence over batch jobs. If an interactive user wants to work on a host running a process of a parallel application, the process must be migrated because it probably claims such an enormous amount of resources that the interactive user will have unacceptable response times on the host. Existing resource management systems, e.g. Condor [LTBL97] and LSF [Pla96] oer checkpointing and migration only for sequential applications. Merely initial process placement is supported for parallel applications, i.e. the processes of a parallel application are mapped to appropriate hosts. The processes are bound to their hosts and cannot be migrated to other hosts at runtime because checkpointing mechanisms for parallel applications with communicating processes are rarely available [ZB96]. The reason why existing resource management systems hardly support parallel applications is the lack of control over the processes of a parallel application. Without control over the processes a resource management system is unable to kill, checkpoint or migrate a running process of a parallel application or to observe resource limitations. A major goal of the SEMPA project [LMRW96] is to design and implement a batch queuing and resource management system for sequential and parallel applications in a NOW. Available resources should always be utilized for the execution of batch jobs. A mechanism for checkpointing and migration of parallel applications must be provided to equalize the load in the NOW and to release hosts running processes of a parallel application if the hosts are needed by an interactive user. Our basic idea was to use existing batch queuing and resource management facilities and add new features supporting the ecient computation of parallel applications in a NOW. The SEMPA Resource Manager is based on the batch queuing and resource management system CODINE [GEN96] and the checkpointing and migration capability for parallel applications of CoCheck [Ste95]. A resource manager is implemented to control the parallel applications and to join the components and functions of CODINE and CoCheck. The remaining paper is organized as follows. Section 2 describes the design concept of the SEMPA Resource Manager. The structure and functionalities of the basic components are explained in section 3. Section 4 shows some implementation details of the SEMPA Resource Manager. First performance measurements are presented in section 5. The paper closes with a brief summary and an outlook on further research. 2 The Design Concept of the SEMPA Resource Manager An architectural design of a distributed resource management system for parallel applications in a NOW is introduced in [MS97]. The concept of the distributed resource management system comprehends modular components for the main functionalities batch queuing, scheduling and load management and includes dened interfaces between these components. The scheduling component is organized hierarchically, i.e. a global resource manager places a parallel application initially and then passes it to a local resource manager that is responsible for the parallel application until it has nished. The functions of the local resource manager are the management of hosts and processes and the remapping of the parallel application. The SEMPA Resource Manager is an implementation of the design concept presented in [MS97] based on CODINE, CoCheck and the resource manager interface.

3 The architectural design of the SEMPA Resource Manager strongly depends on the structure and the components of CODINE and CoCheck that should be retained as far as possible. An important issue in the design of the SEMPA Resource Manager is to dene a communication model for the information exchange between the dierent components. One of the major functions of the SEMPA Resource Manager is to control the parallel applications which means to control each of its processes. This is the basic assumption for further functions of the SEMPA Resource Manager that operate on single processes of a parallel application. Control over a parallel application is required to: suspend a running parallel application stop a running parallel application, e.g. by the job owner write periodic checkpoints of a parallel application migrate one or more processes of a parallel application observe resource limitations of a parallel application collect accounting information about a parallel application 3 Components of the SEMPA Resource Manager Structure and functionalities of the main components of the SEMPA Resource Manager, CO- DINE, CoCheck and resource manager interface, are explained in the following sections. 3.1 CODINE CODINE is a batch queuing and resource management system for NOWs [GEN96]. Users submit their jobs to CODINE that queues the jobs until the required resources are available. A batch job is composed of an application and resource requirements specied by the user, e.g. machine architecture or size of memory. CODINE maps sequential and parallel applications to idle or low loaded hosts. CODINE is built up of various components to queue and schedule jobs and to measure the load on the hosts in the NOW: qmaster: The qmaster is the central component in CODINE and has the control over all other components. It corresponds to a database server containing the information about hosts and jobs. schedd: The schedd is the component that performs the scheduling algorithm. It gets information about hosts and jobs from the qmaster and computes the job order list. commd: A communication daemon is running on every host that is controlled by CODINE. The commd implements the communication between the CODINE components over TCP sockets. Some connections are permanent, e.g. between qmaster and schedd, other connections are set up on demand and closed when the transmission is over.

4 execd: An execution daemon is running on every host that executes batch jobs. The execd starts and controls jobs and measures the load on its host. When a job has nished, the execd returns the accounting information about the job to the qmaster. shepherd: The shepherd process is started by the execd and builds up the execution environment for a job. The execd does not start a job immediately but starts a shepherd and the shepherd starts the job by forking a process. When the job has nished, the shepherd collects the accounting information about the job. Figure 1 shows the components of CODINE and their relationship. qmaster and schedd usually run on the same host to minimize the communication overhead. Jobs are running on execution hosts and for every job a shepherd is existing that controls the job. execution host qmaster execd commd commd schedd shepherd shepherd job job Figure 1: The structure of CODINE When a parallel job is submitted, additional resource requirements must be specied compared to a sequential batch job, e.g. the parallel programming environment or the minimum and maximum number of hosts. Parallel CODINE jobs can use, MPI or EXPRESS as parallel programming environment. A job in CODINE is not directly started by an execd but by a shepherd process that is started by the execd. The shepherd is parent of the started job and has control over the job, e.g. to suspend or kill the job during runtime or to collect accounting information about the job. A shepherd can only start one job whereas an execd can start several shepherd processes. In the current version of CODINE there is only a single shepherd existing for each parallel job, i.e. CODINE only has control over the process forked by the shepherd but not over processes that are created by parallel programming environments, e.g. spawned by. Thus, operations such as resource limitation and the collection of accounting information can only be performed for the master process forked by the shepherd but not for the spawned processes. One of the aims of the SEMPA Resource Manager is to overcome this deciency.

5 3.2 CoCheck CoCheck (Consistent Checkpoints) is an extension to message-passing libraries that allows the creation of checkpoints of parallel applications and the migration of processes. Implementations of CoCheck for [Ste95] and MPI [Ste96] exist. For the remainder of the paper we will refer to the version of CoCheck. Before the application can actually be started the user must relink the application with the CoCheck libraries to incorporate the code which implements checkpointing and migration. A resource manager is provided [GBD + 94] that receives and handles requests to checkpoint or restart an application or to migrate processes. An API has been dened to send such requests to the resource manager. After the resource manager of CoCheck has received a request to checkpoint or migrate it initiates the CoCheck checkpointing protocol. All processes of the currently executing application are informed about a pending checkpoint. In turn all the processes start to exchange so called \ready messages". These ready messages ush all communication channels between all the processes. Messages that were in transmit upon checkpoint time are thus forwarded to their destination and stored there. After restart these messages are automatically retrieved from the buers. When the processes are restarted they get a new identier. These identiers are then sent to the CoCheck resource manager. It in turn sets up a mapping table from old to current identiers. Within the wrappers for the communication calls these current values are used to send and receive messages instead of the values that the application actually uses. Hence, checkpointing and migration is transparent to the application [Ste95]. 3.3 The Resource Manager Interface 3.3 oers a resource manager interface to dene an own host and task management and new scheduling strategies [GBD + 94]. Usually calls are handled by the daemons, but if there is a resource manager registered in the virtual machine, calls concerning hosts and tasks, e.g. pvm addhosts or pvm spawn are redirected to the resource manager. The resource manager provides handler functions to execute the redirected calls. The handler functions in the resource manager are not part of, they must be explicitly written by the user corresponding to a given message framework. CoCheck uses the resource manager interface for the implementation of additional handler functions for checkpointing and migration. For the SEMPA Resource Manager a complete resource manager has been implemented with handler functions for all aected calls that joins the components of CODINE, CoCheck and and realizes a local resource manager for every application. 4 Implementation Aspects of the SEMPA Resource Manager In the previous sections the architectural design and the components of the SEMPA Resource Manager have been introduced. This section explains some functionalities of the SEMPA Resource Manager and shows some implementation details.

6 The main component of the SEMPA Resource Manager is the resource manager with its handler functions for host and task management that initiate certain operations of CODINE, CoCheck or. The data exchange between CODINE and components is realized by calls and a signal interface. 4.1 Starting a Job by the SEMPA Resource Manager Before a job can be started, hosts for the execution of the job must be selected and the parallel environment must be congured. In the SEMPA Resource Manager the CODINE scheduler selects the hosts for the application and the master host where the application is started corresponding to the load on the hosts. Then the execd on the master host starts a shepherd, called the master shepherd. The master shepherd starts the master daemon (pvmd) and the resource manager. The resource manager sets up the virtual machine with the hosts selected by the schedd, i.e. it starts a slave pvmd and a tasker on each host belonging to the virtual machine. Due to implementation constraints the resource manager must be started before hosts are added to the virtual machine. Now the virtual machine is built up completely with the master pvmd and the resource manager running on the master host and a slave pvmd and a tasker on every other host in the virtual machine as shown in Figure 2. As the next step the application is started by the master shepherd, i.e. the rst task is started that usually spawns further tasks. 4.2 Spawning a Task As mentioned above, CODINE is intended to have control over all tasks spawned by. The tasker concept is used to implement the creation of a new task with an own strategy. The resource manager selects a host within the virtual machine where the new task is started. If no appropriate host is available in the virtual machine, the resource manager requests a new host maybe with specic hardware requirements from the CODINE qmaster. The pvm spawn call is sent to the resource manager that selects a host and sends a message to the tasker on that host. uses the round-robin strategy to map tasks to hosts. A strategy considering load information about the hosts will be implemented in the next phase of the project [SKS92]. It is not reasonable to specify a particular host in the pvm spawn call because the resource manager selects a host for the task. If the pvm spawn call fails, a corresponding error is generated and the responsibility to handle the error message is turned to the calling task. The tasker implements a procedure that prevents the tasker to fork the new task itself but causes the execd to start a shepherd that nally creates the task (see Figure 3). The task is spawned on a host belonging to the virtual machine, i.e. that a slave pvmd and a tasker are already running on that host. The spawned task is now under the control of CODINE and.

7 host 1 (master host) schedd execd shepherd qmaster master pvmd start appl. spawn task resource manager execute host 2 host n request tasker slave pvmd tasker slave pvmd message exchange execd execd Figure 2: Starting a job by the SEMPA Resource Manager 4.3 Exiting a Task When a task exits, CODINE and the resource manager must be notied. An exiting task sends a signal SIGCHLD to its parent process that is a shepherd process. After receiving the signal SIGCHLD, the shepherd writes the accounting information about the task to a temporary le and sends a signal SIGCHLD to the tasker to inform it that the task has exited. The shepherd exits and sends a signal SIGCHLD to its parent process, the execd. When the resource manager recognizes that all tasks have terminated, it stops and exits. 5 Performance Measurements Functionalities and performance of the SEMPA Resource Manager have been evaluated with ParTfC as a real world test case. ParTfC is a computational uid dynamics package to compute laminar and turbulent viscous ows in three dimensional geometries. It has been parallelized within the SEMPA project corresponding to the SPMD (single program, multiple data) paradigm [LMR + 96]. The underlying grid is partitioned into smaller parts and every partition is computed by an own process.

8 host 1 (master host) host n master task spawn task resource manager tasker slave pvmd execd execute request shepherd message exchange task Figure 3: Spawning a task by the SEMPA Resource Manager The presented time measurements show the inuence of a resource management system to the runtime of ParTfC. The following three measurement models have been viewed: (M1) ParTfC in interactive mode (M2) ParTfC started as CODINE batch job without a resource manager (M3) ParTfC started as batch job to the SEMPA Resource Manager The time measurements were performed with two dierent grids: (T1) A grid with 3150 grid nodes divided into 4 partitions. (T2) A grid with grid nodes divided into 4 partitions. The four processes of ParTfC were computed on two SGI Indigo 4400 so that two processes were running on one host. The two grids are relatively small but they are sucient to show that the overhead produced by CODINE or the SEMPA Resource Manager is negligible. Table 1 shows that the runtime of ParTfC hardly increases if ParTfC is started as a batch job in CODINE or the SEMPA Resource Manager compared to the runtime of ParTfC in the interactive mode. The time for start and stop scripts in CODINE and the SEMPA Resource Manager that are performed before starting and after nishing ParTfC are shown in Table 2. However, compared to the runtime of ParTfC these times can be neglected. The start script in CODINE starts and sets up the virtual machine. The execution of the

9 (M1) (M2) (M3) (T1) 190 s 194 s 197 s (T2) 389 s 395 s 396 s Table 1: Runtime of ParTfC for the three measurement models start script of the SEMPA Resource Manager takes more time compared to the start script of CODINE because the resource manager and the tasker must be started in addition. The stop script of CODINE performs a pvm halt to stop the virtual machine. The stop script of the SEMPA Resource Manager sends a signal to the resource manager to stop the virtual machine if all processes of the parallel application have nished. (M2) (M3) start script 100 ms 4.2 s stop script 100 ms 60 ms Table 2: Time for start and stop scripts in CODINE and the SEMPA Resource Manager 6 Conclusion The SEMPA Resource Manager provides batch queuing and resource management facilities for applications in a NOW. Parallel applications are started as batch jobs and each process of a parallel application is under the control of the SEMPA Resource Manager so that e.g. resource limitation and migration of each process can be performed. The presented approach is restricted to applications because oers dynamic task management and features to dene own resource management services. The exibility of the concept prevents changes in the code. Modications in CODINE and CoCheck are necessary but reduced to a minimum. The implementation of the SEMPA Resource Manager has almost been completed except the integration of the CoCheck handler functions into the resource manager. The next step after the integration of the migration facilities will be to improve the scheduling strategy of the resource manager to decide about the mapping and remapping of processes more eciently. Currently the round-robin method is used that does not consider the dierent CPU and memory capacities of the hosts and the actual load situation in the virtual machine and the NOW. The interface between the resource manager and the CODINE qmaster must be extended to make scheduling information of CODINE available to the resource manager. References [GBD + 94] Al Geist, Adam Beguelin, Jack Dongarra, Weicheng Jiang, Robert Manchek, and Vaidy Sunderam. : Parallel Virtual Machine A Users' Guide and Tutorial for Networked Parallel Computing. Scientic and Engineering Computation. The MIT Press, Cambridge, MA, 1994.

10 [GEN96] [LMR + 96] GENIAS Software GmbH, Erzgebirgstr. 2B, D Neutraubling, Germany. CO- DINE Reference Manual, Version 4.0, Peter Luksch, Ursula Maier, Sabine Rathmayer, Friedemann Unger, and Matthias Weidmann. Parallelization of a state-of-the-art industrial CFD Package for Execution on Networks of Workstations and Massively Parallel Processors. In Third European Users' Group Meeting, Euro 96, Munchen, October [LMRW96] Peter Luksch, Ursula Maier, Sabine Rathmayer, and Matthias Weidmann. SEMPA: Software Engineering Methods for Parallel Scientic Applications. In International Software Engineering Week, First International Workshop on Software Engineering for Parallel and Distributed Systems, Berlin, March [LTBL97] [MS97] [Pla96] [SKS92] [Ste95] [Ste96] [ZB96] Michael Litzkow, Todd Tannenbaum, Jim Basney, and Miron Livny. Checkpoint and Migration of UNIX Processes in the Condor Distributed Environment. Technical Report 1346, University of Wisconsin-Madison, April Ursula Maier and Georg Stellner. Distributed Resource Management for Parallel Applications in Networks of Workstations. In HPCN Europe 1997, volume 1225 of Lecture Notes in Computer Science, pages 462{471. Springer-Verlag, Platform Computing Corporation, North York, Ontario, Canada. LSF Documentation, December Niranjan G. Shivaratri, Phillip Krueger, and Mukesh Singhal. Load Distributing for Locally Distributed Systems. Computer, 25(12):33{44, December Georg Stellner. Checkpointing and Process Migration for. In Arndt Bode, Thomas Ludwig, Vaidy Sunderam, and Roland Wismuller, editors, Workshop on, MPI Tools and Applications, number 342/18/95 A in SFB-Bericht, pages 44{48. Technische Universitat Munchen, Institut fur Informatik, November Georg Stellner. CoCheck: Checkpointing and Process Migration for MPI. In Proceedings of the International Parallel Processing Symposium, pages 526{531, Honolulu, HI, April IEEE Computer Society Press, Los Vaqueros Circle, P.O. Box 3014, Los Alamitos, CA Avi Ziv and Jehoshua Bruck. Checkpointing in Parallel and Distributed Systems. In Albert Zomaya, editor, Parallel and Distributed Computing Handbook, Series on Computer Engineering, chapter 10, pages 274{302. McGraw-Hill, 1996.

Technische Universitat Munchen. Institut fur Informatik. D Munchen.

Technische Universitat Munchen. Institut fur Informatik. D Munchen. Developing Applications for Multicomputer Systems on Workstation Clusters Georg Stellner, Arndt Bode, Stefan Lamberts and Thomas Ludwig? Technische Universitat Munchen Institut fur Informatik Lehrstuhl

More information

Software Engineering Methods for Parallel Applications in Scientific Computing Project SEMPA

Software Engineering Methods for Parallel Applications in Scientific Computing Project SEMPA Software Engineering Methods for Parallel Applications in Scientific Computing Project SEMPA P. Luksch, U. Maier, S. Rathmayer, M. Weidmann Lehrstuhl für Rechnertechnik und Rechnerorganisation (LRR-TUM)

More information

Application. CoCheck Overlay Library. MPE Library Checkpointing Library. OS Library. Operating System

Application. CoCheck Overlay Library. MPE Library Checkpointing Library. OS Library. Operating System Managing Checkpoints for Parallel Programs Jim Pruyne and Miron Livny Department of Computer Sciences University of Wisconsin{Madison fpruyne, mirong@cs.wisc.edu Abstract Checkpointing is a valuable tool

More information

100 Mbps DEC FDDI Gigaswitch

100 Mbps DEC FDDI Gigaswitch PVM Communication Performance in a Switched FDDI Heterogeneous Distributed Computing Environment Michael J. Lewis Raymond E. Cline, Jr. Distributed Computing Department Distributed Computing Department

More information

Scheduling of Parallel Jobs on Dynamic, Heterogenous Networks

Scheduling of Parallel Jobs on Dynamic, Heterogenous Networks Scheduling of Parallel Jobs on Dynamic, Heterogenous Networks Dan L. Clark, Jeremy Casas, Steve W. Otto, Robert M. Prouty, Jonathan Walpole {dclark, casas, otto, prouty, walpole}@cse.ogi.edu http://www.cse.ogi.edu/disc/projects/cpe/

More information

les are not generally available by NFS or AFS, with a mechanism called \remote system calls". These are discussed in section 4.1. Our general method f

les are not generally available by NFS or AFS, with a mechanism called \remote system calls. These are discussed in section 4.1. Our general method f Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System Michael Litzkow, Todd Tannenbaum, Jim Basney, and Miron Livny Computer Sciences Department University of Wisconsin-Madison

More information

LINUX. Benchmark problems have been calculated with dierent cluster con- gurations. The results obtained from these experiments are compared to those

LINUX. Benchmark problems have been calculated with dierent cluster con- gurations. The results obtained from these experiments are compared to those Parallel Computing on PC Clusters - An Alternative to Supercomputers for Industrial Applications Michael Eberl 1, Wolfgang Karl 1, Carsten Trinitis 1 and Andreas Blaszczyk 2 1 Technische Universitat Munchen

More information

Kevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a

Kevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a Asynchronous Checkpointing for PVM Requires Message-Logging Kevin Skadron 18 April 1994 Abstract Distributed computing using networked workstations oers cost-ecient parallel computing, but the higher rate

More information

Condor and BOINC. Distributed and Volunteer Computing. Presented by Adam Bazinet

Condor and BOINC. Distributed and Volunteer Computing. Presented by Adam Bazinet Condor and BOINC Distributed and Volunteer Computing Presented by Adam Bazinet Condor Developed at the University of Wisconsin-Madison Condor is aimed at High Throughput Computing (HTC) on collections

More information

Compiler and Runtime Support for Programming in Adaptive. Parallel Environments 1. Guy Edjlali, Gagan Agrawal, and Joel Saltz

Compiler and Runtime Support for Programming in Adaptive. Parallel Environments 1. Guy Edjlali, Gagan Agrawal, and Joel Saltz Compiler and Runtime Support for Programming in Adaptive Parallel Environments 1 Guy Edjlali, Gagan Agrawal, Alan Sussman, Jim Humphries, and Joel Saltz UMIACS and Dept. of Computer Science University

More information

signature i-1 signature i instruction j j+1 branch adjustment value "if - path" initial value signature i signature j instruction exit signature j+1

signature i-1 signature i instruction j j+1 branch adjustment value if - path initial value signature i signature j instruction exit signature j+1 CONTROL FLOW MONITORING FOR A TIME-TRIGGERED COMMUNICATION CONTROLLER Thomas M. Galla 1, Michael Sprachmann 2, Andreas Steininger 1 and Christopher Temple 1 Abstract A novel control ow monitoring scheme

More information

The DBC: Processing Scientic Data. Over the Internet 1. Chungmin Chen Kenneth Salem Miron Livny. USA Canada USA

The DBC: Processing Scientic Data. Over the Internet 1. Chungmin Chen Kenneth Salem Miron Livny. USA Canada USA The DBC: Processing Scientic Data Over the Internet 1 Chungmin Chen Kenneth Salem Miron Livny Dept. of Computer Science Dept. of Computer Science Computer Sciences Dept. University of Maryland University

More information

PARALLEL PROGRAM EXECUTION SUPPORT IN THE JGRID SYSTEM

PARALLEL PROGRAM EXECUTION SUPPORT IN THE JGRID SYSTEM PARALLEL PROGRAM EXECUTION SUPPORT IN THE JGRID SYSTEM Szabolcs Pota 1, Gergely Sipos 2, Zoltan Juhasz 1,3 and Peter Kacsuk 2 1 Department of Information Systems, University of Veszprem, Hungary 2 Laboratory

More information

Active Motion Detection and Object Tracking. Joachim Denzler and Dietrich W.R.Paulus.

Active Motion Detection and Object Tracking. Joachim Denzler and Dietrich W.R.Paulus. 0 Active Motion Detection and Object Tracking Joachim Denzler and Dietrich W.R.Paulus denzler,paulus@informatik.uni-erlangen.de The following paper was published in the Proceedings on the 1 st International

More information

[8] J. J. Dongarra and D. C. Sorensen. SCHEDULE: Programs. In D. B. Gannon L. H. Jamieson {24, August 1988.

[8] J. J. Dongarra and D. C. Sorensen. SCHEDULE: Programs. In D. B. Gannon L. H. Jamieson {24, August 1988. editor, Proceedings of Fifth SIAM Conference on Parallel Processing, Philadelphia, 1991. SIAM. [3] A. Beguelin, J. J. Dongarra, G. A. Geist, R. Manchek, and V. S. Sunderam. A users' guide to PVM parallel

More information

A Framework-Solution for the. based on Graphical Integration-Schema. W. John, D. Portner

A Framework-Solution for the. based on Graphical Integration-Schema. W. John, D. Portner A Framework-Solution for the EMC-Analysis-Domain based on Graphical Integration-Schema W. John, D. Portner Cadlab - Analoge Systemtechnik, Bahnhofstrasse 32, D-4790 Paderborn, Germany 1 Introduction Especially

More information

On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems

On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems On Object Orientation as a Paradigm for General Purpose Distributed Operating Systems Vinny Cahill, Sean Baker, Brendan Tangney, Chris Horn and Neville Harris Distributed Systems Group, Dept. of Computer

More information

CUMULVS: Collaborative Infrastructure for Developing. Abstract. by allowing them to dynamically attach to, view, and \steer" a running simulation.

CUMULVS: Collaborative Infrastructure for Developing. Abstract. by allowing them to dynamically attach to, view, and \steer a running simulation. CUMULVS: Collaborative Infrastructure for Developing Distributed Simulations James Arthur Kohl Philip M. Papadopoulos G. A. Geist, II y Abstract The CUMULVS software environment provides remote collaboration

More information

NSR A Tool for Load Measurement in Heterogeneous Environments

NSR A Tool for Load Measurement in Heterogeneous Environments NSR A Tool for Load Measurement in Heterogeneous Environments Christian Röder, Thomas Ludwig, Arndt Bode LRR-TUM Lehrstuhl für Rechnertechnik und Rechnerorganisation Technische Universität München, Institut

More information

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Donald S. Miller Department of Computer Science and Engineering Arizona State University Tempe, AZ, USA Alan C.

More information

Load Balancing for Problems with Good Bisectors, and Applications in Finite Element Simulations

Load Balancing for Problems with Good Bisectors, and Applications in Finite Element Simulations Load Balancing for Problems with Good Bisectors, and Applications in Finite Element Simulations Stefan Bischof, Ralf Ebner, and Thomas Erlebach Institut für Informatik Technische Universität München D-80290

More information

Batch Queueing in the WINNER Resource Management System

Batch Queueing in the WINNER Resource Management System Batch Queueing in the WINNER Resource Management System Olaf Arndt 1, Bernd Freisleben 1, Thilo Kielmann 2, Frank Thilo 1 1 Dept. of Electrical Engineering and Computer Science, University of Siegen, Germany

More information

Dynamic Process Management in an MPI Setting. William Gropp. Ewing Lusk. Abstract

Dynamic Process Management in an MPI Setting. William Gropp. Ewing Lusk.  Abstract Dynamic Process Management in an MPI Setting William Gropp Ewing Lusk Mathematics and Computer Science Division Argonne National Laboratory gropp@mcs.anl.gov lusk@mcs.anl.gov Abstract We propose extensions

More information

2 Rupert W. Ford and Michael O'Brien Parallelism can be naturally exploited at the level of rays as each ray can be calculated independently. Note, th

2 Rupert W. Ford and Michael O'Brien Parallelism can be naturally exploited at the level of rays as each ray can be calculated independently. Note, th A Load Balancing Routine for the NAG Parallel Library Rupert W. Ford 1 and Michael O'Brien 2 1 Centre for Novel Computing, Department of Computer Science, The University of Manchester, Manchester M13 9PL,

More information

A Distributed Load Sharing Batch System. Jingwen Wang, Songnian Zhou, Khalid Ahmed, and Weihong Long. Technical Report CSRI-286.

A Distributed Load Sharing Batch System. Jingwen Wang, Songnian Zhou, Khalid Ahmed, and Weihong Long. Technical Report CSRI-286. LSBATCH: A Distributed Load Sharing Batch System Jingwen Wang, Songnian Zhou, Khalid Ahmed, and Weihong Long Technical Report CSRI-286 April 1993 Computer Systems Research Institute University of Toronto

More information

A Hierarchical Approach to Workload. M. Calzarossa 1, G. Haring 2, G. Kotsis 2,A.Merlo 1,D.Tessera 1

A Hierarchical Approach to Workload. M. Calzarossa 1, G. Haring 2, G. Kotsis 2,A.Merlo 1,D.Tessera 1 A Hierarchical Approach to Workload Characterization for Parallel Systems? M. Calzarossa 1, G. Haring 2, G. Kotsis 2,A.Merlo 1,D.Tessera 1 1 Dipartimento di Informatica e Sistemistica, Universita dipavia,

More information

Adaptive load migration systems for PVM

Adaptive load migration systems for PVM Oregon Health & Science University OHSU Digital Commons CSETech March 1994 Adaptive load migration systems for PVM Jeremy Casas Ravi Konuru Steve W. Otto Robert Prouty Jonathan Walpole Follow this and

More information

should invest the time and eort to rewrite their existing PVM applications in MPI. In this paper we address these questions by comparing the features

should invest the time and eort to rewrite their existing PVM applications in MPI. In this paper we address these questions by comparing the features PVM and MPI: a Comparison of Features G. A. Geist J. A. Kohl P. M. Papadopoulos May 30, 1996 Abstract This paper compares PVM and MPI features, pointing out the situations where one may befavored over

More information

Enhancing Integrated Layer Processing using Common Case. Anticipation and Data Dependence Analysis. Extended Abstract

Enhancing Integrated Layer Processing using Common Case. Anticipation and Data Dependence Analysis. Extended Abstract Enhancing Integrated Layer Processing using Common Case Anticipation and Data Dependence Analysis Extended Abstract Philippe Oechslin Computer Networking Lab Swiss Federal Institute of Technology DI-LTI

More information

HARNESS. provides multi-level hot pluggability. virtual machines. split off mobile agents. merge multiple collaborating sites.

HARNESS. provides multi-level hot pluggability. virtual machines. split off mobile agents. merge multiple collaborating sites. HARNESS: Heterogeneous Adaptable Recongurable NEtworked SystemS Jack Dongarra { Oak Ridge National Laboratory and University of Tennessee, Knoxville Al Geist { Oak Ridge National Laboratory James Arthur

More information

Application Programm 1

Application Programm 1 A Concept of Datamigration in a Distributed, Object-Oriented Knowledge Base Oliver Schmid Research Institute for Robotic and Real-Time Systems, Department of Computer Science, Technical University of Munich,

More information

N1GE6 Checkpointing and Berkeley Lab Checkpoint/Restart. Liang PENG Lip Kian NG

N1GE6 Checkpointing and Berkeley Lab Checkpoint/Restart. Liang PENG Lip Kian NG N1GE6 Checkpointing and Berkeley Lab Checkpoint/Restart Liang PENG Lip Kian NG N1GE6 Checkpointing and Berkeley Lab Checkpoint/Restart Liang PENG Lip Kian NG APSTC-TB-2004-005 Abstract: N1GE6, formerly

More information

UTOPIA: A Load Sharing Facility for Large, Heterogeneous Distributed Computer Systems. Technical Report CSRI-257. April 1992

UTOPIA: A Load Sharing Facility for Large, Heterogeneous Distributed Computer Systems. Technical Report CSRI-257. April 1992 UTOPIA: A Load Sharing Facility for Large, Heterogeneous Distributed Computer Systems Songnian Zhou, Jingwen Wang, Xiaohu Zheng, and Pierre Delisle Technical Report CSRI-257 April 1992. (To appear in Software

More information

Jeremy Casas, Dan Clark, Ravi Konuru, Steve W. Otto, Robert Prouty, and Jonathan Walpole.

Jeremy Casas, Dan Clark, Ravi Konuru, Steve W. Otto, Robert Prouty, and Jonathan Walpole. MPVM: A Migration Transparent Version of PVM Jeremy Casas, Dan Clark, Ravi Konuru, Steve W. Otto, Robert Prouty, and Jonathan Walpole fcasas,dclark,konuru,otto,prouty,walpoleg@cse.ogi.edu Department of

More information

processes based on Message Passing Interface

processes based on Message Passing Interface Checkpointing and Migration of parallel processes based on Message Passing Interface Zhang Youhui, Wang Dongsheng, Zheng Weimin Department of Computer Science, Tsinghua University, China. Abstract This

More information

Making Workstations a Friendly Environment for Batch Jobs. Miron Livny Mike Litzkow

Making Workstations a Friendly Environment for Batch Jobs. Miron Livny Mike Litzkow Making Workstations a Friendly Environment for Batch Jobs Miron Livny Mike Litzkow Computer Sciences Department University of Wisconsin - Madison {miron,mike}@cs.wisc.edu 1. Introduction As time-sharing

More information

A Freely Congurable Audio-Mixing Engine. M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster

A Freely Congurable Audio-Mixing Engine. M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster A Freely Congurable Audio-Mixing Engine with Automatic Loadbalancing M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster Electronics Laboratory, Swiss Federal Institute of Technology CH-8092 Zurich, Switzerland

More information

TIME WARP PARALLEL LOGIC SIMULATION ON A DISTRIBUTED MEMORY MULTIPROCESSOR. Peter Luksch, Holger Weitlich

TIME WARP PARALLEL LOGIC SIMULATION ON A DISTRIBUTED MEMORY MULTIPROCESSOR. Peter Luksch, Holger Weitlich TIME WARP PARALLEL LOGIC SIMULATION ON A DISTRIBUTED MEMORY MULTIPROCESSOR ABSTRACT Peter Luksch, Holger Weitlich Department of Computer Science, Munich University of Technology P.O. Box, D-W-8-Munchen,

More information

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and

More information

Khoral Research, Inc. Khoros is a powerful, integrated system which allows users to perform a variety

Khoral Research, Inc. Khoros is a powerful, integrated system which allows users to perform a variety Data Parallel Programming with the Khoros Data Services Library Steve Kubica, Thomas Robey, Chris Moorman Khoral Research, Inc. 6200 Indian School Rd. NE Suite 200 Albuquerque, NM 87110 USA E-mail: info@khoral.com

More information

Monitoring the Usage of the ZEUS Analysis Grid

Monitoring the Usage of the ZEUS Analysis Grid Monitoring the Usage of the ZEUS Analysis Grid Stefanos Leontsinis September 9, 2006 Summer Student Programme 2006 DESY Hamburg Supervisor Dr. Hartmut Stadie National Technical

More information

CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS. Xiaodong Zhang and Yongsheng Song

CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS. Xiaodong Zhang and Yongsheng Song CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS Xiaodong Zhang and Yongsheng Song 1. INTRODUCTION Networks of Workstations (NOW) have become important distributed

More information

Mechanisms for Just-in-Time Allocation of Resources to Adaptive Parallel Programs

Mechanisms for Just-in-Time Allocation of Resources to Adaptive Parallel Programs Mechanisms for Just-in-Time Allocation of Resources to Adaptive Parallel Programs Arash Baratloo Ayal Itzkovitz Zvi M. Kedem Yuanyuan Zhao fbaratloo,ayali,kedem,yuanyuang@cs.nyu.edu Department of Computer

More information

Mechanisms for Just-in-Time Allocation of Resources to Adaptive Parallel Programs

Mechanisms for Just-in-Time Allocation of Resources to Adaptive Parallel Programs Mechanisms for Just-in-Time Allocation of Resources to Adaptive Parallel Programs Arash Baratloo Ayal Itzkovitz Zvi M. Kedem Yuanyuan Zhao baratloo,ayali,kedem,yuanyuan @cs.nyu.edu Department of Computer

More information

Covering the Aztec Diamond with One-sided Tetrasticks Extended Version

Covering the Aztec Diamond with One-sided Tetrasticks Extended Version Covering the Aztec Diamond with One-sided Tetrasticks Extended Version Alfred Wassermann, University of Bayreuth, D-95440 Bayreuth, Germany Abstract There are 107 non-isomorphic coverings of the Aztec

More information

Network Computing Environment. Adam Beguelin, Jack Dongarra. Al Geist, Robert Manchek. Keith Moore. August, Rice University

Network Computing Environment. Adam Beguelin, Jack Dongarra. Al Geist, Robert Manchek. Keith Moore. August, Rice University HeNCE: A Heterogeneous Network Computing Environment Adam Beguelin, Jack Dongarra Al Geist, Robert Manchek Keith Moore CRPC-TR93425 August, 1993 Center for Research on Parallel Computation Rice University

More information

Storage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk

Storage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk HRaid: a Flexible Storage-system Simulator Toni Cortes Jesus Labarta Universitat Politecnica de Catalunya - Barcelona ftoni, jesusg@ac.upc.es - http://www.ac.upc.es/hpc Abstract Clusters of workstations

More information

Grid Compute Resources and Grid Job Management

Grid Compute Resources and Grid Job Management Grid Compute Resources and Job Management March 24-25, 2007 Grid Job Management 1 Job and compute resource management! This module is about running jobs on remote compute resources March 24-25, 2007 Grid

More information

Normal mode acoustic propagation models. E.A. Vavalis. the computer code to a network of heterogeneous workstations using the Parallel

Normal mode acoustic propagation models. E.A. Vavalis. the computer code to a network of heterogeneous workstations using the Parallel Normal mode acoustic propagation models on heterogeneous networks of workstations E.A. Vavalis University of Crete, Mathematics Department, 714 09 Heraklion, GREECE and IACM, FORTH, 711 10 Heraklion, GREECE.

More information

Job Management System Extension To Support SLAAC-1V Reconfigurable Hardware

Job Management System Extension To Support SLAAC-1V Reconfigurable Hardware Job Management System Extension To Support SLAAC-1V Reconfigurable Hardware Mohamed Taher 1, Kris Gaj 2, Tarek El-Ghazawi 1, and Nikitas Alexandridis 1 1 The George Washington University 2 George Mason

More information

Tutorial 4: Condor. John Watt, National e-science Centre

Tutorial 4: Condor. John Watt, National e-science Centre Tutorial 4: Condor John Watt, National e-science Centre Tutorials Timetable Week Day/Time Topic Staff 3 Fri 11am Introduction to Globus J.W. 4 Fri 11am Globus Development J.W. 5 Fri 11am Globus Development

More information

The driving motivation behind the design of the Janus framework is to provide application-oriented, easy-to-use and ecient abstractions for the above

The driving motivation behind the design of the Janus framework is to provide application-oriented, easy-to-use and ecient abstractions for the above Janus a C++ Template Library for Parallel Dynamic Mesh Applications Jens Gerlach, Mitsuhisa Sato, and Yutaka Ishikawa fjens,msato,ishikawag@trc.rwcp.or.jp Tsukuba Research Center of the Real World Computing

More information

NOW Based Parallel Reconstruction of Functional Images

NOW Based Parallel Reconstruction of Functional Images NOW Based Parallel Reconstruction of Functional Images F. Munz 1, T. Stephan 2, U. Maier 2, T. Ludwig 2,A.Bode 2, S. Ziegler 1,S.Nekolla 1, P. Bartenstein 1 and M. Schwaiger 1 1 Nuklearmedizinische Klinik

More information

UNICORE Globus: Interoperability of Grid Infrastructures

UNICORE Globus: Interoperability of Grid Infrastructures UNICORE : Interoperability of Grid Infrastructures Michael Rambadt Philipp Wieder Central Institute for Applied Mathematics (ZAM) Research Centre Juelich D 52425 Juelich, Germany Phone: +49 2461 612057

More information

Improving the Performance of Coordinated Checkpointers. on Networks of Workstations using RAID Techniques. University of Tennessee

Improving the Performance of Coordinated Checkpointers. on Networks of Workstations using RAID Techniques. University of Tennessee Improving the Performance of Coordinated Checkpointers on Networks of Workstations using RAID Techniques James S. Plank Department of Computer Science University of Tennessee Knoxville, TN 37996 plank@cs.utk.edu

More information

Supporting Heterogeneous Network Computing: PVM. Jack J. Dongarra. Oak Ridge National Laboratory and University of Tennessee. G. A.

Supporting Heterogeneous Network Computing: PVM. Jack J. Dongarra. Oak Ridge National Laboratory and University of Tennessee. G. A. Supporting Heterogeneous Network Computing: PVM Jack J. Dongarra Oak Ridge National Laboratory and University of Tennessee G. A. Geist Oak Ridge National Laboratory Robert Manchek University of Tennessee

More information

/98 $10.00 (c) 1998 IEEE

/98 $10.00 (c) 1998 IEEE CUMULVS: Extending a Generic Steering and Visualization Middleware for lication Fault-Tolerance Philip M. Papadopoulos, phil@msr.epm.ornl.gov James Arthur Kohl, kohl@msr.epm.ornl.gov B. David Semeraro,

More information

EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH PARALLEL IN-MEMORY DATABASE. Dept. Mathematics and Computing Science div. ECP

EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH PARALLEL IN-MEMORY DATABASE. Dept. Mathematics and Computing Science div. ECP EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH CERN/ECP 95-29 11 December 1995 ON-LINE EVENT RECONSTRUCTION USING A PARALLEL IN-MEMORY DATABASE E. Argante y;z,p. v.d. Stok y, I. Willers z y Eindhoven University

More information

The MPBench Report. Philip J. Mucci. Kevin London. March 1998

The MPBench Report. Philip J. Mucci. Kevin London.  March 1998 The MPBench Report Philip J. Mucci Kevin London mucci@cs.utk.edu london@cs.utk.edu March 1998 1 Introduction MPBench is a benchmark to evaluate the performance of MPI and PVM on MPP's and clusters of workstations.

More information

UNIVERSITY OF MINNESOTA. This is to certify that I have examined this copy of master s thesis by. Vishwas Raman

UNIVERSITY OF MINNESOTA. This is to certify that I have examined this copy of master s thesis by. Vishwas Raman UNIVERSITY OF MINNESOTA This is to certify that I have examined this copy of master s thesis by Vishwas Raman and have have found that it is complete and satisfactory in all respects, and that any and

More information

director executor user program user program signal, breakpoint function call communication channel client library directing server

director executor user program user program signal, breakpoint function call communication channel client library directing server (appeared in Computing Systems, Vol. 8, 2, pp.107-134, MIT Press, Spring 1995.) The Dynascope Directing Server: Design and Implementation 1 Rok Sosic School of Computing and Information Technology Grith

More information

Java Virtual Machine

Java Virtual Machine Evaluation of Java Thread Performance on Two Dierent Multithreaded Kernels Yan Gu B. S. Lee Wentong Cai School of Applied Science Nanyang Technological University Singapore 639798 guyan@cais.ntu.edu.sg,

More information

An Introduction to Parallel Processing with the Fork Transformation in SAS Data Integration Studio

An Introduction to Parallel Processing with the Fork Transformation in SAS Data Integration Studio Paper 2733-2018 An Introduction to Parallel Processing with the Fork Transformation in SAS Data Integration Studio Jeff Dyson, The Financial Risk Group ABSTRACT The SAS Data Integration Studio job is historically

More information

Efficiently building on-line tools for distributed heterogeneous environments

Efficiently building on-line tools for distributed heterogeneous environments Scientific Programming 10 (2002) 67 74 67 IOS Press Efficiently building on-line tools for distributed heterogeneous environments Günther Rackl, Thomas Ludwig, Markus Lindermeier and Alexandros Stamatakis

More information

(HT)Condor - Past and Future

(HT)Condor - Past and Future (HT)Condor - Past and Future Miron Livny John P. Morgridge Professor of Computer Science Wisconsin Institutes for Discovery University of Wisconsin-Madison חי has the value of 18 חי means alive Europe

More information

Towards ParadisEO-MO-GPU: a Framework for GPU-based Local Search Metaheuristics

Towards ParadisEO-MO-GPU: a Framework for GPU-based Local Search Metaheuristics Towards ParadisEO-MO-GPU: a Framework for GPU-based Local Search Metaheuristics N. Melab, T-V. Luong, K. Boufaras and E-G. Talbi Dolphin Project INRIA Lille Nord Europe - LIFL/CNRS UMR 8022 - Université

More information

Processes, PCB, Context Switch

Processes, PCB, Context Switch THE HONG KONG POLYTECHNIC UNIVERSITY Department of Electronic and Information Engineering EIE 272 CAOS Operating Systems Part II Processes, PCB, Context Switch Instructor Dr. M. Sakalli enmsaka@eie.polyu.edu.hk

More information

A MATLAB Toolbox for Distributed and Parallel Processing

A MATLAB Toolbox for Distributed and Parallel Processing A MATLAB Toolbox for Distributed and Parallel Processing S. Pawletta a, W. Drewelow a, P. Duenow a, T. Pawletta b and M. Suesse a a Institute of Automatic Control, Department of Electrical Engineering,

More information

Distributed Batch Controller. Department of Computer Science, University of Maryland, College Park, MD USA. Waterloo, ON N2L 3G1 Canada

Distributed Batch Controller. Department of Computer Science, University of Maryland, College Park, MD USA. Waterloo, ON N2L 3G1 Canada Processing TOVS Polar Pathnder Data Using the Distributed Batch Controller James Du a, Kenneth Salem b, Axel Schweiger c, and Miron Livny d a Department of Computer Science, University of Maryland, College

More information

CL/TB. An Allegro Common Lisp. J. Kempe, T. Lenz, B. Freitag, H. Schutz, G. Specht

CL/TB. An Allegro Common Lisp. J. Kempe, T. Lenz, B. Freitag, H. Schutz, G. Specht CL/TB An Allegro Common Lisp Programming Interface for TransBase J. Kempe, T. Lenz, B. Freitag, H. Schutz, G. Specht TECHNISCHE UNIVERSIT AT M UNCHEN Institut fur Informatik Orleansstrasse 34 D-8000 Munchen

More information

Grid Compute Resources and Job Management

Grid Compute Resources and Job Management Grid Compute Resources and Job Management How do we access the grid? Command line with tools that you'll use Specialised applications Ex: Write a program to process images that sends data to run on the

More information

What is checkpoint. Checkpoint libraries. Where to checkpoint? Why we need it? When to checkpoint? Who need checkpoint?

What is checkpoint. Checkpoint libraries. Where to checkpoint? Why we need it? When to checkpoint? Who need checkpoint? What is Checkpoint libraries Bosilca George bosilca@cs.utk.edu Saving the state of a program at a certain point so that it can be restarted from that point at a later time or on a different machine. interruption

More information

THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL. Jun Sun, Yasushi Shinjo and Kozo Itano

THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL. Jun Sun, Yasushi Shinjo and Kozo Itano THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL Jun Sun, Yasushi Shinjo and Kozo Itano Institute of Information Sciences and Electronics University of Tsukuba Tsukuba,

More information

Applications PVM (Parallel Virtual Machine) Socket Interface. Unix Domain LLC/SNAP HIPPI-LE/FP/PH. HIPPI Networks

Applications PVM (Parallel Virtual Machine) Socket Interface. Unix Domain LLC/SNAP HIPPI-LE/FP/PH. HIPPI Networks Enhanced PVM Communications over a HIPPI Local Area Network Jenwei Hsieh, David H.C. Du, Norman J. Troullier 1 Distributed Multimedia Research Center 2 and Computer Science Department, University of Minnesota

More information

Application Programmer. Vienna Fortran Out-of-Core Program

Application Programmer. Vienna Fortran Out-of-Core Program Mass Storage Support for a Parallelizing Compilation System b a Peter Brezany a, Thomas A. Mueck b, Erich Schikuta c Institute for Software Technology and Parallel Systems, University of Vienna, Liechtensteinstrasse

More information

MOTION ESTIMATION IN MPEG-2 VIDEO ENCODING USING A PARALLEL BLOCK MATCHING ALGORITHM. Daniel Grosu, Honorius G^almeanu

MOTION ESTIMATION IN MPEG-2 VIDEO ENCODING USING A PARALLEL BLOCK MATCHING ALGORITHM. Daniel Grosu, Honorius G^almeanu MOTION ESTIMATION IN MPEG-2 VIDEO ENCODING USING A PARALLEL BLOCK MATCHING ALGORITHM Daniel Grosu, Honorius G^almeanu Multimedia Group - Department of Electronics and Computers Transilvania University

More information

Evaluating Personal High Performance Computing with PVM on Windows and LINUX Environments

Evaluating Personal High Performance Computing with PVM on Windows and LINUX Environments Evaluating Personal High Performance Computing with PVM on Windows and LINUX Environments Paulo S. Souza * Luciano J. Senger ** Marcos J. Santana ** Regina C. Santana ** e-mails: {pssouza, ljsenger, mjs,

More information

Comparing Centralized and Decentralized Distributed Execution Systems

Comparing Centralized and Decentralized Distributed Execution Systems Comparing Centralized and Decentralized Distributed Execution Systems Mustafa Paksoy mpaksoy@swarthmore.edu Javier Prado jprado@swarthmore.edu May 2, 2006 Abstract We implement two distributed execution

More information

Steering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream

Steering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream Agent Roles in Snapshot Assembly Delbert Hart Dept. of Computer Science Washington University in St. Louis St. Louis, MO 63130 hart@cs.wustl.edu Eileen Kraemer Dept. of Computer Science University of Georgia

More information

2 Fredrik Manne, Svein Olav Andersen where an error occurs. In order to automate the process most debuggers can set conditional breakpoints (watch-poi

2 Fredrik Manne, Svein Olav Andersen where an error occurs. In order to automate the process most debuggers can set conditional breakpoints (watch-poi This is page 1 Printer: Opaque this Automating the Debugging of Large Numerical Codes Fredrik Manne Svein Olav Andersen 1 ABSTRACT The development of large numerical codes is usually carried out in an

More information

160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp

160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp Scientia Iranica, Vol. 11, No. 3, pp 159{164 c Sharif University of Technology, July 2004 On Routing Architecture for Hybrid FPGA M. Nadjarbashi, S.M. Fakhraie 1 and A. Kaviani 2 In this paper, the routing

More information

Experiences in Managing Resources on a Large Origin3000 cluster

Experiences in Managing Resources on a Large Origin3000 cluster Experiences in Managing Resources on a Large Origin3000 cluster UG Summit 2002, Manchester, May 20 2002, Mark van de Sanden & Huub Stoffers http://www.sara.nl A oarse Outline of this Presentation Overview

More information

Process a program in execution; process execution must progress in sequential fashion. Operating Systems

Process a program in execution; process execution must progress in sequential fashion. Operating Systems Process Concept An operating system executes a variety of programs: Batch system jobs Time-shared systems user programs or tasks 1 Textbook uses the terms job and process almost interchangeably Process

More information

Design of it : an Aldor library to express parallel programs Extended Abstract Niklaus Mannhart Institute for Scientic Computing ETH-Zentrum CH-8092 Z

Design of it : an Aldor library to express parallel programs Extended Abstract Niklaus Mannhart Institute for Scientic Computing ETH-Zentrum CH-8092 Z Design of it : an Aldor library to express parallel programs Extended Abstract Niklaus Mannhart Institute for Scientic Computing ETH-Zentrum CH-8092 Zurich, Switzerland e-mail: mannhart@inf.ethz.ch url:

More information

AN ABSTRACT OF THE THESIS OF. December 6, Title: Optimization of Machine Allocation in Ring Leader.

AN ABSTRACT OF THE THESIS OF. December 6, Title: Optimization of Machine Allocation in Ring Leader. AN ABSTRACT OF THE THESIS OF Jonathan B. King for the degree of Master of Science in Computer Science presented on December 6, 1996. Title: Optimization of Machine Allocation in Ring Leader. Abstract approved

More information

Towards Energy Efficient Change Management in a Cloud Computing Environment

Towards Energy Efficient Change Management in a Cloud Computing Environment Towards Energy Efficient Change Management in a Cloud Computing Environment Hady AbdelSalam 1,KurtMaly 1,RaviMukkamala 1, Mohammad Zubair 1, and David Kaminsky 2 1 Computer Science Department, Old Dominion

More information

Univa Grid Engine Troubleshooting Quick Reference

Univa Grid Engine Troubleshooting Quick Reference Univa Corporation Grid Engine Documentation Univa Grid Engine Troubleshooting Quick Reference Author: Univa Engineering Version: 8.4.4 October 31, 2016 Copyright 2012 2016 Univa Corporation. All rights

More information

Mobile Computing An Browser. Grace Hai Yan Lo and Thomas Kunz fhylo, October, Abstract

Mobile Computing An  Browser. Grace Hai Yan Lo and Thomas Kunz fhylo, October, Abstract A Case Study of Dynamic Application Partitioning in Mobile Computing An E-mail Browser Grace Hai Yan Lo and Thomas Kunz fhylo, tkunzg@uwaterloo.ca University ofwaterloo, ON, Canada October, 1996 Abstract

More information

Chapter 3. Design of Grid Scheduler. 3.1 Introduction

Chapter 3. Design of Grid Scheduler. 3.1 Introduction Chapter 3 Design of Grid Scheduler The scheduler component of the grid is responsible to prepare the job ques for grid resources. The research in design of grid schedulers has given various topologies

More information

Policy-Based Context-Management for Mobile Solutions

Policy-Based Context-Management for Mobile Solutions Policy-Based Context-Management for Mobile Solutions Caroline Funk 1,Björn Schiemann 2 1 Ludwig-Maximilians-Universität München Oettingenstraße 67, 80538 München caroline.funk@nm.ifi.lmu.de 2 Siemens AG,

More information

n m-dimensional data points K Clusters KP Data Points (Cluster centers) K Clusters

n m-dimensional data points K Clusters KP Data Points (Cluster centers) K Clusters Clustering using a coarse-grained parallel Genetic Algorithm: A Preliminary Study Nalini K. Ratha Anil K. Jain Moon J. Chung Department of Computer Science Department of Computer Science Department of

More information

Distributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne

Distributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne Distributed Computing: PVM, MPI, and MOSIX Multiple Processor Systems Dr. Shaaban Judd E.N. Jenne May 21, 1999 Abstract: Distributed computing is emerging as the preferred means of supporting parallel

More information

The Architecture of a System for the Indexing of Images by. Content

The Architecture of a System for the Indexing of Images by. Content The Architecture of a System for the Indexing of s by Content S. Kostomanolakis, M. Lourakis, C. Chronaki, Y. Kavaklis, and S. C. Orphanoudakis Computer Vision and Robotics Laboratory Institute of Computer

More information

Providing Interoperability for Java-Oriented Monitoring Tools with JINEXT

Providing Interoperability for Java-Oriented Monitoring Tools with JINEXT Providing Interoperability for Java-Oriented Monitoring Tools with JINEXT W lodzimierz Funika and Arkadiusz Janik Institute of Computer Science, AGH, al. Mickiewicza 30, 30-059 Kraków, Poland funika@uci.agh.edu.pl

More information

Ecient Redo Processing in. Jun-Lin Lin. Xi Li. Southern Methodist University

Ecient Redo Processing in. Jun-Lin Lin. Xi Li. Southern Methodist University Technical Report 96-CSE-13 Ecient Redo Processing in Main Memory Databases by Jun-Lin Lin Margaret H. Dunham Xi Li Department of Computer Science and Engineering Southern Methodist University Dallas, Texas

More information

Space-Efficient Page-Level Incremental Checkpointing *

Space-Efficient Page-Level Incremental Checkpointing * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 22, 237-246 (2006) Space-Efficient Page-Level Incremental Checkpointing * JUNYOUNG HEO, SANGHO YI, YOOKUN CHO AND JIMAN HONG + School of Computer Science

More information

Cross Cluster Migration using Dynamite

Cross Cluster Migration using Dynamite Cross Cluster Migration using Dynamite Remote File Access Support A thesis submitted in partial fulfilment of the requirements for the degree of Master of Science at the University of Amsterdam by Adianto

More information

Improving the Dynamic Creation of Processes in MPI-2

Improving the Dynamic Creation of Processes in MPI-2 Improving the Dynamic Creation of Processes in MPI-2 Márcia C. Cera, Guilherme P. Pezzi, Elton N. Mathias, Nicolas Maillard, and Philippe O. A. Navaux Universidade Federal do Rio Grande do Sul, Instituto

More information

PC cluster as a platform for parallel applications

PC cluster as a platform for parallel applications PC cluster as a platform for parallel applications AMANY ABD ELSAMEA, HESHAM ELDEEB, SALWA NASSAR Computer & System Department Electronic Research Institute National Research Center, Dokki, Giza Cairo,

More information

Using semantic causality graphs to validate MAS models

Using semantic causality graphs to validate MAS models Using semantic causality graphs to validate MAS models Guillermo Vigueras 1, Jorge J. Gómez 2, Juan A. Botía 1 and Juan Pavón 2 1 Facultad de Informática Universidad de Murcia Spain 2 Facultad de Informática

More information