execution host commd
|
|
- Claribel Doyle
- 5 years ago
- Views:
Transcription
1 Batch Queuing and Resource Management for Applications in a Network of Workstations Ursula Maier, Georg Stellner, Ivan Zoraja Lehrstuhl fur Rechnertechnik und Rechnerorganisation (LRR-TUM) Institut fur Informatik Technische Universitat Munchen fmaier,stellner,zorajag@informatik.tu-muenchen.de Abstract A resource management system can eectively shorten the runtime of batch jobs in a network of workstations (NOW). This is achieved with load balancing mechanisms to distribute the load equally among the hosts. To avoid conicts between interactive users and batch jobs, a resource management system must be able to migrate batch jobs from an interactive host to an idle host. Common resource management systems oer process migration only for sequential jobs but not for parallel jobs. Within the SEMPA project a resource management system with batch queuing functionalities including checkpointing and migration is designed and implemented. We focus on applications because oers dynamic task management and an interface to resource management systems 1. 1 Introduction Parallel scientic computing applications, e.g. in computational uid dynamics, require a large amount of CPU time and memory. Therefore, they are often run on massively parallel systems. However, networks of workstations (NOWs) often have computing capacities available that are sucient for the computation of resource intense applications. Especially smaller companies or research institutes use their NOWs for parallel applications as a low-cost alternative to massively parallel systems. A resource management system makes the use of a NOW transparent to the user and guarantees that the computational power of a NOW is utilized in the best possible way. To take advantage of a resource management system, resource intense applications are executed as batch jobs. In the remaining paper a parallel application is a application submitted as batch job to a resource management system. Checkpointing and migration of applications are important functionalities of a resource management system for reasons of fault tolerance and dynamic load balancing. Periodic checkpoints are written of long running applications to avoid the loss of the so far computed results, if the application unexpectedly aborts, e.g. because of a hardware error. Process migration is a way to equalize the load in a NOW if the load situation is unbalanced or to relocate processes during runtime. 1 This work has been funded by the German Federal Department of Education, Science, Research and Technology, BMBF (Bundesministerium fur Bildung, Wissenschaft, Forschung und Technologie) within the research project SEMPA (Software Engineering Methods for Parallel Applications in Scientic Computing).
2 Primarily a NOW is used for interactive work, batch jobs only utilize idle resources and hence, the interactive users have precedence over batch jobs. If an interactive user wants to work on a host running a process of a parallel application, the process must be migrated because it probably claims such an enormous amount of resources that the interactive user will have unacceptable response times on the host. Existing resource management systems, e.g. Condor [LTBL97] and LSF [Pla96] oer checkpointing and migration only for sequential applications. Merely initial process placement is supported for parallel applications, i.e. the processes of a parallel application are mapped to appropriate hosts. The processes are bound to their hosts and cannot be migrated to other hosts at runtime because checkpointing mechanisms for parallel applications with communicating processes are rarely available [ZB96]. The reason why existing resource management systems hardly support parallel applications is the lack of control over the processes of a parallel application. Without control over the processes a resource management system is unable to kill, checkpoint or migrate a running process of a parallel application or to observe resource limitations. A major goal of the SEMPA project [LMRW96] is to design and implement a batch queuing and resource management system for sequential and parallel applications in a NOW. Available resources should always be utilized for the execution of batch jobs. A mechanism for checkpointing and migration of parallel applications must be provided to equalize the load in the NOW and to release hosts running processes of a parallel application if the hosts are needed by an interactive user. Our basic idea was to use existing batch queuing and resource management facilities and add new features supporting the ecient computation of parallel applications in a NOW. The SEMPA Resource Manager is based on the batch queuing and resource management system CODINE [GEN96] and the checkpointing and migration capability for parallel applications of CoCheck [Ste95]. A resource manager is implemented to control the parallel applications and to join the components and functions of CODINE and CoCheck. The remaining paper is organized as follows. Section 2 describes the design concept of the SEMPA Resource Manager. The structure and functionalities of the basic components are explained in section 3. Section 4 shows some implementation details of the SEMPA Resource Manager. First performance measurements are presented in section 5. The paper closes with a brief summary and an outlook on further research. 2 The Design Concept of the SEMPA Resource Manager An architectural design of a distributed resource management system for parallel applications in a NOW is introduced in [MS97]. The concept of the distributed resource management system comprehends modular components for the main functionalities batch queuing, scheduling and load management and includes dened interfaces between these components. The scheduling component is organized hierarchically, i.e. a global resource manager places a parallel application initially and then passes it to a local resource manager that is responsible for the parallel application until it has nished. The functions of the local resource manager are the management of hosts and processes and the remapping of the parallel application. The SEMPA Resource Manager is an implementation of the design concept presented in [MS97] based on CODINE, CoCheck and the resource manager interface.
3 The architectural design of the SEMPA Resource Manager strongly depends on the structure and the components of CODINE and CoCheck that should be retained as far as possible. An important issue in the design of the SEMPA Resource Manager is to dene a communication model for the information exchange between the dierent components. One of the major functions of the SEMPA Resource Manager is to control the parallel applications which means to control each of its processes. This is the basic assumption for further functions of the SEMPA Resource Manager that operate on single processes of a parallel application. Control over a parallel application is required to: suspend a running parallel application stop a running parallel application, e.g. by the job owner write periodic checkpoints of a parallel application migrate one or more processes of a parallel application observe resource limitations of a parallel application collect accounting information about a parallel application 3 Components of the SEMPA Resource Manager Structure and functionalities of the main components of the SEMPA Resource Manager, CO- DINE, CoCheck and resource manager interface, are explained in the following sections. 3.1 CODINE CODINE is a batch queuing and resource management system for NOWs [GEN96]. Users submit their jobs to CODINE that queues the jobs until the required resources are available. A batch job is composed of an application and resource requirements specied by the user, e.g. machine architecture or size of memory. CODINE maps sequential and parallel applications to idle or low loaded hosts. CODINE is built up of various components to queue and schedule jobs and to measure the load on the hosts in the NOW: qmaster: The qmaster is the central component in CODINE and has the control over all other components. It corresponds to a database server containing the information about hosts and jobs. schedd: The schedd is the component that performs the scheduling algorithm. It gets information about hosts and jobs from the qmaster and computes the job order list. commd: A communication daemon is running on every host that is controlled by CODINE. The commd implements the communication between the CODINE components over TCP sockets. Some connections are permanent, e.g. between qmaster and schedd, other connections are set up on demand and closed when the transmission is over.
4 execd: An execution daemon is running on every host that executes batch jobs. The execd starts and controls jobs and measures the load on its host. When a job has nished, the execd returns the accounting information about the job to the qmaster. shepherd: The shepherd process is started by the execd and builds up the execution environment for a job. The execd does not start a job immediately but starts a shepherd and the shepherd starts the job by forking a process. When the job has nished, the shepherd collects the accounting information about the job. Figure 1 shows the components of CODINE and their relationship. qmaster and schedd usually run on the same host to minimize the communication overhead. Jobs are running on execution hosts and for every job a shepherd is existing that controls the job. execution host qmaster execd commd commd schedd shepherd shepherd job job Figure 1: The structure of CODINE When a parallel job is submitted, additional resource requirements must be specied compared to a sequential batch job, e.g. the parallel programming environment or the minimum and maximum number of hosts. Parallel CODINE jobs can use, MPI or EXPRESS as parallel programming environment. A job in CODINE is not directly started by an execd but by a shepherd process that is started by the execd. The shepherd is parent of the started job and has control over the job, e.g. to suspend or kill the job during runtime or to collect accounting information about the job. A shepherd can only start one job whereas an execd can start several shepherd processes. In the current version of CODINE there is only a single shepherd existing for each parallel job, i.e. CODINE only has control over the process forked by the shepherd but not over processes that are created by parallel programming environments, e.g. spawned by. Thus, operations such as resource limitation and the collection of accounting information can only be performed for the master process forked by the shepherd but not for the spawned processes. One of the aims of the SEMPA Resource Manager is to overcome this deciency.
5 3.2 CoCheck CoCheck (Consistent Checkpoints) is an extension to message-passing libraries that allows the creation of checkpoints of parallel applications and the migration of processes. Implementations of CoCheck for [Ste95] and MPI [Ste96] exist. For the remainder of the paper we will refer to the version of CoCheck. Before the application can actually be started the user must relink the application with the CoCheck libraries to incorporate the code which implements checkpointing and migration. A resource manager is provided [GBD + 94] that receives and handles requests to checkpoint or restart an application or to migrate processes. An API has been dened to send such requests to the resource manager. After the resource manager of CoCheck has received a request to checkpoint or migrate it initiates the CoCheck checkpointing protocol. All processes of the currently executing application are informed about a pending checkpoint. In turn all the processes start to exchange so called \ready messages". These ready messages ush all communication channels between all the processes. Messages that were in transmit upon checkpoint time are thus forwarded to their destination and stored there. After restart these messages are automatically retrieved from the buers. When the processes are restarted they get a new identier. These identiers are then sent to the CoCheck resource manager. It in turn sets up a mapping table from old to current identiers. Within the wrappers for the communication calls these current values are used to send and receive messages instead of the values that the application actually uses. Hence, checkpointing and migration is transparent to the application [Ste95]. 3.3 The Resource Manager Interface 3.3 oers a resource manager interface to dene an own host and task management and new scheduling strategies [GBD + 94]. Usually calls are handled by the daemons, but if there is a resource manager registered in the virtual machine, calls concerning hosts and tasks, e.g. pvm addhosts or pvm spawn are redirected to the resource manager. The resource manager provides handler functions to execute the redirected calls. The handler functions in the resource manager are not part of, they must be explicitly written by the user corresponding to a given message framework. CoCheck uses the resource manager interface for the implementation of additional handler functions for checkpointing and migration. For the SEMPA Resource Manager a complete resource manager has been implemented with handler functions for all aected calls that joins the components of CODINE, CoCheck and and realizes a local resource manager for every application. 4 Implementation Aspects of the SEMPA Resource Manager In the previous sections the architectural design and the components of the SEMPA Resource Manager have been introduced. This section explains some functionalities of the SEMPA Resource Manager and shows some implementation details.
6 The main component of the SEMPA Resource Manager is the resource manager with its handler functions for host and task management that initiate certain operations of CODINE, CoCheck or. The data exchange between CODINE and components is realized by calls and a signal interface. 4.1 Starting a Job by the SEMPA Resource Manager Before a job can be started, hosts for the execution of the job must be selected and the parallel environment must be congured. In the SEMPA Resource Manager the CODINE scheduler selects the hosts for the application and the master host where the application is started corresponding to the load on the hosts. Then the execd on the master host starts a shepherd, called the master shepherd. The master shepherd starts the master daemon (pvmd) and the resource manager. The resource manager sets up the virtual machine with the hosts selected by the schedd, i.e. it starts a slave pvmd and a tasker on each host belonging to the virtual machine. Due to implementation constraints the resource manager must be started before hosts are added to the virtual machine. Now the virtual machine is built up completely with the master pvmd and the resource manager running on the master host and a slave pvmd and a tasker on every other host in the virtual machine as shown in Figure 2. As the next step the application is started by the master shepherd, i.e. the rst task is started that usually spawns further tasks. 4.2 Spawning a Task As mentioned above, CODINE is intended to have control over all tasks spawned by. The tasker concept is used to implement the creation of a new task with an own strategy. The resource manager selects a host within the virtual machine where the new task is started. If no appropriate host is available in the virtual machine, the resource manager requests a new host maybe with specic hardware requirements from the CODINE qmaster. The pvm spawn call is sent to the resource manager that selects a host and sends a message to the tasker on that host. uses the round-robin strategy to map tasks to hosts. A strategy considering load information about the hosts will be implemented in the next phase of the project [SKS92]. It is not reasonable to specify a particular host in the pvm spawn call because the resource manager selects a host for the task. If the pvm spawn call fails, a corresponding error is generated and the responsibility to handle the error message is turned to the calling task. The tasker implements a procedure that prevents the tasker to fork the new task itself but causes the execd to start a shepherd that nally creates the task (see Figure 3). The task is spawned on a host belonging to the virtual machine, i.e. that a slave pvmd and a tasker are already running on that host. The spawned task is now under the control of CODINE and.
7 host 1 (master host) schedd execd shepherd qmaster master pvmd start appl. spawn task resource manager execute host 2 host n request tasker slave pvmd tasker slave pvmd message exchange execd execd Figure 2: Starting a job by the SEMPA Resource Manager 4.3 Exiting a Task When a task exits, CODINE and the resource manager must be notied. An exiting task sends a signal SIGCHLD to its parent process that is a shepherd process. After receiving the signal SIGCHLD, the shepherd writes the accounting information about the task to a temporary le and sends a signal SIGCHLD to the tasker to inform it that the task has exited. The shepherd exits and sends a signal SIGCHLD to its parent process, the execd. When the resource manager recognizes that all tasks have terminated, it stops and exits. 5 Performance Measurements Functionalities and performance of the SEMPA Resource Manager have been evaluated with ParTfC as a real world test case. ParTfC is a computational uid dynamics package to compute laminar and turbulent viscous ows in three dimensional geometries. It has been parallelized within the SEMPA project corresponding to the SPMD (single program, multiple data) paradigm [LMR + 96]. The underlying grid is partitioned into smaller parts and every partition is computed by an own process.
8 host 1 (master host) host n master task spawn task resource manager tasker slave pvmd execd execute request shepherd message exchange task Figure 3: Spawning a task by the SEMPA Resource Manager The presented time measurements show the inuence of a resource management system to the runtime of ParTfC. The following three measurement models have been viewed: (M1) ParTfC in interactive mode (M2) ParTfC started as CODINE batch job without a resource manager (M3) ParTfC started as batch job to the SEMPA Resource Manager The time measurements were performed with two dierent grids: (T1) A grid with 3150 grid nodes divided into 4 partitions. (T2) A grid with grid nodes divided into 4 partitions. The four processes of ParTfC were computed on two SGI Indigo 4400 so that two processes were running on one host. The two grids are relatively small but they are sucient to show that the overhead produced by CODINE or the SEMPA Resource Manager is negligible. Table 1 shows that the runtime of ParTfC hardly increases if ParTfC is started as a batch job in CODINE or the SEMPA Resource Manager compared to the runtime of ParTfC in the interactive mode. The time for start and stop scripts in CODINE and the SEMPA Resource Manager that are performed before starting and after nishing ParTfC are shown in Table 2. However, compared to the runtime of ParTfC these times can be neglected. The start script in CODINE starts and sets up the virtual machine. The execution of the
9 (M1) (M2) (M3) (T1) 190 s 194 s 197 s (T2) 389 s 395 s 396 s Table 1: Runtime of ParTfC for the three measurement models start script of the SEMPA Resource Manager takes more time compared to the start script of CODINE because the resource manager and the tasker must be started in addition. The stop script of CODINE performs a pvm halt to stop the virtual machine. The stop script of the SEMPA Resource Manager sends a signal to the resource manager to stop the virtual machine if all processes of the parallel application have nished. (M2) (M3) start script 100 ms 4.2 s stop script 100 ms 60 ms Table 2: Time for start and stop scripts in CODINE and the SEMPA Resource Manager 6 Conclusion The SEMPA Resource Manager provides batch queuing and resource management facilities for applications in a NOW. Parallel applications are started as batch jobs and each process of a parallel application is under the control of the SEMPA Resource Manager so that e.g. resource limitation and migration of each process can be performed. The presented approach is restricted to applications because oers dynamic task management and features to dene own resource management services. The exibility of the concept prevents changes in the code. Modications in CODINE and CoCheck are necessary but reduced to a minimum. The implementation of the SEMPA Resource Manager has almost been completed except the integration of the CoCheck handler functions into the resource manager. The next step after the integration of the migration facilities will be to improve the scheduling strategy of the resource manager to decide about the mapping and remapping of processes more eciently. Currently the round-robin method is used that does not consider the dierent CPU and memory capacities of the hosts and the actual load situation in the virtual machine and the NOW. The interface between the resource manager and the CODINE qmaster must be extended to make scheduling information of CODINE available to the resource manager. References [GBD + 94] Al Geist, Adam Beguelin, Jack Dongarra, Weicheng Jiang, Robert Manchek, and Vaidy Sunderam. : Parallel Virtual Machine A Users' Guide and Tutorial for Networked Parallel Computing. Scientic and Engineering Computation. The MIT Press, Cambridge, MA, 1994.
10 [GEN96] [LMR + 96] GENIAS Software GmbH, Erzgebirgstr. 2B, D Neutraubling, Germany. CO- DINE Reference Manual, Version 4.0, Peter Luksch, Ursula Maier, Sabine Rathmayer, Friedemann Unger, and Matthias Weidmann. Parallelization of a state-of-the-art industrial CFD Package for Execution on Networks of Workstations and Massively Parallel Processors. In Third European Users' Group Meeting, Euro 96, Munchen, October [LMRW96] Peter Luksch, Ursula Maier, Sabine Rathmayer, and Matthias Weidmann. SEMPA: Software Engineering Methods for Parallel Scientic Applications. In International Software Engineering Week, First International Workshop on Software Engineering for Parallel and Distributed Systems, Berlin, March [LTBL97] [MS97] [Pla96] [SKS92] [Ste95] [Ste96] [ZB96] Michael Litzkow, Todd Tannenbaum, Jim Basney, and Miron Livny. Checkpoint and Migration of UNIX Processes in the Condor Distributed Environment. Technical Report 1346, University of Wisconsin-Madison, April Ursula Maier and Georg Stellner. Distributed Resource Management for Parallel Applications in Networks of Workstations. In HPCN Europe 1997, volume 1225 of Lecture Notes in Computer Science, pages 462{471. Springer-Verlag, Platform Computing Corporation, North York, Ontario, Canada. LSF Documentation, December Niranjan G. Shivaratri, Phillip Krueger, and Mukesh Singhal. Load Distributing for Locally Distributed Systems. Computer, 25(12):33{44, December Georg Stellner. Checkpointing and Process Migration for. In Arndt Bode, Thomas Ludwig, Vaidy Sunderam, and Roland Wismuller, editors, Workshop on, MPI Tools and Applications, number 342/18/95 A in SFB-Bericht, pages 44{48. Technische Universitat Munchen, Institut fur Informatik, November Georg Stellner. CoCheck: Checkpointing and Process Migration for MPI. In Proceedings of the International Parallel Processing Symposium, pages 526{531, Honolulu, HI, April IEEE Computer Society Press, Los Vaqueros Circle, P.O. Box 3014, Los Alamitos, CA Avi Ziv and Jehoshua Bruck. Checkpointing in Parallel and Distributed Systems. In Albert Zomaya, editor, Parallel and Distributed Computing Handbook, Series on Computer Engineering, chapter 10, pages 274{302. McGraw-Hill, 1996.
Technische Universitat Munchen. Institut fur Informatik. D Munchen.
Developing Applications for Multicomputer Systems on Workstation Clusters Georg Stellner, Arndt Bode, Stefan Lamberts and Thomas Ludwig? Technische Universitat Munchen Institut fur Informatik Lehrstuhl
More informationSoftware Engineering Methods for Parallel Applications in Scientific Computing Project SEMPA
Software Engineering Methods for Parallel Applications in Scientific Computing Project SEMPA P. Luksch, U. Maier, S. Rathmayer, M. Weidmann Lehrstuhl für Rechnertechnik und Rechnerorganisation (LRR-TUM)
More informationApplication. CoCheck Overlay Library. MPE Library Checkpointing Library. OS Library. Operating System
Managing Checkpoints for Parallel Programs Jim Pruyne and Miron Livny Department of Computer Sciences University of Wisconsin{Madison fpruyne, mirong@cs.wisc.edu Abstract Checkpointing is a valuable tool
More information100 Mbps DEC FDDI Gigaswitch
PVM Communication Performance in a Switched FDDI Heterogeneous Distributed Computing Environment Michael J. Lewis Raymond E. Cline, Jr. Distributed Computing Department Distributed Computing Department
More informationScheduling of Parallel Jobs on Dynamic, Heterogenous Networks
Scheduling of Parallel Jobs on Dynamic, Heterogenous Networks Dan L. Clark, Jeremy Casas, Steve W. Otto, Robert M. Prouty, Jonathan Walpole {dclark, casas, otto, prouty, walpole}@cse.ogi.edu http://www.cse.ogi.edu/disc/projects/cpe/
More informationles are not generally available by NFS or AFS, with a mechanism called \remote system calls". These are discussed in section 4.1. Our general method f
Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System Michael Litzkow, Todd Tannenbaum, Jim Basney, and Miron Livny Computer Sciences Department University of Wisconsin-Madison
More informationLINUX. Benchmark problems have been calculated with dierent cluster con- gurations. The results obtained from these experiments are compared to those
Parallel Computing on PC Clusters - An Alternative to Supercomputers for Industrial Applications Michael Eberl 1, Wolfgang Karl 1, Carsten Trinitis 1 and Andreas Blaszczyk 2 1 Technische Universitat Munchen
More informationKevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a
Asynchronous Checkpointing for PVM Requires Message-Logging Kevin Skadron 18 April 1994 Abstract Distributed computing using networked workstations oers cost-ecient parallel computing, but the higher rate
More informationCondor and BOINC. Distributed and Volunteer Computing. Presented by Adam Bazinet
Condor and BOINC Distributed and Volunteer Computing Presented by Adam Bazinet Condor Developed at the University of Wisconsin-Madison Condor is aimed at High Throughput Computing (HTC) on collections
More informationCompiler and Runtime Support for Programming in Adaptive. Parallel Environments 1. Guy Edjlali, Gagan Agrawal, and Joel Saltz
Compiler and Runtime Support for Programming in Adaptive Parallel Environments 1 Guy Edjlali, Gagan Agrawal, Alan Sussman, Jim Humphries, and Joel Saltz UMIACS and Dept. of Computer Science University
More informationsignature i-1 signature i instruction j j+1 branch adjustment value "if - path" initial value signature i signature j instruction exit signature j+1
CONTROL FLOW MONITORING FOR A TIME-TRIGGERED COMMUNICATION CONTROLLER Thomas M. Galla 1, Michael Sprachmann 2, Andreas Steininger 1 and Christopher Temple 1 Abstract A novel control ow monitoring scheme
More informationThe DBC: Processing Scientic Data. Over the Internet 1. Chungmin Chen Kenneth Salem Miron Livny. USA Canada USA
The DBC: Processing Scientic Data Over the Internet 1 Chungmin Chen Kenneth Salem Miron Livny Dept. of Computer Science Dept. of Computer Science Computer Sciences Dept. University of Maryland University
More informationPARALLEL PROGRAM EXECUTION SUPPORT IN THE JGRID SYSTEM
PARALLEL PROGRAM EXECUTION SUPPORT IN THE JGRID SYSTEM Szabolcs Pota 1, Gergely Sipos 2, Zoltan Juhasz 1,3 and Peter Kacsuk 2 1 Department of Information Systems, University of Veszprem, Hungary 2 Laboratory
More informationActive Motion Detection and Object Tracking. Joachim Denzler and Dietrich W.R.Paulus.
0 Active Motion Detection and Object Tracking Joachim Denzler and Dietrich W.R.Paulus denzler,paulus@informatik.uni-erlangen.de The following paper was published in the Proceedings on the 1 st International
More information[8] J. J. Dongarra and D. C. Sorensen. SCHEDULE: Programs. In D. B. Gannon L. H. Jamieson {24, August 1988.
editor, Proceedings of Fifth SIAM Conference on Parallel Processing, Philadelphia, 1991. SIAM. [3] A. Beguelin, J. J. Dongarra, G. A. Geist, R. Manchek, and V. S. Sunderam. A users' guide to PVM parallel
More informationA Framework-Solution for the. based on Graphical Integration-Schema. W. John, D. Portner
A Framework-Solution for the EMC-Analysis-Domain based on Graphical Integration-Schema W. John, D. Portner Cadlab - Analoge Systemtechnik, Bahnhofstrasse 32, D-4790 Paderborn, Germany 1 Introduction Especially
More informationOn Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems
On Object Orientation as a Paradigm for General Purpose Distributed Operating Systems Vinny Cahill, Sean Baker, Brendan Tangney, Chris Horn and Neville Harris Distributed Systems Group, Dept. of Computer
More informationCUMULVS: Collaborative Infrastructure for Developing. Abstract. by allowing them to dynamically attach to, view, and \steer" a running simulation.
CUMULVS: Collaborative Infrastructure for Developing Distributed Simulations James Arthur Kohl Philip M. Papadopoulos G. A. Geist, II y Abstract The CUMULVS software environment provides remote collaboration
More informationNSR A Tool for Load Measurement in Heterogeneous Environments
NSR A Tool for Load Measurement in Heterogeneous Environments Christian Röder, Thomas Ludwig, Arndt Bode LRR-TUM Lehrstuhl für Rechnertechnik und Rechnerorganisation Technische Universität München, Institut
More informationDistributed Scheduling for the Sombrero Single Address Space Distributed Operating System
Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Donald S. Miller Department of Computer Science and Engineering Arizona State University Tempe, AZ, USA Alan C.
More informationLoad Balancing for Problems with Good Bisectors, and Applications in Finite Element Simulations
Load Balancing for Problems with Good Bisectors, and Applications in Finite Element Simulations Stefan Bischof, Ralf Ebner, and Thomas Erlebach Institut für Informatik Technische Universität München D-80290
More informationBatch Queueing in the WINNER Resource Management System
Batch Queueing in the WINNER Resource Management System Olaf Arndt 1, Bernd Freisleben 1, Thilo Kielmann 2, Frank Thilo 1 1 Dept. of Electrical Engineering and Computer Science, University of Siegen, Germany
More informationDynamic Process Management in an MPI Setting. William Gropp. Ewing Lusk. Abstract
Dynamic Process Management in an MPI Setting William Gropp Ewing Lusk Mathematics and Computer Science Division Argonne National Laboratory gropp@mcs.anl.gov lusk@mcs.anl.gov Abstract We propose extensions
More information2 Rupert W. Ford and Michael O'Brien Parallelism can be naturally exploited at the level of rays as each ray can be calculated independently. Note, th
A Load Balancing Routine for the NAG Parallel Library Rupert W. Ford 1 and Michael O'Brien 2 1 Centre for Novel Computing, Department of Computer Science, The University of Manchester, Manchester M13 9PL,
More informationA Distributed Load Sharing Batch System. Jingwen Wang, Songnian Zhou, Khalid Ahmed, and Weihong Long. Technical Report CSRI-286.
LSBATCH: A Distributed Load Sharing Batch System Jingwen Wang, Songnian Zhou, Khalid Ahmed, and Weihong Long Technical Report CSRI-286 April 1993 Computer Systems Research Institute University of Toronto
More informationA Hierarchical Approach to Workload. M. Calzarossa 1, G. Haring 2, G. Kotsis 2,A.Merlo 1,D.Tessera 1
A Hierarchical Approach to Workload Characterization for Parallel Systems? M. Calzarossa 1, G. Haring 2, G. Kotsis 2,A.Merlo 1,D.Tessera 1 1 Dipartimento di Informatica e Sistemistica, Universita dipavia,
More informationAdaptive load migration systems for PVM
Oregon Health & Science University OHSU Digital Commons CSETech March 1994 Adaptive load migration systems for PVM Jeremy Casas Ravi Konuru Steve W. Otto Robert Prouty Jonathan Walpole Follow this and
More informationshould invest the time and eort to rewrite their existing PVM applications in MPI. In this paper we address these questions by comparing the features
PVM and MPI: a Comparison of Features G. A. Geist J. A. Kohl P. M. Papadopoulos May 30, 1996 Abstract This paper compares PVM and MPI features, pointing out the situations where one may befavored over
More informationEnhancing Integrated Layer Processing using Common Case. Anticipation and Data Dependence Analysis. Extended Abstract
Enhancing Integrated Layer Processing using Common Case Anticipation and Data Dependence Analysis Extended Abstract Philippe Oechslin Computer Networking Lab Swiss Federal Institute of Technology DI-LTI
More informationHARNESS. provides multi-level hot pluggability. virtual machines. split off mobile agents. merge multiple collaborating sites.
HARNESS: Heterogeneous Adaptable Recongurable NEtworked SystemS Jack Dongarra { Oak Ridge National Laboratory and University of Tennessee, Knoxville Al Geist { Oak Ridge National Laboratory James Arthur
More informationApplication Programm 1
A Concept of Datamigration in a Distributed, Object-Oriented Knowledge Base Oliver Schmid Research Institute for Robotic and Real-Time Systems, Department of Computer Science, Technical University of Munich,
More informationN1GE6 Checkpointing and Berkeley Lab Checkpoint/Restart. Liang PENG Lip Kian NG
N1GE6 Checkpointing and Berkeley Lab Checkpoint/Restart Liang PENG Lip Kian NG N1GE6 Checkpointing and Berkeley Lab Checkpoint/Restart Liang PENG Lip Kian NG APSTC-TB-2004-005 Abstract: N1GE6, formerly
More informationUTOPIA: A Load Sharing Facility for Large, Heterogeneous Distributed Computer Systems. Technical Report CSRI-257. April 1992
UTOPIA: A Load Sharing Facility for Large, Heterogeneous Distributed Computer Systems Songnian Zhou, Jingwen Wang, Xiaohu Zheng, and Pierre Delisle Technical Report CSRI-257 April 1992. (To appear in Software
More informationJeremy Casas, Dan Clark, Ravi Konuru, Steve W. Otto, Robert Prouty, and Jonathan Walpole.
MPVM: A Migration Transparent Version of PVM Jeremy Casas, Dan Clark, Ravi Konuru, Steve W. Otto, Robert Prouty, and Jonathan Walpole fcasas,dclark,konuru,otto,prouty,walpoleg@cse.ogi.edu Department of
More informationprocesses based on Message Passing Interface
Checkpointing and Migration of parallel processes based on Message Passing Interface Zhang Youhui, Wang Dongsheng, Zheng Weimin Department of Computer Science, Tsinghua University, China. Abstract This
More informationMaking Workstations a Friendly Environment for Batch Jobs. Miron Livny Mike Litzkow
Making Workstations a Friendly Environment for Batch Jobs Miron Livny Mike Litzkow Computer Sciences Department University of Wisconsin - Madison {miron,mike}@cs.wisc.edu 1. Introduction As time-sharing
More informationA Freely Congurable Audio-Mixing Engine. M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster
A Freely Congurable Audio-Mixing Engine with Automatic Loadbalancing M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster Electronics Laboratory, Swiss Federal Institute of Technology CH-8092 Zurich, Switzerland
More informationTIME WARP PARALLEL LOGIC SIMULATION ON A DISTRIBUTED MEMORY MULTIPROCESSOR. Peter Luksch, Holger Weitlich
TIME WARP PARALLEL LOGIC SIMULATION ON A DISTRIBUTED MEMORY MULTIPROCESSOR ABSTRACT Peter Luksch, Holger Weitlich Department of Computer Science, Munich University of Technology P.O. Box, D-W-8-Munchen,
More informationEgemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for
Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and
More informationKhoral Research, Inc. Khoros is a powerful, integrated system which allows users to perform a variety
Data Parallel Programming with the Khoros Data Services Library Steve Kubica, Thomas Robey, Chris Moorman Khoral Research, Inc. 6200 Indian School Rd. NE Suite 200 Albuquerque, NM 87110 USA E-mail: info@khoral.com
More informationMonitoring the Usage of the ZEUS Analysis Grid
Monitoring the Usage of the ZEUS Analysis Grid Stefanos Leontsinis September 9, 2006 Summer Student Programme 2006 DESY Hamburg Supervisor Dr. Hartmut Stadie National Technical
More informationCHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS. Xiaodong Zhang and Yongsheng Song
CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS Xiaodong Zhang and Yongsheng Song 1. INTRODUCTION Networks of Workstations (NOW) have become important distributed
More informationMechanisms for Just-in-Time Allocation of Resources to Adaptive Parallel Programs
Mechanisms for Just-in-Time Allocation of Resources to Adaptive Parallel Programs Arash Baratloo Ayal Itzkovitz Zvi M. Kedem Yuanyuan Zhao fbaratloo,ayali,kedem,yuanyuang@cs.nyu.edu Department of Computer
More informationMechanisms for Just-in-Time Allocation of Resources to Adaptive Parallel Programs
Mechanisms for Just-in-Time Allocation of Resources to Adaptive Parallel Programs Arash Baratloo Ayal Itzkovitz Zvi M. Kedem Yuanyuan Zhao baratloo,ayali,kedem,yuanyuan @cs.nyu.edu Department of Computer
More informationCovering the Aztec Diamond with One-sided Tetrasticks Extended Version
Covering the Aztec Diamond with One-sided Tetrasticks Extended Version Alfred Wassermann, University of Bayreuth, D-95440 Bayreuth, Germany Abstract There are 107 non-isomorphic coverings of the Aztec
More informationNetwork Computing Environment. Adam Beguelin, Jack Dongarra. Al Geist, Robert Manchek. Keith Moore. August, Rice University
HeNCE: A Heterogeneous Network Computing Environment Adam Beguelin, Jack Dongarra Al Geist, Robert Manchek Keith Moore CRPC-TR93425 August, 1993 Center for Research on Parallel Computation Rice University
More informationStorage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk
HRaid: a Flexible Storage-system Simulator Toni Cortes Jesus Labarta Universitat Politecnica de Catalunya - Barcelona ftoni, jesusg@ac.upc.es - http://www.ac.upc.es/hpc Abstract Clusters of workstations
More informationGrid Compute Resources and Grid Job Management
Grid Compute Resources and Job Management March 24-25, 2007 Grid Job Management 1 Job and compute resource management! This module is about running jobs on remote compute resources March 24-25, 2007 Grid
More informationNormal mode acoustic propagation models. E.A. Vavalis. the computer code to a network of heterogeneous workstations using the Parallel
Normal mode acoustic propagation models on heterogeneous networks of workstations E.A. Vavalis University of Crete, Mathematics Department, 714 09 Heraklion, GREECE and IACM, FORTH, 711 10 Heraklion, GREECE.
More informationJob Management System Extension To Support SLAAC-1V Reconfigurable Hardware
Job Management System Extension To Support SLAAC-1V Reconfigurable Hardware Mohamed Taher 1, Kris Gaj 2, Tarek El-Ghazawi 1, and Nikitas Alexandridis 1 1 The George Washington University 2 George Mason
More informationTutorial 4: Condor. John Watt, National e-science Centre
Tutorial 4: Condor John Watt, National e-science Centre Tutorials Timetable Week Day/Time Topic Staff 3 Fri 11am Introduction to Globus J.W. 4 Fri 11am Globus Development J.W. 5 Fri 11am Globus Development
More informationThe driving motivation behind the design of the Janus framework is to provide application-oriented, easy-to-use and ecient abstractions for the above
Janus a C++ Template Library for Parallel Dynamic Mesh Applications Jens Gerlach, Mitsuhisa Sato, and Yutaka Ishikawa fjens,msato,ishikawag@trc.rwcp.or.jp Tsukuba Research Center of the Real World Computing
More informationNOW Based Parallel Reconstruction of Functional Images
NOW Based Parallel Reconstruction of Functional Images F. Munz 1, T. Stephan 2, U. Maier 2, T. Ludwig 2,A.Bode 2, S. Ziegler 1,S.Nekolla 1, P. Bartenstein 1 and M. Schwaiger 1 1 Nuklearmedizinische Klinik
More informationUNICORE Globus: Interoperability of Grid Infrastructures
UNICORE : Interoperability of Grid Infrastructures Michael Rambadt Philipp Wieder Central Institute for Applied Mathematics (ZAM) Research Centre Juelich D 52425 Juelich, Germany Phone: +49 2461 612057
More informationImproving the Performance of Coordinated Checkpointers. on Networks of Workstations using RAID Techniques. University of Tennessee
Improving the Performance of Coordinated Checkpointers on Networks of Workstations using RAID Techniques James S. Plank Department of Computer Science University of Tennessee Knoxville, TN 37996 plank@cs.utk.edu
More informationSupporting Heterogeneous Network Computing: PVM. Jack J. Dongarra. Oak Ridge National Laboratory and University of Tennessee. G. A.
Supporting Heterogeneous Network Computing: PVM Jack J. Dongarra Oak Ridge National Laboratory and University of Tennessee G. A. Geist Oak Ridge National Laboratory Robert Manchek University of Tennessee
More information/98 $10.00 (c) 1998 IEEE
CUMULVS: Extending a Generic Steering and Visualization Middleware for lication Fault-Tolerance Philip M. Papadopoulos, phil@msr.epm.ornl.gov James Arthur Kohl, kohl@msr.epm.ornl.gov B. David Semeraro,
More informationEUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH PARALLEL IN-MEMORY DATABASE. Dept. Mathematics and Computing Science div. ECP
EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH CERN/ECP 95-29 11 December 1995 ON-LINE EVENT RECONSTRUCTION USING A PARALLEL IN-MEMORY DATABASE E. Argante y;z,p. v.d. Stok y, I. Willers z y Eindhoven University
More informationThe MPBench Report. Philip J. Mucci. Kevin London. March 1998
The MPBench Report Philip J. Mucci Kevin London mucci@cs.utk.edu london@cs.utk.edu March 1998 1 Introduction MPBench is a benchmark to evaluate the performance of MPI and PVM on MPP's and clusters of workstations.
More informationUNIVERSITY OF MINNESOTA. This is to certify that I have examined this copy of master s thesis by. Vishwas Raman
UNIVERSITY OF MINNESOTA This is to certify that I have examined this copy of master s thesis by Vishwas Raman and have have found that it is complete and satisfactory in all respects, and that any and
More informationdirector executor user program user program signal, breakpoint function call communication channel client library directing server
(appeared in Computing Systems, Vol. 8, 2, pp.107-134, MIT Press, Spring 1995.) The Dynascope Directing Server: Design and Implementation 1 Rok Sosic School of Computing and Information Technology Grith
More informationJava Virtual Machine
Evaluation of Java Thread Performance on Two Dierent Multithreaded Kernels Yan Gu B. S. Lee Wentong Cai School of Applied Science Nanyang Technological University Singapore 639798 guyan@cais.ntu.edu.sg,
More informationAn Introduction to Parallel Processing with the Fork Transformation in SAS Data Integration Studio
Paper 2733-2018 An Introduction to Parallel Processing with the Fork Transformation in SAS Data Integration Studio Jeff Dyson, The Financial Risk Group ABSTRACT The SAS Data Integration Studio job is historically
More informationEfficiently building on-line tools for distributed heterogeneous environments
Scientific Programming 10 (2002) 67 74 67 IOS Press Efficiently building on-line tools for distributed heterogeneous environments Günther Rackl, Thomas Ludwig, Markus Lindermeier and Alexandros Stamatakis
More information(HT)Condor - Past and Future
(HT)Condor - Past and Future Miron Livny John P. Morgridge Professor of Computer Science Wisconsin Institutes for Discovery University of Wisconsin-Madison חי has the value of 18 חי means alive Europe
More informationTowards ParadisEO-MO-GPU: a Framework for GPU-based Local Search Metaheuristics
Towards ParadisEO-MO-GPU: a Framework for GPU-based Local Search Metaheuristics N. Melab, T-V. Luong, K. Boufaras and E-G. Talbi Dolphin Project INRIA Lille Nord Europe - LIFL/CNRS UMR 8022 - Université
More informationProcesses, PCB, Context Switch
THE HONG KONG POLYTECHNIC UNIVERSITY Department of Electronic and Information Engineering EIE 272 CAOS Operating Systems Part II Processes, PCB, Context Switch Instructor Dr. M. Sakalli enmsaka@eie.polyu.edu.hk
More informationA MATLAB Toolbox for Distributed and Parallel Processing
A MATLAB Toolbox for Distributed and Parallel Processing S. Pawletta a, W. Drewelow a, P. Duenow a, T. Pawletta b and M. Suesse a a Institute of Automatic Control, Department of Electrical Engineering,
More informationDistributed Batch Controller. Department of Computer Science, University of Maryland, College Park, MD USA. Waterloo, ON N2L 3G1 Canada
Processing TOVS Polar Pathnder Data Using the Distributed Batch Controller James Du a, Kenneth Salem b, Axel Schweiger c, and Miron Livny d a Department of Computer Science, University of Maryland, College
More informationCL/TB. An Allegro Common Lisp. J. Kempe, T. Lenz, B. Freitag, H. Schutz, G. Specht
CL/TB An Allegro Common Lisp Programming Interface for TransBase J. Kempe, T. Lenz, B. Freitag, H. Schutz, G. Specht TECHNISCHE UNIVERSIT AT M UNCHEN Institut fur Informatik Orleansstrasse 34 D-8000 Munchen
More informationGrid Compute Resources and Job Management
Grid Compute Resources and Job Management How do we access the grid? Command line with tools that you'll use Specialised applications Ex: Write a program to process images that sends data to run on the
More informationWhat is checkpoint. Checkpoint libraries. Where to checkpoint? Why we need it? When to checkpoint? Who need checkpoint?
What is Checkpoint libraries Bosilca George bosilca@cs.utk.edu Saving the state of a program at a certain point so that it can be restarted from that point at a later time or on a different machine. interruption
More informationTHE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL. Jun Sun, Yasushi Shinjo and Kozo Itano
THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL Jun Sun, Yasushi Shinjo and Kozo Itano Institute of Information Sciences and Electronics University of Tsukuba Tsukuba,
More informationApplications PVM (Parallel Virtual Machine) Socket Interface. Unix Domain LLC/SNAP HIPPI-LE/FP/PH. HIPPI Networks
Enhanced PVM Communications over a HIPPI Local Area Network Jenwei Hsieh, David H.C. Du, Norman J. Troullier 1 Distributed Multimedia Research Center 2 and Computer Science Department, University of Minnesota
More informationApplication Programmer. Vienna Fortran Out-of-Core Program
Mass Storage Support for a Parallelizing Compilation System b a Peter Brezany a, Thomas A. Mueck b, Erich Schikuta c Institute for Software Technology and Parallel Systems, University of Vienna, Liechtensteinstrasse
More informationMOTION ESTIMATION IN MPEG-2 VIDEO ENCODING USING A PARALLEL BLOCK MATCHING ALGORITHM. Daniel Grosu, Honorius G^almeanu
MOTION ESTIMATION IN MPEG-2 VIDEO ENCODING USING A PARALLEL BLOCK MATCHING ALGORITHM Daniel Grosu, Honorius G^almeanu Multimedia Group - Department of Electronics and Computers Transilvania University
More informationEvaluating Personal High Performance Computing with PVM on Windows and LINUX Environments
Evaluating Personal High Performance Computing with PVM on Windows and LINUX Environments Paulo S. Souza * Luciano J. Senger ** Marcos J. Santana ** Regina C. Santana ** e-mails: {pssouza, ljsenger, mjs,
More informationComparing Centralized and Decentralized Distributed Execution Systems
Comparing Centralized and Decentralized Distributed Execution Systems Mustafa Paksoy mpaksoy@swarthmore.edu Javier Prado jprado@swarthmore.edu May 2, 2006 Abstract We implement two distributed execution
More informationSteering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream
Agent Roles in Snapshot Assembly Delbert Hart Dept. of Computer Science Washington University in St. Louis St. Louis, MO 63130 hart@cs.wustl.edu Eileen Kraemer Dept. of Computer Science University of Georgia
More information2 Fredrik Manne, Svein Olav Andersen where an error occurs. In order to automate the process most debuggers can set conditional breakpoints (watch-poi
This is page 1 Printer: Opaque this Automating the Debugging of Large Numerical Codes Fredrik Manne Svein Olav Andersen 1 ABSTRACT The development of large numerical codes is usually carried out in an
More information160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp
Scientia Iranica, Vol. 11, No. 3, pp 159{164 c Sharif University of Technology, July 2004 On Routing Architecture for Hybrid FPGA M. Nadjarbashi, S.M. Fakhraie 1 and A. Kaviani 2 In this paper, the routing
More informationExperiences in Managing Resources on a Large Origin3000 cluster
Experiences in Managing Resources on a Large Origin3000 cluster UG Summit 2002, Manchester, May 20 2002, Mark van de Sanden & Huub Stoffers http://www.sara.nl A oarse Outline of this Presentation Overview
More informationProcess a program in execution; process execution must progress in sequential fashion. Operating Systems
Process Concept An operating system executes a variety of programs: Batch system jobs Time-shared systems user programs or tasks 1 Textbook uses the terms job and process almost interchangeably Process
More informationDesign of it : an Aldor library to express parallel programs Extended Abstract Niklaus Mannhart Institute for Scientic Computing ETH-Zentrum CH-8092 Z
Design of it : an Aldor library to express parallel programs Extended Abstract Niklaus Mannhart Institute for Scientic Computing ETH-Zentrum CH-8092 Zurich, Switzerland e-mail: mannhart@inf.ethz.ch url:
More informationAN ABSTRACT OF THE THESIS OF. December 6, Title: Optimization of Machine Allocation in Ring Leader.
AN ABSTRACT OF THE THESIS OF Jonathan B. King for the degree of Master of Science in Computer Science presented on December 6, 1996. Title: Optimization of Machine Allocation in Ring Leader. Abstract approved
More informationTowards Energy Efficient Change Management in a Cloud Computing Environment
Towards Energy Efficient Change Management in a Cloud Computing Environment Hady AbdelSalam 1,KurtMaly 1,RaviMukkamala 1, Mohammad Zubair 1, and David Kaminsky 2 1 Computer Science Department, Old Dominion
More informationUniva Grid Engine Troubleshooting Quick Reference
Univa Corporation Grid Engine Documentation Univa Grid Engine Troubleshooting Quick Reference Author: Univa Engineering Version: 8.4.4 October 31, 2016 Copyright 2012 2016 Univa Corporation. All rights
More informationMobile Computing An Browser. Grace Hai Yan Lo and Thomas Kunz fhylo, October, Abstract
A Case Study of Dynamic Application Partitioning in Mobile Computing An E-mail Browser Grace Hai Yan Lo and Thomas Kunz fhylo, tkunzg@uwaterloo.ca University ofwaterloo, ON, Canada October, 1996 Abstract
More informationChapter 3. Design of Grid Scheduler. 3.1 Introduction
Chapter 3 Design of Grid Scheduler The scheduler component of the grid is responsible to prepare the job ques for grid resources. The research in design of grid schedulers has given various topologies
More informationPolicy-Based Context-Management for Mobile Solutions
Policy-Based Context-Management for Mobile Solutions Caroline Funk 1,Björn Schiemann 2 1 Ludwig-Maximilians-Universität München Oettingenstraße 67, 80538 München caroline.funk@nm.ifi.lmu.de 2 Siemens AG,
More informationn m-dimensional data points K Clusters KP Data Points (Cluster centers) K Clusters
Clustering using a coarse-grained parallel Genetic Algorithm: A Preliminary Study Nalini K. Ratha Anil K. Jain Moon J. Chung Department of Computer Science Department of Computer Science Department of
More informationDistributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne
Distributed Computing: PVM, MPI, and MOSIX Multiple Processor Systems Dr. Shaaban Judd E.N. Jenne May 21, 1999 Abstract: Distributed computing is emerging as the preferred means of supporting parallel
More informationThe Architecture of a System for the Indexing of Images by. Content
The Architecture of a System for the Indexing of s by Content S. Kostomanolakis, M. Lourakis, C. Chronaki, Y. Kavaklis, and S. C. Orphanoudakis Computer Vision and Robotics Laboratory Institute of Computer
More informationProviding Interoperability for Java-Oriented Monitoring Tools with JINEXT
Providing Interoperability for Java-Oriented Monitoring Tools with JINEXT W lodzimierz Funika and Arkadiusz Janik Institute of Computer Science, AGH, al. Mickiewicza 30, 30-059 Kraków, Poland funika@uci.agh.edu.pl
More informationEcient Redo Processing in. Jun-Lin Lin. Xi Li. Southern Methodist University
Technical Report 96-CSE-13 Ecient Redo Processing in Main Memory Databases by Jun-Lin Lin Margaret H. Dunham Xi Li Department of Computer Science and Engineering Southern Methodist University Dallas, Texas
More informationSpace-Efficient Page-Level Incremental Checkpointing *
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 22, 237-246 (2006) Space-Efficient Page-Level Incremental Checkpointing * JUNYOUNG HEO, SANGHO YI, YOOKUN CHO AND JIMAN HONG + School of Computer Science
More informationCross Cluster Migration using Dynamite
Cross Cluster Migration using Dynamite Remote File Access Support A thesis submitted in partial fulfilment of the requirements for the degree of Master of Science at the University of Amsterdam by Adianto
More informationImproving the Dynamic Creation of Processes in MPI-2
Improving the Dynamic Creation of Processes in MPI-2 Márcia C. Cera, Guilherme P. Pezzi, Elton N. Mathias, Nicolas Maillard, and Philippe O. A. Navaux Universidade Federal do Rio Grande do Sul, Instituto
More informationPC cluster as a platform for parallel applications
PC cluster as a platform for parallel applications AMANY ABD ELSAMEA, HESHAM ELDEEB, SALWA NASSAR Computer & System Department Electronic Research Institute National Research Center, Dokki, Giza Cairo,
More informationUsing semantic causality graphs to validate MAS models
Using semantic causality graphs to validate MAS models Guillermo Vigueras 1, Jorge J. Gómez 2, Juan A. Botía 1 and Juan Pavón 2 1 Facultad de Informática Universidad de Murcia Spain 2 Facultad de Informática
More information