TRAPPER A GRAPHICAL PROGRAMMING ENVIRONMENT O. KR AMER-FUHRMANN. German National Research Center for Computer Science (GMD)

Size: px
Start display at page:

Download "TRAPPER A GRAPHICAL PROGRAMMING ENVIRONMENT O. KR AMER-FUHRMANN. German National Research Center for Computer Science (GMD)"

Transcription

1 TRAPPER A GRAPHICAL PROGRAMMING ENVIRONMENT FOR PARALLEL SYSTEMS O. KR AMER-FUHRMANN German National Research Center for Computer Science (GMD) Schloss Birlinghoven, D Sankt Augustin, Germany L. SCH AFERS, C. SCHEIDLER Daimler-Benz Research, Alt Moabit 91b, D Berlin, Germany ABSTRACT TRAPPER is a graphical programming environment which supports the development of parallel applications. The programming environment is based on the programming model of communicating sequential processes. TRAPPER contains tools for the design, mapping and visualization of parallel systems. The Designtool supports a hybrid program development, where the parallel process structure is described using a graphical representation and the sequential behavior is described by sequential program code. The conguration of the target hardware and the mapping of the application onto the hardware is supported by the Congtool. During run-time, the monitoring-system records events which can be animated by the Vistool and the Perftool which visualize the behavior of the hard- and software components. This paper describes the support of two machine independent message passing interfaces: PVM (Parallel Virtual Machine) and PARMACS (Parallel Macros). 1. Introduction Parallel computing is accepted as the only technology which oers a long term chance of improving the performance of computer systems. Parallel processing has been widely accepted in the eld of numerical computing, but will also have a great impact on technical systems when the software problem will be solved. In this paper we describe TRAPPER, a graphical programming environment that assists the programmer in developing application software for systems which use parallel processing as a key technology for high computing power. The TRAPPER philosophy is to have the programmer explicitly specify the parallel structure of the application and to aid the programmer as much as possible in the various phases of the development cycle. TRAPPER is based on the programming model of communicating sequential processes, which is suited for a large class of applications. TRAPPER has been under development in the GMD since 1991 [14]. The programming environment supports dierent target systems like PVM [5] and the PARMACS [7]. In cooperation with Daimler-Benz, embedded industrial real-time systems based on the Transputer technology are supported [20]. A rst release has been delivered to dierent research groups of the GMD, the Mercedes-Benz vehicle research center and other members of the Daimler-Benz corporation.

2 2. Related Work Ever since parallel computers have existed, researchers have investigated the diculties associated with programming them. Tools for parallel MIMD computers (multiple instruction multiple data) can be classied into systems which support the data-parallel approach or explicit message passing. In the data-parallel concept, the programmer deals with arrays of data, getting distributed over the parallel architecture by compiler technology [8]. In the message passing paradigm, the parallel program consists of an ensemble of communicating processes [12]. The benet of the data-parallel programming model is the absence of multiple control ows, which leads to an easy but inexible programming model. The process-parallel programming model is more exible, but the existence of multiple control ows raises new problems in the various program development tasks. Several research activities are dealing with design, mapping, debugging, animation and optimization of parallel programs. A good overview over those tools is given in [13]. During the design phase, the programmer has to describe the parallelism of the application. Text-oriented programming languages reect the parallel structure of a program very poorly. Therefore various tools make use of graphical representations instead of textual representations to describe the parallelism. HeNCE [3], Code [18] and Paralex [2] follow the data-ow approach, where nodes represent subroutines and directed arcs represent data-dependencies between subroutines. Millipede [1], MP [16] and TRAPPER use process graphs, where nodes represent processes and arcs represent communication channels. The mapping of the application onto the target hardware is a task that does not exist in the traditional development cycle of sequential programs. A lot of research has been done in this area, a survey is given in [6]. TRAPPER integrates a semi-automatic mapping algorithm into a graphical programming environment. Animation is crucial for understanding the run-time behavior of a parallel program. Parallel activities are better understandable when they are represented graphically. ParaGraph [11] is a very popular visualization tool, which can display trace data in a lot of dierent graphical views. Unfortunately ParaGraph works only in oine mode, has no backstepping facilities and cannot display user dened data. PATOP [4] is part of the Topsys project and enables very detailed observation of hardware activity including statistical evaluation. Pablo [19] is a very exible animation tool, where the user can dene and congure the views of his data interactively. Unfortunately Pablo has no stepping/backstepping facilities and oers no time diagrams. Furthermore there exist many graphical visualization tools for special systems. Xab is a monitor and visualizer for PVM applications. Express has its own monitoring and visualizing toolset. Prism [10] is a graphical tool for the connection machine, which is a SIMD architecture. Prism gives excellent debugging support and is well integrated into the system software, but an extension for message passing systems and the portation to other MIMD architectures is dicult.

3 3. TRAPPER Components TRAPPER comprises dierent tools for all phases of the software development cycle. Figure 1 gives an overview of the TRAPPER toolset. Figure 1: TRAPPER overview With the Designtool the programmer species the process graph, where the nodes in the graph represent processes and the edges denote communication channels. The process graph describes the parallel structure of the application, independently of the target hardware. Each process is a sequential task with access to local memory only. Processes can communicate with each other by message passing constructs. The Configtool allows the user to specify the conguration of the hardware system and determines the mapping of the process graph onto the hardware. The Monitor collects run-time information about software-events like process invocation, interprocess-communication, computation and communication loads of the target hardware and of user dened events. All events are time stamped automatically and are stored in an animation le. Two dierent tools allow the visualization of the program execution. The Vistool enables program animation, i.e. the graphical display of the software behavior, like execution phases, variable contents and application specic information. The Perftool displays information about the hardware, i.e. load characteristics and scheduling information. Both animation tools oer not only online animation but

4 also oine animation, including stepping and backtracing facilities. TRAPPER supports the snapshot concept, which displays the system states of all components at a given point of time. Furthermore TRAPPER can visualize the dependencies between dierent events on a time scale and draws various statistics. TRAPPER resides on the host (currently a Sun) and is implemented in C++, using the InterViews graphics library [15], which is based on the X window system Designtool The Designtool supports a hybrid program development. The parallel structure of the application is described by a graphical representation, the process graph. The sequential components are described by textual representations, the traditional program code. The process graph consists of nodes and edges, where nodes represent processes and edges represent communication channels. Each process consists of a unique process identier, a process type denoted by the process name and dedicated communication interfaces called ports. Large process graphs can be designed hierarchically as a composition of subsystems. A subsystem is a graphical entity and can be considered as a black box, which contains a subgraph of the process graph. A subsystem itself can contain other subsystems. TRAPPER can create standard topologies (rings, grids, tori and trees) by a built-in net generator. Figure 2: TRAPPER Designtool A screen dump of the Designtool is given in gure 2. The upper left window shows the main TRAPPER control panel. The window below the control panel shows the activated Designtool with the (simplied) process graph of an application \autonomous vehicle guidance" being developed by Daimler-Benz. In this example data in the process graph ow from left to right. Normal processes are represented

5 by boxes with one frame while subsystems are represented by boxes with double frames. The right window shows an opened subsystem consisting of two processes. Ports are represented by squares located at the frame of the subsystems. The behavior of a process is described textually by the program code. The programmer selects a process in the process graph and activates a text editor with the associated C or Fortran code le. Such a program code is shown in the lower left window. The process code is associated with the process type, not with the process itself. In other words, processes having the same name share the same process code Conguration Tool With the aid of the Congtool the application is mapped onto the target hardware. The mapping is done in two steps. First the programmer has to specify the target hardware by a special designtool. With the aid of a graphic editor the user draws the conguration of the target hardware. Nodes represent processors and edges connecting nodes represent communication links. These links are not necessary for workstation clusters, because usually all processors are connected to a shared bus. Dierent processor types can be introduced by using dierent node names. Each type takes into account a relative speed in order to model dierent computation speed of the various workstations. In the second step the application is mapped onto the hardware. This can be done either automatically or manually. In the automatic mode TRAPPER computes a mapping of the process graph. The TRAPPER mapping algorithm searches a partitioning of the process graph with a well distributed computation load and a small communication load between partitions. It takes into account the following criteria: computation time of each process, communication amount between the processes and speed of the processors. These input data for the mapping algorithm can be dened in two ways. Either the programmer adds weights to the nodes and edges of the process graphs, or these data are extracted from a real test-run. During program execution the monitor extracts the start/stop and communication events, which can be gathered by the TRAPPER utility trace2load. This program extracts the computation and communication amount and adds these data to the software and hardware graphs. In general, the mapping algorithm can nd only sub-optimal solutions, because the underlying optimization problems (e.g. graph partitioning) belong to the class of NP-hard problems. Therefore a heuristic algorithm called iterated 2-Opt [17] is used to determine a good solution in short time. In the rst phase a valid solution is constructed at random. Then the 2-Opt algorithm tries to improve the solution by pairwise exchanges until no more improvements can be found. The solution is disturbed by few random exchanges before the 2-Opt tries to improve this

6 solution again. After every 2-Opt phase the actual solution is compared to the best solution so far in order to keep the best mapping. This iteration is repeated until a time limit (e.g. 5 seconds) is reached. The process graphs build the user-interface of the Congtool. The mapping solution is displayed by coloring the nodes of the software graph. Processes mapped onto the same processor have the same color. The mapping computed by TRAPPER can be modied by the user. The programmer can select nodes interactively in the process graph and specify the desired CPU number. After nishing the previous steps, TRAPPER generates the conguration le needed by the TRAPPER utility startpvm to start the PVM application. This will be explained in section Monitoring System The monitoring-system provides run-time information as input to the Vistool and the Perftool. Vistool needs information about the the application and Perftool needs information about the underlying hardware. The monitoring-system is the only component running on the target hardware and is therefore not as portable as the other TRAPPER components. In following sections we describe two dierent monitoring systems. Section 4 describes a monitor for PVM applications and in section 5 an oine monitoring for PARMACS applications is presented Visualization Tool The Vistool aids the observation of the run-time behavior of the application. It is integrated with the Perftool in such a way that a consistent view on the states of hardware and software components is oered. The Vistool supports the programmer in the analysis of the parallel algorithm by displaying run-time data of the application. This helps the user to understand the dynamics of distributed systems, gives debugging information, allows the detection of errors (i.e. deadlocks by the visualization of cycles of incomplete communication requests) and gives important information for code optimization. The animation tools have an online and an oine mode. In short applications, the oine mode is to be preferred. The events are collected in a monitoring-le which is read by the animation system. This decoupling of the animation from the execution allows the observation of the software events in an individual speed. Additional features like single stepping and backtracing are oered by TRAPPER, to enable the programmer to lead the animation to the interesting program phases. The Vistool has three dierent views: one is based on the process graph, a second is based on the time scale, and at last a kiviat diagram. The process graph view is based on the graphical representation developed with the Designtool. The programmer can select between a variety of dierent animation features like coloring of nodes or edges, textures on nodes or edges, changing the line width or drawing arrows on the edges, displaying plots, histograms or rastered squares in the process boxes. The process graph animation can be used to display the process state, variable values or interprocess-communication.

7 The time scale animation shows the events on a time scale. This animation visualizes communication operations as arrows between processes. This function allows a detailed insight in the cooperation of distributed processes and therefore helps the programmer to debug and optimize his program. Statistics and a critical path analysis can be supplied displaying the dependencies in the application, which determine the overall execution time. The kiviat diagram displays load distribution. Figure 3: TRAPPER animation tools Figure 3 shows a typical session with the TRAPPER animation tools. The upper right window shows the animation controller with an interface similar to a tape recorder. Each animation display is controlled by this component. The animation controller supports single-stepping, normal play and fast forward in both directions. The middle left window shows the software graph. The colors of the nodes indicate the state of a process: active, waiting, communicating or idle. This diagram shows a snapshot of the application, i.e. the state at a given time. The kind of data displayed in each node can be changed interactively by the user such that a comprehensive analysis of the interresting points in the application is possible. The lower window shows a time diagram, where each process is represented by a horizontal line. State changes of the processes are displayed by dierent colors, communication is indicated by a black arrow between the time lines of the partners. With this diagram the analysis of the dynamic behavior of an application is possible. The lower right window is a Kiviat diagram showing the load distribution of the processors. Each CPU is represented by a spoke of the wheel which gets colored corresponding to the current load. A so-called high water mark indicates the maximal value ever reached.

8 3.5. Performance Analysis Tool The Perftool supports the user in the optimization phase, which completes the software development after the design and debugging phases. The software developer receives hints on possible bottlenecks that are due to load imbalances. Use of the Perftool is tightly coupled to the use of the Vistool. An important purpose of the Perftool is to nd a relation between the behavior of the hardware and the software, e.g. to relate a non-satisfactory CPU load and the specic code segment which causes it. The Perftool oers three dierent views: the rst is based on the hardware graph, the second is based on time scaled charts and the third is a Kiviat diagram. Within the hardware graph the CPU load and link load are shown by coloring the nodes and links. Animation of the hardware graph shows the whole parallel machine and gives a rst, rough impression of its behavior. A more detailed insight can be gained with animations based on time scaled charts. Each chart shows the temporal behavior of the selected component. Included are visualization of CPU load, link load and scheduling information. Performance statistics are also provided. They include run-time, CPU utilization, communication overhead, speedup and eciency of the application. The Kiviat diagram displays the load of all processors including with a so-called high water mark. This Kiviat diagram is useful for the analysis of load balance. 4. PVM This section describes the cooperation between PVM [5] applications and the TRAPPER programming environment. The PVM support consists of two independent functionalities. In the program design phase, PVM applications can be specied and mapped by TRAPPER. Running PVM applications can be monitored and their dynamic behavior can be visualized by the TRAPPER animation tools. Figure 4 gives an overview of the two mechanisms which will be explained in the following. Parallel applications being designed and mapped by TRAPPER can be started on PVM via the startpvm utility. For this purpose TRAPPER generates a conguration le (.pvm) which describes the application in terms of processes and their target processors. With the TRAPPER design tool only static process nets can be specied. For the denition of interprocess communication two dierent paradigms are supported by TRAPPER. PVM and the PARMACS allow communication between arbitrary process pairs. Addressing is done by global unique process identiers, such that an explicit connection between communicating partners is not necessary. The second model is based on port communication, as it is used in the Transputer world. Here processes must be connected explicitly by a communication channel. For this paradigm, the process topology can be dened by the TRAPPER Designtool which

9 Figure 4: PVM integration is instantiated by the startpvm utility. Monitoring under PVM is implemented as a special process, called tracepvm. This process is started automatically by startpvm and runs additionally to the application. Its purpose is twofold. First the hardware load has to be recorded by polling the PVM processors via the Unix command rstat. This is done cyclic in a given time interval (e.g. 5 seconds). Second the monitoring process receives messages being sent by the application processes, like task activation communication events user events creation and termination of processes, begin and end of a communication and color and write textures to nodes or edges; display application data in the nodes. Each event must carry an individual time stamp, which can either be inserted by the application or, if this is impossible, automatically by the monitoring process. TRAPPER animation can be either online or oine. In the online case the monitor writes the events to a named pipe which is being read simultaneously by TRAPPER. This enables the observation of the distributed application during runtime. In the oine mode the events are written to a le (.anim). After sorting them by using the time stamps, this le can be read and visualized by the TRAPPER visualization tools.

10 5. PARMACS This section describes the cooperation between PARMACS [7] applications and the TRAPPER programming environment. We describe the oine version, but online animation is possible by introducing a tracer process similar to the PVM implementation in section 4. PARMACS applications use a set of machine independent functions to express parallelism in sequential C or Fortran programs. Most important are functions that allow process creation (CREATE), synchronization (BARRIER) and communication (SEND, RECV). Process nets are mapped automatically to the actual hardware and communication partners are addressed by global process identiers. This implies that PARMACS applications cannot make direct use of design and mapping functionality of TRAPPER. Therefore our work concentrates on an automatic instrumentation of PARMACS programs and their visualization. An automatic detection of process creation and termination, barriers and process communication is realized. The instrumentation of a PARMACS program is done by preprocessing the program code, similar to the instrumentation for Paragraph implemented by the Pallas GmbH. This preprocessing is activated by an additional ag to the parmacs call and therefore no changes in the source code are needed. The preprocessing expands the PARMACS macros such that before and after a PARMACS call a library routine is called. The following program segment claries this \sandwich" technique for the example of an asynchronous send operation. sendbegin(...) SEND(target,data,length,type) -> SEND(target,data,length,type) sendend(...) These newly included functions collect the event together with the actual time stamp in an internal event queue. Each PARMACS program executes an ENDNODE of ENDHOST command before termination. These macros have also been modied such that the internal event queues get sent to the host program, where the events get written to the animation le. TRAPPER can sort and read this event le and allows the visualization with the animation tools. To instrument a PARMACS program only the makele has to be changed in two places: parmacs must be called with an additional ag (-patools) and the linking step must include the trapperlib-offline, which implements the monitoring functionality.

11 6. Conclusion We presented TRAPPER, a graphical programming environment for parallel applications. The TRAPPER philosophy is to have the programmer explicitly specify the parallel structure of the application. TRAPPER supports a hybrid program development, where the process structure is described using a graphical representation and the sequential behavior is described using textual representations. TRAPPER consists of tools which support the design, mapping, monitoring and animation of parallel applications. With the aid of the Designtool the programmer species the process graph. The Congtool allows the user to specify the conguration of the hardware system and determines the mapping of the process graph onto the hardware. The monitoring-system collects run-time information. The Vistool enables program animation, i.e. the graphical display of execution phases, variable contents and application specic information. The Perftool displays information about the hardware, i.e. load characteristics and scheduling information. In this paper we described two mechanisms for automatic program instrumentation, one for online monitoring of PVM programs and another for oine monitoring of PARMACS programs. The aim was to collect runtime information about a program run automatically, without the need of source code mainpulation. Important information about the program behaviour, like process creation and termination, and interprocess communication can be monitored and displayed by the TRAPPER animation tools Project Status TRAPPER is an active research project. Its rst releases are in use in different research projects of the GMD: in the PEGASUS project a parallel genetic algorithm toolbox is being developed, and in the ROTOR project the air ow in jetpropulsion is simulated. A rst release has also been delivered to the Mercedes-Benz vehicle research center and other members of the Daimler-Benz corporation Future Work Future activities may concern all parts of the programming environment: Other message passing interfaces like the MPI will be supported by TRAPPER. The extensions concern mainly the monitoring-system. Currently the graphical representation only describes the static process structure, extensions to support dynamic nets are planned. The debugging features of the visualization can be extended by integrating existing debuggers or by extending the online features of the Vistool. It is planned to use TRAPPER as an user interface for interactions during runtime. This can be useful for interactively steered simulation processes.

12 7. Acknowledgements We thank the TRAPPER team, namely Beatrix Hornef, Hans-Christof Lenhard, Josef Roggenbuck and Angelika Weihermuller, which did an excellent work implementing the TRAPPER toolset. 8. References 1. M. Aspnas, B. R. J. R., and T. Langbacka. Millipede - A Programming Environment Providing Visual Support for Parallel Programming. In European Workshop on Parallel Computing, Barcelona, Spain, O. Babaoglu, L. Alvisi, A. Amoroso, R. Davoli, and L. A. Giachini. Paralex: An Environment for Parallel Programming Distributed Systems. In International Conference on Supercomputing, pages 178{187, Washington, USA, July A. Beguelin, J. J. Dongarra, A. Geist, R. Manchek, and V. Sunderam. HeNCE: Graphical Development Tool for Network-Based Concurrent Supercomputing. In Proc. of Supercomputing, Albuquerque, T. Bemmerl and A. Bode. An Integrated Environment for Programming Distributed Memory Multiprocessors. In Second European Distributed Memory Conference, Munich, Apr A. Benguelin, J. Dongarra, A. Geist, R. Manchek, and V. Sunderam. A Users Guide to PVM - Parallel Virtual Machine. Technical Report ORNL/TM-11826, Oak Ridge National Laboratory, Sept F. Berman. Experience with an Automatic Solution to the Mapping Problem. In L. H. Lamieson, D. Gannon, and R. J. Douglas, editors, The Characteristics of Parallel Algorithms, Series in Scientic Computation The Characteristics of Parallel Algorithms, pages 307{334. MIT Press, L. Bomans, D. Roose, and R. Hempel. The Argonne/GMD Macros in FORTRAN for Portable Parallel Programming and their Implementation on the Intel ipsc/2. Parallel Computing, 15:119{132, T. Brandes. Compiling Data Parallel Programs to Message Passing Programs for Massively Parallel MIMD Systems. In Working Conference on Massively Parallel Programming Models, Berlin, Sept T. Braunl. Structured SIMD Programming in Parallaxis. Structured Programming, 10(3):121{132, T. M. Corporation. Prism User's Guide. Version 1.2, Thinking Machines Corporation, Marz M. T. Heath and J. A. Etheridge. ParaGraph: Visualizing the Performance of Parallel Programs. IEEE Software, 8(5):29{39, Sept. 91.

13 12. C. A. R. Hoare. Communicating Sequential Processes. Commun. ACM, 21(8):666{ 677, Aug H.-C. Hoppe, T. Kentemich, O. Kramer-Fuhrmann, and W. Krotz-Vogel. Evaluation of Graphical Performance Analysis Tools for Local Memory Parallel Computers. Technical Report D6.2.b, Esprit Project PPPE: Portable Parallel Programming Environments, July O. Kramer-Fuhrmann and T. Brandes. GRACIA: A Software Environment for Graphical Specication, Automatic Conguration and Animation of Parallel Programs. International Conference on Supercomputing, pages 67{74, June M. A. Linton, J. M. Vlissides, and P. R. Calder. Composing User Interfaces with InterViews. Computer, 22(2):8{22, Feb J. Mangee and N. Dulay. MP: A Programming Environment for Multicomputers. In Proc. of the IFIP Working Group on programming Environments for Parallel Computers, Edinburgh, Scotland, Apr C. S. R. Murthy and V. Rajaraman. Task Assignment in a Multiprocessor System. Microprocessing and Microprogramming, 26:63{71, P. Newton and J. C. Browne. The CODE 2.0 Graphical Programming Environment. In International Conference on Supercomputing, pages 167{177, Washington, USA, July D. A. Reed, R. A. Aydt, T. M. Madhyasta, R. J. Noe, K. A. Shields, and B. W. Schwartz. An Overview of the Pablo Performance Analysis Environment. Technical Report, University of Illinois, Department of Computer Science, Nov L. Schafers, C. Scheidler, and O. Kramer-Fuhrmann. TRAPPER: A Graphical Programming Environment for Industrial High-Performance Applications. In Parle, Parallel Architectures and Languages Europe, pages 403{413, Munich, June 93.

[8] J. J. Dongarra and D. C. Sorensen. SCHEDULE: Programs. In D. B. Gannon L. H. Jamieson {24, August 1988.

[8] J. J. Dongarra and D. C. Sorensen. SCHEDULE: Programs. In D. B. Gannon L. H. Jamieson {24, August 1988. editor, Proceedings of Fifth SIAM Conference on Parallel Processing, Philadelphia, 1991. SIAM. [3] A. Beguelin, J. J. Dongarra, G. A. Geist, R. Manchek, and V. S. Sunderam. A users' guide to PVM parallel

More information

Technische Universitat Munchen. Institut fur Informatik. D Munchen.

Technische Universitat Munchen. Institut fur Informatik. D Munchen. Developing Applications for Multicomputer Systems on Workstation Clusters Georg Stellner, Arndt Bode, Stefan Lamberts and Thomas Ludwig? Technische Universitat Munchen Institut fur Informatik Lehrstuhl

More information

Network Computing Environment. Adam Beguelin, Jack Dongarra. Al Geist, Robert Manchek. Keith Moore. August, Rice University

Network Computing Environment. Adam Beguelin, Jack Dongarra. Al Geist, Robert Manchek. Keith Moore. August, Rice University HeNCE: A Heterogeneous Network Computing Environment Adam Beguelin, Jack Dongarra Al Geist, Robert Manchek Keith Moore CRPC-TR93425 August, 1993 Center for Research on Parallel Computation Rice University

More information

Department of Computing, Macquarie University, NSW 2109, Australia

Department of Computing, Macquarie University, NSW 2109, Australia Gaurav Marwaha Kang Zhang Department of Computing, Macquarie University, NSW 2109, Australia ABSTRACT Designing parallel programs for message-passing systems is not an easy task. Difficulties arise largely

More information

CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS. Xiaodong Zhang and Yongsheng Song

CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS. Xiaodong Zhang and Yongsheng Song CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS Xiaodong Zhang and Yongsheng Song 1. INTRODUCTION Networks of Workstations (NOW) have become important distributed

More information

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and

More information

Supporting Heterogeneous Network Computing: PVM. Jack J. Dongarra. Oak Ridge National Laboratory and University of Tennessee. G. A.

Supporting Heterogeneous Network Computing: PVM. Jack J. Dongarra. Oak Ridge National Laboratory and University of Tennessee. G. A. Supporting Heterogeneous Network Computing: PVM Jack J. Dongarra Oak Ridge National Laboratory and University of Tennessee G. A. Geist Oak Ridge National Laboratory Robert Manchek University of Tennessee

More information

Parallel Arch. & Lang. (PARLE 94), Lect. Notes in Comp. Sci., Vol 817, pp , July 1994

Parallel Arch. & Lang. (PARLE 94), Lect. Notes in Comp. Sci., Vol 817, pp , July 1994 Parallel Arch. & Lang. (PARLE 94), Lect. Notes in Comp. Sci., Vol 817, pp. 202-213, July 1994 A Formal Approach to Modeling Expected Behavior in Parallel Program Visualizations? Joseph L. Sharnowski and

More information

Normal mode acoustic propagation models. E.A. Vavalis. the computer code to a network of heterogeneous workstations using the Parallel

Normal mode acoustic propagation models. E.A. Vavalis. the computer code to a network of heterogeneous workstations using the Parallel Normal mode acoustic propagation models on heterogeneous networks of workstations E.A. Vavalis University of Crete, Mathematics Department, 714 09 Heraklion, GREECE and IACM, FORTH, 711 10 Heraklion, GREECE.

More information

Blocking vs. Non-blocking Communication under. MPI on a Master-Worker Problem. Institut fur Physik. TU Chemnitz. D Chemnitz.

Blocking vs. Non-blocking Communication under. MPI on a Master-Worker Problem. Institut fur Physik. TU Chemnitz. D Chemnitz. Blocking vs. Non-blocking Communication under MPI on a Master-Worker Problem Andre Fachat, Karl Heinz Homann Institut fur Physik TU Chemnitz D-09107 Chemnitz Germany e-mail: fachat@physik.tu-chemnitz.de

More information

CUMULVS: Collaborative Infrastructure for Developing. Abstract. by allowing them to dynamically attach to, view, and \steer" a running simulation.

CUMULVS: Collaborative Infrastructure for Developing. Abstract. by allowing them to dynamically attach to, view, and \steer a running simulation. CUMULVS: Collaborative Infrastructure for Developing Distributed Simulations James Arthur Kohl Philip M. Papadopoulos G. A. Geist, II y Abstract The CUMULVS software environment provides remote collaboration

More information

Kevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a

Kevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a Asynchronous Checkpointing for PVM Requires Message-Logging Kevin Skadron 18 April 1994 Abstract Distributed computing using networked workstations oers cost-ecient parallel computing, but the higher rate

More information

A Framework-Solution for the. based on Graphical Integration-Schema. W. John, D. Portner

A Framework-Solution for the. based on Graphical Integration-Schema. W. John, D. Portner A Framework-Solution for the EMC-Analysis-Domain based on Graphical Integration-Schema W. John, D. Portner Cadlab - Analoge Systemtechnik, Bahnhofstrasse 32, D-4790 Paderborn, Germany 1 Introduction Especially

More information

GRED: Graphical Design. GRP file. GRP2C precompiler. C Source Code. Makefile. Building executables. Executables. Execution. Trace file.

GRED: Graphical Design. GRP file. GRP2C precompiler. C Source Code. Makefile. Building executables. Executables. Execution. Trace file. A Graphical Development and Debugging Environment for Parallel Programs Peter Kacsuk, Jose C. Cunha Gabor Dozsa, Jo~ao Lourenco Tibor Fadgyas, Tiago Ant~ao KFKI-MSZKI Research Institute for Measurement

More information

Flow simulation. Frank Lohmeyer, Oliver Vornberger. University of Osnabruck, D Osnabruck.

Flow simulation. Frank Lohmeyer, Oliver Vornberger. University of Osnabruck, D Osnabruck. To be published in: Notes on Numerical Fluid Mechanics, Vieweg 1994 Flow simulation with FEM on massively parallel systems Frank Lohmeyer, Oliver Vornberger Department of Mathematics and Computer Science

More information

POM: a Virtual Parallel Machine Featuring Observation Mechanisms

POM: a Virtual Parallel Machine Featuring Observation Mechanisms POM: a Virtual Parallel Machine Featuring Observation Mechanisms Frédéric Guidec, Yves Mahéo To cite this version: Frédéric Guidec, Yves Mahéo. POM: a Virtual Parallel Machine Featuring Observation Mechanisms.

More information

SAMOS: an Active Object{Oriented Database System. Stella Gatziu, Klaus R. Dittrich. Database Technology Research Group

SAMOS: an Active Object{Oriented Database System. Stella Gatziu, Klaus R. Dittrich. Database Technology Research Group SAMOS: an Active Object{Oriented Database System Stella Gatziu, Klaus R. Dittrich Database Technology Research Group Institut fur Informatik, Universitat Zurich fgatziu, dittrichg@ifi.unizh.ch to appear

More information

Dynamic Tuning of Parallel Programs

Dynamic Tuning of Parallel Programs Dynamic Tuning of Parallel Programs A. Morajko, A. Espinosa, T. Margalef, E. Luque Dept. Informática, Unidad de Arquitectura y Sistemas Operativos Universitat Autonoma de Barcelona 08193 Bellaterra, Barcelona

More information

a simple structural description of the application

a simple structural description of the application Developing Heterogeneous Applications Using Zoom and HeNCE Richard Wolski, Cosimo Anglano 2, Jennifer Schopf and Francine Berman Department of Computer Science and Engineering, University of California,

More information

Language-Based Parallel Program Interaction: The Breezy Approach. Darryl I. Brown Allen D. Malony. Bernd Mohr. University of Oregon

Language-Based Parallel Program Interaction: The Breezy Approach. Darryl I. Brown Allen D. Malony. Bernd Mohr. University of Oregon Language-Based Parallel Program Interaction: The Breezy Approach Darryl I. Brown Allen D. Malony Bernd Mohr Department of Computer And Information Science University of Oregon Eugene, Oregon 97403 fdarrylb,

More information

and easily tailor it for use within the multicast system. [9] J. Purtilo, C. Hofmeister. Dynamic Reconguration of Distributed Programs.

and easily tailor it for use within the multicast system. [9] J. Purtilo, C. Hofmeister. Dynamic Reconguration of Distributed Programs. and easily tailor it for use within the multicast system. After expressing an initial application design in terms of MIL specications, the application code and speci- cations may be compiled and executed.

More information

Application Programmer. Vienna Fortran Out-of-Core Program

Application Programmer. Vienna Fortran Out-of-Core Program Mass Storage Support for a Parallelizing Compilation System b a Peter Brezany a, Thomas A. Mueck b, Erich Schikuta c Institute for Software Technology and Parallel Systems, University of Vienna, Liechtensteinstrasse

More information

DiP : A Parallel Program Development Environment. Jesus LABARTA, Sergi GIRONA, Vincent PILLET, Toni CORTES, Luis GREGORIS CEPBA

DiP : A Parallel Program Development Environment. Jesus LABARTA, Sergi GIRONA, Vincent PILLET, Toni CORTES, Luis GREGORIS CEPBA DiP : A Parallel Program Development Environment Jesus LABARTA, Sergi GIRONA, Vincent PILLET, Toni CORTES, Luis GREGORIS CEPBA Departament d'arquitectura de Computadors Universitat Politecnica de Catalunya

More information

2 Rupert W. Ford and Michael O'Brien Parallelism can be naturally exploited at the level of rays as each ray can be calculated independently. Note, th

2 Rupert W. Ford and Michael O'Brien Parallelism can be naturally exploited at the level of rays as each ray can be calculated independently. Note, th A Load Balancing Routine for the NAG Parallel Library Rupert W. Ford 1 and Michael O'Brien 2 1 Centre for Novel Computing, Department of Computer Science, The University of Manchester, Manchester M13 9PL,

More information

for Parallel Message-Passing Programs Guido Wirtz Holderlinstrae 3, Siegen, Germany Abstract

for Parallel Message-Passing Programs Guido Wirtz Holderlinstrae 3, Siegen, Germany Abstract Graph-based Software Construction for Parallel Message-Passing Programs Guido Wirtz FB Elektrotechnik und Informatik, Universitat{GHS{Siegen, Holderlinstrae 3, 57068 Siegen, Germany Abstract A programming

More information

Monitoring the Usage of the ZEUS Analysis Grid

Monitoring the Usage of the ZEUS Analysis Grid Monitoring the Usage of the ZEUS Analysis Grid Stefanos Leontsinis September 9, 2006 Summer Student Programme 2006 DESY Hamburg Supervisor Dr. Hartmut Stadie National Technical

More information

DBMS Environment. Application Running in DMS. Source of data. Utilization of data. Standard files. Parallel files. Input. File. Output.

DBMS Environment. Application Running in DMS. Source of data. Utilization of data. Standard files. Parallel files. Input. File. Output. Language, Compiler and Parallel Database Support for I/O Intensive Applications? Peter Brezany a, Thomas A. Mueck b and Erich Schikuta b University of Vienna a Inst. for Softw. Technology and Parallel

More information

Khoral Research, Inc. Khoros is a powerful, integrated system which allows users to perform a variety

Khoral Research, Inc. Khoros is a powerful, integrated system which allows users to perform a variety Data Parallel Programming with the Khoros Data Services Library Steve Kubica, Thomas Robey, Chris Moorman Khoral Research, Inc. 6200 Indian School Rd. NE Suite 200 Albuquerque, NM 87110 USA E-mail: info@khoral.com

More information

TAU: A Portable Parallel Program Analysis. Environment for pc++ 1. Bernd Mohr, Darryl Brown, Allen Malony. fmohr, darrylb,

TAU: A Portable Parallel Program Analysis. Environment for pc++ 1. Bernd Mohr, Darryl Brown, Allen Malony. fmohr, darrylb, Submitted to: CONPAR 94 - VAPP VI, University of Linz, Austria, September 6-8, 1994. TAU: A Portable Parallel Program Analysis Environment for pc++ 1 Bernd Mohr, Darryl Brown, Allen Malony Department of

More information

Steering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream

Steering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream Agent Roles in Snapshot Assembly Delbert Hart Dept. of Computer Science Washington University in St. Louis St. Louis, MO 63130 hart@cs.wustl.edu Eileen Kraemer Dept. of Computer Science University of Georgia

More information

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004 A Study of High Performance Computing and the Cray SV1 Supercomputer Michael Sullivan TJHSST Class of 2004 June 2004 0.1 Introduction A supercomputer is a device for turning compute-bound problems into

More information

THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL. Jun Sun, Yasushi Shinjo and Kozo Itano

THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL. Jun Sun, Yasushi Shinjo and Kozo Itano THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL Jun Sun, Yasushi Shinjo and Kozo Itano Institute of Information Sciences and Electronics University of Tsukuba Tsukuba,

More information

A Hierarchical Approach to Workload. M. Calzarossa 1, G. Haring 2, G. Kotsis 2,A.Merlo 1,D.Tessera 1

A Hierarchical Approach to Workload. M. Calzarossa 1, G. Haring 2, G. Kotsis 2,A.Merlo 1,D.Tessera 1 A Hierarchical Approach to Workload Characterization for Parallel Systems? M. Calzarossa 1, G. Haring 2, G. Kotsis 2,A.Merlo 1,D.Tessera 1 1 Dipartimento di Informatica e Sistemistica, Universita dipavia,

More information

Storage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk

Storage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk HRaid: a Flexible Storage-system Simulator Toni Cortes Jesus Labarta Universitat Politecnica de Catalunya - Barcelona ftoni, jesusg@ac.upc.es - http://www.ac.upc.es/hpc Abstract Clusters of workstations

More information

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Structure Page Nos. 2.0 Introduction 4 2. Objectives 5 2.2 Metrics for Performance Evaluation 5 2.2. Running Time 2.2.2 Speed Up 2.2.3 Efficiency 2.3 Factors

More information

100 Mbps DEC FDDI Gigaswitch

100 Mbps DEC FDDI Gigaswitch PVM Communication Performance in a Switched FDDI Heterogeneous Distributed Computing Environment Michael J. Lewis Raymond E. Cline, Jr. Distributed Computing Department Distributed Computing Department

More information

December 28, Abstract. In this report we describe our eorts to parallelize the VGRIDSG unstructured surface

December 28, Abstract. In this report we describe our eorts to parallelize the VGRIDSG unstructured surface A Comparison of Using APPL and PVM for a Parallel Implementation of an Unstructured Grid Generation Program T. Arthur y M. Bockelie z December 28, 1992 Abstract In this report we describe our eorts to

More information

1e+07 10^5 Node Mesh Step Number

1e+07 10^5 Node Mesh Step Number Implicit Finite Element Applications: A Case for Matching the Number of Processors to the Dynamics of the Program Execution Meenakshi A.Kandaswamy y Valerie E. Taylor z Rudolf Eigenmann x Jose' A. B. Fortes

More information

Towards the Performance Visualization of Web-Service Based Applications

Towards the Performance Visualization of Web-Service Based Applications Towards the Performance Visualization of Web-Service Based Applications Marian Bubak 1,2, Wlodzimierz Funika 1,MarcinKoch 1, Dominik Dziok 1, Allen D. Malony 3,MarcinSmetek 1, and Roland Wismüller 4 1

More information

Application. CoCheck Overlay Library. MPE Library Checkpointing Library. OS Library. Operating System

Application. CoCheck Overlay Library. MPE Library Checkpointing Library. OS Library. Operating System Managing Checkpoints for Parallel Programs Jim Pruyne and Miron Livny Department of Computer Sciences University of Wisconsin{Madison fpruyne, mirong@cs.wisc.edu Abstract Checkpointing is a valuable tool

More information

computers. Hence, modular design techniques are applied, whereby complex programs are constructed from simple components, and components are structure

computers. Hence, modular design techniques are applied, whereby complex programs are constructed from simple components, and components are structure Chapter in Wiley Encyclopedia of Electrical and Electronics Engineering Jack J. Dongarra, Graham E. Fagg, Rolf Hempel, and David W. Walker October 13, 1999 Keywords: messages Parallel computing, message

More information

On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems

On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems On Object Orientation as a Paradigm for General Purpose Distributed Operating Systems Vinny Cahill, Sean Baker, Brendan Tangney, Chris Horn and Neville Harris Distributed Systems Group, Dept. of Computer

More information

Moore s Law. Computer architect goal Software developer assumption

Moore s Law. Computer architect goal Software developer assumption Moore s Law The number of transistors that can be placed inexpensively on an integrated circuit will double approximately every 18 months. Self-fulfilling prophecy Computer architect goal Software developer

More information

MOTION ESTIMATION IN MPEG-2 VIDEO ENCODING USING A PARALLEL BLOCK MATCHING ALGORITHM. Daniel Grosu, Honorius G^almeanu

MOTION ESTIMATION IN MPEG-2 VIDEO ENCODING USING A PARALLEL BLOCK MATCHING ALGORITHM. Daniel Grosu, Honorius G^almeanu MOTION ESTIMATION IN MPEG-2 VIDEO ENCODING USING A PARALLEL BLOCK MATCHING ALGORITHM Daniel Grosu, Honorius G^almeanu Multimedia Group - Department of Electronics and Computers Transilvania University

More information

director executor user program user program signal, breakpoint function call communication channel client library directing server

director executor user program user program signal, breakpoint function call communication channel client library directing server (appeared in Computing Systems, Vol. 8, 2, pp.107-134, MIT Press, Spring 1995.) The Dynascope Directing Server: Design and Implementation 1 Rok Sosic School of Computing and Information Technology Grith

More information

Parallel Programming Models. Parallel Programming Models. Threads Model. Implementations 3/24/2014. Shared Memory Model (without threads)

Parallel Programming Models. Parallel Programming Models. Threads Model. Implementations 3/24/2014. Shared Memory Model (without threads) Parallel Programming Models Parallel Programming Models Shared Memory (without threads) Threads Distributed Memory / Message Passing Data Parallel Hybrid Single Program Multiple Data (SPMD) Multiple Program

More information

Chapter 3 Parallel Software

Chapter 3 Parallel Software Chapter 3 Parallel Software Part I. Preliminaries Chapter 1. What Is Parallel Computing? Chapter 2. Parallel Hardware Chapter 3. Parallel Software Chapter 4. Parallel Applications Chapter 5. Supercomputers

More information

A Component-based Programming Model for Composite, Distributed Applications

A Component-based Programming Model for Composite, Distributed Applications NASA/CR-2001-210873 ICASE Report No. 2001-15 A Component-based Programming Model for Composite, Distributed Applications Thomas M. Eidson ICASE, Hampton, Virginia ICASE NASA Langley Research Center Hampton,

More information

Abstract formula. Net formula

Abstract formula. Net formula { PEP { More than a Petri Net Tool ABSTRACT Bernd Grahlmann and Eike Best The PEP system (Programming Environment based on Petri Nets) supports the most important tasks of a good net tool, including HL

More information

Active Motion Detection and Object Tracking. Joachim Denzler and Dietrich W.R.Paulus.

Active Motion Detection and Object Tracking. Joachim Denzler and Dietrich W.R.Paulus. 0 Active Motion Detection and Object Tracking Joachim Denzler and Dietrich W.R.Paulus denzler,paulus@informatik.uni-erlangen.de The following paper was published in the Proceedings on the 1 st International

More information

Comparing SIMD and MIMD Programming Modes Ravikanth Ganesan, Kannan Govindarajan, and Min-You Wu Department of Computer Science State University of Ne

Comparing SIMD and MIMD Programming Modes Ravikanth Ganesan, Kannan Govindarajan, and Min-You Wu Department of Computer Science State University of Ne Comparing SIMD and MIMD Programming Modes Ravikanth Ganesan, Kannan Govindarajan, and Min-You Wu Department of Computer Science State University of New York Bualo, NY 14260 Abstract The Connection Machine

More information

A Freely Congurable Audio-Mixing Engine. M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster

A Freely Congurable Audio-Mixing Engine. M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster A Freely Congurable Audio-Mixing Engine with Automatic Loadbalancing M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster Electronics Laboratory, Swiss Federal Institute of Technology CH-8092 Zurich, Switzerland

More information

A MATLAB Toolbox for Distributed and Parallel Processing

A MATLAB Toolbox for Distributed and Parallel Processing A MATLAB Toolbox for Distributed and Parallel Processing S. Pawletta a, W. Drewelow a, P. Duenow a, T. Pawletta b and M. Suesse a a Institute of Automatic Control, Department of Electrical Engineering,

More information

Parallel Algorithms for the Third Extension of the Sieve of Eratosthenes. Todd A. Whittaker Ohio State University

Parallel Algorithms for the Third Extension of the Sieve of Eratosthenes. Todd A. Whittaker Ohio State University Parallel Algorithms for the Third Extension of the Sieve of Eratosthenes Todd A. Whittaker Ohio State University whittake@cis.ohio-state.edu Kathy J. Liszka The University of Akron liszka@computer.org

More information

3/24/2014 BIT 325 PARALLEL PROCESSING ASSESSMENT. Lecture Notes:

3/24/2014 BIT 325 PARALLEL PROCESSING ASSESSMENT. Lecture Notes: BIT 325 PARALLEL PROCESSING ASSESSMENT CA 40% TESTS 30% PRESENTATIONS 10% EXAM 60% CLASS TIME TABLE SYLLUBUS & RECOMMENDED BOOKS Parallel processing Overview Clarification of parallel machines Some General

More information

COSC 6385 Computer Architecture - Multi Processor Systems

COSC 6385 Computer Architecture - Multi Processor Systems COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:

More information

Extending CRAFT Data-Distributions for Sparse Matrices. July 1996 Technical Report No: UMA-DAC-96/11

Extending CRAFT Data-Distributions for Sparse Matrices. July 1996 Technical Report No: UMA-DAC-96/11 Extending CRAFT Data-Distributions for Sparse Matrices G. Bandera E.L. Zapata July 996 Technical Report No: UMA-DAC-96/ Published in: 2nd. European Cray MPP Workshop Edinburgh Parallel Computing Centre,

More information

Lecture 9: Load Balancing & Resource Allocation

Lecture 9: Load Balancing & Resource Allocation Lecture 9: Load Balancing & Resource Allocation Introduction Moler s law, Sullivan s theorem give upper bounds on the speed-up that can be achieved using multiple processors. But to get these need to efficiently

More information

The Architecture of a System for the Indexing of Images by. Content

The Architecture of a System for the Indexing of Images by. Content The Architecture of a System for the Indexing of s by Content S. Kostomanolakis, M. Lourakis, C. Chronaki, Y. Kavaklis, and S. C. Orphanoudakis Computer Vision and Robotics Laboratory Institute of Computer

More information

Parallel Clustering on a Unidirectional Ring. Gunter Rudolph 1. University of Dortmund, Department of Computer Science, LS XI, D{44221 Dortmund

Parallel Clustering on a Unidirectional Ring. Gunter Rudolph 1. University of Dortmund, Department of Computer Science, LS XI, D{44221 Dortmund Parallel Clustering on a Unidirectional Ring Gunter Rudolph 1 University of Dortmund, Department of Computer Science, LS XI, D{44221 Dortmund 1. Introduction Abstract. In this paper a parallel version

More information

HARNESS. provides multi-level hot pluggability. virtual machines. split off mobile agents. merge multiple collaborating sites.

HARNESS. provides multi-level hot pluggability. virtual machines. split off mobile agents. merge multiple collaborating sites. HARNESS: Heterogeneous Adaptable Recongurable NEtworked SystemS Jack Dongarra { Oak Ridge National Laboratory and University of Tennessee, Knoxville Al Geist { Oak Ridge National Laboratory James Arthur

More information

sizes. Section 5 briey introduces some of the possible applications of the algorithm. Finally, we draw some conclusions in Section 6. 2 MasPar Archite

sizes. Section 5 briey introduces some of the possible applications of the algorithm. Finally, we draw some conclusions in Section 6. 2 MasPar Archite Parallelization of 3-D Range Image Segmentation on a SIMD Multiprocessor Vipin Chaudhary and Sumit Roy Bikash Sabata Parallel and Distributed Computing Laboratory SRI International Wayne State University

More information

Process 0 Process 1 MPI_Barrier MPI_Isend. MPI_Barrier. MPI_Recv. MPI_Wait. MPI_Isend message. header. MPI_Recv. buffer. message.

Process 0 Process 1 MPI_Barrier MPI_Isend. MPI_Barrier. MPI_Recv. MPI_Wait. MPI_Isend message. header. MPI_Recv. buffer. message. Where's the Overlap? An Analysis of Popular MPI Implementations J.B. White III and S.W. Bova Abstract The MPI 1:1 denition includes routines for nonblocking point-to-point communication that are intended

More information

under Timing Constraints David Filo David Ku Claudionor N. Coelho, Jr. Giovanni De Micheli

under Timing Constraints David Filo David Ku Claudionor N. Coelho, Jr. Giovanni De Micheli Interface Optimization for Concurrent Systems under Timing Constraints David Filo David Ku Claudionor N. Coelho, Jr. Giovanni De Micheli Abstract The scope of most high-level synthesis eorts to date has

More information

Eect of fan-out on the Performance of a. Single-message cancellation scheme. Atul Prakash (Contact Author) Gwo-baw Wu. Seema Jetli

Eect of fan-out on the Performance of a. Single-message cancellation scheme. Atul Prakash (Contact Author) Gwo-baw Wu. Seema Jetli Eect of fan-out on the Performance of a Single-message cancellation scheme Atul Prakash (Contact Author) Gwo-baw Wu Seema Jetli Department of Electrical Engineering and Computer Science University of Michigan,

More information

Semi-Empirical Multiprocessor Performance Predictions. Zhichen Xu. Abstract

Semi-Empirical Multiprocessor Performance Predictions. Zhichen Xu. Abstract Semi-Empirical Multiprocessor Performance Predictions Zhichen Xu Computer Sciences Department University of Wisconsin - Madison Madison, WI 53706 Xiaodong Zhang High Performance Computing and Software

More information

execution host commd

execution host commd Batch Queuing and Resource Management for Applications in a Network of Workstations Ursula Maier, Georg Stellner, Ivan Zoraja Lehrstuhl fur Rechnertechnik und Rechnerorganisation (LRR-TUM) Institut fur

More information

GNATDIST : a conguration language for. distributed Ada 95 applications. Yvon Kermarrec and Laurent Nana. Departement Informatique

GNATDIST : a conguration language for. distributed Ada 95 applications. Yvon Kermarrec and Laurent Nana. Departement Informatique GNATDIST : a conguration language for distributed Ada 95 applications Yvon Kermarrec and Laurent Nana ENST de Bretagne Departement Informatique Technop^ole de l'iroise F 29 285 Brest Cedex France Yvon.Kermarrec@enst-bretagne.fr

More information

THE COMPARISON OF PARALLEL SORTING ALGORITHMS IMPLEMENTED ON DIFFERENT HARDWARE PLATFORMS

THE COMPARISON OF PARALLEL SORTING ALGORITHMS IMPLEMENTED ON DIFFERENT HARDWARE PLATFORMS Computer Science 14 (4) 2013 http://dx.doi.org/10.7494/csci.2013.14.4.679 Dominik Żurek Marcin Pietroń Maciej Wielgosz Kazimierz Wiatr THE COMPARISON OF PARALLEL SORTING ALGORITHMS IMPLEMENTED ON DIFFERENT

More information

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter Lecture Topics Today: Advanced Scheduling (Stallings, chapter 10.1-10.4) Next: Deadlock (Stallings, chapter 6.1-6.6) 1 Announcements Exam #2 returned today Self-Study Exercise #10 Project #8 (due 11/16)

More information

n m-dimensional data points K Clusters KP Data Points (Cluster centers) K Clusters

n m-dimensional data points K Clusters KP Data Points (Cluster centers) K Clusters Clustering using a coarse-grained parallel Genetic Algorithm: A Preliminary Study Nalini K. Ratha Anil K. Jain Moon J. Chung Department of Computer Science Department of Computer Science Department of

More information

PEPE: A Trace-Driven Simulator to Evaluate. Recongurable Multicomputer Architectures? Campus Universitario, Albacete, Spain

PEPE: A Trace-Driven Simulator to Evaluate. Recongurable Multicomputer Architectures? Campus Universitario, Albacete, Spain PEPE: A Trace-Driven Simulator to Evaluate Recongurable Multicomputer Architectures? Jose M Garca 1, Jose LSanchez 2,Pascual Gonzalez 2 1 Universidad de Murcia, Facultad de Informatica Campus de Espinardo,

More information

Compiler and Runtime Support for Programming in Adaptive. Parallel Environments 1. Guy Edjlali, Gagan Agrawal, and Joel Saltz

Compiler and Runtime Support for Programming in Adaptive. Parallel Environments 1. Guy Edjlali, Gagan Agrawal, and Joel Saltz Compiler and Runtime Support for Programming in Adaptive Parallel Environments 1 Guy Edjlali, Gagan Agrawal, Alan Sussman, Jim Humphries, and Joel Saltz UMIACS and Dept. of Computer Science University

More information

Martin P. Robillard and Gail C. Murphy. University of British Columbia. November, 1999

Martin P. Robillard and Gail C. Murphy. University of British Columbia. November, 1999 Migrating a Static Analysis Tool to AspectJ TM Martin P. Robillard and Gail C. Murphy Department of Computer Science University of British Columbia 201-2366 Main Mall Vancouver BC Canada V6T 1Z4 fmrobilla,murphyg@cs.ubc.ca

More information

Commission of the European Communities **************** ESPRIT III PROJECT NB 6756 **************** CAMAS

Commission of the European Communities **************** ESPRIT III PROJECT NB 6756 **************** CAMAS Commission of the European Communities **************** ESPRIT III PROJECT NB 6756 **************** CAMAS COMPUTER AIDED MIGRATION OF APPLICATIONS SYSTEM **************** CAMAS-TR-2.3.4 Finalization Report

More information

Jukka Julku Multicore programming: Low-level libraries. Outline. Processes and threads TBB MPI UPC. Examples

Jukka Julku Multicore programming: Low-level libraries. Outline. Processes and threads TBB MPI UPC. Examples Multicore Jukka Julku 19.2.2009 1 2 3 4 5 6 Disclaimer There are several low-level, languages and directive based approaches But no silver bullets This presentation only covers some examples of them is

More information

The PVM 3.4 Tracing Facility and XPVM 1.1 *

The PVM 3.4 Tracing Facility and XPVM 1.1 * The PVM 3.4 Tracing Facility and XPVM 1.1 * James Arthur Kohl (kohl@msr.epm.ornl.gov) G. A. Geist (geist@msr.epm.ornl.gov) Computer Science & Mathematics Division Oak Ridge National Laboratory Oak Ridge,

More information

Introduction to Parallel Computing

Introduction to Parallel Computing Portland State University ECE 588/688 Introduction to Parallel Computing Reference: Lawrence Livermore National Lab Tutorial https://computing.llnl.gov/tutorials/parallel_comp/ Copyright by Alaa Alameldeen

More information

PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA. Laurent Lemarchand. Informatique. ea 2215, D pt. ubo University{ bp 809

PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA. Laurent Lemarchand. Informatique. ea 2215, D pt. ubo University{ bp 809 PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA Laurent Lemarchand Informatique ubo University{ bp 809 f-29285, Brest { France lemarch@univ-brest.fr ea 2215, D pt ABSTRACT An ecient distributed

More information

Transactions on Information and Communications Technologies vol 9, 1995 WIT Press, ISSN

Transactions on Information and Communications Technologies vol 9, 1995 WIT Press,  ISSN Finite difference and finite element analyses using a cluster of workstations K.P. Wang, J.C. Bruch, Jr. Department of Mechanical and Environmental Engineering, q/ca/z/brm'a, 5Wa jbw6wa CW 937% Abstract

More information

A New Theory of Deadlock-Free Adaptive Multicast Routing in. Wormhole Networks. J. Duato. Facultad de Informatica. Universidad Politecnica de Valencia

A New Theory of Deadlock-Free Adaptive Multicast Routing in. Wormhole Networks. J. Duato. Facultad de Informatica. Universidad Politecnica de Valencia A New Theory of Deadlock-Free Adaptive Multicast Routing in Wormhole Networks J. Duato Facultad de Informatica Universidad Politecnica de Valencia P.O.B. 22012, 46071 - Valencia, SPAIN E-mail: jduato@aii.upv.es

More information

Multiple Data Sources

Multiple Data Sources DATA EXCHANGE: HIGH PERFORMANCE COMMUNICATIONS IN DISTRIBUTED LABORATORIES GREG EISENHAUER BETH SCHROEDER KARSTEN SCHWAN VERNARD MARTIN JEFF VETTER College of Computing Georgia Institute of Technology

More information

Chapter 8 : Multiprocessors

Chapter 8 : Multiprocessors Chapter 8 Multiprocessors 8.1 Characteristics of multiprocessors A multiprocessor system is an interconnection of two or more CPUs with memory and input-output equipment. The term processor in multiprocessor

More information

Automatic Code Generation for Non-Functional Aspects in the CORBALC Component Model

Automatic Code Generation for Non-Functional Aspects in the CORBALC Component Model Automatic Code Generation for Non-Functional Aspects in the CORBALC Component Model Diego Sevilla 1, José M. García 1, Antonio Gómez 2 1 Department of Computer Engineering 2 Department of Information and

More information

messages from disque to parsim messages from parsim to disque

messages from disque to parsim messages from parsim to disque Extension to DISQUE - A trace facility to produce trace data for use by a monitoring tool for distributed simulators Gerd Meister Department of Computer Science, University of Kaiserslautern P.O.Box 3049,

More information

Design and Implementation of a Java-based Distributed Debugger Supporting PVM and MPI

Design and Implementation of a Java-based Distributed Debugger Supporting PVM and MPI Design and Implementation of a Java-based Distributed Debugger Supporting PVM and MPI Xingfu Wu 1, 2 Qingping Chen 3 Xian-He Sun 1 1 Department of Computer Science, Louisiana State University, Baton Rouge,

More information

task object task queue

task object task queue Optimizations for Parallel Computing Using Data Access Information Martin C. Rinard Department of Computer Science University of California, Santa Barbara Santa Barbara, California 9316 martin@cs.ucsb.edu

More information

EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH PARALLEL IN-MEMORY DATABASE. Dept. Mathematics and Computing Science div. ECP

EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH PARALLEL IN-MEMORY DATABASE. Dept. Mathematics and Computing Science div. ECP EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH CERN/ECP 95-29 11 December 1995 ON-LINE EVENT RECONSTRUCTION USING A PARALLEL IN-MEMORY DATABASE E. Argante y;z,p. v.d. Stok y, I. Willers z y Eindhoven University

More information

12th European Simulation Multiconference, Manchester, Uk, Discrete Event Simulation in Interactive Scientic and

12th European Simulation Multiconference, Manchester, Uk, Discrete Event Simulation in Interactive Scientic and 12th European Simulation Multiconference, Manchester, Uk, 1998 1 Discrete Event Simulation in Interactive Scientic and Technical Computing Environments T. Pawletta, Wismar University, Germany W. Drewelow,

More information

Dynamic Process Management in an MPI Setting. William Gropp. Ewing Lusk. Abstract

Dynamic Process Management in an MPI Setting. William Gropp. Ewing Lusk.  Abstract Dynamic Process Management in an MPI Setting William Gropp Ewing Lusk Mathematics and Computer Science Division Argonne National Laboratory gropp@mcs.anl.gov lusk@mcs.anl.gov Abstract We propose extensions

More information

User Machine. Other Machines. process. (main deamon) Central. debugger. User Tool. controller. front-end. controller. debugging library.

User Machine. Other Machines. process. (main deamon) Central. debugger. User Tool. controller. front-end. controller. debugging library. A Debugging Engine for a Parallel and Distributed Environment? Jose C. Cunha, Jo~ao Lourenco, Tiago Ant~ao Universidade Nova de Lisboa Faculdade de Ci^encias e Tecnologia Departamento de Informatica 2825

More information

Neuro-Remodeling via Backpropagation of Utility. ABSTRACT Backpropagation of utility is one of the many methods for neuro-control.

Neuro-Remodeling via Backpropagation of Utility. ABSTRACT Backpropagation of utility is one of the many methods for neuro-control. Neuro-Remodeling via Backpropagation of Utility K. Wendy Tang and Girish Pingle 1 Department of Electrical Engineering SUNY at Stony Brook, Stony Brook, NY 11794-2350. ABSTRACT Backpropagation of utility

More information

Client 1. Client 2. out. Tuple Space (CB400, $5400) (Z400, $4800) removed from tuple space (Z400, $4800) remains in tuple space (CB400, $5400)

Client 1. Client 2. out. Tuple Space (CB400, $5400) (Z400, $4800) removed from tuple space (Z400, $4800) remains in tuple space (CB400, $5400) VisuaLinda: A Framework and a System for Visualizing Parallel Linda Programs Hideki Koike 3 Graduate School of Information Systems University of Electro-Communications 1{5{1, Chofugaoka, Chofu, Tokyo 182,

More information

Parallel Algorithm Design. CS595, Fall 2010

Parallel Algorithm Design. CS595, Fall 2010 Parallel Algorithm Design CS595, Fall 2010 1 Programming Models The programming model o determines the basic concepts of the parallel implementation and o abstracts from the hardware as well as from the

More information

Parallel Computing. Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides)

Parallel Computing. Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides) Parallel Computing 2012 Slides credit: M. Quinn book (chapter 3 slides), A Grama book (chapter 3 slides) Parallel Algorithm Design Outline Computational Model Design Methodology Partitioning Communication

More information

LINUX. Benchmark problems have been calculated with dierent cluster con- gurations. The results obtained from these experiments are compared to those

LINUX. Benchmark problems have been calculated with dierent cluster con- gurations. The results obtained from these experiments are compared to those Parallel Computing on PC Clusters - An Alternative to Supercomputers for Industrial Applications Michael Eberl 1, Wolfgang Karl 1, Carsten Trinitis 1 and Andreas Blaszczyk 2 1 Technische Universitat Munchen

More information

Centre for Parallel Computing, University of Westminster, London, W1M 8JS

Centre for Parallel Computing, University of Westminster, London, W1M 8JS Graphical Construction of Parallel Programs G. R. Ribeiro Justo Centre for Parallel Computing, University of Westminster, London, WM 8JS e-mail: justog@wmin.ac.uk, Abstract Parallel programming is not

More information

Oracle Developer Studio 12.6

Oracle Developer Studio 12.6 Oracle Developer Studio 12.6 Oracle Developer Studio is the #1 development environment for building C, C++, Fortran and Java applications for Oracle Solaris and Linux operating systems running on premises

More information

2 Addressing the Inheritance Anomaly One of the major issues in correctly connecting task communication mechanisms and the object-oriented paradigm is

2 Addressing the Inheritance Anomaly One of the major issues in correctly connecting task communication mechanisms and the object-oriented paradigm is Extendable, Dispatchable Task Communication Mechanisms Stephen Michell Maurya Software 29 Maurya Court Ottawa Ontario, Canada K1G 5S3 steve@maurya.on.ca Kristina Lundqvist Dept. of Computer Systems Uppsala

More information

University of Malaga. Image Template Matching on Distributed Memory and Vector Multiprocessors

University of Malaga. Image Template Matching on Distributed Memory and Vector Multiprocessors Image Template Matching on Distributed Memory and Vector Multiprocessors V. Blanco M. Martin D.B. Heras O. Plata F.F. Rivera September 995 Technical Report No: UMA-DAC-95/20 Published in: 5th Int l. Conf.

More information