TRAPPER A GRAPHICAL PROGRAMMING ENVIRONMENT O. KR AMER-FUHRMANN. German National Research Center for Computer Science (GMD)

Size: px

Start display at page:

Download "TRAPPER A GRAPHICAL PROGRAMMING ENVIRONMENT O. KR AMER-FUHRMANN. German National Research Center for Computer Science (GMD)"

Cecil Black
5 years ago
Views:

1 TRAPPER A GRAPHICAL PROGRAMMING ENVIRONMENT FOR PARALLEL SYSTEMS O. KR AMER-FUHRMANN German National Research Center for Computer Science (GMD) Schloss Birlinghoven, D Sankt Augustin, Germany L. SCH AFERS, C. SCHEIDLER Daimler-Benz Research, Alt Moabit 91b, D Berlin, Germany ABSTRACT TRAPPER is a graphical programming environment which supports the development of parallel applications. The programming environment is based on the programming model of communicating sequential processes. TRAPPER contains tools for the design, mapping and visualization of parallel systems. The Designtool supports a hybrid program development, where the parallel process structure is described using a graphical representation and the sequential behavior is described by sequential program code. The conguration of the target hardware and the mapping of the application onto the hardware is supported by the Congtool. During run-time, the monitoring-system records events which can be animated by the Vistool and the Perftool which visualize the behavior of the hard- and software components. This paper describes the support of two machine independent message passing interfaces: PVM (Parallel Virtual Machine) and PARMACS (Parallel Macros). 1. Introduction Parallel computing is accepted as the only technology which oers a long term chance of improving the performance of computer systems. Parallel processing has been widely accepted in the eld of numerical computing, but will also have a great impact on technical systems when the software problem will be solved. In this paper we describe TRAPPER, a graphical programming environment that assists the programmer in developing application software for systems which use parallel processing as a key technology for high computing power. The TRAPPER philosophy is to have the programmer explicitly specify the parallel structure of the application and to aid the programmer as much as possible in the various phases of the development cycle. TRAPPER is based on the programming model of communicating sequential processes, which is suited for a large class of applications. TRAPPER has been under development in the GMD since 1991 [14]. The programming environment supports dierent target systems like PVM [5] and the PARMACS [7]. In cooperation with Daimler-Benz, embedded industrial real-time systems based on the Transputer technology are supported [20]. A rst release has been delivered to dierent research groups of the GMD, the Mercedes-Benz vehicle research center and other members of the Daimler-Benz corporation.

2 2. Related Work Ever since parallel computers have existed, researchers have investigated the diculties associated with programming them. Tools for parallel MIMD computers (multiple instruction multiple data) can be classied into systems which support the data-parallel approach or explicit message passing. In the data-parallel concept, the programmer deals with arrays of data, getting distributed over the parallel architecture by compiler technology [8]. In the message passing paradigm, the parallel program consists of an ensemble of communicating processes [12]. The benet of the data-parallel programming model is the absence of multiple control ows, which leads to an easy but inexible programming model. The process-parallel programming model is more exible, but the existence of multiple control ows raises new problems in the various program development tasks. Several research activities are dealing with design, mapping, debugging, animation and optimization of parallel programs. A good overview over those tools is given in [13]. During the design phase, the programmer has to describe the parallelism of the application. Text-oriented programming languages reect the parallel structure of a program very poorly. Therefore various tools make use of graphical representations instead of textual representations to describe the parallelism. HeNCE [3], Code [18] and Paralex [2] follow the data-ow approach, where nodes represent subroutines and directed arcs represent data-dependencies between subroutines. Millipede [1], MP [16] and TRAPPER use process graphs, where nodes represent processes and arcs represent communication channels. The mapping of the application onto the target hardware is a task that does not exist in the traditional development cycle of sequential programs. A lot of research has been done in this area, a survey is given in [6]. TRAPPER integrates a semi-automatic mapping algorithm into a graphical programming environment. Animation is crucial for understanding the run-time behavior of a parallel program. Parallel activities are better understandable when they are represented graphically. ParaGraph [11] is a very popular visualization tool, which can display trace data in a lot of dierent graphical views. Unfortunately ParaGraph works only in oine mode, has no backstepping facilities and cannot display user dened data. PATOP [4] is part of the Topsys project and enables very detailed observation of hardware activity including statistical evaluation. Pablo [19] is a very exible animation tool, where the user can dene and congure the views of his data interactively. Unfortunately Pablo has no stepping/backstepping facilities and oers no time diagrams. Furthermore there exist many graphical visualization tools for special systems. Xab is a monitor and visualizer for PVM applications. Express has its own monitoring and visualizing toolset. Prism [10] is a graphical tool for the connection machine, which is a SIMD architecture. Prism gives excellent debugging support and is well integrated into the system software, but an extension for message passing systems and the portation to other MIMD architectures is dicult.

3 3. TRAPPER Components TRAPPER comprises dierent tools for all phases of the software development cycle. Figure 1 gives an overview of the TRAPPER toolset. Figure 1: TRAPPER overview With the Designtool the programmer species the process graph, where the nodes in the graph represent processes and the edges denote communication channels. The process graph describes the parallel structure of the application, independently of the target hardware. Each process is a sequential task with access to local memory only. Processes can communicate with each other by message passing constructs. The Configtool allows the user to specify the conguration of the hardware system and determines the mapping of the process graph onto the hardware. The Monitor collects run-time information about software-events like process invocation, interprocess-communication, computation and communication loads of the target hardware and of user dened events. All events are time stamped automatically and are stored in an animation le. Two dierent tools allow the visualization of the program execution. The Vistool enables program animation, i.e. the graphical display of the software behavior, like execution phases, variable contents and application specic information. The Perftool displays information about the hardware, i.e. load characteristics and scheduling information. Both animation tools oer not only online animation but

also oine animation, including stepping and backtracing facilities. TRAPPER supports the snapshot concept, which displays the system states of all components at a given point of time.

4 also oine animation, including stepping and backtracing facilities. TRAPPER supports the snapshot concept, which displays the system states of all components at a given point of time. Furthermore TRAPPER can visualize the dependencies between dierent events on a time scale and draws various statistics. TRAPPER resides on the host (currently a Sun) and is implemented in C++, using the InterViews graphics library [15], which is based on the X window system Designtool The Designtool supports a hybrid program development. The parallel structure of the application is described by a graphical representation, the process graph. The sequential components are described by textual representations, the traditional program code. The process graph consists of nodes and edges, where nodes represent processes and edges represent communication channels. Each process consists of a unique process identier, a process type denoted by the process name and dedicated communication interfaces called ports. Large process graphs can be designed hierarchically as a composition of subsystems. A subsystem is a graphical entity and can be considered as a black box, which contains a subgraph of the process graph. A subsystem itself can contain other subsystems. TRAPPER can create standard topologies (rings, grids, tori and trees) by a built-in net generator. Figure 2: TRAPPER Designtool A screen dump of the Designtool is given in gure 2. The upper left window shows the main TRAPPER control panel. The window below the control panel shows the activated Designtool with the (simplied) process graph of an application \autonomous vehicle guidance" being developed by Daimler-Benz. In this example data in the process graph ow from left to right. Normal processes are represented

5 by boxes with one frame while subsystems are represented by boxes with double frames. The right window shows an opened subsystem consisting of two processes. Ports are represented by squares located at the frame of the subsystems. The behavior of a process is described textually by the program code. The programmer selects a process in the process graph and activates a text editor with the associated C or Fortran code le. Such a program code is shown in the lower left window. The process code is associated with the process type, not with the process itself. In other words, processes having the same name share the same process code Conguration Tool With the aid of the Congtool the application is mapped onto the target hardware. The mapping is done in two steps. First the programmer has to specify the target hardware by a special designtool. With the aid of a graphic editor the user draws the conguration of the target hardware. Nodes represent processors and edges connecting nodes represent communication links. These links are not necessary for workstation clusters, because usually all processors are connected to a shared bus. Dierent processor types can be introduced by using dierent node names. Each type takes into account a relative speed in order to model dierent computation speed of the various workstations. In the second step the application is mapped onto the hardware. This can be done either automatically or manually. In the automatic mode TRAPPER computes a mapping of the process graph. The TRAPPER mapping algorithm searches a partitioning of the process graph with a well distributed computation load and a small communication load between partitions. It takes into account the following criteria: computation time of each process, communication amount between the processes and speed of the processors. These input data for the mapping algorithm can be dened in two ways. Either the programmer adds weights to the nodes and edges of the process graphs, or these data are extracted from a real test-run. During program execution the monitor extracts the start/stop and communication events, which can be gathered by the TRAPPER utility trace2load. This program extracts the computation and communication amount and adds these data to the software and hardware graphs. In general, the mapping algorithm can nd only sub-optimal solutions, because the underlying optimization problems (e.g. graph partitioning) belong to the class of NP-hard problems. Therefore a heuristic algorithm called iterated 2-Opt [17] is used to determine a good solution in short time. In the rst phase a valid solution is constructed at random. Then the 2-Opt algorithm tries to improve the solution by pairwise exchanges until no more improvements can be found. The solution is disturbed by few random exchanges before the 2-Opt tries to improve this

6 solution again. After every 2-Opt phase the actual solution is compared to the best solution so far in order to keep the best mapping. This iteration is repeated until a time limit (e.g. 5 seconds) is reached. The process graphs build the user-interface of the Congtool. The mapping solution is displayed by coloring the nodes of the software graph. Processes mapped onto the same processor have the same color. The mapping computed by TRAPPER can be modied by the user. The programmer can select nodes interactively in the process graph and specify the desired CPU number. After nishing the previous steps, TRAPPER generates the conguration le needed by the TRAPPER utility startpvm to start the PVM application. This will be explained in section Monitoring System The monitoring-system provides run-time information as input to the Vistool and the Perftool. Vistool needs information about the the application and Perftool needs information about the underlying hardware. The monitoring-system is the only component running on the target hardware and is therefore not as portable as the other TRAPPER components. In following sections we describe two dierent monitoring systems. Section 4 describes a monitor for PVM applications and in section 5 an oine monitoring for PARMACS applications is presented Visualization Tool The Vistool aids the observation of the run-time behavior of the application. It is integrated with the Perftool in such a way that a consistent view on the states of hardware and software components is oered. The Vistool supports the programmer in the analysis of the parallel algorithm by displaying run-time data of the application. This helps the user to understand the dynamics of distributed systems, gives debugging information, allows the detection of errors (i.e. deadlocks by the visualization of cycles of incomplete communication requests) and gives important information for code optimization. The animation tools have an online and an oine mode. In short applications, the oine mode is to be preferred. The events are collected in a monitoring-le which is read by the animation system. This decoupling of the animation from the execution allows the observation of the software events in an individual speed. Additional features like single stepping and backtracing are oered by TRAPPER, to enable the programmer to lead the animation to the interesting program phases. The Vistool has three dierent views: one is based on the process graph, a second is based on the time scale, and at last a kiviat diagram. The process graph view is based on the graphical representation developed with the Designtool. The programmer can select between a variety of dierent animation features like coloring of nodes or edges, textures on nodes or edges, changing the line width or drawing arrows on the edges, displaying plots, histograms or rastered squares in the process boxes. The process graph animation can be used to display the process state, variable values or interprocess-communication.

7 The time scale animation shows the events on a time scale. This animation visualizes communication operations as arrows between processes. This function allows a detailed insight in the cooperation of distributed processes and therefore helps the programmer to debug and optimize his program. Statistics and a critical path analysis can be supplied displaying the dependencies in the application, which determine the overall execution time. The kiviat diagram displays load distribution. Figure 3: TRAPPER animation tools Figure 3 shows a typical session with the TRAPPER animation tools. The upper right window shows the animation controller with an interface similar to a tape recorder. Each animation display is controlled by this component. The animation controller supports single-stepping, normal play and fast forward in both directions. The middle left window shows the software graph. The colors of the nodes indicate the state of a process: active, waiting, communicating or idle. This diagram shows a snapshot of the application, i.e. the state at a given time. The kind of data displayed in each node can be changed interactively by the user such that a comprehensive analysis of the interresting points in the application is possible. The lower window shows a time diagram, where each process is represented by a horizontal line. State changes of the processes are displayed by dierent colors, communication is indicated by a black arrow between the time lines of the partners. With this diagram the analysis of the dynamic behavior of an application is possible. The lower right window is a Kiviat diagram showing the load distribution of the processors. Each CPU is represented by a spoke of the wheel which gets colored corresponding to the current load. A so-called high water mark indicates the maximal value ever reached.

8 3.5. Performance Analysis Tool The Perftool supports the user in the optimization phase, which completes the software development after the design and debugging phases. The software developer receives hints on possible bottlenecks that are due to load imbalances. Use of the Perftool is tightly coupled to the use of the Vistool. An important purpose of the Perftool is to nd a relation between the behavior of the hardware and the software, e.g. to relate a non-satisfactory CPU load and the specic code segment which causes it. The Perftool oers three dierent views: the rst is based on the hardware graph, the second is based on time scaled charts and the third is a Kiviat diagram. Within the hardware graph the CPU load and link load are shown by coloring the nodes and links. Animation of the hardware graph shows the whole parallel machine and gives a rst, rough impression of its behavior. A more detailed insight can be gained with animations based on time scaled charts. Each chart shows the temporal behavior of the selected component. Included are visualization of CPU load, link load and scheduling information. Performance statistics are also provided. They include run-time, CPU utilization, communication overhead, speedup and eciency of the application. The Kiviat diagram displays the load of all processors including with a so-called high water mark. This Kiviat diagram is useful for the analysis of load balance. 4. PVM This section describes the cooperation between PVM [5] applications and the TRAPPER programming environment. The PVM support consists of two independent functionalities. In the program design phase, PVM applications can be specied and mapped by TRAPPER. Running PVM applications can be monitored and their dynamic behavior can be visualized by the TRAPPER animation tools. Figure 4 gives an overview of the two mechanisms which will be explained in the following. Parallel applications being designed and mapped by TRAPPER can be started on PVM via the startpvm utility. For this purpose TRAPPER generates a conguration le (.pvm) which describes the application in terms of processes and their target processors. With the TRAPPER design tool only static process nets can be specied. For the denition of interprocess communication two dierent paradigms are supported by TRAPPER. PVM and the PARMACS allow communication between arbitrary process pairs. Addressing is done by global unique process identiers, such that an explicit connection between communicating partners is not necessary. The second model is based on port communication, as it is used in the Transputer world. Here processes must be connected explicitly by a communication channel. For this paradigm, the process topology can be dened by the TRAPPER Designtool which

9 Figure 4: PVM integration is instantiated by the startpvm utility. Monitoring under PVM is implemented as a special process, called tracepvm. This process is started automatically by startpvm and runs additionally to the application. Its purpose is twofold. First the hardware load has to be recorded by polling the PVM processors via the Unix command rstat. This is done cyclic in a given time interval (e.g. 5 seconds). Second the monitoring process receives messages being sent by the application processes, like task activation communication events user events creation and termination of processes, begin and end of a communication and color and write textures to nodes or edges; display application data in the nodes. Each event must carry an individual time stamp, which can either be inserted by the application or, if this is impossible, automatically by the monitoring process. TRAPPER animation can be either online or oine. In the online case the monitor writes the events to a named pipe which is being read simultaneously by TRAPPER. This enables the observation of the distributed application during runtime. In the oine mode the events are written to a le (.anim). After sorting them by using the time stamps, this le can be read and visualized by the TRAPPER visualization tools.

10 5. PARMACS This section describes the cooperation between PARMACS [7] applications and the TRAPPER programming environment. We describe the oine version, but online animation is possible by introducing a tracer process similar to the PVM implementation in section 4. PARMACS applications use a set of machine independent functions to express parallelism in sequential C or Fortran programs. Most important are functions that allow process creation (CREATE), synchronization (BARRIER) and communication (SEND, RECV). Process nets are mapped automatically to the actual hardware and communication partners are addressed by global process identiers. This implies that PARMACS applications cannot make direct use of design and mapping functionality of TRAPPER. Therefore our work concentrates on an automatic instrumentation of PARMACS programs and their visualization. An automatic detection of process creation and termination, barriers and process communication is realized. The instrumentation of a PARMACS program is done by preprocessing the program code, similar to the instrumentation for Paragraph implemented by the Pallas GmbH. This preprocessing is activated by an additional ag to the parmacs call and therefore no changes in the source code are needed. The preprocessing expands the PARMACS macros such that before and after a PARMACS call a library routine is called. The following program segment claries this \sandwich" technique for the example of an asynchronous send operation. sendbegin(...) SEND(target,data,length,type) -> SEND(target,data,length,type) sendend(...) These newly included functions collect the event together with the actual time stamp in an internal event queue. Each PARMACS program executes an ENDNODE of ENDHOST command before termination. These macros have also been modied such that the internal event queues get sent to the host program, where the events get written to the animation le. TRAPPER can sort and read this event le and allows the visualization with the animation tools. To instrument a PARMACS program only the makele has to be changed in two places: parmacs must be called with an additional ag (-patools) and the linking step must include the trapperlib-offline, which implements the monitoring functionality.

11 6. Conclusion We presented TRAPPER, a graphical programming environment for parallel applications. The TRAPPER philosophy is to have the programmer explicitly specify the parallel structure of the application. TRAPPER supports a hybrid program development, where the process structure is described using a graphical representation and the sequential behavior is described using textual representations. TRAPPER consists of tools which support the design, mapping, monitoring and animation of parallel applications. With the aid of the Designtool the programmer species the process graph. The Congtool allows the user to specify the conguration of the hardware system and determines the mapping of the process graph onto the hardware. The monitoring-system collects run-time information. The Vistool enables program animation, i.e. the graphical display of execution phases, variable contents and application specic information. The Perftool displays information about the hardware, i.e. load characteristics and scheduling information. In this paper we described two mechanisms for automatic program instrumentation, one for online monitoring of PVM programs and another for oine monitoring of PARMACS programs. The aim was to collect runtime information about a program run automatically, without the need of source code mainpulation. Important information about the program behaviour, like process creation and termination, and interprocess communication can be monitored and displayed by the TRAPPER animation tools Project Status TRAPPER is an active research project. Its rst releases are in use in different research projects of the GMD: in the PEGASUS project a parallel genetic algorithm toolbox is being developed, and in the ROTOR project the air ow in jetpropulsion is simulated. A rst release has also been delivered to the Mercedes-Benz vehicle research center and other members of the Daimler-Benz corporation Future Work Future activities may concern all parts of the programming environment: Other message passing interfaces like the MPI will be supported by TRAPPER. The extensions concern mainly the monitoring-system. Currently the graphical representation only describes the static process structure, extensions to support dynamic nets are planned. The debugging features of the visualization can be extended by integrating existing debuggers or by extending the online features of the Vistool. It is planned to use TRAPPER as an user interface for interactions during runtime. This can be useful for interactively steered simulation processes.

12 7. Acknowledgements We thank the TRAPPER team, namely Beatrix Hornef, Hans-Christof Lenhard, Josef Roggenbuck and Angelika Weihermuller, which did an excellent work implementing the TRAPPER toolset. 8. References 1. M. Aspnas, B. R. J. R., and T. Langbacka. Millipede - A Programming Environment Providing Visual Support for Parallel Programming. In European Workshop on Parallel Computing, Barcelona, Spain, O. Babaoglu, L. Alvisi, A. Amoroso, R. Davoli, and L. A. Giachini. Paralex: An Environment for Parallel Programming Distributed Systems. In International Conference on Supercomputing, pages 178{187, Washington, USA, July A. Beguelin, J. J. Dongarra, A. Geist, R. Manchek, and V. Sunderam. HeNCE: Graphical Development Tool for Network-Based Concurrent Supercomputing. In Proc. of Supercomputing, Albuquerque, T. Bemmerl and A. Bode. An Integrated Environment for Programming Distributed Memory Multiprocessors. In Second European Distributed Memory Conference, Munich, Apr A. Benguelin, J. Dongarra, A. Geist, R. Manchek, and V. Sunderam. A Users Guide to PVM - Parallel Virtual Machine. Technical Report ORNL/TM-11826, Oak Ridge National Laboratory, Sept F. Berman. Experience with an Automatic Solution to the Mapping Problem. In L. H. Lamieson, D. Gannon, and R. J. Douglas, editors, The Characteristics of Parallel Algorithms, Series in Scientic Computation The Characteristics of Parallel Algorithms, pages 307{334. MIT Press, L. Bomans, D. Roose, and R. Hempel. The Argonne/GMD Macros in FORTRAN for Portable Parallel Programming and their Implementation on the Intel ipsc/2. Parallel Computing, 15:119{132, T. Brandes. Compiling Data Parallel Programs to Message Passing Programs for Massively Parallel MIMD Systems. In Working Conference on Massively Parallel Programming Models, Berlin, Sept T. Braunl. Structured SIMD Programming in Parallaxis. Structured Programming, 10(3):121{132, T. M. Corporation. Prism User's Guide. Version 1.2, Thinking Machines Corporation, Marz M. T. Heath and J. A. Etheridge. ParaGraph: Visualizing the Performance of Parallel Programs. IEEE Software, 8(5):29{39, Sept. 91.

13 12. C. A. R. Hoare. Communicating Sequential Processes. Commun. ACM, 21(8):666{ 677, Aug H.-C. Hoppe, T. Kentemich, O. Kramer-Fuhrmann, and W. Krotz-Vogel. Evaluation of Graphical Performance Analysis Tools for Local Memory Parallel Computers. Technical Report D6.2.b, Esprit Project PPPE: Portable Parallel Programming Environments, July O. Kramer-Fuhrmann and T. Brandes. GRACIA: A Software Environment for Graphical Specication, Automatic Conguration and Animation of Parallel Programs. International Conference on Supercomputing, pages 67{74, June M. A. Linton, J. M. Vlissides, and P. R. Calder. Composing User Interfaces with InterViews. Computer, 22(2):8{22, Feb J. Mangee and N. Dulay. MP: A Programming Environment for Multicomputers. In Proc. of the IFIP Working Group on programming Environments for Parallel Computers, Edinburgh, Scotland, Apr C. S. R. Murthy and V. Rajaraman. Task Assignment in a Multiprocessor System. Microprocessing and Microprogramming, 26:63{71, P. Newton and J. C. Browne. The CODE 2.0 Graphical Programming Environment. In International Conference on Supercomputing, pages 167{177, Washington, USA, July D. A. Reed, R. A. Aydt, T. M. Madhyasta, R. J. Noe, K. A. Shields, and B. W. Schwartz. An Overview of the Pablo Performance Analysis Environment. Technical Report, University of Illinois, Department of Computer Science, Nov L. Schafers, C. Scheidler, and O. Kramer-Fuhrmann. TRAPPER: A Graphical Programming Environment for Industrial High-Performance Applications. In Parle, Parallel Architectures and Languages Europe, pages 403{413, Munich, June 93.

[8] J. J. Dongarra and D. C. Sorensen. SCHEDULE: Programs. In D. B. Gannon L. H. Jamieson {24, August 1988.

[8] J. J. Dongarra and D. C. Sorensen. SCHEDULE: Programs. In D. B. Gannon L. H. Jamieson {24, August 1988. editor, Proceedings of Fifth SIAM Conference on Parallel Processing, Philadelphia, 1991. SIAM. [3] A. Beguelin, J. J. Dongarra, G. A. Geist, R. Manchek, and V. S. Sunderam. A users' guide to PVM parallel