Parallel Arch. & Lang. (PARLE 94), Lect. Notes in Comp. Sci., Vol 817, pp , July 1994

Size: px

Start display at page:

Download "Parallel Arch. & Lang. (PARLE 94), Lect. Notes in Comp. Sci., Vol 817, pp , July 1994"

Gervais Allen
6 years ago
Views:

1 Parallel Arch. & Lang. (PARLE 94), Lect. Notes in Comp. Sci., Vol 817, pp , July 1994 A Formal Approach to Modeling Expected Behavior in Parallel Program Visualizations? Joseph L. Sharnowski and Betty H.C. Cheng?? Department of Computer Science, Michigan State University, A714 Wells Hall, East Lansing, Michigan (ph: ; fax: ) fsharnows,chengbg@cps.msu.edu Abstract. Visualizations of program execution are useful for debugging the complex behavior of parallel programs. However, the eectiveness of the visualizations is limited by how well their representations match the programmer's conceptual model of the expected program behavior. In this paper, we show that the LOTOS specication of a parallel program may be used to model expected behavior in the visualizations of the program's execution. We developed a prototype debugging environment, Panorama, to provide a framework for the modeling of expected behavior, collection of trace data, and the generation of the corresponding visualizations. We illustrate, by example, that debugging of incorrect message-passing communication is facilitated by this visualization strategy. 1 Introduction Debugging a sequential program is a dicult task, as it relies on insight in order to know where to look for the cause of an error. Such insight is often only achieved after years of experience in program development [1]. Debugging a parallel program presents an even more dicult challenge [2], as communication between processing elements complicates the task of locating the cause of an error. In order to simplify the development of parallel programs, support tools must be devised for handling this additional complexity. An indirect solution for addressing the diculty of parallel debugging is to minimize the number of errors that must be detected and corrected. The application of formal methods to the process of software development provides a means for achieving this solution. Formal methods are mathematically-based techniques that are used to describe and reason about properties of software systems. The description of the properties are presented using a notation called a formal specication language, and the document in which the properties are described is called a formal specication [3]. The formal specication is an abstraction of? This work is supported in part by the NSF Grant CCR ?? Please address all correspondences to this author.

2 the software system, where the implementation details are intentionally omitted. Using formal specications facilitates the early evaluation of a software design through the use of formal reasoning techniques. The use of formal specications is unfortunately unable to entirely eliminate the possibility of errors in the implementation. For example, even if the specication accurately represents a problem, the process of constructing an implementation for the specication is subject to coding errors. In particular, the expected behavior, as described by the formal specication, may be inconsistent with the actual behavior, as revealed during the program's execution. Visualization has been shown to be an eective approach for representing the complex behavior of a parallel program [4, 5]. Large quantities of event data from the program execution can be encapsulated in compact graphical representations. These visualizations reveal patterns and discrepancies in the event data more readily than the corresponding textual output. A major diculty in the use of visualization is nding a graphical representation for the event data that ts the programmer's conceptual model of the problem at hand [6]. Many visualizations lack any use of abstraction to model the low-level events in terms of high-level behavior, such as stages in the algorithm. In order to debug any errors, the programmer must manually establish a relationship between the graphically-displayed events and the patterns of expected behavior for the program. This paper presents an approach for visualizing the execution of a parallel program in the context of the program's formal specication, written in the specication language LOTOS (Language of Temporal Ordering Specications) [7, 8]. A LOTOS specication is used to represent an abstraction of the program, thus providing a model of the program's expected behavior. We specically consider the case of programs written for distributed-memory systems, where communication between nodes is via message-passing. We present the prototype visualization environment, Panorama, which implements our approach for modeling expected behavior in program visualizations. The remainder of this paper is organized as follows. Section 2 describes how LOTOS is used for the specication of parallel programs. Section 3 discusses how Panorama supports the collection of data, and Section 4 discusses how it supports the modeling of expected behavior in program visualizations. Section 5 considers a practical example where we demonstrate how Panorama can be used to detect message-passing errors. Finally, conclusions are discussed in Section 6. 2 Parallel Program Specication Using LOTOS LOTOS [7, 8] is a formal specication language that has been specically designed to specify protocols and services. The concepts of LOTOS are general in nature, however, thus making the language useful for a wide variety of other tasks, including the specication of parallel programs. In this section, we present a brief discussion of how the language may be used to specify a parallel program

3 written for a distributed-memory system, where communication channels interconnect the nodes to provide a medium for message-passing. For a general introduction to LOTOS, the reader may refer to the tutorials provided in [8, 9]. Also, additional discussion regarding the LOTOS specication of parallel computing environments may be found in [10, 11]. A simple example of a parallel computing environment consists of two processing elements P 1 and P 2 connected via a unidirectional channel C1, as shown in Figure 1. The synchronization points between the processing elements and the channels are the gates labeled send and recv. An action that sends a message Mesg from a processing element labeled Sender to a processing element labeled Rcvr may be formatted as [10]: send!sender!rcvr!mesg Similarly, the action that receives the message may be formatted as: recv!sender!rcvr?mesg:message where, in this case, Mesg is a variable whose value is set through synchronization. P1 send C1 recv P2 Fig. 1. Simple parallel computing environment Consider a program for the computing environment given in Figure 1, whose purpose is to send a natural number from P 1 to P 2. The specication for this program is shown in Figure 2. The behavior expression for the overall specication states that the actions of the processing elements (P 1 and P 2) may interleave, but the actions between the processing elements and the channel C1 must fully synchronize (i.e., synchronize on both the send and recv gates). By examining the behavior expressions in the three process denitions of the specication, we observe that the rst synchronization that occurs is between P 1 and C1 at the send gate. This synchronization causes the variable Msg in the action \send!p1!p2?msg:nat" to accept the value SomeNum oered by the action \send!p1!p2!somenum", eectively representing the passing of the message from P 1 to C1. Similarly, an additional synchronization then occurs at the recv gate, representing the passing of the message from C1 to P 2. We developed a model for a general parallel computing environment that supports an unlimited number of processing elements, where each pair of processing elements is connected by two unidirectional channels, one in each direction. We assign each processing element with a natural number for its name. The number zero is assigned to a special processing element called the host node, whose purpose is to handle management duties for the set of processing elements, such as assigning tasks or data sets to the other processing elements. We refer to the other processing elements in the environment as worker nodes, assigning them the names 1; : : : ;W, where W refers to the number of worker nodes in

4 specication SimpleSend(SomeNum:Nat): exit library NaturalNumber endlib behavior ( P1[send,recv](SomeNum) jjj P2[send,recv] ) jj C1[send,recv] where process P1[send,recv](SomeNum:Nat):exit := send!p1!p2!somenum ; exit endproc process P2[send,recv]:exit := recv!p1!p2?receivednum:nat ; exit endproc process C1[send,recv]:exit := send!p1!p2?msg:nat ; recv!p1!p2!msg ; exit endproc endspec Fig. 2. Specication for sending a number from P 1 to P 2 the environment. The purpose of the worker nodes is to collectively complete the main computation for a problem by each performing a portion of it. The LOTOS specication for this type of computing environment may be constructed by using recursive LOTOS processes to create both the individual instances of worker node processes as well as the processes for the channels that interconnect the processing elements [10, 11]. A simple example of a message-passing parallel program is one in which the host node sends each worker node a natural number equal to the value of the worker node's name, where the worker node then doubles the received number and returns the result to the host. The specications for the host and worker node processes of this number-doubling program are shown in Figure 3. The behavior expression for the host node illustrates the use of process decomposition, where the subprocesses Send_Numbers and Receive_Replies perform the tasks of sending messages to the worker nodes and receiving the results, respectively. The denitions for each of these subprocesses illustrate the use of recursion, where the NodeCtr variable is used to recursively count in descending order through all the \names" of the worker nodes, from W to 1. 3 Data Collection Step Panorama incorporates a post-mortem visualization strategy, where trace data of important events are collected during program execution, while the graphical depiction of the data takes place o-line after execution is complete. In this section, we discuss the collection of the two types of data used by Panorama's graphical depiction step: the expected behavior data and the trace data. A LOTOS specication is an abstraction of a program, from which a model of the program's expected behavior may be derived. In order to facilitate the

5 process Host_Node[send,recv](W:Nat): exit := Send_Numbers[send,recv](W); Receive_Replies[send,recv](W); exit where (* Use recursion to send a number to each worker node *) process Send_Numbers[send,recv](NodeCtr:Nat): exit := [NodeCtr > 0] {> send!0!nodectr!nodectr ; Send_Numbers[send,recv](NodeCtr - 1) [ ] [NodeCtr = 0] {> exit endproc (* Send_Numbers *) (* Use recursion to receive a number from each worker node *) process Receive_Replies[send,recv](NodeCtr:Nat): exit := [NodeCtr > 0] {> (* Since Sender is a variable, receive the replies in any order *) recv?sender:nat!0?reply:nat ; Receive_Replies[send,recv](NodeCtr - 1) [ ] [NodeCtr = 0] {> exit endproc (* Receive_Replies *) endproc (* Host_Node *) process Worker_Node[send,recv](W,MyNum:Nat): exit := (* Receive a number and return twice its value *) recv!0!mynum?value:nat ; send!mynum!0!(value+value) ; exit endproc (* Worker_Node *) Fig. 3. Specication of host and worker nodes for the number-doubling program task of debugging message-passing errors, Panorama's expected behavior model focuses on capturing the occurrences of the message-passing actions according to where they appear in the process hierarchy of the LOTOS specication. Panorama generates the expected behavior model by calculating the tree-like hierarchy, where the name of the specication is placed at the root of the tree. As each process is added to the tree, the message-passing actions that occur in the behavior expression of that process are recorded with it. For example, Figure 4 shows the subtree of processes for the specication of the host node in the number-doubling program from Section 2, where the message-passing actions that occur in the specication are listed with their corresponding processes. After nishing the calculation of the expected behavior model, Panorama stores the model, where the stored version is known as expected behavior data. Panorama uses a software instrumentation approach for the collection of trace data, where appropriate statements are added to the source code in order to generate the relevant trace data during program execution. The method for adding these data collection statements is based directly on the expected behavior model, where the programmer maps items from the expected behavior

6 Host_Node Send_Numbers send!0!nodectr!nodectr Receive_Replies recv?sender:nat!0?reply:nat Fig. 4. Subtree of processes for the host node in the number-doubling program model to their corresponding instrumentation points in the source code via a graphical interface [11]. After the programmer has performed the mapping procedure, Panorama handles the underlying details of adding the data collection statements to the source code. Currently, Panorama provides software instrumentation support for both C and Fortran programs that use PVM (Parallel Virtual Machine) version 2.4 [12] message-passing primitives, where PVM is a parallel computing environment for heterogeneous networks of parallel and serial computers. An advantage to using a software instrumentation approach is that the data collection statements are capable of generating auxiliary information as part of a traced event, where the auxiliary information is used to map the event to its corresponding location in the source code and formal specication. The instrumented version of the source code may be compiled and executed, where each processing element produces a le of trace data during the program execution. A post-processing stage is then used to perform a time-ordered merge of these trace les. Since distributed-memory systems generally lack a synchronized global clock, additional analysis is then performed to adjust the ordering to be consistent with the happened-before relation [13], such that any event E1 that can aect an event E2 is placed in the global ordering before E2. In the Panorama framework, it is guaranteed that no message-passing events are listed in the time-ordering as being received before they are sent. 4 Graphical Depiction Step The graphical depiction step of Panorama uses both the expected behavior data and the trace data to render a visualization of the program execution that models the expected behavior of the program. In this section, we discuss the visualization strategy used by Panorama, including a description of the major features available for generating visualizations. Space-time diagrams [13] display communication events between processing elements across time. One axis of the diagram represents the processing elements, while the other axis represents time. Arcs are drawn between appropriate points in the diagram to represent message-passing events. Panorama uses an enhanced version of a space-time diagram for graphically depicting program execution, where the enhancement is an overlay of the active LOTOS processes onto the diagram. The shaded portion of a rectangle is used to represent the interval between the entry and exit times of a corresponding process. Dierent

7 shading patterns are used to distinguish between active processes. The graphical depiction of the events is performed by a playback strategy, where the programmer may choose either to sequence through the events in a step mode or have Panorama provide a simulated replay. By graphically depicting the trace data in terms of items from the expected behavior model, the diagram facilitates a comparison between the actual behavior and expected behavior of the program. Thus, we call this diagram a Behavior Comparison graph, or BC-graph for short. A BC-graph for the execution of the number-doubling program is illustrated in Figure 5. 1 This visualization shows that six processing elements were involved in the computation, consisting of one host (labeled \0") and ve worker nodes (labeled \1-5"). The message-passing events are depicted in the model of active processes, thus facilitating a visual mapping of the message-passing events to their corresponding location in the expected behavior model. Fig. 5. BC-graph of the execution of the number-doubling program Panorama's visualization strategy oers several features that facilitate the debugging of message-passing errors. First, the graphical elements that represent the items from the expected behavior model may be selected using the cursor, at which time windows are activated that display the portions of the source code and LOTOS specication that correspond to the selected element. Second, in order to avoid congestion, Panorama can perform selective ltering of the events to be depicted, where the processes in the expected behavior model are 1 The trace data for this visualization was generated using a PVM 2.4 [12] implementation of the program, with the execution occurring on a cluster of six identical ethernet-connected SUN SPARCstation 1 workstations.

8 used as the basis for the selection. Finally, a BC-graph provides an abstraction (clustering) mechanism for displaying a subtree of active processes by the parent (root) process of the subtree, thus reducing the congestion that may be caused by displaying the individual active processes (as depicted by the shaded rectangles). Examples of several of these debugging features are given in the following section. 5 Debugging Example: Cholesky Factorization This section illustrates an example in which Panorama facilitates the task of debugging a message-passing error. The application we use is the Cholesky factorization program supplied with the PVM 2.4 [12] distribution. In the discussion below, we present an informal description of the program, followed by LOTOS specications of relevant portions of the program. (We omit the full specication of the program due to space limitations, but the interested reader may refer to [11] for the complete specication.) We then illustrate the use of Panorama for debugging a message-passing error in the program. The trace data for this example was obtained by running the program on a cluster of eight identical ethernet-connected SUN SPARCstation 1 workstations. 5.1 Informal Description of the Program Cholesky factorization considers the special case in which a matrix A is both symmetric and positive denite. In this case, matrix A has a factorization of the form A = LL T, where L is a lower triangular matrix. This factorization is known as the Cholesky factorization. The program that we use to compute the Cholesky factorization is a Column- Cholesky [14] implementation, in which the worker nodes are each assigned an approximately equal number of columns for the computational tasks, although not necessarily consecutive. (The host node process participates in the computation only during initialization.) The implementation consists of three phases: synchronous Cholesky factorization, forward substitution, and backward substitution. A full discussion of these phases is beyond the scope of this paper, although the interested reader may refer to [14]. In the following presentation, we focus on the patterns of the message-passing events, and do not consider the contents of the messages. The matrix we consider is of size n n, where the columns are numbered 0; : : : ; n? 1. The number of available worker nodes is represented by W. All three phases of the program contain message-passing events within loops that use the column number as the index variable. The error we consider below is located within the forward substitution phase, and so we focus our discussion on that phase. That particular phase iterates in order of increasing column number, from 1 to n? 1 (column zero is skipped). If the relevant processing element determines that it is assigned the column corresponding to the value of the index variable, then it waits to receive values sent from each of the other processing elements. Otherwise, it sends a message to the processing element that is assigned the column.

9 5.2 LOTOS Specications for the Program The behavior expression for the Forward_Substitution process consists of a set of initialization operations, followed by a main loop. The header and behavior expression for the process that species the main loop is: process For_Sub_Main_Loop[send,recv](CurrentCol:Nat,MAX_COL:Nat, hlocal Variablesi): exit := [CurrentCol MAX_COL] {> For_Sub_Pre-communication[send,recv](hLocal Variablesi); For_Sub_Communication[send,recv](CurrentCol,hLocal Variablesi); For_Sub_Post-communication[send,recv](hLocal Variablesi); For_Sub_Main_Loop[send,recv](CurrentCol+1,MAX_COL,hLocal Variablesi) [ ] [CurrentCol > MAX_COL] {> exit As shown in the behavior expression above, the CurrentCol variable is used to iterate in order of increasing column number through recursive calls to the For_Sub_Main_Loop process. The specication of the For_Sub_Communication process is as follows: process For_Sub_Communication[send,recv](CurrentCol:Nat, hlocal Variablesi): exit := [CurrentCol 2 MY_COL_SET] {> For_Sub_Receive[send,recv](0,W,MyNum); [ ] [CurrentCol 62 MY_COL_SET] {> send!mynum!owner(currentcol)!msg:message ; exit where process For_Sub_Receive[send,recv](NodeCtr:Nat, W:Nat, MyNum:Nat): exit := [NodeCtr < (W - 1)] {> (* Since Sender is a variable, receive the messages in any order *) recv?sender:nat!mynum?msg:message ; For_Sub_Receive[send,recv](NodeCtr + 1,W,MyNum) [ ] [NodeCtr = (W - 1)] {> exit endproc (* For_Sub_Receive *) endproc (* For_Sub_Communication *) The behavior expression for this process is divided into two choices that determine whether the worker node should participate in send or receive actions, which is based on whether the CurrentCol variable belongs to the set of columns assigned to the worker node that has invoked the process. In the case where the column is assigned to a worker node, the recursive For_Sub_Receive process is used to receive W-1 messages sent by the other worker nodes.

10 5.3 Debugging the Message-Passing Error The error that we investigate is located in the communication step of the forward substitution phase. Specically, the processing element that is assigned the column corresponding to the current value of the index variable must wait to receive messages from W - 1 worker node processes, where a looping construct is used to implement the receipt of multiple messages. We consider the case in which the looping construct is implemented erroneously, such that it waits for messages from W worker node processes instead of W? 1. Since only W? 1 worker nodes send messages, the processing element that is assigned the column enters a deadlock state where it is waiting for a message that will never arrive. The other worker nodes are able to proceed to the next iteration, but each one eventually enters the deadlock state upon executing an iteration in which it is assigned the current column. Since the processing elements enter deadlock states, our discussion below considers partial trace les that were generated by the worker nodes before they entered the deadlock state. The complexity of the message-passing patterns complicates the task of locating the error in the forward substitution phase. By using the clustering mechanism, a BC-graph may be used to display the message-passing events in a model of the high-level stages of the program. For example, Figure 6 shows the message-passing behavior in the synchronous Cholesky factorization and forward substitution phases of the program, where phases are distinguished by the level of shading. (The host node, represented by processing element \0", does not communicate with the worker nodes after initialization, and, thus, there are no message-passing events shown for processing element \0".) Fig. 6. BC-graph of both active processes and message-passing events

In order to gain a better understanding of the erroneous message-passing behavior in the forward substitution phase, ltering may be used to depict key events.

11 In order to gain a better understanding of the erroneous message-passing behavior in the forward substitution phase, ltering may be used to depict key events. For example, the BC-graph in the left side of Figure 7 shows the case in which lters have been applied to depict only the active For_Sub_Receive processes along with the relevant message-passing events. This visualization illustrates that the processing elements are each waiting in the For_Sub_Receive process at the point where the information in the partial trace les is exhausted. (The depicted events are the last events recorded for each of the processing elements, at which point progress apparently stops since none of the processing elements completed the execution of the program.) The behavior of processing element \1" is of particular interest, as we see that it received one message from each of the other processing elements (i.e., the expected behavior, as dened by the For_Sub_Receive process for a worker node whose CurrentCol 2 MY_COL_SET), yet it failed to exit the For_Sub_Receive process. Upon locating this questionable behavior, the programmer may use the cursor to select any of the graphical elements that represent the process, at which time windows are displayed showing the specication and source code corresponding to the questionable event, as is shown in Figure 7. Through a comparison of the contents of the two windows, an inconsistency may be detected in the bound for the number of messages to be received, where the specication states a bound equal to W?1, but the source code states a bound equal to W. This inconsistency explains the cause of the deadlock problem. Fig. 7. Specication and source code corresponding to questionable process 6 Conclusions This paper has discussed a strategy for using the LOTOS specication of a parallel program to form an expected behavior model in which to visualize program execution. We have described an approach where expected behavior data and trace data are rst collected and then used to generate visualizations.

12 Each visualization is represented using a BC-graph, where the actual behavior of the program (as represented by trace data) is depicted in terms of the expected behavior. This approach to program visualization, implemented in the tool Panorama, has been demonstrated for debugging message-passing errors. Future work will include extending the expected behavior model to include other aspects of the LOTOS specication besides message-passing actions, such as data operations. Acknowledgments The authors wish to thank the anonymous reviewers for their helpful comments, and Enoch Wang for his assistance with the graphics programming. References 1. Keijiro Araki, Zengo Furukawa, and Jingde Cheng. A general framework for debugging. IEEE Software, pages 14{20, May Charles E. McDowell and David P. Helmbold. Debugging concurrent programs. ACM Computing Surveys, 21(4):593{622, December Jeannette M. Wing. A specier's introduction to formal methods. IEEE Computer, pages 8{24, September Eileen Kraemer and John T. Stasko. The visualization of parallel systems: An overview. Journal of Parallel and Distributed Computing, 18:105{117, Mark V. LaPolla, Joseph L. Sharnowski, Betty H. C. Cheng, and Kevin Anderson. Data parallel program visualizations from formal specications. Journal of Parallel and Distributed Computing, 18:252{257, Cherri M. Pancake and Sue Utter. Models for visualization in parallel debuggers. In Proceedings of 1989 Supercomputing Conference, pages 627{636, International Organization for Standardization, IS LOTOS: A formal description technique based on the temporal ordering of observational behavior, Tommaso Bolognesi and Ed Brinksma. Introduction to the ISO specication language LOTOS. Computer Networks and ISDN Systems, 14(1):25{59, Luigi Logrippo, Mohammed Faci, and Mazen Haj-Hussein. An introduction to LOTOS: learning by examples. Computer Networks and ISDN Systems, 23:325{ 342, Mazen Haj-Hussein and Luigi Logrippo. Specifying distributed algorithms in LOTOS. To appear in Revue reseaux et informatique repartie. 11. Joseph L. Sharnowski and Betty H. C. Cheng. A formal approach to modeling expected behavior in parallel program visualizations. Technical Report MSU-CPS , Michigan State University, November Adam Beguelin, Jack Dongarra, Al Geist, Robert Manchek, and Vaidy Sunderam. A users' guide to PVM: Parallel Virtual Machine. Technical Report ORNL/TM , Oak Ridge National Laboratory, July Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558{565, July Alan George, Michael T. Heath, and Joseph Liu. Parallel Cholesky factorization on a shared-memory multiprocessor. Linear Algebra and Its Applications, 77:165{187, 1986.

13 This article was processed using the LaT E X macro package with LLNCS style

Kevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a

Kevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a Asynchronous Checkpointing for PVM Requires Message-Logging Kevin Skadron 18 April 1994 Abstract Distributed computing using networked workstations oers cost-ecient parallel computing, but the higher rate