A MATLAB Toolbox for Distributed and Parallel Processing S. Pawletta a, W. Drewelow a, P. Duenow a, T. Pawletta b and M. Suesse a a Institute of Automatic Control, Department of Electrical Engineering, Univ. of Rostock, D-18051 Rostock, Germany * b Chair of Applied Computer Science, Dept. of Mechanical Engineering, FH Wismar, P.O. Box 1210, D-23952 Wismar, Germany Multiprocessing systems, especially networked computers, are available for a growing number of users, but their exploitation for distributed and parallel processing is often a very painful task, because MATLAB and similar packages disregard the support of such concepts on the application layer. To fill this lack a MATLAB toolbox is presented which allows interactive development, validation and execution of distributed and parallel applications based on multiple MATLAB instances and other non-matlab software. The typical fields of application of the toolbox are illustrated by examples. 1. INTRODUCTION Up to now, there is no parallel version of MATLAB or of a similar package available for practical use. The main reasons are the fine granularity of the most MATLAB functions which can not be parallelised effectively on distributed memory computers, the small number of processors in shared memory computers, conflicts with MATLAB's sophisticated memory model and architecture and last but not least the limited number of customers with parallel machines [1]. But many scientific and engineering problems coded as MATLAB applications are characterized by a considerable run-time expenditure on one side and a medium- or highgrained logical problem structure (e.g. Monte Carlo studies, complex simulations, teaching neural networks) on the other side. This peculiarity is more and more aggravated owing to increasing comply of the used algorithms and methods. Therefore, there is a need to take advantage of the problem specific concurrency by means of structural parallelisation. A suitable hardware basis in the form of networked computers is available for a growing number of users. Another class of problems leads to the need of distributed processing. That means, some tasks cannot be solved in a single and isolated MATLAB environment. The reasons are diverse, but they usually fit into two general categories. Tasks of the first category are problems where data have to be processed or provided on-line (e.g. measurement processing, hardware-in-the-loop simulations). The second category results from the always limited set of methods in a real software package. Here, the integration of autonomous software tools with MATLAB in a distributed system offers a solution. * email: sven.pawletta@etechnik.uni-rostock.de
This situation was the motivation for the authors to start the development of the DP (distributed and parallel application) toolbox in 1992. The DP toolbox provides an easy to use high-level communication interface. It enables the user to develop interactively distributed and parallel applications which are running on multiple MATLAB instances and which include autonomous non-matlab components. The interactions between the tasks of an application are mapped onto the primitives of a programming library for interprocess communication and control. The main application areas of the presented DP toolbox are parallelisation of runtime expensive, medium- and high-grained calculation problems, distributed processing in heterogeneous environments and integration of different software tools with MATLAB. 2. THE DP TOOLBOX Figure 1 shows the principle how the DP toolbox extends MATLAB to a distributed and parallel environment. Prerequisite is MATLAB s programming interface (MEX) to external software [4]. It opens the way for a linkage of MATLAB with a communication library, which are usually implemented in C. The supported platforms range from standard to real-time operating systems depending on the used communication library. In the moment the DP toolbox can use the PVM [2] and the PSI [3] library. MATLAB high-level communication interface DP Toolbox external interface (MEX) communication library (PVM, PSI,...) other autonomous components standard operating systems real-time operating systems (Unix, MS-Windows,...) (OS9, LynxOS,...) heterogeneous multiprocessing systems (networked computers, multicomputers) Figure 1. Extension of MATLAB to a distributed and parallel environment Actually, the DP toolbox realizes a two layer interface (s. fig. 2). The first layer provides the complete function set of the underlying communication library in MATLAB. The abstraction degree is not increased at this level and the maximum flexebility of the communication library is kept up. For a convenient interactive usage a second layer provides a much more abstract interface consisting of only a few but very powerful functions.
high-level interface low-level interface DP Toolbox MATLAB MEX-Interface PVM or PSI library Figure 2. Two layer interface of the DP toolbox 2.1. Parallel paradigm for interactive working Generally, a communication interface can be realized as message MASTER passing or shared memory interface. The message passing start approach models interactions by explicit sending and receiving of messages. In the shared memory model an abstract global tupel space is used [5]. In the following, spawn slaves send id's send data start only the message passing model is considered. The most important programming paradigms for distribution and parallelization based on message receive data send data passing are Master/Slave (s. fig. 3) and SPMD (single program multiple data). The basic constructs of the Master/Slave-paradigm are spawn to create slaves, send Figure 3. Master/Slave paradigm and receive to handle messages. receive id's receive data process data SLAVES start receive id's receive data process data send data input data startup MASTER startup spawn startup SLAVES startup eval(*.m) put(m) output data aeval(*.m) quit M=putback('M') quit interactive Figure 4. Interactive working with MATLAB (principle) Figure 5. Interactive Master/Slave (principle)
Because both approaches are pure programming paradigms and do not consider interactive working, it is not useful to build up a message passing interface for an interactive environment like MATLAB with only those constructs. A modified Master/Slave paradigm is given in figure 5. For a better understanding, the usual sequential work with the MATLAB interpreter is shown in figure 4. eval(*.m) symbolizes the call or the processing of a built-in-, M-, MEXfunction or an M-script. aeval(*.m) is the parallel equivalent of eval. Different to eval, aeval works asynchronously that means aeval does not return a result, but it returns immediately after the function is called in the slave instance. 2.2. The high-level communication interface The DP toolbox provides a high-level message passing interface which supports application development according to the introduced interactive Master/Slave principle and to the conventional paradigms Master/Slave and SPMD. All functions of the message passing interface support matrix and vector arguments, optional and default arguments and alternative argument signatures. Thereby, the functions are very flexible and powerful. The fundamental functions and arguments are briefly described in figure 6. I = spawn(n) put(i,m) M=putback(I, M ) M=get() aeval(i, *.m ) i=myid b=parent creates n MATLAB-instances or other processes on the desired node(s) and returns identifier delivers one ore more matrices to one ore more receivers requests one or more matrices from one or more processes receives any matrices or desired matrices or matrices from desired senders etc. asynchronous function call in MATLAB-instances or other processes returns the identifier of the calling MATLAB-instance returns TRUE for father instance and FALSE for son instance Figure 6. Message passing interface provided by the DP toolbox (extract). To give an impression of the usage of the DP toolbox a simple example should be discussed in greater detail. Consider, the simulation of the step response of a damped mass-spring system is coded as the MATLAB function [xmean]=sim_mss(dd). The function argument dd is the damping factor which should be used as model parameter. If dd is a vector, a simulation is done with every component of dd and the average step response xmean is calculated. In a traditional way a Monte Carlo study with e.g. 1000 randomly chosen damping factors is carried out in MATLAB by the following commands: dd=rand(1000,1); xmean=sim_mss(dd); plot(xmean); That can be done interactively, but it is absurdly because the study consumes too much time. In a parallel fashion multiple MATLAB instances are used (e.g. 20): nslaves=20; slaves=spawn(nslaves);
The damping factors are placed in a matrix and each column is put to another MATLAB instance: dd=rand(1000/nslaves,nslaves); for i=1:nslaves put(slaves(i),dd(:,i)) end By calling aeval each MATLAB instance is requested to process its part of the whole task. After this the results can be collected and the entire step response is calculated: aeval(slaves,'xmean=sim_mss(dd);') xmean=0; for i=1:nslaves xmean=xmean+putback(slaves(i),'xmean')/nslaves; end plot(xmean); If necessary, the parallel Monte Carlo study can be coded as MATLAB program in sense of the Master/Slave programming paradigm. sim_master.m: nslaves=20; slaves=spawn(nslaves,'sim_slave'); dd=rand(1000/nslaves,nslaves); for i=1:nslaves put(slaves(i),dd(:,i)) end xmean=0; for i=1:nslaves xmean=xmean+get(slaves(i))/nslaves; end plot(xmean) sim_slave.m: dd=get; put(-1,sim_mss(dd)) Sometimes the SPMD (single program multiple data) programming paradigm is preferred. A SPMD example program coded with the DP toolbox is published in [7]. 3. APPLICATIONS 3.1. Parallelization of runtime expensive calculations For the purpose of comparing parallel simulation techniques three simulation tasks are published in [6]. The problem granularity of the tasks ranges from a high-grained Monte Carlo study via a medium-grained calculation of a PDE to a fine-grained simulation of coupled predator-prey populations. All test problems were solved in MATLAB with the DP toolbox on a cluster of 20 SUN Classic workstations under Solaris 2 connected via Ethernet (10 Mbit/s). The performance results are published in [7]. Another application example is the parallelization of a stability test for linear uncertain systems by the method of convex decomposition [8], which is also applicable for controller design. The
method bases on the repeated solution of Lyapunov-Equations and provides a graphical presentation of stability domains in the space spawned by the uncertain parameters (s. fig. 7). In the sense of parallelization the method represents a difficult load balancing problem, because the repeated division of the original cube in subdomains leads to a decomposition tree with branches of different comply (s. fig 8). Therefore, static load balancing will produce pure results. In [9] a dynamic load balancing algorithm is described which has been implemented using the DP toolbox. 2 1.5 0.5 1 Figure 7. Well-known stability triangle of a discrete second order system 0-0.5-1 -1.5-2 -1.5-1 -0.5 0 0.5 1 1.5 (-,-) Figure 8. Decomposition tree (0,0) (0,1) (1,0) (1,1) (00,01) (00,11) (10,01) (10,11) (00,01) (00,11) (10,01) (10,11) (000,011) (000,111) (100,011) (100,111) (1,0) (0,0) (11,01) (01,01) (10,01) (00,01) (11,11) (01,11) (10,11) (100,011) (000,011) (100,111) (000,111) 3.2. Distributed processing in heterogeneous environments Figure 7 shows a typical distributed application in the area of automatic control. The data sampling of a plant runs on a real-time system and the collected data are delivered to
MATLAB on a standard system. There, the user can perform plant identification, controller design and simulation of the closed loop. After that, the found parameters are down-loaded to an actual controller. The realization of this application is described in [10]. Data Sampling OS9 VME-BUS Modelling Controller Design MATLAB Unix Controller OS9 VME-BUS Figure 9. Distributed data sampling and process control 3.3. Integration of autonomous software with MATLAB Investigations in the field of agricultural systems [11] have shown that some simulation problems cannot be solved within MATLAB effectively. One example are models which are described by differential equations and logical rules. In order to solve such tasks, MATLAB and a rule-based simulator (PROLOG-application) were coupled via the DP toolbox [12]. Another example are structure variable systems [13] which are not supported by MATLAB. The connection to a special simulator, which is available as C++-application [14], allows the user to simulate that system class in the accustomed MATLAB environment. On the other side, all methods necessary for data pre- and post-processing and for visualization are supported by MATLAB and need not be provided by the special simulator. 4. CONCLUSIONS The presented DP toolbox provides an easy approach to distributed and parallel processing for a wide range of applications. The necessary time for implementing and testing a distributed and parallel version of a problem decreases significantly compared with other techniques [15] because the usual fast prototyping of MATLAB is kept up. Principles of a high-level shared memory interface for MATLAB and experiences with distributed and parallel SIMULINK applications will be published in the near future. REFERENCES 1. Moler, C.: Why there isn't a parallel MATLAB. MATLAB News and Notes, spring, 1995 2. Geist, A. et al.: PVM 3 User's Guide and Reference Manual. Technical Report ORNL/TM-12187, Oak Ridge National Laboratory, Oak Ridge, May 1994 3. Pawletta, S.: Design and Implementation of a Distributed and Parallel Simulation Environment for Complex Experiments. Pre-print CS-03-94, University of Rostock, 1994 4. MATLAB - External Interface Guide. The Mathworks Inc., Natick, January 1993 5. Carriero, F. C. and Gelernter, D.: How to write parallel programs. The MIT Press, Cambridge, MA, 1991 6. Comparison of Parallel Simulation Techniques. Simulation News Europe, (10), 1994
7. Pawletta, S., Pawletta, T. and Drewelow, W.: Comparison of Parallel Simulation Techniques - MATLAB/PSI. Simulation News Europe, (13), 1995, pp. 38-39 8. Kiendl, H.; Ossadnik, H.: Robustness Analysis and Synthesis of Linear Uncertain Systems by the Method of Convex Decomposition. Workshop on Control of Uncertain Systems, Bremen, Germany, 1989 9. Suesse, M.: Parallel Processing in the Interactive Environment MATLAB. Diploma Thesis, Institute of Automatic Control, University of Rostock, 1995 10. Krueger, T.: Distributed Data Sampling and Process Control under OS9. Diploma Thesis, Institute of Automatic Control, University of Rostock, 1994 11. Pawletta, T.: Design and implementation of a modular hierarchical simulation runtime system for ecological models. Pre-print CS-02-94, University of Rostock, 1994 12. Nekien, T.: Coupling of rule based and equation based model descriptions. Diploma Thesis, Department of Computer Science, University of Rostock, 1993 13. Pawletta, T. and Pawletta, S.: CAST based simulation of structure variable systems. Rostocker Informatik Berichte, University of Rostock, (1) 1995 14. Pawletta, T. and Pawletta, S.: Design of a simulator for structure variable systems. Proc. of the 5th International IMACS-Symposium on System Analysis and Simulation, Berlin, Gordon & Breach Publishing House 15. Breitenecker, F.: Comparison of Parallel Simulation Techniques. Proc. of the European Simulation Congress, EUROSIM 95, Vienna, Elsevier Science B.V., 1995