Modalis. A First Step to the Evaluation of SimGrid in the Context of a Real Application. Abdou Guermouche and Hélène Renard, May 5, 2010

Size: px

Start display at page:

Download "Modalis. A First Step to the Evaluation of SimGrid in the Context of a Real Application. Abdou Guermouche and Hélène Renard, May 5, 2010"

Derek Briggs
5 years ago
Views:

1 A First Step to the Evaluation of SimGrid in the Context of a Real Application Abdou Guermouche and Hélène Renard, LaBRI/Univ Bordeaux 1 I3S/École polytechnique universitaire de Nice-Sophia Antipolis May 5, 2010 Modalis

2 Hélène Renard SimGrid vs Real-Life 2 Plan of presentation 1. Framework Data redistribution algorithms Heat propagation 2. Real-life and simulation Scheduling & heat Wrekavoc 3. Experimental results 4. Conclusion and future works

3 Framework Plan of presentation Hélène Renard SimGrid vs Real-Life 3 1. Framework Data redistribution algorithms Heat propagation 2. Real-life and simulation Scheduling & heat Wrekavoc 3. Experimental results 4. Conclusion and future works

4 Hélène Renard SimGrid vs Real-Life 4 Framework Plan of presentation Data redistribution algorithms 1. Framework Data redistribution algorithms Heat propagation 2. Real-life and simulation Scheduling & heat Wrekavoc 3. Experimental results 4. Conclusion and future works

Hélène Renard SimGrid vs Real-Life 5 Framework Data redistribution algorithms Data redistribution algorithms : context Target platforms: distributed heterogeneous platforms (network of workstations,

5 Hélène Renard SimGrid vs Real-Life 5 Framework Data redistribution algorithms Data redistribution algorithms : context Target platforms: distributed heterogeneous platforms (network of workstations, clusters of clusters, grids, etc.) 1. Various sources of load imbalance : application requirements / platform. 2. The data must be redistributed to achieve a better load balancing. 3. No discussion of the mechanism of load balancing we consider it as given.

6 Hélène Renard SimGrid vs Real-Life 6 Framework Data redistribution algorithms Data redistribution algorithms : context The algorithm operates on a wide array of rectangular sample data: The array is split in vertical slices; This geometric constraint recommends that processors must be organized as a virtual ring: Each processor only communicates twice (once with each neighbor). x i 1,j x i,j 1 x i,j x i,j+1 x i+1,j P i 1 P i P i+1 Figure: Communication scheme.

7 Hélène Renard SimGrid vs Real-Life 7 Framework Data redistribution algorithms Redistribution problem for heterogeneous bidirectional rings Definition A redistribution is light if each processor initially owns all data that it will send during the execution of the algorithm. Minimize τ subject to S i,i+1 0 S i,i 1 0 S i,i+1 + S i,i 1 S i+1,i S i 1,i = δ i S i,i+1 c i,i+1 + S i,i 1 c i,i 1 τ S i+1,i c i+1,i + S i 1,i c i 1,i τ 1 i n 1 i n 1 i n 1 i n 1 i n (1) To lead to... We can use the solution of System 1 safely.

8 Hélène Renard SimGrid vs Real-Life 8 Framework Plan of presentation Heat propagation 1. Framework Data redistribution algorithms Heat propagation 2. Real-life and simulation Scheduling & heat Wrekavoc 3. Experimental results 4. Conclusion and future works

9 Hélène Renard SimGrid vs Real-Life 9 Laplace equation Framework Heat propagation Context A metal plate to which is applied a source of heat from the edges. The heat will spread within plate. The temperature at the edges is kept constant, the heat distribution in the plate tends to a stationary state. Heat source Laplace equation : 2 f x f y 2 = 0 Heat source

10 Hélène Renard SimGrid vs Real-Life 10 Laplace equation : Framework Heat propagation 2 f x + 2 f 2 y = 0 2 Resolution : 1. Approximating the solution discretization grid n 2 points Heat source Heat source 2. Using finite differences on the Laplace equation, this is equivalent to iteratively solve the following equation: 4x i,j (x i 1,j + x i+1,j + x i,j 1 + x i,j+1 ) = 0

11 Hélène Renard SimGrid vs Real-Life 11 Laplace equation: Framework Heat propagation 2 f x + 2 f 2 y = 0 2 xi,j 1 xi 1,j xi,j xi,j+1 Same pattern of communication as the ring of processors xi+1,j Pi 1 Pi Pi+1 Communication only with immediate neighbors. Figure: Communication scheme. 3. Solving a linear system Jacobi, since it is of the form: Ax = b, with A and x as... x 1,1... x 1, = b x n,n 1 x n,n

12 Hélène Renard SimGrid vs Real-Life 12 Laplace equation: Framework Heat propagation 2 f x + 2 f 2 y = Enrichment of the matrix: the vector b is zero except for on the lower and higher (source heat) neighboring points of point heat source B.. C {z } A x 1,1 x 1,2 x 1,3.. x 5,4 x 5,5 {z } x 1 0 = C B t 1 t 2 t t 9 t 10 1 C A {z } b

13 Real-life and simulation Plan of presentation Hélène Renard SimGrid vs Real-Life Framework Data redistribution algorithms Heat propagation 2. Real-life and simulation Scheduling & heat Wrekavoc 3. Experimental results 4. Conclusion and future works

14 Hélène Renard SimGrid vs Real-Life 14 Real-life and simulation Plan of presentation Scheduling & heat 1. Framework Data redistribution algorithms Heat propagation 2. Real-life and simulation Scheduling & heat Wrekavoc 3. Experimental results 4. Conclusion and future works

15 Hélène Renard SimGrid vs Real-Life 15 Scheduling & heat Real-life and simulation Scheduling & heat P1 P3 P4 P5 P2 Figure: Data redistribution Figure: Heat propagation

16 Hélène Renard SimGrid vs Real-Life 16 Ring history Real-life and simulation Scheduling & heat Too large! We split the plate among a set of processors: Heat source Heat source Re inject Communication pattern follows a ring organization.

17 Hélène Renard SimGrid vs Real-Life 17 Real-life and simulation Plan of presentation 1. Framework Data redistribution algorithms Heat propagation 2. Real-life and simulation Scheduling & heat Wrekavoc 3. Experimental results 4. Conclusion and future works

load balancing and data redistribution on two different

18 Hélène Renard SimGrid vs Real-Life 18 Real-life and simulation Goal : Compare the behavior of algorithms for load balancing and data redistribution on two different platforms : Grid 5000 SimGrid Figure: SimGrid Figure: Grid 5000

19 Real-life and simulation The corresponding code and two algorithms The corresponding code: The C language standard UNIX sockets for communication, the XDR layer for interoperable communications between heterogeneous machines. No MPI for the communication layer. while end not detected do if I am master then if modulo (current iteration number, interval) == 0 then Wait for state information from all workers; Use the algorithm to build redistribution information; Send redistribution information to each worker. end else Exchange data with neighbors; Update local data and process current iteration; if modulo (current iteration number, interval) == 0 then Perform benchmarks to get the new characteristics of my processor and my network links; end end end Send my new state to master; Wait for redistribution information from master; Apply the redistribution algorithm according to the decision of master. Algorithm 1: Iterative scheme. Hélène Renard SimGrid vs Real-Life 19

20 Hélène Renard SimGrid vs Real-Life 19 Real-life and simulation The corresponding code and two algorithms The corresponding code: The C language standard UNIX sockets for communication, the XDR layer for interoperable communications between heterogeneous machines. No MPI for the communication layer. while end not detected do if I need to modify the platform then Pick a random number of resources to degrade; for each selected resource do Pick a random degradation factor from the interval [40; 100] end Apply the modification of the characteristics of the platform using wrekavoc; Generate the corresponding simulated platform. end end Algorithm 2: Monitor scheme.

21 Hélène Renard SimGrid vs Real-Life 20 Real-life and simulation The master and the workers Cluster 1... Cluster 3 Monitor Master node (scheduler)... regular communications redistribution/state information monitoring... Cluster 2 Figure: Experimental scheme: the master and the workers. This organization is used in both the simulated and real-life context. The difference comes from the monitor which is given by SimGrid in the simulated context.

22 Hélène Renard SimGrid vs Real-Life 20 Real-life and simulation The master and the workers Cluster 1... Cluster 3 Monitor Master node (scheduler)... regular communications redistribution/state information monitoring... Cluster 2 Figure: Experimental scheme: the master and the workers. Master: Gather the results of the measurements. Call the redistribution algorithms when needed.

23 Hélène Renard SimGrid vs Real-Life 20 Real-life and simulation The master and the workers Cluster 1... Cluster 3 Monitor Master node (scheduler)... regular communications redistribution/state information monitoring... Cluster 2 Figure: Experimental scheme: the master and the workers. Monitor: Modify (using wrekavoc) the characteristics of the platform.

24 Hélène Renard SimGrid vs Real-Life 20 Real-life and simulation The master and the workers Cluster 1... Cluster 3 Monitor Master node (scheduler)... regular communications redistribution/state information monitoring... Cluster 2 Workers: Figure: Experimental scheme: the master and the workers. Do all the computations and communications. Exchange data for redistribution according to the results of the master.

25 Hélène Renard SimGrid vs Real-Life 21 Real-life and simulation Plan of presentation Wrekavoc 1. Framework Data redistribution algorithms Heat propagation 2. Real-life and simulation Scheduling & heat Wrekavoc 3. Experimental results 4. Conclusion and future works

26 Hélène Renard SimGrid vs Real-Life 22 Real-life and simulation Wrekavoc Wrekavoc, in the center of both platforms 1. In our context, Wrekavoc is used to control CPU and network capabilities; of randomly chosen resources; in order to study the behavior of the application. Modify CPU speed Modify Memory available Node Daemon Modify Network bandwith & Latency Figure: Wrekavoc in pictures

27 Hélène Renard SimGrid vs Real-Life 23 Real-life and simulation Wrekavoc 1. Real and simulated execution: Retrieve through measurements: processor speed network latency inbound bandwidth Differences: Real execution: the modification of the characteristics of the platform are done using wrekavoc, Simulated execution: the modification of the characteristics of the platform is a built-in functionality of SimGrid.

28 Experimental results Plan of presentation Hélène Renard SimGrid vs Real-Life Framework Data redistribution algorithms Heat propagation 2. Real-life and simulation Scheduling & heat Wrekavoc 3. Experimental results 4. Conclusion and future works

29 Experimental results Hélène Renard SimGrid vs Real-Life 25 Bordeaux Grenoble Lille Lyon Nancy Orsay Rennes Sophia Toulouse total 1 site sites sites Grid 5000: Table: Description of the experimental platforms. A highly reconfigurable, controllable and monitorable experimental platform; Three different set of results; Neither the master nor the monitor are counted. SimGrid: Version 3.3.2; Simulacrum tool to generate the XML description of the platform; Provides the theoretical characteristics of the platform.

30 Experimental results Hélène Renard SimGrid vs Real-Life Real Life SimGrid 100 Real Life SimGrid Time for iteration (in seconds) 10 Time for iteration (in seconds) Iteration number (a) No platform variation Iteration number (b) With platform variation (3 platform variations, once every 29 iterations). Figure: Time needed (in seconds) for each iteration on the real-life and the simulated platform: one site platform.

31 Experimental results Hélène Renard SimGrid vs Real-Life Real Life SimGrid 100 Real Life SimGrid Time for iteration (in seconds) 10 Time for iteration (in seconds) Iteration number (a) No platform variation Iteration number (b) With platform variation (3 platform variations, once every 29 iterations). Figure: Time needed (in seconds) for each iteration on the real-life and the simulated platform: two sites platform.

32 Experimental results Hélène Renard SimGrid vs Real-Life 28 Time for iteration (in seconds) Real Life SimGrid Time for iteration (in seconds) Real Life SimGrid Iteration number (a) No platform variation Iteration number (b) With platform variation (3 platform variations, once every 29 iterations). Figure: Time needed (in seconds) for each iteration on the real-life and the simulated platform: five sites platform.

33 Experimental results Hélène Renard SimGrid vs Real-Life Real Life SimGrid 100 Real Life SimGrid Time for iteration (in seconds) 10 Time for iteration (in seconds) Iteration number (a) No platform variation Iteration number (b) With platform variation (3 platform variations, once every 29 iterations). Figure: Time needed (in seconds) for each iteration on the real-life and the simulated platform: two sites platform. Each iteration is three time more costly than a regular one.

34 Conclusion and future works Plan of presentation Hélène Renard SimGrid vs Real-Life Framework Data redistribution algorithms Heat propagation 2. Real-life and simulation Scheduling & heat Wrekavoc 3. Experimental results 4. Conclusion and future works

35 Hélène Renard SimGrid vs Real-Life 31 Conclusion Conclusion and future works 1. Two versions of the same application: the propagation of heat Simulated implementation on top of SimGrid. Real-life implementation running on the Grid 5000 platform. Using wrekavoc to control the characteristics of the platform. Use the same platform characteristics over time in the two contexts. 2. The observed behavior for the simulated case is very close to that of a real execution. 3. A first step for validation of SimGrid in the context of complex applications.

36 Conclusion and future works Future works Hélène Renard SimGrid vs Real-Life SimGrid vs Grid 5000: tightly coupled application where network models have to be, in general, accurate. 1 paper See below: 2. Local upday: check if it is profitable to replace a processor in the ring with a processor that not belong it. 2 papers or more 3. Global change: new solution from scratch. 2 papers or more

A First Step to the Evaluation of SimGrid in the Context of a Real Application. Abdou Guermouche

A First Step to the Evaluation of SimGrid in the Context of a Real Application Abdou Guermouche Hélène Renard 19th International Heterogeneity in Computing Workshop April 19, 2010 École polytechnique universitaire