A Genetic Algorithm Me hod for Sender-based Dynamic Load Balancing Algorithm in Distributed Systems

1997 First Intemational Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain A Genetic Algorithm Me hod for Sender-based Dynamic Load Balancing Algorithm in Distributed Systems Seong-hoon Lee, Tae-won Kang, Myung-sook KO, Gwang-sik Chung, Joon-min Gil and Chong-sun Hwang Department of Computer Science, Korea University 1, 5-ka, Anam-dong, Seoul, 136-701, Korea Phone: +82-02-924-0547, Fax: +82-02-953-0771 E-mail: shlee@disys 1.korea.ac.kr Abstract - In a sender-initiated load balancing algorithms, overloaded processor continues to send unnecessary request messages for load transfer until underloaded processor is found while the system load is heavy. Therefore, it yields inefficient inter-processor communications, low CPU utilization, and low system throughput. To solve these problems, we propose an improved genetic algorithm method for sender-initiated load balancing in distributed systems, and define a suitable fitness function. In this scheme, the processors that the request messages are transfered are determined by genetic algorithm. And it decreases unnecessary request messages. Compared with the conventional sender-initiated algorithms, we shows that the proposed algorithm performs better. 1. Introduction Distributed systems consist of a collection of autonomous computers connected network. The primary advantages of these systems are high performance, availability, and extensibility at low cost. To improve distributed system's performance, it is essential to keep the system load to each processor equally. An objective of load balancing in distributed systems is to allocate tasks among the processors to maximize the utilization of processors and minimize the mean response time. Load balancing algorithms can be largely classified into three classes: static, dynamic, adaptive. Our approach is based on the dynamic load balancing algorithm. In a dynamic scheme, an overloaded processor(sender) sends excess tasks to an underloaded processor(receiver) during execution. Dynamic load balancing algorithms are specialized into three methods: sender-initiated, receiver-initiated, symmetrically-initiated. Basically our approach is a sender-initiated algorithm. Under sender-initiated algorithms, load balancing activity is initiated by a sender trying to send a task to a receiver[ 1, 21. In a sender-initiated algorithm, decision of task transfer is made in each processor independently. A request message for the task transfer is initially issued from a sender to another processor randomly selected. If the selected processor is receiver, it retums an accept message. And the receiver is ready for receiving the additional task, from sender. Otherwise, it retums a reject message, and the sender tries for others until receiving an accept message. If all the request messages are rejected, no task transfer takes place. While distributed systems remain to light system load, the sender-initiated algorithm performs well. But when the distributed systems become to heavy system load, it is difficult to find a suitable receiver because most processors have a additional tasks to send. So, many request and reject messages are repeatedly sent back and forth, and a lot of time is consumed before exec- ution. Therefore, much of the task processing time is consumed, and causes low system throughput, low CPU utilization To solve these problems, we propose a 0-78O3-37SS -7/97/$5.OO 0 1997 IEEE 302

normal-load 1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, AdelPide, Australia. Editor, L.C. Jain GA-based dynamic load balancing scheme with a local improvement operation which evolves strategy for determining a destination processor to receive a task. And we define a suitable fitness function. In this scheme, a number of request messages issued before accepting a task are determined through a proposed genetic algorithm. The proposed genetic algorithm apply to a population of binary strings. Each gene in the string stands for the number of processors the request messages should be sent off. Load state L-load which The rest of the paper is organized as follows. Section 2 presents the GA-based methods. Section 3 presents several experiments to compare with conventional method. Finally the conclusions are presented in Section 4. 2. GA-based Method 2.1 Load Measure We employ the CPU queue length as a suitable load index because this measure is known the most suitable index[5]. This measure means a number of tasks in CPU queue residing in a processor. We use a?-level scheme to represent the load state on its own CPU queue length of a processor. Table 1 shows the 3-level load measurement scheme. Tup and TI,, are algorithm design parameters and are called upper and lower threshold respectively. Table 1?-level load measurement scheme N-load ~ Meaning Criteria transfers a task to P;. light-load 1 CQL < T ~ ~ ~ 1 TI, < CQL Tup H-load 1 heavy-load 1 CQL > Tup ( CQL : CPU Queue Length ) The transfer policy use the threshold policy that makes decisions based on the CPU queue length. The transfer policy is triggered when a task arrives. A node identifies as a sender if a new task originating at the node makes the CPU queue length exceed Tup. A node identifies itself as a suitable receiver for a task acquisition if the node's CPU queue length will not T~ow. 2.2 Overview Each processor in distributed own population which genetic applied to. A string in population binary-coded vector <V,,VI,..., vn-l> cause to exceed systems has its operators are is defined as a which indicates a set of processors to which the request messages are sent off. If the request message is transfered to the processor P,(where O<i<n-1, n is the total number of processors), then vi= 1, otherwise v,=o. Each string has its own fitness value. We select a string by a probability proportional to its fitness value, and transfer the request messages to the processors indicated by the string. Processors received the request message from sender send accept message or request message depending on its own CPU queue length. In the case of more than two accept messages retumed, one is selected at random. Suppose that there are 10 processors in distributed systems, and the processor PO is a sender. Then, genetic algorithm is performed to decide a suitable receiver. We select a string by a probability proportional to its fitness value. Suppose a selected string is <-, 1, 0, 1, 0, 0, 1, 1, 0, O>, then the sender PO sends request messages to PI, P;, P6, P7. After each processor(p1, P;, P6, P7) received a request message from PO, each processor checks its load state. If the processor P; is a light load state, P; 2.3 Fitness Function Each string included in a population is evaluated by the fitness function using following formula. F. = 1 (cyx TMP + BX TMT + yx TTP) a, B, 7 used above formula means the 303

1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain weights for parameters such as TMP, TMT, TTP. First, TMP(Tota1 Message Processing time) is the summation of the processing times for request messages to be transfered. This parameter is defined by the following formula. The ReMN is the number of messages to be transfered. It means the number of bits set '1' in selected string. The objective of this parameter is to select a string with the fewest number of messa, aes to be transfered. TMP = F (ReMN X Time Unit) EX (where, x={i I v,=l for OLign-1)) Second, TMT(Tota1 Message Transfer time) means the summation of each message transfer times(emtt) from sender to the processors corresponding to bits set '1' in selected string. The objective of this parameter is to select a string with the shortest distance eventually. So, we define the TMT as the following formula. TMT = pemttk EX (where x=(i I v,=l for O<i<n-l)) Last, TTP(Tota1 Task Processing time) is the summation of the times needed to perform a task at each processor corresponding to bits set '1' in selected string. This parameter is defined by the following formula. The objective of this parameter is to select a string with the fewest tasks. Load in parameter TTP is the volumes of CPU queue length in the processor. TTP = F (Loadk) EX (where x=(i I v,=l for O<i<n-l)) So, in order to have a largest fitness value, each parameter such as TMP, TMT, TTP must have small values as possible as. That is, TMP must have the fewer number of request messages, and TMT must have the shortest distance, and TTP should have the fewer number of tasks. Eventually, a string with the largest fitness value in population is selected. And after genetic-operation is performed, the request messages are transfered to the processors corresponding to bits set '1' in selected string. 2.4 Algorithm This algorithm consists of five modules such as Initialization, Check-load, String-evaluation, Genetic-operation and Message-evaluation. Geneti- c-operation module consists of three sub-modules such as Local improvement, Reproduction, Crossover. These modules are executed at each processor in distributed systems. The flow diagram of the proposed algorithm for dynamic load balancing algorithm is presented as Fig 1. System l'*l Re rodudion ww Task arrive 1 - Message-evaluation I ' I Fig 1 Flow diagram of the proposed algorithm An Initialization module is executed in each processor. A population of strings is randomly generated without duplication. A Check-load module is used to observe its own processor's load by checking the CPU queue length, whenever a task is arrived in a processor. If the observed load is heavy, the load balancing algorithm performs the following modules. A String-evaluation module calculates the fitness value of strings in the population. A Genetic-operation module such as Local improvement, Reproduction, Crossover is executed on the population in such a way as follows. Distributed systems consist of groups with autonomous computers. When each group consists of many processors, we can suppose that there are p parts in a string corresponding to the 3 04

1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain groups. The following genetic operations are applied to each strings, and new populations of strings are generated: (1) Local Improvement Operation String 1 is chosen. A copy version of the string 1 is generated and part 1 of the newly generated clone is mutated. This new string is evaluated by proposed fitness function. If the evaluated value of the new string is higher than that of the original string, replace the original string with the new string. After this, the local improvement of part 2 of string 1 is done repeatedly. This local improvement is applied to each part one by one. When the local improvement of all the parts is finished, new string 1 is generated. String 2 is then chosen, and the above-mentioned local improvement is done. This local improvement operation is applied to all the strings in population. (2) Reproduction The reproduction operations are applied to the newly generated strings. We use the "wheel of fortune" technique[4]. (3) Crossover The crossover operation is applied to the newly generated strings. These newly generated strings are evaluated. We applied to the "one- point" crossover operator in this paper[4]. The Genetic-operation selects a string from the population at the probability proportional to its fitness, and then sends off the request messages according to the contents of the selected string. A Message-evaluation module is used whenev- er a processor receives a message from other pro- cessors. When a processor P, receives a request message, it sends back an accept or reject message depending on its CPU queue length. 3. Experiments We executed several experiments on the proposed genetic algorithm for dynamic load balancing scheme to compare with a conventional sender-initiated algorithm. Our experiments have the following assumptions. First, each task size is the same. Second, the load rating over systems is about 60 percent. In genetic algorithm, crossover probability(p,) is 0.7, mutation probability(p,) 0.05. The values of these parameters P,, P, were known as the most suitable values in various application[3]. The table 2 shows the detailed contents of parameters used in our experiments. Table 2 Contents of parameters number of processor 1 - -number of strings number of parts in a string weight for TMP weight for TMT weight for TTP 30 5 0.025 0.0 1 0.02 [Experiment 11 We compared the performance of proposed method with a conventional method in this experiment by using the parameters on the table 2. The experiment is to observe the change of response time when the number of tasks to be performed is 3000. number of task Fig 2 Result of response time Fig 2 shows result of the experiment 1. In conventional method, when the sender determines a suitable receiver, it select a processor in distributed systems randomly, and receive the load state information from the selected processor. The algorithm determines the selected processor as receiver if the load of randomly selected processor is Tl,,,.,(light-load). These processes are repeated until a suitable receiver is searched. So, the result is 305

1997 First International Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain of response time shows the severe fluctuation. In the proposed algorithm, the algorithm shows the low response time because the load balancing activity performs the genetic-operation considering the load state when it determines a receiver. [Experiment 2) This experiment is to observe the convergence of the fimess function for the best string in a specific processor in distributed systems. 22 101 ' " " " " I 1 300 600 900 1200 1500 1800 2100 2400 2700 3000 number of task Fig 4 Result depending on the changes of P, [Experiment 41 This experiment is to observe the performance when the probability of mutation is changed. 06i i 0 50 100 150 200 250 300 350 Fig 3 Fitness value of the processor P6 In this experiments, we observed the fact that the processor P6 performs about three hundred of tasks among 3000 tasks, and the proposed algorithm generally converges through 50 generations. A small scale of the fluctuations displayed in this experiment result from the change of the fitness value for the best string selected through each generation. An interesting fact is that these fluctuations show only in upper bound on a specific level. that is, the fitness values through each generation don't drop into in lower bound on a specific level. [Experiment 31 This experiment is to observe the performance when the probability of crossover is changed. Fig 4 shows the result of response time depending on the changes of P, when P, is 0.05. It shows high performance respectively when P, is 0.7. Its mean response time has 21.104667 time unit. But it shows low performance when P, is 0.6. Its mean response time has 22.380000 time unit. lo: 300 600 900 1200 1500 18bO 2100 2400 27b0 3obO number of task Fig 5 Result depending on the changes of P, Fig 5 shows the result of the response time depending on the changes of P, when P, is 0.7. The result shows high performance respectively when P, is 0.1. Its mean response time has 19.784000 time unit. But it shows low performance when P, time has 2 1.104667 time unit. is 0.05. Its mean response Through these several experiments, we verify that performance of the proposed scheme is better than that of the conventional scheme. The performance depending on the changes for parameters P, and P, is excellent than that of the conventional method. 4. Conclusions 306

1997 First Intemational Conference on Knowledge-Based Intelligent Electronic Systems, 21-23 May 1997, Adelaide, Australia. Editor, L.C. Jain We propose a new dynamic load balancing scheme in distributed system that is based on the genetic algorithm with a local improvement operation. The genetic algorithm is used to decide to suitable candidate receivers which task transfer request messages should be sent off. Several experiments have been done to compare the proposed scheme with a conventional algorithm. Through the various experiments, the performance of the proposed scheme is better than that of the conventional scheme on the response time and mean response time. So, it shows the effectiveness of the proposed algorithm for dynamic load balancing in distributed systems. The performance of the proposed algorithm depending on the changes of the probability of mutation(p,) and probability of crossover(p,) is also better than that of the conventional scheme. References [I] D.L.Eager, E.D.Lazowska, J.Zahorjan, "Adaptive Load Sharing in Homogeneous Distributed Systems," IEEE Trans on Sofiare Engineering, vol. 12, no.5, pp.662-675, May 1986. [2] N. GShivaratri, P.Krueger, and MSinghal, "Load Distributing for Locally Distributed Systems," IEEE COMPUTER, vo1.25, no. 12, pp.33-44, December 1992. [3] J.Grefenstette, "Optimization of Control Par- ameters for Genetic Algorithms," IEEE Trans on SMC, vol.smc-16, no.1, pp.122-128, January 1986. [4] J.R. Filho and P. C. Treleaven, "Genetic-- Algorithm Programming Environments," IEEE COMPUTER, pp.28-43, June 1994. [SI T. Kunz, "The Influence of Different Workload Descriptions on a Heuristic Load Balancing Scheme," IEEE Trans on Software Engineering, vol. 17, No.7, pp.725-730, July 1991. [6] T.Furuhashi, K.Nakaoka, Y.Uchikawa, "A New Approach to Genetic Based Machine Learning and an Efficient Finding of Fuzzy Rules," Proc. WWW94, pp.114-122, 1994. [7] J A. Miller, W D. Potter, R V. Gondham, C N. Lapena, "An Evaluation of Local Improvement Operators for Genetic Algorithms," IEEE Trans on SMC, v01.23, No 5, pp.1340-1351, Sept 1993. [8] N.G.Shivaratri and P.Krueger, "Two Adaptive Location Policies for Global Scheduling Algorithms," Proc. 10th International Conference on Distributed Computing Systems, pp.502-509, May 1990. [9] Terence C. Fogarty, Frank Vavak, and Phillip Cheng, "Use of the Genetic Algorithm for Load Balancing of Sugar Beet Presses," Proc. Sixth International Conference on Genetic Algorithms, pp.617-624, 1995. [lo] Garrism W. Greenwood, Christian Lang and steve Hurley, "Scheduling Tasks in Real-Time Systems using Evolutionary Strategies," Proc. Third Workshop on ParalIeI and Distributed Real-Time Systems, pp.195-196, 1995. [ 1 I] Gilbert Syswerda, Jeff Palmucci, "The application of Genetic Algorithms to Resource Scheduling," Proc. Fourth International Conference on Genetic Algorithms, pp.502-508, 1991. 307