ACONM: A hybrid of Ant Colony Optimization and Nelder-Mead Simplex Search N. Arun & V.Ravi* Assistant Professor Institute for Development and Research in Banking Technology (IDRBT), Castle Hills Road #1, Masab Tank, HYDERABAD 500 067 (AP) INDIA rav_padma@yahoo.com Jan 24, 2009 1
Outline Review of existing hybrid algorithms Exploration vs Exploitation Motivation for the current hybrid ACO Nelder Mead Simplex Search Hybrid Key issues Results Conclusion Jan 24, 2009 2
Existing Hybrid Algorithms There are several algorithms which are hybrids of metaheuristics and local search techniques. INESA (Non-equilibrium simulated annealing + Simplex like heuristic) [1] GA + Simplex search [2] SA + TS + Simplex search [3] TS + Simplex search [4] DE + Simplex search [5] PSO + Simplex search [6] HCIAC ( CIAC + Simplex search) [7] DHCIAC [9] Jan 24, 2009 3
Exploration vs Exploitation A common theme across these hybrid algorithms is exploration vs exploitation Exploration is the process of identifying promising regions in the search space. Exploitation is the process of using the promising region to arrive at the global optimum Jan 24, 2009 4
Exploration vs Exploitation contd. Meta-heuristics by their very nature are very good at performing exploration. They are good at avoiding local optima. The local search techniques on the other hand are not very good at exploration. They tend to get trapped in local optima. However, the meta-heuristics are not as fast as local-search techniques when it comes to exploitation Jan 24, 2009 5
Exploration vs Exploitation contd. It is these complementary strengths that inspired the development of several hybrid algorithms. The hybrid algorithms employ the meta-heuristics for exploration and the local-search techniques for exploitation By using the strengths of both the meta-heuristics and the local search techniques, the hybrids perform better than the individual algorithms Jan 24, 2009 6
Motivation of the current hybrid When we look at the hybrid algorithms which make use of Ant Colony Optimization we find that, they use ACO algorithms which are modified forms of the ACO metaphor. The HCIAC and DHCIAC algorithms make use of the CIAC [8] algorithm. The CIAC uses the notion of heterarchy (direct communication between ants) The ACO R [10] algorithm on the other hand is an elegant extension of the original ACO algorithm [12] to the realm of continuous optimization. This motivated us to try out the hybrid of ACO R and Simplex Search Jan 24, 2009 7
ACO R This was proposed by Socha and Dorigo (2008) Extension of the ACO algorithm to continuous optimization The ACO algorithm constructs the solution as a sequence of solution components The number of solution components for each dimension is finite in the case of combinatorial optimization However, in the case of continuous optimization, the number of solution components for each domain is infinite. The ACO R algorithm was proposed to take care of this key difference, without making any changes to the ACO metaphor (solution construction using pheromone trail values) Jan 24, 2009 8
ACO R Algorithm The algorithm makes use of a set of solutions called the solution archive (T). The solution archive is used to construct probability density functions which model the fitness of solutions in various regions of search space. The solutions are constructed by sampling these probability density functions. Jan 24, 2009 9
ACO R Algorithm contd. The solution archive may contain K solutions S 1, S 2,, S k. These solutions are sorted in descending order of their fitness value (ties are broken randomly). f(s 1 ), f(s 2 ),, f(s k ) represent the objective function values of the solutions. Weights are assigned to the solutions according to the following formula Weight of solution with rank l l qk 2 ( l 1) 1 2 2 2 q k e 2 Jan 24, 2009 10
ACO R Algorithm contd. Effect of q on the weights Jan 24, 2009 11
Jan 24, 2009 12 ACO R Algorithm contd. For each dimension a Gaussian Kernel Function is constructed. A Gaussian Kernel function is a weighted sum of Gaussian functions. Gaussian function for dimension i k l k l x i l l i l l i i l i l e x g x G 1 1 ) 2( ) ( 2 2 2 1 ) ( ) ( i k i i S S,, 1 k e i l i e i l k S S 1 1
ACO R Algorithm contd. ξ is a parameter of the algorithm. The parameter ξ influences the manner in which the solution archive will be used in the search process. If the value of ξ is small, the new solutions will be close to the solutions in the archive and this increases the speed of convergence. On the other hand if the value of ξ is large, there will be greater diversification and convergence may be slow. Jan 24, 2009 13
ACO R Algorithm contd. A new solution is created by the algorithm by making use of the probability density functions. Since Gaussian kernel functions are being used as probability density functions, creating a new solution involves: (i) selecting a component (Gaussian function) of the Gaussian kernel (ii) creating a new solution using the chosen Gaussian function. The Gaussian function l is chosen with the probability: P l l Jan 24, 2009 14 i i
ACO R Algorithm contd. Once a particular component of the Gaussian kernel is selected, the same component is used across all the dimensions. The new solution is constructed by sampling the chosen Gaussian function. Jan 24, 2009 15
ACO R Algorithm contd. Pseudo-Code for ACO R algorithm. 1: Initialize the solution archive T to random solutions. 2: repeat 3: sort the solutions in the archive in descending order of fitness value. Break ties randomly. Compute weights according to (1). 4: for each ant m in NUMANTS do 5: select a solution Sl probabilistically according to (5). 6: for each dimension i do 7: generate a random number Z having standard normal distribution. 8: calculate according to (4). 9: i i i S m Sl Z l 10: end for 11: end for 12: until termination criterion is met Jan 24, 2009 16
Nelder-Mead Simplex Search It was proposed by Nelder and Mead [11] It is a local search technique. It uses a simplex which moves towards the local optimum using four operations. Jan 24, 2009 17
Pseudo-Code for Simplex Search 1: REFLECTION: Let P h, P s and P l denote the points with the highest, second highest and the lowest objective function values. Let their corresponding P objective function values be f h, f s and f l. r P Calculate c ( P the centroid c P c h) of the simplex by excluding P h. Reflect the highest point in the simplex about the centroid α is the reflection coefficient (α > 0). We used α = 1 as suggested in [11]. Replace P h by P r, if and repeat step 1 again. If f r < f l go to step 2, otherwise go to step 3. P P ( P P ) c 2: EXPANSION: e c Calculate r the point of expansion P e by searching in the direction of P r. γ is the expansion coefficient (γ > 1). We used γ = 2 as suggested in [11]. If f e < f r, replace P h by P e, otherwise replace P h by P r. Go to step 1. P P P P ) i l ( i l Jan 24, 2009 18
3: CONTRACTION: If, P r replaces P h. The point of contraction P ct is calculated. Otherwise, P r does not replace P h and the point of contraction is calculated directly. β is the contraction coefficient (0 < β < 1). We used β = 0.5 as suggested in [11]. If, P ct replaces P h and go to step 1. Otherwise, go to step 4. 4: SHRINKAGE: If f ct > f h then we shrink the simplex towards P l (i.e., each point except P l is moved towards P l ). P ct P c ( Ph c P ) δ is the shrinkage coefficient (0 < δ < 1). We used δ = 0.5 as suggested in [11]. Go to step 1. Jan 24, 2009 19
Hybrid The key idea of the hybrid is to allow the ACO R algorithm to do the exploration and once it identifies a promising region to invoke the Simplex Search to quickly arrive at the optimum. Jan 24, 2009 20
Key issues of the hybrid The two important issues in the hybrid are: (i) The changeover from ACO R to Simplex search (ii) Creation of the initial simplex for the Simplex search algorithm. Jan 24, 2009 21
Key issues contd. Changeover from ACO R to NM The point where the algorithm shifts from ACO R to NM is a crucial parameter of the algorithm Standard deviation between solutions in the decision space is used for choosing the point of changeover. The reason for choosing the standard deviation is that when the ACO R algorithm beings to converge on the global optimum, the solutions lie in the vicinity of the global optimum. They form a neighborhood which can then be used by NM simplex search. Jan 24, 2009 22
Key issues contd. Jan 24, 2009 23
Key issues contd. Jan 24, 2009 24
Key issues contd. After each iteration the standard deviation between solutions is calculated. If it becomes less than or equal to a user specified value η c If a large value of η c is chosen, changeover occurs early. This will result in fewer function evaluations, but at the same time a lower success rate If a small value of η c is chosen, this will result in higher number of function evaluations, but also higher success rate. Jan 24, 2009 25
Key issues contd. Jan 24, 2009 26
Key issues contd. Construction of initial simplex: The Simplex Search algorithm is extremely sensitive to the initial simplex. The initial simplex should give the algorithm sufficient information about the landscape of the function being optimized. If the initial simplex is such that the difference in objective function values of the points is small, the algorithm will make a large number of iterations Jan 24, 2009 27
Key issues contd. Jan 24, 2009 28
Key issues contd. To give the simplex algorithm sufficient information about the landscape of the function being optimized, we divide the solutions in the archive into (n+1) chunks (n is the number of dimensions) and pick the first solution from each chunk. This gives the algorithm better information about the function, because the objective function values will be spread out. Jan 24, 2009 29
Results Table 3. Results ACONM ACO R problem srate fevals srate fevals Ackley [16] 0.79 712.62 0.81 1251.97 Bohachevsky [13] 0.83 300.74 1.00 709.42 Branin [14] 1.00 539.23 1.00 901.72 De Jong Function1 [15] 1.00 184.21 1.00 565.3 Easom [17] 1.00 271.79 1.00 950.80 Goldstein-Price [14] 0.98 161.40 1.00 553.74 Griewank10 [18] 0.01 2102.00 0.28 2679.85 Hartman 3 [14] 1.00 179.90 1.00 698.18 Hartman 6 [14] 0.58 423.43 0.57 1183.54 Rosenbrock [19] 1.00 653.11 1.00 1314.38 Schwefel [20] 0.55 1112.76 0.59 1633.45 Shekel5 [14] 0.54 569.59 0.61 1161.34 Shekel7 [14] 0.67 502.11 0.59 1059.49 Shekel10 [14] 0.66 516.96 0.66 1067.33 Shubert [21] 0.84 1559.64 0.85 1991.94 Rastrigin [22] 0.56 807.16 0.63 1388.50 Modified Himmelblau [23] 0.80 416.02 0.77 844.33 Jan 24, 2009 30 Zakharov [2] 1.00 226.75 1.00 631.46
Results contd. 100 independent runs were conducted for both the algorithms on all the problems The success rate over 100 independent runs is reported Also, mean number of function evaluations are reported To enable fair comparison, the bounds fixed on the variables were same for both algorithms We didn t resort to fine tuning of the algorithms, since our main aim was to compare the performance of the two algorithms Jan 24, 2009 31
Results contd. The results show that the hybrid is able to outperform the ACO R algorithm for several of the test functions. Jan 24, 2009 32
References [1] V. Ravi, B. S. N. Murty and P. J. Reddy, Nonequilibrim Simulated Annealing-Algorithm Applied to Reliability Optimization of Complex Systems, IEEE Transactions on Reliability, vol. 46, pp. 233-239, 1997. [2] R. Chelouah and P. Siarry, Genetic and Nelder-Mead algorithms hybridized for a more accurate global optimization of continuous multiminima functions, European Journal of Operational Research, vol. 148, pp. 335-348, 2003. [3] S. Salhi and N. M. Queen, A Hybrid Algorithm for Identifying Global and Local Minima when Optimizing Functions with many Minima, European Journal of Operational Research, vol. 155, pp. 51-67, 2004. [4] R. Chelouah and P. Siarry, A hybrid method combining continuous tabu search and Nelder-mead simplex algorithms for the global optimization of multiminima functions, European Journal of Operational Research, vol. 161, pp. 636-654, 2005. [5] T. R. Bhat, D. Venkataramani, V. Ravi and C. V. S. Murty, An improved differential evolution method for efficient parameter estimation in biofilter modeling, Biochemical Engineering Journal, vol. 28, pp. 167-176, 2006. [6] S.-K. S. Fan and E. Zahara, A Hybrid Simplex Search and Particle Swarm Optimization for Unconstrained Optimization, European Journal of Operational Research, vol. 181, pp. 527-548, 2007. [7] J. Dreo and P. Siarry, Hybrid Continuous Interacting Ant Colony Aimed at Enhanced Global Optimization, Algorithmic Operations Research, vol. 2, pp. 52-64, 2007. [8] J. Dreo and P. Siarry, Continuous Interacting Ant Colony Algorithm Based on Dense Heterarchy, Future Generation Computer Systems, vol. 20, pp. 841-856, 2004. [9] J. Dreo and P. Siarry, An Ant Colony Algorithm Aimed at Dynamic Continuous Optimization, Applied Mathematics and Computation, vol. 181, pp. 457-467, 2004. [10] K. Socha and M. Dorigo, Ant Colony Optimization for continuous Domains, European Journal of Operational Research, vol. 185, pp. 1155-1173, 2008. [11] J. A. Nelder and R. Mead, A Simplex Method for Function Optimization, The Computer Journal, vol. 7, pp. 308-313, 1965. [12] M. Dorigo and T. Stutzle, Ant Colony Optimization. Cambridge MA: MIT press, 2004. Jan 24, 2009 33