Comprehensive and Automatic Fitness Landscape Analysis using HeuristicLab

Similar documents
ON THE BENEFITS OF A DOMAIN-SPECIFIC LANGUAGE FOR MODELING METAHEURISTIC OPTIMIZATION ALGORITHMS

Effective Optimizer Development for Solving Combinatorial Optimization Problems *

Hyperplane Ranking in. Simple Genetic Algorithms. D. Whitley, K. Mathias, and L. Pyeatt. Department of Computer Science. Colorado State University

to the Traveling Salesman Problem 1 Susanne Timsj Applied Optimization and Modeling Group (TOM) Department of Mathematics and Physics

Telecommunication and Informatics University of North Carolina, Technical University of Gdansk Charlotte, NC 28223, USA

Stefan WAGNER *, Michael AFFENZELLER * HEURISTICLAB GRID A FLEXIBLE AND EXTENSIBLE ENVIRONMENT 1. INTRODUCTION

Effects of Constant Optimization by Nonlinear Least Squares Minimization in Symbolic Regression

ACO and other (meta)heuristics for CO

One-Point Geometric Crossover

First-improvement vs. Best-improvement Local Optima Networks of NK Landscapes

Attractor of Local Search Space in the Traveling Salesman Problem

A COMPARATIVE STUDY OF FIVE PARALLEL GENETIC ALGORITHMS USING THE TRAVELING SALESMAN PROBLEM

Meta- Heuristic based Optimization Algorithms: A Comparative Study of Genetic Algorithm and Particle Swarm Optimization

MODELING OF HEURISTIC OPTIMIZATION ALGORITHMS

A B. A: sigmoid B: EBA (x0=0.03) C: EBA (x0=0.05) U

Data Mining via Distributed Genetic Programming Agents

Dept. of Computer Science. The eld of time series analysis and forecasting methods has signicantly changed in the last

Department of. Computer Science. Remapping Subpartitions of. Hyperspace Using Iterative. Genetic Search. Keith Mathias and Darrell Whitley

Metaheuristic Development Methodology. Fall 2009 Instructor: Dr. Masoud Yaghini

Large Scale Parameter Meta-Optimization of Metaheuristic Optimization Algorithms with HeuristicLab Hive

OptLets: A Generic Framework for Solving Arbitrary Optimization Problems *

A Web-Based Evolutionary Algorithm Demonstration using the Traveling Salesman Problem

ARTIFICIAL INTELLIGENCE (CSCU9YE ) LECTURE 5: EVOLUTIONARY ALGORITHMS

MAXIMUM LIKELIHOOD ESTIMATION USING ACCELERATED GENETIC ALGORITHMS

Fuzzy Inspired Hybrid Genetic Approach to Optimize Travelling Salesman Problem

Implementations of Dijkstra's Algorithm. Based on Multi-Level Buckets. November Abstract

Comparison Study of Multiple Traveling Salesmen Problem using Genetic Algorithm

Geometric Crossover for Sets, Multisets and Partitions

Rearrangement of DNA fragments: a branch-and-cut algorithm Abstract. In this paper we consider a problem that arises in the process of reconstruction

Benefits of Plugin-Based Heuristic Optimization Software Systems

Understanding Competitive Co-evolutionary Dynamics via Fitness Landscapes

Two new variants of Christofides heuristic for the Static TSP and a computational study of a nearest neighbor approach for the Dynamic TSP

Using ɛ-dominance for Hidden and Degenerated Pareto-Fronts

Modeling with Uncertainty Interval Computations Using Fuzzy Sets

Object Modeling from Multiple Images Using Genetic Algorithms. Hideo SAITO and Masayuki MORI. Department of Electrical Engineering, Keio University

Local search. Heuristic algorithms. Giovanni Righini. University of Milan Department of Computer Science (Crema)

A Steady-State Genetic Algorithm for Traveling Salesman Problem with Pickup and Delivery

Solving Traveling Salesman Problem Using Parallel Genetic. Algorithm and Simulated Annealing

A PRIME FACTOR THEOREM FOR A GENERALIZED DIRECT PRODUCT

BI-OBJECTIVE EVOLUTIONARY ALGORITHM FOR FLEXIBLE JOB-SHOP SCHEDULING PROBLEM. Minimizing Make Span and the Total Workload of Machines

Algorithm and Experiment Design with HeuristicLab

Parallel Genetic Algorithm to Solve Traveling Salesman Problem on MapReduce Framework using Hadoop Cluster

Preliminary results from an agent-based adaptation of friendship games

A Hybrid Genetic Algorithm for the Hexagonal Tortoise Problem

Gen := 0. Create Initial Random Population. Termination Criterion Satisfied? Yes. Evaluate fitness of each individual in population.

A NEW HEURISTIC ALGORITHM FOR MULTIPLE TRAVELING SALESMAN PROBLEM

SPATIAL OPTIMIZATION METHODS

Understanding Competitive Co-evolutionary Dynamics via Fitness Landscapes

Genetic Algorithms with Oracle for the Traveling Salesman Problem

Job Shop Scheduling Problem (JSSP) Genetic Algorithms Critical Block and DG distance Neighbourhood Search

Lower Bounds for Insertion Methods for TSP. Yossi Azar. Abstract. optimal tour. The lower bound holds even in the Euclidean Plane.

Object classes. recall (%)

Proceedings of the First IEEE Conference on Evolutionary Computation - IEEE World Congress on Computational Intelligence, June

Comparative Analysis of Genetic Algorithm Implementations

Multidimensional Knapsack Problem: The Influence of Representation

Lecture 4. Convexity Robust cost functions Optimizing non-convex functions. 3B1B Optimization Michaelmas 2017 A. Zisserman

A Novel Approach to Planar Mechanism Synthesis Using HEEDS

Algorithm Design (4) Metaheuristics

A memetic algorithm for symmetric traveling salesman problem

THE Multiconstrained 0 1 Knapsack Problem (MKP) is

Exploration of Pareto Frontier Using a Fuzzy Controlled Hybrid Line Search

Frontier Pareto-optimum

In Proc of 4th Int'l Conf on Parallel Problem Solving from Nature New Crossover Methods for Sequencing Problems 1 Tolga Asveren and Paul Molito

Solution of P versus NP problem

search [3], and more recently has been applied to a variety of problem settings in combinatorial and nonlinear optimization [4]. It seems that the evo

A New Selection Operator - CSM in Genetic Algorithms for Solving the TSP

Escaping Local Optima: Genetic Algorithm

Genetic Algorithms for Solving. Open Shop Scheduling Problems. Sami Khuri and Sowmya Rao Miryala. San Jose State University.

Machine Learning for Software Engineering

Evolutionary Computation for Combinatorial Optimization

Real Coded Genetic Algorithm Particle Filter for Improved Performance

Hardware Implementation of GA.

MINIMAL EDGE-ORDERED SPANNING TREES USING A SELF-ADAPTING GENETIC ALGORITHM WITH MULTIPLE GENOMIC REPRESENTATIONS

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Hybridization EVOLUTIONARY COMPUTING. Reasons for Hybridization - 1. Naming. Reasons for Hybridization - 3. Reasons for Hybridization - 2

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Regularity Analysis of Non Uniform Data

CT79 SOFT COMPUTING ALCCS-FEB 2014

Adapting the Genetic Algorithm to the Travelling Saleman Problem

Solving the Travelling Salesman Problem in Parallel by Genetic Algorithm on Multicomputer Cluster

CHAPTER 4 GENETIC ALGORITHM

Set Cover with Almost Consecutive Ones Property

Applied Cloning Techniques for a Genetic Algorithm Used in Evolvable Hardware Design

Genetic Algorithms for EQ-algebras Automatic Generation

COMP 355 Advanced Algorithms Approximation Algorithms: VC and TSP Chapter 11 (KT) Section (CLRS)

A combination of clustering algorithms with Ant Colony Optimization for large clustered Euclidean Travelling Salesman Problem

Pre-requisite Material for Course Heuristics and Approximation Algorithms

A Polynomial-Time Deterministic Approach to the Traveling Salesperson Problem

However, m pq is just an approximation of M pq. As it was pointed out by Lin [2], more precise approximation can be obtained by exact integration of t

PROPOSED METHODOLOGY FOR COMPARING SCHEDULE GENERATION SCHEMES IN CONSTRUCTION RESOURCE SCHEDULING. Jin-Lee Kim

Research on outlier intrusion detection technologybased on data mining

Exploration vs. Exploitation in Differential Evolution

Chapter S:II. II. Search Space Representation

ANTICIPATORY VERSUS TRADITIONAL GENETIC ALGORITHM

A motivated definition of exploitation and exploration

MS&E 213 / CS 269O : Chapter 1 Introduction to Introduction to Optimization Theory

Optimization of Noisy Fitness Functions by means of Genetic Algorithms using History of Search with Test of Estimation

Solving Travelling Salesman Problem and Mapping to Solve Robot Motion Planning through Genetic Algorithm Principle

a a b b a a a a b b b b

Disjunctive and Conjunctive Normal Forms in Fuzzy Logic

Transcription:

Comprehensive and Automatic Fitness Landscape Analysis using HeuristicLab Erik Pitzer, Michael Aenzeller Andreas Beham, and Stefan Wagner Heuristic and Evolutionary Algorithms Lab School of Informatics, Communications and Media Upper Austria University of Applied Sciences Softwarepark 11, 4232 Hagenberg, Austria {erik.pitzer,michael.affenzeller, andreas.beham,stefan.wager}@fh-hagenberg.at Abstract. Many dierent techniques have been developed for tness landscape analysis. We present a consolidated and uniform implementation inside the heuristic optimization platform HeuristicLab. On top of these analysis methods a new approach to empirical measurement of isotropy is presented that can be used to extend existing methods. Results are shown using the existing implementation within Heuristic- Lab 3.3. Keywords: Fitness Landscape Analysis, Isotropy, HeuristicLab 1 Introduction The original idea of a tness landscape stems from the descriptions of adaptive populations in [16]. Since then, this metaphor has not only been used in biology but in genetic algorithms and other heuristic methods. It provides a vivid metaphor of the search space as perceived by an optimization process. Hence, it is the basis for understanding typical problems that are subject to heuristic problems solving. This paper is structured as follows: First the purpose of tness landscape analysis is reiterated followed by a formal denition. Next, tness landscape analysis methods are introduced, followed by the presentation of a new method for isotropy measurement as an extension to existing analysis methods. Finally several tness landscapes are analyzed using standard methods followed by a measurement of isotropy using the newly introduced technique. 1.1 Purpose In the course of tness landscape analysis many dierent methods have been proposed. These methods aim to provide complementary views of the tness landscape and enable the researcher to gain a better understanding of the problem at hand.

At a rst glance, tness landscape analysis might seem like the holy grail of heuristic problem solving. Understanding the tness landscape of a given problems should yield immediate insight that enables us to easily nd the global optimum. In reality, it is bound by the same limits as any other heuristic optimization method even though it has a dierent focus. While typical metaheuristics provide an eort versus quality trade-o giving you a choice between a more sophisticated analysis and, in return, a solution of higher quality, tness landscape analysis provides you with an eort versus insight trade-o. Here, the aim is not to solve a given problem as eciently as possible but to gain maximum insight with feasible eort. 1.2 Denition Informally, a tness landscape is often assumed to be given merely by the tness function f : S R alone that assigns a real value to every solution candidate in the solution space S. However, this does not yet give rise to a cohesive structure. To fully capture the landscape, a notion of connectivity is needed [7, 3]. This can range from a direct connectivity through neighbors or mutation operators to distances measures or hypergraphs for modeling crossover [11]. In summary, a tness landscape is, therefore, dened by the triple F := {S, f, X } (1) given by the solution space S, a tness function f and a notion of connectivity X. A more detailed explanation and comprehensive survey of existing methods can be found in [6]. 2 Existing Fitness Landscape Analysis Methods 2.1 Local Methods The rst form of tness landscape analysis is based on move or manipulation operators and closely examines the immediate structure that is encountered during an optimization process. These local measures examine the immediate tness changes that occur by applying a certain operator on a tness landscape and are very often based on trajectories of the tness landscape created by a random walk. Ruggedness One of the rst measures for tness landscape analysis was a measure of ruggedness. Several more or less equivalent measures are derived from the autocorrelation function shown in Eq. (2) where {f} n i=0 is a series of neighboring tness values that is typically derived through a random walk in the landscape, f and σ f are the mean and variance of the tness values in the random walk and E(x) is the expected value of x. R(τ) := E[(f i f)(f i+τ f)] σ f 2 (2)

Typically R(1), the autocorrelation after one step is used as an indicator of the ruggedness of a tness landscape [15]. Additionally, the average maximum distance to a statistically signicantly correlated point, called the autocorrelation length can be used instead. Information Analysis Similar to the notion of ruggedness, information analysis as described in [12] can be used to explore local properties of tness landscapes. First, the series of tness values obtained from e.g. a random walk is changed into a series of slope directions d i = sign(f i f i 1 ). Instead of using a strict sign function, a certain interval [ ε, ε] around zero can also be regarded as being equal to zero, allowing to zoom in and out and vary the amount of detail being analyzed. Next, several entropic measures are derived. While the information content H(ε) measures the number of slope shapes in the random walk, the density basin information h(ε) analyzes the smooth regions as shown in Eq. (3), where P [pq] is the frequency of the consecutive occurrence the two slopes p and q. H(ε) := p q P [pq] log 6 P [pq] h(ε) := p=q P [pq] log 3 P [pq] (3) In addition, the partial information content M(ε) measures the number of slope direction changes, hence, also helps in determining the ruggedness. Finally, the smallest ε for which the landscape becomes completely at, i.e. the maximum tness dierence of consecutive steps is called the information stability. 2.2 Global Methods As a complement to local methods that primarily analyze the immediate eect on optimization algorithms as they proceed step by step, the following global methods can help to obtain an overview of the tness landscape. It has to be emphasized, however, that neither view is superior to the other in general. Fitness Distance Correlation A very insightful global analysis method is the tness distance correlation. It visualizes the relation between tness of a solution candidate and its distance to the global optimum. While this can give very good insight into whether the tness function together with the selected stepping algorithm yields a meaningful succession towards the global optimum it has one severe drawback: One needs to know the global optimum in advance. The tness distance correlation coecient is dened in Eq. (4). This single number can provide a quick look at whether increased tness actually leads the search process closer to the global optimum. However, as any correlation coecient this does not work properly for non-linear correlations. FDC := E[(f i f)(d i d)] σ f σ d (4)

On the other hand, looking at a tness versus distance plot can provide some more insight into the global structure of the tness landscape and help in determining certain trends. Evolvability Another approach for a more global analysis is to take the neighboring tness values but regroup them according to their base tness. It is like looking at the individual contour lines in a contour plot of the landscape. In [9] a selection of dierent measures of evolvability is presented called the Evolvability Portraits. We start with a basic denition of evolvability itself in Eq. (5), where N(x) is the set of all neighbors of x and N is the set of all better neighbors of x or, in the continuous denition, the integral over all neighbors with higher tness (f(n) f(x)) over the probability of selecting a certain neighbor p(n x). E(x) := N (x) N(x) or E(x) := f(n) f(x) p(n x)dn (5) Based on this denition Smith and coworkers dene four measures for every solution candidate x that together form the so-called evolvability portraits [9]: First, E a is simply the probability of a non-deleterious mutation. Second, E b is the average expected ospring tness p(n x)f(x)dn. Finally, the two measures E c and E d are a certain top and bottom percentile of the expected ospring tness or in our implementation simply the minimum and maximum expected ospring tness. 3 Isotropy It is very convenient to assume that the tness landscape looks exactly the same everywhere. This assumption is called isotropy. Many local analysis methods implicitly or explicitly assume isotropy [12, 2]. While many real world tness landscapes seem to be almost isotropic this assumption has to be veried. One obvious formulation for a measure of isotropy has been described in [10] as empiric isotropy as shown in Eq. (6), where d is the average over all values with x y = d and A d is a restriction of the former to an arbitrary subset A. (f(x) f(y)) 2 ) A d = (f(x) f(y)) 2 d (6) This notion of empirical isotropy is dened over tness dierences at certain distances in the solution space. We propose to extend this idea to any measure by requiring that the result of a measurement on a subspace has to be approximately the result of the measurement on the whole solution space within reasonable statistical bounds as shown in Eq. (7) where M is any tness landscape measure and A is an arbitrary subset of the solution space S. M(A) = M(S) (7)

As an extension, Eq. (8) also has to hold in an isotropic landscape where A i and A j are suitable subsets of the solution space S. M(A i ) = M(A j ) (8) By repeatedly comparing dierent subsets A i and A j the amount of isotropy can be estimated. An important trade-o to consider in this scenario is the choice of number and sizes of the subsets A i. While a larger subset gives a statistically more robust estimate of the local measure a smaller subset will elucidate a much more local view of the landscape. Usually these subsets will be selected as the trajectory of a random walk at a random starting point. As a random walk introduces a large amount of variability especially if it is short, the random walks are concentrated around the sample points by repeatedly restarting from the same point several times and averaging over these repeats. This gives a much more robust analysis while still enabling a local view of the tness landscape. 4 Implementation The paradigm-invariant open-source optimization framework HeuristicLab [13, 14] provides an optimal starting point for the extension with a tness landscape analysis toolkit. An additional plugin has been implemented containing the following features. Most analysis methods have been implemented as analyzers that can be attached to existing algorithms and provide information about tness landscape characteristics during execution. In addition new algorithm-like drivers have been implemented that enable the use of custom trajectories for analysis, such as random walks or adaptive up-down walks which are frequently used for analysis. Moreover, the functionality of the analyzers is available as operators which can be reused in user-dened algorithms together with additional operators aimed at tness landscape analysis. Additionally, an implementation of the NK tness landscapes [4], which are the defacto playground problem for tness landscape analysis methods, using a more general block model as described in [1] is also included. A HeuristicLab version with the new plugin together with the demonstration runs is available at http://dev.heursiticlab.com/additionalmaterial. 5 Results Figure 1 shows a selection of scalar tness landscape analysis results as obtained by HeuristicLab. It was derived using the local analysis algorithm congured to perform a random walk on dierent NK tness landscapes [4] using a bit ip operator (Figure 1(a)) and a traveling salesman problem [5] instance (ch130 from the TSPlib [8]) using dierent permutation mutation operators (Figure 1(b)). It is easy to see how increasing complexity of the NK landscapes also increases measures for ruggedness, (partial) information content and regularity, while at

(a) Key gures of dierent NK landscapes (b) Evolvability analysis of dierent mutation operators on a TSP landscapes Fig. 1. Fitness landscape analysis results using HeuristicLab the same time decreases measures for smoothness such as auto correlation and density basin information. In the TSP examples (Figure 1(b)) the evolvability average for dierent parent quality levels is plotted. The probability of producing tter ospring starts at around 80% while still 1000% away from the best known solution and reduces to almost zero when approaching the best known solution. The initially superior scramble operator quickly becomes worse as higher quality base tness values are approached. However, closer to the optimum it performs better than the others because (2% vs. 0,5% and 0,03% for insertion and inversion respectively). While the previous analysis gives a quick insight into which mutation operators might be benecial or analyze which landscape variant is more dicult, the information analysis in Figure 2 can provide more insight into the structure of the landscape. These plots show several information analysis values for the previously analyzed NK and TSP landscapes. We can observe how higher quality jumps occur for the more rugged landscapes as described previously. Moreover, this time we can also see the fundamental dierence in more global structures of their tness landscapes by looking at the landscape from a more coarse grained perspective. This shows, that the TSP landscape is much more structured as it shows a smooth transition between dierent zoom levels while the NK landscape immediately jumps to very information rich structures. Moreover, the NK landscape reaches higher levels of information content and lower levels for density basin information, hinting at a richer formation of rugged and less prevalent smooth areas in the landscape. Figure 3 shows an example of isotropy estimation. The pronounced dierence of the autocorrelation curves throughout the landscape for the sine of the exponential function demonstrate the much higher anisotropy in this case. 6 Conclusions We have created a simple toolkit that allows plugging typical tness landscape analysis methods into standard algorithms and provides new algorithm-like tra-

(a) TSP Inversion (b) TSP Insertion (c) TSP Scramble (d) NK 100-3 (e) NK 100-50 (f) NK 100-99 Fig. 2. Information analysis: The X-axis contains the ε-threshold used to smooth the quality jumps in the random walk, while the Y-axis contains the entropy of information content (solid black), partial information content (dashed) and density basin information (solid gray). (a) ACF sin(x y ) (b) ACF Ackley (c) Variance of ACF sin(x y ) (d) Variance of ACF Ackley Fig. 3. Autocorrelation analysis of 50 random walks of length 50 at 10,000 points. The solid lines are minimum, median and maximum values, while the gray lines are the quartiles and the dashed line is the average. The wider spread of the autocorrelation families shows the anisotropy of the sine exp function in Figure 3(a) and 3(c).

jectory generators for more traditional tness landscape analysis using random or partly adaptive walks. Isotropy is a convenient assumption in most cases. Its extent has only sparingly been tested in the past. We have demonstrated a simple method to empirically measure isotropy extending the well established autocorrelation analysis. Based on this method it is possible to extend other analysis methods and obtain other perspectives of isotropy. We argue that isotropy estimation cannot be determined universally but has to be linked to a particular underlying measure. Acknowledgments This work was made possible through a sabbatical grant from the Upper Austria University of Applied Sciences to EP. References 1. Altenberg, L.: The Handbook of Evoluationary Computation, chap. NK Fitness Landscapes, pp. B2.7.2:111. Oxford University Press (1997) 2. Czech, Z.: Statistical measures of a tness landscape for the vehicle routing problem. In: IEEE International Symposium on Parallel and Distributed Processing, 2008. pp. 18 (2008) 3. Jones, T.: Evolutionary Algorithms, Fitness Landscapes and Search. Ph.D. thesis, University of New Mexico, Albuquerque, New Mexico (1995) 4. Kauman, S.A.: The Origins of Order. Oxford University Press (1993) 5. Lawler, E.L., Lenstra, J.K., Kan, A.H.G.R., Shmoys, D.B. (eds.): The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization. Wiley (1985) 6. Pitzer, E., Aenzeller, M.: Recent Advances in Intelligent Engineering Systems, chap. A Comprehensive Survey on Fitness Landscape Analysis, pp. 167196. Springer (2011) 7. Reeves, C.R., Rowe, J.E.: Genetic Algorithms. Kluwer (2003) 8. Reinelt, G.: Tspliba traveling salesman problem library. ORSA Journal on Computing 3(4), 376384 (1991) 9. Smith, T., Husbands, P., Layzell, P., O'Shea, M.: Fitness landscapes and evolvability. Evol. Comput. 10(1), 134 (2002) 10. Stadler, P.F., Grüner, W.: Anisotropy in tness landscapes. J. Theor. Biol. 165(3), 373388 (1993) 11. Stadler, P., Wagner, G.: The algebraic theory of recombination spaces. Evol.Comp. 5, 241275 (1998) 12. Vassilev, V.K., Fogarty, T.C., Miller, J.F.: Information characteristics and the structure of landscapes. Evol. Comput. 8(1), 3160 (2000) 13. Wagner, S.: Heuristic Optimization Software Systems - Modeling of Heuristic Optimization Algorithms in the HeuristicLab Software Environment. Ph.D. thesis, Johannes Kepler University, Linz, Austria (2009) 14. Wagner, S., Kronberger, G., Beham, A., Winkler, S., Aenzeller, M.: Modeling of heuristic optimization algorithms. In: Proceedings of the 20th European Modeling and Simulation Symposium. pp. 106111. DIPTEM University of Genova (2008) 15. Weinberger, E.: Correlated and uncorrelated tness landscapes and how to tell the dierence. Biological Cybernetics 63(5), 325336 (1990) 16. Wright, S.: The roles of mutation, inbreeding, crossbreeding and selection in evolution. In: Proceedings of the Sixth International Congress of genetics. vol. 1, pp. 356366 (1932)