MAXIMUM LIKELIHOOD ESTIMATION USING ACCELERATED GENETIC ALGORITHMS

Similar documents
Adaptive Crossover in Genetic Algorithms Using Statistics Mechanism

Suppose you have a problem You don t know how to solve it What can you do? Can you use a computer to somehow find a solution for you?

What is GOSET? GOSET stands for Genetic Optimization System Engineering Tool

GENETIC ALGORITHM VERSUS PARTICLE SWARM OPTIMIZATION IN N-QUEEN PROBLEM

Binary Representations of Integers and the Performance of Selectorecombinative Genetic Algorithms

Genetic Algorithms Variations and Implementation Issues

Chapter 14 Global Search Algorithms

Aero-engine PID parameters Optimization based on Adaptive Genetic Algorithm. Yinling Wang, Huacong Li

The Genetic Algorithm for finding the maxima of single-variable functions

An Introduction to Evolutionary Algorithms

Artificial Intelligence Application (Genetic Algorithm)

Automata Construct with Genetic Algorithm

A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2

Similarity Templates or Schemata. CS 571 Evolutionary Computation

REAL-CODED GENETIC ALGORITHMS CONSTRAINED OPTIMIZATION. Nedim TUTKUN

CHAPTER 6 REAL-VALUED GENETIC ALGORITHMS

A THREAD BUILDING BLOCKS BASED PARALLEL GENETIC ALGORITHM

Lecture 6: The Building Block Hypothesis. Genetic Algorithms and Genetic Programming Lecture 6. The Schema Theorem Reminder

Genetic Algorithms and Genetic Programming Lecture 7

Introduction to Genetic Algorithms. Based on Chapter 10 of Marsland Chapter 9 of Mitchell

DERIVATIVE-FREE OPTIMIZATION

DETERMINING MAXIMUM/MINIMUM VALUES FOR TWO- DIMENTIONAL MATHMATICLE FUNCTIONS USING RANDOM CREOSSOVER TECHNIQUES

CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM

Genetic Algorithm Performance with Different Selection Methods in Solving Multi-Objective Network Design Problem

Multi-objective Optimization

Genetic Algorithms for Vision and Pattern Recognition

Topological Machining Fixture Layout Synthesis Using Genetic Algorithms

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter

Evolutionary Computation Part 2

A GENETIC ALGORITHM FOR CLUSTERING ON VERY LARGE DATA SETS

Path Planning Optimization Using Genetic Algorithm A Literature Review

CHAPTER 4 GENETIC ALGORITHM

[Premalatha, 4(5): May, 2015] ISSN: (I2OR), Publication Impact Factor: (ISRA), Journal Impact Factor: 2.114

Genetic Algorithm for Finding Shortest Path in a Network

Mutations for Permutations

GA is the most popular population based heuristic algorithm since it was developed by Holland in 1975 [1]. This algorithm runs faster and requires les

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

Genetic Algorithms: Setting Parmeters and Incorporating Constraints OUTLINE OF TOPICS: 1. Setting GA parameters. 2. Constraint Handling (two methods)

ARTIFICIAL INTELLIGENCE (CSCU9YE ) LECTURE 5: EVOLUTIONARY ALGORITHMS

Parameter Control of Genetic Algorithms by Learning and Simulation of Bayesian Networks

International Journal of Digital Application & Contemporary research Website: (Volume 1, Issue 7, February 2013)

Genetic Algorithms. Kang Zheng Karl Schober

Pseudo-code for typical EA

NOVEL HYBRID GENETIC ALGORITHM WITH HMM BASED IRIS RECOGNITION

Hybridization EVOLUTIONARY COMPUTING. Reasons for Hybridization - 1. Naming. Reasons for Hybridization - 3. Reasons for Hybridization - 2

Estimation of Distribution Algorithm Based on Mixture

Comparative Study on VQ with Simple GA and Ordain GA

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Hyperplane Ranking in. Simple Genetic Algorithms. D. Whitley, K. Mathias, and L. Pyeatt. Department of Computer Science. Colorado State University

Solving A Nonlinear Side Constrained Transportation Problem. by Using Spanning Tree-based Genetic Algorithm. with Fuzzy Logic Controller

AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENETIC ALGORITHMS

An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

V.Petridis, S. Kazarlis and A. Papaikonomou

Abstract. 1 Introduction

Introduction to Optimization

Quality of Genetic Algorithm in the Cloud

An Evolutionary Algorithm for the Multi-objective Shortest Path Problem

Automated Test Data Generation and Optimization Scheme Using Genetic Algorithm

AN EVOLUTIONARY APPROACH TO DISTANCE VECTOR ROUTING

Introduction to Evolutionary Computation

A motivated definition of exploitation and exploration

Lecture 25 Nonlinear Programming. November 9, 2009

Genetic algorithms and finite element coupling for mechanical optimization

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES

Outline. Motivation. Introduction of GAs. Genetic Algorithm 9/7/2017. Motivation Genetic algorithms An illustrative example Hypothesis space search

Grid Scheduling Strategy using GA (GSSGA)

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Reducing Fitness Evaluations Using Clustering Techniques and Neural Network Ensembles

Genetic Model Optimization for Hausdorff Distance-Based Face Localization

GENETIC ALGORITHM with Hands-On exercise

Monika Maharishi Dayanand University Rohtak

Learning Adaptive Parameters with Restricted Genetic Optimization Method

Santa Fe Trail Problem Solution Using Grammatical Evolution

CS:4420 Artificial Intelligence

Time Complexity Analysis of the Genetic Algorithm Clustering Method

Introduction to Optimization

Genetic Algorithm for Dynamic Capacitated Minimum Spanning Tree

Application of a Genetic Algorithm to a Scheduling Assignement Problem

Outline. CS 6776 Evolutionary Computation. Numerical Optimization. Fitness Function. ,x 2. ) = x 2 1. , x , 5.0 x 1.

Heuristic Optimisation

Genetic Programming. Charles Chilaka. Department of Computational Science Memorial University of Newfoundland

Multi-Modal Metropolis Nested Sampling For Inspiralling Binaries

Research Article Path Planning Using a Hybrid Evolutionary Algorithm Based on Tree Structure Encoding

Optimizing Flow Shop Sequencing Through Simulation Optimization Using Evolutionary Methods

A Comparative Study of Linear Encoding in Genetic Programming

Genetic Algorithms and Image Search Pavel Mrázek

336 THE STATISTICAL SOFTWARE NEWSLETTER where z is one (randomly taken) pole of the simplex S, g the centroid of the remaining d poles of the simplex

A Genetic Algorithm for Expert System Rule Generation

An evolutionary annealing-simplex algorithm for global optimisation of water resource systems

Multi-Objective Pipe Smoothing Genetic Algorithm For Water Distribution Network Design

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

MIC 2009: The VIII Metaheuristics International Conference. A Comparative Study of Adaptive Mutation Operators for Genetic Algorithms

ET-based Test Data Generation for Multiple-path Testing

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Solving Traveling Salesman Problem Using Parallel Genetic. Algorithm and Simulated Annealing

Inducing Parameters of a Decision Tree for Expert System Shell McESE by Genetic Algorithm

Genetic Algorithm using Theory of Chaos

Evolutionary form design: the application of genetic algorithmic techniques to computer-aided product design

Transcription:

In: Journal of Applied Statistical Science Volume 18, Number 3, pp. 1 7 ISSN: 1067-5817 c 2011 Nova Science Publishers, Inc. MAXIMUM LIKELIHOOD ESTIMATION USING ACCELERATED GENETIC ALGORITHMS Füsun Akman 1, Olcay Akman 1 and Joshua W. Hallam 2 1 Department of Mathematics Illinois State University USA 2 Department of Mathematics Michigan State University USA Abstract We implement Genetic Algorithms, accelerated by population reduction, in imum likelihood estimation of parameters of statistical distributions. Introduction A genetic algorithm (GA) is an optimization technique inspired by biological evolution operating under natural selection. First popularized by John Holland [6] and extensively studied by Goldberg [3], this technique has been shown to be robust and capable of dealing with highly multimodal and discontinuous search landscapes where the traditional optimization techniques fail. Traditional methods such as hill-climbing and derivative-based methods are often able to find optimal points in ordinary landscapes. However, with multimodal landscapes, they may get stuck in local optima, whereas the structure of genetic algorithms helps avoid this problem. Specifically, minimization of the highly-nonlinear artificial neural network error surfaces still remains to be one of the hurdles in high-density multivariate modeling. For instance, producing a Kohonen map in a typical clustering problem using Self Organizing Feature Maps (SOFM) requires a significantly long computing time due to the ruggedness of the high-dimensional error surface. Genetic algorithms are routinely employed in these types of optimization problems. In a genetic algorithm, a group of possible solutions, i.e., a population of chromosomes, are substituted into a fitness function (the function being optimized) and hence assigned fitness values. The chromosomes with desirable fitness values are allowed to mate with other chromosomes, mutate, and move on to the next generation. This process is repeated until either a certain number of generations is reached or there is no change in the best solution found for many generations. At the end of the algorithm, the chromosome with the highest fitness value is considered to be the solution. In order to take advantage of the analogy with natural biological processes, the chromosomes are encoded as binary strings. Let l denote the length of a string. Typically, if the fitness function has K independent variables, then l is an integer multiple of K. The binary string is then broken into parts of equal length, each representing one of the K variables, and converted into a real number based on the range of possible values for the variables. The fitness associated with the chromosome is calculated by evaluating the fitness function at the K real values obtained from the chromosome. More formally, if f : R K R denotes the fitness function, and g : {0, 1} l R K denotes the transformation from the binary strings to the real values, then the fitness of a chromosome is

2 Füsun Akman, Olcay Akman and Joshua W. Hallam calculated as fitness = f(g(chrom)), and the chromosomes with optimal fitness are chosen for the next generation. However, this choice is not deterministic. Usually, two chromosomes are selected at random, and the one with the higher fitness is kept. This technique is called binary tournament. Chromosomes can be chosen with replacement in a tournament. The chosen chromosomes are then put in a mating pool. The process continues until the mating pool has the desired size of the population in the next generation. To begin creating the next generation, two chromosomes are chosen from the pool and mated. Mating in a GA is analogous to genetic recombination, in which segments of the code are swapped between the two chromosomes. The number of crossing-over points is up to the user, but in our work we used three. After the mating occurs, the two new chromosomes are mutated. That is, with a certain small probability, each bit may be changed from 0 to 1 or 1 to 0. The crossover and mutation create two new chromosomes, which will be put into the next generation s population. The steps are repeated until all pairs in the mating pool have mated. The process of selection, mating, and mutation continues from generation to generation, creating better solutions as time progresses. In a typical GA, the population size remains constant throughout the entire algorithm. We believe that this is a poor choice of allocation of resources. Instead, we propose a GA that reduces population size at every time step, allowing for a larger initial population size. Starting with a larger population, the algorithm has a better chance of selecting parts of the optimal solution earlier. In short, we believe that our method would enable the algorithm to find the optimal solution more efficiently. Accelerated Genetic Algorithms We have developed three different methods of population reduction for genetic algorithms. The first is an adaptive measure and the other two are based on a predetermined pattern. We describe these three methods in detail below. 1. Adaptive Population Reduction. Adaptively sizing population is defined as continually changing the population size based on parameters within the algorithm. The changes in population size would depend on those in average fitness and genetic variance. This method contrasts with predetermined sizing methods, in which the population size at each generation is independent of the changes in the population and in the fitness values generated. Adaptive measures have been offered by several authors, and a review of current methods can be found in Lima and Lobo [7]. In this paper, we present a new method based on the change in best fitness. (For convenience, we will assume from now on that best fitness means the largest fitness value in the current pool.) A variant of our approach was used earlier by Eiben et al. [2]: their method was to increase the population size if the best fitness increased, decrease the size if there was a short term lack of fitness increase, and increase the population size if no change occurred for a long period of time. This approach may have several problems associated with it. For example, for the population to be increased, obviously new chromosomes must be created. However, if new chromosomes are just created by cloning the best existing ones, as in the Eiben et al. study [2], then there is no increase in genetic diversity. Hence it would be more beneficial, in theory, to generate additional random individuals to simulate natural gene flow. Another problem is that typically fitness increases the fastest early in a genetic algorithm, which would imply that the population size should grow early in the algorithm. If the individuals are obtained

Maximum Likelihood Estimation Using Accelerated Genetic Algorithms 3 only by cloning, then the population will lose genetic diversity even faster because of the dominance of the numerous clones with large fitness. It seems that when population size is likely to increase early in the algorithm, simply starting with a larger population size would both contribute higher genetic diversity and use the same amount of computation. Our approach takes a very different stand from [2]. We believe that as the best fitness increases, we may reduce the population size and still obtain reasonable results with fewer computation than would be employed in a typical genetic algorithm. A key point with this approach is that population size gets reduced when only the best fitness improves (the method never increases the population size). In justification, suppose that we wish to do optimization in a multimodal fitness landscape. If we start with a large population, then we can hit more points on the rugged landscape. However, as time passes, since the solutions will start aggregating around certain areas (hopefully near solution) we can reduce the population size without much loss of accuracy. In other words, a reduction in complexity allows for a smaller population to optimize the problem with the same or better results than a larger population. Since the chromosomes with the best fitness will be allowed to mate often, the solutions will continue to concentrate around the desired solution. Thus, the change in best fitness is a good indicator of how well the algorithm is performing. The small population size, with the implementation of elitism, allows genetic drift to fine-tune the solution without losing the best solution in the process: suppose that the population has aggregated in a small partition of the search space such that there are only slight changes in fitness. At this point, it is more economical to have a small population because a chromosome with a small difference in fitness from that of the optimal solution has a better chance to be chosen to participate in a tournament. (Although the choice to participate in the tournament is random, with a smaller population, every chromosome has a better chance to be chosen.) Thus, those with a slightly better fitness can participate and be chosen for the mating pool. At the same time, this part of the algorithm is merely choosing between solutions which only differ little, and it is less important than the phase of the algorithm making large jumps in fitness. We have developed a formula to quantify the amount of reduction. It is based on the idea that the population size should be reduced proportionally to the change in best fitness. Let N t be the population size at generation t. Denote the change in best fitness at generation t as ft best = (ft 1 best f t 2 best best )/ ft 2. We use the absolute value to deal with fitness values which can be both positive and negative. We then determine a parameter f best such that (1 ft best )N t, if ft best f best N(t + 1) = (1 f)n best t, if ft best > f best (1) MIN POPSIZE, if N(t + 1) will be less than MIN POPSIZE. When this type of decrease is used, we implement elitism, allowing the best chromosome to continue on to the next generation without change, so that the change in fitness is always positive. Clearly, we have f best < 1, however, this value should be chosen in the interval [.05,.2] based on empirical evidence. As a side note, the typical genetic algorithm is a special case of the method we have produced, where f best = 0. The determination of minimum population size is arbitrary. However, to avoid the negative effects of extremely small populations, we set the value MIN POPSIZE equal to 20 based on work by Reeves [9]. As can be seen from the formula, the shape of the population curve is exponential decrease, followed by a steady section, again followed by an exponential decrease,

4 Füsun Akman, Olcay Akman and Joshua W. Hallam and this pattern continues. 2. Predetermined Exponential Decrease. Although the adaptive method produces a population curve which has segments of exponential decrease, it requires computing ft best at every generation, as well as the determination of f. best We now present a method which requires neither and reduces the population exponentially. The Schema Theorem [6] shows that on the average, the number of highly fit schema increases exponentially. Based on this fact, we believe that we can reduce the population size exponentially and get results comparable to an algorithm which has no reduction. To perform this reduction, the following formula is used: N(t) = (N 0 )e c t, where c = ln N END N 0 Number of Generations. (2) N END denotes the population size at the end of the algorithm. It is set to be 20, in agreement with the minimum population size used in the adaptive method. 1. Predetermined Linear Decrease It is not possible to predict the shape of the exponential increase of schema without direct and complicated calculations during the algorithm. Therefore, we have also developed a reduction method which is not exponential, but instead, which decreases the population size in a linear trend. This avoids decreasing the population too quickly, but has the benefit of reducing the number of computations needed in a traditional genetic algorithm. The following formula is used to determine the population size at each generation: N(t) = mt + N 0, where m = N 0 N END Number of Generations. (3) A discussion of the performance of the methods discussed above can be found in [1]. Diversifying In addition to reducing population size over time, we have developed another method to increase the efficiency of a genetic algorithm. According to Fisher s Fundamental Theorem of Natural Selection [5], the increase in mean fitness of a population is equal to the variance in fitness. In genetic algorithms, the easiest way to increase variance in fitness would be to allow every possible solution to be represented in the population. Of course, this is equivalent to an exhaustive search. We believe the next best procedure is to force the population to start with the highest variance in each position of the chromosome. Since each position is either 0 or 1, this would imply that at each position there are ideally the same amount of 0 s and 1 s across the entire population. To implement this procedure, half of the initial population is randomly generated. The other half is then generated by taking each of the chromosomes in the first half and changing each bit from 1 to 0 or 0 to 1. We call this process diversifying. In addition to increasing variance at each position, the procedure guarantees that within one generation, recombination alone could generate the optimal solution. This does not imply that mutation is not necessary, as selection acts on the entire string, and not on individual positions. Since selection will reduce variance at each position, mutation is still required to maintain some variance.

Maximum Likelihood Estimation Using Accelerated Genetic Algorithms 5 Maximum Likelihood Estimation Choosing the correct parameters in given data, which one believes is from a specific distribution, is a global optimization problem. In many cases, the search space of the parameters is multimodal and finding the correct solution may be difficult for the classic search/optimization techniques. Genetic algorithms provide a useful tool for exploring possible solutions. Let f(x θ) be a p.d.f with x = {x 1, x 2,... x n } and parameter vector θ = {θ 1, θ 2... θ n }. Let L(θ x) be the likelihood function that we wish to optimize with respect to θ. As is commonplace, ln L is optimized instead of L. This transformation is helpful when using a GA, because the fitness values for L would have been in the interval [0, 1], which forces the fitness space to be flat, and the convergence of the solution to be slow. By using log likelihood, the solution fitness space expands to the interval (, 0) and gives better contrast between possible solutions. In order to use genetic algorithms for imum likelihood estimation, the following procedure should be used. First, a bootstrap sample of size n from x should be generated. Using this sample, we imize the likelihood (or log likelihood) function using the genetic algorithm to obtain a parameter vector, ˆθ, which reflects the imum given the bootstrap sample of data. We repeat this process m times to generate ˆθ 1, ˆθ 2,... θ ˆ m. Using these m solutions, we construct a covariance matrix for the parameter vectors and the overall best fit ˆθ is given by the average of best fits found in the previous m runs of the genetic algorithm. Table 1. Covariance Matrix for Airplane 7907. p µ λ p 0.1386936 2.663881 20.01815 µ 2.6638812 304.957664 1189.76515 λ 20.0181523 1189.765153 9304.16663 Table 2. Covariance Matrix for Airplane 7909. p µ λ p 0.09958262 2.705328 3.141576 µ 2.70532883 130.961146 181.918055 λ 3.14157588 181.918055 501.227123 An Application To illustrate an application, we consider the aircraft data given by Proschan in [8]. The mixture inverse Gaussian distribution was fitted by Gupta and Akman in [4]. If x is inverse Gaussian, then the p.d.f. is given by ( ) λ 1/2 ( f(x µ, λ, p) = 2πx 3 exp λ (x µ) 2 2µ 2 x 1 p + p x ), (4) µ

6 Füsun Akman, Olcay Akman and Joshua W. Hallam with parameters λ > 0, µ > 0, and 0 p 1. This produces the following log likelihood function, given a random sample X 1, X 2,..., X n : L = n 2 ln(λ) λ 2µ 2 n 2 ln(2π) 3 2 x i + λn µ λ 2 1 x i + ( ln 1 p + px ) i µ ln(x i ). (5) We used the data for airplane numbers 7907 and 7909 to find the optimal parameter vector (p, µ, λ). We followed the procedure outlined above, bootstrapping 100 times. We used a GA with adaptive reduction and diversification to obtain our results. The covariance matrix is given for Airplane 7907 in Table 1 and Airplane 7909 in Table 2. Taking average of the 100 runs gives the optimal parameter vectors for 7907 and 7909, shown in Table 3. Table 3. Optimal values for p, µ, λ. p µ λ 7907 0.3113695 48.9713807 112.2475586 7909 0.5970175 56.6431133 71.147073 Conclusion Using GA in imum likelihood estimation is a viable method of obtaining parameter estimates, especially considering the small variance attained as a result. We believe that using a GA becomes more meaningful when the likelihood surface is more rugged (and has more dimensions) than the usual distributions. Acknowledgment We would like to thank Dr. Nader Ebrahimi of Northern Illinois University for his suggestions and guidance. References [1] Akman, O., Hallam, J., and Akman, F. (2010). Genetic Algorithms with Shrinking Population Size. Computational Statistics, 25, 691 705. [2] Eiben, A. E., Marchiori, E., and Valko, V. A. (2004). Evolutionary algorithms with on-the-fly population size adjustment. In Parallel Problem Solving from Nature, PPSN VIII, volume 3242 of Lecture Notes in Computer Science, 41 50. Springer. [3] Goldberg, D.E. Genetic algorithms in search, optimization, and machine learning. Reading, MA: Addison-Wesley, 1989.

Maximum Likelihood Estimation Using Accelerated Genetic Algorithms 7 [4] Gupta, R. and Akman, O. (1995). On the reliability studies of a weighted inverse Gaussian model. Journal of Statistical Planning and Inference, 69 83. [5] Fisher, R. A. (1930). The Genetical Theory of Natural Selection. Clarendon Press, Oxford. [6] Holland, J. (1975). Adaptation in Natural and Artificial Systems. The MIT Press, 1975. [7] Lima, C. F. and Lobo, F. G. (2005). A Review of Adaptive Population Sizing Schemes in Genetic Algorithms. In Proceedings of the 2005 workshops on Genetic and evolutionary computation, 228 234. ACM. [8] Proschan, F. (1963). Theoretical explanation of observed decreasing failure rate. Technometrics 5, 375. [9] Reeves, C. R. (1993). Using genetic algorithms with small populations. In Proceedings of the 5th International Conference on Genetic Algorithms, 92 97, Morgan Kaufmann.