Outline. Motivation. Introduction of GAs. Genetic Algorithm 9/7/2017. Motivation Genetic algorithms An illustrative example Hypothesis space search

1 Outline Genetic Algorithm Motivation Genetic algorithms An illustrative example Hypothesis space search Motivation Evolution is known to be a successful, robust method for adaptation within biological systems GAs can search spaces of hypotheses containing complex interacting parts GAs are easily parallelized and can take advantage of the decreasing costs of powerful computer hardware Introduction of GAs A genetic algorithm (or GA) is a search technique used in computing to find true or approximate solutions to optimization and search problems. Genetic algorithms are categorized as global search heuristics. Genetic algorithms are a particular class of evolutionary algorithms that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover (also called recombination). 1

2 Introduction of GAs Genetic algorithms are implemented as a computer simulation in which a population of abstract representations (called chromosomes or the genotype or the genome) of candidate solutions (called individuals) to an optimization problem evolves toward better solutions. Introduction of GAs Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. Introduction of GAs Introduction of GAs The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are selected from the current population (based on their fitness), and modified (recombined and possibly mutated) to form a new population The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. If the algorithm has terminated due to a maximum number of generations, a satisfactory solution may or may not have been reached.

3 Biological Background Chromosomes Antigenic Shift Genetic information is stored in the chromosomes Each chromosome is build of DNA Chromosomes in humans form pairs. There are 3 pairs. The chromosome is divided in parts: genes Genes code for properties Every gene has an unique position on the chromosome: locus 009 H1N1 Influenza Virus Antigenic Drift Antigenic Distance Changes between Mutant Influenza Viruses and Their Respective Wild-type Influenza Viruses RNA Replication Transcriptase/RNA polymerase RNA Error Non-proof reading Mutant protein Vaccine--- Lose effectiveness 3

4 Biological Background Reproduction During reproduction errors occur Due to these errors genetic variation exists Most important errors are: Recombination (cross-over) Mutation Prototype of GA Fitness A predefined numerical measure for the problem at hand Population The algorithm operates by iteratively updating a pool of hypotheses, called the population Genetic Algorithm GA(Fitness,Fitness_threshold,p,r,m) Fitness: A function that assigns an evaluation score,given a hypothesis Fitness_threshold: A threshold specifying the termination criterion p: The number of hypotheses to be included in the population r: The fraction of the population to be replaced by Crossover at each step m: The mutation rate Genetic Algorithm Initialize population: P Generate p hypotheses at random Evaluate: For each h in P, compute Fitness(h) While [max Fitness(h)]<Fitness_threshold do Create a new generation P S : 1. Select: (1-r)p from P to P S. Crossover: r.p/ 3. Mutate 4. Update: PPS 5. Evaluate: update Fitness(h) Return the hypothesis from P that has the highest fitness 4

5 Genetic Algorithm Select: Probabilistically select (1-r)p members of P to add to P S. Fitness( hi ) Pr( hi ) p Fitness( hj ) j 1 Crossover: Probabilistically select r*p/ pairs of hypotheses from P, according to Pr(h i ) Mutate: Choose m percent of the members of P S with uniform probability to invert one randomly selected bit in its representation Genetic Algorithm Hypotheses in GAs are often represented by bit strings, so that they can be easily manipulated by genetic operators Rule precondition and postcondition IF Wind = Strong THEN PlayTennis = yes Bit String Bit String,cont. Outlook: Sunny,Overcast or Rain Wind: Strong or Weak Rule: ( Outlook Overcast Rain) ( Wind Strong ) Representation: Outlook Wind Rule: IF Wind Strong THEN PlayTennis yes Representation: Outlook Wind PlayTennis Note the string representing the rule contains a substring for each attribute in the hypothesis space 19/44 0/44 5

6 Genetic Operators Single-Point Genetic Operators Initial Strings Offspring Crossover Produces two new offspring from two parent strings by copying selected bits from each parent Crossover mask Crossover point n is chosen at random In the case of uniform crossover, the crossover mask is generated as a random bit string Mutation Produces small random changes to the parents By choosing a single bit at random then changing its value Two-Point Uniform /44 Genetic Operators Mutation Population Evolution and the Schema Theorem Each schema represents the set of bit strings containing the indicated 0s,1s and *s 0*10 represents the set of bit strings that includes exactly 0010 and 0110 An individual bit string can be viewed as a representative of each of the different schemas that it matches 0010 can be thought of as a representative of 4 distinct schemas including 00**,0*10,****,etc 4/44 6

7 Schema Theorem Let m(s,t) denote the number of instances of schema s in the population at time t Thus, schema theorem describes the expected value of m(s,t+1) Let us start by considering just the effect of the selection step f( h) f() t : the fitness of the individual bit string : the average fitness of all individuals at time t n Schema Theorem : the total number of individuals The possibility of selecting hypothesis h is given by f( h) Pr( h) n f( h) i1 f( h) nf () t : h is both a representative of schema s and a member of the population at time t hs p i i 5/44 6/44 Schema Theorem Schema Theorem Let uˆ( s, t) denote the average fitness of instances of schema s in the population at time t The possibility that we will select a representative of schema s is f( h) Pr( hs) nf () t hs p i uˆ( s, t) m( s, t) nf () t u( s, t) h( s pt ) 7/44 f ( h) m( s, t) Thus, the expected number of instances of s resulting from the n independent selection steps is n times this probability uˆ( s, t) E[ m( s, t 1)] m( s, t) f() t If crossover and mutation is considered, u ˆ( s, t [ (, 1)] ) d (, ) 1 ( s ) E m s t m s t pe 1 p f ( t) l 1 os ( ) O(s) is the number of defined bit in schema s p e is the probability that an arbitrary bit of an arbitrary individual will be mutated d(s) is the distance between leftmost and rightmost bits in s 8/44 m 7

8 F(s)=s^ (s<3) s1= 13 (01101) s= 4 (11000) s3= 8 (01000) s4= 19 (10011) f (s1) = f(13) = 13^ = 169 f (s) = f(4) = 4^ = 576 f (s3) = f(8) = 8^ = 64 f (s4) = f(19) = 19^ = 361 Four random values r1 = 0.450, r = 0.110, r3 = 0.57, r4 = Index S Fitness Percent # selection s1 =11000(4), s =01101(13) s3 =11000(4), s4 =10011(19) Crossover (last two position) s1 =11001(5), s =01100(1) s3 =11011(7), s4 =10000(16) 8

9 Index S Fitness Percent # selection Index S Fitness Percent # selection s1 =11001(5), s = 01100(1) s3 =11011(7), s4 = 10000(16) Crossover (last three position) s1 =11100(8), s = 01001(9) s3 =11000(4), s4 = 10011(19) s1 =11100(8), s =11100(8) s3 =11000(4), s4 =10011(19) Crossover (last two position) s1 =11111(31), s =11100(8) s3 =11000(4), s4 =10000(16) 1 s. t. I max f(x, x ) x x {1,,3,4,5,6,7} x x {1,,3,4,5,6,7} 3 4 Using 3 bits to represent one variable. Therefore, 6 bits are for two variables. For example, mean x 1 =6 and x =1 9

10 Initialize population I ,101011,011100, Fitness function f(x,x ) x 1 1 x I Index P(0) X1 X Fitness Percent # selection Sum I Index selection Pair Crossover position Crossover results : : I Index Crossover results Mutation site Mutation results Index P(0) X1 X Fitness Percent Sum

