Running Programs Backwards: Instruction Inversion for Effective Search in Semantic Spaces

Size: px
Start display at page:

Download "Running Programs Backwards: Instruction Inversion for Effective Search in Semantic Spaces"

Transcription

1 Running Programs Backwards: Instruction Inversion for Effective Search in Semantic Spaces Bartosz Wieloch Krzysztof Krawiec Institute of Computing Science, Poznan University of Technology Piotrowo 2, Poznań, Poland ABSTRACT The instructions used for solving typical genetic programming tasks have strong mathematical properties. In this study, we leverage one of such properties: invertibility. A search operator is proposed that performs an approximate reverse execution of program fragments, trying to determine in this way the desired semantics (partial outcome) at intermediate stages of program execution. The desired semantics determined in this way guides the choice of a subprogram that replaces the old program fragment. An extensive computational experiment on 20 symbolic regression and Boolean domain problems leads to statistically significant evidence that the proposed Random Desired Operator outperforms all typical combinations of conventional mutation and crossover operators. Categories and Subject Descriptors I.2.2 [Artificial Intelligence]: Automatic Programming; I.2.8 [Artificial Intelligence]: Problem Solving, Control Methods, and Search Heuristic methods Keywords genetic programming, program semantics, desired semantics, search operators, instruction inversion 1. INTRODUCTION The conventional search operators used in genetic programming (GP) make little or no assumptions about the properties of instructions from the programming language of consideration. Essentially, the only attribute of instruction that such operators have to be aware of is its arity (and the types of instruction s inputs and outputs, if types are considered). A common argument for this attitude is generality: by abstracting from the internals of instructions, operators like tree-swapping crossover or subtree-replacement mutation can be indeed regarded as universal. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. GECCO 13, July 6 10, 2013, Amsterdam, The Netherlands. Copyright 2013 ACM /13/07...$ This argument is however strikingly inconsistent with the GP practice, which to great extent focuses on symbolic regression, Boolean function synthesis, and other programming domains with instruction sets strongly rooted in mathematical foundations. These foundations bestow the instructions with formal properties that can be potentially exploited. Following this observation, this paper proposes a straightforward method for exploiting the property of (partial or complete) invertibility of instructions. This property, shared by most arithmetic and logic instructions, enables us to determine the desired semantics at any arbitrary location (locus) in a program. This ability is leveraged in a new GP search operator that is the main contribution of this paper, and which turns out to significantly outperform the conventional operators. 2. GENERAL IDEA In this paper, for simplicity we concentrate on the canonical tree-based GP as proposed by Koza [6]. We assume that subprograms (subtrees) in programs can be replaced by independently generated subtrees (procedures). This assumption is not constraining as it is also required by standard GP search operators. Procedures can be executed independently from the rest of a program. Moreover, execution of a subprogram returns a certain result, exactly in the same manner as the execution of the entire program does. However, let us emphasize that our approach can be likewise adopted for other variants of GP, like Linear GP [1] or Cartesian GP [10]. Moreover, the proposed general idea may be used even in non evolutionary metaheuristics. For example, it may be applied as an operator for generating a neighbor solution in local search. In the following, we limit our considerations only to such GP tasks in which fitness calculation is based on fitness cases, meant as a list of pairs composed of input data and the corresponding desired (correct) output. In such case, following [8], we may define the program semantics as a list of outputs that are actually produced by the program for all fitness cases. Additionally, by the target semantics we mean the semantics of an ideal solution (i.e., target semantics equals to a list of the desired outputs defined in the list of fitness cases). Obviously, the closer the semantics of a program to the target semantics, the better it is. Therefore, the distance between the semantics of an individual and the target semantics can be treated as a minimized fitness value. In practice, most benchmark problems considered in the lit- 1013

2 erature define the fitness value in the above way, even if the term program semantics is not used explicitly. Now, let us imagine that we have an oracle that can tell us what should any given fragment (called part in the following) of a program return for a given fitness case to make the entire program yield the correct output. Because oracle s verdicts do not depend on the actual part in question, only the rest of the entire program (called context) is relevant. In the case of tree-based GP we will identify context with an incomplete tree that misses a single branch (part). Thus, an entire program can be assembled by combining a context with a part (subtree). We will say that a context accepts a part if the entire program returns the correct output. For a given context, the list of values obtained from the oracle for each fitness case constitutes the desired semantics of a context (desired semantics for short). The rather obvious yet important observation for this study is that finding a part (subprogram) with semantics equal to the desired semantics of some context is equivalent to solving the entire GP task, as such a part combined with the context will form the ultimate, optimal solution. In this way, the ideal solution could be created in just a single step. Because it is not always possible (or technically feasible) to find a part with semantics equal to the desired semantics, we are interested here in minimizing this discrepancy. Our hypothesis is that decreasing the distance between the desired semantics and the semantics of a part can at least bring us closer to solving the problem. Now, to apply this idea in practice, we need two things: (1) the oracle (i.e., a computationally feasible method to calculate the desired semantics), and (2) the source of parts which will be matched to a given context. In the next section, we will present an algorithm for calculating the approximate desired semantics for two different domains of problems: real-valued functions (symbolic regression problems) and the Boolean domain (synthesis of logic functions). In relation to the second requirement, there are many possible ways to provide the set of parts to match the context (called library in the following). It may be a library of intentionally designed subprograms, a sample of random subprograms, or even an exhaustive set of all programs within certain constraints (e.g., maximal size [7]). In this paper we shall present an efficient and uncontroversial form of library, suitable for working with population-based approaches. Specifically, the library of parts is constructed from all parts of individuals from the current population. Apart from computational efficiency, this approach avoids, among others, human biases the library contains only code fragments that have already evolved. 3. DESIRED SEMANTICS In this section we describe the concept of desired semantics in more detail. Firstly, we show the possible types of oracle s answers. These considerations dictate to some extent the representation of desired semantics used in this paper, which we describe subsequently. Finally, we present a simple method for calculating the desired semantics for so-called partially invertible instructions. 3.1 Possible Situations In general, it is convenient for us to assume that the oracle, when queried with a specific context, returns a set of desired 1 1 (a) one value 0 sin (c) any, value is insignificant 1 pow 2 1 cos (b) multiple (two, infinitely many) 1 + exp (d) none (inconsistent context) Figure 1: Examples of the four situations concerning oracle s answer for a zero-valued target (i.e., t = 0). The subtitles report the number of accepted values. The dotted circle (node ) represents a missing subtree of the context. values. The reason for this choice is the fact that we have to consider the following four possible situations: 1. There exists exactly one value which causes the context to return the correct output. 2. There exist more than one such values (either finite or infinite number of them). 3. Any value fed into the context causes the semantics of entire program to reach the target. In other words, the missing part in this context is an intron and does not have any influence on the final behavior of the program. 4. No matter what is fed into the context, the resulting program will not attain the target. In such a case, the context has to be changed to make it possible. To illustrate the above situations let us assume that the task is to evolve a real-valued mathematical expression which identically equals zero, i.e., its target semantics t = 0. For this problem, we can easily design contexts that represent the above categories, shown in Figure 1. The first context encodes expression 1 ( 1), where is the missing subtree, and accepts exactly one semantics only by replacing with a subtree returning 1 we can obtain an expression which equals 0. The next two contexts accept more than one value. However, the context 1 2 accepts only two possible semantics: 1 and 1, whereas the context 1 cos() accepts any subtree returning 2πn, n Z (thus, infinitely many values are accepted in this case). In the next example, where the context is 0 sin(), whatever subtree is pasted in place of, the whole expression will return zero anyway. Finally, the last example shows a situation in which the resultant expression is always greater than 1, disregard the semantic substituted in place of. That context cannot be used to construct a solution to our toy problem. Because the desired semantics contains the oracle s answers for each fitness case, then its every element should be able to express all of the four considered situations. However, the situation where a context accepts many values 1014

3 Algorithm 1 Calculating the desired semantics of a context. 1: procedure DesiredSemantics(c, t) for context c and target semantics t 2: L list of nodes on the path from the root of c to the subtree missing in c (to the node) 3: d t desired semantics of an empty context 4: for all n L do 5: S semantics of all child subtrees x of node n such that x/ L 6: d n 1 (d, S) calculate the desired values using the inverse of instruction n 7: return d 8: end procedure [-1,?,1] [2,0,0] [1,0,1] [ 1, 0, 1] x [ 1, 0, 1] x Figure 2: Exemplary context with calculated desired semantics (in bold). The target semantics of a task is [2, 0, 0], the semantics of an independent variable is [ 1, 0, 1], and the context desired semantics is [ 1,?, 1]. (in particular, infinitely many) is not convenient to encode. Moreover, this may lead to exponential growth of memory demand and complexity of the required computations by a method calculating the desired semantics. To avoid such problems, from now on we adopt a technically feasible but formally incomplete solution where the desired semantics stores for each fitness case at most one arbitrary chosen value from the set returned by the oracle. The elements of such simplified desired semantics are allowed to express one of three situations (instead of four): Only one concrete value is acceptable (possibly an arbitrarily chosen one). Any value is acceptable (i.e., don t care ) such elements will be called insignificant in the following. No value is acceptable such element is inconsistent. The simplified desired semantics can be expressed as a list (similarly to conventional program semantics as introduced in Section 2), where each element of the list is a concrete value or one of two special values used to encode the undefined elements: insignificant ( don t care ), and inconsistent. In the following we will identify desired semantics with this simplified version of it, which does not require encoding of alternative values. It is important to notice that switching to simplified desired semantics can introduce certain bias and omit potentially good parts. To illustrate this, let us continue the example presented earlier, where the goal was to evolve an expression returning zero (see Figure 1b). For the context 1 2, our simplified desired semantics will contain either 1 or 1 (the other value will be discarded). Similarly, for the context 1 cos(), the simplified desired semantics will contain a specific multiplicity of 2π (e.g., 0), and a part returning a different multiplicity of 2π (e.g., 2π) will be treated as committing some error. 3.2 Calculating the Desired Semantics Let us assume for a while that a method calculating the desired input value for a single instruction is given. In such a case, it becomes straightforward to design an algorithm that calculates the desired semantics of the entire context. The algorithm starts from the root node and proceeds along the path to the missing subtree () of the context. In each step, the desired semantics of a subsequent node on the path is calculated. Thus, at the beginning, the desired argument for the instruction located at the root node is calculated (the given target semantics is simultaneously the desired semantics of the root). Then, recursively, all consecutive nodes on the path are processed. For each node, the semantics of all its subtrees (arguments) are known, therefore the only unknown for each node is the quested desired semantics. The last calculated value forms the unknown desired semantics of the context. Algorithm 1 presents the pseudocode of this procedure. In this process of semantic backpropagation, the special insignificant and inconsistent values of calculated desired semantics are directly propagated. To calculate the desired semantics of a context, all instructions on the abovementioned path should be invertible. When we have a invertible instruction, the inversion of it will have the same properties as an inverse function in mathematics. This means that it must be possible to calculate the desired input for each of the used functions (instructions), given the values of all the other inputs (remaining arguments of a function) and the expected output (the result of the function). For functions that are partially invertible, some elements of the calculated desired semantics can be ambiguous or inconsistent (cf. the example in Fig. 1). It should be also noticed that the invertibility requirement means that the used functions cannot be blackboxes we must know how to calculate the desired argument to get an expected function value (i.e., we must know the inverse functions). As an example of calculations conducted by Algorithm 1, let us consider the symbolic regression task of evolving the expression x 2 x (or, to be precise, of evolving an expression with semantics equivalent to that of x 2 x). The only input variable is x, and there are three fitness cases, for which x assumes values 1, 0, and 1, respectively. Thus, the semantics of the terminal node x equals [ 1, 0, 1] and the semantics of the target is [2, 0, 0]. Figure 2 shows an exemplary context ( x x) with semantics of all subtrees denoted in plain text (here only terminals). The desired semantics, computed for the consecutive nodes on the path from the root node to the missing part of the context are marked in bold. Starting from the root node, [2, 0, 0] is both the desired semantics of an empty context (empty program) and simultaneously the target semantics of the problem. The desired semantics for the context ( x) is[1, 0, 1]. Finally, the desired semantics of the context ( x x) is[ 1,?, 1], with the question mark denoting insignificant value. It does not matter what is the second element of the semantics of the missing subtree, because any value of this element is multiplied by zero and always yields zero. 1015

4 Table 1: Symbolic regression tasks used in the experiment. The columns present the number of independent variables and their domains. Target program (expression) Variables Range F03 x 5 + x 4 + x 3 + x 2 + x 1 [ 1; 1] F04 x 6 + x 5 + x 4 + x 3 + x 2 + x 1 [ 1; 1] F05 sin(x 2 )cos(x) 1 1 [ 1; 1] F06 sin(x)+sin(x + x 2 ) 1 [ 1; 1] F07 log(x +1)+log(x 2 +1) 1 [0;2] F08 x 1 [0; 4] F09 sin(x)+sin(y 2 ) 2 [0.01; 0.99] F10 2 sin(x) cos(y) 2 [0.01; 0.99] F11 x y 2 [0.01; 0.99] F12 x 4 x 3 + y 2 /2 y 2 [0.01; 0.99] 4. RANDOM DESIRED OPERATOR The previous sections presented the conceptual framework of desired semantics. Here, we embed them into the evolutionary context by designing a concrete search operator, called Random Desired Operator (RDO) in the following. Desired semantics determines the preferred semantic properties at a specific location in a program, but is incapable to synthesize the suitable part (subtree). Rather than synthesizing such a part, RDO relies on a dynamically changing repository of ready-to-use parts, which we call library, containing subtrees extracted from individuals in the present population. Technically, in every generation, all subtrees from all current individuals are first collected. Next, semantic duplicates are eliminated: if two or more subtrees have the same semantics, only the one with the smallest subtree depth 1 remains in the set, while the others are discarded. This reduction to minimal subtrees with unique behaviors drastically decreases the library size. There are two reasons to it. Firstly, the majority of program fragments in the whole evolved population exists in many copies. Secondly, different genotypes often map to equivalent phenotypes, i.e., syntactically different program can have the same semantics. To create new solutions, RDO combines a selected context extracted from a single parent individual and the best matching subtree from the library built anew in every generation. RDO is then somehow similar to the standard subtree-replacing mutation operator [6]. Specifically, RDO removes a randomly chosen subtree from the parent, but instead of generating a new random subtree in place of the old one, it looks for a subtree in the library. From the parts available there, it chooses the one that has semantics that is most similar to the desired semantics of the context arising from removing the old subtree (see Algorithm 2). The undefined (i.e., both insignificant and inconsistent) elements in desired semantics are ignored when calculated the semantic distance. Note that, given the method in which the library is built, the RDO is most likely to insert a subtree from other individuals. Therefore, RDO may be seen as a specialized crossover operator (tossing the second child) which performs certain mate selection with respect to the semantic utility of partner s subtrees. 1 We use subtree depth criterion because we apply the same type of constraint to the evolutionary process., i.e., we limit the maximal tree depth. Other measures, e.g., the maximal number of nodes, might be more appropriate as well. Algorithm 2 Random Desired Operator (RDO) 1: procedure RDO(p) 2: r random node in program p 3: c Context(p, r) extractthecontextby removing subtree r from p 4: s DesiredSemantics(c, t) 5: r SearchLibrary(s) find a subtree that best matches semantics s 6: return tree obtained from p by replacing subtree r with r 7: end procedure 5. THE EXPERIMENT 5.1 The Benchmarks In the following experiment, we aim at comparing RDO with standard GP search operators. To this aim, we test them on problems from two different domains: real-valued functions and logic functions. In each domain, we have ten problem instances (tasks). Symbolic Regression Problems The set of symbolic regression problems is presented in Table 1. Problems F03 F12 are taken from Nguyen et al. paper [12], half of which originate from [6, 3, 5, 4]. The table shows the hidden equation to discover (Target program), the number of independent variables (Vars), and the range from which they are chosen (Range). The number of fitness cases (points) for univariate (F03 F08) and for bivariate (F09 F12) problems is 20 and 100, respectively. A program is considered an optimal solution if it returns correct (target) values for each fitness case within a tolerance. This tolerance threshold is necessary to handle the floating point imprecision. Without it, even an expression mathematically equivalent to the target program could be found non-optimal (i.e., would have a non-zero value of our minimized fitness function). The fitness cases are evenly distributed in variable domain(s). More precisely, the values are evenly spaced in a given closed interval from Table 1, with the extreme values placed on the interval boundaries. For univariate problems, this implies that the spans between any two consecutive points equal (b a)/(k 1) for k points in range [a; b]. In case of bivariate functions the values of both variables in fitness cases lie on an evenly spaced square grid. This, however, may cause problems. For that instance, if the variable ranges were [0; 1] for F11, then a substantial number of fitness cases (nearly 40%) would fall on values that constitute special cases, i.e. 0 y,1 y, x 0,orx 1. This may render evolution unable to escape from even very simple local optima. Therefore, we slightly narrowed the original [0; 1] interval to [0.01; 0.99]. The problem mentioned above does not exist in the original problem formulation with [0; 1] interval as in paper [12] because Nguyen et al. (as most researchers) have used randomly selected points uniformly distributed in this range. However, we have strong evidence that such a selection of fitness cases is not a good practice, because GP is highly sensitive to the choice of fitness cases. In other words, the precise values of fitness cases should be always considered as part of a GP task. 1016

5 Table 2: Boolean tasks used in the experiment. Problem Instance Bits Fitness cases PAR even parity PAR PAR multiplexer MUX MUX MAJ majority MAJ MAJ comparator CMP CMP For univariate problems, the terminal set contains two elements, x, the independent variable, and a constant 1.0. For bivariate problems there are two terminals, x and y, the independent variables, without the constant 1.0. Though the lack of constants for bivariate problems may seem surprising, let us note that it has been shown many times that GP fares pretty well without any constants at all, as evolution can easily come up with the idea of filling in for them using subexpressions like x/x or x x. The set of non-terminal instructions consists of eight functions: +,,, / (protected), sin, cos, exp, log (protected). The protected version of division returns 1.0 if the denominator equals zero, irrespective of the numerator. log(x) returns 0.0 ifx =0,andlog( x ) otherwise. Let us note that the provided set of instructions allows expressing all target functions presented in Table 1. In other words, for every benchmark problem, an optimal solution is present in the considered solution space. Boolean Problems In the Boolean domain, four different problems are studied here: even parity, multiplexer, majority, and comparator. The first three of them come from [6], and the last one is a simplified version of a digital comparator proposed by Walker and Miller [13]. In the following, we will use the terms argument and bit in the same meaning as independent variable was used in the symbolic regression context. The objective of the even parity (PAR) problem is to synthesize a function which returns true if and only if an even number of its arguments are true. PAR can be alternatively seen as a generalization of the Not-Exclusive-Or function to more than two arguments. We consider instances with 4, 5, and 6 bits (i.e. with 4, 5, or 6 input arguments), denoted as PAR4, PAR5, and PAR6, respectively. In the multiplexer problem (MUX), program arguments are divided into two blocks: address bits and data bits. The goal is to interpret the address bits as a binary number and use that number to index and return the value of the appropriate data bit. We consider two variants of this problem 6-bit (MUX6) and 11-bit (MUX11). In the former we have 2addressbitsand4databits. Inthelatter 3and8bits, respectively. The task in majority (MAJ) problem is to create a function that returns true if more than half of input arguments are true. Note that for an even number of arguments, the function should return false if exactly half (or less) of them are true. We consider three variants of this problem: with 5 bits (MAJ5), 6 bits (MAJ6), and 7 bits (MAJ7). The last Boolean problem used in this paper is comparator. The objective here is to interpret the input bits as two binary integers, and return true only if the first number is greater than the second one. We used six (CMP6) and eight (CMP8) bits variants, which means that we compare 3-bits numbers (CMP6) or 4-bits numbers (CMP8). In Table 2, all ten problems are shown together with the number of fitness cases on which solutions will be evaluated. A solution is considered as an ideal only if it returns the correct result for all fitness cases. The set of terminals used in Boolean domain experiments comprises one terminal for each input bit (D1...D11 depending on the number of bits). The non-terminal instructions are: AND, OR, NAND, and NOR. Similarly to regression problems, also for this domain solutions to all aforementioned problems can be found in the assumed solution space. 5.2 The Setups To verify the performance of RDO, we conduct a series of experiments involving RDO, standard crossover (X), and standard mutation (M) individually, and every combination of two of them. The setup that uses standard crossover and standard mutation serves as control experiment. For each pair of operators we test different proportions of their probabilities varying from 0.0 to 1.0 with step 0.1, which results in ten different setups with RDO and eleven different control setups. Setups are generally denoted as O 1 + O 2 β, whereo 1 and O 2 are symbols of used operators, and β is the probability of operator O 2 (i.e., Pr(O 1)=1 β). For example, in the setup X+RDO 0.2 crossover is applied with 0.8 probability and RDO with 0.2. Whenever β = 0 or β = 1, the notation simplifies to O oro 2 1.0, respectively (e.g., X 1.0 or RDO 1.0). In total, we have 30 setups, with 19 of them involving RDO ( , i.e., X + RDO β and M + RDO β with β = , and RDO 1.0) and 11 control experiments (X + Mβwith β = , X 1.0, and M 1.0). The parameters of the evolutionary algorithm used in our experiments, shown in Table 3, are based on Nguyen s work [11]. The parameters not mentioned there are taken from ECJ [9] package and are based on the values originally used by Koza [6]. Every setup was tested on all 20 problems. To arrive at statistically significant results, every setup was run independently 200 times with different seeds of a pseudo random number generator, which give us = 120, 000 evolutionary runs in total. 6. RESULTS Rather than presenting detailed results for every combination of method and benchmark, we provide a global, rankingbased perspective. For statistical validation, we perform the Friedman test comparing success ratio of all setups tested on our benchmark problems. The null hypothesis of this nonparametric test says that medians of all samples (corresponding to setups in our case) are equal, and the alternative hypothesis says that this is not true. Tables 4a-c present the computed ranks for success rate performance measured on all 20 problems, and for problems from the two domains separately. In each table, of all setups that use the same combination of operators, the one marked 1017

6 Table 3: Evolutionary parameters. Parameter Value Generations 100 Population size 500 Initialization method Ramped Half-and-Half algorithm Initial minimal depth 2 Initial maximal depth 6 Duplicate retries 100 (before accepting a syntactic duplicated individual) Selection method Tournament Tournament size 3 Operators probability Varying from 0 to 1 with step 0.1 Maximal program depth 15 Node selection Probability of terminal nodes: 10% Probability of non-terminal nodes: 90% Mutation method Subtree mutation Subtree builder Grow algorithm Subtree depth 5 Crossover method Subtree swapping Instructions Symbolic regression: x, either1ory, +,,, / (protected), sin, cos, exp, log (protected) Boolean domain: D1...D11 (inputs depend on a problem instance), AND, OR, NAND, NOR Success Symbolic regression: erroroneachfitnesscase< Boolean domain: perfect reproduction of all fitness cases Number of runs 200 Table 4: Friedman ranks of success ratio performance measure. (a) on all 20 problems M+RDO X+RDO M+RDO RDO X+RDO X+RDO M+RDO M+RDO X+RDO X X+RDO X+M M+RDO X+M X+RDO X+M M+RDO X+M X+RDO X+M X+RDO X+M M+RDO X+M M+RDO X+M X+RDO M M+RDO X+M (b) on 10 symbolic regression problems X+RDO X+RDO X+RDO M+RDO M+RDO X M+RDO M+RDO X+RDO RDO X+RDO X+M M+RDO X+M M+RDO X+M X+RDO X+M X+RDO X+M X+RDO X+M M+RDO X+M M+RDO X+M X+RDO M M+RDO X+M (c) on 10 Boolean domain problems M+RDO X+RDO M+RDO M+RDO M+RDO X+RDO X+RDO X+RDO M+RDO X+M X+RDO X+M M+RDO X+M M+RDO X RDO X+M X+RDO X+M M+RDO X+M X+RDO X+M X+RDO X+M M+RDO X+M X+RDO M (a) on all 20 problems M+RDO X+RDO M+RDO X+RDO M+RDO RDO M+RDO M+RDO M+RDO X X+RDO X+M M+RDO X+M X+RDO X+M X+RDO X+M X+RDO X+M X+RDO X+M X+RDO X+M M+RDO X+M M+RDO X+M X+RDO M Table 5: Friedman ranks of median error. (b) on 10 symbolic regression problems M+RDO X+RDO M+RDO X+RDO M+RDO M+RDO M+RDO RDO M+RDO X+M X+RDO X M+RDO X+M X+RDO X+M X+RDO X+M X+RDO X+M X+RDO X+M X+RDO X+M M+RDO X+M M+RDO X+M X+RDO M (c) on 10 Boolean domain problems M+RDO X+RDO M+RDO X+RDO M+RDO M+RDO M+RDO X+RDO M+RDO X+M M+RDO X M+RDO X+M M+RDO X+M RDO X+M X+RDO X+M X+RDO X+M X+RDO X+M X+RDO X+M X+RDO X+M X+RDO M

7 in bold is the best (however, not necessarily in a statistically significant manner). These tables show that the M+RDO 0.7 setup fares the best. Moreover, the Holm s post-hoc analysis (as suggested by Derrac et al. in [2]) reveals that the best control setup (X 1.0) is statistically worse (p-value < 0.01) than the best 13 setups from this ranking only X+RDO 0.2 and setups with RDO 0.1, 0.9, 1.0 are not statistically better than it. However, there is also clear difference between the performance of setups when considering particular domains separately (Tables 4b-c). The best setups for symbolic regression are those that employ RDO with lower probability than in case of Boolean problems, which may suggest that RDO is not as advantageous in the former domain. The differences are even more visible in the qualitative comparison: for symbolic regression problems, the best control setup (also X 1.0) is not statistically worse than any other setup. The best setup, X+RDO 0.4, is statistically better than X+M 0.3, and following setups (the worst 8 setups from the ranking). However, for the Boolean problems the best control setup (X+M 0.1) is statistically worse than the best 12 setups. In Table 5 we show the rankings of median error committed by each setup. When considering both regression and Boolean domain together, the best setup is the same as comparing success ratio, M+RDO 0.7. For symbolic regression, it seems more beneficial to use bit higher probability of RDO than when maximizing success ratio. For the Boolean problems little can be concluded, as they are almost always solved by RDO, so the median error is zero and does not differentiate the setups (16 of its setups rank equally well at the top). Table 6 presents more detailed results for the globally best setup M+RDO 0.7 (the best success rate and the lowest median error on all problems) and the best control setup (X 1.0). The table shows the achieved success rate, the mean generation when the ideal solution was found (calculated only over the successful runs), the median error of the best-of-run individuals, and the time required by a single evolutionary process (averaged over 200 runs). The last column says how many successes are expected if an algorithm would be allowed to run for one hour, starting a new evolutionary run every time the previous one has been completed (with success or not). We find this efficiency measure convenient, because a method that executes very fast does not necessary score many successes. On the other hand, it can be run more times within a fixed computational budget. However, if a method is actually bad, the successive runs do not help much, and this performance indicator will show that. Last but not least, fixing the time budget allows us to take into account the overhead of searching for procedures in the library, which causes RDO to be substantially slower in absolute terms. To provide another reference point, in the rows named best we present the best value of each objective achieved by any setup. Therefore, each value in each best row may come from a different setup. Table 6 confirms our earlier finding that RDO is very beneficial for all Boolean problems. For instance, only 1% of runs for the quite challenging PAR6 problem fail. All other runs of M+RDO 0.7 succeeded, while standard GP failed to solve PAR6 even once. For this reason, the efficiency measured as success per hour for the Boolean domain can be even several orders of magnitude higher. For symbolic regression problems, the results are not so firm. For seven out of ten problems (F05 F11), the success ratio achieved by M+RDO 0.7 is better. However, in terms of the number of successes per hour, this setup can have up to two orders of magnitude worse performance only for 5 problems (F05, F08 F11) RDO is more efficient than the control setup. One of the possible reasons for inferior performance of RDO in the regression domain is the way in which the errors, committed by the inserted procedures with respect to desired semantics, propagate through the programs. In the Boolean domain, when a perfectly matching procedure cannot be found (which is most often the case), a mismatch on a single element of desired semantics causes unitary loss of fitness (one incorrect output bit). In the regression domain however, even a relatively small discrepancy between the semantics of the inserted procedure and the desired semantics may translate into arbitrarily large error of the entire program (consider, e.g., the context (1/)). Apart from that,we anticipate that the RDO performance for symbolic regression can be substantially improved by using a more efficient implementation of the algorithm that searches the library for the best matching subtree. 7. CONCLUSIONS The speedup and the overall improvement in quality provided by the presented approach result from exploiting certain properties of a given problem, more specifically, the properties of instructions that form the programming language of consideration. We demonstrated that taking into account such implicit, but often easily available, features can help to overcome some weaknesses of genetic programming, at least in certain domains and applications. As we already mentioned in Introduction, it is surprising to see that such supplementary 2 and easily exploitable properties are rarely considered in the genetic programming. The one fits all attitude prevails, particularly when it comes to operator design, which is puzzling in the light of the No Free Lunch theorem heritage. Therefore, we believe that this innovative approach is an important contribution to GP and that other methods developed in this spirit will lead to essential breakthrough in the field of genetic programming. There are several directions in which this research can develop. In the variant of RDO presented here, we used the desired semantics in a very strict way, searching the library for the part that exactly matched the desired values (i.e., the values that made the entire program return the correct output). However, to exploit the presented idea, an omniscient oracle is not necessary. It may be enough to only narrow the space of the considered parts in such way that the genetic operators can faster choose the part to be inserted into the parent individual. Therefore, our approach can be also applied to problems where the instructions are not easily invertible, which we plan to investigate in the future research. Acknowledgment Work supported by grant no. DEC-2011/01/B/ST6/ Supplementary in the sense that the invertibility of, e.g., the multiplication operator is not required for program execution. 1019

8 Table 6: Detailed comparison of the best setup with RDO (M+RDO 0.7) with the best control setup (X 1.0). The shown best values (in italic) for each problem and each column may come from different setups. (a) symbolic regression Setup Success Success Median Time Success rate gen. error [ms] per hour RDO X best: RDO X best: RDO X best: RDO X best: RDO X best: RDO X best: RDO X best: RDO X best: RDO X best: RDO X best: (b) Boolean domain Setup Success Success Median Time Success rate gen. error [ms] per hour RDO X best: RDO X best: RDO X best: RDO X best: RDO X best: RDO X best: RDO X best: RDO X best: RDO X best: RDO X best: F03 F04 F05 F06 F07 F08 F09 F10 F11 F12 CMP6 CMP8 MAJ5 MAJ6 MAJ7 MUX11 MUX6 PAR4 PAR5 PAR6 8. REFERENCES [1] W. Banzhaf et al. Genetic Programming An Introduction; On the Automatic Evolution of Computer Programs and its Applications. Morgan Kaufmann, San Francisco, CA, USA, January [2] J. Derrac et al. A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm and Evolutionary Computation, [3] N. X. Hoai et al. Solving the symbolic regression problem with tree-adjunct grammar guided genetic programming: The comparative results. In D. B. Fogel et al., editors, Proceedings of the 2002 Congress on Evolutionary Computation CEC2002, pages IEEE Press, May [4] C. Johnson. Genetic programming crossover: Does it cross over? In Proceedings of the 12th European Conference on Genetic Programming, EuroGP 2009, volume 5481 of LNCS, pages , Tuebingen, April Springer. [5] M. Keijzer. Improving symbolic regression with interval arithmetic and linear scaling. In C. Ryan et al., editors, Genetic Programming, Proceedings of EuroGP 2003, volume 2610 of LNCS, pages 70 82, Essex, April Springer-Verlag. [6] J.R.Koza.Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA, [7] K. Krawiec and T. Pawlak. Approximating geometric crossover by semantic backpropagation. In C. Blum et al., editor, GECCO 13: Proceedings of the 15th annual conference on Genetic and evolutionary computation, Amsterdam, The Netherlands, ACM. [8] K. Krawiec and B. Wieloch. Analysis of semantic modularity for genetic programming. Foundations of Computing and Decision Sciences, 34(4): , [9] S. Luke. The ECJ Owner s Manual A User Manual for the ECJ Evolutionary Computation Library, zeroth edition, online version 0.2 edition, October [10] J. F. Miller. An empirical study of the efficiency of learning boolean functions using a cartesian genetic programming approach. In Proceedings of the Genetic and Evolutionary Computation Conference, volume2, pages , Orlando, Florida, USA, July Morgan Kaufmann. [11] Q. U. Nguyen et al. Semantic aware crossover for genetic programming: The case for real-valued function regression. In Proceedings of the 12th European Conference on Genetic Programming, EuroGP 2009, volume 5481 of LNCS, pages , Tuebingen, April Springer. [12] Q. U. Nguyen et al. Self-adapting semantic sensitivities for semantic similarity based crossover. In 2010 IEEE World Congress on Computational Intelligence, pages , Barcelona, Spain, July IEEE Computational Intelligence Society, IEEE Press. [13] J. A. Walker and J. F. Miller. Investigating the performance of module acquisition in cartesian genetic programming. In H.G. Beyer et al., editors, GECCO 2005: Proceedings of the 2005 conference on Genetic and evolutionary computation, volume 2, pages , Washington DC, USA, June ACM Press. 1020

ADAPTATION OF REPRESENTATION IN GP

ADAPTATION OF REPRESENTATION IN GP 1 ADAPTATION OF REPRESENTATION IN GP CEZARY Z. JANIKOW University of Missouri St. Louis Department of Mathematics and Computer Science St Louis, Missouri RAHUL A DESHPANDE University of Missouri St. Louis

More information

A New Crossover Technique for Cartesian Genetic Programming

A New Crossover Technique for Cartesian Genetic Programming A New Crossover Technique for Cartesian Genetic Programming Genetic Programming Track Janet Clegg Intelligent Systems Group, Department of Electronics University of York, Heslington York, YO DD, UK jc@ohm.york.ac.uk

More information

Investigating the Application of Genetic Programming to Function Approximation

Investigating the Application of Genetic Programming to Function Approximation Investigating the Application of Genetic Programming to Function Approximation Jeremy E. Emch Computer Science Dept. Penn State University University Park, PA 16802 Abstract When analyzing a data set it

More information

A New Crossover Technique for Cartesian Genetic Programming

A New Crossover Technique for Cartesian Genetic Programming A New Crossover Technique for Cartesian Genetic Programming Genetic Programming Track Janet Clegg Intelligent Systems Group, Department of Electronics University of York, Heslington York,YODD,UK jc@ohm.york.ac.uk

More information

Mutations for Permutations

Mutations for Permutations Mutations for Permutations Insert mutation: Pick two allele values at random Move the second to follow the first, shifting the rest along to accommodate Note: this preserves most of the order and adjacency

More information

Genetic Programming for Data Classification: Partitioning the Search Space

Genetic Programming for Data Classification: Partitioning the Search Space Genetic Programming for Data Classification: Partitioning the Search Space Jeroen Eggermont jeggermo@liacs.nl Joost N. Kok joost@liacs.nl Walter A. Kosters kosters@liacs.nl ABSTRACT When Genetic Programming

More information

A Comparative Study of Linear Encoding in Genetic Programming

A Comparative Study of Linear Encoding in Genetic Programming 2011 Ninth International Conference on ICT and Knowledge A Comparative Study of Linear Encoding in Genetic Programming Yuttana Suttasupa, Suppat Rungraungsilp, Suwat Pinyopan, Pravit Wungchusunti, Prabhas

More information

Distributed minimum spanning tree problem

Distributed minimum spanning tree problem Distributed minimum spanning tree problem Juho-Kustaa Kangas 24th November 2012 Abstract Given a connected weighted undirected graph, the minimum spanning tree problem asks for a spanning subtree with

More information

Santa Fe Trail Problem Solution Using Grammatical Evolution

Santa Fe Trail Problem Solution Using Grammatical Evolution 2012 International Conference on Industrial and Intelligent Information (ICIII 2012) IPCSIT vol.31 (2012) (2012) IACSIT Press, Singapore Santa Fe Trail Problem Solution Using Grammatical Evolution Hideyuki

More information

Escaping Local Optima: Genetic Algorithm

Escaping Local Optima: Genetic Algorithm Artificial Intelligence Escaping Local Optima: Genetic Algorithm Dae-Won Kim School of Computer Science & Engineering Chung-Ang University We re trying to escape local optima To achieve this, we have learned

More information

Math 340 Fall 2014, Victor Matveev. Binary system, round-off errors, loss of significance, and double precision accuracy.

Math 340 Fall 2014, Victor Matveev. Binary system, round-off errors, loss of significance, and double precision accuracy. Math 340 Fall 2014, Victor Matveev Binary system, round-off errors, loss of significance, and double precision accuracy. 1. Bits and the binary number system A bit is one digit in a binary representation

More information

Semantic Forward Propagation for Symbolic Regression

Semantic Forward Propagation for Symbolic Regression Semantic Forward Propagation for Symbolic Regression Marcin Szubert 1, Anuradha Kodali 2,3, Sangram Ganguly 3,4, Kamalika Das 2,3, and Josh C. Bongard 1 1 University of Vermont, Burlington VT 05405, USA

More information

Contents. Index... 11

Contents. Index... 11 Contents 1 Modular Cartesian Genetic Programming........................ 1 1 Embedded Cartesian Genetic Programming (ECGP)............ 1 1.1 Cone-based and Age-based Module Creation.......... 1 1.2 Cone-based

More information

Using Genetic Algorithms to Solve the Box Stacking Problem

Using Genetic Algorithms to Solve the Box Stacking Problem Using Genetic Algorithms to Solve the Box Stacking Problem Jenniffer Estrada, Kris Lee, Ryan Edgar October 7th, 2010 Abstract The box stacking or strip stacking problem is exceedingly difficult to solve

More information

Potential Fitness for Genetic Programming

Potential Fitness for Genetic Programming Potential Fitness for Genetic Programming Krzysztof Krawiec and Przemysław Polewski Institute of Computing Science, Poznan University of Technology Piotrowo 2, 60965 Poznań, Poland kkrawiec@cs.put.poznan.pl,

More information

Artificial Neural Network based Curve Prediction

Artificial Neural Network based Curve Prediction Artificial Neural Network based Curve Prediction LECTURE COURSE: AUSGEWÄHLTE OPTIMIERUNGSVERFAHREN FÜR INGENIEURE SUPERVISOR: PROF. CHRISTIAN HAFNER STUDENTS: ANTHONY HSIAO, MICHAEL BOESCH Abstract We

More information

Genetic Programming Prof. Thomas Bäck Nat Evur ol al ut ic o om nar put y Aling go rg it roup hms Genetic Programming 1

Genetic Programming Prof. Thomas Bäck Nat Evur ol al ut ic o om nar put y Aling go rg it roup hms Genetic Programming 1 Genetic Programming Prof. Thomas Bäck Natural Evolutionary Computing Algorithms Group Genetic Programming 1 Genetic programming The idea originated in the 1950s (e.g., Alan Turing) Popularized by J.R.

More information

Adaptive Crossover in Genetic Algorithms Using Statistics Mechanism

Adaptive Crossover in Genetic Algorithms Using Statistics Mechanism in Artificial Life VIII, Standish, Abbass, Bedau (eds)(mit Press) 2002. pp 182 185 1 Adaptive Crossover in Genetic Algorithms Using Statistics Mechanism Shengxiang Yang Department of Mathematics and Computer

More information

Computational Intelligence

Computational Intelligence Computational Intelligence Module 6 Evolutionary Computation Ajith Abraham Ph.D. Q What is the most powerful problem solver in the Universe? ΑThe (human) brain that created the wheel, New York, wars and

More information

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM 1 CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM John R. Koza Computer Science Department Stanford University Stanford, California 94305 USA E-MAIL: Koza@Sunburn.Stanford.Edu

More information

The Genetic Algorithm for finding the maxima of single-variable functions

The Genetic Algorithm for finding the maxima of single-variable functions Research Inventy: International Journal Of Engineering And Science Vol.4, Issue 3(March 2014), PP 46-54 Issn (e): 2278-4721, Issn (p):2319-6483, www.researchinventy.com The Genetic Algorithm for finding

More information

Automatic Generation of Prime Factorization Algorithms Using Genetic Programming

Automatic Generation of Prime Factorization Algorithms Using Genetic Programming 1 Automatic Generation of Prime Factorization Algorithms Using Genetic Programming David Michael Chan Department of Computer Science Post Office Box 12587 Stanford, California 94309 dmchan@stanford.edu

More information

Binary Representations of Integers and the Performance of Selectorecombinative Genetic Algorithms

Binary Representations of Integers and the Performance of Selectorecombinative Genetic Algorithms Binary Representations of Integers and the Performance of Selectorecombinative Genetic Algorithms Franz Rothlauf Department of Information Systems University of Bayreuth, Germany franz.rothlauf@uni-bayreuth.de

More information

Genetic Programming in the Wild:

Genetic Programming in the Wild: Genetic Programming in the Wild: and orlovm, sipper@cs.bgu.ac.il Department of Computer Science Ben-Gurion University, Israel GECCO 2009, July 8 12 Montréal, Québec, Canada 1 / 46 GP: Programs or Representations?

More information

Meta- Heuristic based Optimization Algorithms: A Comparative Study of Genetic Algorithm and Particle Swarm Optimization

Meta- Heuristic based Optimization Algorithms: A Comparative Study of Genetic Algorithm and Particle Swarm Optimization 2017 2 nd International Electrical Engineering Conference (IEEC 2017) May. 19 th -20 th, 2017 at IEP Centre, Karachi, Pakistan Meta- Heuristic based Optimization Algorithms: A Comparative Study of Genetic

More information

DESIGN AND ANALYSIS OF ALGORITHMS. Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES

DESIGN AND ANALYSIS OF ALGORITHMS. Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES DESIGN AND ANALYSIS OF ALGORITHMS Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES http://milanvachhani.blogspot.in USE OF LOOPS As we break down algorithm into sub-algorithms, sooner or later we shall

More information

Metaheuristic Optimization with Evolver, Genocop and OptQuest

Metaheuristic Optimization with Evolver, Genocop and OptQuest Metaheuristic Optimization with Evolver, Genocop and OptQuest MANUEL LAGUNA Graduate School of Business Administration University of Colorado, Boulder, CO 80309-0419 Manuel.Laguna@Colorado.EDU Last revision:

More information

Automating Test Driven Development with Grammatical Evolution

Automating Test Driven Development with Grammatical Evolution http://excel.fit.vutbr.cz Automating Test Driven Development with Grammatical Evolution Jan Svoboda* Abstract Test driven development is a widely used process of creating software products with automated

More information

Similarity Templates or Schemata. CS 571 Evolutionary Computation

Similarity Templates or Schemata. CS 571 Evolutionary Computation Similarity Templates or Schemata CS 571 Evolutionary Computation Similarities among Strings in a Population A GA has a population of strings (solutions) that change from generation to generation. What

More information

An empirical study of the efficiency of learning boolean functions using a Cartesian Genetic Programming approach

An empirical study of the efficiency of learning boolean functions using a Cartesian Genetic Programming approach An empirical study of the efficiency of learning boolean functions using a Cartesian Genetic Programming approach Julian F. Miller School of Computing Napier University 219 Colinton Road Edinburgh, EH14

More information

A Genetic Algorithm Applied to Graph Problems Involving Subsets of Vertices

A Genetic Algorithm Applied to Graph Problems Involving Subsets of Vertices A Genetic Algorithm Applied to Graph Problems Involving Subsets of Vertices Yaser Alkhalifah Roger L. Wainwright Department of Mathematical Department of Mathematical and Computer Sciences and Computer

More information

Chapter 9. Software Testing

Chapter 9. Software Testing Chapter 9. Software Testing Table of Contents Objectives... 1 Introduction to software testing... 1 The testers... 2 The developers... 2 An independent testing team... 2 The customer... 2 Principles of

More information

Effects of Constant Optimization by Nonlinear Least Squares Minimization in Symbolic Regression

Effects of Constant Optimization by Nonlinear Least Squares Minimization in Symbolic Regression Effects of Constant Optimization by Nonlinear Least Squares Minimization in Symbolic Regression Michael Kommenda, Gabriel Kronberger, Stephan Winkler, Michael Affenzeller, and Stefan Wagner Contact: Michael

More information

Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES

Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES DESIGN AND ANALYSIS OF ALGORITHMS Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES http://milanvachhani.blogspot.in USE OF LOOPS As we break down algorithm into sub-algorithms, sooner or later we shall

More information

Multi-Way Number Partitioning

Multi-Way Number Partitioning Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence (IJCAI-09) Multi-Way Number Partitioning Richard E. Korf Computer Science Department University of California,

More information

Genetic programming. Lecture Genetic Programming. LISP as a GP language. LISP structure. S-expressions

Genetic programming. Lecture Genetic Programming. LISP as a GP language. LISP structure. S-expressions Genetic programming Lecture Genetic Programming CIS 412 Artificial Intelligence Umass, Dartmouth One of the central problems in computer science is how to make computers solve problems without being explicitly

More information

Telecommunication and Informatics University of North Carolina, Technical University of Gdansk Charlotte, NC 28223, USA

Telecommunication and Informatics University of North Carolina, Technical University of Gdansk Charlotte, NC 28223, USA A Decoder-based Evolutionary Algorithm for Constrained Parameter Optimization Problems S lawomir Kozie l 1 and Zbigniew Michalewicz 2 1 Department of Electronics, 2 Department of Computer Science, Telecommunication

More information

Automatic Programming with Ant Colony Optimization

Automatic Programming with Ant Colony Optimization Automatic Programming with Ant Colony Optimization Jennifer Green University of Kent jg9@kent.ac.uk Jacqueline L. Whalley University of Kent J.L.Whalley@kent.ac.uk Colin G. Johnson University of Kent C.G.Johnson@kent.ac.uk

More information

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize. Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject

More information

Solving the Artificial Ant on the Santa Fe Trail Problem in 20,696 Fitness Evaluations

Solving the Artificial Ant on the Santa Fe Trail Problem in 20,696 Fitness Evaluations Solving the Artificial Ant on the Santa Fe Trail Problem in 20,696 Fitness Evaluations Steffen Christensen, Franz Oppacher School of Computer Science, Carleton University 25 Colonel By Drive, Ottawa, Ontario,

More information

Random Oracles - OAEP

Random Oracles - OAEP Random Oracles - OAEP Anatoliy Gliberman, Dmitry Zontov, Patrick Nordahl September 23, 2004 Reading Overview There are two papers presented this week. The first paper, Random Oracles are Practical: A Paradigm

More information

What Every Programmer Should Know About Floating-Point Arithmetic

What Every Programmer Should Know About Floating-Point Arithmetic What Every Programmer Should Know About Floating-Point Arithmetic Last updated: October 15, 2015 Contents 1 Why don t my numbers add up? 3 2 Basic Answers 3 2.1 Why don t my numbers, like 0.1 + 0.2 add

More information

Floating Point Considerations

Floating Point Considerations Chapter 6 Floating Point Considerations In the early days of computing, floating point arithmetic capability was found only in mainframes and supercomputers. Although many microprocessors designed in the

More information

Evolving Hierarchical and Recursive Teleo-reactive Programs through Genetic Programming

Evolving Hierarchical and Recursive Teleo-reactive Programs through Genetic Programming Evolving Hierarchical and Recursive Teleo-reactive Programs through Genetic Programming Mykel J. Kochenderfer Department of Computer Science Stanford University Stanford, California 94305 mykel@cs.stanford.edu

More information

Table : IEEE Single Format ± a a 2 a 3 :::a 8 b b 2 b 3 :::b 23 If exponent bitstring a :::a 8 is Then numerical value represented is ( ) 2 = (

Table : IEEE Single Format ± a a 2 a 3 :::a 8 b b 2 b 3 :::b 23 If exponent bitstring a :::a 8 is Then numerical value represented is ( ) 2 = ( Floating Point Numbers in Java by Michael L. Overton Virtually all modern computers follow the IEEE 2 floating point standard in their representation of floating point numbers. The Java programming language

More information

Chapter S:II. II. Search Space Representation

Chapter S:II. II. Search Space Representation Chapter S:II II. Search Space Representation Systematic Search Encoding of Problems State-Space Representation Problem-Reduction Representation Choosing a Representation S:II-1 Search Space Representation

More information

Evolving Variable-Ordering Heuristics for Constrained Optimisation

Evolving Variable-Ordering Heuristics for Constrained Optimisation Griffith Research Online https://research-repository.griffith.edu.au Evolving Variable-Ordering Heuristics for Constrained Optimisation Author Bain, Stuart, Thornton, John, Sattar, Abdul Published 2005

More information

Genetic Image Network for Image Classification

Genetic Image Network for Image Classification Genetic Image Network for Image Classification Shinichi Shirakawa, Shiro Nakayama, and Tomoharu Nagao Graduate School of Environment and Information Sciences, Yokohama National University, 79-7, Tokiwadai,

More information

Genetic Programming of Autonomous Agents. Functional Requirements List and Performance Specifi cations. Scott O'Dell

Genetic Programming of Autonomous Agents. Functional Requirements List and Performance Specifi cations. Scott O'Dell Genetic Programming of Autonomous Agents Functional Requirements List and Performance Specifi cations Scott O'Dell Advisors: Dr. Joel Schipper and Dr. Arnold Patton November 23, 2010 GPAA 1 Project Goals

More information

A SIMULATED ANNEALING ALGORITHM FOR SOME CLASS OF DISCRETE-CONTINUOUS SCHEDULING PROBLEMS. Joanna Józefowska, Marek Mika and Jan Węglarz

A SIMULATED ANNEALING ALGORITHM FOR SOME CLASS OF DISCRETE-CONTINUOUS SCHEDULING PROBLEMS. Joanna Józefowska, Marek Mika and Jan Węglarz A SIMULATED ANNEALING ALGORITHM FOR SOME CLASS OF DISCRETE-CONTINUOUS SCHEDULING PROBLEMS Joanna Józefowska, Marek Mika and Jan Węglarz Poznań University of Technology, Institute of Computing Science,

More information

A Comparison of Cartesian Genetic Programming and Linear Genetic Programming

A Comparison of Cartesian Genetic Programming and Linear Genetic Programming A Comparison of Cartesian Genetic Programming and Linear Genetic Programming Garnett Wilson 1,2 and Wolfgang Banzhaf 1 1 Memorial Univeristy of Newfoundland, St. John s, NL, Canada 2 Verafin, Inc., St.

More information

EVOLVING LEGO. Exploring the impact of alternative encodings on the performance of evolutionary algorithms. 1. Introduction

EVOLVING LEGO. Exploring the impact of alternative encodings on the performance of evolutionary algorithms. 1. Introduction N. Gu, S. Watanabe, H. Erhan, M. Hank Haeusler, W. Huang, R. Sosa (eds.), Rethinking Comprehensive Design: Speculative Counterculture, Proceedings of the 19th International Conference on Computer- Aided

More information

Chapter 14 Global Search Algorithms

Chapter 14 Global Search Algorithms Chapter 14 Global Search Algorithms An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Introduction We discuss various search methods that attempts to search throughout the entire feasible set.

More information

Geometric Semantic Genetic Programming ~ Theory & Practice ~

Geometric Semantic Genetic Programming ~ Theory & Practice ~ Geometric Semantic Genetic Programming ~ Theory & Practice ~ Alberto Moraglio University of Exeter 25 April 2017 Poznan, Poland 2 Contents Evolutionary Algorithms & Genetic Programming Geometric Genetic

More information

Evolving SQL Queries for Data Mining

Evolving SQL Queries for Data Mining Evolving SQL Queries for Data Mining Majid Salim and Xin Yao School of Computer Science, The University of Birmingham Edgbaston, Birmingham B15 2TT, UK {msc30mms,x.yao}@cs.bham.ac.uk Abstract. This paper

More information

A Fitness Function to Find Feasible Sequences of Method Calls for Evolutionary Testing of Object-Oriented Programs

A Fitness Function to Find Feasible Sequences of Method Calls for Evolutionary Testing of Object-Oriented Programs A Fitness Function to Find Feasible Sequences of Method Calls for Evolutionary Testing of Object-Oriented Programs Myoung Yee Kim and Yoonsik Cheon TR #7-57 November 7; revised January Keywords: fitness

More information

2 Computation with Floating-Point Numbers

2 Computation with Floating-Point Numbers 2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers

More information

Genetic Algorithms and Genetic Programming. Lecture 9: (23/10/09)

Genetic Algorithms and Genetic Programming. Lecture 9: (23/10/09) Genetic Algorithms and Genetic Programming Lecture 9: (23/10/09) Genetic programming II Michael Herrmann michael.herrmann@ed.ac.uk, phone: 0131 6 517177, Informatics Forum 1.42 Overview 1. Introduction:

More information

Evolved Multi-resolution Transforms for Optimized Image Compression and Reconstruction under Quantization

Evolved Multi-resolution Transforms for Optimized Image Compression and Reconstruction under Quantization Evolved Multi-resolution Transforms for Optimized Image Compression and Reconstruction under Quantization FRANK W. MOORE Mathematical Sciences Department University of Alaska Anchorage CAS 154, 3211 Providence

More information

An Attempt to Identify Weakest and Strongest Queries

An Attempt to Identify Weakest and Strongest Queries An Attempt to Identify Weakest and Strongest Queries K. L. Kwok Queens College, City University of NY 65-30 Kissena Boulevard Flushing, NY 11367, USA kwok@ir.cs.qc.edu ABSTRACT We explore some term statistics

More information

Foundations of Computing

Foundations of Computing Foundations of Computing Darmstadt University of Technology Dept. Computer Science Winter Term 2005 / 2006 Copyright c 2004 by Matthias Müller-Hannemann and Karsten Weihe All rights reserved http://www.algo.informatik.tu-darmstadt.de/

More information

Leveraging Transitive Relations for Crowdsourced Joins*

Leveraging Transitive Relations for Crowdsourced Joins* Leveraging Transitive Relations for Crowdsourced Joins* Jiannan Wang #, Guoliang Li #, Tim Kraska, Michael J. Franklin, Jianhua Feng # # Department of Computer Science, Tsinghua University, Brown University,

More information

Welfare Navigation Using Genetic Algorithm

Welfare Navigation Using Genetic Algorithm Welfare Navigation Using Genetic Algorithm David Erukhimovich and Yoel Zeldes Hebrew University of Jerusalem AI course final project Abstract Using standard navigation algorithms and applications (such

More information

An Information-Theoretic Approach to the Prepruning of Classification Rules

An Information-Theoretic Approach to the Prepruning of Classification Rules An Information-Theoretic Approach to the Prepruning of Classification Rules Max Bramer University of Portsmouth, Portsmouth, UK Abstract: Keywords: The automatic induction of classification rules from

More information

MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS

MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS J.I. Serrano M.D. Del Castillo Instituto de Automática Industrial CSIC. Ctra. Campo Real km.0 200. La Poveda. Arganda del Rey. 28500

More information

Random Search Report An objective look at random search performance for 4 problem sets

Random Search Report An objective look at random search performance for 4 problem sets Random Search Report An objective look at random search performance for 4 problem sets Dudon Wai Georgia Institute of Technology CS 7641: Machine Learning Atlanta, GA dwai3@gatech.edu Abstract: This report

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

5.4 Pure Minimal Cost Flow

5.4 Pure Minimal Cost Flow Pure Minimal Cost Flow Problem. Pure Minimal Cost Flow Networks are especially convenient for modeling because of their simple nonmathematical structure that can be easily portrayed with a graph. This

More information

Review Questions 26 CHAPTER 1. SCIENTIFIC COMPUTING

Review Questions 26 CHAPTER 1. SCIENTIFIC COMPUTING 26 CHAPTER 1. SCIENTIFIC COMPUTING amples. The IEEE floating-point standard can be found in [131]. A useful tutorial on floating-point arithmetic and the IEEE standard is [97]. Although it is no substitute

More information

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems Some Applications of Graph Bandwidth to Constraint Satisfaction Problems Ramin Zabih Computer Science Department Stanford University Stanford, California 94305 Abstract Bandwidth is a fundamental concept

More information

2 Computation with Floating-Point Numbers

2 Computation with Floating-Point Numbers 2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers

More information

Chapter 2: Number Systems

Chapter 2: Number Systems Chapter 2: Number Systems Logic circuits are used to generate and transmit 1s and 0s to compute and convey information. This two-valued number system is called binary. As presented earlier, there are many

More information

Constructing Hidden Units using Examples and Queries

Constructing Hidden Units using Examples and Queries Constructing Hidden Units using Examples and Queries Eric B. Baum Kevin J. Lang NEC Research Institute 4 Independence Way Princeton, NJ 08540 ABSTRACT While the network loading problem for 2-layer threshold

More information

Floating-Point Numbers in Digital Computers

Floating-Point Numbers in Digital Computers POLYTECHNIC UNIVERSITY Department of Computer and Information Science Floating-Point Numbers in Digital Computers K. Ming Leung Abstract: We explain how floating-point numbers are represented and stored

More information

Bits, Words, and Integers

Bits, Words, and Integers Computer Science 52 Bits, Words, and Integers Spring Semester, 2017 In this document, we look at how bits are organized into meaningful data. In particular, we will see the details of how integers are

More information

Optimized Implementation of Logic Functions

Optimized Implementation of Logic Functions June 25, 22 9:7 vra235_ch4 Sheet number Page number 49 black chapter 4 Optimized Implementation of Logic Functions 4. Nc3xe4, Nb8 d7 49 June 25, 22 9:7 vra235_ch4 Sheet number 2 Page number 5 black 5 CHAPTER

More information

A SIMPLE APPROXIMATION ALGORITHM FOR NONOVERLAPPING LOCAL ALIGNMENTS (WEIGHTED INDEPENDENT SETS OF AXIS PARALLEL RECTANGLES)

A SIMPLE APPROXIMATION ALGORITHM FOR NONOVERLAPPING LOCAL ALIGNMENTS (WEIGHTED INDEPENDENT SETS OF AXIS PARALLEL RECTANGLES) Chapter 1 A SIMPLE APPROXIMATION ALGORITHM FOR NONOVERLAPPING LOCAL ALIGNMENTS (WEIGHTED INDEPENDENT SETS OF AXIS PARALLEL RECTANGLES) Piotr Berman Department of Computer Science & Engineering Pennsylvania

More information

Ranking Clustered Data with Pairwise Comparisons

Ranking Clustered Data with Pairwise Comparisons Ranking Clustered Data with Pairwise Comparisons Alisa Maas ajmaas@cs.wisc.edu 1. INTRODUCTION 1.1 Background Machine learning often relies heavily on being able to rank the relative fitness of instances

More information

Semantics via Syntax. f (4) = if define f (x) =2 x + 55.

Semantics via Syntax. f (4) = if define f (x) =2 x + 55. 1 Semantics via Syntax The specification of a programming language starts with its syntax. As every programmer knows, the syntax of a language comes in the shape of a variant of a BNF (Backus-Naur Form)

More information

Machine Evolution. Machine Evolution. Let s look at. Machine Evolution. Machine Evolution. Machine Evolution. Machine Evolution

Machine Evolution. Machine Evolution. Let s look at. Machine Evolution. Machine Evolution. Machine Evolution. Machine Evolution Let s look at As you will see later in this course, neural networks can learn, that is, adapt to given constraints. For example, NNs can approximate a given function. In biology, such learning corresponds

More information

AN EVOLUTIONARY APPROACH TO DISTANCE VECTOR ROUTING

AN EVOLUTIONARY APPROACH TO DISTANCE VECTOR ROUTING International Journal of Latest Research in Science and Technology Volume 3, Issue 3: Page No. 201-205, May-June 2014 http://www.mnkjournals.com/ijlrst.htm ISSN (Online):2278-5299 AN EVOLUTIONARY APPROACH

More information

Test Case Generation for Classes in Objects-Oriented Programming Using Grammatical Evolution

Test Case Generation for Classes in Objects-Oriented Programming Using Grammatical Evolution Test Case Generation for Classes in Objects-Oriented Programming Using Grammatical Evolution Jirawat Chaiareerat, Peraphon Sophatsathit and Chidchanok Lursinsap Abstract This paper proposes a dynamic test

More information

6.2 DATA DISTRIBUTION AND EXPERIMENT DETAILS

6.2 DATA DISTRIBUTION AND EXPERIMENT DETAILS Chapter 6 Indexing Results 6. INTRODUCTION The generation of inverted indexes for text databases is a computationally intensive process that requires the exclusive use of processing resources for long

More information

Handout 9: Imperative Programs and State

Handout 9: Imperative Programs and State 06-02552 Princ. of Progr. Languages (and Extended ) The University of Birmingham Spring Semester 2016-17 School of Computer Science c Uday Reddy2016-17 Handout 9: Imperative Programs and State Imperative

More information

6.034 Notes: Section 3.1

6.034 Notes: Section 3.1 6.034 Notes: Section 3.1 Slide 3.1.1 In this presentation, we'll take a look at the class of problems called Constraint Satisfaction Problems (CSPs). CSPs arise in many application areas: they can be used

More information

Multi Expression Programming. Mihai Oltean

Multi Expression Programming. Mihai Oltean Multi Expression Programming Mihai Oltean Department of Computer Science, Faculty of Mathematics and Computer Science, Babeş-Bolyai University, Kogălniceanu 1, Cluj-Napoca, 3400, Romania. email: mihai.oltean@gmail.com

More information

Comparison of Evolutionary Multiobjective Optimization with Reference Solution-Based Single-Objective Approach

Comparison of Evolutionary Multiobjective Optimization with Reference Solution-Based Single-Objective Approach Comparison of Evolutionary Multiobjective Optimization with Reference Solution-Based Single-Objective Approach Hisao Ishibuchi Graduate School of Engineering Osaka Prefecture University Sakai, Osaka 599-853,

More information

Fundamental Concepts. Chapter 1

Fundamental Concepts. Chapter 1 Chapter 1 Fundamental Concepts This book is about the mathematical foundations of programming, with a special attention on computing with infinite objects. How can mathematics help in programming? There

More information

Approximating Square Roots

Approximating Square Roots Math 560 Fall 04 Approximating Square Roots Dallas Foster University of Utah u0809485 December 04 The history of approximating square roots is one of the oldest examples of numerical approximations found.

More information

Genetic Programming. and its use for learning Concepts in Description Logics

Genetic Programming. and its use for learning Concepts in Description Logics Concepts in Description Artificial Intelligence Institute Computer Science Department Dresden Technical University May 29, 2006 Outline Outline: brief introduction to explanation of the workings of a algorithm

More information

6. Relational Algebra (Part II)

6. Relational Algebra (Part II) 6. Relational Algebra (Part II) 6.1. Introduction In the previous chapter, we introduced relational algebra as a fundamental model of relational database manipulation. In particular, we defined and discussed

More information

MERL { A MITSUBISHI ELECTRIC RESEARCH LABORATORY. Empirical Testing of Algorithms for. Variable-Sized Label Placement.

MERL { A MITSUBISHI ELECTRIC RESEARCH LABORATORY. Empirical Testing of Algorithms for. Variable-Sized Label Placement. MERL { A MITSUBISHI ELECTRIC RESEARCH LABORATORY http://www.merl.com Empirical Testing of Algorithms for Variable-Sized Placement Jon Christensen Painted Word, Inc. Joe Marks MERL Stacy Friedman Oracle

More information

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant

More information

On Meaning Preservation of a Calculus of Records

On Meaning Preservation of a Calculus of Records On Meaning Preservation of a Calculus of Records Emily Christiansen and Elena Machkasova Computer Science Discipline University of Minnesota, Morris Morris, MN 56267 chri1101, elenam@morris.umn.edu Abstract

More information

EC121 Mathematical Techniques A Revision Notes

EC121 Mathematical Techniques A Revision Notes EC Mathematical Techniques A Revision Notes EC Mathematical Techniques A Revision Notes Mathematical Techniques A begins with two weeks of intensive revision of basic arithmetic and algebra, to the level

More information

Applied Cloning Techniques for a Genetic Algorithm Used in Evolvable Hardware Design

Applied Cloning Techniques for a Genetic Algorithm Used in Evolvable Hardware Design Applied Cloning Techniques for a Genetic Algorithm Used in Evolvable Hardware Design Viet C. Trinh vtrinh@isl.ucf.edu Gregory A. Holifield greg.holifield@us.army.mil School of Electrical Engineering and

More information

Data Structure. IBPS SO (IT- Officer) Exam 2017

Data Structure. IBPS SO (IT- Officer) Exam 2017 Data Structure IBPS SO (IT- Officer) Exam 2017 Data Structure: In computer science, a data structure is a way of storing and organizing data in a computer s memory so that it can be used efficiently. Data

More information

Homework 2: Search and Optimization

Homework 2: Search and Optimization Scott Chow ROB 537: Learning Based Control October 16, 2017 Homework 2: Search and Optimization 1 Introduction The Traveling Salesman Problem is a well-explored problem that has been shown to be NP-Complete.

More information

Subset sum problem and dynamic programming

Subset sum problem and dynamic programming Lecture Notes: Dynamic programming We will discuss the subset sum problem (introduced last time), and introduce the main idea of dynamic programming. We illustrate it further using a variant of the so-called

More information

BBN Technical Report #7866: David J. Montana. Bolt Beranek and Newman, Inc. 10 Moulton Street. March 25, Abstract

BBN Technical Report #7866: David J. Montana. Bolt Beranek and Newman, Inc. 10 Moulton Street. March 25, Abstract BBN Technical Report #7866: Strongly Typed Genetic Programming David J. Montana Bolt Beranek and Newman, Inc. 10 Moulton Street Cambridge, MA 02138 March 25, 1994 Abstract Genetic programming is a powerful

More information