Genetic Programming in the Wild:

Similar documents
Genetic Programming in the Wild: Evolving Unrestricted Bytecode

DRAFT. IN A recent comprehensive monograph surveying the field. Flight of the FINCH through the Java Wilderness. Michael Orlov and Moshe Sipper

Draft. IN A RECENT comprehensive monograph surveying the. Flight of the FINCH through the Java Wilderness Michael Orlov and Moshe Sipper

Static Program Analysis

Exercise 7 Bytecode Verification self-study exercise sheet

Java Class Loading and Bytecode Verification

Genetic Programming Part 1

Java byte code verification

Java Security. Compiler. Compiler. Hardware. Interpreter. The virtual machine principle: Abstract Machine Code. Source Code

Course Overview. Levels of Programming Languages. Compilers and other translators. Tombstone Diagrams. Syntax Specification

Evolving Software Building Blocks with FINCH. Michael Orlov, SCE, Israel GECCO / Genetic Improvement 2017, July 16

Compiler construction 2009

Oak Intermediate Bytecodes

Code Generation Introduction

Java: framework overview and in-the-small features

JVM. What This Topic is About. Course Overview. Recap: Interpretive Compilers. Abstract Machines. Abstract Machines. Class Files and Class File Format

Compiler construction 2009

Previous Lecture Genetic Programming

Compiling Techniques

CSCE 314 Programming Languages

CMPSC 497: Java Security

Translating JVM Code to MIPS Code 1 / 43

JAM 16: The Instruction Set & Sample Programs

Run-time Program Management. Hwansoo Han

A New Crossover Technique for Cartesian Genetic Programming

Towards Byte Code Genetic Programming 1

JVML Instruction Set. How to get more than 256 local variables! Method Calls. Example. Method Calls

A Comparison of Cartesian Genetic Programming and Linear Genetic Programming

Genetic Programming of Autonomous Agents. Functional Requirements List and Performance Specifi cations. Scott O'Dell

Investigating the Application of Genetic Programming to Function Approximation

Course Overview. PART I: overview material. PART II: inside a compiler. PART III: conclusion

CSc 453 Interpreters & Interpretation

An Introduction to Multicodes. Ben Stephenson Department of Computer Science University of Western Ontario

Recap: Printing Trees into Bytecodes

An empirical study of the efficiency of learning boolean functions using a Cartesian Genetic Programming approach

Automatic Programming with Ant Colony Optimization

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace

Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 16

A New Crossover Technique for Cartesian Genetic Programming

Genetic Programming Prof. Thomas Bäck Nat Evur ol al ut ic o om nar put y Aling go rg it roup hms Genetic Programming 1

Topics. Structured Computer Organization. Assembly language. IJVM instruction set. Mic-1 simulator programming

The Java Virtual Machine

Code Generation: Introduction

SOFTWARE ARCHITECTURE 7. JAVA VIRTUAL MACHINE

A Tour of Language Implementation

A Quantitative Analysis of Java Bytecode Sequences

Evolving Hierarchical and Recursive Teleo-reactive Programs through Genetic Programming

EXAMINATIONS 2014 TRIMESTER 1 SWEN 430. Compiler Engineering. This examination will be marked out of 180 marks.

Run-Time Data Structures

Graph Structured Program Evolution

The Microarchitecture Level

JoeQ Framework CS243, Winter 20156

A Comparison of Several Linear Genetic Programming Techniques

The calculator problem and the evolutionary synthesis of arbitrary software

Genetic Programming for Data Classification: Partitioning the Search Space

Genetic programming. Lecture Genetic Programming. LISP as a GP language. LISP structure. S-expressions

COMP3131/9102: Programming Languages and Compilers

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

Building a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2

Programming Language Systems

Java and C II. CSE 351 Spring Instructor: Ruth Anderson

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM

Evolution Evolves with Autoconstruction

Genetic Algorithms and Genetic Programming. Lecture 9: (23/10/09)

A brief overview of COSTA

CSE 5317 Midterm Examination 4 March Solutions

Joeq Analysis Framework. CS 243, Winter

High-Level Language VMs

Plan for Today. Safe Programming Languages. What is a secure programming language?

Approaches to Capturing Java Threads State

The Java Virtual Machine. CSc 553. Principles of Compilation. 3 : The Java VM. Department of Computer Science University of Arizona

Learning Programs in Different Paradigms using Genetic Programming

CSC 4181 Handout : JVM

Java Instrumentation for Dynamic Analysis

Just In Time Compilation

Evolutionary Computation. Chao Lan

Final Exam. 12 December 2018, 120 minutes, 26 questions, 100 points

Field Report: Applying Monte Carlo Tree Search for Program Synthesis

Principles of Computer Architecture. Chapter 5: Languages and the Machine

Evolutionary Art with Cartesian Genetic Programming

Mechanized Operational Semantics

CMSC430 Spring 2014 Midterm 2 Solutions

COMP 520 Fall 2009 Virtual machines (1) Virtual machines

The calculator problem and the evolutionary synthesis of arbitrary software

Instruction Set Architecture

Santa Fe Trail Problem Solution Using Grammatical Evolution

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon. School of Electrical Engineering and Computer Science Seoul National University, Korea

Evolving Turing-Complete Programs for a Register Machine with. Self-modifying Code. 1 Introduction

Evolution of the Discrete Cosine Transform Using Genetic Programming

Chapter 5. A Closer Look at Instruction Set Architectures

Chapter 5. A Closer Look at Instruction Set Architectures

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Tutorial 3: Code Generation

Termination Analysis of Java Bytecode

Over-view. CSc Java programs. Java programs. Logging on, and logging o. Slides by Michael Weeks Copyright Unix basics. javac.

Using Java Pathfinder to Reason about Agent Systems

Announcements. Lab Friday, 1-2:30 and 3-4:30 in Boot your laptop and start Forte, if you brought your laptop

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

Interpreting a genetic programming population on an nvidia Tesla

Transcription:

Genetic Programming in the Wild: and orlovm, sipper@cs.bgu.ac.il Department of Computer Science Ben-Gurion University, Israel GECCO 2009, July 8 12 Montréal, Québec, Canada 1 / 46

GP: Programs or Representations? While it is common to describe GP as evolving programs, GP is not typically used to evolve programs in the familiar Turing-complete languages humans normally use for software development. Goals A Field Guide to Genetic Programming [Poli, Langdon, and McPhee, 2008] 2 / 46

GP: Programs or Representations? While it is common to describe GP as evolving programs, GP is not typically used to evolve programs in the familiar Turing-complete languages humans normally use for software development. It is instead more common to evolve programs (or expressions or formulae) in a more constrained and often domain-specific language. Goals A Field Guide to Genetic Programming [Poli, Langdon, and McPhee, 2008] 3 / 46

Our Goals From programs Evolve actual programs written in Java to software! Improve (existing) software written in unrestricted Java Goals 4 / 46

Our Goals From programs Evolve actual programs written in Java to software! Improve (existing) software written in unrestricted Java Goals Extending prior work Existing work uses restricted subsets of Java bytecode as representation language for GP individuals We evolve unrestricted bytecode 5 / 46

Let s Evolve Java Source Code Rely on the building blocks in the initial population Defining genetic operators is problematic How to define good source code crossover? Factorial (recursive) class F { int fact(int n) { int ans = 1; Factorial (iterative) class F { int fact(int n) { int ans = 1; if (n > 0) ans = n * fact( n-1); for (; n > 0; n--) ans = ans * n; } } return ans; } } return ans; 6 / 46

Stupid Example Source-level crossover typically produces garbage Factorial (recursive iterative) class F { int fact(int n) { int ans = 1; } } if (n > = 1; for (; n > 0; n--) ans = ans * n; n-1); return ans; 7 / 46

Stupid Example Source-level crossover typically produces garbage Factorial (recursive iterative) class F { int fact(int n) { int ans = 1; } } if (n > = 1; for (; n > 0; n--) ans = ans * n; n-1); return ans; 8 / 46

Parse Trees Maybe we can design better genetic operators? 9 / 46

Parse Trees Maybe we can design better genetic operators? Maybe but too much harsh syntax Possibly use parse tree? 10 / 46

Parse Trees Maybe we can design better genetic operators? Maybe but too much harsh syntax Possibly use parse tree? Just one BNF rule (of many) method_declaration ::= modifier type identifier ( parameter_list? ) [ ] statement_block ; method_declaration modifier type identifier ( parameter_list ) ] [ statement_block ; 11 / 46

Better than parse trees: Let s use bytecode! Java Virtual Machine (JVM) is compiled to platform-neutral bytecode is executed with fast just-in-time compiler High-order, simple yet powerful architecture Stack-based, supports hierarchical object types Not limited to Java! (Scala, Groovy, Jython, Kawa, Clojure, ) 12 / 46

(cont d) Some basic bytecode instructions Stack Local variables iconst 1 pushes int 1 onto operand stack aload 5 pushes object in local variable 5 onto stack (object type is deduced when class is loaded) dstore 6 pops two-word double to local variables 6 7 13 / 46

(cont d) Some basic bytecode instructions Stack Local variables iconst 1 pushes int 1 onto operand stack aload 5 pushes object in local variable 5 onto stack (object type is deduced when class is loaded) dstore 6 pops two-word double to local variables 6 7 Arithmetic instructions (affect operand stack) imul pops two ints from stack, pushes multiplication result 14 / 46

(cont d) Some basic bytecode instructions Stack Local variables iconst 1 pushes int 1 onto operand stack aload 5 pushes object in local variable 5 onto stack (object type is deduced when class is loaded) dstore 6 pops two-word double to local variables 6 7 Arithmetic instructions (affect operand stack) imul pops two ints from stack, pushes multiplication result Control flow (uses operand stack) ifle +13 pops int, jumps +13 bytes if value 0 lreturn pops two-word long, returns to caller s stack 15 / 46

(cont d) ary operators Java bytecode is less fragile than source code 16 / 46

(cont d) ary operators Java bytecode is less fragile than source code But, bytecode must be correct in order to run correctly Correct bytecode requirements Stack use is type-consistent (e.g., can t multiply an int by an Object) Local variables use is type-consistent (e.g., can t read an int after storing an Object) No stack underflow No reading from uninitialized variables 17 / 46

(cont d) ary operators Java bytecode is less fragile than source code But, bytecode must be correct in order to run correctly Correct bytecode requirements Stack use is type-consistent (e.g., can t multiply an int by an Object) Local variables use is type-consistent (e.g., can t read an int after storing an Object) No stack underflow No reading from uninitialized variables So, genetic operators are still delicate 18 / 46

(cont d) ary operators Java bytecode is less fragile than source code But, bytecode must be correct in order to run correctly Correct bytecode requirements Stack use is type-consistent (e.g., can t multiply an int by an Object) Local variables use is type-consistent (e.g., can t read an int after storing an Object) No stack underflow No reading from uninitialized variables So, genetic operators are still delicate Need good genetic operators to produce correct offspring 19 / 46

(cont d) ary operators Java bytecode is less fragile than source code But, bytecode must be correct in order to run correctly Correct bytecode requirements Stack use is type-consistent (e.g., can t multiply an int by an Object) Local variables use is type-consistent (e.g., can t read an int after storing an Object) No stack underflow No reading from uninitialized variables So, genetic operators are still delicate Need good genetic operators to produce correct offspring Conclusion: Avoid bad crossover and mutation 20 / 46

ary Unidirectional bytecode crossover: A Section α B Section β 21 / 46

ary Unidirectional bytecode crossover: A Section α B Section β 22 / 46

ary Good and bad crossovers Parent A: Factorial (recursive) class F { int fact(int n) { int ans = 1; } } if (n > 0) ans = n * fact(n-1); return ans; Compiled bytecode 0 iconst_1 1 istore_2 2 iload_1 3 ifle 16 6 iload_1 7 aload_0 8 iload_1 9 iconst_1 10 isub 11 invokevirtual #2 14 imul 15 istore_2 16 iload_2 17 ireturn 23 / 46

ary Good and bad crossovers Parent B: Factorial (iterative) class F { int fact(int n) { int ans = 1; } } for (; n > 0; n--) ans = ans * n; return ans; Compiled bytecode 0 iconst_1 1 istore_2 2 iload_1 3 ifle 16 6 iload_2 7 iload_1 8 imul 9 istore_2 10 iinc 1, -1 13 goto 2 16 iload_2 17 ireturn 24 / 46

ary Good and bad crossovers Replace a section in A with section from B A 0 iconst_1 1 istore_2 2 iload_1 3 ifle 16 6 iload_1 7 aload_0 8 iload_1 9 iconst_1 10 isub 11 invokevirtual #2 14 imul 15 istore_2 16 iload_2 17 ireturn B 0 iconst_1 1 istore_2 2 iload_1 3 ifle 16 6 iload_2 7 iload_1 8 imul 9 istore_2 10 iinc 1, -1 13 goto 2 16 iload_2 17 ireturn 25 / 46

ary Good crossover example Stack use is depth- and type-consistent, variables are initialized. A 0 iconst_1 1 istore_2 2 iload_1 3 ifle 16 6 iload_1 7 aload_0 8 iload_1 9 iconst_1 10 isub 11 invokevirtual #2 14 imul 15 istore_2 16 iload_2 17 ireturn B 0 iconst_1 1 istore_2 2 iload_1 3 ifle 16 6 iload_2 7 iload_1 8 imul 9 istore_2 10 iinc 1, -1 13 goto 2 16 iload_2 17 ireturn 26 / 46

ary Good crossover example Stack use is depth- and type-consistent, variables are initialized. (A B) 0 iconst_1 1 istore_2 2 iload_1 3 ifle 12 6 iload_1 7 iload_2 8 iload_1 9 imul 10 imul 11 istore_2 12 iload_2 13 ireturn Decompiled source class F { int fact(int n) { int ans = 1; } } if (n > 0) ans = n * (ans * n); return ans; 27 / 46

ary Bad crossover example Stack use is depth- and type-inconsistent. A 0 iconst_1 1 istore_2 2 iload_1 3 ifle 16 6 iload_1 7 aload_0 8 iload_1 9 iconst_1 10 isub 11 invokevirtual #2 14 imul 15 istore_2 16 iload_2 17 ireturn B 0 iconst_1 1 istore_2 2 iload_1 3 ifle 16 6 iload_2 7 iload_1 8 imul 9 istore_2 10 iinc 1, -1 13 goto 2 16 iload_2 17 ireturn 28 / 46

ary Bad crossover example Stack use is depth- and type-inconsistent. (A B) 0 iconst_1 1 istore_2 2 iload_1 3 ifle 13 6 iload_2 7 iload_1 8 invokevirtual #2 11 imul 12 istore_2 13 iload_2 14 ireturn Decompiled source class F { int fact(int n) { int ans = 1; } } if (n > 0) ans = ans.fact(n) *? ; return ans; 29 / 46

Compatible Constraints of unidirectional crossover A B Good crossover is achieved by respecting bytecode constraints: ( α is target section in A, β is source section in B) Operand stack e.g., β doesn t pop values with types incompatible to those popped by α Compatible XO Formal Definition 30 / 46

Compatible Constraints of unidirectional crossover A B Good crossover is achieved by respecting bytecode constraints: ( α is target section in A, β is source section in B) Operand stack e.g., β doesn t pop values with types incompatible to those popped by α Local variables e.g., variables read by β in B must be written before α in A with compatible types Compatible XO Formal Definition 31 / 46

Compatible Constraints of unidirectional crossover A B Good crossover is achieved by respecting bytecode constraints: ( α is target section in A, β is source section in B) Operand stack e.g., β doesn t pop values with types incompatible to those popped by α Local variables e.g., variables read by β in B must be written before α in A with compatible types Compatible XO Formal Definition Control flow e.g., branch instructions in β have no outside destinations 32 / 46

Formal Definition (Example of operand stack requirement) α and β have compatible stack frames up to stack depth of β: pops of α have identical or narrower types as pops of β; pushes of β have identical or narrower types as pushes of α Good crossover α β pre-stack **AB **AA post-stack **B **C depth 3 2 Types hierarchy: C B A Stack pops AB (2 stop tack frames) are narrower than AA, whereas stack push C is narrower than B Compatible XO Formal Definition (see [Orlov and Sipper, 2009] for full formal definitions) 33 / 46

Formal Definition (Example of operand stack requirement) α and β have compatible stack frames up to stack depth of β: pops of α have identical or narrower types as pops of β; pushes of β have identical or narrower types as pushes of α Bad crossover α β pre-stack **AB **Af post-stack **B **A depth 3 2 Types hierarchy: B A; Stack pops AB are not narrower than Af (B and f are incompatible); stack push A is not narrower than B f is a float Compatible XO Formal Definition (see [Orlov and Sipper, 2009] for full formal definitions) 34 / 46

Symbolic Regression As an evolutionary example Parameters Objective: symbolic regression, x 4 + x 3 + x 2 + x Fitness: sum of errors on 20 random data points in [ 1, 1] Input: Number num (a Java type) Symbolic Regression Seeding Statistics Results 35 / 46

Symbolic Regression As an evolutionary example Parameters Objective: symbolic regression, x 4 + x 3 + x 2 + x Fitness: sum of errors on 20 random data points in [ 1, 1] Input: Number num (a Java type) Seeding Population initialized using seeding [Langdon and Nordin, 2000] Seed population with clones of Koza s original worst-of-generation-0 [Koza, 1992] Symbolic Regression Seeding Statistics Results 36 / 46

Symbolic Regression Seeding with Koza s worst-of-generation-0 Original Lisp individual and its tree representation: (EXP (- (% X (- X (SIN X))) (RLOG (RLOG (* X X))))) EXP - % RLOG Symbolic Regression Seeding Statistics Results X - RLOG X SIN * X X X 37 / 46

Symbolic Regression Seeding with Koza s worst-of-generation-0 Translation to unrestricted Java class Gecco { Number simpleregression(number num) { double x = num.doublevalue(); double llsq = Math.log(Math.log(x*x)); double dv = x / (x - Math.sin(x)); double worst = Math.exp(dv - llsq); return Double.valueOf(worst + Math.cos(1)); } Symbolic Regression Seeding Statistics Results } /* Rest of class omitted */ We added a couple of building blocks in the last line 38 / 46

Symbolic Regression Setup and Statistics Setup (similar to Koza s) Population: 500 individuals Generations: 51 (or less) Probabilities: p cross = 0.9 ( α and β segments are uniform over segment sizes) Selection: binary tournament Symbolic Regression Seeding Statistics Results 39 / 46

Symbolic Regression Setup and Statistics Setup (similar to Koza s) Population: 500 individuals Generations: 51 (or less) Probabilities: p cross = 0.9 ( α and β segments are uniform over segment sizes) Selection: binary tournament Statistics Yield: 99% of runs successful (out of 100) Runtime: 30 60 s on dual-core 2.6 GHz Opteron Memory limits: insignificant w.r.t. runtime Symbolic Regression Seeding Statistics Results 40 / 46

Symbolic Regression Evolved perfect individuals A perfect solution easily evolves: (beware of decompiler quirks!) class Gecco_0_7199 { Number simpleregression(number num) { double d = num.doublevalue(); d = num.doublevalue(); double d1 = d; d = Double.valueOf(d + d * d * num.doublevalue()).doublevalue(); return Double.valueOf(d + (d = num.doublevalue()) * num.doublevalue()); } Symbolic Regression Seeding Statistics Results } /* Rest of class unchanged */ Computes (x + x x x) + (x + x x x) x = x(1 + x)(1 + x 2 ) 41 / 46

Symbolic Regression Evolved perfect individuals Another solution: class Gecco_0_2720 { Number simpleregression(number num) { double d = num.doublevalue(); d = d; d = d; double d1 = Math.exp(d - d); return Double.valueOf(num.doubleValue() * (num.doublevalue() * (d * d + d) + d) + d); } } /* Rest of class unchanged */ Symbolic Regression Seeding Statistics Results Computes x (x (x x + x) + x) + x = x(1 + x(1 + x(1 + x))) 42 / 46

Completely unrestricted Java programs can be evolved (via bytecode) Extant (bad) Java programs can be improved (e.g., initial regression seed) Future Work 43 / 46

Future Work Exhibit viability on other problems We currentlty have results for: complex regression, artificial ant, intertwined spirals, Future Work 44 / 46

Future Work Exhibit viability on other problems We currentlty have results for: complex regression, artificial ant, intertwined spirals, Future Work Loops and recursion are not a problem! 45 / 46

J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. The MIT Press, Cambridge, MA, USA, Dec. 1992. ISBN 0-262-11170-5. W. B. Langdon and P. Nordin. Seeding genetic programming populations. In R. Poli, W. Banzhaf, W. B. Langdon, J. Miller, P. Nordin, and T. C. Fogarty, editors, Genetic Programming: European Conference, EuroGP 2000, Edinburgh, Scotland, UK, April 15 16, 2000, volume 1802/2004 of Lecture Notes in Computer Science, pages 304 315, Berlin / Heidelberg, 2000. Springer-Verlag. ISBN 978-3-540-67339-2. doi:10.1007/b75085. M. Orlov and M. Sipper. Genetic programming in the wild: unrestricted bytecode. In Proceedings of the 11th Annual Conference on Genetic and ary Computation, July 8 12, 2009, Montréal Québec, Canada. ACM Press, July 2009. To appear. R. Poli, W. B. Langdon, and N. F. McPhee. A Field Guide to Genetic Programming. Lulu Enterprises, UK, Mar. 2008. ISBN 978-1-4092-0073-4. URL http://www.gp-field-guide.org.uk. (With contributions by J. R. Koza). 46 / 46