Genetic Programming. and its use for learning Concepts in Description Logics

Similar documents
Genetic Programming Prof. Thomas Bäck Nat Evur ol al ut ic o om nar put y Aling go rg it roup hms Genetic Programming 1

Genetic programming. Lecture Genetic Programming. LISP as a GP language. LISP structure. S-expressions

Advanced Search Genetic algorithm

Previous Lecture Genetic Programming

ARTIFICIAL INTELLIGENCE (CSCU9YE ) LECTURE 5: EVOLUTIONARY ALGORITHMS

GENETIC ALGORITHM with Hands-On exercise

Introduction to Genetic Algorithms

Genetic Algorithms Variations and Implementation Issues

Computational Intelligence

Genetic Programming Part 1

Genetic Algorithms and Genetic Programming. Lecture 9: (23/10/09)

Investigating the Application of Genetic Programming to Function Approximation

Genetic Programming: A study on Computer Language

Gen := 0. Create Initial Random Population. Termination Criterion Satisfied? Yes. Evaluate fitness of each individual in population.

Evolutionary Algorithms. CS Evolutionary Algorithms 1

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM

Genetic Programming. Modern optimization methods 1

The Parallel Software Design Process. Parallel Software Design

Evolving SQL Queries for Data Mining

Heuristic Optimisation

Evolutionary Computation. Chao Lan

Search Algorithms for Regression Test Suite Minimisation

Genetic Programming for Multiclass Object Classification

Mutations for Permutations

Genetic Algorithms. Kang Zheng Karl Schober

Escaping Local Optima: Genetic Algorithm

Genetic Programming. Czech Institute of Informatics, Robotics and Cybernetics CTU Prague.

Genetic Programming of Autonomous Agents. Functional Requirements List and Performance Specifi cations. Scott O'Dell

Evolution Evolves with Autoconstruction

CHAPTER 4 GENETIC ALGORITHM

Machine Evolution. Machine Evolution. Let s look at. Machine Evolution. Machine Evolution. Machine Evolution. Machine Evolution

CS5401 FS2015 Exam 1 Key

Artificial Intelligence Application (Genetic Algorithm)

Classification Using Genetic Programming. Patrick Kellogg General Assembly Data Science Course (8/23/15-11/12/15)

Multi-Objective Optimization Using Genetic Algorithms

Administrative. Local Search!

N-Queens problem. Administrative. Local Search

Automatic Programming of Agents by Genetic Programming

CHAPTER 6 REAL-VALUED GENETIC ALGORITHMS

Evolutionary Computation Part 2

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter

Hybridization EVOLUTIONARY COMPUTING. Reasons for Hybridization - 1. Naming. Reasons for Hybridization - 3. Reasons for Hybridization - 2

Evolutionary Computation Algorithms for Cryptanalysis: A Study

Lecture 8: Genetic Algorithms

Planning and Search. Genetic algorithms. Genetic algorithms 1

CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM

Suppose you have a problem You don t know how to solve it What can you do? Can you use a computer to somehow find a solution for you?

A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2

Introduction to Evolutionary Computation

Research Article Path Planning Using a Hybrid Evolutionary Algorithm Based on Tree Structure Encoding

Genetic Algorithm for Finding Shortest Path in a Network

What is GOSET? GOSET stands for Genetic Optimization System Engineering Tool

The Genetic Algorithm for finding the maxima of single-variable functions

A Genetic Programming Approach for Distributed Queries

Genetic Algorithm for Dynamic Capacitated Minimum Spanning Tree

Evolving Hierarchical and Recursive Teleo-reactive Programs through Genetic Programming

Job Shop Scheduling Problem (JSSP) Genetic Algorithms Critical Block and DG distance Neighbourhood Search

The k-means Algorithm and Genetic Algorithm

Local Search. CS 486/686: Introduction to Artificial Intelligence Winter 2016

Genetic Algorithms. Aims 09s1: COMP9417 Machine Learning and Data Mining. Biological Evolution. Evolutionary Computation

MDL-based Genetic Programming for Object Detection

IMPROVING A GREEDY DNA MOTIF SEARCH USING A MULTIPLE GENOMIC SELF-ADAPTATING GENETIC ALGORITHM

Lecture 6: Genetic Algorithm. An Introduction to Meta-Heuristics, Produced by Qiangfu Zhao (Since 2012), All rights reserved

ISSN: [Keswani* et al., 7(1): January, 2018] Impact Factor: 4.116

4/22/2014. Genetic Algorithms. Diwakar Yagyasen Department of Computer Science BBDNITM. Introduction

Usage of of Genetic Algorithm for Lattice Drawing

Basic Data Mining Technique

Selection Based on the Pareto Nondomination Criterion for Controlling Code Growth in Genetic Programming

MINIMAL EDGE-ORDERED SPANNING TREES USING A SELF-ADAPTING GENETIC ALGORITHM WITH MULTIPLE GENOMIC REPRESENTATIONS

Hardware Neuronale Netzwerke - Lernen durch künstliche Evolution (?)

Lecture 4. Convexity Robust cost functions Optimizing non-convex functions. 3B1B Optimization Michaelmas 2017 A. Zisserman

CHAPTER 5 ENERGY MANAGEMENT USING FUZZY GENETIC APPROACH IN WSN

Structural Optimizations of a 12/8 Switched Reluctance Motor using a Genetic Algorithm

Introduction to Optimization

Geometric Semantic Genetic Programming ~ Theory & Practice ~

Evolutionary origins of modularity

Evolutionary Decision Trees and Software Metrics for Module Defects Identification

(Conceptual) Clustering methods for the Semantic Web: issues and applications

DERIVATIVE-FREE OPTIMIZATION

Information Fusion Dr. B. K. Panigrahi

Genetic Algorithms for Vision and Pattern Recognition

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra

Local Search. CS 486/686: Introduction to Artificial Intelligence

Polyhedron Evolver Evolution of 3D Shapes with Evolvica

Genetic Programming. Charles Chilaka. Department of Computational Science Memorial University of Newfoundland

Correspondence. Object Detection via Feature Synthesis Using MDL-Based Genetic Programming

Evolutionary Search in Machine Learning. Lutz Hamel Dept. of Computer Science & Statistics University of Rhode Island

Reducing Graphic Conflict In Scale Reduced Maps Using A Genetic Algorithm

Genetic Programming for Julia: fast performance and parallel island model implementation

A New Crossover Technique for Cartesian Genetic Programming

A More Stable Approach To LISP Tree GP

March 19, Heuristics for Optimization. Outline. Problem formulation. Genetic algorithms

Clustering Analysis of Simple K Means Algorithm for Various Data Sets in Function Optimization Problem (Fop) of Evolutionary Programming

Evolution of the Discrete Cosine Transform Using Genetic Programming

Kyrre Glette INF3490 Evolvable Hardware Cartesian Genetic Programming

3DUDOOHO*UDPPDWLFDO(YROXWLRQ

Using Genetic Programming to Evolve a General Purpose Sorting Network for Comparable Data Sets

Logik für Informatiker Logic for computer scientists. Ontologies: Description Logics

JHPCSN: Volume 4, Number 1, 2012, pp. 1-7

ADAPTATION OF REPRESENTATION IN GP

Transcription:

Concepts in Description Artificial Intelligence Institute Computer Science Department Dresden Technical University May 29, 2006

Outline Outline: brief introduction to explanation of the workings of a algorithm application of to Description Logic learning including experimental results

- Machine Learning algorithm inspired by biological evolution in the real world uses Darwin s theory of evolution (natural selection, survival of the fittest) nature individual fitness crossover, mutation natural selection evolutionary computing problem solution quality of a solution genetic search operators reuse of good solutions

- Generic Algorithm Generic Evolutionary Algorithm: create population while the termination criterion is not met: select a subset of the population based on their fitness produce offspring using genetic operators on selected individuals create a new population from the old one and the offspring

GP - is a special kind of Evolutionary Computing algorithm distinctive feature: individuals are represented as variable length programs programs can have a linear (like code in imperative languages) or tree structure (like LISP) we focus on tree structures

and Alphabet alphabet consists of terminal set T and function set F terminal set: set of all leaf nodes, which can occur function set: set of all internal nodes, which can occur * - + 2 y 3 * x 2 Figure: possible tree for T = {0, 1, 2, 3, x, y} and F = {+,, }

all methods have as parameter a maximum depth d methods to create a single tree: full method: if d > 0 randomly select a function symbol and call the method recursively for all children, which need to be generated, with parameter d 1 if d = 0 return a random terminal symbol grow method: modify full method to select an alphabet symbol not necessarily a function symbol in the d > 0 case full method creates complete trees (i.e. all leaf nodes have equal depth)

ramped half and half: divide population in d 1 parts, where each part has its own depth parameter (from 2 to d) half of the trees in each part are generated using the full and the other half using the grow method with the part-specific depth parameter grow and full method can be used to create the population, but often ramped half and half is used to increase diversity

it is unlikely that tree initialisation alone yields a good solution genetic operators aim at finding and combining pieces of a good solution simplest genetic operator is reproduction: copies one individual in the next generation

Crossover Crossover: selects two parents randomly selects a node in each parent, the crossover point subtrees rooted by these two points are swapped + 1 x parents * 2 * x 1 + offspring * * x 1 x 2 1

Mutation Mutation: selects one individual from the population randomly chooses a node in the inidividual, the mutation point subtree rooted by this node is replaced by another tree (e.g. using the grow method) + 1 x + * x x 1

Editing Editing: selects one individual from the population applies simplification rules to the individual * 1 - x 0 x

selection methods in GPs simulate natural selection ("survival of the fittest") fitness of an individual is usually encoded in a single number measuring process can be very complex and take multiple objectives into account

FPS Fitness Proportionate (FPS): probability of an individual being selected is proportional to its fitness stagnation problem can arise stagnation: in a population of individuals with similar fitness, there is almost no selective pressure, because all individuals have approximately the same probability of being selected modified versions of FPS (e.g. sigma truncation) avoid the stagnation problem

Rank and Tournament Rank : individuals are sorted according to their fitness, i.e. each individual has a rank within the population probability of being selected is a function of the rank (any sensible function can be used) Tournament : takes m individuals from the population and selects the best one (m is typically a number between 2 and 8)

Criteria: when optimal solution is found when solution is within an ɛ-range of an optimal solution manual abort fixed number of generations no better solution has been found for a fixed number of generations population does not have sufficient genetic diversity to produce new solutions

create initial population while the termination criterion is not met: evaluate the fitness of all individuals in the population while the number of individuals to generate is not reached: choose a genetic operator select (by one of the selection methods) the appropriate numbers of individuals for this operator and generate new individuals copy these individuals into the new population

Human competitive GP Results John Koza, a respectable GP researcher, has created a list of 36 instances of "human competitive intelligence" produced by genetic programming some of these are: creation of quantum algorithms (better than previously published results) creation of soccer playing programs for Robo Cup synthesis of analog circuits

Learning in Description

goal: learn a concept definition from positive examples + negative examples + background knowledge Why is it useful to learn in DLs? may have similar applications like ILP (Inductive Logic Programming) approaches for learning horn clauses e.g. in biology and medicine incremental ontology learning in context of OWL and the Semantic Web (can help to build ontologies)

Using GP for Why use GP? flexible: can (probably) learn in any description language robust: can handle noise GP has shown promising results in practise GP usually performs good when approximate solutions are acceptable in description languages, which allow all boolean operations it becomes difficult to apply standard ILP appraches makes use of computational resources (the more resources the better the result) can be fine-tuned in various ways to improve results Why not? too general: there may be better special purpose methods, especially for less expressive description languages stochastic outcome: makes analysis difficult and good solution quality cannot be guaranteed positive effect of natural selection and standard genetic operators is not obvious when learning DLs

ALC ALC concepts: C, D A C C D C D r.c r.c Example TBox T : Man Woman Mother Woman haschild. Example ABox A : Man(STEPHEN). Man(MONICA). Woman(JESSICA). haschild(stephen, JESSICA).

Learning Problem we have a target concept name Target and a knowledge base K as background knowledge examples are of the form Target(a) (positive) or Target(a) (negative), where a is an object let E + be the set of positive examples, E the set of negative examples and E = E + E we want to find an acyclic definition Def of the form Target C such that for K = K {Def } we have K = E

of ALC concepts alphabet let NC be the set of concept names and N R the set of role names terminal set: T = NC {, } function set: F = {,, } { r r N R } { r r N R } example of a tree representation of the ALC concept Male haschild.female: Male haschild Female

Fitness Measurement How to measure fitness? an example a E is in one of the three following classes: positively classified: K = C(a) negatively classified: K = C(a) neutrally classified: K = C(a) and K = C(a) high penalties for classification errors (positive example classified as negative or negative example classified as positive) smaller penalty if classification is not provably (in)correct (positive or negative example classified as neutral)

Fitness Measurement primary learning goal is correct classification, secondary goal is small concept length (smaller concepts usually generalize better to unseen examples) length C of a concept C is defined inductively in the obvious way: A = = = 1 C = C + 1 C D = C D = 1 + C + D r.c = r.c = 2 + C

Uncle Problem STEPHEN JESSICA TYLER TOM JASON BOB JOHN LAUREN GRAIG MARIA TIM AMBER THOMAS married PETER sibling child Uncle Male ( hassibling. haschild. married. hassibling. haschild. )

Uncle Problem GP: 10 generations crossover probability: 90%, mutation probability: 3% selection: tournament (tournament-size 3) Random Guesser: uses grow method to generate trees, which amounts to guessing a solution for technical reasons a depth limit of 10 is used (otherwise trees are too big to be processed in reasonable time) Fitness Function: error penalty: 3 points, neutral classification penalty: 1 point, length penalty: 1 10 points

4 fitness 5 6 7 8 9 guesser GP 1 10 0 5000 10000 15000 20000 25000 30000 tests

Hill Climbing Operator introduce a new genetic operator, which takes one individual and produces a new one: given a concept C we can look at the "neighbourhood" N(C) of C, i.e. concepts of the form C A, C A, r.c, r.c from these concepts we can compute a subset, which are better classifiers than C and for which there is no better classifier in N(C) if this set is empty we return C (= reproduction) otherwise we randomly select one element of this set and return it operator can be seen as one-step-hill-climbing evaluation of neighbour concepts can be approximated efficiently using the fact that we already evaluated C (details omitted) new algorithm is a combination of GP and hill climbing to test it on the uncle problem we use 15% hill climbing, 80% crossover (see next slide)

4 fitness 5 6 7 8 9 guesser GP 1 GP 2 10 0 5000 10000 15000 20000 25000 30000 tests

Automatically Defined Concepts (ADC) idea: inventing new concepts, which are useful for defining the target concept instead of one tree, we evolve two trees alphabet of main tree is extended by a terminal symbol "ADC" ADCs are a modification of the idea of automatically defined functions (ADFs) already known in genetic programming advantage: solution more modular, easier to read disadvantage: larger search space hassibling Male haschild ADC married ADC