Variable-Branch Decision Tree

Similar documents
Midterm Examination CS 540-2: Introduction to Artificial Intelligence

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013

Midterm Examination CS540-2: Introduction to Artificial Intelligence

Joint Image Classification and Compression Using Hierarchical Table-Lookup Vector Quantization

Feature-Guided K-Means Algorithm for Optimal Image Vector Quantizer Design

The k-means Algorithm and Genetic Algorithm

Computer Vision. Exercise Session 10 Image Categorization

Comparative Study on VQ with Simple GA and Ordain GA

Lecture 6: Genetic Algorithm. An Introduction to Meta-Heuristics, Produced by Qiangfu Zhao (Since 2012), All rights reserved

Uninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall

Descriptors for CV. Introduc)on:

MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ)

Basic Data Mining Technique

Codebook generation for Image Compression with Simple and Ordain GA

Case-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.

CS 188: Artificial Intelligence Fall 2008

A Review on LBG Algorithm for Image Compression

Kyrre Glette INF3490 Evolvable Hardware Cartesian Genetic Programming

Graph-based Framework for Flexible Baseband Function Splitting and Placement in C-RAN

Genetic Algorithms for Vision and Pattern Recognition

CS 231A CA Session: Problem Set 4 Review. Kevin Chen May 13, 2016

Genetic Algorithms for Classification and Feature Extraction

A LOSSLESS INDEX CODING ALGORITHM AND VLSI DESIGN FOR VECTOR QUANTIZATION

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Adaptive data hiding based on VQ compressed images

A Hybrid Genetic Algorithm for the Distributed Permutation Flowshop Scheduling Problem Yan Li 1, a*, Zhigang Chen 2, b

Beyond Bags of features Spatial information & Shape models

UCSD ECE154C Handout #21 Prof. Young-Han Kim Thursday, June 8, Solutions to Practice Final Examination (Spring 2016)

CHAPTER 5 ENERGY MANAGEMENT USING FUZZY GENETIC APPROACH IN WSN

Machine Learning: Algorithms and Applications Mockup Examination

QED Q: Why is it called the triangle inequality? A: Analogue with euclidean distance in the plane: picture Defn: Minimum Distance of a code C:

KLEE Workshop Feeding the Fuzzers. with KLEE. Marek Zmysłowski MOBILE SECURITY TEAM R&D INSTITUTE POLAND

Mining di Dati Web. Lezione 3 - Clustering and Classification

Repeating Segment Detection in Songs using Audio Fingerprint Matching

Chapter 6: Cluster Analysis

Lecture 13. Types of error-free codes: Nonsingular, Uniquely-decodable and Prefix-free

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

CHAPTER 4: CLUSTER ANALYSIS

Grid-Based Genetic Algorithm Approach to Colour Image Segmentation

Figure (5) Kohonen Self-Organized Map

Support Vector Machines

Artificial Intelligence. Programming Styles

Segmentation of Images

Part-based and local feature models for generic object recognition

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22

Genetic Algorithm for Finding Shortest Path in a Network

Khushboo Arora, Samiksha Agarwal, Rohit Tanwar

An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid

Traffic Signal Control Based On Fuzzy Artificial Neural Networks With Particle Swarm Optimization

Classifier Inspired Scaling for Training Set Selection

Using Genetic Algorithms to Improve Pattern Classification Performance

K-Means Clustering 3/3/17

Genetic Algorithms. PHY 604: Computational Methods in Physics and Astrophysics II

ARTIFICIAL INTELLIGENCE (CSCU9YE ) LECTURE 5: EVOLUTIONARY ALGORITHMS

Extract an Essential Skeleton of a Character as a Graph from a Character Image

Large-scale visual recognition Efficient matching

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter

Neural Network Weight Selection Using Genetic Algorithms

Clustering & Classification (chapter 15)

identified and grouped together.

Genetic algorithm with deterministic crossover for vector quantization

Minimum Spanning Tree

MODULE 6 Different Approaches to Feature Selection LESSON 10

Introduction to Mobile Robotics

Text Compression through Huffman Coding. Terminology

k-means demo Administrative Machine learning: Unsupervised learning" Assignment 5 out

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM

Feature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule.

Genetic Algorithm based Fractal Image Compression

Mutation in Compressed Encoding in Estimation of Distribution Algorithm

Heuristic Optimisation

Evolutionary Computation, 2018/2019 Programming assignment 3

PROBLEM 4

Bag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013

Genetic Programming Part 1

Hierarchical Clustering 4/5/17

INF4820, Algorithms for AI and NLP: Hierarchical Clustering

CHAPTER 6 INFORMATION HIDING USING VECTOR QUANTIZATION

A Classifier with the Function-based Decision Tree

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford

Binary Shape Characterization using Morphological Boundary Class Distribution Functions

CSC411/2515 Tutorial: K-NN and Decision Tree

Chapter 5 VARIABLE-LENGTH CODING Information Theory Results (II)

Greedy Algorithms. Alexandra Stefan

Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems

Introduction to Machine Learning. Xiaojin Zhu

Data Compression Fundamentals

Using a genetic algorithm for editing k-nearest neighbor classifiers

15-780: Problem Set #2

Mutations for Permutations

Targil 12 : Image Segmentation. Image segmentation. Why do we need it? Image segmentation

Approximate Nearest Neighbors. CS 510 Lecture #24 April 18 th, 2014

Greedy algorithms 2 4/5/12. Knapsack problems: Greedy or not? Compression algorithms. Data compression. David Kauchak cs302 Spring 2012

Introduction to Artificial Intelligence

Classification Algorithms for Determining Handwritten Digit

V.Petridis, S. Kazarlis and A. Papaikonomou

Move-to-front algorithm

MATLAB Based Optimization Techniques and Parallel Computing

I211: Information infrastructure II

Extending the Growing Neural Gas Classifier for Context Recognition

Transcription:

Variable-Branch Decision Tree based on Genetic Algorithm 楊雄彬 1

Contents Decision tree for compression and recognition K-means algorithm Binary decision tree Greedy decision i tree Twoproblems of greedy decision tree Genetic algorithm for solving problem1. Classification points for solving problem 2. Conclusions 2

Decision tree-compression C1 C2 C3 S C1 00 C2 01 C3 10 Codebook tree 3

Encode : x x C1 00 S Code 01 C2 01 C3 10 Codebook tree Decode : Code 01 00 C1 01 C2 10 C3 C2 4

Decision tree-recognition C2 C1 C3 C1 S A C2 C3 B C Classification tree 5

Recognition: x x C1 A S B C2 C3 B C Classification tree 6

K-means (C-means) algorithm Divide the data set into k clusters. 7

Decision tree ----- Compression Binary decision tree (k=2) C1 C2 S C3 Input X is encoded in T1. X is compared with three codewords, C1, C2 and C3. S T1 C1 C2 8 C3

Decision tree ----- Compression 3-ary decision tree (k=3) C1 C2 C3 Input X is encoded in T2. X is compared with three codewords, C1,C2 and C3. T2 C1 C2 C3 9

T1 and T2, which one is better? Optimal? It is hard for the users to determine which one is better. The users usually have no ideal about the value of k. Thus, T1 and T2 are not optimal. Compression performance is depended on the coding quality and bit rate. The coding quality is as high as possible. The bit rate is as low as possible. 10

Greedy decision tree (K=2) The growing method selects nodes with the maximum value of λ to split during the design of decision tree. λ = ΔDi Distortion ti ΔD D = ΔBitr ate ΔR Distortion C1>Distortion C2+Distortion C3 Bit rate C1<Bit rate C2+Bitrate C3 C1 C2 C3 11

x1 x2 x1 x1 x2 x2 T T1 T2 T tortion ΔD T1 Dis ΔR T2 Bit rate T2 is better than T1. 12

Problem1:The greedy decision tree is a fixed-branch decision tree. It is still not an optimal decision tree. Which one is better? 13

Solution for Problem1 Variable-branch decision tree is proposed to replace the fixed-branch decision tree. How to determine the proper number of branches of a node? Nearest-neighbor algorithm + Genetic clustering algorithm automatically determines the proper number of branches in a node. NN+GA searches for the proper number of branches of node X. X? 14

Reduce the training data set using the nearest-neighbor algorithm. X X Size n (n>>m) Size m 15

Genetic algorithm Initialize Define the bit string, set the population size. End? Output Change the bits in the bit strings. Reproduction Crossover Mutation Calculate the fitness of each bit string. Interchange the partial solution among the bit strings. 16

Initialize m 01001 X b2 b5 b6 10110 Population size 00110 Ex: m=8 R= 0 1 0 0 1 1 0 0 b1 b2 b3 b4 b5 b6 b7 b8 Three initial seeds, b2,b5 and b6, generate three clusters, X1, X2 and X3. X X1 X2 X3 17

Reproduction(1) Fitness(R)= [Dinter(x x Dtat set i i )* w - Dintra(x Dinter(xi) denotes the minimal distance between the sample xi and its nearest cluster. Dintra(xi) denotes the distance between the sample xi and its center. i )] xi Dinter(xi) Dintra(xi) * * 18

If w is large, Fitness is determined d by Dinter. If w is small, Fitness is determined by Dintra. Wi is large Dinter is large Wissmall small Dintra is small 19

Ex: (3 clusters) R1= 1 0 0 0 1 1 0 0 b1 b2 b3 b4 b5 b6 b7 b8 R2= 00100100 0 0 0 1 0 0 (2 clusters) x1 x3 x1 x3 R1 x4 x2 x2 x5 x6 x7 R1 x4 R2 x1 x3 R2 X x2 x4 x5 x6 x7 X1 X2 x5 x6 x7 X X1 X2 X3 20

A good clustering result?=?? A good decision tree for compression 21

Reproduction(2) Fitness(R)= λ = ΔD ΔRR T Disto ortion ΔD ΔR Tx T Tx Bit rate X X X1 X2 X3 22

Ex: R1= 0 1 0 0 1 1 0 0 b1 b2 b3 b4 b5 b6 b7 b8 (3 branches) R2= 0 0 1 0 0 1 0 0 (2 branches) R3= R4= R5= 1 1 1 0 0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 0 1 0 0 (4 branches) (4 branches) (2 branches) R6= 01010100 0 1 0 1 0 0 (3 branches) Fitness(R5)>Fitness(R1)>Fitness(R6)>Fitness(R2)>Fitness(R4) (R1) (R6) (R2) (R4) 23 >Fitness(R3)

Prob(R5)>Prob(R1)>Prob(R6)>Prob(R2)>Prob(R4)> Prob(R3) R3 R4 R6 R2 R5 R1 Ex: R5, R1, R5, R6, R5, R4 are selected to be the next population. 24

Crossover R1= 0 0 1 0 0 1 0 0 R1= 0 1 1 0 0 1 0 0 R2= 11100100 1 1 0 1 0 0 R2= 10100100 1 0 0 1 0 0 25

Mutation R1= 0 0 1 0 0 1 0 0 R1= 0 0 0 0 0 1 0 0 0 1 1 0 26

Problem 2:The encoding codeword is not the closest codeword to the input X. Input O is encoded by C2 in the decision tree. However, O is closed to C5. 27

Solution for Problem 2 The cluster center is not proper to classify the input vector in the decision i tree. Ex: (1)Large cluster O C1 C2 28

Ex: (2)Non-spherical shape of clusters o c1 29

Danger region among the clusters Danger region 30

Classification point is defined to classify the input vectors in the decision tree. O o p2 C1 P1 P2 C2 P1 c1 31

How to find the classification points in a cluster? X o C1 p6 X1 X2 X3 X4 p3 p1 C2 p2 C3 p4 p5 C4 O is compared with p1, p2, p3, p4, p5 and p6. 32

(1) (2) (3) 33

Conclusions Variable-branch decision tree can also be applied to recognition applications. Traditional NCUT tree can be improved by genetic algorithm. Adaptive variable-branch decision tree can be proposed in the further. 34

感謝 35