Variable-Branch Decision Tree based on Genetic Algorithm 楊雄彬 1
Contents Decision tree for compression and recognition K-means algorithm Binary decision tree Greedy decision i tree Twoproblems of greedy decision tree Genetic algorithm for solving problem1. Classification points for solving problem 2. Conclusions 2
Decision tree-compression C1 C2 C3 S C1 00 C2 01 C3 10 Codebook tree 3
Encode : x x C1 00 S Code 01 C2 01 C3 10 Codebook tree Decode : Code 01 00 C1 01 C2 10 C3 C2 4
Decision tree-recognition C2 C1 C3 C1 S A C2 C3 B C Classification tree 5
Recognition: x x C1 A S B C2 C3 B C Classification tree 6
K-means (C-means) algorithm Divide the data set into k clusters. 7
Decision tree ----- Compression Binary decision tree (k=2) C1 C2 S C3 Input X is encoded in T1. X is compared with three codewords, C1, C2 and C3. S T1 C1 C2 8 C3
Decision tree ----- Compression 3-ary decision tree (k=3) C1 C2 C3 Input X is encoded in T2. X is compared with three codewords, C1,C2 and C3. T2 C1 C2 C3 9
T1 and T2, which one is better? Optimal? It is hard for the users to determine which one is better. The users usually have no ideal about the value of k. Thus, T1 and T2 are not optimal. Compression performance is depended on the coding quality and bit rate. The coding quality is as high as possible. The bit rate is as low as possible. 10
Greedy decision tree (K=2) The growing method selects nodes with the maximum value of λ to split during the design of decision tree. λ = ΔDi Distortion ti ΔD D = ΔBitr ate ΔR Distortion C1>Distortion C2+Distortion C3 Bit rate C1<Bit rate C2+Bitrate C3 C1 C2 C3 11
x1 x2 x1 x1 x2 x2 T T1 T2 T tortion ΔD T1 Dis ΔR T2 Bit rate T2 is better than T1. 12
Problem1:The greedy decision tree is a fixed-branch decision tree. It is still not an optimal decision tree. Which one is better? 13
Solution for Problem1 Variable-branch decision tree is proposed to replace the fixed-branch decision tree. How to determine the proper number of branches of a node? Nearest-neighbor algorithm + Genetic clustering algorithm automatically determines the proper number of branches in a node. NN+GA searches for the proper number of branches of node X. X? 14
Reduce the training data set using the nearest-neighbor algorithm. X X Size n (n>>m) Size m 15
Genetic algorithm Initialize Define the bit string, set the population size. End? Output Change the bits in the bit strings. Reproduction Crossover Mutation Calculate the fitness of each bit string. Interchange the partial solution among the bit strings. 16
Initialize m 01001 X b2 b5 b6 10110 Population size 00110 Ex: m=8 R= 0 1 0 0 1 1 0 0 b1 b2 b3 b4 b5 b6 b7 b8 Three initial seeds, b2,b5 and b6, generate three clusters, X1, X2 and X3. X X1 X2 X3 17
Reproduction(1) Fitness(R)= [Dinter(x x Dtat set i i )* w - Dintra(x Dinter(xi) denotes the minimal distance between the sample xi and its nearest cluster. Dintra(xi) denotes the distance between the sample xi and its center. i )] xi Dinter(xi) Dintra(xi) * * 18
If w is large, Fitness is determined d by Dinter. If w is small, Fitness is determined by Dintra. Wi is large Dinter is large Wissmall small Dintra is small 19
Ex: (3 clusters) R1= 1 0 0 0 1 1 0 0 b1 b2 b3 b4 b5 b6 b7 b8 R2= 00100100 0 0 0 1 0 0 (2 clusters) x1 x3 x1 x3 R1 x4 x2 x2 x5 x6 x7 R1 x4 R2 x1 x3 R2 X x2 x4 x5 x6 x7 X1 X2 x5 x6 x7 X X1 X2 X3 20
A good clustering result?=?? A good decision tree for compression 21
Reproduction(2) Fitness(R)= λ = ΔD ΔRR T Disto ortion ΔD ΔR Tx T Tx Bit rate X X X1 X2 X3 22
Ex: R1= 0 1 0 0 1 1 0 0 b1 b2 b3 b4 b5 b6 b7 b8 (3 branches) R2= 0 0 1 0 0 1 0 0 (2 branches) R3= R4= R5= 1 1 1 0 0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 0 1 0 0 (4 branches) (4 branches) (2 branches) R6= 01010100 0 1 0 1 0 0 (3 branches) Fitness(R5)>Fitness(R1)>Fitness(R6)>Fitness(R2)>Fitness(R4) (R1) (R6) (R2) (R4) 23 >Fitness(R3)
Prob(R5)>Prob(R1)>Prob(R6)>Prob(R2)>Prob(R4)> Prob(R3) R3 R4 R6 R2 R5 R1 Ex: R5, R1, R5, R6, R5, R4 are selected to be the next population. 24
Crossover R1= 0 0 1 0 0 1 0 0 R1= 0 1 1 0 0 1 0 0 R2= 11100100 1 1 0 1 0 0 R2= 10100100 1 0 0 1 0 0 25
Mutation R1= 0 0 1 0 0 1 0 0 R1= 0 0 0 0 0 1 0 0 0 1 1 0 26
Problem 2:The encoding codeword is not the closest codeword to the input X. Input O is encoded by C2 in the decision tree. However, O is closed to C5. 27
Solution for Problem 2 The cluster center is not proper to classify the input vector in the decision i tree. Ex: (1)Large cluster O C1 C2 28
Ex: (2)Non-spherical shape of clusters o c1 29
Danger region among the clusters Danger region 30
Classification point is defined to classify the input vectors in the decision tree. O o p2 C1 P1 P2 C2 P1 c1 31
How to find the classification points in a cluster? X o C1 p6 X1 X2 X3 X4 p3 p1 C2 p2 C3 p4 p5 C4 O is compared with p1, p2, p3, p4, p5 and p6. 32
(1) (2) (3) 33
Conclusions Variable-branch decision tree can also be applied to recognition applications. Traditional NCUT tree can be improved by genetic algorithm. Adaptive variable-branch decision tree can be proposed in the further. 34
感謝 35