CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES 6.1 INTRODUCTION The exploration of applications of ANN for image classification has yielded satisfactory results. But, the scope for improving the performance of these techniques is always available. One of the several ways of improving the classification performance is the usage of appropriate feature set for image classification. Based on this fact, a feature selection process is included between the feature extraction and image classification in the automated system. Selection of significant features and elimination of insignificant features has been performed by these feature selection techniques. Evolutionary algorithms such as Genetic Algorithms (GA), Particle Swarm Optimization (PSO) are the widely used techniques for feature selection. The usage of these techniques with the ANN has resulted in the new category called, hybrid approaches. In this work, four hybrid approaches such as GA+BPN, GA+SOM, PSO+BPN and PSO+SOM are used in this work. The BPN and SOM are selected since these are the prime representatives of supervised and unsupervised neural networks. Additionally, two other combination of AI approaches such as fuzzy classifier+ga and fuzzy classifier+pso are also tested in this work. Thus, few possible combinations of ANN, fuzzy theory and optimization techniques are used in this work to enhance the performance of the automated image classification system. 6.2 BLOCK DIAGRAM OF THE PROPOSED SYSTEM The block diagram of the proposed system is shown in Figure 6.1. 91

Retinal Image Database Image Pre-processing Feature Extraction Feature Selection GA PSO BPN Classifier SOM Classifier Fuzzy Classifier BPN Classifier SOM Classifier Fuzzy Classifier Figure 6.1 Block Diagram of the proposed system The detailed explanation of the first three blocks of Figure 6.1 is discussed in detail in section 3.3and3.4. Comparative Analysis The feature selection process is included in this work to distinguish between the significant and the non-significant features since all the features do not guarantee high accuracy. In this work, the two prime optimization algorithms such as GA and PSO are used in this work. The innovative nature of this work is verified as the optimization algorithms are used in conjunction with the neural classifiers and the fuzzy classifier. Two neural classifiers and one fuzzy classifier are employed in this work with the two optimization algorithms. Thus, six hybrid AI approaches are implemented in this work for pattern recognition application with an objective to enhance the performance of the automated classification system. 6.3 FEATURE SELECTION FOR IMAGE CLASSIFICATION SYSTEM The different input images are usually classified according to a set of measured features. The features extracted from the retinal image database form the 92

initial, high-dimensional feature space. All the features do not contribute to high figure of merit and even features providing useful information may reduce classification accuracy when there are a limited number of training points. Also, the presence of insignificant features has indirectly increased the computational complexity and the training time period of the classifiers. Hence, the removal of insignificant features is essential to ensure high classification accuracy within a quick time period. These operations are performed by the feature selection process. The objective of the feature selection is twofold: (a) Enhancing the classification accuracy and (b) Reducing the training time period of the classifiers. The criteria of these feature selection algorithms are to choose optimal subsets of the original features which still contain the information essential for the classification task. In this work, GA and PSO are proposed for feature selection. The optimal features selected by these algorithms are used to train the different neural and fuzzy classifiers which ultimately enhance the performance measures of the classifiers. The detailed explanation of these algorithms is given in the subsequent sections. 6.4 HYBRID AI TECHNIQUES WITH GA FOR IMAGE CLASSIFICATION In this work, two neural classifiers and a single fuzzy classifier is used in conjunction with GA for retinal image classification. 6.4.1 Hybrid GA optimized BPN based Image Classification The features extracted from the raw images are given as input to the Genetic Algorithm. Only the optimal features are given as output which is then fed to the BPN for retinal image classification. 6.4.1.1 Genetic Algorithm for feature selection Genetic Algorithm can be viewed as a general purpose search method or an optimization method based on biological evolution. GA is the widely preferred 93

optimization algorithm for many engineering applications. Since the operations are based on natural theory of evolution, it falls under the category of evolutionary algorithms. A set of candidate solutions called population is maintained by GA and repeatedly modifies them. There are several mathematical operations used to modify the candidate solutions. At each step, individuals from the current population are selected to be parents and GA uses them to produce the children for next generation. Candidate solutions are represented as strings of fixed length called chromosomes. Each bit in the string is called as gene. A random population size is initialized and then a fitness function is used to reflect the goodness of each member of population. The fitness function may be a maximization function or minimizing objective function. The computational flowchart of GA is shown in Figure 6.2. Initialize a population of chromosomes with 16 bits Calculate fitness function for each chromosome Apply the genetic operators and form a new set of population Replace old population with current population No Gen=max Yes Estimate optimal set from fittest chromosome Figure 6.2 Flow diagram of genetic algorithm In this work, each of the sixteen features are represented by a chromosome (string of bits) with 16 genes (bits) corresponding to the number of features. The 94

order of position of the features in each string is mean, standard deviation, circularity, area, skewness, kurtosis, energy, entropy, contrast, inverse difference moment, correlation, variance, perimeter, cluster shade, cluster prominence and homogenity respectively. The order of positions is chosen randomly. The effect of these positions on accuracy is very less since equal importance is given to all features. An initial random population of 100 chromosomes is formed to initiate the genetic optimization. A suitable fitness function is estimated for each individual. The fittest individuals are selected and the crossover and the mutation operations are performed to generate the new population. This process is continued for a particular number of generations and finally the fittest chromosome is calculated based on the fitness function. The features with a bit value 1 are accepted and the features with the bit value of 0 are rejected. The fitness function used in this work is given by c r Fitness ( ) (6.1) c where γ classification accuracy c total number of features r length of the chromosome (number of 1 s) [0, 1] and 1 The classification accuracy for each chromosome is determined by training the SOM classifier with different features corresponding to the 1 s position in the chromosome. The training error in % is determined and thus the accuracy can be determined by subtracting the training error % from 100%. The goodness of each position is evaluated by this fitness function. The criteria are to maximize the fitness values. The final optimal chromosome is determined and all the features with the value of 1 are stored as the optimal feature set. After feature selection, the optimal features are then supplied to the BPN for the classification process. The BPN network discussed in section 5.3.1 is used for the 95

implementation. The only difference is that the number of neurons in the input layer of the conventional BPN classifier is 16 whereas the number of neurons in the input layer of GA optimized BPN is lesser than 16. Thus, only the significant features are used in this work for image classification. 6.4.2 Hybrid GA optimized SOM based Image Classification The optimal features extracted from GA as discussed in section 6.4.1 is used as input for the SOM classifier. The size of the SOM architecture is slightly different from the network discussed in section 5.3.3. Since the number of input features for the GA optimized SOM is less, the number of neurons in the input layer is also highly reduced over the conventional SOM classifier. Thus, the SOM network is trained with only the optimal features and hence an improvement in the performance measures of the GA optimized SOM classifier is expected over the conventional SOM classifier. 6.4.3 Hybrid GA optimized fuzzy classifier based Image Classification After the implementation of the neural classifiers, a different category of the AI techniques such as fuzzy classifier is tested along with GA for pattern recognition. The optimal features obtained from GA are given as input to the fuzzy nearest neighbor classifier which is discussed in section 5.4.1. 6.5 HYBRID AI TECHNIQUES WITH PSO FOR IMAGE CLASSIFICATION In this work, two neural classifiers and a single fuzzy classifier is used in conjunction with PSO for retinal image classification. 6.5.1 Hybrid PSO optimized BPN based Image Classification 96

The features extracted from the raw images are given as input to the PSO algorithm. Only the optimal features are given as output which is then fed to the BPN for retinal image classification. 6.5.1.1 Particle Swarm Optimization PSO is one of the swarm intelligence methods used for solving the optimization problems. It is a population based search algorithm where each individual is referred to as particle and represents a candidate solution. Each single candidate solution is assumed to be an individual bird of the flock, i.e, a particle in the search space. Each particle makes use of its individual memory and knowledge to find the best solution. All the particles are given its own fitness values, which are evaluated by fitness function and have velocities which direct the movement of the particles. The particles move through the problem space by following a current of optimum particles. The initial swarm is generally created in such a way that the population of the particles is distributed randomly over the search space. At every iteration, each particle is updated by following two best values, called pbest and gbest. Each particle keeps track of its coordinates in the problem space, which are associated with the best solution (fitness value). This fitness value is called pbest. When a particle takes the whole population as its topological neighbor, the best value is a global best value and is called gbest. The detailed algorithm is given below: Step 1: The constants q max, c 1, c 2, r 1, r 2, w are fixed during the process. The particle positions x 0 (i) for i=1, 2 p are randomly initialized. The particle velocities v 0 (i) for i=1,2..p are randomly initialized. Step 2: The process is initiated with q=1. 97

Step 3: The function value f q is evaluated using design space coordinates x q (i) If f q f pbest, then pbest(i) = x q (i) (6.5) If f q f gbest, then gbest= x q (i) (6.6) Step 4: The particle velocity is adjusted using the following equation v q+1 (i)= w*(v q (i))+c 1 r 1 *(pbest q (i) - x q (i))+c 2 r 2 (gbest q x q (i)) (6.7) The particle position vector is adjusted using the following equation x q+1 (i)=x q (i)+v q+1 (i) (6.8) Step 5: The value of i is incremented by 1. If i > p, then increment q and set i=1. Step 6: The steps 3 to 5 is repeated until q max is reached. where q max = maximum iteration number w = inertia weight factor c 1, c 2 = cognitive and social acceleration factors r 1, r 2 = random numbers in the range (0, 1). In this work, each of the candidate solution is represented by a particle (string of bits) with 16 bits corresponding to the number of features. An initial random population of 100 particles is formed to initiate the optimization. The initial coding for each particle is randomly generated. A suitable fitness function is estimated for each individual. This process continues for a particular number of iterations and finally the fittest chromosome is calculated based on the fitness function. The features with a bit value 1 are accepted and the features with the bit value of 0 are rejected. The fitness function used in this work is given by where γ classification accuracy c r Fitness ) c c total number of features ( (6.9) 98

r length of the chromosome (number of 1 s) [0, 1] and 1 The goodness of each position is evaluated by this fitness function. The criteria are to maximize the fitness values. An optimal solution is obtained at the end of the maximum iteration. This value is binary coded with fourteen bits. The bit value of 1 represents a selected feature whereas the bit value of 0 represents a rejected feature. Thus an optimal set of features are selected from the PSO technique. Though the size of the input feature vector is not too high, feature selection is still necessary because the presence of even a single insignificant feature may reduce the classification accuracy. Also, for retinal image applications, the time period factor is extremely important and hence it is advisable to use a feature set which is as small as possible. Feature selection removes the insignificant features besides reducing the training time period of the classifiers irrespective of the size of the input feature set. These set of optimal features are further given as input to the BPN network for the classification process. The number of input neurons used in this architecture is different from the number of neurons used for the conventional BPN and the GA optimized BPN. The other details such as the training algorithm and the parameter fixation remain the same as that of conventional BPN and GA optimized BPN. 6.5.2 Hybrid PSO optimized SOM based Image Classification The optimal features extracted from PSO algorithm is used for training the SOM classifier unlike the conventional SOM classifier where the entire feature set is used for the classification process. Hence, the number of input layer neurons used for PSO optimized SOM is different from the conventional SOM classifier. The PSO algorithm is dealt in detail in the previous section. The architectural and algorithmic concepts of SOM are covered in 5.3.3. 6.5.3 Hybrid PSO optimized fuzzy classifier based Image Classification 99

In this work, the PSO optimized features are used to train the fuzzy nearest neighbor classifier which is discussed in 5.4.1. Since the features optimized by the GA is different from the features optimized by the PSO algorithm, the possibility of performance difference of the fuzzy classifier with GA and PSO is always available which can be verified through this experiment. Thus, six different hybrid approaches are implemented in this work with an objective to achieve performance enhancement over the conventional neural classifiers based automated image classification system. 6.6. IMPLEMENTATION The experiments of these hybrid classifiers are carried out using the MATLAB software. The procedural flow and the various parameters used for this implementation are discussed in this section. In this section, much emphasis is given on the practical approach of the execution of the feature selection algorithms. The procedural flow of the classifiers has been already discussed in section 5.4. 6.6.1 Implementation of the GA optimized AI techniques The step-by-step procedure of the implementation of the GA based techniques is as follows: 1) The complete feature set extracted from the raw images are given as input to the Genetic Algorithm. 2) An initial population of 100 chromosomes is used in this work. Each chromosome is the candidate solution of this work. Each chromosome is made up of genes of 16 bits corresponding to the input features. 3) The fitness value is estimated for each chromosome using the fitness function which is a maximization function. 100

4) The 30 least fit chromosomes are removed from the population and 30 new off-springs are generated using the cross-over operation and mutation operation. 5) The cross-over operations are performed between two parent chromosomes and the mutation operation is performed within a single chromosome. In both the techniques, swapping of bits is performed to generate the new off-springs. 6) The generated new off-spring may be one of the already existing chromosome (or) a new chromosome since 2 16 chromosome combinations are possible. 7) This process is repeated for 1000 iterations and the chromosome which is available for the maximum number of iterations is selected as the optimal chromosome. 8) The features corresponding to bit position with the value of 1 is selected as the optimal feature set. 9) These optimal feature set are further given as input to the two neural classifiers and the fuzzy classifier. 10) The neural classifiers are further trained and tested using these feature set to estimate the performance measures of the classifiers. 11) In the case of fuzzy classifier, the FCM algorithm is used with these optimal features and the output centroid values of FCM algorithm is observed. 12) Further, the distance between the unknown testing input s centroid value and the centroid values of the stored categories are determined. The input is categorized to the class for which the distance value is minimum. The various parameters involved in this algorithm must be properly fixed for enhancing the success rate of the subsequent steps. The size of the complete feature set is 16. An initial population size of 100 is used in this work. Though, 2 16 combinations are available, the size of the initial population is limited to avoid computational complexity. But, the possibility of the unused combination may be 101

generated as one of the off-spring in the subsequent generations. The number of genes used in each chromosome is 16 corresponding to the number of features. Each bit position is associated with a feature which is extremely important to estimate the optimal feature set at the end of the iterations. Binary representations are adopted in this work for each chromosome. The fitness function used in this work is the maximization objective function where higher fitness values indicate the optimal chromosomes. The classification accuracy in the fitness function for each chromosome is estimated by training the classifier with the features of 1 value in the corresponding chromosome. The value of α used in this work is 0.8 and the value of β is 0.2. Since more importance has to be given to the classification accuracy, a higher value of α is used. The number of rejected chromosomes per iteration is 30 which is roughly onethird of the total population size. Thus, the probability of the success for each chromosome is 0.7 which is sufficiently fair for the entire population. The cross-over rate used for generating the off-springs is 0.7 and the mutation rate used in this implementation is 0.3. The number of iterations employed in this work is 1000. The optimal features are then used for training the different classifiers. The number of neurons in the neural classifiers is based on the number of optimal features. The other parameters are same as that of the implementation of conventional neural classifiers which is discussed in section 5.4. The error tolerance value ( ) used for FCM clustering is 0.01. The number of clusters used in this work is 3. The average number of iterations required for convergence is 700. Thus, these parameters are used in the implementation of the hybrid techniques with an objective to enhance the performance of the classifiers. 6.6.2 Implementation of the PSO optimized AI techniques The step-by-step procedure of the implementation of the PSO based techniques is as follows: 102

1) The various parameters are initialized and the population is formed using the same procedure followed by the GA technique. 2) The particle position and the particle velocity are the two important parameters of PSO algorithm. The candidate solution is represented by the particle position. 3) The particle positions and particle velocities are updated in an iterative manner and values of pbest and gbest are noted for each iterations based on the fitness function. 4) The binary representation of the population is converted to decimal form and then used in this work. Finally, the decimal values of the particle positions are converted into binary values. 5) The optimal particle position ( gbest ) is finally determined at the end of the maximum number of iterations. 6) The optimal features are further estimated based on the bit values (one or zero) in the gbest particle position. 7) These features are then used to train the three AI classifiers and the performance measures are estimated. The size of the initial population used in this work is 100 and each member of the population is represented by 16 bits. The average number of iterations used in this work is 520 which is very much lesser than the number of iterations required for GA. The convergence of the particle position is achieved with less number of iterations when compared with the GA. The values of the parameters used for the classifiers are same as that of the values used in the implementation of the GA. The only difference is that the number of optimal features yielded by GA is different from PSO which ultimately results in the change of input layer neurons for the neural classifiers. 6.7 Experimental Results and Discussions 103

The experiments are performed using MATLAB software with a Pentium processor of 1.66 GHz processing speed and 1 GB RAM. Initially, the results of the feature selection are discussed in detail followed by the extensive analysis on the hybrid AI classifiers. 6.7.1 Results of feature selection with GA and PSO The various features selected by the optimization algorithms and other issues related with these feature selection techniques are shown in Table 6.1. Table 6.1 Analysis of the feature selection algorithms Technique Optimal features No. of features Average no. of iterations GA PSO Standard deviation, circularity, Area, Skewness, Kurtosis, Energy, Entropy, Contrast, Inverse difference moment, Correlation, cluster shade and Perimeter. Area, Skewness, Energy, Entropy, Contrast, Inverse difference moment, Correlation and cluster shade 12 1000 8 520 The characteristic features of the GA and PSO algorithm are clearly displayed in Table 6.1. The optimal features selected by GA are different from the optimal features selected by the PSO algorithm. GA has selected 8 textural features and 4 features obtained from the segmented anatomical structures of the input image whereas PSO has selected 6 textural features and 2 structure based features. Some of the features are rejected commonly by both techniques and some of the features are accepted commonly by both the algorithms. The primitive statistical feature such as mean is rejected by both the techniques since this feature is based only on the intensity of the input image which is not sufficient for classification. An observation is those features which have yielded similar values for different categories are 104

rejected by the optimization algorithms. Some of the features such as entropy and correlation are preferred by both the techniques since these features represent the amount of textural information which is highly essential for image classification. Features such as standard deviation and kurtosis are accepted by GA but rejected by PSO. The justification for these feature rejection is verified by the experimental results of the classifiers. The number of features selected by PSO is lesser than GA which is a major proof for the superior nature of the PSO algorithm. Since the number of input layer neurons is based on the number of input features, lesser number of input features is always preferable to reduce the computational complexity of the automated system. In case of fuzzy classifier, the number of mathematical operations is reduced because of the reduction in the number of input features. Another criterion for performance analysis between these two algorithms is the convergence rate. The convergence time requirement is based on the number of iterations and hence lesser number of iterations is essential for any efficient algorithm. The number of iterations required for PSO is almost half of the iterations required for GA. The results remain unchanged even if the number of iterations is increased further for the PSO algorithm. But, the convergence time is significantly higher for GA since the entire process is purely dependent on the number of iterations unlike PSO where a standard convergence condition is available. Even though, the number of random parameters and the mathematical operations are high, the time requirement of PSO is low. On the other hand, the number of parameters to be initialized for PSO algorithm is higher than the GA. The parameters such as inertia weight factor, cognitive and social acceleration factors, particle velocity, etc are to be initialized in PSO whereas the cross-over rate and the mutation rate are the only parameters which need attention in GA. The fitness function parameters are common for both the algorithms. These parameters have to be properly fixed for the success rate of the 105

subsequent techniques. Hence, GA has an edge over the PSO but the exact advantages can be verified only by the classification accuracy results of the classifiers. The number of computational operations is also higher for PSO over the GA. The increase in the mathematical operations of PSO is mainly due to the presence of parameter adjustment equations which is not available in GA. The swapping operation is the main computational operation of the GA. But, since lesser number of iterations is used in the PSO algorithm, the effect of these operations on the convergence rate is very low. Thus, the characteristic features of both these algorithms are discussed in detail. Since, the classification accuracy is the main objective of this work, the exact merits of these techniques can be justified mainly by the experimental results of the GA optimized AI classifiers and PSO optimized AI classifiers. 6.7.2 Results of GA optimized AI classifiers The optimal features obtained from the GA are further used to train and test the two neural classifiers and the fuzzy classifier. The performance measures used for the analysis are classification accuracy, sensitivity, specificity, PLR and NLR. 6.7.2.1 Results of GA optimized BPN classifier The confusion matrix for the GA optimized BPN classifier is shown in Table 6.2. Table 6.2 Confusion matrix of the GA optimized BPN classifier Category Class1 Class 2 Class 3 Class 4 CNVM 59 1 2 2 CRVO 1 55 2 2 CSR 1 2 67 3 NPDR 1 3 1 78 The level of misclassification rate of GA optimized BPN classifier has been decreased in comparison with the confusion matrix of the conventional BPN 106

classifier. The various performance measures of the classifier are further estimated by observing the TP, TN, FP and FN values from Table 6.2. The performance analysis of the GA optimized BPN classifier is shown in Table 6.3. Table 6.3 Performance measures of GA optimized BPN classifier TP TN FP FN Sensitivity Specificity Accuracy (%) PLR NLR CNVM 59 213 3 5 0.92 0.98 97 66 0.06 CRVO 55 214 6 5 0.92 0.97 96 33 0.08 CSR 67 202 5 6 0.91 0.97 96 38 0.08 NPDR 78 190 7 5 0.94 0.96 95 26 0.06 Average Value 0.91 0.97 96 41 0.07 The enhancement of the performance measures of the GA based BPN classifier over the conventional BPN classifier is verified from the results shown in Table 6.3. But, the enhancement of the classification accuracy results is not much significant since the conventional BPN classifier has already yielded sufficiently accurate results. On the other hand, the time taken for the GA optimized BPN classifier is higher than the conventional BPN classifier since the additional GA process has consumed significant time period. The reduction in the size of input layer neurons from 16 to 12 have reduced the impact to some extent but still the time requirement for the GA optimized BPN classifier is higher than the conventional BPN classifier. 6.7.2.2 Results of GA optimized SOM classifier The confusion matrix of GA optimized SOM classifier is shown in Table 6.4. Table 6.4 Confusion Matrix of GA optimized SOM classifier Class1 Class 2 Class 3 Class 4 CNVM 44 3 7 10 CRVO 6 43 6 5 CSR 5 9 51 8 NPDR 5 7 11 60 107

From Table 6.4, it is evident that the GA optimized SOM classifier has yielded better classification rates than the conventional SOM classifier. The various performance measures calculated from the confusion matrix are shown in Table 6.5. Table 6.5 Performance Measures of the GA optimized SOM classifier TP TN FP FN Sensitivity Specificity Accuracy PLR NLR (%) CNVM 44 200 16 20 0.69 0.93 87 9.3 0.33 CRVO 43 201 19 17 0.71 0.91 87 8.2 0.31 CSR 51 183 24 22 0.70 0.88 84 6.0 0.34 NPDR 60 174 23 23 0.73 0.88 84 6.2 0.31 Average Value 0.71 0.90 86 7.4 0.32 The necessity of GA for improving the accuracy of the conventional SOM classifier is verified from Table 6.5. A significant increase in the accuracy results have been achieved through the GA optimized SOM classifier. But, the time period requirement of the GA optimized SOM classifier is exceedingly high. Less convergence time period is the only significant merit of the conventional SOM classifier but this positive aspect has been cancelled with the inclusion of the GA. The probability of low quality results is high when the number of iterations of the GA is reduced. Thus, a compromise between accuracy and convergence time period is required for the GA optimized SOM classifier. 6.7.2.3 Results of GA optimized Fuzzy classifier The confusion matrix of GA optimized fuzzy classifier is shown in Table 6.6. Table 6.6 Confusion Matrix of GA optimized Fuzzy classifier Class1 Class 2 Class 3 Class 4 CNVM 38 5 8 13 CRVO 8 37 9 6 CSR 9 12 44 8 NPDR 6 8 16 53 The level of correct classification rate has been slightly improved in comparison to the conventional fuzzy classifier but still the performance is inferior to the optimized neural classifiers. The performance measures are shown in Table 6.7. 108

Table 6.7 Performance Measures of the GA optimized fuzzy classifier TP TN FP FN Sensitivity Specificity Accuracy (%) PLR NLR CNVM 38 193 23 26 0.59 0.90 83 5.6 0.45 CRVO 37 195 25 23 0.61 0.89 83 5.4 0.43 CSR 44 174 33 29 0.60 0.84 78 3.8 0.47 NPDR 53 170 27 30 0.64 0.86 80 4.7 0.42 Average Value 0.61 0.87 81 4.9 0.44 The inferior results of the GA optimized fuzzy classifier over the GA optimized neural classifiers are evident from Table 6.7. The low quality results are due to two important factors. The number of randomly initialized parameters for the fuzzy classifier is higher than the neural classifiers where the weights alone are initialized randomly. Secondly, the classification process is performed with the centroid values obtained from the FCM algorithm. Since the abnormality has been widely spread throughout the image, the success rate of the clustering process is very low which ultimately yields incorrect centroid results. These incorrect centroid values are responsible for the inaccurate results of the fuzzy classifier. Though the inclusion of GA has enhanced the performance, the results are still not sufficient for possible practical applications. The huge time period requirement is another drawback of this approach. 6.7.3 Results of PSO optimized AI classifiers The optimal features obtained from the PSO algorithm are further used to train and test the classifiers. The performance of the PSO optimized classifiers are further compared with the GA optimized classifiers and the conventional AI classifiers. 109

6.7.3.1 Results of PSO optimized BPN classifier The confusion matrix for the PSO optimized BPN classifier is shown in Table 6.8. Table 6.8 Confusion matrix of the PSO optimized BPN classifier Category Class1 Class 2 Class 3 Class 4 CNVM 61 1 1 1 CRVO 1 56 1 2 CSR 1 2 69 1 NPDR 1 1 1 80 The level of misclassification rate of PSO optimized BPN classifier has been decreased in comparison with the confusion matrix of the conventional BPN and GA optimized BPN classifier. The various performance measures of the classifier are further estimated by observing the TP, TN, FP and FN values from Table 6.8. The performance analysis of the PSO optimized BPN classifier is shown in Table 6.9. Table 6.9 Performance Measures of the PSO optimized BPN classifier TP TN FP FN Sensitivity Specificity Accuracy (%) PLR NLR CNVM 61 213 3 3 0.95 0.98 97 69 0.04 CRVO 56 216 4 4 0.93 0.98 97 51 0.06 CSR 69 204 3 4 0.95 0.98 98 65 0.05 NPDR 80 193 4 3 0.96 0.97 98 47 0.04 Average Value 0.95 0.98 98 58 0.05 From the results, it is evident that the performance measures of PSO based BPN classifier is better than the conventional and GA optimized classifier. It also indirectly hinted the fact that the features rejected by PSO but accepted by GA are sub-optimal. The accuracy of the PSO based approach has increased since these insignificant features are sidelined by the algorithm. Another advantage is the reduction in the computational complexity of the approach where only 8 input layer neurons are used. This reduction in computational complexity has also minimized the requirement of convergence time period of the classifier. Another fact is that the number of iterations used for PSO algorithm is only 520 which is nominal in nature. 110

Though the inclusion of PSO algorithm has increased the time period of the automated system, the PSO algorithm is still preferred because of the superior classification accuracy results. 6.7.3.2 Results of PSO optimized SOM classifier The confusion matrix of PSO optimized SOM classifier is shown in Table 6.10. Table 6.10 Confusion Matrix of PSO optimized SOM classifier Class1 Class 2 Class 3 Class 4 CNVM 48 3 5 8 CRVO 4 46 4 6 CSR 5 6 55 7 NPDR 5 5 9 64 From Table 6.10, it is evident that the PSO optimized SOM classifier has yielded better classification rates than the conventional SOM classifier and GA optimized SOM classifier. The various performance measures calculated from the confusion matrix are shown in Table 6.11. Table 6.11 Performance Measures of the PSO optimized SOM classifier TP TN FP FN Sensitivity Specificity Accuracy (%) PLR NLR CNVM 48 202 14 16 0.75 0.93 89 11.5 0.27 CRVO 46 206 14 14 0.77 0.94 90 12.1 0.25 CSR 55 189 18 18 0.75 0.91 87 8.7 0.27 NPDR 64 176 21 19 0.77 0.89 86 7.2 0.26 Average Value 0.76 0.92 88 9.8 0.26 The superior nature of the PSO algorithm over the GA algorithm is verified through the experimental results shown in Table 6.11. The accuracy is also considerably increased in comparison with the conventional SOM classifier. The time requirement for the PSO algorithm is also significantly reduced when compared with the GA algorithm. Thus, the PSO based SOM classifier has proved to be an optimal classifier in terms of accuracy and convergence rate among the conventional SOM classifier and the GA optimized SOM classifier. 111

6.7.3.3 Results of PSO optimized fuzzy classifier The confusion matrix of PSO optimized fuzzy classifier is shown in Table 6.12. Table 6.12 Confusion Matrix of PSO optimized fuzzy classifier Class1 Class 2 Class 3 Class 4 CNVM 41 5 7 11 CRVO 7 40 8 5 CSR 8 10 48 7 NPDR 6 6 14 57 The inferior nature of the fuzzy classifier for pattern recognition is proved in Table 6.12. The performance measures of this classifier are shown in Table 6.13. Table 6.13 Performance Measures of the PSO optimized fuzzy classifier TP TN FP FN Sensitivity Specificity Accuracy (%) PLR NLR CNVM 41 195 21 23 0.64 0.90 84 6.6 0.40 CRVO 40 199 21 20 0.67 0.90 85 7.0 0.37 CSR 48 178 29 25 0.66 0.86 81 4.7 0.40 NPDR 57 174 23 26 0.69 0.88 82 5.9 0.35 Average Value 0.67 0.89 83 6.0 0.38 The improvement in the performance of the classifier with inclusion of the PSO algorithm is illustrated in Table 6.13. The accuracy of the PSO optimized fuzzy classifier is better than the performance of the conventional fuzzy classifier and the GA optimized fuzzy classifier. But, the results are still highly inferior to the neural classifiers. The difficulty in clustering the abnormal input image is the main reason for the low quality results. The time period of this automated system is also increased and hence this approach is disadvantageous in terms of accuracy and time period requirement. Though fuzzy systems are said to be accurate, they have yielded low quality results for pattern recognition applications such as retinal image classification. Thus, the fuzzy techniques are highly preferred for segmentation (pixel-based classification) applications rather than classification (image-based) applications. 112

Thus, an extensive analysis has been performed on the performance of the classifiers with GA and PSO algorithm. Further, a comparative analysis is also performed between them to highlight the merits and de-merits of the various approaches. 6.7.4 Comparative analysis of the various approaches Initially, an analysis is done to show to the necessity of optimization algorithms for performance enhancement of the automated pattern recognition system. A comparative analysis between optimized classifiers and un-optimized classifiers are shown in Table 6.14. Table 6.14 Comparative analysis of optimized and un-optimized classifiers Parameters used Un-optimized classifiers GA optimized classifiers PSO optimized classifiers Number of input features Size of weight matrix for the input layer of the neural classifiers Average classification accuracy (%) 16 12 8 16 20 12 20 8 20 87 91 95 From Table 6.14, the significance of optimization algorithms for image classification has been verified. The optimized classifiers have yielded highly accurate results than the un-optimized classifiers. Since the significant features have not been removed in the conventional classifiers, the accuracy of such automated systems is usually low. The presence of non-relevant features has also resulted in the high computational complexity of the neural systems since the weight matrices size is based on these input features. 113

But these merits are obtained at the cost of high computational time period. This drawback can be minimized to some extent with the proper selection of optimization algorithms. Nevertheless, optimization algorithms have proved to enhance the performance of the conventional automated image classification systems. Further, a comparative analysis between the GA and the PSO algorithms are performed to determine the better optimization algorithm. The number of optimal features supplied by GA is 12 whereas the number of optimal features for PSO is 8. The accuracy of the PSO algorithm with 8 features is much better than the GA with 12 features which is evident from the experimental results. Also, the computational complexity is highly reduced for PSO optimized neural classifiers since the number of input layer neurons used is 8. Another important factor for the PSO algorithm is the availability of standard convergence condition. The main objective of the convergence condition is to obtain stabilized values of particle velocity and particle position. Hence, the algorithm is repeated till these criteria are satisfied. But, GA is purely dependent on the number of generations (iterations) which does not guarantee optimal convergence. Hence, the performance of the PSO is better than the GA in terms of accuracy. Since a goal is set for PSO, the number of iterations required for convergence is reduced than the GA where the algorithm is executed for more number of iterations with an objective to achieve higher accuracy. This has accounted for the increased time period requirement for the GA based classifiers which is not practically feasible. Even though the amount of mathematical operations is higher for PSO, these operations are executed only for less number of iterations whereas the limited number of swapping operations is repeated many times for more number of iterations. Hence, PSO is also superior to GA in terms of the convergence rate. Thus, this analysis has suggested the possible usage of PSO over GA for practical applications such as retinal image classification applications 114

6.8 CONCLUSION The necessity of optimization algorithms for performance enhancement of the automated image classification system has been verified by this research work. The accuracy of the optimized classifiers has substantially increased over the unoptimized classifiers. Since several optimization algorithms are available, proper selection of optimization algorithm is important for the overall efficiency of the system. In this work, GA and PSO are used as optimization algorithms and it has been found that PSO is efficient than GA in terms of accuracy and convergence rate. But, the time period requirement of PSO optimized classifiers are higher than the unoptimized classifiers which shows that the scope for improvement is still available. Even though higher accuracy is guaranteed by PSO optimized classifiers, the time period requirement of such classifiers can be minimized by performing suitable modifications in the classifiers. These modifications are done either on the training algorithm or the architecture of the classifiers. Thus, the improvement in the convergence rate can be achieved without compromising the accuracy through these PSO based modified classifiers. 115