CHAPTER 4 FUZZY LOGIC, K-MEANS, FUZZY C-MEANS AND BAYESIAN METHODS

CHAPTER 4 FUZZY LOGIC, K-MEANS, FUZZY C-MEANS AND BAYESIAN METHODS 4.1. INTRODUCTION This chapter includes implementation and testing of the student s academic performance evaluation to achieve the objective(s) of research work proposed in chapter 1 by using classical Fuzzy Logic, Bayesian, K-Means and Fuzzy C- Means. 4.2. RESULT OF FUZZY EXPERT SYSTEM Academic Performance Evaluation with Fuzzy Expert System comprised with three steps: 1. Fuzzification of inputs (semester examination results) and output performance value. 2. Determination of application rules and inference method. 3. Defuzzification of performance value. 4.2.1. Fuzzification Fuzzification of semester examination results have been carried by input of fuzzy sets variables and their membership functions. For each student, three semester results from input variables of the fuzzy logic based expert system. Each input variable has five triangular membership functions (Table 4.1) Input Linguistic Variable Table 4.1: Fuzzy Set of Input and Output Variables Interval Output Linguistic Variable Interval Very Low (0, 0, 0.25) Very Low (0, 0, 0.25) Low (0, 0.25, 0.50) Low (0, 0.25, 0.5) Average (0.25, 0.50, 0.75) Average (0.25, 0.5, 0.75) High (0.50, 0.75, 1.00) High (0.5, 0.75, 1.00) Very High (0.75, 1.00, 1.00) Very High (0.75, 1.00, 1.00) 87

4.2.2. Rules and Inference Generation The rules determine input and out membership functions have been used in inference process. These rules are linguistic and entitled IF-THEN rules: 1. If Sem-1 is VL and Sem-2 is VL and Sem-3 is VL then Performance is VU. And so till. 75. If Sem-1 is VS and Sem-2 is VS and Sem-3 is VS then Performance is VS. In case where multiple rules are active for same output membership function, only one necessary membership value was chosen. This process is entitled fuzzy decision or fuzzy inference [1]. A range of techniques for fuzzy decisionmaking and fuzzy inference have been developed. The method proposed by Mamdani [2-5] has been used in the present work: μ (y) = max min μ input(i), μ input(j) (4.1) This expression determines an output membership function value for each active rule. When one rule is active, an AND operation is applied between inputs the smaller input value is chosen and its membership value is determined as membership value of output for given rule. This method is repeated to determine output membership functions for each rule. Overall, graphically AND (min) operation are applied between inputs and OR (max) operations between output. 4.2.3. Calculation of Performance Value At terminal stage but after completing the fuzzy decision process, the obtained fuzzy number must be converted to a crisp value by the process Defuzzification. Among the methods available, a centre of area (Centroid) technique [1] was applied in the present work. The crisp value has been calculated as follows: z ( z) x dz C ( z) dz C (4.2) 4.2.4. Results and Discussion The proposed Fuzzy Expert System for student academic performance evaluation has been implemented in MATLAB. The marks, their associated original grade and level of achievement (i.e. very high, high, average, low and very low) are 88

shown in Table 4.2. The dataset used for training and testing is a data set of 2050 student s marks for Semester-1 (Sem-1), Semester-2 (Sem-2) and Semester-3 (Sem-3) (Table 4.3 and 4.4). Out of total data sets, 2000 data sets have been used for training purpose and rest 50 used for testing purpose. Table 4.2: Students Marks, Original Grade and Level of Achievement S.No. Marks Grade Level of Achievement 1. 0.76-1.00 A Very High 2. 0.56-0.75 B High 3. 0.46-0.55 C Average 4. 0.26-0.45 D Low 5. 0.00-0.25 E Very Low Table 4.3: Student s Training Data Set S.No. Sem-1 Sem-2 Sem-3 Final Observed Grade Marks output 1. 0.05 0.37 0.18 0.2000 0.25 E 2. 0.10 0.23 10.6 0.1633 0.25 E 3. 0.15 0.13 0.06 0.1133 0.25 E 4. 0.40 0.13 0.20 0.2433 0.25 E 5. 0.25 0.31 0.14 0.2333 0.25 E 6. 0.15 0.10 0.26 0.1700 0.25 E 7. 0.10 0.13 0.30 0.1767 0.25 E 8. 0.10 0.17 0.08 0.1167 0.25 E 9. 0.25 0.23 0.04 0.1733 0.25 E 10. 0.05 0.17 0.12 0.1133 0.25 E 11. 0.12 0.32 0.34 0.2600 0.45 D 12. 0.25 0.33 0.30 0.2933 0.45 D 13. 0.30 0.30 0.34 0.3133 0.45 D 14. 0.40 0.20 0.38 0.3267 0.45 D 15. 0.50 0.40 0.30 0.4000 0.45 D 16. 0.65 0.17 0.38 0.4000 0.45 D 17. 0.50 0.26 0.38 0.3800 0.45 D 1998. 0.95 0.97 0.98 0.9667 1.00 A 1999. 0.90 0.93 0.94 0.9233 1.00 A 2000. 1.00 0.83 0.98 0.9367 1.00 A 89

Table 4.4: Student Testing Data Set S.No. Sem-1 Sem-2 Sem-3 Final Observed Grade Marks output 1. 0.05 0.34 0.16 0.1833 0.25 E 2. 0.02 0.45 0.46 0.3100 0.45 D 3. 0.23 0.45 0.19 0.2900 0.45 D 4. 0.34 0.43 0.46 0.4100 0.45 D 5. 0.05 0.23 0.11 0.1300 0.25 E 6. 0.17 0.96 0.48 0.5367 0.55 C 7. 0.61 0.98 0.94 0.8433 1.00 A 8. 0.29 0.97 0.57 0.6100 0.75 B 9. 0.74 0.90 0.93 0.8567 1.00 A 10. 0.52 0.34 0.69 0.5167 0.55 C 48. 0.39 0.21 0.12 0.2400 0.25 E 49. 0.37 0.59 0.57 0.5100 0.55 C 50. 0.06 0.45 0.03 0.1800 0.25 E The proposed model was tested with the 17 new student s marks for testing purpose (Table 4.5). Table 4.5: Semester Scores of 17 New Students S.No. Sem-1 Sem-2 Sem-3 Classical Method Output Grade 1. 0.10 0.2333 0.2000 0.178 E 2. 0.05 0.1667 0.1200 0.112 E 3. 0.15 0.1333 0.1800 0.154 E 4. 0.45 0.2667 0.4000 0.372 D 5. 0.35 0.3333 0.3000 0.328 D 6. 0.35 0.5000 0.3800 0.410 D 7. 0.45 0.4333 0.5400 0.474 C 8. 0.50 0.4000 0.5000 0.467 C 9 0.45 0.5000 0.5800 0.510 C 10. 0.50 0.7000 0.6200 0.607 B 11. 0.65 0.7000 0.7400 0.697 B 12. 0.85 0.6000 0.7600 0.737 B 13. 0.95 0.7667 0.8600 0.859 A 14. 0.85 0.8333 0.9600 0.881 A 15. 0.90 0.9000 0.9800 0.927 A 16. 0.35 0.4500 0.7500 0.520 C 17. 0.75 0.4500 0.3500 0.520 C 90

For each student, both semesters score after Fuzzification (by the triangular membership functions) active membership functions were calculated according Mamdani Fuzzy Decision Techniques. The output (Performance value) was calculated and defuzzified by calculating the center (centriod) of the resulting geometrical shape. The same procedure was repeated for each student. The performance value yielded by the Fuzzy-1 method is shown in Table 4.6. Table 4.6: Semester Scores and Calculated Performance Value (Fuzzy-1) S.No. Sem-1 Sem-2 Sem-3 Fuzzy-1 Output Grade 1. 0.10 0.2333 0.2000 0.253 D 2. 0.05 0.1667 0.1200 0.190 E 3. 0.15 0.1333 0.1800 0.200 E 4. 0.45 0.2667 0.4000 0.443 D 5. 0.35 0.3333 0.3000 0.390 D 6. 0.35 0.5000 0.3800 0.483 C 7. 0.45 0.4333 0.5400 0.481 C 8. 0.50 0.4000 0.5000 0.520 C 9 0.45 0.5000 0.5800 0.574 B 10. 0.50 0.7000 0.6200 0.640 B 11. 0.65 0.7000 0.7400 0.720 B 12. 0.85 0.6000 0.7600 0.770 A 13. 0.95 0.7667 0.8600 0.870 A 14. 0.85 0.8333 0.9600 0.890 A 15. 0.90 0.9000 0.9800 0.950 A 16. 0.35 0.4500 0.7500 0.575 B 17. 0.75 0.4500 0.3500 0.575 B The inputs (three) showed same triangular membership functions which indicates replacement of Sem-1 with Sem-3 would not change performance value (0.35, 0.45, and 0.75) and (0.75, 0.45, and 0.35). If, symmetry or value range of the membership functions is dissimilar, one semester shows greater influence on performance value than others e.g. let s change the membership functions and value of Sem-3 (Table 4.7) with original criteria for Sem-1 and Sem-2 examination. Aim of this study arrangement in Sem-3 examination is penalize scores below 0.50 and to reward scores above 0.50. This situation can be seen in Table 4.8. For scores below 0.50, performance values decreased and for scores above 0.50 performance value increased. There is no change for scores of 0.50, 91

because this is the boundary of the limit values. The active rules and performance value for scores and surface viewer of proposed fizzy expert system are shown in Fig. 4.1 and Fig. 4.2, respectively. Fig. 4.1: Active Rules and Performance Value for Examination Scores Fig. 4.2: Surface Viewer of Academic Performance Evaluation 92

Table 4.7: Fuzzy Set of Input (Sem-3) Input Linguistic Variable Interval Very Low (0, 0, 0.40) Low (0, 0.20, 0.50) Average (0.40, 0.50, 0.60) High (0.50, 0.80, 1.00) Very High (0.60, 1.00, 1.00) Table 4.8: Variation in Performance According to Semester-3 (Fuzzy-2) S.No. Sem-1 Sem-2 Sem-3 Fuzzy-2 Output Grade 1. 0.10 0.2333 0.2000 0.151 E 2. 0.05 0.1667 0.1200 0.101 E 3. 0.15 0.1333 0.1800 0.120 E 4. 0.45 0.2667 0.4000 0.351 D 5. 0.35 0.3333 0.3000 0.312 D 6. 0.35 0.5000 0.3800 0.382 C 7. 0.45 0.4333 0.5400 0.521 C 8. 0.50 0.4000 0.5000 0.467 C 9 0.45 0.5000 0.5800 0.491 B 10. 0.50 0.7000 0.6200 0.752 A 11. 0.65 0.7000 0.7400 0.780 A 12. 0.85 0.6000 0.7600 0.790 A 13. 0.95 0.7667 0.8600 0.880 A 14. 0.85 0.8333 0.9600 0.940 A 15. 0.90 0.9000 0.9800 0.990 A 16. 0.35 0.4500 0.7500 0.685 B 17. 0.75 0.4500 0.3500 0.440 C 4.2.5. Comparison of Classical, Fuzzy-1 and Fuzzy-2 Methods Comparisons of classical, fuzzy-1 and fuzzy-2 methods for student academic performance evaluation are given in Table 4.9. A student successful in the classical assessment method will also successful in the fuzzy-1. Comparison of the classical method with fuzzy-2 revealed differences in the performance values. In case of scores <0.50, the performance value of fuzzy-2 is smaller than the classical method; however, for scores >0.50, the performance value is greater than the classical method. For example, a student scoring 0.35 in sem-1, 0.50 in Sem-2 and 0.38 in Sem-3 and unsuccessful in the classical and Fuzzy-1 methods, got 93

success in the fuzzy-2. The linear relationship among classical, fuzzy-1 and fuzzy- 2 can be seen in the Fig. 4.3. Table 4.9: Comparison of Performance Evaluation Methods S. No. Sem-1 Sem-2 Sem-3 Output Classical Method Grade Output Fuzzy-1 Grade Output Fuzzy-2 Grade 1. 0.10 0.2333 0.2000 0.178 E 0.253 D* 0.151 E* 2. 0.05 0.1667 0.1200 0.112 E 0.190 E 0.101 E 3. 0.15 0.1333 0.1800 0.154 E 0.200 E 0.120 E 4. 0.45 0.2667 0.4000 0.372 D 0.443 D 0.351 D 5. 0.35 0.3333 0.3000 0.328 D 0.390 D 0.312 D 6. 0.35 0.5000 0.3800 0.410 D 0.483 C* 0.382 C 7. 0.45 0.4333 0.5400 0.474 C 0.481 C 0.521 C 8. 0.50 0.4000 0.5000 0.467 C 0.520 C 0.467 C 9. 0.45 0.5000 0.5800 0.510 C 0.574 B* 0.491 B 10. 0.50 0.7000 0.6200 0.607 B 0.640 B 0.752 A* 11. 0.65 0.7000 0.7400 0.697 B 0.720 B 0.780 A* 12. 0.85 0.6000 0.7600 0.737 B 0.770 A* 0.790 A 13. 0.95 0.7667 0.8600 0.859 A 0.870 A 0.880 A 14. 0.85 0.8333 0.9600 0.881 A 0.890 A 0.940 A 15. 0.90 0.9000 0.9800 0.927 A 0.950 A 0.990 A 16. 0.35 0.4500 0.7500 0.520 C 0.575 B* 0.685 B 17. 0.75 0.4500 0.3500 0.520 C 0.575 B* 0.440 C* *Improved value The accuracy of proposed model for both training and testing data has been tested by the Root Mean Square Error (RMSE) (Table 4.10). Noticeable are the RMSE of fuzzy-2 model (0.1217) and RMSE of fuzzy-1 model (0.1312) for training data sets demonstrating benefits of fuzzy-2, obviously. In case of testing data, the lower RMSE for fuzzy-2 model compared to fuzzy-1 model further demonstrates the benefits of fuzzy-2. Table 4.10: RMSE of Training and Testing Data Sets S.No. Training and Testing (RMSE) Fuzzy-1 Fuzzy-2 1. Training (RMSE) 0.1312 0.1217 2. Testing (RMSE) 0.1401 0.1182 94

Fig. 4.3: Performance Value by Classical, Fuzzy-1 & Fuzzy-2 Methods 4.3. RESULT OF FUZZY K-MEANS The data sets (Table 4.3 and Table 4.4) divided into various clusters using K- Means clustering method with the help of MATLAB software. The students have been classified in five groups (Clusters); very high, high, average, low and very low. K-means clustering method works on finding the cluster centers by trying to minimize objective function. It alternates between updating the membership matrix and updating the cluster centers till any improvement in the objective function is possible. Since, the algorithm initializes the cluster centers randomly its performance is affected by initial cluster centers. Fig. 4.4: Objective Function Values of the K-Means Method 95

After the cluster centers are determined, the evaluation data vectors are assigned to their respective clusters according to the distance between each vector and each of the cluster centers. An error measure is then calculated; the root mean square error (RMSE) is used for this purpose. The results of this method are given in Table 4.11 and the objective function values are shown in the Fig. 4.4. It may be noted that 03 students belong to cluster (very high), 03 students belong to cluster (high), 07 students belong to cluster (average), 02 students belong to cluster (low) and 02 students belongs to cluster (very low) (Table 4.11). The drawback of K- Means clustering method is that it cannot calculate the fuzzy membership value and total mark of a student. Such problem may be solved by the FCM. Table 4.11: Student s Academic Performance Results Using K-Means S.No. Sem-1 Sem-2 Sem-3 Grade based on K-Mean 1. 0.100 0.233 0.200 D 2. 0.500 0.167 0.120 E 3. 0.150 0.133 0.180 D 4. 0.450 0.267 0.400 C 5. 0.350 0.333 0.300 C 6. 0.350 0.500 0.380 C 7. 0.450 0.433 0.540 C 8. 0.500 0.400 0.500 C 9. 0.450 0.500 0.580 C 10. 0.500 0.700 0.620 B 11. 0.650 0.700 0.740 B 12. 0.850 0.600 0.760 B 13. 0.950 0.767 0.860 A 14. 0.850 0.833 0.960 A 15. 0.900 0.900 0.980 A 16. 0.35 0.450 0.750 C 17. 0.75 0.450 0.350 D 4.4. RESULT OF FUZZY C-MEANS Method known as rule based Fuzzy Expert System, proposed for the first time in this work, for academic performance evaluation creates fuzzy classes of input data set by using Fuzzy C-Means clustering algorithms. The steps of proposed method are given below: Step-1 (Fuzzification): With the help of Fuzzy C-Means clustering Algorithms carried out by MATLAB software, student s data score was classified into five 96

classes or clusters, namely Very High, High, Average, Low, and Very Low for student s academic performance evaluation. The membership values of these clusters are shown in Table 4.12. The application of fuzzy C-Means Algorithm (FCM) illustrated by a case described as dataset of students score marks shown in Table 4.12. Table 4.12 gives the value of elements of vector. As an illustration, the values in the 16 th row of Table 4.12 can be interpreted as: Very High = 0.0106, High = 0.0518, Average = 0.8862, Low = 0.0323, Very Low = 0.0191. Max = (0.0106, 0.0518, 0.8862, 0.0323, 0.0191) = 0.8862. Table 4.12: The Membership Functions Values Fuzzy C-Means S.No. Sem-1 Sem-2 Sem-3 Very High High Average Low Very Low 1. 0.10 0.233 0.200 0.012 0.024 0.056 0.052 0.857 2. 0.50 0.167 0.120 0.018 0.114 0.055 0.789 0.025 3. 0.15 0.133 0.180 0.012 0.024 0.056 0.052 0.857 4. 0.45 0.267 0.400 0.018 0.061 0.078 0.776 0.067 5. 0.35 0.333 0.300 0.004 0.009 0.029 0.940 0.019 6. 0.35 0.500 0.380 0.011 0.052 0.886 0.032 0.019 7. 0.45 0.433 0.540 0.012 0.049 0.771 0.116 0.053 8. 0.50 0.400 0.500 0.014 0.857 0.055 0.041 0.032 9. 0.45 0.500 0.580 0.005 0.916 0.026 0.038 0.015 10. 0.50 0.700 0.620 0.036 0.652 0.200 0.088 0.024 11. 0.65 0.700 0.740 0.561 0.299 0.066 0.054 0.020 12. 0.85 0.600 0.760 0.015 0.949 0.018 0.014 0.004 13. 0.95 0.767 0.860 0.971 0.018 0.005 0.004 0.002 14. 0.85 0.833 0.960 0.952 0.027 0.009 0.008 0.004 15. 0.90 0.900 0.980 0.960 0.037 0.008 0.028 0.014 16. 0.35 0.450 0.750 0.011 0.052 0.886 0.032 0.020 17. 0.75 0.450 0.350 0.019 0.089 0.120 0.741 0.031 From those five values, 16 th student is the most suitable to be in class or cluster (average) since he/she has the highest degree of membership. 17 th student is the most suitable in class or cluster (low), since he/she has the highest degree of membership. Thus, it may be concluded that 16 th consistently while 17 th student has improved student has deteriorated consistently. Similarly, the 97

following class or cluster has been obtained for students partitioning in Semester- 1 and Semester-2 examinations: 1. The first class or cluster (Very High) consists of 4 students. 2. The second class or cluster (High) consists of 4 students. 3. The third class or cluster (Average) consists of 3 students. 4. The fourth class or cluster (Low) consists of 4 students. 5. The fifth class or cluster (Very Low) consists of 2 students. Step-2 (Output Estimation): Regression problems deal with estimation of an output value based on input values. When used for classification, the input values are values from the database and the output values represents the classes. Regression takes a set of data and fits the data to formal. The linear regression formula in two dimensional spaces is given bellow: y = a + bx (4.3) where a and b are constant determined by the normal equations for best fit of linear relationship of input and output. This model estimates the actual relationship between input and output. The generated linear regression model has been utilized to predict an output value on given input value. The regression analysis of output estimation of rule based Fuzzy Expert System for academic performance evaluation has been used. In this research work linear regression model has been used for estimation of output of rule based Fuzzy Expert System by using MATAB software. The output of cluster (Very High), cluster (High), cluster (Average), cluster (Low) and Cluster (Very Low) are given bellow: Very High: Y = 1.0000 High: Y = 0.8938 0.1897 X Average: Y = 0.7110 0.2925 X Low: Y = 0.5750 + 0.0150 X Very Low: Y = 0.1684 + 0.5943 X, where X is students mark in sem-1. Step-3 (Rule Generation): The FCM method provided faster convergence and higher accuracy for student s academic performance evaluation based on the following five rules: 1. If Student belongs to cluster (very high) then student performance is very high. 98

2. If student is belongs to cluster (high) then student performance is high. 3. If student is belongs to cluster (average) then student performance is average. 4. If student belongs to cluster (low) then student performance low. 5. If student belongs to cluster very low then student performance is very low. Step-4 (Defuzzification) Calculation of Academic Performance: The final calculation of student academic performance is determined by the following formula: Y = μ (x) Y + μ (x) Y + μ (x) Y + μ (x) Y + μ (x) Y μ (x) + μ (x) + μ (x) + μ (x) + μ (x) Y =............... = 0.5826 Similarly, the academic performance of other students has been calculated (Table 4.13). Fig. 4.5 shows the performance of objective function for students academic performance evaluation. The objective function evolutions suggest that FCM method is better than K-means method. Fig. 4.5: Performance of Objective Function 99

Noticeable is the 16 th student, getting 0.35 marks in sem-1, 0.45 marks in sem-2 and 0.75 marks in sem-3 and assigned performance index as 0.52 in statistical method (Table 4.13). Similarly 17 th student getting 0.75 marks in sem-1, 0.45 marks in sem-2 and 0.55 marks in sem-3 assigned performance index 0.52. These two students assigned performance index (0.49 and 0.42 in fuzzy-2 method; 0.68 and 0.42 in FCM method, respectively). It may be concluded that 16 th student has improved consistently, while 17 th student has deteriorated consistently. Therefore, the fuzzy C-Means clustering technique method is more suitable than the statistical, classical fuzzy and K-Means methods for academic performance evaluation. In this model, the numbers of fuzzy rules are very less in comparison to existing classical Fuzzy Expert System. Table 4.13: Comparison of Statistical, Fuzzy-2, K-Means and FCM Output Grade Output Grade Grade Output Grade S.No. Sem-1 Sem-2 Sem-3 Statistical method Fuzzy-2 Grade based on K-Mean FCM 1. 0.10 0.233 0.200 0.178 E 0.223 E* D* 0.214 E 2. 0.05 0.167 0.120 0.112 E 0.170 E E 0.161 E 3. 0.15 0.133 0.180 0.154 E 0.190 E D* 0.201 E 4. 0.45 0.267 0.400 0.372 D 0.453 C* C 0.431 D* 5. 0.35 0.333 0.300 0.328 D 0.410 D C* 0.294 D 6. 0.35 0.500 0.380 0.410 D 0.530 C C 0.528 C 7. 0.45 0.433 0.540 0.474 C 0.572 B C* 0.532 C* 8. 0.50 0.400 0.500 0.467 C 0.540 C C 0.583 B* 9. 0.45 0.500 0.580 0.510 C 0.594 B C* 0.561 B 10. 0.50 0.700 0.620 0.607 B 0.752 A* B* 0.692 B* 11. 0.65 0.700 0.740 0.697 B 0.780 A* B* 0.792 A 12. 0.85 0.600 0.760 0.737 B 0.790 A B* 0.749 B* 13. 0.95 0.766 0.860 0.859 A 0.880 A A 0.960 A 14. 0.85 0.833 0.960 0.881 A 0.940 A A 0.985 A 15. 0.90 0.900 0.980 0.927 A 0.990 A A 0.995 A 16. 0.35 0.450 0.750 0.520 C 0.540 C C 0.681 B* 17. 0.75 0.450 0.350 0.520 C 0.450 C D* 0.424 D* 100

Root Mean Square Error (RMSE) is employed to evaluate the accuracy of the model identification. The RMSE of FCM is 0.1012, RMSE of K-Means model goes to 0.1191, RMSE of fuzzy-2 model goes to 0.1217 and RMSE of fuzzy-1 model goes to 0.1312 for training data sets (Table 4.14), which demonstrates the benefits of FCM model obviously. RMSE of FCM is 0.1073, RMSE of K-Means model goes to 0.1099, RMSE of fuzzy-2 model goes to 0.1182 and RMSE of fuzzy-1 model goes to 0.1401 for testing data sets, which demonstrates the benefits of FCM model obviously. TABLE 4.14: RMSE of Training and Testing Data Sets S.No. Training and Testing Fuzzy-1 Fuzzy-2 K-Means FCM (RMSE) 1. Training (RMSE) 0.1312 0.1217 0.1191 0.1012 2. Testing (RMSE) 0.1401 0.1182 0.1099 0.1073 4.5. RESULTS BAYESIAN METHOD The predicted class label of students using Bayesian classification with the help of training data is given in Table 4.15. There are 14 data sets belonging to the class first, 14 data sets belonging to class second, 14 data sets belonging to class third and 8 data sets belonging to class fail. The training data sets are described by attributes like end semester marks, class test grade, seminar performance, assignment, general proficiency, attendance, and lab work. The class label attributes and class types have four distinct values namely first, second, third and fail. The allocation of any new student is shown in Table 4.16: The need to maximize P(X/C ), for i = 1, 2,3, 4, P(C ), the prior probability of each class can be computed based on the training data set. P(Class Type = First) = 14 50 = 0.280 P(Class Type = Second) = 14 50 = 0.280 P(Class Type = Third) = 14 50 = 0.280 P(Class Type = First) = 14 50 = 0.160 101

Table 4.15: Class-Labeled Training Tuples from the Students Data Set S.No. End Semester Marks Class Test Grade Seminar Performance Assignment General Proficiency Attendance Lab Work Class Type 1. first good good yes yes good yes first 2. first good average yes no good yes first 3. first good average no no average no first 4. first average good no no good yes first 5. first average average no yes good yes first.... 48. fail poor poor no no poor yes fail 49. fail average average yes yes good yes second 50. fail poor good no no poor no fail Table 4.16: Data Set for a New Student S.No. End Semester Marks Class Test Grade Seminar Performance Assignment General Proficiency Attendance Lab Work 1. first good good yes no good yes To compute P(X/C ) for calculated: i = 1, 2,3, 4, P(C ), following probabilities has been P(End Semester Marks = First) Class Type = First) = 9 18 = 0.5000 P(End Semester Marks = First) Class Type = Second) = 2 18 = 0.1111 P(End Semester Marks = First) Class Type = Third) = 2 18 = 0.1111 P(End Semester Marks = First) Class Type = Fail) = 1 12 = 0.0833 P(Class Test = Good) Class Type = First) = 8 14 = 0.5714 102

P(Class Test = Good) Class Type = Second) = 6 14 = 0.4286 P(Class Test = Good) Class Type = Third) = 1 14 = 0.0714 P(Class Test = Good) Class Type = Fail) = 1 8 = 0.125 P(Seminar Performance = Good) Class Type = First) = 5 14 = 0.3571 P(Seminar Performance = Good) Class Type = Second) = 5 14 = 0.3571 P(Seminar Performance = Good) Class Type = Third) = 2 14 = 0.1429 P(Seminar Performance = Good) Class Type = Fail) = 1 8 = 0.1250 P(Assignment = Yes) Class Type = First) = 10 14 = 0.7143 P(Assignment = Yes) Class Type = Second) = 11 14 = 0.7857 P(Assignment = Yes) Class Type = Third) = 5 14 = 0.3571 P(Assignment = Yes) Class Type = Fail) = 1 8 = 0.1250 P(General Proeficiency = No) Class Type = First) = 5 14 = 0.3571 P(General Proeficiency = No) Class Type = Second) = 3 14 = 0.2143 P(General Proeficiency = No) Class Type = Third) = 6 14 = 0.4286 P(General Proeficiency = No) Class Type = Fail) = 4 8 = 0.5000 P(Attendance = Good) Class Type = First) = 11 17 = 0.6471 103

P(Attendance = Good) Class Type = Second) = 9 17 = 0.5294 P(Attendance = Good) Class Type = Third) = 4 17 = 0.2353 P(Attendance = Good) Class Type = Fail) = 1 11 = 0.9091 P(Lab Work = Yes) Class Type = First) = 9 14 = 0.6429 P(Lab Work = Yes) Class Type = Second) = 13 14 = 0.9296 P(Lab Work = Yes) Class Type = Third) = 11 14 = 0.7857 P(Lab Work = Yes) Class Type = Fail) = 5 8 = 0.3571 With the help of above probabilities, following has been achieved: P(New Student/Class Type = First) = P(End Semester Marks = First/Class Type=First) P(Class Test = Good/Class Type=First) P(Seminar Performance = Good/Class Type = First) P(Assignment = Yes/Class Type = First) P(General Proficiency = No/Class Type=First) P(Attendance = Good/Class Type = First) P(Lab work = Yes/Class Type = First) = 05 0.5714 0.3571 0.7174 0.3571 0.6471 0.6429= 0.01087 P(New Student/Class Type = Second)= 0.1111 0.4286 0.3570 0.7857 0.2143 0.5294 0.9296= 0.00140 P(New Student/Class Type = Third)= 0.1111 0.0714 0.1429 0.3571 0.4286 0.2353 0.7857= 0.000032 P(New Student/Class Type = Fail)= 0.0625 0.125 0.125 0.125 0.5 0.0.0833 0.3571= 0.000001816 To find the class C i, that maximize P(X/C )P(C ), we compute 104

P(New Student/Class Type = First)P(Class Type = First)=0.1087 0.28 = 0.0030436 P(New Student/Class Type = Second)P(Class Type = Second)= 0.00140 0.28 = 0.000392 P(New Student/Class Type = Third)P(Class Type = Third)= 0.000032 0.28 = 0.00000896 P(New Student/Class Type = Fail)P(Class Type = Fail)= 0.000001816 0.16 = 0.0000016 The Bayesian classifier more reliably predicts the new student belonging to class first. In the same manner any new student can be fitted to their respective class based on the performance. 4.6. CONCLUSION This research work focus on the development of fuzzy logic based expert system and fuzzy C-means based fuzzy expert system to academic performance. Also presented is a new method to new student allocation based on Bayesian approach. A difference in outcomes is seen between the classical and proposed fuzzy logic based expert systems methods when results are evaluated from fuzzy expert system. While the classical method adheres to a constant mathematical rule, evaluation with fuzzy logic has great flexibility and reliability. The proposed FCM based Fuzzy Expert System automatically converted the crisp data into fuzzy set and also calculate the total marks of a student appeared in semsetr-1, semester-2 and semester-3 examination. A simple and qualitative methodology to compare the predictive power of clustering algorithm and the Euclidean distance is evident as result of this work. The FCM clustering models have improved on some limitation of the existing traditional methods, such as average method and statistical method. The FCM method is best model for modeling academic performance in educational domain compared the classical fuzzy logic. However, due to multiple iterations and various Eigen vectors the FCM method suffers heavy computational burdens and is time-consuming. Apart from this, it is also 105

highly sensitive to the initialization treatment which usually requires a priori knowledge of the cluster numbers to form the initial cluster centers. Such limitations can be mitigated by the Subtractive clustering based Takagi-Sugeno (T-S) fuzzy model [6] and combined Subtractive clustering with FCM and ANFIS called hybrid SC-FCM and SC-ANFIS methods, respectively (described in next chapter). REFERENCES [1]. Padhy, N.P. Artificial intelligence and Intelligent System. Oxford University Press, (2005). [2]. Zadeh, L. A. Fuzzy Logic: Advanced Concepts and Structures. IEEE, Piscataway, New York, (1992). [3]. Zadeh, L. A. Soft Computing and Fuzzy Logic. IEEE Software, Vol. 11, No. 6, (1994): 48-56. [4]. Takagi, T., and M. Sugeno. Fuzzy Identification of Systems and Its Applications to Modeling and Control. IEEE Transaction, Systems, Man and Cybernetics, Vol. 15, (1985): 116-132. [5]. Mamdani, E. H., and S. Assilian. An Experiment in Linguistic Synthesis with A Fuzzy Logic Controller. International Journal of Man-Machine Studies, Vol. 7, No.1, (1975): 1-13. [6]. Huaguang, Z., L. Jianhong, and C. Laijiu. The Theory and Application Research on Predictive Generalized Control. Acta Automatica Sinica, Vol. 19, No. 3, (1993): 9-17. 106