DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES

Size: px
Start display at page:

Download "DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES"

Transcription

1 EXPERIMENTAL WORK PART I CHAPTER 6 DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES The evaluation of models built using statistical in conjunction with various feature subset selection algorithms, data discretisers, data transforms along with various classifiers are reported. The result obtained at the end of the evaluation is an expert system or machine learning model using the most preferred feature subset selection algorithm, data discretiser or data transform along with the classifier, using statistical. 96

2 CHAPTER 6 DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES 6.1 INTRODUCTION Misfire detection using the signals acquired from the engine under different operating conditions has to undergo a few processes before extraction of relevant information for classifier training. The fault descriptors basically contain a set of parameters computed from the signal, called which is used for building the machine learning model capable of fault identification. A wide variety of can be extracted from the acquired signals, but the suitability of the feature set has to be evaluated thoroughly for building the model. Since signature or signal from each device is unique, a comprehensive decision on the type of feature to be used is often not possible without evaluating a few possibilities. The simplest and easiest feature with respect to computational load is often chosen as the first choice. In this section, the use of statistical as the basic building block for the machine learning model is evaluated in combination with various data pre processing techniques discussed in Chapter 3 and different learning algorithms presented in Chapter 4. The main focus is to build a misfire detection system by the synthesis of the best performing: a) discretisation technique, b) feature subset selection and/or feature transform and c) machine learning algorithm. This combination has to be evaluated in detail at each and every stage before crystallising the final machine learning model with an aim to develop a robust system capable of consistent high performance with inherent tolerance to variation in noise and signal conditions. The statistical analysis of the acquired signal yields a set of which are used for fault diagnosis as described in Section The next logical step after feature extraction, refer to Figure 3.6 of Chapter 3 is discretisation of data followed by data transformation and/or feature subset selection. Building and evaluating the model after each process with an aim to identify the possibility of achieving good model performance by eliminating any one or all data pre-processing steps are attempted. Finally combinations of a) data 97

3 discretisation and feature subset selection, and b) data discretisation and data transformation techniques are also evaluated. these probable combinations are built into various machine learning models discussed in Section 3.4 and evaluated. 6.2 STATISTICAL FEATURES The eleven statistical extracted from the engine block vibration signals were selected as a basis for the study. The are mean, standard error, median, standard deviation, sample variance, kurtosis, skewness, range, minimum, maximum, and sum. The important logic behind data preprocessing apart from noise and outliers detection is that, all the may not contain exclusive information required for machine learning. It is generally observed that some may yield more information than others and a few can be highly correlated, containing similar information. The correlated data do not have additional information for building the classifier and is considered an overload due to repetition of the same information in more than one feature. The process of identifying and selecting good which reveal more information for classification is called feature subset selection. This process is usually preceded by dimensionality reduction where the volume of information is reduced by granulating or aggregating data for ease of computing and is explained in Section 3.4. This process is analogues to palletisation or bundling of components for improving data processing efficiency. 6.3 ESTIMATION OF THE PREFERRED FSS TECHNIQUE In this work decision tree based FSS and FSS are chosen for building the classifier model. A detailed evaluation to identify the feature subset with minimum number of has to be performed to validate the FSS procedure and to ensure a classification model with minimum complexity Effect of number of A decision tree based FSS and FSS are inducted as data preprocessing techniques in building the classifier model. The effect of number of on classification efficiency is investigated using these techniques. The feature subset 98

4 identified through this process will be used for developing and evaluating the various algorithm based models. The significant part of the decision tree in Figure 6.1 obtained by using all the show that the root node is sample variance which is the feature with maximum discriminating capacity closely followed by standard deviation and standard error. Percolating down the tree branches, it is evident that the list continues with skewness, range, kurtosis, minimum, mean and median. The statistical, sum and maximum do not find a place in the tree, indicating that they do not have any additional discriminating power to augment the classifier. The decision tree is capable of listing the in the order of importance for use in a classifier but it is not capable of suggesting a crisp subset of or the minimum number of that would be most suitable for building the classifier. Hence optimising the number of discriminant ( which have information for classification) is essential. This process is accomplished by evaluating the classifier performance starting with the root node, cumulatively adding in their order of importance and evaluating it at each stage of feature addition. A similar exercise has to be followed for CFS also. The application of CFS evaluator recommends the following seven statistical : standard error, standard deviation, kurtosis, skewness, range, minimum and maximum. The CFS is capable of identifying based on correlation avoidance, but is not capable of listing the based on their importance. One proposed method for validating the recommendation of CFS is by building a classification tree using the decision tree and lists from the root node in the descending order of importance which is as described in the previous paragraph and the significant part of it is represented by Figure 6.2. It is noted that CFS validation requires the assistance of an additional algorithm if evaluation has to be done thoroughly. In this work decision tree algorithm is used for feature ranking. The listed in descending order of their importance are standard deviation at the root node followed by standard error, skewness, range, kurtosis, minimum and maximum. 99

5 Figure 6.1 tree with all considered 100

6 Figure 6.2 tree with CFS identified 101

7 6.3.2 Effect of number of on classification efficiency The CFS evaluation suggests a subset of seven while decision tree identified nine as reported in Section An analysis on the number of on classification efficiency was performed and reported here. The are taken in the order of importance from one feature to the maximum number of and their cumulative classification efficiencies using decision tree and random forest are calculated. The decision tree is a benchmark classification algorithm generally used to compare results as demonstrated by Hall (Hall 2000). forest in an extension of decision trees and hence considered for the evaluation of FSS methodology. the eleven statistical are given as input for identifying the feature subset containing minimum number of without appreciable loss in classification efficiency. CFS subset evaluation and decision tree give an orderly representation of the showing their relative importance in classification. The learning algorithm was evaluated for classification efficiency using the most prominent single feature and the result was recorded. Additional in the order of importance were added to the set one by one and at every stage the cumulative classification efficiency of the selected were recorded. The number of in the subset can be decided based on the two alternatives: a. the feature subset that offers maximum classification efficiency b. the feature subset that achieves very close to maximum classification efficiency with minimal number of and has minimum computational complexity. The second option is a good alternative where serious deviations in performance is avoided and the system will be more robust since the model over fitting all the available data is avoided due to the use of minimum number of. An added advantage is that the model has reduced computational load thereby saving on system resources requirement during implementation. Variations in classification efficiency with number of in a subset, using decision tree and FSS are presented in Figure 6.3 a and 6.3 b respectively. 102

8 Classification accuracy % SV SD SE SKE RAN KUR MIN MED MEN ALL Series Number of considered (cumulative) Figure 6.3 a) FSS using decision tree Classification accuracy % SD SE SKE RAN KUR MIN MAX Series Number of considered (cumulative) Figure 6.3 b) FSS using CFS Figure 6.3 Classification efficiency of statistical feature subset using tree Evaluating the effectiveness of FSS is done using the decision tree as classifier in the first phase. Figure 6.3a depicts FSS done using decision tree for both identification of the best FSS and classification. Analysis of the graph shows that the use of first five (as ranked by decision tree) gives the maximum classification efficiency of 89.1% and increasing or decreasing the number of in the subset becomes progressively counterproductive. The classifier is never able to achieve the peak performance of 89.1% when more than or less than five are used. A similar observation on Figure 6.3b representing the FSS using CFS and ranked by decision tree shows that the classifier performance peaks to 89.2% when the first four identified by CFS and ranked by 103

9 decision tree are used. Any decrease in the number of drastically reduces the classifier performance and increase in marginally reduces the classifier performance from 89.2% to 88.4%. The use of all the yields 88.5% only. forest, a tree based ensemble is used to check for consistency in the predicted results. Figure 6.4a represents the forest evaluation of FSS using decision tree algorithm. The figure shows that considering the first four or five returns a maximum classification accuracy of 87.9% and any increase in the number of induces oscillations in the classification accuracy but never reaches the peak performance. Any reduction below four is marked by a sharp decrease in classification accuracy from 87.9% to a flat 78%. The overall results tally well with the findings obtained from FSS evaluation using decision tree which is presented in Figure 6.3a. A similar observation can be made on FSS done using CFS and evaluated by using forest. The results presented in Figure 6.4b clearly predict the FSS with the first four to have a maximum performance of 88.2% and oscillates between 86.5% and 87.8% when increased beyond four. Any reduction is equally unfavorable with the classifier returning an 87.1% for three and a flat 78% for two and below. Comparing the peak classification accuracies obtained from both the classifiers, it is clearly evident that FSS using CFS is the best tool. The first four namely standard deviation, standard error, skewness and range form the best possible feature subset for use in any classifier. Classification accuracy % SV SD SE SKE RAN KUR MIN MED MEN ALL Series Number of considered (cumulative) Figure 6.4 a) FSS using decision tree for use in forest 104

10 Classification accuracy % SD SE SKE RAN KUR MIN MAX Series Number of considered (cumulative) Figure 6.4 b) FSS using CFS Figure 6.4 Classification efficiency of statistical feature subset using forest 6.4 MODEL BUILDING AND EVALUATION The machine learning model development involves two-phases called training and testing. Training is the process where the classifier learns to classify the faults based on the supplied training samples. In event identification like misfire detection, the supervised form of training is followed where the feature with the class it represents is provided to the learning algorithm. The important point to be considered here is that the model has to be evaluated with each and every data preprocessing technique and their possible combinations. The process of testing is to check how well the classifier has learnt to label the unseen samples. The summarized testing results of all the classifiers used here are presented in the form of a square matrix called confusion matrix. the confusion matrices depicted are for results with CFS and Konenenko discretisation. The interpretation of the confusion matrix is as follows: referring to Table 6.5, attention is focused on the last row for explanation since it has a higher spread of misclassifications in to other conditions. The last row of the confusion matrix represents misfire in cylinder four. The first element in the last row, i.e. location (5,1), 0 represents the number of data points that belong to the good condition and have been misclassified incorrectly as good. The second element in the last row i.e. location (5,2), 11 depicts as to how many of the misfire in cylinder four condition have been misclassified as misfire in cylinder one. The third element represents the number of 105

11 data points that has been misclassified as misfire in cylinder two. The fourth element in the first row i.e. location (5,4), 32 depicts as to how many of the misfire in cylinder four condition have been misclassified as misfire in cylinder three. The last element in the last row i.e. location (5,5), represents the number of data points that has been correctly classified as misfire in cylinder four. Similarly the second row represents the misfire in cylinder one condition. The second element in second row represents the correctly classified instances for misfire in cylinder one condition and rest of them are misclassified details as explained earlier. Similar interpretation can be given for other elements as well. To summarize, the diagonal elements shown in the confusion matrix represents the correctly classified points and non-diagonal elements are misclassified ones. The evaluation results of various classifiers with a diverse set of data pre-processing techniques are presented in the following sub sections. The process of detecting misfire is of paramount importance than identifying exactly in which cylinder it had occurred. Hence all the classifiers were also evaluated in a two class mode where misfire in any cylinder is considered as one class and no misfire as another class. However the overall focus is to design a system which is capable of identifying misfire and accurately determining the cylinder in which it has occurred. This will enable a faster fault detection and rectification for the automobile tree Algorithm tree algorithm is a versatile classifier and is also capable of identifying for FSS. The classifier parameters and classification results of tree algorithm, using various data preprocessing techniques and FSS options are presented in Tables 6.1 to 6.5. The effectiveness of using MDL correction to the rules generated by decision tree is also evaluated under the above mentioned conditions. Comparing Tables 6.2 and 6.3 it is evident that large variations in classification accuracy is not induced by MDL but the benefit of achieving the same classification accuracy with rules pruned to the minimum possible size is advantageous. MDL helps in formulating a more generalized rule thus making the model robust from misclassifications due to the 106

12 effects of minor variations in engine vibration signature. The main advantage of MDL is that the rules are shortened to the minimum level possible and hence model over fitting the data is avoided. The evaluation in two class mode is done using MDL enabled decision tree algorithm and the results are presented in Table 6.4. The confusion matrix showing the misclassification details is presented in Table 6.5 which shows that misclassification among good and misfire is nil since all entries in the first row and column except (1,1) are zero. Table 6.1 Classifier parameters for decision tree Parameters for evaluation Values Model performance evaluation 10-fold stratified cross-validation Model building time 0.5 s Total Number of Instances 1000 Correctly Classified Instances 892 Incorrectly Classified Instances 108 Mean absolute error 0.06 Root mean squared error MDL correction Incorporated Number of leaves 98 Size of the tree 107 Table 6.2 tree results in multiclass mode without MDL Statistical No pre Discretisation Entropy based processing using 10 bins discretisation FSS tree FSS

13 Table 6.3 tree results in multiclass mode with MDL Statistical No pre Discretisation Entropy based processing using 10 bins discretisation FSS tree FSS Table 6.4 tree results in two class mode with MDL Statistical No pre Discretisation Entropy based processing using 10 bins discretisation FSS tree FSS Table 6.5 tree confusion matrix Good C1m C2m C3m C4m Good C1m C2m C3m C4m 108

14 From the values tabulated in Table 6.4 it is inferred that none of the data transforms could produce 100% misfire detection hence their use in misfire detection with decision tree is not favorable. It is observed that EWD and entropy based discretisation achieves 100% classification accuracy in two class mode. From the values tabulated in Table 6.3, it is observed that entropy based discretisation achieves maximum multi-class classification accuracy and from among them, Konenenko based discretiser with CFS has the maximum classification accuracy of 89.2%. Additionally, the classification results do not change with or without MDL, indicating that the system is capable of learning a reduced rule set even without MDL implementation forest The forest algorithm uses multiple decision trees with a voting system to build the classification model. The classifier parameters and classification results using various data preprocessing techniques and FSS options are presented in Tables 6.6 to 6.9. The optimum number of trees to be used for model building is presented in Figure 6.5. The optimum number of trees to be used for building the tree cluster based classifier has to be evaluated to achieve the maximum possible results. The number of trees is varied from 1 to 15 and the results are recorded. The Konenenko discretised data set is used for the analysis. From the results presented in Figure 6.5, it is evident that a maximum classification accuracy of 89.6 is achieved when the number of trees used is 10. Table 6.6 Classifier parameters for forest Parameters for evaluation Values Model performance evaluation 10-fold stratified cross-validation Model building time 1.1 s Total Number of Instances 1000 Correctly Classified Instances 896 Incorrectly Classified Instances 104 Mean absolute error Root mean squared error Number of trees used

15 Classification accuracy % Series Number of trees used Figure 6.5 Number of trees Vs classification accuracy Table 6.7 forest results in multiclass mode Statistical FSS tree FSS No pre Discretisation Entropy based processing using 10 bins discretisation Table 6.8 forest results in two class mode Statistical FSS tree FSS No pre Discretisation Entropy based processing using 10 bins discretisation

16 From the values tabulated in Table 6.8 it is inferred that none of the data transforms could produce 100% misfire detection hence their use in misfire detection with forest is not favorable. It is observed that only entropy based discretisation achieves 100% classification accuracy in two class mode. From the values tabulated in Table 6.7, it is observed that using all with discretisation achieves a maximum of 90.6% followed by CFS with Konenenko based discretiser with a maximum classification accuracy of 89.6%. Table 6.9 forest confusion matrix Good C1m C2m C3m C4m Good C1m C2m C3m C4m The use of FSS and discretiser enhances robustness of the model by avoiding data over fitting, which is mandatory for ensuring the future operability of the model under slightly varying signal conditions. Hence a model using CFS with Konenenko based discretiser performing with 89.6% is preferred. The misclassification details are comparable to the one obtained with decision tree and is presented in Table Fuzzy classifier (Furia and FRRC) The classifier parameters and classification results of fuzzy classifiers are presented in Tables 6.10 to The effectiveness of using various data preprocessing techniques is evaluated with two different fuzzy formulations, FURIA and FRRC. The misclassification details are presented in Table The FURIA classification results presented in Tables 6.11 and 6.12 clearly portrays that the use of FURIA with any data model is not a suitable choice since the classification accuracy in two class mode never reaches 100%, which is mandatory for the system to be of any practical use. Hence further analysis of the results is not done. 111

17 Table 6.10 Classifier parameters for FURIA Parameters for evaluation Values Model performance evaluation 10-fold stratified cross-validation Model building time 16 s Total Number of Instances 1000 Correctly Classified Instances 885 Incorrectly Classified Instances 115 Mean absolute error Root mean squared error Number of rules generated 18 Table 6.11FURIA results in multiclass mode Statistical FSS tree FSS No pre Discretisation Entropy based processing using 10 bins discretisation Table 6.12 FURIA results in two class mode Statistical FSS tree FSS No pre Discretisation Entropy based processing using 10 bins discretisation

18 Table 6.13 FURIA confusion matrix (with CFS + transform) Good C1m C2m C3m C4m Good C1m C2m C3m C4m The evaluation of FRRC is presented in Tables 6.14 to The misclassification details are presented in Table As observed earlier, any higher performance with all is consciously avoided to negate model over-fitting the data. Table 6.14 Classifier parameters for FRRC Parameters for evaluation Values Model performance evaluation 10-fold stratified cross-validation Model building time 7.5 s Total Number of Instances 1000 Correctly Classified Instances 890 Incorrectly Classified Instances 110 Mean absolute error Root mean squared error Number of rules 83 Table 6.15 FRRC results in multi class mode Statistical FSS tree FSS No pre Discretisation Entropy based processing using 10 bins discretisation

19 Table 6.16 FRRC results in two class mode Statistical FSS tree FSS No pre Discretisation Entropy based processing using 10 bins discretisation Table 6.17 FRRC confusion matrix Good C1m C2m C3m C4m Good C1m C2m C3m C4m From the values tabulated in Tables 6.12 and 6.16, it is inferred that none of the data transforms could produce 100% misfire detection in two class mode hence their use in misfire detection with both the fuzzy systems is not favorable. The model performance with FRRC presented in table 6.15 is similar to that of FURIA with respect to multi-class mode reaching a maximum of 89% but with two class mode it achieves the mandatory 100% when CFS with entropy based discretisation is used for data preprocessing Naïve Bayes classifier Naïve Bayes classifier makes use of conditional probability for its classification. Its classification performance with statistical is presented in Tables 6.18 to Table 6.18 shows the test parameters and classification efficiency, while the details of misclassifications are presented in Table 6.21 as confusion matrix. 114

20 Table 6.18 Classifier parameters for Naïve Bayes classifier Parameters for evaluation Values Model performance evaluation 10-fold stratified cross-validation Model building time 0.2 s Total Number of Instances 1000 Correctly Classified Instances 846 Incorrectly Classified Instances 154 Mean absolute error Root mean squared error Table 6.19 Naïve Bayes results in multi class mode Statistical FSS tree FSS No pre Discretisation Entropy based processing using 10 bins discretisation Table 6.20 Naïve Bayes results in two class mode Statistical FSS tree FSS No pre Discretisation Entropy based processing using 10 bins discretisation

21 Table 6.21 Naïve Bayes confusion matrix Good C1m C2m C3m C4m Good C1m C2m C3m C4m The Naïve Bayes is a simple yet efficient classifier achieving 100 % classification efficiency in two class mode with entropy based discretisation as observed from Table From the values tabulated in Table 6.20, it is inferred that none of the data transforms could produce 100% misfire detection in two class mode hence their use in misfire detection with Naïve Bayes is not favorable. In multi-class mode both the entropy based discretisation methods achieve the same classification accuracy of 84.6% with FSS as noted from Table Bayes net Bayes net or Bayseian belief network classifier is a slightly improvised model as compared to Naïve Bayes model. Its classification performance is presented in Tables 6.22 to The Table 7.22 shows the test parameters and classification efficiency, while the details of misclassifications are presented in Table 6.25 as confusion matrix. Table 6.22 Classifier parameters for Bayes net Classifier Parameters for evaluation Values Model performance evaluation 10-fold stratified cross-validation Model building time 0.2 s Total Number of Instances 1000 Correctly Classified Instances 846 Incorrectly Classified Instances 154 Mean absolute error Root mean squared error Estimator Simple Bayes Optimisation algorithm used Hill climber 116

22 Table 6.23 Bayes net results in multi class mode Statistical FSS tree FSS No pre Discretisation Entropy based processing using 10 bins discretisation From the values tabulated in Table 6.24, it is inferred that none of the data transforms except random with FSS could produce 100% misfire detection in two class mode hence their use in misfire detection with Bayes net is not favorable. The Bayes net classifier achieves a classification efficiency of 100% in two class mode with both entropy based discretisation as seen from Table In multi-class mode, the performance is 84.6% with FSS as noted from Table It is observed that both Naïve Bayes and Bayes net accomplish the same classification results both in two class and multi class mode. Table 6.24 Bayes net results in two class mode Statistical FSS tree FSS No pre Discretisation Entropy based processing using 10 bins discretisation

23 Table 6.25 Bayes net confusion matrix Good C1m C2m C3m C4m Good C1m C2m C3m C4m Support Vector Machines The SVM is a complex and computationally taxing algorithm which is capable of returning considerably higher classification accuracies. There are various kernels that can be used to build the classifier as mentioned in section The evaluation results are summarized in Tables 6.26 to The SVM classifier parameters for c-svm with transform using linear function kernel is presented in Table 6.26 followed by the confusion matrix for the same in Table Table 6.26 Classifier parameters for c-svm linear with transform Parameters for evaluation Values Model performance evaluation 10-fold stratified cross-validation Model building time 1.1 s Total Number of Instances 1000 Correctly Classified Instances 865 Incorrectly Classified Instances 135 Mean absolute error Root mean squared error Kernel used Linear SVM type c-svm Analysing the results presented in Tables 6.27 and 6.28, it is observed that the overall performance of SVM is very good in many instances in two class mode but the major setback is the processing time required, which is considerably high. Making an initial choice based on two class performance and time, only transform is available as the 118

24 option without a choice since PCA is comparable in time but fails to achieve 100% in two class mode. Table 6.27 SVM results in multi class mode (c-svm linear) Statistical FSS tree FSS Time taken in seconds No pre Discretisation Entropy based processing using 10 bins discretisation With RBF Table 6.28 SVM results in two class mode (c- SVM linear) Statistical FSS tree FSS No pre Discretisation Entropy based processing using 10 bins discretisation The evaluation of various kernels under c-svm and nu-svm formulations are carried out with using transform and the results are presented in Tables 6.30 and

25 Table 6.29 c-svm confusion matrix with CFS and transform Good C1m C2m C3m C4m Good C1m C2m C3m C4m Table 6.30 c-svm with CFS and transform SVM Kernels used Multi class mode 2 class mode RBF polynomial Sigmoid linear Table 6.31 nu-svm with CFS and transform SVM Kernels used Multi class mode 2 class mode RBF polynomial Sigmoid linear The results of c-svm with CFS and transform presented in Table 6.30 implies that a linear kernel is performing with highest two class and multi-class classification accuracy of 86.5% and 100% respectively, followed by RBF kernel with 82.5%. The results in Table 6.31 clearly indicate that nu-svm cannot be a choice since it is not able to reach 100% classification accuracy in two class mode with any of the combinations considered. 120

26 6.5 SUMMARY The detailed analyses of statistical have led to the formulation of the following conclusions: Feature subset selection (FSS): The effect of FSS using CFS and decision tree were analysed and the resulting feature reduction was implemented for further analysis. FSS using CFS proved to be marginally better than decision tree based FSS. Effect of feature transforms: The effect of feature transforms like PCA, random and transform on classification accuracy were analysed by evaluating the classification accuracy using all the algorithms considered. The results found that the use of with data transforms was lagging in performance with almost all algorithms except SVM. Effect of Discretisation: The effect of EWD and EFD were not very prominent compared to entropy based discretisation which was performing with good results along with almost all classifier except with SVM, where it increased the processing time required for arriving at a decision. From among the entropy based discretisation, Konenenko based discretiser is found to perform better that and Irani model. Analysis of the best feature-classifier combination: Diverse families of classifiers were evaluated and at each stage the best classifier-feature combination was chosen based on the classification accuracy and time taken for building the model. the selected combinations were evaluated and the best options for building the model are analysed and presented below. 121

27 Table 6.32 Summary of classification efficiencies using statistical Classifier used Multi class Two class Time taken in mode mode seconds tree forest Fuzzy (FRRC) Naïve Bayes Bayes Net c-svm with linear kernel A compilation of multi-class and two class classification results along with the computation time taken using statistical is presented in Table Analysing the results, it is evident that all the classifiers achieve 100% classification accuracy in two class mode but only forest and tree shares the highest and the next highest multi class accuracy of 89.6% and 89.2% respectively. The choice between forest and decision tree can be decided based on a compromise between classification accuracy and the time taken for classification. Here the decision tree model (DT-CFS-KD) requires only 0.1 seconds whereas the forest takes 0.3 seconds for arriving at the same decisions but performs better. Since the time saving under consideration is not very large the forest with feature selection followed by Konenenko discretisation of data is the model of choice (RF-CFS-KD). 122

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Data Mining by I. H. Witten and E. Frank 7 Engineering the input and output Attribute selection Scheme-independent, scheme-specific Attribute discretization Unsupervised, supervised, error-

More information

Performance Evaluation of Various Classification Algorithms

Performance Evaluation of Various Classification Algorithms Performance Evaluation of Various Classification Algorithms Shafali Deora Amritsar College of Engineering & Technology, Punjab Technical University -----------------------------------------------------------***----------------------------------------------------------

More information

Machine Learning Techniques for Data Mining

Machine Learning Techniques for Data Mining Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already

More information

Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis

Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis CHAPTER 3 BEST FIRST AND GREEDY SEARCH BASED CFS AND NAÏVE BAYES ALGORITHMS FOR HEPATITIS DIAGNOSIS 3.1 Introduction

More information

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation.

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation. Equation to LaTeX Abhinav Rastogi, Sevy Harris {arastogi,sharris5}@stanford.edu I. Introduction Copying equations from a pdf file to a LaTeX document can be time consuming because there is no easy way

More information

Supervised Learning Classification Algorithms Comparison

Supervised Learning Classification Algorithms Comparison Supervised Learning Classification Algorithms Comparison Aditya Singh Rathore B.Tech, J.K. Lakshmipat University -------------------------------------------------------------***---------------------------------------------------------

More information

Final Report: Kaggle Soil Property Prediction Challenge

Final Report: Kaggle Soil Property Prediction Challenge Final Report: Kaggle Soil Property Prediction Challenge Saurabh Verma (verma076@umn.edu, (612)598-1893) 1 Project Goal Low cost and rapid analysis of soil samples using infrared spectroscopy provide new

More information

Business Club. Decision Trees

Business Club. Decision Trees Business Club Decision Trees Business Club Analytics Team December 2017 Index 1. Motivation- A Case Study 2. The Trees a. What is a decision tree b. Representation 3. Regression v/s Classification 4. Building

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

The Curse of Dimensionality

The Curse of Dimensionality The Curse of Dimensionality ACAS 2002 p1/66 Curse of Dimensionality The basic idea of the curse of dimensionality is that high dimensional data is difficult to work with for several reasons: Adding more

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

IEE 520 Data Mining. Project Report. Shilpa Madhavan Shinde

IEE 520 Data Mining. Project Report. Shilpa Madhavan Shinde IEE 520 Data Mining Project Report Shilpa Madhavan Shinde Contents I. Dataset Description... 3 II. Data Classification... 3 III. Class Imbalance... 5 IV. Classification after Sampling... 5 V. Final Model...

More information

Why MultiLayer Perceptron/Neural Network? Objective: Attributes:

Why MultiLayer Perceptron/Neural Network? Objective: Attributes: Why MultiLayer Perceptron/Neural Network? Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are

More information

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

Allstate Insurance Claims Severity: A Machine Learning Approach

Allstate Insurance Claims Severity: A Machine Learning Approach Allstate Insurance Claims Severity: A Machine Learning Approach Rajeeva Gaur SUNet ID: rajeevag Jeff Pickelman SUNet ID: pattern Hongyi Wang SUNet ID: hongyiw I. INTRODUCTION The insurance industry has

More information

On Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions

On Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions On Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions CAMCOS Report Day December 9th, 2015 San Jose State University Project Theme: Classification The Kaggle Competition

More information

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Statistical Analysis of Metabolomics Data Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Outline Introduction Data pre-treatment 1. Normalization 2. Centering,

More information

2. On classification and related tasks

2. On classification and related tasks 2. On classification and related tasks In this part of the course we take a concise bird s-eye view of different central tasks and concepts involved in machine learning and classification particularly.

More information

Comparison of different preprocessing techniques and feature selection algorithms in cancer datasets

Comparison of different preprocessing techniques and feature selection algorithms in cancer datasets Comparison of different preprocessing techniques and feature selection algorithms in cancer datasets Konstantinos Sechidis School of Computer Science University of Manchester sechidik@cs.man.ac.uk Abstract

More information

Hybrid Fuzzy Model Based Expert System for Misfire Detection in Automobile Engines

Hybrid Fuzzy Model Based Expert System for Misfire Detection in Automobile Engines This article can be cited as S. Babu Devasenapati and K. I. Ramachandran, Hybrid Fuzzy Model Based Expert System for Misfire Detection in Automobile Engines, International Journal of Artificial Intelligence,

More information

CHAPTER 3. Preprocessing and Feature Extraction. Techniques

CHAPTER 3. Preprocessing and Feature Extraction. Techniques CHAPTER 3 Preprocessing and Feature Extraction Techniques CHAPTER 3 Preprocessing and Feature Extraction Techniques 3.1 Need for Preprocessing and Feature Extraction schemes for Pattern Recognition and

More information

CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA. By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr.

CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA. By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr. CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr. Michael Nechyba 1. Abstract The objective of this project is to apply well known

More information

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 SUPERVISED LEARNING METHODS Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 2 CHOICE OF ML You cannot know which algorithm will work

More information

Cyber attack detection using decision tree approach

Cyber attack detection using decision tree approach Cyber attack detection using decision tree approach Amit Shinde Department of Industrial Engineering, Arizona State University,Tempe, AZ, USA {amit.shinde@asu.edu} In this information age, information

More information

Machine Learning for NLP

Machine Learning for NLP Machine Learning for NLP Support Vector Machines Aurélie Herbelot 2018 Centre for Mind/Brain Sciences University of Trento 1 Support Vector Machines: introduction 2 Support Vector Machines (SVMs) SVMs

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

Support Vector Machines + Classification for IR

Support Vector Machines + Classification for IR Support Vector Machines + Classification for IR Pierre Lison University of Oslo, Dep. of Informatics INF3800: Søketeknologi April 30, 2014 Outline of the lecture Recap of last week Support Vector Machines

More information

Classification. 1 o Semestre 2007/2008

Classification. 1 o Semestre 2007/2008 Classification Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 Single-Class

More information

Stat 602X Exam 2 Spring 2011

Stat 602X Exam 2 Spring 2011 Stat 60X Exam Spring 0 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed . Below is a small p classification training set (for classes) displayed in

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Engineering the input and output Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 7 of Data Mining by I. H. Witten and E. Frank Attribute selection z Scheme-independent, scheme-specific

More information

Application of Support Vector Machine Algorithm in Spam Filtering

Application of Support Vector Machine Algorithm in  Spam Filtering Application of Support Vector Machine Algorithm in E-Mail Spam Filtering Julia Bluszcz, Daria Fitisova, Alexander Hamann, Alexey Trifonov, Advisor: Patrick Jähnichen Abstract The problem of spam classification

More information

Contents. Preface to the Second Edition

Contents. Preface to the Second Edition Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................

More information

CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR)

CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR) 63 CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR) 4.1 INTRODUCTION The Semantic Region Based Image Retrieval (SRBIR) system automatically segments the dominant foreground region and retrieves

More information

Supervised vs unsupervised clustering

Supervised vs unsupervised clustering Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful

More information

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES 6.1 INTRODUCTION The exploration of applications of ANN for image classification has yielded satisfactory results. But, the scope for improving

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 2 out Announcements Due May 3 rd (11:59pm) Course project proposal

More information

CS229 Lecture notes. Raphael John Lamarre Townshend

CS229 Lecture notes. Raphael John Lamarre Townshend CS229 Lecture notes Raphael John Lamarre Townshend Decision Trees We now turn our attention to decision trees, a simple yet flexible class of algorithms. We will first consider the non-linear, region-based

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 20: 10/12/2015 Data Mining: Concepts and Techniques (3 rd ed.) Chapter

More information

An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm

An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm Proceedings of the National Conference on Recent Trends in Mathematical Computing NCRTMC 13 427 An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm A.Veeraswamy

More information

Machine Learning with MATLAB --classification

Machine Learning with MATLAB --classification Machine Learning with MATLAB --classification Stanley Liang, PhD York University Classification the definition In machine learning and statistics, classification is the problem of identifying to which

More information

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018 MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge

More information

Content-based image and video analysis. Machine learning

Content-based image and video analysis. Machine learning Content-based image and video analysis Machine learning for multimedia retrieval 04.05.2009 What is machine learning? Some problems are very hard to solve by writing a computer program by hand Almost all

More information

Random Forest A. Fornaser

Random Forest A. Fornaser Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou Sun yzsun@ccs.neu.edu November 19, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining

More information

Naïve Bayes for text classification

Naïve Bayes for text classification Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Support

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

Predictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA

Predictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA Predictive Analytics: Demystifying Current and Emerging Methodologies Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA May 18, 2017 About the Presenters Tom Kolde, FCAS, MAAA Consulting Actuary Chicago,

More information

CHAPTER 4 MAINTENANCE STRATEGY SELECTION USING TOPSIS AND FUZZY TOPSIS

CHAPTER 4 MAINTENANCE STRATEGY SELECTION USING TOPSIS AND FUZZY TOPSIS 59 CHAPTER 4 MAINTENANCE STRATEGY SELECTION USING TOPSIS AND FUZZY TOPSIS 4.1 INTRODUCTION The development of FAHP-TOPSIS and fuzzy TOPSIS for selection of maintenance strategy is elaborated in this chapter.

More information

Introduction to Automated Text Analysis. bit.ly/poir599

Introduction to Automated Text Analysis. bit.ly/poir599 Introduction to Automated Text Analysis Pablo Barberá School of International Relations University of Southern California pablobarbera.com Lecture materials: bit.ly/poir599 Today 1. Solutions for last

More information

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska Classification Lecture Notes cse352 Neural Networks Professor Anita Wasilewska Neural Networks Classification Introduction INPUT: classification data, i.e. it contains an classification (class) attribute

More information

COMP 465: Data Mining Classification Basics

COMP 465: Data Mining Classification Basics Supervised vs. Unsupervised Learning COMP 465: Data Mining Classification Basics Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Supervised

More information

Ensemble Methods, Decision Trees

Ensemble Methods, Decision Trees CS 1675: Intro to Machine Learning Ensemble Methods, Decision Trees Prof. Adriana Kovashka University of Pittsburgh November 13, 2018 Plan for This Lecture Ensemble methods: introduction Boosting Algorithm

More information

Facial Expression Classification with Random Filters Feature Extraction

Facial Expression Classification with Random Filters Feature Extraction Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,

More information

Leave-One-Out Support Vector Machines

Leave-One-Out Support Vector Machines Leave-One-Out Support Vector Machines Jason Weston Department of Computer Science Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 OEX, UK. Abstract We present a new learning algorithm

More information

Introduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core)

Introduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core) Introduction to Data Science What is Analytics and Data Science? Overview of Data Science and Analytics Why Analytics is is becoming popular now? Application of Analytics in business Analytics Vs Data

More information

Extra readings beyond the lecture slides are important:

Extra readings beyond the lecture slides are important: 1 Notes To preview next lecture: Check the lecture notes, if slides are not available: http://web.cse.ohio-state.edu/~sun.397/courses/au2017/cse5243-new.html Check UIUC course on the same topic. All their

More information

OPTIMISATION OF PIN FIN HEAT SINK USING TAGUCHI METHOD

OPTIMISATION OF PIN FIN HEAT SINK USING TAGUCHI METHOD CHAPTER - 5 OPTIMISATION OF PIN FIN HEAT SINK USING TAGUCHI METHOD The ever-increasing demand to lower the production costs due to increased competition has prompted engineers to look for rigorous methods

More information

April 3, 2012 T.C. Havens

April 3, 2012 T.C. Havens April 3, 2012 T.C. Havens Different training parameters MLP with different weights, number of layers/nodes, etc. Controls instability of classifiers (local minima) Similar strategies can be used to generate

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 08: Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu October 24, 2017 Learnt Prediction and Classification Methods Vector Data

More information

Predicting Popular Xbox games based on Search Queries of Users

Predicting Popular Xbox games based on Search Queries of Users 1 Predicting Popular Xbox games based on Search Queries of Users Chinmoy Mandayam and Saahil Shenoy I. INTRODUCTION This project is based on a completed Kaggle competition. Our goal is to predict which

More information

Predicting Diabetes and Heart Disease Using Diagnostic Measurements and Supervised Learning Classification Models

Predicting Diabetes and Heart Disease Using Diagnostic Measurements and Supervised Learning Classification Models Predicting Diabetes and Heart Disease Using Diagnostic Measurements and Supervised Learning Classification Models Kunal Sharma CS 4641 Machine Learning Abstract Supervised learning classification algorithms

More information

Weka ( )

Weka (  ) Weka ( http://www.cs.waikato.ac.nz/ml/weka/ ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised

More information

CS229 Final Project: Predicting Expected Response Times

CS229 Final Project: Predicting Expected  Response Times CS229 Final Project: Predicting Expected Email Response Times Laura Cruz-Albrecht (lcruzalb), Kevin Khieu (kkhieu) December 15, 2017 1 Introduction Each day, countless emails are sent out, yet the time

More information

A Content Vector Model for Text Classification

A Content Vector Model for Text Classification A Content Vector Model for Text Classification Eric Jiang Abstract As a popular rank-reduced vector space approach, Latent Semantic Indexing (LSI) has been used in information retrieval and other applications.

More information

Experimenting with Multi-Class Semi-Supervised Support Vector Machines and High-Dimensional Datasets

Experimenting with Multi-Class Semi-Supervised Support Vector Machines and High-Dimensional Datasets Experimenting with Multi-Class Semi-Supervised Support Vector Machines and High-Dimensional Datasets Alex Gonopolskiy Ben Nash Bob Avery Jeremy Thomas December 15, 007 Abstract In this paper we explore

More information

An introduction to random forests

An introduction to random forests An introduction to random forests Eric Debreuve / Team Morpheme Institutions: University Nice Sophia Antipolis / CNRS / Inria Labs: I3S / Inria CRI SA-M / ibv Outline Machine learning Decision tree Random

More information

Beyond Bags of Features

Beyond Bags of Features : for Recognizing Natural Scene Categories Matching and Modeling Seminar Instructed by Prof. Haim J. Wolfson School of Computer Science Tel Aviv University December 9 th, 2015

More information

Data Preprocessing. Slides by: Shree Jaswal

Data Preprocessing. Slides by: Shree Jaswal Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data

More information

Table Of Contents: xix Foreword to Second Edition

Table Of Contents: xix Foreword to Second Edition Data Mining : Concepts and Techniques Table Of Contents: Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments xxxi About the Authors xxxv Chapter 1 Introduction 1 (38) 1.1 Why Data

More information

Part I. Instructor: Wei Ding

Part I. Instructor: Wei Ding Classification Part I Instructor: Wei Ding Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Classification: Definition Given a collection of records (training set ) Each record contains a set

More information

Basic Data Mining Technique

Basic Data Mining Technique Basic Data Mining Technique What is classification? What is prediction? Supervised and Unsupervised Learning Decision trees Association rule K-nearest neighbor classifier Case-based reasoning Genetic algorithm

More information

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Abstract We intend to show that leveraging semantic features can improve precision and recall of query results in information

More information

DECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS

DECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS DECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS Deep Neural Decision Forests Microsoft Research Cambridge UK, ICCV 2015 Decision Forests, Convolutional Networks and the Models in-between

More information

DATA SCIENCE INTRODUCTION QSHORE TECHNOLOGIES. About the Course:

DATA SCIENCE INTRODUCTION QSHORE TECHNOLOGIES. About the Course: DATA SCIENCE About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst/Analytics Manager/Actuarial Scientist/Business

More information

Data Cleaning and Prototyping Using K-Means to Enhance Classification Accuracy

Data Cleaning and Prototyping Using K-Means to Enhance Classification Accuracy Data Cleaning and Prototyping Using K-Means to Enhance Classification Accuracy Lutfi Fanani 1 and Nurizal Dwi Priandani 2 1 Department of Computer Science, Brawijaya University, Malang, Indonesia. 2 Department

More information

BITS F464: MACHINE LEARNING

BITS F464: MACHINE LEARNING BITS F464: MACHINE LEARNING Lecture-16: Decision Tree (contd.) + Random Forest Dr. Kamlesh Tiwari Assistant Professor Department of Computer Science and Information Systems Engineering, BITS Pilani, Rajasthan-333031

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

Assignment 1: CS Machine Learning

Assignment 1: CS Machine Learning Assignment 1: CS7641 - Machine Learning Saad Khan September 18, 2015 1 Introduction I intend to apply supervised learning algorithms to classify the quality of wine samples as being of high or low quality

More information

Information Fusion Dr. B. K. Panigrahi

Information Fusion Dr. B. K. Panigrahi Information Fusion By Dr. B. K. Panigrahi Asst. Professor Department of Electrical Engineering IIT Delhi, New Delhi-110016 01/12/2007 1 Introduction Classification OUTLINE K-fold cross Validation Feature

More information

Contents. Foreword to Second Edition. Acknowledgments About the Authors

Contents. Foreword to Second Edition. Acknowledgments About the Authors Contents Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments About the Authors xxxi xxxv Chapter 1 Introduction 1 1.1 Why Data Mining? 1 1.1.1 Moving toward the Information Age 1

More information

Chapter 4. The Classification of Species and Colors of Finished Wooden Parts Using RBFNs

Chapter 4. The Classification of Species and Colors of Finished Wooden Parts Using RBFNs Chapter 4. The Classification of Species and Colors of Finished Wooden Parts Using RBFNs 4.1 Introduction In Chapter 1, an introduction was given to the species and color classification problem of kitchen

More information

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining Data Mining: Data Lecture Notes for Chapter 2 Introduction to Data Mining by Tan, Steinbach, Kumar Data Preprocessing Aggregation Sampling Dimensionality Reduction Feature subset selection Feature creation

More information

Study on Classifiers using Genetic Algorithm and Class based Rules Generation

Study on Classifiers using Genetic Algorithm and Class based Rules Generation 2012 International Conference on Software and Computer Applications (ICSCA 2012) IPCSIT vol. 41 (2012) (2012) IACSIT Press, Singapore Study on Classifiers using Genetic Algorithm and Class based Rules

More information

7. Decision or classification trees

7. Decision or classification trees 7. Decision or classification trees Next we are going to consider a rather different approach from those presented so far to machine learning that use one of the most common and important data structure,

More information

Feature Selection for Image Retrieval and Object Recognition

Feature Selection for Image Retrieval and Object Recognition Feature Selection for Image Retrieval and Object Recognition Nuno Vasconcelos et al. Statistical Visual Computing Lab ECE, UCSD Presented by Dashan Gao Scalable Discriminant Feature Selection for Image

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 12 Combining

More information

University of Florida CISE department Gator Engineering. Data Preprocessing. Dr. Sanjay Ranka

University of Florida CISE department Gator Engineering. Data Preprocessing. Dr. Sanjay Ranka Data Preprocessing Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville ranka@cise.ufl.edu Data Preprocessing What preprocessing step can or should

More information

Topics In Feature Selection

Topics In Feature Selection Topics In Feature Selection CSI 5388 Theme Presentation Joe Burpee 2005/2/16 Feature Selection (FS) aka Attribute Selection Witten and Frank book Section 7.1 Liu site http://athena.csee.umbc.edu/idm02/

More information

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate

More information

An Unsupervised Approach for Combining Scores of Outlier Detection Techniques, Based on Similarity Measures

An Unsupervised Approach for Combining Scores of Outlier Detection Techniques, Based on Similarity Measures An Unsupervised Approach for Combining Scores of Outlier Detection Techniques, Based on Similarity Measures José Ramón Pasillas-Díaz, Sylvie Ratté Presenter: Christoforos Leventis 1 Basic concepts Outlier

More information

Response to API 1163 and Its Impact on Pipeline Integrity Management

Response to API 1163 and Its Impact on Pipeline Integrity Management ECNDT 2 - Tu.2.7.1 Response to API 3 and Its Impact on Pipeline Integrity Management Munendra S TOMAR, Martin FINGERHUT; RTD Quality Services, USA Abstract. Knowing the accuracy and reliability of ILI

More information

Mapping of Hierarchical Activation in the Visual Cortex Suman Chakravartula, Denise Jones, Guillaume Leseur CS229 Final Project Report. Autumn 2008.

Mapping of Hierarchical Activation in the Visual Cortex Suman Chakravartula, Denise Jones, Guillaume Leseur CS229 Final Project Report. Autumn 2008. Mapping of Hierarchical Activation in the Visual Cortex Suman Chakravartula, Denise Jones, Guillaume Leseur CS229 Final Project Report. Autumn 2008. Introduction There is much that is unknown regarding

More information

4. Feedforward neural networks. 4.1 Feedforward neural network structure

4. Feedforward neural networks. 4.1 Feedforward neural network structure 4. Feedforward neural networks 4.1 Feedforward neural network structure Feedforward neural network is one of the most common network architectures. Its structure and some basic preprocessing issues required

More information

10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors

10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors Dejan Sarka Anomaly Detection Sponsors About me SQL Server MVP (17 years) and MCT (20 years) 25 years working with SQL Server Authoring 16 th book Authoring many courses, articles Agenda Introduction Simple

More information