Feature Extraction and Selection for Automatic Fault Diagnosis of Rotating Machinery

Size: px
Start display at page:

Download "Feature Extraction and Selection for Automatic Fault Diagnosis of Rotating Machinery"

Transcription

1 Feature Extraction and Selection for Automatic Fault Diagnosis of Rotating Machinery Francisco de A. Boldt 1,2, Thomas W. Rauber 1, Flávio M. Varejão 1 1 Departamento de Informática Universidade Federal do Espírito Santo (UFES) Av. Fernando Ferrari, 514, Goiabeiras Vitória ES Brazil 2 Instituto Federal de Educação, Ciência de Tecnologia do Espírito Santo (IFES) Rodovia ES Km 6,5 Manguinhos Serra ES Brazil {fboldt,thomas,fvarejao}@inf.ufes.br Abstract. In this work we present three feature extraction models used in vibratory data from rotating machinery for bearing fault diagnosis. Vibrations signals are acquired by accelerometers which are then submitted to different feature extraction modules. Our tests suggest that pooling heterogeneous feature sets achieve better results than using a single extraction model. Besides, different classifiers are used for performance optimization, K-earest-eighbor and Support Vector Machine. 1. Introduction Automatic fault diagnosis of complex machinery has economical and security related advantages. Identifying a fault in its initial stage allows the early replacement of damaged parts. This type of predictive maintenance is better than the preventive counterpart, which replaces parts that are not necessarily defective. Pattern recognition techniques have widely been used in automatic fault diagnosis of rotating machinery [Wandekokem et al. 2011, Xia et al. 2012, Liu 2012, Wu et al. 2012]. We use the supervised learning paradigm to diagnose bearing failures. The study of bearings is motivated by the fact that they play an important role in a wide range of rotating machines and a sophisticated fault diagnosis system can be build with the help of computational intelligence techniques. Experimental results are shown for the Case Western Reserve University (CWRU) Bearing Data [CWRU 2013]. Model free diagnosis needs to extract relevant data from the problem domain in order to train the diagnosis system algorithms. We use three basics feature extraction models, generating one dataset for each model, assembling a global feature pool. This approach is motivated by the high plausibility that several feature sets contain more discriminative information than a single feature set. A subsequent necessary step to filter the most discriminative information is feature selection, which in general increases accuracy and reduces computational cost of the fault classifier. We use a greedy heuristic in form of the Sequential Forward Selection to implement the search algorithm. As the classifier paradigm we compare the K-earest eighbor Algorithm (K-) [Cover and Hart 1967] as a representative of a less sophisticated method to the Support Vector Machine (SVM) [Burges 1998]. Leave-one-out and 10-folds cross-validation is used for performance estimation. The main contribution of this work is the variety of diagnosed faults and feature extraction models applied, appropriate use of machine learning techniques and feature selection techniques for the considered CWRU bearing data. In the reviewed literature frequently only a few classes are considered, and no cross-validation is employed in the tests, splitting the data

2 only once into a training and test set, e.g. [Xia et al. 2012, Wu et al. 2012]. o classifier at all is used in [Wu et al. 2012], [Luo et al. 2013], only the visual inspection of peaks in frequency graphs suggest the discriminative power of the method. Our system is capable to detect 21 classes of bearing conditions, employing two classifiers models and two cross-validation techniques. To the best of our knowledge, no paper uses such a variety of feature extraction models for the CWRU bearing data, as well, none used multivariate feature selection techniques. The rest of this paper is organized as follows: In section 2 we present the feature extraction models to describe the bearing faults. Section 3 presents the machine learning techniques used. Experimental results are shown in section 4 and section 5 presents our conclusions and future works. 2. Feature Extraction Vibratory signals, collected by accelerometers, are widely used in automatic rotating machine failure diagnosis [Wandekokem et al. 2011, Xia et al. 2012, Liu 2012, Wu et al. 2012]. The signals collected from the machinery are not directly usable for diagnosis, so is necessary extract static features. We use three basics models of features extraction techniques. Statistical models are applied in the time and frequency domain, while wavelet package analysis represents an extraction in the time-frequency domain [Xia et al. 2012]. Complex envelope analysis completes the methods used in the frequency domain Statistical Model We used ten statistical features in the time domain and three in the frequency domain. As a representative set we choose those features proposed in [Xia et al. 2012], c.f. table 1 and table 2. Table 1 presents the definition of statistical features in the time domain as root mean square (RMS), square root of the amplitude (SRA), kurtosis value (KV), skewness value (SV), peakpeak value (PPV), crest factor (CF), impulse factor (IF), margin factor (MF), shape factor (SF) and kurtosis factor (KF). Table 2 presents the definition of statistical features in the frequency domain as frequency center (FC), RMS frequency (RMSF) and root variance frequency (RVF) Wavelet Package Analysis The wavelet package analysis is a time-frequency domain method which permits the level by level decomposition using a wavelet function. The decomposition results in 2 l signals, where l is the number of desired levels. We follow the procedure proposed in [Xia et al. 2012] which uses as mother wavelet Daubechies 4 and refining down to the fourth decomposition level. The energy calculated in the leaf nodes are used as final features Complex Envelope Spectrum The complex envelope spectrum allows to calculate the energy value in the frequency where the faults manifest themselves. There are four characteristic frequencies at which faults can occur. Knowing the shaft rotational frequency, they are the fundamental cage frequency, ball pass inner raceway frequency, ball pass outer raceway frequency and the ball spin frequency [McInerny and Dai 2003]. For the complex envelope analysis, first a high pass filter was applied, in order to eliminate the influence of the low frequency vibrations, caused by noise, unbalance and misalignment. Subsequently, an analytical signal was calculated by applying the Hilbert transform to the original signal and adding it in quadrature to it. The magnitude 2

3 Table 1. Time domain statistical feature set of the vibration signal ( 1/2 ( ) 2 1 X rms = x 2 1 i) X sra = xi X kv = 1 ( ) 4 xi x X sv = 1 σ ( xi x σ ) 3 X ppv = max(x i ) min(x i ) X cf = X if = max( x i ) 1 x i X mf = ( max( x i ) ) 1/2 1 x2 i max( x i ) ( 1 xi ) 2 X sf = ( max( x i ) ) 1/2 X kf = 1 x2 i 1 ( xi ) x 4 σ ( ) 2 1 x2 i Table 2. Frequency domain statistical feature set of the vibration signal ( ) 1/2 X fc = 1 1 f i X rmsf = fi 2 X rvf = ( 1 ) 1/2 (f i X fc ) 2 of the Fourier transform of the analytical signal translate the characteristic bearing faults frequencies to the low frequency band. The final features are the narrow band energy around the expected fault frequencies and their harmonics. 3. Machine Learning Methods The supervised learning paradigm [Bishop et al. 2006] can be used in automatic fault diagnosis. For classification this approach needs a dataset with labeled patterns to train a classifier. The classifier performance can be estimated using labeled patterns unused during training. We present two well known classifier algorithms, two performance estimation methods, one feature selection search algorithm with two different selection criteria. We think that the chosen set of pattern recognition techniques is an appropriate toolbox to approach optimality with respect to the compactness and accuracy of the proposed diagnosis system. The K-earest eighbor Algorithm (K-) [Cover and Hart 1967] classifies a new pattern according to the majority vote of its closest neighbors, usually using the Euclidean distance. The benefit of this architecture is its simplicity and its theoretical properties, with 3

4 respect to the error bound. The Support Vector Machine (SVM) [Burges 1998] training algorithm creates a maximum-margin separation hyperplane between two classes. In order to enhance the linear separation in the original Euclidean space the SVM maps the input vectors into a highdimensional feature space through some nonlinear mapping [Vapnik 1999], using a kernel function. To classify more than two classes one can uses a one-against-all approach. We use the C-SVM classification architecture with Radial Basis Function (RBF) kernel. K-fold cross-validation performance estimation splits the data D into k approximately equal parts D 1,...,D k, and learns with the reduced data set D \ D i,1 i k with one part left out. The part D i left out is used as the test set [Bouckaert 2004]. A special case of crossvalidation is leave-one-out, where the number of folds equals the number of samples. We use the estimated accuracy as performance criterion. The global accuracy of cross-validation is estimated as the mean of the k folds as ACC global = 1 k ACC i. k In this study we use a large number of features, as shown in section 2. Feature selection generally improves prediction performance and simultaneously reduces the problem dimensionality, providing faster and more cost-effective predictors, and allowing a better understanding of the underlying processes that generate the data [Guyon and Elisseeff 2003]. We use Sequential Forward Selection (SFS) which is a good compromise between exploring the search space sufficiently and computational cost. In order to select k from a total of Q features, SFS initializes with an empty feature sety. Features are iteratively added toy,according to some selection criterion. The algorithm stops and returns Y when Y = k. We employ two selection criteria. Interclass distance expresses the separation of classes according to some distance measure, mainly Euclidean distance. SFS with estimated mean error probability runs a complete performance estimation for each available featuref j X,f j / Y with the candidate set(f j / Y) Y,j {1,...,Q} wherex is the complete set of all features. Those feature that increases, or less decreases the performance criterion is joined to Y. Ties are solved arbitrarily. 4. Experimental Results We used as a benchmark for our methods the bearing dataset provided by the Bearing Data Center of Case Western Reserve University [CWRU 2013]. This publicly available benchmark allows a objective comparison of the proposed method to other research work. This dataset is composed of vibratory signals of normal and fault bearings extracted from a 2 hp reliance electric motor. The faults were introduced at a specific position of the bearing, using an electro-discharging machining with fault diameters of 0.007, 0.014, and inches. A dynamometer induced loads of 0, 1, 2 and 3 hp, changing the shaft rotation from 1797 to 1720 rpm. One model of bearing was used on the drive end and the other was used on the fan end. Three accelerometers collected the vibratory data, placed on the drive end, fan end and the base of the motor. ot all data files contain the base plate data, so we did not use this sensor in our experiments. As done in other work [Xia et al. 2012, Liu 2012, Wu et al. 2012], we split the signals in several parts before the feature extraction, aiming at a better classification performance estimation. We split the signals into 15 parts, resulting in a total of 2415 samples. Preliminary 4

5 experiments showed that this was the maximum possible division without considerable loss of accuracy Identified Classes The identified classes can be labeled according to the number of bearings, the bearing state (normal or defective), fault severity (depth) and motor load. Another main contribution of our work is the large number of machine condition classes, normal, plus three faults (ball, inner race and outer race) times three severities (0.007, 0.014, in) times two bearing models, plus two faults (ball, inner race) times one severity (0.028 in) times one bearing model (drive end), resulting in = 21 classes. Table 3 presents the distribution and description of the classes used in our experiments. We are able to identify not only the fault class, but within the same class also the severity of the fault. [Xia et al. 2012, Liu 2012, Wu et al. 2012] identify only a small number of classes, from one sensor position (drive end). [Xia et al. 2012] uses four classes, normal, ball, inner race and outer race, or fixes a fault class and then distinguishes among its severities. Table 3. Class distribution and description Class ame Samples Distribution Description 1 Ball DE % inch ball fault in the drive end bearing. 2 Ball FE % inch ball fault in the fan end bearing. 3 Ball DE % inch ball fault in the drive end bearing. 4 Ball FE % inch ball fault in the fan end bearing. 5 Ball DE % inch ball fault in the drive end bearing. 6 Ball FE % inch ball fault in the fan end bearing. 7 Ball DE % inch ball fault in the drive end bearing. 8 InnerRace DE % inch inner race fault in the drive end bearing. 9 InnerRace FE % inch inner race fault in the fan end bearing. 10 InnerRace DE % inch inner race fault in the drive end bearing. 11 InnerRace FE % inch inner race fault in the fan end bearing. 12 InnerRace DE % inch inner race fault in the drive end bearing. 13 InnerRace FE % inch inner race fault in the fan end bearing. 14 InnerRace DE % inch inner race fault in the drive end bearing. 15 ormal % ormal bearing 16 OuterRace DE % inch outer race fault in the drive end bearing. 17 OuterRace FE % inch outer race fault in the fan end bearing. 18 OuterRace FE % inch outer race fault in the drive end bearing. 19 OuterRace DE % inch outer race fault in the fan end bearing. 20 OuterRace DE % inch outer race fault in the drive end bearing. 21 OuterRace FE % inch outer race fault in the fan end bearing Features Extraction Models We used three basic features extraction models: complex envelope spectrum, statistical features extracted from the time and frequency domain, and wavelet package analysis in the timefrequency domain. Table 4 shows the number of features extracted by each model used. We employed the K- classifier, with 1, 3, 5 and 7 as the value of K, to compare the quality of the feature extraction diagnosis. We used the leave-one-out performance estimator with accuracy as quality criterion. Table 5 compares the accuracy of the different feature extraction techniques. 5

6 Table 4. umber of features of features extraction models used Feature Extraction Model umber of Features Complex Envelope Spectrum 72 Statistical Features 26 Wavelet Package Analysis 32 All together 130 Table 5. Estimated K- accuracy for each feature extraction model Extraction Model Envelope 97.76% 97.64% 97.43% 97.35% Statistical 96.44% 96.36% 95.94% 95.69% Wavelet 99.83% 99.88% 99.79% 99.63% Global Feature Pool 99.96% 99.92% 99.92% 99.92% 4.3. Feature Selection Sequential forward search (SFS) was used with two selection criteria: the estimated mean error probability (EMEP) and the interclass distance (ICD). The 1- algorithm generally showed higher accuracy than other values of K. We increased the number of SFS selected features in steps of three to reduce the computational complexity, until reaching the final number of 90 features. The maximum number of 90 selected features provided a sufficiently expressive evolution of the selection criterion. Fig. 1 shows the relation between the estimated accuracy and selection criterion up to 30 features for each of the used criteria. From feature number 27 to 90 we estimated a 0% error, suggesting the benefit of feature selection. The experiment illustrates that, for less than nine features, the EMEP selection criterion performs better than ICD, but for nine or more features performance is equal or very similar. A collateral conclusion is that the CWRU dataset is relatively easy to classify, which in a certain way contradicts the low scores reported in [Wang et al. 2012], where a auto-regressive model of time-domain signal, plus a SVD decomposition is proposed. Accuracy Figure 1. Accuracy by selection criterion umber of features selected EMEP ICD Fig. 2 compares the EMEP to the ICD features selection criterion for the three different feature models (envelope, statistical and wavelet package). The quality of each of these three models reflects itself in the particular number of the selected features of each model. The horizontal axis shows the number of selected features during the SFS search. The vertical axis shows the number of selected features for each of the three feature models. For instance with the EMEP criterion, after having reached six selected features, zero are wavelet features, two are statistical features and four are envelope features. The main difference between the EMEP 6

7 and ICD selection criteria is the preference for a certain feature extraction model. While EMEP chooses the models in an equilibrated manner, the ICD criterion prefers the envelope model. # of features Figure 2. Feature model selected EMEP criterion ICD criterion Envelope 5 10 Statistical Wavelet Total number of features selected Total number of features selected 4.4. Experiments with the Support Vector Machine For the SVM experiments we use a 10-fold cross-validation due to the high computational cost of SVM training. The SVM type used was the C-SVM, with the radial basis kernel function. The RBF intrinsic parameter was set toγ = and the regularization parameter to C = 1, obtained from preliminary experiments. Like in the case of K-, the feature increasing step is set to three until reaching60 selected features. In the case of the SVM, we used only the EMEP selection criterion, because the experiments with K- suggested a considerably inferiority of the ICD criterion. Since the computational cost of the performance estimation with the SVM classifier is high, we therefore unconsidered the ICD criterion. The classifiers that used the wavelet package analysis feature model exhibited an estimated accuracy considerably higher than those trained by the other two feature models, and higher than global feature pool. After 9 features selected, the accuracy was even higher than for wavelet features alone. The results are shown in table 6. Table 6. Estimated SVM accuracy for different feature models Extraction Model SVM Envelope 85.47% Statistical 92.88% Wavelet 99.30% Global Feature Pool 90.56% 9 Selected from Global Feature Pool 99.38% 18 Selected from Global Feature Pool 99.96% 5. Conclusions and Future Works The K- classifier trained with the global feature pool showed a higher estimated accuracy than each of the isolated feature models. This behavior does not occur when the SVM was used as the classifier method. The feature selection by the SFS search increased the accuracy of both classifiers, K- and SVM. When the SFS used the EMEP criterion to select the features, it preferred the more equilibrated features models than when used with the ICD criterion, which preferred more the envelope features than the other two. For nine features or more, the SVM 7

8 classifier achieved higher estimated accuracy than when it used any specific feature extraction model, inclusive when the three feature extraction techniques were used together. In future work we will test Artificial eural etworks, additional feature models, more features selection techniques, use ensembles of classifiers, other application domains and other performance estimation metrics to optimize the global quality of the fault diagnosis system. References Bishop, C. M. et al. (2006). Pattern recognition and machine learning, volume 1. Springer, ew York. Bouckaert, R. R. (2004). Estimating replicability of classifier learning experiments. In Proceedings of the twenty-first international conference on Machine learning, page 15. ACM. Burges, C. J. (1998). A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery, 2(2): Cover, T. and Hart, P. (1967). earest neighbor pattern classification. Information Theory, IEEE Transactions on, 13(1): CWRU (2013). Case Western Reserve University, Bearing Data Center. eecs.cwru.edu/laboratory/bearing. Accessed: Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection. The Journal of Machine Learning Research, 3: Liu, J. (2012). Shannon wavelet spectrum analysis on truncated vibration signals for machine incipient fault detection. Measurement Science and Technology, 23(5): Luo, J., Yu, D., and Liang, M. (2013). A kurtosis-guided adaptive demodulation technique for bearing fault detection based on tunable-q wavelet transform. Measurement Science and Technology, 24(5): McInerny, S. A. and Dai, Y. (2003). Basic vibration signal processing for bearing fault detection. IEEE Transactions on Education, 46(1): Vapnik, V. (1999). The nature of statistical learning theory. Springer-Verlag, ew York. Wandekokem, E.. D., Mendel, E., Fabris, F., Valentim, M., Batista, R. J., Varejão, F. M., and Rauber, T. W. (2011). Diagnosing multiple faults in oil rig motor pumps using support vector machine classifier ensembles. Integrated Computer-Aided Engineering, 18(1): Wang, Y., Kang, S., Jiang, Y., Yang, G., Song, L., and Mikulovich, V. (2012). Classification of fault location and the degree of performance degradation of a rolling bearing based on an improved hyper-sphere-structured multi-class support vector machine. Mechanical Systems and Signal Processing, 29(0): Wu, S.-D., Wu, P.-H., Wu, C.-W., Ding, J.-J., and Wang, C.-C. (2012). Bearing fault diagnosis based on multiscale permutation entropy and support vector machine. Entropy, 14(8): Xia, Z., Xia, S., Wan, L., and Cai, S. (2012). Spectral regression based fault feature extraction for bearing accelerometer sensor signals. Sensors, 12(10):

Rolling element bearings fault diagnosis based on CEEMD and SVM

Rolling element bearings fault diagnosis based on CEEMD and SVM Rolling element bearings fault diagnosis based on CEEMD and SVM Tao-tao Zhou 1, Xian-ming Zhu 2, Yan Liu 3, Wei-cai Peng 4 National Key Laboratory on Ship Vibration and Noise, China Ship Development and

More information

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Lu Chen and Yuan Hang PERFORMANCE DEGRADATION ASSESSMENT AND FAULT DIAGNOSIS OF BEARING BASED ON EMD AND PCA-SOM.

More information

Finding Dominant Parameters For Fault Diagnosis Of a Single Bearing System Using Back Propagation Neural Network

Finding Dominant Parameters For Fault Diagnosis Of a Single Bearing System Using Back Propagation Neural Network International Journal of Mechanical & Mechatronics Engineering IJMME-IJENS Vol:13 No:01 40 Finding Dominant Parameters For Fault Diagnosis Of a Single Bearing System Using Back Propagation Neural Network

More information

Fault Diagnosis of Wind Turbine Based on ELMD and FCM

Fault Diagnosis of Wind Turbine Based on ELMD and FCM Send Orders for Reprints to reprints@benthamscience.ae 76 The Open Mechanical Engineering Journal, 24, 8, 76-72 Fault Diagnosis of Wind Turbine Based on ELMD and FCM Open Access Xianjin Luo * and Xiumei

More information

Analysis of Process and biological data using support vector machines

Analysis of Process and biological data using support vector machines Analysis of Process and biological data using support vector machines Sankar Mahadevan, PhD student Supervisor : Dr. Sirish Shah Department of Chemical and Materials Engineering University of Alberta Outline

More information

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate

More information

Rolling Bearing Diagnosis Based on CNN-LSTM and Various Condition Dataset

Rolling Bearing Diagnosis Based on CNN-LSTM and Various Condition Dataset Rolling Bearing Diagnosis Based on CNN-LSTM and Various Condition Dataset Osamu Yoshimatsu 1, Yoshihiro Satou 2, and Kenichi Shibasaki 3 1,2,3 Core Technology R&D Center, NSK Ltd., Fujisawa, Kanagawa,

More information

Automatic Machinery Fault Detection and Diagnosis Using Fuzzy Logic

Automatic Machinery Fault Detection and Diagnosis Using Fuzzy Logic Automatic Machinery Fault Detection and Diagnosis Using Fuzzy Logic Chris K. Mechefske Department of Mechanical and Materials Engineering The University of Western Ontario London, Ontario, Canada N6A5B9

More information

Feature Selection Technique to Improve Performance Prediction in a Wafer Fabrication Process

Feature Selection Technique to Improve Performance Prediction in a Wafer Fabrication Process Feature Selection Technique to Improve Performance Prediction in a Wafer Fabrication Process KITTISAK KERDPRASOP and NITTAYA KERDPRASOP Data Engineering Research Unit, School of Computer Engineering, Suranaree

More information

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others Introduction to object recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others Overview Basic recognition tasks A statistical learning approach Traditional or shallow recognition

More information

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 SUPERVISED LEARNING METHODS Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 2 CHOICE OF ML You cannot know which algorithm will work

More information

ISSN (Online) Volume 2, Number 1, May October (2011), IAEME

ISSN (Online) Volume 2, Number 1, May October (2011), IAEME International Journal Journal of Design of Design and Manufacturing and Manufacturing Technology (IJDMT), ISSN 0976 6995(Print), Technology (IJDMT), ISSN 0976 6995(Print), ISSN 0976 7002(Online) Volume

More information

arxiv: v5 [cs.cv] 4 Feb 2016

arxiv: v5 [cs.cv] 4 Feb 2016 Bearing fault diagnosis based on spectrum images of vibration signals arxiv:1511.02503v5 [cs.cv] 4 Feb 2016 Wei Li 1, Mingquan Qiu 1, Zhencai Zhu 1, Bo Wu 1, and Gongbo Zhou 1 1 School of Mechatronic Engineering,

More information

Classification of Bearing Faults Through Time- Frequency Analysis and Image Processing

Classification of Bearing Faults Through Time- Frequency Analysis and Image Processing Classification of Bearing Faults Through Time- Frequency Analysis and Image Processing Damiano Rossetti 1*, Yu Zhang 2, Stefano Squartini 1 and Stefano Collura 3 1 Università Politecnica delle Marche,

More information

KBSVM: KMeans-based SVM for Business Intelligence

KBSVM: KMeans-based SVM for Business Intelligence Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2004 Proceedings Americas Conference on Information Systems (AMCIS) December 2004 KBSVM: KMeans-based SVM for Business Intelligence

More information

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning Robot Learning 1 General Pipeline 1. Data acquisition (e.g., from 3D sensors) 2. Feature extraction and representation construction 3. Robot learning: e.g., classification (recognition) or clustering (knowledge

More information

HEALTH MONITORING OF INDUCTION MOTOR FOR VIBRATION ANALYSIS

HEALTH MONITORING OF INDUCTION MOTOR FOR VIBRATION ANALYSIS HEALTH MONITORING OF INDUCTION MOTOR FOR VIBRATION ANALYSIS Chockalingam ARAVIND VAITHILINGAM aravind_147@yahoo.com UCSI University Kualalumpur Gilbert THIO gthio@ucsi.edu.my UCSI University Kualalumpur

More information

Feature-weighted k-nearest Neighbor Classifier

Feature-weighted k-nearest Neighbor Classifier Proceedings of the 27 IEEE Symposium on Foundations of Computational Intelligence (FOCI 27) Feature-weighted k-nearest Neighbor Classifier Diego P. Vivencio vivencio@comp.uf scar.br Estevam R. Hruschka

More information

Kernel Combination Versus Classifier Combination

Kernel Combination Versus Classifier Combination Kernel Combination Versus Classifier Combination Wan-Jui Lee 1, Sergey Verzakov 2, and Robert P.W. Duin 2 1 EE Department, National Sun Yat-Sen University, Kaohsiung, Taiwan wrlee@water.ee.nsysu.edu.tw

More information

The Curse of Dimensionality

The Curse of Dimensionality The Curse of Dimensionality ACAS 2002 p1/66 Curse of Dimensionality The basic idea of the curse of dimensionality is that high dimensional data is difficult to work with for several reasons: Adding more

More information

A Comparative Study of SVM Kernel Functions Based on Polynomial Coefficients and V-Transform Coefficients

A Comparative Study of SVM Kernel Functions Based on Polynomial Coefficients and V-Transform Coefficients www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 6 Issue 3 March 2017, Page No. 20765-20769 Index Copernicus value (2015): 58.10 DOI: 18535/ijecs/v6i3.65 A Comparative

More information

Madhya Pradesh, India, 2 Asso. Prof., Department of Mechanical Engineering, MITS Gwalior, Madhya Pradesh, India

Madhya Pradesh, India, 2 Asso. Prof., Department of Mechanical Engineering, MITS Gwalior, Madhya Pradesh, India EFFECTS OF ARTIFICIAL NEURAL NETWORK PARAMETERS ON ROLLING ELEMENT BEARING FAULT DIAGNOSIS Deepak Kumar Gaud, Pratesh Jayaswal 2 Research Scholar, Department of Mechanical Engineering, MITS Gwalior, RGPV

More information

A Two-phase Distributed Training Algorithm for Linear SVM in WSN

A Two-phase Distributed Training Algorithm for Linear SVM in WSN Proceedings of the World Congress on Electrical Engineering and Computer Systems and Science (EECSS 015) Barcelona, Spain July 13-14, 015 Paper o. 30 A wo-phase Distributed raining Algorithm for Linear

More information

Comparison of different preprocessing techniques and feature selection algorithms in cancer datasets

Comparison of different preprocessing techniques and feature selection algorithms in cancer datasets Comparison of different preprocessing techniques and feature selection algorithms in cancer datasets Konstantinos Sechidis School of Computer Science University of Manchester sechidik@cs.man.ac.uk Abstract

More information

Opinion Mining by Transformation-Based Domain Adaptation

Opinion Mining by Transformation-Based Domain Adaptation Opinion Mining by Transformation-Based Domain Adaptation Róbert Ormándi, István Hegedűs, and Richárd Farkas University of Szeged, Hungary {ormandi,ihegedus,rfarkas}@inf.u-szeged.hu Abstract. Here we propose

More information

DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES

DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES EXPERIMENTAL WORK PART I CHAPTER 6 DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES The evaluation of models built using statistical in conjunction with various feature subset

More information

Data Mining for Fault Diagnosis and Machine Learning. for Rotating Machinery

Data Mining for Fault Diagnosis and Machine Learning. for Rotating Machinery Key Engineering Materials Vols. 293-294 (2005) pp 175-182 online at http://www.scientific.net (2005) Trans Tech Publications, Switzerland Online available since 2005/Sep/15 Data Mining for Fault Diagnosis

More information

2. Department of Electronic Engineering and Computer Science, Case Western Reserve University

2. Department of Electronic Engineering and Computer Science, Case Western Reserve University Chapter MINING HIGH-DIMENSIONAL DATA Wei Wang 1 and Jiong Yang 2 1. Department of Computer Science, University of North Carolina at Chapel Hill 2. Department of Electronic Engineering and Computer Science,

More information

2. LITERATURE REVIEW

2. LITERATURE REVIEW 2. LITERATURE REVIEW CBIR has come long way before 1990 and very little papers have been published at that time, however the number of papers published since 1997 is increasing. There are many CBIR algorithms

More information

Combining SVMs with Various Feature Selection Strategies

Combining SVMs with Various Feature Selection Strategies Combining SVMs with Various Feature Selection Strategies Yi-Wei Chen and Chih-Jen Lin Department of Computer Science, National Taiwan University, Taipei 106, Taiwan Summary. This article investigates the

More information

sensors An SVM-Based Solution for Fault Detection in Wind Turbines Sensors 2015, 15, ; doi: /s

sensors An SVM-Based Solution for Fault Detection in Wind Turbines Sensors 2015, 15, ; doi: /s Sensors 2015, 15, 5627-5648; doi:10.3390/s150305627 OPEN ACCESS sensors ISSN 1424-8220 www.mdpi.com/journal/sensors Article An SVM-Based Solution for Fault Detection in Wind Turbines Pedro Santos 1, Luisa

More information

Bias-Variance Analysis of Ensemble Learning

Bias-Variance Analysis of Ensemble Learning Bias-Variance Analysis of Ensemble Learning Thomas G. Dietterich Department of Computer Science Oregon State University Corvallis, Oregon 97331 http://www.cs.orst.edu/~tgd Outline Bias-Variance Decomposition

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

Machine Learning and Pervasive Computing

Machine Learning and Pervasive Computing Stephan Sigg Georg-August-University Goettingen, Computer Networks 17.12.2014 Overview and Structure 22.10.2014 Organisation 22.10.3014 Introduction (Def.: Machine learning, Supervised/Unsupervised, Examples)

More information

Input and Structure Selection for k-nn Approximator

Input and Structure Selection for k-nn Approximator Input and Structure Selection for k- Approximator Antti Soramaa ima Reyhani and Amaury Lendasse eural etwork Research Centre Helsinki University of Technology P.O. Box 5400 005 spoo Finland {asorama nreyhani

More information

A Novel Fault Identifying Method with Supervised Classification and Unsupervised Clustering

A Novel Fault Identifying Method with Supervised Classification and Unsupervised Clustering A Novel Fault Identifying Method with Supervised Classification and Unsupervised Clustering Tao Xu Department of Automation Shenyang Aerospace University China xutao@sau.edu.cn Journal of Digital Information

More information

Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging

Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging 1 CS 9 Final Project Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging Feiyu Chen Department of Electrical Engineering ABSTRACT Subject motion is a significant

More information

Department of Electromechanical Engineering, University of Burgos, Burgos 09006, Spain;

Department of Electromechanical Engineering, University of Burgos, Burgos 09006, Spain; Sensors 2014, 14, 20713-20735; doi:10.3390/s141120713 Article OPEN ACCESS sensors ISSN 1424-8220 www.mdpi.com/journal/sensors An SVM-Based Classifier for Estimating the State of Various Rotating Components

More information

Detecting Bearing Defects under High Noise Levels: A Classifier Fusion Approach

Detecting Bearing Defects under High Noise Levels: A Classifier Fusion Approach Detecting Bearing Defects under High Noise Levels: A Classifier Fusion Approach Luana Batista, Bechir Badri, Robert Sabourin, Marc Thomas lbatista@livia.etsmtl.ca, bechirbadri@yahoo.fr, {robert.sabourin,

More information

TWRBF Transductive RBF Neural Network with Weighted Data Normalization

TWRBF Transductive RBF Neural Network with Weighted Data Normalization TWRBF Transductive RBF eural etwork with Weighted Data ormalization Qun Song and ikola Kasabov Knowledge Engineering & Discovery Research Institute Auckland University of Technology Private Bag 9006, Auckland

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Louis Fourrier Fabien Gaie Thomas Rolf

Louis Fourrier Fabien Gaie Thomas Rolf CS 229 Stay Alert! The Ford Challenge Louis Fourrier Fabien Gaie Thomas Rolf Louis Fourrier Fabien Gaie Thomas Rolf 1. Problem description a. Goal Our final project is a recent Kaggle competition submitted

More information

Genetic Algorithm Based Discriminant Feature Selection for Improved Fault Diagnosis of Induction Motor

Genetic Algorithm Based Discriminant Feature Selection for Improved Fault Diagnosis of Induction Motor Int'l Conf. Artificial Intelligence ICAI'7 2 Genetic Algorithm Based Discriminant Feature Selection for Improved Fault Diagnosis of Induction Motor Young-Hun Kim, M M Manjurul Islam, Rashedul Islam 2,

More information

Contents. Preface to the Second Edition

Contents. Preface to the Second Edition Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................

More information

Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B

Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B 1 Supervised learning Catogarized / labeled data Objects in a picture: chair, desk, person, 2 Classification Fons van der Sommen

More information

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1225 Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms S. Sathiya Keerthi Abstract This paper

More information

Random Forest A. Fornaser

Random Forest A. Fornaser Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University

More information

Content Based Image Retrieval system with a combination of Rough Set and Support Vector Machine

Content Based Image Retrieval system with a combination of Rough Set and Support Vector Machine Shahabi Lotfabadi, M., Shiratuddin, M.F. and Wong, K.W. (2013) Content Based Image Retrieval system with a combination of rough set and support vector machine. In: 9th Annual International Joint Conferences

More information

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015 Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2015 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows K-Nearest

More information

Naïve Bayes for text classification

Naïve Bayes for text classification Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Support

More information

CS6716 Pattern Recognition

CS6716 Pattern Recognition CS6716 Pattern Recognition Prototype Methods Aaron Bobick School of Interactive Computing Administrivia Problem 2b was extended to March 25. Done? PS3 will be out this real soon (tonight) due April 10.

More information

Forward Feature Selection Using Residual Mutual Information

Forward Feature Selection Using Residual Mutual Information Forward Feature Selection Using Residual Mutual Information Erik Schaffernicht, Christoph Möller, Klaus Debes and Horst-Michael Gross Ilmenau University of Technology - Neuroinformatics and Cognitive Robotics

More information

Unsupervised Feature Selection for Sparse Data

Unsupervised Feature Selection for Sparse Data Unsupervised Feature Selection for Sparse Data Artur Ferreira 1,3 Mário Figueiredo 2,3 1- Instituto Superior de Engenharia de Lisboa, Lisboa, PORTUGAL 2- Instituto Superior Técnico, Lisboa, PORTUGAL 3-

More information

NUMERICAL ANALYSIS OF ROLLER BEARING

NUMERICAL ANALYSIS OF ROLLER BEARING Applied Computer Science, vol. 12, no. 1, pp. 5 16 Submitted: 2016-02-09 Revised: 2016-03-03 Accepted: 2016-03-11 tapered roller bearing, dynamic simulation, axial load force Róbert KOHÁR *, Frantisek

More information

Measurement 46 (2013) Contents lists available at SciVerse ScienceDirect. Measurement

Measurement 46 (2013) Contents lists available at SciVerse ScienceDirect. Measurement Measurement 6 () 55 56 Contents lists available at SciVerse ScienceDirect Measurement journal homepage: www.elsevier.com/locate/measurement Fault diagnosis of rotating machinery based on the statistical

More information

Influence of geometric imperfections on tapered roller bearings life and performance

Influence of geometric imperfections on tapered roller bearings life and performance Influence of geometric imperfections on tapered roller bearings life and performance Rodríguez R a, Calvo S a, Nadal I b and Santo Domingo S c a Computational Simulation Centre, Instituto Tecnológico de

More information

Discriminative classifiers for image recognition

Discriminative classifiers for image recognition Discriminative classifiers for image recognition May 26 th, 2015 Yong Jae Lee UC Davis Outline Last time: window-based generic object detection basic pipeline face detection with boosting as case study

More information

Machine Learning Techniques for Data Mining

Machine Learning Techniques for Data Mining Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already

More information

FAULT DETECTION AND ISOLATION USING SPECTRAL ANALYSIS. Eugen Iancu

FAULT DETECTION AND ISOLATION USING SPECTRAL ANALYSIS. Eugen Iancu FAULT DETECTION AND ISOLATION USING SPECTRAL ANALYSIS Eugen Iancu Automation and Mechatronics Department University of Craiova Eugen.Iancu@automation.ucv.ro Abstract: In this work, spectral signal analyses

More information

Univariate Margin Tree

Univariate Margin Tree Univariate Margin Tree Olcay Taner Yıldız Department of Computer Engineering, Işık University, TR-34980, Şile, Istanbul, Turkey, olcaytaner@isikun.edu.tr Abstract. In many pattern recognition applications,

More information

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target

More information

Kernel Methods and Visualization for Interval Data Mining

Kernel Methods and Visualization for Interval Data Mining Kernel Methods and Visualization for Interval Data Mining Thanh-Nghi Do 1 and François Poulet 2 1 College of Information Technology, Can Tho University, 1 Ly Tu Trong Street, Can Tho, VietNam (e-mail:

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

ECG782: Multidimensional Digital Signal Processing

ECG782: Multidimensional Digital Signal Processing ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting

More information

Empirical Evaluation of Feature Subset Selection based on a Real-World Data Set

Empirical Evaluation of Feature Subset Selection based on a Real-World Data Set P. Perner and C. Apte, Empirical Evaluation of Feature Subset Selection Based on a Real World Data Set, In: D.A. Zighed, J. Komorowski, and J. Zytkow, Principles of Data Mining and Knowledge Discovery,

More information

A Feature Selection Method to Handle Imbalanced Data in Text Classification

A Feature Selection Method to Handle Imbalanced Data in Text Classification A Feature Selection Method to Handle Imbalanced Data in Text Classification Fengxiang Chang 1*, Jun Guo 1, Weiran Xu 1, Kejun Yao 2 1 School of Information and Communication Engineering Beijing University

More information

Lecture 10 September 19, 2007

Lecture 10 September 19, 2007 CS 6604: Data Mining Fall 2007 Lecture 10 September 19, 2007 Lecture: Naren Ramakrishnan Scribe: Seungwon Yang 1 Overview In the previous lecture we examined the decision tree classifier and choices for

More information

Basis Functions. Volker Tresp Summer 2017

Basis Functions. Volker Tresp Summer 2017 Basis Functions Volker Tresp Summer 2017 1 Nonlinear Mappings and Nonlinear Classifiers Regression: Linearity is often a good assumption when many inputs influence the output Some natural laws are (approximately)

More information

Rule extraction from support vector machines

Rule extraction from support vector machines Rule extraction from support vector machines Haydemar Núñez 1,3 Cecilio Angulo 1,2 Andreu Català 1,2 1 Dept. of Systems Engineering, Polytechnical University of Catalonia Avda. Victor Balaguer s/n E-08800

More information

Scale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract

Scale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract Scale-Invariance of Support Vector Machines based on the Triangular Kernel François Fleuret Hichem Sahbi IMEDIA Research Group INRIA Domaine de Voluceau 78150 Le Chesnay, France Abstract This paper focuses

More information

Bearing fault detection using multi-scale fractal dimensions based on morphological covers

Bearing fault detection using multi-scale fractal dimensions based on morphological covers Shock and Vibration 19 (2012) 1373 1383 1373 DOI 10.3233/SAV-2012-0679 IOS Press Bearing fault detection using multi-scale fractal dimensions based on morphological covers Pei-Lin Zhang a,bingli a,b,,

More information

Performance Evaluation of Various Classification Algorithms

Performance Evaluation of Various Classification Algorithms Performance Evaluation of Various Classification Algorithms Shafali Deora Amritsar College of Engineering & Technology, Punjab Technical University -----------------------------------------------------------***----------------------------------------------------------

More information

The Fault Diagnosis of Wind Turbine Gearbox Based on Improved KNN

The Fault Diagnosis of Wind Turbine Gearbox Based on Improved KNN www.seipub.org/aee Advances in Energy Engineering (AEE) Volume 3, 2015 doi: 10.14355/aee.2015.03.002 The Fault Diagnosis of Wind Turbine Gearbox Based on Improved KNN Long Peng 1, Bin Jiao 2, Hai Liu 3,

More information

Fuzzy Entropy based feature selection for classification of hyperspectral data

Fuzzy Entropy based feature selection for classification of hyperspectral data Fuzzy Entropy based feature selection for classification of hyperspectral data Mahesh Pal Department of Civil Engineering NIT Kurukshetra, 136119 mpce_pal@yahoo.co.uk Abstract: This paper proposes to use

More information

The Machine Part Tool in Observer 9

The Machine Part Tool in Observer 9 Application Note The Machine Part Tool in SKF @ptitude Observer 9 Introduction The machine part tool in SKF @ptitude Observer is an important part of the setup. By defining the machine parts, it is possible

More information

Support Vector Machines: Brief Overview" November 2011 CPSC 352

Support Vector Machines: Brief Overview November 2011 CPSC 352 Support Vector Machines: Brief Overview" Outline Microarray Example Support Vector Machines (SVMs) Software: libsvm A Baseball Example with libsvm Classifying Cancer Tissue: The ALL/AML Dataset Golub et

More information

Sensor-based Semantic-level Human Activity Recognition using Temporal Classification

Sensor-based Semantic-level Human Activity Recognition using Temporal Classification Sensor-based Semantic-level Human Activity Recognition using Temporal Classification Weixuan Gao gaow@stanford.edu Chuanwei Ruan chuanwei@stanford.edu Rui Xu ray1993@stanford.edu I. INTRODUCTION Human

More information

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN 2016 International Conference on Artificial Intelligence: Techniques and Applications (AITA 2016) ISBN: 978-1-60595-389-2 Face Recognition Using Vector Quantization Histogram and Support Vector Machine

More information

Performance Analysis of Data Mining Classification Techniques

Performance Analysis of Data Mining Classification Techniques Performance Analysis of Data Mining Classification Techniques Tejas Mehta 1, Dr. Dhaval Kathiriya 2 Ph.D. Student, School of Computer Science, Dr. Babasaheb Ambedkar Open University, Gujarat, India 1 Principal

More information

CHAPTER 3. Preprocessing and Feature Extraction. Techniques

CHAPTER 3. Preprocessing and Feature Extraction. Techniques CHAPTER 3 Preprocessing and Feature Extraction Techniques CHAPTER 3 Preprocessing and Feature Extraction Techniques 3.1 Need for Preprocessing and Feature Extraction schemes for Pattern Recognition and

More information

Time Series Classification in Dissimilarity Spaces

Time Series Classification in Dissimilarity Spaces Proceedings 1st International Workshop on Advanced Analytics and Learning on Temporal Data AALTD 2015 Time Series Classification in Dissimilarity Spaces Brijnesh J. Jain and Stephan Spiegel Berlin Institute

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines CS 536: Machine Learning Littman (Wu, TA) Administration Slides borrowed from Martin Law (from the web). 1 Outline History of support vector machines (SVM) Two classes,

More information

Stability of Feature Selection Algorithms

Stability of Feature Selection Algorithms Stability of Feature Selection Algorithms Alexandros Kalousis, Jullien Prados, Phong Nguyen Melanie Hilario Artificial Intelligence Group Department of Computer Science University of Geneva Stability of

More information

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques 24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE

More information

CS570: Introduction to Data Mining

CS570: Introduction to Data Mining CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8 & 9 Han, Chapters 4 & 5 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data Mining.

More information

Cover Page. The handle holds various files of this Leiden University dissertation.

Cover Page. The handle   holds various files of this Leiden University dissertation. Cover Page The handle http://hdl.handle.net/1887/22055 holds various files of this Leiden University dissertation. Author: Koch, Patrick Title: Efficient tuning in supervised machine learning Issue Date:

More information

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018 MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge

More information

Hybrid Approach for Classification using Support Vector Machine and Decision Tree

Hybrid Approach for Classification using Support Vector Machine and Decision Tree Hybrid Approach for Classification using Support Vector Machine and Decision Tree Anshu Bharadwaj Indian Agricultural Statistics research Institute New Delhi, India anshu@iasri.res.in Sonajharia Minz Jawaharlal

More information

6. Dicretization methods 6.1 The purpose of discretization

6. Dicretization methods 6.1 The purpose of discretization 6. Dicretization methods 6.1 The purpose of discretization Often data are given in the form of continuous values. If their number is huge, model building for such data can be difficult. Moreover, many

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Data Mining by I. H. Witten and E. Frank 7 Engineering the input and output Attribute selection Scheme-independent, scheme-specific Attribute discretization Unsupervised, supervised, error-

More information

Pattern Classification Using Neuro Fuzzy and Support Vector Machine (SVM) - A Comparative Study

Pattern Classification Using Neuro Fuzzy and Support Vector Machine (SVM) - A Comparative Study Pattern Classification Using Neuro Fuzzy and Support Vector Machine (SVM) - A Comparative Study Dr. Maya Nayak 1 and Er. Jnana Ranjan Tripathy 2 Department of Information Technology, Biju Pattnaik University

More information

A Boosting-Based Framework for Self-Similar and Non-linear Internet Traffic Prediction

A Boosting-Based Framework for Self-Similar and Non-linear Internet Traffic Prediction A Boosting-Based Framework for Self-Similar and Non-linear Internet Traffic Prediction Hanghang Tong 1, Chongrong Li 2, and Jingrui He 1 1 Department of Automation, Tsinghua University, Beijing 100084,

More information

Supervised Learning Classification Algorithms Comparison

Supervised Learning Classification Algorithms Comparison Supervised Learning Classification Algorithms Comparison Aditya Singh Rathore B.Tech, J.K. Lakshmipat University -------------------------------------------------------------***---------------------------------------------------------

More information

Support Vector Machines

Support Vector Machines Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

Clustering Analysis based on Data Mining Applications Xuedong Fan

Clustering Analysis based on Data Mining Applications Xuedong Fan Applied Mechanics and Materials Online: 203-02-3 ISSN: 662-7482, Vols. 303-306, pp 026-029 doi:0.4028/www.scientific.net/amm.303-306.026 203 Trans Tech Publications, Switzerland Clustering Analysis based

More information

Using PageRank in Feature Selection

Using PageRank in Feature Selection Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy {ienco,meo,botta}@di.unito.it Abstract. Feature selection is an important

More information

Minimal Test Cost Feature Selection with Positive Region Constraint

Minimal Test Cost Feature Selection with Positive Region Constraint Minimal Test Cost Feature Selection with Positive Region Constraint Jiabin Liu 1,2,FanMin 2,, Shujiao Liao 2, and William Zhu 2 1 Department of Computer Science, Sichuan University for Nationalities, Kangding

More information

BENCHMARKING ATTRIBUTE SELECTION TECHNIQUES FOR MICROARRAY DATA

BENCHMARKING ATTRIBUTE SELECTION TECHNIQUES FOR MICROARRAY DATA BENCHMARKING ATTRIBUTE SELECTION TECHNIQUES FOR MICROARRAY DATA S. DeepaLakshmi 1 and T. Velmurugan 2 1 Bharathiar University, Coimbatore, India 2 Department of Computer Science, D. G. Vaishnav College,

More information

Contributions to the diagnosis of kinematic chain components operation by analyzing the electric current and temperature of the driving engine

Contributions to the diagnosis of kinematic chain components operation by analyzing the electric current and temperature of the driving engine Fourth International Conference Modelling and Development of Intelligent Systems October 28 - November 1, 2015 Lucian Blaga University Sibiu - Romania Contributions to the diagnosis of kinematic chain

More information