A New Implementation of Recursive Feature Elimination Algorithm for Gene Selection from Microarray Data

Size: px
Start display at page:

Download "A New Implementation of Recursive Feature Elimination Algorithm for Gene Selection from Microarray Data"

Transcription

1 2009 World Congress on Computer Science and Information Engineering A New Implementation of Recursive Feature Elimination Algorithm for Gene Selection from Microarray Data Sihua Peng 1, Xiaoping Liu 2, Jiyang Yu 1, Zhizhen Wan 3, and Xiaoning Peng 4* 1 Department of Pathology, School of Medicine, Zhejiang University; 2 College of Life Science and Technology, Xinjiang University; 3 College of Computer Science and Engineering, Zhejiang University, 4 School of Medicine, Hunan Normal University. Abstract We proposed a new approach for gene selection and multi-cancer classification based on step-by-step improvement of classification performance (SSiCP). The SSiCP gene selection algorithms were evaluated over the NCI60 and GCM benchmar datasets, with an accuracy of 96.6% and 95.5% in 10-fold crossvalidation, respectively. Furthermore, the SSiCP outperformed recently published algorithms when applied to another two multi-cancer data sets. Computational evidence indicated that SSiCP can avoid overfitting effectively. Compared with various gene selection algorithms, the implementation of SSiCP is very simple, and all the computational experiments are repeatable. 1. Introduction Cancer classification is a very important step for diagnosis and treatment of cancers. Without the correct identification of cancer types, it is almost impossible to achieve a good therapeutic effect. Based on the cdna microarray technology for cancer identification and classification, many in-depth studies have been done [1, 2]. As for binary classification issues, such as tumour versus normal tissue [3], or one subtype of a tumor versus another [4], molecular classification using gene expression profiles has achieved a very high degree of accuracy. For classification of multiple tumour types, however, the accuracy has yet to be improved [5-10]. Because of the high dimensionality, the excessive noise, and the relatively small sample sizes in DNA microarray data, this issue has become a hot focus in the data mining of gene expression profiles. Especially for data with a large number of cancer types, many conventional classification methods show very poor performance [11], such as the NCI60 data set (9 types of cancer) [5], and the GCM data set (14 types of cancer) [6]. Recently, to face the challenge of multi-cancer classification, investigators have proposed many new approaches. Xu et al. used semi-supervised ellipsoid ARTMAP and particle swarm optimization, with a competitive performance [12]. Cai et al. proposed a new algorithm, which introduced a new measurement to quantify the class discrimination strength difference between two genes [13]. Zhou et al. [14] recently put forward the MSVM-RFE algorithms, which are four expansions of the well-nown SVM-RFE algorithm [15]. However, obtaining higher classification accuracy as well as choosing fewer genes is possible by using more powerful dada mining algorithms. In this paper, we proposed a new approach of gene selection and multi-cancer classification based on stepby-step improvement of classification performance (SSiCP). SSiCP, which is neither SVM-RFE nor the expansion of SVM-RFE [15], is a new SVM based implementation of RFE feature selection methodology. The results show that our strategy is very effective, with a fast calculation procedure. 2. Materials and Methods 2.1 Data sets NCI60 dataset [5] * To whom correspondence should be addressed: Xiaoning Peng, PhD, Hunan Normal University Schoole of Medicine, No. 81 Jiatongjie, Changsha, Hunan Province, P.R.China ( pxiaoning@hunnu.edu.cn, Tel: , FAX: , Zip Code: ) /08 $ IEEE DOI /CSIE

2 The NCI60 data set was described by Ross et al., and can be downloaded from ( wi.mit.edu/mpr/nci60/nci_60.expression.scfrs.txt). There are 60 samples in this data set, which express 7129 genes in nine types. GCM dataset [6,7] The original GCM dataset contains 198 samples with genes from 14 classes of cancers [6]. A subset of the original GCM dataset is employed in this study, which was download at the web site ( view&paper_id=114). Human Carcinomas Dataset (HCD174) [8] The HCD174 dataset contains 174 samples in 11 classes. Each sample contains genes. The dataset was obtained from ( Central Nervous System Embryonal Tumors dataset (CNS) [9] The CNS dataset contains 42 samples with 7129 gene probes, and can be downloaded from ( 2.2 Gene pre-selection Without gene pre-selection, computation becomes a time-consuming tas because of the very high dimensions in feature space. After gene pre-selection, we can obtain a few dozen or hundreds of differentially expresse. Based on this reduced gene subset, the second step of gene selection was carried out smoothly, with the calculation burden greatly reduced. As our algorithm is based on the Wea platform, we tested several feature selection methods on Wea. After going through calculation and comparison, we chose the chi-squared test-based feature selection algorithm as our gene pre-selection algorithm, which is named the "ChiSquaredAttributeEval" feature selection on Wea. The Chi-Squared (χ 2 ) method evaluates features individually by measuring their χ 2 statistic with respect to the classes. After calculating the χ 2 value of all considered features, we sorted the values with the largest one at the first position, as the larger the χ 2 value, the more important the feature [16]. 2.3 RFE: Recursive Feature Elimination RFE is an iterative procedure, which can be described as follows. 1. Train the classifier. 2. Compute the raning criterion for all features. 3. Remove the feature with smallest raning criterion. In the algorithm of SVM-RFE proposed by guyon et al., the main steps are described as follows [15]. 1. Train the classifier: α = SVM train( x, y) 2. Compute the weight vector of dimension length (s): w = α y x c = ( wi 3. Compute the raning criteria: i 4. Find the feature with smallest raning criterion: f = arg min( c) c =. 5. Eliminate the feature with f 2.4 Feature selection methodology Step by step feature reduction SSiCP algorithm is not a ind of wrapper algorithm [17]. In SSiCP, we do not use a search method. But we do employ an evaluation function to guide the eliminate features step by step. To some extent, SSiCP is similar to SVM-RFE in two aspects. Both of the algorithms are SVM based algorithm, and both of them employed the recursive feature elimination (RFE) methodology. Nevertheless, they are completely different algorithms. The innovation of our algorithm is the feature elimination criteria. Briefly, we eliminate a feature at a time. If the classification accuracy increases (or equal to the original value) without this feature, we remove this feature forever, otherwise restores this feature. So SSiCP did not ran the features by some raning criteria. The ey steps of the algorithm we proposed were as follows: Step 1. Train the classifier with n features (genes), and compute the accuracy with m-fold cross-validation. Step 2. Eliminate a feature f temporarily, and compute the accuracy with m-fold cross-validation. Step 3. If, remove the feature f, and if >, restore the feature f. If all the retained features were restored once without the increase, a local maxima valve of the accuracy is obtained. In this case, we mae =. Step 4. If n=2, stop the calculation. If n>2 go to Step 2. The above steps are the ey points of our algorithm, and the details shown in Fig ) 666

3 (classifier for building linear logistic regression models) [Wea: we determined the classification algorithm which provided the best performance. By using the seven classification algorithms on the GCM and NCI60 data sets, the optimal algorithm was selected. Subsequent calculation results showed that SMO outperformed all of the other six algorithms. 2.6 Parameter selection on Wea Fig. 1 Schematic map of the feature reduction algorithm. Overfitting evaluation of SSiCP algorithm As a machine learning algorithm, overfitting issue must be addressed. Of the four datasets, there are more instances in HCD174 (174 instances) dataset than that of GCM, NCI60, and NCS. Therefore, to evaluate the overfitting status of SSiCP algorithm, HCD174 dataset is partitioned into two parts: training set and test set. A classifier model is obtained by running the SSiCP algorithm on the training dataset, with an accuracy of ten-fold across validation denoted in x. And the classifier model is then tested by the independent test dataset, with an accuracy denoted in x. If there is little difference between x and x, we conclude that SSiCP can avoid overfitting effectively. 2.5 Confirmation of classification algorithm in the second step of feature selection By comparing the seven classification algorithms including the Naive Bayes classifier, the BayesNet classifier, SMO (sequential minimal optimization algorithm for training a support vector classifier), KStar, LMT (logistic model trees), J48, and SL SMO algorithm was superior to the other algorithms. After features (genes) pre-selection, 208 genes were 3.2 Gene selection based on step-by-step improvement of classification performance When we used SVM to do the classification tas, the choice of the ernel function was a ey factor to obtain better performance. For the classification of the microarray dataset, a relatively better classification performance was achieved by using the polynomial ernel function [10]. After testing the four ernel functions (NormalizedPolyKernel, PolyKernel, RBFKernel, and StringKernel) on Wea, it was also clear that the best results were achieved by using PolyKernel. 3. Results 3.1 Initial noise removal and comparison of classification algorithms The NCI60 and GCM datasets are generally considered benchmar datasets in the microarray data mining problem, so they are always used to test the performance of a new algorithm. Therefore, seven classification algorithms which are commonly used in data mining issues were employed with these two datasets. First, we obtained the computational results with and without feature pre-selection (using the χ 2 test-based feature selection algorithm). The results suggested that after initial pre-selection of the features, the classification performance improved considerably, indicating that the noise in the microarray datasets was removed to a certain extent. The results also indicated that when using both NCI60 data and GCM data, the selected from NCI 60 data set and 150 genes from GCM data set. By calling the main pacage of Wea to run our algorithm, the computations were carried out using the NCI60 and GCM datasets, and the gene selection results of the above seven algorithms were obtained 667

4 (Fig. 2 and Fig. 3). Clearly, the SMO algorithm also outperformed the other six algorithms. Fig. 2 Classification performance comparisons of the seven algorithms using the NCI60 data set. The maximal accuracy of 96.6% was obtained using the SMO algorithm with 24 genes (red). Fig. 3 Classification performance comparisons of the seven algorithms using the GCM data set. The maximal accuracy of 95.5% was obtained using the SMO algorithm with 28 genes (red). 3.3 Comparison of computational results using four data sets Through the above comparisons, the SMO algorithm was selected as the classifier embedded in our algorithm. This SMO-based algorithm was then applied to the other two datasets: CNS, and HCD174. In the calculation process, we generally chose the following parameters: ten-fold cross-validation, PolyKernel ernel function and standardization data filter type, with the remaining parameters set to the default values. The results are shown in Table 1. Table 1 - cy comparison of multi-class classification using the four data sets (%) NCI60 GCM CNS HCD174 SU Pomeroy Yeang Peng Lin Xu Cai Zhou This study Overfitting evaluation HCD174 dataset is divided into training dataset with 142 instances and test dataset with 32 instances. Running SSiCP on HCD174 training set, a classifier 668

5 model including 49 features was obtained with accuracy of 95.8% by ten-fold cross validation. Then independent test dataset from HD174 is employed to test the classifier model with accuracy of 93.8%. From 95.8% to 93.8%, the accuracy declined slightly, suggesting that SSiCP avoids Overfitting efficaciously. 4. Discussions In the comparison of the results obtained from the four datasets, our algorithm was superior to all other algorithms in classification accuracy except for the algorithm of Cai et al., which achieved slightly higher accuracy than ours (97.3% versus 97.1%, Table 1), whereas the number of genes we selected was far less than theirs (80 versus 37, Table 1). The advantages of wrapper-based techniques for feature selection are well established [17]. So a comparison should be made between the wrapperbased approaches and SSiCP algorithm. First, it has recently been recognized that wrapper-based techniques have the potential to overfit the training data [18], while SSiCP has shown the ability to overcome overfitting by computational experiments. Second, wrapper-based techniques must employ a heuristic search method to search subset feature states in a large state space, maing a heavy computational burden on the computer. However, instead of searching states in a huge space, SSiCP uses a step by step improvement of classification accuracy to reduce feature space, with a result of fast procedure of computation and simple implementation of the algorithm. 5. References [1] Golub, T.R., Slonim, D.K., Tamayo, P., et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring, Science 286, 1999, pp [2] Bittner, M., et al. Molecular classification of cutaneous malignant melanoma by gene expression profiling, Nature 406, 2000, pp [3] Furey, T.S., et al. Support vector machine classification and validation of cancer tissue samples using microarray expression data, Bioinformatics 16, 2000, [4] Alizadeh, A.A., et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000, 2000, pp [5] Ross, D.T., et al. Systematic variation in gene expression patterns in human cancer cell lines, Nature Genetics 24, 2000, pp [6] Ramaswamy, S., Tamayo, P., Rifin, R., et al. Multiclass cancer diagnosis using tumor gene expression signatures, Proc. Natl. Acad. Sci. USA 98, 2001, pp [7] Lu, J., Getz, G., Misa, E.A., et al. MicroRNA expression profiles classify human cancers, Nature 435, 2005, pp [8] Su, A.I., et al. Molecular classification of human carcinomas by use of gene expression signatures, Cancer Research 61, 2001, pp [9] Pomeroy, S.L., et al. Prediction of central nervous system embryonal tumour outcome based on gene expression, Nature 415, 2002, pp [10] Peng, S.H., Xu, Q.H., Ling, X.B., et al. Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines, FEBS Letters 555, 2003, pp [11] Li, T., et al. A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression, Bioinformatics 20, 2004, pp [12] Xu, R., et al. Multiclass cancer classification using semisupervised ellipsoid ARTMAP and particle swarm optimization with gene expression data, IEEE-ACM Transaction on Computational Biology and Bioinformatics 4, 2007, pp [13] Cai, Z.P., et al. Selecting dissimilar genes for multiclass classification, an application in cancer subtyping, BMC Bioinformatics 8, 2007, Art. No.206. [14] Zhou, X. and Tuc, D.P. MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data, Bioinformatics 23, 2007, pp [15] Guyon, I., et al. Gene selection for cancer classification using support vector machines, Machine Learning 46, 2002, pp [16] Liu, H. and Setiono, R. Chi2: Feature selection and discrimination of numeric attributes. In: Proceedings of the IEEE 7th International Conference on Tools with Artificial Intelligence, pp , [17] R. Kohavi and G. H. John. Wrapper for feature subset selection, Artificial Intelligence 97, 1997, pp [18] Reunanen, J. Overfitting in maing comparisons between variable selection methods, Journal of Machine Learning Research 3, 2003, pp

Noise-based Feature Perturbation as a Selection Method for Microarray Data

Noise-based Feature Perturbation as a Selection Method for Microarray Data Noise-based Feature Perturbation as a Selection Method for Microarray Data Li Chen 1, Dmitry B. Goldgof 1, Lawrence O. Hall 1, and Steven A. Eschrich 2 1 Department of Computer Science and Engineering

More information

Estimating Error-Dimensionality Relationship for Gene Expression Based Cancer Classification

Estimating Error-Dimensionality Relationship for Gene Expression Based Cancer Classification 1 Estimating Error-Dimensionality Relationship for Gene Expression Based Cancer Classification Feng Chu and Lipo Wang School of Electrical and Electronic Engineering Nanyang Technological niversity Singapore

More information

Title: Optimized multilayer perceptrons for molecular classification and diagnosis using genomic data

Title: Optimized multilayer perceptrons for molecular classification and diagnosis using genomic data Supplementary material for Manuscript BIOINF-2005-1602 Title: Optimized multilayer perceptrons for molecular classification and diagnosis using genomic data Appendix A. Testing K-Nearest Neighbor and Support

More information

Exploratory data analysis for microarrays

Exploratory data analysis for microarrays Exploratory data analysis for microarrays Jörg Rahnenführer Computational Biology and Applied Algorithmics Max Planck Institute for Informatics D-66123 Saarbrücken Germany NGFN - Courses in Practical DNA

More information

FEATURE EXTRACTION TECHNIQUES USING SUPPORT VECTOR MACHINES IN DISEASE PREDICTION

FEATURE EXTRACTION TECHNIQUES USING SUPPORT VECTOR MACHINES IN DISEASE PREDICTION FEATURE EXTRACTION TECHNIQUES USING SUPPORT VECTOR MACHINES IN DISEASE PREDICTION Sandeep Kaur 1, Dr. Sheetal Kalra 2 1,2 Computer Science Department, Guru Nanak Dev University RC, Jalandhar(India) ABSTRACT

More information

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate

More information

A New Maximum-Relevance Criterion for Significant Gene Selection

A New Maximum-Relevance Criterion for Significant Gene Selection A New Maximum-Relevance Criterion for Significant Gene Selection Young Bun Kim 1,JeanGao 1, and Pawel Michalak 2 1 Department of Computer Science and Engineering 2 Department of Biology The University

More information

Gene Expression Based Classification using Iterative Transductive Support Vector Machine

Gene Expression Based Classification using Iterative Transductive Support Vector Machine Gene Expression Based Classification using Iterative Transductive Support Vector Machine Hossein Tajari and Hamid Beigy Abstract Support Vector Machine (SVM) is a powerful and flexible learning machine.

More information

Comparison of Optimization Methods for L1-regularized Logistic Regression

Comparison of Optimization Methods for L1-regularized Logistic Regression Comparison of Optimization Methods for L1-regularized Logistic Regression Aleksandar Jovanovich Department of Computer Science and Information Systems Youngstown State University Youngstown, OH 44555 aleksjovanovich@gmail.com

More information

10601 Machine Learning. Model and feature selection

10601 Machine Learning. Model and feature selection 10601 Machine Learning Model and feature selection Model selection issues We have seen some of this before Selecting features (or basis functions) Logistic regression SVMs Selecting parameter value Prior

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information

Fuzzy Entropy based feature selection for classification of hyperspectral data

Fuzzy Entropy based feature selection for classification of hyperspectral data Fuzzy Entropy based feature selection for classification of hyperspectral data Mahesh Pal Department of Civil Engineering NIT Kurukshetra, 136119 mpce_pal@yahoo.co.uk Abstract: This paper proposes to use

More information

Statistical dependence measure for feature selection in microarray datasets

Statistical dependence measure for feature selection in microarray datasets Statistical dependence measure for feature selection in microarray datasets Verónica Bolón-Canedo 1, Sohan Seth 2, Noelia Sánchez-Maroño 1, Amparo Alonso-Betanzos 1 and José C. Príncipe 2 1- Department

More information

Ensemble-based Classifiers for Cancer Classification Using Human Tumor Microarray Data

Ensemble-based Classifiers for Cancer Classification Using Human Tumor Microarray Data 1 Ensemble-based Classifiers for Cancer Classification Using Human Tumor Microarray Data Argin Margoosian and Jamshid Abouei, Member, IEEE, Dept. of Electrical and Computer Engineering, Yazd University,

More information

SVM-Based Local Search for Gene Selection and Classification of Microarray Data

SVM-Based Local Search for Gene Selection and Classification of Microarray Data SVM-Based Local Search for Gene Selection and Classification of Microarray Data Jose Crispin Hernandez Hernandez, Béatrice Duval, and Jin-Kao Hao LERIA, Université d Angers, 2 Boulevard Lavoisier, 49045

More information

Univariate Margin Tree

Univariate Margin Tree Univariate Margin Tree Olcay Taner Yıldız Department of Computer Engineering, Işık University, TR-34980, Şile, Istanbul, Turkey, olcaytaner@isikun.edu.tr Abstract. In many pattern recognition applications,

More information

A PSO-based Generic Classifier Design and Weka Implementation Study

A PSO-based Generic Classifier Design and Weka Implementation Study International Forum on Mechanical, Control and Automation (IFMCA 16) A PSO-based Generic Classifier Design and Weka Implementation Study Hui HU1, a Xiaodong MAO1, b Qin XI1, c 1 School of Economics and

More information

Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis

Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis CHAPTER 3 BEST FIRST AND GREEDY SEARCH BASED CFS AND NAÏVE BAYES ALGORITHMS FOR HEPATITIS DIAGNOSIS 3.1 Introduction

More information

Gene selection through Switched Neural Networks

Gene selection through Switched Neural Networks Gene selection through Switched Neural Networks Marco Muselli Istituto di Elettronica e di Ingegneria dell Informazione e delle Telecomunicazioni Consiglio Nazionale delle Ricerche Email: Marco.Muselli@ieiit.cnr.it

More information

Support Vector Machines: Brief Overview" November 2011 CPSC 352

Support Vector Machines: Brief Overview November 2011 CPSC 352 Support Vector Machines: Brief Overview" Outline Microarray Example Support Vector Machines (SVMs) Software: libsvm A Baseball Example with libsvm Classifying Cancer Tissue: The ALL/AML Dataset Golub et

More information

Feature Selection for SVMs

Feature Selection for SVMs Feature Selection for SVMs J. Weston, S. Mukherjee, O. Chapelle, M. Pontil T. Poggio, V. Vapnik Barnhill BioInformatics.com, Savannah, Georgia, USA. CBCL MIT, Cambridge, Massachusetts, USA. AT&T Research

More information

Comparison of different preprocessing techniques and feature selection algorithms in cancer datasets

Comparison of different preprocessing techniques and feature selection algorithms in cancer datasets Comparison of different preprocessing techniques and feature selection algorithms in cancer datasets Konstantinos Sechidis School of Computer Science University of Manchester sechidik@cs.man.ac.uk Abstract

More information

SVMFILEFS- A NOVEL ENSEMBLE FEATURE SELECTION TECHNIQUE FOR EFFECTIVE BREAST CANCER DIAGNOSIS

SVMFILEFS- A NOVEL ENSEMBLE FEATURE SELECTION TECHNIQUE FOR EFFECTIVE BREAST CANCER DIAGNOSIS International Journal of Civil Engineering and Technology (IJCIET) Volume 9, Issue 11, November 2018, pp. 1526 1533, Article ID: IJCIET_09_11_147 Available online at http://www.iaeme.com/ijciet/issues.asp?jtype=ijciet&vtype=9&itype=11

More information

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 1

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (  1 Cluster Based Speed and Effective Feature Extraction for Efficient Search Engine Manjuparkavi A 1, Arokiamuthu M 2 1 PG Scholar, Computer Science, Dr. Pauls Engineering College, Villupuram, India 2 Assistant

More information

Filter versus Wrapper Feature Subset Selection in Large Dimensionality Micro array: A Review

Filter versus Wrapper Feature Subset Selection in Large Dimensionality Micro array: A Review Filter versus Wrapper Feature Subset Selection in Large Dimensionality Micro array: A Review Binita Kumari #1, Tripti Swarnkar *2 #1 Department of Computer Science - *2 Department of Computer Applications,

More information

The Iterative Bayesian Model Averaging Algorithm: an improved method for gene selection and classification using microarray data

The Iterative Bayesian Model Averaging Algorithm: an improved method for gene selection and classification using microarray data The Iterative Bayesian Model Averaging Algorithm: an improved method for gene selection and classification using microarray data Ka Yee Yeung, Roger E. Bumgarner, and Adrian E. Raftery April 30, 2018 1

More information

Forward Feature Selection Using Residual Mutual Information

Forward Feature Selection Using Residual Mutual Information Forward Feature Selection Using Residual Mutual Information Erik Schaffernicht, Christoph Möller, Klaus Debes and Horst-Michael Gross Ilmenau University of Technology - Neuroinformatics and Cognitive Robotics

More information

Feature Selection in Knowledge Discovery

Feature Selection in Knowledge Discovery Feature Selection in Knowledge Discovery Susana Vieira Technical University of Lisbon, Instituto Superior Técnico Department of Mechanical Engineering, Center of Intelligent Systems, IDMEC-LAETA Av. Rovisco

More information

Review of feature selection techniques in bioinformatics by Yvan Saeys, Iñaki Inza and Pedro Larrañaga.

Review of feature selection techniques in bioinformatics by Yvan Saeys, Iñaki Inza and Pedro Larrañaga. Americo Pereira, Jan Otto Review of feature selection techniques in bioinformatics by Yvan Saeys, Iñaki Inza and Pedro Larrañaga. ABSTRACT In this paper we want to explain what feature selection is and

More information

Machine Learning in Biology

Machine Learning in Biology Università degli studi di Padova Machine Learning in Biology Luca Silvestrin (Dottorando, XXIII ciclo) Supervised learning Contents Class-conditional probability density Linear and quadratic discriminant

More information

Online Streaming Feature Selection

Online Streaming Feature Selection Online Streaming Feature Selection Abstract In the paper, we consider an interesting and challenging problem, online streaming feature selection, in which the size of the feature set is unknown, and not

More information

A Naïve Soft Computing based Approach for Gene Expression Data Analysis

A Naïve Soft Computing based Approach for Gene Expression Data Analysis Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2124 2128 International Conference on Modeling Optimization and Computing (ICMOC-2012) A Naïve Soft Computing based Approach for

More information

Supervised vs unsupervised clustering

Supervised vs unsupervised clustering Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful

More information

Redundancy Based Feature Selection for Microarray Data

Redundancy Based Feature Selection for Microarray Data Redundancy Based Feature Selection for Microarray Data Lei Yu Department of Computer Science & Engineering Arizona State University Tempe, AZ 85287-8809 leiyu@asu.edu Huan Liu Department of Computer Science

More information

K-means clustering based filter feature selection on high dimensional data

K-means clustering based filter feature selection on high dimensional data International Journal of Advances in Intelligent Informatics ISSN: 2442-6571 Vol 2, No 1, March 2016, pp. 38-45 38 K-means clustering based filter feature selection on high dimensional data Dewi Pramudi

More information

Dynamic Clustering of Data with Modified K-Means Algorithm

Dynamic Clustering of Data with Modified K-Means Algorithm 2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Dynamic Clustering of Data with Modified K-Means Algorithm Ahamed Shafeeq

More information

Automated Microarray Classification Based on P-SVM Gene Selection

Automated Microarray Classification Based on P-SVM Gene Selection Automated Microarray Classification Based on P-SVM Gene Selection Johannes Mohr 1,2,, Sambu Seo 1, and Klaus Obermayer 1 1 Berlin Institute of Technology Department of Electrical Engineering and Computer

More information

Software Documentation of the Potential Support Vector Machine

Software Documentation of the Potential Support Vector Machine Software Documentation of the Potential Support Vector Machine Tilman Knebel and Sepp Hochreiter Department of Electrical Engineering and Computer Science Technische Universität Berlin 10587 Berlin, Germany

More information

Data Mining. 3.2 Decision Tree Classifier. Fall Instructor: Dr. Masoud Yaghini. Chapter 5: Decision Tree Classifier

Data Mining. 3.2 Decision Tree Classifier. Fall Instructor: Dr. Masoud Yaghini. Chapter 5: Decision Tree Classifier Data Mining 3.2 Decision Tree Classifier Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Basic Algorithm for Decision Tree Induction Attribute Selection Measures Information Gain Gain Ratio

More information

On The Value of Leave-One-Out Cross-Validation Bounds

On The Value of Leave-One-Out Cross-Validation Bounds On The Value of Leave-One-Out Cross-Validation Bounds Jason D. M. Rennie jrennie@csail.mit.edu December 15, 2003 Abstract A long-standing problem in classification is the determination of the regularization

More information

SVM Classification in -Arrays

SVM Classification in -Arrays SVM Classification in -Arrays SVM classification and validation of cancer tissue samples using microarray expression data Furey et al, 2000 Special Topics in Bioinformatics, SS10 A. Regl, 7055213 What

More information

2. Department of Electronic Engineering and Computer Science, Case Western Reserve University

2. Department of Electronic Engineering and Computer Science, Case Western Reserve University Chapter MINING HIGH-DIMENSIONAL DATA Wei Wang 1 and Jiong Yang 2 1. Department of Computer Science, University of North Carolina at Chapel Hill 2. Department of Electronic Engineering and Computer Science,

More information

Double Self-Organizing Maps to Cluster Gene Expression Data

Double Self-Organizing Maps to Cluster Gene Expression Data Double Self-Organizing Maps to Cluster Gene Expression Data Dali Wang, Habtom Ressom, Mohamad Musavi, Cristian Domnisoru University of Maine, Department of Electrical & Computer Engineering, Intelligent

More information

A STUDY OF SOME DATA MINING CLASSIFICATION TECHNIQUES

A STUDY OF SOME DATA MINING CLASSIFICATION TECHNIQUES A STUDY OF SOME DATA MINING CLASSIFICATION TECHNIQUES Narsaiah Putta Assistant professor Department of CSE, VASAVI College of Engineering, Hyderabad, Telangana, India Abstract Abstract An Classification

More information

9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology

9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology 9/9/ I9 Introduction to Bioinformatics, Clustering algorithms Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Outline Data mining tasks Predictive tasks vs descriptive tasks Example

More information

Feature Selection Algorithm with Discretization and PSO Search Methods for Continuous Attributes

Feature Selection Algorithm with Discretization and PSO Search Methods for Continuous Attributes Feature Selection Algorithm with Discretization and PSO Search Methods for Continuous Attributes Madhu.G 1, Rajinikanth.T.V 2, Govardhan.A 3 1 Dept of Information Technology, VNRVJIET, Hyderabad-90, INDIA,

More information

CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA. By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr.

CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA. By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr. CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr. Michael Nechyba 1. Abstract The objective of this project is to apply well known

More information

A Hybrid Feature Selection Algorithm Based on Information Gain and Sequential Forward Floating Search

A Hybrid Feature Selection Algorithm Based on Information Gain and Sequential Forward Floating Search A Hybrid Feature Selection Algorithm Based on Information Gain and Sequential Forward Floating Search Jianli Ding, Liyang Fu School of Computer Science and Technology Civil Aviation University of China

More information

Feature-weighted k-nearest Neighbor Classifier

Feature-weighted k-nearest Neighbor Classifier Proceedings of the 27 IEEE Symposium on Foundations of Computational Intelligence (FOCI 27) Feature-weighted k-nearest Neighbor Classifier Diego P. Vivencio vivencio@comp.uf scar.br Estevam R. Hruschka

More information

Multi-label classification using rule-based classifier systems

Multi-label classification using rule-based classifier systems Multi-label classification using rule-based classifier systems Shabnam Nazmi (PhD candidate) Department of electrical and computer engineering North Carolina A&T state university Advisor: Dr. A. Homaifar

More information

Application of Support Vector Machine Algorithm in Spam Filtering

Application of Support Vector Machine Algorithm in  Spam Filtering Application of Support Vector Machine Algorithm in E-Mail Spam Filtering Julia Bluszcz, Daria Fitisova, Alexander Hamann, Alexey Trifonov, Advisor: Patrick Jähnichen Abstract The problem of spam classification

More information

Feature Selection for Multi-Class Imbalanced Data Sets Based on Genetic Algorithm

Feature Selection for Multi-Class Imbalanced Data Sets Based on Genetic Algorithm Ann. Data. Sci. (2015) 2(3):293 300 DOI 10.1007/s40745-015-0060-x Feature Selection for Multi-Class Imbalanced Data Sets Based on Genetic Algorithm Li-min Du 1,2 Yang Xu 1 Hua Zhu 1 Received: 30 November

More information

Exploratory data analysis for microarrays

Exploratory data analysis for microarrays Exploratory data analysis for microarrays Adrian Alexa Computational Biology and Applied Algorithmics Max Planck Institute for Informatics D-66123 Saarbrücken slides by Jörg Rahnenführer NGFN - Courses

More information

Individualized Error Estimation for Classification and Regression Models

Individualized Error Estimation for Classification and Regression Models Individualized Error Estimation for Classification and Regression Models Krisztian Buza, Alexandros Nanopoulos, Lars Schmidt-Thieme Abstract Estimating the error of classification and regression models

More information

Improving Classification Accuracy for Single-loop Reliability-based Design Optimization

Improving Classification Accuracy for Single-loop Reliability-based Design Optimization , March 15-17, 2017, Hong Kong Improving Classification Accuracy for Single-loop Reliability-based Design Optimization I-Tung Yang, and Willy Husada Abstract Reliability-based design optimization (RBDO)

More information

Individual feature selection in each One-versus-One classifier improves multi-class SVM performance

Individual feature selection in each One-versus-One classifier improves multi-class SVM performance Individual feature selection in each One-versus-One classifier improves multi-class SVM performance Phoenix X. Huang School of Informatics University of Edinburgh 10 Crichton street, Edinburgh Xuan.Huang@ed.ac.uk

More information

Knowledge Discovery. Javier Béjar URL - Spring 2019 CS - MIA

Knowledge Discovery. Javier Béjar URL - Spring 2019 CS - MIA Knowledge Discovery Javier Béjar URL - Spring 2019 CS - MIA Knowledge Discovery (KDD) Knowledge Discovery in Databases (KDD) Practical application of the methodologies from machine learning/statistics

More information

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

The Comparative Study of Machine Learning Algorithms in Text Data Classification* The Comparative Study of Machine Learning Algorithms in Text Data Classification* Wang Xin School of Science, Beijing Information Science and Technology University Beijing, China Abstract Classification

More information

Introduction The problem of cancer classication has clear implications on cancer treatment. Additionally, the advent of DNA microarrays introduces a w

Introduction The problem of cancer classication has clear implications on cancer treatment. Additionally, the advent of DNA microarrays introduces a w MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No.677 C.B.C.L Paper No.8

More information

Detecting Network Intrusions

Detecting Network Intrusions Detecting Network Intrusions Naveen Krishnamurthi, Kevin Miller Stanford University, Computer Science {naveenk1, kmiller4}@stanford.edu Abstract The purpose of this project is to create a predictive model

More information

Prognosis of Lung Cancer Using Data Mining Techniques

Prognosis of Lung Cancer Using Data Mining Techniques Prognosis of Lung Cancer Using Data Mining Techniques 1 C. Saranya, M.Phil, Research Scholar, Dr.M.G.R.Chockalingam Arts College, Arni 2 K. R. Dillirani, Associate Professor, Department of Computer Science,

More information

2.5 A STORM-TYPE CLASSIFIER USING SUPPORT VECTOR MACHINES AND FUZZY LOGIC

2.5 A STORM-TYPE CLASSIFIER USING SUPPORT VECTOR MACHINES AND FUZZY LOGIC 2.5 A STORM-TYPE CLASSIFIER USING SUPPORT VECTOR MACHINES AND FUZZY LOGIC Jennifer Abernethy* 1,2 and John K. Williams 2 1 University of Colorado, Boulder, Colorado 2 National Center for Atmospheric Research,

More information

Good Cell, Bad Cell: Classification of Segmented Images for Suitable Quantification and Analysis

Good Cell, Bad Cell: Classification of Segmented Images for Suitable Quantification and Analysis Cell, Cell: Classification of Segmented Images for Suitable Quantification and Analysis Derek Macklin, Haisam Islam, Jonathan Lu December 4, 22 Abstract While open-source tools exist to automatically segment

More information

FEATURE SELECTION TECHNIQUES

FEATURE SELECTION TECHNIQUES CHAPTER-2 FEATURE SELECTION TECHNIQUES 2.1. INTRODUCTION Dimensionality reduction through the choice of an appropriate feature subset selection, results in multiple uses including performance upgrading,

More information

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995)

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Department of Information, Operations and Management Sciences Stern School of Business, NYU padamopo@stern.nyu.edu

More information

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 SUPERVISED LEARNING METHODS Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 2 CHOICE OF ML You cannot know which algorithm will work

More information

Exploratory data analysis for microarrays

Exploratory data analysis for microarrays Exploratory data analysis for microarrays Jörg Rahnenführer Computational Biology and Applied Algorithmics Max Planck Institute for Informatics D-66123 Saarbrücken Germany NGFN - Courses in Practical DNA

More information

stepwisecm: Stepwise Classification of Cancer Samples using High-dimensional Data Sets

stepwisecm: Stepwise Classification of Cancer Samples using High-dimensional Data Sets stepwisecm: Stepwise Classification of Cancer Samples using High-dimensional Data Sets Askar Obulkasim Department of Epidemiology and Biostatistics, VU University Medical Center P.O. Box 7075, 1007 MB

More information

Gene signature selection to predict survival benefits from adjuvant chemotherapy in NSCLC patients

Gene signature selection to predict survival benefits from adjuvant chemotherapy in NSCLC patients 1 Gene signature selection to predict survival benefits from adjuvant chemotherapy in NSCLC patients 1,2 Keyue Ding, Ph.D. Nov. 8, 2014 1 NCIC Clinical Trials Group, Kingston, Ontario, Canada 2 Dept. Public

More information

Hotel Recommendation Based on Hybrid Model

Hotel Recommendation Based on Hybrid Model Hotel Recommendation Based on Hybrid Model Jing WANG, Jiajun SUN, Zhendong LIN Abstract: This project develops a hybrid model that combines content-based with collaborative filtering (CF) for hotel recommendation.

More information

DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES

DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES EXPERIMENTAL WORK PART I CHAPTER 6 DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES The evaluation of models built using statistical in conjunction with various feature subset

More information

The digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand).

The digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand). http://waikato.researchgateway.ac.nz/ Research Commons at the University of Waikato Copyright Statement: The digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand). The thesis

More information

Information Integration of Partially Labeled Data

Information Integration of Partially Labeled Data Information Integration of Partially Labeled Data Steffen Rendle and Lars Schmidt-Thieme Information Systems and Machine Learning Lab, University of Hildesheim srendle@ismll.uni-hildesheim.de, schmidt-thieme@ismll.uni-hildesheim.de

More information

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford Department of Engineering Science University of Oxford January 27, 2017 Many datasets consist of multiple heterogeneous subsets. Cluster analysis: Given an unlabelled data, want algorithms that automatically

More information

Machine learning techniques for binary classification of microarray data with correlation-based gene selection

Machine learning techniques for binary classification of microarray data with correlation-based gene selection Machine learning techniques for binary classification of microarray data with correlation-based gene selection By Patrik Svensson Master thesis, 15 hp Department of Statistics Uppsala University Supervisor:

More information

Interactive Text Mining with Iterative Denoising

Interactive Text Mining with Iterative Denoising Interactive Text Mining with Iterative Denoising, PhD kegiles@vcu.edu www.people.vcu.edu/~kegiles Assistant Professor Department of Statistics and Operations Research Virginia Commonwealth University Interactive

More information

Using Google s PageRank Algorithm to Identify Important Attributes of Genes

Using Google s PageRank Algorithm to Identify Important Attributes of Genes Using Google s PageRank Algorithm to Identify Important Attributes of Genes Golam Morshed Osmani Ph.D. Student in Software Engineering Dept. of Computer Science North Dakota State Univesity Fargo, ND 58105

More information

Feature Selection and Classification for Small Gene Sets

Feature Selection and Classification for Small Gene Sets Feature Selection and Classification for Small Gene Sets Gregor Stiglic 1,2, Juan J. Rodriguez 3, and Peter Kokol 1,2 1 University of Maribor, Faculty of Health Sciences, Zitna ulica 15, 2000 Maribor,

More information

Hybrid Feature Selection for Modeling Intrusion Detection Systems

Hybrid Feature Selection for Modeling Intrusion Detection Systems Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,

More information

Data Mining in Bioinformatics Day 1: Classification

Data Mining in Bioinformatics Day 1: Classification Data Mining in Bioinformatics Day 1: Classification Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institute Tübingen and Eberhard Karls

More information

Unsupervised Feature Selection for Sparse Data

Unsupervised Feature Selection for Sparse Data Unsupervised Feature Selection for Sparse Data Artur Ferreira 1,3 Mário Figueiredo 2,3 1- Instituto Superior de Engenharia de Lisboa, Lisboa, PORTUGAL 2- Instituto Superior Técnico, Lisboa, PORTUGAL 3-

More information

Chapter 8 The C 4.5*stat algorithm

Chapter 8 The C 4.5*stat algorithm 109 The C 4.5*stat algorithm This chapter explains a new algorithm namely C 4.5*stat for numeric data sets. It is a variant of the C 4.5 algorithm and it uses variance instead of information gain for the

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

Topics In Feature Selection

Topics In Feature Selection Topics In Feature Selection CSI 5388 Theme Presentation Joe Burpee 2005/2/16 Feature Selection (FS) aka Attribute Selection Witten and Frank book Section 7.1 Liu site http://athena.csee.umbc.edu/idm02/

More information

Chapter 22 Information Gain, Correlation and Support Vector Machines

Chapter 22 Information Gain, Correlation and Support Vector Machines Chapter 22 Information Gain, Correlation and Support Vector Machines Danny Roobaert, Grigoris Karakoulas, and Nitesh V. Chawla Customer Behavior Analytics Retail Risk Management Canadian Imperial Bank

More information

An Adaptive Threshold LBP Algorithm for Face Recognition

An Adaptive Threshold LBP Algorithm for Face Recognition An Adaptive Threshold LBP Algorithm for Face Recognition Xiaoping Jiang 1, Chuyu Guo 1,*, Hua Zhang 1, and Chenghua Li 1 1 College of Electronics and Information Engineering, Hubei Key Laboratory of Intelligent

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

A Feature Selection Method to Handle Imbalanced Data in Text Classification

A Feature Selection Method to Handle Imbalanced Data in Text Classification A Feature Selection Method to Handle Imbalanced Data in Text Classification Fengxiang Chang 1*, Jun Guo 1, Weiran Xu 1, Kejun Yao 2 1 School of Information and Communication Engineering Beijing University

More information

A hybrid of discrete particle swarm optimization and support vector machine for gene selection and molecular classification of cancer

A hybrid of discrete particle swarm optimization and support vector machine for gene selection and molecular classification of cancer A hybrid of discrete particle swarm optimization and support vector machine for gene selection and molecular classification of cancer Adithya Sagar Cornell University, New York 1.0 Introduction: Cancer

More information

CS229 Lecture notes. Raphael John Lamarre Townshend

CS229 Lecture notes. Raphael John Lamarre Townshend CS229 Lecture notes Raphael John Lamarre Townshend Decision Trees We now turn our attention to decision trees, a simple yet flexible class of algorithms. We will first consider the non-linear, region-based

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Features and Patterns The Curse of Size and

More information

Machine Learning Techniques for Data Mining

Machine Learning Techniques for Data Mining Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already

More information

BIOINFORMATICS. New algorithms for multi-class cancer diagnosis using tumor gene expression signatures

BIOINFORMATICS. New algorithms for multi-class cancer diagnosis using tumor gene expression signatures BIOINFORMATICS Vol. 19 no. 14 2003, pages 1800 1807 DOI: 10.1093/bioinformatics/btg238 New algorithms for multi-class cancer diagnosis using tumor gene expression signatures A. M. Bagirov, B. Ferguson,

More information

Estimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees

Estimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees Estimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees Jing Wang Computer Science Department, The University of Iowa jing-wang-1@uiowa.edu W. Nick Street Management Sciences Department,

More information

What to come. There will be a few more topics we will cover on supervised learning

What to come. There will be a few more topics we will cover on supervised learning Summary so far Supervised learning learn to predict Continuous target regression; Categorical target classification Linear Regression Classification Discriminative models Perceptron (linear) Logistic regression

More information

Cover Page. The handle holds various files of this Leiden University dissertation.

Cover Page. The handle   holds various files of this Leiden University dissertation. Cover Page The handle http://hdl.handle.net/1887/22055 holds various files of this Leiden University dissertation. Author: Koch, Patrick Title: Efficient tuning in supervised machine learning Issue Date:

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees David S. Rosenberg New York University April 3, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 April 3, 2018 1 / 51 Contents 1 Trees 2 Regression

More information

8/19/13. Computational problems. Introduction to Algorithm

8/19/13. Computational problems. Introduction to Algorithm I519, Introduction to Introduction to Algorithm Yuzhen Ye (yye@indiana.edu) School of Informatics and Computing, IUB Computational problems A computational problem specifies an input-output relationship

More information

Subject. Dataset. Copy paste feature of the diagram. Importing the dataset. Copy paste feature into the diagram.

Subject. Dataset. Copy paste feature of the diagram. Importing the dataset. Copy paste feature into the diagram. Subject Copy paste feature into the diagram. When we define the data analysis process into Tanagra, it is possible to copy components (or entire branches of components) towards another location into the

More information

An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm

An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm Proceedings of the National Conference on Recent Trends in Mathematical Computing NCRTMC 13 427 An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm A.Veeraswamy

More information