Feature Selection for Multi-Class Problems Using Support Vector Machines
|
|
- Samuel Carr
- 6 years ago
- Views:
Transcription
1 Feature Selection for Multi-Class Problems Using Support Vector Machines Guo-Zheng Li, Jie Yang, Guo-Ping Liu, Li Xue Institute of Image Processing & Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China, Abstract. Since feature selection can remove the irrelevant features and improve the performance of learning systems, it is an crucial step in machine learning. The feature selection methods using support vector machines have obtained satisfactory results, but the previous works are usually for binary classification, and needs auxiliary techniques to be extended to multiple classification. In this paper, we propose a prediction risk based feature selection method using multiple classification support vector machines. The performance of the proposed method is compared with the previous methods of optimal brain damage based feature selection methods using binary support vector machines. The results of experiments on UCI data sets show that prediction risk based feature selection method obtains better results than the previous methods using support vector machines for multiple classification problems. 1 Introduction Feature selection is one of the key topics in machine learning and other related fields [1 3], it can remove the irrelevant even noisy features and hence improve the quality of the data set and the performance of learning systems. In the recent years, many feature selection algorithms have been developed, but no optimal algorithms can actually be suitable for all problems. Since neural computing does not make assumption of the possible distribution of the data, only when the training data sets are available, can feature selection using neural computing perform well and really improve the performances of the neural learning machines [4, 5]. Support vector machines(svms) proposed in 1990s have exhibited excellent performance in many applications and become the standard tools in neural computing [6, 7]. Compared with other neural computing methods like multiple layer perceptron neural network trained by back propagation algorithms, SVMs realize the data dependent principle of structure risk minimization, have better generalization ability and can obtain the optimal solution[8]. Although SVMs are powerful algorithms, too many irrelevant features can reduce their performances, so feature selection for SVMs are proposed[5, 3]. Weston et al. proposed to use the leave-one-out error bound as the selection criteria[9]; Guyon et al. used the second derivative of the object function as the criteria [5]. Rakotomamonjy used the zero oder and first order of the above criteria and
2 proved that the optimal damage brain measure used by Guyon et al. is better than others[10]. It is worth noting that all the above algorithms are based on binary classification SVMs. Since SVMs classification algorithms are designed for binary classification problems, the techniques like one against one or one against all are needed to build multiple classification SVMs. At the same time, the capability of the feature selection methods using binary SVMs is limited. Weston et al. computed the sum of measures of the corresponding features in each binary SVMs of the multiple classification SVMs to evaluate the features in multiple classification problems [11]. In order to use the multiple classification SVMs as the learning machine to help to select the features effectively, we propose to use the prediction risk based feature selection method [12]. The rest of this paper is arranged as follows: Prediction risk based feature selection method using multiple classification SVMs is described in Section 2; Section 3 focuses on experiments on multi-class UCI data sets; and in Section 4, we will give some discussions. 2 Prediction risk based feature selection method We use feature selection to improve the accuracy of multiple classification support vector machines, which will firstly be introduced in brief. Support vector machines(svms) proposed in the 1990s have become state-ofthe-arts methods in machine learning fields [8, 7] and exhibited excellent performance in many applications such as digit recognition[13], text categorization[14], computer vision[15], biological data mining [5], and medical diagnosis [16], etc. In this paper, the version of 2-norm soft margin SVMs [7] is used for the binary classification machines, which minimize the training error as well as the 2-norm of slack variables according to the statistical learning theory [8]. The object function is defined as: L = l α i 1 2 i=1 l i,j=1 y i y j α i α j K(x i, x j ) 1 α α 2C where C is the parameter to control the trade off of training error and the norm of slack variables, α is the Lagrange multiplier vector, and K(x i, x j ) is the kernel function[17] introduced into SVMs to solve the nonlinear problems. Radial basis function(rbf) kernel is considered as a superior choice [18]: K(x, z) = exp( x z 2 /σ 2 ). where x, z are input examples, σ is the radius. There are several methods to construct multiple classification machines based on binary classification SVMs, among them one against one method is recommended[19]. If there are k classes in the data set, k(k 1)/2 binary SVMs are trained on each pair of class labels. In this work, one against one method is used
3 to build the multiple classification, and voting method of maximum win strategy is used to predict the labels of test examples. For more details of SVMs, please refer to The previous work Some embedded feature selection methods using binary classification SVMs have been proposed. Guyon et al. proposed to use optimal brain damage as the selection criteria [5]. Furthermore, optimal brain damage has been studied by Rakotomamonjy and proved to be better than the other measures proposed before[10]. Optimal brain damage(obd) proposed by LeCun et al.[20] uses the change of the object function as the selection criteria, which is defined as the second order term in Taylor series of the object function: S i = 1 2 L 2 (w i ) 2 (Dwi ) 2. in which L is the object function of learning machines, and w is the weight of features. OBD has been used in the feature selection for artificial neural networks and obtained satisfactory results[21]. In binary classification SVMs, OBD has performed well in the gene analysis problems[5]. For binary classification SVMs, the measure of OBD is defined[5] as S i = 1 2 αt K(x k, x h )α 1 2 αt K(x i k, x i h )α where α is the Lagrange multipliers in SVMs, and i in K(x i k, x i h ) means the component i has been removed. The feature corresponding to the least S i will be removed. The methods proposed in the previous works are based on binary classification SVMs. If we want to extend them to multiple classification SVMs, we have to compute the measures of each individual binary classification SVMs. One way is that we compute the sum of the measures of each individual SVMs for the corresponding features, and remove the features with the least sum of measures. However, all these methods are based on individual SVMs not on the multiple classification SVMs, so we propose the prediction risk based feature selection method for the multi-class problems which uses the multiple classification SVMs directly. 2.2 Prediction risk based feature selection method Prediction risk based feature selection method proposed by Moody et al. [12], evaluates the features by computing the change of training error when the features are replaced by their mean values, S i = ERR( x i ) ERR
4 where ERR is the training error. ERR( x i ) is the test error on the training set and defined as: ERR( x i ) = 1 N N (ỹ(x 1 j,..., x i,..., x M ) j ) y j, j=1 in which, M, N are the number of features and instances respectively, x i is the mean value of the ith feature and ỹ() is the prediction value of the jth example with the ith feature replaced by its mean value. The feature corresponding to the least S i will be removed, because its change causes the least error which indicates it is the least important one. This measure was used to perform feature selection for the regularized forward neural networks and obtained better results than other measures like fuzzy gain, output sensitivity[22]. In order to remove the features effectively, we use the sequential backward search algorithms [23], which removes one feature in one step according to the measures. The algorithms used in this paper is named as SVM-SBS in the following. The best feature subset is the one with the least test error on the test sample. Algorithm SVM-SBS Surviving feature subset u = [1, 2,..., M], the discarded feature list r = [ ] and the test error list e = [ ] are initialized firstly. Then, training sample x r0 = [x 1 r,..., x i r,..., x M r ] T with the target values y r and the test sample x s0 with the target values y s are input into SVM-SBS. Step 1: Restrict training sample to good feature indices x r = x r (:, u), and in the first iteration, x r = x r0. Step 2: Train the multiple classification machines to get M-SVM(x r, y r ). Step 3: Test the model on the test sample, classification error rate is computed e t = M-SVM(x s (:, u), y s ), and update the error list e = [e t, e]. Step 4: Compute the selection criteria S i for all i on the training sample using the evaluation method in the above two subsections. Step 5 Find the feature with smallest selection criterion h = arg min(s). Step 6: Update the discarded feature list r = [u(h), r] and eliminate the feature with smallest selection criterion u = u(1 : h 1, h + 1 : length(u)). If length(u) > 1 goto step 1. Step 7: Output the test error list e on the test sample and the discarded feature list r. 3 Experiments on the UCI data sets 3.1 The used UCI data sets In order to compare the different feature selection methods for multiple classification problems using support vector machines, we use twelve of multi-class
5 data sets from UCI data repository [24]. Data sets selected for comparison are listed in Table 1. For all the data sets, we first replace the symbols with numerical values in the data sets, then, all the attributes are transformed into the interval of [-1,1] using an affine transformation. At last, we split the data set equally into two parts according to the number of instances of each class, one part is used as training sample, the other is used as test sample, such operation is performed 100 times. Table 1. The properties of the UCI data sets for comparison Data set Number of instances Number of attributes Number of Classes all-bp all-hyper all-hypo backup fisher glass lung processed-cl processed-va soybean-l soybean-s stepp-order Experimental methods and Results In order to compare the two feature selection methods, we choose the same parameters C = 100 and σ = 0.5 for the SVMs on all data sets. Although they are not the optimal parameters, we consider them reasonable. OBD based feature selection methods and prediction risk based methods using SVMs are performed on the data sets using the SVM-SBS algorithm. Both evaluation methods are applied in SVM-SBS to selected features on the training data sets and compute the test error of the selected feature subset of the corresponding test data set. The test error is defined as the classification error rate: ERR(x s ) = 1 N N (ỹ j (x sj ) y j ), j=1 where N is the number of test instances, ỹ j is the prediction value of x j. This calculation is performed 100 times. Finally, the statistical results of the average error and its corresponding standard deviation for each number of feature subset are computed. Results of the least average error of each data set and its corresponding standard deviation are listed in Table 2.
6 Since prediction risk feature selection is an embedded method, the computation is efficiency and mainly focused on the training of SVMs. The CPU time of each time of selection using the SVM-SBS algorithm is varied on different data sets and no more than one second on a computer with one PIV 1.2G CPU of intel and 512M memory. Table 2. Statistical results of the test error on UCI data sets by different feature selection methods Data set Prediction risk based method OBD based method All features all-bp ± ± ± all-hyper ± ± ± all-hypo ± ± ± backup ± ± ± fisher ± ± ± glass ± ± ± lung ± ± ± processed-cl ± ± ± processed-va ± ± ± soybean-l ± ± ± soybean-s ± ± ± stepp-order ± ± ± Average ± ± ± From Table 2, we can see that: 1) Compared with the total feature set, both methods significantly reduce the classification error rate, except on one data set; 2) On seven out of twelve data sets prediction risk based method obtains better results than OBD based method does, and on four data sets prediction risk based method performs worse than OBD based method does; 3) For the average values, prediction risk based feature selection method obtains 2 percent better results than OBD does on the average error and 3 percent better results on the standard deviation error. One typical selection process is on the data set of backup, whose results of the average error and the corresponding standard deviation are plotted on Figure 1. From Figure 1, we can see the results of the test error become small first and then high when more features are eliminated. 4 Discussions Prediction risk based feature selection method for multi-class problems using support vector machines is proposed and obtains better results than optimal brain damage based feature selection method on twelve of multi-class data sets from UCI data repository.
7 We think two factors may account for the better performance of the proposed method. One is that the two feature selection method are based on two different measures, prediction risk and optimal brain damage, the former is something like the wrapper method [2] which can obtain the least error for specific learning machines. The second reason is that prediction risk based method uses the whole multiple classification support vector machines to evaluate the features, while optimal brain damage method is based on binary support vector machines, and needs auxiliary techniques to evaluate the features, perhaps another auxiliary method can help to obtain better results. We can also find both feature selection methods can greatly reduce the test error of the used data sets, the reduction magnitude is about 13 percent of the test error on the total feature set. This indicates that almost all the data sets have some redundant features or even noisy features, and these features hurt the performance of the used learning machine. Thus, feature selection is needed to perform on all the data sets for learning machines. Feature selection using support vector machines is a general method which does not make any assumption of the data distribution. However, one or several outlier examples may cause unexpected results, outlier detection should be considered before the feature selection is performed in the real world applications. In addition, how about the performance of the proposed method compared with other ad hoc feature selection methods like spectral clustering and mutual information methods is still an open issue, which needs a thorough investigation. Acknowledgments This work is financially supported by the Natural Science Foundation of China under the grant number of Thanks also go to the anonymous reviewers for their valuable advices. References 1. Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1 (1997) Kohavi, R., George, J.H.: Wrappers for feature subset selection. Artificial Intelligence 97 (1997) Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of machine learning research 3 (2003) Reed, R.: Pruning algorithms a survey. IEEE Transactions on Neural Networks 4 (1993) Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46 (2002) Haykin, S.: Neural Networks: A Comprehensive Foundation. 2 edn. Printice Hall, New Jersey (1999) 7. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000) 8. Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
8 9. Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., Vapnik, V.: Feature selection for SVMs. In: Advances in Neural Information Processing Systems. Volume 13. (2001) 10. Rakotomamonjy, A.: Variable selection using SVM-based criteria. Journal of machine learning research 3 (2003) Weston, J., Elisseeff, A., Bakir, G., Sinz, F.: The spider. (2004) 12. Moody, J., Utans, J.: Principled architecture selection for neural networks: Application to corporate bond rating prediction. In Moody, J.E., Hanson, S.J., Lippmann, R.P., eds.: Advances in Neural Information Processing Systems. Volume 4., Morgan Kaufmann Publishers, Inc. (1992) LeCun, Y., Jackel, L.D., Bottou, L., Brunot, A., Cortes, C., Denker, J.S., Drucker, H., Guyon, I., Müller, U.A., Säckinger, E., Simard, P., Vapnik, V.: Comparison of learning algorithms for handwritten digit recognition. In Fogelman-Soulié, F., Gallinari, P., eds.: Proceedings ICANN 95 International Conference on Artificial Neural Networks, Volume II. (1995) Joachims, T.: Text categorization with support vector machines. In: Proceedings of European Conference on Machine Learning(ECML). (1998) 15. Pontil, M., Verri, A.: Object recognition with support vector machines. IEEE Trans. on PAMI 20 (1998) El-Naqa, I., Yang, Y., Wernick, M.N., Galatsanos, N.P., R, N.: Support vector machine learning for detection of microcalcifications in mammograms. In: Proceedings of IEEE International Symposium on Biomedical Imaging. (2002) Mercer, J.: Functions of positive and negative type and their connection with the theory of integral equations. Philos. Trans. Roy. Soc. London A 209 (1909) Keerthi, S.S., Lin, C.J.: Asymptotic behaviors of support vector machines with gaussian kernel. Neural Computation 15 (2003) Hsu, C.W., Lin, C.J.: A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks 13 (2002) LeCun, Y., Denker, J.S., Solla, S.A.: Optimal brain damage. In Touretzky, D., ed.: Advances in Neural Information Processing Systems, Morgan Kaufmann, Inc. (1990) Cibas, T., Soulie, F., Gallinari, P.: Variable selection with neural networks. Neurocomputing 12 (1996) Verikas, A., Bacauskiene, M.: Feature selection with neural networks. Pattern Recognition Letters 23 (2002) Marill, T., Green, D.M.: On the effectiveness of receptors in recognition system. IEEE Transaction on Information Theory 9 (1963) Blake, C., Keogh, E., Merz, C.J.: UCI repository of machine learning databases. Technical report, Department of Information and Computer Science, University of California, Irvine, CA (1998) mlearn/mlrepository.htm.
9 Prediction Risk based Optimal Brain Damage based 0.45 Prediction Risk based Optimal Brain Damage based the average error the standard deviation number of the eliminated features number of the eliminated features Fig. 1. The feature selection process of the embedded algorithms on the data set of backup
SoftDoubleMinOver: A Simple Procedure for Maximum Margin Classification
SoftDoubleMinOver: A Simple Procedure for Maximum Margin Classification Thomas Martinetz, Kai Labusch, and Daniel Schneegaß Institute for Neuro- and Bioinformatics University of Lübeck D-23538 Lübeck,
More informationBagging and Boosting Algorithms for Support Vector Machine Classifiers
Bagging and Boosting Algorithms for Support Vector Machine Classifiers Noritaka SHIGEI and Hiromi MIYAJIMA Dept. of Electrical and Electronics Engineering, Kagoshima University 1-21-40, Korimoto, Kagoshima
More informationForward Feature Selection Using Residual Mutual Information
Forward Feature Selection Using Residual Mutual Information Erik Schaffernicht, Christoph Möller, Klaus Debes and Horst-Michael Gross Ilmenau University of Technology - Neuroinformatics and Cognitive Robotics
More informationThe Effects of Outliers on Support Vector Machines
The Effects of Outliers on Support Vector Machines Josh Hoak jrhoak@gmail.com Portland State University Abstract. Many techniques have been developed for mitigating the effects of outliers on the results
More informationKBSVM: KMeans-based SVM for Business Intelligence
Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2004 Proceedings Americas Conference on Information Systems (AMCIS) December 2004 KBSVM: KMeans-based SVM for Business Intelligence
More informationA Modular Reduction Method for k-nn Algorithm with Self-recombination Learning
A Modular Reduction Method for k-nn Algorithm with Self-recombination Learning Hai Zhao and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd.,
More informationSupport Vector Machine Ensemble with Bagging
Support Vector Machine Ensemble with Bagging Hyun-Chul Kim, Shaoning Pang, Hong-Mo Je, Daijin Kim, and Sung-Yang Bang Department of Computer Science and Engineering Pohang University of Science and Technology
More informationRule extraction from support vector machines
Rule extraction from support vector machines Haydemar Núñez 1,3 Cecilio Angulo 1,2 Andreu Català 1,2 1 Dept. of Systems Engineering, Polytechnical University of Catalonia Avda. Victor Balaguer s/n E-08800
More informationAn Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm
Proceedings of the National Conference on Recent Trends in Mathematical Computing NCRTMC 13 427 An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm A.Veeraswamy
More informationUsing K-NN SVMs for Performance Improvement and Comparison to K-Highest Lagrange Multipliers Selection
Using K-NN SVMs for Performance Improvement and Comparison to K-Highest Lagrange Multipliers Selection Sedat Ozer, Chi Hau Chen 2, and Imam Samil Yetik 3 Electrical & Computer Eng. Dept, Rutgers University,
More informationNoise-based Feature Perturbation as a Selection Method for Microarray Data
Noise-based Feature Perturbation as a Selection Method for Microarray Data Li Chen 1, Dmitry B. Goldgof 1, Lawrence O. Hall 1, and Steven A. Eschrich 2 1 Department of Computer Science and Engineering
More informationIndividual feature selection in each One-versus-One classifier improves multi-class SVM performance
Individual feature selection in each One-versus-One classifier improves multi-class SVM performance Phoenix X. Huang School of Informatics University of Edinburgh 10 Crichton street, Edinburgh Xuan.Huang@ed.ac.uk
More informationRelevance Feedback for Content-Based Image Retrieval Using Support Vector Machines and Feature Selection
Relevance Feedback for Content-Based Image Retrieval Using Support Vector Machines and Feature Selection Apostolos Marakakis 1, Nikolaos Galatsanos 2, Aristidis Likas 3, and Andreas Stafylopatis 1 1 School
More informationEfficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1225 Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms S. Sathiya Keerthi Abstract This paper
More informationTraining Data Selection for Support Vector Machines
Training Data Selection for Support Vector Machines Jigang Wang, Predrag Neskovic, and Leon N Cooper Institute for Brain and Neural Systems, Physics Department, Brown University, Providence RI 02912, USA
More informationMemory-efficient Large-scale Linear Support Vector Machine
Memory-efficient Large-scale Linear Support Vector Machine Abdullah Alrajeh ac, Akiko Takeda b and Mahesan Niranjan c a CRI, King Abdulaziz City for Science and Technology, Saudi Arabia, asrajeh@kacst.edu.sa
More informationImproved DAG SVM: A New Method for Multi-Class SVM Classification
548 Int'l Conf. Artificial Intelligence ICAI'09 Improved DAG SVM: A New Method for Multi-Class SVM Classification Mostafa Sabzekar, Mohammad GhasemiGol, Mahmoud Naghibzadeh, Hadi Sadoghi Yazdi Department
More informationCombining SVMs with Various Feature Selection Strategies
Combining SVMs with Various Feature Selection Strategies Yi-Wei Chen and Chih-Jen Lin Department of Computer Science, National Taiwan University, Taipei 106, Taiwan Summary. This article investigates the
More informationReihe Informatik 10/2001. Efficient Feature Subset Selection for Support Vector Machines. Matthias Heiler, Daniel Cremers, Christoph Schnörr
Computer Vision, Graphics, and Pattern Recognition Group Department of Mathematics and Computer Science University of Mannheim D-68131 Mannheim, Germany Reihe Informatik 10/2001 Efficient Feature Subset
More informationData mining with Support Vector Machine
Data mining with Support Vector Machine Ms. Arti Patle IES, IPS Academy Indore (M.P.) artipatle@gmail.com Mr. Deepak Singh Chouhan IES, IPS Academy Indore (M.P.) deepak.schouhan@yahoo.com Abstract: Machine
More informationFeature Selection. Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester / 262
Feature Selection Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester 2016 239 / 262 What is Feature Selection? Department Biosysteme Karsten Borgwardt Data Mining Course Basel
More informationKernel-based online machine learning and support vector reduction
Kernel-based online machine learning and support vector reduction Sumeet Agarwal 1, V. Vijaya Saradhi 2 andharishkarnick 2 1- IBM India Research Lab, New Delhi, India. 2- Department of Computer Science
More informationRule Based Learning Systems from SVM and RBFNN
Rule Based Learning Systems from SVM and RBFNN Haydemar Núñez 1, Cecilio Angulo 2 and Andreu Català 2 1 Laboratorio de Inteligencia Artificial, Universidad Central de Venezuela. Caracas, Venezuela hnunez@strix.ciens.ucv.ve
More informationUsing Analytic QP and Sparseness to Speed Training of Support Vector Machines
Using Analytic QP and Sparseness to Speed Training of Support Vector Machines John C. Platt Microsoft Research 1 Microsoft Way Redmond, WA 9805 jplatt@microsoft.com Abstract Training a Support Vector Machine
More informationLeave-One-Out Support Vector Machines
Leave-One-Out Support Vector Machines Jason Weston Department of Computer Science Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 OEX, UK. Abstract We present a new learning algorithm
More informationRobustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification
Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification Tomohiro Tanno, Kazumasa Horie, Jun Izawa, and Masahiko Morita University
More informationVersion Space Support Vector Machines: An Extended Paper
Version Space Support Vector Machines: An Extended Paper E.N. Smirnov, I.G. Sprinkhuizen-Kuyper, G.I. Nalbantov 2, and S. Vanderlooy Abstract. We argue to use version spaces as an approach to reliable
More informationSupport Vector Regression for Software Reliability Growth Modeling and Prediction
Support Vector Regression for Software Reliability Growth Modeling and Prediction 925 Fei Xing 1 and Ping Guo 2 1 Department of Computer Science Beijing Normal University, Beijing 100875, China xsoar@163.com
More informationClassification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska
Classification Lecture Notes cse352 Neural Networks Professor Anita Wasilewska Neural Networks Classification Introduction INPUT: classification data, i.e. it contains an classification (class) attribute
More informationIMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS
IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS BOGDAN M.WILAMOWSKI University of Wyoming RICHARD C. JAEGER Auburn University ABSTRACT: It is shown that by introducing special
More informationResearch on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a
International Conference on Education Technology, Management and Humanities Science (ETMHS 2015) Research on Applications of Data Mining in Electronic Commerce Xiuping YANG 1, a 1 Computer Science Department,
More informationNon-linear gating network for the large scale classification model CombNET-II
Non-linear gating network for the large scale classification model CombNET-II Mauricio Kugler, Toshiyuki Miyatani Susumu Kuroyanagi, Anto Satriyo Nugroho and Akira Iwata Department of Computer Science
More informationWell Analysis: Program psvm_welllogs
Proximal Support Vector Machine Classification on Well Logs Overview Support vector machine (SVM) is a recent supervised machine learning technique that is widely used in text detection, image recognition
More informationA Subspace Kernel for Nonlinear Feature Extraction
A Subspace Kernel for Nonlinear Feature Extraction Mingrui Wu, Jason Farquhar Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany {firstname.lastname}@tuebingen.mpg.de Abstract Kernel
More informationUnivariate Margin Tree
Univariate Margin Tree Olcay Taner Yıldız Department of Computer Engineering, Işık University, TR-34980, Şile, Istanbul, Turkey, olcaytaner@isikun.edu.tr Abstract. In many pattern recognition applications,
More informationAn Empirical Study on feature selection for Data Classification
An Empirical Study on feature selection for Data Classification S.Rajarajeswari 1, K.Somasundaram 2 Department of Computer Science, M.S.Ramaiah Institute of Technology, Bangalore, India 1 Department of
More informationSupport Vector Machines
Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining
More informationFeature Selection for SVMs
Feature Selection for SVMs J. Weston, S. Mukherjee, O. Chapelle, M. Pontil T. Poggio, V. Vapnik Barnhill BioInformatics.com, Savannah, Georgia, USA. CBCL MIT, Cambridge, Massachusetts, USA. AT&T Research
More informationOne-class Problems and Outlier Detection. 陶卿 中国科学院自动化研究所
One-class Problems and Outlier Detection 陶卿 Qing.tao@mail.ia.ac.cn 中国科学院自动化研究所 Application-driven Various kinds of detection problems: unexpected conditions in engineering; abnormalities in medical data,
More informationSecond Order SMO Improves SVM Online and Active Learning
Second Order SMO Improves SVM Online and Active Learning Tobias Glasmachers and Christian Igel Institut für Neuroinformatik, Ruhr-Universität Bochum 4478 Bochum, Germany Abstract Iterative learning algorithms
More informationEfficient Pairwise Classification
Efficient Pairwise Classification Sang-Hyeun Park and Johannes Fürnkranz TU Darmstadt, Knowledge Engineering Group, D-64289 Darmstadt, Germany Abstract. Pairwise classification is a class binarization
More informationORT EP R RCH A ESE R P A IDI! " #$$% &' (# $!"
R E S E A R C H R E P O R T IDIAP A Parallel Mixture of SVMs for Very Large Scale Problems Ronan Collobert a b Yoshua Bengio b IDIAP RR 01-12 April 26, 2002 Samy Bengio a published in Neural Computation,
More informationA New Fuzzy Membership Computation Method for Fuzzy Support Vector Machines
A New Fuzzy Membership Computation Method for Fuzzy Support Vector Machines Trung Le, Dat Tran, Wanli Ma and Dharmendra Sharma Faculty of Information Sciences and Engineering University of Canberra, Australia
More informationHALF&HALF BAGGING AND HARD BOUNDARY POINTS. Leo Breiman Statistics Department University of California Berkeley, CA
1 HALF&HALF BAGGING AND HARD BOUNDARY POINTS Leo Breiman Statistics Department University of California Berkeley, CA 94720 leo@stat.berkeley.edu Technical Report 534 Statistics Department September 1998
More informationKernel Methods and Visualization for Interval Data Mining
Kernel Methods and Visualization for Interval Data Mining Thanh-Nghi Do 1 and François Poulet 2 1 College of Information Technology, Can Tho University, 1 Ly Tu Trong Street, Can Tho, VietNam (e-mail:
More informationEfficient Pairwise Classification
Efficient Pairwise Classification Sang-Hyeun Park and Johannes Fürnkranz TU Darmstadt, Knowledge Engineering Group, D-64289 Darmstadt, Germany {park,juffi}@ke.informatik.tu-darmstadt.de Abstract. Pairwise
More informationFingerprint Classification with Combinations of Support Vector Machines
Fingerprint Classification with Combinations of Support Vector Machines Yuan Yao 1, Paolo Frasconi 2, and Massimiliano Pontil 3,1 1 Department of Mathematics, City University of Hong Kong, Hong Kong 2
More informationData Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 21 Table of contents 1 Introduction 2 Data mining
More informationAdaptive Scaling for Feature Selection in SVMs
Adaptive Scaling for Feature Selection in SVMs Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, Compiègne, France Yves.Grandvalet@utc.fr Stéphane Canu PSI INSA de Rouen,
More informationSupport Vector Machines
Support Vector Machines VL Algorithmisches Lernen, Teil 3a Norman Hendrich & Jianwei Zhang University of Hamburg, Dept. of Informatics Vogt-Kölln-Str. 30, D-22527 Hamburg hendrich@informatik.uni-hamburg.de
More informationSpeeding Up Multi-class SVM Evaluation by PCA and Feature Selection
Speeding Up Multi-class SVM Evaluation by PCA and Feature Selection Hansheng Lei, Venu Govindaraju CUBS, Center for Unified Biometrics and Sensors State University of New York at Buffalo Amherst, NY 1460
More informationData Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 20 Table of contents 1 Introduction 2 Data mining
More informationSVM in Analysis of Cross-Sectional Epidemiological Data Dmitriy Fradkin. April 4, 2005 Dmitriy Fradkin, Rutgers University Page 1
SVM in Analysis of Cross-Sectional Epidemiological Data Dmitriy Fradkin April 4, 2005 Dmitriy Fradkin, Rutgers University Page 1 Overview The goals of analyzing cross-sectional data Standard methods used
More informationFace Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN
2016 International Conference on Artificial Intelligence: Techniques and Applications (AITA 2016) ISBN: 978-1-60595-389-2 Face Recognition Using Vector Quantization Histogram and Support Vector Machine
More informationOptimal Brain Damage. Yann Le Cun, John S. Denker and Sara A. Solla. presented by Chaitanya Polumetla
Optimal Brain Damage Yann Le Cun, John S. Denker and Sara A. Solla presented by Chaitanya Polumetla Overview Introduction Need for OBD The Idea Authors Proposal Why OBD could work? Experiments Results
More informationInduction of Multivariate Decision Trees by Using Dipolar Criteria
Induction of Multivariate Decision Trees by Using Dipolar Criteria Leon Bobrowski 1,2 and Marek Krȩtowski 1 1 Institute of Computer Science, Technical University of Bia lystok, Poland 2 Institute of Biocybernetics
More informationScale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract
Scale-Invariance of Support Vector Machines based on the Triangular Kernel François Fleuret Hichem Sahbi IMEDIA Research Group INRIA Domaine de Voluceau 78150 Le Chesnay, France Abstract This paper focuses
More informationEvaluation of Performance Measures for SVR Hyperparameter Selection
Evaluation of Performance Measures for SVR Hyperparameter Selection Koen Smets, Brigitte Verdonk, Elsa M. Jordaan Abstract To obtain accurate modeling results, it is of primal importance to find optimal
More informationNovel Initialisation and Updating Mechanisms in PSO for Feature Selection in Classification
Novel Initialisation and Updating Mechanisms in PSO for Feature Selection in Classification Bing Xue, Mengjie Zhang, and Will N. Browne School of Engineering and Computer Science Victoria University of
More informationFlexible-Hybrid Sequential Floating Search in Statistical Feature Selection
Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Petr Somol 1,2, Jana Novovičová 1,2, and Pavel Pudil 2,1 1 Dept. of Pattern Recognition, Institute of Information Theory and
More informationUnsupervised Feature Selection for Sparse Data
Unsupervised Feature Selection for Sparse Data Artur Ferreira 1,3 Mário Figueiredo 2,3 1- Instituto Superior de Engenharia de Lisboa, Lisboa, PORTUGAL 2- Instituto Superior Técnico, Lisboa, PORTUGAL 3-
More informationFeature scaling in support vector data description
Feature scaling in support vector data description P. Juszczak, D.M.J. Tax, R.P.W. Duin Pattern Recognition Group, Department of Applied Physics, Faculty of Applied Sciences, Delft University of Technology,
More informationCS570: Introduction to Data Mining
CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8 & 9 Han, Chapters 4 & 5 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data Mining.
More informationSupport Vector Machines
Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining
More informationEfficient Pruning Method for Ensemble Self-Generating Neural Networks
Efficient Pruning Method for Ensemble Self-Generating Neural Networks Hirotaka INOUE Department of Electrical Engineering & Information Science, Kure National College of Technology -- Agaminami, Kure-shi,
More informationIntroduction The problem of cancer classication has clear implications on cancer treatment. Additionally, the advent of DNA microarrays introduces a w
MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No.677 C.B.C.L Paper No.8
More informationFisher Score Dimensionality Reduction for Svm Classification Arunasakthi. K, KamatchiPriya.L, Askerunisa.A
ISSN (Online) : 2319-8753 ISSN (Print) : 2347-6710 International Journal of Innovative Research in Science, Engineering and Technology Volume 3, Special Issue 3, March 2014 2014 International Conference
More informationMachine Learning: Think Big and Parallel
Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least
More informationTable of Contents. Recognition of Facial Gestures... 1 Attila Fazekas
Table of Contents Recognition of Facial Gestures...................................... 1 Attila Fazekas II Recognition of Facial Gestures Attila Fazekas University of Debrecen, Institute of Informatics
More informationModule 4. Non-linear machine learning econometrics: Support Vector Machine
Module 4. Non-linear machine learning econometrics: Support Vector Machine THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Introduction When the assumption of linearity
More informationSupport Vector Machines for Face Recognition
Chapter 8 Support Vector Machines for Face Recognition 8.1 Introduction In chapter 7 we have investigated the credibility of different parameters introduced in the present work, viz., SSPD and ALR Feature
More informationRadial Basis Function Neural Network Classifier
Recognition of Unconstrained Handwritten Numerals by a Radial Basis Function Neural Network Classifier Hwang, Young-Sup and Bang, Sung-Yang Department of Computer Science & Engineering Pohang University
More informationClass-Specific Feature Selection for One-Against-All Multiclass SVMs
Class-Specific Feature Selection for One-Against-All Multiclass SVMs Gaël de Lannoy and Damien François and Michel Verleysen Université catholique de Louvain Institute of Information and Communication
More informationDECISION-TREE-BASED MULTICLASS SUPPORT VECTOR MACHINES. Fumitake Takahashi, Shigeo Abe
DECISION-TREE-BASED MULTICLASS SUPPORT VECTOR MACHINES Fumitake Takahashi, Shigeo Abe Graduate School of Science and Technology, Kobe University, Kobe, Japan (E-mail: abe@eedept.kobe-u.ac.jp) ABSTRACT
More informationFeature Ranking Using Linear SVM
JMLR: Workshop and Conference Proceedings 3: 53-64 WCCI2008 workshop on causality Feature Ranking Using Linear SVM Yin-Wen Chang Chih-Jen Lin Department of Computer Science, National Taiwan University
More informationFeature Selection for Supervised Classification: A Kolmogorov- Smirnov Class Correlation-Based Filter
Feature Selection for Supervised Classification: A Kolmogorov- Smirnov Class Correlation-Based Filter Marcin Blachnik 1), Włodzisław Duch 2), Adam Kachel 1), Jacek Biesiada 1,3) 1) Silesian University
More informationA Support Vector Method for Hierarchical Clustering
A Support Vector Method for Hierarchical Clustering Asa Ben-Hur Faculty of IE and Management Technion, Haifa 32, Israel David Horn School of Physics and Astronomy Tel Aviv University, Tel Aviv 69978, Israel
More informationSSV Criterion Based Discretization for Naive Bayes Classifiers
SSV Criterion Based Discretization for Naive Bayes Classifiers Krzysztof Grąbczewski kgrabcze@phys.uni.torun.pl Department of Informatics, Nicolaus Copernicus University, ul. Grudziądzka 5, 87-100 Toruń,
More informationClassification by Support Vector Machines
Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III
More informationNelder-Mead Enhanced Extreme Learning Machine
Philip Reiner, Bogdan M. Wilamowski, "Nelder-Mead Enhanced Extreme Learning Machine", 7-th IEEE Intelligent Engineering Systems Conference, INES 23, Costa Rica, June 9-2., 29, pp. 225-23 Nelder-Mead Enhanced
More informationKernel Combination Versus Classifier Combination
Kernel Combination Versus Classifier Combination Wan-Jui Lee 1, Sergey Verzakov 2, and Robert P.W. Duin 2 1 EE Department, National Sun Yat-Sen University, Kaohsiung, Taiwan wrlee@water.ee.nsysu.edu.tw
More informationSupport Vector Machines (a brief introduction) Adrian Bevan.
Support Vector Machines (a brief introduction) Adrian Bevan email: a.j.bevan@qmul.ac.uk Outline! Overview:! Introduce the problem and review the various aspects that underpin the SVM concept.! Hard margin
More informationVariable Selection 6.783, Biomedical Decision Support
6.783, Biomedical Decision Support (lrosasco@mit.edu) Department of Brain and Cognitive Science- MIT November 2, 2009 About this class Why selecting variables Approaches to variable selection Sparsity-based
More informationAnalysis of SAGE Results with Combined Learning Techniques
Analysis of SAGE Results with Combined Learning Techniques Hsuan-Tien Lin and Ling Li Learning Systems Group, California Institute of Technology, USA htlin@caltech.edu, ling@caltech.edu Abstract. Serial
More informationA Feature Selection Method to Handle Imbalanced Data in Text Classification
A Feature Selection Method to Handle Imbalanced Data in Text Classification Fengxiang Chang 1*, Jun Guo 1, Weiran Xu 1, Kejun Yao 2 1 School of Information and Communication Engineering Beijing University
More informationA Hybrid Feature Selection Algorithm Based on Information Gain and Sequential Forward Floating Search
A Hybrid Feature Selection Algorithm Based on Information Gain and Sequential Forward Floating Search Jianli Ding, Liyang Fu School of Computer Science and Technology Civil Aviation University of China
More informationChapter 22 Information Gain, Correlation and Support Vector Machines
Chapter 22 Information Gain, Correlation and Support Vector Machines Danny Roobaert, Grigoris Karakoulas, and Nitesh V. Chawla Customer Behavior Analytics Retail Risk Management Canadian Imperial Bank
More informationJ. Weston, A. Gammerman, M. Stitson, V. Vapnik, V. Vovk, C. Watkins. Technical Report. February 5, 1998
Density Estimation using Support Vector Machines J. Weston, A. Gammerman, M. Stitson, V. Vapnik, V. Vovk, C. Watkins. Technical Report CSD-TR-97-3 February 5, 998!()+, -./ 3456 Department of Computer Science
More informationClassification by Support Vector Machines
Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III
More informationFeature selection in environmental data mining combining Simulated Annealing and Extreme Learning Machine
Feature selection in environmental data mining combining Simulated Annealing and Extreme Learning Machine Michael Leuenberger and Mikhail Kanevski University of Lausanne - Institute of Earth Surface Dynamics
More informationEvaluating the SVM Component in Oracle 10g Beta
Evaluating the SVM Component in Oracle 10g Beta Dept. of Computer Science and Statistics University of Rhode Island Technical Report TR04-299 Lutz Hamel and Angela Uvarov Department of Computer Science
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More informationA Practical Guide to Support Vector Classification
A Practical Guide to Support Vector Classification Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin Department of Computer Science and Information Engineering National Taiwan University Taipei 106, Taiwan
More informationLecture 9: Support Vector Machines
Lecture 9: Support Vector Machines William Webber (william@williamwebber.com) COMP90042, 2014, Semester 1, Lecture 8 What we ll learn in this lecture Support Vector Machines (SVMs) a highly robust and
More informationFuzzy-Kernel Learning Vector Quantization
Fuzzy-Kernel Learning Vector Quantization Daoqiang Zhang 1, Songcan Chen 1 and Zhi-Hua Zhou 2 1 Department of Computer Science and Engineering Nanjing University of Aeronautics and Astronautics Nanjing
More informationGene Expression Based Classification using Iterative Transductive Support Vector Machine
Gene Expression Based Classification using Iterative Transductive Support Vector Machine Hossein Tajari and Hamid Beigy Abstract Support Vector Machine (SVM) is a powerful and flexible learning machine.
More informationCluster homogeneity as a semi-supervised principle for feature selection using mutual information
Cluster homogeneity as a semi-supervised principle for feature selection using mutual information Frederico Coelho 1 and Antonio Padua Braga 1 andmichelverleysen 2 1- Universidade Federal de Minas Gerais
More information5 Learning hypothesis classes (16 points)
5 Learning hypothesis classes (16 points) Consider a classification problem with two real valued inputs. For each of the following algorithms, specify all of the separators below that it could have generated
More informationWrapper Feature Selection using Discrete Cuckoo Optimization Algorithm Abstract S.J. Mousavirad and H. Ebrahimpour-Komleh* 1 Department of Computer and Electrical Engineering, University of Kashan, Kashan,
More informationSupport Vector Machines and their Applications
Purushottam Kar Department of Computer Science and Engineering, Indian Institute of Technology Kanpur. Summer School on Expert Systems And Their Applications, Indian Institute of Information Technology
More information