SELF-ORGANIZING methods such as the Self-

Size: px
Start display at page:

Download "SELF-ORGANIZING methods such as the Self-"

Transcription

1 Proceedings of International Joint Conference on Neural Networks, Dallas, Texas, USA, August 4-9, 2013 Maximal Margin Learning Vector Quantisation Trung Le, Dat Tran, Van Nguyen, and Wanli Ma Abstract Kernel Generalised Learning Vector Quantisation (KGLVQ) was proposed to extend Generalised Learning Vector Quantisation into the kernel feature space to deal with complex class boundaries and thus yielded promising performance for complex classification tasks in pattern recognition. However KGLVQ does not follow the maximal margin principle, which is crucial for kernel-based learning methods. In this paper we propose a maximal margin approach (MLVQ) to the KGLVQ algorithm. MLVQ inherits the merits of KGLVQ and also follows the maximal margin principle to improve the generalisation capability. Experiments performed on the wellknown data sets available in UCI repository show promising classification results for the proposed method. I. INTRODUCTION SELF-ORGANIZING methods such as the Self- Organizing Map (SOM) or Learning Vector Quantisation (LVQ) introduced by Kohonen [8] provide a successful and intuitive method of processing data for easy access [6]. LVQ aims at generating the prototypes or reference vectors which delegate for the data of classes [7]. Although LVQ is a fast and simple learning algorithm, sometimes its prototypes diverge and, as a result, degrade its recognition ability [12]. To address this problem, Generalised Learning Vector Quantisation (GLVQ) [12] was proposed. It is a generalisation of the original model proposed by Kohonen, where the prototypes are updated based on the steepest descent method to minimise a cost function. GLVQ has been widely applied to and also shown good performance in many applications [9], [11], [12]. However, its performance deteriorates for complex data sets since pattern classes with nonlinear class boundaries usually need a large number prototypes. It thus becomes very difficult to determine the reasonable number of prototypes and their positions to achieve a good generalisation performance [10]. To overcome this drawback, in [10] Kernel Generalised Learning Vector Quantisation (KGLVQ) was proposed, which learns the prototypes of data in the feature space. Like LVQ and GLVQ, KGLVQ can be used for two class and multi-class classification problems. In the case of two-class classification problems, the entire feature space is divided into subspaces induced by two core prototypes. In each subspace, a mid-perpendicular hyperplane of the two core prototypes was employed to classify the data. However, the hyperplanes of KGLVQ do not guarantee maximising margins, which is crucial for kernel methods [13], [14], [15]. Trung Le and Van Nguyen are with Faculty of Information Technology, HCMc University of Pedagogy, Hochiminh city, Vietnam ( {trunglm, vannk}@hcmup.edu.vn). Dat Tran and Wanli Ma are with Faculty of Education, Science, Technology and Mathematics, University of Canberra, Australia ( {dat.tran, wanli.ma}@canberra.edu.au). In this paper, we propose a maximal margin approach to KGLVQ, and we name it MLVQ. It takes the advantage of maximising margins to improve the generalisation capability, as seen in Support Vector Machine [3], [1]. MLVQ is different from the approach in [4], which aims at maximising the hypothesis margin rather than the real margin. In our approach, a finite number of prototypes m and n are used to represent the positive and the negative classes, respectively, in a binary data set. The entire feature space is divided into m n subspaces, which are induced by the permutation pairs of the prototypes. In each subspace, a mid-perpendicular hyperplane of two correspondent prototypes is employed to classify the data. The cost function in our approach takes into account maximising the margins of hyperplanes to boost the generalisation capability. Experiments performed on 9 data sets in UCI repository show a promising performance of the proposed method. II. MAXIMAL MARGIN KERNEL GENERALISED LEARNING VECTOR QUANTISATION A. Introduction Consider a binary training set X = {(x 1, y 1 ), (x 2, y 2 ),..., (x l, y l )}, where x 1, x 2,..., x l R d are data points and y 1, y 2,..., y l { 1, 1} are labels. This training set is mapped into a high dimensional space namely feature space through a function φ(.). Based on the idea of Vector Quantisation (VQ), m prototypes A 1, A 2,..., A m of the positive class and n prototypes B 1, B 2,..., B n of the negative class will be discovered in the feature space. The classification is based on the minimum distance to the prototypes in each class. More precisely, given a new vector x the decision function is as follows: ( f(x) = sign φ(x) b j0 2 φ(x) a i0 2) (1) { where i 0 = arg min φ(x) a i 2}, j 0 = { 1 i m arg min φ(x) b j 2}, and a i, b j are coordinates of 1 j n A i, B j, i = 1,..., m; j = 1,..., n, respectively. B. Optimisation Problem Given a labeled training vector (x, y), let us denote a and b as two prototypes of the positive class and negative class which are, respectively, closest to φ(x). Let µ(x, a, b) be the function which satisfies the following criterion: if x is correctly classified, µ(x, a, b) < 0; otherwise µ(x, a, b) 0. Let g be a monotonically increasing function. To improve the error rate, µ(x, a, b) should decrease for all training vectors /13/$ IEEE 1668

2 Therefore, the criterion is formulated as minimising of the following function: min l {A},{B} ( ( g µ x i, a (i), b (i))) (2) where {A} and {B} are the sequences {A 1, A 2,..., A m } and {B 1, B 2,..., B n }, respectively, and a (i) and b (i) are two prototypes of two classes which are closest to φ(x i ). C. Solution Assuming that the prototypes are linear expansions of vectors φ(x 1 ), φ(x 2 ),..., φ(x l ), let us denote a i, i = 1,..., m and b j, j = 1,..., n as the coordinates of the prototypes: a i = b j = l k=1 l k=1 u ik φ(x k ), v jk φ(x k ), For convenience, if c = l i = 1,..., m j = 1,..., n (3) u i φ(x i ), we rewrite c as c = [u 1, u 2,..., u l ] = [u i ],...,l. Given a labeled training vector (x, y), firstly we determine two closest prototypes A and B for two classes with respect to x, and secondly we use gradient descent method to update the coordinates a and b of A and B, respectively, as follows: a = a α g a b = b α g b (4) We now introduce the algorithm for Vector Quantisation Support Vector Machine. ALGORITHM FOR VECTOR QUANTISATION SUPPORT VECTOR MACHINE Initialise Using C-Means or Fuzzy C-Means clustering to find out m prototypes for positive class and n prototypes for negative class in the input space Set t = 0 and i = 0 Repeat t = t + 1 i = (i + 1) mod l A t = A i0 where i 0 = arg min { φ(x i ) a k 2} 1 k m B t = B j0 where j 0 = arg min { φ(x i ) b k 2} 1 k n a i0 b j0 Update a i0 = a i0 α g Update b j0 = b j0 α g Until convergence is reached where the function g = g(µ, t) depends on learning 1 time t. The sigmoid function g (µ, t) = 1+e is a good µt candidate for g. If this sigmoid function is applied then = tg (µ, t) (1 g(µ, t)). g D. Selection of the µ-function We introduce some candidates for the µ-function. Let (x, y) be a labeled training vector, and a & b are two closest prototypes in the two classes to the vector. CANDIDATE 1 FOR THE µ-function [8] (LVQ) µ(x, a, b) = y ( φ(x) a 2 φ(x) b 2) = y(d 1 d 2 ) = η(d 1, d 2 ) CANDIDATE 2 FOR THE µ-function [12] (GLVQ) µ(x, a, b) = y( φ(x) a 2 φ(x) b 2 ) φ(x) a 2 + φ(x) b 2 = y(d1 d2) d 1+d 2 = η(d 1, d 2 ) where d 1 and d 2 in (5) and (6) are distances from φ(x) to the two prototypes a and b, respectively. These functions depend primarily on d 1 and d 2. The formula for adaptation of prototypes in (4) can be rewritten as follows a = a 2α g η η d 1 (a φ(x)) b = b 2α g η η d 2 (b φ(x)) If µ(x, a, b) = η(d 1, d 2 ) = y(d 1 d 2 ), the equations in (7) become: (5) (6) (7) a = a 2α g η y (a φ(x)) b = b + 2α g η y (b φ(x)) (8) If µ(x, a, b) = η(d 1, d 2 ) = y(d1 d2) d 1+d 2, the equations in (7) become: a = a α g η b = b + α g η 4yd 2 (a φ(x)) (d 1+d 2) 2 4yd 1 (b φ(x)) (d 1+d 2) 2 CANDIDATE 3 FOR THE µ-function [4] (HMLVQ) µ(x, a, b) = 1 y ( φ(x) a φ(x) b ) (10) 2 This µ-function refers to hypothesis margin in [4] and is used in AdaBoost [5]. The hypothesis margin measures how much the hypothesis can travel before it hits an instance as shown in Figure 1. The partial derivatives of µ with respect to a and b are: a = b = y 2 φ(x) a (9) (φ(x) a) y 2 φ(x) b (φ(x) b) (11) CANDIDATE 4 FOR µ-function (MLVQ) This is our proposed maximal margin approach MLVQ. The µ-function is of the form µ(x, a, b) = y( φ(x) a 2 φ(x) b 2 ) a b (12) It is noted that by referring to Theorem 1 in Appendix A, the absolute value of the µ-function in Candidate 4 is the sample margin at φ(x) in Figure 1, and it is also the distance from φ(x) to mid-perpendicular hyperplane of prototypes a and b. When x is correctly classified, this value is equal to the 1669

3 negative sample margin at x. Minimising µ(x, a, b) motivates maximising the sample margin at x. The partial derivatives of µ with respect to a and b are: a = b = 2y a b (φ(x) a) y( φ(x) a 2 φ(x) b 2 ) a b 3 2y a b (φ(x) b) + y( φ(x) a 2 φ(x) b 2 ) a b 3 (13) Fig. 1. (a) Hypothesis Margin, (b) Sample Margin margin at φ(x i ), and d i = φ(x i ) a 1 2 φ(x i ) b 1 2. To make it simple, let us consider a separable case, i.e., all vectors φ(x i ) are correctly classified by the hyperplane H. l The objective function becomes ( dis(φ(x i ), H)). The solving optimisation problem is transformed to the following: ( l ) ( l ) min ( dis(φ(x i ), H)) or max dis(φ(x i ), H) a 1,b 1 a 1,b 1 (16) The above objective function is the sum of all sample margins at vectors, not the margins in the original Support Vector Machine (SVM). However, it was regarded in a variation of SVM [2]. Therefore, when using 2 prototypes, 1 each for the positive and negative classes, the objective function of the proposed model closely relates to the margin of SVM. Furthermore, with m prototypes to delegate the positive class and n for the negative class, the entire space are divided into m n subspaces (receptive fields), and for each receptive field the hyperplane induced by two correspondent prototypes is used for classifying the data as shown in Figure 3. Since the objective function (2) is minimised, the margins in the corresponding fields tend to be maximised. E. Decision Function When a convergence is reached, we achieve the final prototypes a i = [u ik ] k=1,...,l, i = 1,..., m and b j = [v jk ] k=1,...,l, j = 1,..., n. For a new vector x, we can calculate the distances from φ(x) to the prototypes using the following: d(φ(x), a i ) = φ(x) a i 2 = K(x, x) 2 l u ip K(x p, x) + a i 2, p=1 d(φ(x), b j ) = φ(x) b j 2 = K(x, x) 2 l v jp K(x p, x) + b j 2, p=1 i = 1,..., m j = 1,..., n (14) The two closest prototypes to φ(x) and the decision function will be determined as follows i 0 = arg min{d(φ(x), a i )} 1 i m j 0 = arg min{d(φ(x), b j )} 1 j n f(x) = sign (d(φ(x), b j0 ) d(φ(x), a i0 )) F. The Rational of the Proposed Margin Approach (15) In this section, we discuss the rationale and the advantage of our proposed method. First, we consider the case when the numbers of prototypes for both positive and negative classes are set to 1 as shown in Figure 2. Assuming that g(µ, t) = µ is applied, by referring to Theorem 1 in Appendix A, the objective function in (2) becomes l (y i dis(φ(x i ), H)sign(d i )), where H is hyperplane induced by positive prototype a 1 and negative prototype b 1, dis(φ(x i ), H) stands for distance from φ(x i ) to H or sample Fig. 2. One positive and one negative prototype are used to classify the data set. Fig. 3. Two positive and two negative prototypes are used to classify the data set. A. Data sets III. EXPERIMENTAL RESULTS We conducted experiments on 9 data sets of UCI repository. Details of data sets are shown in Table I. The LVQ algorithms with different µ-functions mentioned above were performed in both the input and feature spaces to compare 1670

4 TABLE I NUMBER OF SAMPLES IN 9 DATA SETS. #POSITIVE: NUMBER OF POSITIVE SAMPLES, #NEGATIVE: NUMBER OF NEGATIVE SAMPLES AND d: DIMENSION. Data set #positive #negative d Astroparticle Australian Breast Cancer Fourclass Ionosphere Liver Disorders SvmGuide USPS Wine LVQ, GLVQ and HMLVQ with our proposed MLVQ in the input space and also compare kernel LVQ, kernel GLVQ and kernel HMLVQ with MLVQ in the kernel feature spaces. We also make comparison of MLVQ with SVM. B. Parameter Settings In our experiment, we did not use the sigmoid function g(µ, t) = 1 1+e which results in the derivative g µt = tg(1 g). The derivative of this function rapidly decreases to 0 when the time t approaches +. For example when t = 100, the derivative is nearly equal to 0 if 0.1 < µ < Instead, we applied g(µ, t) = whose derivative is 1+e µ t g t = tg(1 g). This function shows two good features as seen in Figures 4 and 5: 1) Its derivative approaches to 0 slower than that of the sigmoid function. 2) Given t, if µ of a vector exceeds a predefined threshold then the derivative or the rate at this vector is very small and the adaptation is minor. used. The learning rate α was set to Both the numbers of positive and negative prototypes were searched in the grid {1, 2, 3}. For Kernel LVQs and SVM, the popular RBF kernel function K(x, x ) = e γ x x 2 was used. The parameter γ was ranged in the grid {2 k : k = 2l + 1, l = 8, 6,..., 1}. For SVM, the trade-off parameter C was searched in the grid {2 k : k = 2l + 1, l = 8, 6,..., 2}. Experimental results are displayed in Tables II, III and Figures 6, 7. It is shown that our MLVQ method performed very well in the input space. It also happens that the kernel models always produce better performance in comparison to the correspondent models in the input space. It is reasonable since the data tend to be more compact in the feature space. Therefore, a few of prototypes are sufficient to delegate each class. The experiment also points out that MLVQ in the kernel feature spaces and SVM are comparable, however MLVQ is preferable because it is simpler and does not require searching over a large-scale range of parameters like SVM. TABLE II CLASSIFICATION RESULTS (IN %) ON 9 DATA SETS FOR THE 4 INPUT SPACE MODELS LVQ, GLVQ, HMLVQ AND MLVQ. Data set LVQ GLVQ HMLVQ MLVQ Astroparticle 66% 68% 70% 84% Australian 82% 82% 83% 85% Breast Cancer 94% 95% 95% 96% Fourclass 90% 90% 93% 88% Ionosphere 70% 69% 71 % 84% Liver Disorders 60% 61% 62% 64% SvmGuide3 74% 76% 74% 65% USPS 74% 74% 73% 95% Wine 91% 94% 90% 93% TABLE III CLASSIFICATION RESULTS (IN %) ON 9 DATA SETS FOR THE 4 kernel FEATURE SPACE MODELS LVQ, GLVQ, HMLVQ, MLVQ AND SVM. Fig. 4. The graph of the derivative of sigmoid function. Data set LVQ GLVQ HMLVQ MLVQ SVM Astroparticle 86% 89% 89% 95% 96% Australian 83% 82% 84% 88% 86% Breast Cancer 96% 97% 97% 97% 96% Fourclass 98% 99% 100% 99% 98% Ionosphere 92% 93% 92% 95% 93% Liver Disorders 62% 62% 64% 66% 60% SvmGuide3 75% 77% 76% 79% 76% USPS 82% 83% 82% 98% 98% Wine 96% 98% 95% 99% 99% Fig. 5. The graph of the derivative of new sigmoid function. To evaluate accuracies, cross validation with 5 folds were IV. CONCLUSION In this paper, we have introduced MLVQ, a new maximal margin approach to Kernel Generalised Learning Vector Quantisation. MLVQ maximises the real margin which is crucial for kernel method and can be applied to both the input space and feature space. The experiments conducted on the 9 data sets in UCI repository demostrate good performance of MLVQ in both the input space and feature space. 1671

5 Fig. 8. The formula to evaluate margin. Fig. 6. Classification results (in %) on 9 data sets for the 4 input space models LVQ, GLVQ, HMLVQ (HM) and MLVQ (SM). partial derivatives of µ with respect to a and b are a = b = 2y a b (φ(x) a) y( φ(x) a 2 φ(x) b 2 ) a b 3 2y a b (φ(x) b) + y( φ(x) a 2 φ(x) b 2 ) a b 3 (19) Fig. 7. Classification results (in %) on 9 data sets for the 4 kernel feature space models: kernel LVQ (KLVQ), kernel GLVQ (KGLVQ), kernel HMLVQ (KHM), kernel MLVQ (KSM) and SVM. APPENDIX Theorem 1 Let M, A and B be the points in the affine space R d. Let (H) : w T x + b = 0 be the mid-perpendicular hyperplane of segment AB. The following equality holds: Margin(M, H) = MA 2 MB 2 2AB where the sample margin Margin(M, H) is distance from point M to hyperplane (H). PROOF MA 2 MB 2 = MA 2 MB 2 = ( MA MB)( MA + MB) = 2 BA. MI = 2BA( MH + HI) (17) where I is the midpoint of segment AB, H is projected point of M to hyperplane (H) as in Figure 8. Since HI is orthogonal to BA and MH is parallel to BA, we have: MA 2 MB 2 = 2 BA. MH = 2BA.MH = 2BA.M argin(m, H) Corollary 1 If µ(x, a, b) = y( φ(x) a 2 φ(x) b 2 ) a b (18) then the PROOF a = 2y(a φ(x)) a b y( φ(x) a 2 φ(x) b 2 ) 2(a b) 2 a b a b 2 = 2y a b (φ(x) a) y( φ(x) a 2 φ(x) b 2 ) a b 3 (20) b = 2y(b φ(x)) a b y( φ(x) a 2 φ(x) b 2 ) 2(b a) 2 a b a b 2 = 2y a b (φ(x) b) + y( φ(x) a 2 φ(x) b 2 ) a b 3 (21) REFERENCES [1] C.J.C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2: , [2] Colin Campbell and Kristin P. Bennett. A linear programming approach to novelty detection, [3] C. Cortes and V. Vapnik. Support-vector networks. In Machine Learning, pages , [4] K. Crammer, R. Gilad-bachrach, A. Navot, and N. Tishby. Margin analysis of the lvq algorithm. In Advances in Neural Information Processing Systems 2002, pages , [5] Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1): , [6] B. Hammer and T. Villmann. Generalized relevance learning vector quantization. Neural Networks, 15: , [7] T. Kohonen. Self-organization and associative memory: 3rd edition. Springer-Verlag, [8] T. Kohonen. Learning vector quantization. The Handbook of Brain Theory and Neural Networks, pages , [9] C-L. Liu and M. Nakagawa. Evaluation of prototype learning algorithms for nearest-neighbor classifier in application to handwritten character recognition. Pattern Recognition, 34(3): ,

6 [10] A. K. Qinand and P. N. Suganthan. A novel kernel prototype-based learning algorithm. In ICPR, pages , [11] A. Sato. Discriminative dimensionality reduction based on generalized lvq. In ICANN, pages 65 72, [12] A. Sato and K. Yamada. Generalized learning vector quantization. In NIPS, pages , [13] B. Schölkopf and A.J. Smola. Learning with kernels : support vector machines, regularization, optimization, and beyond. The MIT Press, 2 edition, [14] V. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, New York, [15] V. Vapnik. The Nature of Statistical Learning Theory. Springer, 2 edition,

A New Fuzzy Membership Computation Method for Fuzzy Support Vector Machines

A New Fuzzy Membership Computation Method for Fuzzy Support Vector Machines A New Fuzzy Membership Computation Method for Fuzzy Support Vector Machines Trung Le, Dat Tran, Wanli Ma and Dharmendra Sharma Faculty of Information Sciences and Engineering University of Canberra, Australia

More information

Machine Learning for NLP

Machine Learning for NLP Machine Learning for NLP Support Vector Machines Aurélie Herbelot 2018 Centre for Mind/Brain Sciences University of Trento 1 Support Vector Machines: introduction 2 Support Vector Machines (SVMs) SVMs

More information

Relevance Determination for Learning Vector Quantization using the Fisher Criterion Score

Relevance Determination for Learning Vector Quantization using the Fisher Criterion Score 17 th Computer Vision Winter Workshop Matej Kristan, Rok Mandeljc, Luka Čehovin (eds.) Mala Nedelja, Slovenia, February 1-3, 2012 Relevance Determination for Learning Vector Quantization using the Fisher

More information

Bagging and Boosting Algorithms for Support Vector Machine Classifiers

Bagging and Boosting Algorithms for Support Vector Machine Classifiers Bagging and Boosting Algorithms for Support Vector Machine Classifiers Noritaka SHIGEI and Hiromi MIYAJIMA Dept. of Electrical and Electronics Engineering, Kagoshima University 1-21-40, Korimoto, Kagoshima

More information

Classification by Support Vector Machines

Classification by Support Vector Machines Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III

More information

Naïve Bayes for text classification

Naïve Bayes for text classification Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Support

More information

Efficient Case Based Feature Construction

Efficient Case Based Feature Construction Efficient Case Based Feature Construction Ingo Mierswa and Michael Wurst Artificial Intelligence Unit,Department of Computer Science, University of Dortmund, Germany {mierswa, wurst}@ls8.cs.uni-dortmund.de

More information

Table of Contents. Recognition of Facial Gestures... 1 Attila Fazekas

Table of Contents. Recognition of Facial Gestures... 1 Attila Fazekas Table of Contents Recognition of Facial Gestures...................................... 1 Attila Fazekas II Recognition of Facial Gestures Attila Fazekas University of Debrecen, Institute of Informatics

More information

KBSVM: KMeans-based SVM for Business Intelligence

KBSVM: KMeans-based SVM for Business Intelligence Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2004 Proceedings Americas Conference on Information Systems (AMCIS) December 2004 KBSVM: KMeans-based SVM for Business Intelligence

More information

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1225 Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms S. Sathiya Keerthi Abstract This paper

More information

Classification by Support Vector Machines

Classification by Support Vector Machines Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III

More information

Rule extraction from support vector machines

Rule extraction from support vector machines Rule extraction from support vector machines Haydemar Núñez 1,3 Cecilio Angulo 1,2 Andreu Català 1,2 1 Dept. of Systems Engineering, Polytechnical University of Catalonia Avda. Victor Balaguer s/n E-08800

More information

All lecture slides will be available at CSC2515_Winter15.html

All lecture slides will be available at  CSC2515_Winter15.html CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 9: Support Vector Machines All lecture slides will be available at http://www.cs.toronto.edu/~urtasun/courses/csc2515/ CSC2515_Winter15.html Many

More information

The Effects of Outliers on Support Vector Machines

The Effects of Outliers on Support Vector Machines The Effects of Outliers on Support Vector Machines Josh Hoak jrhoak@gmail.com Portland State University Abstract. Many techniques have been developed for mitigating the effects of outliers on the results

More information

Training Data Selection for Support Vector Machines

Training Data Selection for Support Vector Machines Training Data Selection for Support Vector Machines Jigang Wang, Predrag Neskovic, and Leon N Cooper Institute for Brain and Neural Systems, Physics Department, Brown University, Providence RI 02912, USA

More information

Use of Multi-category Proximal SVM for Data Set Reduction

Use of Multi-category Proximal SVM for Data Set Reduction Use of Multi-category Proximal SVM for Data Set Reduction S.V.N Vishwanathan and M Narasimha Murty Department of Computer Science and Automation, Indian Institute of Science, Bangalore 560 012, India Abstract.

More information

Fast Support Vector Machine Classification of Very Large Datasets

Fast Support Vector Machine Classification of Very Large Datasets Fast Support Vector Machine Classification of Very Large Datasets Janis Fehr 1, Karina Zapién Arreola 2 and Hans Burkhardt 1 1 University of Freiburg, Chair of Pattern Recognition and Image Processing

More information

Support vector machines. Dominik Wisniewski Wojciech Wawrzyniak

Support vector machines. Dominik Wisniewski Wojciech Wawrzyniak Support vector machines Dominik Wisniewski Wojciech Wawrzyniak Outline 1. A brief history of SVM. 2. What is SVM and how does it work? 3. How would you classify this data? 4. Are all the separating lines

More information

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017 Data Analysis 3 Support Vector Machines Jan Platoš October 30, 2017 Department of Computer Science Faculty of Electrical Engineering and Computer Science VŠB - Technical University of Ostrava Table of

More information

Kernel-based online machine learning and support vector reduction

Kernel-based online machine learning and support vector reduction Kernel-based online machine learning and support vector reduction Sumeet Agarwal 1, V. Vijaya Saradhi 2 andharishkarnick 2 1- IBM India Research Lab, New Delhi, India. 2- Department of Computer Science

More information

Support Vector Machines

Support Vector Machines Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining

More information

CS570: Introduction to Data Mining

CS570: Introduction to Data Mining CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8 & 9 Han, Chapters 4 & 5 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data Mining.

More information

Generating the Reduced Set by Systematic Sampling

Generating the Reduced Set by Systematic Sampling Generating the Reduced Set by Systematic Sampling Chien-Chung Chang and Yuh-Jye Lee Email: {D9115009, yuh-jye}@mail.ntust.edu.tw Department of Computer Science and Information Engineering National Taiwan

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing

More information

Global Metric Learning by Gradient Descent

Global Metric Learning by Gradient Descent Global Metric Learning by Gradient Descent Jens Hocke and Thomas Martinetz University of Lübeck - Institute for Neuro- and Bioinformatics Ratzeburger Allee 160, 23538 Lübeck, Germany hocke@inb.uni-luebeck.de

More information

Discriminative classifiers for image recognition

Discriminative classifiers for image recognition Discriminative classifiers for image recognition May 26 th, 2015 Yong Jae Lee UC Davis Outline Last time: window-based generic object detection basic pipeline face detection with boosting as case study

More information

Machine Learning Classifiers and Boosting

Machine Learning Classifiers and Boosting Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve

More information

Opinion Mining by Transformation-Based Domain Adaptation

Opinion Mining by Transformation-Based Domain Adaptation Opinion Mining by Transformation-Based Domain Adaptation Róbert Ormándi, István Hegedűs, and Richárd Farkas University of Szeged, Hungary {ormandi,ihegedus,rfarkas}@inf.u-szeged.hu Abstract. Here we propose

More information

Feature scaling in support vector data description

Feature scaling in support vector data description Feature scaling in support vector data description P. Juszczak, D.M.J. Tax, R.P.W. Duin Pattern Recognition Group, Department of Applied Physics, Faculty of Applied Sciences, Delft University of Technology,

More information

SoftDoubleMinOver: A Simple Procedure for Maximum Margin Classification

SoftDoubleMinOver: A Simple Procedure for Maximum Margin Classification SoftDoubleMinOver: A Simple Procedure for Maximum Margin Classification Thomas Martinetz, Kai Labusch, and Daniel Schneegaß Institute for Neuro- and Bioinformatics University of Lübeck D-23538 Lübeck,

More information

Ensembles. An ensemble is a set of classifiers whose combined results give the final decision. test feature vector

Ensembles. An ensemble is a set of classifiers whose combined results give the final decision. test feature vector Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1 classifier 2 classifier 3 super classifier result 1 * *A model is the learned

More information

Kernel Methods and Visualization for Interval Data Mining

Kernel Methods and Visualization for Interval Data Mining Kernel Methods and Visualization for Interval Data Mining Thanh-Nghi Do 1 and François Poulet 2 1 College of Information Technology, Can Tho University, 1 Ly Tu Trong Street, Can Tho, VietNam (e-mail:

More information

Leave-One-Out Support Vector Machines

Leave-One-Out Support Vector Machines Leave-One-Out Support Vector Machines Jason Weston Department of Computer Science Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 OEX, UK. Abstract We present a new learning algorithm

More information

Contents. Preface to the Second Edition

Contents. Preface to the Second Edition Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................

More information

Local Linear Approximation for Kernel Methods: The Railway Kernel

Local Linear Approximation for Kernel Methods: The Railway Kernel Local Linear Approximation for Kernel Methods: The Railway Kernel Alberto Muñoz 1,JavierGonzález 1, and Isaac Martín de Diego 1 University Carlos III de Madrid, c/ Madrid 16, 890 Getafe, Spain {alberto.munoz,

More information

DECISION-TREE-BASED MULTICLASS SUPPORT VECTOR MACHINES. Fumitake Takahashi, Shigeo Abe

DECISION-TREE-BASED MULTICLASS SUPPORT VECTOR MACHINES. Fumitake Takahashi, Shigeo Abe DECISION-TREE-BASED MULTICLASS SUPPORT VECTOR MACHINES Fumitake Takahashi, Shigeo Abe Graduate School of Science and Technology, Kobe University, Kobe, Japan (E-mail: abe@eedept.kobe-u.ac.jp) ABSTRACT

More information

CGBoost: Conjugate Gradient in Function Space

CGBoost: Conjugate Gradient in Function Space CGBoost: Conjugate Gradient in Function Space Ling Li Yaser S. Abu-Mostafa Amrit Pratap Learning Systems Group, California Institute of Technology, Pasadena, CA 91125, USA {ling,yaser,amrit}@caltech.edu

More information

Support Vector Machines

Support Vector Machines Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining

More information

Lab 2: Support vector machines

Lab 2: Support vector machines Artificial neural networks, advanced course, 2D1433 Lab 2: Support vector machines Martin Rehn For the course given in 2006 All files referenced below may be found in the following directory: /info/annfk06/labs/lab2

More information

Classification by Support Vector Machines

Classification by Support Vector Machines Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III

More information

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska Classification Lecture Notes cse352 Neural Networks Professor Anita Wasilewska Neural Networks Classification Introduction INPUT: classification data, i.e. it contains an classification (class) attribute

More information

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007,

More information

Content-based image and video analysis. Machine learning

Content-based image and video analysis. Machine learning Content-based image and video analysis Machine learning for multimedia retrieval 04.05.2009 What is machine learning? Some problems are very hard to solve by writing a computer program by hand Almost all

More information

Version Space Support Vector Machines: An Extended Paper

Version Space Support Vector Machines: An Extended Paper Version Space Support Vector Machines: An Extended Paper E.N. Smirnov, I.G. Sprinkhuizen-Kuyper, G.I. Nalbantov 2, and S. Vanderlooy Abstract. We argue to use version spaces as an approach to reliable

More information

One-class Problems and Outlier Detection. 陶卿 中国科学院自动化研究所

One-class Problems and Outlier Detection. 陶卿 中国科学院自动化研究所 One-class Problems and Outlier Detection 陶卿 Qing.tao@mail.ia.ac.cn 中国科学院自动化研究所 Application-driven Various kinds of detection problems: unexpected conditions in engineering; abnormalities in medical data,

More information

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs) Data Mining: Concepts and Techniques Chapter 9 Classification: Support Vector Machines 1 Support Vector Machines (SVMs) SVMs are a set of related supervised learning methods used for classification Based

More information

Module 4. Non-linear machine learning econometrics: Support Vector Machine

Module 4. Non-linear machine learning econometrics: Support Vector Machine Module 4. Non-linear machine learning econometrics: Support Vector Machine THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Introduction When the assumption of linearity

More information

Bagging for One-Class Learning

Bagging for One-Class Learning Bagging for One-Class Learning David Kamm December 13, 2008 1 Introduction Consider the following outlier detection problem: suppose you are given an unlabeled data set and make the assumptions that one

More information

1 Case study of SVM (Rob)

1 Case study of SVM (Rob) DRAFT a final version will be posted shortly COS 424: Interacting with Data Lecturer: Rob Schapire and David Blei Lecture # 8 Scribe: Indraneel Mukherjee March 1, 2007 In the previous lecture we saw how

More information

A Practical Guide to Support Vector Classification

A Practical Guide to Support Vector Classification A Practical Guide to Support Vector Classification Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin Department of Computer Science and Information Engineering National Taiwan University Taipei 106, Taiwan

More information

Lab 2: Support Vector Machines

Lab 2: Support Vector Machines Articial neural networks, advanced course, 2D1433 Lab 2: Support Vector Machines March 13, 2007 1 Background Support vector machines, when used for classication, nd a hyperplane w, x + b = 0 that separates

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,

More information

FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION

FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION 1 ZUHERMAN RUSTAM, 2 AINI SURI TALITA 1 Senior Lecturer, Department of Mathematics, Faculty of Mathematics and Natural

More information

Minimum Risk Feature Transformations

Minimum Risk Feature Transformations Minimum Risk Feature Transformations Shivani Agarwal Dan Roth Department of Computer Science, University of Illinois, Urbana, IL 61801 USA sagarwal@cs.uiuc.edu danr@cs.uiuc.edu Abstract We develop an approach

More information

Figure (5) Kohonen Self-Organized Map

Figure (5) Kohonen Self-Organized Map 2- KOHONEN SELF-ORGANIZING MAPS (SOM) - The self-organizing neural networks assume a topological structure among the cluster units. - There are m cluster units, arranged in a one- or two-dimensional array;

More information

Binarization of Color Character Strings in Scene Images Using K-means Clustering and Support Vector Machines

Binarization of Color Character Strings in Scene Images Using K-means Clustering and Support Vector Machines 2011 International Conference on Document Analysis and Recognition Binarization of Color Character Strings in Scene Images Using K-means Clustering and Support Vector Machines Toru Wakahara Kohei Kita

More information

Clustering with Reinforcement Learning

Clustering with Reinforcement Learning Clustering with Reinforcement Learning Wesam Barbakh and Colin Fyfe, The University of Paisley, Scotland. email:wesam.barbakh,colin.fyfe@paisley.ac.uk Abstract We show how a previously derived method of

More information

Using Analytic QP and Sparseness to Speed Training of Support Vector Machines

Using Analytic QP and Sparseness to Speed Training of Support Vector Machines Using Analytic QP and Sparseness to Speed Training of Support Vector Machines John C. Platt Microsoft Research 1 Microsoft Way Redmond, WA 9805 jplatt@microsoft.com Abstract Training a Support Vector Machine

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization More on Learning Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization Neural Net Learning Motivated by studies of the brain. A network of artificial

More information

Fuzzy-Kernel Learning Vector Quantization

Fuzzy-Kernel Learning Vector Quantization Fuzzy-Kernel Learning Vector Quantization Daoqiang Zhang 1, Songcan Chen 1 and Zhi-Hua Zhou 2 1 Department of Computer Science and Engineering Nanjing University of Aeronautics and Astronautics Nanjing

More information

12 Classification using Support Vector Machines

12 Classification using Support Vector Machines 160 Bioinformatics I, WS 14/15, D. Huson, January 28, 2015 12 Classification using Support Vector Machines This lecture is based on the following sources, which are all recommended reading: F. Markowetz.

More information

Support Vector Machines

Support Vector Machines Support Vector Machines SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions 6. Dealing

More information

Support vector machines

Support vector machines Support vector machines When the data is linearly separable, which of the many possible solutions should we prefer? SVM criterion: maximize the margin, or distance between the hyperplane and the closest

More information

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN 2016 International Conference on Artificial Intelligence: Techniques and Applications (AITA 2016) ISBN: 978-1-60595-389-2 Face Recognition Using Vector Quantization Histogram and Support Vector Machine

More information

Univariate Margin Tree

Univariate Margin Tree Univariate Margin Tree Olcay Taner Yıldız Department of Computer Engineering, Işık University, TR-34980, Şile, Istanbul, Turkey, olcaytaner@isikun.edu.tr Abstract. In many pattern recognition applications,

More information

Function approximation using RBF network. 10 basis functions and 25 data points.

Function approximation using RBF network. 10 basis functions and 25 data points. 1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data

More information

Second Order SMO Improves SVM Online and Active Learning

Second Order SMO Improves SVM Online and Active Learning Second Order SMO Improves SVM Online and Active Learning Tobias Glasmachers and Christian Igel Institut für Neuroinformatik, Ruhr-Universität Bochum 4478 Bochum, Germany Abstract Iterative learning algorithms

More information

Hierarchical Local Clustering for Constraint Reduction in Rank-Optimizing Linear Programs

Hierarchical Local Clustering for Constraint Reduction in Rank-Optimizing Linear Programs Hierarchical Local Clustering for Constraint Reduction in Rank-Optimizing Linear Programs Kaan Ataman and W. Nick Street Department of Management Sciences The University of Iowa Abstract Many real-world

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others Introduction to object recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others Overview Basic recognition tasks A statistical learning approach Traditional or shallow recognition

More information

Dynamic Ensemble Construction via Heuristic Optimization

Dynamic Ensemble Construction via Heuristic Optimization Dynamic Ensemble Construction via Heuristic Optimization Şenay Yaşar Sağlam and W. Nick Street Department of Management Sciences The University of Iowa Abstract Classifier ensembles, in which multiple

More information

Recursive Similarity-Based Algorithm for Deep Learning

Recursive Similarity-Based Algorithm for Deep Learning Recursive Similarity-Based Algorithm for Deep Learning Tomasz Maszczyk 1 and Włodzisław Duch 1,2 1 Department of Informatics, Nicolaus Copernicus University Grudzia dzka 5, 87-100 Toruń, Poland 2 School

More information

Data Mining in Bioinformatics Day 1: Classification

Data Mining in Bioinformatics Day 1: Classification Data Mining in Bioinformatics Day 1: Classification Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institute Tübingen and Eberhard Karls

More information

CS 559: Machine Learning Fundamentals and Applications 9 th Set of Notes

CS 559: Machine Learning Fundamentals and Applications 9 th Set of Notes 1 CS 559: Machine Learning Fundamentals and Applications 9 th Set of Notes Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Office: Lieb 215 Overview

More information

NEAREST-INSTANCE-CENTROID-ESTIMATION LINEAR DISCRIMINANT ANALYSIS (NICE LDA) Rishabh Singh, Kan Li (Member, IEEE) and Jose C. Principe (Fellow, IEEE)

NEAREST-INSTANCE-CENTROID-ESTIMATION LINEAR DISCRIMINANT ANALYSIS (NICE LDA) Rishabh Singh, Kan Li (Member, IEEE) and Jose C. Principe (Fellow, IEEE) NEAREST-INSTANCE-CENTROID-ESTIMATION LINEAR DISCRIMINANT ANALYSIS (NICE LDA) Rishabh Singh, Kan Li (Member, IEEE) and Jose C. Principe (Fellow, IEEE) University of Florida Department of Electrical and

More information

Support Vector Machines

Support Vector Machines Support Vector Machines VL Algorithmisches Lernen, Teil 3a Norman Hendrich & Jianwei Zhang University of Hamburg, Dept. of Informatics Vogt-Kölln-Str. 30, D-22527 Hamburg hendrich@informatik.uni-hamburg.de

More information

.. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar..

.. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar.. .. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar.. Machine Learning: Support Vector Machines: Linear Kernel Support Vector Machines Extending Perceptron Classifiers. There are two ways to

More information

The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem

The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem Int. J. Advance Soft Compu. Appl, Vol. 9, No. 1, March 2017 ISSN 2074-8523 The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem Loc Tran 1 and Linh Tran

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions

More information

Using Decision Boundary to Analyze Classifiers

Using Decision Boundary to Analyze Classifiers Using Decision Boundary to Analyze Classifiers Zhiyong Yan Congfu Xu College of Computer Science, Zhejiang University, Hangzhou, China yanzhiyong@zju.edu.cn Abstract In this paper we propose to use decision

More information

Comparison of different preprocessing techniques and feature selection algorithms in cancer datasets

Comparison of different preprocessing techniques and feature selection algorithms in cancer datasets Comparison of different preprocessing techniques and feature selection algorithms in cancer datasets Konstantinos Sechidis School of Computer Science University of Manchester sechidik@cs.man.ac.uk Abstract

More information

Efficient Pruning Method for Ensemble Self-Generating Neural Networks

Efficient Pruning Method for Ensemble Self-Generating Neural Networks Efficient Pruning Method for Ensemble Self-Generating Neural Networks Hirotaka INOUE Department of Electrical Engineering & Information Science, Kure National College of Technology -- Agaminami, Kure-shi,

More information

SVM Classification in Multiclass Letter Recognition System

SVM Classification in Multiclass Letter Recognition System Global Journal of Computer Science and Technology Software & Data Engineering Volume 13 Issue 9 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals

More information

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning Robot Learning 1 General Pipeline 1. Data acquisition (e.g., from 3D sensors) 2. Feature extraction and representation construction 3. Robot learning: e.g., classification (recognition) or clustering (knowledge

More information

An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm

An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm Proceedings of the National Conference on Recent Trends in Mathematical Computing NCRTMC 13 427 An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm A.Veeraswamy

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 20: 10/12/2015 Data Mining: Concepts and Techniques (3 rd ed.) Chapter

More information

A Comparative Study of SVM Kernel Functions Based on Polynomial Coefficients and V-Transform Coefficients

A Comparative Study of SVM Kernel Functions Based on Polynomial Coefficients and V-Transform Coefficients www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 6 Issue 3 March 2017, Page No. 20765-20769 Index Copernicus value (2015): 58.10 DOI: 18535/ijecs/v6i3.65 A Comparative

More information

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty)

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty) Supervised Learning (contd) Linear Separation Mausam (based on slides by UW-AI faculty) Images as Vectors Binary handwritten characters Treat an image as a highdimensional vector (e.g., by reading pixel

More information

More Learning. Ensembles Bayes Rule Neural Nets K-means Clustering EM Clustering WEKA

More Learning. Ensembles Bayes Rule Neural Nets K-means Clustering EM Clustering WEKA More Learning Ensembles Bayes Rule Neural Nets K-means Clustering EM Clustering WEKA 1 Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector

More information

Client Dependent GMM-SVM Models for Speaker Verification

Client Dependent GMM-SVM Models for Speaker Verification Client Dependent GMM-SVM Models for Speaker Verification Quan Le, Samy Bengio IDIAP, P.O. Box 592, CH-1920 Martigny, Switzerland {quan,bengio}@idiap.ch Abstract. Generative Gaussian Mixture Models (GMMs)

More information

Robust 1-Norm Soft Margin Smooth Support Vector Machine

Robust 1-Norm Soft Margin Smooth Support Vector Machine Robust -Norm Soft Margin Smooth Support Vector Machine Li-Jen Chien, Yuh-Jye Lee, Zhi-Peng Kao, and Chih-Cheng Chang Department of Computer Science and Information Engineering National Taiwan University

More information

ENSEMBLE RANDOM-SUBSET SVM

ENSEMBLE RANDOM-SUBSET SVM ENSEMBLE RANDOM-SUBSET SVM Anonymous for Review Keywords: Abstract: Ensemble Learning, Bagging, Boosting, Generalization Performance, Support Vector Machine In this paper, the Ensemble Random-Subset SVM

More information

J. Weston, A. Gammerman, M. Stitson, V. Vapnik, V. Vovk, C. Watkins. Technical Report. February 5, 1998

J. Weston, A. Gammerman, M. Stitson, V. Vapnik, V. Vovk, C. Watkins. Technical Report. February 5, 1998 Density Estimation using Support Vector Machines J. Weston, A. Gammerman, M. Stitson, V. Vapnik, V. Vovk, C. Watkins. Technical Report CSD-TR-97-3 February 5, 998!()+, -./ 3456 Department of Computer Science

More information

Feature Selection in a Kernel Space

Feature Selection in a Kernel Space Bin Cao Peking University, Beijing, China Dou Shen Hong Kong University of Science and Technology, Hong Kong Jian-Tao Sun Microsoft Research Asia, 49 Zhichun Road, Beijing, China Qiang Yang Hong Kong University

More information

Support Vector Machines + Classification for IR

Support Vector Machines + Classification for IR Support Vector Machines + Classification for IR Pierre Lison University of Oslo, Dep. of Informatics INF3800: Søketeknologi April 30, 2014 Outline of the lecture Recap of last week Support Vector Machines

More information

SUPPORT VECTOR MACHINES

SUPPORT VECTOR MACHINES SUPPORT VECTOR MACHINES Today Reading AIMA 8.9 (SVMs) Goals Finish Backpropagation Support vector machines Backpropagation. Begin with randomly initialized weights 2. Apply the neural network to each training

More information

Kernel Combination Versus Classifier Combination

Kernel Combination Versus Classifier Combination Kernel Combination Versus Classifier Combination Wan-Jui Lee 1, Sergey Verzakov 2, and Robert P.W. Duin 2 1 EE Department, National Sun Yat-Sen University, Kaohsiung, Taiwan wrlee@water.ee.nsysu.edu.tw

More information

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:

More information