Training Data Selection for Support Vector Machines
|
|
- Junior Rice
- 5 years ago
- Views:
Transcription
1 Training Data Selection for Support Vector Machines Jigang Wang, Predrag Neskovic, and Leon N Cooper Institute for Brain and Neural Systems, Physics Department, Brown University, Providence RI 02912, USA jigang@brown.edu,pedja@brown.edu, Leon Cooper@brown.edu, Abstract. In recent years, support vector machines (SVMs) have become a popular tool for pattern recognition and machine learning. Training a SVM involves solving a constrained quadratic programming problem, which requires large memory and enormous amounts of training time for large-scale problems. In contrast, the SVM decision function is fully determined by a small subset of the training data, called support vectors. Therefore, it is desirable to remove from the training set the data that is irrelevant to the final decision function. In this paper we propose two new methods that select a subset of data for SVM training. Using real-world datasets, we compare the effectiveness of the proposed data selection strategies in terms of their ability to reduce the training set size while maintaining the generalization performance of the resulting SVM classifiers. Our experimental results show that a significant amount of training data can be removed by our proposed methods without degrading the performance of the resulting SVM classifiers. 1 Introduction Support vector machines (SVMs), introduced by Vapnik and coworkers in the structural risk minimization (SRM) framework [1 3], have gained wide acceptance due to their solid statistical foundation and good generalization performance that has been demonstrated in a wide range of applications. Training a SVM involves solving a constrained quadratic programming (QP) problem, which requires large memory and takes enormous amounts of training time for large-scale applications [4]. On the other hand, the SVM decision function depends only on a small subset of the training data, called support vectors. Therefore, if one knows in advance which patterns correspond to the support vectors, the same solution can be obtained by solving a much smaller QP problem that involves only the support vectors. The problem is then how to select training examples that are likely to be support vectors. Recently, there This work is partially supported by ARO under grant W911NF Jigang Wang is supported by a dissertation fellowship from Brown University.
2 has been considerable research on data selection for SVM training. For example, Shin and Cho proposed a method that selects patterns near the decision boundary based on the neighborhood properties [5]. In [6 8], k-means clustering is employed to select patterns from the training set. In [9], Zhang and King proposed a β-skeleton algorithm to identify support vectors. In [10], Abe and Inoue used Mahalanobis distance to estimate boundary points. In the reduced SVM (RSVM) setting, Lee and Mangasarian chose a subset of training examples using random sampling [11]. In [12], it was shown that uniform random sampling is the optimal robust selection scheme in terms of several statistical criteria. In this paper, we introduce two new data selection methods for SVM training. The first method selects training data based on a statistical confidence measure that we will describe later. The second method uses the minimal distance from a training example to the training examples of a different class as a criterion to select patterns near the decision boundary. This method is motivated by the geometrical interpretation of SVMs based on the (reduced) convex hulls. To understand how effective these strategies are in terms of their ability to reduce the training set size while maintaining the generalization performance, we compare the results obtained by the SVM classifiers trained with data selected by these two new methods, by random sampling, and by the data selection method that is based on the distance from a training example to the desired optimal separating hyperplane. Our comparative study shows that a significant amount of training data can be removed from the training set by our methods without degrading the performance of the resulting SVM classifier. We also find that, despite its simplicity, random sampling performs well and often provides results comparable to those obtained by the method based on the desired SVM outputs. Furthermore, in our experiments, we find that incorporating the class distribution information in the training set often improves the efficiency of the data selection methods. The remainder of the paper is organized as follows. In section 2, we give a brief overview of support vector machines for classification and the corresponding training problem. In section 3, we present the two new methods that select subsets of training examples for training SVMs. In section 4 we report the experimental results on several real-world datasets. Concluding remarks are provided in section 5. 2 Related Background Given a set of training data {(x 1,y 1 ),...,(x n,y n )}, where x i IR d and y i { 1, 1}, support vector machines seek to construct an optimal separating hyperplane by solving the following quadratic optimization problem: n 1 min w,b 2 w, w + C ξ n i=1 (1) subject to the constraints: y i ( w, x i + b) 1 ξ i i =1,...,n, (2)
3 where ξ i 0 for i =1,...,n are slack variables introduced to handle the nonseparable case [2]. The constant C>0 is a parameter that controls the trade-off between the separation margin and the number of training errors. Using the Lagrange multiplier method, one can easily obtain the following Wolfe dual form of the primal quadratic programming problem: subject to 1 min α i,i=1,...,n 2 α i α j y i y j x i,x j i,j=1 0 α i C i =1,...,n and α i (3) i=1 α i y i =0. (4) Solving the dual problem, one obtains the multipliers α i,i=1,...,n, which give w as an expansion w = α i y i x i. (5) i=1 According to the Karush-Kuhn-Tucker (KKT) optimality conditions, we have i=1 α i =0 y i ( w, x i + b) 1 and ξ i =0 0 <α i <C y i ( w, x i + b) = 1 and ξ i =0 α i = C y i ( w, x i + b) 1 and ξ i 0. Therefore, only α i that correspond to training examples x i which lie either on the margin or inside the margin area are non-zero. All the remaining α i are zero and the corresponding training examples are irrelevant to the final solution. Knowing the normal vector w, the bias term b can be determined from the KKT conditions y i ( w, x i + b) = 1 for 0 <α i <C. This subsequently leads to the linear decision function f(x) = sgn( n i=1 α iy i x, x i + b). In practice, linear decision functions are generally not rich enough for pattern separation. To allow for more general decision surfaces, one can apply the kernel trick by replacing the inner products x i,x j in the dual problem with suitable kernel functions k(x i,x j ). Effectively, support vector machines implicitly map training vectors x i in IR d to feature vectors Φ(x i ) in some high dimensional feature space IF such that inner products in IF are defined as Φ(x i ),Φ(x j ) = k(x i,x j ). Consequently, the optimal hyperplane in the feature space IF represents a nonlinear decision functions of the form f(x) = sgn( α i y i k(x, x i )+b). (6) i=1 To train a SVM classifier, one therefore needs to solve the dual quadratic programming problem (3) under the constraints (4). For a small training set, standard QP solvers, such as CPLEX, LOQO, MINOS and Matlab QP routines, can be readily used to obtain the solution. However, for a large training set, they
4 quickly become intractable because of the large memory requirements and the enormous amounts of training time involved. To alleviate the problem, a number of solutions have been proposed by exploiting the sparsity of the SVM solution and the KKT conditions. The first such solution, known as chunking [13], uses the fact that only the support vectors are relevant for the final solution. At each step, chunking solves a QP problem that consists of all non-zero Lagrange multipliers α i from the last step and some of the α i that violate the KKT conditions. The size of the QP problem varies but finally equals the number of non-zero Lagrange multipliers. At the last step, the entire set of non-zero Lagrange multipliers are identified and the QP problem is solved. Another solution, proposed in [14], solves the large QP problem by breaking it down into a series of smaller QP sub-problems. This decomposition method is justified by the observation that solving a sequence of QP sub-problems that always contain at least one training example that violates the KKT conditions will eventually lead to the optimal solution. Recently, a method called sequential minimal optimization (SMO) was proposed by Platt [15], which approaches the problem by iteratively solving a QP sub-problem of size 2. The key idea is that a QP sub-problem of size 2 can be solved analytically without invoking a quadratic optimizer. This method has been reported to be several orders of magnitude faster than the classical chunking algorithm. All the above training methods make use of the whole training set. However, according to the KKT optimality conditions, the final separating hyperplane is fully determined by the support vectors. In many real-world applications, the number of support vectors is expected to be much smaller than the total number of training examples. Therefore, the speed of SVM training will be significantly improved if only the set of support vectors is used for training, and the solution will be exactly the same as if the whole training set was used. In theory, one has to solve the full QP problem in order to identify the support vectors. However, it is easy to see that the support vectors are training examples that are close to decision boundaries. Therefore, if there exists a computationally efficient way to find a small set of training data such that with high probability it contains the desired support vectors, the speed of SVM training will be improved without degrading the generalization performance. The size of the reduced training set can still be larger than the set of desired support vectors. However, as long as its size is much smaller than the size of the total training set, the SVM training speed will be significantly improved because most SVM training algorithms scales quadratically on many problems [4]. In the next section, we propose two new data selection strategies to explore the possibility. 3 Training Data Selection for Support Vector Machines 3.1 Data Selection based on Confidence Measure A good heuristic for identifying boundary points is the number of training examples that are contained in the largest sphere centered at a training example without covering an example of a different class.
5 Centered at each training example x i, let us draw a sphere that is as large as possible without covering a training example of a different class and count the number of training examples that fall inside the sphere. We denote this number by N(x i ). Obviously, the larger the number N(x i ), the more training examples (of the same class as x i ) will be scattered around x i, the less likely x i will be close to the decision boundary, and the less likely x i will be a support vector. Hence, this number can be used as a criterion to decide which training examples should belong to the reduced training set. For each training example x i, we compute the number N(x i ) and sort the training data according to the corresponding value of N(x i ) and choose a subset of data with the smallest numbers N(x i )as the reduced training set. It can be shown that N(x i ) is related to the statistical confidence that can be associated with the class label y i of the training example x i. For this reason, we call this data selection scheme the confidence measurebased training set selection. 3.2 Data Selection based on Hausdorff Distance Our second data selection strategy is based on the Hausdorff distance. In the separable case, it has been shown that the optimal SVM separating hyperplane is identical to the hyperplane that bisects the line segment which connects the two closest points of the convex hulls of the positive and of the negative training examples [16, 17]. The problem of finding the two closest points in the convex hulls can be formulated as min z + z 2 (7) z +,z subject to z + = α i x i and z = α i x i, (8) i:y i=1 i:y i= 1 where α i 0 satisfies the constraints i:y α i=1 i = 1 and i:y α i= 1 i =1. Based on this geometrical interpretation, the support vectors are the training examples that are vertices of the convex hulls that are closest to the convex hull of the training examples from the opposite class. For the non-separable case, a similar result holds by replacing the convex hulls with the reduced convex hulls [16,17]. Therefore, a good heuristic that can be used to determine whether a training example is likely to be a support vector is the distance to the convex hull of the training examples of the opposite class. Computing the distance from a training example x i to the convex hull of the training examples of the opposite class involves solving a smaller quadratic programming problem. To simplify the computation, the distance from a training example to the closest training examples of the opposite class can be used as an approximation. We denote the minimal distance as d(x i ) = min x i x j, (9) j:y j =y i which is also the Hausdorff distance between the training example x i and the set of training examples that belong to a different class. To select a subset of training
6 examples, we sort the training set according to d(x i ) and select examples with the smallest Hausdorff distances d(x i ) as the reduced training set. This method will be referred to as the Hausdorff distance-based selection method. 3.3 Data Selection based on Random Sampling and Desired SVM Outputs To study the effectiveness of the proposed data selection strategies, we compare them to two other strategies. One is random sampling and the other is a data selection strategy based on the distance from the training examples to the desired separating hyperplane. The random sampling strategy simply selects a small portion of the training data to form the reduced training set uniformly at random. This method is straightforward to implement and requires no extra computation. The other data selection strategy we compare our methods to is implemented as follows. Given the training set and the parameter setting, we solve the full QP problem to obtained the desired separating hyperplane. Then for each training example x i, we compute its distance to the desired separating hyperplane as: f(x i )=y i ( α j y j k(x i,x j )+b). (10) j=1 Note that Eq. (10) has taken into account the class information and training examples that are misclassified by the desired separating hyperplane will have negative distances. According to the KKT conditions, support vectors are training examples that have relatively small values of distance f(x i ). We sort the training examples according to their distances to the separating hyperplane and select a subset of training examples with the smallest distances as the reduced training set. This strategy, although impractical because one needs to solve the full QP problem first, is ideal for comparison purposes as the distance from a training example to the desired separating hyperplane provides the optimal criterion for selecting the support vectors. 4 Results and Discussion In this section we report experimental results on several real-world datasets from the UCI Machine Learning repository [18]. The SVM training algorithm was implemented based on the SMO method. For all datasets, Gaussian kernels were used and the generalization error of the SVMs was estimated using the 5-fold cross-validation method. For each training set, according to the data selection method used, a portion of the training set (ranging from 10 to 100 percent) was selected as the reduced training set to train the SVM classifier. The error rate reported is the average error rate of the resulting SVM classifiers on the test sets over the 5 iterations. Due to the space limit, only results on three datasets will be presented.
7 Note that when the data selection method is based on the desired SVM outputs, the SVM training procedure has to be run twice in each iteration. The first time a SVM classifier is trained with the training set to obtain the desired separating hyperplane. Then a portion of the training examples in the training set is selected to form the reduced training set based on their distances to the desired separating hyperplane (see Eq. (10)). The second time a SVM classifier is trained with the reduced training set. Given a training set and a particular data selection criterion, there are two ways to form the reduced training set. One can either select training examples regardless of which classes they belong to or select training examples from each class separately while maintaining the class distribution. It was found in our experiments that selecting training examples from each class separately often improves the classification accuracy of the resulting SVM classifiers. Therefore, we only report results in this case. Table 1 shows the error rates of SVMs on the Wisconsin Breast Cancer dataset when trained with the reduced training sets of various sizes selected by the four different data selection methods. This dataset consists of 683 examples from two classes (excluding the 16 examples with missing attribute values). Each example has 8 attributes. The size of the training set in each iteration is 547 and the size of the test set is 136. The average number of support vectors is 238.6, which is 43.62% of the training set size. Table 1. Error rates of SVMs on the Breast Cancer dataset when trained with reduced training sets of various sizes Percent Confidence Hausdorff Random SVM From Table 1 one can see that a significant amount data can be removed from the training set without degrading the performance of the resulting SVM classifier. When more than 10% of the training data is selected, the confidencebased data selection method outperforms the other two methods. Its performance is actually as good as that of the method based on the desired SVM outputs. The method based on the Hausdorff distance gives the worst results. When the data reduction rate is high, e.g., when less than 10 percent of the training data
8 is selected, the results obtained by the Hausdorff distance-based method and random sampling are much better than those based on the confidence measure and the desired SVM outputs. Table 2 shows the corresponding results obtained on the BUPA Liver dataset, which consists of 345 examples, with each example having 6 attributes. The sizes of the training and test sets in each iteration are 276 and 69, respectively. The average number of support vectors is 222.2, which is 80.51% of the size of the training sets. Interestingly, as we can see, the method based on the desired SVM outputs has the worst overall results. When less than 80% of the data is selected for training, the Hausdorff distance-based method and random sampling have similar performance and outperform the methods based on the confidence measure and the desired SVM outputs. Table 2. Results on the BUPA Liver dataset Percent Confidence Hausdorff Random SVM Table 3 provides the results on the Ionosphere dataset, which has a total of 351 examples, with each example having 34 attributes. The sizes of the training and test sets in each iteration are 281 and 70, respectively. The average number of support vectors is 159.8, which is 56.87% of the size of the training sets. From Table 3 we see that the data selection method based on the desired SVM outputs gives the best results when more than 20% of the data is selected. When more than 50% of the data is selected, the results of the confidence-based method are very close to the best achievable results. However, when the reduction rate is high, the performance of random sampling is the best. The Hausdorff distancebased method has the worst overall results. An interesting finding of the experiments is that the performance of the SVM classifiers deteriorates significantly when the reduction rate is high, e.g., when the size of the reduced training set is much smaller than the number of the desired support vectors. This is especially true for data selection strategies that are based on the desired SVM outputs and the proposed heuristics. On the other hand, the effect is less significant for random sampling, as we have seen
9 Table 3. Results on the Ionosphere dataset Percent Confidence Hausdorff Random SVM that random sampling usually has better relative performance at higher data reduction rates. From a theoretical point of view, this is not surprising because when only a subset of the support vectors is chosen as the reduced training set, there is no guarantee that the solution of the reduced QP problem will still be the same. In fact, if the reduction rate is high and the criterion is based on the desired SVM outputs or the proposed heuristics, the reduced training set is likely to be dominated by outliers, therefore leading to worse classification performance. To overcome this problem, we can remove those training examples that lie far inside the margin area since they are likely to be outliers. For the data selection strategy based on the desired SVM outputs, it means that we can discard part of the training data that has extremely small values of the distance to the desired separating hyperplane (see Eq. (10)). For the methods based on the confidence measure and Hausdorff distance, we can similarly discard the part of the training data that has extremely small values of N(x i ) and the Hausdorff distance. In Table 4 we show the results of the proposed solution on the Breast Cancer dataset. Comparing Tables 1 and 4, it is easy to see that, when only a very small subset of the training data (compared to the number of the desired support vectors) is selected for SVM training, removing training patterns that are extremely close to the decision boundary according to the confidence measure or according to the underlying SVM outputs significantly improves the performance of the resulting SVM classifiers. The effect is less obvious for the methods based on the Hausdorff measure and random sampling. Similar results have also been observed on other datasets but will not be reported here due to the space limit. 5 Conclusion In this paper we presented two new data selection methods for SVM training. To analyze their effectiveness in terms of their ability to reduce the training data
10 Table 4. Results on the Breast Cancer dataset Percent Confidence Hausdorff Random SVM while maintaining the generalization performance of the resulting SVM classifiers, we conducted a comparative study using several real-world datasets. More specifically, we compared the results obtained by these two new methods with the results of the simple random sampling scheme and the results obtained by the selection method based on the desired SVM outputs. Through our experiments, several important observations have been made: (1) In many applications, significant data reduction can be achieved without degrading the performance of the SVM classifiers. For that purpose, the performance of the confidence measurebased selection method is often comparable to or better than the performance of the method based on the desired SVM outputs. (2) When the reduction rate is high, some of training examples that are extremely close to the decision boundary have to be removed in order to maintain the generalization performance of the resulting SVM classifiers. (3) In spite of its simplicity, random sampling performs consistently well, especially when the reduction rate is high. However, at low reduction rates, random sampling performs noticeably worse compared to the confidence measure-based method. (4) When conducting training data selection, sampling training data from each class separately according to the class distribution often improves the performance of the resulting SVM classifiers. By directly comparing various data selection schemes with the scheme based on the desired SVM outputs, we are able to conclude that the confidence measure provides a criterion for training data selection that is almost as good as the optimal criterion based on the desired SVM outputs. At high reduction rates, by removing training data that are likely to be outliers, we boost the performance of the resulting SVM classifiers. Random sampling performs consistently well in our experiments, which is consistent with the results obtained by Syed et al. in [19] and the theoretical analysis of Huang and Lee in [12]. The robustness of random sampling at high reduction rates suggests that, although an SVM classifier is fully determined by the support vectors, the generalization performance of an SVM is less reliant on the choice of training data than it appears to be.
11 References 1. Boser, B. E., Guyon, I. M., Vapnik, V. N.: A training algorithm for optimal margin classifiers. In: Haussler, D. (ed.): Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory (1992) Cortes, C., Vapnik, V. N.: Support vector networks. Machine Learning. 20 (1995) Vapnik, V. N.: Statistical Learning Theory. Wiley, New York, NY (1998) 4. Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C. J. C., Smola, A. J. (eds.): Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge, MA (1999) Shin, H. J., Cho, S. Z.: Fast pattern selection for support vector classifiers. In: Proceedings of the 7th Pacific-Asia Conference on Knowledge Discovery and Data Mining. Lecture Notes in Artificial Intelligence (LNAI 2637) (2003) Almeida, M. B., Braga, A. P., Braga, J. P.: SVM-KM: speeding SVMs learning with a priori cluster selection and k-means. In: Proceedings of the 6th Brazilian Symposium on Neural Networks (2000) Zheng, S. F., Lu, X. F., Zheng, N. N., Xu, W. P.: Unsupervised clustering based reduced support vector machines. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2 (2003) Koggalage, R., Halgamuge, S.: Reducing the number of training samples for fast support vector machine classification. Neural Information Processing - Letters and Reviews 2(3) (2004) Zhang, W., King, I.: Locating support vectors via β-skeleton technique. In: Proceedings of the International Conference on Neural Information Processing (ICONIP) (2002) Abe, S., Inoue, T.: Fast training of support vector machines by extracting boundary data. In: Proceedings of the International Conference on Artificial Neural Networks (ICANN) (2001) Lee, Y. J., Mangasarian, O. L.: RSVM: Reduced support vector machines. In: Proceedings of the First SIAM International Conference on Data Mining (2001) 12. Huang, S. Y., Lee, Y. J.: Reduced support vector machines: a statistical theory. Technical report, Institute of Statistical Science, Academia Sinica, Taiwan. (2004) 13. Vapnik, V. N.: Estimation of Dependence Based on Empirical Data. Springer- Verlag, Berlin (1982) 14. Osuna, E., Freund, R., Girosi, R.: Support vector machines: training and applications. A.I. Memo AIM , MIT A.I. Lab. (1996) 15. Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C. J. C., Smola, A. J. (eds.): Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge, MA (1999) Bennett, K. P., Bredensteiner, E. J.: Duality and geometry in SVM classifiers. In: Proceedings of 17th International Conference on Machine Learning. (2000) Crisp, D. J., Burges, C. J. C.: A geometric interpretation of nu-svm classifiers. Advances in Neural Information Processing Systems. 12 (1999) 18. Blake, C. L., Merz, C. J.: UCI Repository of machine learning databases. mlearn/mlrepository.html (1998) 19. Syed, N. A., Liu, H., Sung, K. K.: A study of support vectors on model independent example selection. In: Proceedings of the Workshop on Support Vector Machines at the International Joint Conference on Artificial Intelligence. (1999)
Using Analytic QP and Sparseness to Speed Training of Support Vector Machines
Using Analytic QP and Sparseness to Speed Training of Support Vector Machines John C. Platt Microsoft Research 1 Microsoft Way Redmond, WA 9805 jplatt@microsoft.com Abstract Training a Support Vector Machine
More informationIntroduction to Machine Learning
Introduction to Machine Learning Maximum Margin Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationUsing Analytic QP and Sparseness to Speed Training of Support Vector Machines
Using Analytic QP and Sparseness to Speed Training of Support Vector Machines John C. Platt Microsoft Research 1 Microsoft Way Redmond, WA 98052 jplatt@microsoft.com Abstract Training a Support Vector
More informationBagging and Boosting Algorithms for Support Vector Machine Classifiers
Bagging and Boosting Algorithms for Support Vector Machine Classifiers Noritaka SHIGEI and Hiromi MIYAJIMA Dept. of Electrical and Electronics Engineering, Kagoshima University 1-21-40, Korimoto, Kagoshima
More informationTable of Contents. Recognition of Facial Gestures... 1 Attila Fazekas
Table of Contents Recognition of Facial Gestures...................................... 1 Attila Fazekas II Recognition of Facial Gestures Attila Fazekas University of Debrecen, Institute of Informatics
More informationConstrained optimization
Constrained optimization A general constrained optimization problem has the form where The Lagrangian function is given by Primal and dual optimization problems Primal: Dual: Weak duality: Strong duality:
More informationGenerating the Reduced Set by Systematic Sampling
Generating the Reduced Set by Systematic Sampling Chien-Chung Chang and Yuh-Jye Lee Email: {D9115009, yuh-jye}@mail.ntust.edu.tw Department of Computer Science and Information Engineering National Taiwan
More informationKBSVM: KMeans-based SVM for Business Intelligence
Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2004 Proceedings Americas Conference on Information Systems (AMCIS) December 2004 KBSVM: KMeans-based SVM for Business Intelligence
More informationKernel Methods & Support Vector Machines
& Support Vector Machines & Support Vector Machines Arvind Visvanathan CSCE 970 Pattern Recognition 1 & Support Vector Machines Question? Draw a single line to separate two classes? 2 & Support Vector
More informationSoftDoubleMinOver: A Simple Procedure for Maximum Margin Classification
SoftDoubleMinOver: A Simple Procedure for Maximum Margin Classification Thomas Martinetz, Kai Labusch, and Daniel Schneegaß Institute for Neuro- and Bioinformatics University of Lübeck D-23538 Lübeck,
More informationFast Support Vector Machine Classification of Very Large Datasets
Fast Support Vector Machine Classification of Very Large Datasets Janis Fehr 1, Karina Zapién Arreola 2 and Hans Burkhardt 1 1 University of Freiburg, Chair of Pattern Recognition and Image Processing
More informationImprovements to the SMO Algorithm for SVM Regression
1188 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 11, NO. 5, SEPTEMBER 2000 Improvements to the SMO Algorithm for SVM Regression S. K. Shevade, S. S. Keerthi, C. Bhattacharyya, K. R. K. Murthy Abstract This
More informationData Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)
Data Mining: Concepts and Techniques Chapter 9 Classification: Support Vector Machines 1 Support Vector Machines (SVMs) SVMs are a set of related supervised learning methods used for classification Based
More informationSupport Vector Machines.
Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing
More informationSUPPORT VECTOR MACHINES
SUPPORT VECTOR MACHINES Today Reading AIMA 18.9 Goals (Naïve Bayes classifiers) Support vector machines 1 Support Vector Machines (SVMs) SVMs are probably the most popular off-the-shelf classifier! Software
More informationSUPPORT VECTOR MACHINES
SUPPORT VECTOR MACHINES Today Reading AIMA 8.9 (SVMs) Goals Finish Backpropagation Support vector machines Backpropagation. Begin with randomly initialized weights 2. Apply the neural network to each training
More informationRobust 1-Norm Soft Margin Smooth Support Vector Machine
Robust -Norm Soft Margin Smooth Support Vector Machine Li-Jen Chien, Yuh-Jye Lee, Zhi-Peng Kao, and Chih-Cheng Chang Department of Computer Science and Information Engineering National Taiwan University
More informationBagging for One-Class Learning
Bagging for One-Class Learning David Kamm December 13, 2008 1 Introduction Consider the following outlier detection problem: suppose you are given an unlabeled data set and make the assumptions that one
More informationA New Fuzzy Membership Computation Method for Fuzzy Support Vector Machines
A New Fuzzy Membership Computation Method for Fuzzy Support Vector Machines Trung Le, Dat Tran, Wanli Ma and Dharmendra Sharma Faculty of Information Sciences and Engineering University of Canberra, Australia
More informationSupport Vector Machines for Face Recognition
Chapter 8 Support Vector Machines for Face Recognition 8.1 Introduction In chapter 7 we have investigated the credibility of different parameters introduced in the present work, viz., SSPD and ALR Feature
More informationUse of Multi-category Proximal SVM for Data Set Reduction
Use of Multi-category Proximal SVM for Data Set Reduction S.V.N Vishwanathan and M Narasimha Murty Department of Computer Science and Automation, Indian Institute of Science, Bangalore 560 012, India Abstract.
More informationRule extraction from support vector machines
Rule extraction from support vector machines Haydemar Núñez 1,3 Cecilio Angulo 1,2 Andreu Català 1,2 1 Dept. of Systems Engineering, Polytechnical University of Catalonia Avda. Victor Balaguer s/n E-08800
More informationLeave-One-Out Support Vector Machines
Leave-One-Out Support Vector Machines Jason Weston Department of Computer Science Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 OEX, UK. Abstract We present a new learning algorithm
More informationDECISION-TREE-BASED MULTICLASS SUPPORT VECTOR MACHINES. Fumitake Takahashi, Shigeo Abe
DECISION-TREE-BASED MULTICLASS SUPPORT VECTOR MACHINES Fumitake Takahashi, Shigeo Abe Graduate School of Science and Technology, Kobe University, Kobe, Japan (E-mail: abe@eedept.kobe-u.ac.jp) ABSTRACT
More informationSecond Order SMO Improves SVM Online and Active Learning
Second Order SMO Improves SVM Online and Active Learning Tobias Glasmachers and Christian Igel Institut für Neuroinformatik, Ruhr-Universität Bochum 4478 Bochum, Germany Abstract Iterative learning algorithms
More informationAll lecture slides will be available at CSC2515_Winter15.html
CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 9: Support Vector Machines All lecture slides will be available at http://www.cs.toronto.edu/~urtasun/courses/csc2515/ CSC2515_Winter15.html Many
More informationEfficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1225 Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms S. Sathiya Keerthi Abstract This paper
More informationRobot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning
Robot Learning 1 General Pipeline 1. Data acquisition (e.g., from 3D sensors) 2. Feature extraction and representation construction 3. Robot learning: e.g., classification (recognition) or clustering (knowledge
More informationA Short SVM (Support Vector Machine) Tutorial
A Short SVM (Support Vector Machine) Tutorial j.p.lewis CGIT Lab / IMSC U. Southern California version 0.zz dec 004 This tutorial assumes you are familiar with linear algebra and equality-constrained optimization/lagrange
More informationLab 2: Support vector machines
Artificial neural networks, advanced course, 2D1433 Lab 2: Support vector machines Martin Rehn For the course given in 2006 All files referenced below may be found in the following directory: /info/annfk06/labs/lab2
More informationSupport Vector Machines
Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining
More informationKernel Combination Versus Classifier Combination
Kernel Combination Versus Classifier Combination Wan-Jui Lee 1, Sergey Verzakov 2, and Robert P.W. Duin 2 1 EE Department, National Sun Yat-Sen University, Kaohsiung, Taiwan wrlee@water.ee.nsysu.edu.tw
More informationPerceptron Learning Algorithm (PLA)
Review: Lecture 4 Perceptron Learning Algorithm (PLA) Learning algorithm for linear threshold functions (LTF) (iterative) Energy function: PLA implements a stochastic gradient algorithm Novikoff s theorem
More informationSUCCESSIVE overrelaxation (SOR), originally developed
1032 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 5, SEPTEMBER 1999 Successive Overrelaxation for Support Vector Machines Olvi L. Mangasarian and David R. Musicant Abstract Successive overrelaxation
More informationLecture 10: SVM Lecture Overview Support Vector Machines The binary classification problem
Computational Learning Theory Fall Semester, 2012/13 Lecture 10: SVM Lecturer: Yishay Mansour Scribe: Gitit Kehat, Yogev Vaknin and Ezra Levin 1 10.1 Lecture Overview In this lecture we present in detail
More informationSupport Vector Machines
Support Vector Machines VL Algorithmisches Lernen, Teil 3a Norman Hendrich & Jianwei Zhang University of Hamburg, Dept. of Informatics Vogt-Kölln-Str. 30, D-22527 Hamburg hendrich@informatik.uni-hamburg.de
More informationSupport Vector Regression for Software Reliability Growth Modeling and Prediction
Support Vector Regression for Software Reliability Growth Modeling and Prediction 925 Fei Xing 1 and Ping Guo 2 1 Department of Computer Science Beijing Normal University, Beijing 100875, China xsoar@163.com
More informationA Two-phase Distributed Training Algorithm for Linear SVM in WSN
Proceedings of the World Congress on Electrical Engineering and Computer Systems and Science (EECSS 015) Barcelona, Spain July 13-14, 015 Paper o. 30 A wo-phase Distributed raining Algorithm for Linear
More informationScale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract
Scale-Invariance of Support Vector Machines based on the Triangular Kernel François Fleuret Hichem Sahbi IMEDIA Research Group INRIA Domaine de Voluceau 78150 Le Chesnay, France Abstract This paper focuses
More informationKernel-based online machine learning and support vector reduction
Kernel-based online machine learning and support vector reduction Sumeet Agarwal 1, V. Vijaya Saradhi 2 andharishkarnick 2 1- IBM India Research Lab, New Delhi, India. 2- Department of Computer Science
More informationSVM in Analysis of Cross-Sectional Epidemiological Data Dmitriy Fradkin. April 4, 2005 Dmitriy Fradkin, Rutgers University Page 1
SVM in Analysis of Cross-Sectional Epidemiological Data Dmitriy Fradkin April 4, 2005 Dmitriy Fradkin, Rutgers University Page 1 Overview The goals of analyzing cross-sectional data Standard methods used
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 20: 10/12/2015 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationSupport Vector Machines
Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining
More informationLecture 7: Support Vector Machine
Lecture 7: Support Vector Machine Hien Van Nguyen University of Houston 9/28/2017 Separating hyperplane Red and green dots can be separated by a separating hyperplane Two classes are separable, i.e., each
More informationSELF-ORGANIZING methods such as the Self-
Proceedings of International Joint Conference on Neural Networks, Dallas, Texas, USA, August 4-9, 2013 Maximal Margin Learning Vector Quantisation Trung Le, Dat Tran, Van Nguyen, and Wanli Ma Abstract
More informationSupport Vector Machines
Support Vector Machines . Importance of SVM SVM is a discriminative method that brings together:. computational learning theory. previously known methods in linear discriminant functions 3. optimization
More informationSupport Vector Machines (a brief introduction) Adrian Bevan.
Support Vector Machines (a brief introduction) Adrian Bevan email: a.j.bevan@qmul.ac.uk Outline! Overview:! Introduce the problem and review the various aspects that underpin the SVM concept.! Hard margin
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007,
More informationLocal Linear Approximation for Kernel Methods: The Railway Kernel
Local Linear Approximation for Kernel Methods: The Railway Kernel Alberto Muñoz 1,JavierGonzález 1, and Isaac Martín de Diego 1 University Carlos III de Madrid, c/ Madrid 16, 890 Getafe, Spain {alberto.munoz,
More informationA Summary of Support Vector Machine
A Summary of Support Vector Machine Jinlong Wu Computational Mathematics, SMS, PKU May 4,2007 Introduction The Support Vector Machine(SVM) has achieved a lot of attention since it is developed. It is widely
More informationThe Effects of Outliers on Support Vector Machines
The Effects of Outliers on Support Vector Machines Josh Hoak jrhoak@gmail.com Portland State University Abstract. Many techniques have been developed for mitigating the effects of outliers on the results
More informationSupport Vector Machine Ensemble with Bagging
Support Vector Machine Ensemble with Bagging Hyun-Chul Kim, Shaoning Pang, Hong-Mo Je, Daijin Kim, and Sung-Yang Bang Department of Computer Science and Engineering Pohang University of Science and Technology
More informationGeneralized version of the support vector machine for binary classification problems: supporting hyperplane machine.
E. G. Abramov 1*, A. B. Komissarov 2, D. A. Kornyakov Generalized version of the support vector machine for binary classification problems: supporting hyperplane machine. In this paper there is proposed
More informationVery Large SVM Training using Core Vector Machines
Very Large SVM Training using Core Vector Machines Ivor W. Tsang James T. Kwok Department of Computer Science The Hong Kong University of Science and Technology Clear Water Bay Hong Kong Pak-Ming Cheung
More informationORT EP R RCH A ESE R P A IDI! " #$$% &' (# $!"
R E S E A R C H R E P O R T IDIAP A Parallel Mixture of SVMs for Very Large Scale Problems Ronan Collobert a b Yoshua Bengio b IDIAP RR 01-12 April 26, 2002 Samy Bengio a published in Neural Computation,
More informationEfficient Case Based Feature Construction
Efficient Case Based Feature Construction Ingo Mierswa and Michael Wurst Artificial Intelligence Unit,Department of Computer Science, University of Dortmund, Germany {mierswa, wurst}@ls8.cs.uni-dortmund.de
More informationSupport Vector Machines
Support Vector Machines 64-360 Algorithmic Learning, part 3 Norman Hendrich University of Hamburg, Dept. of Informatics Vogt-Kölln-Str. 30, D-22527 Hamburg hendrich@informatik.uni-hamburg.de 13/06/2012
More informationA Practical Guide to Support Vector Classification
A Practical Guide to Support Vector Classification Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin Department of Computer Science and Information Engineering National Taiwan University Taipei 106, Taiwan
More informationMachine Learning Lecture 9
Course Outline Machine Learning Lecture 9 Fundamentals ( weeks) Bayes Decision Theory Probability Density Estimation Nonlinear SVMs 19.05.013 Discriminative Approaches (5 weeks) Linear Discriminant Functions
More informationCLASSIFICATION OF CUSTOMER PURCHASE BEHAVIOR IN THE AIRLINE INDUSTRY USING SUPPORT VECTOR MACHINES
CLASSIFICATION OF CUSTOMER PURCHASE BEHAVIOR IN THE AIRLINE INDUSTRY USING SUPPORT VECTOR MACHINES Pravin V, Innovation and Development Team, Mu Sigma Business Solutions Pvt. Ltd, Bangalore. April 2012
More informationMemory-efficient Large-scale Linear Support Vector Machine
Memory-efficient Large-scale Linear Support Vector Machine Abdullah Alrajeh ac, Akiko Takeda b and Mahesan Niranjan c a CRI, King Abdulaziz City for Science and Technology, Saudi Arabia, asrajeh@kacst.edu.sa
More informationVersion Space Support Vector Machines: An Extended Paper
Version Space Support Vector Machines: An Extended Paper E.N. Smirnov, I.G. Sprinkhuizen-Kuyper, G.I. Nalbantov 2, and S. Vanderlooy Abstract. We argue to use version spaces as an approach to reliable
More information12 Classification using Support Vector Machines
160 Bioinformatics I, WS 14/15, D. Huson, January 28, 2015 12 Classification using Support Vector Machines This lecture is based on the following sources, which are all recommended reading: F. Markowetz.
More informationRecursive Similarity-Based Algorithm for Deep Learning
Recursive Similarity-Based Algorithm for Deep Learning Tomasz Maszczyk 1 and Włodzisław Duch 1,2 1 Department of Informatics, Nicolaus Copernicus University Grudzia dzka 5, 87-100 Toruń, Poland 2 School
More informationAdapting SVM Classifiers to Data with Shifted Distributions
Adapting SVM Classifiers to Data with Shifted Distributions Jun Yang School of Computer Science Carnegie Mellon University Pittsburgh, PA 523 juny@cs.cmu.edu Rong Yan IBM T.J.Watson Research Center 9 Skyline
More informationVOKNN: Voting-based Nearest Neighbor Approach for Scalable SVM Training
VOKNN: Voting-based Nearest Neighbor Approach for Scalable SVM Training Saeed Salem Department of Computer Science North Dakota State University Fargo, ND 58102, USA saeed.salem@ndsu.edu Loqmane Seridi
More informationSupport Vector Machines
Support Vector Machines SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions 6. Dealing
More informationContrained K-Means Clustering 1 1 Introduction The K-Means clustering algorithm [5] has become a workhorse for the data analyst in many diverse elds.
Constrained K-Means Clustering P. S. Bradley K. P. Bennett A. Demiriz Microsoft Research Dept. of Mathematical Sciences One Microsoft Way Dept. of Decision Sciences and Eng. Sys. Redmond, WA 98052 Renselaer
More informationSupport Vector Machines.
Support Vector Machines srihari@buffalo.edu SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions
More informationClassification by Support Vector Machines
Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III
More informationCOMS 4771 Support Vector Machines. Nakul Verma
COMS 4771 Support Vector Machines Nakul Verma Last time Decision boundaries for classification Linear decision boundary (linear classification) The Perceptron algorithm Mistake bound for the perceptron
More informationClassification by Support Vector Machines
Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III
More information9. Support Vector Machines. The linearly separable case: hard-margin SVMs. The linearly separable case: hard-margin SVMs. Learning objectives
Foundations of Machine Learning École Centrale Paris Fall 25 9. Support Vector Machines Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech Learning objectives chloe agathe.azencott@mines
More informationForward Feature Selection Using Residual Mutual Information
Forward Feature Selection Using Residual Mutual Information Erik Schaffernicht, Christoph Möller, Klaus Debes and Horst-Michael Gross Ilmenau University of Technology - Neuroinformatics and Cognitive Robotics
More informationSpeeding Up Multi-class SVM Evaluation by PCA and Feature Selection
Speeding Up Multi-class SVM Evaluation by PCA and Feature Selection Hansheng Lei, Venu Govindaraju CUBS, Center for Unified Biometrics and Sensors State University of New York at Buffalo Amherst, NY 1460
More informationLab 2: Support Vector Machines
Articial neural networks, advanced course, 2D1433 Lab 2: Support Vector Machines March 13, 2007 1 Background Support vector machines, when used for classication, nd a hyperplane w, x + b = 0 that separates
More informationSupport vector machines
Support vector machines When the data is linearly separable, which of the many possible solutions should we prefer? SVM criterion: maximize the margin, or distance between the hyperplane and the closest
More informationMachine Learning Lecture 9
Course Outline Machine Learning Lecture 9 Fundamentals ( weeks) Bayes Decision Theory Probability Density Estimation Nonlinear SVMs 30.05.016 Discriminative Approaches (5 weeks) Linear Discriminant Functions
More informationSupport Vector Machines and their Applications
Purushottam Kar Department of Computer Science and Engineering, Indian Institute of Technology Kanpur. Summer School on Expert Systems And Their Applications, Indian Institute of Information Technology
More informationSoftware Documentation of the Potential Support Vector Machine
Software Documentation of the Potential Support Vector Machine Tilman Knebel and Sepp Hochreiter Department of Electrical Engineering and Computer Science Technische Universität Berlin 10587 Berlin, Germany
More informationLecture 9: Support Vector Machines
Lecture 9: Support Vector Machines William Webber (william@williamwebber.com) COMP90042, 2014, Semester 1, Lecture 8 What we ll learn in this lecture Support Vector Machines (SVMs) a highly robust and
More informationHsiaochun Hsu Date: 12/12/15. Support Vector Machine With Data Reduction
Support Vector Machine With Data Reduction 1 Table of Contents Summary... 3 1. Introduction of Support Vector Machines... 3 1.1 Brief Introduction of Support Vector Machines... 3 1.2 SVM Simple Experiment...
More informationComputationally Efficient Face Detection
Appeared in The Proceeding of the 8th International Conference on Computer Vision, 21. Computationally Efficient Face Detection Sami Romdhani, Philip Torr, Bernhard Schölkopf, Andrew Blake Microsoft Research
More informationPerceptron Learning Algorithm
Perceptron Learning Algorithm An iterative learning algorithm that can find linear threshold function to partition linearly separable set of points. Assume zero threshold value. 1) w(0) = arbitrary, j=1,
More informationA Support Vector Method for Hierarchical Clustering
A Support Vector Method for Hierarchical Clustering Asa Ben-Hur Faculty of IE and Management Technion, Haifa 32, Israel David Horn School of Physics and Astronomy Tel Aviv University, Tel Aviv 69978, Israel
More information10. Support Vector Machines
Foundations of Machine Learning CentraleSupélec Fall 2017 10. Support Vector Machines Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr Learning
More informationNaïve Bayes for text classification
Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Support
More informationReihe Informatik 10/2001. Efficient Feature Subset Selection for Support Vector Machines. Matthias Heiler, Daniel Cremers, Christoph Schnörr
Computer Vision, Graphics, and Pattern Recognition Group Department of Mathematics and Computer Science University of Mannheim D-68131 Mannheim, Germany Reihe Informatik 10/2001 Efficient Feature Subset
More informationOne-class Problems and Outlier Detection. 陶卿 中国科学院自动化研究所
One-class Problems and Outlier Detection 陶卿 Qing.tao@mail.ia.ac.cn 中国科学院自动化研究所 Application-driven Various kinds of detection problems: unexpected conditions in engineering; abnormalities in medical data,
More informationCLASS IMBALANCE LEARNING METHODS FOR SUPPORT VECTOR MACHINES
CHAPTER 6 CLASS IMBALANCE LEARNING METHODS FOR SUPPORT VECTOR MACHINES Rukshan Batuwita and Vasile Palade Singapore-MIT Alliance for Research and Technology Centre; University of Oxford. Abstract Support
More informationLOGISTIC REGRESSION FOR MULTIPLE CLASSES
Peter Orbanz Applied Data Mining Not examinable. 111 LOGISTIC REGRESSION FOR MULTIPLE CLASSES Bernoulli and multinomial distributions The mulitnomial distribution of N draws from K categories with parameter
More informationUnivariate Margin Tree
Univariate Margin Tree Olcay Taner Yıldız Department of Computer Engineering, Işık University, TR-34980, Şile, Istanbul, Turkey, olcaytaner@isikun.edu.tr Abstract. In many pattern recognition applications,
More informationFeature scaling in support vector data description
Feature scaling in support vector data description P. Juszczak, D.M.J. Tax, R.P.W. Duin Pattern Recognition Group, Department of Applied Physics, Faculty of Applied Sciences, Delft University of Technology,
More informationModule 4. Non-linear machine learning econometrics: Support Vector Machine
Module 4. Non-linear machine learning econometrics: Support Vector Machine THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Introduction When the assumption of linearity
More informationIntroduction The problem of cancer classication has clear implications on cancer treatment. Additionally, the advent of DNA microarrays introduces a w
MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No.677 C.B.C.L Paper No.8
More informationData Mining in Bioinformatics Day 1: Classification
Data Mining in Bioinformatics Day 1: Classification Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institute Tübingen and Eberhard Karls
More informationClassification by Support Vector Machines
Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III
More informationClassification with Class Overlapping: A Systematic Study
Classification with Class Overlapping: A Systematic Study Haitao Xiong 1 Junjie Wu 1 Lu Liu 1 1 School of Economics and Management, Beihang University, Beijing 100191, China Abstract Class overlapping has
More informationA generalized quadratic loss for Support Vector Machines
A generalized quadratic loss for Support Vector Machines Filippo Portera and Alessandro Sperduti Abstract. The standard SVM formulation for binary classification is based on the Hinge loss function, where
More informationSupport Vector Machines
Support Vector Machines Chapter 9 Chapter 9 1 / 50 1 91 Maximal margin classifier 2 92 Support vector classifiers 3 93 Support vector machines 4 94 SVMs with more than two classes 5 95 Relationshiop to
More information