Kernel Methods in Machine Learning
|
|
- Evelyn O’Brien’
- 6 years ago
- Views:
Transcription
1 Outline Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong Joint work with Ivor Tsang, Pakming Cheung, Andras Kocsor, Jacek Zurada, Kimo Lai November 2006, Nanjing
2 Outline Outline 1 Kernel Methods: An Introduction 2 3 Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments 4
3 Popularity of Kernel Methods Supervised learning Classification: Support vector machines (SVM) Regression: Support vector regression Unsupervised learning Novelty detection: One-class SVM / Support vector domain description Clustering: Kernel clustering Principal component analysis: Kernel PCA Other learning scenarios Semi-supervised learning, transductive learning, etc. Applications Text classification, speaker adaptation, image fusion, texture classification...
4 Basic Idea in Kernel Methods Map the data from input space to feature space F using ϕ Apply a linear procedure in F hyperplane classifier, linear regression, PCA, etc. Only inner products in F are needed Kernel trick: ϕ(x) ϕ(y) = k(x, y)
5 Support Vector Machines Classification problem: {(x i, y i )} N i=1, x i R m, y i {±1} min 1 2 w 2 (primal) s.t. y i (w ϕ(x i ) + b) 1 max s.t. N α i 1 2 i=1 N α i α j y i y j ϕ(x i ) ϕ(x j ) i,j=1 N α i y i = 0, α i 0 (dual) i=1 Quadratic programming (QP) problem
6 Problem 1 Need O(m 2 ) memory just to write down K (m training examples) If m = 20, 000 and it takes 4 bytes to represent a kernel entry, we would need 1.6Gbytes to store the kernel matrix Problem 2 Involves inverting the kernel matrix K m m = [k(x i, x j )] m i,j=1 Requires O(m 3 ) time Existing methods sampling, low-rank approximations, decomposition methods in practice, time complexities O(m) O(m 2.3 ) empirical observations and not theoretical guarantees
7 Observation SVM implementations only approximate the optimal solution by an iterative strategy 1 Pick a subset of Lagrange multipliers 2 Optimize the reduced optimization problem 3 Repeat until all the Lagrange multipliers are accurate enough (loose KKT condition) These near-optimal solutions are often good enough in practical applications
8 Approximation Algorithm Approximation algorithms have been extensively used theoretical computer science E.g., for NP-complete problems such as vertex-cover problem, traveling-salesman problem, set-covering problem,... Denote C : cost of the optimal solution C: cost of the solution returned by approximation algorithm Performance guarantee: Approximation ratio ρ(n) for input size n max ( C C, C C ) ρ(n) large ρ(n): solution is much worse than the optimal solution small ρ(n): solution is more or less optimal If the ratio does not depend on n, we may just write ρ and call the algorithm an ρ-approximation algorithm
9 The Minimum Enclosing Ball Problem Problem in Computational Geometry Given: S = {x 1,..., x m }, where each x i R d Minimum enclosing ball of S (MEB(S)): the smallest ball that contains all the points in S Finding exact MEBs is inefficient for large d
10 (1 + ɛ)-approximation Given an ɛ > 0, a ball B(c, (1 + ɛ)r) is an (1 + ɛ)-approximation of MEB(S) if R r MEB(S) and S B(c, (1 + ɛ)r)
11 Approximate MEB Algorithm Proposed by Bădoiu and Clarkson (2002) A simple iterative scheme: At the tth iteration, the current estimate B(c t, r t ) is expanded incrementally by including the furthest point outside the (1 + ɛ)-ball B(c t, (1 + ɛ)r t ) Repeat until all the points in S are covered by B(c t, (1 + ɛ)r t ) Surprising property Number of iterations (and hence the size of the final core-set) depends only on ɛ but not on d or m
12 MEB Problems and Kernel Methods What is obvious MEB is equivalent to the hard-margin support vector data description (SVDD) The MEB problem can also be used to find the radius component of the radius-margin bound SVM parameter tuning What is not so obvious Other kernel-related problems can also be viewed as MEB problems soft-margin one-class SVM, multi-class SVM, ranking SVM, SVR, Laplacian SVM, etc.
13 Hard-Margin SVDD Denote: Kernel k; feature map ϕ MEB in the feature space: B(c, R) (primal) : min R,c R2 : c ϕ(x i ) 2 R 2, i = 1,..., m (dual) max α α diag(k) α Kα : α 0, α 1 = 1 α = [α i,..., α m ] : Lagrange multipliers K m m = [k(x i, x j )]: kernel matrix 0 = [0,..., 0], 1 = [1,..., 1]
14 Kernel Methods as MEB Problems Holds for Assume k(x, x) = κ, a constant (1) 1 isotropic kernel k(x, y) = K( x y ) (e.g., Gaussian) 2 dot product kernel k(x, y) = K(x y) (e.g., polynomial) with normalized inputs 3 any normalized kernel k(x, y) = K(x,y) K(x,x) K(y,y) Combine with α 1 = 1, we have α diag(k) = κ max α α Kα : α 0, α 1 = 1 (2) Conversely, whenever the kernel k satisfies (1), Any QP of the form in (2) a MEB problem
15 Two-Class SVM {z i = (x i, y i )} m i=1 with y i { 1, 1} (primal) min w,b,ρ,ξ i (dual) max α K = α w 2 +b 2 2ρ+C [ y i y j k(x i, x j ) + y i y j + δ ij C m i=1 ξ i 2 : y i (w ϕ(x i )+b) ρ ξ i (K yy + yy + 1C I ) α : α 0, α 1 = 1 ], with k(z, z) = κ+1+ 1 C (constant)
16 Core Vector Machine (CVM) At the tth iteration, denote S t : core-set; c t : ball s center; R t : ball s radius Given an ɛ > 0 1 Initialize S 0, c 0 and R 0 2 Terminate if there is no training point z such that ϕ(z) falls outside the (1 + ɛ)-ball B(c t, (1 + ɛ)r t ) 3 Find (core vector) z such that ϕ(z) is furthest away from c t. Set S t+1 = S t {z} 4 Find the new MEB(S t+1 ) and set c t+1 = c MEB(St+1 ) and R t+1 = r MEB(St+1 ) 5 Increment t by 1 and go back to Step 2
17 Convergence to (Approximate) Optimality When ɛ = 0 CVM outputs the exact solution of the kernel problem When ɛ > 0 CVM is an (1 + ɛ) 2 -approximation algorithm
18 Time Complexity CVM converges in at most 2/ɛ iterations [Bădoiu and Clarkson, 2002] No probabilistic speedup: Overall time for τ = O(1/ɛ) iterations: O ( m + 1 ) ɛ 2 ɛ 4 linear in m for a fixed ɛ With probabilistic speedup: Overall time: O ( ) 1 ɛ 4 independent of m for a fixed ɛ
19 Space Complexity Space complexity for the for the whole procedure: O(1/ɛ 2 ) independent of m for a fixed ɛ
20 Forest Cover Type Data (522,911 patterns) L2 SVM (CVM) L2 SVM (LIBSVM) L1 SVM (LIBSVM) 10 6 L2 SVM (CVM) core set size L2 SVM (LIBSVM) L1 SVM (LIBSVM) CPU time (in seconds) number of SV s size of training set x size of training set x 10 L2 SVM (CVM) L2 SVM (LIBSVM) L1 SVM (LIBSVM) error rate (in %) size of training set x 10
21 Extended MIT Face Data training set # faces # nonfaces total original 2,429 4,548 6,977 set A 2, , ,343 set B 19,432 (blur+flip) 481, ,346 set C 408,072 (rotate) 481, ,986 CPU time (in seconds) L2 SVM (CVM) L2 SVM (LIBSVM) L1 SVM (LIBSVM) number of support vectors L2 SVM (CVM) core set size L2 SVM (LIBSVM) L1 SVM (LIBSVM) original set A set B set C training set 10 2 original set A set B set C training set L2 SVM (CVM) L2 SVM (LIBSVM) L1 SVM (LIBSVM) L2 SVM (CVM) L2 SVM (LIBSVM) L1 SVM (LIBSVM) AUC balanced loss (in %) original set A set B set C training set 15 original set A set B set C training set
22 KDDCUP-99 Intrusion Detection (4,898,431 patterns) Used in KDD-99 s Knowledge Discovery and Data Mining Tools Competition: Separate normal connections from attacks method # train patns # test SVM training other proc input to SVM errors time (in sec) time (in sec) 0.001% 47 25, random 0.01% , sampling 0.1% 4,917 25, % 49,204 25, % 245,364 25, active learning , CB-SVM (KDD 03) 4,090 20, CVM 4,898,431 19, AUC l bal # core vectors # support vectors
23 Limitations 1 k(x, x) = constant for any pattern x 2 The QP problem is of the form max α Kα s.t. α 1 = 1, α 0 Condition 1 holds for kernels, including Isotropic kernel (e.g., Gaussian kernel) Dot product kernel (e.g., polynomial kernel) with normalized input Any normalized kernel Condition 2 holds for kernel methods including the one-class and two-class SVMs there are still some popular kernel methods that violate these conditions and so cannot be used
24 Motivating Example Example (L2-support vector regression (SVR)) Training set: {z i = (x i, y i )} m i=1 with x i R d and y i R Find f (x) = w ϕ(x) + b in F that minimizes ε-insensitive loss Primal min w 2 + b 2 + C m (ξi 2 + ξi 2 ) + 2C ε µm i=1 s.t. y i (w ϕ(x i ) + b) ε + ξ i, (w ϕ(x i ) + b) y i ε + ξ i Dual [ 2 max [λ λ ] C y 2 C y ] [λ [ λ λ ] K λ s.t. [λ λ ]1 = 1, λ, λ 0 [ ] K + 11 K = [ k(z i, z j )] = + µm C I (K + 11 ) (K + 11 ) K µm C I ]
25 Center-Constrained MEB Problem Modifications to the original MEB problem: [ ] ϕ(xi ) 1 Augment an extra i R to each ϕ(x i ) 2 Constrain the last coordinate of the ball s center to zero Finding the center-constrained MEB Primal: [ ] [ ] min R 2 s.t. c ϕ(xi ) 2 R 2 0 i where = [ 2 1,..., 2 m] 0 Dual: max α (diag(k)+ ) α Kα s.t. α 1 = 1, α 0 Goal: Transform the dual of SVR to this form i [ c 0 ]
26 SVR as a Center-Constrained MEB Problem SVR s dual: max α {}}{ [ 2 [λ λ ] C y 2 C y ] [λ [ λ λ ] K λ s.t. [λ λ ]1 = 1, λ, λ 0 [ ] Define = diag( K) + η1 + 2 y C for η large enough such y that 0 max α (diag( K) + η1) α K α : α 1=1, α 0 Using the constraint α 1 = 1 max α (diag( K) + ) α K α : α 1 = 1, α 0 which is thus of the required form! ]
27 Advantages 1 Allows a more general QP formulation 2 Can be used with any linear/nonlinear kernels no longer require k(x, x) = constant on the kernel
28 Friedman (200,000 Patterns) CPU time (in seconds) L2 SVR (CVR) L1 SVR (LIBSVM) L1 SVR (SVM Light) number of SV s 14 x L2 SVR (CVR) L1 SVR (LIBSVM) L1 SVR (SVM Light) size of training set x L2 SVR (CVR) L1 SVR (LIBSVM) L1 SVR (SVM Light) size of training set x L2 SVR (CVR) L1 SVR (LIBSVM) L1 SVR (SVM Light) RMSE MRE size of training set x size of training set x 10 5
29 Semi-Supervised Learning Labeled patterns are rare, expensive and time consuming to collect supervised learning can have poor performance when only very few labeled patterns are available Unlabeled data are abundant and readily available without any cost e.g., unlabeled webpages on the internet often has a manifold structure
30 Laplacian SVM Incorporate a manifold regularizer [Belkin et al 2005]: min 1 l l i=1 ξ i + λ 2 f 2 H k + λ G 2 G f 2 y i f (x i ) 1 ξ i, ξ i 0 Sparse Laplacian SVM min w 2 + b 2 + C l ξi 2 +2Cɛ + Cθ (ζe 2 + ζe 2 ) }{{} lµ uµ i=1 e ε margin }{{}}{{} error manifold regularizer s.t. y i (w ϕ(x i ) + b) 1 ɛ ξ i, w ψ e ɛ + ζ e, w ψ e ɛ + ζ e, e ε. Dual: center-constrained MEB problem
31 Two Moons (l = 2; u = 1, 000, 000)
32 Extended USPS: 0-vs-1 (l = 2; u = 266, 077)
33 Extended MIT Face (l = 10; u = 100, 000)
34 Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments Multi-Instance Learning: Motivating Example Content-based image retrieval: Classify/retrieve images based on content each image is a bag and each local image patch an instance an image is labeled positive when at least one of its segments is positive Weak label information of the training data only the bags (but not the individual instances) have known labels
35 Kernel-Based MI Learning Methods Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments Design MI kernels that operate on bags the underlying quadratic programming (QP) problem only involves variables corresponding to the bags, but not instances (More direct approach) Associate the variables with instances, but not with bags bag label information still used implicitly bag B i : instances {x ij } n i j=1 f (B i ) = max j=1,...,ni f (x ij )
36 Problems Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments Mixed integer problem MI-SVM uses a simple optimization heuristic convergence properties unclear Only the sign is important in classification sign(f (B i )) = sign(max j=1,...,ni f (x ij )) f (B i ) = max j=1,...,ni f (x ij ) may be too restrictive Cannot utilize both the bag and instance information simultaneously MI kernels: variables correspond only to the bags, but not instances MI-SVM: variables correspond only to the instances, but not bags
37 Proposed Approach Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments Introduce a loss function between f (B i ) and the associated f (x ij ) s allows both the bags and instances to directly participate in the optimization process the learned function is smooth over both bags and instances Optimization technique MI-SVM uses an optimization heuristic we use constrained concave-convex procedure
38 Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments Constrained Concave-Convex Procedure An optimization tool for nonlinear optimization problems whose objective function can be expressed as a difference of convex functions
39 Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments Constrained Concave-Convex Procedure (CCCP) min x f 0 (x) g 0 (x) s.t. f i (x) g i (x) c i, i = 1,..., m, f i, g i (i = 0,..., m) are real-valued, convex and differentiable functions on R n ; c i R Procedure: 1 start with an initial x (0) 2 replace g i (x) with its first-order Taylor expansion at x (t) 3 set x (t+1) to the solution of the relaxed optimization problem: [ min f 0 (x) g 0 (x (t) ) + g 0 (x (t) ) ] (x x (t) ) x [ s.t. f i (x) g i (x (t) ) + g i (x (t) ) ] (x x (t) ) c i Converges to a local minimum solution
40 Regularization Framework Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments A set of training bags: {(B 1, y 1 ),..., (B m, y m )} B i = {x i1, x i2,..., x ini }: ith bag containing instances x ij s y i {±1} Define a loss function that depends on both the training bags and training instances: ( ) V {B i, y i, f (B i )} m i=1, {f (x ij )} n i j=1 m i=1 Split the loss function V into two parts 1 between each bag ( label and its bag prediction ) V {B i, y i, f (B i )} m i=1, {f (x ij)} n i j=1 m i=1 2 between the predictions of each bag and its constituent instances ( ) V {B i, y i, f (B i )} m i=1, {f (x ij)} n i j=1 m i=1
41 Loss Function V : 1st Part Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments Between each bag label y i and its corresponding prediction f (B i ) hinge loss (1 y i f (B i )) + where (z) + = max(0, z) 0 1
42 Loss Function V : 2nd Part Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments Between the predictions of each bag f (B i ) and its constituent instances {f (x ij ) j = 1,..., n i } l(f (B i ), max j f (x ij )) { 0 if v1 = v l(v 1, v 2 ) = 2, otherwise. L1 loss: l(v 1, v 2 ) = v 1 v 2 L2 loss: l(v 1, v 2 ) = (v 1 v 2 ) 2
43 Combining Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments V = 1 m m i=1 (1 y if (B i )) + + λ m m i=1 l(f (B i), max j f (x ij )) λ: trades off the two components Special cases: 1 1 Only the first part: m m i=1 (1 y if (B i )) + the same { as that with the MI kernel 2 0 if v1 = v l(v 1, v 2 ) = 2, otherwise. same as the MI-SVM
44 Optimization Problem Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments Introduce ξ = [ξ 1,..., ξ m ] : slack variables for the errors on bags γ, λ: tradeoff parameters m min f H,ξ 1 2 f 2 H + γ m ξ 1 + γλ m s.t. y i f (B i ) 1 ξ i, ξ 0 i=1 l(f (B i ), max f (x ij )) j=1,...,n i Representer Theorem m n i f (x) = α i0 k(x, B i ) + α ij k(x, x ij ), α i0, α ij R i=1 j=1 α: vector for all the α i0 s and α ij s
45 Using the L1 Loss for l(, ) Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments K: kernel matrix; k i : ith column of K min α,ξ,δ,b 1 2 α Kα + γ m ξ 1 + γλ m δ 1 s.t. y i (k I(B i ) α + b) 1 ξ i, ξ 0, k I(x ij ) α δ i k I(B i ) α, k I(B i ) α max (k I(x j=1,...,n ij ) α) δ i i Objective: quadratic; First three constraints: linear Last constraint: nonlinear, but is a difference of two convex functions
46 Optimization using CCCP Iterative procedure: 1 obtain α from this QP 2 use this as α (t+1) and iterate α (t) : estimate of α at the tth iteration : estimate of β ij β (t) ij min α,ξ,δ,b Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments 1 2 α Kα + γ m ξ 1 + γλ m δ 1 s.t. y i (k I(B i ) α + b) 1 ξ i, ξ 0, k I(x ij ) α δ i k I(B i ) α, n i k I(B i ) α β (t) ij k I(x ij ) α δ i j=1
47 Using the Loss Function in MI-SVM Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments With a particular choice of the subgradient identical to the optimization heuristic in MI-SVM MI-SVM: no convergence proof CCCP: guaranteed convergence
48 Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments Classification: Image Categorization on Corel Images Data set Used in Chen and Wang (JMLR 2004) 10 classes (beach, flowers, horses, etc.), with each class containing 100 images Each image: bag; Image segments: instance Procedure Same as in (Chen and Wang) Randomly divided into a training and test set, each containing 50 images of each category Repeated 5 times, and report the average accuracy Model parameters selected by a validation set
49 Results Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments accuracy (%) DD-SVM (Chen and Wang 2004) 81.5 ± 3.0 Hist-SVM (Chapelle et al. 1999) 66.7 ± 2.2 MI-SVM (Andrews et al. 2003) 74.7 ± 0.6 SVM (MI kernel) (Gärtner et al. 2002) 84.1 ± 0.90 Our method 84.4 ± 1.38 Results on DD-SVM, Hist-SVM and MI-SVM are from (Chen and Wang 2004) MI kernel used: normalized set kernel k κ(b 1, B 2 ) = set(b 1,B 2 ) kset(b1,b 1 ) k set(b 2,B 2 ) k set (B 1, B 2 ) = x B 1,z B 2 k(x, z), k: Gaussian kernel Our method: use the L1 loss significant at the 0.01 level of significance
50 Regression: Synthetic Musk Molecules Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments Predict the real-valued binding energies of musk molecules Synthetic data sets generated by Dooly et al. (JMLR 2002) based on using an affinity model between the musk molecules and receptors LJ , LJ and LJ LJ : # relevant features: 16; total # features: 30; # scale factors: 2 Make it more challenging created three more data sets (LJ , LJ and LJ ) by adding irrelevant features e.g., LJ is generated by adding 20 more irrelevant features to LJ while keeping its real-valued outputs intact
51 Results Multi-Instance Learning Constrained Concave-Convex Procedure Loss Function Optimization Problem Experiments data set DD citation-knn SVM (MI kernel) our method %err MSE %err MSE %err MSE %err MSE LJ LJ (not available) LJ LJ LJ LJ Results of DD, citation-knn on the first three data sets are from (Dooly et al. 2002) DD: does not perform well Easier data sets: ours has comparable/better performance More challenging data sets nearest neighbor-based and DD algorithms degrade with more irrelevant features our SVM-based approach is consistently the best
52 Kernel methods can now be used on massive data sets: novelty detection (unsupervised learning) classification/regression (supervised learning) manifold regularization (semi-supervised learning) maximum margin discriminant analysis (feature extraction) Kernel methods can also be used for multi-instance learning in a disciplined manner allows a loss function between the outputs of a bag and its associated instances both bags and instances can now directly participate in the optimization process by using CCCP, no need to use optimization heuristics how to design MI kernels? marginalized kernel
53 Recent Research ensemble learning multi instance learning IJCAI 2007b KDD 2006 ICML 2006a IJCAI 2007a feature extraction Kernel methods JMLR 2005 ICML 2005 TNN 2006a large datasets NIPS 2006a machine learning ICML 2006b NIPS 2006b semi supervised learning kernel learning ICML 2003 IJCAI 2003 ICML 2004 MLJ 2006 TNN 2006b NIPS 2003 TSAP 2005 TASLP 2006 applications TNN 2004 IJCAI 2005 TKDE 2006 GLOBECOM 2005 speech image / recognition vision pervasive computing
54 jamesk Google: james kwok
Very Large SVM Training using Core Vector Machines
Very Large SVM Training using Core Vector Machines Ivor W. Tsang James T. Kwok Department of Computer Science The Hong Kong University of Science and Technology Clear Water Bay Hong Kong Pak-Ming Cheung
More informationAll lecture slides will be available at CSC2515_Winter15.html
CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 9: Support Vector Machines All lecture slides will be available at http://www.cs.toronto.edu/~urtasun/courses/csc2515/ CSC2515_Winter15.html Many
More informationThe Pre-Image Problem in Kernel Methods
The Pre-Image Problem in Kernel Methods James Kwok Ivor Tsang Department of Computer Science Hong Kong University of Science and Technology Hong Kong The Pre-Image Problem in Kernel Methods ICML-2003 1
More informationSupport Vector Machines.
Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing
More informationSUPPORT VECTOR MACHINES
SUPPORT VECTOR MACHINES Today Reading AIMA 18.9 Goals (Naïve Bayes classifiers) Support vector machines 1 Support Vector Machines (SVMs) SVMs are probably the most popular off-the-shelf classifier! Software
More informationOverview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010
INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,
More informationKernel Methods & Support Vector Machines
& Support Vector Machines & Support Vector Machines Arvind Visvanathan CSCE 970 Pattern Recognition 1 & Support Vector Machines Question? Draw a single line to separate two classes? 2 & Support Vector
More informationSUPPORT VECTOR MACHINES
SUPPORT VECTOR MACHINES Today Reading AIMA 8.9 (SVMs) Goals Finish Backpropagation Support vector machines Backpropagation. Begin with randomly initialized weights 2. Apply the neural network to each training
More informationData Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)
Data Mining: Concepts and Techniques Chapter 9 Classification: Support Vector Machines 1 Support Vector Machines (SVMs) SVMs are a set of related supervised learning methods used for classification Based
More informationLinear methods for supervised learning
Linear methods for supervised learning LDA Logistic regression Naïve Bayes PLA Maximum margin hyperplanes Soft-margin hyperplanes Least squares resgression Ridge regression Nonlinear feature maps Sometimes
More informationSupport Vector Machines
Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining
More informationConstrained optimization
Constrained optimization A general constrained optimization problem has the form where The Lagrangian function is given by Primal and dual optimization problems Primal: Dual: Weak duality: Strong duality:
More informationSupport Vector Machines and their Applications
Purushottam Kar Department of Computer Science and Engineering, Indian Institute of Technology Kanpur. Summer School on Expert Systems And Their Applications, Indian Institute of Information Technology
More informationKernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:
More informationSupport Vector Machines
Support Vector Machines . Importance of SVM SVM is a discriminative method that brings together:. computational learning theory. previously known methods in linear discriminant functions 3. optimization
More informationOne-class Problems and Outlier Detection. 陶卿 中国科学院自动化研究所
One-class Problems and Outlier Detection 陶卿 Qing.tao@mail.ia.ac.cn 中国科学院自动化研究所 Application-driven Various kinds of detection problems: unexpected conditions in engineering; abnormalities in medical data,
More informationDM6 Support Vector Machines
DM6 Support Vector Machines Outline Large margin linear classifier Linear separable Nonlinear separable Creating nonlinear classifiers: kernel trick Discussion on SVM Conclusion SVM: LARGE MARGIN LINEAR
More informationPart 5: Structured Support Vector Machines
Part 5: Structured Support Vector Machines Sebastian Nowozin and Christoph H. Lampert Providence, 21st June 2012 1 / 34 Problem (Loss-Minimizing Parameter Learning) Let d(x, y) be the (unknown) true data
More informationA Taxonomy of Semi-Supervised Learning Algorithms
A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph
More information9. Support Vector Machines. The linearly separable case: hard-margin SVMs. The linearly separable case: hard-margin SVMs. Learning objectives
Foundations of Machine Learning École Centrale Paris Fall 25 9. Support Vector Machines Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech Learning objectives chloe agathe.azencott@mines
More informationInstance-level Semi-supervised Multiple Instance Learning
Instance-level Semi-supervised Multiple Instance Learning Yangqing Jia and Changshui Zhang State Key Laboratory on Intelligent Technology and Systems Tsinghua National Laboratory for Information Science
More informationStanford University. A Distributed Solver for Kernalized SVM
Stanford University CME 323 Final Project A Distributed Solver for Kernalized SVM Haoming Li Bangzheng He haoming@stanford.edu bzhe@stanford.edu GitHub Repository https://github.com/cme323project/spark_kernel_svm.git
More informationMachine Learning: Think Big and Parallel
Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationSecond Order SMO Improves SVM Online and Active Learning
Second Order SMO Improves SVM Online and Active Learning Tobias Glasmachers and Christian Igel Institut für Neuroinformatik, Ruhr-Universität Bochum 4478 Bochum, Germany Abstract Iterative learning algorithms
More informationSemi-Supervised Clustering with Partial Background Information
Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences
More informationCSE 417T: Introduction to Machine Learning. Lecture 22: The Kernel Trick. Henry Chai 11/15/18
CSE 417T: Introduction to Machine Learning Lecture 22: The Kernel Trick Henry Chai 11/15/18 Linearly Inseparable Data What can we do if the data is not linearly separable? Accept some non-zero in-sample
More informationData Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017
Data Analysis 3 Support Vector Machines Jan Platoš October 30, 2017 Department of Computer Science Faculty of Electrical Engineering and Computer Science VŠB - Technical University of Ostrava Table of
More informationModule 4. Non-linear machine learning econometrics: Support Vector Machine
Module 4. Non-linear machine learning econometrics: Support Vector Machine THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Introduction When the assumption of linearity
More informationBagging for One-Class Learning
Bagging for One-Class Learning David Kamm December 13, 2008 1 Introduction Consider the following outlier detection problem: suppose you are given an unlabeled data set and make the assumptions that one
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14
More informationCOMS 4771 Support Vector Machines. Nakul Verma
COMS 4771 Support Vector Machines Nakul Verma Last time Decision boundaries for classification Linear decision boundary (linear classification) The Perceptron algorithm Mistake bound for the perceptron
More informationIntroduction to Support Vector Machines
Introduction to Support Vector Machines CS 536: Machine Learning Littman (Wu, TA) Administration Slides borrowed from Martin Law (from the web). 1 Outline History of support vector machines (SVM) Two classes,
More informationSupport Vector Machines
Support Vector Machines About the Name... A Support Vector A training sample used to define classification boundaries in SVMs located near class boundaries Support Vector Machines Binary classifiers whose
More informationThorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA
Retrospective ICML99 Transductive Inference for Text Classification using Support Vector Machines Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA Outline The paper in
More informationLecture 9: Support Vector Machines
Lecture 9: Support Vector Machines William Webber (william@williamwebber.com) COMP90042, 2014, Semester 1, Lecture 8 What we ll learn in this lecture Support Vector Machines (SVMs) a highly robust and
More informationPart 5: Structured Support Vector Machines
Part 5: Structured Support Vector Machines Sebastian Noozin and Christoph H. Lampert Colorado Springs, 25th June 2011 1 / 56 Problem (Loss-Minimizing Parameter Learning) Let d(x, y) be the (unknon) true
More informationSupport Vector Machines
Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining
More informationSemi-supervised learning and active learning
Semi-supervised learning and active learning Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Combining classifiers Ensemble learning: a machine learning paradigm where multiple learners
More informationSupport vector machines
Support vector machines When the data is linearly separable, which of the many possible solutions should we prefer? SVM criterion: maximize the margin, or distance between the hyperplane and the closest
More informationarxiv: v1 [cs.lg] 4 Dec 2012
Training Support Vector Machines Using Frank-Wolfe Optimization Methods arxiv:1212.0695v1 [cs.lg] 4 Dec 2012 Emanuele Frandi 1, Ricardo Ñanculef2, Maria Grazia Gasparo 3, Stefano Lodi 4, and Claudio Sartori
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June
More informationPerceptron Learning Algorithm (PLA)
Review: Lecture 4 Perceptron Learning Algorithm (PLA) Learning algorithm for linear threshold functions (LTF) (iterative) Energy function: PLA implements a stochastic gradient algorithm Novikoff s theorem
More informationBehavioral Data Mining. Lecture 10 Kernel methods and SVMs
Behavioral Data Mining Lecture 10 Kernel methods and SVMs Outline SVMs as large-margin linear classifiers Kernel methods SVM algorithms SVMs as large-margin classifiers margin The separating plane maximizes
More informationSVM in Analysis of Cross-Sectional Epidemiological Data Dmitriy Fradkin. April 4, 2005 Dmitriy Fradkin, Rutgers University Page 1
SVM in Analysis of Cross-Sectional Epidemiological Data Dmitriy Fradkin April 4, 2005 Dmitriy Fradkin, Rutgers University Page 1 Overview The goals of analyzing cross-sectional data Standard methods used
More informationLarge Scale Manifold Transduction
Large Scale Manifold Transduction Michael Karlen, Jason Weston, Ayse Erkan & Ronan Collobert NEC Labs America, Princeton, USA Ećole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland New York University,
More informationNearest Neighbor with KD Trees
Case Study 2: Document Retrieval Finding Similar Documents Using Nearest Neighbors Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox January 22 nd, 2013 1 Nearest
More informationSVMs for Structured Output. Andrea Vedaldi
SVMs for Structured Output Andrea Vedaldi SVM struct Tsochantaridis Hofmann Joachims Altun 04 Extending SVMs 3 Extending SVMs SVM = parametric function arbitrary input binary output 3 Extending SVMs SVM
More informationMachine Learning Lecture 9
Course Outline Machine Learning Lecture 9 Fundamentals ( weeks) Bayes Decision Theory Probability Density Estimation Nonlinear SVMs 30.05.016 Discriminative Approaches (5 weeks) Linear Discriminant Functions
More informationSupport Vector Machines
Support Vector Machines SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions 6. Dealing
More informationLECTURE 5: DUAL PROBLEMS AND KERNELS. * Most of the slides in this lecture are from
LECTURE 5: DUAL PROBLEMS AND KERNELS * Most of the slides in this lecture are from http://www.robots.ox.ac.uk/~az/lectures/ml Optimization Loss function Loss functions SVM review PRIMAL-DUAL PROBLEM Max-min
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 20: 10/12/2015 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationIntroduction to Machine Learning
Introduction to Machine Learning Maximum Margin Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationNearest Neighbor with KD Trees
Case Study 2: Document Retrieval Finding Similar Documents Using Nearest Neighbors Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox January 22 nd, 2013 1 Nearest
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007,
More informationSupport Vector Machines (a brief introduction) Adrian Bevan.
Support Vector Machines (a brief introduction) Adrian Bevan email: a.j.bevan@qmul.ac.uk Outline! Overview:! Introduce the problem and review the various aspects that underpin the SVM concept.! Hard margin
More information10. Support Vector Machines
Foundations of Machine Learning CentraleSupélec Fall 2017 10. Support Vector Machines Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr Learning
More informationMachine Learning Lecture 9
Course Outline Machine Learning Lecture 9 Fundamentals ( weeks) Bayes Decision Theory Probability Density Estimation Nonlinear SVMs 19.05.013 Discriminative Approaches (5 weeks) Linear Discriminant Functions
More informationChoosing the kernel parameters for SVMs by the inter-cluster distance in the feature space Authors: Kuo-Ping Wu, Sheng-De Wang Published 2008
Choosing the kernel parameters for SVMs by the inter-cluster distance in the feature space Authors: Kuo-Ping Wu, Sheng-De Wang Published 2008 Presented by: Nandini Deka UH Mathematics Spring 2014 Workshop
More informationLecture 7: Support Vector Machine
Lecture 7: Support Vector Machine Hien Van Nguyen University of Houston 9/28/2017 Separating hyperplane Red and green dots can be separated by a separating hyperplane Two classes are separable, i.e., each
More informationCombine the PA Algorithm with a Proximal Classifier
Combine the Passive and Aggressive Algorithm with a Proximal Classifier Yuh-Jye Lee Joint work with Y.-C. Tseng Dept. of Computer Science & Information Engineering TaiwanTech. Dept. of Statistics@NCKU
More informationCSE 573: Artificial Intelligence Autumn 2010
CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine
More informationMI2LS: Multi-Instance Learning from Multiple Information Sources
MI2LS: Multi-Instance Learning from Multiple Information Sources Dan Zhang 1, Jingrui He 2, Richard Lawrence 3 1 Facebook Incorporation, Menlo Park, CA 2 Stevens Institute of Technology Hoboken, NJ 3 IBM
More informationChap.12 Kernel methods [Book, Chap.7]
Chap.12 Kernel methods [Book, Chap.7] Neural network methods became popular in the mid to late 1980s, but by the mid to late 1990s, kernel methods have also become popular in machine learning. The first
More informationStreamed Learning: One-Pass SVMs
Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence (IJCAI-09) Streamed Learning: One-Pass SVMs Piyush Rai, Hal Daumé III, Suresh Venkatasubramanian University of
More informationSupport Vector Machines.
Support Vector Machines srihari@buffalo.edu SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions
More informationAdapting SVM Classifiers to Data with Shifted Distributions
Adapting SVM Classifiers to Data with Shifted Distributions Jun Yang School of Computer Science Carnegie Mellon University Pittsburgh, PA 523 juny@cs.cmu.edu Rong Yan IBM T.J.Watson Research Center 9 Skyline
More informationContent-based image and video analysis. Machine learning
Content-based image and video analysis Machine learning for multimedia retrieval 04.05.2009 What is machine learning? Some problems are very hard to solve by writing a computer program by hand Almost all
More information12 Classification using Support Vector Machines
160 Bioinformatics I, WS 14/15, D. Huson, January 28, 2015 12 Classification using Support Vector Machines This lecture is based on the following sources, which are all recommended reading: F. Markowetz.
More informationLearning with infinitely many features
Learning with infinitely many features R. Flamary, Joint work with A. Rakotomamonjy F. Yger, M. Volpi, M. Dalla Mura, D. Tuia Laboratoire Lagrange, Université de Nice Sophia Antipolis December 2012 Example
More informationA Dendrogram. Bioinformatics (Lec 17)
A Dendrogram 3/15/05 1 Hierarchical Clustering [Johnson, SC, 1967] Given n points in R d, compute the distance between every pair of points While (not done) Pick closest pair of points s i and s j and
More informationBagging and Boosting Algorithms for Support Vector Machine Classifiers
Bagging and Boosting Algorithms for Support Vector Machine Classifiers Noritaka SHIGEI and Hiromi MIYAJIMA Dept. of Electrical and Electronics Engineering, Kagoshima University 1-21-40, Korimoto, Kagoshima
More informationMathematical Programming and Research Methods (Part II)
Mathematical Programming and Research Methods (Part II) 4. Convexity and Optimization Massimiliano Pontil (based on previous lecture by Andreas Argyriou) 1 Today s Plan Convex sets and functions Types
More informationEfficient Iterative Semi-supervised Classification on Manifold
. Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani
More informationIntroduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others
Introduction to object recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others Overview Basic recognition tasks A statistical learning approach Traditional or shallow recognition
More informationHW2 due on Thursday. Face Recognition: Dimensionality Reduction. Biometrics CSE 190 Lecture 11. Perceptron Revisited: Linear Separators
HW due on Thursday Face Recognition: Dimensionality Reduction Biometrics CSE 190 Lecture 11 CSE190, Winter 010 CSE190, Winter 010 Perceptron Revisited: Linear Separators Binary classification can be viewed
More informationKernel spectral clustering: model representations, sparsity and out-of-sample extensions
Kernel spectral clustering: model representations, sparsity and out-of-sample extensions Johan Suykens and Carlos Alzate K.U. Leuven, ESAT-SCD/SISTA Kasteelpark Arenberg B-3 Leuven (Heverlee), Belgium
More informationKernels and representation
Kernels and representation Corso di AA, anno 2017/18, Padova Fabio Aiolli 20 Dicembre 2017 Fabio Aiolli Kernels and representation 20 Dicembre 2017 1 / 19 (Hierarchical) Representation Learning Hierarchical
More informationEvaluation of Performance Measures for SVR Hyperparameter Selection
Evaluation of Performance Measures for SVR Hyperparameter Selection Koen Smets, Brigitte Verdonk, Elsa M. Jordaan Abstract To obtain accurate modeling results, it is of primal importance to find optimal
More informationKernels for Structured Data
T-122.102 Special Course in Information Science VI: Co-occurence methods in analysis of discrete data Kernels for Structured Data Based on article: A Survey of Kernels for Structured Data by Thomas Gärtner
More informationLarge synthetic data sets to compare different data mining methods
Large synthetic data sets to compare different data mining methods Victoria Ivanova, Yaroslav Nalivajko Superviser: David Pfander, IPVS ivanova.informatics@gmail.com yaroslav.nalivayko@gmail.com June 3,
More informationSupport Vector Machines
Support Vector Machines VL Algorithmisches Lernen, Teil 3a Norman Hendrich & Jianwei Zhang University of Hamburg, Dept. of Informatics Vogt-Kölln-Str. 30, D-22527 Hamburg hendrich@informatik.uni-hamburg.de
More informationKernels and Clustering
Kernels and Clustering Robert Platt Northeastern University All slides in this file are adapted from CS188 UC Berkeley Case-Based Learning Non-Separable Data Case-Based Reasoning Classification from similarity
More informationMultiple Instance Learning for Sparse Positive Bags
Razvan C. Bunescu razvan@cs.utexas.edu Raymond J. Mooney mooney@cs.utexas.edu Department of Computer Sciences, University of Texas at Austin, 1 University Station C0500, TX 78712 USA Abstract We present
More informationFast Support Vector Machine Classification of Very Large Datasets
Fast Support Vector Machine Classification of Very Large Datasets Janis Fehr 1, Karina Zapién Arreola 2 and Hans Burkhardt 1 1 University of Freiburg, Chair of Pattern Recognition and Image Processing
More informationClassification. 1 o Semestre 2007/2008
Classification Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 Single-Class
More informationFeature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule.
CS 188: Artificial Intelligence Fall 2007 Lecture 26: Kernels 11/29/2007 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit your
More informationClassification by Support Vector Machines
Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III
More informationTransductive Learning: Motivation, Model, Algorithms
Transductive Learning: Motivation, Model, Algorithms Olivier Bousquet Centre de Mathématiques Appliquées Ecole Polytechnique, FRANCE olivier.bousquet@m4x.org University of New Mexico, January 2002 Goal
More informationSVM in Oracle Database 10g: Removing the Barriers to Widespread Adoption of Support Vector Machines
SVM in Oracle Database 10g: Removing the Barriers to Widespread Adoption of Support Vector Machines Boriana Milenova, Joseph Yarmus, Marcos Campos Data Mining Technologies Oracle Overview Support Vector
More informationDivide and Conquer Kernel Ridge Regression
Divide and Conquer Kernel Ridge Regression Yuchen Zhang John Duchi Martin Wainwright University of California, Berkeley COLT 2013 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT 2013 1 / 15 Problem
More informationIteratively Re-weighted Least Squares for Sums of Convex Functions
Iteratively Re-weighted Least Squares for Sums of Convex Functions James Burke University of Washington Jiashan Wang LinkedIn Frank Curtis Lehigh University Hao Wang Shanghai Tech University Daiwei He
More informationProblem 1: Complexity of Update Rules for Logistic Regression
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1
More informationClassification by Support Vector Machines
Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III
More informationA (somewhat) Unified Approach to Semisupervised and Unsupervised Learning
A (somewhat) Unified Approach to Semisupervised and Unsupervised Learning Ben Recht Center for the Mathematics of Information Caltech April 11, 2007 Joint work with Ali Rahimi (Intel Research) Overview
More informationSemi-supervised Learning
Semi-supervised Learning Piyush Rai CS5350/6350: Machine Learning November 8, 2011 Semi-supervised Learning Supervised Learning models require labeled data Learning a reliable model usually requires plenty
More informationData-driven Kernels for Support Vector Machines
Data-driven Kernels for Support Vector Machines by Xin Yao A research paper presented to the University of Waterloo in partial fulfillment of the requirement for the degree of Master of Mathematics in
More informationSemi-supervised Methods
Semi-supervised Methods Irwin King Joint work with Zenglin Xu Department of Computer Science & Engineering The Chinese University of Hong Kong Academia Sinica 2009 Irwin King (CUHK) Semi-supervised Methods
More informationA New Fuzzy Membership Computation Method for Fuzzy Support Vector Machines
A New Fuzzy Membership Computation Method for Fuzzy Support Vector Machines Trung Le, Dat Tran, Wanli Ma and Dharmendra Sharma Faculty of Information Sciences and Engineering University of Canberra, Australia
More information