Kernel spectral clustering: model representations, sparsity and out-of-sample extensions
|
|
- Rolf Hodges
- 5 years ago
- Views:
Transcription
1 Kernel spectral clustering: model representations, sparsity and out-of-sample extensions Johan Suykens and Carlos Alzate K.U. Leuven, ESAT-SCD/SISTA Kasteelpark Arenberg B-3 Leuven (Heverlee), Belgium 4th Int. Conf. on Computational Harmonic Analysis May, Hong Kong ICCHA4, Hong Kong
2 Overview Introduction Kernel PCA: primal and dual model representations Spectral clustering Kernel spectral clustering Model selection Sparsity Incorporating prior knowledge
3 Introduction Application studies related to spectral clustering: complex networks: clustering time series in power grid networks clustering of scientific journals clustering time series in the CoRoT exoplanet database prognostics tools for predicting maintenance of machines image segmentation [Alzate et al. 7, 9 ; Varon et al., ] ICCHA4, Hong Kong
4 Power grid networks Winter.9 Normalized load Summer Hour Network: 45 substations in Belgian grid Data: hourly load, seasonal/weekly/intra-daily patterns Aim: short-term load forecasting, important for power generation decisions Clustering: identifying customer profiles from time-series (over 5 years) [Espinoza et al., IEEE CSM 7; Alzate et al., 9] ICCHA4, Hong Kong
5 Kernel-based models and learning theory Classification and regression: the use of kernel-based models is well established aim: good predictive model (achieving good generalization) model selection (kernel parameters, regularization constants): CV Clustering: underlying kernel-based models? out-of-sample extensions? learning and generalization? training, validation, test data? tuning parameters: how to determine? ICCHA4, Hong Kong 3
6 Kernel-based models and learning theory Classification and regression: the use of kernel-based models is well established aim: good predictive model (achieving good generalization) model selection (kernel parameters, regularization constants): CV Clustering: underlying kernel-based models? out-of-sample extensions? learning and generalization? training, validation, test data? tuning parameters: how to determine? ICCHA4, Hong Kong 4
7 Overview Introduction Kernel PCA: primal and dual model representations Spectral clustering Kernel spectral clustering Model selection Sparsity Incorporating prior knowledge
8 Kernel principal component analysis x () x () linear PCA kernel PCA (RBF kernel) Kernel PCA [Schölkopf et al., 998]: eigenvalue decomposition of K(x, x )... K(x, x N ).. K(x N,x )... K(x N,x N ) ICCHA4, Hong Kong 5
9 Kernel PCA: primal and dual problem Underlying primal problem [Suykens et al., 3] Primal problem: min w,b,e wt w + N γ i= e i s.t. e i = w T ϕ(x i ) + b, i =,...,N. (Lagrange) dual problem = kernel PCA : Ω c α = λα with λ = /γ with Ω c,ij = (ϕ(x i ) ˆµ ϕ ) T (ϕ(x j ) ˆµ ϕ ) the centered kernel matrix. Interpretation:. pool of candidates components (objective function equals zero). select relevant components ICCHA4, Hong Kong 6
10 Kernel PCA: model representations Primal and dual model representations: M ր ց (P) : (D) : ê = j w jϕ j (x ) + b = w T ϕ(x ) + b ê = i α ik(x,x i ) + b which can be evaluated at any point x R d, where K(x,x i ) = ϕ(x ) T ϕ(x i ) with K(, ) a positive definite kernel and feature map ϕ( ) : R d R n h. ICCHA4, Hong Kong 7
11 Sparse and robust versions Iteratively weighted L loss (to reduce the influence of outliers): min w,b,e i wt w + γ N i= v ie i subject to e i = w T ϕ(x i ) + b, i =,...,N. Other loss functions: e.g. Huber loss for robustness, ǫ-insensitive loss for sparsity min w,b,e wt w + γ N i= L(e i) i subject to e i = w T ϕ(x i ) + b, i =,...,N. [Alzate & Suykens, IEEE-TNN, 8] ICCHA4, Hong Kong 8
12 Robustness: Kernel Component Analysis original image corrupted image KPCA reconstruction Weighted LS-SVM: robustness [Alzate & Suykens, IEEE-TNN 8] ICCHA4, Hong Kong 9
13 Robustness: Kernel Component Analysis original image corrupted image KPCA reconstruction KCA reconstruction Weighted LS-SVM: robustness and sparsity [Alzate & Suykens, IEEE-TNN 8] ICCHA4, Hong Kong 9
14 Overview Introduction Kernel PCA: primal and dual model representations Spectral clustering Kernel spectral clustering Model selection Sparsity Incorporating prior knowledge
15 Spectral graph clustering Minimal cut: given the graph G = (V,E), find clusters A, A min q i {,+} w ij (q i q j ) i,j with cluster membership indicator q i (q i = if i A, q i = if i A ) and W = [w ij ] the weighted adjacency matrix cut of size (minimal cut) 6 cut of size ICCHA4, Hong Kong
16 Spectral graph clustering Min-cut spectral clustering problem min q T q= q T L q with L = D W the unnormalized graph Laplacian, degree matrix D = diag(d,...,d N ), d i = j w ij, giving L q = λ q. Cluster member indicators: ˆq i = sign( q i θ) with threshold θ. Normalized cut L q = λd q [Fiedler, 973; Shi & Malik, ; Ng et al. ; Chung, 997; von Luxburg, 7] Discrete version to continuous problem (Laplace operator) [Belkin & Niyogi, 3; von Luxburg et al., 8; Smale & Zhou, 7] ICCHA4, Hong Kong
17 Spectral clustering + K-means ICCHA4, Hong Kong
18 Overview Introduction Kernel PCA: primal and dual model representations Spectral clustering Kernel spectral clustering Model selection Sparsity Incorporating prior knowledge
19 Kernel spectral clustering: case of two clusters Underlying model (primal representation): ê = w T ϕ(x ) + b with ˆq = sign[ê ] the estimated cluster indicator at any x R d. Primal problem: training on given data {x i } N i= min w,b,e wt w + γ N i= v i e i subject to e i = w T ϕ(x i ) + b, i =,...,N with positive weights v i (will be related to inverse degree matrix). [Alzate & Suykens, IEEE-PAMI, ] ICCHA4, Hong Kong 3
20 Lagrangian: Lagrangian and conditions for optimality L(w,b, e; α) = wt w + γ N v i e i i= N α i (e i w T ϕ(x i ) b) i= Conditions for optimality: L w = w = i α iϕ(x i ) L b = i α i = L = α i = γv i e i, i =,...,N e i L = e i = w T ϕ(x i ) + b, i =,...,N α i Eliminate w,b,e, write solution in α. ICCHA4, Hong Kong 4
21 Kernel-based model representation Dual problem: with V M V Ωα = λα λ = /γ M V = I N T N V N T NV : weighted centering matrix N Ω = [Ω ij ]: kernel matrix with Ω ij = ϕ(x i ) T ϕ(x j ) = K(x i,x j ) Dual model representation: ê = N α i K(x i, x ) + b i= with K(x i,x ) = ϕ(x i ) T ϕ(x ). ICCHA4, Hong Kong 5
22 Choice of weights v i Take V = D where D = diag{d,...,d N } and d i = N j= Ω ij This gives the generalized eigenvalue problem: M D Ωα = λdα with M D = I N T N D N N T N D This is a modified version of random walks spectral clustering. Note that sign[e i ] = sign[α i ] (on training data)... but sign[e ] applies beyond training data ICCHA4, Hong Kong 6
23 Kernel spectral clustering: more clusters Case of k clusters: additional sets of constraints min w (l),e (l),b l k l= w (l)t w (l) + k l= γ l e (l)t D e (l) subject to e () = Φ N nh w () + b N e () = Φ N nh w () + b N. e (k ) = Φ N nh w (k ) + b k N where e (l) = [e (l) ;...;e(l) N ] and Φ N n h = [ϕ(x ) T ;...;ϕ(x N ) T ] R N n h. Dual problem: M D Ωα (l) = λdα (l), l =,...,k. [Alzate & Suykens, IEEE-PAMI, ] ICCHA4, Hong Kong 7
24 Primal and dual model representations k clusters k sets of constraints (index l =,...,k ) M ր ց (P) : sign[ê (l) ] = sign[w (l)t ϕ(x ) + b l ] (D) : sign[ê (l) ] = sign[ j α(l) j K(x,x j ) + b l ] Note: additional sets of constraints also in multi-class and vector-valued output LS-SVMs [Suykens et al., 999] ICCHA4, Hong Kong 8
25 Out-of-sample extension and coding x () x () x () x () ICCHA4, Hong Kong 9
26 Out-of-sample extension and coding x () x () x () x () ICCHA4, Hong Kong 9
27 Overview Introduction Kernel PCA: primal and dual model representations Spectral clustering Kernel spectral clustering Model selection Sparsity Incorporating prior knowledge
28 Piecewise constant eigenvectors and extension Definition. [Meila & Shi, ] Vector α is called piecewise constant relative to a partition (A,..., A k ) iff α i = α j x i,x j A p,p =,...,k. Proposition. [Alzate & Suykens, ] Assume (i) a training set D = {x i } N i= and validation set Dv = {x v m} N v m= i.i.d. sampled from the same underlying distribution; (ii) a set of k clusters {A,..., A k } with k > ; (iii) an isotropic kernel function such that K(x, z) = when x and z belong to different clusters; (iv) the eigenvectors α (l) for l =,...,k are piecewise constant. Then validation set points belonging to the same cluster are collinear in the k dimensional subspace spanned by the columns of E v R Nv (k ) where Eml v = e(l) m = N i= α(l) i K(x i, x v m) + b l. ICCHA4, Hong Kong
29 Piecewise constant eigenvectors and extension Key aspect of the proof: one has e (l) = N i= α(l) i K(x i,x ) + b (l) = c p (l) i A p K(x i, x ) + N = c p (l) i A p K(x i, x ) + b (l) i/ A p α (l) i K(x i, x ) + b (l) Model selection to determine kernel parameters and k: Looking for line structures in the space (e () i,e () i,...,e (k ) i ), evaluated on validation data (aiming for good generalization) Choice kernel: Gaussian RBF kernel χ -kernel for images ICCHA4, Hong Kong
30 Model selection (looking for lines): toy problem e () i,val...3 σ =.5, BLF = e () i,val..4.6 e () i,val σ =.6, BLF = e () i,val validation set x () x () x () x () train + validation + test data ICCHA4, Hong Kong
31 Model selection (looking for lines): toy problem 8 σ =., BLF = i,val e () x (3) e () i,val 3 x () x ().3 σ =.3, BLF =... 3 e () i,val e () i,val validation set x (3) 3 x () x () train + validation + test data ICCHA4, Hong Kong 3
32 Example: image segmentation (looking for lines) 4 3 i,val e (3) 3 e () i,val e () i,val ICCHA4, Hong Kong 4
33 Image ID Image Proposed method Nyström method Human ICCHA4, Hong Kong 5
34 Example: power grid networks - identifying customer profiles Power load: 45 substations, hourly data (5 years), d = Periodic AR modelling: dimensionality reduction k-means clustering applied after dimensionality reduction.9 normalized load normalized load normalized load normalized load hour hour hour hour normalized load normalized load normalized load normalized load hour hour hour hour ICCHA4, Hong Kong 6
35 Clustering time-series: kernel spectral clustering Application of kernel spectral clustering, directly on d = Model selection on kernel parameter and number of clusters [Alzate, Espinoza, De Moor, Suykens, 9] normalized load normalized load normalized load normalized load hour 5 5 hour 5 5 hour 5 5 hour normalized load normalized load normalized load hour 5 5 hour 5 5 hour ICCHA4, Hong Kong 7
36 Clustering time-series: kernel spectral clustering normalized load normalized load normalized load hour 5 5 hour 5 5 hour Electricity load: 45 substations in Belgian grid (/ train, / validation) x i R : spectral clustering on high dimensional data (5 years) 3 of 7 detected clusters: - : Residential profile: morning and evening peaks - : Business profile: peaked around noon - 3: Industrial profile: increasing morning, oscillating afternoon and evening ICCHA4, Hong Kong 8
37 Out-of-sample eigenvectors From the conditions for optimality: an eigenvector α satisfies and T N α =. α = γd e By defining deg(x) = N j= K(x, x j) the notion of eigenvector is extended to a validation set as follows: α val = [I N v Nv T N v ]γd val e val [I N v Nv T N v ]γd val e val satisfying α val = and T N α val =. N v denotes the validation set size. [Alzate & Suykens, IJCNN ] ICCHA4, Hong Kong 9
38 Model selection (looking for dots): toy problem.3 σ =.3, Fisher =. badly tuned.. α () i,val x (3) α () i,val σ =., Fisher =. 3 well tuned x () x () α () i,val.. x (3) α () i,val 3 x () x () ICCHA4, Hong Kong 3
39 Example: image segmentation (looking for dots) Fisher criterion Number of clusters k..5 α () i,val α () i,val ICCHA4, Hong Kong 3
40 Overview Introduction Kernel PCA: primal and dual model representations Spectral clustering Kernel spectral clustering Model selection Sparsity Incorporating prior knowledge
41 Kernel spectral clustering: sparse kernel models original image binary clustering Incomplete Cholesky decomposition: Ω GG T η with G R N R and R N Image (Berkeley image dataset): 3 48 (54, 4 pixels), 75 SV e (l) = i S SV α (l) i K(x i, x ) + b l ICCHA4, Hong Kong 3
42 Kernel spectral clustering: sparse kernel models original image sparse kernel model Incomplete Cholesky decomposition: Ω GG T η with G R N R and R N Image (Berkeley image dataset): 3 48 (54, 4 pixels), 75 SV e (l) = i S SV α (l) i K(x i, x ) + b l ICCHA4, Hong Kong 3
43 Highly sparse kernel models on images application on images: x i R 3 (r,g,b values per pixel), i =,...,N pre-processed into z i R 8 (quantization to 8 colors) χ -kernel to compare two local color histograms (5 5 pixels window) N >., select subset M N based on quadratic Renyi entropy as in the fixed-size method [Suykens et al., ] Highly sparse representations: # SV = 3 k Completion of cluster indicators based on out-of-sample extensions sign[ê (l) ] = sign[ j S SV α (l) j K(x,x j ) + b l ] applied to the full image [Alzate & Suykens, Neurocomputing ] ICCHA4, Hong Kong 33
44 Highly sparse kernel models: toy example x () e () i x () e () i only 3k = 9 support vectors ICCHA4, Hong Kong 34
45 Highly sparse kernel models: toy example 4 3 x () x () ICCHA4, Hong Kong 35
46 Highly sparse kernel models: toy example x() x () only 3k = support vectors ICCHA4, Hong Kong 35
47 Highly sparse kernel models: toy example 4 ê (3) i 4 4 ê () i 4 4 ê () i 4 6 ICCHA4, Hong Kong 35
48 Highly sparse kernel models: image segmentation e () i e () i e (3) i ICCHA4, Hong Kong 36
49 Highly sparse kernel models: image segmentation.5.5 e (3) i e () i 3.5 e () i.5.5 only 3k = support vectors ICCHA4, Hong Kong 36
50 Overview Introduction Kernel PCA: primal and dual model representations Spectral clustering Kernel spectral clustering Model selection Sparsity Incorporating prior knowledge
51 Kernel spectral clustering: adding prior knowledge Pair of points x, x : c = must-link, c = cannot-link Primal problem [Alzate & Suykens, IJCNN 9] min w (l),e (l),b l k l= w (l)t w (l) + k l= γ l e (l)t D e (l) subject to e () = Φ N nh w () + b N. e (k ) = Φ N nh w (k ) + b k N w ()T ϕ(x ) = cw ()T ϕ(x ). w (k )T ϕ(x ) = cw (k )T ϕ(x ) Dual problem: yields rank-one downdate of the kernel matrix ICCHA4, Hong Kong 37
52 Kernel spectral clustering: example original image without constraints ICCHA4, Hong Kong 38
53 Kernel spectral clustering: example original image with constraints ICCHA4, Hong Kong 39
54 Conclusions Spectral clustering within a kernel-based learning framework Training problem: characterization in terms of primal and dual problem Out-of-sample extensions: primal and dual model representations Extend desirable piecewise constant property to validation level New model selection criteria (learning and generalization aspects) (highly) sparse kernel models Suitable for adding prior knowledge through constraints ICCHA4, Hong Kong 4
55 References Downloadable from Alzate C., Suykens J.A.K., Multiway Spectral Clustering with Out-of-Sample Extensions through Weighted Kernel PCA, IEEE Transactions on Pattern Analysis and Machine Intelligence, 3(), , Alzate C., Suykens J.A.K., Sparse Kernel Spectral Clustering Models for Large-Scale Data Analysis, Neurocomputing, 74(9), 38-39, Alzate C., Suykens J.A.K., A Regularized Formulation for Spectral Clustering with Pairwise Constraints, International Joint Conference on Neural Networks (IJCNN 9), Atlanta US, 9, 4-48 Alzate C., Suykens J.A.K., Out-of-Sample Eigenvectors in Kernel Spectral Clustering, to appear International Joint Conference on Neural Networks (IJCNN ) Suykens J.A.K., Data Visualization and Dimensionality Reduction using Kernel Maps with a Reference Point, IEEE Transactions on Neural Networks, 9(9), 5-57, 8 Suykens J.A.K., Alzate C., Pelckmans K., Primal and dual model representations in kernel-based learning, Statistics Surveys, 4, 48-83, ICCHA4, Hong Kong 4
A Weighted Kernel PCA Approach to Graph-Based Image Segmentation
A Weighted Kernel PCA Approach to Graph-Based Image Segmentation Carlos Alzate Johan A. K. Suykens ESAT-SCD-SISTA Katholieke Universiteit Leuven Leuven, Belgium January 25, 2007 International Conference
More informationLearning with infinitely many features
Learning with infinitely many features R. Flamary, Joint work with A. Rakotomamonjy F. Yger, M. Volpi, M. Dalla Mura, D. Tuia Laboratoire Lagrange, Université de Nice Sophia Antipolis December 2012 Example
More informationData Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017
Data Analysis 3 Support Vector Machines Jan Platoš October 30, 2017 Department of Computer Science Faculty of Electrical Engineering and Computer Science VŠB - Technical University of Ostrava Table of
More informationClustering: Classic Methods and Modern Views
Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering
More informationDivide and Conquer Kernel Ridge Regression
Divide and Conquer Kernel Ridge Regression Yuchen Zhang John Duchi Martin Wainwright University of California, Berkeley COLT 2013 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT 2013 1 / 15 Problem
More informationRobust Kernel Methods in Clustering and Dimensionality Reduction Problems
Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Jian Guo, Debadyuti Roy, Jing Wang University of Michigan, Department of Statistics Introduction In this report we propose robust
More informationVery Sparse LSSVM Reductions for Large Scale Data
1 Very Sparse LSSVM Reductions for Large Scale Data Raghvendra Mall and Johan A.K. Suykens Abstract Least Squares Support Vector Machines (LSSVM) have been widely applied for classification and regression
More informationApplication of Spectral Clustering Algorithm
1/27 Application of Spectral Clustering Algorithm Danielle Middlebrooks dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007,
More informationThe Pre-Image Problem in Kernel Methods
The Pre-Image Problem in Kernel Methods James Kwok Ivor Tsang Department of Computer Science Hong Kong University of Science and Technology Hong Kong The Pre-Image Problem in Kernel Methods ICML-2003 1
More informationSpectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014
Spectral Clustering Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 What are we going to talk about? Introduction Clustering and
More informationIN many applications, ranging from data mining to machine
Hierarchical Semi-Supervised Clustering using KSC based model Siamak Mehrkanoon, Oscar Mauricio Agudelo, Raghvendra Mall and Johan A. K. Suykens Abstract This paper introduces a methodology to incorporate
More informationSupport Vector Machines.
Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing
More informationData Visualization and Dimensionality Reduction Using Kernel Maps With a Reference Point Johan A. K. Suykens, Senior Member, IEEE
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL 19, NO 9, SEPTEMBER 2008 1501 Data Visualization and Dimensionality Reduction Using Kernel Maps With a Reference Point Johan A K Suykens, Senior Member, IEEE Abstract
More informationSegmentation: Clustering, Graph Cut and EM
Segmentation: Clustering, Graph Cut and EM Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 yingwu@northwestern.edu http://www.eecs.northwestern.edu/~yingwu
More informationNonlinear Clustering on Sparse Grids
Nonlinear Clustering on Sparse Grids Interdisciplinary Project (IDP) Julius Adorf, Technische Universität München August 13, 2012 Abstract This work applies a recent sparse-grid-based spectral clustering
More informationKernel Methods & Support Vector Machines
& Support Vector Machines & Support Vector Machines Arvind Visvanathan CSCE 970 Pattern Recognition 1 & Support Vector Machines Question? Draw a single line to separate two classes? 2 & Support Vector
More informationVisual Representations for Machine Learning
Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering
More informationKernels for Structured Data
T-122.102 Special Course in Information Science VI: Co-occurence methods in analysis of discrete data Kernels for Structured Data Based on article: A Survey of Kernels for Structured Data by Thomas Gärtner
More informationSpectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,
Spectral Clustering XIAO ZENG + ELHAM TABASSI CSE 902 CLASS PRESENTATION MARCH 16, 2017 1 Presentation based on 1. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4
More informationSpatial-Color Pixel Classification by Spectral Clustering for Color Image Segmentation
2008 ICTTA Damascus (Syria), April, 2008 Spatial-Color Pixel Classification by Spectral Clustering for Color Image Segmentation Pierre-Alexandre Hébert (LASL) & L. Macaire (LAGIS) Context Summary Segmentation
More informationLS-SVM Functional Network for Time Series Prediction
LS-SVM Functional Network for Time Series Prediction Tuomas Kärnä 1, Fabrice Rossi 2 and Amaury Lendasse 1 Helsinki University of Technology - Neural Networks Research Center P.O. Box 5400, FI-02015 -
More informationGraph Laplacian Kernels for Object Classification from a Single Example
Graph Laplacian Kernels for Object Classification from a Single Example Hong Chang & Dit-Yan Yeung Department of Computer Science, Hong Kong University of Science and Technology {hongch,dyyeung}@cs.ust.hk
More informationSUPPORT VECTOR MACHINES
SUPPORT VECTOR MACHINES Today Reading AIMA 8.9 (SVMs) Goals Finish Backpropagation Support vector machines Backpropagation. Begin with randomly initialized weights 2. Apply the neural network to each training
More informationHW2 due on Thursday. Face Recognition: Dimensionality Reduction. Biometrics CSE 190 Lecture 11. Perceptron Revisited: Linear Separators
HW due on Thursday Face Recognition: Dimensionality Reduction Biometrics CSE 190 Lecture 11 CSE190, Winter 010 CSE190, Winter 010 Perceptron Revisited: Linear Separators Binary classification can be viewed
More informationComputer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction
Computer vision: models, learning and inference Chapter 13 Image preprocessing and feature extraction Preprocessing The goal of pre-processing is to try to reduce unwanted variation in image due to lighting,
More informationAarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg
Spectral Clustering Aarti Singh Machine Learning 10-701/15-781 Apr 7, 2010 Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg 1 Data Clustering Graph Clustering Goal: Given data points X1,, Xn and similarities
More informationLeast Squares Support Vector Machines for Data Mining
Least Squares Support Vector Machines for Data Mining JÓZSEF VALYON, GÁBOR HORVÁTH Budapest University of Technology and Economics, Department of Measurement and Information Systems {valyon, horvath}@mit.bme.hu
More informationA Dendrogram. Bioinformatics (Lec 17)
A Dendrogram 3/15/05 1 Hierarchical Clustering [Johnson, SC, 1967] Given n points in R d, compute the distance between every pair of points While (not done) Pick closest pair of points s i and s j and
More informationChap.12 Kernel methods [Book, Chap.7]
Chap.12 Kernel methods [Book, Chap.7] Neural network methods became popular in the mid to late 1980s, but by the mid to late 1990s, kernel methods have also become popular in machine learning. The first
More informationSUPPORT VECTOR MACHINES
SUPPORT VECTOR MACHINES Today Reading AIMA 18.9 Goals (Naïve Bayes classifiers) Support vector machines 1 Support Vector Machines (SVMs) SVMs are probably the most popular off-the-shelf classifier! Software
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationCS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation
CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation Spring 2005 Ahmed Elgammal Dept of Computer Science CS 534 Segmentation II - 1 Outlines What is Graph cuts Graph-based clustering
More informationA survey of kernel and spectral methods for clustering
A survey of kernel and spectral methods for clustering Maurizio Filippone a Francesco Camastra b Francesco Masulli a Stefano Rovetta a a Department of Computer and Information Science, University of Genova,
More informationLocally Linear Landmarks for large-scale manifold learning
Locally Linear Landmarks for large-scale manifold learning Max Vladymyrov and Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced http://eecs.ucmerced.edu
More informationPerformance Modeling of Analog Integrated Circuits using Least-Squares Support Vector Machines
Performance Modeling of Analog Integrated Circuits using Least-Squares Support Vector Machines Tholom Kiely and Georges Gielen Katholieke Universiteit Leuven, Department of Electrical Engineering, ESAT-MICAS
More informationA (somewhat) Unified Approach to Semisupervised and Unsupervised Learning
A (somewhat) Unified Approach to Semisupervised and Unsupervised Learning Ben Recht Center for the Mathematics of Information Caltech April 11, 2007 Joint work with Ali Rahimi (Intel Research) Overview
More informationMachine Learning: Think Big and Parallel
Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least
More informationA Local Learning Approach for Clustering
A Local Learning Approach for Clustering Mingrui Wu, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics 72076 Tübingen, Germany {mingrui.wu, bernhard.schoelkopf}@tuebingen.mpg.de Abstract
More informationIntroduction to spectral clustering
Introduction to spectral clustering Vasileios Zografos zografos@isy.liu.se Klas Nordberg klas@isy.liu.se What this course is Basic introduction into the core ideas of spectral clustering Sufficient to
More informationOptimal Data Projection for Kernel Spectral Clustering
Optimal Data Projection for Kernel Spectral Clustering D. H. Peluffo 1, C. Alzate 2, J. A. K. Suykens 3, and G. Castellanos-Dominguez 4 1- Université catholique de Louvain, Machine Learning Group - ICTEAM
More informationGENDER CLASSIFICATION USING SUPPORT VECTOR MACHINES
GENDER CLASSIFICATION USING SUPPORT VECTOR MACHINES Ashwin Swaminathan ashwins@umd.edu ENEE633: Statistical and Neural Pattern Recognition Instructor : Prof. Rama Chellappa Project 2, Part (a) 1. INTRODUCTION
More informationUnsupervised learning in Vision
Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual
More informationLarge-Scale Face Manifold Learning
Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random
More informationImproving Image Segmentation Quality Via Graph Theory
International Symposium on Computers & Informatics (ISCI 05) Improving Image Segmentation Quality Via Graph Theory Xiangxiang Li, Songhao Zhu School of Automatic, Nanjing University of Post and Telecommunications,
More informationOptimal Separating Hyperplane and the Support Vector Machine. Volker Tresp Summer 2018
Optimal Separating Hyperplane and the Support Vector Machine Volker Tresp Summer 2018 1 (Vapnik s) Optimal Separating Hyperplane Let s consider a linear classifier with y i { 1, 1} If classes are linearly
More informationDetecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference
Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Minh Dao 1, Xiang Xiang 1, Bulent Ayhan 2, Chiman Kwan 2, Trac D. Tran 1 Johns Hopkins Univeristy, 3400
More informationIntroduction to spectral clustering
Introduction to spectral clustering Denis Hamad LASL ULCO Denis.Hamad@lasl.univ-littoral.fr Philippe Biela HEI LAGIS Philippe.Biela@hei.fr Data Clustering Data clustering Data clustering is an important
More informationFeature Selection for fmri Classification
Feature Selection for fmri Classification Chuang Wu Program of Computational Biology Carnegie Mellon University Pittsburgh, PA 15213 chuangw@andrew.cmu.edu Abstract The functional Magnetic Resonance Imaging
More informationTable of Contents. Recognition of Facial Gestures... 1 Attila Fazekas
Table of Contents Recognition of Facial Gestures...................................... 1 Attila Fazekas II Recognition of Facial Gestures Attila Fazekas University of Debrecen, Institute of Informatics
More informationMaximum Margin Binary Classifiers using Intrinsic and Penalty Graphs
Maximum Margin Binary Classifiers using Intrinsic and Penalty Graphs Berkay Kicanaoglu, Alexandros Iosifidis and Moncef Gabbouj Department of Signal Processing, Tampere University of Technology, Tampere,
More informationImage Segmentation continued Graph Based Methods
Image Segmentation continued Graph Based Methods Previously Images as graphs Fully-connected graph node (vertex) for every pixel link between every pair of pixels, p,q affinity weight w pq for each link
More informationBig Data Analytics. Special Topics for Computer Science CSE CSE Feb 11
Big Data Analytics Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Feb 11 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering II Spectral
More informationSegmentation. Bottom Up Segmentation
Segmentation Bottom up Segmentation Semantic Segmentation Bottom Up Segmentation 1 Segmentation as clustering Depending on what we choose as the feature space, we can group pixels in different ways. Grouping
More informationVideo event detection using subclass discriminant analysis and linear support vector machines
Video event detection using subclass discriminant analysis and linear support vector machines Nikolaos Gkalelis, Damianos Galanopoulos, Vasileios Mezaris / TRECVID 2014 Workshop, Orlando, FL, USA, November
More informationUnsupervised and Semi-Supervised Learning vial 1 -Norm Graph
Unsupervised and Semi-Supervised Learning vial -Norm Graph Feiping Nie, Hua Wang, Heng Huang, Chris Ding Department of Computer Science and Engineering University of Texas, Arlington, TX 769, USA {feipingnie,huawangcs}@gmail.com,
More informationA Taxonomy of Semi-Supervised Learning Algorithms
A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph
More informationENSEMBLE RANDOM-SUBSET SVM
ENSEMBLE RANDOM-SUBSET SVM Anonymous for Review Keywords: Abstract: Ensemble Learning, Bagging, Boosting, Generalization Performance, Support Vector Machine In this paper, the Ensemble Random-Subset SVM
More informationImageCLEF 2011
SZTAKI @ ImageCLEF 2011 Bálint Daróczy joint work with András Benczúr, Róbert Pethes Data Mining and Web Search Group Computer and Automation Research Institute Hungarian Academy of Sciences Training/test
More informationData fusion and multi-cue data matching using diffusion maps
Data fusion and multi-cue data matching using diffusion maps Stéphane Lafon Collaborators: Raphy Coifman, Andreas Glaser, Yosi Keller, Steven Zucker (Yale University) Part of this work was supported by
More informationData Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\
Data Preprocessing Javier Béjar BY: $\ URL - Spring 2018 C CS - MAI 1/78 Introduction Data representation Unstructured datasets: Examples described by a flat set of attributes: attribute-value matrix Structured
More informationSupport Vector Machines
Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining
More informationSpectral Clustering on Handwritten Digits Database
October 6, 2015 Spectral Clustering on Handwritten Digits Database Danielle dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance
More informationAPPROXIMATE SPECTRAL LEARNING USING NYSTROM METHOD. Aleksandar Trokicić
FACTA UNIVERSITATIS (NIŠ) Ser. Math. Inform. Vol. 31, No 2 (2016), 569 578 APPROXIMATE SPECTRAL LEARNING USING NYSTROM METHOD Aleksandar Trokicić Abstract. Constrained clustering algorithms as an input
More informationStanford University. A Distributed Solver for Kernalized SVM
Stanford University CME 323 Final Project A Distributed Solver for Kernalized SVM Haoming Li Bangzheng He haoming@stanford.edu bzhe@stanford.edu GitHub Repository https://github.com/cme323project/spark_kernel_svm.git
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Unsupervised Learning: Clustering Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com (Some material
More informationLecture 7: Support Vector Machine
Lecture 7: Support Vector Machine Hien Van Nguyen University of Houston 9/28/2017 Separating hyperplane Red and green dots can be separated by a separating hyperplane Two classes are separable, i.e., each
More informationMachine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016
Machine Learning for Signal Processing Clustering Bhiksha Raj Class 11. 13 Oct 2016 1 Statistical Modelling and Latent Structure Much of statistical modelling attempts to identify latent structure in the
More informationKernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:
More informationELEG Compressive Sensing and Sparse Signal Representations
ELEG 867 - Compressive Sensing and Sparse Signal Representations Gonzalo R. Arce Depart. of Electrical and Computer Engineering University of Delaware Fall 211 Compressive Sensing G. Arce Fall, 211 1 /
More informationBilevel Sparse Coding
Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional
More informationAdvanced Machine Learning Practical 1: Manifold Learning (PCA and Kernel PCA)
Advanced Machine Learning Practical : Manifold Learning (PCA and Kernel PCA) Professor: Aude Billard Assistants: Nadia Figueroa, Ilaria Lauzana and Brice Platerrier E-mails: aude.billard@epfl.ch, nadia.figueroafernandez@epfl.ch
More informationrandom fourier features for kernel ridge regression: approximation bounds and statistical guarantees
random fourier features for kernel ridge regression: approximation bounds and statistical guarantees Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, and Amir Zandieh Tel
More informationClustering. Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester / 238
Clustering Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester 2015 163 / 238 What is Clustering? Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester
More informationThe Pre-Image Problem and Kernel PCA for Speech Enhancement
The Pre-Image Problem and Kernel PCA for Speech Enhancement Christina Leitner and Franz Pernkopf Signal Processing and Speech Communication Laboratory, Graz University of Technology, Inffeldgasse 6c, 8
More informationData Preprocessing. Javier Béjar AMLT /2017 CS - MAI. (CS - MAI) Data Preprocessing AMLT / / 71 BY: $\
Data Preprocessing S - MAI AMLT - 2016/2017 (S - MAI) Data Preprocessing AMLT - 2016/2017 1 / 71 Outline 1 Introduction Data Representation 2 Data Preprocessing Outliers Missing Values Normalization Discretization
More informationCS 534: Computer Vision Segmentation and Perceptual Grouping
CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation
More informationThe Constrained Laplacian Rank Algorithm for Graph-Based Clustering
The Constrained Laplacian Rank Algorithm for Graph-Based Clustering Feiping Nie, Xiaoqian Wang, Michael I. Jordan, Heng Huang Department of Computer Science and Engineering, University of Texas, Arlington
More informationBagging for One-Class Learning
Bagging for One-Class Learning David Kamm December 13, 2008 1 Introduction Consider the following outlier detection problem: suppose you are given an unlabeled data set and make the assumptions that one
More informationClustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic
Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the
More informationLecture 10: SVM Lecture Overview Support Vector Machines The binary classification problem
Computational Learning Theory Fall Semester, 2012/13 Lecture 10: SVM Lecturer: Yishay Mansour Scribe: Gitit Kehat, Yogev Vaknin and Ezra Levin 1 10.1 Lecture Overview In this lecture we present in detail
More informationApplying the Possibilistic C-Means Algorithm in Kernel-Induced Spaces
1 Applying the Possibilistic C-Means Algorithm in Kernel-Induced Spaces Maurizio Filippone, Francesco Masulli, and Stefano Rovetta M. Filippone is with the Department of Computer Science of the University
More informationData Mining in Bioinformatics Day 1: Classification
Data Mining in Bioinformatics Day 1: Classification Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institute Tübingen and Eberhard Karls
More informationClustering via Random Walk Hitting Time on Directed Graphs
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (8) Clustering via Random Walk Hitting Time on Directed Graphs Mo Chen Jianzhuang Liu Xiaoou Tang, Dept. of Information Engineering
More informationAdaptive Sparse Kernel Principal Component Analysis for Computation and Store Space Constrained-based Feature Extraction
Journal of Information Hiding and Multimedia Signal Processing c 2015 ISSN 2073-4212 Ubiquitous International Volume 6, Number 4, July 2015 Adaptive Sparse Kernel Principal Component Analysis for Computation
More informationFunction approximation using RBF network. 10 basis functions and 25 data points.
1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data
More informationSoftware Documentation of the Potential Support Vector Machine
Software Documentation of the Potential Support Vector Machine Tilman Knebel and Sepp Hochreiter Department of Electrical Engineering and Computer Science Technische Universität Berlin 10587 Berlin, Germany
More informationLaplacian Eigenmaps and Bayesian Clustering Based Layout Pattern Sampling and Its Applications to Hotspot Detection and OPC
Laplacian Eigenmaps and Bayesian Clustering Based Layout Pattern Sampling and Its Applications to Hotspot Detection and OPC Tetsuaki Matsunawa 1, Bei Yu 2 and David Z. Pan 3 1 Toshiba Corporation 2 The
More informationScalable Clustering of Signed Networks Using Balance Normalized Cut
Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.
More informationKernels and representation
Kernels and representation Corso di AA, anno 2017/18, Padova Fabio Aiolli 20 Dicembre 2017 Fabio Aiolli Kernels and representation 20 Dicembre 2017 1 / 19 (Hierarchical) Representation Learning Hierarchical
More informationCLASSIFICATION AND CHANGE DETECTION
IMAGE ANALYSIS, CLASSIFICATION AND CHANGE DETECTION IN REMOTE SENSING With Algorithms for ENVI/IDL and Python THIRD EDITION Morton J. Canty CRC Press Taylor & Francis Group Boca Raton London NewYork CRC
More informationLecture 10 CNNs on Graphs
Lecture 10 CNNs on Graphs CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 26, 2017 Two Scenarios For CNNs on graphs, we have two distinct scenarios: Scenario 1: Each
More informationSupport vector machines
Support vector machines When the data is linearly separable, which of the many possible solutions should we prefer? SVM criterion: maximize the margin, or distance between the hyperplane and the closest
More informationKernel Methods in Machine Learning
Outline Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong Joint work with Ivor Tsang, Pakming Cheung, Andras Kocsor, Jacek Zurada, Kimo Lai November
More informationKernel Methods and Visualization for Interval Data Mining
Kernel Methods and Visualization for Interval Data Mining Thanh-Nghi Do 1 and François Poulet 2 1 College of Information Technology, Can Tho University, 1 Ly Tu Trong Street, Can Tho, VietNam (e-mail:
More informationMSA220 - Statistical Learning for Big Data
MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups
More informationLocality Preserving Projections (LPP) Abstract
Locality Preserving Projections (LPP) Xiaofei He Partha Niyogi Computer Science Department Computer Science Department The University of Chicago The University of Chicago Chicago, IL 60615 Chicago, IL
More information10-701/15-781, Fall 2006, Final
-7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly
More informationA WEIGHTED SUPPORT VECTOR MACHINE FOR DATA CLASSIFICATION
International Journal of Pattern Recognition and Artificial Intelligence Vol. 2, No. 5 (2007) 96 976 c World Scientific Publishing Company A WEIGHTED SUPPORT VECTOR MACHINE FOR DATA CLASSIFICATION XULEI
More informationNon-exhaustive, Overlapping k-means
Non-exhaustive, Overlapping k-means J. J. Whang, I. S. Dhilon, and D. F. Gleich Teresa Lebair University of Maryland, Baltimore County October 29th, 2015 Teresa Lebair UMBC 1/38 Outline Introduction NEO-K-Means
More information