Spectral Clustering on Handwritten Digits Database

Similar documents
Application of Spectral Clustering Algorithm

Spectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014

The clustering in general is the task of grouping a set of objects in such a way that objects

Introduction to Machine Learning

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

Visual Representations for Machine Learning

Aarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg

Introduction to spectral clustering

CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation

Machine Learning for Data Science (CS4786) Lecture 11

A Weighted Kernel PCA Approach to Graph-Based Image Segmentation

Mining Social Network Graphs

APPROXIMATE SPECTRAL LEARNING USING NYSTROM METHOD. Aleksandar Trokicić

Spectral Graph Clustering

My favorite application using eigenvalues: partitioning and community detection in social networks

Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11

Dimension reduction for hyperspectral imaging using laplacian eigenmaps and randomized principal component analysis

Non Overlapping Communities

Clustering. So far in the course. Clustering. Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. dist(x, y) = x y 2 2

Linear and Non-linear Dimentionality Reduction Applied to Gene Expression Data of Cancer Tissue Samples

Clustering Lecture 8. David Sontag New York University. Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein

Introduction to spectral clustering

Dimension reduction for hyperspectral imaging using laplacian eigenmaps and randomized principal component analysis:midyear Report

CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection.

Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. 2 April April 2015

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Segmentation: Clustering, Graph Cut and EM

Segmentation Computer Vision Spring 2018, Lecture 27

Image Segmentation continued Graph Based Methods

DIMENSION REDUCTION FOR HYPERSPECTRAL DATA USING RANDOMIZED PCA AND LAPLACIAN EIGENMAPS

SGN (4 cr) Chapter 11

From Pixels to Blobs

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning

Unsupervised brightfield image segmentation with RPCA and spectral clustering

Social-Network Graphs

Scalable Clustering of Signed Networks Using Balance Normalized Cut

MSA220 - Statistical Learning for Big Data

Generalized trace ratio optimization and applications

Segmentation. Bottom Up Segmentation

Targil 12 : Image Segmentation. Image segmentation. Why do we need it? Image segmentation

MATH 567: Mathematical Techniques in Data

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003

Unsupervised Clustering of Bitcoin Transaction Data

Local Graph Clustering and Applications to Graph Partitioning

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Data Clustering via Graph Min-Cuts. Erin R. Rhode University of Michigan EECS 767 March 15, 2006

Planar Graphs 2, the Colin de Verdière Number

An Introduction to Graph Theory

Clustering: Classic Methods and Modern Views

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen

Clustering in Networks

Image Segmentation continued Graph Based Methods. Some slides: courtesy of O. Capms, Penn State, J.Ponce and D. Fortsyth, Computer Vision Book

Kernighan/Lin - Preliminary Definitions. Comments on Kernighan/Lin Algorithm. Partitioning Without Nodal Coordinates Kernighan/Lin

CS 231A CA Session: Problem Set 4 Review. Kevin Chen May 13, 2016

Lecture 7: Segmentation. Thursday, Sept 20

Algebraic Graph Theory- Adjacency Matrix and Spectrum

Nonlinear Dimensionality Reduction Applied to the Classification of Images

Introduction aux Systèmes Collaboratifs Multi-Agents

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies.

Clustering. Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester / 238

TELCOM2125: Network Science and Analysis

CS 534: Computer Vision Segmentation and Perceptual Grouping

Spectral Sparsification in Spectral Clustering

6.801/866. Segmentation and Line Fitting. T. Darrell

Graph drawing in spectral layout

Image Segmentation. April 24, Stanford University. Philipp Krähenbühl (Stanford University) Segmentation April 24, / 63

Efficient FM Algorithm for VLSI Circuit Partitioning

Biomedical Data Exploration with Spectral Methods and Canonical Correlation. Outline. Diversity of Data. Nataliya Sokolovska

SPECTRAL SPARSIFICATION IN SPECTRAL CLUSTERING

CS 6140: Machine Learning Spring 2017

Spectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras

The Alternating Direction Method of Multipliers

Graph Classification for the Red Team Event of the Los Alamos Authentication Graph. John M. Conroy IDA Center for Computing Sciences

Normalized cuts and image segmentation

EE 701 ROBOT VISION. Segmentation

Signal Processing for Big Data

Overlapping Communities

10-701/15-781, Fall 2006, Final

The Constrained Laplacian Rank Algorithm for Graph-Based Clustering

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates

An introduction to graph analysis and modeling Descriptive Analysis of Network Data

Spectral Clustering and Community Detection in Labeled Graphs

Lesson 2 7 Graph Partitioning

Normalized Graph cuts. by Gopalkrishna Veni School of Computing University of Utah

Kirchhoff's theorem. Contents. An example using the matrix-tree theorem

Spectral Methods for Network Community Detection and Graph Partitioning

Big-data Clustering: K-means vs K-indicators

c 2004 Society for Industrial and Applied Mathematics

A Graph Clustering Algorithm Based on Minimum and Normalized Cut

GRAPH CENTERS USED FOR STABILIZATION OF MATRIX FACTORIZATIONS

Community Detection. Community

Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic

Spatial-Color Pixel Classification by Spectral Clustering for Color Image Segmentation

Behavioral Data Mining. Lecture 18 Clustering

Spectral Graph Multisection Through Orthogonality. Huanyang Zheng and Jie Wu CIS Department, Temple University

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2

Image Segmentation. Selim Aksoy. Bilkent University

Parallel Rendering. Jongwook Jin Kwangyun Wohn VR Lab. CS Dept. KAIST MAR

Feature Selection for fmri Classification

REGULAR GRAPHS OF GIVEN GIRTH. Contents

Transcription:

October 6, 2015 Spectral Clustering on Handwritten Digits Database Danielle dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance Scientific Computing I

Outline 1 2 3 4 5 6 7

Background Information Spectral Clustering is clustering technique that makes use of the spectrum of the similarity matrix derived from the data set. Implements a clustering algorithm on a reduced dimension. Advantages: Simple algorithm to implement and uses standard linear algebra methods to solve the problem efficiently. Motivation: Implement an algorithm that groups objects in a data set to other objects with ones that have a similar behavior.

Definitions A graph G = (V, E) where V = {v 1,..., v n } W- Adjacency { matrix. 1, if v i, v j are connected by an edge w ij = 0, otherwise The degree of a vertex d i = n j=1 w ij. The Degree matrix denoted D,where each d 1,..., d n are on the diagonal. Denote a subset of vertices A V and the compliment Ā = V \ A A = number of vertices in A vol(a) = i A d i W (A, B) = i A,j B w ij

Example 1 4 5 2 3 6 7 G = (V, E) where V = {1,..., 7}. 0 1 1 0 0 0 0 2 0 0 0 0 0 0 1 0 1 0 0 0 0 0 2 0 0 0 0 0 1 1 0 0 0 0 0 0 0 2 0 0 0 0 W = 0 0 0 0 1 1 1 D = 0 0 0 3 0 0 0 0 0 0 1 0 1 1 0 0 0 0 3 0 0 0 0 0 1 1 0 1 0 0 0 0 0 3 0 0 0 0 1 1 1 0 0 0 0 0 0 0 3

Definitions Similarity graph: Given a data set x 1,..., x n and a notion of similar, a similarity graph is a graph where x i and x j have an edge between them if they are considered similar. Some ways to determine if data points are similar are: e-neighborhood graph k-nearest neighborhood graph Use Similarity Function Unnormalized Laplacian Matrix: L = D W Normalized Laplacian Matrix: L sym = D 1/2 LD 1/2 = I D 1/2 WD 1/2

Why this works? Spectral Clustering is motivated by approximating the RatioCut or NCut on a given graph. Given a similarity graph, to construct a partition is the solve the min cut problem. That is min cut(a 1,...A k ) := 1 2 k W (A i, Ā i ) 1 In order to insist each partition is reasonably large, use RatioCut or NCut. Thus the size of each partition is measured by the number of vertices or weights of the edges, respectively.

Why this works? Thus RatioCut(A 1,..., A k ) := 1 2 k i=1 W (A i, Ā i ) A i NCut(A 1,..., A k ) := 1 2 k i=1 W (A i, Ā i ) vol(a i ) Solving these versions makes the problem NP hard. Spectral Clustering solves the relaxed versions of these problems. [2.]

Why this works? Case k = 2. Given a subset A. Then NCut(A, Ā) = W (A,Ā) vol(a) + W (A,Ā). Define the cluster indicator vector vol(ā) f by { 1 vol(a) f (v i ) = f i =, if v i A 1 vol(ā), if v i Ā Then f T Lf = w ij (f i f j ) 2 = W (A, Ā)( 1 vol(a) + 1 vol(ā) )2 f T Df = d i f 2 i = 1 vol(a) + 1 vol(ā)

Why this works? Thus minimizing the NCut problem is equivalent to min NCut(A, B) = f T Lf f T Df The relaxation problem is given by minimize f R n f T Lf f T Df subject to f T D1 = 0

Why this works? It can be should the relaxation problem is a form of the Rayleigh-Ritz quotient. The Rayleigh Ritz theorem states: Given A a Hermitian matrix, then λ min = min x 0 x T Ax x T x Thus in the relaxation problem, the solution f is the second eigenvector of the generalized problem.

Procedure Database Similarity Graph Normalized Laplacian Compute the Eigenvectors Put the eigenvectors in a matrix and Normalize Perform dimension reduction Cluster the points

Databases The database I will be using is the MNIST Handwritten digits database. Has 1000 of each digit 0-9. Each image is of size 28x28 pixels. Figure: Test images. Simon A. J. Winder

Similarity Graph Guassian Similarity Function: s(x i, x j ) = e xi x j 2 2σ 2 where σ is a parameter. If s(x i, x j ) < ɛ connect an edge between x i and x j. Each x i R 28x28 and corresponds to an image. Thus 28 28 x i x j 2 2 = (xkl i x j kl )2 k=1 l=1

Laplacian Matrix W - Adjacency matrix w ij = D- Degree matrix L = D W { 1, if s(x i, x j ) < ɛ 0, otherwise L sym = D 1/2 LD 1/2 = I D 1/2 WD 1/2

Computing Eigenvectors Use an iterative method called the Power Method to find the first k eigenvectors of L sym = D 1/2 LD 1/2 = I D 1/2 WD 1/2. Start with an initial nonzero vector, v 0, for the eigenvector Let B = D 1/2 WD 1/2. Form the sequence given by: for i = 1,..., l end x i = Bv i 1 v i = x i x i

Computing Eigenvectors (Con t) For large values of l we will obtain a good approximation of the dominant eigenvector of B. This will give us the eigenvector corresponding the the largest eigenvalue of B which corresponds to the smallest eigenvalue of L sym. To find the next eigenvector, after selecting the random initial vector v 0, subtract the component of v 0 that is parallel to the eigenvector of the largest eigenvalue.

Computing Eigenvectors (Con t) We put the first k eigenvectors into a matrix and normalize it. Let T R nxk be the eigenvector matrix with norm 1. Set v i,j t i,j = ( k v i,k 2 )1/2 v 11 v 12 v 13... v 1k t 11 t 12 t 13... t 1k.............. v i1 v i2 v i3... v ik t i1 t i2 t i3... t ik.............. v n1 v n2 v n3... v nk t n1 t n2 t n3... t nk

Dimension Reduction Project the eigenvectors onto new space. Let y i R k be a vector from the i th row of T t 11 t 12 t 13... t 1k...... t i1. t i1 t i2 t i3... t ik t i2 y i =........ t ik t n1 t n2 t n3... t nk

Clustering Perform a k-means algorithm on the set of vectors. Randomly select k cluster centroids, z j. Calculate the distance between each y i and z j. Assign the data point to the closest centroid. Recalculate centroids and distances from data points to new centroids. If no data point was reassigned then stop, else reassign data points and repeat.

Clustering Perform a k-means algorithm on the set of vectors. Randomly select k cluster centroids, z j. Calculate the distance between each y i and z j. Assign the data point to the closest centroid. Recalculate centroids and distances from data points to new centroids. If no data point was reassigned then stop, else reassign data points and repeat. Finally, assign the original point x i to cluster j if and only if row i of the matrix Y was assigned to cluster j.

Validate the k-means clustering on a well known clustered set. See if they match. Validate the computation of the eigenvectors by using Matlab toolbox. Validate the solution of clustering found by seeing if similar images are grouped together.

Personal Laptop: Macbook Pro. I will be using Matlab R2014b for the coding.

End of October/ Early November: Construct Similarity Graph and Normalized Laplacian matrix. End of November/ Early December: Compute first k eigenvectors validate this. February: Normalize the rows of matrix of eigenvectors and perform dimension reduction. March/April: Cluster the points using k-means and validate this step. End of Spring semester: Implement entire algorithm, optimize and obtain final results.

Results By the end of the project, I will deliver Code that delivers database Codes that implement the algorithm Final report of algorithm outline, testing on database and results Final presentation

[1.] Von Cybernetics, U. A Tutorial on Spectral Clustering. Statistics and Computing, 7 (2007) 4. [2.] Shi, J. and Malik J. Normalized cuts and image segmentation. IEEE Transations on Pattern Analysis and Machine Intelligence, 22 (2000) 8. [3.] Chung, Fan. Spectral Graph Theory. N.p.: American Mathematical Society. Regional Conference Series in Mathematics. 1997. Ser. 92. [4.] Vishnoi, Nisheeth K.Lx = b Laplacian Solvers and their Algorithmic Applications. N.p.: Foundations and Trends in Theoretical Computer Science, 2012.

Thank you