Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11

Similar documents
Aarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg

CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,


CS 6140: Machine Learning Spring 2017

Normalized Graph cuts. by Gopalkrishna Veni School of Computing University of Utah

Visual Representations for Machine Learning

Mining Social Network Graphs

Feature Selection for fmri Classification

CS 534: Computer Vision Segmentation and Perceptual Grouping

Image Segmentation continued Graph Based Methods

Non Overlapping Communities

Introduction to spectral clustering

Targil 12 : Image Segmentation. Image segmentation. Why do we need it? Image segmentation

Spectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014

Segmentation: Clustering, Graph Cut and EM

k-means demo Administrative Machine learning: Unsupervised learning" Assignment 5 out

SGN (4 cr) Chapter 11

Spectral Clustering on Handwritten Digits Database

Introduction to spectral clustering

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning

Introduction to Machine Learning

Lecture 7: Segmentation. Thursday, Sept 20

Image Segmentation. Selim Aksoy. Bilkent University

Image Segmentation. Selim Aksoy. Bilkent University

SPECTRAL SPARSIFICATION IN SPECTRAL CLUSTERING

Community Detection. Community

Image Segmentation continued Graph Based Methods. Some slides: courtesy of O. Capms, Penn State, J.Ponce and D. Fortsyth, Computer Vision Book

Lecture 11: E-M and MeanShift. CAP 5415 Fall 2007

Clustering Algorithms for general similarity measures

CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM

Big Data Analytics! Special Topics for Computer Science CSE CSE Feb 9

Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic

Application of Spectral Clustering Algorithm

6.801/866. Segmentation and Line Fitting. T. Darrell

Lesson 2 7 Graph Partitioning

Graph Cuts and Normalized Cuts

The clustering in general is the task of grouping a set of objects in such a way that objects

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

A Weighted Kernel PCA Approach to Graph-Based Image Segmentation

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003

Spatial-Color Pixel Classification by Spectral Clustering for Color Image Segmentation

Lecture 19: Graph Partitioning

Segmentation (continued)

Scalable Clustering of Signed Networks Using Balance Normalized Cut

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

10-701/15-781, Fall 2006, Final

My favorite application using eigenvalues: partitioning and community detection in social networks

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2

Modularity CMSC 858L

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters

TELCOM2125: Network Science and Analysis

Normalized cuts and image segmentation

MATH 567: Mathematical Techniques in Data

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Methods for Intelligent Systems

Extracting Information from Complex Networks

Segmentation. Bottom Up Segmentation

CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection.

Segmentation and low-level grouping.

Spectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras

Big Data Analytics. Special Topics for Computer Science CSE CSE April 14

Machine learning - HT Clustering

BACKGROUND: A BRIEF INTRODUCTION TO GRAPH THEORY

Image Segmentation. Srikumar Ramalingam School of Computing University of Utah. Slides borrowed from Ross Whitaker

Hierarchical Clustering

Social-Network Graphs

Normalized Cut Approximations

STATS306B STATS306B. Clustering. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010

Clustering Lecture 8. David Sontag New York University. Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein

Clustering: Classic Methods and Modern Views

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems

Spectral Methods for Network Community Detection and Graph Partitioning

Machine Learning for Data Science (CS4786) Lecture 11

A Study of Graph Spectra for Comparing Graphs

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Bipartite Edge Prediction via Transductive Learning over Product Graphs

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies.

Algebraic Graph Theory- Adjacency Matrix and Spectrum

Globally and Locally Consistent Unsupervised Projection

Topology-Invariant Similarity and Diffusion Geometry

Partitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA

General Instructions. Questions

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Graph Theory: Introduction

Lecture 2: Geometric Graphs, Predictive Graphs and Spectral Analysis

Behavioral Data Mining. Lecture 18 Clustering

Image Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 18.

Administrative. Machine learning code. Machine learning: Unsupervised learning

Segmentation Computer Vision Spring 2018, Lecture 27

Explore Co-clustering on Job Applications. Qingyun Wan SUNet ID:qywan

The goals of segmentation

An introduction to graph analysis and modeling Descriptive Analysis of Network Data

Bipartite Graph Partitioning and Content-based Image Clustering

Analysis of high dimensional data via Topology. Louis Xiang. Oak Ridge National Laboratory. Oak Ridge, Tennessee

MAT 280: Laplacian Eigenfunctions: Theory, Applications, and Computations Lecture 18: Introduction to Spectral Graph Theory I. Basics of Graph Theory

EXPLOITING THE STRUCTURE OF BIPARTITE GRAPHS FOR ALGEBRAIC AND SPECTRAL GRAPH THEORY APPLICATIONS

Comparison of cluster algorithms for the analysis of text data using Kolmogorov complexity

Graph Classification for the Red Team Event of the Los Alamos Authentication Graph. John M. Conroy IDA Center for Computing Sciences

Transcription:

Big Data Analytics Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Feb 11 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu

Clustering II

Spectral Clustering Algorithms that cluster points using eigenvectors of matrices derived from the data Obtain data representation in the low-dimensional space that can be easily clustered Variety of methods that use the eigenvectors differently Difficult to understand.

Graph Theory Basics A graph G = (V,E) consists of a vertex set V and an edge set E. If G is a directed graph, each edge is an ordered pair of vertices A bipartite graph is one in which the vertices can be divided into two groups, so that all edges join vertices in different groups.

Similarity Graph Distance decrease similarity increase Represent dataset as a weighted graph G(V,E) Wij represent similarity between vertex If Wij=0 where isn t similarity; Wii=0 V={x i } E={W ij } Set of n vertices representing data points Set of weighted edges indicating pair-wise similarity between points 1 0.1 5 2 3 0.6 4 0.7 6

Graph Partitioning Clustering can be viewed as partitioning a similarity graph Bi-partitioning task: Divide vertices into two disjoint groups (A,B) A 1 5 B 2 4 6 3

Partitioning Criterion Traditional definition of a good clustering: Points assigned to same cluster should be highly similar. Points assigned to different clusters should be highly dissimilar 1 0.1 5 2 3 0.6 4 0.7 6

Graph Cut Express partitioning objectives as a function of the edge cut of the partition. Cut: Set of edges with only one vertex in a group.we wants to find the minimal cut between groups. The groups that has the minimal cut would be the partition A 1 0.1 B 5 2 0.6 4 6 3 0.7

Min-Cut Minimize weight of connections between groups Optimal cut Minimum cut

Normalized Cut Consider the connectivity between groups relative to the density of each group Normalize the association between groups by volume. Vol(A): The total weight of the edges originating from group A. Why use this criterion? Minimizing the normalized cut is equivalent to maximizing normalized association. Produces more balanced partitions.

Spectral Graph Theory Possible approach Represent a similarity graph as a matrix Apply knowledge from Linear Algebra The eigenvalues and eigenvectors of a matrix provide global information about its structure. Spectral Graph Theory Analyze the spectrum of matrix representing a graph. Spectrum : The eigenvectors of a graph, ordered by the magnitude(strength) of their corresponding eigenvalues

Matrix Representation x 1 x 2 x 3 x 4 x 5 x 6 1 0.1 5 x 1 0 0.6 0 0.1 0 x 2 0 0 0 0 2 0.6 4 6 x 3 0.6 0 0 0 3 0.7 x 4 0 0 0 0.7 x 5 0.1 0 0 0 x 6 0 0 0 0.7 0 x 1 x 2 x 3 x 4 x 5 x 6 0.1 x 1 1.5 0 0 0 0 0 2 1 0.6 4 5 6 x 2 0 1.6 0 0 0 0 x 3 0 0 1.6 0 0 0 3 0.7 x 4 0 0 0 1.7 0 0 x 5 0 0 0 0 1.7 0 x 6 0 0 0 0 0 1.5

Laplacian Matrix x 1 x 2 x 3 x 4 x 5 x 6 x 1 1.5 0 0 0 0 0 x 1 x 2 x 3 x 4 x 5 x 6 x 1 0 0.6 0 0.1 0 x 1 x 2 x 3 x 4 x 5 x 6 x 2 0 1.6 0 0 0 0 x 3 0 0 1.6 0 0 0 x 4 0 0 0 1.7 0 0 x 5 0 0 0 0 1.7 0 x 6 0 0 0 0 0 1.5 - x 2 0 0 0 0 x 3 0.6 0 0 0 x 4 0 0 0 0.7 x 5 0.1 0 0 0 x 6 0 0 0 0.7 0 = x 1 1.5 - -0.6 0-0.1 0 x 2-1.6-0 0 0 x 3-0.6-1.6-0 0 x 4 0 0-1.7 - -0.7 x 5-0.1 0 0-1.7-1 0.1 5 x 6 0 0 0-0.7-1.5 2 0.6 4 6 3 0.7

Normalized Laplacian 0.1 0.7 0.6 1 2 3 4 5 6-0.06-0.39-0.52 1.00-0.50 1.00-0.52-0.12 1.00-0.50-0.39-0.44-0.47 1.00-0.12-0.50 1.00 0.47- -0.06 1.00-0.50-0.44

Spectral Clustering Three basic stages: Pre-processing Construct a matrix representation of the dataset. Decomposition Compute eigenvalues and eigenvectors of the matrix. Map each point to a lower-dimensional representation based on one or more eigenvectors. Grouping Assign points to two or more clusters, based on the new representation.

Min-cut x 1 x 2 x 3 x 4 x 5 x 6 x 1 1.5 - -0.6 0-0.1 0 x 2-1.6-0 0 0 x 3-0.6-1.6-0 0 x 4 0 0-1.7 - -0.7 x 5-0.1 0 0-1.7 - x 6 0 0 0-0.7-1.5 x 1 x 2 x 3 0.0 0.4 0.1 0.4 - -0.9 x 4-0.4 0.4 0.4 0.1-0. 0.4 0.3 2.2 0.4 0.4-0.4 - Λ = X = -0.7 2.3 0.9 0.0 - -0.4 0.6-0.6 x 5 x 6-0.7 2.5 0.4-0.7-0.4 - -0.6-3.0 0.4-0.7-0.5 0.9

k-way Partitioning Partition using only one eigenvector at a time Use procedure recursively Example: Image Segmentation Uses 2 nd (smallest) eigenvector to define optimal cut Recursively generates two clusters with each cut

k-way Partitioning Use k eigenvectors (k chosen by user) Directly compute k-way partitioning Experimentally has been seen to be better

Spectral Clustering with Data Vectors Given a set of data points Form the pairwise affinity matrix Construct the normalized Laplacian matrix Stack the k smallest eigenvectors Renormalize each eigenvector to have a unit norm and perform k-means

Gaussian Affinity

Magic Sigma σ = 0.041235 σ = 0.015625 σ = 0.35355 σ = 1

Local Scaling Instead of selecting a single scaling parameter σ, calculate a local scaling parameter σi for each data point

Local Scaling (a) (b) Figure 2: The effect of local scaling. (a) Input data points. A tight clu a background cluster. (b) The affinity between each point and its surr is indicated by the thickness of the line connecting them. The affinities larger than the affinities within the background cluster. (c) The correspo of affinities after local scaling. The affinities across clusters are now than the affinities within any single cluster. Introducing Local Scaling: Instead of selecting a single scaling param to calculate a local scaling parameter σ i for each data point s i. Th

http://www.vision.caltech.edu/lihi/demos/selftuningclustering.html

How to determine k Eigengap: the difference between two consecutive eigenvalues. Most stable clustering is generally given by the value k that maximises the expression 50 45 40 35 Eigenvalue 30 25 20 15 10 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 K