CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection.

Similar documents
Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates

CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation

Graph Partitioning Algorithms

Lecture 6: Spectral Graph Theory I

Targil 12 : Image Segmentation. Image segmentation. Why do we need it? Image segmentation

Mining Social Network Graphs

Spectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014

CS 534: Computer Vision Segmentation and Perceptual Grouping

Image Segmentation for Image Object Extraction

Planar Graphs 2, the Colin de Verdière Number

Segmentation: Clustering, Graph Cut and EM

Stanford University CS359G: Graph Partitioning and Expanders Handout 1 Luca Trevisan January 4, 2011

Machine Learning for Data Science (CS4786) Lecture 11

Spectral Clustering on Handwritten Digits Database

Graph Based Image Segmentation

SPECTRAL SPARSIFICATION IN SPECTRAL CLUSTERING

Aarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

The Surprising Power of Belief Propagation

Social-Network Graphs

Normalized cuts and image segmentation

Lecture 11: E-M and MeanShift. CAP 5415 Fall 2007

Normalized Graph cuts. by Gopalkrishna Veni School of Computing University of Utah

Clustering appearance and shape by learning jigsaws Anitha Kannan, John Winn, Carsten Rother

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur

Clustering Lecture 8. David Sontag New York University. Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein

CPSC 536N: Randomized Algorithms Term 2. Lecture 10

Graph drawing in spectral layout

Lecture 7: Segmentation. Thursday, Sept 20

Graph Theory Day Four

Image Segmentation continued Graph Based Methods

CS70 - Lecture 6. Graphs: Coloring; Special Graphs. 1. Review of L5 2. Planar Five Color Theorem 3. Special Graphs:

My favorite application using eigenvalues: partitioning and community detection in social networks

Local Graph Clustering and Applications to Graph Partitioning

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13.

MC 302 GRAPH THEORY 10/1/13 Solutions to HW #2 50 points + 6 XC points

REGULAR GRAPHS OF GIVEN GIRTH. Contents

6.801/866. Segmentation and Line Fitting. T. Darrell

Visual Representations for Machine Learning

Spectral Sparsification in Spectral Clustering

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning

Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11

8 Matroid Intersection

Lecture 9. Semidefinite programming is linear programming where variables are entries in a positive semidefinite matrix.

1 The Traveling Salesperson Problem (TSP)

6 Randomized rounding of semidefinite programs

CPS 102: Discrete Mathematics. Quiz 3 Date: Wednesday November 30, Instructor: Bruce Maggs NAME: Prob # Score. Total 60

Introduction to Graph Theory

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Computing Heat Kernel Pagerank and a Local Clustering Algorithm

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Diffusion and Clustering on Large Graphs

Graph theory - solutions to problem set 1

Segmentation (continued)

Lecture 14: Linear Programming II

Local Partitioning using PageRank

Non Overlapping Communities

Great Theoretical Ideas in Computer Science

princeton univ. F 15 cos 521: Advanced Algorithm Design Lecture 2: Karger s Min Cut Algorithm

EE 701 ROBOT VISION. Segmentation

Ma/CS 6b Class 26: Art Galleries and Politicians

From Pixels to Blobs

Content-based Image and Video Retrieval. Image Segmentation

Majority and Friendship Paradoxes

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents

Part II. Graph Theory. Year

1 Minimum Cut Problem

Contrast adjustment via Bayesian sequential partitioning

Segmentation and Grouping

1 Random Walks on Graphs

11.1 Facility Location

Lecture 4: Rational IPs, Polyhedron, Decomposition Theorem

Lecture 11: Maximum flow and minimum cut

Combining Top-down and Bottom-up Segmentation

Lesson 2 7 Graph Partitioning

Image Segmentation continued Graph Based Methods. Some slides: courtesy of O. Capms, Penn State, J.Ponce and D. Fortsyth, Computer Vision Book

Combinatorial Gems. Po-Shen Loh. June 2009

Grades 7 & 8, Math Circles 31 October/1/2 November, Graph Theory

Application of Spectral Clustering Algorithm

Discrete Wiskunde II. Lecture 6: Planar Graphs

Approximation Algorithms

Graph Theory Problem Ideas

Hardness of Approximation for the TSP. Michael Lampis LAMSADE Université Paris Dauphine

Modularity CMSC 858L

Design of Orthogonal Graph Wavelet Filter Banks

Image Segmentation. Srikumar Ramalingam School of Computing University of Utah. Slides borrowed from Ross Whitaker

What is a Graphon? Daniel Glasscock, June 2013

Image Segmentation. Selim Aksoy. Bilkent University

Clustering. So far in the course. Clustering. Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. dist(x, y) = x y 2 2

DO NOT RE-DISTRIBUTE THIS SOLUTION FILE

Image Segmentation. Selim Aksoy. Bilkent University

Finding and Visualizing Graph Clusters Using PageRank Optimization. Fan Chung and Alexander Tsiatas, UCSD WAW 2010

Course Introduction / Review of Fundamentals of Graph Theory

Today. Types of graphs. Complete Graphs. Trees. Hypercubes.

GRAPH THEORY: AN INTRODUCTION

Integral Geometry and the Polynomial Hirsch Conjecture

Segmentation Computer Vision Spring 2018, Lecture 27

arxiv: v3 [cs.ds] 15 Dec 2016

CSE 417 Network Flows (pt 3) Modeling with Min Cuts

Transcription:

CSCI-B609: A Theorist s Toolkit, Fall 016 Sept. 6, 016 Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality Lecturer: Yuan Zhou Scribe: Xuan Dong We will continue studying the spectral graph theory in this lecture. We start by introducing the sparsest cut problem and Cheeger s inequalities. The ideas that we develop here will be crucially used later. 1 Sparsest Cut Problem Firstly let s consider a real world problem: community detection. 1.1 Community detection In a social network graph, a person is regarded as a vertex, and if person u and person v know each other, then vertex u and vertex v forms an edge (u, v). Intuitively a community is thought as a set S of people which has the following two properties: (1) people in S are likely know each other, and () people out of set S (we call this set S) are less likely to know each other. So communities are groups of vertices which probably share common properties or play similar roles within the graph. In Figure. 1, a schematic example of a graph with communities is shown.the next question would be how to formalize the mathematical definition of a community? First try: Let s find cut across the boundary of S and S and make the cut be very small. This definition reflects the property (), but considering the extreme case: what if there is only one vertex in S, like S = {v 1 }? Small S usually has small cuts, but we also need to address large community. Second try: Let s define Vol(S) = v S deg(v) which is the sum of the degree of vertices in S. If G is a d-regular graph, then Vol(S) = d S. This definition gives us a intuition that Vol(S) is the total #friends of people in S, and we would like the following formula to be as small as possible, Vol(S) = #friends out of S total #friends of S. This definition looks make more sense then the first one, but also considering the extreme case: what if S = V which means all vertices are in the S? 1

Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality Figure 1: A simple graph with three communities, enclosed by the dashed circles [1]. Definition 1. Let Φ(S) := be the conductance of S for S. min{vol(s), Vol( S} Third try: The definition of conductance enables us to find good balanced separations, that is, to find a set S with small Φ(S) while S itself is not too big, and usually we set it Vol(S) 1Vol(V ) = m. If G is regular, S 1 n (n is the number of vertices). Sparsest cut problem. The goal of the sparsest cut problem is to find a cut with minimum sparsity (sparsest cut). Given a graph G, and find a set S V, Vol(S) m which minimizes Φ G (S). We can relate the sparsest cut in the graph to the conductance of a graph. Let Φ G = min {Φ(S)} S:0<Vol(S) m be the conductance of the graph G. Intuitively, Φ G is the smallest conductance among all sets with at most half of the total volume. Remark 1. If G is a d-regular graph, a one-step random walk from u ends in S w.p. Pr [v S] = v u edges(u, S). d If we pick a uniform random walk u S as start point, a one-step random walk ends in S w.p. edges(u, S) Pr [v S] u S = = = Φ(S). u S,v u deg(u) d S u S In other word, if we randomly pick a vertex of S and do a one-step random walk, the probability of going out of S is the conductance of S. Now let s see an application of sparsest cut on computer vision area.

Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality 3 1. Image segmentation Image segmentation is the process of partitioning a digital image into multiple segments. Each of the pixels in a region are similar with respect to some characteristic, such as color, intensity, or texture. Adjacent regions are significantly different with respect to the same characteristic []. The goal of segmentation is to simplify or change the representation of an image into regions that is more meaningful and easier to analyze, e.g. sky, people, animals. etc. Figure : A simple sample of graph-based image segmentation [4]. We can treat an image as a pixel graph. Every pixel with (red, green, blue) color channels is a vertex, every pair of neighboring pixels (u, v) is an edge, edge weight is defined as w(u, v) = exp( (r r ) + (g g ) + (b b ) θ ), where θ is a parameter. Heavy weight when colors are similar. and light weight when colors are not similar. See Figure. Then we could use the idea of sparsest cut as a graph-based image segmentation algorithm. [Shi-Malik 00] proposes a normalized cut approach. It consists of the following main steps [3]: Given an image, set up a weighted graph G = (V, E) and set the weight W on the edge connecting two vertices to be a measure of the similarity between the two vertices. Solve (D W )x = λdx for eigenvectors with the smallest eigenvalues. Use the eigenvector with the second smallest eigenvalue to bipartition the graph. Decide if the current partition should be subdivided and recursively repartition the segmented parts if necessary.

Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality 4 For further reading, see [3]. Cheeger s Inequality The sparsest cut problem is NP-hard and its approximability is quite open. We will explore a relation between spectral property of a graph and its conductance. This will give a nice approximation to the conductance. The following theorem is called Cheeger s inequality. Theorem (Cheeger s Inequality). For regular graph G, let λ be the second smallest eigenvalue of L G, then we have 1 λ Φ G λ. Cheeger s inequality is perhaps one of the most fundamental inequalities in discrete optimization and spectral graph theory [7]. It relates the eigenvalue of the normalized Laplacian matrix L G to the conductance Φ G. As noted above, the second smallest non-zero eignenvalue of the Laplacian of a graph tells us how connected the graph is. For exapmle, Φ G is close to zero and G has a natural -clustering if and only if λ is close to zero. Thus, it is important to find bounds for this value. Before proving this theorem, we need some definitions and facts. Definition 3. For a given symmetric matrix M R n n and non-zero vector x R n, the Rayleigh quotient R(M, x) is defined as R(M, x) := xt M x x T x. Because any x R n could be represented as x = n y iψi via an orthogonal transformation, so let s write M = n T λ iψi ψi with λ1 λ λ n. Now Raileigh quotient could be written as R(M, x) = with R(M, ψ 1 ) = λ 1 and R(M, ψ n ) = λ n. λ i yi yi [λ 1, λ n ], Theorem 4 (Courant-Fischer). Let M be an n n symmetric matrix and let λ 1 λ

Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality 5 λ n be the eigenvalues of M and ψ 1, ψ,, ψ n be the corresponding eigenvectors. Then λ 1 = min{ xt M x x 0 x T x }, λ n = max{ xt M x x 0 x T x }, x T M x λ = min{ x 0 x T x }. x ψ 1 Observe that R(M, ψ ) = λ, and also for all x ψ 1, we can write x = n y iψ i, therefore R(M, x) = λ i yi i= yi i= λ. The Courant Fischer characterization of the eigenvalues of a symmetric matrix M in terms of the maximizers and minimizers of the Rayleigh quotient plays a fundamental role in spectral graph theory. For further reading, see [5] and [6]. The left side of the Cheeger s inequality is usually referred to as the easy direction and the right side is known as the hard direction. Now let us prove the LHS of Cheeger s Inequlity ( 1 λ Φ G ). i= Proof. Let S : S 1 V be the set such that Φ G = Φ(S) = 1 δ S n 1, and 1 S x(u) = n S n u S, u S.. Consider x = d S Then we have x T L G x = 1 d (u,v) E ( x(u) x(v)) = 1, d

Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality 6 and x T x = S (1 S n ) + (n S ) ( S n ) = (n S )[ S S (1 n n ) + ( S n ) ] = (n S ) S n 1 S. Recall Courant Fischer theorem λ R(M, x) = xt L G x x T x d S = Φ G. References [1] Fortunato, Santo. Community detection in graphs. Physics reports 486.3 (010): 75-174. [] Shapiro, Linda, and George C. Stockman. Computer vision. 001. ed: Prentice Hall (001). [3] Shi, Jianbo, and Jitendra Malik. Normalized cuts and image segmentation. IEEE Transactions on pattern analysis and machine intelligence.8 (000): 888-905. [4] http://cs.brown.edu/~pff/segment/ [5] Chung, Fan RK. Spectral graph theory. Vol. 9. American Mathematical Soc., 1997. [6] Spielman, Daniel. Spectral graph theory. Lecture Notes, Yale University (009): 740-0776. [7] Shayan, Oveis Gharan. Cheeger s Inequality and the Sparsest Cut Problem. Lecture Notes, Washington University (015).