Exact Recovery in Stochastic Block Models

Size: px
Start display at page:

Download "Exact Recovery in Stochastic Block Models"

Transcription

1 Eact Recovery in Stochastic Block Models Ben Eysenbach May 11, Introduction Graphs are a natural way of capturing interactions between objects, such as phone calls between friends, binding between proteins, and and co-authoring between researchers. While the space of graphs is eponentially large in the number of vertices, most graphs of interest are sparse and ehibit some underlying structure. A number of algorithms attempt to recover this underlying structure. In this survey, we study when recovery is possible and algorithms for recovery. We begin by describing a generative model for unstructured, random graphs in Section 2. We then etend this vanilla model to structured graphs in Section 3. Section 4 discuses why recovering the structure is difficult. We discuss two algorithms for recovery in Sections 5 and 6. We prove that recovery is impossible in some cases in Section 7. A u u th column of matri A B (k k) symmetric matri storing the probabilities partitions interact G Matri of probabilities each pair of vertices has an edge Ĝ Adjacency matri G(p, n) Erdős-Rényi graph on n vertices, including each edge with probability p H Estimate of G computed from Ĝ i, j Particular partitions k Number of partitions n Number of vertices p Probability of including an edge s m Minimum number of vertices in any partition T u Preliminary cluster of vertices, used in spectral algorithm tau Threshold parameter for clustering in spectral algorithm u, v Particular vertices Figure 1: Variable conventions 2 Erdős-Rényi Models The Erdős-Rényi model [3] is a generative model for graphs. Specifically, it defines a distribution over all graphs. The generated graphs do not (in general) ehibit any special structure, but studying Erdős-Rényi graphs does introduce a language which etends to structured graph models. Formally, we define an Erdős-Rényi generative model G(p, n) as a distribution over undirected graphs with n vertices, where each edge (u, v) is included independently with probability p. By linearity of epectation, the epected number of edges is E G(p,n) [num. edges] = (prob. pair of vertices has an edge) (num. pairs of vertices) n(n 1) = p 2 = O(pn 2 ) Most real-world graphs are sparse, meaning the number of edges scales linearly with the number of vertices. For eample, the graph of LinkedIn connections has 7 million vertices but only 30 million 1

2 edges. [7] This number of edges is a few orders of magnitude smaller than the tens of billions of edges we would epect if the number of edges grew as O(n 2 ). Thus, we are particularly interested in graphs where p = O(1/n). 3 Stochastic Block Model The stochastic block model, sometimes referred to as the planted partition model, is a cousin of the Erdős-Rényi model which allows the (undirected) graph to ehibit some structure. In the stochastic block model, each verte is assigned a partition, and the probability of including an edge between vertices u and v depends only on the partitions of u and v. We will use the k k symmetric matri B to store the probability of edges between two partitions. The stochastic block model can be interpreted as the concatenation of many Erdős-Rényi graphs. The edges between two partitions eactly follow the Erdős-Rényi model, as do the edges without a single partition. Consider the amount of information required to represent the generative model. While the Erdős-Rényi model only stores a single value p, the stochastic block model stores k(k 1) 2 = O(k 2 ) values in the matri B and n values to store the partitions. We usually consider models where k << n, so the stochastic block model requires much less information than a completely general model, which sould use O(n 2 ) space to store a interaction probability for each individual pair of vertices. Two interesting graph structures are special cases of the stochastic block model. In the planted clique model, a graph is first generated from an Erdős-Rényi model. k vertices are chosen randomly, and all edges between pairs of these k edges are added. This imposes a k clique on the graph. In the second model, the graph coloring model, a graph is also initially generated from an Erdős-Rényi model. Each node is assigned a color, and all edges between nodes of the same color are deleted. Given sample[s] from a stochastic block model, the goal is to recover the assignment of nodes to partitions and the partition interaction matri B. 4 Why are these problems hard? Figure 2: Graph and corresponding adjacency matri Recovering the underlying partitions and interaction matri is deceptively hard. Graphs drawn from stochastic block models are often depicted as in Fig. 2. The matri on the right is the adjacency matri for the graph on the left. Seeing these representations, it is easy to directly read off the partitions. You can empirically estimate the an entry (i, j) in partition interaction matri B by counting the number of edges between nodes in partitions i and j. 2

3 Figure 3: Scrambled graph and permuted adjacency matri In Fig. 3, we have taken Fig. 2, rearranged the vertices and permuted the rows/cols of the adjacency matri. This does not change the graph, but the structure of this graph is no longer apparent. There are eponentially many ways of permuting the rows/cols of the adjacency matri and only a vanishingly small fraction will look like the one shown in Fig. 2 The two special graph models described in Section 3, the planted clique model and the graph coloring model correspond to famous hard problems. Finding the largest clique in a graph or computing it s chromatic number are both NP-hard problems. 1 Another challenge with recovering the underlying structure is that the Erdős-Rényi places non-zero probability on all graphs. If the Erdős-Rényi model generated a complete graph, we would be unable to determine which k vertices the planted clique model choose as part of the clique. Similarly, if the Erdős-Rényi model generated an empty graph, we would be unable to determine how the graph coloring model colored the vertices. These two challenges indicate that we cannot hope to accurately recover the underlying graph structure for every case. However, most hard problems, including finding the largest clique and graph coloring, are easy in the average case. In the net section, we introduce two algorithms for recovering the underlying structure. 5 Spectral Algorithm 5.1 Intuition The key insight into how spectral algorithms work is that the adjacency matri, Ĝ, is a noisy measurement of another matri G. Each entry in G contains the probability that two nodes interact. We can construct this matri for the stochastic block model by randomly assigning vertices to partitions, and then filling the entries of G by looking up the corresponding entries in B. The stochastic block model samples a graph by including each edge (u, v) with probability given by G(u, v). Matri G is low rank; it contains at most k distinct rows/cols. If we had access to G, we could identify the k distinct types of rows/cols. The type of row/col u would indicate to which partition verte u belonges. Deleting all rows/cols of the same type will leave a k k matri, which equals the partition interaction matri B. Unfortunately, we only have access to the the adjacency matri Ĝ, not G. 5.2 Algorithm We now present a spectral algorithm for recovering the stochastic block model based on McSherry [6]. We will build up to the final algorithm over the course of two failed (but conceptually useful) approaches Approach 1 In the edge probability matri G, vertices u and v belonging to the same partition will have identical columns G u and G v. We epect that the corresponding columns of the adjacency matri, Ĝ u and Ĝv, 1 Both appeared on Richard Karp s 1971 list of 21 NP-complete problems 3

4 are also close. Our first approach greedily clusters vertices based on these columns, using τ is a threshold variable: 1. While some verte u has not been assigned to a partition: (a) Create a new partition and assign to it u and every close verte: {v V Ĝu Ĝv 2 < τ} Unfortunately, the distance between Ĝu and Ĝv may be large, even if u and v belong to the same cluster (so G u = G v ) Approach 2 Our net approach attempts to smooth Ĝ to form another matri H, and then apply Approach 1 to H. We can do this smoothing by first partially clustering the vertices. The clustering step uses variable s m, the number of vertices in the smallest partition. Then we can create matri H be representing each verte as a combination of these initial clusters: 1. Initialize each verte as not assigned 2. While at least 1 2 s m vertices have not been assigned to a cluster: (a) Choose an random unassigned verte u and initialize a new cluster T u =. (b) For each verte v V : i. Compute Ĝu Ĝv, and project the difference onto Ĝ. ii. If the projection has length less than τ, add v to T u and mark v as assigned. 3. For each unassigned verte u (a) Assign u to cluster T v for which u v 2 is minimized 4. Each cluster T u can be represented as a length-v binary vector indicating cluster membership. Stack these indicator vectors into matri C. 5. Let H be the projection of Ĝ onto C. 6. Apply Approach 1 to H. This approach almost works. The one caveat is that we use Ĝ twice, once to compute the indicator matri C, and again when we multiply to C. Using Ĝ multiple times makes analysis tricky Approach 3 The solution given by McSherry [6] splits Ĝ into two matrices of size (roughly) n 1 2n. The proof of correctness requires that we split Ĝ. An alternate proof technique might not require splitting Ĝ Analysis McSherry [6] shows that Approach 3 recovers the stochastic block if the partitions are distinct enough. Formally, we require that for any pair of vertices u and v belonging to different partitions, the L2 distance between G u and G v is large. When this requirement holds, Approach 3 succeeds with probability 1 δ, which can be inflated by repetition. At a high level, the proof required three steps. First, McSherry [6] shows that G and Ĝ are not too far apart. Net, he proves that our smoothed version of Ĝ, H, is close to G. Finally, he shows that the algorithm succeeds when H is close to G. The proof sketch above only shows that the algorithm recovers the partitions, not the partition interaction matri B. B can be empirically estimated after the fact using a simple average. However, it is impossible to recover B eactly. This negative result can be shown using information theory. Each entry in B is a number of arbitrary precision, requiring potentially infinite bits to epress eactly. The graph we are given as input stores only a finite number of bits, one for each pair of vertices. By the Pidgeon Hole Principle, there will be graphs with different matrices B between which our algorithm cannot distinguish. Subsequent work by Vu [9] showed how the smoothing step in Approach 2 can be replaced by a simple SVD. 4

5 6 Semidefinite Programming Algorithm We now consider an alternate approach based on semidefinite programming by Abbe, Bandeira, and Hall [1]. This approach applies only to stochastic block models with two partitions of equal size, where the edge probabilities within and between the two partitions are p and q respective. We assume p > q. 6.1 Defining the Semidefinite Program The general approach is to maimize the number of edges within each partition while minimizing the number of edges between each partition: ma partition We first define this objective algebraically. Define u as an ±1 variable indicating to which partition verte u belongs. We can write our objective as ma We can write this in matri form as: (u,v) E u v ma T D (u,v) / E u v where D is similar to an adjacency matri for the graph: { } 1, if (u, v) E D[u, v] = 1, if (u, v) / E Note that the number of within-partition pairs and the number of between-partition pairs are fied at 2( 1 2 (n/2)(n/2 1)) = 1 2 n( n 2 1) and (n/2)2 = 1 4 n2 respectively. The new objective has the same optimal solution as the old objective: ma T D = ma = ma = ma = ma = ma D[u, v] u v u,v (u,v) E u v (u,v) / E u v (num. non-edges within partitions) + (num. non-edges between partitions) ((num. pairs within partitions) (num. edges within partitions)) + ((num. pairs between partitions) (num. edges between partitions)) 2(num. edges within partitions) 2(num. edges between partitions) 1 2 n(n 2 1) n2 = ma The ±1 constraints on make this optimization problem hard. We instead formulate it as an SDP without integer constraints. A challenge is converting the objective T D into a Frobenius product of two matrices. Recalling that the trace is invariant under cyclic permutations, we have tr( T D) = tr(d( T )) = D( T ) F The second equality comes the fact that the trace of a matri equals its Frobenius product if at least one of the matrices is symmetric. Our SDP will solve for X T : 5

6 ma DX F s.t. X ii = 1 X 0 If p is sufficiently larger than q, then with high probability our relaed SDP above will have a rank-1 solution gg T with g {±1} n. The coordinates of g will indicate to which partition each verte belongs. 6.2 Analysis We want to show that gg T will be the unique solution to SDP. To show that gg T is an optimal solution, Abbe et al [1] show that Bgg T equals Y, some feasible solution to the dual of the SDP: min tr(y ) s.t. Y B Y diag Net, define two diagonal matrices storing the within-partition and between-partition degrees of each verte: D uu + = num. edges from u to another verte in the same partition Duu = num. edges from u to another verte in a different partition We can compute Y directly by rewriting Bgg T in terms of D + and D. (Bgg T ) uu = (num. edges within partitions) + (num. non-edges between partitions) (1) (num. edges between partitions) (num. non-edges between partitions) (2) = D uu + + ( n 2 G uu) ( n 2 1 G+ uu) Duu (3) = 2(D uu + Duu) + 1 (4) Setting Y = 2(D + D ) + I n gives a feasible solution to the dual program. To show that gg T unique solution, Abbe et al [1] show that the second eigenvalue of Y is not too small. is the 7 Lower Bounds for Recovery We now study when these two algorithms will succeed with high probability and compare those bounds to information theoretic lower bounds on recovery. First, it is convenient to define a = pn and b = qn. This SDP algorithm succeeds in the regime when (a b) 2 > 8(a + b) (a b). This is substantially better than the spectral algorithm presented in Section 5, which succeeds only when (a b) 2 > 64(a + b). It is important to note that all algorithms fail when a and b are too close. Specifically, no algorithm can recover the partitions when (a b) 2 < 4(a b) 4 and a + b 2.[1] Note that neither of the algorithms presented achieves this lower bound. This lower bound can be shown via an information theoretic argument. Consider the problem of distinguishing the above stochastic block graph from an Erdős-Rényi graph with edge probability 1 2 (p + q). If p q is small, the graph will not contain enough bits of information to distinguish these two graphs. 8 Conclusion In this survey, we introduced two random graph models, Erdős-Rényi models and stochastic block models. We discussed why uncovering the structure of graphs generated by the second model is difficult. We then presented two algorithms for recovering this structure and sketched correctness proofs. We finished by showing that recovery is not always possible and by discussing lower bounds. 6

7 A number of open problems remain in this area. First, how many real-world graphs have distinct enough partitions to be recovered using the presented algorithms? How many fall into the range between the lower bound of Abbe et al [1] and the the lower bounds for the presented algorithms? Second, we can etend the stochastic block model to allow vertices to belong to multiple partitions, creating a mied membership stochastic block model [2]. When is eact recovery possible in this setting, and what are the best algorithms for recovery? Third, what if we only want to partially recover the underlying partitions? Is there a unified approach to analyzing both the eact recovery and partial recovery settings? References [1] Emmanuel Abbe, Afonso S Bandeira, and Georgina Hall. Eact recovery in the stochastic block model. arxiv preprint arxiv: , [2] Edo M Airoldi, David M Blei, Stephen E Fienberg, and Eric P Xing. Mied membership stochastic blockmodels. In Advances in Neural Information Processing Systems, pages 33 40, [3] Paul Erdős and Alfréd Rényi. On random graphs. Publicationes Mathematicae Debrecen, 6: , [4] Anna Goldenberg, Alice X Zheng, Stephen E Fienberg, and Edoardo M Airoldi. A survey of statistical network models. Foundations and Trends in Machine Learning, 2(2): , [5] Bruce Hajek, Yihong Wu, and Jiaming Xu. Computational lower bounds for community detection on random graphs. arxiv preprint arxiv: , [6] Frank McSherry. Spectral partitioning of random graphs. In Foundations of Computer Science, Proceedings. 42nd IEEE Symposium on, pages IEEE, [7] Elchanan Mossel, Joe Neeman, and Allan Sly. Stochastic block models and reconstruction. arxiv preprint arxiv: , [8] Tiago P Peioto. Hierarchical block structures and high-resolution model selection in large networks. Physical Review X, 4(1):011047, [9] Van Vu. A simple svd algorithm for finding hidden partitions. arxiv preprint arxiv: ,

Spectral Clustering and Community Detection in Labeled Graphs

Spectral Clustering and Community Detection in Labeled Graphs Spectral Clustering and Community Detection in Labeled Graphs Brandon Fain, Stavros Sintos, Nisarg Raval Machine Learning (CompSci 571D / STA 561D) December 7, 2015 {btfain, nisarg, ssintos} at cs.duke.edu

More information

Introduction to Graph Theory

Introduction to Graph Theory Introduction to Graph Theory Tandy Warnow January 20, 2017 Graphs Tandy Warnow Graphs A graph G = (V, E) is an object that contains a vertex set V and an edge set E. We also write V (G) to denote the vertex

More information

The Surprising Power of Belief Propagation

The Surprising Power of Belief Propagation The Surprising Power of Belief Propagation Elchanan Mossel June 12, 2015 Why do you want to know about BP It s a popular algorithm. We will talk abut its analysis. Many open problems. Connections to: Random

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

Semidefinite Programs for Exact Recovery of a Hidden Community (and Many Communities)

Semidefinite Programs for Exact Recovery of a Hidden Community (and Many Communities) Semidefinite Programs for Exact Recovery of a Hidden Community (and Many Communities) Bruce Hajek 1 Yihong Wu 1 Jiaming Xu 2 1 University of Illinois at Urbana-Champaign 2 Simons Insitute, UC Berkeley

More information

Phase Transitions in Semidefinite Relaxations

Phase Transitions in Semidefinite Relaxations Phase Transitions in Semidefinite Relaxations Andrea Montanari [with Adel Javanmard, Federico Ricci-Tersenghi, Subhabrata Sen] Stanford University April 5, 2016 Andrea Montanari (Stanford) SDP Phase Transitions

More information

Notes for Lecture 24

Notes for Lecture 24 U.C. Berkeley CS170: Intro to CS Theory Handout N24 Professor Luca Trevisan December 4, 2001 Notes for Lecture 24 1 Some NP-complete Numerical Problems 1.1 Subset Sum The Subset Sum problem is defined

More information

Part II. Graph Theory. Year

Part II. Graph Theory. Year Part II Year 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2017 53 Paper 3, Section II 15H Define the Ramsey numbers R(s, t) for integers s, t 2. Show that R(s, t) exists for all s,

More information

6 Randomized rounding of semidefinite programs

6 Randomized rounding of semidefinite programs 6 Randomized rounding of semidefinite programs We now turn to a new tool which gives substantially improved performance guarantees for some problems We now show how nonlinear programming relaxations can

More information

4 Linear Programming (LP) E. Amaldi -- Foundations of Operations Research -- Politecnico di Milano 1

4 Linear Programming (LP) E. Amaldi -- Foundations of Operations Research -- Politecnico di Milano 1 4 Linear Programming (LP) E. Amaldi -- Foundations of Operations Research -- Politecnico di Milano 1 Definition: A Linear Programming (LP) problem is an optimization problem: where min f () s.t. X n the

More information

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates CSE 51: Design and Analysis of Algorithms I Spring 016 Lecture 11: Clustering and the Spectral Partitioning Algorithm Lecturer: Shayan Oveis Gharan May nd Scribe: Yueqi Sheng Disclaimer: These notes have

More information

We show that the composite function h, h(x) = g(f(x)) is a reduction h: A m C.

We show that the composite function h, h(x) = g(f(x)) is a reduction h: A m C. 219 Lemma J For all languages A, B, C the following hold i. A m A, (reflexive) ii. if A m B and B m C, then A m C, (transitive) iii. if A m B and B is Turing-recognizable, then so is A, and iv. if A m

More information

Coloring 3-Colorable Graphs

Coloring 3-Colorable Graphs Coloring -Colorable Graphs Charles Jin April, 015 1 Introduction Graph coloring in general is an etremely easy-to-understand yet powerful tool. It has wide-ranging applications from register allocation

More information

CME307/MS&E311 Optimization Theory Summary

CME307/MS&E311 Optimization Theory Summary CME307/MS&E311 Optimization Theory Summary Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/~yyye http://www.stanford.edu/class/msande311/

More information

Hierarchical Clustering: Objectives & Algorithms. École normale supérieure & CNRS

Hierarchical Clustering: Objectives & Algorithms. École normale supérieure & CNRS Hierarchical Clustering: Objectives & Algorithms Vincent Cohen-Addad Paris Sorbonne & CNRS Frederik Mallmann-Trenn MIT Varun Kanade University of Oxford Claire Mathieu École normale supérieure & CNRS Clustering

More information

CPSC 536N: Randomized Algorithms Term 2. Lecture 10

CPSC 536N: Randomized Algorithms Term 2. Lecture 10 CPSC 536N: Randomized Algorithms 011-1 Term Prof. Nick Harvey Lecture 10 University of British Columbia In the first lecture we discussed the Max Cut problem, which is NP-complete, and we presented a very

More information

On the Approximability of Modularity Clustering

On the Approximability of Modularity Clustering On the Approximability of Modularity Clustering Newman s Community Finding Approach for Social Nets Bhaskar DasGupta Department of Computer Science University of Illinois at Chicago Chicago, IL 60607,

More information

Cumulative Review Problems Packet # 1

Cumulative Review Problems Packet # 1 April 15, 009 Cumulative Review Problems Packet #1 page 1 Cumulative Review Problems Packet # 1 This set of review problems will help you prepare for the cumulative test on Friday, April 17. The test will

More information

Zhibin Huang 07. Juni Zufällige Graphen

Zhibin Huang 07. Juni Zufällige Graphen Zhibin Huang 07. Juni 2010 Seite 2 Contents The Basic Method The Probabilistic Method The Ramsey Number R( k, l) Linearity of Expectation Basics Splitting Graphs The Probabilistic Lens: High Girth and

More information

Math 443/543 Graph Theory Notes 10: Small world phenomenon and decentralized search

Math 443/543 Graph Theory Notes 10: Small world phenomenon and decentralized search Math 443/543 Graph Theory Notes 0: Small world phenomenon and decentralized search David Glickenstein November 0, 008 Small world phenomenon The small world phenomenon is the principle that all people

More information

Outline. Advanced Digital Image Processing and Others. Importance of Segmentation (Cont.) Importance of Segmentation

Outline. Advanced Digital Image Processing and Others. Importance of Segmentation (Cont.) Importance of Segmentation Advanced Digital Image Processing and Others Xiaojun Qi -- REU Site Program in CVIP (7 Summer) Outline Segmentation Strategies and Data Structures Algorithms Overview K-Means Algorithm Hidden Markov Model

More information

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the

More information

1 Counting triangles and cliques

1 Counting triangles and cliques ITCSC-INC Winter School 2015 26 January 2014 notes by Andrej Bogdanov Today we will talk about randomness and some of the surprising roles it plays in the theory of computing and in coding theory. Let

More information

Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation

Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation Bryan Poling University of Minnesota Joint work with Gilad Lerman University of Minnesota The Problem of Subspace

More information

arxiv: v1 [stat.ml] 2 Nov 2010

arxiv: v1 [stat.ml] 2 Nov 2010 Community Detection in Networks: The Leader-Follower Algorithm arxiv:111.774v1 [stat.ml] 2 Nov 21 Devavrat Shah and Tauhid Zaman* devavrat@mit.edu, zlisto@mit.edu November 4, 21 Abstract Traditional spectral

More information

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003 CS 664 Slides #11 Image Segmentation Prof. Dan Huttenlocher Fall 2003 Image Segmentation Find regions of image that are coherent Dual of edge detection Regions vs. boundaries Related to clustering problems

More information

Epigraph proximal algorithms for general convex programming

Epigraph proximal algorithms for general convex programming Epigraph proimal algorithms for general conve programming Matt Wytock, Po-Wei Wang and J. Zico Kolter Machine Learning Department Carnegie Mellon University mwytock@cs.cmu.edu Abstract This work aims at

More information

2 The Fractional Chromatic Gap

2 The Fractional Chromatic Gap C 1 11 2 The Fractional Chromatic Gap As previously noted, for any finite graph. This result follows from the strong duality of linear programs. Since there is no such duality result for infinite linear

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models

More information

A GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY

A GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY A GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY KARL L. STRATOS Abstract. The conventional method of describing a graph as a pair (V, E), where V and E repectively denote the sets of vertices and edges,

More information

CMSC Honors Discrete Mathematics

CMSC Honors Discrete Mathematics CMSC 27130 Honors Discrete Mathematics Lectures by Alexander Razborov Notes by Justin Lubin The University of Chicago, Autumn 2017 1 Contents I Number Theory 4 1 The Euclidean Algorithm 4 2 Mathematical

More information

1.1 What is Microeconomics?

1.1 What is Microeconomics? 1.1 What is Microeconomics? Economics is the study of allocating limited resources to satisfy unlimited wants. Such a tension implies tradeoffs among competing goals. The analysis can be carried out at

More information

1 Introduction. Colouring of generalized signed planar graphs arxiv: v1 [math.co] 21 Nov Ligang Jin Tsai-Lien Wong Xuding Zhu

1 Introduction. Colouring of generalized signed planar graphs arxiv: v1 [math.co] 21 Nov Ligang Jin Tsai-Lien Wong Xuding Zhu Colouring of generalized signed planar graphs arxiv:1811.08584v1 [math.co] 21 Nov 2018 Ligang Jin Tsai-Lien Wong Xuding Zhu November 22, 2018 Abstract Assume G is a graph. We view G as a symmetric digraph,

More information

Shannon capacity and related problems in Information Theory and Ramsey Theory

Shannon capacity and related problems in Information Theory and Ramsey Theory Shannon capacity and related problems in Information Theory and Ramsey Theory Eyal Lubetzky Based on Joint work with Noga Alon and Uri Stav May 2007 1 Outline of talk Shannon Capacity of of a graph: graph:

More information

Lecture 2 September 3

Lecture 2 September 3 EE 381V: Large Scale Optimization Fall 2012 Lecture 2 September 3 Lecturer: Caramanis & Sanghavi Scribe: Hongbo Si, Qiaoyang Ye 2.1 Overview of the last Lecture The focus of the last lecture was to give

More information

How and what do we see? Segmentation and Grouping. Fundamental Problems. Polyhedral objects. Reducing the combinatorics of pose estimation

How and what do we see? Segmentation and Grouping. Fundamental Problems. Polyhedral objects. Reducing the combinatorics of pose estimation Segmentation and Grouping Fundamental Problems ' Focus of attention, or grouping ' What subsets of piels do we consider as possible objects? ' All connected subsets? ' Representation ' How do we model

More information

Characterizing Graphs (3) Characterizing Graphs (1) Characterizing Graphs (2) Characterizing Graphs (4)

Characterizing Graphs (3) Characterizing Graphs (1) Characterizing Graphs (2) Characterizing Graphs (4) S-72.2420/T-79.5203 Basic Concepts 1 S-72.2420/T-79.5203 Basic Concepts 3 Characterizing Graphs (1) Characterizing Graphs (3) Characterizing a class G by a condition P means proving the equivalence G G

More information

Algebraic Graph Theory- Adjacency Matrix and Spectrum

Algebraic Graph Theory- Adjacency Matrix and Spectrum Algebraic Graph Theory- Adjacency Matrix and Spectrum Michael Levet December 24, 2013 Introduction This tutorial will introduce the adjacency matrix, as well as spectral graph theory. For those familiar

More information

Discrete mathematics , Fall Instructor: prof. János Pach

Discrete mathematics , Fall Instructor: prof. János Pach Discrete mathematics 2016-2017, Fall Instructor: prof. János Pach - covered material - Lecture 1. Counting problems To read: [Lov]: 1.2. Sets, 1.3. Number of subsets, 1.5. Sequences, 1.6. Permutations,

More information

Chapter 15 Introduction to Linear Programming

Chapter 15 Introduction to Linear Programming Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of

More information

6. Advanced Topics in Computability

6. Advanced Topics in Computability 227 6. Advanced Topics in Computability The Church-Turing thesis gives a universally acceptable definition of algorithm Another fundamental concept in computer science is information No equally comprehensive

More information

Parallel Auction Algorithm for Linear Assignment Problem

Parallel Auction Algorithm for Linear Assignment Problem Parallel Auction Algorithm for Linear Assignment Problem Xin Jin 1 Introduction The (linear) assignment problem is one of classic combinatorial optimization problems, first appearing in the studies on

More information

Chapter 3. Set Theory. 3.1 What is a Set?

Chapter 3. Set Theory. 3.1 What is a Set? Chapter 3 Set Theory 3.1 What is a Set? A set is a well-defined collection of objects called elements or members of the set. Here, well-defined means accurately and unambiguously stated or described. Any

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 22.1 Introduction We spent the last two lectures proving that for certain problems, we can

More information

1 Better Approximation of the Traveling Salesman

1 Better Approximation of the Traveling Salesman Stanford University CS261: Optimization Handout 4 Luca Trevisan January 13, 2011 Lecture 4 In which we describe a 1.5-approximate algorithm for the Metric TSP, we introduce the Set Cover problem, observe

More information

Faster parameterized algorithms for Minimum Fill-In

Faster parameterized algorithms for Minimum Fill-In Faster parameterized algorithms for Minimum Fill-In Hans L. Bodlaender Pinar Heggernes Yngve Villanger Technical Report UU-CS-2008-042 December 2008 Department of Information and Computing Sciences Utrecht

More information

GRAPH DECOMPOSITION BASED ON DEGREE CONSTRAINTS. March 3, 2016

GRAPH DECOMPOSITION BASED ON DEGREE CONSTRAINTS. March 3, 2016 GRAPH DECOMPOSITION BASED ON DEGREE CONSTRAINTS ZOÉ HAMEL March 3, 2016 1. Introduction Let G = (V (G), E(G)) be a graph G (loops and multiple edges not allowed) on the set of vertices V (G) and the set

More information

Faster parameterized algorithms for Minimum Fill-In

Faster parameterized algorithms for Minimum Fill-In Faster parameterized algorithms for Minimum Fill-In Hans L. Bodlaender Pinar Heggernes Yngve Villanger Abstract We present two parameterized algorithms for the Minimum Fill-In problem, also known as Chordal

More information

2009 HMMT Team Round. Writing proofs. Misha Lavrov. ARML Practice 3/2/2014

2009 HMMT Team Round. Writing proofs. Misha Lavrov. ARML Practice 3/2/2014 Writing proofs Misha Lavrov ARML Practice 3/2/2014 Warm-up / Review 1 (From my research) If x n = 2 1 x n 1 for n 2, solve for x n in terms of x 1. (For a more concrete problem, set x 1 = 2.) 2 (From this

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A 4 credit unit course Part of Theoretical Computer Science courses at the Laboratory of Mathematics There will be 4 hours

More information

Available online at ScienceDirect. Procedia Computer Science 20 (2013 )

Available online at  ScienceDirect. Procedia Computer Science 20 (2013 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 20 (2013 ) 522 527 Complex Adaptive Systems, Publication 3 Cihan H. Dagli, Editor in Chief Conference Organized by Missouri

More information

How Hard Is Inference for Structured Prediction?

How Hard Is Inference for Structured Prediction? How Hard Is Inference for Structured Prediction? Tim Roughgarden (Stanford University) joint work with Amir Globerson (Tel Aviv), David Sontag (NYU), and Cafer Yildirum (NYU) 1 Structured Prediction structured

More information

Module 1. Preliminaries. Contents

Module 1. Preliminaries. Contents Module 1 Preliminaries Contents 1.1 Introduction: Discovery of graphs............. 2 1.2 Graphs.............................. 3 Definitions........................... 4 Pictorial representation of a graph..............

More information

Lecture Overview. 2 Online Algorithms. 2.1 Ski rental problem (rent-or-buy) COMPSCI 532: Design and Analysis of Algorithms November 4, 2015

Lecture Overview. 2 Online Algorithms. 2.1 Ski rental problem (rent-or-buy) COMPSCI 532: Design and Analysis of Algorithms November 4, 2015 COMPSCI 532: Design and Analysis of Algorithms November 4, 215 Lecturer: Debmalya Panigrahi Lecture 19 Scribe: Allen Xiao 1 Overview In this lecture, we motivate online algorithms and introduce some of

More information

My favorite application using eigenvalues: partitioning and community detection in social networks

My favorite application using eigenvalues: partitioning and community detection in social networks My favorite application using eigenvalues: partitioning and community detection in social networks Will Hobbs February 17, 2013 Abstract Social networks are often organized into families, friendship groups,

More information

Dr. Amotz Bar-Noy s Compendium of Algorithms Problems. Problems, Hints, and Solutions

Dr. Amotz Bar-Noy s Compendium of Algorithms Problems. Problems, Hints, and Solutions Dr. Amotz Bar-Noy s Compendium of Algorithms Problems Problems, Hints, and Solutions Chapter 1 Searching and Sorting Problems 1 1.1 Array with One Missing 1.1.1 Problem Let A = A[1],..., A[n] be an array

More information

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6, Spectral Clustering XIAO ZENG + ELHAM TABASSI CSE 902 CLASS PRESENTATION MARCH 16, 2017 1 Presentation based on 1. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4

More information

DECISION-TREE-BASED MULTICLASS SUPPORT VECTOR MACHINES. Fumitake Takahashi, Shigeo Abe

DECISION-TREE-BASED MULTICLASS SUPPORT VECTOR MACHINES. Fumitake Takahashi, Shigeo Abe DECISION-TREE-BASED MULTICLASS SUPPORT VECTOR MACHINES Fumitake Takahashi, Shigeo Abe Graduate School of Science and Technology, Kobe University, Kobe, Japan (E-mail: abe@eedept.kobe-u.ac.jp) ABSTRACT

More information

Post-Processing for MCMC

Post-Processing for MCMC ost-rocessing for MCMC Edwin D. de Jong Marco A. Wiering Mădălina M. Drugan institute of information and computing sciences, utrecht university technical report UU-CS-23-2 www.cs.uu.nl ost-rocessing for

More information

Small Survey on Perfect Graphs

Small Survey on Perfect Graphs Small Survey on Perfect Graphs Michele Alberti ENS Lyon December 8, 2010 Abstract This is a small survey on the exciting world of Perfect Graphs. We will see when a graph is perfect and which are families

More information

Extremal Graph Theory: Turán s Theorem

Extremal Graph Theory: Turán s Theorem Bridgewater State University Virtual Commons - Bridgewater State University Honors Program Theses and Projects Undergraduate Honors Program 5-9-07 Extremal Graph Theory: Turán s Theorem Vincent Vascimini

More information

Collaborative Filtering for Netflix

Collaborative Filtering for Netflix Collaborative Filtering for Netflix Michael Percy Dec 10, 2009 Abstract The Netflix movie-recommendation problem was investigated and the incremental Singular Value Decomposition (SVD) algorithm was implemented

More information

Packing Edge-Disjoint Triangles in Given Graphs

Packing Edge-Disjoint Triangles in Given Graphs Electronic Colloquium on Computational Complexity, Report No. 13 (01) Packing Edge-Disjoint Triangles in Given Graphs Tomás Feder Carlos Subi Abstract Given a graph G, we consider the problem of finding

More information

Structural and Syntactic Pattern Recognition

Structural and Syntactic Pattern Recognition Structural and Syntactic Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent

More information

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,

More information

Learning Probabilistic models for Graph Partitioning with Noise

Learning Probabilistic models for Graph Partitioning with Noise Learning Probabilistic models for Graph Partitioning with Noise Aravindan Vijayaraghavan Northwestern University Based on joint works with Konstantin Makarychev Microsoft Research Yury Makarychev Toyota

More information

1. Lecture notes on bipartite matching

1. Lecture notes on bipartite matching Massachusetts Institute of Technology 18.453: Combinatorial Optimization Michel X. Goemans February 5, 2017 1. Lecture notes on bipartite matching Matching problems are among the fundamental problems in

More information

What Does Robustness Say About Algorithms?

What Does Robustness Say About Algorithms? What Does Robustness Say About Algorithms? Ankur Moitra (MIT) ICML 2017 Tutorial, August 6 th Let me tell you a story about the tension between sharp thresholds and robustness THE STOCHASTIC BLOCK MODEL

More information

Lecture 9. Semidefinite programming is linear programming where variables are entries in a positive semidefinite matrix.

Lecture 9. Semidefinite programming is linear programming where variables are entries in a positive semidefinite matrix. CSE525: Randomized Algorithms and Probabilistic Analysis Lecture 9 Lecturer: Anna Karlin Scribe: Sonya Alexandrova and Keith Jia 1 Introduction to semidefinite programming Semidefinite programming is linear

More information

PCP and Hardness of Approximation

PCP and Hardness of Approximation PCP and Hardness of Approximation January 30, 2009 Our goal herein is to define and prove basic concepts regarding hardness of approximation. We will state but obviously not prove a PCP theorem as a starting

More information

1 Overview. 2 Applications of submodular maximization. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 Applications of submodular maximization. AM 221: Advanced Optimization Spring 2016 AM : Advanced Optimization Spring 06 Prof. Yaron Singer Lecture 0 April th Overview Last time we saw the problem of Combinatorial Auctions and framed it as a submodular maximization problem under a partition

More information

Lecture 19. Lecturer: Aleksander Mądry Scribes: Chidambaram Annamalai and Carsten Moldenhauer

Lecture 19. Lecturer: Aleksander Mądry Scribes: Chidambaram Annamalai and Carsten Moldenhauer CS-621 Theory Gems November 21, 2012 Lecture 19 Lecturer: Aleksander Mądry Scribes: Chidambaram Annamalai and Carsten Moldenhauer 1 Introduction We continue our exploration of streaming algorithms. First,

More information

CS281 Section 9: Graph Models and Practical MCMC

CS281 Section 9: Graph Models and Practical MCMC CS281 Section 9: Graph Models and Practical MCMC Scott Linderman November 11, 213 Now that we have a few MCMC inference algorithms in our toolbox, let s try them out on some random graph models. Graphs

More information

Chapter 3 Path Optimization

Chapter 3 Path Optimization Chapter 3 Path Optimization Background information on optimization is discussed in this chapter, along with the inequality constraints that are used for the problem. Additionally, the MATLAB program for

More information

On Universal Cycles of Labeled Graphs

On Universal Cycles of Labeled Graphs On Universal Cycles of Labeled Graphs Greg Brockman Harvard University Cambridge, MA 02138 United States brockman@hcs.harvard.edu Bill Kay University of South Carolina Columbia, SC 29208 United States

More information

Monochromatic loose-cycle partitions in hypergraphs

Monochromatic loose-cycle partitions in hypergraphs Monochromatic loose-cycle partitions in hypergraphs András Gyárfás Alfréd Rényi Institute of Mathematics Hungarian Academy of Sciences Budapest, P.O. Box 27 Budapest, H-364, Hungary gyarfas.andras@renyi.mta.hu

More information

On Modularity Clustering. Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari)

On Modularity Clustering. Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari) On Modularity Clustering Presented by: Presented by: Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari) Modularity A quality index for clustering a graph G=(V,E) G=(VE) q( C): EC ( ) EC ( ) + ECC (, ')

More information

The Probabilistic Method

The Probabilistic Method The Probabilistic Method Po-Shen Loh June 2010 1 Warm-up 1. (Russia 1996/4 In the Duma there are 1600 delegates, who have formed 16000 committees of 80 persons each. Prove that one can find two committees

More information

Extracting Information from Complex Networks

Extracting Information from Complex Networks Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform

More information

An Optimal and Progressive Approach to Online Search of Top-K Influential Communities

An Optimal and Progressive Approach to Online Search of Top-K Influential Communities An Optimal and Progressive Approach to Online Search of Top-K Influential Communities Fei Bi, Lijun Chang, Xuemin Lin, Wenjie Zhang University of New South Wales, Australia The University of Sydney, Australia

More information

A Study of Graph Spectra for Comparing Graphs

A Study of Graph Spectra for Comparing Graphs A Study of Graph Spectra for Comparing Graphs Ping Zhu and Richard C. Wilson Computer Science Department University of York, UK Abstract The spectrum of a graph has been widely used in graph theory to

More information

Coloring Random Graphs

Coloring Random Graphs Coloring Random Graphs A Short and Biased Survey Lefteris M. Kirousis RA Computer Technology Institute and Department of Computer Engineering and Informatics, University of Patras 5th Athens Colloquium

More information

Maximizing the Spread of Influence through a Social Network. David Kempe, Jon Kleinberg and Eva Tardos

Maximizing the Spread of Influence through a Social Network. David Kempe, Jon Kleinberg and Eva Tardos Maximizing the Spread of Influence through a Social Network David Kempe, Jon Kleinberg and Eva Tardos Group 9 Lauren Thomas, Ryan Lieblein, Joshua Hammock and Mary Hanvey Introduction In a social network,

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

The Structure of Bull-Free Perfect Graphs

The Structure of Bull-Free Perfect Graphs The Structure of Bull-Free Perfect Graphs Maria Chudnovsky and Irena Penev Columbia University, New York, NY 10027 USA May 18, 2012 Abstract The bull is a graph consisting of a triangle and two vertex-disjoint

More information

Math 443/543 Graph Theory Notes

Math 443/543 Graph Theory Notes Math 443/543 Graph Theory Notes David Glickenstein September 3, 2008 1 Introduction We will begin by considering several problems which may be solved using graphs, directed graphs (digraphs), and networks.

More information

Making Error Correcting Codes Work for Flash Memory

Making Error Correcting Codes Work for Flash Memory Making Error Correcting Codes Work for Flash Memory Part III: New Coding Methods Anxiao (Andrew) Jiang Department of Computer Science and Engineering Texas A&M University Tutorial at Flash Memory Summit,

More information

II (Sorting and) Order Statistics

II (Sorting and) Order Statistics II (Sorting and) Order Statistics Heapsort Quicksort Sorting in Linear Time Medians and Order Statistics 8 Sorting in Linear Time The sorting algorithms introduced thus far are comparison sorts Any comparison

More information

Introduction to Combinatorial Algorithms

Introduction to Combinatorial Algorithms Fall 2009 Intro Introduction to the course What are : Combinatorial Structures? Combinatorial Algorithms? Combinatorial Problems? Combinatorial Structures Combinatorial Structures Combinatorial structures

More information

1. Lecture notes on bipartite matching February 4th,

1. Lecture notes on bipartite matching February 4th, 1. Lecture notes on bipartite matching February 4th, 2015 6 1.1.1 Hall s Theorem Hall s theorem gives a necessary and sufficient condition for a bipartite graph to have a matching which saturates (or matches)

More information

Hardness of Subgraph and Supergraph Problems in c-tournaments

Hardness of Subgraph and Supergraph Problems in c-tournaments Hardness of Subgraph and Supergraph Problems in c-tournaments Kanthi K Sarpatwar 1 and N.S. Narayanaswamy 1 Department of Computer Science and Engineering, IIT madras, Chennai 600036, India kanthik@gmail.com,swamy@cse.iitm.ac.in

More information

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2

More information

CS6702 GRAPH THEORY AND APPLICATIONS 2 MARKS QUESTIONS AND ANSWERS

CS6702 GRAPH THEORY AND APPLICATIONS 2 MARKS QUESTIONS AND ANSWERS CS6702 GRAPH THEORY AND APPLICATIONS 2 MARKS QUESTIONS AND ANSWERS 1 UNIT I INTRODUCTION CS6702 GRAPH THEORY AND APPLICATIONS 2 MARKS QUESTIONS AND ANSWERS 1. Define Graph. A graph G = (V, E) consists

More information

Orthogonal representations, minimum rank, and graph complements

Orthogonal representations, minimum rank, and graph complements Orthogonal representations, minimum rank, and graph complements Leslie Hogben November 24, 2007 Abstract Orthogonal representations are used to show that complements of certain sparse graphs have (positive

More information

College of Computer & Information Science Fall 2007 Northeastern University 14 September 2007

College of Computer & Information Science Fall 2007 Northeastern University 14 September 2007 College of Computer & Information Science Fall 2007 Northeastern University 14 September 2007 CS G399: Algorithmic Power Tools I Scribe: Eric Robinson Lecture Outline: Linear Programming: Vertex Definitions

More information

arxiv: v3 [cs.dm] 12 Jun 2014

arxiv: v3 [cs.dm] 12 Jun 2014 On Maximum Differential Coloring of Planar Graphs M. A. Bekos 1, M. Kaufmann 1, S. Kobourov, S. Veeramoni 1 Wilhelm-Schickard-Institut für Informatik - Universität Tübingen, Germany Department of Computer

More information

CsegGraph: Column Segment Graph Generator

CsegGraph: Column Segment Graph Generator CsegGraph: Column Segment Graph Generator Shahadat Hossain Department of Mathematics and Computer Science, University of Lethbridge, Alberta, Canada Zhenshuan Zhang Department of Mathematics and Computer

More information

Orthogonal representations, minimum rank, and graph complements

Orthogonal representations, minimum rank, and graph complements Orthogonal representations, minimum rank, and graph complements Leslie Hogben March 30, 2007 Abstract Orthogonal representations are used to show that complements of certain sparse graphs have (positive

More information