On Clusterings Good, Bad and Spectral

Size: px
Start display at page:

Download "On Clusterings Good, Bad and Spectral"

Transcription

1 On Clusterings Good, Bad and Spectral Ravi Kannan Computer Science, Yale University. Santosh Vempala Ý Mathematics, M.I.T. Adrian Vetta Þ Mathematics, M.I.T. Abstract We propose a new measure for assessing the quality of a clustering. A simple heuristic is shown to give worst-case guarantees under the new measure. Then we present two results regarding the quality of the clustering found by a popular spectral algorithm. One proffers worst case guarantees whilst the other shows that if there exists a good clustering then the spectral algorithm will find one close to it. 1 Introduction Clustering, or partitioning into similar groups, is a problem with many variants in mathematics as well as in the applied sciences. The availability of vast amounts of data has revitalized research on the problem. Over the years, several clever heuristics have been invented for clustering. While many of them are special-purpose, the heuristic known as spectral clustering stands out in that it has been used in a variety of situations, often successfully. This method is described in section 2. Roughly speaking, spectral clustering is the technique of partitioning the rows of a matrix (say) according to their components in the top few singular vectors of the matrix. The main motivation of this paper was to analyze the performance of spectral clustering. However this is inextricably linked to the question of how to measure the quality of a clustering. The heuristics used in practice are usually defended empirically. Theoreticians, meanwhile, have been busy analyzing quality measures that are seductively simple to define Supported in part by NSF grant CCR Ý Supported by NSF CAREER award CCR Þ Supported in part by NSF CAREER award CCR and in part by a Wolfe fellowship. (e.g. -median, min sum, min diameter, etc.). Both approaches have obvious shortcomings: the justification provided by practitioners is typically case-by-case and experimental ( it works well on my data ); the measures thus far analyzed by theoreticians are easy to fool, i.e. there are simple examples where it is clear what the right clustering is, but optimizing the measures defined produces undesirable solutions (see section 3). In this paper we propose a new bicriteria measure of the quality of a clustering, based on expansion-like properties of the underlying pairwise similarity graph. Roughly speaking, the quality of a clustering is given by two parameters:, the minimum conductance of the clusters and, the ratio of the weight of inter-cluster edges to the total weight of all edges. The objective is to find an µ clustering that maximizes and minimizes. Hence, imposing a lower bound for we strive to minimize, or vice versa. In section 3, we argue that although the new measure is more complex, it does not have the drawbacks of earlier quality measures. In section 4 we study an algorithm for finding nearly optimal solutions to the new measure and prove that it yields simultaneous polylogarithmic approximations to both parameters. Then, in section 5 we turn our attention to analyzing spectral clustering under the new measure. We show that it too has worst case guarantees, and also show that if the given data has a rather good clustering (i.e. is large and is small), then the spectral algorithm will find an optimal clustering. Spectral clustering has another attractive feature. Using the randomized algorithms of [7] and [5] it can be implemented extremely quickly (nearly linear time). We discuss this in section 6. 1

2 2 Spectral Clustering Algorithms As stated the primary motivation for this paper was to provide a theoretical analysis of spectral clustering. Spectral clustering refers to the general technique of partitioning the rows of a matrix according to their components in the top few singular vectors of the matrix. The underlying problem, that of clustering the rows of a matrix, is ubiquitous. We mention three special cases that are all of independent interest: to define a measure of the quality of a clustering. In recent years several combinatorial measures of clustering quality have been investigated in detail. These include minimum diameter, -center, -median, and minimum sum (for example: [3],[4],[6], [8],[9], etc.). The matrix encodes the pairwise similarities of vertices of a graph. The rows of the matrix are points in a -dimensional Euclidean space. The columns are the coordinates. The rows of the matrix are documents of a corpus. The columns are terms. The µ entry encodes information about the occurrence of the th term in the th document. Given a matrix, the spectral algorithm for clustering the rows of is given below. SPECTRAL ALGORITHM I Find the top left singular vectors Ù Ù with singular values. Let be the matrix with th column Ù. Place row in cluster if is the largest entry in the th row of. The algorithm has the following interpretation. Suppose the rows of are points in a high-dimensional space. Then we first find the rank subspace that best approximates, and then project all the points into this subspace. This subspace is defined by the top right singular vectors of. Each singular vector represents a cluster; to get a clustering we map each projected point to the (cluster defined by the) singular vector that is closest to it in angle. In section 5, we describe a recursive variant of this algorithm. 3 What is a good clustering? How good is the spectral algorithm? Intuitively, the algorithm performs well if points that are similar are assigned the same cluster and those that are different are assigned to different clusters. Of course, this may not be possible to do for every pair of points, and so we compare the clustering found by the algorithm to the optimal one for the given matrix. This leads to the question of what exactly is an optimal clustering. To provide a quantitative answer, we first need A B Figure 1. Optimizing the diameter produces B while A is clearly more desirable. All these measures, although mathematically attractive due to their simple description, are easy to fool. That is, one can construct examples with the property that the best clustering is obvious, yet an algorithm that optimizes one of these measures finds a clustering that is substantially different and therefore unsatisfactory. Such examples for the diameter and k-median measures are presented in Figs. 1 and 2. In the figures, nearby points represent highly similar points and points that are farther away are less similar. A Figure 2. The inferior clustering B is found by optimizing the -median or min sum measure. Consider a clustering that minimizes the maximum diameter of the clusters, the diameter of a cluster being the largest distance (say) between two points in a cluster. It is NP-hard to find such a clustering, but this is not our main concern. What is worrisome about the example shown in Fig. 1 is that the optimal solution (B) produces a cluster which contains points that should have been separated. The B

3 reason such a poor cluster is produced is that although we have minimized the maximum dissimilarity between points in a cluster, this was at the expense of creating a cluster with many dissimilar points. The clustering (A) on the other hand, although it leads to a larger maximum diameter, is desirable since it better satisfies the goal of similar points together and dissimilar points apart. This problem also arises for -median and minimum sum measures (see, for example, the case shown in Fig. 2); they may produce clusters of poor quality. For the purpose of motivating our proposed measures, let us model the clustering problem via an edge-weighted complete graph whose vertices need to be partitioned. The weight of an edge represents the similarity of the vertices (points) and. Thus, the graph for points in space would have high edge weights for points that are close together and low edge weights for points that are far apart. Note that associated with the graph is an Ò Ò symmetric matrix with entries ; here we assume that the are nonnegative. A cluster will be a subgraph. The quality of a cluster should be determined by how similar the points within a cluster are. In particular, if there is a cut of small weight that divides the cluster into two pieces of comparable size then the cluster has lots of pairs of vertices that are dissimilar and hence it is of low quality. This might suggest that the quality of a subgraph as a cluster is the minimum cut of the subgraph. However, this is misleading as illustrated in Fig. 3. i) ii) Figure 3. The second subgraph is of higher quality as a cluster although it has a smaller mincut. In this example edges represent high-similarity pairs and non-edges represent pairs that are highly dissimilar. The minimum cut of the first subgraph is larger than that of the second subgraph. This is because the second subgraph has low degree vertices. However, the second subgraph is a higher quality cluster. This can be attributed to the fact that in the first subgraph there is a cut whose weight is small relative to the sizes of the pieces it creates. A quantity that measures the relative cut size is the expansion. The expansion of a graph is the minimum ratio over all cuts of the graph of the total weight of edges of the cut to the number or vertices in the smaller part created by the cut. Formally, we denote the expansion of a cut Ë Ëµ by : ËË Ëµ ÑÒ Ë Ëµ We say that the expansion of a graph is the minimum expansion over all the cuts of the graph. Our first measure of quality of a cluster is the expansion of the subgraph corresponding to it. The expansion of a clustering is the minimum expansion of one of the clusters. The measure defined above gives equal importance to all the vertices of the given graph. This, though, may lead to a rather taxing requirement. For example, in order to accommodate a vertex with very little similarity to all the other vertices combined (i.e., is small), then will have to be very low. Arguably it is more prudent to give greater importance to vertices that have many similar neighbors and lesser importance to vertices that have few similar neighbors. This can be done by a direct generalization of the expansion, called the conductance, in which subsets of vertices are weighted to reflect their importance. The conductance of a cut Ë Ëµ in is denoted by : ˵ ËË ÑÒ Ëµ ˵µ Here ˵ Ë Î µ Ë Î. The conductance of a graph is the minimum conductance over all the ˵. In order to quan- cuts of the graph; µ ÑÒ ËÎ tify the quality of a clustering we generalise the definition of conductance further. Take a cluster Î and a cut Ë Ò˵ within, where Ë. Then we say that the conductance of Ë in is: Ë µ ËÒË ÑÒ Ëµ Ò Ëµµ The conductance of a cluster µ will then be the smallest conductance of a cut within the cluster. The conductance of a clustering is the minimum conductance of its clusters. This leads to the following optimization problem: given a graph and an integer, find a -clustering of maximum conductance. Notice that optimizing the expansion/conductance gives the right clustering in the examples of Figs. 1 and 2. There is still a problem with the above clustering measure. The graph might consist mostly of clusters of high quality and maybe a few points that create clusters of very poor quality, so that any clustering necessarily has a poor overall quality (since we have defined the quality of a clustering to be the minimum over all the clusters); in fact, to

4 boost the overall quality, the best clustering might create many clusters of relatively low quality so that the minimum is as large as possible. Such an example is shown in Fig. 4. Figure 4. Assigning the outliers leads to poor quality clusters. One way to handle the problem might be to avoid restricting the number of clusters. But this could lead to a situation where many points are in singleton clusters. Instead we measure the quality of a clustering using two criteria, the first is the minimum quality of the clusters (called ), and the second is the fraction of the total weight of edges that are not covered by the clusters (called ). Definition 1 We call a partition Ð of Î an µ-partition if : 1. The conductance of each is at least. 2. The total weight of inter-cluster edges is at most an fraction of the total edge weight. The associated bicriteria optimization problem is the following: Problem 1 Find an µ-partition that simultaneously maximizes and minimizes. Another way to pose the problem is, given, find an µ partition that minimizes. Note that the number of clusters is not restricted. For this measure of cluster quality, we are not aware of any bad examples for the µclustering problem, i.e. examples where solving this bicriteria optimization problem gives a clustering that is clearly inferior to the best clustering. An important question is whether this observation holds up under a systematic experimental study. The focus of the rest of this paper, however, is to consider the measure from a theoretic standpoint and to examine in detail the performance of spectral clustering algorithms. 4 An approximation algorithm Observe that Problem 1 is NP-hard. To see this consider maximising whilst setting to zero. This problem is equivalent to finding the conductance of a given graph. As a result, it may be appropriate to develop approximation guarantees for the problem. Hence, here we present a simple heuristic, outlined below, and show that it is a poly-logarithmic approximation algorithm for our clustering problem. APPROXIMATE-CUT ALGORITHM Find an approximate sparsest cut in. If the conductance of the cut is then output the cluster. Else recurse on the induced pieces. The idea behind our algorithm is simple. Given, find a cut Ë Ëµ of minimum expansion. If the cut has conductance of, say, at least then stop. Otherwise, recurse on the subgraphs induced by Ë and Ë. Clearly the algorithm produces clusters with conductance of. Since finding a cut of minimum conductance is hard, we cannot implement this algorithm in polynomial time in its current form. However, a simple modification allows for a polynomial implementation. Leighton and Rao [10] gave a polynomial time algorithm that finds a cut with conductance at most ÐÓ Ò times the minimum. So, instead of finding a cut of minimum conductance at each step we apply the Leighton and Rao procedure and find an approximate sparsest cut. This modified algorithm, therefore, produces clusters with conductance. ÐÓ Ò Suppose that the graph contains an µ-partition. We now obtain poly-logarithmic guarantees for our problem. First we consider the situation in which the quality of the cluster is given via the expansion measure. Implementing the algorithm with we obtain the following ÐÓ Ò theorem. Theorem 1 If there is an µ-partition, where the quality of a cluster is measured with respect to expansion, then the Approximate-Cut algorithm will find an ÐÓ Ò ÐÓ Òµ-partition. Proof : Now at any time, the algorithm induces a partition of the vertex set Î of graph into Î Î Î, where Î Î. Clearly, upon termination, each cluster has the desired expansion. Let us call the cuts that the algorithm partitions along Ë Ì µ Ë Ì µ, where Ë Ì. Note that Ë Ì is just the subset of Î being partitioned. Let Ð be the optimal µ clustering. Observe that Î µ is twice the total edge weight in the graph, and so Ï Î µ is the weight of the inter-cluster edges in the optimal solution. Now we divide the cuts into two groups; the first group À are the ones with high expansion within clusters. The second group consists of the remaining cuts. The precise definition follows. We use the notation

5 Now, we deal with the not in À. First, suppose that all the cuts together induce a partition of into ; Ö we order them so that is the largest. Every edge between two vertices in which belong to different sets of À Û Á Ë Ì the partition must be cut by some cut Ë Ì µ and conversely, every edge of every cut Ë Ì µ must µ ÑÒ Ë Ì µ ÐÓ Ò have its two end points in different sets of the partition. So, We first deal with the in À. Now, for all À, we have given that has expansion, we obtain that Û Ë Ì µ ÙËÚÌ ÙÚ. We denote by Û Á Ë Ì µ the sum of the weights of the inside cluster edges of the cut Ð Ë Ì µ, i.e., Û Á Ë Ì µ Û Ë Ì µ. Ð Ë ÐÓ Ò Û Á Ë Ì µ ÑÒ Ë Ì µ ÐÓ Ò ÐÐ Û Á Ë Ì µ Ö Û Ò µ ÑÒ Ò µ So, we have, ÑÒ Ë Ì µ Ë (1) For each À, we define another cut Ë ¼ Ì ¼ µ in Ë Ì in the following manner. For each Ð, if Ë Ì, then put all of Ë Ì µ into Ë ¼. If Ë Ì, then put all of Ë Ì µ into Ì ¼. It follows from (1) that ÑÒ Ë¼ Ì ¼ µ Ë. Now, using the fact that Ë Ì µ is an approximately sparsest cut in Ë Ì, we observe that Û Ë Ì µ Ë ÐÓ Ò Û Ë¼ Ì ¼ µ ÑÒ Ë ¼ Ì ¼ µ Û Ë ¼ Ì ¼ µ Ë For any subset Ë of Î, let ˵ denote the set of intercluster edges incident at a vertex in Ë. Then Û Ë ¼ Ì ¼ µ Û Ë ¼ µµ, since every edge in ˼ Ì ¼ µ is an intercluster edge. So we have Û Ë Ì µ ÐÓ Ò Û Ë ¼ µµ Claim 1. For each vertex Ù Î, there are at most ÐÓ Ò values of such that Ù belongs to Ë ¼. Proof of Claim : Fix Ù Î. Let Á Ù Ë ¼ Ë Â Ù Ë ¼ Ò Ë Clearly Á ÐÓ Ò since if Ù Ë Ë (with ), then Ë Ë. Now suppose Â. Suppose also Ù. Then Ù Ì. Also later Ì (or a subset of Ì ) is being partitioned into Ë Ì µ and since Ù Ë ¼ Ò Ë, we have Ì Ë. Thus Ì Ë Ì µ Ì. Thus the size of Ì halves between two successive times that Â. So,  ÐÓ Ò. Thus we have À Û Ë Ì µ Ï ÐÓ Ò ÑÜ Á µ Ù Ï ÐÓ Ò (2) For each vertex Ù in, we have that there can be at most ÐÓ ÐÓ Ò values of such that Ù belongs to the smaller of the two sets Ë or Ì. Furthermore, if, then Ù never belongs to the smaller of the two sets Ë or Ì. So, we have that Ö ÑÒ µ ÐÓ Ò Thus, we have ÐÐ Û Á Ë Ì µ ÐÓ Ò Ð Therefore, from the definition of À, we have À Û Á Ë Ì µ Thus, applying (2), À Also, we have À ÐÓ Ò ÐÐ Û Á Ë Ì µ ÐÐ ÑÒ Ë Ì µ ÑÒ Ë Ì µ Ð Û Á Ë Ì µ À ÑÒ Ë Ì µ Û Á Ë Ì µ Ï ÐÓ Ò (3) Û Ë Ì µ Û Á Ë Ì µµ Ï (4) since each intercluster edge appears in at most 1 cut. Adding (2), (3) and (4), we get the theorem. Consider now the general case in which the quality of a cluster is measured with respect to conductance. Observe that the function µ is an additive function on the subsets of Î. This property, that for disjoint Í and Í we have

6 Í Í µ Í µ Í µ, is akin to the cardinality function. With this observation then we may prove, using similar methods to those of Theorem 1, the following guarantees for the general case. Theorem 2 If there exists an µ-partition, where the quality of a cluster is measured in terms of conductance, then the Approximate-Cut algorithm will produce a ÐÓ µ-partition. ÐÓ We now assess the running times of our algorithms. These are based upon implementing an approximate sparsest cut procedure. The fastest implementation for this procedure, due to Benczur and Karger [2], runs in Ç Ò µ time. In addition, note that the algorithms make less than Ò cuts. Hence our heuristics have total running time Ç Ò µ. This, though, may be too slow for many real-world applications. We discuss a faster, more practical, algorithm in the next section. 5 Performance guarantees for spectral clustering In this section we describe and analyse a one-at-a-time variant of the spectral algorithm. This algorithm, outlined below, has been used in the field of computer vision [12] and also in the field of web search engines [16]. Observe that the algorithm is similar in flavour to the Approximate- Cut algorithm described in the previous section. SPECTRAL ALGORITHM II Normalise and find its nd eigenvector Ú. Find the best ratio cut wrt Ú. If the conductance of the cut is then output the cluster. Else recurse on the induced pieces. Ø. Hence is We now elaborate upon the basic description of this variant of the spectral algorithm. Initally we normalise our matrix so that the rows sums are equal to one. Now at any stage in the algorithm we have a clustering. Then for each Ø we consider the Ø Ø submatrix of restricted to Ø. We normalise by setting to also nonnegative with row sums equal to one. We then find the second eigenvector of. This is the right eigenvector Ú corresponding to the second largest eigenvalue, i.e. Ú Ú. Then order the elements (rows) of Ø decreasingly with respect to their component in the direction of Ú. Given this ordering, say Ù Ù Ù Ö, find the minimum ratio cut in Ø. This is the cut that minimises Ù Ù Ù Ø µ for some, Ö. If this cut has conductance at least then we output the cluster Ø. Otherwise, we recurse on the pieces Ù Ù and Ø Ò Ù Ù. For simplicity of analysis the we consider a slightly modified algorithm. If at any point we have a cluster Ø with the property that Ø µ Î Ò µ then we split Ø into singletons and output these vertices. Notice that this later operation may be performed at small cost since such singletons are of little importance. Observe, also, that our algorithm performs at least as well as this modified version. We now ready to analyse the performance of our spectral algorithms. We do this in two ways. First we give worst case performance guarantees for the one-at-a-time variant. Then we show that both the spectral algorithms have the property that given the existence of a good clustering, the algorithm will find one close to it. 5.1 Worst-Case Guarantees We will require the following theorem. This result was essentially proved by Sinclair and Jerrum (in their proof of Lemma 3.3 in [13], although not mentioned in the statement of the lemma). For completeness and due to the fact that Theorem 3 is usually not explicitly stated in the Markov Chain literature (and also usually includes some other conditions which are not relevant here), we include a proof of this result in the appendix. Theorem 3 Suppose is an Æ Æ matrix with nonnegative entries with each row sum equal to 1 and suppose there are positive real numbers Æ summing to 1 such that for all. If Ú the right eigenvector of corresponding to the second largest eigenvalue, and Æ is an ordering of Æ so that Ú Ú Ú Æ, then ÑÒ ËÆ ¼ ÑÒ ÐÐÆ Ë Ë ÑÒ Ë µ Ë ÙÐÐ ÚÆ Ô ÑÒ Ù ÙÐ Ù ÙÚ Ð ÚÆ Ú µ Now, implementing the algorithm with, we ÐÓ may obtain worst case guarantees in a similar manner to those given by Theorems 1 and 2. Theorem 4 If has a µ partition, then Spectral Algorithm II will find an ÐÓ ¼Ô ÐÓ µ-partition. Proof : Let us call the cuts the algorithm partitions along Ë Ì µ Ë Ì µ, where we adopt the convention that

7 Ë is the smaller side, i.e., Ë µ Ì µ. Let Ð be the µ solution. We divide the cuts into two groups; the first group À are the ones with high expansion within clusters and the second group the rest. À Û Á Ë Ì µ ÐÓ Ð We first deal with the in À. For all À, we have, Ë µ ÐÓ So, we have, Û Ë Ì µ Û Á Ë Ì µ ÐÓ ÑÒ Ë µ Ì µµ ÑÒ Ë µ Ì µµ Ë µ For each À, we define another cut Ë ¼ Ì ¼ µ in Ë Ì in the following manner. For each Ð, if Ë µ Ì µ, then put all of Ë Ì µ into Ë ¼. If Ë µ Ì µ, then put all of Ë Ì µ into Ì ¼. Notice that if Ë µ Ë ¼ µ Ë µ then we have ÑÒ Ë ¼ µ Ì ¼ µµ Ë µ. Now we will use the conductance theorem to get an upper bound on Û Ë Ì µ in terms of Û Ë ¼ Ì ¼ µ. To this end, first note that if we let Ù Ùµ ÚÎØ Úµ for all Ù Î Ø, then we have Ù ÙÚ Ú ÚÙ for all Ù Ú Î Ø for the matrix in step (ii) of the algorithm. (This follows from the symmetry of.) Note also that in steps (iii), (iv), we order the elements of Î Ø according to the component in the second largest eigenvector of and take Ë Ì µ to be the contiguous cut with the minimum conductance in this order. So, applying Theorem 3, we obtain Û Ë Ì µ ÑÒ Ë µ Ì µµ ÙË ÚÌ ÙÚ ÑÒ Ë µ Ì µµ ÙË ÚÌ Ù ÙÚ ÑÒ Ë µ Ì µµ Ô µµ ÙË ¼ ÚÌ ¼ Ù ÙÚ ÑÒ Ë ¼ µ Ì ¼ µµ ÙË ¼ ÚÌ ¼ ÙÚ ÑÒ Ë ¼ µ Ì ¼ µµ For any subset Ë of Î, let ˵ denote the set of intercluster edges incident at a vertex in Ë. Then Û Ë ¼ Ì ¼ µ Û Ë ¼ µµ, since every edge in ˼ Ì ¼ µ is an intercluster edge. So we have from (5), Õ Û Ë Ì µ ÕÛ Ë ¼ ÑÒ Ë µ Ì µµ µµ Ë µ (6) Ò Claim 2. For each vertex Ù Î, there are at most ÐÓ values of such that Ù belongs to Ë. Further, there are at Û Ë ¼ Ì ¼ µ (5) Ë µ most ÐÓ Ò values of such that Ù belongs to ˼. Proof of Claim : Fix Ù Î. Let Á Ù Ë Â Ù Ë ¼ Ò Ë Clearly if Ù Ë Ë (with ), then Ë Ì µ must be a partition of Ë or a subset of Ë. Now we have, Ë µ Ë Ì µ Ë µ So Ë µ reduces by a factor of 2 or greater between two successive times Ù belongs to Ë. The maximum value of Ë µ is at most Î µ and the minimum value is at least Ò Î µ, so the first statement of the claim follows. Now suppose Â. Suppose also Ù. Then Ù Ì. Also later Ì (or a subset of Ì ) is being partitioned into Ë Ì µ and since Ù Ë ¼ Ò Ë, we have Ì µ Ë µ. Thus Ì µ Ë Ì µ Ì µ. Thus Ì µ halves between two successive times that Â. So,  ÐÓ. This proves the second statement in the claim (since Ù Ë ¼ implies that Ù Ë or Ù Ë ¼ Ò Ë ). Thus, using the Cauchy-Schwartz inequality, we have À Û Ë Ì µ (7) Õ ÕÛ Ë ¼ µµ Ë µ ÐÐ ¼ ¼ Û Ë ¼ µµ Ë µ ÐÐ ÐÓ ÐÓ Î µ Î µ Ô ÐÓ Î µ (8) Now, we deal with the not in À. First, suppose that all the cuts together induce a partition of into Ö. Every edge between two vertices in which belong to different sets of the partition must be cut by some cut Ë Ì µ and conversely, every edge of every cut Ë Ì µ must have its two end points in different sets of the partition. So, given that has high conductance, we obtain ÐÐ Û Á Ë Ì µ Ö Û Ò µ

8 ÑÒ µ Ò µµ For each vertex Ù there can be at most ÐÓ values of such that Ù belongs to the smaller (according to µ) of the two sets Ë Ì. So, we have that Ö Thus, we have ÐÐ ÑÒ µ Ò µµ ÐÓ Û Á Ë Ì µ ÐÓ ÑÒ Ë µ Ì µµ Ð Therefore, from the definition of À, we have À Û Á Ë Ì µ ÐÓ ÐÐ Thus, applying (8), À Also, we have À Ð ÐÐ Û Á Ë Ì µ Û Á Ë Ì µ ÑÒ Ë µ Ì µµ ÑÒ Ë µ Ì µµ Û Á Ë Ì µ À Ô ÐÓ Î µ (9) Û Ë Ì µ Û Á Ë Ì µµ Î µ (10) since each intercluster edge belongs to at most one cut Ë Ì. We add up (8), (9) and (10). Also note that splitting up all the Î Ø with Î Ø µ Î µ into singletons costs Ò us at most Î µ on the whole. Recalling that Î µ equals twice the total sum of edge weights, the bound regarding the cost of intercluster edge weights follows. Finally note that at the end each part has conductance at least ÐÓ, where this is trivially true of singleton sets. The theorem follows. 5.2 In the Presence of a Good Clustering Suppose a given input matrix has a particularly good clustering, i.e. it can be partitioned into blocks such that the conductance of each block as a cluster is high and the total weight of inter-cluster edges is small. We model this situation as follows. We assume that is normalized so that each row sum is 1. The matrix is written as where is a block diagonal matrix, and is the set of edges that run between clusters. Let be block-diagonal with blocks that correspond to the clusters of the optimal clustering. Rather than conductance, it will be easier to state the theorem in terms of the minimum eigenvalue gap of the blocks. The eigenvalue gap of matrix is. The eigenvalue gap is closely related to the conductance ( ). Ideally we would like to show that if where is large for each block in and the total weight of edges in is small then the spectral algorithm works. Theorem 5 shows that if the s are large and the -norm of is small then the spectral algorithm finds a nearly optimal clustering. The -norm of an Ò Ñ matrix is ÑÜ Ü ÜÊ Ñ Ü We are now ready to prove the following theorem. Analyses in a similar spirit have been conducted by Papadimitriou et al. [11] and Fiat et al. [1]. Theorem 5 Let the eigenvalue gap of each cluster,, of the optimal clustering be at least (¼ ). In addition, let the difference between the th and th eigenvalues of also be at least. Suppose has -norm at most Æ where Æ for some constant. Then, if the optimal cluster sizes are within a bounded ratio, the spectral algorithm applied to finds a clustering that differs from the optimal in Ç Òµ rows. Proof : The matrix can be viewed as a perturbation of by the matrix. Let the first eigenvectors of (resp. ) be denoted by (resp. ), and the last Ò by (resp. ). Let µì for. Similarly define µì. Then ¼ and is the diagonal matrix of the the top eigenvalues of. Now define µì. Then a theorem of Stewart [15] (Theorem 4.11, page 745) tells us that, for some matrix with -norm at most, the columns of µ Á Ì µ form an orthonormal basis for a subspace that is invariant for. Hence there is a orthonormal matrix Ê such that Ê. It follows that Ê, where is a matrix with -norm,, less than. We may take to be the orthonormal basis for a subspace that is invariant for and such that.

9 Recall that we partition our rows with respect to the Ò matrix. Hence, we are interested in the difference between the clusterings induced by and. From the previous para we have that has -norm Ç µ. By a suitable rotation, we can take to be the matrix whose th column has support as the rows of. Further the entries of the th column are all close to Ô where Ò Ò is the number of rows in the th block. Now consider a row that is misclassified by. Suppose it belongs to block in the optimal clustering. Then (in the matrix ), in some column Ð other than it has an entry higher than it does in column. This implies a dif- ference of at least Ô between either Ò and or between Ð and Ð. By our assumption that the block sizes are roughly equal and the bound of Ç µ on the -norm of the difference between and, we have that in total at most Ç Òµ rows are misclassified, i.e. the clustering produced differs from the optimal one in Ç Òµ rows. Theorem 5 can be generalized to the case of nonsymmetric matrices, using a corresponding theorem of Stewart for the invariance of left and right singular subspaces. 6 Conclusion There are two aspects to analyzing a clustering algorithm: (i) Quality; how good is the clustering produced? and (ii) Speed; how fast can it be found? In this paper we have mostly dealt with (i) while taking care that the algorithms are polynomial time. The spectral algorithms, depend on the time it takes to find the top (or top ) singular vector(s). While this can be done exactly in polynomial time, it is still too expensive for applications such as information retrieval. The recent work of [7] and [5] on randomized algorithms for low-rank approximation tackles this problem. The running time of their first algorithm depends only on the quality of the desired approximation and not on the size of the matrix, but it assumes that the entries of the matrix can be sampled in a specific manner. Their second algorithm needs no assumptions and has a running time that is linear in the number of non-zero entries. References [1] Y. Azar, A. Fiat, A. Karlin, F. McSherry and J. Saia. Spectral analysis of data. Manuscript. [2] A. Benczur and D. Karger. Approximate Ø mincuts in Ç Ò µ time. In Proc. of 28th STOC, p47-55, [3] M. Charikar, C. Chekuri, T. Feder and R. Motwani. Incremental clustering and dynamic information retrieval. In Proc. of 29th STOC, [4] M. Charikar, S. Guha, D. Shmoys and E. Tardos. A constant-factor approximation for the -median problem. In Proc. of 31st STOC, p1-10, [5] P. Drineas, A. Frieze, R. Kannan, S. Vempala and V. Vinay. Clustering in large graphs and matrices. In Proc. of 10th SODA, [6] M.E. Dyer and A. Frieze. A simple heuristic for the Ô- center problem. In Oper. Res. Let., 3, , [7] A. Frieze, R. Kannan and S. Vempala. Fast Monte- Carlo algorithms for finding low-rank approximations. In Proc. of 39th FOCS, [8] P. Indyk. A sublinear time approximation scheme for clustering in metric spaces. In Proc of 40th FOCS, p154-59, [9] K. Jain and V. Vazirani. Primal-dual approximation algorithms for metric facility location and k-median problems. In Proc. of 40th FOCS, p2-13, [10] T. Leighton and S. Rao. An approximate max-flow min-cut theorem for uniform multicommodity flow problems with applications to approximation algorithms. In Proc. of 28th FOCS, p256-69, [11] C. Papadimitriou, P. Raghaven, H. Tamaki and S. Vempala. Latent semantic indexing: a probabilistic analysis. In Proc. of 17th Symposium on the Principles of Database Systems, [12] J. Shi and J. Malik. Normalized cuts and image segmentation. In IEEE Conf. on Computer Vision and Pattern Recognition, [13] A. Sinclair and M. Jerrum. Approximate counting, uniform generation and rapidly mixing Markov chains. In Information and Computation, 82, p93-133, [14] D. Spielman and S. Teng. Spectral Partitioning Works: Planar Graphs and finite element meshes. In Proc. of 37th FOCS, [15] G. W. Stewart, Error and perturbation bounds for subspaces associated with certain eigenvalue problems. In SIAM review, 15(4), [16]

10 A Proof of Conductance Inequality Theorem 6 Suppose is a Æ Æ matrix with nonnegative entries with each row sum equal to 1 and suppose there are positive real numbers Æ summing to 1 such that for all. If Ú the right eigenvector of corresponding to the second largest eigenvalue, and Æ is an ordering of Æ so that Ú Ú Ú Æ, then ÑÒ ËÆ ¼ ÑÒ ÐÐÆ Ë Ë ÑÒ Ë µ Ë ÙÐÐ ÚÆ Ô ÑÒ Ù ÙÐ Ù ÙÚ Ð ÚÆ Ú µ Proof : Let µ. Ì and so É is symmetric. The eigenvalues of É are the same. Their largest eigenvalue is 1. Also Ì É Ì and so Ì is the eigenvector of É corresponding to eigenvalue 1. So, we have, Thus ÑÒ Ì Ü¼ Now ÑÒ Ì Ý¼ Ý Ì Á µý So Ü Ì Ü ÑÜ Ì Ü¼ Ü Ì Ü Ü Ì Á µ Ü Ü Ì Ü Ý Ì Á µý Ý Ì ÔÙØØÒ Ý Üµ Ý Ý Ý Ý Ý Ý Ý Ýµ ݵ Ý µ Ý Ýµ ÑÒ Ì Ý¼ Ý µý Ý Ý If Ý Ú attains the minimum above, then Ú is the eigenvector of É corresponding to eigenvalue and so Ú is the right eigenvector of corresponding to. Our ordering will be according to Ú as in the statement of the Theorem. Assume that for simplicity of notation, we have reordered the indices (of rows and corresponding columns of ) so that now we have Ú Ú Ú Ú Æ. Define Ö to satisfy Ö Ö. Let Þ Ú Ú Ö for Ò. Then and Ú Úµ Ú Þ Þ Þ Ö ¼ Þ Ö Þ Ò Þ Þµ Ú Ö Þ Þ Þµ Þ Þ Þ Ýµ Now, by Cauchy-Schwartz ¼ ¼ Þ µ Þ Þ µ Þ Þ µ Þ Þ Þ Þ µ Þ Þ (11) Here the second inequality follows from the observation that if then Þ Þ Þ Þ µ Þ Þ. Since if Ö i.e. if Þ Þ have the same sign then Þ Þ Þ Þ µ Þ Þ. Otherwise Þ Þ Þ Þ µ Þ Þ µ Þ Þ and the observation follows. Also, So, Ú Úµ Ú Þ Þ µ Þ Þ µ Þ Þ Þ µ Þ

11 Now let Ë and µ. Let µ ÑÒ Æ ÑÒ µ Then Æ Ö Þ Þ Þ Þ µ Þ Þ µ Ë µ Æ Ö Æ Þ Þ µ Ë µ Þ Æ Þ Ö µ Æ Þ since Þ Ö ¼. Thus if Ì Ý ¼ then Þ Þ µ Ë µµ Ú Úµ Ú

Online Facility Location

Online Facility Location Online Facility Location Adam Meyerson Abstract We consider the online variant of facility location, in which demand points arrive one at a time and we must maintain a set of facilities to service these

More information

Designing Networks Incrementally

Designing Networks Incrementally Designing Networks Incrementally Adam Meyerson Kamesh Munagala Ý Serge Plotkin Þ Abstract We consider the problem of incrementally designing a network to route demand to a single sink on an underlying

More information

Optimal Time Bounds for Approximate Clustering

Optimal Time Bounds for Approximate Clustering Optimal Time Bounds for Approximate Clustering Ramgopal R. Mettu C. Greg Plaxton Department of Computer Science University of Texas at Austin Austin, TX 78712, U.S.A. ramgopal, plaxton@cs.utexas.edu Abstract

More information

Scan Scheduling Specification and Analysis

Scan Scheduling Specification and Analysis Scan Scheduling Specification and Analysis Bruno Dutertre System Design Laboratory SRI International Menlo Park, CA 94025 May 24, 2000 This work was partially funded by DARPA/AFRL under BAE System subcontract

More information

The Online Median Problem

The Online Median Problem The Online Median Problem Ramgopal R. Mettu C. Greg Plaxton November 1999 Abstract We introduce a natural variant of the (metric uncapacitated) -median problem that we call the online median problem. Whereas

More information

Efficiency versus Convergence of Boolean Kernels for On-Line Learning Algorithms

Efficiency versus Convergence of Boolean Kernels for On-Line Learning Algorithms Efficiency versus Convergence of Boolean Kernels for On-Line Learning Algorithms Roni Khardon Tufts University Medford, MA 02155 roni@eecs.tufts.edu Dan Roth University of Illinois Urbana, IL 61801 danr@cs.uiuc.edu

More information

A 2-Approximation Algorithm for the Soft-Capacitated Facility Location Problem

A 2-Approximation Algorithm for the Soft-Capacitated Facility Location Problem A 2-Approximation Algorithm for the Soft-Capacitated Facility Location Problem Mohammad Mahdian Yinyu Ye Ý Jiawei Zhang Þ Abstract This paper is divided into two parts. In the first part of this paper,

More information

Probabilistic analysis of algorithms: What s it good for?

Probabilistic analysis of algorithms: What s it good for? Probabilistic analysis of algorithms: What s it good for? Conrado Martínez Univ. Politècnica de Catalunya, Spain February 2008 The goal Given some algorithm taking inputs from some set Á, we would like

More information

Correlation Clustering

Correlation Clustering Correlation Clustering Nikhil Bansal Avrim Blum Shuchi Chawla Abstract We consider the following clustering problem: we have a complete graph on Ò vertices (items), where each edge Ù Úµ is labeled either

More information

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates CSE 51: Design and Analysis of Algorithms I Spring 016 Lecture 11: Clustering and the Spectral Partitioning Algorithm Lecturer: Shayan Oveis Gharan May nd Scribe: Yueqi Sheng Disclaimer: These notes have

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Competitive Analysis of On-line Algorithms for On-demand Data Broadcast Scheduling

Competitive Analysis of On-line Algorithms for On-demand Data Broadcast Scheduling Competitive Analysis of On-line Algorithms for On-demand Data Broadcast Scheduling Weizhen Mao Department of Computer Science The College of William and Mary Williamsburg, VA 23187-8795 USA wm@cs.wm.edu

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 1 Luca Trevisan January 4, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 1 Luca Trevisan January 4, 2011 Stanford University CS359G: Graph Partitioning and Expanders Handout 1 Luca Trevisan January 4, 2011 Lecture 1 In which we describe what this course is about. 1 Overview This class is about the following

More information

From Static to Dynamic Routing: Efficient Transformations of Store-and-Forward Protocols

From Static to Dynamic Routing: Efficient Transformations of Store-and-Forward Protocols From Static to Dynamic Routing: Efficient Transformations of Store-and-Forward Protocols Christian Scheideler Ý Berthold Vöcking Þ Abstract We investigate how static store-and-forward routing algorithms

More information

SFU CMPT Lecture: Week 8

SFU CMPT Lecture: Week 8 SFU CMPT-307 2008-2 1 Lecture: Week 8 SFU CMPT-307 2008-2 Lecture: Week 8 Ján Maňuch E-mail: jmanuch@sfu.ca Lecture on June 24, 2008, 5.30pm-8.20pm SFU CMPT-307 2008-2 2 Lecture: Week 8 Universal hashing

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition.

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition. 18.433 Combinatorial Optimization Matching Algorithms September 9,14,16 Lecturer: Santosh Vempala Given a graph G = (V, E), a matching M is a set of edges with the property that no two of the edges have

More information

A General Greedy Approximation Algorithm with Applications

A General Greedy Approximation Algorithm with Applications A General Greedy Approximation Algorithm with Applications Tong Zhang IBM T.J. Watson Research Center Yorktown Heights, NY 10598 tzhang@watson.ibm.com Abstract Greedy approximation algorithms have been

More information

Parameterized graph separation problems

Parameterized graph separation problems Parameterized graph separation problems Dániel Marx Department of Computer Science and Information Theory, Budapest University of Technology and Economics Budapest, H-1521, Hungary, dmarx@cs.bme.hu Abstract.

More information

Graphs: Introduction. Ali Shokoufandeh, Department of Computer Science, Drexel University

Graphs: Introduction. Ali Shokoufandeh, Department of Computer Science, Drexel University Graphs: Introduction Ali Shokoufandeh, Department of Computer Science, Drexel University Overview of this talk Introduction: Notations and Definitions Graphs and Modeling Algorithmic Graph Theory and Combinatorial

More information

Simultaneous Optimization for Concave Costs: Single Sink Aggregation or Single Source Buy-at-Bulk

Simultaneous Optimization for Concave Costs: Single Sink Aggregation or Single Source Buy-at-Bulk Simultaneous Optimization for Concave Costs: Single Sink Aggregation or Single Source Buy-at-Bulk Ashish Goel Ý Stanford University Deborah Estrin Þ University of California, Los Angeles Abstract We consider

More information

On the Performance of Greedy Algorithms in Packet Buffering

On the Performance of Greedy Algorithms in Packet Buffering On the Performance of Greedy Algorithms in Packet Buffering Susanne Albers Ý Markus Schmidt Þ Abstract We study a basic buffer management problem that arises in network switches. Consider input ports,

More information

Approximation Via Cost-Sharing: A Simple Approximation Algorithm for the Multicommodity Rent-or-Buy Problem

Approximation Via Cost-Sharing: A Simple Approximation Algorithm for the Multicommodity Rent-or-Buy Problem Approximation Via Cost-Sharing: A Simple Approximation Algorithm for the Multicommodity Rent-or-Buy Problem Anupam Gupta Amit Kumar Ý Martin Pál Þ Tim Roughgarden Þ Abstract We study the multicommodity

More information

Graphs (MTAT , 4 AP / 6 ECTS) Lectures: Fri 12-14, hall 405 Exercises: Mon 14-16, hall 315 või N 12-14, aud. 405

Graphs (MTAT , 4 AP / 6 ECTS) Lectures: Fri 12-14, hall 405 Exercises: Mon 14-16, hall 315 või N 12-14, aud. 405 Graphs (MTAT.05.080, 4 AP / 6 ECTS) Lectures: Fri 12-14, hall 405 Exercises: Mon 14-16, hall 315 või N 12-14, aud. 405 homepage: http://www.ut.ee/~peeter_l/teaching/graafid08s (contains slides) For grade:

More information

CS 534: Computer Vision Segmentation and Perceptual Grouping

CS 534: Computer Vision Segmentation and Perceptual Grouping CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation

More information

RSA (Rivest Shamir Adleman) public key cryptosystem: Key generation: Pick two large prime Ô Õ ¾ numbers È.

RSA (Rivest Shamir Adleman) public key cryptosystem: Key generation: Pick two large prime Ô Õ ¾ numbers È. RSA (Rivest Shamir Adleman) public key cryptosystem: Key generation: Pick two large prime Ô Õ ¾ numbers È. Let Ò Ô Õ. Pick ¾ ½ ³ Òµ ½ so, that ³ Òµµ ½. Let ½ ÑÓ ³ Òµµ. Public key: Ò µ. Secret key Ò µ.

More information

6. Lecture notes on matroid intersection

6. Lecture notes on matroid intersection Massachusetts Institute of Technology 18.453: Combinatorial Optimization Michel X. Goemans May 2, 2017 6. Lecture notes on matroid intersection One nice feature about matroids is that a simple greedy algorithm

More information

Lecture and notes by: Sarah Fletcher and Michael Xu November 3rd, Multicommodity Flow

Lecture and notes by: Sarah Fletcher and Michael Xu November 3rd, Multicommodity Flow Multicommodity Flow 1 Introduction Suppose we have a company with a factory s and a warehouse t. The quantity of goods that they can ship from the factory to the warehouse in a given time period is limited

More information

On Covering a Graph Optimally with Induced Subgraphs

On Covering a Graph Optimally with Induced Subgraphs On Covering a Graph Optimally with Induced Subgraphs Shripad Thite April 1, 006 Abstract We consider the problem of covering a graph with a given number of induced subgraphs so that the maximum number

More information

Time-space tradeoff lower bounds for randomized computation of decision problems

Time-space tradeoff lower bounds for randomized computation of decision problems Time-space tradeoff lower bounds for randomized computation of decision problems Paul Beame Ý Computer Science and Engineering University of Washington Seattle, WA 98195-2350 beame@cs.washington.edu Xiaodong

More information

Online algorithms for clustering problems

Online algorithms for clustering problems University of Szeged Department of Computer Algorithms and Artificial Intelligence Online algorithms for clustering problems Summary of the Ph.D. thesis by Gabriella Divéki Supervisor Dr. Csanád Imreh

More information

Tolls for heterogeneous selfish users in multicommodity networks and generalized congestion games

Tolls for heterogeneous selfish users in multicommodity networks and generalized congestion games Tolls for heterogeneous selfish users in multicommodity networks and generalized congestion games Lisa Fleischer Kamal Jain Mohammad Mahdian Abstract We prove the existence of tolls to induce multicommodity,

More information

Approximation by NURBS curves with free knots

Approximation by NURBS curves with free knots Approximation by NURBS curves with free knots M Randrianarivony G Brunnett Technical University of Chemnitz, Faculty of Computer Science Computer Graphics and Visualization Straße der Nationen 6, 97 Chemnitz,

More information

Fast algorithms for max independent set

Fast algorithms for max independent set Fast algorithms for max independent set N. Bourgeois 1 B. Escoffier 1 V. Th. Paschos 1 J.M.M. van Rooij 2 1 LAMSADE, CNRS and Université Paris-Dauphine, France {bourgeois,escoffier,paschos}@lamsade.dauphine.fr

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A 4 credit unit course Part of Theoretical Computer Science courses at the Laboratory of Mathematics There will be 4 hours

More information

RSA (Rivest Shamir Adleman) public key cryptosystem: Key generation: Pick two large prime Ô Õ ¾ numbers È.

RSA (Rivest Shamir Adleman) public key cryptosystem: Key generation: Pick two large prime Ô Õ ¾ numbers È. RSA (Rivest Shamir Adleman) public key cryptosystem: Key generation: Pick two large prime Ô Õ ¾ numbers È. Let Ò Ô Õ. Pick ¾ ½ ³ Òµ ½ so, that ³ Òµµ ½. Let ½ ÑÓ ³ Òµµ. Public key: Ò µ. Secret key Ò µ.

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

Topic: Local Search: Max-Cut, Facility Location Date: 2/13/2007

Topic: Local Search: Max-Cut, Facility Location Date: 2/13/2007 CS880: Approximations Algorithms Scribe: Chi Man Liu Lecturer: Shuchi Chawla Topic: Local Search: Max-Cut, Facility Location Date: 2/3/2007 In previous lectures we saw how dynamic programming could be

More information

Introduction to Randomized Algorithms

Introduction to Randomized Algorithms Introduction to Randomized Algorithms Gopinath Mishra Advanced Computing and Microelectronics Unit Indian Statistical Institute Kolkata 700108, India. Organization 1 Introduction 2 Some basic ideas from

More information

Computing optimal linear layouts of trees in linear time

Computing optimal linear layouts of trees in linear time Computing optimal linear layouts of trees in linear time Konstantin Skodinis University of Passau, 94030 Passau, Germany, e-mail: skodinis@fmi.uni-passau.de Abstract. We present a linear time algorithm

More information

Disjoint, Partition and Intersection Constraints for Set and Multiset Variables

Disjoint, Partition and Intersection Constraints for Set and Multiset Variables Disjoint, Partition and Intersection Constraints for Set and Multiset Variables Christian Bessiere ½, Emmanuel Hebrard ¾, Brahim Hnich ¾, and Toby Walsh ¾ ¾ ½ LIRMM, Montpelier, France. Ö Ð ÖÑÑ Ö Cork

More information

Online Stochastic Matching CMSC 858F: Algorithmic Game Theory Fall 2010

Online Stochastic Matching CMSC 858F: Algorithmic Game Theory Fall 2010 Online Stochastic Matching CMSC 858F: Algorithmic Game Theory Fall 2010 Barna Saha, Vahid Liaghat Abstract This summary is mostly based on the work of Saberi et al. [1] on online stochastic matching problem

More information

Provisioning a Virtual Private Network: A Network Design problem for Multicommodity flows

Provisioning a Virtual Private Network: A Network Design problem for Multicommodity flows Carnegie Mellon University Research Showcase @ CMU Computer Science Department School of Computer Science 2001 Provisioning a Virtual Private Network: A Network Design problem for Multicommodity flows

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees

A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees Kedar Dhamdhere ½ ¾, Srinath Sridhar ½ ¾, Guy E. Blelloch ¾, Eran Halperin R. Ravi and Russell Schwartz March 17, 2005 CMU-CS-05-119

More information

Notes for Lecture 24

Notes for Lecture 24 U.C. Berkeley CS170: Intro to CS Theory Handout N24 Professor Luca Trevisan December 4, 2001 Notes for Lecture 24 1 Some NP-complete Numerical Problems 1.1 Subset Sum The Subset Sum problem is defined

More information

Framework for Design of Dynamic Programming Algorithms

Framework for Design of Dynamic Programming Algorithms CSE 441T/541T Advanced Algorithms September 22, 2010 Framework for Design of Dynamic Programming Algorithms Dynamic programming algorithms for combinatorial optimization generalize the strategy we studied

More information

Extremal Graph Theory: Turán s Theorem

Extremal Graph Theory: Turán s Theorem Bridgewater State University Virtual Commons - Bridgewater State University Honors Program Theses and Projects Undergraduate Honors Program 5-9-07 Extremal Graph Theory: Turán s Theorem Vincent Vascimini

More information

Expander-based Constructions of Efficiently Decodable Codes

Expander-based Constructions of Efficiently Decodable Codes Expander-based Constructions of Efficiently Decodable Codes (Extended Abstract) Venkatesan Guruswami Piotr Indyk Abstract We present several novel constructions of codes which share the common thread of

More information

MOST attention in the literature of network codes has

MOST attention in the literature of network codes has 3862 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 8, AUGUST 2010 Efficient Network Code Design for Cyclic Networks Elona Erez, Member, IEEE, and Meir Feder, Fellow, IEEE Abstract This paper introduces

More information

A Comparison of Structural CSP Decomposition Methods

A Comparison of Structural CSP Decomposition Methods A Comparison of Structural CSP Decomposition Methods Georg Gottlob Institut für Informationssysteme, Technische Universität Wien, A-1040 Vienna, Austria. E-mail: gottlob@dbai.tuwien.ac.at Nicola Leone

More information

Ulysses: A Robust, Low-Diameter, Low-Latency Peer-to-Peer Network

Ulysses: A Robust, Low-Diameter, Low-Latency Peer-to-Peer Network 1 Ulysses: A Robust, Low-Diameter, Low-Latency Peer-to-Peer Network Abhishek Kumar Shashidhar Merugu Jun (Jim) Xu Xingxing Yu College of Computing, School of Mathematics, Georgia Institute of Technology,

More information

Maximizing edge-ratio is NP-complete

Maximizing edge-ratio is NP-complete Maximizing edge-ratio is NP-complete Steven D Noble, Pierre Hansen and Nenad Mladenović February 7, 01 Abstract Given a graph G and a bipartition of its vertices, the edge-ratio is the minimum for both

More information

Connectivity and Inference Problems for Temporal Networks

Connectivity and Inference Problems for Temporal Networks Connectivity and Inference Problems for Temporal Networks (Extended Abstract for STOC 2000) David Kempe Jon Kleinberg Ý Amit Kumar Þ October, 1999 Abstract Many network problems are based on fundamental

More information

5. Lecture notes on matroid intersection

5. Lecture notes on matroid intersection Massachusetts Institute of Technology Handout 14 18.433: Combinatorial Optimization April 1st, 2009 Michel X. Goemans 5. Lecture notes on matroid intersection One nice feature about matroids is that a

More information

Lecture Notes on GRAPH THEORY Tero Harju

Lecture Notes on GRAPH THEORY Tero Harju Lecture Notes on GRAPH THEORY Tero Harju Department of Mathematics University of Turku FIN-20014 Turku, Finland e-mail: harju@utu.fi 2007 Contents 1 Introduction 2 1.1 Graphs and their plane figures.........................................

More information

Assignment 4 Solutions of graph problems

Assignment 4 Solutions of graph problems Assignment 4 Solutions of graph problems 1. Let us assume that G is not a cycle. Consider the maximal path in the graph. Let the end points of the path be denoted as v 1, v k respectively. If either of

More information

Structured System Theory

Structured System Theory Appendix C Structured System Theory Linear systems are often studied from an algebraic perspective, based on the rank of certain matrices. While such tests are easy to derive from the mathematical model,

More information

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 18 Luca Trevisan March 3, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 18 Luca Trevisan March 3, 2011 Stanford University CS359G: Graph Partitioning and Expanders Handout 8 Luca Trevisan March 3, 20 Lecture 8 In which we prove properties of expander graphs. Quasirandomness of Expander Graphs Recall that

More information

c 2004 Society for Industrial and Applied Mathematics

c 2004 Society for Industrial and Applied Mathematics SIAM J. MATRIX ANAL. APPL. Vol. 26, No. 2, pp. 390 399 c 2004 Society for Industrial and Applied Mathematics HERMITIAN MATRICES, EIGENVALUE MULTIPLICITIES, AND EIGENVECTOR COMPONENTS CHARLES R. JOHNSON

More information

Analysis of an ( )-Approximation Algorithm for the Maximum Edge-Disjoint Paths Problem with Congestion Two

Analysis of an ( )-Approximation Algorithm for the Maximum Edge-Disjoint Paths Problem with Congestion Two Analysis of an ( )-Approximation Algorithm for the Maximum Edge-Disjoint Paths Problem with Congestion Two Nabiha Asghar Department of Combinatorics & Optimization University of Waterloo, Ontario, Canada

More information

Clustering. (Part 2)

Clustering. (Part 2) Clustering (Part 2) 1 k-means clustering 2 General Observations on k-means clustering In essence, k-means clustering aims at minimizing cluster variance. It is typically used in Euclidean spaces and works

More information

Spectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras

Spectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras Spectral Graph Sparsification: overview of theory and practical methods Yiannis Koutis University of Puerto Rico - Rio Piedras Graph Sparsification or Sketching Compute a smaller graph that preserves some

More information

Cycle cover with short cycles

Cycle cover with short cycles Cycle cover with short cycles Nicole Immorlica Mohammad Mahdian Vahab S. Mirrokni Abstract Cycle covering is a well-studied problem in computer science. In this paper, we develop approximation algorithms

More information

6 Randomized rounding of semidefinite programs

6 Randomized rounding of semidefinite programs 6 Randomized rounding of semidefinite programs We now turn to a new tool which gives substantially improved performance guarantees for some problems We now show how nonlinear programming relaxations can

More information

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,

More information

Learning to Align Sequences: A Maximum-Margin Approach

Learning to Align Sequences: A Maximum-Margin Approach Learning to Align Sequences: A Maximum-Margin Approach Thorsten Joachims Department of Computer Science Cornell University Ithaca, NY 14853 tj@cs.cornell.edu August 28, 2003 Abstract We propose a discriminative

More information

Monotone Paths in Geometric Triangulations

Monotone Paths in Geometric Triangulations Monotone Paths in Geometric Triangulations Adrian Dumitrescu Ritankar Mandal Csaba D. Tóth November 19, 2017 Abstract (I) We prove that the (maximum) number of monotone paths in a geometric triangulation

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours

More information

CS 598CSC: Approximation Algorithms Lecture date: March 2, 2011 Instructor: Chandra Chekuri

CS 598CSC: Approximation Algorithms Lecture date: March 2, 2011 Instructor: Chandra Chekuri CS 598CSC: Approximation Algorithms Lecture date: March, 011 Instructor: Chandra Chekuri Scribe: CC Local search is a powerful and widely used heuristic method (with various extensions). In this lecture

More information

Data Streaming Algorithms for Geometric Problems

Data Streaming Algorithms for Geometric Problems Data Streaming Algorithms for Geometric roblems R.Sharathkumar Duke University 1 Introduction A data stream is an ordered sequence of points that can be read only once or a small number of times. Formally,

More information

Treewidth and graph minors

Treewidth and graph minors Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under

More information

Response Time Analysis of Asynchronous Real-Time Systems

Response Time Analysis of Asynchronous Real-Time Systems Response Time Analysis of Asynchronous Real-Time Systems Guillem Bernat Real-Time Systems Research Group Department of Computer Science University of York York, YO10 5DD, UK Technical Report: YCS-2002-340

More information

Lecture 9. Semidefinite programming is linear programming where variables are entries in a positive semidefinite matrix.

Lecture 9. Semidefinite programming is linear programming where variables are entries in a positive semidefinite matrix. CSE525: Randomized Algorithms and Probabilistic Analysis Lecture 9 Lecturer: Anna Karlin Scribe: Sonya Alexandrova and Keith Jia 1 Introduction to semidefinite programming Semidefinite programming is linear

More information

CS675: Convex and Combinatorial Optimization Spring 2018 The Simplex Algorithm. Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Spring 2018 The Simplex Algorithm. Instructor: Shaddin Dughmi CS675: Convex and Combinatorial Optimization Spring 2018 The Simplex Algorithm Instructor: Shaddin Dughmi Algorithms for Convex Optimization We will look at 2 algorithms in detail: Simplex and Ellipsoid.

More information

Byzantine Consensus in Directed Graphs

Byzantine Consensus in Directed Graphs Byzantine Consensus in Directed Graphs Lewis Tseng 1,3, and Nitin Vaidya 2,3 1 Department of Computer Science, 2 Department of Electrical and Computer Engineering, and 3 Coordinated Science Laboratory

More information

Clustering Data: Does Theory Help?

Clustering Data: Does Theory Help? Clustering Data: Does Theory Help? Ravi Kannan December 10, 2013 Ravi Kannan Clustering Data: Does Theory Help? December 10, 2013 1 / 27 Clustering Given n objects - divide (partition) them into k clusters.

More information

Clustering Using Graph Connectivity

Clustering Using Graph Connectivity Clustering Using Graph Connectivity Patrick Williams June 3, 010 1 Introduction It is often desirable to group elements of a set into disjoint subsets, based on the similarity between the elements in the

More information

Online Scheduling for Sorting Buffers

Online Scheduling for Sorting Buffers Online Scheduling for Sorting Buffers Harald Räcke ½, Christian Sohler ½, and Matthias Westermann ¾ ½ Heinz Nixdorf Institute and Department of Mathematics and Computer Science Paderborn University, D-33102

More information

Faster parameterized algorithms for Minimum Fill-In

Faster parameterized algorithms for Minimum Fill-In Faster parameterized algorithms for Minimum Fill-In Hans L. Bodlaender Pinar Heggernes Yngve Villanger Technical Report UU-CS-2008-042 December 2008 Department of Information and Computing Sciences Utrecht

More information

Coloring 3-Colorable Graphs

Coloring 3-Colorable Graphs Coloring -Colorable Graphs Charles Jin April, 015 1 Introduction Graph coloring in general is an etremely easy-to-understand yet powerful tool. It has wide-ranging applications from register allocation

More information

PCP and Hardness of Approximation

PCP and Hardness of Approximation PCP and Hardness of Approximation January 30, 2009 Our goal herein is to define and prove basic concepts regarding hardness of approximation. We will state but obviously not prove a PCP theorem as a starting

More information

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems Some Applications of Graph Bandwidth to Constraint Satisfaction Problems Ramin Zabih Computer Science Department Stanford University Stanford, California 94305 Abstract Bandwidth is a fundamental concept

More information

Faster parameterized algorithms for Minimum Fill-In

Faster parameterized algorithms for Minimum Fill-In Faster parameterized algorithms for Minimum Fill-In Hans L. Bodlaender Pinar Heggernes Yngve Villanger Abstract We present two parameterized algorithms for the Minimum Fill-In problem, also known as Chordal

More information

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6, Spectral Clustering XIAO ZENG + ELHAM TABASSI CSE 902 CLASS PRESENTATION MARCH 16, 2017 1 Presentation based on 1. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4

More information

Planar graphs, negative weight edges, shortest paths, and near linear time

Planar graphs, negative weight edges, shortest paths, and near linear time Planar graphs, negative weight edges, shortest paths, and near linear time Jittat Fakcharoenphol Satish Rao Ý Abstract In this paper, we present an Ç Ò ÐÓ Òµ time algorithm for finding shortest paths in

More information

REGULAR GRAPHS OF GIVEN GIRTH. Contents

REGULAR GRAPHS OF GIVEN GIRTH. Contents REGULAR GRAPHS OF GIVEN GIRTH BROOKE ULLERY Contents 1. Introduction This paper gives an introduction to the area of graph theory dealing with properties of regular graphs of given girth. A large portion

More information

Recitation 4: Elimination algorithm, reconstituted graph, triangulation

Recitation 4: Elimination algorithm, reconstituted graph, triangulation Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Recitation 4: Elimination algorithm, reconstituted graph, triangulation

More information

Exponentiated Gradient Algorithms for Large-margin Structured Classification

Exponentiated Gradient Algorithms for Large-margin Structured Classification Exponentiated Gradient Algorithms for Large-margin Structured Classification Peter L. Bartlett U.C.Berkeley bartlett@stat.berkeley.edu Ben Taskar Stanford University btaskar@cs.stanford.edu Michael Collins

More information

1. Lecture notes on bipartite matching

1. Lecture notes on bipartite matching Massachusetts Institute of Technology 18.453: Combinatorial Optimization Michel X. Goemans February 5, 2017 1. Lecture notes on bipartite matching Matching problems are among the fundamental problems in

More information

Expected Approximation Guarantees for the Demand Matching Problem

Expected Approximation Guarantees for the Demand Matching Problem Expected Approximation Guarantees for the Demand Matching Problem C. Boucher D. Loker September 2006 Abstract The objective of the demand matching problem is to obtain the subset M of edges which is feasible

More information

Expected Runtimes of Evolutionary Algorithms for the Eulerian Cycle Problem

Expected Runtimes of Evolutionary Algorithms for the Eulerian Cycle Problem Expected Runtimes of Evolutionary Algorithms for the Eulerian Cycle Problem Frank Neumann Institut für Informatik und Praktische Mathematik Christian-Albrechts-Universität zu Kiel 24098 Kiel, Germany fne@informatik.uni-kiel.de

More information

Combinatorial Problems on Strings with Applications to Protein Folding

Combinatorial Problems on Strings with Applications to Protein Folding Combinatorial Problems on Strings with Applications to Protein Folding Alantha Newman 1 and Matthias Ruhl 2 1 MIT Laboratory for Computer Science Cambridge, MA 02139 alantha@theory.lcs.mit.edu 2 IBM Almaden

More information

Information-Theoretic Private Information Retrieval: A Unified Construction (Extended Abstract)

Information-Theoretic Private Information Retrieval: A Unified Construction (Extended Abstract) Information-Theoretic Private Information Retrieval: A Unified Construction (Extended Abstract) Amos Beimel ½ and Yuval Ishai ¾ ¾ ½ Ben-Gurion University, Israel. beimel@cs.bgu.ac.il. DIMACS and AT&T Labs

More information

Lecture 5: Duality Theory

Lecture 5: Duality Theory Lecture 5: Duality Theory Rajat Mittal IIT Kanpur The objective of this lecture note will be to learn duality theory of linear programming. We are planning to answer following questions. What are hyperplane

More information

A Modality for Recursion

A Modality for Recursion A Modality for Recursion (Technical Report) March 31, 2001 Hiroshi Nakano Ryukoku University, Japan nakano@mathryukokuacjp Abstract We propose a modal logic that enables us to handle self-referential formulae,

More information

Approximation Algorithms for Item Pricing

Approximation Algorithms for Item Pricing Approximation Algorithms for Item Pricing Maria-Florina Balcan July 2005 CMU-CS-05-176 Avrim Blum School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 School of Computer Science,

More information

High Dimensional Indexing by Clustering

High Dimensional Indexing by Clustering Yufei Tao ITEE University of Queensland Recall that, our discussion so far has assumed that the dimensionality d is moderately high, such that it can be regarded as a constant. This means that d should

More information