One-Dimensional Graph Drawing: Part I Drawing Graphs by Axis Separation

Size: px
Start display at page:

Download "One-Dimensional Graph Drawing: Part I Drawing Graphs by Axis Separation"

Transcription

1 One-Dimensional Graph Drawing: Part I Drawing Graphs by Axis Separation Yehuda Koren and David Harel Dept. of Computer Science and Applied Mathematics The Weizmann Institute of Science, Rehovot, Israel {yehuda,dharel}@wisdom.weizmann.ac.il Abstract. In this paper we discuss a useful family of graph drawing algorithms, characterized by their ability to draw graphs in one dimension. The most important application of this family seems to be achieving graph drawing by axis separation, where each axis of the drawing addresses different aspects of aesthetics. We define the special requirements from such algorithms and show how several graph drawing algorithms can be generalized to handle this task. 1 Introduction A graph G(V,E) is an abstract structure that is used to model a relation E over a set V of entities. Graph drawing is a standard means for the visualization of relational information, and its ultimate usefulness depends on the readability of the resulting layout; that is, the drawing algorithm s capability of conveying the meaning of the diagram quickly and clearly. Consequently, many approaches to graph drawing have been developed [6, 15]. We concentrate on the problem of drawing graphs so as to convey pictorially the proximity relations between the nodes. The most popular approaches to this appear to be force-directed algorithms. These define a cost function (or a force model), whose minimization determines the optimal drawing. Graph drawing research traditionally deals with drawing graphs in two or three dimensions. In this paper, we identify and discuss a new family of graph drawing algorithms whose goal is to draw the graph in one dimension. This family has some interesting applications, to which we now turn. Axis separation The most common use of 1-D drawing algorithms is graph drawing by axis separation. Here, we would like to build a multidimensional layout axis-by-axis, so that we each axis can be computed using a different algorithm, perhaps accounting for different aesthetical considerations. This facilitates an appealing divide-and-conquer approach to graph drawing. A well known example of this is the problem of drawing directed graphs, where the y-coordinates are commonly used to reflect the hierarchy, while the separately computed x-coordinates take care of additional characteristics of the graph, such as preserving proximity. As another example, we have recently worked on visualizing clustered data using axis separation. There, the x-coordinates guarantee the visual separation between clusters, whereas the y-coordinates address additional aesthetics ignoring the clustering structure; see [18] for a detailed description. Fig. 1 shows a sample result of this, containing a hierarchically-clustered biological dataset (modeled by a weighted graph). The hierarchy structure of the data is represented by the traditional dendrogram a full binary tree in which each subtree is a cluster and the leaves are individual elements. Consequently, the x-axis was computed so as to adhere to the dendrogram structure,

2 while maximizing the expression of similarities between the nodes. This was done by reordering the dendrogram and adjusting the gaps between consecutive leaves. The y- axis, which should not consider the hierarchy structure at all, was computed by a 1-D graph drawing algorithm (using the classical-mds method, as described in Subsection 3.3). Fig. 1. (taken from [18]) Using axis-separation to draw hierarchically clustered fibroblast gene expression data. We convey both the similarities between the nodes and their clustering decomposition, using an ordered dendrogram coupled with a 2-D layout that adheres to its structure. We have colored six salient clusters that are clearly visible. Sometimes a single dataset can be modeled by different graphs. Consequently, it might be instructive to draw the data by assigning each of the axes to a different graph, and then simultaneously examine and compare the characteristics of the two models. For example, proximity relationships between web pages can be modeled either by connecting pages that have a similar content, or by relying on their link structure. We can draw the web pages as points in the plane according to these two models by using axis separation, thus making it possible to see at a glance which elements are related by each of them. Another tightly related case is when we already have one coordinate for each node. Such a coordinate might be a numeric attribute of the nodes that we want to convey spatially. In order to reflect proximity relationships, we would like to add another coordinate computed by a 1-D graph drawing algorithm. A nice example of this appears in [2]. There, a link structure (like the WWW) is visualized by associating one axis with a ranking of the nodes (some measure for node-prominency) and the other axis is computed by 1-D graph drawing (using the Eigen-projection method described in Subsection 3.1). See Fig. 2. Linear arrangement So far, we have described situations were the 1-D graph drawing is used to construct a multidimensional drawing. However, in some cases, additional axes are not necessary, and we simply need an algorithm for drawing the graph on a line ; see, e.g., Fig. 3. In this case, the problem is called linear arrangement [10, 17],

3 home.interlink.or.jp/~ichisaka/ Stars.com/Multimedia/ javaboutique.internet.com java.sun.com tacocity.com.tw/java/ physics.syr.edu Fig. 2. Authority and PageRank visualization of java query result, taken (with permission) from [2]. Each web-page is given two numerical values that measure its importance (Authority and PageRank). These values determine the x-coordinates of the drawing. The y-coordinates, which reflect the similarity between the web-pages, are computed by a graph drawing algorithm. and we want to order the nodes such that similar nodes are placed close to each other. In the graph drawing context, such a problem arises in code and data layout applications [1], and in laying out software diagrams [22] Fig. 3. A Linear arrangement Fig. 4 shows how such a linear arrangement can be used to visualize a (weighted) adjacency matrix. The figure shows the relations between odor patterns measured by an electronic nose using a complete weighted graph; see [5]. As seen in part (a) of the figure, the raw adjacency matrix does not show any structure. However, the same matrix, shown in part (b) after permuting its rows and columns according to a linear arrangement of the graph, reveals much of the structure of the data. Ordering problems are naturally formulated as discrete optimization problems, where the coordinates are permutation of {1,...,n}. However, such formulations lead to NP-hard problems that are difficult to solve. One way to eliminate part of this difficulty is to allow the nodes to take on noninteger coordinates. The resulting continuous problems can be efficiently solved, and their solution is used as an approximation of the optimal discrete ordering, by taking a sorted ordering of the nodes coordinates; see [13, 17]. In this way, the continuous formulations given in this paper can be used for discrete linear arrangement problems too.

4 (a) (b) Fig. 4. Using linear arrangement for matrix visualization. (a) A similarity matrix of odor patterns as measured by an electronic nose; more similar patterns get higher (=brighter) similarity values. (b) The structure of the data is better visualized after re-ordering rows and columns of the matrix. 2 Basic Notions Throughout the paper, we assume that we are given an n-node connected graph G(V,E), with V = {1,...,n}. A key value that describes relations between nodes is the Laplacian, which is an n n symmetric positive-semidefinite matrix denoted by L, where 1 {i, j} E L ij = 0 {i, j} / E,i j i, j =1,...,n. deg(i) i = j Here, deg(i) def def = {j {i, j} E}. It is easy to check that 1 n = (1,...,1) R n is an eigenvector of L with associated eigenvalue 0. When the graph is connected, all other eigenvalues are strictly positive. The usefulness of the Laplacian stems from the fact that the quadratic form associated with it is just the sum of squared edge lengths. We formulate this for a 1-D layout: Lemma 1. Let L be an n n Laplacian, and let x R n. Then x T Lx = (x i x j ) 2. {i,j} E The proof of the lemma is straightforward, and it can be extended to multidimensional layouts too. We now recall some basic statistical notions. The mean of a vector x R n, denoted by x, is defined as 1 n n i=1 x i. The variance of x, denoted by Var(x), is defined as 1 n n i=1 (x i x) 2. The covariance between two vectors x, y R n is defined as Cov(x, y) = 1 n n i=1 (x i x)(y i ȳ). The correlation coefficient between x and y is defined as Cov(x, y) Var(x)Var(y) This measures the colinearity between the two vectors. If the correlation coefficient is 0, x and y are uncorrelated. Ifx and y are independent, then they are uncorrelated. We denote the normalized value of x by ˆx = x/ x. As explained earlier, 1-D drawing algorithms are often used in the context of multidimensional drawings. Henceforth, for simplicity, assume we have to compute the y- coordinates, while (possibly) being given precomputed x-coordinates. Thus, the layout

5 is characterized by two vectors x, y R n, with the x-coordinates being x 1,...,x n, and the y-coordinates y 1,...,y n. Other cases, where we have more than one precomputed axis or where we want to produce several dimensions, can be addressed by small changes in our techniques. Moreover, for convenience, we assume, without loss of generality, that the x- and y-coordinates are centered, so their means are 0. In symbols, n i=1 x i = n i=1 y i =0. This can be achieved by a simple translation. 3 Algorithms for One-Dimensional Graph Drawing In principle, we could have used a classical force-directed algorithm for computing the 1-D layout. However, when trying to modify the customary two-dimensional optimization algorithm for use in our one-dimensional case, convergence was rarely achieved. Traditionally, node-by-node optimization is performed, by moving each node to a point that decreases the cost function. Common methods for this are Gradient-Descent and Newton-Raphson. However, these methods tend to get stuck in bad local minima when used for 1-D drawing [4, 23]. Interestingly, 2-D drawing is much easier for such methods. Probably, the reason is that there is less space for maneuver in one dimension when seeking a nice layout, which prevents convergence to an optimum. Furthermore, in several works even a 3-D layout is used to avoid local minima, see, e.g., [3, 8, 23]. Another possible approach could be to use algorithms for computing (approximated) minimum linear arrangements (MinLA). These set the coordinates to be a permutation of {1,...,n} in a way that minimizes the sum of edge lengths. Although the limitation that the coordinates are distinct integers may seem unnatural in the graph drawing context, we have found that MinLA has some merits when drawing digraphs by axis separation; see [4]. However, a major disadvantage of MinLA is that it cannot consider precomputed coordinates. Note that a careless computation that ignores such precomputed coordinates can be very problematic. Such a computation might yield y-coordinates that are very similar to the x-coordinates, resulting in a drawing whose intrinsic dimensionality would really be 1, meaning that one axis would be wasted. In the rest of this section, we describe four different methods that appear to be perfect for our task. A common characteristic of these methods, which makes them suitable for 1-D optimization, is that they compute the layout axis-by-axis, instead of the nodeby-node optimization mechanism of force-directed methods. Furthermore, when these methods are used to produce a multidimensional layout, the different axes are uncorrelated. This suggests a very effective way to generalize the methods so that they can deal with the precomputed coordinates: we simply require no correlation between the x-coordinates and the y-coordinates, so that the latter ones will provide us with as much new information as possible. Technically, since we have assumed x and y to be centered, the no-correlation requirement can be formulated simply as y T x =0, which states that x and y are orthogonal. We now survey the methods and explain how they can be extended to handle the case of predefined x-coordinates. 3.1 Eigen-projection The Eigen-projection [11, 19] computes the layout of a graph using low eigenvectors of the related Laplacian. Some important advantages of this approach are its ability to compute optimal layouts (according to specific requirements) and a very short computation time [16]. As we will see, this method is a natural choice for 1-D layouts, and has already been used for such tasks in [13, 1, 2, 4]. In [19] we give several explanations for the

6 ability of the Eigen-projection to draw graphs nicely. Here, we provide a new derivation, which shows the tight relationship of this method to force-directed graph drawing. We define the Eigen-projection 1-D layout y R n, as the solution of: min y {i,j} E (y i y j ) 2 {i,j} / E (y i y j ) 2 (1) In (1), the numerator calls for shortening the edge lengths (the attractive forces ), while the denominator calls for placing all nonadjacent pairs further apart (the repulsive forces ). This is a reasonable energy minimization approach that resembles forcedirected algorithms. Since we have {i,j} E (y i y j ) 2 + {i,j} / E (y i y j ) 2 = i<j (y i y j ) 2,an equivalent problem would be: min y {i,j} E (y i y j ) 2 i<j (y i y j ) 2 (2) It is easy to see that the energy to be minimized is invariant under translation of the data. Thus, for convenience, we eliminate this degree of freedom, by requiring that y be centered; that is, y T 1 n =0. We can now simplify (2) by using the following lemma: Lemma 2. Let y R n such that y T 1 n =0, then: (y i y j ) 2 = n i<j yi 2 (= n y T y). i=1 Proof. (y i y j ) 2 = 1 2 i<j (y i y j ) 2 = 1 2n yi i,j=1 = n yi 2 n y i i=1 i=1 j=1 i=1 y j = n yi 2 i=1 y i y j = i,j=1 The last step stems from the fact that y is centered, so that n j=1 y j =0. Therefore, we can replace i<j (y i y j ) 2 with y T y. Moreover, using Lemma 1, we can write {i,j} E (y i y j ) 2 as the quadratic form y T Ly. Consequently, once again we reformulate our minimization problem in the equivalent form: min y y T Ly y T y in the subspace: y T 1 n =0 (3) By substituting ˆx = 0 in the following Proposition, we obtain that the optimal 1-D layout is the eigenvector of L with the smallest positive eigenvalue. This way, the Eigen-projection method provides us with an efficient way to calculate optimal 1-D layouts. We still have to show how the Eigen-projection can be extended so as to deal with the uncorrelation requirement, that is a case where we already have a

7 coordinate vector x, and we require that y is orthogonal to x. Now, the optimal layout will be the solution of: min y y T Ly y T y in the subspace: y T 1 n =0,y T x =0 (4) Fortunately, the optimal layout is still a solution of a related eigen-equation: Proposition 1. The solution of (4) is the eigenvector of (I ˆxˆx T )L(I ˆxˆx T ) with the smallest positive eigenvalue. (Note that I ˆxˆx T is a symmetric n n matrix. Henceforth, we will use it extensively thanks to its property of being an orthogonalization operator: for any vector y R n, the result of orthogonalizing y against x is (I ˆxˆx T )y.) Proof. Observe that we can assume, without loss of generality, that y T y =1. This is because changing the scale still gives an optimal solution: Check that if for y 0 satisfying y T 0 y 0 =1we get y T 0 Ly 0 /y T 0 y 0 = λ, then we will also get y T Ly/y T y = λ for each y = c y 0 (c 0). Thus, the new form of the optimization problem will be: min y T Ly y given: y T y =1 in the subspace: y T 1 n =0,y T x =0 (5) The matrix (I ˆxˆx T )L(I ˆxˆx T ) is symmetric, so it has n orthogonal eigenvectors spanning R n. We will use the convention λ 1 λ 2... λ n for the eigenvalues of (I ˆxˆx T )L(I ˆxˆx T ), and denote the corresponding real orthonormal eigenvectors by u 1,u 2,...,u n. Clearly, (I ˆxˆx T )L(I ˆxˆx T ) x =0. Utilizing the fact that x T 1 n =0 and that 1 n is the only zero eigenvector of L, we obtain λ 1 = λ 2 =0, u 1 =(1/ 1 n ) 1 n,u 2 =ˆx, and λ 3 > 0. We can now decompose every y R n as a linear combination, where y = n i=1 α iu i. Moreover, since the solution is constrained to be orthogonal to u 1 and u 2, we can restrict ourselves to linear combinations of the form y = n i=3 α iu i. Use the constraint y T y =1to obtain n i=3 α2 i =1(a generalization of the Pythagorean law). Similarly, y T (I ˆxˆx T )L(I ˆxˆx T )y = n i=3 α2 i λ i. Note that since y T ˆx =ˆx T y = 0, we get: y T (I ˆxˆx T )L(I ˆxˆx T )y = y T Ly + y T ˆxˆx T L(I ˆxˆx T )y + y T Lˆxˆx T y = y T Ly. So the target value is y T Ly = y T (I ˆxˆx T )L(I ˆxˆx T )y = αi 2 λ i i=3 αi 2 λ 3 = λ 3. Thus, for any y that satisfies the constraints, we get y T Ly λ 3. Since u T 3 Lu 3 = u T 3 (I ˆxˆx T )L(I ˆxˆx T )u 3 = λ 3, we can deduce that the minimizer is u 3, the lowest positive eigenvector. i=3

8 Interestingly, posing the problem as in (4) and solving it as in Proposition 1, constitutes a smooth generalization of the Eigen-projection method: when x is the lowest positive eigenvector of L, then the solution y will be the second lowest positive eigenvector of L. This coincides with the way the Eigen-projection computes 2-D layouts; see [19]. However, we allow the more general case of arbitrary x-coordinates. As to computational complexity, the space requirement of the algorithm is O( E ) when using a sparse representation of the Laplacian. The computation can be done using iterative algorithms, such as the Power-Iteration or Lanczos; see [9]. The time complexity of a single iteration is O( E ). When working with a sparse L, we can use a much faster multi-scale algorithm that can deal with millions of elements in reasonable time; see [16]. However, caution is needed, since an explicit calculation of (I ˆxˆx T )L(I ˆxˆx T ) would destroy the sparsity of L. To get around this, we utilize the fact that the iterative algorithms for computing eigenvectors use the matrix as an operator, i.e., they access it only via multiplication with a vector. This settles the issue, since carrying out the product (I ˆxˆx T )L(I ˆxˆx T ) v is equivalent to orthogonalizing v against x, multiplying the result with the sparse matrix L, and then again orthogonalizing the result against x. 3.2 Principal component analysis and high-dimensional embedding Principal component analysis (PCA) computes a projection of multidimensional data that optimally preserves its variance; see [7]. The fact that PCA uses the data coordinates apparently renders it useless for graph drawing. However, in [12] we show that it is possible to generate artificial k-dimensional coordinates of the nodes that preserve some of the graph structure, thus making it possible to use PCA. We call these coordinates high-dimensional embedding in [12], and denote them by an n k coordinate matrix called X, so the k coordinates of node i constitute the i-th row of X. We assume each of the columns of X is centered, something that can be achieved by translating the data. In order to compute a 1-D projection, PCA computes a unit vector d R k, which is the direction of the projection. The vector d is the top eigenvector of the covariance matrix 1 n X T X. The projection itself is X d, and, as mentioned, it is the best 1-D projection in terms of variance preservation. When given x-coordinates, we will be interested only in the component of the projection that is orthogonal to the x-coordinates. This component is exactly (I ˆxˆx T ) (X d), and we want to maximize its variance. However, (I xx T ) (X d) =((I xx T ) X)d, so our problem is reduced to finding the most variance-preserving projection of the coordinates (I ˆxˆx T ) X. The optimal solution is obtained by performing PCA on (I ˆxˆx T ) X, which is equivalent to orthogonalizing each of X columns against x and then performing PCA on the resulting matrix. Again, this is a smooth generalization of PCA that enables it to deal with predefined x-coordinates. The reason is that if x was also computed by PCA, then one would obtain the regular 2-D PCA projection. One of the advantages of the PCA approach is its excellent time and space complexity; see [12]. 3.3 Classical multidimensional scaling Multidimensional scaling (MDS) is a general term for techniques that generate coordinates of points from information about pairwise distances. Therefore, arguably, forcedirected graph drawing can be considered to be MDS. Here we are interested in a technique called classical-mds (CMDS) [7], which produces (multidimensional) coordinates that preserve the given pairwise distances perfectly; i.e., the pairwise Euclidean

9 distances in the generated space are equal to the given distances. The graph drawing application of CMDS was suggested long ago, in [21]. The distance between nodes i and j is defined as d ij, the graph-theoretical distance between the nodes. Therefore, CMDS can be used to find an Euclidean embedding of the graph that preserves the graph-theoretical distance. We now provide a short technical description of the method. Given points in Euclidean space, it is possible to construct a matrix X of centered coordinates if we know the pairwise distances among the points. The way to do this is to construct the n n inner product matrix B = XX T, which can be computed using the cosine law, as follows B ij = 1 d 2 ij 1 d 2 ik 1 d 2 kj + 1 n 2 n n n 2 d 2 lk. (6) k=1 k=1 k=1,l=1 Note that B is invariant under orthogonal transformations of X. That is, given some orthogonal matrix Q (i.e., QQ T = I), we can replace X with X Q, without changing the inner-product matrix: X Q(X Q) T = X QQ T X = XX T = B Therefore, B determines the coordinates up to orthogonal transformation. This is reasonable, since such a transformation does not alter pairwise distances. There is always an orthogonal transformation that makes the axes orthogonal (i.e., the singular value decomposition), which allows us to restrict ourselves to a coordinate matrix with orthogonal columns. Such a matrix can be obtained by factoring B using the eigenvalue decomposition B = U U T (U is orthogonal and is diagonal), which enables defining the coordinates of the points as X = U 1 2. This way, the columns of X are centered and are mutually orthogonal. In practice, we do not want all the coordinates but only a low-dimensional projection of the points, and here only a 1-D embedding is needed. Thus, as in PCA, we seek the 1-D projection of X having the maximal variance. Since the columns of X are uncorrelated, we simply have to take the column with the maximal variance, which is equivalent to the column of U with the highest corresponding eigenvalue. Technically, we are interested in the top eigenvector u 1 and the corresponding eigenvalue, λ 1,ofB. After computing this eigenpair, we can define the embedding of the data as λ 1 u 1. Additional coordinates can be obtained using the subsequent eigenpairs. It appears that CMDS is closely related to PCA. In fact, CMDS is a way of performing PCA without explicitly defining the coordinate matrix. Thus, if the pairwise distances are Euclidean distances based on the coordinate matrix, the results of CMDS are identical to PCA. Consequently, in our case, when we want the embedding to be orthogonal to x, we can use the same technique we used in PCA. Once again, we would like to perform PCA on (I ˆxˆx T )X, and of course we do not have this matrix explicitly. However, it is possible to compute the inner-product matrix (I ˆxˆx T )XX T (I ˆxˆx T ), since this matrix is simply (I ˆxˆx T )B(I ˆxˆx T ). Using the same reasoning as above, the first principal component of (I ˆxˆx T )X can be found by computing the top eigenpair of (I ˆxˆx T )B(I ˆxˆx T ). However, there is one theoretical flaw in applying CMDS to graph-drawing. Computing a coordinate matrix X that preserves pairwise distances is not always possible, and will fail when the graph-theoretic metric is not Euclidean. Technically, there might be some negative eigenvalues to the matrix B, preventing the square-root operation from

10 being carried out. However, in practice this is not a serious problem, since we are not interested in recovering the full multidimensional coordinates, but only the few leading coordinates. When the given x-coordinates are also the result of CMDS, our method produces the same y-coordinates as CMDS. Therefore, we have a smooth generalization of CMDS that allows it to deal with predefined coordinates. One note on complexity. When performing this CMDS, we have to store the matrix B, which requires O(n 2 ) space complexity, much worse than in the Eigen-projection or PCA cases. 3.4 One-dimensional drawing of digraphs When edges are directed, we may want the layout to show the overall directionality of the graph and its hierarchical structure. The previously described techniques, which ignore the direction of the edges, might thus not be suitable. An adequate method for dealing with 1-D layout of digraphs was described in [4]. There, we looked for a layout that minimizes the hierarchy energy: (y i y j 1) 2 (7) i j E Define the balance vector, b R n, such that: b i = outdeg(i) indeg(i) where outdeg(i) and indeg(i) denote the number of outgoing and incoming edges adjacent to i, respectively. We showed in [4] that up to a constant additive term, the hierarchy energy (7), can be written in a compact form as: y T Ly 2y T b (8) Consequently, the optimal 1-D layout would be the solution of Ly = b. This formulation is flexible enough to allow y to be uncorrelated with a given coordinate vector x. In this case, we want to minimize (8) in the subspace orthogonal to x. Equivalently, we can take only the component of y that is orthogonal to x, which is (I ˆxˆx T )y. This way, we seek the minimizer of: y T (I ˆxˆx T )L(I ˆxˆx T )y 2y T (I ˆxˆx T )b, (9) Hence, the optimal 1-D layout would be the solution of: (I ˆxˆx T )L(I ˆxˆx T )y =(I ˆxˆx T )b After finding the minimizer we orthogonalize it against x. It is easy to see that this does not affect the value of (9), so we remain with an optimal solution uncorrelated with x. As a consequence, when we have a 1-D layout of a digraph, we can add an additional uncorrelated dimension that shows the hierarchical structure of the graph. Moreover, we can compute two coordinate vectors that provide two uncorrelated descriptions of the graph s directionality. This might be useful for digraphs whose hierarchical structure is explained by several independent factors. When L is sparse, we recommended in [4] that the equation be solved using the Conjugate-Gradient method that accesses L only via matrix-vector multiplication. In this case, like in the Eigen-projection case, the product (I ˆxˆx T )L(I ˆxˆx T ) v is carried out by orthogonalizing v against x, multiplying the result with the sparse matrix L, and then again orthogonalizing the result against x.

11 4 Discussion We have explored one-dimensional graph drawing algorithms and have studied their special features and applications. One important application of this family is graph drawing by axis-separation, where each axis is computed separately, so as to address specific aesthetical considerations. Since point-by-point local optimization in one dimension is a poor strategy, traditional force-directed algorithms are not suitable for the 1-D drawing task, while less traditional algorithms are. We generalized four such algorithms using the unified paradigm of computing the layout axis-by-axis, while maintaining noncorrelation with the precomputed coordinates. This unified framework allows for an interesting integration of the algorithms. We can use one of the algorithms for laying out the x-coordinates and another for computing uncorrelated y-coordinates. For example consider the 4970 graph that was previously drawn by the Eigen-projection in [16], as shown in Fig. 5(a), and by PCA projection in [12], as shown in Fig. 5(b). In Fig. 5(c) we show a layout where the x-coordinates were computed by Eigen-projection and y-coordinates by CMDS. Another combined layout is given in Fig. 5(d), where the x-coordinates were computed by PCA projection and y-coordinates by CMDS. (a) (b) (c) (d) Fig. 5. Layouts of the 4970 graph: (a) by Eigen-projection (taken from [16]); (b) by PCA (taken from [12]); (c,d) two combined layouts: (c) Eigen-projection + CMDS (d) PCA + CMDS. In a subsequent paper [20] we describe a rather sophisticated optimization process that allows incorporating the popular model of Kamada and Kawai [14] for onedimensional graph layout. References 1. B. Beckman, Theory of Spectral Graph Layout, Technical Report MSR-TR-94-04, Microsoft Research, 1994.

12 2. U. Brandes and S. Cornelsen, Visual Ranking of Link Structures, Proc. 7th Workshop Algorithms and Data Structures (WADS 01), LNCS 2125, pp , Springer-Verlag, To appear in Journal of Graph Algorithms and Application. 3. I. Bruss and A. Frick, Fast Interactive 3-D Graph Visualization, Proc. 3rd Inter. Symposium on Graph Drawing (GD 95), LNCS 1027, pp , Springer-Verlag, L. Carmel, D. Harel and Y. Koren, Drawing Directed Graphs Using One-Dimensional Optimization, Proc. 10th Inter. Symposium on Graph Drawing (GD 02), LNCS 2528, Springer- Verlag, pp , L. Carmel, Y. Koren and D. Harel, Visualizing and Classifying Odors Using a Similarity Matrix, Proc. 9th International Symposium on Olfaction and Electronic Nose (ISOEN 02), Aracne, pp , G. Di Battista, P. Eades, R. Tamassia and I.G. Tollis, Graph Drawing: Algorithms for the Visualization of Graphs, Prentice-Hall, B. S. Everitt and G. Dunn, Applied Multivariate Data Analysis, Arnold, P. Gajer, M. T. Goodrich, and S. G. Kobourov, A Multi-dimensional Approach to Force- Directed Layouts of Large Graphs, Proc. 8th Inter. Symposium on Graph Drawing (GD 00), LNCS 1984, pp , Springer-Verlag, G.H. Golub and C.F. Van Loan, Matrix Computations, Johns Hopkins University Press, J. Diaz, J. Petit and M. Serna, A Survey on Graph Layout Problems, ACM Computing Surveys 34 (2002), K. M. Hall, An r-dimensional Quadratic Placement Algorithm, Management Science 17 (1970), D. Harel and Y. Koren, Graph Drawing by High-Dimensional Embedding, Proc. 10th Inter. Symposium on Graph Drawing (GD 02), LNCS 2528, Springer-Verlag, pp , M. Juvan and B. Mohar, Optimal Linear Labelings and Eigenvalues of Graphs, Discrete Applied Math. 36 (1992), T. Kamada and S. Kawai, An Algorithm for Drawing General Undirected Graphs, Information Processing Letters 31 (1989), M. Kaufmann and D. Wagner (Eds.), Drawing Graphs: Methods and Models, LNCS 2025, Springer-Verlag, Y. Koren, L. Carmel and D. Harel, ACE: A Fast Multiscale Eigenvectors Computation for Drawing Huge Graphs, Proc. IEEE Information Visualization (InfoVis 02), IEEE, pp , Y. Koren and D. Harel, A Multi-Scale Algorithm for the Linear Arrangement Problem, Proc. 28th Inter. Workshop on Graph-Theoretic Concepts in Computer Science (WG 02), LNCS 2573, Springer-Verlag, pp , Y. Koren and D. Harel, A Two-Way Visualization Method for Clustered Data, Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 03), ACM Press, 2003, to appear. 19. Y. Koren, On Spectral Graph Drawing, Proc. 9th Inter. Computing and Combinatorics Conference (COCOON 03), Springer-Verlag, 2003, to appear. 20. Y. Koren, One-Dimensional Graph Drawing: Part II Axis-by-Axis Stress Minimization, submitted. Available at: yehuda/pubs/1d stress.pdf 21. J. Kruskal and J. Seery, Designing Network Diagrams Proc. First General Conference on Social Graphics, 22 50, U. S. Department of the Census, A. J. McAllister, A New Heuristic Algorithm for the Linear Arrangement Problem, Technical Report a, Faculty of Computer Science, University of New Brunswick, D. Tunkelang, A Numerical Optimization Approach to General Graph Drawing, Ph.D. Thesis, Carnegie Mellon University, 1999.

Graph Drawing by High-Dimensional Embedding

Graph Drawing by High-Dimensional Embedding Graph Drawing by High-Dimensional Embedding David Harel and Yehuda Koren Dept. of Computer Science and Applied Mathematics The Weizmann Institute of Science, Rehovot, Israel {harel,yehuda}@wisdom.weizmann.ac.il

More information

Drawing Directed Graphs Using One-Dimensional Optimization

Drawing Directed Graphs Using One-Dimensional Optimization Drawing Directed Graphs Using One-Dimensional Optimization Liran Carmel, David Harel and Yehuda Koren Dept. of Computer Science and Applied Mathematics The Weizmann Institute of Science, Rehovot, Israel

More information

Graph Drawing by High-Dimensional Embedding

Graph Drawing by High-Dimensional Embedding Journal of Graph Algorithms and Applications http://jgaa.info/ vol. 8, no. 2, pp. 195 214 (2004) Graph Drawing by High-Dimensional Embedding David Harel Dept. of Computer Science and Applied Mathematics

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

Graph drawing in spectral layout

Graph drawing in spectral layout Graph drawing in spectral layout Maureen Gallagher Colleen Tygh John Urschel Ludmil Zikatanov Beginning: July 8, 203; Today is: October 2, 203 Introduction Our research focuses on the use of spectral graph

More information

1 Graph Visualization

1 Graph Visualization A Linear Algebraic Algorithm for Graph Drawing Eric Reckwerdt This paper will give a brief overview of the realm of graph drawing, followed by a linear algebraic approach, ending with an example of our

More information

SSDE: Fast Graph Drawing Using Sampled Spectral Distance Embedding

SSDE: Fast Graph Drawing Using Sampled Spectral Distance Embedding SSDE: Fast Graph Drawing Using Sampled Spectral Distance Embedding Ali Çivril, Malik Magdon-Ismail and Eli Bocek-Rivele Computer Science Department, RPI, 110 8th Street, Troy, NY 12180 {civria,magdon}@cs.rpi.edu,boceke@rpi.edu

More information

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2

More information

Simultaneous Graph Drawing: Layout Algorithms and Visualization Schemes

Simultaneous Graph Drawing: Layout Algorithms and Visualization Schemes Simultaneous Graph Drawing: Layout Algorithms and Visualization Schemes (System Demo) C. Erten, S. G. Kobourov, V. Le, and A. Navabi Department of Computer Science University of Arizona {cesim,kobourov,vle,navabia}@cs.arizona.edu

More information

Week 7 Picturing Network. Vahe and Bethany

Week 7 Picturing Network. Vahe and Bethany Week 7 Picturing Network Vahe and Bethany Freeman (2005) - Graphic Techniques for Exploring Social Network Data The two main goals of analyzing social network data are identification of cohesive groups

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 2 Part 4: Dividing Networks into Clusters The problem l Graph partitioning

More information

GRIP: Graph drawing with Intelligent Placement

GRIP: Graph drawing with Intelligent Placement GRIP: Graph drawing with Intelligent Placement Pawel Gajer 1 and Stephen G. Kobourov 2 1 Department of Computer Science Johns Hopkins University Baltimore, MD 21218 2 Department of Computer Science University

More information

Locality Preserving Projections (LPP) Abstract

Locality Preserving Projections (LPP) Abstract Locality Preserving Projections (LPP) Xiaofei He Partha Niyogi Computer Science Department Computer Science Department The University of Chicago The University of Chicago Chicago, IL 60615 Chicago, IL

More information

Efficient Minimization of New Quadric Metric for Simplifying Meshes with Appearance Attributes

Efficient Minimization of New Quadric Metric for Simplifying Meshes with Appearance Attributes Efficient Minimization of New Quadric Metric for Simplifying Meshes with Appearance Attributes (Addendum to IEEE Visualization 1999 paper) Hugues Hoppe Steve Marschner June 2000 Technical Report MSR-TR-2000-64

More information

CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM

CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM 96 CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM Clustering is the process of combining a set of relevant information in the same group. In this process KM algorithm plays

More information

Chapter 2 Basic Structure of High-Dimensional Spaces

Chapter 2 Basic Structure of High-Dimensional Spaces Chapter 2 Basic Structure of High-Dimensional Spaces Data is naturally represented geometrically by associating each record with a point in the space spanned by the attributes. This idea, although simple,

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Lecture 2 September 3

Lecture 2 September 3 EE 381V: Large Scale Optimization Fall 2012 Lecture 2 September 3 Lecturer: Caramanis & Sanghavi Scribe: Hongbo Si, Qiaoyang Ye 2.1 Overview of the last Lecture The focus of the last lecture was to give

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo CSE 6242 A / CS 4803 DVA Feb 12, 2013 Dimension Reduction Guest Lecturer: Jaegul Choo CSE 6242 A / CS 4803 DVA Feb 12, 2013 Dimension Reduction Guest Lecturer: Jaegul Choo Data is Too Big To Do Something..

More information

Non-linear dimension reduction

Non-linear dimension reduction Sta306b May 23, 2011 Dimension Reduction: 1 Non-linear dimension reduction ISOMAP: Tenenbaum, de Silva & Langford (2000) Local linear embedding: Roweis & Saul (2000) Local MDS: Chen (2006) all three methods

More information

Simultaneous Graph Drawing: Layout Algorithms and Visualization Schemes

Simultaneous Graph Drawing: Layout Algorithms and Visualization Schemes Simultaneous Graph Drawing: Layout Algorithms and Visualization Schemes Cesim Erten, Stephen G. Kobourov, Vu Le, and Armand Navabi Department of Computer Science University of Arizona {cesim,kobourov,vle,navabia}@cs.arizona.edu

More information

CSE 6242 A / CX 4242 DVA. March 6, Dimension Reduction. Guest Lecturer: Jaegul Choo

CSE 6242 A / CX 4242 DVA. March 6, Dimension Reduction. Guest Lecturer: Jaegul Choo CSE 6242 A / CX 4242 DVA March 6, 2014 Dimension Reduction Guest Lecturer: Jaegul Choo Data is Too Big To Analyze! Limited memory size! Data may not be fitted to the memory of your machine! Slow computation!

More information

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,

More information

Mining Social Network Graphs

Mining Social Network Graphs Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be

More information

Locality Preserving Projections (LPP) Abstract

Locality Preserving Projections (LPP) Abstract Locality Preserving Projections (LPP) Xiaofei He Partha Niyogi Computer Science Department Computer Science Department The University of Chicago The University of Chicago Chicago, IL 60615 Chicago, IL

More information

Principal Component Image Interpretation A Logical and Statistical Approach

Principal Component Image Interpretation A Logical and Statistical Approach Principal Component Image Interpretation A Logical and Statistical Approach Md Shahid Latif M.Tech Student, Department of Remote Sensing, Birla Institute of Technology, Mesra Ranchi, Jharkhand-835215 Abstract

More information

Visualizing Weighted Edges in Graphs

Visualizing Weighted Edges in Graphs Visualizing Weighted Edges in Graphs Peter Rodgers and Paul Mutton University of Kent, UK P.J.Rodgers@kent.ac.uk, pjm2@kent.ac.uk Abstract This paper introduces a new edge length heuristic that finds a

More information

General Instructions. Questions

General Instructions. Questions CS246: Mining Massive Data Sets Winter 2018 Problem Set 2 Due 11:59pm February 8, 2018 Only one late period is allowed for this homework (11:59pm 2/13). General Instructions Submission instructions: These

More information

Understanding Clustering Supervising the unsupervised

Understanding Clustering Supervising the unsupervised Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data

More information

REGULAR GRAPHS OF GIVEN GIRTH. Contents

REGULAR GRAPHS OF GIVEN GIRTH. Contents REGULAR GRAPHS OF GIVEN GIRTH BROOKE ULLERY Contents 1. Introduction This paper gives an introduction to the area of graph theory dealing with properties of regular graphs of given girth. A large portion

More information

Feature selection. Term 2011/2012 LSI - FIB. Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/ / 22

Feature selection. Term 2011/2012 LSI - FIB. Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/ / 22 Feature selection Javier Béjar cbea LSI - FIB Term 2011/2012 Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/2012 1 / 22 Outline 1 Dimensionality reduction 2 Projections 3 Attribute selection

More information

Image Processing. Image Features

Image Processing. Image Features Image Processing Image Features Preliminaries 2 What are Image Features? Anything. What they are used for? Some statements about image fragments (patches) recognition Search for similar patches matching

More information

MSA220 - Statistical Learning for Big Data

MSA220 - Statistical Learning for Big Data MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups

More information

Data Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\

Data Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\ Data Preprocessing Javier Béjar BY: $\ URL - Spring 2018 C CS - MAI 1/78 Introduction Data representation Unstructured datasets: Examples described by a flat set of attributes: attribute-value matrix Structured

More information

Image Coding with Active Appearance Models

Image Coding with Active Appearance Models Image Coding with Active Appearance Models Simon Baker, Iain Matthews, and Jeff Schneider CMU-RI-TR-03-13 The Robotics Institute Carnegie Mellon University Abstract Image coding is the task of representing

More information

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur Lecture : Graphs Rajat Mittal IIT Kanpur Combinatorial graphs provide a natural way to model connections between different objects. They are very useful in depicting communication networks, social networks

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

Diffusion Wavelets for Natural Image Analysis

Diffusion Wavelets for Natural Image Analysis Diffusion Wavelets for Natural Image Analysis Tyrus Berry December 16, 2011 Contents 1 Project Description 2 2 Introduction to Diffusion Wavelets 2 2.1 Diffusion Multiresolution............................

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Fabio G. Cozman - fgcozman@usp.br November 16, 2018 What can we do? We just have a dataset with features (no labels, no response). We want to understand the data... no easy to define

More information

6 Randomized rounding of semidefinite programs

6 Randomized rounding of semidefinite programs 6 Randomized rounding of semidefinite programs We now turn to a new tool which gives substantially improved performance guarantees for some problems We now show how nonlinear programming relaxations can

More information

Clustering Algorithms for general similarity measures

Clustering Algorithms for general similarity measures Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative

More information

On the null space of a Colin de Verdière matrix

On the null space of a Colin de Verdière matrix On the null space of a Colin de Verdière matrix László Lovász 1 and Alexander Schrijver 2 Dedicated to the memory of François Jaeger Abstract. Let G = (V, E) be a 3-connected planar graph, with V = {1,...,

More information

SDE: GRAPH DRAWING USING SPECTRAL DISTANCE EMBEDDING. and. and

SDE: GRAPH DRAWING USING SPECTRAL DISTANCE EMBEDDING. and. and International Journal of Foundations of Computer Science c World Scientific Publishing Company SDE: GRAPH DRAWING USING SPECTRAL DISTANCE EMBEDDING ALI CIVRIL Computer Science Department, Rensselaer Polytechnic

More information

Algebraic Graph Theory- Adjacency Matrix and Spectrum

Algebraic Graph Theory- Adjacency Matrix and Spectrum Algebraic Graph Theory- Adjacency Matrix and Spectrum Michael Levet December 24, 2013 Introduction This tutorial will introduce the adjacency matrix, as well as spectral graph theory. For those familiar

More information

ON THE STRONGLY REGULAR GRAPH OF PARAMETERS

ON THE STRONGLY REGULAR GRAPH OF PARAMETERS ON THE STRONGLY REGULAR GRAPH OF PARAMETERS (99, 14, 1, 2) SUZY LOU AND MAX MURIN Abstract. In an attempt to find a strongly regular graph of parameters (99, 14, 1, 2) or to disprove its existence, we

More information

vector space retrieval many slides courtesy James Amherst

vector space retrieval many slides courtesy James Amherst vector space retrieval many slides courtesy James Allan@umass Amherst 1 what is a retrieval model? Model is an idealization or abstraction of an actual process Mathematical models are used to study the

More information

Constrained Clustering with Interactive Similarity Learning

Constrained Clustering with Interactive Similarity Learning SCIS & ISIS 2010, Dec. 8-12, 2010, Okayama Convention Center, Okayama, Japan Constrained Clustering with Interactive Similarity Learning Masayuki Okabe Toyohashi University of Technology Tenpaku 1-1, Toyohashi,

More information

Simplified clustering algorithms for RFID networks

Simplified clustering algorithms for RFID networks Simplified clustering algorithms for FID networks Vinay Deolalikar, Malena Mesarina, John ecker, Salil Pradhan HP Laboratories Palo Alto HPL-2005-163 September 16, 2005* clustering, FID, sensors The problem

More information

Lecture Topic Projects

Lecture Topic Projects Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, basic tasks, data types 3 Introduction to D3, basic vis techniques for non-spatial data Project #1 out 4 Data

More information

Bipartite Graph Partitioning and Content-based Image Clustering

Bipartite Graph Partitioning and Content-based Image Clustering Bipartite Graph Partitioning and Content-based Image Clustering Guoping Qiu School of Computer Science The University of Nottingham qiu @ cs.nott.ac.uk Abstract This paper presents a method to model the

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

An efficient algorithm for sparse PCA

An efficient algorithm for sparse PCA An efficient algorithm for sparse PCA Yunlong He Georgia Institute of Technology School of Mathematics heyunlong@gatech.edu Renato D.C. Monteiro Georgia Institute of Technology School of Industrial & System

More information

A GENTLE INTRODUCTION TO THE BASIC CONCEPTS OF SHAPE SPACE AND SHAPE STATISTICS

A GENTLE INTRODUCTION TO THE BASIC CONCEPTS OF SHAPE SPACE AND SHAPE STATISTICS A GENTLE INTRODUCTION TO THE BASIC CONCEPTS OF SHAPE SPACE AND SHAPE STATISTICS HEMANT D. TAGARE. Introduction. Shape is a prominent visual feature in many images. Unfortunately, the mathematical theory

More information

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA Chapter 1 : BioMath: Transformation of Graphs Use the results in part (a) to identify the vertex of the parabola. c. Find a vertical line on your graph paper so that when you fold the paper, the left portion

More information

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs Advanced Operations Research Techniques IE316 Quiz 1 Review Dr. Ted Ralphs IE316 Quiz 1 Review 1 Reading for The Quiz Material covered in detail in lecture. 1.1, 1.4, 2.1-2.6, 3.1-3.3, 3.5 Background material

More information

Social-Network Graphs

Social-Network Graphs Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities

More information

COMP 558 lecture 19 Nov. 17, 2010

COMP 558 lecture 19 Nov. 17, 2010 COMP 558 lecture 9 Nov. 7, 2 Camera calibration To estimate the geometry of 3D scenes, it helps to know the camera parameters, both external and internal. The problem of finding all these parameters is

More information

Clustering and Dimensionality Reduction

Clustering and Dimensionality Reduction Clustering and Dimensionality Reduction Some material on these is slides borrowed from Andrew Moore's excellent machine learning tutorials located at: Data Mining Automatically extracting meaning from

More information

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the

More information

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6, Spectral Clustering XIAO ZENG + ELHAM TABASSI CSE 902 CLASS PRESENTATION MARCH 16, 2017 1 Presentation based on 1. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4

More information

Dimension Reduction CS534

Dimension Reduction CS534 Dimension Reduction CS534 Why dimension reduction? High dimensionality large number of features E.g., documents represented by thousands of words, millions of bigrams Images represented by thousands of

More information

Lecture 27: Fast Laplacian Solvers

Lecture 27: Fast Laplacian Solvers Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

Multi Layer Perceptron trained by Quasi Newton learning rule

Multi Layer Perceptron trained by Quasi Newton learning rule Multi Layer Perceptron trained by Quasi Newton learning rule Feed-forward neural networks provide a general framework for representing nonlinear functional mappings between a set of input variables and

More information

Curvilinear Graph Drawing Using The Force-Directed Method. by Benjamin Finkel Sc. B., Brown University, 2003

Curvilinear Graph Drawing Using The Force-Directed Method. by Benjamin Finkel Sc. B., Brown University, 2003 Curvilinear Graph Drawing Using The Force-Directed Method by Benjamin Finkel Sc. B., Brown University, 2003 A Thesis submitted in partial fulfillment of the requirements for Honors in the Department of

More information

On Universal Cycles of Labeled Graphs

On Universal Cycles of Labeled Graphs On Universal Cycles of Labeled Graphs Greg Brockman Harvard University Cambridge, MA 02138 United States brockman@hcs.harvard.edu Bill Kay University of South Carolina Columbia, SC 29208 United States

More information

Behavioral Data Mining. Lecture 18 Clustering

Behavioral Data Mining. Lecture 18 Clustering Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality K-means Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i

More information

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS Toomas Kirt Supervisor: Leo Võhandu Tallinn Technical University Toomas.Kirt@mail.ee Abstract: Key words: For the visualisation

More information

Workload Characterization Techniques

Workload Characterization Techniques Workload Characterization Techniques Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse567-08/

More information

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the

More information

Integer Programming Theory

Integer Programming Theory Integer Programming Theory Laura Galli October 24, 2016 In the following we assume all functions are linear, hence we often drop the term linear. In discrete optimization, we seek to find a solution x

More information

Grandalf : A Python module for Graph Drawings

Grandalf : A Python module for Graph Drawings Grandalf : A Python module for Graph Drawings https://github.com/bdcht/grandalf Axel Tillequin Bibliography on Graph Drawings - 2008-2010 June 2011 bdcht (Axel Tillequin) https://github.com/bdcht/grandalf

More information

LATIN SQUARES AND THEIR APPLICATION TO THE FEASIBLE SET FOR ASSIGNMENT PROBLEMS

LATIN SQUARES AND THEIR APPLICATION TO THE FEASIBLE SET FOR ASSIGNMENT PROBLEMS LATIN SQUARES AND THEIR APPLICATION TO THE FEASIBLE SET FOR ASSIGNMENT PROBLEMS TIMOTHY L. VIS Abstract. A significant problem in finite optimization is the assignment problem. In essence, the assignment

More information

Minoru SASAKI and Kenji KITA. Department of Information Science & Intelligent Systems. Faculty of Engineering, Tokushima University

Minoru SASAKI and Kenji KITA. Department of Information Science & Intelligent Systems. Faculty of Engineering, Tokushima University Information Retrieval System Using Concept Projection Based on PDDP algorithm Minoru SASAKI and Kenji KITA Department of Information Science & Intelligent Systems Faculty of Engineering, Tokushima University

More information

Multiresponse Sparse Regression with Application to Multidimensional Scaling

Multiresponse Sparse Regression with Application to Multidimensional Scaling Multiresponse Sparse Regression with Application to Multidimensional Scaling Timo Similä and Jarkko Tikka Helsinki University of Technology, Laboratory of Computer and Information Science P.O. Box 54,

More information

Big Data Management and NoSQL Databases

Big Data Management and NoSQL Databases NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic

More information

Discrete mathematics , Fall Instructor: prof. János Pach

Discrete mathematics , Fall Instructor: prof. János Pach Discrete mathematics 2016-2017, Fall Instructor: prof. János Pach - covered material - Lecture 1. Counting problems To read: [Lov]: 1.2. Sets, 1.3. Number of subsets, 1.5. Sequences, 1.6. Permutations,

More information

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters Types of general clustering methods Clustering Algorithms for general similarity measures agglomerative versus divisive algorithms agglomerative = bottom-up build up clusters from single objects divisive

More information

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction

PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING. 1. Introduction PRIMAL-DUAL INTERIOR POINT METHOD FOR LINEAR PROGRAMMING KELLER VANDEBOGERT AND CHARLES LANNING 1. Introduction Interior point methods are, put simply, a technique of optimization where, given a problem

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Exploratory data analysis tasks Examine the data, in search of structures

More information

A Fully Animated Interactive System for Clustering and Navigating Huge Graphs

A Fully Animated Interactive System for Clustering and Navigating Huge Graphs A Fully Animated Interactive System for Clustering and Navigating Huge Graphs Mao Lin Huang and Peter Eades Department of Computer Science and Software Engineering The University of Newcastle, NSW 2308,

More information

Treewidth and graph minors

Treewidth and graph minors Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under

More information

Advanced Topics In Machine Learning Project Report : Low Dimensional Embedding of a Pose Collection Fabian Prada

Advanced Topics In Machine Learning Project Report : Low Dimensional Embedding of a Pose Collection Fabian Prada Advanced Topics In Machine Learning Project Report : Low Dimensional Embedding of a Pose Collection Fabian Prada 1 Introduction In this project we present an overview of (1) low dimensional embedding,

More information

Discrete Optimization. Lecture Notes 2

Discrete Optimization. Lecture Notes 2 Discrete Optimization. Lecture Notes 2 Disjunctive Constraints Defining variables and formulating linear constraints can be straightforward or more sophisticated, depending on the problem structure. The

More information

A Fast and Simple Heuristic for Constrained Two-Level Crossing Reduction

A Fast and Simple Heuristic for Constrained Two-Level Crossing Reduction A Fast and Simple Heuristic for Constrained Two-Level Crossing Reduction Michael Forster University of Passau, 94030 Passau, Germany forster@fmi.uni-passau.de Abstract. The one-sided two-level crossing

More information

Community Detection. Community

Community Detection. Community Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,

More information

Wireless Sensor Networks Localization Methods: Multidimensional Scaling vs. Semidefinite Programming Approach

Wireless Sensor Networks Localization Methods: Multidimensional Scaling vs. Semidefinite Programming Approach Wireless Sensor Networks Localization Methods: Multidimensional Scaling vs. Semidefinite Programming Approach Biljana Stojkoska, Ilinka Ivanoska, Danco Davcev, 1 Faculty of Electrical Engineering and Information

More information

Planar Graphs with Many Perfect Matchings and Forests

Planar Graphs with Many Perfect Matchings and Forests Planar Graphs with Many Perfect Matchings and Forests Michael Biro Abstract We determine the number of perfect matchings and forests in a family T r,3 of triangulated prism graphs. These results show that

More information

Facial Expression Detection Using Implemented (PCA) Algorithm

Facial Expression Detection Using Implemented (PCA) Algorithm Facial Expression Detection Using Implemented (PCA) Algorithm Dileep Gautam (M.Tech Cse) Iftm University Moradabad Up India Abstract: Facial expression plays very important role in the communication with

More information

Today. Gradient descent for minimization of functions of real variables. Multi-dimensional scaling. Self-organizing maps

Today. Gradient descent for minimization of functions of real variables. Multi-dimensional scaling. Self-organizing maps Today Gradient descent for minimization of functions of real variables. Multi-dimensional scaling Self-organizing maps Gradient Descent Derivatives Consider function f(x) : R R. The derivative w.r.t. x

More information

A New Heuristic Layout Algorithm for Directed Acyclic Graphs *

A New Heuristic Layout Algorithm for Directed Acyclic Graphs * A New Heuristic Layout Algorithm for Directed Acyclic Graphs * by Stefan Dresbach Lehrstuhl für Wirtschaftsinformatik und Operations Research Universität zu Köln Pohligstr. 1, 50969 Köln revised August

More information

Extracting Information from Complex Networks

Extracting Information from Complex Networks Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform

More information

Generalized trace ratio optimization and applications

Generalized trace ratio optimization and applications Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO

More information

EXERCISES SHORTEST PATHS: APPLICATIONS, OPTIMIZATION, VARIATIONS, AND SOLVING THE CONSTRAINED SHORTEST PATH PROBLEM. 1 Applications and Modelling

EXERCISES SHORTEST PATHS: APPLICATIONS, OPTIMIZATION, VARIATIONS, AND SOLVING THE CONSTRAINED SHORTEST PATH PROBLEM. 1 Applications and Modelling SHORTEST PATHS: APPLICATIONS, OPTIMIZATION, VARIATIONS, AND SOLVING THE CONSTRAINED SHORTEST PATH PROBLEM EXERCISES Prepared by Natashia Boland 1 and Irina Dumitrescu 2 1 Applications and Modelling 1.1

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover

More information

Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, Roma, Italy

Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, Roma, Italy Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, 00142 Roma, Italy e-mail: pimassol@istat.it 1. Introduction Questions can be usually asked following specific

More information

Orthogonal representations, minimum rank, and graph complements

Orthogonal representations, minimum rank, and graph complements Orthogonal representations, minimum rank, and graph complements Leslie Hogben March 30, 2007 Abstract Orthogonal representations are used to show that complements of certain sparse graphs have (positive

More information