My favorite application using eigenvalues: partitioning and community detection in social networks

Size: px
Start display at page:

Download "My favorite application using eigenvalues: partitioning and community detection in social networks"

Transcription

1 My favorite application using eigenvalues: partitioning and community detection in social networks Will Hobbs February 17, 2013 Abstract Social networks are often organized into families, friendship groups, and institutional bodies. However, this organization, functional groups of friends and acquaintances especially, may not always be apparent from the characteristics of the individuals or from global measures of network structure. Community detection methods provide a means to identify cohesive and/or inter-dependent clusters of individuals. In this paper, I review traditional methods to cluster and partition vertices, along with a recent community detection method, modularity maximization, that is more intuitively appealing in many research areas. I focus in particular on the spectral solutions to these problems that have advanced and/or expanded their use in real world applications. Introduction Social networks, and many other real world networks, display a high level of order [3]. Globally, empirical degree sequences often follow a heavy-tailed distribution, and this distribution can be used in measures of network centrality. At meso-scale, vertices tend to organize in groups with a high concentration of within group edges, and a low concentration of between-group edges. This organization is termed community structure [4], or clustering. 1 Community structure is practically interesting because the clusters often share characteristics or functions. For example, social networks organize into families, friendship groups, and institutional bodies. Further, the development of community structure may imply an underlying preference (and advantage) in forming these memberships that predicts the function of the community and the network as a whole. Membership in this group can be an indicator of local cohesion and interdependency. Partitioning and community detection can be readily conceptualized in terms of flows and random walks on a network [3]. A small number of between-group edges forces a random walk to spend more time within a community and relatively infrequently cross into another one. In a social network, we might connect this to the flow of social support within a group of very close friends or family members (that infrequently extends this support to out-groups). This spectral approach has a long history in mathematics [14] and, in recent years, has been adapted to quickly and accurately identify communities in social networks [7]. [9]. 1 Much of this paper draws from reviews and explanations of community detection methods by Fortunato [3] and Newman 1

2 History Community detection has parallel and, recently, mutually enforcing histories in mathematics and in social science. The mathematical approach identifies communities based on network characteristics (edges and edge strengths)[3], while traditional social science research identifies communities based on the shared characteristics and functions of individuals. The current spectral methods for detecting communities in networks are rooted in early work by Fiedler [2] and Donath and Hoffman [1]. This research used the eigenvectors of matrices (Donath and Hoffman) and, specifically, the eigenvector corresponding to the second smallest eigenvalue of the graph Laplacian (Fiedler), or the spectral gap, to partition graphs. The eigenvector corresponding to the spectral gap minimizes the bisection of a graph because the eigenvalues are proportional to their corresponding eigenvector cut sizes. This method was first practically applied in a 1990 paper by Pothen, Simon, and Liou [12]. The most recent applications of spectral methods to community detection use the largest eigenvalue to maximize the modularity of a network [10], or the difference between realized and expected connections within clusters. Specifically, the modularity approaches maximizes the value Q: Q = 1 2m B ij δ(c i, c j ) (1) ij B ij = A ij P ij (2) P ij = k ik j 2m (3) where B ij is the modularity matrix (or the B matrix), A ij is the adjacency matrix, P ij is the null model matrix, k is vertex degree, m is the edge count, and δ is the Kronecker delta symbol indicating shared community membership of two vertices [11]. The most common choice of null model is the kikj 2m, closely related to the configuration model (of expected connections based on a given degree sequence) [6, 11]. This method is appealing because, in most applications, a community detection method should not cut a graph based on local degree. Connections in excess of what we should see at random given a degree sequence will reflect some other (non-degree-based) social characteristic or function. Further, other null models can be easily adapted to this method. These methods are usually verified by labels assigned by the actors themselves, such as party membership in the US Congress, along with qualitative assessments, such as factions within parties during the Civil Rights Era. Algorithms The following methods are split by whether they minimize cuts (using the two, or more, smallest eigenvalues of the graph Laplacian) or maximize modularity (using the one, or more, leading eigenvectors of the modularity matrix). The modularity maximization methods have achieved greater success in real world applications. 2

3 Cut minimization Some traditional graph partitioning and clustering methods use the graph Laplacian to minimize the cuts between clusters. The size of these cuts in many of the methods are directly related to the probability a random walk will transfer from one cluster to another. These methods originally arose to address problems in parallel computing [12, 11] and require ex-ante specification of group sizes (community detection does not). Greedy method: Kernighan-Lin algorithm [5] One of the first methods in computer science to calculate minimum cut sizes is a greedy algorithm that swaps pairs of vertices until a (locally) minimal cut size is achieved. The pair swapping ensures that the final partition assigns a specific number of nodes to each group. 1. Divide nodes into two groups. 2. Find pairs that reduce the cut size by the largest amount. 3. Swap these pairs (and swap only once) until there is no increase in modularity. 4. Go back to the smallest cut size and begin swapping pairs (with the preceding, already swapped pairs set in place). 5. Repeat until there is no improvement in cut size. Spectral partitioning [12] Spectral partitioning tends to achieve cut sizes similar to the Kernighan-Lin algorithm, but achieves them much more quickly [9]. Results can be improved by using the Kernighan-Lin algorithm after finding the spectral cuts. 1. Calculate the eigenvector corresponding to the spectral gap of the Laplacian. 2. Sort the n 1 largest elements into one group and the rest into a second group (where n 1 is the user size of a partition). 3. Sort the n 1 smallest elements into one group and the rest into a second group (because we do not know whether n 1 should correspond to the largest or smallest elements). 4. Select the division with the smaller cut size. Spectral clustering [13] 3

4 1. Calculate the k eigenvectors corresponding to the k smallest eigenvalues of the normalized graph Laplacian (the unnormalized graph Laplacian will achieve similar results when the vertices degrees are similar [3]). 2. Create an n x k matrix (where the k columns are eigenvectors). 3. Perform k-means clustering on the n vertices in k-dimensional Euclidean space. Modularity maximization [4] There are two prominent algorithms to optimize the modularity value introduced in the history section of this paper. The algorithms do not substantially differ in speed or accuracy [9]. Greedy method: vertex-moving algorithm (Kernighan-Lin analog) [9] 1. Divide nodes into two groups. 2. Move the vertex that will achieve the greatest increase in modularity (but move vertices only once). 3. After each vertex has been reassigned, go back to the largest modularity value and begin moving vertices (with the preceding, already moved vertices set in place) 4. Repeat until there is no improvement in modularity. 5. Repeat this on each bipartition (until there is no increase in modularity) to find an arbitrary number of communities. Spectral algorithm [7] 1. Find the leading eigenvector of the modularity matrix (using the power method). 2. Assign nodes corresponding to the positive values of the leading eigenvector to one community and the remainder to another (if there are no positive values, do not split the network). 3. Repeat this on each bipartition (until there are no positive values in the leading eigenvector) to find an arbitrary number of communities. Other eigenvectors This method throws away useful information from other eigenvectors. In theory, the k-largest eigenvalues can be used to find more than two communities at once, but there is currently no implementation of this (the problem is too high-dimensional). Ideas behind optimization techniques In this section, I review two techniques that enable the practical implementation of spectral partitioning and modularity maximization. 4

5 Spectral approximation The matrix formulation of the graph partitioning problem is: R = 1 4 st Ls, (4) The s, community assignment (here either +1 or -1 for s i in s), that minimizes cut size R (alternatively, R = 1 4 ij A ij(1 s i s j )) is difficult to find in practice. A useful approximation of this permits s to take on any value subject to s = n. This relaxation allows s to assign values to the hypersphere circumscribing the 2 n hypercube [9]. Given this relaxation, we can differentiate with respect to s i and find an approximate, continuous solution in terms of the graph Laplacian and its eigenvectors. Spectral modularity maximization in O(n 2 ) time The modularity matrix B is not a sparse matrix and finding the leading eigenvector of this matrix, using the power method, will take O(n 3 ). This is not practical on many real world networks. However, the structure of the modularity matrix allows the computation to be completed more quickly (in O(n 2 )) [9]. In practice the power method is: Bx = Ax k(kt x) 2m, (5) where B is the modularity matrix, A is the adjacency matrix, x is an arbitrary vector, k is the degrees of the vertices, and 2m is a scaling term. Ax takes O(m + n) to complete, given the sparse matrix A (m is the number of edges and n is the number of vertices) and the second term can be evaluated in O(n). Because the power method converges to the leading eigenvector in O(n) multiplications and most real world networks (especially social networks) are sparse, the algorithm is completed in O(n 2 ) [8]. 5

6 References [1] W. E. Donath and A. J. Hoffman, Lower bounds for the partitioning of graphs, IBM Journal of Research and Development, 17 (1973), pp [2] M. Fiedler, Algebraic connectivity of graphs, Czechoslovak Mathematical Journal, 23 (1973), pp [3] S. Fortunato, Community detection in graphs, Physics Reports, 486 (2010), pp , 2, 4 [4] M. Girvan and M. Newman, Community structure in social and biological networks, Proceedings of the National Academy of Sciences of the United States of America, 99 (2002), pp , 4 [5] B. W. Kernighan and S. Lin, An Efficient Heuristic Procedure for Partitioning Graphs, Bell system technical journal, (1970). 3 [6] M. Molloy and B. Reed, A critical point for random graphs with a given degree sequence, Random Structures & Algorithms, 6 (1995), pp [7] M. Newman, Finding community structure in networks using the eigenvectors of matrices, Physical Review E, 74 (2006), p , 4 [8] M. Newman, Modularity and community structure in networks, Proceedings of the National Academy of Sciences of the United States of America, 103 (2006), pp [9] M. Newman, Networks: An Introduction, Oxford University Press, Inc., New York, NY, USA, , 3, 4, 5 [10] M. Newman and M. Girvan, Finding and evaluating community structure in networks, Physical Review E, 69 (2004), p [11] M. A. Porter, J.-P. Onnela, and P. J. Mucha, Communities in networks, Notices of the AMS, 56 (2009), pp , 3 [12] A. Pothen, H. D. Simon, and K.-P. Liou, Partitioning Sparse Matrices with Eigenvectors of Graphs, SIAM Journal on Matrix Analysis and Applications, 11 (1990), pp , 3 [13] J. Shi and J. Malik, Normalized cuts and image segmentation, Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22 (2000), pp [14] D. A. Spielmat and S.-H. Teng, Spectral Partitioning Works, Foundations of Computer Science, (1996), pp

Modularity CMSC 858L

Modularity CMSC 858L Modularity CMSC 858L Module-detection for Function Prediction Biological networks generally modular (Hartwell+, 1999) We can try to find the modules within a network. Once we find modules, we can look

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 2 Part 4: Dividing Networks into Clusters The problem l Graph partitioning

More information

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6, Spectral Clustering XIAO ZENG + ELHAM TABASSI CSE 902 CLASS PRESENTATION MARCH 16, 2017 1 Presentation based on 1. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4

More information

Clustering Object-Oriented Software Systems using Spectral Graph Partitioning

Clustering Object-Oriented Software Systems using Spectral Graph Partitioning Clustering Object-Oriented Software Systems using Spectral Graph Partitioning Spiros Xanthos University of Illinois at Urbana-Champaign 0 North Goodwin Urbana, IL 680 xanthos@cs.uiuc.edu Abstract In this

More information

Spectral Methods for Network Community Detection and Graph Partitioning

Spectral Methods for Network Community Detection and Graph Partitioning Spectral Methods for Network Community Detection and Graph Partitioning M. E. J. Newman Department of Physics, University of Michigan Presenters: Yunqi Guo Xueyin Yu Yuanqi Li 1 Outline: Community Detection

More information

Non Overlapping Communities

Non Overlapping Communities Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides

More information

Community Detection Methods using Eigenvectors of Matrices

Community Detection Methods using Eigenvectors of Matrices Community Detection Methods using Eigenvectors of Matrices Yan Zhang Abstract In this paper we investigate the problem of detecting communities in graphs. We use the eigenvectors of the graph Laplacian

More information

V4 Matrix algorithms and graph partitioning

V4 Matrix algorithms and graph partitioning V4 Matrix algorithms and graph partitioning - Community detection - Simple modularity maximization - Spectral modularity maximization - Division into more than two groups - Other algorithms for community

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models

More information

Mining Social Network Graphs

Mining Social Network Graphs Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be

More information

Spectral Clustering on Handwritten Digits Database

Spectral Clustering on Handwritten Digits Database October 6, 2015 Spectral Clustering on Handwritten Digits Database Danielle dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance

More information

Spectral Clustering and Community Detection in Labeled Graphs

Spectral Clustering and Community Detection in Labeled Graphs Spectral Clustering and Community Detection in Labeled Graphs Brandon Fain, Stavros Sintos, Nisarg Raval Machine Learning (CompSci 571D / STA 561D) December 7, 2015 {btfain, nisarg, ssintos} at cs.duke.edu

More information

Lecture 19: Graph Partitioning

Lecture 19: Graph Partitioning Lecture 19: Graph Partitioning David Bindel 3 Nov 2011 Logistics Please finish your project 2. Please start your project 3. Graph partitioning Given: Graph G = (V, E) Possibly weights (W V, W E ). Possibly

More information

Behavioral Data Mining. Lecture 18 Clustering

Behavioral Data Mining. Lecture 18 Clustering Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality K-means Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i

More information

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning Parallel sparse matrix-vector product Lay out matrix and vectors by rows y(i) = sum(a(i,j)*x(j)) Only compute terms with A(i,j) 0 P0 P1

More information

Community Detection. Community

Community Detection. Community Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,

More information

Level 3: Level 2: Level 1: Level 0:

Level 3: Level 2: Level 1: Level 0: A Graph Based Method for Generating the Fiedler Vector of Irregular Problems 1 Michael Holzrichter 1 and Suely Oliveira 2 1 Texas A&M University, College Station, TX,77843-3112 2 The University of Iowa,

More information

Local Fiedler Vector Centrality for Detection of Deep and Overlapping Communities in Networks

Local Fiedler Vector Centrality for Detection of Deep and Overlapping Communities in Networks Local Fiedler Vector Centrality for Detection of Deep and Overlapping Communities in Networks Pin-Yu Chen and Alfred O. Hero III, Fellow, IEEE Department of Electrical Engineering and Computer Science,

More information

Kernighan/Lin - Preliminary Definitions. Comments on Kernighan/Lin Algorithm. Partitioning Without Nodal Coordinates Kernighan/Lin

Kernighan/Lin - Preliminary Definitions. Comments on Kernighan/Lin Algorithm. Partitioning Without Nodal Coordinates Kernighan/Lin Partitioning Without Nodal Coordinates Kernighan/Lin Given G = (N,E,W E ) and a partitioning N = A U B, where A = B. T = cost(a,b) = edge cut of A and B partitions. Find subsets X of A and Y of B with

More information

SGN (4 cr) Chapter 11

SGN (4 cr) Chapter 11 SGN-41006 (4 cr) Chapter 11 Clustering Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 25, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006 (4 cr) Chapter

More information

Machine Learning for Data Science (CS4786) Lecture 11

Machine Learning for Data Science (CS4786) Lecture 11 Machine Learning for Data Science (CS4786) Lecture 11 Spectral Clustering Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Survey Survey Survey Competition I Out! Preliminary report of

More information

Problem Definition. Clustering nonlinearly separable data:

Problem Definition. Clustering nonlinearly separable data: Outlines Weighted Graph Cuts without Eigenvectors: A Multilevel Approach (PAMI 2007) User-Guided Large Attributed Graph Clustering with Multiple Sparse Annotations (PAKDD 2016) Problem Definition Clustering

More information

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates CSE 51: Design and Analysis of Algorithms I Spring 016 Lecture 11: Clustering and the Spectral Partitioning Algorithm Lecturer: Shayan Oveis Gharan May nd Scribe: Yueqi Sheng Disclaimer: These notes have

More information

Community Detection in Directed Weighted Function-call Networks

Community Detection in Directed Weighted Function-call Networks Community Detection in Directed Weighted Function-call Networks Zhengxu Zhao 1, Yang Guo *2, Weihua Zhao 3 1,3 Shijiazhuang Tiedao University, Shijiazhuang, Hebei, China 2 School of Mechanical Engineering,

More information

Chapter 11. Network Community Detection

Chapter 11. Network Community Detection Chapter 11. Network Community Detection Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Outline

More information

Lesson 2 7 Graph Partitioning

Lesson 2 7 Graph Partitioning Lesson 2 7 Graph Partitioning The Graph Partitioning Problem Look at the problem from a different angle: Let s multiply a sparse matrix A by a vector X. Recall the duality between matrices and graphs:

More information

Social-Network Graphs

Social-Network Graphs Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities

More information

Overlapping Communities

Overlapping Communities Yangyang Hou, Mu Wang, Yongyang Yu Purdue Univiersity Department of Computer Science April 25, 2013 Overview Datasets Algorithm I Algorithm II Algorithm III Evaluation Overview Graph models of many real

More information

Web Structure Mining Community Detection and Evaluation

Web Structure Mining Community Detection and Evaluation Web Structure Mining Community Detection and Evaluation 1 Community Community. It is formed by individuals such that those within a group interact with each other more frequently than with those outside

More information

Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11

Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11 Big Data Analytics Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Feb 11 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering II Spectral

More information

CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection.

CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection. CSCI-B609: A Theorist s Toolkit, Fall 016 Sept. 6, 016 Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality Lecturer: Yuan Zhou Scribe: Xuan Dong We will continue studying the spectral graph theory

More information

Community Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal

Community Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal Community Structure Detection Amar Chandole Ameya Kabre Atishay Aggarwal What is a network? Group or system of interconnected people or things Ways to represent a network: Matrices Sets Sequences Time

More information

CAIM: Cerca i Anàlisi d Informació Massiva

CAIM: Cerca i Anàlisi d Informació Massiva 1 / 72 CAIM: Cerca i Anàlisi d Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Balcázar, Ricard Gavaldá Department of Computer Science, UPC Fall 2016 http://www.cs.upc.edu/~caim

More information

Dynamic network generative model

Dynamic network generative model Dynamic network generative model Habiba, Chayant Tantipathanananandh, Tanya Berger-Wolf University of Illinois at Chicago. In this work we present a statistical model for generating realistic dynamic networks

More information

Social Data Management Communities

Social Data Management Communities Social Data Management Communities Antoine Amarilli 1, Silviu Maniu 2 January 9th, 2018 1 Télécom ParisTech 2 Université Paris-Sud 1/20 Table of contents Communities in Graphs 2/20 Graph Communities Communities

More information

Community Structure and Beyond

Community Structure and Beyond Community Structure and Beyond Elizabeth A. Leicht MAE: 298 April 9, 2009 Why do we care about community structure? Large Networks Discussion Outline Overview of past work on community structure. How to

More information

A geometric model for on-line social networks

A geometric model for on-line social networks WOSN 10 June 22, 2010 A geometric model for on-line social networks Anthony Bonato Ryerson University Geometric model for OSNs 1 Complex Networks web graph, social networks, biological networks, internet

More information

Signal Processing for Big Data

Signal Processing for Big Data Signal Processing for Big Data Sergio Barbarossa 1 Summary 1. Networks 2.Algebraic graph theory 3. Random graph models 4. OperaGons on graphs 2 Networks The simplest way to represent the interaction between

More information

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization Pedro Ribeiro (DCC/FCUP & CRACS/INESC-TEC) Part 1 Motivation and emergence of Network Science

More information

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2 CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #21: Graph Mining 2 Networks & Communi>es We o@en think of networks being organized into modules, cluster, communi>es: VT CS 5614 2 Goal:

More information

Discovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes

Discovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes Journal of Computational Information Systems 8: 23 (2012) 9807 9814 Available at http://www.jofcis.com Discovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes

More information

The clustering in general is the task of grouping a set of objects in such a way that objects

The clustering in general is the task of grouping a set of objects in such a way that objects Spectral Clustering: A Graph Partitioning Point of View Yangzihao Wang Computer Science Department, University of California, Davis yzhwang@ucdavis.edu Abstract This course project provide the basic theory

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters Types of general clustering methods Clustering Algorithms for general similarity measures agglomerative versus divisive algorithms agglomerative = bottom-up build up clusters from single objects divisive

More information

Spectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras

Spectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras Spectral Graph Sparsification: overview of theory and practical methods Yiannis Koutis University of Puerto Rico - Rio Piedras Graph Sparsification or Sketching Compute a smaller graph that preserves some

More information

Spectral Graph Multisection Through Orthogonality. Huanyang Zheng and Jie Wu CIS Department, Temple University

Spectral Graph Multisection Through Orthogonality. Huanyang Zheng and Jie Wu CIS Department, Temple University Spectral Graph Multisection Through Orthogonality Huanyang Zheng and Jie Wu CIS Department, Temple University Outline Motivation Preliminary Algorithm Evaluation Future work Motivation Traditional graph

More information

Community detection. Leonid E. Zhukov

Community detection. Leonid E. Zhukov Community detection Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Network Science Leonid E.

More information

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen Graph Partitioning for High-Performance Scientific Simulations Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Challenges for irregular meshes Modeling mesh-based computations as graphs Static

More information

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003 CS 664 Slides #11 Image Segmentation Prof. Dan Huttenlocher Fall 2003 Image Segmentation Find regions of image that are coherent Dual of edge detection Regions vs. boundaries Related to clustering problems

More information

Network Community Detection

Network Community Detection Network Community Detection Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ March 20, 2018

More information

Size Regularized Cut for Data Clustering

Size Regularized Cut for Data Clustering Size Regularized Cut for Data Clustering Yixin Chen Department of CS Univ. of New Orleans yixin@cs.uno.edu Ya Zhang Department of EECS Uinv. of Kansas yazhang@ittc.ku.edu Xiang Ji NEC-Labs America, Inc.

More information

Clusters and Communities

Clusters and Communities Clusters and Communities Lecture 7 CSCI 4974/6971 22 Sep 2016 1 / 14 Today s Biz 1. Reminders 2. Review 3. Communities 4. Betweenness and Graph Partitioning 5. Label Propagation 2 / 14 Today s Biz 1. Reminders

More information

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Clustering Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 19 Outline

More information

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2

More information

Detecting Community Structure for Undirected Big Graphs Based on Random Walks

Detecting Community Structure for Undirected Big Graphs Based on Random Walks Detecting Community Structure for Undirected Big Graphs Based on Random Walks Xiaoming Liu 1, Yadong Zhou 1, Chengchen Hu 1, Xiaohong Guan 1,, Junyuan Leng 1 1 MOE KLNNIS Lab, Xi an Jiaotong University,

More information

Community Structure in Graphs

Community Structure in Graphs Community Structure in Graphs arxiv:0712.2716v1 [physics.soc-ph] 17 Dec 2007 Santo Fortunato a, Claudio Castellano b a Complex Networks LagrangeLaboratory(CNLL), ISI Foundation, Torino, Italy b SMC, INFM-CNR

More information

Online Social Networks and Media. Community detection

Online Social Networks and Media. Community detection Online Social Networks and Media Community detection 1 Notes on Homework 1 1. You should write your own code for generating the graphs. You may use SNAP graph primitives (e.g., add node/edge) 2. For the

More information

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components Review Lecture 14 ! PRINCIPAL COMPONENT ANALYSIS Eigenvectors of the covariance matrix are the principal components 1. =cov X Top K principal components are the eigenvectors with K largest eigenvalues

More information

Basics of Network Analysis

Basics of Network Analysis Basics of Network Analysis Hiroki Sayama sayama@binghamton.edu Graph = Network G(V, E): graph (network) V: vertices (nodes), E: edges (links) 1 Nodes = 1, 2, 3, 4, 5 2 3 Links = 12, 13, 15, 23,

More information

Spectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014

Spectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 Spectral Clustering Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 What are we going to talk about? Introduction Clustering and

More information

arxiv: v1 [stat.ml] 2 Nov 2010

arxiv: v1 [stat.ml] 2 Nov 2010 Community Detection in Networks: The Leader-Follower Algorithm arxiv:111.774v1 [stat.ml] 2 Nov 21 Devavrat Shah and Tauhid Zaman* devavrat@mit.edu, zlisto@mit.edu November 4, 21 Abstract Traditional spectral

More information

Detecting community structure in networks

Detecting community structure in networks Eur. Phys. J. B 38, 321 330 (2004) DOI: 10.1140/epjb/e2004-00124-y THE EUROPEAN PHYSICAL JOURNAL B Detecting community structure in networks M.E.J. Newman a Department of Physics and Center for the Study

More information

Oh Pott, Oh Pott! or how to detect community structure in complex networks

Oh Pott, Oh Pott! or how to detect community structure in complex networks Oh Pott, Oh Pott! or how to detect community structure in complex networks Jörg Reichardt Interdisciplinary Centre for Bioinformatics, Leipzig, Germany (Host of the 2012 Olympics) Questions to start from

More information

Community Identification in International Weblogs

Community Identification in International Weblogs Technische Universität Kaiserslautern Fernstudium Software Engineering for Embedded Systems Masterarbeit Community Identification in International Weblogs Autor: Christoph Rueger Betreuer: Prof. Dr. Prof.

More information

Extracting Information from Complex Networks

Extracting Information from Complex Networks Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform

More information

Spectral Graph Multisection Through Orthogonality

Spectral Graph Multisection Through Orthogonality Spectral Graph Multisection Through Orthogonality Huanyang Zheng and Jie Wu Department of Computer and Information Sciences Temple University, Philadelphia, PA 922 {huanyang.zheng, jiewu}@temple.edu ABSTRACT

More information

Scalable Clustering of Signed Networks Using Balance Normalized Cut

Scalable Clustering of Signed Networks Using Balance Normalized Cut Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.

More information

Planar Graphs 2, the Colin de Verdière Number

Planar Graphs 2, the Colin de Verdière Number Spectral Graph Theory Lecture 26 Planar Graphs 2, the Colin de Verdière Number Daniel A. Spielman December 4, 2009 26.1 Introduction In this lecture, I will introduce the Colin de Verdière number of a

More information

1 More stochastic block model. Pr(G θ) G=(V, E) 1.1 Model definition. 1.2 Fitting the model to data. Prof. Aaron Clauset 7 November 2013

1 More stochastic block model. Pr(G θ) G=(V, E) 1.1 Model definition. 1.2 Fitting the model to data. Prof. Aaron Clauset 7 November 2013 1 More stochastic block model Recall that the stochastic block model (SBM is a generative model for network structure and thus defines a probability distribution over networks Pr(G θ, where θ represents

More information

Q. Wang National Key Laboratory of Antenna and Microwave Technology Xidian University No. 2 South Taiba Road, Xi an, Shaanxi , P. R.

Q. Wang National Key Laboratory of Antenna and Microwave Technology Xidian University No. 2 South Taiba Road, Xi an, Shaanxi , P. R. Progress In Electromagnetics Research Letters, Vol. 9, 29 38, 2009 AN IMPROVED ALGORITHM FOR MATRIX BANDWIDTH AND PROFILE REDUCTION IN FINITE ELEMENT ANALYSIS Q. Wang National Key Laboratory of Antenna

More information

Application of Spectral Clustering Algorithm

Application of Spectral Clustering Algorithm 1/27 Application of Spectral Clustering Algorithm Danielle Middlebrooks dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance

More information

Social Network Analysis

Social Network Analysis Social Network Analysis Mathematics of Networks Manar Mohaisen Department of EEC Engineering Adjacency matrix Network types Edge list Adjacency list Graph representation 2 Adjacency matrix Adjacency matrix

More information

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1 Graph Theory Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ January 30, 2018 Network

More information

Research Article Performance Evaluation of Modularity Based Community Detection Algorithms in Large Scale Networks

Research Article Performance Evaluation of Modularity Based Community Detection Algorithms in Large Scale Networks Mathematical Problems in Engineering, Article ID 502809, 15 pages http://dx.doi.org/10.1155/2014/502809 Research Article Performance Evaluation of Modularity Based Community Detection Algorithms in Large

More information

Multilevel Graph Partitioning

Multilevel Graph Partitioning Multilevel Graph Partitioning George Karypis and Vipin Kumar Adapted from Jmes Demmel s slide (UC-Berkely 2009) and Wasim Mohiuddin (2011) Cover image from: Wang, Wanyi, et al. "Polygonal Clustering Analysis

More information

Analysis of Internet Topologies

Analysis of Internet Topologies Feature XIAOFAN LIU Ljiljana Trajković Analysis of Internet Topologies Abstract The discovery of power-laws and spectral properties of the Internet topology illustrates a complex underlying network infrastructure

More information

Partitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA

Partitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA Partitioning and Partitioning Tools Tim Barth NASA Ames Research Center Moffett Field, California 94035-00 USA 1 Graph/Mesh Partitioning Why do it? The graph bisection problem What are the standard heuristic

More information

MCL. (and other clustering algorithms) 858L

MCL. (and other clustering algorithms) 858L MCL (and other clustering algorithms) 858L Comparing Clustering Algorithms Brohee and van Helden (2006) compared 4 graph clustering algorithms for the task of finding protein complexes: MCODE RNSC Restricted

More information

Generalized trace ratio optimization and applications

Generalized trace ratio optimization and applications Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO

More information

Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering

Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering George Karypis and Vipin Kumar Brian Shi CSci 8314 03/09/2017 Outline Introduction Graph Partitioning Problem Multilevel

More information

Clustering Algorithms for general similarity measures

Clustering Algorithms for general similarity measures Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative

More information

Statistical Physics of Community Detection

Statistical Physics of Community Detection Statistical Physics of Community Detection Keegan Go (keegango), Kenji Hata (khata) December 8, 2015 1 Introduction Community detection is a key problem in network science. Identifying communities, defined

More information

A Comparison of Pattern-Based Spectral Clustering Algorithms in Directed Weighted Network

A Comparison of Pattern-Based Spectral Clustering Algorithms in Directed Weighted Network A Comparison of Pattern-Based Spectral Clustering Algorithms in Directed Weighted Network Sumuya Borjigin 1. School of Economics and Management, Inner Mongolia University, No.235 West College Road, Hohhot,

More information

Clustering on networks by modularity maximization

Clustering on networks by modularity maximization Clustering on networks by modularity maximization Sonia Cafieri ENAC Ecole Nationale de l Aviation Civile Toulouse, France thanks to: Pierre Hansen, Sylvain Perron, Gilles Caporossi (GERAD, HEC Montréal,

More information

Finding and Visualizing Graph Clusters Using PageRank Optimization. Fan Chung and Alexander Tsiatas, UCSD WAW 2010

Finding and Visualizing Graph Clusters Using PageRank Optimization. Fan Chung and Alexander Tsiatas, UCSD WAW 2010 Finding and Visualizing Graph Clusters Using PageRank Optimization Fan Chung and Alexander Tsiatas, UCSD WAW 2010 What is graph clustering? The division of a graph into several partitions. Clusters should

More information

Feature Selection for fmri Classification

Feature Selection for fmri Classification Feature Selection for fmri Classification Chuang Wu Program of Computational Biology Carnegie Mellon University Pittsburgh, PA 15213 chuangw@andrew.cmu.edu Abstract The functional Magnetic Resonance Imaging

More information

UML CS Algorithms Qualifying Exam Fall, 2003 ALGORITHMS QUALIFYING EXAM

UML CS Algorithms Qualifying Exam Fall, 2003 ALGORITHMS QUALIFYING EXAM NAME: This exam is open: - books - notes and closed: - neighbors - calculators ALGORITHMS QUALIFYING EXAM The upper bound on exam time is 3 hours. Please put all your work on the exam paper. (Partial credit

More information

Social & Information Network Analysis CS 224W

Social & Information Network Analysis CS 224W Social & Information Network Analysis CS 224W Final Report Alexandre Becker Jordane Giuly Sébastien Robaszkiewicz Stanford University December 2011 1 Introduction The microblogging service Twitter today

More information

Aarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg

Aarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg Spectral Clustering Aarti Singh Machine Learning 10-701/15-781 Apr 7, 2010 Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg 1 Data Clustering Graph Clustering Goal: Given data points X1,, Xn and similarities

More information

An Empirical Analysis of Communities in Real-World Networks

An Empirical Analysis of Communities in Real-World Networks An Empirical Analysis of Communities in Real-World Networks Chuan Sheng Foo Computer Science Department Stanford University csfoo@cs.stanford.edu ABSTRACT Little work has been done on the characterization

More information

Clustering: Classic Methods and Modern Views

Clustering: Classic Methods and Modern Views Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering

More information

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the

More information

Unit 2: Graphs and Matrices. ICPSR University of Michigan, Ann Arbor Summer 2015 Instructor: Ann McCranie

Unit 2: Graphs and Matrices. ICPSR University of Michigan, Ann Arbor Summer 2015 Instructor: Ann McCranie Unit 2: Graphs and Matrices ICPSR University of Michigan, Ann Arbor Summer 2015 Instructor: Ann McCranie Four main ways to notate a social network There are a variety of ways to mathematize a social network,

More information

Explore Co-clustering on Job Applications. Qingyun Wan SUNet ID:qywan

Explore Co-clustering on Job Applications. Qingyun Wan SUNet ID:qywan Explore Co-clustering on Job Applications Qingyun Wan SUNet ID:qywan 1 Introduction In the job marketplace, the supply side represents the job postings posted by job posters and the demand side presents

More information

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur Lecture : Graphs Rajat Mittal IIT Kanpur Combinatorial graphs provide a natural way to model connections between different objects. They are very useful in depicting communication networks, social networks

More information

Community Detection: Comparison of State of the Art Algorithms

Community Detection: Comparison of State of the Art Algorithms Community Detection: Comparison of State of the Art Algorithms Josiane Mothe IRIT, UMR5505 CNRS & ESPE, Univ. de Toulouse Toulouse, France e-mail: josiane.mothe@irit.fr Karen Mkhitaryan Institute for Informatics

More information

Efficient Semi-supervised Spectral Co-clustering with Constraints

Efficient Semi-supervised Spectral Co-clustering with Constraints 2010 IEEE International Conference on Data Mining Efficient Semi-supervised Spectral Co-clustering with Constraints Xiaoxiao Shi, Wei Fan, Philip S. Yu Department of Computer Science, University of Illinois

More information

Extracting Communities from Networks

Extracting Communities from Networks Extracting Communities from Networks Ji Zhu Department of Statistics, University of Michigan Joint work with Yunpeng Zhao and Elizaveta Levina Outline Review of community detection Community extraction

More information

Networks in economics and finance. Lecture 1 - Measuring networks

Networks in economics and finance. Lecture 1 - Measuring networks Networks in economics and finance Lecture 1 - Measuring networks What are networks and why study them? A network is a set of items (nodes) connected by edges or links. Units (nodes) Individuals Firms Banks

More information