My favorite application using eigenvalues: partitioning and community detection in social networks
|
|
- Hugo May
- 6 years ago
- Views:
Transcription
1 My favorite application using eigenvalues: partitioning and community detection in social networks Will Hobbs February 17, 2013 Abstract Social networks are often organized into families, friendship groups, and institutional bodies. However, this organization, functional groups of friends and acquaintances especially, may not always be apparent from the characteristics of the individuals or from global measures of network structure. Community detection methods provide a means to identify cohesive and/or inter-dependent clusters of individuals. In this paper, I review traditional methods to cluster and partition vertices, along with a recent community detection method, modularity maximization, that is more intuitively appealing in many research areas. I focus in particular on the spectral solutions to these problems that have advanced and/or expanded their use in real world applications. Introduction Social networks, and many other real world networks, display a high level of order [3]. Globally, empirical degree sequences often follow a heavy-tailed distribution, and this distribution can be used in measures of network centrality. At meso-scale, vertices tend to organize in groups with a high concentration of within group edges, and a low concentration of between-group edges. This organization is termed community structure [4], or clustering. 1 Community structure is practically interesting because the clusters often share characteristics or functions. For example, social networks organize into families, friendship groups, and institutional bodies. Further, the development of community structure may imply an underlying preference (and advantage) in forming these memberships that predicts the function of the community and the network as a whole. Membership in this group can be an indicator of local cohesion and interdependency. Partitioning and community detection can be readily conceptualized in terms of flows and random walks on a network [3]. A small number of between-group edges forces a random walk to spend more time within a community and relatively infrequently cross into another one. In a social network, we might connect this to the flow of social support within a group of very close friends or family members (that infrequently extends this support to out-groups). This spectral approach has a long history in mathematics [14] and, in recent years, has been adapted to quickly and accurately identify communities in social networks [7]. [9]. 1 Much of this paper draws from reviews and explanations of community detection methods by Fortunato [3] and Newman 1
2 History Community detection has parallel and, recently, mutually enforcing histories in mathematics and in social science. The mathematical approach identifies communities based on network characteristics (edges and edge strengths)[3], while traditional social science research identifies communities based on the shared characteristics and functions of individuals. The current spectral methods for detecting communities in networks are rooted in early work by Fiedler [2] and Donath and Hoffman [1]. This research used the eigenvectors of matrices (Donath and Hoffman) and, specifically, the eigenvector corresponding to the second smallest eigenvalue of the graph Laplacian (Fiedler), or the spectral gap, to partition graphs. The eigenvector corresponding to the spectral gap minimizes the bisection of a graph because the eigenvalues are proportional to their corresponding eigenvector cut sizes. This method was first practically applied in a 1990 paper by Pothen, Simon, and Liou [12]. The most recent applications of spectral methods to community detection use the largest eigenvalue to maximize the modularity of a network [10], or the difference between realized and expected connections within clusters. Specifically, the modularity approaches maximizes the value Q: Q = 1 2m B ij δ(c i, c j ) (1) ij B ij = A ij P ij (2) P ij = k ik j 2m (3) where B ij is the modularity matrix (or the B matrix), A ij is the adjacency matrix, P ij is the null model matrix, k is vertex degree, m is the edge count, and δ is the Kronecker delta symbol indicating shared community membership of two vertices [11]. The most common choice of null model is the kikj 2m, closely related to the configuration model (of expected connections based on a given degree sequence) [6, 11]. This method is appealing because, in most applications, a community detection method should not cut a graph based on local degree. Connections in excess of what we should see at random given a degree sequence will reflect some other (non-degree-based) social characteristic or function. Further, other null models can be easily adapted to this method. These methods are usually verified by labels assigned by the actors themselves, such as party membership in the US Congress, along with qualitative assessments, such as factions within parties during the Civil Rights Era. Algorithms The following methods are split by whether they minimize cuts (using the two, or more, smallest eigenvalues of the graph Laplacian) or maximize modularity (using the one, or more, leading eigenvectors of the modularity matrix). The modularity maximization methods have achieved greater success in real world applications. 2
3 Cut minimization Some traditional graph partitioning and clustering methods use the graph Laplacian to minimize the cuts between clusters. The size of these cuts in many of the methods are directly related to the probability a random walk will transfer from one cluster to another. These methods originally arose to address problems in parallel computing [12, 11] and require ex-ante specification of group sizes (community detection does not). Greedy method: Kernighan-Lin algorithm [5] One of the first methods in computer science to calculate minimum cut sizes is a greedy algorithm that swaps pairs of vertices until a (locally) minimal cut size is achieved. The pair swapping ensures that the final partition assigns a specific number of nodes to each group. 1. Divide nodes into two groups. 2. Find pairs that reduce the cut size by the largest amount. 3. Swap these pairs (and swap only once) until there is no increase in modularity. 4. Go back to the smallest cut size and begin swapping pairs (with the preceding, already swapped pairs set in place). 5. Repeat until there is no improvement in cut size. Spectral partitioning [12] Spectral partitioning tends to achieve cut sizes similar to the Kernighan-Lin algorithm, but achieves them much more quickly [9]. Results can be improved by using the Kernighan-Lin algorithm after finding the spectral cuts. 1. Calculate the eigenvector corresponding to the spectral gap of the Laplacian. 2. Sort the n 1 largest elements into one group and the rest into a second group (where n 1 is the user size of a partition). 3. Sort the n 1 smallest elements into one group and the rest into a second group (because we do not know whether n 1 should correspond to the largest or smallest elements). 4. Select the division with the smaller cut size. Spectral clustering [13] 3
4 1. Calculate the k eigenvectors corresponding to the k smallest eigenvalues of the normalized graph Laplacian (the unnormalized graph Laplacian will achieve similar results when the vertices degrees are similar [3]). 2. Create an n x k matrix (where the k columns are eigenvectors). 3. Perform k-means clustering on the n vertices in k-dimensional Euclidean space. Modularity maximization [4] There are two prominent algorithms to optimize the modularity value introduced in the history section of this paper. The algorithms do not substantially differ in speed or accuracy [9]. Greedy method: vertex-moving algorithm (Kernighan-Lin analog) [9] 1. Divide nodes into two groups. 2. Move the vertex that will achieve the greatest increase in modularity (but move vertices only once). 3. After each vertex has been reassigned, go back to the largest modularity value and begin moving vertices (with the preceding, already moved vertices set in place) 4. Repeat until there is no improvement in modularity. 5. Repeat this on each bipartition (until there is no increase in modularity) to find an arbitrary number of communities. Spectral algorithm [7] 1. Find the leading eigenvector of the modularity matrix (using the power method). 2. Assign nodes corresponding to the positive values of the leading eigenvector to one community and the remainder to another (if there are no positive values, do not split the network). 3. Repeat this on each bipartition (until there are no positive values in the leading eigenvector) to find an arbitrary number of communities. Other eigenvectors This method throws away useful information from other eigenvectors. In theory, the k-largest eigenvalues can be used to find more than two communities at once, but there is currently no implementation of this (the problem is too high-dimensional). Ideas behind optimization techniques In this section, I review two techniques that enable the practical implementation of spectral partitioning and modularity maximization. 4
5 Spectral approximation The matrix formulation of the graph partitioning problem is: R = 1 4 st Ls, (4) The s, community assignment (here either +1 or -1 for s i in s), that minimizes cut size R (alternatively, R = 1 4 ij A ij(1 s i s j )) is difficult to find in practice. A useful approximation of this permits s to take on any value subject to s = n. This relaxation allows s to assign values to the hypersphere circumscribing the 2 n hypercube [9]. Given this relaxation, we can differentiate with respect to s i and find an approximate, continuous solution in terms of the graph Laplacian and its eigenvectors. Spectral modularity maximization in O(n 2 ) time The modularity matrix B is not a sparse matrix and finding the leading eigenvector of this matrix, using the power method, will take O(n 3 ). This is not practical on many real world networks. However, the structure of the modularity matrix allows the computation to be completed more quickly (in O(n 2 )) [9]. In practice the power method is: Bx = Ax k(kt x) 2m, (5) where B is the modularity matrix, A is the adjacency matrix, x is an arbitrary vector, k is the degrees of the vertices, and 2m is a scaling term. Ax takes O(m + n) to complete, given the sparse matrix A (m is the number of edges and n is the number of vertices) and the second term can be evaluated in O(n). Because the power method converges to the leading eigenvector in O(n) multiplications and most real world networks (especially social networks) are sparse, the algorithm is completed in O(n 2 ) [8]. 5
6 References [1] W. E. Donath and A. J. Hoffman, Lower bounds for the partitioning of graphs, IBM Journal of Research and Development, 17 (1973), pp [2] M. Fiedler, Algebraic connectivity of graphs, Czechoslovak Mathematical Journal, 23 (1973), pp [3] S. Fortunato, Community detection in graphs, Physics Reports, 486 (2010), pp , 2, 4 [4] M. Girvan and M. Newman, Community structure in social and biological networks, Proceedings of the National Academy of Sciences of the United States of America, 99 (2002), pp , 4 [5] B. W. Kernighan and S. Lin, An Efficient Heuristic Procedure for Partitioning Graphs, Bell system technical journal, (1970). 3 [6] M. Molloy and B. Reed, A critical point for random graphs with a given degree sequence, Random Structures & Algorithms, 6 (1995), pp [7] M. Newman, Finding community structure in networks using the eigenvectors of matrices, Physical Review E, 74 (2006), p , 4 [8] M. Newman, Modularity and community structure in networks, Proceedings of the National Academy of Sciences of the United States of America, 103 (2006), pp [9] M. Newman, Networks: An Introduction, Oxford University Press, Inc., New York, NY, USA, , 3, 4, 5 [10] M. Newman and M. Girvan, Finding and evaluating community structure in networks, Physical Review E, 69 (2004), p [11] M. A. Porter, J.-P. Onnela, and P. J. Mucha, Communities in networks, Notices of the AMS, 56 (2009), pp , 3 [12] A. Pothen, H. D. Simon, and K.-P. Liou, Partitioning Sparse Matrices with Eigenvectors of Graphs, SIAM Journal on Matrix Analysis and Applications, 11 (1990), pp , 3 [13] J. Shi and J. Malik, Normalized cuts and image segmentation, Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22 (2000), pp [14] D. A. Spielmat and S.-H. Teng, Spectral Partitioning Works, Foundations of Computer Science, (1996), pp
Modularity CMSC 858L
Modularity CMSC 858L Module-detection for Function Prediction Biological networks generally modular (Hartwell+, 1999) We can try to find the modules within a network. Once we find modules, we can look
More informationTELCOM2125: Network Science and Analysis
School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 2 Part 4: Dividing Networks into Clusters The problem l Graph partitioning
More informationSpectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,
Spectral Clustering XIAO ZENG + ELHAM TABASSI CSE 902 CLASS PRESENTATION MARCH 16, 2017 1 Presentation based on 1. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4
More informationClustering Object-Oriented Software Systems using Spectral Graph Partitioning
Clustering Object-Oriented Software Systems using Spectral Graph Partitioning Spiros Xanthos University of Illinois at Urbana-Champaign 0 North Goodwin Urbana, IL 680 xanthos@cs.uiuc.edu Abstract In this
More informationSpectral Methods for Network Community Detection and Graph Partitioning
Spectral Methods for Network Community Detection and Graph Partitioning M. E. J. Newman Department of Physics, University of Michigan Presenters: Yunqi Guo Xueyin Yu Yuanqi Li 1 Outline: Community Detection
More informationNon Overlapping Communities
Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides
More informationCommunity Detection Methods using Eigenvectors of Matrices
Community Detection Methods using Eigenvectors of Matrices Yan Zhang Abstract In this paper we investigate the problem of detecting communities in graphs. We use the eigenvectors of the graph Laplacian
More informationV4 Matrix algorithms and graph partitioning
V4 Matrix algorithms and graph partitioning - Community detection - Simple modularity maximization - Spectral modularity maximization - Division into more than two groups - Other algorithms for community
More informationCS224W: Analysis of Networks Jure Leskovec, Stanford University
CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models
More informationMining Social Network Graphs
Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be
More informationSpectral Clustering on Handwritten Digits Database
October 6, 2015 Spectral Clustering on Handwritten Digits Database Danielle dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance
More informationSpectral Clustering and Community Detection in Labeled Graphs
Spectral Clustering and Community Detection in Labeled Graphs Brandon Fain, Stavros Sintos, Nisarg Raval Machine Learning (CompSci 571D / STA 561D) December 7, 2015 {btfain, nisarg, ssintos} at cs.duke.edu
More informationLecture 19: Graph Partitioning
Lecture 19: Graph Partitioning David Bindel 3 Nov 2011 Logistics Please finish your project 2. Please start your project 3. Graph partitioning Given: Graph G = (V, E) Possibly weights (W V, W E ). Possibly
More informationBehavioral Data Mining. Lecture 18 Clustering
Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality K-means Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i
More informationCS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning
CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning Parallel sparse matrix-vector product Lay out matrix and vectors by rows y(i) = sum(a(i,j)*x(j)) Only compute terms with A(i,j) 0 P0 P1
More informationCommunity Detection. Community
Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,
More informationLevel 3: Level 2: Level 1: Level 0:
A Graph Based Method for Generating the Fiedler Vector of Irregular Problems 1 Michael Holzrichter 1 and Suely Oliveira 2 1 Texas A&M University, College Station, TX,77843-3112 2 The University of Iowa,
More informationLocal Fiedler Vector Centrality for Detection of Deep and Overlapping Communities in Networks
Local Fiedler Vector Centrality for Detection of Deep and Overlapping Communities in Networks Pin-Yu Chen and Alfred O. Hero III, Fellow, IEEE Department of Electrical Engineering and Computer Science,
More informationKernighan/Lin - Preliminary Definitions. Comments on Kernighan/Lin Algorithm. Partitioning Without Nodal Coordinates Kernighan/Lin
Partitioning Without Nodal Coordinates Kernighan/Lin Given G = (N,E,W E ) and a partitioning N = A U B, where A = B. T = cost(a,b) = edge cut of A and B partitions. Find subsets X of A and Y of B with
More informationSGN (4 cr) Chapter 11
SGN-41006 (4 cr) Chapter 11 Clustering Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 25, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006 (4 cr) Chapter
More informationMachine Learning for Data Science (CS4786) Lecture 11
Machine Learning for Data Science (CS4786) Lecture 11 Spectral Clustering Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Survey Survey Survey Competition I Out! Preliminary report of
More informationProblem Definition. Clustering nonlinearly separable data:
Outlines Weighted Graph Cuts without Eigenvectors: A Multilevel Approach (PAMI 2007) User-Guided Large Attributed Graph Clustering with Multiple Sparse Annotations (PAKDD 2016) Problem Definition Clustering
More informationLecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates
CSE 51: Design and Analysis of Algorithms I Spring 016 Lecture 11: Clustering and the Spectral Partitioning Algorithm Lecturer: Shayan Oveis Gharan May nd Scribe: Yueqi Sheng Disclaimer: These notes have
More informationCommunity Detection in Directed Weighted Function-call Networks
Community Detection in Directed Weighted Function-call Networks Zhengxu Zhao 1, Yang Guo *2, Weihua Zhao 3 1,3 Shijiazhuang Tiedao University, Shijiazhuang, Hebei, China 2 School of Mechanical Engineering,
More informationChapter 11. Network Community Detection
Chapter 11. Network Community Detection Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Outline
More informationLesson 2 7 Graph Partitioning
Lesson 2 7 Graph Partitioning The Graph Partitioning Problem Look at the problem from a different angle: Let s multiply a sparse matrix A by a vector X. Recall the duality between matrices and graphs:
More informationSocial-Network Graphs
Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities
More informationOverlapping Communities
Yangyang Hou, Mu Wang, Yongyang Yu Purdue Univiersity Department of Computer Science April 25, 2013 Overview Datasets Algorithm I Algorithm II Algorithm III Evaluation Overview Graph models of many real
More informationWeb Structure Mining Community Detection and Evaluation
Web Structure Mining Community Detection and Evaluation 1 Community Community. It is formed by individuals such that those within a group interact with each other more frequently than with those outside
More informationBig Data Analytics. Special Topics for Computer Science CSE CSE Feb 11
Big Data Analytics Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Feb 11 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering II Spectral
More informationCSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection.
CSCI-B609: A Theorist s Toolkit, Fall 016 Sept. 6, 016 Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality Lecturer: Yuan Zhou Scribe: Xuan Dong We will continue studying the spectral graph theory
More informationCommunity Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal
Community Structure Detection Amar Chandole Ameya Kabre Atishay Aggarwal What is a network? Group or system of interconnected people or things Ways to represent a network: Matrices Sets Sequences Time
More informationCAIM: Cerca i Anàlisi d Informació Massiva
1 / 72 CAIM: Cerca i Anàlisi d Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Balcázar, Ricard Gavaldá Department of Computer Science, UPC Fall 2016 http://www.cs.upc.edu/~caim
More informationDynamic network generative model
Dynamic network generative model Habiba, Chayant Tantipathanananandh, Tanya Berger-Wolf University of Illinois at Chicago. In this work we present a statistical model for generating realistic dynamic networks
More informationSocial Data Management Communities
Social Data Management Communities Antoine Amarilli 1, Silviu Maniu 2 January 9th, 2018 1 Télécom ParisTech 2 Université Paris-Sud 1/20 Table of contents Communities in Graphs 2/20 Graph Communities Communities
More informationCommunity Structure and Beyond
Community Structure and Beyond Elizabeth A. Leicht MAE: 298 April 9, 2009 Why do we care about community structure? Large Networks Discussion Outline Overview of past work on community structure. How to
More informationA geometric model for on-line social networks
WOSN 10 June 22, 2010 A geometric model for on-line social networks Anthony Bonato Ryerson University Geometric model for OSNs 1 Complex Networks web graph, social networks, biological networks, internet
More informationSignal Processing for Big Data
Signal Processing for Big Data Sergio Barbarossa 1 Summary 1. Networks 2.Algebraic graph theory 3. Random graph models 4. OperaGons on graphs 2 Networks The simplest way to represent the interaction between
More informationAn Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization
An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization Pedro Ribeiro (DCC/FCUP & CRACS/INESC-TEC) Part 1 Motivation and emergence of Network Science
More informationCS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #21: Graph Mining 2 Networks & Communi>es We o@en think of networks being organized into modules, cluster, communi>es: VT CS 5614 2 Goal:
More informationDiscovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes
Journal of Computational Information Systems 8: 23 (2012) 9807 9814 Available at http://www.jofcis.com Discovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes
More informationThe clustering in general is the task of grouping a set of objects in such a way that objects
Spectral Clustering: A Graph Partitioning Point of View Yangzihao Wang Computer Science Department, University of California, Davis yzhwang@ucdavis.edu Abstract This course project provide the basic theory
More informationVisual Representations for Machine Learning
Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering
More informationTypes of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters
Types of general clustering methods Clustering Algorithms for general similarity measures agglomerative versus divisive algorithms agglomerative = bottom-up build up clusters from single objects divisive
More informationSpectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras
Spectral Graph Sparsification: overview of theory and practical methods Yiannis Koutis University of Puerto Rico - Rio Piedras Graph Sparsification or Sketching Compute a smaller graph that preserves some
More informationSpectral Graph Multisection Through Orthogonality. Huanyang Zheng and Jie Wu CIS Department, Temple University
Spectral Graph Multisection Through Orthogonality Huanyang Zheng and Jie Wu CIS Department, Temple University Outline Motivation Preliminary Algorithm Evaluation Future work Motivation Traditional graph
More informationCommunity detection. Leonid E. Zhukov
Community detection Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Network Science Leonid E.
More informationGraph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen
Graph Partitioning for High-Performance Scientific Simulations Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Challenges for irregular meshes Modeling mesh-based computations as graphs Static
More informationCS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003
CS 664 Slides #11 Image Segmentation Prof. Dan Huttenlocher Fall 2003 Image Segmentation Find regions of image that are coherent Dual of edge detection Regions vs. boundaries Related to clustering problems
More informationNetwork Community Detection
Network Community Detection Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ March 20, 2018
More informationSize Regularized Cut for Data Clustering
Size Regularized Cut for Data Clustering Yixin Chen Department of CS Univ. of New Orleans yixin@cs.uno.edu Ya Zhang Department of EECS Uinv. of Kansas yazhang@ittc.ku.edu Xiang Ji NEC-Labs America, Inc.
More informationClusters and Communities
Clusters and Communities Lecture 7 CSCI 4974/6971 22 Sep 2016 1 / 14 Today s Biz 1. Reminders 2. Review 3. Communities 4. Betweenness and Graph Partitioning 5. Label Propagation 2 / 14 Today s Biz 1. Reminders
More informationClustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic
Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the
More informationIntroduction to Machine Learning
Introduction to Machine Learning Clustering Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 19 Outline
More informationClustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search
Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2
More informationDetecting Community Structure for Undirected Big Graphs Based on Random Walks
Detecting Community Structure for Undirected Big Graphs Based on Random Walks Xiaoming Liu 1, Yadong Zhou 1, Chengchen Hu 1, Xiaohong Guan 1,, Junyuan Leng 1 1 MOE KLNNIS Lab, Xi an Jiaotong University,
More informationCommunity Structure in Graphs
Community Structure in Graphs arxiv:0712.2716v1 [physics.soc-ph] 17 Dec 2007 Santo Fortunato a, Claudio Castellano b a Complex Networks LagrangeLaboratory(CNLL), ISI Foundation, Torino, Italy b SMC, INFM-CNR
More informationOnline Social Networks and Media. Community detection
Online Social Networks and Media Community detection 1 Notes on Homework 1 1. You should write your own code for generating the graphs. You may use SNAP graph primitives (e.g., add node/edge) 2. For the
More information( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components
Review Lecture 14 ! PRINCIPAL COMPONENT ANALYSIS Eigenvectors of the covariance matrix are the principal components 1. =cov X Top K principal components are the eigenvectors with K largest eigenvalues
More informationBasics of Network Analysis
Basics of Network Analysis Hiroki Sayama sayama@binghamton.edu Graph = Network G(V, E): graph (network) V: vertices (nodes), E: edges (links) 1 Nodes = 1, 2, 3, 4, 5 2 3 Links = 12, 13, 15, 23,
More informationSpectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014
Spectral Clustering Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 What are we going to talk about? Introduction Clustering and
More informationarxiv: v1 [stat.ml] 2 Nov 2010
Community Detection in Networks: The Leader-Follower Algorithm arxiv:111.774v1 [stat.ml] 2 Nov 21 Devavrat Shah and Tauhid Zaman* devavrat@mit.edu, zlisto@mit.edu November 4, 21 Abstract Traditional spectral
More informationDetecting community structure in networks
Eur. Phys. J. B 38, 321 330 (2004) DOI: 10.1140/epjb/e2004-00124-y THE EUROPEAN PHYSICAL JOURNAL B Detecting community structure in networks M.E.J. Newman a Department of Physics and Center for the Study
More informationOh Pott, Oh Pott! or how to detect community structure in complex networks
Oh Pott, Oh Pott! or how to detect community structure in complex networks Jörg Reichardt Interdisciplinary Centre for Bioinformatics, Leipzig, Germany (Host of the 2012 Olympics) Questions to start from
More informationCommunity Identification in International Weblogs
Technische Universität Kaiserslautern Fernstudium Software Engineering for Embedded Systems Masterarbeit Community Identification in International Weblogs Autor: Christoph Rueger Betreuer: Prof. Dr. Prof.
More informationExtracting Information from Complex Networks
Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform
More informationSpectral Graph Multisection Through Orthogonality
Spectral Graph Multisection Through Orthogonality Huanyang Zheng and Jie Wu Department of Computer and Information Sciences Temple University, Philadelphia, PA 922 {huanyang.zheng, jiewu}@temple.edu ABSTRACT
More informationScalable Clustering of Signed Networks Using Balance Normalized Cut
Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.
More informationPlanar Graphs 2, the Colin de Verdière Number
Spectral Graph Theory Lecture 26 Planar Graphs 2, the Colin de Verdière Number Daniel A. Spielman December 4, 2009 26.1 Introduction In this lecture, I will introduce the Colin de Verdière number of a
More information1 More stochastic block model. Pr(G θ) G=(V, E) 1.1 Model definition. 1.2 Fitting the model to data. Prof. Aaron Clauset 7 November 2013
1 More stochastic block model Recall that the stochastic block model (SBM is a generative model for network structure and thus defines a probability distribution over networks Pr(G θ, where θ represents
More informationQ. Wang National Key Laboratory of Antenna and Microwave Technology Xidian University No. 2 South Taiba Road, Xi an, Shaanxi , P. R.
Progress In Electromagnetics Research Letters, Vol. 9, 29 38, 2009 AN IMPROVED ALGORITHM FOR MATRIX BANDWIDTH AND PROFILE REDUCTION IN FINITE ELEMENT ANALYSIS Q. Wang National Key Laboratory of Antenna
More informationApplication of Spectral Clustering Algorithm
1/27 Application of Spectral Clustering Algorithm Danielle Middlebrooks dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance
More informationSocial Network Analysis
Social Network Analysis Mathematics of Networks Manar Mohaisen Department of EEC Engineering Adjacency matrix Network types Edge list Adjacency list Graph representation 2 Adjacency matrix Adjacency matrix
More informationGraph Theory Review. January 30, Network Science Analytics Graph Theory Review 1
Graph Theory Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ January 30, 2018 Network
More informationResearch Article Performance Evaluation of Modularity Based Community Detection Algorithms in Large Scale Networks
Mathematical Problems in Engineering, Article ID 502809, 15 pages http://dx.doi.org/10.1155/2014/502809 Research Article Performance Evaluation of Modularity Based Community Detection Algorithms in Large
More informationMultilevel Graph Partitioning
Multilevel Graph Partitioning George Karypis and Vipin Kumar Adapted from Jmes Demmel s slide (UC-Berkely 2009) and Wasim Mohiuddin (2011) Cover image from: Wang, Wanyi, et al. "Polygonal Clustering Analysis
More informationAnalysis of Internet Topologies
Feature XIAOFAN LIU Ljiljana Trajković Analysis of Internet Topologies Abstract The discovery of power-laws and spectral properties of the Internet topology illustrates a complex underlying network infrastructure
More informationPartitioning and Partitioning Tools. Tim Barth NASA Ames Research Center Moffett Field, California USA
Partitioning and Partitioning Tools Tim Barth NASA Ames Research Center Moffett Field, California 94035-00 USA 1 Graph/Mesh Partitioning Why do it? The graph bisection problem What are the standard heuristic
More informationMCL. (and other clustering algorithms) 858L
MCL (and other clustering algorithms) 858L Comparing Clustering Algorithms Brohee and van Helden (2006) compared 4 graph clustering algorithms for the task of finding protein complexes: MCODE RNSC Restricted
More informationGeneralized trace ratio optimization and applications
Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO
More informationParallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering
Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering George Karypis and Vipin Kumar Brian Shi CSci 8314 03/09/2017 Outline Introduction Graph Partitioning Problem Multilevel
More informationClustering Algorithms for general similarity measures
Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative
More informationStatistical Physics of Community Detection
Statistical Physics of Community Detection Keegan Go (keegango), Kenji Hata (khata) December 8, 2015 1 Introduction Community detection is a key problem in network science. Identifying communities, defined
More informationA Comparison of Pattern-Based Spectral Clustering Algorithms in Directed Weighted Network
A Comparison of Pattern-Based Spectral Clustering Algorithms in Directed Weighted Network Sumuya Borjigin 1. School of Economics and Management, Inner Mongolia University, No.235 West College Road, Hohhot,
More informationClustering on networks by modularity maximization
Clustering on networks by modularity maximization Sonia Cafieri ENAC Ecole Nationale de l Aviation Civile Toulouse, France thanks to: Pierre Hansen, Sylvain Perron, Gilles Caporossi (GERAD, HEC Montréal,
More informationFinding and Visualizing Graph Clusters Using PageRank Optimization. Fan Chung and Alexander Tsiatas, UCSD WAW 2010
Finding and Visualizing Graph Clusters Using PageRank Optimization Fan Chung and Alexander Tsiatas, UCSD WAW 2010 What is graph clustering? The division of a graph into several partitions. Clusters should
More informationFeature Selection for fmri Classification
Feature Selection for fmri Classification Chuang Wu Program of Computational Biology Carnegie Mellon University Pittsburgh, PA 15213 chuangw@andrew.cmu.edu Abstract The functional Magnetic Resonance Imaging
More informationUML CS Algorithms Qualifying Exam Fall, 2003 ALGORITHMS QUALIFYING EXAM
NAME: This exam is open: - books - notes and closed: - neighbors - calculators ALGORITHMS QUALIFYING EXAM The upper bound on exam time is 3 hours. Please put all your work on the exam paper. (Partial credit
More informationSocial & Information Network Analysis CS 224W
Social & Information Network Analysis CS 224W Final Report Alexandre Becker Jordane Giuly Sébastien Robaszkiewicz Stanford University December 2011 1 Introduction The microblogging service Twitter today
More informationAarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg
Spectral Clustering Aarti Singh Machine Learning 10-701/15-781 Apr 7, 2010 Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg 1 Data Clustering Graph Clustering Goal: Given data points X1,, Xn and similarities
More informationAn Empirical Analysis of Communities in Real-World Networks
An Empirical Analysis of Communities in Real-World Networks Chuan Sheng Foo Computer Science Department Stanford University csfoo@cs.stanford.edu ABSTRACT Little work has been done on the characterization
More informationClustering: Classic Methods and Modern Views
Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering
More informationLecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1
CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the
More informationUnit 2: Graphs and Matrices. ICPSR University of Michigan, Ann Arbor Summer 2015 Instructor: Ann McCranie
Unit 2: Graphs and Matrices ICPSR University of Michigan, Ann Arbor Summer 2015 Instructor: Ann McCranie Four main ways to notate a social network There are a variety of ways to mathematize a social network,
More informationExplore Co-clustering on Job Applications. Qingyun Wan SUNet ID:qywan
Explore Co-clustering on Job Applications Qingyun Wan SUNet ID:qywan 1 Introduction In the job marketplace, the supply side represents the job postings posted by job posters and the demand side presents
More informationLecture 5: Graphs. Rajat Mittal. IIT Kanpur
Lecture : Graphs Rajat Mittal IIT Kanpur Combinatorial graphs provide a natural way to model connections between different objects. They are very useful in depicting communication networks, social networks
More informationCommunity Detection: Comparison of State of the Art Algorithms
Community Detection: Comparison of State of the Art Algorithms Josiane Mothe IRIT, UMR5505 CNRS & ESPE, Univ. de Toulouse Toulouse, France e-mail: josiane.mothe@irit.fr Karen Mkhitaryan Institute for Informatics
More informationEfficient Semi-supervised Spectral Co-clustering with Constraints
2010 IEEE International Conference on Data Mining Efficient Semi-supervised Spectral Co-clustering with Constraints Xiaoxiao Shi, Wei Fan, Philip S. Yu Department of Computer Science, University of Illinois
More informationExtracting Communities from Networks
Extracting Communities from Networks Ji Zhu Department of Statistics, University of Michigan Joint work with Yunpeng Zhao and Elizaveta Levina Outline Review of community detection Community extraction
More informationNetworks in economics and finance. Lecture 1 - Measuring networks
Networks in economics and finance Lecture 1 - Measuring networks What are networks and why study them? A network is a set of items (nodes) connected by edges or links. Units (nodes) Individuals Firms Banks
More information