Chapter 11. Network Community Detection

Similar documents
Extracting Communities from Networks

Spectral Methods for Network Community Detection and Graph Partitioning

Visual Representations for Machine Learning

Network cross-validation by edge sampling

Introduction to Machine Learning

My favorite application using eigenvalues: partitioning and community detection in social networks

CS224W: Analysis of Networks Jure Leskovec, Stanford University

Statistical Physics of Community Detection

Community Detection. Community

Machine Learning for Data Science (CS4786) Lecture 11

Mining Social Network Graphs

Community Structure and Beyond

Behavioral Data Mining. Lecture 18 Clustering

On the Approximability of Modularity Clustering

CSCI5070 Advanced Topics in Social Computing

An introduction to graph analysis and modeling Descriptive Analysis of Network Data

CAIM: Cerca i Anàlisi d Informació Massiva

Non Overlapping Communities

V4 Matrix algorithms and graph partitioning

Introduction to spectral clustering

Basics of Network Analysis

Chapter 5. Tree-based Methods

Social-Network Graphs

SEMI-SUPERVISED SPECTRAL CLUSTERING USING THE SIGNED LAPLACIAN. Thomas Dittrich, Peter Berger, and Gerald Matz

1 Large-scale network structures

Local Fiedler Vector Centrality for Detection of Deep and Overlapping Communities in Networks

Sergei Silvestrov, Christopher Engström. January 29, 2013

1 More stochastic block model. Pr(G θ) G=(V, E) 1.1 Model definition. 1.2 Fitting the model to data. Prof. Aaron Clauset 7 November 2013

SGN (4 cr) Chapter 11

Community Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal

Online Social Networks and Media. Community detection

Community Structure in Graphs

MSA220 - Statistical Learning for Big Data

CS281 Section 9: Graph Models and Practical MCMC

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Extracting Information from Complex Networks

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2

TELCOM2125: Network Science and Analysis

Computer vision: models, learning and inference. Chapter 10 Graphical Models

Clustering Lecture 8. David Sontag New York University. Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein

Clustering: Classic Methods and Modern Views

Machine Learning (BSMC-GA 4439) Wenke Liu

Lesson 4. Random graphs. Sergio Barbarossa. UPC - Barcelona - July 2008

Machine Learning (BSMC-GA 4439) Wenke Liu

Maximizing edge-ratio is NP-complete

Chapter 9 Graph Algorithms

Assessing and Safeguarding Network Resilience to Nodal Attacks

11/22/2016. Chapter 9 Graph Algorithms. Introduction. Definitions. Definitions. Definitions. Definitions

Problem Definition. Clustering nonlinearly separable data:

Complex-Network Modelling and Inference

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection

Spatial-Color Pixel Classification by Spectral Clustering for Color Image Segmentation

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components

Edge-exchangeable graphs and sparsity

Signal Processing for Big Data

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters

Clustering on networks by modularity maximization

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample

Community Identification in International Weblogs

Segmentation: Clustering, Graph Cut and EM

Social Network Analysis

Modularity CMSC 858L

INF 4300 Classification III Anne Solberg The agenda today:

Simulated Annealing. Slides based on lecture by Van Larhoven

Clusters and Communities

CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS

Using the DATAMINE Program

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

1 Inferring large-scale structural patterns. Pr(G θ) G=(V, E) Prof. Aaron Clauset Network Analysis and Modeling, CSCI 5352 Lecture 6

Graph drawing in spectral layout

Clustering Algorithms for general similarity measures

L1-graph based community detection in online social networks

Image Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 18.

Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering

Anomaly Detection in Very Large Graphs Modeling and Computational Considerations

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

Machine learning - HT Clustering

Latent Class Modeling as a Probabilistic Extension of K-Means Clustering

STATISTICS (STAT) Statistics (STAT) 1

Social Networks. of Science and Technology, Wuhan , China

Monte Carlo for Spatial Models

Spectral Clustering and Community Detection in Labeled Graphs

Web Structure Mining Community Detection and Evaluation

Response Network Emerging from Simple Perturbation

Feature Selection for fmri Classification

Algebraic statistics for network models

Detecting community structure in networks

Centrality Measures to Identify Traffic Congestion on Road Networks: A Case Study of Sri Lanka

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Clustering. So far in the course. Clustering. Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. dist(x, y) = x y 2 2

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection

Turing Workshop on Statistics of Network Analysis

Discovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes

Spectral Graph Sparsification: overview of theory and practical methods. Yiannis Koutis. University of Puerto Rico - Rio Piedras

CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection.

Spectral Graph Multisection Through Orthogonality. Huanyang Zheng and Jie Wu CIS Department, Temple University

Lecture 3: Descriptive Statistics

Social Data Management Communities

Transcription:

Chapter 11. Network Community Detection Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan

Outline Introduction Spectral clustering Hierachical clustering Modularity-based methods Model-based methods Key refs: 1.Newman MEJ 2. Zhao Y, Levina E, Zhu J (2012, Ann Statist 40:2266-2292). 3. Fortunato S (2010, Physics Reports 486:75-174). R package igraph: drawing networks, calculating some network statistics, some community detection algorithms,...

Introduction Given a binary (undirected) network/graph: G = (V, E), V = {1, 2,..., n}, set of nodes; E, set of edges. Adjacency matrix A = (A ij ): A ij = 1 if there is an edge/link b/w nodes i and j; A ij = 0 o/w. (A ii = 0) Goal: assign the nodes into K homogeneous groups. often means dense connections within groups, but sparse b/w groups. Why? Figs 1-4 in Fortunato (2010).

Spectral clustering Laplacian L = D A, or... D = Diag(D 11,..., D nn ), D ii = j A ij. Intuition: If a network separates perfectly into K communities, then L (or A) is block diagonal (after some re-ordering of the rows/columns). If not perfectly but nearly, then the eigenvectors of L are (nearly) linear combinations of the indicator vectors. Apply K-means (or..) to a few (K) eigenvectors corresponding to the smallest eigenvalues of L. (Note: the smallest eigen value is 0, corresponding to eigenvector 1.) Widely used; some theory (e.g consistency).

Modified spectral clustering SC may not work well for sparse networks. Regularized SC (Qin and Rohe): replace D with D τ = D + τi for a small τ > 0. SC with perturbations (Amini, Chen, Bickel, Levina, 2013, Ann Statist 41: 2097-2122): regularize A by adding a small positive number on a random subset of off-diagonals of A.

Hierarchical clustering Need to define some similarity or distance b/w nodes. Euclidean distance: A i. = (A i1, A I 2,..., A in ), Or, Pearson s corr, x ij = A i. A j. 2 x ij = corr(a i., A j. ) Then apply a hierarchical clustering. can be used to re-arrange the rows/columns of A to get a nearly block-diagonal A. Fig 3 in Neuman. Fig 2 in Meunier et al (2010).

Algorithms based on edge removal Divisive: edges are progressively removed. Which edges? bottleneck ones. edge betweenness is defined to be the number of shortest paths between all pairs of all nodes that run through the two nodes. Algorithm (Girvam and Neuman 2002, PNAS): 1) calculate edge betweenness for each remaining edge in a network; 2) remove the edge with the higest edge betweenness; 3) repeat the above until... A possible stopping critarion: modularity, to be discussed. Fig 4 in Neuman. Remarks: slow; some modifications, e.g. a Monte Carlo version in calculating edge betweenness using only a random subset of all pairs; or use a different criterion.

Modularity-based methods Notation: degree of node i: d i = D ii = n j=1 A ij, (twice) total number of edges: m = n i=1 d i, Community assignment: C = (C 1, C 2,..., C n ); unknown, C i {1, 2,..., K}: community containing node i. Modularity: Q = Q(C) = 1 ( A ij d ) id j I (C i = C j ). 2m m Intuition: obs ed - exp ed Goal: Ĉ = arg max C Q(C) Assumption: good to maximize Q, but... i,j Key: a combinatorial optimization problem! seeking exact solution will be too slow = many approximate algorithms, such as greedy searches (e.g. genetic algorithms, simulated annealing), relaxed algorithms,...

Very nonparametric?! Problems: resolution limit; too many local solutions. cannot detect small communities; why? an implicit null model.

Model-based methods Stochastic block model SBM (Holland et al 1983): 1) a K K probability matrix P; 2) A ij Bin(1, P Ci,C j ) independently. Simple; can model dense/weak within-/between-community edges. But, treat all nodes/edges in a community equally; cannot model hub nodes! Scale-free network: node degree distribution Pr(k) is heavy-tailed; a power law. SBM with K = 1: Erdos-Renyi Random Graph. Degree-corrected SBM (DCSBM) (Karrer and Newman 2011): 1) P; each node i has a degree parameter θ i (with some constraints for identifiability); 2) A ij Bin(1, θ i θ j P Ci,C j ) independently

More notations: n k (C) = n i=1 I (C i = k), number of nodes in community k; O kl = n i,j=1 A iji (C i = k, C j = l), number of edges b/w communities k l; O kk = n i,j=1 A iji (C i = k, C j = k), (twice) number of edges within community k; O k = K l=1 O kl, sum of node degrees in community k; m = n i=1 d i, (twice) the number fo edges in the network. Objective function: A profile likelihood (profiling out nuisance parameters P and θ s based on a Poisson approximation to a binomial). Given a likelihood L(C, P), a profile likelihood L (C) = max P L(C, P) = L(C, ˆP(C)).

SBM: DCSBM: Q SB (C) = Q DC (C) = Neuman-Girvan modularity: K (O kl log O kl ). n k n l k,l=1 K (O kl log O kl ). O k O l k,l=1 Q NG (C) = 1 2m k (O kk O2 k m ). Remarks: Still a combinatorial optimization problem; better theoretical properties. Numerical examples in Zhao et al (2012).

Other topics Summary statistics for networks; e.g. clustering coeficient,... Weighted networks; with or without negative weights (e.g. Pearson s correlations). Overlapping communities. Time-varying (dynamic) networks. With covariates. How to model covariates? Fast (approximate) algorithms; theory.