Community detection. Leonid E. Zhukov

Similar documents
Graph Partitioning Algorithms

Social Data Management Communities

CEIL: A Scalable, Resolution Limit Free Approach for Detecting Communities in Large Networks

Basics of Network Analysis

CAIM: Cerca i Anàlisi d Informació Massiva

Community Detection: Comparison of State of the Art Algorithms

A new Pre-processing Strategy for Improving Community Detection Algorithms

Community Detection based on Structural and Attribute Similarities

CUT: Community Update and Tracking in Dynamic Social Networks

A Simple Acceleration Method for the Louvain Algorithm

On the Permanence of Vertices in Network Communities. Tanmoy Chakraborty Google India PhD Fellow IIT Kharagpur, India

Demystifying movie ratings 224W Project Report. Amritha Raghunath Vignesh Ganapathi Subramanian

Dynamic Clustering in Social Networks using Louvain and Infomap Method

Brief description of the base clustering algorithms

Detecting Community Structure for Undirected Big Graphs Based on Random Walks

Computing Communities in Large Networks Using Random Walks

Community Detection in Bipartite Networks:

Non Overlapping Communities

Community detection algorithms survey and overlapping communities. Presented by Sai Ravi Kiran Mallampati

Static community detection algorithms for evolving networks

Supplementary material to Epidemic spreading on complex networks with community structures

Jure Leskovec, Cornell/Stanford University. Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research

MCL. (and other clustering algorithms) 858L

Community Detection in Directed Weighted Function-call Networks

My favorite application using eigenvalues: partitioning and community detection in social networks

Hierarchical Graph Clustering: Quality Metrics & Algorithms

Community Detection. Community

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Social and Technological Network Analysis. Lecture 4: Community Detec=on and Overlapping Communi=es. Prof. Cecilia Mascolo

A Comparison of Community Detection Algorithms on Artificial Networks

Modularity CMSC 858L

Expected Nodes: a quality function for the detection of link communities

Crawling and Detecting Community Structure in Online Social Networks using Local Information

Communities and Balance in Signed Networks: A Spectral Approach

Hierarchical Overlapping Community Discovery Algorithm Based on Node purity

Web Structure Mining Community Detection and Evaluation

Cluster Analysis. Angela Montanari and Laura Anderlucci

TELCOM2125: Network Science and Analysis

arxiv: v2 [physics.soc-ph] 24 Jul 2009

Overview Of Various Overlapping Community Detection Approaches

1 Homophily and assortative mixing

Networks in economics and finance. Lecture 1 - Measuring networks

Overlapping Communities

The Small Community Phenomenon in Networks: Models, Algorithms and Applications

A Fast Algorithm to Find Overlapping Communities in Networks

Single link clustering: 11/7: Lecture 18. Clustering Heuristics 1

Generalized Modularity for Community Detection

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Research on Community Structure in Bus Transport Networks

Relative Centrality and Local Community Detection

TELCOM2125: Network Science and Analysis

Community Detection in Social Networks

Complex-Network Modelling and Inference

Hierarchical Clustering

Keywords: dynamic Social Network, Community detection, Centrality measures, Modularity function.

CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection.

Online Social Networks and Media. Community detection

Strength of Co-authorship Ties in Clusters: a Comparative Analysis

Social and Technological Network Analysis. Lecture 4: Community Detec=on and Overlapping Communi=es. Dr. Cecilia Mascolo

Community Detection in Networks using Node Attributes and Modularity

Efficient Mining Algorithms for Large-scale Graphs

Local higher-order graph clustering

Introduction to Parallel & Distributed Computing Parallel Graph Algorithms

Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection

Understanding complex networks with community-finding algorithms

Section 7.12: Similarity. By: Ralucca Gera, NPS

Review on Different Methods of Community Structure of a Complex Software Network

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates

Extracting Information from Complex Networks

Using Stable Communities for Maximizing Modularity

Generalized Louvain method for community detection in large networks

Unsupervised discovery of category and object models. The task

Persistent Homology in Complex Network Analysis

V2: Measures and Metrics (II)

Node Similarity. Ralucca Gera, Applied Mathematics Dept. Naval Postgraduate School Monterey, California

Cycles in Random Graphs

Introduction to network metrics

An Empirical Analysis of Communities in Real-World Networks

Overlapping Community Detection in Social Networks Using Parliamentary Optimization Algorithm

Cluster Editing with Locally Bounded Modifications Revisited

EECS730: Introduction to Bioinformatics

Community Structure and Beyond

1 Large-scale structure in networks

Some Graph Theory for Network Analysis. CS 249B: Science of Networks Week 01: Thursday, 01/31/08 Daniel Bilar Wellesley College Spring 2008

Lecture Note: Computation problems in social. network analysis

Chapters 11 and 13, Graph Data Mining

Impact of Clustering on Epidemics in Random Networks

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection

Lecture 20: Clustering and Evolution

Computational Complexity CSC Professor: Tom Altman. Capacitated Problem

IDLE: A Novel Approach to Improving Overlapping Community Detection in Complex Networks

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

A Novel Parallel Hierarchical Community Detection Method for Large Networks

Lesson 4. Random graphs. Sergio Barbarossa. UPC - Barcelona - July 2008

TELCOM2125: Network Science and Analysis

11/17/2009 Comp 590/Comp Fall

Graph Theory S 1 I 2 I 1 S 2 I 1 I 2

Artificial Intelligence

Transcription:

Community detection Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Network Science Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 1 / 26

Lecture outline 1 Overlapping commmunities Clique percolation method 2 Multi-level optimization Fast community unfolding 3 Random walk methods Walktrap Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 2 / 26

Community detection image from W. Liu, 2014 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 3 / 26

Overlapping communities Palla, 2005 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 4 / 26

Overlapping communities Palla, 2005 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 5 / 26

k-clique community k-clique is a clique (complete subgraph) with k nodes k-clique community a union of all k-cliques that can be reached from each other through a series of adjacent k-cliques two k-cliques are said to be adjacent if they share k 1 nodes. Adjacent 4-cliques Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 6 / 26

k-clique percolation Find all maximal cliques Create clique overlap matrix Threshold matrix at value k 1 Communities = connected components Palla, 2005 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 7 / 26

k-clique percolation Palla, 2005 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 8 / 26

k-clique percolation k = 4 k = 5 Palla, 2005 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 9 / 26

k-clique percolation Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 10 / 26

Fast community unfolding Multi-resolution scalable method 2 mln mobile phone network V. Blondel et.al., 2008 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 11 / 26

Fast community unfolding The Louvain method Heuristic method for greedy modularity optimization Find partitions with high modularity Multi-level (multi-resolution) hierarchical scheme Scalable Modularity: Q = 1 2m i,j ( A ij k ) ik j δ(c i, c j ) 2m V. Blondel et.al., 2008 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 12 / 26

Fast community unfolding Algorithm Assign every node to its own community Phase I For every node evaluate modularity gain from removing node from its community and placing it in the community of its neighbor Place node in the community maximizing modularity gain repeat until no more improvement (local max of modularity) Phase II Nodes from communities merged into super nodes Weight on the links added up Repeat until no more changes (max modularity) V. Blondel et.al., 2008 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 13 / 26

Fast community unfolding V. Blondel et.al., 2008 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 14 / 26

Fast community unfolding V. Blondel et.al., 2008 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 15 / 26

Communities and random walks Random walks on a graph tend to get trapped into densely connected parts corresponding to communities. Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 16 / 26

Walktrap community Walktrap Consider random walk on graph At each time step walk moves to NN uniformly at random P ij = A ij d(i), P = D 1 A, D ii = diag(d(i)) P t ij - probability to get from i to j in t steps, t t mixing Assumptions: for two i and j in the same community P t ij is high if i and j are in the same community, then k, Pik t Pt jk Distance between nodes: r ij (t) = n (Pik t Pt jk )2 = D 1/2 Pi t D 1/2 Pj t d(k) k=1 P. Pons and M. Latapy, 2006 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 17 / 26

Walktrap Computing node distance r ij Direct (exact) computation: Pij t = (Pt ) ij or Pi t = P t pi 0, p0 i (k) = δ ik Approximate computation (simulation): Compute K random walks of length t starting form node i Approximate Pik t N ik K, number of walks end up on k Distance between communities: k=1 P t Cj = 1 C r C1 C 2 (t) = n (PC t 1 k Pt C 2 k )2 = D 1/2 PC t d(k) 1 D 1/2 PC t 2 P. Pons and M. Latapy, 2006 i C P t ij Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 18 / 26

Walktrap Algorithm (hierarchical clustering) Assign each vertex to its own community S 1 = {{v}, v V } Compute distance between all adjacent communities r Ci C j Choose two closest communities that minimizes (Ward s methods): σ(c 1, C 2 ) = 1 n i C3 r 2 ic 3 i C 1 r 2 ic 1 i C 2 r 2 ic 2 and merge them S k+1 = (S k \{C 1, C 2 }) C 3, C 3 = C 1 C 2 update distance between communities After n 1 steps finish with one community S n = {V } P. Pons and M. Latapy, 2006 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 19 / 26

Walktrap P. Pons and M. Latapy, 2006 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 20 / 26

Real world communities Best conductance of a vertex set S of size k: Φ(k) = min φ(s), φ(s) = cut(s, V \S) S V, S =k min(vol(s), vol(s\v )) where vol(s) = i S k i - sum of all node degrees in the set J. Leskovec, K. Lang, 2010 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 21 / 26

Community detection algorithms Fortunato, 2010 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 22 / 26

References G. Palla, I. Derenyi, I. Farkas, T. Vicsek, Uncovering the overlapping community structure of complex networks in nature and society, Nature 435 (2005) 814?818. P. Pons and M. Latapy, Computing communities in large networks using random walks, Journal of Graph Algorithms and Applications, 10 (2006), 191-218. V.D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefebvre, Fast unfolding of communities in large networks, J. Stat. Mech. P10008 (2008). J. Leskovec, K.J. Lang, A. Dasgupta, and M.W. Mahoney. Statistical properties of community structure in large social and information networks. In WWW 08: Procs. of the 17th Int. Conf. on World Wide Web, pages 695-704, 2008. Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 23 / 26

References M.A Porter, J-P Onella, P.J. Mucha. Communities in Networks, Notices of the American Mathematical Society, Vol. 56, No. 9, 2009 S. E. Schaeffer. Graph clustering. Computer Science Review, 1(1), pp 27-64, 2007. S. Fortunato. Community detection in graphs, Physics Reports, Vol. 486, Iss. 3-5, pp 75-174, 2010 Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 24 / 26

Summary Lectures 1-10 Network characteristics: Power law node degree distribution Small diameter High clustering coefficient (transitivity) Network models: Random graphs Preferential attachement Small world Centrality measures: Degree centrality Closeness centrality Betweenness centrality Link analysis: Page rank HITS Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 25 / 26

Summary Lectures 1-10 Structural equivalence Vertex equivalence Vertex similarity Assortative mixing Assortative and disassortative networks Mixing by node degree Modularity Network structures: Cliques k-cores Network communities: Graph partitioning Overlapping communities Heuristic methods Random walk based methods Leonid E. Zhukov (HSE) Lecture 10 18.03.2017 26 / 26