L1-graph based community detection in online social networks

Similar documents
Social Networks. of Science and Technology, Wuhan , China

A Novel Parallel Hierarchical Community Detection Method for Large Networks

Mining Social Network Graphs

Community Detection. Community

Community Structure and Beyond

Statistical Physics of Community Detection

Non Overlapping Communities

Diffusion and Clustering on Large Graphs

Social-Network Graphs

Social Data Management Communities

Online Social Networks and Media. Community detection

TELCOM2125: Network Science and Analysis

Social Network Recommendation Algorithm based on ICIP

An Efficient Algorithm for Community Detection in Complex Networks

MATH 567: Mathematical Techniques in Data

Persistent Homology in Complex Network Analysis

Spectral Methods for Network Community Detection and Graph Partitioning

Supplementary material to Epidemic spreading on complex networks with community structures

Working with Unlabeled Data Clustering Analysis. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

Modeling and Detecting Community Hierarchies

V4 Matrix algorithms and graph partitioning

Fast Snippet Generation. Hybrid System

CS224W: Analysis of Networks Jure Leskovec, Stanford University

Finding and Visualizing Graph Clusters Using PageRank Optimization. Fan Chung and Alexander Tsiatas, UCSD WAW 2010

CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks

My favorite application using eigenvalues: partitioning and community detection in social networks

SPECTRAL SPARSIFICATION IN SPECTRAL CLUSTERING

Community Detection: Comparison of State of the Art Algorithms

Spectral Graph Multisection Through Orthogonality. Huanyang Zheng and Jie Wu CIS Department, Temple University

Detecting Community Structure for Undirected Big Graphs Based on Random Walks

SCALABLE LOCAL COMMUNITY DETECTION WITH MAPREDUCE FOR LARGE NETWORKS

SSD-based Information Retrieval Systems

A new Pre-processing Strategy for Improving Community Detection Algorithms

Hierarchical Clustering 4/5/17

Overlapping Communities

SSD-based Information Retrieval Systems

Community Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal

Cluster Analysis. Ying Shen, SSE, Tongji University

Figure (5) Kohonen Self-Organized Map

SOMSN: An Effective Self Organizing Map for Clustering of Social Networks

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters

Discovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Clusters and Communities

Clustering Algorithms for general similarity measures

Community Mining in Signed Networks: A Multiobjective Approach

Digital Image Fundamentals II

Chapter 11. Network Community Detection

Detecting community structure in networks

Introduction to Machine Learning

AN ANT-BASED ALGORITHM WITH LOCAL OPTIMIZATION FOR COMMUNITY DETECTION IN LARGE-SCALE NETWORKS

Extracting Information from Complex Networks

CAIM: Cerca i Anàlisi d Informació Massiva

A Parallel Community Detection Algorithm for Big Social Networks

Generalized Transitive Distance with Minimum Spanning Random Forest

Network community detection with edge classifiers trained on LFR graphs

Web Structure Mining Community Detection and Evaluation

Hierarchical Graph Clustering: Quality Metrics & Algorithms

Graph Laplacian Kernels for Object Classification from a Single Example

Clustering CS 550: Machine Learning

Basics of Network Analysis

Local Fiedler Vector Centrality for Detection of Deep and Overlapping Communities in Networks

On community detection in very large networks

A Brief Review of Representation Learning in Recommender 赵鑫 RUC

Edge Weight Method for Community Detection in Scale-Free Networks

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization

Spectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014

Robust Spectral Learning for Unsupervised Feature Selection

Chapter 2 Detecting the Overlapping and Hierarchical Community Structure in Networks 2.1 Introduction

Clustering Lecture 4: Density-based Methods

IMAGE SUPER-RESOLUTION BASED ON DICTIONARY LEARNING AND ANCHORED NEIGHBORHOOD REGRESSION WITH MUTUAL INCOHERENCE

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection

Clustering. So far in the course. Clustering. Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. dist(x, y) = x y 2 2

What was Monet seeing while painting? Translating artworks to photo-realistic images M. Tomei, L. Baraldi, M. Cornia, R. Cucchiara

Finding and evaluating community structure in networks

IMAGE DENOISING USING NL-MEANS VIA SMOOTH PATCH ORDERING

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

EXTREMAL OPTIMIZATION AND NETWORK COMMUNITY STRUCTURE

Cluster Analysis. Prof. Thomas B. Fomby Department of Economics Southern Methodist University Dallas, TX April 2008 April 2010

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2

Community detection using boundary nodes in complex networks

The Generalized Topological Overlap Matrix in Biological Network Analysis

A Clustering Approach to the Bounded Diameter Minimum Spanning Tree Problem Using Ants. Outline. Tyler Derr. Thesis Adviser: Dr. Thang N.

Demystifying movie ratings 224W Project Report. Amritha Raghunath Vignesh Ganapathi Subramanian

SLPA: Uncovering Overlapping Communities in Social Networks via A Speaker-listener Interaction Dynamic Process

A Comparison of Document Clustering Techniques

Bridging the Gap Between Local and Global Approaches for 3D Object Recognition. Isma Hadji G. N. DeSouza

ECS 234: Data Analysis: Clustering ECS 234

Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering

Extracting Communities from Networks

Community Detection Using Cooperative Co-evolutionary Differential Evolution

Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. 2 April April 2015

Automatic Group-Outlier Detection

A new Graph constructor for Semi-supervised Discriminant Analysis via Group Sparsity

Modularity CMSC 858L

Geometric Registration for Deformable Shapes 3.3 Advanced Global Matching

Community Detection in Social Networks

Social Network Analysis With igraph & R. Ofrit Lesser December 11 th, 2014

Transcription:

L1-graph based community detection in online social networks Liang Huang 1, Ruixuan Li 1, Kunmei Wen 1, Xiwu Gu 1, Yuhua Li 1 and Zhiyong Xu 2 1 Huazhong University of Science and Technology 2 Suffork University

Online social network Online social networks have attracted more and more attention in recent years. In online social networks, a common feature is the community structure. Detecting community structures in online social network is a challenging job for traditional algorithms. 2

Features of online social network The scale of online social networks are becoming very huge due to the scale of the internet. t The relationships between people are becoming more complicated. 3

Related Work Newman proposed an iterative, divisive method based on the progressive removal of links with the largest betweenness. Santo Fortunato developed an algorithm of hierarchical clustering that consists of finding and removing the edges iteratively ti with the highest h information centrality. Newman and Girvan proposed a quantitative method called modularity to identify the network communities. 4

Problems of current algorithms Uncompetitive for the large social network. The computation ti overhead of the current algorithms is becoming very high when handling the large-scale online social networks. Not consider the condition of directed social networks for normal spectral clustering algorithms. 5

Our contributions We use L1-graph to utilize the overall contextual information instead of only pairwise i Euclidean distance as conventionally. We use the laplacian regularizer to choose relatively small sample sets of the given social network, which largely reduces the computational cost. We combine the L1-graph and laplacian regularizer to detect the communities in both directed and undirected d social networks. 6

Typical social network N 0 1 1 0 0 1 0 N 1 0 1 0 0 1 0 1 1 0 0 0 1 0 0 1 1 0 0 1 0 G = <V, E> 7

Spectral clustering algorithm (SCA) 8

Spectral clustering algorithm (SCA) 9

Pairwise similarity graph for SCA Pairwise similarity graph for SCA The -neighborhood graph k-nearest neighbor graph The fully connected graph Shortcomings for pairwise similarity graph approaches Pairwise similarity graph doesn t integrate the overall contextual information of the social networks. 10

L1-graph for SCA Advantages for L1-graph Robustness to data noise Good at represent sparsity Datum-adaptive neighborhood Advantage for using L1-graph in SCA Using the L1-graph, the SCA algorithm can automatically select the neighbors for each datum, and the similarity matrix is automatically derived from the calculation of these sparse representations. 11

L-1 graph construction Inputs: The sample data set denoted as the matrix X= x x,... where xi Robust sparse representation: m 1, 1 x N B x 1, x1,..., xn, i m N 1 Where Graph weight setting: I 12

Regularized spectral clustering algorithm (RSCA) 13

Our algorithm RSCA with L1-graph 14

Experiments Implementation using Matlab. Data Sets: Zachary Dolphin network Arxiv HEP-PH PH collaboration network Metric: Modularity 15

Modularity We define a quantity function of the community detection and detect the good communities through optimizing the quantity function Q. 2 Q ( e ii a i ) j The more links within the communities and fewer links between communities, the more Q quantities. The maximum of Q is 1. We use modularity to quantify the results of our algorithm. 16

Results in two benchmark social networks (Zachary and Dolphin network) RSCA with L1-graph and full samples has the best performance as displayed in the figure. 17

Result in Arxiv HEP-PH PH collaboration network Arxiv with 12,008 nodes and 237,010 edges. RSCA with 2000 samples has the best performance. SCA outperformed the RSCA with 800 samples. 18

Conclusion We proposed to integrate L1-graph and RSCA to detect community structures in complex online social networks in an efficient way. The social networks adopted in our paper only consist of ftens of fthousands nodes. We plan to employ some parallel and distributed computing tools, such as MapReduce, to improve the computing performance. 19

Thanks! 20