Label propagation with dams on large graphs using Apache Hadoop and Apache Spark
|
|
- Gilbert Morgan
- 5 years ago
- Views:
Transcription
1 Label propagation with dams on large graphs using Apache Hadoop and Apache Spark ATTAL Jean-Philippe (1) MALEK Maria (2) (1) (2) October 19, 2015
2 Contents 1 Real-World Graphs 2 Community Detection Problem Community detection algorithms Supervised and unsupervised measures 3 Benchmarks to test our community detection algorithms 4 Proposed community detection algorithms 5 Perspectives : a Hadoop and Spark implementation, first experiments on Amazon 6 Conclusion
3 What is a graph? A simple definition A graph is a set of vertices and edges, or set of nodes that are connected by a certain degree of relationship (a certain level of similarity). A mathematical definition A graph is an ordered pair G = (V, E) where V is the set of vertices (or nodes) and E is the set of edges of the graph. We note: V = n is the number of vertices of G E = m is the number edges of G Figure 1: weighted graph Figure 2: unweighted graph
4 Contents 1 Real-World Graphs 2 Community Detection Problem Community detection algorithms Supervised and unsupervised measures 3 Benchmarks to test our community detection algorithms 4 Proposed community detection algorithms 5 Perspectives : a Hadoop and Spark implementation, first experiments on Amazon 6 Conclusion
5 The community detection problem The community detection problem The community detection problem is to find a partition of nodes in a way that there is a high density of edges within a group and a low density of edges between groups. It is the detection of natural groups of vertices in networks. There is no absolute definition in the literature Figure 3: A toy example with 3 communities of different scales
6 Analysis on the community detections It exists three big families in community detection : Local method based on the node and its neighbourhood propagates an information or an operation not deterministic and very unstable weak complexity O(m) Global method based on the whole topology of the graph high complexity (often in O(n 3 )) propagate an information or an operation deterministic, subject to error propagation Hybrid method (Glocal) based on a local method which is lead by one or several global metrics complexity depends on the global metrics used better quality results than local methods not deterministic
7 Example of the three methodologies Figure 4: Agglomerative, spectral and Leader driven methods
8 Unsupervised measures To evaluate the quality of our community detection algorithms: Non supervised measures : Modularity and conductance Supervised Measures : Adjusted rand index, normalised mutual information and purity. Let f (S) the metric that captures the notion of the quality of the cluster. Optimized value of f (S) signifies a more community-like set of nodes: C The conductance : f (S) = S N S (2m S +c S ) measures the fraction of total edge volume that points outside the cluster. The modularity :f (S) = 1 4 (m S E(m S )) is the difference between m S, the number of edges between nodes in S and E(m S ), the expected number of such edges in a random graph with identical degree sequence. where N S is the number of nodes in S, and C S the number of outgoing links.
9 Supervised measures : Adjusted Rand Index (1971) Adjusted Rand Index Let S be a set of N data items, and U = {U 1, U 2,..., U R } and V = {V 1, V 2,..., V C } two partitions of S, information on the overlap between U and V can be summarized in form of a R C contingency table M = [nij] j=1...c i=1...r, where n ij denotes the number of objects that are common to clusters U i and V j. By counting ( a b) possibilities, we have N 11 : the number of pairs that are in the same cluster in both U and V ; N 00 : the number of pairs that are in different clusters in both U and V N 01 : the number of pairs that are in the same cluster in U but in different clusters in V N 10 : the number of pairs that are in different clusters in U but in the same cluster in V 2(N 00 N 11 N 01 N 10 ) ARI = (N 00 + N 01 )(N 01 + N 11 ) + (N 00 + N 10 )(N 10 + N 11 ) ATTAL Jean-Philippe Figureand 5: MALEK College Maria football Label propagation club network with dams on large graphs using Apache H
10 Supervides measures : The Normalised mutual information (1987) H(U) = R a i i=1 N log( a i N ) H(U, V ) = R C n ij i=1 j=1 N log( n ij N ) MI(U, V ) = H(U) + H(V ) H(U, V ) Values of the NMI The MI as the NMI : measures the information that partitions U and V share tells how much knowing one of these clusterings reduces our uncertainty about the other. The NMI can be considered as a function : f NMI (U, V ) [0, 1]
11 Contents 1 Real-World Graphs 2 Community Detection Problem Community detection algorithms Supervised and unsupervised measures 3 Benchmarks to test our community detection algorithms 4 Proposed community detection algorithms 5 Perspectives : a Hadoop and Spark implementation, first experiments on Amazon 6 Conclusion
12 Networks characteristics Figure 6: Network Characteristics
13 Contents 1 Real-World Graphs 2 Community Detection Problem Community detection algorithms Supervised and unsupervised measures 3 Benchmarks to test our community detection algorithms 4 Proposed community detection algorithms 5 Perspectives : a Hadoop and Spark implementation, first experiments on Amazon 6 Conclusion
14 Proposed community detection algorithm The label propagation is a non deterministic algorithm based on the propagation label. 1: Initialize nodes with unique labels, as n N, c n = l n 2: update each node s label to the label shared by most of its neighborhood, ie n N, C n = arg max l L l (n) until the convergence. 3: If every node has a label the maximum number of their neighbors have, then stop the algorithm, go to the previous step Figure 7: The label propagation This algorithm is a local method. low complexity O(m) non deterministic method very unstable (without order or preferential order)
15 Community detection algorithms : problem of Label propagation Label propagation has two major problems: It can produce a bad propagation It can produce "monster" communities It is highly unstable (case where a given order is not considered) Figure 8: The label propagation
16 Community detection algorithms : other label propagation algorithms in the literature Adding a score to each label which decreases when the geodesic distance from the label source node is too high (Leung et al. 2009) Multistep greedy agglomerative label propagation algorithm using modularity (Liu and Murata 2009) Offensive, defensive and hop attenuation label propagation (node propagation strength), (Lovro et al. 2013) Copra, Finding overlapping communities (Steve Gregory 2009) "Community Detection Using A Neighborhood Strength Driven Label Propagation Algorithm" (Xie and B.K. Szymanski 2011) Maximum overlap label core detection using MapReduce (Ovelgonne 2013) "Robust network community detection using balanced propagation"(subelj and Bajec 2013) "Controlled Label Propagation: Preventing Over Propagation through Gradual Expansion" (Rezaei and Soleymani 2015) The list is not exhaustive.
17 Community detection algorithms : proposed Method, a stabilized label propagation with dams Objective: Find community structures without the problem of bad propagation. Method which can be easily parallelizable and produces a good quality of community detection. Figure 9: The label propagation with dams We note β the percentage of edges with dams.
18 Community detection algorithms : proposed Method, an hybrid method, a stabilized label propagation with dams The edge betweenness centrality Let G = (V, E) a graph, where V is the set of nodes and E the set of edges of G. Let w the function of the weight on edges. For an unweighted graph, we have w(e) = 1. Let a path between two vertices starting at s V and finishing at t V. Note σ st the total number of shortest paths between vertices s and t. The notion of edge betweenness is based on the number of shortest paths that pass through a certain edge. The edge betweenness BC(e) for an edge e E is given by: BC(e) = s,t V,s t σ st (e) σ st where σ st (e) represents the number of shortest paths from nodes s to t and passing though the edge e E.
19 Community detection algorithms : proposed Method, the LPWD Figure 10: The LPWD Complexity: O(n 2 ) + k O(m βm)
20 Community detection algorithms : proposed Method, the LPWD Figure 11: The LPWD on Football Club
21 Community detection algorithms : a core label propagation with dams Figure 12: The core detection of Seifi et al. (2011)
22 Community detection algorithms : LPWDUS Figure 13: The LPWDUS Complexity: O(n 2 ) + O(N k (m βm))
23 Community detection algorithms : LPWDUSWS Figure 14: The LPWDUSWS The modularity can be used as score metric. Complexity : O(n 2 ) + O( 1 N k (m βm)).
24 Community detection algorithms : ECDLPWD Figure 15: The ECDLPWD Complexity: O(n 2 ) + O( 1 N k (m βm)).
25 Comparative analysis Experiences with some community detection algorithms Algorithms Q Φ NMI ARI Purity # Zachary #2 Louvain Seifi GN Spin Spectral WalkTrap Leung LPA * DPA Infomap LICOD ECDLPWD LPWDUS LPWDUSWM LPWODUS Table 1: Experiences with some community detection algorithms
26 Comparative analysis Experiences with some community detection algorithms Algorithms Q Φ NMI ARI Purity # Football #11 Louvain Seifi GN Spin Spectral WalkTrap Leung LPA * DPA Infomap LICOD ECDLPWD LPWDUS LPWDUSWM LPWODUS Table 2: Experiences with some community detection algorithms
27 Comparative analysis Experiences with some community detection algorithms Algorithms Q Φ NMI ARI Purity # Dolphins #2 Louvain Seifi GN Spin Spectral WalkTrap Leung LPA * DPA Infomap LICOD ECDLPWD LPWDUS LPWDUSWM LPWODUS Table 3: Experiences with some community detection algorithms
28 Comparative analysis Experiences with some community detection algorithms Algorithms Q Φ NMI ARI Purity # Political #3 Louvain Seifi GN Spin Spectral WalkTrap Leung LPA * DPA Infomap LICOD ECDLPWD LPWDUS LPWDUSWM LPWODUS Table 4: Experiences with some community detection algorithms
29 Contents 1 Real-World Graphs 2 Community Detection Problem Community detection algorithms Supervised and unsupervised measures 3 Benchmarks to test our community detection algorithms 4 Proposed community detection algorithms 5 Perspectives : a Hadoop and Spark implementation, first experiments on Amazon 6 Conclusion
30 Perspectives and current work: Parallel graph processing systems Figure 16: PGPS
31 Perspectives : graph partitioning with Mizan Figure 17: Dynamic graph partitioning
32 Perspectives and current work: HADOOP and MapReduce Figure 18: Hadoop architecture
33 Perspectives and current work: Spark and RDD Figure 19: Spark architecture
34 Perspectives and current work: label propagation on large graphs Current work: Work on large graphs with billions of edges. A Hadoop version for large scale graphs How to compute the edge betweenness on large graphs How to develop an in memory solution using Spark Study the parametrization the LPWDUS with α and β. Proposed an improved version of the label propagation with core detection DBLP, You Tube, Live Journal
35 Perspectives: Amazon graph Amazon graph It represents a network of products, where each vertex is a product and an edge exists between two products if they have been co purchased frequently. Figure 20: Amazon characteristics network
36 Perspectives: a label propagation on Hadoop Figure 21: Simple Label propagation on Hadoop
37 Perspectives: a core label propagation on Hadoop Figure 22: Community size distribution
38 Contents 1 Real-World Graphs 2 Community Detection Problem Community detection algorithms Supervised and unsupervised measures 3 Benchmarks to test our community detection algorithms 4 Proposed community detection algorithms 5 Perspectives : a Hadoop and Spark implementation, first experiments on Amazon 6 Conclusion
39 Conclusion Putting dams allows to increase the quality of the community detection with a local method. ECDLPWD gives better results than the LPWODUS LPWDUSWM seems in specific case to give better results than the ECDLPWD. For scale free graphs, 15% to 20% of dams with the highest edge betweenness gives better quality results Putting dams associated to core detection allows to find cores, but produces a bigger number of communities rather than the standard LPA The number of community in LPWODUS depends on α
40 Do you have any questions?
41 Please cite the following articles: Jean-Philippe Attal, Maria Malek, A new label propagation with dams, IEEE/ACM international conference on advances in social networks analysis and Mining (ASONAM), Paris, août Jean-Philippe Attal, Maria Malek, Un nouvel algorithme de propagation de labels avec barrages, Journée Réseaux Sociaux et Inteligence Artificielle (Atelier PFIA), Rennes, 29 juin and Jean-Philippe Attal, Maria Malek, Propagation de labels avec barrages sur de grands graphes en utilisant Apache Hadoop et Apache Spark (GraphX), Journée thématique : Fouille de grands graphes (JFGG15), Nîmes, octobre 2015
Community detection using boundary nodes in complex networks
Community detection using boundary nodes in complex networks Mursel Tasgin and Haluk O. Bingol Department of Computer Engineering Bogazici University, Istanbul In this paper, we propose a new community
More informationWeb Structure Mining Community Detection and Evaluation
Web Structure Mining Community Detection and Evaluation 1 Community Community. It is formed by individuals such that those within a group interact with each other more frequently than with those outside
More informationAdaptive Modularity Maximization via Edge Weighting Scheme
Information Sciences, Elsevier, accepted for publication September 2018 Adaptive Modularity Maximization via Edge Weighting Scheme Xiaoyan Lu a, Konstantin Kuzmin a, Mingming Chen b, Boleslaw K. Szymanski
More informationCHAPTER 3 3. LABEL PROPAGATION IN COMMUNITY DETECTION
CHAPTER 3 3. LABEL PROPAGATION IN COMMUNITY DETECTION 3.1 INTRODUCTION There exist various algorithms that identify community structures in large-scale real-world networks which were discussed in Chapter
More informationSocial Data Management Communities
Social Data Management Communities Antoine Amarilli 1, Silviu Maniu 2 January 9th, 2018 1 Télécom ParisTech 2 Université Paris-Sud 1/20 Table of contents Communities in Graphs 2/20 Graph Communities Communities
More informationCommunity Detection. Community
Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,
More informationCommunity Detection: Comparison of State of the Art Algorithms
Community Detection: Comparison of State of the Art Algorithms Josiane Mothe IRIT, UMR5505 CNRS & ESPE, Univ. de Toulouse Toulouse, France e-mail: josiane.mothe@irit.fr Karen Mkhitaryan Institute for Informatics
More informationWeighted Label Propagation Algorithm based on Local Edge Betweenness
Weighted Label Propagation Algorithm based on Local Edge Betweenness Hamid Shahrivari Joghan, Alireza Bagheri, Meysam Azad Department of Computer Engineering and Information Technology, Amir Kabir University
More informationNon Overlapping Communities
Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides
More informationComparative Evaluation of Community Detection Algorithms: A Topological Approach
omparative Evaluation of ommunity Detection Algorithms: A Topological Approach Günce Keziban Orman,2, Vincent Labatut, Hocine herifi 2 Galatasaray University, 2 University of Burgundy korman@gsu.edu.tr,
More informationCommunity detection algorithms survey and overlapping communities. Presented by Sai Ravi Kiran Mallampati
Community detection algorithms survey and overlapping communities Presented by Sai Ravi Kiran Mallampati (sairavi5@vt.edu) 1 Outline Various community detection algorithms: Intuition * Evaluation of the
More informationG(B)enchmark GraphBench: Towards a Universal Graph Benchmark. Khaled Ammar M. Tamer Özsu
G(B)enchmark GraphBench: Towards a Universal Graph Benchmark Khaled Ammar M. Tamer Özsu Bioinformatics Software Engineering Social Network Gene Co-expression Protein Structure Program Flow Big Graphs o
More informationA new Pre-processing Strategy for Improving Community Detection Algorithms
A new Pre-processing Strategy for Improving Community Detection Algorithms A. Meligy Professor of Computer Science, Faculty of Science, Ahmed H. Samak Asst. Professor of computer science, Faculty of Science,
More informationFinding Hierarchical Communities in Complex Networks Using Influence-Guided Label Propagation
Finding Hierarchical Communities in Complex Networks Using Influence-Guided Label Propagation Wenjun Wang and W. Nick Street Department of Management Sciences University of Iowa Iowa City, IA 52242, USA
More informationOh Pott, Oh Pott! or how to detect community structure in complex networks
Oh Pott, Oh Pott! or how to detect community structure in complex networks Jörg Reichardt Interdisciplinary Centre for Bioinformatics, Leipzig, Germany (Host of the 2012 Olympics) Questions to start from
More informationNetwork community detection with edge classifiers trained on LFR graphs
Network community detection with edge classifiers trained on LFR graphs Twan van Laarhoven and Elena Marchiori Department of Computer Science, Radboud University Nijmegen, The Netherlands Abstract. Graphs
More informationLICOD: Leaders Identification for Community Detection in Complex Networks
2011 IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE International Conference on Social Computing LICOD: Leaders Identification for Community Detection in Complex Networks
More informationAN ANT-BASED ALGORITHM WITH LOCAL OPTIMIZATION FOR COMMUNITY DETECTION IN LARGE-SCALE NETWORKS
AN ANT-BASED ALGORITHM WITH LOCAL OPTIMIZATION FOR COMMUNITY DETECTION IN LARGE-SCALE NETWORKS DONGXIAO HE, JIE LIU, BO YANG, YUXIAO HUANG, DAYOU LIU *, DI JIN College of Computer Science and Technology,
More informationOn the Permanence of Vertices in Network Communities. Tanmoy Chakraborty Google India PhD Fellow IIT Kharagpur, India
On the Permanence of Vertices in Network Communities Tanmoy Chakraborty Google India PhD Fellow IIT Kharagpur, India 20 th ACM SIGKDD, New York City, Aug 24-27, 2014 Tanmoy Chakraborty Niloy Ganguly IIT
More informationAn Efficient Algorithm for Community Detection in Complex Networks
An Efficient Algorithm for Community Detection in Complex Networks Qiong Chen School of Computer Science & Engineering South China University of Technology Guangzhou Higher Education Mega Centre Panyu
More informationClustering Lecture 5: Mixture Model
Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics
More informationMining Social Network Graphs
Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be
More informationIntroduction to Data Mining and Data Analytics
1/28/2016 MIST.7060 Data Analytics 1 Introduction to Data Mining and Data Analytics What Are Data Mining and Data Analytics? Data mining is the process of discovering hidden patterns in data, where Patterns
More informationA Novel Parallel Hierarchical Community Detection Method for Large Networks
A Novel Parallel Hierarchical Community Detection Method for Large Networks Ping Lu Shengmei Luo Lei Hu Yunlong Lin Junyang Zou Qiwei Zhong Kuangyan Zhu Jian Lu Qiao Wang Southeast University, School of
More informationSLPA: Uncovering Overlapping Communities in Social Networks via A Speaker-listener Interaction Dynamic Process
SLPA: Uncovering Overlapping Cmunities in Social Networks via A Speaker-listener Interaction Dynamic Process Jierui Xie and Boleslaw K. Szymanski Department of Cputer Science Rensselaer Polytechnic Institute
More informationCentralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge
Centralities (4) By: Ralucca Gera, NPS Excellence Through Knowledge Some slide from last week that we didn t talk about in class: 2 PageRank algorithm Eigenvector centrality: i s Rank score is the sum
More informationSOMSN: An Effective Self Organizing Map for Clustering of Social Networks
SOMSN: An Effective Self Organizing Map for Clustering of Social Networks Fatemeh Ghaemmaghami Research Scholar, CSE and IT Dept. Shiraz University, Shiraz, Iran Reza Manouchehri Sarhadi Research Scholar,
More informationJure Leskovec, Cornell/Stanford University. Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research
Jure Leskovec, Cornell/Stanford University Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research Network: an interaction graph: Nodes represent entities Edges represent interaction
More informationarxiv: v2 [cs.si] 22 Mar 2013
Community Structure Detection in Complex Networks with Partial Background Information Zhong-Yuan Zhang a arxiv:1210.2018v2 [cs.si] 22 Mar 2013 Abstract a School of Statistics, Central University of Finance
More informationClusters and Communities
Clusters and Communities Lecture 7 CSCI 4974/6971 22 Sep 2016 1 / 14 Today s Biz 1. Reminders 2. Review 3. Communities 4. Betweenness and Graph Partitioning 5. Label Propagation 2 / 14 Today s Biz 1. Reminders
More informationStatistical Physics of Community Detection
Statistical Physics of Community Detection Keegan Go (keegango), Kenji Hata (khata) December 8, 2015 1 Introduction Community detection is a key problem in network science. Identifying communities, defined
More informationEfficient Community Detection Algorithm with Label Propagation using Node Importance and Link Weight
Efficient Community Detection Algorithm with Label Propagation using Node Importance and Link Weight Mohsen Arab, Mahdieh Hasheminezhad* Department of Computer Science Yazd University, Yazd, Iran Abstract
More informationResearch Article An Improved Topology-Potential-Based Community Detection Algorithm for Complex Network
e Scientific World Journal, Article ID 121609, 7 pages http://dx.doi.org/10.1155/2014/121609 Research Article An Improved Topology-Potential-Based Community Detection Algorithm for Complex Network Zhixiao
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford
More informationCommunity Detection in Social Networks
San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 5-24-2017 Community Detection in Social Networks Ketki Kulkarni San Jose State University Follow
More informationGraph Sampling Approach for Reducing. Computational Complexity of. Large-Scale Social Network
Journal of Innovative Technology and Education, Vol. 3, 216, no. 1, 131-137 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/1.12988/jite.216.6828 Graph Sampling Approach for Reducing Computational Complexity
More informationhttp://www.xkcd.com/233/ Text Clustering David Kauchak cs160 Fall 2009 adapted from: http://www.stanford.edu/class/cs276/handouts/lecture17-clustering.ppt Administrative 2 nd status reports Paper review
More informationSensor Tasking and Control
Sensor Tasking and Control Outline Task-Driven Sensing Roles of Sensor Nodes and Utilities Information-Based Sensor Tasking Joint Routing and Information Aggregation Summary Introduction To efficiently
More informationOnline Social Networks and Media. Community detection
Online Social Networks and Media Community detection 1 Notes on Homework 1 1. You should write your own code for generating the graphs. You may use SNAP graph primitives (e.g., add node/edge) 2. For the
More informationCAIM: Cerca i Anàlisi d Informació Massiva
1 / 72 CAIM: Cerca i Anàlisi d Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Balcázar, Ricard Gavaldá Department of Computer Science, UPC Fall 2016 http://www.cs.upc.edu/~caim
More informationExtracting Information from Complex Networks
Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform
More informationCommunity Detection in Bipartite Networks:
Community Detection in Bipartite Networks: Algorithms and Case Studies Kathy Horadam and Taher Alzahrani Mathematical and Geospatial Sciences, RMIT Melbourne, Australia IWCNA 2014 Community Detection,
More informationALTERNATIVES TO BETWEENNESS CENTRALITY: A MEASURE OF CORRELATION COEFFICIENT
ALTERNATIVES TO BETWEENNESS CENTRALITY: A MEASURE OF CORRELATION COEFFICIENT Xiaojia He 1 and Natarajan Meghanathan 2 1 University of Georgia, GA, USA, 2 Jackson State University, MS, USA 2 natarajan.meghanathan@jsums.edu
More informationEdge Weight Method for Community Detection in Scale-Free Networks
Edge Weight Method for Community Detection in Scale-Free Networks Sorn Jarukasemratana Tsuyoshi Murata Tokyo Institute of Technology WIMS'14 June 2-4, 2014 - Thessaloniki, Greece Modularity High modularity
More informationMachine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
More informationOverlapping Community Detection in Dynamic Networks
Journal of Software Engineering and Applications, 24, 7, 872-882 Published Online September 24 in SciRes. http://www.scirp.org/journal/jsea http://dx.doi.org/.4236/jsea.24.778 Overlapping Community Detection
More informationSCALABLE LOCAL COMMUNITY DETECTION WITH MAPREDUCE FOR LARGE NETWORKS
SCALABLE LOCAL COMMUNITY DETECTION WITH MAPREDUCE FOR LARGE NETWORKS Ren Wang, Andong Wang, Talat Iqbal Syed and Osmar R. Zaïane Department of Computing Science, University of Alberta, Canada ABSTRACT
More informationDetecting Community Structure for Undirected Big Graphs Based on Random Walks
Detecting Community Structure for Undirected Big Graphs Based on Random Walks Xiaoming Liu 1, Yadong Zhou 1, Chengchen Hu 1, Xiaohong Guan 1,, Junyuan Leng 1 1 MOE KLNNIS Lab, Xi an Jiaotong University,
More informationAnalysis of Extended Performance for clustering of Satellite Images Using Bigdata Platform Spark
Analysis of Extended Performance for clustering of Satellite Images Using Bigdata Platform Spark PL.Marichamy 1, M.Phil Research Scholar, Department of Computer Application, Alagappa University, Karaikudi,
More informationData Clustering. Danushka Bollegala
Data Clustering Danushka Bollegala Outline Why cluster data? Clustering as unsupervised learning Clustering algorithms k-means, k-medoids agglomerative clustering Brown s clustering Spectral clustering
More informationFlat Clustering. Slides are mostly from Hinrich Schütze. March 27, 2017
Flat Clustering Slides are mostly from Hinrich Schütze March 7, 07 / 79 Overview Recap Clustering: Introduction 3 Clustering in IR 4 K-means 5 Evaluation 6 How many clusters? / 79 Outline Recap Clustering:
More informationCommunity Detection Using Random Walk Label Propagation Algorithm and PageRank Algorithm over Social Network
Community Detection Using Random Walk Label Propagation Algorithm and PageRank Algorithm over Social Network 1 Monika Kasondra, 2 Prof. Kamal Sutaria, 1 M.E. Student, 2 Assistent Professor, 1 Computer
More informationMATH 567: Mathematical Techniques in Data
Supervised and unsupervised learning Supervised learning problems: MATH 567: Mathematical Techniques in Data (X, Y ) P (X, Y ). Data Science Clustering I is labelled (input/output) with joint density We
More informationCEIL: A Scalable, Resolution Limit Free Approach for Detecting Communities in Large Networks
CEIL: A Scalable, Resolution Limit Free Approach for Detecting Communities in Large etworks Vishnu Sankar M IIT Madras Chennai, India vishnusankar151gmail.com Balaraman Ravindran IIT Madras Chennai, India
More informationCommunity Structure and Beyond
Community Structure and Beyond Elizabeth A. Leicht MAE: 298 April 9, 2009 Why do we care about community structure? Large Networks Discussion Outline Overview of past work on community structure. How to
More informationBrief description of the base clustering algorithms
Brief description of the base clustering algorithms Le Ou-Yang, Dao-Qing Dai, and Xiao-Fei Zhang In this paper, we choose ten state-of-the-art protein complex identification algorithms as base clustering
More informationCSE 7/5337: Information Retrieval and Web Search Document clustering I (IIR 16)
CSE 7/5337: Information Retrieval and Web Search Document clustering I (IIR 16) Michael Hahsler Southern Methodist University These slides are largely based on the slides by Hinrich Schütze Institute for
More informationMR-ECOCD: AN EDGE CLUSTERING ALGORITHM FOR OVERLAPPING COMMUNITY DETECTION ON LARGE-SCALE NETWORK USING MAPREDUCE
International Journal of Innovative Computing, Information and Control ICIC International c 2016 ISSN 1349-4198 Volume 12, Number 1, February 2016 pp. 263 273 MR-ECOCD: AN EDGE CLUSTERING ALGORITHM FOR
More informationDemystifying movie ratings 224W Project Report. Amritha Raghunath Vignesh Ganapathi Subramanian
Demystifying movie ratings 224W Project Report Amritha Raghunath (amrithar@stanford.edu) Vignesh Ganapathi Subramanian (vigansub@stanford.edu) 9 December, 2014 Introduction The past decade or so has seen
More informationFast Nearest Neighbor Search on Large Time-Evolving Graphs
Fast Nearest Neighbor Search on Large Time-Evolving Graphs Leman Akoglu Srinivasan Parthasarathy Rohit Khandekar Vibhore Kumar Deepak Rajan Kun-Lung Wu Graphs are everywhere Leman Akoglu Fast Nearest Neighbor
More informationDiffusion and Clustering on Large Graphs
Diffusion and Clustering on Large Graphs Alexander Tsiatas Final Defense 17 May 2012 Introduction Graphs are omnipresent in the real world both natural and man-made Examples of large graphs: The World
More informationL1-graph based community detection in online social networks
L1-graph based community detection in online social networks Liang Huang 1, Ruixuan Li 1, Kunmei Wen 1, Xiwu Gu 1, Yuhua Li 1 and Zhiyong Xu 2 1 Huazhong University of Science and Technology 2 Suffork
More informationClustering CS 550: Machine Learning
Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf
More informationMCL. (and other clustering algorithms) 858L
MCL (and other clustering algorithms) 858L Comparing Clustering Algorithms Brohee and van Helden (2006) compared 4 graph clustering algorithms for the task of finding protein complexes: MCODE RNSC Restricted
More informationAlternative Clusterings: Current Progress and Open Challenges
Alternative Clusterings: Current Progress and Open Challenges James Bailey Department of Computer Science and Software Engineering The University of Melbourne, Australia 1 Introduction Cluster analysis:
More informationUnsupervised Learning: Clustering
Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning
More informationIntroduction to network metrics
Universitat Politècnica de Catalunya Version 0.5 Complex and Social Networks (2018-2019) Master in Innovation and Research in Informatics (MIRI) Instructors Argimiro Arratia, argimiro@cs.upc.edu, http://www.cs.upc.edu/~argimiro/
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING Clustering Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu November 7, 2017 Learnt Clustering Methods Vector Data Set Data Sequence Data Text
More informationClustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation
More informationConsensus clustering by graph based approach
Consensus clustering by graph based approach Haytham Elghazel 1, Khalid Benabdeslemi 1 and Fatma Hamdi 2 1- University of Lyon 1, LIESP, EA4125, F-69622 Villeurbanne, Lyon, France; {elghazel,kbenabde}@bat710.univ-lyon1.fr
More informationLocal Community Detection in Dynamic Graphs Using Personalized Centrality
algorithms Article Local Community Detection in Dynamic Graphs Using Personalized Centrality Eisha Nathan, Anita Zakrzewska, Jason Riedy and David A. Bader * School of Computational Science and Engineering,
More informationCluster Evaluation and Expectation Maximization! adapted from: Doug Downey and Bryan Pardo, Northwestern University
Cluster Evaluation and Expectation Maximization! adapted from: Doug Downey and Bryan Pardo, Northwestern University Kinds of Clustering Sequential Fast Cost Optimization Fixed number of clusters Hierarchical
More informationLesson 4. Random graphs. Sergio Barbarossa. UPC - Barcelona - July 2008
Lesson 4 Random graphs Sergio Barbarossa Graph models 1. Uncorrelated random graph (Erdős, Rényi) N nodes are connected through n edges which are chosen randomly from the possible configurations 2. Binomial
More informationCS224W: Analysis of Networks Jure Leskovec, Stanford University
CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models
More informationNear Linear-Time Community Detection in Networks with Hardly Detectable Community Structure
Near Linear-Time Community Detection in Networks with Hardly Detectable Community Structure Aria Rezaei Department of Computer Engineering Sharif University of Technology Email: arezaei@ce.sharif.edu Saeed
More informationTopological Centrality and Its Applications. Hai Zhuge, Senior Member, IEEE, and Junsheng Zhang
1 Topological Centrality and Its Applications Hai Zhuge, Senior Member, IEEE, and Junsheng Zhang Abstract Recent development of network structure analysis shows that it plays an important role in characterizing
More informationAlgorithms for Grid Graphs in the MapReduce Model
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Computer Science and Engineering: Theses, Dissertations, and Student Research Computer Science and Engineering, Department
More informationCommunity Detection Algorithm based on Centrality and Node Closeness in Scale-Free Networks
234 29 2 SP-B 2014 Community Detection Algorithm based on Centrality and Node Closeness in Scale-Free Networks Sorn Jarukasemratana Tsuyoshi Murata Xin Liu 1 Tokyo Institute of Technology sorn.jaru@ai.cs.titech.ac.jp
More informationEXTREMAL OPTIMIZATION AND NETWORK COMMUNITY STRUCTURE
EXTREMAL OPTIMIZATION AND NETWORK COMMUNITY STRUCTURE Noémi Gaskó Department of Computer Science, Babeş-Bolyai University, Cluj-Napoca, Romania gaskonomi@cs.ubbcluj.ro Rodica Ioana Lung Department of Statistics,
More informationKeywords: dynamic Social Network, Community detection, Centrality measures, Modularity function.
Volume 6, Issue 1, January 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com An Efficient
More informationGeneralized Measures for the Evaluation of Community Detection Methods
Edit 13/05/2016: the R source code for the measures described in this article is now publicly available online on GitHub: https://github.com/compnet/topomeasures Generalized Measures for the Evaluation
More informationChapter 7: Competitive learning, clustering, and self-organizing maps
Chapter 7: Competitive learning, clustering, and self-organizing maps António R. C. Paiva EEL 6814 Spring 2008 Outline Competitive learning Clustering Self-Organizing Maps What is competition in neural
More informationarxiv: v1 [cs.si] 17 Sep 2016
Understanding Stability of Noisy Networks through Centrality Measures and Local Connections arxiv:69.542v [cs.si] 7 Sep 26 ABSTRACT Vladimir Ufimtsev Dept. of CS Univ. of Nebraska at Omaha NE 6882, USA
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval http://informationretrieval.org IIR 6: Flat Clustering Wiltrud Kessler & Hinrich Schütze Institute for Natural Language Processing, University of Stuttgart 0-- / 83
More informationCommunity Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal
Community Structure Detection Amar Chandole Ameya Kabre Atishay Aggarwal What is a network? Group or system of interconnected people or things Ways to represent a network: Matrices Sets Sequences Time
More informationUnderstanding Clustering Supervising the unsupervised
Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data
More informationECG782: Multidimensional Digital Signal Processing
ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting
More informationCommunity Mining in Signed Networks: A Multiobjective Approach
Community Mining in Signed Networks: A Multiobjective Approach Alessia Amelio National Research Council of Italy (CNR) Inst. for High Perf. Computing and Networking (ICAR) Via Pietro Bucci, 41C 87036 Rende
More informationModularity CMSC 858L
Modularity CMSC 858L Module-detection for Function Prediction Biological networks generally modular (Hartwell+, 1999) We can try to find the modules within a network. Once we find modules, we can look
More informationAn overview of Graph Categories and Graph Primitives
An overview of Graph Categories and Graph Primitives Dino Ienco (dino.ienco@irstea.fr) https://sites.google.com/site/dinoienco/ Topics I m interested in: Graph Database and Graph Data Mining Social Network
More informationWSI using Graphs of Collocations. Paper by: Ioannis P. Klapaftis and Suresh Manandhar Presented by: Ahmad R. Shahid
WSI using Graphs of Collocations Paper by: Ioannis P. Klapaftis and Suresh Manandhar Presented by: Ahmad R. Shahid Word Sense Induction (WSI) Identifying different senses (uses) of a word Finds applications
More informationA Review on Overlapping Community Detection Algorithms
Review Paper A Review on Overlapping Community Detection Algorithms Authors 1 G.T.Prabavathi*, 2 Dr. V. Thiagarasu Address For correspondence: 1 Asst Professor in Computer Science, Gobi Arts & Science
More informationCommunity detection. Leonid E. Zhukov
Community detection Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Network Science Leonid E.
More informationOutlier edge detection using random graph generation models and applications
Tampere University of Technology Outlier edge detection using random graph generation models and applications Citation Zhang, H., Kiranyaz, S., & Gabbouj, M. (2017). Outlier edge detection using random
More informationCommunity Analysis. Chapter 6
This chapter is from Social Media Mining: An Introduction. By Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. Cambridge University Press, 2014. Draft version: April 20, 2014. Complete Draft and Slides
More informationPV211: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/pv211
PV: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/pv IIR 6: Flat Clustering Handout version Petr Sojka, Hinrich Schütze et al. Faculty of Informatics, Masaryk University, Brno Center
More informationGraph analytics approach to analyse Enterprise Architecture models
Nikhitha Rajashekar nikhita.rajashekar@rwth-aachen.de Graph analytics approach to analyse Enterprise Architecture models Master Thesis Proposal Supervisor: Simon Hacks Overview 1. Enterprise Architecture
More informationClustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York
Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity
More informationRouting Outline. EECS 122, Lecture 15
Fall & Walrand Lecture 5 Outline EECS, Lecture 5 Kevin Fall kfall@cs.berkeley.edu Jean Walrand wlr@eecs.berkeley.edu Definition/Key Questions Distance Vector Link State Comparison Variations EECS - Fall
More informationV4 Matrix algorithms and graph partitioning
V4 Matrix algorithms and graph partitioning - Community detection - Simple modularity maximization - Spectral modularity maximization - Division into more than two groups - Other algorithms for community
More information