Extracting Communities from Networks
|
|
- April Reynolds
- 5 years ago
- Views:
Transcription
1 Extracting Communities from Networks Ji Zhu Department of Statistics, University of Michigan Joint work with Yunpeng Zhao and Elizaveta Levina
2 Outline Review of community detection Community extraction Asymptotic consistency Simulation study Real data analysis Future work
3 Network data Network analysis has been a focus of attention in different fields. Social science: friendship networks Internet: computer networks, hyper-links Biology: food webs, gene regulatory networks
4 The mathematical representation of a network Given a network N = (V,E), where V is the set of nodes and E is the set of edges, we can represent this network N by an adjacency matrix A as follows: { 1 if there is an edge from node i to node j, A ij = 0 otherwise. where A can be either symmetric (for an undirected network) or asymmetric (for a directed network).
5 Community detection Communities: Networks consist of communities, or clusters, with many connections within a community and few connections between communities. Community detection problem: For an undirected network N = (V,E), the community detection problem is typically formulated as finding a partition V = V 1 V K which gives tight communities in some suitable sense.
6 Example: a school friendship network (colors represent grades)
7 Community detection problem Existing community detection methods: minimizing links between communities while maximizing links within communities (Newman, 2004). For simplicity, we consider the case of partitioning the network into two communities V 1 and V 2.
8 Min-cut To minimize R = A ij. i V 1,j V 2 However, min-cut always yields a trivial solution of V 1 = V or V 2 = V.
9 Ratio-cut (Wei and Cheng, 1989) To minimize R V 1 V 2, where V 1 and V 2 are the sizes of the two communities. Ratio-cut can avoid trivial solutions because the maximizer of V 1 V 2 is achieved at V 1 = V 2 = V /2.
10 Normalized-cut (Shi and Malik, 2000) To minimize R assoc(v 1,V) + R assoc(v 2,V), where assoc(v k,v) = i Vk,j V A ij for k = 1,2. Normalized-cut can avoid trivial solutions because an extremely small group V k may have a large ratio R/assoc(V k,v).
11 Modularity (Newman and Girvan, 2004) To maximize Q = 2 k=1 [ O kk D2 k L where O kk = i Vk,j V k A ij,d k = i Vk,j V A ij,l = 2 k=1 D k. ], Q represents the number of edges that fall within communities, minus the average value of the same quantity if edges fall at random given the degree of each node.
12 Limitation of traditional community detection methods There exists background in many real-world networks, which may not belong to any community. Traditional graph partitioning methods have difficulty in this situation.
13 Example: a school friendship network
14 Outline Review of community detection Community extraction Asymptotic consistency Simulation study Real data analysis Future work
15 Community extraction Most networks consist of a number (not known a priori) of communities, with relatively tight links within each community and sparse links to the outside, and background nodes that only have sparse links to other nodes. We propose a method that extracts communities sequentially: at each step, the tightest is extracted from the network until no more meaningful communities exist.
16 Criterion Extract one community at a time by looking for a set of nodes with a large number of links within itself and a small number of links to the rest of the network. The links within the complement of this set do not matter. To maximize where W(S) = I(S) m 2 B(S) m(n m), I(S) = A ij, B(S) = A ij, m = S. i,j S i S,j S c
17 Adjusted criterion Empirically, the previous criterion performs well for dense networks. However, it tends to find very small communities for sparse networks. To avoid small communities, we also propose To maximize W a (S) = m(n m) ( I(S) m 2 B(S) ). m(n m) The factor m(n m) penalizes communities with m close to 1 or n and encourages more balanced solutions.
18 Eigen-decomposition approximation Let D = diag( j A ij ), H = na md, then for fixed community size m, the adjusted criterion is equivalent to where max s s=m, s i {0,1} s Hs, { 1 if node i belongs to S, s i = 0 otherwise. Perform eigen-decomposition by relaxing s to a real vector.
19 Algorithm Tabu Search (Glover, 1986; Glover and Laguna, 1997): a local optimization technique based on label switching Use the eigen-decomposition result as an initial value Run the algorithm for many randomly ordered nodes
20 Outline Review of community detection Community extraction Asymptotic consistency Simulation study Real data analysis Future work
21 Block models Asymptotic consistency can be established under the assumption of block models. General block models 1 Each node is assigned to a block independently of other nodes, with probability π k for block k, 1 k K, K k=1 π k = 1. 2 Given that node i belongs to block a and node j belongs to block b, P[A ij = 1] = p ab, and all edges are independent. Block models for networks with background We can define the last block as background, by assuming p ak < p bb for all a = 1,...,K, and all b = 1,...,K 1.
22 Asymptotic consistency For simplicity, assume there is only one community and background in the network (K = 2 with parameters p 11,p 12,p 22,π and 1 π). Let c denote the true community labels, ĉ (n) denote the estimated labels, we proved Theorem For any 0 < π < 1, if p 11 > p 12, p 11 > p 22 and p 11 + p 22 > 2p 12, the maximizer ĉ (n) of both unadjusted and adjusted criteria satisfies P[ĉ (n) = c] 1 as n.
23 Key component of the proof Based on the framework established by Bickel and Chen (2009) Given a proposed label assignment s, Let R be the confusion matrix with R ab (s,c) = 1 n n i=1 I(s i = a,c i = b). The population version of the criterion can be written as a function of the confusion matrix. Key condition: The population version of the criterion is maximized by the correct confusion matrix diag(π, 1 π).
24 Outline Review of community detection Community extraction Asymptotic consistency Simulation study Real data analysis Future work
25 Simulation I Two pure communities (no backgroud) n = 1000 n 1 = 100,200,300 p 11 = 0.5,p 22 = 0.4,p 12 = 0.05
26 Evaluation Let S be the extracted set and C S be the true community that matches with S the best. PPV and NPV PPV = C S S S NPV =1 C S S c S c Purity Completeness
27 Results for simulation I Method n 1 = 100 n 1 = 200 n 1 = 300 PPV NPV PPV NPV PPV NPV Modularity (0.032) (0) (0.000) (0) (0) (0) Unadjusted (0.000) (0) (0.000) (0) (0) (0) Adjusted (0.058) (0) (0.003) (0) (0) (0)
28 Simulation II One community with background n = 1000 n 1 = 100,200,300 p 12 = 0.05, p 22 = 0.05 p 11 = 0.1,0.15,0.2
29 Results of simulation II p11=0.1 p11=0.15 p11=0.2 n1= PPV NPV PPV NPV PPV NPV n1= PPV NPV PPV NPV PPV NPV
30 Simulation III Two communities with background n = 1000 n 1 = 100,300,n 2 = 100,300 p 12 = p 23 = p 13 = p 33 = 0.05 p 11 = 0.1,0.15,0.2 p 22 = 0.08,0.12,0.16
31 Results for simulation III p11=0.1 p22=0.08 p11=0.15 p22=0.12 p11=0.2 p22=0.16 n1=100 n2= PPV NPV PPV NPV PPV NPV n1=300 n2= PPV NPV PPV NPV PPV NPV
32 Outline Review of community detection Community extraction Asymptotic consistency Simulation study Real data analysis Future work
33 Karate club network Friendships between 34 members of a karate club (Zachary, 1977). This club has subsequently split into two parts following a disagreement between an instructor (node 0) and an administrator (node 33).
34 Karate club network Community extraction Modularity
35 Political books network Links in the political books network (Newman, 2006) represent pairs of books frequently bought together on amazon.com. Blue: liberal Red: conservative
36 Political books network Community extraction Modularity
37 School friendship network The school friendship network is complied from the National Longitudinal Study of Adolescent Health (AddHealth) ( Grade 7: red Grade 8: blue Grade 9: green Grade 10: yellow Grade 11: purple Grade 12: orange
38 School friendship network Grades Modularity with 6 communities
39 School friendship network Extracting 6 communities Extracting 7 communities
40 Future work Stopping criterion Adjusted criterion ( I(S) W a (S) = [m(n m)] α m 2 B(S) ) m(n m)
Modularity CMSC 858L
Modularity CMSC 858L Module-detection for Function Prediction Biological networks generally modular (Hartwell+, 1999) We can try to find the modules within a network. Once we find modules, we can look
More informationCommunity Structure and Beyond
Community Structure and Beyond Elizabeth A. Leicht MAE: 298 April 9, 2009 Why do we care about community structure? Large Networks Discussion Outline Overview of past work on community structure. How to
More informationChapter 11. Network Community Detection
Chapter 11. Network Community Detection Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Outline
More informationMining Social Network Graphs
Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be
More informationCommunity Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal
Community Structure Detection Amar Chandole Ameya Kabre Atishay Aggarwal What is a network? Group or system of interconnected people or things Ways to represent a network: Matrices Sets Sequences Time
More informationClustering Algorithms on Graphs Community Detection 6CCS3WSN-7CCSMWAL
Clustering Algorithms on Graphs Community Detection 6CCS3WSN-7CCSMWAL Contents Zachary s famous example Community structure Modularity The Girvan-Newman edge betweenness algorithm In the beginning: Zachary
More informationNon Overlapping Communities
Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides
More informationOn the Approximability of Modularity Clustering
On the Approximability of Modularity Clustering Newman s Community Finding Approach for Social Nets Bhaskar DasGupta Department of Computer Science University of Illinois at Chicago Chicago, IL 60607,
More informationCS224W: Analysis of Networks Jure Leskovec, Stanford University
CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models
More informationSpectral Methods for Network Community Detection and Graph Partitioning
Spectral Methods for Network Community Detection and Graph Partitioning M. E. J. Newman Department of Physics, University of Michigan Presenters: Yunqi Guo Xueyin Yu Yuanqi Li 1 Outline: Community Detection
More informationCSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection
CSE 258 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:
More informationScalable Clustering of Signed Networks Using Balance Normalized Cut
Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.
More informationCSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection
CSE 158 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:
More informationCS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #21: Graph Mining 2 Networks & Communi>es We o@en think of networks being organized into modules, cluster, communi>es: VT CS 5614 2 Goal:
More informationVisual Representations for Machine Learning
Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering
More informationEfficient Semi-supervised Spectral Co-clustering with Constraints
2010 IEEE International Conference on Data Mining Efficient Semi-supervised Spectral Co-clustering with Constraints Xiaoxiao Shi, Wei Fan, Philip S. Yu Department of Computer Science, University of Illinois
More informationSocial Data Management Communities
Social Data Management Communities Antoine Amarilli 1, Silviu Maniu 2 January 9th, 2018 1 Télécom ParisTech 2 Université Paris-Sud 1/20 Table of contents Communities in Graphs 2/20 Graph Communities Communities
More informationIntroduction to Machine Learning
Introduction to Machine Learning Clustering Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 19 Outline
More informationMy favorite application using eigenvalues: partitioning and community detection in social networks
My favorite application using eigenvalues: partitioning and community detection in social networks Will Hobbs February 17, 2013 Abstract Social networks are often organized into families, friendship groups,
More informationV2: Measures and Metrics (II)
- Betweenness Centrality V2: Measures and Metrics (II) - Groups of Vertices - Transitivity - Reciprocity - Signed Edges and Structural Balance - Similarity - Homophily and Assortative Mixing 1 Betweenness
More informationCommunity Analysis. Chapter 6
This chapter is from Social Media Mining: An Introduction. By Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. Cambridge University Press, 2014. Draft version: April 20, 2014. Complete Draft and Slides
More informationV4 Matrix algorithms and graph partitioning
V4 Matrix algorithms and graph partitioning - Community detection - Simple modularity maximization - Spectral modularity maximization - Division into more than two groups - Other algorithms for community
More informationDiscovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes
Journal of Computational Information Systems 8: 23 (2012) 9807 9814 Available at http://www.jofcis.com Discovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes
More informationJure Leskovec, Cornell/Stanford University. Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research
Jure Leskovec, Cornell/Stanford University Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research Network: an interaction graph: Nodes represent entities Edges represent interaction
More informationClustering on networks by modularity maximization
Clustering on networks by modularity maximization Sonia Cafieri ENAC Ecole Nationale de l Aviation Civile Toulouse, France thanks to: Pierre Hansen, Sylvain Perron, Gilles Caporossi (GERAD, HEC Montréal,
More informationOnline Social Networks and Media. Community detection
Online Social Networks and Media Community detection 1 Notes on Homework 1 1. You should write your own code for generating the graphs. You may use SNAP graph primitives (e.g., add node/edge) 2. For the
More informationAlgorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov
Algorithms and Applications in Social Networks 2017/2018, Semester B Slava Novgorodov 1 Lesson #1 Administrative questions Course overview Introduction to Social Networks Basic definitions Network properties
More informationCS473-Algorithms I. Lecture 13-A. Graphs. Cevdet Aykanat - Bilkent University Computer Engineering Department
CS473-Algorithms I Lecture 3-A Graphs Graphs A directed graph (or digraph) G is a pair (V, E), where V is a finite set, and E is a binary relation on V The set V: Vertex set of G The set E: Edge set of
More informationGraph Theory S 1 I 2 I 1 S 2 I 1 I 2
Graph Theory S I I S S I I S Graphs Definition A graph G is a pair consisting of a vertex set V (G), and an edge set E(G) ( ) V (G). x and y are the endpoints of edge e = {x, y}. They are called adjacent
More information1 More stochastic block model. Pr(G θ) G=(V, E) 1.1 Model definition. 1.2 Fitting the model to data. Prof. Aaron Clauset 7 November 2013
1 More stochastic block model Recall that the stochastic block model (SBM is a generative model for network structure and thus defines a probability distribution over networks Pr(G θ, where θ represents
More informationNATCOR Convex Optimization Linear Programming 1
NATCOR Convex Optimization Linear Programming 1 Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk 5 June 2018 What is linear programming (LP)? The most important model used in
More informationClustering: Classic Methods and Modern Views
Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering
More informationCSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection
CSE 255 Lecture 6 Data Mining and Predictive Analytics Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:
More informationNetwork community detection with edge classifiers trained on LFR graphs
Network community detection with edge classifiers trained on LFR graphs Twan van Laarhoven and Elena Marchiori Department of Computer Science, Radboud University Nijmegen, The Netherlands Abstract. Graphs
More informationI How does the formulation (5) serve the purpose of the composite parameterization
Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)
More informationCommunity Detection Methods using Eigenvectors of Matrices
Community Detection Methods using Eigenvectors of Matrices Yan Zhang Abstract In this paper we investigate the problem of detecting communities in graphs. We use the eigenvectors of the graph Laplacian
More informationWhat is linear programming (LP)? NATCOR Convex Optimization Linear Programming 1. Solving LP problems: The standard simplex method
NATCOR Convex Optimization Linear Programming 1 Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk 14 June 2016 What is linear programming (LP)? The most important model used in
More informationSpectral Graph Multisection Through Orthogonality. Huanyang Zheng and Jie Wu CIS Department, Temple University
Spectral Graph Multisection Through Orthogonality Huanyang Zheng and Jie Wu CIS Department, Temple University Outline Motivation Preliminary Algorithm Evaluation Future work Motivation Traditional graph
More information1 Homophily and assortative mixing
1 Homophily and assortative mixing Networks, and particularly social networks, often exhibit a property called homophily or assortative mixing, which simply means that the attributes of vertices correlate
More informationSpectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,
Spectral Clustering XIAO ZENG + ELHAM TABASSI CSE 902 CLASS PRESENTATION MARCH 16, 2017 1 Presentation based on 1. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4
More informationApproximation Algorithms: The Primal-Dual Method. My T. Thai
Approximation Algorithms: The Primal-Dual Method My T. Thai 1 Overview of the Primal-Dual Method Consider the following primal program, called P: min st n c j x j j=1 n a ij x j b i j=1 x j 0 Then the
More informationResearch Incubator: Combinatorial Optimization. Dr. Lixin Tao December 9, 2003
Research Incubator: Combinatorial Optimization Dr. Lixin Tao December 9, 23 Content General Nature of Research on Combinatorial Optimization Problem Identification and Abstraction Problem Properties and
More informationDouble Patterning Layout Decomposition for Simultaneous Conflict and Stitch Minimization
Double Patterning Layout Decomposition for Simultaneous Conflict and Stitch Minimization Kun Yuan, Jae-Seo Yang, David Z. Pan Dept. of Electrical and Computer Engineering The University of Texas at Austin
More informationCluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1
Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods
More informationUML CS Algorithms Qualifying Exam Fall, 2003 ALGORITHMS QUALIFYING EXAM
NAME: This exam is open: - books - notes and closed: - neighbors - calculators ALGORITHMS QUALIFYING EXAM The upper bound on exam time is 3 hours. Please put all your work on the exam paper. (Partial credit
More informationMVE165/MMG630, Applied Optimization Lecture 8 Integer linear programming algorithms. Ann-Brith Strömberg
MVE165/MMG630, Integer linear programming algorithms Ann-Brith Strömberg 2009 04 15 Methods for ILP: Overview (Ch. 14.1) Enumeration Implicit enumeration: Branch and bound Relaxations Decomposition methods:
More informationLecture 3. Corner Polyhedron, Intersection Cuts, Maximal Lattice-Free Convex Sets. Tepper School of Business Carnegie Mellon University, Pittsburgh
Lecture 3 Corner Polyhedron, Intersection Cuts, Maximal Lattice-Free Convex Sets Gérard Cornuéjols Tepper School of Business Carnegie Mellon University, Pittsburgh January 2016 Mixed Integer Linear Programming
More informationNon-exhaustive, Overlapping k-means
Non-exhaustive, Overlapping k-means J. J. Whang, I. S. Dhilon, and D. F. Gleich Teresa Lebair University of Maryland, Baltimore County October 29th, 2015 Teresa Lebair UMBC 1/38 Outline Introduction NEO-K-Means
More informationGraph Definitions. In a directed graph the edges have directions (ordered pairs). A weighted graph includes a weight function.
Graph Definitions Definition 1. (V,E) where An undirected graph G is a pair V is the set of vertices, E V 2 is the set of edges (unordered pairs) E = {(u, v) u, v V }. In a directed graph the edges have
More informationMCL. (and other clustering algorithms) 858L
MCL (and other clustering algorithms) 858L Comparing Clustering Algorithms Brohee and van Helden (2006) compared 4 graph clustering algorithms for the task of finding protein complexes: MCODE RNSC Restricted
More informationSOMSN: An Effective Self Organizing Map for Clustering of Social Networks
SOMSN: An Effective Self Organizing Map for Clustering of Social Networks Fatemeh Ghaemmaghami Research Scholar, CSE and IT Dept. Shiraz University, Shiraz, Iran Reza Manouchehri Sarhadi Research Scholar,
More informationOh Pott, Oh Pott! or how to detect community structure in complex networks
Oh Pott, Oh Pott! or how to detect community structure in complex networks Jörg Reichardt Interdisciplinary Centre for Bioinformatics, Leipzig, Germany (Host of the 2012 Olympics) Questions to start from
More informationNetwork cross-validation by edge sampling
Network cross-validation by edge sampling Abstract Many models and methods are now available for network analysis, but model selection and tuning remain challenging. Cross-validation is a useful general
More informationCommunity Detection. Community
Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,
More informationThe districting problem: applications and solving methods
The districting problem: applications and solving methods Viviane Gascon Département des sciences de la gestion Université du Québec à Trois-Rivi Rivières 1 Introduction The districting problem consists
More informationOn Modularity Clustering. Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari)
On Modularity Clustering Presented by: Presented by: Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari) Modularity A quality index for clustering a graph G=(V,E) G=(VE) q( C): EC ( ) EC ( ) + ECC (, ')
More informationGraph Theory: Introduction
Graph Theory: Introduction Pallab Dasgupta, Professor, Dept. of Computer Sc. and Engineering, IIT Kharagpur pallab@cse.iitkgp.ernet.in Resources Copies of slides available at: http://www.facweb.iitkgp.ernet.in/~pallab
More informationMath.3336: Discrete Mathematics. Chapter 10 Graph Theory
Math.3336: Discrete Mathematics Chapter 10 Graph Theory Instructor: Dr. Blerina Xhabli Department of Mathematics, University of Houston https://www.math.uh.edu/ blerina Email: blerina@math.uh.edu Fall
More informationGraphBLAS Mathematics - Provisional Release 1.0 -
GraphBLAS Mathematics - Provisional Release 1.0 - Jeremy Kepner Generated on April 26, 2017 Contents 1 Introduction: Graphs as Matrices........................... 1 1.1 Adjacency Matrix: Undirected Graphs,
More informationMethods and Models for Combinatorial Optimization Exact methods for the Traveling Salesman Problem
Methods and Models for Combinatorial Optimization Exact methods for the Traveling Salesman Problem L. De Giovanni M. Di Summa The Traveling Salesman Problem (TSP) is an optimization problem on a directed
More informationTELCOM2125: Network Science and Analysis
School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 Figures are taken from: M.E.J. Newman, Networks: An Introduction 2
More informationSignal Processing for Big Data
Signal Processing for Big Data Sergio Barbarossa 1 Summary 1. Networks 2.Algebraic graph theory 3. Random graph models 4. OperaGons on graphs 2 Networks The simplest way to represent the interaction between
More informationNetwork Community Detection
Network Community Detection Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ March 20, 2018
More information22 Elementary Graph Algorithms. There are two standard ways to represent a
VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph
More informationRandomized rounding of semidefinite programs and primal-dual method for integer linear programming. Reza Moosavi Dr. Saeedeh Parsaeefard Dec.
Randomized rounding of semidefinite programs and primal-dual method for integer linear programming Dr. Saeedeh Parsaeefard 1 2 3 4 Semidefinite Programming () 1 Integer Programming integer programming
More informationNormalized Graph cuts. by Gopalkrishna Veni School of Computing University of Utah
Normalized Graph cuts by Gopalkrishna Veni School of Computing University of Utah Image segmentation Image segmentation is a grouping technique used for image. It is a way of dividing an image into different
More informationCommunication balancing in Mondriaan sparse matrix partitioning
Communication balancing in Mondriaan sparse matrix partitioning Rob Bisseling and Wouter Meesen Rob.Bisseling@math.uu.nl http://www.math.uu.nl/people/bisseling Department of Mathematics Utrecht University
More informationToward the joint design of electronic and optical layer protection
Toward the joint design of electronic and optical layer protection Massachusetts Institute of Technology Slide 1 Slide 2 CHALLENGES: - SEAMLESS CONNECTIVITY - MULTI-MEDIA (FIBER,SATCOM,WIRELESS) - HETEROGENEOUS
More informationSize Regularized Cut for Data Clustering
Size Regularized Cut for Data Clustering Yixin Chen Department of CS Univ. of New Orleans yixin@cs.uno.edu Ya Zhang Department of EECS Uinv. of Kansas yazhang@ittc.ku.edu Xiang Ji NEC-Labs America, Inc.
More informationDetecting Community Structure for Undirected Big Graphs Based on Random Walks
Detecting Community Structure for Undirected Big Graphs Based on Random Walks Xiaoming Liu 1, Yadong Zhou 1, Chengchen Hu 1, Xiaohong Guan 1,, Junyuan Leng 1 1 MOE KLNNIS Lab, Xi an Jiaotong University,
More informationAlgorithm and Complexity of Disjointed Connected Dominating Set Problem on Trees
Algorithm and Complexity of Disjointed Connected Dominating Set Problem on Trees Wei Wang joint with Zishen Yang, Xianliang Liu School of Mathematics and Statistics, Xi an Jiaotong University Dec 20, 2016
More informationk-means demo Administrative Machine learning: Unsupervised learning" Assignment 5 out
Machine learning: Unsupervised learning" David Kauchak cs Spring 0 adapted from: http://www.stanford.edu/class/cs76/handouts/lecture7-clustering.ppt http://www.youtube.com/watch?v=or_-y-eilqo Administrative
More informationBinary Relations McGraw-Hill Education
Binary Relations A binary relation R from a set A to a set B is a subset of A X B Example: Let A = {0,1,2} and B = {a,b} {(0, a), (0, b), (1,a), (2, b)} is a relation from A to B. We can also represent
More informationHierarchical Overlapping Community Discovery Algorithm Based on Node purity
Hierarchical Overlapping ommunity Discovery Algorithm Based on Node purity Guoyong ai, Ruili Wang, and Guobin Liu Guilin University of Electronic Technology, Guilin, Guangxi, hina ccgycai@guet.edu.cn,
More informationEnhanced Modularity-based Community Detection by Random Walk Network Preprocessing
Enhanced Modularity-based Community Detection by Random Walk Network Preprocessing Darong Lai and Hongtao Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University, 8 Dong Chuan
More informationOn the Min-Max 2-Cluster Editing Problem
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 29, 1109-1120 (2013) On the Min-Max 2-Cluster Editing Problem LI-HSUAN CHEN 1, MAW-SHANG CHANG 2, CHUN-CHIEH WANG 1 AND BANG YE WU 1,* 1 Department of Computer
More informationComputational Genomics and Molecular Biology, Fall
Computational Genomics and Molecular Biology, Fall 2015 1 Sequence Alignment Dannie Durand Pairwise Sequence Alignment The goal of pairwise sequence alignment is to establish a correspondence between the
More informationON THE STRUCTURE OF SELF-COMPLEMENTARY GRAPHS ROBERT MOLINA DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE ALMA COLLEGE ABSTRACT
ON THE STRUCTURE OF SELF-COMPLEMENTARY GRAPHS ROBERT MOLINA DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE ALMA COLLEGE ABSTRACT A graph G is self complementary if it is isomorphic to its complement G.
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu HITS (Hypertext Induced Topic Selection) Is a measure of importance of pages or documents, similar to PageRank
More informationA Note on Polyhedral Relaxations for the Maximum Cut Problem
A Note on Polyhedral Relaxations for the Maximum Cut Problem Alantha Newman Abstract We consider three well-studied polyhedral relaxations for the maximum cut problem: the metric polytope of the complete
More informationPower Set of a set and Relations
Power Set of a set and Relations 1 Power Set (1) Definition: The power set of a set S, denoted P(S), is the set of all subsets of S. Examples Let A={a,b,c}, P(A)={,{a},{b},{c},{a,b},{b,c},{a,c},{a,b,c}}
More informationClustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search
Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2
More informationCopyright 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin Introduction to the Design & Analysis of Algorithms, 2 nd ed., Ch.
Iterative Improvement Algorithm design technique for solving optimization problems Start with a feasible solution Repeat the following step until no improvement can be found: change the current feasible
More informationSocial Network Analysis
Social Network Analysis Mathematics of Networks Manar Mohaisen Department of EEC Engineering Adjacency matrix Network types Edge list Adjacency list Graph representation 2 Adjacency matrix Adjacency matrix
More informationCPS 102: Discrete Mathematics. Quiz 3 Date: Wednesday November 30, Instructor: Bruce Maggs NAME: Prob # Score. Total 60
CPS 102: Discrete Mathematics Instructor: Bruce Maggs Quiz 3 Date: Wednesday November 30, 2011 NAME: Prob # Score Max Score 1 10 2 10 3 10 4 10 5 10 6 10 Total 60 1 Problem 1 [10 points] Find a minimum-cost
More informationNetworks in economics and finance. Lecture 1 - Measuring networks
Networks in economics and finance Lecture 1 - Measuring networks What are networks and why study them? A network is a set of items (nodes) connected by edges or links. Units (nodes) Individuals Firms Banks
More informationGraph Theory for Network Science
Graph Theory for Network Science Dr. Natarajan Meghanathan Professor Department of Computer Science Jackson State University, Jackson, MS E-mail: natarajan.meghanathan@jsums.edu Networks or Graphs We typically
More informationMathematics. 2.1 Introduction: Graphs as Matrices Adjacency Matrix: Undirected Graphs, Directed Graphs, Weighted Graphs CHAPTER 2
CHAPTER Mathematics 8 9 10 11 1 1 1 1 1 1 18 19 0 1.1 Introduction: Graphs as Matrices This chapter describes the mathematics in the GraphBLAS standard. The GraphBLAS define a narrow set of mathematical
More informationCSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection.
CSCI-B609: A Theorist s Toolkit, Fall 016 Sept. 6, 016 Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality Lecturer: Yuan Zhou Scribe: Xuan Dong We will continue studying the spectral graph theory
More informationStatistical Physics of Community Detection
Statistical Physics of Community Detection Keegan Go (keegango), Kenji Hata (khata) December 8, 2015 1 Introduction Community detection is a key problem in network science. Identifying communities, defined
More informationMOURAD BAÏOU AND FRANCISCO BARAHONA
THE p-median POLYTOPE OF RESTRICTED Y-GRAPHS MOURAD BAÏOU AND FRANCISCO BARAHONA Abstract We further study the effect of odd cycle inequalities in the description of the polytopes associated with the p-median
More informationUnderstanding complex networks with community-finding algorithms
Understanding complex networks with community-finding algorithms Eric D. Kelsic 1 SURF 25 Final Report 1 California Institute of Technology, Pasadena, CA 91126, USA (Dated: November 1, 25) In a complex
More informationPaths, Circuits, and Connected Graphs
Paths, Circuits, and Connected Graphs Paths and Circuits Definition: Let G = (V, E) be an undirected graph, vertices u, v V A path of length n from u to v is a sequence of edges e i = {u i 1, u i} E for
More informationXLVI Pesquisa Operacional na Gestão da Segurança Pública
Setembro de 014 Approximation algorithms for simple maxcut of split graphs Rubens Sucupira 1 Luerbio Faria 1 Sulamita Klein 1- IME/UERJ UERJ, Rio de JaneiroRJ, Brasil rasucupira@oi.com.br, luerbio@cos.ufrj.br
More informationTargil 12 : Image Segmentation. Image segmentation. Why do we need it? Image segmentation
Targil : Image Segmentation Image segmentation Many slides from Steve Seitz Segment region of the image which: elongs to a single object. Looks uniform (gray levels, color ) Have the same attributes (texture
More informationDiscrete mathematics , Fall Instructor: prof. János Pach
Discrete mathematics 2016-2017, Fall Instructor: prof. János Pach - covered material - Lecture 1. Counting problems To read: [Lov]: 1.2. Sets, 1.3. Number of subsets, 1.5. Sequences, 1.6. Permutations,
More informationApproximability Results for the p-center Problem
Approximability Results for the p-center Problem Stefan Buettcher Course Project Algorithm Design and Analysis Prof. Timothy Chan University of Waterloo, Spring 2004 The p-center
More informationIMO Training 2010 Double Counting Victoria Krakovna. Double Counting. Victoria Krakovna
Double Counting Victoria Krakovna vkrakovna@gmail.com 1 Introduction In many combinatorics problems, it is useful to count a quantity in two ways. Let s start with a simple example. Example 1. (Iran 2010
More informationCommunity Mining in Signed Networks: A Multiobjective Approach
Community Mining in Signed Networks: A Multiobjective Approach Alessia Amelio National Research Council of Italy (CNR) Inst. for High Perf. Computing and Networking (ICAR) Via Pietro Bucci, 41C 87036 Rende
More informationGraphs and Network Flows IE411. Lecture 21. Dr. Ted Ralphs
Graphs and Network Flows IE411 Lecture 21 Dr. Ted Ralphs IE411 Lecture 21 1 Combinatorial Optimization and Network Flows In general, most combinatorial optimization and integer programming problems are
More information