CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2

Size: px
Start display at page:

Download "CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2"

Transcription

1 CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #21: Graph Mining 2

2 Networks & Communi>es We think of networks being organized into modules, cluster, communi>es: VT CS

3 Goal: Find Densely Linked Clusters VT CS

4 Micro-Markets in Sponsored Search Find micro-markets by par>>oning the query-toadver>ser graph: query advertiser [Andersen, Lang: Communities from seed sets, 2006] VT CS

5 Movies and Actors Clusters in Movies-to-Actors graph: [Andersen, Lang: Communities from seed sets, 2006] VT CS

6 TwiRer & Facebook Discovering social circles, circles of trust: [McAuley, Leskovec: Discovering social circles in ego networks, 2012] VT CS

7 We will work with undirected (unweighted) networks How to find communi>es? COMMUNITY DETECTION VT CS

8 Method 1: Strength of Weak Ties Edge betweenness: Number of shortest paths passing over the edge Intui>on: b=16 b=7.5 Edge strengths (call volume) in a real network Edge betweenness in a real network VT CS

9 [Girvan-Newman 02] Method 1: Girvan-Newman Divisive hierarchical clustering based on the nohon of edge betweenness: Number of shortest paths passing through the edge Girvan-Newman Algorithm: Undirected unweighted networks Repeat un>l no edges are Calculate betweenness of edges Remove edges with highest betweenness Connected components are communihes Gives a hierarchical decomposihon of the network VT CS

10 Girvan-Newman: Example Need to re-compute betweenness at every step VT CS

11 Girvan-Newman: Example Step 1: Step 2: Step 3: Hierarchical network decomposition: VT CS

12 Girvan-Newman: Results CommuniHes in physics collaborahons VT CS

13 Girvan-Newman: Results Zachary s Karate club: Hierarchical decomposihon VT CS

14 WE NEED TO RESOLVE 2 QUESTIONS 1. How to compute betweenness? 2. How to select the number of clusters? VT CS

15 How to Compute Betweenness? Want to compute betweenness of paths star>ng at node A Breath first search star>ng from A: VT CS

16 How to Compute Betweenness? Count the number of shortest paths from A to all other nodes of the network: VT CS

17 How to Compute Betweenness? Compute betweenness by working up the tree: If there are mulhple paths count them frachonally The algorithm: Add edge flows: -- node flow = 1+ child edges -- split the flow up based on the parent value Repeat the BFS procedure for each starting node U 1 path to K. Split evenly paths to J Split 1:2 1+1 paths to H Split evenly VT CS

18 How to Compute Betweenness? Compute betweenness by working up the tree: If there are mulhple paths count them frachonally The algorithm: Add edge flows: -- node flow = 1+ child edges -- split the flow up based on the parent value Repeat the BFS procedure for each starting node U 1 path to K. Split evenly paths to J Split 1:2 1+1 paths to H Split evenly VT CS

19 WE NEED TO RESOLVE 2 QUESTIONS 1. How to compute betweenness? 2. How to select the number of clusters? VT CS

20 Network Communi>es Communi>es: sets of >ghtly connected nodes Define: Modularity Q A measure of how well a network is parhhoned into communihes Given a parhhoning of the network into groups s S: Q s S [ (# edges within group s) (expected # edges within group s) ] VT CS 5614 Need a null model! 20

21 Null Model: Configura>on Model VT CS

22 Modularity VT CS

23 Modularity: Number of clusters Modularity is useful for selec>ng the number of clusters: Q VT CS

24 SPECTRAL CLUSTERING VT CS

25 Graph Par>>oning Undirected graph Bi-par>>oning task: Divide verhces into two disjoint groups A Ques>ons: How can we define a good parhhon of? How can we efficiently idenhfy such a parhhon? B VT CS

26 Graph Par>>oning What makes a good par>>on? Maximize the number of within-group connechons Minimize the number of between-group connechons A B VT CS

27 Graph Cuts VT CS

28 Graph Cut Criterion Criterion: Minimum-cut Minimize weight of connechons between groups arg min A,B cut(a,b) Degenerate case: Optimal cut Minimum cut Problem: Only considers external cluster connechons Does not consider internal cluster connechvity VT CS

29 [Shi-Malik] Graph Cut Criteria Criterion: Normalized-cut [Shi-Malik, 97] ConnecHvity between groups relahve to the density of each group cut( A, B) ncut ( A, B) = + vol( A) cut( A, B) vol( B) : total weight of the edges with at least one endpoint in : n Why use this criterion? n Produces more balanced parhhons How do we efficiently find a good par>>on? Problem: CompuHng ophmal cut is NP-hard VT CS

30 Spectral Graph Par>>oning A: adjacency matrix of undirected G A ij =1 if is an edge, else 0 x is a vector in R n with components Think of it as a label/value of each node of What is the meaning of A x? Entry y i is a sum of labels x j of neighbors of i VT CS = n n nn n n y y x x a a a a!!!! = = = E j i j n j ij i x xj A y ), ( 1

31 What is the meaning of Ax? j th coordinate of A x : Sum of the x-values of neighbors of j Make this a new value at node j Spectral Graph Theory: a! a Analyze the spectrum of matrix represenhng x! x Spectrum: Eigenvectors of a graph, ordered by the magnitude (strength) of their corresponding eigenvalues : Λ = 11 n1 a1 n 1 1 { 1 2 n VT CS a! λ, λ,..., λ } λ λ λ n nn A x=λ x n = x λ! x n

32 Example: d-regular graph VT CS

33 d is the largest eigenvalue of A Details! VT CS

34 Example: Graph on 2 components VT CS

35 More Intui>on VT CS

36 Matrix Representa>ons Adjacency matrix (A): n n matrix A=[a ij ], a ij =1 if edge between node i and j Important proper>es: Symmetric matrix Eigenvectors are real and orthogonal VT CS

37 Matrix Representa>ons Degree matrix (D): n n diagonal matrix D=[d ii ], d ii = degree of node i VT CS

38 Matrix Representa>ons Laplacian matrix (L): n n symmetric matrix What is trivial eigenpair? then and so Important proper>es: Eigenvalues are non-negahve real numbers Eigenvectors are real and orthogonal L = D A VT CS

39 Details! Facts about the Laplacian L VT CS

40 λ 2 as op>miza>on problem λ 2 = min x x T x M x T x VT CS

41 Proof VT CS

42 λ 2 as op>miza>on problem VT CS

43 Find Op>mal Cut [Fiedler 73] VT CS

44 Rayleigh Theorem VT CS

45 Details! Approx. Guarantee of Spectral VT CS

46 Details! Approx. Guarantee of Spectral VT CS

47 Details! Approx. Guarantee of Spectral VT CS

48 So far How to define a good par>>on of a graph? Minimize a given graph cut criterion How to efficiently iden>fy such a par>>on? Approximate using informahon provided by the eigenvalues and eigenvectors of a graph Spectral Clustering VT CS

49 Spectral Clustering Algorithms Three basic stages: 1) Pre-processing Construct a matrix representahon of the graph 2) Decomposi>on Compute eigenvalues and eigenvectors of the matrix Map each point to a lower-dimensional representahon based on one or more eigenvectors 3) Grouping Assign points to two or more clusters, based on the new representahon VT CS

50 Spectral Par>>oning Algorithm 1) Pre-processing: Build Laplacian matrix L of the graph ) Decomposi>on: Find eigenvalues λ and eigenvectors x of the matrix L λ= X = Map verhces to How do we now corresponding find the clusters? components of λ VT CS

51 Spectral Par>>oning 3) Grouping: Sort components of reduced 1-dimensional vector IdenHfy clusters by splihng the sorted vector in two How to choose a splisng point? Naïve approaches: Split at 0 or median value More expensive approaches: Ajempt to minimize normalized cut in 1-dimension (sweep over ordering of nodes induced by the eigenvector) Split at 0: Cluster A: PosiHve points Cluster B: NegaHve points A B VT CS

52 Example: Spectral Par>>oning Value of x 2 Rank in x 2 VT CS

53 Example: Spectral Par>>oning Components of x 2 Value of x 2 Rank in x 2 VT CS

54 Example: Spectral par>>oning Components of x 1 Components of x 3 VT CS

55 k-way Spectral Clustering How do we par>>on a graph into k clusters? Two basic approaches: Recursive bi-par>>oning [Hagen et al., 92] Recursively apply bi-parhhoning algorithm in a hierarchical divisive manner Disadvantages: Inefficient, unstable Cluster mul>ple eigenvectors [Shi-Malik, 00] Build a reduced space from mulhple eigenvectors Commonly used in recent papers A preferable approach VT CS

56 Why use mul>ple eigenvectors? Approximates the op>mal cut [Shi-Malik, 00] Can be used to approximate ophmal k-way normalized cut Emphasizes cohesive clusters Increases the unevenness in the distribuhon of the data AssociaHons between similar points are amplified, associahons between dissimilar points are ajenuated The data begins to approximate a clustering Well-separated space Transforms data to a new embedded space, consishng of k orthogonal basis vectors MulHple eigenvectors prevent instability due to informahon loss VT CS

57 ANALYSIS OF LARGE GRAPHS: TRAWLING VT CS

58 [Kumar et al. 99] Trawling Searching for small communi>es in the Web graph What is the signature of a community / discussion in a Web graph? Use this to define topics : What the same people on the left talk about on the right Remember HITS! Dense 2-layer graph Intuition: Many people all talking about the same things VT CS

59 Searching for Small Communi>es A more well-defined problem: Enumerate complete biparhte subgraphs K s,t Where K s,t : s nodes on the lem where each links to the same t other nodes on the right X Fully connected K 3,4 Y X = s = 3 Y = t = 4 VT CS

60 [Agrawal-Srikant 99] Frequent Itemset Enumera>on Market basket analysis. Sehng: Market: Universe U of n items Baskets: m subsets of U: S 1, S 2,, S m U (S i is a set of items one person bought) Support: Frequency threshold f Goal: Find all subsets T s.t. T S i of at least f sets S i (items in T were bought together at least f Hmes) What s the connec>on between the itemsets and complete bipar>te graphs? VT CS

61 [Kumar et al. 99] From Itemsets to Bipar>te K s,t Frequent itemsets = complete bipar>te graphs! How? View each node i as a set S i of nodes i points to K s,t = a set Y of size t that occurs in s sets S i S i ={a,b,c,d} Looking for K s,t à set of frequency threshold to s and look at layer t all frequent sets of size t X Y s minimum support ( X =s) t itemset size ( Y =t) VT CS

62 [Kumar et al. 99] From Itemsets to Bipar>te K s,t View each node i as a set S i of nodes i points to Say we find a frequent itemset Y={a,b,c} of supp s So, there are s nodes that link to all of {a,b,c}: S i ={a,b,c,d} Find frequent itemsets: s minimum support t itemset size We found K s,t! K s,t = a set Y of size t that occurs in s sets S i X VT CS Y

63 a c e f Itemsets: a = {b,c,d} b = {d} c = {b,d,e,f} d = {e,f} e = {b,d} f = {} d b Example (1) Support threshold s=2 {b,d}: support 3 {e,f}: support 2 And we just found 2 bipar>te subgraphs: a c e d b VT CS c e f d

64 Example (2) Example of a community from a web graph Nodes on the right Nodes on the left [Kumar, Raghavan, Rajagopalan, Tomkins: Trawling the Web for emerging cyber-communities 1999] VT CS

Mining Social Network Graphs

Mining Social Network Graphs Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be

More information

Non Overlapping Communities

Non Overlapping Communities Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu HITS (Hypertext Induced Topic Selection) Is a measure of importance of pages or documents, similar to PageRank

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford

More information

CS 6140: Machine Learning Spring 2017

CS 6140: Machine Learning Spring 2017 CS 6140: Machine Learning Spring 2017 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis@cs Grades

More information

Community Detection. Community

Community Detection. Community Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,

More information

Social-Network Graphs

Social-Network Graphs Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities

More information

Web Structure Mining Community Detection and Evaluation

Web Structure Mining Community Detection and Evaluation Web Structure Mining Community Detection and Evaluation 1 Community Community. It is formed by individuals such that those within a group interact with each other more frequently than with those outside

More information

CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation

CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation Spring 2005 Ahmed Elgammal Dept of Computer Science CS 534 Segmentation II - 1 Outlines What is Graph cuts Graph-based clustering

More information

Clusters and Communities

Clusters and Communities Clusters and Communities Lecture 7 CSCI 4974/6971 22 Sep 2016 1 / 14 Today s Biz 1. Reminders 2. Review 3. Communities 4. Betweenness and Graph Partitioning 5. Label Propagation 2 / 14 Today s Biz 1. Reminders

More information

CS 534: Computer Vision Segmentation and Perceptual Grouping

CS 534: Computer Vision Segmentation and Perceptual Grouping CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Extracting Information from Complex Networks

Extracting Information from Complex Networks Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform

More information

Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11

Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11 Big Data Analytics Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Feb 11 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering II Spectral

More information

Modularity CMSC 858L

Modularity CMSC 858L Modularity CMSC 858L Module-detection for Function Prediction Biological networks generally modular (Hartwell+, 1999) We can try to find the modules within a network. Once we find modules, we can look

More information

Lesson 2 7 Graph Partitioning

Lesson 2 7 Graph Partitioning Lesson 2 7 Graph Partitioning The Graph Partitioning Problem Look at the problem from a different angle: Let s multiply a sparse matrix A by a vector X. Recall the duality between matrices and graphs:

More information

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6, Spectral Clustering XIAO ZENG + ELHAM TABASSI CSE 902 CLASS PRESENTATION MARCH 16, 2017 1 Presentation based on 1. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4

More information

Online Social Networks and Media. Community detection

Online Social Networks and Media. Community detection Online Social Networks and Media Community detection 1 Notes on Homework 1 1. You should write your own code for generating the graphs. You may use SNAP graph primitives (e.g., add node/edge) 2. For the

More information

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection CSE 258 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

Introduction to spectral clustering

Introduction to spectral clustering Introduction to spectral clustering Vasileios Zografos zografos@isy.liu.se Klas Nordberg klas@isy.liu.se What this course is Basic introduction into the core ideas of spectral clustering Sufficient to

More information

Jure Leskovec, Cornell/Stanford University. Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research

Jure Leskovec, Cornell/Stanford University. Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research Jure Leskovec, Cornell/Stanford University Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research Network: an interaction graph: Nodes represent entities Edges represent interaction

More information

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection CSE 158 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning Parallel sparse matrix-vector product Lay out matrix and vectors by rows y(i) = sum(a(i,j)*x(j)) Only compute terms with A(i,j) 0 P0 P1

More information

Mining Social-Network Graphs

Mining Social-Network Graphs 342 Chapter 10 Mining Social-Network Graphs There is much information to be gained by analyzing the large-scale data that is derived from social networks. The best-known example of a social network is

More information

Segmentation and Grouping

Segmentation and Grouping Segmentation and Grouping How and what do we see? Fundamental Problems ' Focus of attention, or grouping ' What subsets of pixels do we consider as possible objects? ' All connected subsets? ' Representation

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu [Kumar et al. 99] 2/13/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

More information

Spectral Methods for Network Community Detection and Graph Partitioning

Spectral Methods for Network Community Detection and Graph Partitioning Spectral Methods for Network Community Detection and Graph Partitioning M. E. J. Newman Department of Physics, University of Michigan Presenters: Yunqi Guo Xueyin Yu Yuanqi Li 1 Outline: Community Detection

More information

My favorite application using eigenvalues: partitioning and community detection in social networks

My favorite application using eigenvalues: partitioning and community detection in social networks My favorite application using eigenvalues: partitioning and community detection in social networks Will Hobbs February 17, 2013 Abstract Social networks are often organized into families, friendship groups,

More information

Extracting Communities from Networks

Extracting Communities from Networks Extracting Communities from Networks Ji Zhu Department of Statistics, University of Michigan Joint work with Yunpeng Zhao and Elizaveta Levina Outline Review of community detection Community extraction

More information

Spectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014

Spectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 Spectral Clustering Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 What are we going to talk about? Introduction Clustering and

More information

Basics of Network Analysis

Basics of Network Analysis Basics of Network Analysis Hiroki Sayama sayama@binghamton.edu Graph = Network G(V, E): graph (network) V: vertices (nodes), E: edges (links) 1 Nodes = 1, 2, 3, 4, 5 2 3 Links = 12, 13, 15, 23,

More information

Clustering Lecture 8. David Sontag New York University. Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein

Clustering Lecture 8. David Sontag New York University. Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein Clustering Lecture 8 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein Clustering: Unsupervised learning Clustering Requires

More information

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection CSE 255 Lecture 6 Data Mining and Predictive Analytics Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic

Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 23 November 2018 Overview Unsupervised Machine Learning overview Association

More information

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates CSE 51: Design and Analysis of Algorithms I Spring 016 Lecture 11: Clustering and the Spectral Partitioning Algorithm Lecturer: Shayan Oveis Gharan May nd Scribe: Yueqi Sheng Disclaimer: These notes have

More information

Machine Learning for Data Science (CS4786) Lecture 11

Machine Learning for Data Science (CS4786) Lecture 11 Machine Learning for Data Science (CS4786) Lecture 11 Spectral Clustering Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Survey Survey Survey Competition I Out! Preliminary report of

More information

Clustering Algorithms on Graphs Community Detection 6CCS3WSN-7CCSMWAL

Clustering Algorithms on Graphs Community Detection 6CCS3WSN-7CCSMWAL Clustering Algorithms on Graphs Community Detection 6CCS3WSN-7CCSMWAL Contents Zachary s famous example Community structure Modularity The Girvan-Newman edge betweenness algorithm In the beginning: Zachary

More information

Image Segmentation continued Graph Based Methods

Image Segmentation continued Graph Based Methods Image Segmentation continued Graph Based Methods Previously Images as graphs Fully-connected graph node (vertex) for every pixel link between every pair of pixels, p,q affinity weight w pq for each link

More information

MCL. (and other clustering algorithms) 858L

MCL. (and other clustering algorithms) 858L MCL (and other clustering algorithms) 858L Comparing Clustering Algorithms Brohee and van Helden (2006) compared 4 graph clustering algorithms for the task of finding protein complexes: MCODE RNSC Restricted

More information

How and what do we see? Segmentation and Grouping. Fundamental Problems. Polyhedral objects. Reducing the combinatorics of pose estimation

How and what do we see? Segmentation and Grouping. Fundamental Problems. Polyhedral objects. Reducing the combinatorics of pose estimation Segmentation and Grouping Fundamental Problems ' Focus of attention, or grouping ' What subsets of piels do we consider as possible objects? ' All connected subsets? ' Representation ' How do we model

More information

CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection.

CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection. CSCI-B609: A Theorist s Toolkit, Fall 016 Sept. 6, 016 Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality Lecturer: Yuan Zhou Scribe: Xuan Dong We will continue studying the spectral graph theory

More information

Graph drawing in spectral layout

Graph drawing in spectral layout Graph drawing in spectral layout Maureen Gallagher Colleen Tygh John Urschel Ludmil Zikatanov Beginning: July 8, 203; Today is: October 2, 203 Introduction Our research focuses on the use of spectral graph

More information

Lecture Note: Computation problems in social. network analysis

Lecture Note: Computation problems in social. network analysis Lecture Note: Computation problems in social network analysis Bang Ye Wu CSIE, Chung Cheng University, Taiwan September 29, 2008 In this lecture note, several computational problems are listed, including

More information

Spectral Clustering on Handwritten Digits Database

Spectral Clustering on Handwritten Digits Database October 6, 2015 Spectral Clustering on Handwritten Digits Database Danielle dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance

More information

6.801/866. Segmentation and Line Fitting. T. Darrell

6.801/866. Segmentation and Line Fitting. T. Darrell 6.801/866 Segmentation and Line Fitting T. Darrell Segmentation and Line Fitting Gestalt grouping Background subtraction K-Means Graph cuts Hough transform Iterative fitting (Next time: Probabilistic segmentation)

More information

Social Data Management Communities

Social Data Management Communities Social Data Management Communities Antoine Amarilli 1, Silviu Maniu 2 January 9th, 2018 1 Télécom ParisTech 2 Université Paris-Sud 1/20 Table of contents Communities in Graphs 2/20 Graph Communities Communities

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Clustering Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 19 Outline

More information

Community Analysis. Chapter 6

Community Analysis. Chapter 6 This chapter is from Social Media Mining: An Introduction. By Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. Cambridge University Press, 2014. Draft version: April 20, 2014. Complete Draft and Slides

More information

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge Centralities (4) By: Ralucca Gera, NPS Excellence Through Knowledge Some slide from last week that we didn t talk about in class: 2 PageRank algorithm Eigenvector centrality: i s Rank score is the sum

More information

An introduction to graph analysis and modeling Descriptive Analysis of Network Data

An introduction to graph analysis and modeling Descriptive Analysis of Network Data An introduction to graph analysis and modeling Descriptive Analysis of Network Data MSc in Statistics for Smart Data ENSAI Autumn semester 2017 http://julien.cremeriefamily.info 1 / 68 Outline 1 Basic

More information

Segmentation Computer Vision Spring 2018, Lecture 27

Segmentation Computer Vision Spring 2018, Lecture 27 Segmentation http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 218, Lecture 27 Course announcements Homework 7 is due on Sunday 6 th. - Any questions about homework 7? - How many of you have

More information

Discovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes

Discovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes Journal of Computational Information Systems 8: 23 (2012) 9807 9814 Available at http://www.jofcis.com Discovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes

More information

Social Network Analysis

Social Network Analysis Social Network Analysis Mathematics of Networks Manar Mohaisen Department of EEC Engineering Adjacency matrix Network types Edge list Adjacency list Graph representation 2 Adjacency matrix Adjacency matrix

More information

EE 701 ROBOT VISION. Segmentation

EE 701 ROBOT VISION. Segmentation EE 701 ROBOT VISION Regions and Image Segmentation Histogram-based Segmentation Automatic Thresholding K-means Clustering Spatial Coherence Merging and Splitting Graph Theoretic Segmentation Region Growing

More information

Social and Technological Network Analysis. Lecture 4: Community Detec=on and Overlapping Communi=es. Dr. Cecilia Mascolo

Social and Technological Network Analysis. Lecture 4: Community Detec=on and Overlapping Communi=es. Dr. Cecilia Mascolo Social and Technological Network Analysis Lecture 4: Community Detec=on and Overlapping Communi=es Dr. Cecilia Mascolo Communi=es Weak =es (Lecture 2) seemed to bridge groups of =ghtly coupled nodes (communi=es)

More information

Normalized Graph cuts. by Gopalkrishna Veni School of Computing University of Utah

Normalized Graph cuts. by Gopalkrishna Veni School of Computing University of Utah Normalized Graph cuts by Gopalkrishna Veni School of Computing University of Utah Image segmentation Image segmentation is a grouping technique used for image. It is a way of dividing an image into different

More information

SGN (4 cr) Chapter 11

SGN (4 cr) Chapter 11 SGN-41006 (4 cr) Chapter 11 Clustering Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 25, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006 (4 cr) Chapter

More information

Association Pattern Mining. Lijun Zhang

Association Pattern Mining. Lijun Zhang Association Pattern Mining Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction The Frequent Pattern Mining Model Association Rule Generation Framework Frequent Itemset Mining Algorithms

More information

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2

More information

Lecture 2 Data Cube Basics

Lecture 2 Data Cube Basics CompSci 590.6 Understanding Data: Theory and Applica>ons Lecture 2 Data Cube Basics Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu 1 Today s Papers 1. Gray- Chaudhuri- Bosworth- Layman- Reichart- Venkatrao-

More information

Overlapping Communities

Overlapping Communities Yangyang Hou, Mu Wang, Yongyang Yu Purdue Univiersity Department of Computer Science April 25, 2013 Overview Datasets Algorithm I Algorithm II Algorithm III Evaluation Overview Graph models of many real

More information

Lecture 19: Graph Partitioning

Lecture 19: Graph Partitioning Lecture 19: Graph Partitioning David Bindel 3 Nov 2011 Logistics Please finish your project 2. Please start your project 3. Graph partitioning Given: Graph G = (V, E) Possibly weights (W V, W E ). Possibly

More information

MSA220 - Statistical Learning for Big Data

MSA220 - Statistical Learning for Big Data MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups

More information

arxiv: v1 [stat.ml] 2 Nov 2010

arxiv: v1 [stat.ml] 2 Nov 2010 Community Detection in Networks: The Leader-Follower Algorithm arxiv:111.774v1 [stat.ml] 2 Nov 21 Devavrat Shah and Tauhid Zaman* devavrat@mit.edu, zlisto@mit.edu November 4, 21 Abstract Traditional spectral

More information

Community Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal

Community Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal Community Structure Detection Amar Chandole Ameya Kabre Atishay Aggarwal What is a network? Group or system of interconnected people or things Ways to represent a network: Matrices Sets Sequences Time

More information

Community Identification in International Weblogs

Community Identification in International Weblogs Technische Universität Kaiserslautern Fernstudium Software Engineering for Embedded Systems Masterarbeit Community Identification in International Weblogs Autor: Christoph Rueger Betreuer: Prof. Dr. Prof.

More information

Clustering Algorithms for general similarity measures

Clustering Algorithms for general similarity measures Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative

More information

Chapter 11. Network Community Detection

Chapter 11. Network Community Detection Chapter 11. Network Community Detection Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Outline

More information

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1 Graph Theory Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ January 30, 2018 Network

More information

Kernighan/Lin - Preliminary Definitions. Comments on Kernighan/Lin Algorithm. Partitioning Without Nodal Coordinates Kernighan/Lin

Kernighan/Lin - Preliminary Definitions. Comments on Kernighan/Lin Algorithm. Partitioning Without Nodal Coordinates Kernighan/Lin Partitioning Without Nodal Coordinates Kernighan/Lin Given G = (N,E,W E ) and a partitioning N = A U B, where A = B. T = cost(a,b) = edge cut of A and B partitions. Find subsets X of A and Y of B with

More information

Content-based Image and Video Retrieval. Image Segmentation

Content-based Image and Video Retrieval. Image Segmentation Content-based Image and Video Retrieval Vorlesung, SS 2011 Image Segmentation 2.5.2011 / 9.5.2011 Image Segmentation One of the key problem in computer vision Identification of homogenous region in the

More information

Performance and Scalability: Apriori Implementa6on

Performance and Scalability: Apriori Implementa6on Performance and Scalability: Apriori Implementa6on Apriori R. Agrawal and R. Srikant. Fast algorithms for mining associa6on rules. VLDB, 487 499, 1994 Reducing Number of Comparisons Candidate coun6ng:

More information

CS 220: Discrete Structures and their Applications. graphs zybooks chapter 10

CS 220: Discrete Structures and their Applications. graphs zybooks chapter 10 CS 220: Discrete Structures and their Applications graphs zybooks chapter 10 directed graphs A collection of vertices and directed edges What can this represent? undirected graphs A collection of vertices

More information

Finding and Visualizing Graph Clusters Using PageRank Optimization. Fan Chung and Alexander Tsiatas, UCSD WAW 2010

Finding and Visualizing Graph Clusters Using PageRank Optimization. Fan Chung and Alexander Tsiatas, UCSD WAW 2010 Finding and Visualizing Graph Clusters Using PageRank Optimization Fan Chung and Alexander Tsiatas, UCSD WAW 2010 What is graph clustering? The division of a graph into several partitions. Clusters should

More information

V4 Matrix algorithms and graph partitioning

V4 Matrix algorithms and graph partitioning V4 Matrix algorithms and graph partitioning - Community detection - Simple modularity maximization - Spectral modularity maximization - Division into more than two groups - Other algorithms for community

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 16: Association Rules Jan-Willem van de Meent (credit: Yijun Zhao, Yi Wang, Tan et al., Leskovec et al.) Apriori: Summary All items Count

More information

Image Segmentation continued Graph Based Methods. Some slides: courtesy of O. Capms, Penn State, J.Ponce and D. Fortsyth, Computer Vision Book

Image Segmentation continued Graph Based Methods. Some slides: courtesy of O. Capms, Penn State, J.Ponce and D. Fortsyth, Computer Vision Book Image Segmentation continued Graph Based Methods Some slides: courtesy of O. Capms, Penn State, J.Ponce and D. Fortsyth, Computer Vision Book Previously Binary segmentation Segmentation by thresholding

More information

Image Segmentation. Selim Aksoy. Bilkent University

Image Segmentation. Selim Aksoy. Bilkent University Image Segmentation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Examples of grouping in vision [http://poseidon.csd.auth.gr/lab_research/latest/imgs/s peakdepvidindex_img2.jpg]

More information

Oh Pott, Oh Pott! or how to detect community structure in complex networks

Oh Pott, Oh Pott! or how to detect community structure in complex networks Oh Pott, Oh Pott! or how to detect community structure in complex networks Jörg Reichardt Interdisciplinary Centre for Bioinformatics, Leipzig, Germany (Host of the 2012 Olympics) Questions to start from

More information

Introduction to spectral clustering

Introduction to spectral clustering Introduction to spectral clustering Denis Hamad LASL ULCO Denis.Hamad@lasl.univ-littoral.fr Philippe Biela HEI LAGIS Philippe.Biela@hei.fr Data Clustering Data clustering Data clustering is an important

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 2 Part 4: Dividing Networks into Clusters The problem l Graph partitioning

More information

SEMINAR: GRAPH-BASED METHODS FOR NLP

SEMINAR: GRAPH-BASED METHODS FOR NLP SEMINAR: GRAPH-BASED METHODS FOR NLP Organisatorisches: Seminar findet komplett im Mai statt Seminarausarbeitungen bis 15. Juli (?) Hilfen Seminarvortrag / Ausarbeitung auf der Webseite Tucan number for

More information

BACKGROUND: A BRIEF INTRODUCTION TO GRAPH THEORY

BACKGROUND: A BRIEF INTRODUCTION TO GRAPH THEORY BACKGROUND: A BRIEF INTRODUCTION TO GRAPH THEORY General definitions; Representations; Graph Traversals; Topological sort; Graphs definitions & representations Graph theory is a fundamental tool in sparse

More information

Aarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg

Aarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg Spectral Clustering Aarti Singh Machine Learning 10-701/15-781 Apr 7, 2010 Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg 1 Data Clustering Graph Clustering Goal: Given data points X1,, Xn and similarities

More information

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003 CS 664 Slides #11 Image Segmentation Prof. Dan Huttenlocher Fall 2003 Image Segmentation Find regions of image that are coherent Dual of edge detection Regions vs. boundaries Related to clustering problems

More information

Math 778S Spectral Graph Theory Handout #2: Basic graph theory

Math 778S Spectral Graph Theory Handout #2: Basic graph theory Math 778S Spectral Graph Theory Handout #: Basic graph theory Graph theory was founded by the great Swiss mathematician Leonhard Euler (1707-178) after he solved the Königsberg Bridge problem: Is it possible

More information

Approximation slides 1. An optimal polynomial algorithm for the Vertex Cover and matching in Bipartite graphs

Approximation slides 1. An optimal polynomial algorithm for the Vertex Cover and matching in Bipartite graphs Approximation slides 1 An optimal polynomial algorithm for the Vertex Cover and matching in Bipartite graphs Approximation slides 2 Linear independence A collection of row vectors {v T i } are independent

More information

16 - Networks and PageRank

16 - Networks and PageRank - Networks and PageRank ST 9 - Fall 0 Contents Network Intro. Required R Packages................................ Example: Zachary s karate club network.................... Basic Network Concepts. Basic

More information

Behavioral Data Mining. Lecture 18 Clustering

Behavioral Data Mining. Lecture 18 Clustering Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality K-means Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i

More information

Graph Cuts and Normalized Cuts

Graph Cuts and Normalized Cuts Definition: University of Alicante (Spain) Matrix Computing (subject 3168 Degree in Maths) 30 hours (theory)) + 15 hours (practical assignment) Contents 1. Graph Partitiond, Cuts and Normalization 1. Graph

More information

The clustering in general is the task of grouping a set of objects in such a way that objects

The clustering in general is the task of grouping a set of objects in such a way that objects Spectral Clustering: A Graph Partitioning Point of View Yangzihao Wang Computer Science Department, University of California, Davis yzhwang@ucdavis.edu Abstract This course project provide the basic theory

More information

Big Data Management and NoSQL Databases

Big Data Management and NoSQL Databases NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic

More information

Lecture 11: E-M and MeanShift. CAP 5415 Fall 2007

Lecture 11: E-M and MeanShift. CAP 5415 Fall 2007 Lecture 11: E-M and MeanShift CAP 5415 Fall 2007 Review on Segmentation by Clustering Each Pixel Data Vector Example (From Comanciu and Meer) Review of k-means Let's find three clusters in this data These

More information

Sergei Silvestrov, Christopher Engström. January 29, 2013

Sergei Silvestrov, Christopher Engström. January 29, 2013 Sergei Silvestrov, January 29, 2013 L2: t Todays lecture:. L2: t Todays lecture:. measurements. L2: t Todays lecture:. measurements. s. L2: t Todays lecture:. measurements. s.. First we would like to define

More information

Image Segmentation. Selim Aksoy. Bilkent University

Image Segmentation. Selim Aksoy. Bilkent University Image Segmentation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Examples of grouping in vision [http://poseidon.csd.auth.gr/lab_research/latest/imgs/s peakdepvidindex_img2.jpg]

More information

Local Fiedler Vector Centrality for Detection of Deep and Overlapping Communities in Networks

Local Fiedler Vector Centrality for Detection of Deep and Overlapping Communities in Networks Local Fiedler Vector Centrality for Detection of Deep and Overlapping Communities in Networks Pin-Yu Chen and Alfred O. Hero III, Fellow, IEEE Department of Electrical Engineering and Computer Science,

More information

EXPLOITING THE STRUCTURE OF BIPARTITE GRAPHS FOR ALGEBRAIC AND SPECTRAL GRAPH THEORY APPLICATIONS

EXPLOITING THE STRUCTURE OF BIPARTITE GRAPHS FOR ALGEBRAIC AND SPECTRAL GRAPH THEORY APPLICATIONS Internet Mathematics, 11:201 231, 2015 Copyright Taylor & Francis Group, LLC ISSN: 1542-7951 print/1944-9488 online DOI: 10.1080/15427951.2014.958250 EXPLOITING THE STRUCTURE OF BIPARTITE GRAPHS FOR ALGEBRAIC

More information

Scalable Clustering of Signed Networks Using Balance Normalized Cut

Scalable Clustering of Signed Networks Using Balance Normalized Cut Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.

More information

Community Detection Methods using Eigenvectors of Matrices

Community Detection Methods using Eigenvectors of Matrices Community Detection Methods using Eigenvectors of Matrices Yan Zhang Abstract In this paper we investigate the problem of detecting communities in graphs. We use the eigenvectors of the graph Laplacian

More information