CS224W: Analysis of Networks Jure Leskovec, Stanford University

Size: px
Start display at page:

Download "CS224W: Analysis of Networks Jure Leskovec, Stanford University"

Transcription

1 CS224W: Analysis of Networks Jure Leskovec, Stanford University

2 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 2 Observations Models Algorithms Small diameter, Edge clustering Patterns of signed edge creation Viral Marketing, Blogosphere, Memetracking Scale-Free Densification power law, Shrinking diameters Strength of weak ties, Core-periphery Erdös-Renyi model, Small-world model Structural balance, Theory of status Independent cascade model, Game theoretic model Preferential attachment, Copying model Microscopic model of evolving networks Kronecker Graphs Decentralized search Models for predicting edge signs Influence maximization, Outbreak detection, LIM PageRank, Hubs and authorities Link prediction, Supervised random walks Community detection: Girvan-Newman, Modularity

3 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 3 We often think of networks looking like this: What led to such a conceptual picture?

4 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 4 How does information flow through the network? What structurally distinct roles do nodes play? What roles do different links (short vs. long) play? How do people find out about new jobs? Mark Granovetter, part of his PhD in 1960s People find the information through personal contacts But: Contacts were often acquaintances rather than close friends This is surprising: One would expect your friends to help you out more than casual acquaintances Why is it that acquaintances are most helpful?

5 [Granovetter 73] 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 5 Two perspectives on friendships: Structural: Friendships span different parts of the network Interpersonal: Friendship between two people is either strong or weak Structural role: Triadic Closure a b Which edge is more likely, a-b or a-c? c If two people in a network have a friend in common, then there is an increased likelihood they will become friends themselves.

6 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 6 Granovetter makes a connection between social and structural role of an edge First point: Structure Structurally embedded edges are also socially strong Long-range edges spanning different parts of the network are socially weak Second point: Information Long-range edges allow you to gather information from different parts of the network and get a job Structurally embedded edges are heavily redundant in terms of information access S S S a Weak W b S Strong

7 Triadic closure == High clustering coefficient Reasons for triadic closure: If B and C have a friend A in common, then: B is more likely to meet C (since they both spend time with A) B and C trust each other (since they have a friend in common) A has incentive to bring B and C together (as it is hard for A to maintain two disjoint relationships) Empirical study by Bearman and Moody: Teenage girls with low clustering coefficient are more likely to contemplate suicide 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 7 B A C

8 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 8 Define: Bridge edge If removed, it disconnects the graph Define: Local bridge Edge of Span > 2 (Span of an edge is the distance of the edge endpoints if the edge is deleted. Local bridges with long span are like real bridges) Define: Two types of edges: Strong (friend), Weak (acquaintance) Define: Strong triadic closure: Two strong ties imply a third edge Fact: If strong triadic closure is satisfied then local bridges are weak ties! S Bridge a a b Local bridge S S a S W W b Edge: W or S b S S S

9 Claim: If node A satisfies Strong Triadic Closure and is involved in at least two strong ties, then any local bridge adjacent to A must be a weak tie. Proof by contradiction: Assume A satisfies Strong Triadic Closure and has 2 strong ties Let A B be local bridge and a strong tie Then B C must exist because of Strong Triadic Closure But then A B is not a bridge! (since B-C must be connected due to Strong Triadic Closure property) 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 9 C S S S A A S S B

10 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 10 For many years Granovetter s theory was not tested But, today we have large who-talks-to-whom graphs: , Messenger, Cell phones, Facebook Onnela et al. 2007: Cell-phone network of 20% of country s population Edge strength: # phone calls

11 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 11 Edge overlap: O &' = N(i) N j N(i) N j N(i) a set of neighbors of node i Overlap = 0 when an edge is a local bridge

12 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 12 Cell-phone network Observation: Highly used links have high overlap! Legend: True: The data Permuted strengths: Keep the network structure but randomly reassign edge strengths Neighborhood overlap Permuted strengths True Edge strength (#calls)

13 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 13 Real edge strengths in mobile call graph Strong ties are more embedded (have higher overlap)

14 Same network, same set of edge strengths but now strengths are randomly shuffled 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 14

15 Size of largest component Low disconnects the network sooner Removing links by strength (#calls) Low to high High to low Fraction of removed links Conceptual picture of network structure 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 15

16 Size of largest component Low disconnects the network sooner Removing links based on overlap Low to high High to low Fraction of removed links Conceptual picture of network structure 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 16

17 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 17 Granovetter s theory leads to the following conceptual picture of networks Strong ties Weak ties

18

19 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 19 Granovetter s theory suggest that networks are composed of tightly connected sets of nodes Network communities: Communities, clusters, groups, modules Sets of nodes with lots of connections inside and few to outside (the rest of the network)

20 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 20 How to automatically find such densely connected groups of nodes? Ideally such automatically detected clusters would then correspond to real groups For example: Communities, clusters, groups, modules

21 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 21 Zachary s Karate club network: Observe social ties and rivalries in a university karate club During his observation, conflicts led the group to split Split could be explained by a minimum cut in the network

22 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 22 Find micro-markets by partitioning the query-to-advertiser graph in web search: query advertiser

23 Can we identify node groups? (communities, modules, clusters) 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, Nodes: Teams Edges: Games played 23

24 11/13/17 Nodes: Teams Edges: Games played Jure Leskovec, Stanford CS224W: Analysis of Networks, 24 NCAA conferences

25 Can we identify social communities? 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, Nodes: Users Edges: Friendships 25

26 High school Company Stanford (Squash) Stanford (Basketball) Social communities 11/13/17 Nodes: Users Edges: Friendships Jure Leskovec, Stanford CS224W: Analysis of Networks, 26

27 Can we identify functional modules? 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, Nodes: Proteins Edges: Interactions 27

28 11/13/17 Nodes: Proteins Edges: Interactions Jure Leskovec, Stanford CS224W: Analysis of Networks, 28 Functional modules

29 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 29 How to find communities? We will work with undirected (unweighted) networks

30 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 30 Edge betweenness: Number of shortest paths passing over the edge Intuition: b=16 b=7.5 Edge strengths (call volume) in a real network Edge betweenness in a real network

31 [Girvan-Newman 02] 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 31 Divisive hierarchical clustering based on the notion of edge betweenness: Number of shortest paths passing through the edge Girvan-Newman Algorithm: Undirected unweighted networks Repeat until no edges are left: Calculate betweennessof edges Remove edges with highest betweenness Connected components are communities Gives a hierarchical decomposition of the network

32 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, Need to re-compute betweenness at every step

33 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 33 Step 1: Step 2: Step 3: Hierarchical network decomposition:

34 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, How to compute betweenness? 2. How to select the number of clusters?

35 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 35 Want to compute betweenness of paths starting at node A Breadth first search starting from A:

36 Forward step: Count the number of shortest paths from A to all other nodes of the network 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 36

37 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 37 Backward step: Compute betweenness: If there are multiple paths count them fractionally The algorithm: Add edge flows: -- node flow = 1+ child edges -- split the flow up based on the parent value Repeat the BFS procedure for each starting node U 1 path to K. Split evenly paths to J Split 1:2 1+1 paths to H Split evenly

38 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 38 Backward step: Compute betweenness: If there are multiple paths count them fractionally The algorithm: Add edge flows: -- node flow = 1+ child edges -- split the flow up based on the parent value Repeat the BFS procedure for each starting node U 1 path to K. Split evenly paths to J Split 1:2 1+1 paths to H Split evenly

39 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, How to compute betweenness? 2. How to select the number of clusters?

40 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 40 Communities: sets of tightly connected nodes Define: Modularity Q A measure of how well a network is partitioned into communities Given a partitioning of the network into groups s S: Q µ sî S [ (# edges within group s) (expected # edges within group s) ] Need a null model!

41 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 41 Given real G on n nodes and m edges, construct rewired network G Same degree distribution but random connections Consider G as a multigraph The expected number of edges between nodes i and j of degrees k i and k j equals to: k i k j = k ik j 2m 2m The expected number of edges in (multigraph) G : = 1 k i k j 2 i N j N = 1 1 k 2m 2 2m i j N k j = 1 2m 2m = m 4m i N = i j Note: F k H H I = 2m

42 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 42 Modularity of partitioning S of graph G: Q µ sî S [ (# edges within group s) (expected # edges within group s) ] Q G, S = 1 A 2m ij k ik j s S i s j s 2m Normalizing cost.: -1<Q<1 A ij = 1 if i j, 0 else Modularity values take range [ 1,1] It is positive if the number of edges within groups exceeds the expected number <Q means significant community structure

43 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 43 Modularity is useful for selecting the number of clusters: Q Why not optimize Modularity directly?

44

45 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 45 Let s split the graph into 2 communities! Want to directly optimize modularity! arg max R Q G, S = V A WX &' Z [Z \ ] R & ] ' ] WX Community membership vector s: s i = 1 if node i is in community 1-1 if node i is in community -1 Q G, s = V A WX &' Z [Z \ & I ' I WX = V A `X &' Z [Z \ &,' I s WX &s ' s & s ' ] [ ] \ _V W = 1.. if s i=s j 0.. else

46 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 46 Define: Modularity matrix: B ij = A ij k ik j 2m Membership: s = { 1, +1} Then: Q G, s = V A `X &' Z [Z \ & I ' I WX = V B `X &,' I &' s & s ' = V s `X & & ' B &' s ' = V `X sf Bs = B i s Note: each row/col of B sums to 0: A ij = k i j, k ik j j = k 2m i k j j Task: Find sî{-1,+1} n that maximizes Q(G,s) 2m = k i s &s '

47 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 47 Symmetric matrix A That is positive semi-definite: A = U U T Then solutions λ, x to equation A x = λ x : Eigenvectors x i ordered by the magnitude of their corresponding eigenvalues λ & (λ V λ W λ n ) x i are orthonormal (orthogonal and unit length) In other words, x i form a coordinate system (basis) If A is positive-semidefinite: λ & 0 (and they always exist) Eigen Decomposition theorem: Can rewrite matrix A in terms of its eigenvectors and eigenvalues: A = T i x i λ & x i

48 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 48 Rewrite: Q G, s = V `X sq Bs in terms of its eigenvectors and eigenvalues: n = s q F x & λ & x & f &tv n s = F s f x & λ & x & f s &tv n = F s f x & W λ & &tv So, if there would be no other constraints on s then to maximize Q, we make s = x n x Why? Because λ n λ nu1 2 Remember s has fixed euclidean length! Assigns all weight in the sum to λ n (largest eigenvalue) All other s T x i terms are zero because of orthonormality s x 1

49 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 49 Let s consider only the first term in the summation (because λ n is the largest): n arg max Q G, s = &tv s f x W & λ & s f x W n λ n ] Let s maximize: s T x n where s j Î{-1,+1} To do this, we set: s j = x +1 1 if x n,j 0 (j th coordinate of x n 0) if x n,j < 0 (j th coordinate of x n < 0) Continue the bisection hierarchically

50 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 50 Fast Modularity Optimization Algorithm: Find leading eigenvector x n of modularity matrix B Divide the nodes by the signs of the elements of x n Repeat hierarchically until: If a proposed split does not cause modularity to increase, declare community indivisible and do not split it If all communities are indivisible, stop How to find x n? Power method! Start with random v (0), repeat : When converged (v (t) v (t+1) ), set x n = v (t) ( t+ 1) v = Bv Bv ( t) ( t)

51 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 51 Girvan-Newman: Based on the strength of weak ties Remove edge of highest betweenness Modularity: Overall quality of the partitioning of a graph Use to determine the number of communities Fast modularity optimization: Transform the modularity optimization to a eigenvalue problem

52

53 [Ron Burt] 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 53 Who is better off, Robert or James?

54 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 54 Few structural holes Many structural holes Structural Holes provide ego with access to novel information, power, freedom

55 The network constraint measure [Burt]: c To what extent are person s contacts redundant Low: disconnected contacts High: contacts that are close or strongly tied = å i c p ik = å k p ij é ê p ë + å( p ) ik pkj i ij ij j j k p uv prop. of u s energy invested in relationship with v j p uv = 1/d u 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 55 ú û ù 2 2 p 12 =¼ 2 i 1 4 k p 15 =¼ p 25 =½ p uv j

56 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 56 Network constraint: James: c J = Robert: c R = Constraint: To what extent are person s contacts redundant Low: disconnected contacts High: contacts that are close or strongly tied

57 [Ron Burt] 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, 57

Non Overlapping Communities

Non Overlapping Communities Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides

More information

Mining Social Network Graphs

Mining Social Network Graphs Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be

More information

Tie strength, social capital, betweenness and homophily. Rik Sarkar

Tie strength, social capital, betweenness and homophily. Rik Sarkar Tie strength, social capital, betweenness and homophily Rik Sarkar Networks Position of a node in a network determines its role/importance Structure of a network determines its properties 2 Today Notion

More information

CSE 258 Lecture 12. Web Mining and Recommender Systems. Social networks

CSE 258 Lecture 12. Web Mining and Recommender Systems. Social networks CSE 258 Lecture 12 Web Mining and Recommender Systems Social networks Social networks We ve already seen networks (a little bit) in week 3 i.e., we ve studied inference problems defined on graphs, and

More information

CSE 158 Lecture 11. Web Mining and Recommender Systems. Social networks

CSE 158 Lecture 11. Web Mining and Recommender Systems. Social networks CSE 158 Lecture 11 Web Mining and Recommender Systems Social networks Assignment 1 Due 5pm next Monday! (Kaggle shows UTC time, but the due date is 5pm, Monday, PST) Assignment 1 Assignment 1 Social networks

More information

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2 CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #21: Graph Mining 2 Networks & Communi>es We o@en think of networks being organized into modules, cluster, communi>es: VT CS 5614 2 Goal:

More information

Network Community Detection

Network Community Detection Network Community Detection Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ March 20, 2018

More information

Structure of Social Networks

Structure of Social Networks Structure of Social Networks Outline Structure of social networks Applications of structural analysis Social *networks* Twitter Facebook Linked-in IMs Email Real life Address books... Who Twitter #numbers

More information

CSE 255 Lecture 13. Data Mining and Predictive Analytics. Triadic closure; strong & weak ties

CSE 255 Lecture 13. Data Mining and Predictive Analytics. Triadic closure; strong & weak ties CSE 255 Lecture 13 Data Mining and Predictive Analytics Triadic closure; strong & weak ties Monday Random models of networks: Erdos Renyi random graphs (picture from Wikipedia http://en.wikipedia.org/wiki/erd%c5%91s%e2%80%93r%c3%a9nyi_model)

More information

Economics of Information Networks

Economics of Information Networks Economics of Information Networks Stephen Turnbull Division of Policy and Planning Sciences Lecture 4: December 7, 2017 Abstract We continue discussion of the modern economics of networks, which considers

More information

Online Social Networks and Media. Community detection

Online Social Networks and Media. Community detection Online Social Networks and Media Community detection 1 Notes on Homework 1 1. You should write your own code for generating the graphs. You may use SNAP graph primitives (e.g., add node/edge) 2. For the

More information

CSE 158 Lecture 13. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

CSE 158 Lecture 13. Web Mining and Recommender Systems. Triadic closure; strong & weak ties CSE 158 Lecture 13 Web Mining and Recommender Systems Triadic closure; strong & weak ties Monday Random models of networks: Erdos Renyi random graphs (picture from Wikipedia http://en.wikipedia.org/wiki/erd%c5%91s%e2%80%93r%c3%a9nyi_model)

More information

CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties CSE 158 Lecture 11 Web Mining and Recommender Systems Triadic closure; strong & weak ties Triangles So far we ve seen (a little about) how networks can be characterized by their connectivity patterns What

More information

Web 2.0 Social Data Analysis

Web 2.0 Social Data Analysis Web 2.0 Social Data Analysis Ing. Jaroslav Kuchař jaroslav.kuchar@fit.cvut.cz Structure (2) Czech Technical University in Prague, Faculty of Information Technologies Software and Web Engineering 2 Contents

More information

Extracting Information from Complex Networks

Extracting Information from Complex Networks Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform

More information

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017 THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS Summer semester, 2016/2017 SOCIAL NETWORK ANALYSIS: THEORY AND APPLICATIONS 1. A FEW THINGS ABOUT NETWORKS NETWORKS IN THE REAL WORLD There are four categories

More information

Social-Network Graphs

Social-Network Graphs Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities

More information

My favorite application using eigenvalues: partitioning and community detection in social networks

My favorite application using eigenvalues: partitioning and community detection in social networks My favorite application using eigenvalues: partitioning and community detection in social networks Will Hobbs February 17, 2013 Abstract Social networks are often organized into families, friendship groups,

More information

Distribution-Free Models of Social and Information Networks

Distribution-Free Models of Social and Information Networks Distribution-Free Models of Social and Information Networks Tim Roughgarden (Stanford CS) joint work with Jacob Fox (Stanford Math), Rishi Gupta (Stanford CS), C. Seshadhri (UC Santa Cruz), Fan Wei (Stanford

More information

Social Networks, Social Interaction, and Outcomes

Social Networks, Social Interaction, and Outcomes Social Networks, Social Interaction, and Outcomes Social Networks and Social Interaction Not all links are created equal: Some links are stronger than others Some links are more useful than others Links

More information

Recap! CMSC 498J: Social Media Computing. Department of Computer Science University of Maryland Spring Hadi Amiri

Recap! CMSC 498J: Social Media Computing. Department of Computer Science University of Maryland Spring Hadi Amiri Recap! CMSC 498J: Social Media Computing Department of Computer Science University of Maryland Spring 2015 Hadi Amiri hadi@umd.edu Announcement CourseEvalUM https://www.courseevalum.umd.edu/ 2 Announcement

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford

More information

Clusters and Communities

Clusters and Communities Clusters and Communities Lecture 7 CSCI 4974/6971 22 Sep 2016 1 / 14 Today s Biz 1. Reminders 2. Review 3. Communities 4. Betweenness and Graph Partitioning 5. Label Propagation 2 / 14 Today s Biz 1. Reminders

More information

Web 2.0 Social Data Analysis

Web 2.0 Social Data Analysis Web 2.0 Social Data Analysis Ing. Jaroslav Kuchař jaroslav.kuchar@fit.cvut.cz Structure(1) Czech Technical University in Prague, Faculty of Information Technologies Software and Web Engineering 2 Contents

More information

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena CSE 190 Lecture 16 Data Mining and Predictive Analytics Small-world phenomena Another famous study Stanley Milgram wanted to test the (already popular) hypothesis that people in social networks are separated

More information

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection CSE 258 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu HITS (Hypertext Induced Topic Selection) Is a measure of importance of pages or documents, similar to PageRank

More information

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection CSE 158 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov Algorithms and Applications in Social Networks 2017/2018, Semester B Slava Novgorodov 1 Lesson #1 Administrative questions Course overview Introduction to Social Networks Basic definitions Network properties

More information

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection CSE 255 Lecture 6 Data Mining and Predictive Analytics Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

MCL. (and other clustering algorithms) 858L

MCL. (and other clustering algorithms) 858L MCL (and other clustering algorithms) 858L Comparing Clustering Algorithms Brohee and van Helden (2006) compared 4 graph clustering algorithms for the task of finding protein complexes: MCODE RNSC Restricted

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/4/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

More information

Network Science: Principles and Applications

Network Science: Principles and Applications Network Science: Principles and Applications CS 695 - Fall 2016 Amarda Shehu,Fei Li [amarda, lifei](at)gmu.edu Department of Computer Science George Mason University 1 Outline of Today s Class 2 Communities

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 2 Part 4: Dividing Networks into Clusters The problem l Graph partitioning

More information

Modularity CMSC 858L

Modularity CMSC 858L Modularity CMSC 858L Module-detection for Function Prediction Biological networks generally modular (Hartwell+, 1999) We can try to find the modules within a network. Once we find modules, we can look

More information

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1 Graph Theory Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ January 30, 2018 Network

More information

CAIM: Cerca i Anàlisi d Informació Massiva

CAIM: Cerca i Anàlisi d Informació Massiva 1 / 72 CAIM: Cerca i Anàlisi d Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Balcázar, Ricard Gavaldá Department of Computer Science, UPC Fall 2016 http://www.cs.upc.edu/~caim

More information

Jure Leskovec, Cornell/Stanford University. Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research

Jure Leskovec, Cornell/Stanford University. Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research Jure Leskovec, Cornell/Stanford University Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research Network: an interaction graph: Nodes represent entities Edges represent interaction

More information

Networks in economics and finance. Lecture 1 - Measuring networks

Networks in economics and finance. Lecture 1 - Measuring networks Networks in economics and finance Lecture 1 - Measuring networks What are networks and why study them? A network is a set of items (nodes) connected by edges or links. Units (nodes) Individuals Firms Banks

More information

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization Pedro Ribeiro (DCC/FCUP & CRACS/INESC-TEC) Part 1 Motivation and emergence of Network Science

More information

CS224W: Analysis of Network Jure Leskovec, Stanford University

CS224W: Analysis of Network Jure Leskovec, Stanford University CS224W: Analysis of Network Jure Leskovec, Stanford University http://cs224w.stanford.edu 9/25/17 Jure Leskovec, Stanford CS224W: Analysis of Networks 2 Why Networks? Networks are a general language for

More information

Social & Information Network Analysis CS 224W

Social & Information Network Analysis CS 224W Social & Information Network Analysis CS 224W Final Report Alexandre Becker Jordane Giuly Sébastien Robaszkiewicz Stanford University December 2011 1 Introduction The microblogging service Twitter today

More information

Social Network Analysis

Social Network Analysis Social Network Analysis Mathematics of Networks Manar Mohaisen Department of EEC Engineering Adjacency matrix Network types Edge list Adjacency list Graph representation 2 Adjacency matrix Adjacency matrix

More information

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur Lecture : Graphs Rajat Mittal IIT Kanpur Combinatorial graphs provide a natural way to model connections between different objects. They are very useful in depicting communication networks, social networks

More information

CS 220: Discrete Structures and their Applications. graphs zybooks chapter 10

CS 220: Discrete Structures and their Applications. graphs zybooks chapter 10 CS 220: Discrete Structures and their Applications graphs zybooks chapter 10 directed graphs A collection of vertices and directed edges What can this represent? undirected graphs A collection of vertices

More information

Algorithmic and Economic Aspects of Networks. Nicole Immorlica

Algorithmic and Economic Aspects of Networks. Nicole Immorlica Algorithmic and Economic Aspects of Networks Nicole Immorlica Syllabus 1. Jan. 8 th (today): Graph theory, network structure 2. Jan. 15 th : Random graphs, probabilistic network formation 3. Jan. 20 th

More information

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu Can we identify node groups? (communities, modules, clusters) 2/13/2014 Jure Leskovec, Stanford C246: Mining

More information

Alessandro Del Ponte, Weijia Ran PAD 637 Week 3 Summary January 31, Wasserman and Faust, Chapter 3: Notation for Social Network Data

Alessandro Del Ponte, Weijia Ran PAD 637 Week 3 Summary January 31, Wasserman and Faust, Chapter 3: Notation for Social Network Data Wasserman and Faust, Chapter 3: Notation for Social Network Data Three different network notational schemes Graph theoretic: the most useful for centrality and prestige methods, cohesive subgroup ideas,

More information

Social Data Management Communities

Social Data Management Communities Social Data Management Communities Antoine Amarilli 1, Silviu Maniu 2 January 9th, 2018 1 Télécom ParisTech 2 Université Paris-Sud 1/20 Table of contents Communities in Graphs 2/20 Graph Communities Communities

More information

Introduction to network metrics

Introduction to network metrics Universitat Politècnica de Catalunya Version 0.5 Complex and Social Networks (2018-2019) Master in Innovation and Research in Informatics (MIRI) Instructors Argimiro Arratia, argimiro@cs.upc.edu, http://www.cs.upc.edu/~argimiro/

More information

Clustering Algorithms on Graphs Community Detection 6CCS3WSN-7CCSMWAL

Clustering Algorithms on Graphs Community Detection 6CCS3WSN-7CCSMWAL Clustering Algorithms on Graphs Community Detection 6CCS3WSN-7CCSMWAL Contents Zachary s famous example Community structure Modularity The Girvan-Newman edge betweenness algorithm In the beginning: Zachary

More information

Online Social Networks and Media. Positive and Negative Edges Strong and Weak Edges Strong Triadic Closure

Online Social Networks and Media. Positive and Negative Edges Strong and Weak Edges Strong Triadic Closure Online Social Networks and Media Positive and Negative Edges Strong and Weak Edges Strong Triadic Closure POSITIVE AND NEGATIVE TIES Structural Balance What about negative edges? Initially, a complete

More information

Graph Theory and Network Measurment

Graph Theory and Network Measurment Graph Theory and Network Measurment Social and Economic Networks MohammadAmin Fazli Social and Economic Networks 1 ToC Network Representation Basic Graph Theory Definitions (SE) Network Statistics and

More information

Topic mash II: assortativity, resilience, link prediction CS224W

Topic mash II: assortativity, resilience, link prediction CS224W Topic mash II: assortativity, resilience, link prediction CS224W Outline Node vs. edge percolation Resilience of randomly vs. preferentially grown networks Resilience in real-world networks network resilience

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second

More information

Social Networks. Slides by : I. Koutsopoulos (AUEB), Source:L. Adamic, SN Analysis, Coursera course

Social Networks. Slides by : I. Koutsopoulos (AUEB), Source:L. Adamic, SN Analysis, Coursera course Social Networks Slides by : I. Koutsopoulos (AUEB), Source:L. Adamic, SN Analysis, Coursera course Introduction Political blogs Organizations Facebook networks Ingredient networks SN representation Networks

More information

Graph Theory for Network Science

Graph Theory for Network Science Graph Theory for Network Science Dr. Natarajan Meghanathan Professor Department of Computer Science Jackson State University, Jackson, MS E-mail: natarajan.meghanathan@jsums.edu Networks or Graphs We typically

More information

Basics of Network Analysis

Basics of Network Analysis Basics of Network Analysis Hiroki Sayama sayama@binghamton.edu Graph = Network G(V, E): graph (network) V: vertices (nodes), E: edges (links) 1 Nodes = 1, 2, 3, 4, 5 2 3 Links = 12, 13, 15, 23,

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2????? Machine Learning Node

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

More information

Signal Processing for Big Data

Signal Processing for Big Data Signal Processing for Big Data Sergio Barbarossa 1 Summary 1. Networks 2.Algebraic graph theory 3. Random graph models 4. OperaGons on graphs 2 Networks The simplest way to represent the interaction between

More information

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates CSE 51: Design and Analysis of Algorithms I Spring 016 Lecture 11: Clustering and the Spectral Partitioning Algorithm Lecturer: Shayan Oveis Gharan May nd Scribe: Yueqi Sheng Disclaimer: These notes have

More information

Online Social Networks and Media

Online Social Networks and Media Online Social Networks and Media Absorbing Random Walks Link Prediction Why does the Power Method work? If a matrix R is real and symmetric, it has real eigenvalues and eigenvectors: λ, w, λ 2, w 2,, (λ

More information

Social, Information, and Routing Networks: Models, Algorithms, and Strategic Behavior

Social, Information, and Routing Networks: Models, Algorithms, and Strategic Behavior Social, Information, and Routing Networks: Models, Algorithms, and Strategic Behavior Who? Prof. Aris Anagnostopoulos Prof. Luciana S. Buriol Prof. Guido Schäfer What will We Cover? Topics: Network properties

More information

Basic Network Concepts

Basic Network Concepts Basic Network Concepts Basic Vocabulary Alice Graph Network Edges Links Nodes Vertices Chuck Bob Edges Alice Chuck Bob Edge Weights Alice Chuck Bob Apollo 13 Movie Network Main Actors in Apollo 13 the

More information

Social and Information Network Analysis Positive and Negative Relationships

Social and Information Network Analysis Positive and Negative Relationships Social and Information Network Analysis Positive and Negative Relationships Cornelia Caragea Department of Computer Science and Engineering University of North Texas June 14, 2016 To Recap Triadic Closure

More information

Extracting Communities from Networks

Extracting Communities from Networks Extracting Communities from Networks Ji Zhu Department of Statistics, University of Michigan Joint work with Yunpeng Zhao and Elizaveta Levina Outline Review of community detection Community extraction

More information

1 Homophily and assortative mixing

1 Homophily and assortative mixing 1 Homophily and assortative mixing Networks, and particularly social networks, often exhibit a property called homophily or assortative mixing, which simply means that the attributes of vertices correlate

More information

Community Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal

Community Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal Community Structure Detection Amar Chandole Ameya Kabre Atishay Aggarwal What is a network? Group or system of interconnected people or things Ways to represent a network: Matrices Sets Sequences Time

More information

V4 Matrix algorithms and graph partitioning

V4 Matrix algorithms and graph partitioning V4 Matrix algorithms and graph partitioning - Community detection - Simple modularity maximization - Spectral modularity maximization - Division into more than two groups - Other algorithms for community

More information

Betweenness Measures and Graph Partitioning

Betweenness Measures and Graph Partitioning Betweenness Measures and Graph Partitioning Objectives Define densely connected regions of a network Graph partitioning Algorithm to identify densely connected regions breaking a network into a set of

More information

Analysis of Biological Networks. 1. Clustering 2. Random Walks 3. Finding paths

Analysis of Biological Networks. 1. Clustering 2. Random Walks 3. Finding paths Analysis of Biological Networks 1. Clustering 2. Random Walks 3. Finding paths Problem 1: Graph Clustering Finding dense subgraphs Applications Identification of novel pathways, complexes, other modules?

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec Stanford University Jure Leskovec, Stanford University http://cs224w.stanford.edu [Morris 2000] Based on 2 player coordination game 2 players

More information

Graph Theory S 1 I 2 I 1 S 2 I 1 I 2

Graph Theory S 1 I 2 I 1 S 2 I 1 I 2 Graph Theory S I I S S I I S Graphs Definition A graph G is a pair consisting of a vertex set V (G), and an edge set E(G) ( ) V (G). x and y are the endpoints of edge e = {x, y}. They are called adjacent

More information

Oh Pott, Oh Pott! or how to detect community structure in complex networks

Oh Pott, Oh Pott! or how to detect community structure in complex networks Oh Pott, Oh Pott! or how to detect community structure in complex networks Jörg Reichardt Interdisciplinary Centre for Bioinformatics, Leipzig, Germany (Host of the 2012 Olympics) Questions to start from

More information

CS 124/LINGUIST 180 From Languages to Information. Social Networks: Small Worlds, Weak Ties, and Power Laws. Dan Jurafsky Stanford University

CS 124/LINGUIST 180 From Languages to Information. Social Networks: Small Worlds, Weak Ties, and Power Laws. Dan Jurafsky Stanford University CS 124/LINGUIST 180 From Languages to Information Dan Jurafsky Stanford University Social Networks: Small Worlds, Weak Ties, and Power Laws Slides from Jure Leskovec, Lada Adamic, James Moody, Bing Liu,

More information

Introduction Types of Social Network Analysis Social Networks in the Online Age Data Mining for Social Network Analysis Applications Conclusion

Introduction Types of Social Network Analysis Social Networks in the Online Age Data Mining for Social Network Analysis Applications Conclusion Introduction Types of Social Network Analysis Social Networks in the Online Age Data Mining for Social Network Analysis Applications Conclusion References Social Network Social Network Analysis Sociocentric

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 Figures are taken from: M.E.J. Newman, Networks: An Introduction 2

More information

Spectral Methods for Network Community Detection and Graph Partitioning

Spectral Methods for Network Community Detection and Graph Partitioning Spectral Methods for Network Community Detection and Graph Partitioning M. E. J. Newman Department of Physics, University of Michigan Presenters: Yunqi Guo Xueyin Yu Yuanqi Li 1 Outline: Community Detection

More information

RANDOM-REAL NETWORKS

RANDOM-REAL NETWORKS RANDOM-REAL NETWORKS 1 Random networks: model A random graph is a graph of N nodes where each pair of nodes is connected by probability p: G(N,p) Random networks: model p=1/6 N=12 L=8 L=10 L=7 The number

More information

An Empirical Analysis of Communities in Real-World Networks

An Empirical Analysis of Communities in Real-World Networks An Empirical Analysis of Communities in Real-World Networks Chuan Sheng Foo Computer Science Department Stanford University csfoo@cs.stanford.edu ABSTRACT Little work has been done on the characterization

More information

Chapter 11. Network Community Detection

Chapter 11. Network Community Detection Chapter 11. Network Community Detection Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Outline

More information

Overlapping Communities

Overlapping Communities Yangyang Hou, Mu Wang, Yongyang Yu Purdue Univiersity Department of Computer Science April 25, 2013 Overview Datasets Algorithm I Algorithm II Algorithm III Evaluation Overview Graph models of many real

More information

Web Structure Mining Community Detection and Evaluation

Web Structure Mining Community Detection and Evaluation Web Structure Mining Community Detection and Evaluation 1 Community Community. It is formed by individuals such that those within a group interact with each other more frequently than with those outside

More information

An Optimal Allocation Approach to Influence Maximization Problem on Modular Social Network. Tianyu Cao, Xindong Wu, Song Wang, Xiaohua Hu

An Optimal Allocation Approach to Influence Maximization Problem on Modular Social Network. Tianyu Cao, Xindong Wu, Song Wang, Xiaohua Hu An Optimal Allocation Approach to Influence Maximization Problem on Modular Social Network Tianyu Cao, Xindong Wu, Song Wang, Xiaohua Hu ACM SAC 2010 outline Social network Definition and properties Social

More information

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge Centralities (4) By: Ralucca Gera, NPS Excellence Through Knowledge Some slide from last week that we didn t talk about in class: 2 PageRank algorithm Eigenvector centrality: i s Rank score is the sum

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second

More information

Centrality Book. cohesion.

Centrality Book. cohesion. Cohesion The graph-theoretic terms discussed in the previous chapter have very specific and concrete meanings which are highly shared across the field of graph theory and other fields like social network

More information

Community Structure and Beyond

Community Structure and Beyond Community Structure and Beyond Elizabeth A. Leicht MAE: 298 April 9, 2009 Why do we care about community structure? Large Networks Discussion Outline Overview of past work on community structure. How to

More information

beyond social networks

beyond social networks beyond social networks Small world phenomenon: high clustering C network >> C random graph low average shortest path l network ln( N)! neural network of C. elegans,! semantic networks of languages,! actor

More information

Lecture 2: From Structured Data to Graphs and Spectral Analysis

Lecture 2: From Structured Data to Graphs and Spectral Analysis Lecture 2: From Structured Data to Graphs and Spectral Analysis Radu Balan February 9, 2017 Data Sets Today we discuss type of data sets and graphs. The overarching problem is the following: Main Problem

More information

Graph Theory. Network Science: Graph theory. Graph theory Terminology and notation. Graph theory Graph visualization

Graph Theory. Network Science: Graph theory. Graph theory Terminology and notation. Graph theory Graph visualization Network Science: Graph Theory Ozalp abaoglu ipartimento di Informatica Scienza e Ingegneria Università di ologna www.cs.unibo.it/babaoglu/ ranch of mathematics for the study of structures called graphs

More information

Community Detection. Community

Community Detection. Community Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

Supplementary material to Epidemic spreading on complex networks with community structures

Supplementary material to Epidemic spreading on complex networks with community structures Supplementary material to Epidemic spreading on complex networks with community structures Clara Stegehuis, Remco van der Hofstad, Johan S. H. van Leeuwaarden Supplementary otes Supplementary ote etwork

More information

2. CONNECTIVITY Connectivity

2. CONNECTIVITY Connectivity 2. CONNECTIVITY 70 2. Connectivity 2.1. Connectivity. Definition 2.1.1. (1) A path in a graph G = (V, E) is a sequence of vertices v 0, v 1, v 2,..., v n such that {v i 1, v i } is an edge of G for i =

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Community Detection in Social Networks

Community Detection in Social Networks San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 5-24-2017 Community Detection in Social Networks Ketki Kulkarni San Jose State University Follow

More information

ALTERNATIVES TO BETWEENNESS CENTRALITY: A MEASURE OF CORRELATION COEFFICIENT

ALTERNATIVES TO BETWEENNESS CENTRALITY: A MEASURE OF CORRELATION COEFFICIENT ALTERNATIVES TO BETWEENNESS CENTRALITY: A MEASURE OF CORRELATION COEFFICIENT Xiaojia He 1 and Natarajan Meghanathan 2 1 University of Georgia, GA, USA, 2 Jackson State University, MS, USA 2 natarajan.meghanathan@jsums.edu

More information