Network Science: Principles and Applications

Size: px
Start display at page:

Download "Network Science: Principles and Applications"

Transcription

1 Network Science: Principles and Applications CS Fall 2016 Amarda Shehu,Fei Li [amarda, lifei](at)gmu.edu Department of Computer Science George Mason University

2 1 Outline of Today s Class 2 Communities Basics of Communities: Community Structure in Real Networks Examples of Network Communities Network Community Detection Algorithms (Via) Hierarchical Clustering Clustering, Community Structure, and Optimality in Real Networks Hierarchical Network Model Clustering to Optimal Partition: Modularity Modularity Maximization Overlapping Communities Comparison of Community Detection Algorithms 3 Advanced (more Theoretical) Topics/Treatments Modularity Maximization Spectral Graph Partitioning Amarda Shehu,Fei Li () Outline of Today s Class 2

3 Communities within Networks Networks play the powerful role of bridging the local and the global Explain how processes at node/link level ripple to a population We often think of (social) networks as having the following structure Q: Can we gain insights behind this conceptualization? Amarda Shehu,Fei Li () Communities 3

4 Context In the 60s., M. Granovetter interviewed people who changed jobs Asked about how they discovered their new jobs Many learned about opportunities through personal contacts Surprisingly, contacts where often acquaintances rather than friends Close friends likely have the most motivation to help you out Q: Why do distant acquaintances convey the crucial information? M. Granovetter, Getting a job: A study of contacts and careers. University of Chicago Press, 1974 Amarda Shehu,Fei Li () Communities 4

5 Granovetter s Answer and Impact Linked two different perspectives on distant friendships Structural: focus on how friendships span the network Interpersonal: local consequences of friendship being strong or weak Intertwining between structural and informational role of an edge (1) Structurally-embedded edges within a community: Tend to be socially strong Highly redundant in terms of information access (2) Long-range edges spanning different parts of the network: Tend to be socially weak Offer access to useful information (e.g., on a new job!) General way of thinking about the architecture of social networks Answer transcends the specific setting of job-seeking Amarda Shehu,Fei Li () Communities 5

6 Central to Emergence of Communities: Triadic Closure A basic principle of network formation is that of triadic closure If two people have a friend in common, then there is an increased likelihood that they will become friends in the future Emergent edges in a social network likely to close triangles Figure: More likely to see the red edge than the blue one Prevalence of triadic closure measured by the clustering coefficient #pairs of friends of v that are connected cl(v) = #pairs of friends of v Amarda Shehu,Fei Li () Communities 6

7 Reasons for Triadic Closure Triadic closure is intuitively very natural. Reasons why it operates: Increased opportunity for B and C to meet Both spend time with A There is a basis for mutual trust among B and C Both have A as a common friend A may have an incentive to bring B and C together Lack of friendship may become a source of latent stress Premise based on theories dating to early work in social psychology F. Heider, The Psychology of Interpersonal Relations. Wiley, 1958 Amarda Shehu,Fei Li () Communities 7

8 Bridges E.g.: Consider the simple social network here A s links to C,D, and E connect her to a tightly knit group A,C,D, and E likely exposed to similar opinions A s link to B seems to reach to a different part of the network Offers her access to views she would otherwise not hear about A-B edge is called a bridge, its removal disconnects the network Giant components suggest that bridges are quite rare Amarda Shehu,Fei Li () Communities 8

9 Local Bridges E.g.: In reality, the social network is larger and may look as Figure: Without A, B knowing, may have a longer path among them Defn: Span of (u, v) is the u v distance when the edge is removed Defn: A local bridge is an edge with span > 2 Ex: Edge (A, B) is a local bridge with span 3 Local bridges with large spans bridges, but less extreme Link with triadic closure: local bridges not part of triangles Amarda Shehu,Fei Li () Communities 9

10 Strong Triadic Closure Property Categorize all edges in the network according to their strength Strong ties corresponding to friendship Weak ties corresponding to acquaintances Opportunity, trust, incentive act more powerfully for strong ties Suggests qualitative assumption termed strong triadic closure Two strong ties implies a third edge exists closing the triangle Abstraction to reason about consequences of strong/weak ties Amarda Shehu,Fei Li () Communities 10

11 Local Bridges and Weak Ties (a) Local, interpersonal distinction between edges = strong/weak ties (b) Global, structural notion = local bridges present or absent Theorem: If a node satisfies the strong triadic closure property and is involved in at least two strong ties, then any local bridge incident to it is a weak tie. Links structural and interpersonal perspectives on friendships Back to job-seeking, local bridges connect to new information Conceptual span is related to their weakness as social ties Surprising dual role suggests a strength of weak ties Amarda Shehu,Fei Li () Communities 11

12 Proof By Contradiction Proof: We will argue by contradiction. Suppose node A has 2 strong ties Moreover, suppose A satisfies the strong triadic closure property Let A-B be a local bridge as well as a strong tie Figure: Edge B-C must exist by strong triadic closure This contradicts A-B is a local bridge (cannot be part of a triangle) Amarda Shehu,Fei Li () Communities 12

13 Tie Strength and Structure in Large-Scale Data Q: Can one test Granovetter s theory with real network data? Hard for decades. Lack of large-scale social interaction surveys Now we have who-calls-whom networks with both key ingredients Network structure of communication among pairs of people Total talking time, i.e., a proxy for tie strength E.g.: Cell-phone network spanning 20% of country s population J. P. Onella et al., Structure and tie strengths in mobile communication networks, PNAS, vol. 104, pp , 2007 How can we generalize weak ties and local bridges? Amarda Shehu,Fei Li () Communities 13

14 Tie Strength and Structure in Large-Scale Data Q: Can one test Granovetter s theory with real network data? Hard for decades. Lack of large-scale social interaction surveys Now we have who-calls-whom networks with both key ingredients Network structure of communication among pairs of people Total talking time, i.e., a proxy for tie strength E.g.: Cell-phone network spanning 20% of country s population J. P. Onella et al., Structure and tie strengths in mobile communication networks, PNAS, vol. 104, pp , 2007 How can we generalize weak ties and local bridges? Amarda Shehu,Fei Li () Communities 13

15 Generalizing Weak Ties and Local Bridges Model described so far imposes sharp dichotomies on the network Edges are either strong or weak, local bridges or not Convenient to have proxies exhibiting smoother gradations Numerical tie strength = Minutes spent in phone conversations Order edges by strength, report their percentile occupancy Generalize local bridges = neighborhood overlap O ij of edge (i, j) O i j = n(i) n(j) n(i) n(j) where n(i) = {j V : (i, j) E} Desirable property: O ij = 0 if (i, j) is a local bridge Amarda Shehu,Fei Li () Communities 14

16 Empirical Results Strength of weak ties prediction: O ij grows with tie strength Dependence borne out very cleanly by the data (o points) Randomly permuted tie strengths, fixed network structure ( points) Effectively removes the coupling between O ij and tie strength Amarda Shehu,Fei Li () Communities 15

17 Phone Network and Tie Strength Cell-phone network with color-coded tie strengths: (1) Stronger ties more structurally-embedded (within communities) (2) Weaker ties correlate with long-range edges joining communities Amarda Shehu,Fei Li () Communities 16

18 Approximating Link Weights If the link weights are driven by the need to transport information or materials, as it is often the case in technological and biological systems, the weights are well approximated by betweenness centrality links connecting communities have high betweenness (red), whereas the links within communities have low betweenness (green) Amarda Shehu,Fei Li () Communities 17

19 Weak Ties Linking Communities Strength of weak ties prediction: long-range, weak ties bridge communities Delete decreasingly weaker (small overlap) edges one at a time Giant component shrinks rapidly, eventually disappears Repeat with strong-to-weak tie deletions = slower shrinkage observed Amarda Shehu,Fei Li () Communities 18

20 Closing the Loop We often think of (social) networks as having the following structure Conceptual picture supported by Granovetter s strength of weak ties Amarda Shehu,Fei Li () Communities 19

21 Communities Nodes in real-world networks organize into communities E.g.: families, clubs, political organizations, proteins by function,... Supported by Granovetter s strength of weak ties theory Community (a.k.a. group, cluster, module) members are: Well connected among themselves Relatively well separated from the rest Exhibit high cohesiveness w.r.t. the underlying relational patterns Amarda Shehu,Fei Li () Communities 20

22 Zachary s Karate Club Social interactions among members of a karate club in the 70s Zachary witnessed the club split in two during his study Toy network, yet canonical (benchmark) for community detection algorithms Offers ground truth community membership (a rare luxury) Amarda Shehu,Fei Li () Communities 21

23 Zachary s Karate Club Social interactions among members of a karate club in the 70s Zachary witnessed the club split in two during his study Toy network, yet canonical (benchmark) for community detection algorithms Offers ground truth community membership (a rare luxury) Amarda Shehu,Fei Li () Communities 21

24 Political Blogs The political blogosphere for the US 2004 presidential election Community structure of liberal and conservative blogs is apparent People have a stronger tendency to interact with equals Amarda Shehu,Fei Li () Communities 22

25 Electrical Power Grid Split power network into areas with minimum inter-area interactions Applications: Decide control areas for distributed power system state estimation Parallel computation of power flow Controlled islanding to prevent spreading of blackouts Amarda Shehu,Fei Li () Communities 23

26 Scientific Collaboration Network E.g.: Coauthorship network of scientists at the Santa Fe Institute Communities found can be traced to different disciplines Amarda Shehu,Fei Li () Communities 24

27 Belgium Population Call pattern of 2M consumers of largest Belgian mobile phone company The color of each community on a red-green scale represents the language spoken in the particular community, red for French and green for Dutch Bicultural society structure is apparent Model for peaceful coexistence? Minimal cross-community contact Amarda Shehu,Fei Li () Communities 25

28 High-school Social Network Network of social interactions among high-school students Strong assortative mixing, with race as latent characteristic Amarda Shehu,Fei Li () Communities 26

29 Co-authorship Network Coauthorship network of physicists publishing in network science Tightly-knit subgroups are evident from the network structure Amarda Shehu,Fei Li () Communities 27

30 College Football Vertices are NCAA football teams, edges are games during Fall 2000 Communities are the NCAA conferences and independent teams Amarda Shehu,Fei Li () Communities 28

31 Social Network Facebook egonet with 744 vertices and 30K edges Asked ego to identify social circles to which friends belong Company, high-school, basketball club, squash club, family Amarda Shehu,Fei Li () Communities 29

32 Biological Network Community structure of biological systems central to understanding and treating human disease community: group of molecules that form functional modules to carry out a specific cellular function E.g.: Proteins involved in same disease tend to interact with one another Disease module hypothesis: Each disease can be linked to a well-defined neighborhood of the cellular network Amarda Shehu,Fei Li () Communities 30

33 Unveiling Network Communities Nodes in real-world networks organize into communities E.g.: families, clubs, political organizations, proteins by function,... Supported by Granovetter s strength of weak ties theory Community (a.k.a. group, cluster, module) members are: Well connected among themselves Relatively well separated from the rest Exhibit high cohesiveness w.r.t. the underlying relational patterns Q: How can we automatically identify such cohesive subgroups? Amarda Shehu,Fei Li () Communities 31

34 Community Detection: What Makes a Community Useful for exploratory analysis of network data E.g.: clues about social interactions, content-related web pages, disease, collective phenomena Community detection is a challenging clustering problem (1) No consensus on the structural definition of community (2) Node subset selection often intractable (3) Lack of ground-truth for validation Connectedness and Density Hypothesis: Connectedness: All members of a community must be reached through other members of the same community Locally dense: Nodes that belong to a community have a higher probability to link to others in the same community than to nodes outside the community. Amarda Shehu,Fei Li () Communities 32

35 Community Detection: What Makes a Community Useful for exploratory analysis of network data E.g.: clues about social interactions, content-related web pages, disease, collective phenomena Community detection is a challenging clustering problem (1) No consensus on the structural definition of community (2) Node subset selection often intractable (3) Lack of ground-truth for validation Connectedness and Density Hypothesis: Connectedness: All members of a community must be reached through other members of the same community Locally dense: Nodes that belong to a community have a higher probability to link to others in the same community than to nodes outside the community. A community is a locally dense connected subgraph in the network Amarda Shehu,Fei Li () Communities 32

36 Communities as Special Subgraphs Luce & Perry, A method of matrix analysis of group structure in Psychometrika, 1949 defined a community as a complete subgraph, i.e., a clique A clique automatically satisfies the criterion a locally dense connected subgraph but misses legitimate, imperfect communities Consider a node i in a connected subgraph C G that has N c N nodes Let ki int be the number of links that connect i to nodes in C Let ki ext be the number of links that connect i to nodes in GC C is a strong community if each node within C has more links within C than outside: i C, ki int > ki ext C is a weak community if the inequality above does not hold for all nodes in C but the following observation can be made regarding the total internal and total external degree: i C kint i > i C kext i Amarda Shehu,Fei Li () Communities 33

37 Communities as Special Subgraphs Luce & Perry, A method of matrix analysis of group structure in Psychometrika, 1949 defined a community as a complete subgraph, i.e., a clique A clique automatically satisfies the criterion a locally dense connected subgraph but misses legitimate, imperfect communities Consider a node i in a connected subgraph C G that has N c N nodes Let ki int be the number of links that connect i to nodes in C Let ki ext be the number of links that connect i to nodes in GC C is a strong community if each node within C has more links within C than outside: i C, ki int > ki ext C is a weak community if the inequality above does not hold for all nodes in C but the following observation can be made regarding the total internal and total external degree: i C kint i > i C kext i How many communities are there? To answer this, pose community detection as graph partitioning and a simple version of it, graph bisection...next Amarda Shehu,Fei Li () Communities 33

38 Community Detection and Graph Partitioning Graph Partitioning Split V into given number of non-overlapping groups of given sizes Criterion: number of edges between groups is minimized This is known as minimizing cut E.g.: task-processor assignment for load balancing, chip design Number and sizes of groups unspecified in community detection Identify the natural fault lines along which a network separates A special version of this problem is graph bisection: Partition the graph G into two non-overlapping subgraphs, such that the number of edges/links between subgraphs, called cut size is minimized. Amarda Shehu,Fei Li () Communities 34

39 Graph Partitioning is Hard E.g.: Graph bisection problem, i.e., partition V into two groups ( V = N) Suppose the groups V 1 and V 2 are non-overlapping Suppose groups have equal size, i.e., V 1 = V 2 = N/2 Minimize edges running between vertices in different groups Simple problem to describe, but hard to solve Number of ways to partition V : ( N ) N/2 2 N+1 N Used Stirling s formula N! 2πN(N/e) N In this case, ( N ) N/2 e [(N+1)ln2 1 2 lnn] Since number of bisections increases exponentially with size of network, Exhaustive/brute-force search intractable beyond toy small-sized networks No smart (i.e., polynomial time) algorithm, NP-hard problem Seek good heuristics, e.g., relaxations of natural criteria Amarda Shehu,Fei Li () Communities 35

40 Community Detection and Graph Partitioning In community detection, size and number of communities not known a priori Let s call a partition a division of a network into an arbitrary number of groups, s.t. each node belongs to one and only one group (non-overlapping) The number of possible partitions (the Bell Number) is B N = 1 e j=0 j N j! Need to identify communities without the luxury of inspecting all possible partitions but with some freedom on the definition of a community Amarda Shehu,Fei Li () Communities 36

41 Community Detection and Graph Partitioning Need algorithms whose running time grows polynomially with respect to N Clustering algorithms address this need They identify communities by either growing communities or dividing the graph Many exploit the strength of weak ties Short detour... Amarda Shehu,Fei Li () Communities 37

42 Strength of Weak Ties Motivation Local bridges connect weakly interacting parts of the network Q: What about removing those to reveal communities? Challenges: Multiple local bridges. Some better that others? Which one first? There might be no local bridge, yet an apparent natural division Amarda Shehu,Fei Li () Communities 38

43 Edge Betweenness Centrality Idea: high edge betweenness centrality to identify weak ties High cbe(e) edges carry large traffic volume over shortest paths Position at the interface between tightly-knit groups E.g.: cell-phone network with colored edge strength and betweenness Amarda Shehu,Fei Li () Communities 39

44 Girvan-Newman s method Girvan-Newman s method is conceptually simple Find and remove spanning links between cohesive subgroups Algorithm: Repeat until there are no edges left Calculate the betweenness centrality cbe(e) of all edges Remove edge(s) with highest cbe(e) Connected components are the communities identified Divisive method: network falls apart into pieces as we go Nested partition: larger communities potentially host denser groups Recompute edge betweenness after per step M. Girvan and M. Newman, Community structure in social and biological networks, PNAS, vol. 99, pp , 2002 Amarda Shehu,Fei Li () Communities 40

45 Girvan-Newman s Algorithm in Action Amarda Shehu,Fei Li () Communities 41

46 Hierarchical Clustering: Common Framework Another way to look at Girvan-Newman s algorithm is as a divisive hierarchical clustering algorithm We will cover two hierarchical clustering algorithms (domain is very rich): Ravasz algorithm (agglomerative) Girvan-Newman s algorithm (divisive) In principle, any clustering algorithm can be adapted and applied to detect communities in a network Common framework: exploitation of a similarity or dissimilarity matrix whose elements x ij indicate the similarity or distance of node i from node j E.g., similarity extracted from relative positions of i and j within the network Agglomerative: merge nodes with high similarity into the same community Divisive: isolate communities by removing low-similarity (high dissimilarity), cross-community links Amarda Shehu,Fei Li () Communities 42

47 Agglomerative Approach to Hierarchical Clustering Take Ravasz algorithm as an example; designed to detect functional modules in metabolic networks [E. Ravasz and A.-L. Barabasi. Hierarchical organization in complex networks. Physical Review E 2003] Key idea: similarity should be high for nodes that belong to the same community and low for nodes residing in different communities Proposal: nodes that connect to each-other and share neighbors belong to the J(i,j) same community = topological similarity = x ij = min k i,k j +1 θ(a ij ) θ(x) = 0 for x 0 and 1 for x > 0 (Heaviside step function) J(i, j) is nr. of shared neighbors (+1 if (i, j) E), and k i, k j are degrees So, x ij = 1 if nodes i and j share the same neighbors x ij = 0 if i and j do not share any neighbors any value in [0, 1] Amarda Shehu,Fei Li () Communities 43

48 Ravasz-Barabasi Hierarchical Clustering Algorithm Amarda Shehu,Fei Li () Communities 44

49 Ravasz-Barabasi Hierarchical Clustering Algorithm What to do with this similarity matrix? = Neighbor Joining Amarda Shehu,Fei Li () Communities 44

50 Ravasz-Barabasi Hierarchical Clustering Algorithm S0: Assign each node i to its own community {i} S1: Evaluate x ij for all pairs of communities S2: Find the community pair with the highest similarity and join/merge them into a single community S3: If there is more than one community, go to S1 How do we determine similarity between communities of more than one node? = Several options: Single, complete, or average group/cluster similarity Single-link clustering: similarity of two clusters is the similarity of their most similar members Complete-link clustering: the similarity of two clusters is the similarity of their most dissimilar members In Ravasz-Barabasi, average cluster similarity is used: similarity between all pairs of nodes residing in two different clusters is computed and averaged to report the similarity between two groups/clusters Amarda Shehu,Fei Li () Communities 45

51 Cluster Similarity Figure: (a) Pairwise similarity matrix. (b) Single-link(age) clustering. (c) Complete-link clustering. (d) Average link clustering. Amarda Shehu,Fei Li () Communities 46

52 Ravasz-Barabasi Hierarchical Clustering Algorithm: Time Complexity S0: Assign each node i to its own community {i} = O(N) worst-case running time S1: Evaluate x ij for all pairs of communities = O(N 2 ) first time around S2: Find the community pair with the highest similarity and join/merge them into a single community S1 + S2 = at worst N times determining distance of new cluster to others = O(N 2 ) time Total time: O(N 2 ) Where are the communities? What was the order in which smaller communities were joined to larger ones? = need to maintain an order of joining events tree = adds O(NlogN) to running time Amarda Shehu,Fei Li () Communities 47

53 Ravasz-Barabasi Hierarchical Clustering Algorithm: Partitions The order in which communities are merged/joined together can be visualized via a dendogram tree. Can you infer order of events? Amarda Shehu,Fei Li () Communities 48

54 Ravasz-Barabasi Hierarchical Clustering Algorithm: Partitions The order in which communities are merged/joined together can be visualized via a dendogram tree. Can you infer order of events? Where are the communities? (more later) Amarda Shehu,Fei Li () Communities 48

55 Girvan-Newman Hierarchical Clustering Algorithm A divisive algorithm that relies on a dissimilarity matrix x i j Proposal: exploit x ij to select node pairs that are in different communities x ij should be high for nodes in different communities and low for nodes in same community Reasonable options: link betweenness and random walk betweenness link betweenness fastest to calculate In Girvan-Newman: link betweenness x ij defined as the number of shortest paths that go through the link (i, j) If I am to remove links one at a time based on their high-to-low x ij value, does link betweenness make sense? What values of x ij do I expect from nodes that connect communities? Amarda Shehu,Fei Li () Communities 49

56 Girvan-Newman Hierarchical Clustering Algorithm Algorithm S1: Compute centrality of each link S2: Remove link with highest centrality S3: Recalculate centrality of each link in altered network S4: Repeat S2-S3 until all links are removed Computational Complexity Most expensive step is recalculation of centrality If link betweenness used, O(LN) worst-case running time Step S3 adds a factor of L to this time Total time: O(L 2 N) Amarda Shehu,Fei Li () Communities 50

57 Girvan-Newman Hierarchical Clustering Algorithm Figure: (a)-(d) sequence of link removals. (e) Accompanying dendogram. (f) Which partition is optimal? Modularity helps us determine that. Amarda Shehu,Fei Li () Communities 51

58 Partitions on the Dendogram Figure: (a) dendogram. (b)-(d) different partitions. Amarda Shehu,Fei Li () Communities 52

59 Dendograms Show Partitions, Not Final Partition! Hierarchical partitions often represented with a dendogram Shows groups found in the network at all algorithmic steps Split the network at different resolutions E.g.: Girvan-Newman s algorithm for the Zachary s karate club Q: Which of the divisions/partitions is the most useful/optimal in some sense? A: Need to define metrics to compare partitions! Amarda Shehu,Fei Li () Communities 53

60 Clustering, Optimality, and Community Structure in Real Networks Hierarchical clustering assumes nested communities, as captured by the dendograms these algorithms produce The density hypothesis states that there are locally-dense subgraphs weakly linked to other locally-dense subgraphs Q1: How do we know such organization exists in real networks? Q2: If community structure exists, then how can hierarchical clustering algorithms be used to reveal the optimal organization/partition? Amarda Shehu,Fei Li () Communities 54

61 Clustering, Optimality, and Community Structure in Real Networks Hierarchical clustering assumes nested communities, as captured by the dendograms these algorithms produce The density hypothesis states that there are locally-dense subgraphs weakly linked to other locally-dense subgraphs Q1: How do we know such organization exists in real networks? Hubs connect communities = conflict between scale-free property and community structure? Need to develop a model that reconciles the two Q2: If community structure exists, then how can hierarchical clustering algorithms be used to reveal the optimal organization/partition? Amarda Shehu,Fei Li () Communities 54

62 Clustering, Optimality, and Community Structure in Real Networks Hierarchical clustering assumes nested communities, as captured by the dendograms these algorithms produce The density hypothesis states that there are locally-dense subgraphs weakly linked to other locally-dense subgraphs Q1: How do we know such organization exists in real networks? Hubs connect communities = conflict between scale-free property and community structure? Need to develop a model that reconciles the two Q2: If community structure exists, then how can hierarchical clustering algorithms be used to reveal the optimal organization/partition? Need a way to rank/score partitions and report optimal one Amarda Shehu,Fei Li () Communities 54

63 Reconciling Scale-free Property and Community Structure Left: Start with a module of 5 nodes, one at center of a square. Connect all of them to one another. Middle: Create four identical replicas and connect the peripheral nodes of each module to the central node of original module. Right: Create four replicas of the 25-node module and connect the peripheral nodes again to central node of original module, obtaining a 125-node network. Process can be repeated ad infinitum. Amarda Shehu,Fei Li () Communities 55

64 Hierarchical Network Model The hierarchical network model generates a scale-free network with degree exponent γ = 1 + ln5 ln4 = For Erdos-Renyi and Barabasi-Albert models, clustering coefficient C decreases with N; C = for the hierarchical network model N-independent clustering coefficient observed in metabolic networks Quantitative signature of nested community (hierarchical modularity) is the dependence of a node s clustering coefficient on its degree: C(k) k 1 The higher a node s degree, the smaller its clustering coefficient Small-degree nodes have high C because they reside in dense communities High-degree nodes have low C because they connect to different communities Inspecting C(k) allows determining whether a network is hierarchical Power grid lacks hierarchical modularity (C(k) independent of k) In many real networks, C(k) decreases with k More detailed models predict C(k) k β with β [0, 2] Amarda Shehu,Fei Li () Communities 56

65 Modularity Size of communities typically unknown = Identify automatically Devise a function that measures how well a network is partitioned in communities Intuition: density of edges in communities higher than expected at random Hypothesis: Randomly-wired networks lack an inherent community structure By comparing link density of a subgraph with link density obtained for the same group of nodes but randomly wired, one can determine whether the found subgraph is a community Systematic deviations from a random configuration allow defining a quantity called modularity to measure the quality of a particular partition Modularity offers a way to determine the optimal partition from the dendogram and so predict a partition Modularity optimization also offers a novel approach to community detection via optimization algorithms Amarda Shehu,Fei Li () Communities 57

66 Modularity of a Proposed Community Suppose a subgraph s has been identified and proposed to be a community One can measure its modularity as: Q(s) [(# of edges within group s) E[# of such edges]] Operationalizes hypothesis that there should be more edges within a true community than what would be obtained in a null model Proposal: Q(s) = 1 (i,j) s (A ij p ij ) 2L L is number of edges in the entire graph/network p ij is probability of an edge by chance, obtained by randomizing original network G while preserving expected degree of each node Under degree-preserving null model: p ij = k i k j 2L. Why? Amarda Shehu,Fei Li () Communities 58

67 Expected Connectivity Among Nodes Null model: randomize edges preserving degree distribution in G Random variable A ij := I{(i, j) E} Expectation is E[A ij ] = P((i, j) E) Suppose node i has degree k i, node j has degree k j Degree is # of spokes per node, 2L spokes in G Probability spoke i k is connected to j is P((i, j) E) k j 2L 1 k j 2L = P( k i i k =1 {spoke i k connected to node j}) = k i i k =1 P(spoke i k connected to node j) = k i k j 2L Amarda Shehu,Fei Li () Communities 59

68 Expected Connectivity Among Nodes Null model: randomize edges preserving degree distribution in G Random variable A ij := I{(i, j) E} Expectation is E[A ij ] = P((i, j) E) Suppose node i has degree k i, node j has degree k j Degree is # of spokes per node, 2L spokes in G Probability spoke i k is connected to j is P((i, j) E) k j 2L 1 k j 2L = P( k i i k =1 {spoke i k connected to node j}) = k i i k =1 P(spoke i k connected to node j) = k i k j 2L Back to modularity of a subgraph and modularity of a partition Amarda Shehu,Fei Li () Communities 59

69 From Modularity of a Subgraph to Modularity of a Partition So, modularity Q(s) of a subgraph s is: Q(s) = Ls L ( ks 2L )2, where L s is the total number of edges in s and k s is total degree of nodes in s Now consider a partition S of a graph G into groups/subgraphs s S Summing over all groups s S gives us the modularity score of a particular partition S: Q(G, S) = s S Q(s) Formally, after normalization Q(G, S) [ 1, 1] Can use modularity to rank partitions and pick best one from dendogram tree Can evaluate the modularity of each partition in a dendrogram Maximum value gives the best community structure How sensitive is modularity to changes to a partition? Amarda Shehu,Fei Li () Communities 60

70 Examples of Modularity Values Amarda Shehu,Fei Li () Communities 61

71 Using Modularity to Assess Clustering Quality E.g.: Girvan-Newman s algorithm for the Zachary s karate club Q: Why not optimize Q(G, S) directly over possible partitions S? Amarda Shehu,Fei Li () Communities 62

72 Using Modularity to Assess Clustering Quality E.g.: Girvan-Newman s algorithm for the Zachary s karate club Q: Why not optimize Q(G, S) directly over possible partitions S? A: How complex is modularity landscape of partition space? Amarda Shehu,Fei Li () Communities 62

73 Modularity Landscape A greedy algorithm seeking maximally modular partition runs in O(Nlog 2 N) time Modularity maximization forces small weakly-connected communities into larger ones Optimal partition difficult to identify among a large number of close to optimal partitions = modularity landscape is complex Amarda Shehu,Fei Li () Communities 63

74 Modularity Maximization Algorithms If the goal is to find partition with maximal modularity, this is an optimization problem Searching for local maxima in a complex search space (fitness landscape) is handled very well by evolutionary algorithms Shi, Wy, Zhong, A New Genetic Algorithm for Community Detection, Complex Sciences, Springer 2009 Gong, Ma, Zhang, Jiao, Community detection in networks by using multiobjective evolutionary algorithm with decomposition, Physica A: Statistical Mechanics and Its Applications 2012 Key concept: evolving population of partitions in response to optimizing modularity Do not rely on similarity or dissimilarity matrix and therefore scale to very networks Multi-objective EAs address two conflicting objectives: negative ratio association (divides into small communities) and ratio cut (divides into large communities) Various reviews exist Amarda Shehu,Fei Li () Communities 64

75 Overlapping Communities Viscek and collaborators proposed the concept of overlapping communities Clique percolation algorithm views a community as a union of overlapping cliques: based on knowledge that k-cliques naturally emerge in sufficiently dense networks Two k-cliques are adjacent if sharing k 1 nodes A k-clique community is largest connected subgraph obtained by union of all adjacent k-cliques k-cliques that cannot be reached from a particular k-clique belong to other k-clique communities Polynomial running time when cliques are small Link clustering algorithm depends on definition of link similarity Similarity of two links is determined by the neighborhood of nodes connected by them Hierarchical clustering applied to group links into communities Amarda Shehu,Fei Li () Communities 65

76 Testing Community Detection Algorithms In absence of ground truth, how is one to compare community detection algorithms? Modularity-based comparison only helps so far Other measures used, defined to compare quality of clusters by clustering algorithms (e.g., information gain) Other measures compare the behavior of algorithms at various community sizes [Leskovec, Lang, Mahoney, Empirical Comparison of Algorithms for Network Community Detection, IW3C ] Figure: Network community profile (NCP) characterizes the quality of network communities as a function of their size. Universal V shape of NCP summarizes large-scale community structure of real networks. Amarda Shehu,Fei Li () Communities 66

77 Advanced (more Theoretical) Topics/Treatments Modularity maximization can be proved to be NP-hard Relaxing modularity maximization via the Lagrange technique provides one approach to modularity maximization Stochastic optimization algorithms (such as EAs) more generally applicable and scalable on real networks Relaxation uses eigendecomposition and specific properties of dominant eigenvector Graph partition recast as graph cut problem Minimize-cut treatment opens way to other algorithms that use the Laplacian to determine the connected components of the graph Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 67

78 Modularity Revisited Recall our definition of modularity Q(G, S) = 1 2N e s S i,j s [A ij d i d j 2N e ] Let g i be the group membership of vertex i and rewrite Q(G, S) = 1 2N e i,j V [A ij d i d j 2N e ]I{g i = g j } Define for convenience the summands B ij = A ij d i d j 2N e Both marginal sums of B ij vanish, since e.g.: j B ij = j A ij d i 2N e j d j = d i d i 2N e 2N e = 0 Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 68

79 Graph Bisection Consider (for simplicity) dividing the network in two groups Binary community membership variables per vertex: s i = +1 if vertex i belongs to group 1 s i = 1 if vertex i belongs to group 2 Using the identity 1 (s 2 is j + 1) = I{g i = g j }, the modularity is: Q(G, S) = 1 2N e i,j V [A ij d i d j 2N e ]I{g i = g j } = 1 4N e i,j V B ij(s i s j + 1) Recall j B ij = 0 to obtain the simpler expression: Q(G, S = 1 4N e i,j V B ijs i s j Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 69

80 Nv Nv Let B R b Optimizing Modularity the modularity matrix with entries B ij = A ij d i d j 2N e Any partition S is defined by the vector s = [s 1... s Nv ] T Modularity is a quadratic form: Q(G, S) = 1 4N e i,j V B ijs i s j = 1 4N e s T Bs Modularity as criterion for graph bisection yields the formulation: ŝ = arg max s { 1,+1} Nv s T Bs Nasty binary constraints s { 1, +1} Nv (hypercube vertices) Modularity optimization is NP-hard [Brandes et al 2006] Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 70

81 Relaxation Relax the constraint s { 1, +1} Nv to s R Nv, s 2 = 1 ŝ = arg max s s T Bs s.t. s T s = 1 Associate a Lagrange multiplier λ to the constraint s T s = 1 Optimality conditions yield: s [s T Bs + λ(1 s T s)] = 0 = Bs = λs Conclusion is that s is an eigenvector of B with eigenvalue λ Q: Which eigenvector should we pick? At optimum, Bs = λs so objective becomes: s T Bs = λs T s = λ A: To maximize modularity pick the dominant eigenvector of B Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 71

82 Spectral Modularity Maximization Let u 1 be the dominant eigenvector of B, with i th [u 1] i Cannot just set s = u 1 because u 1 { 1, +1} Nv Best effort: maximize their similarity s T u 1 which gives: s i =sign([u 1 ]) i +1 when [u 1 ] i ] > 0 and 1 otherwise Spectral modularity maximization algorithm S1: Compute modularity matrix B with entries B ij = j A ij d i 2N e S2: Find a dominant eigenvector u 1 f B (e.g., power method) S3: Cluster membership of vertex i is s i =sign([u 1 ] i ) Multiple (> 2) communities through e.g., repeated graph bisection Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 72

83 Example: Zachary s Karate Club Spectral modularity maximization Shapes of vertices indicate community membership Dotted line indicates partition found by the algorithm Vertex colors indicate the strength of their membership Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 73

84 Graph Bisection Consider an undirected graph G = (V, E) E.g.: Graph bisection problem, i.e., partition V into two groups Groups V 1 and V 2 = V C 1 are non-overlapping Groups have given sizes, i.e., V 1 = N 1 and V 2 = N 2 Q: What is a good criterion to partition the graph? A: We have already seen modularity. Let s see a different one Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 74

85 Graph Cut Desiderata: Community members should be Well-connected among themselves Relatively well-separated from rest of nodes Def: A cut C is the number of edges between groups 1 and 2 C = cut(v 1, V 2 ) = i V 1,j V 2 A ij Natural criterion: minimize cut, i.e., edges across groups 1 and 2 Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 75

86 From Graph Cuts Binary community membership variable per vertex s i = +1, vertex i belongs to group 1 s i = 1 otherwise Let g i be the group membership of vertex i, s.t.: I{g i g j } = 1 2 (1 s is j ) = 1 if i and j in different groups and 0 otherwise Cut expressible in terms of the variables s i as: C = i V 1,j V 2 A ij = 1 2 i,j V A ij(1 s i s j ) Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 76

87 to the Graph Laplacian Matrix First summand in C = 1 2 i,j A ij(1 s i s j is: i,j V A ij = i V d i = i V d isi 2 = i,j Vd i s i s j I{i=j} Used si 2 = 1 since s i { 1, +1}. The cut becomes: C = 1 2 i,j V (d ii{i = j} A)ij)s i s j = 1 2 i,j V i,j V L ijs i s j Cut in terms of L ij, entries of the graph Laplacian L = D A, i.e.,: C(s) = 1 2 st Ls, s = [s 1... s Nv ] T Maximize modularity Q(s) s T Bs v.s. minimize cut C(s) s T Ls Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 77

88 Laplacian Matrix Properties Revisited Smoothness: For any vector x R Nv of vertex values, one has: x T Lx = i,j V L ijx i x j = (i,j) E (x i x j ) 2 which can be minimized to enforce smoothness of functions on G Positive semi-definiteness: Follows since x T Lx > 0 x R Nv Spectrum: All eigenvalues of L are real and and non-negative Eigenvectors form an orthonormal basis of R Nv Rank deficiency: Since L1 = 0, L is rank deficient Spectrum and connectivity: The smallest eigenvalue λ 1 of L is 0 If the second-smallest eigenvalue λ 2 = 0, then G is connected If L has n zero eigenvalues, G has n connected components Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 78

89 Graph Cut Minimization Since V = N 1 and V 2 = N 2 = N N 1, then we have the constraint: i V s i = i V 1 (+1) + i V 2 ( 1) = N 1 N 2 = 1 T s = N 1 N 2 Minimum-cut criterion for graph bisection yields the formulation ŝ = arg min s { 1,+1} Nv s T Ls s.t. 1 T s = N 1 N 2 Again, binary constraints s { 1, +1} Nv render cut minimization hard Relax binary constraints as with modularity maximization Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 79

90 Further Intuition Since s T Ls = (i,j) E (s i s j ) 2, the minimum-cut formulation is: ŝ = arg min s { 1,+1} Nv (i,j) E (s i s j ) 2 s.t. 1 T s = N 1 N 2 Q: Does this equivalent cost function make sense? A: Absolutely! Edges joining vertices in the same group do not add to the sum Edges joining vertices in different groups add 4 to the sum Minimize cut: assign values s i to nodes i such that few edges cross 0 Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 80

91 Further Intuition Since s T Ls = (i,j) E (s i s j ) 2, the minimum-cut formulation is: ŝ = arg min s { 1,+1} Nv (i,j) E (s i s j ) 2 s.t. 1 T s = N 1 N 2 Q: Does this equivalent cost function make sense? A: Absolutely! Edges joining vertices in the same group do not add to the sum Edges joining vertices in different groups add 4 to the sum Minimize cut: assign values s i to nodes i such that few edges cross 0 Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 80

92 Minimum-Cut Relaxation Relax the constraint s { 1, +1} Nv to s R Nv, s 2 = 1 ŝ =arg min s s T Ls, 1 T s = N 1 N 2, s T s = 1 Straightforward to solve using Lagrange multipliers Characterization of the solution ŝ [Fiedler 1973] ŝ = v 2 + N 1 N 2 N v 1 The second-smallest eigenvector v 2 of L satisfies 1 T v 2 = 0 Minimum cut C(ŝ) = ŝ T Lŝ = v T 2 Lv 2 λ 2 If the graph G is disconnected then we know lambda 2 = 0 = C(ŝq) If G is amenable to bisection, the cut is small and so is λ 2 Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 81

93 Theoretical Guarantee Consider a partition of G into V 1 and V 2, where V 1 V 2 If G is connected, then the Cheeger inequality asserts: α 2 2d max λ 2 2α where α = C V 1 Certifies that λ 2 gives a useful bound and d max is the maximum node degree F. Chung, Four proofs for the Cheeger inequality and graph partition algorithms, Proc. of ICCM, 2007 Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 82

94 Spectral Graph Bisection Q: How to obtain the binary cluster labels s { 1, +1} Nv from ŝ R Nv? Again, maximize the similarity measure s T ŝ where s i =sign([v 2 ] i ) (+1 if [v 2 ] i > 0 and -1 otherwise) Spectral graph bisection algorithm S1: Compute Laplacian matrix L with entries L ij = D ij A ij S2: Find second smallest eigenvector v 2 of L S3: Cluster membership of vertex i is s i =sign([v 2 ] i ) Complexity: efficient Lanczos algorithm variant in O( N e λ 3 λ 2 ) Nomenclature: v 2 is known as the Fiedler vector Eigenvalue λ 2 is Fiedler value, or algebraic connectivity of G Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 83

95 Spectral Gap in Fidler Vector Entries Suppose G is disconnected and has two connected components L is block diagonal, two smallest eigenvectors indicate groups, i.e. v 1 = [1, 1,..., 1, 0,..., 0] T and v 2 = [0, 0,..., 0, 1,..., 1] T If G is connected but amenable to bisection, v 1 = 1 and λ 2 0 Also, 1 T v 2 = i [v 2] i = 0 = Positive and negative entries in v 2 Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 84

96 Unknown Community Sizes Consider the graph bisection problem with unknown group sizes Minimizing the graph cut may be no longer meaningful! Figure: Cost C = i V 1,j V 2 A ij agnostic to groups internal structures Better criterion is the ratio cut R defined as R = C V 1 + C V 2 Balanced partitions: small community is penalized by the cost Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 85

97 Ratio-Cut Minimization Fix a bisection S of G into groups V 1 and V 2 Define f : f (S) = [f 1,..., f N ] T R Nv with entries: f i = V2 V 1 if vertex i belongs to V 1 and f i = V1 V 2 of vertex i belongs to V 2 One can establish the following properties: P1: f T Lf = N v R(S) P2: i f i = 0, i.e., 1 T f = 0 P3: f 2 = N v From P1-P3 it follows that ratio-cut minimization is equivalent to: min f f T Lf s.t. 1 T f = 0 and f T f = N v Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 86

98 Ratio Cut and Spectral Graph Bisection Ratio-cut minimization is also NP-hard. Relax to obtain: ŝ =arg min s R Nv s T Ls s.t. 1 T s = 0 and s T s = N v Partition Ŝ also given by the spectral graph bisection algorithm S1: Compute Laplacian matrix L with entries L ij = D ij A ij S2: Find second smallest eigenvector v 2 of L S3: Cluster membership of vertex i is s i =sign([v 2 ] i ) Alternative criterion is the normalized cut NC defined as: NC = C vol(v 1 ) + C vol(v 2 ), where vol(v i) = v V i d v, i {1, 2} Corresponds to using the normalized Laplacian D 1 L Amarda Shehu,Fei Li () Advanced (more Theoretical) Topics/Treatments 87

Network Community Detection

Network Community Detection Network Community Detection Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ March 20, 2018

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 2 Part 4: Dividing Networks into Clusters The problem l Graph partitioning

More information

Tie strength, social capital, betweenness and homophily. Rik Sarkar

Tie strength, social capital, betweenness and homophily. Rik Sarkar Tie strength, social capital, betweenness and homophily Rik Sarkar Networks Position of a node in a network determines its role/importance Structure of a network determines its properties 2 Today Notion

More information

Web Structure Mining Community Detection and Evaluation

Web Structure Mining Community Detection and Evaluation Web Structure Mining Community Detection and Evaluation 1 Community Community. It is formed by individuals such that those within a group interact with each other more frequently than with those outside

More information

Non Overlapping Communities

Non Overlapping Communities Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides

More information

Social Data Management Communities

Social Data Management Communities Social Data Management Communities Antoine Amarilli 1, Silviu Maniu 2 January 9th, 2018 1 Télécom ParisTech 2 Université Paris-Sud 1/20 Table of contents Communities in Graphs 2/20 Graph Communities Communities

More information

Mining Social Network Graphs

Mining Social Network Graphs Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be

More information

Modularity CMSC 858L

Modularity CMSC 858L Modularity CMSC 858L Module-detection for Function Prediction Biological networks generally modular (Hartwell+, 1999) We can try to find the modules within a network. Once we find modules, we can look

More information

My favorite application using eigenvalues: partitioning and community detection in social networks

My favorite application using eigenvalues: partitioning and community detection in social networks My favorite application using eigenvalues: partitioning and community detection in social networks Will Hobbs February 17, 2013 Abstract Social networks are often organized into families, friendship groups,

More information

Clusters and Communities

Clusters and Communities Clusters and Communities Lecture 7 CSCI 4974/6971 22 Sep 2016 1 / 14 Today s Biz 1. Reminders 2. Review 3. Communities 4. Betweenness and Graph Partitioning 5. Label Propagation 2 / 14 Today s Biz 1. Reminders

More information

Community Detection. Community

Community Detection. Community Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,

More information

INTRODUCTION SECTION 9.1

INTRODUCTION SECTION 9.1 SECTION 9. INTRODUCTION Belgium appears to be the model bicultural society: 59% of its citizens are Flemish, speaking Dutch and 40% are Walloons who speak French. As multiethnic countries break up all

More information

Online Social Networks and Media. Community detection

Online Social Networks and Media. Community detection Online Social Networks and Media Community detection 1 Notes on Homework 1 1. You should write your own code for generating the graphs. You may use SNAP graph primitives (e.g., add node/edge) 2. For the

More information

Social-Network Graphs

Social-Network Graphs Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities

More information

NETWORK SCIENCE COMMUNITIES

NETWORK SCIENCE COMMUNITIES 9 ALBERT-LÁSZLÓ BARABÁSI NETWORK SCIENCE NORMCORE ONCE UPON A TIME PEOPLE WERE BORN INTO AND HAD TO FIND THEIR INDIVIDUALITY. TODAY PEOPLE ARE BORN INDIVIDUALS AND HAVE TO FIND THEIR. MASS INDIE RESPONDS

More information

Spectral Methods for Network Community Detection and Graph Partitioning

Spectral Methods for Network Community Detection and Graph Partitioning Spectral Methods for Network Community Detection and Graph Partitioning M. E. J. Newman Department of Physics, University of Michigan Presenters: Yunqi Guo Xueyin Yu Yuanqi Li 1 Outline: Community Detection

More information

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2

More information

Community Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal

Community Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal Community Structure Detection Amar Chandole Ameya Kabre Atishay Aggarwal What is a network? Group or system of interconnected people or things Ways to represent a network: Matrices Sets Sequences Time

More information

CSE 258 Lecture 12. Web Mining and Recommender Systems. Social networks

CSE 258 Lecture 12. Web Mining and Recommender Systems. Social networks CSE 258 Lecture 12 Web Mining and Recommender Systems Social networks Social networks We ve already seen networks (a little bit) in week 3 i.e., we ve studied inference problems defined on graphs, and

More information

CAIM: Cerca i Anàlisi d Informació Massiva

CAIM: Cerca i Anàlisi d Informació Massiva 1 / 72 CAIM: Cerca i Anàlisi d Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Balcázar, Ricard Gavaldá Department of Computer Science, UPC Fall 2016 http://www.cs.upc.edu/~caim

More information

V4 Matrix algorithms and graph partitioning

V4 Matrix algorithms and graph partitioning V4 Matrix algorithms and graph partitioning - Community detection - Simple modularity maximization - Spectral modularity maximization - Division into more than two groups - Other algorithms for community

More information

Properties of Biological Networks

Properties of Biological Networks Properties of Biological Networks presented by: Ola Hamud June 12, 2013 Supervisor: Prof. Ron Pinter Based on: NETWORK BIOLOGY: UNDERSTANDING THE CELL S FUNCTIONAL ORGANIZATION By Albert-László Barabási

More information

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection CSE 158 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

CSE 158 Lecture 11. Web Mining and Recommender Systems. Social networks

CSE 158 Lecture 11. Web Mining and Recommender Systems. Social networks CSE 158 Lecture 11 Web Mining and Recommender Systems Social networks Assignment 1 Due 5pm next Monday! (Kaggle shows UTC time, but the due date is 5pm, Monday, PST) Assignment 1 Assignment 1 Social networks

More information

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection CSE 258 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

Economics of Information Networks

Economics of Information Networks Economics of Information Networks Stephen Turnbull Division of Policy and Planning Sciences Lecture 4: December 7, 2017 Abstract We continue discussion of the modern economics of networks, which considers

More information

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection CSE 255 Lecture 6 Data Mining and Predictive Analytics Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization Pedro Ribeiro (DCC/FCUP & CRACS/INESC-TEC) Part 1 Motivation and emergence of Network Science

More information

CS 534: Computer Vision Segmentation and Perceptual Grouping

CS 534: Computer Vision Segmentation and Perceptual Grouping CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation

More information

Distribution-Free Models of Social and Information Networks

Distribution-Free Models of Social and Information Networks Distribution-Free Models of Social and Information Networks Tim Roughgarden (Stanford CS) joint work with Jacob Fox (Stanford Math), Rishi Gupta (Stanford CS), C. Seshadhri (UC Santa Cruz), Fan Wei (Stanford

More information

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents E-Companion: On Styles in Product Design: An Analysis of US Design Patents 1 PART A: FORMALIZING THE DEFINITION OF STYLES A.1 Styles as categories of designs of similar form Our task involves categorizing

More information

Structure of Social Networks

Structure of Social Networks Structure of Social Networks Outline Structure of social networks Applications of structural analysis Social *networks* Twitter Facebook Linked-in IMs Email Real life Address books... Who Twitter #numbers

More information

1 Homophily and assortative mixing

1 Homophily and assortative mixing 1 Homophily and assortative mixing Networks, and particularly social networks, often exhibit a property called homophily or assortative mixing, which simply means that the attributes of vertices correlate

More information

Community detection algorithms survey and overlapping communities. Presented by Sai Ravi Kiran Mallampati

Community detection algorithms survey and overlapping communities. Presented by Sai Ravi Kiran Mallampati Community detection algorithms survey and overlapping communities Presented by Sai Ravi Kiran Mallampati (sairavi5@vt.edu) 1 Outline Various community detection algorithms: Intuition * Evaluation of the

More information

Extracting Information from Complex Networks

Extracting Information from Complex Networks Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform

More information

Networks in economics and finance. Lecture 1 - Measuring networks

Networks in economics and finance. Lecture 1 - Measuring networks Networks in economics and finance Lecture 1 - Measuring networks What are networks and why study them? A network is a set of items (nodes) connected by edges or links. Units (nodes) Individuals Firms Banks

More information

Clustering Algorithms for general similarity measures

Clustering Algorithms for general similarity measures Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative

More information

Oh Pott, Oh Pott! or how to detect community structure in complex networks

Oh Pott, Oh Pott! or how to detect community structure in complex networks Oh Pott, Oh Pott! or how to detect community structure in complex networks Jörg Reichardt Interdisciplinary Centre for Bioinformatics, Leipzig, Germany (Host of the 2012 Olympics) Questions to start from

More information

Extracting Communities from Networks

Extracting Communities from Networks Extracting Communities from Networks Ji Zhu Department of Statistics, University of Michigan Joint work with Yunpeng Zhao and Elizaveta Levina Outline Review of community detection Community extraction

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

10701 Machine Learning. Clustering

10701 Machine Learning. Clustering 171 Machine Learning Clustering What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally, finding natural groupings among

More information

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017 THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS Summer semester, 2016/2017 SOCIAL NETWORK ANALYSIS: THEORY AND APPLICATIONS 1. A FEW THINGS ABOUT NETWORKS NETWORKS IN THE REAL WORLD There are four categories

More information

Clustering: Classic Methods and Modern Views

Clustering: Classic Methods and Modern Views Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering

More information

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2 CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #21: Graph Mining 2 Networks & Communi>es We o@en think of networks being organized into modules, cluster, communi>es: VT CS 5614 2 Goal:

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

Community Structure and Beyond

Community Structure and Beyond Community Structure and Beyond Elizabeth A. Leicht MAE: 298 April 9, 2009 Why do we care about community structure? Large Networks Discussion Outline Overview of past work on community structure. How to

More information

Signal Processing for Big Data

Signal Processing for Big Data Signal Processing for Big Data Sergio Barbarossa 1 Summary 1. Networks 2.Algebraic graph theory 3. Random graph models 4. OperaGons on graphs 2 Networks The simplest way to represent the interaction between

More information

Centrality Book. cohesion.

Centrality Book. cohesion. Cohesion The graph-theoretic terms discussed in the previous chapter have very specific and concrete meanings which are highly shared across the field of graph theory and other fields like social network

More information

Cluster Analysis. Angela Montanari and Laura Anderlucci

Cluster Analysis. Angela Montanari and Laura Anderlucci Cluster Analysis Angela Montanari and Laura Anderlucci 1 Introduction Clustering a set of n objects into k groups is usually moved by the aim of identifying internally homogenous groups according to a

More information

CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties CSE 158 Lecture 11 Web Mining and Recommender Systems Triadic closure; strong & weak ties Triangles So far we ve seen (a little about) how networks can be characterized by their connectivity patterns What

More information

Characterizing Graphs (3) Characterizing Graphs (1) Characterizing Graphs (2) Characterizing Graphs (4)

Characterizing Graphs (3) Characterizing Graphs (1) Characterizing Graphs (2) Characterizing Graphs (4) S-72.2420/T-79.5203 Basic Concepts 1 S-72.2420/T-79.5203 Basic Concepts 3 Characterizing Graphs (1) Characterizing Graphs (3) Characterizing a class G by a condition P means proving the equivalence G G

More information

CSE 5243 INTRO. TO DATA MINING

CSE 5243 INTRO. TO DATA MINING CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University 09/25/2017 Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10.

More information

Community Analysis. Chapter 6

Community Analysis. Chapter 6 This chapter is from Social Media Mining: An Introduction. By Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. Cambridge University Press, 2014. Draft version: April 20, 2014. Complete Draft and Slides

More information

Constructing a G(N, p) Network

Constructing a G(N, p) Network Random Graph Theory Dr. Natarajan Meghanathan Professor Department of Computer Science Jackson State University, Jackson, MS E-mail: natarajan.meghanathan@jsums.edu Introduction At first inspection, most

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 01-31-017 Outline Background Defining proximity Clustering methods Determining number of clusters Comparing two solutions Cluster analysis as unsupervised Learning

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

Small Survey on Perfect Graphs

Small Survey on Perfect Graphs Small Survey on Perfect Graphs Michele Alberti ENS Lyon December 8, 2010 Abstract This is a small survey on the exciting world of Perfect Graphs. We will see when a graph is perfect and which are families

More information

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler BBS654 Data Mining Pinar Duygulu Slides are adapted from Nazli Ikizler 1 Classification Classification systems: Supervised learning Make a rational prediction given evidence There are several methods for

More information

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1 Graph Theory Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ January 30, 2018 Network

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

Statistical Physics of Community Detection

Statistical Physics of Community Detection Statistical Physics of Community Detection Keegan Go (keegango), Kenji Hata (khata) December 8, 2015 1 Introduction Community detection is a key problem in network science. Identifying communities, defined

More information

Cluster Analysis. Ying Shen, SSE, Tongji University

Cluster Analysis. Ying Shen, SSE, Tongji University Cluster Analysis Ying Shen, SSE, Tongji University Cluster analysis Cluster analysis groups data objects based only on the attributes in the data. The main objective is that The objects within a group

More information

CSE 158 Lecture 13. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

CSE 158 Lecture 13. Web Mining and Recommender Systems. Triadic closure; strong & weak ties CSE 158 Lecture 13 Web Mining and Recommender Systems Triadic closure; strong & weak ties Monday Random models of networks: Erdos Renyi random graphs (picture from Wikipedia http://en.wikipedia.org/wiki/erd%c5%91s%e2%80%93r%c3%a9nyi_model)

More information

Social Networks, Social Interaction, and Outcomes

Social Networks, Social Interaction, and Outcomes Social Networks, Social Interaction, and Outcomes Social Networks and Social Interaction Not all links are created equal: Some links are stronger than others Some links are more useful than others Links

More information

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering SYDE 372 - Winter 2011 Introduction to Pattern Recognition Clustering Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 5 All the approaches we have learned

More information

CSE 255 Lecture 13. Data Mining and Predictive Analytics. Triadic closure; strong & weak ties

CSE 255 Lecture 13. Data Mining and Predictive Analytics. Triadic closure; strong & weak ties CSE 255 Lecture 13 Data Mining and Predictive Analytics Triadic closure; strong & weak ties Monday Random models of networks: Erdos Renyi random graphs (picture from Wikipedia http://en.wikipedia.org/wiki/erd%c5%91s%e2%80%93r%c3%a9nyi_model)

More information

Week 7 Picturing Network. Vahe and Bethany

Week 7 Picturing Network. Vahe and Bethany Week 7 Picturing Network Vahe and Bethany Freeman (2005) - Graphic Techniques for Exploring Social Network Data The two main goals of analyzing social network data are identification of cohesive groups

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)

More information

Basics of Network Analysis

Basics of Network Analysis Basics of Network Analysis Hiroki Sayama sayama@binghamton.edu Graph = Network G(V, E): graph (network) V: vertices (nodes), E: edges (links) 1 Nodes = 1, 2, 3, 4, 5 2 3 Links = 12, 13, 15, 23,

More information

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Jure Leskovec, Cornell/Stanford University. Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research

Jure Leskovec, Cornell/Stanford University. Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research Jure Leskovec, Cornell/Stanford University Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research Network: an interaction graph: Nodes represent entities Edges represent interaction

More information

Social and Information Network Analysis Positive and Negative Relationships

Social and Information Network Analysis Positive and Negative Relationships Social and Information Network Analysis Positive and Negative Relationships Cornelia Caragea Department of Computer Science and Engineering University of North Texas June 14, 2016 To Recap Triadic Closure

More information

Gene Clustering & Classification

Gene Clustering & Classification BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering

More information

Constructing a G(N, p) Network

Constructing a G(N, p) Network Random Graph Theory Dr. Natarajan Meghanathan Associate Professor Department of Computer Science Jackson State University, Jackson, MS E-mail: natarajan.meghanathan@jsums.edu Introduction At first inspection,

More information

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters Types of general clustering methods Clustering Algorithms for general similarity measures agglomerative versus divisive algorithms agglomerative = bottom-up build up clusters from single objects divisive

More information

Alessandro Del Ponte, Weijia Ran PAD 637 Week 3 Summary January 31, Wasserman and Faust, Chapter 3: Notation for Social Network Data

Alessandro Del Ponte, Weijia Ran PAD 637 Week 3 Summary January 31, Wasserman and Faust, Chapter 3: Notation for Social Network Data Wasserman and Faust, Chapter 3: Notation for Social Network Data Three different network notational schemes Graph theoretic: the most useful for centrality and prestige methods, cohesive subgroup ideas,

More information

Network Clustering. Balabhaskar Balasundaram, Sergiy Butenko

Network Clustering. Balabhaskar Balasundaram, Sergiy Butenko Network Clustering Balabhaskar Balasundaram, Sergiy Butenko Department of Industrial & Systems Engineering Texas A&M University College Station, Texas 77843, USA. Introduction Clustering can be loosely

More information

This is not the chapter you re looking for [handwave]

This is not the chapter you re looking for [handwave] Chapter 4 This is not the chapter you re looking for [handwave] The plan of this course was to have 4 parts, and the fourth part would be half on analyzing random graphs, and looking at different models,

More information

Notes for Lecture 24

Notes for Lecture 24 U.C. Berkeley CS170: Intro to CS Theory Handout N24 Professor Luca Trevisan December 4, 2001 Notes for Lecture 24 1 Some NP-complete Numerical Problems 1.1 Subset Sum The Subset Sum problem is defined

More information

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena CSE 190 Lecture 16 Data Mining and Predictive Analytics Small-world phenomena Another famous study Stanley Milgram wanted to test the (already popular) hypothesis that people in social networks are separated

More information

Lesson 4. Random graphs. Sergio Barbarossa. UPC - Barcelona - July 2008

Lesson 4. Random graphs. Sergio Barbarossa. UPC - Barcelona - July 2008 Lesson 4 Random graphs Sergio Barbarossa Graph models 1. Uncorrelated random graph (Erdős, Rényi) N nodes are connected through n edges which are chosen randomly from the possible configurations 2. Binomial

More information

Finding and Visualizing Graph Clusters Using PageRank Optimization. Fan Chung and Alexander Tsiatas, UCSD WAW 2010

Finding and Visualizing Graph Clusters Using PageRank Optimization. Fan Chung and Alexander Tsiatas, UCSD WAW 2010 Finding and Visualizing Graph Clusters Using PageRank Optimization Fan Chung and Alexander Tsiatas, UCSD WAW 2010 What is graph clustering? The division of a graph into several partitions. Clusters should

More information

V10 Metabolic networks - Graph connectivity

V10 Metabolic networks - Graph connectivity V10 Metabolic networks - Graph connectivity Graph connectivity is related to analyzing biological networks for - finding cliques - edge betweenness - modular decomposition that have been or will be covered

More information

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample CS 1675 Introduction to Machine Learning Lecture 18 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem:

More information

MSA220 - Statistical Learning for Big Data

MSA220 - Statistical Learning for Big Data MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 Figures are taken from: M.E.J. Newman, Networks: An Introduction 2

More information

Scalable Clustering of Signed Networks Using Balance Normalized Cut

Scalable Clustering of Signed Networks Using Balance Normalized Cut Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.

More information

On the Approximability of Modularity Clustering

On the Approximability of Modularity Clustering On the Approximability of Modularity Clustering Newman s Community Finding Approach for Social Nets Bhaskar DasGupta Department of Computer Science University of Illinois at Chicago Chicago, IL 60607,

More information

1 Large-scale structure in networks

1 Large-scale structure in networks 1 Large-scale structure in networks Most of the network measures we have considered so far can tell us a great deal about the shape of a network. However, these measures generally describe patterns as

More information

Example for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows)

Example for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows) Example for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows) Average clustering coefficient of a graph Overall measure

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford

More information

V2: Measures and Metrics (II)

V2: Measures and Metrics (II) - Betweenness Centrality V2: Measures and Metrics (II) - Groups of Vertices - Transitivity - Reciprocity - Signed Edges and Structural Balance - Similarity - Homophily and Assortative Mixing 1 Betweenness

More information

Paths, Circuits, and Connected Graphs

Paths, Circuits, and Connected Graphs Paths, Circuits, and Connected Graphs Paths and Circuits Definition: Let G = (V, E) be an undirected graph, vertices u, v V A path of length n from u to v is a sequence of edges e i = {u i 1, u i} E for

More information

Graph Theory. Graph Theory. COURSE: Introduction to Biological Networks. Euler s Solution LECTURE 1: INTRODUCTION TO NETWORKS.

Graph Theory. Graph Theory. COURSE: Introduction to Biological Networks. Euler s Solution LECTURE 1: INTRODUCTION TO NETWORKS. Graph Theory COURSE: Introduction to Biological Networks LECTURE 1: INTRODUCTION TO NETWORKS Arun Krishnan Koenigsberg, Russia Is it possible to walk with a route that crosses each bridge exactly once,

More information

Community detection. Leonid E. Zhukov

Community detection. Leonid E. Zhukov Community detection Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Network Science Leonid E.

More information

CS-E5740. Complex Networks. Network analysis: key measures and characteristics

CS-E5740. Complex Networks. Network analysis: key measures and characteristics CS-E5740 Complex Networks Network analysis: key measures and characteristics Course outline 1. Introduction (motivation, definitions, etc. ) 2. Static network models: random and small-world networks 3.

More information