Basics of Network Analysis

Similar documents
Course Introduction / Review of Fundamentals of Graph Theory

Social Data Management Communities

Introduction to network metrics

Complex networks Phys 682 / CIS 629: Computational Methods for Nonlinear Systems

Extracting Information from Complex Networks

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization

Community Detection. Community

Community detection. Leonid E. Zhukov

Social-Network Graphs

CAIM: Cerca i Anàlisi d Informació Massiva

Complex-Network Modelling and Inference

- relationships (edges) among entities (nodes) - technology: Internet, World Wide Web - biology: genomics, gene expression, proteinprotein

Networks in economics and finance. Lecture 1 - Measuring networks

Non Overlapping Communities

Persistent Homology in Complex Network Analysis

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge

Machine Learning and Modeling for Social Networks

Summary: What We Have Learned So Far

Preliminaries: networks and graphs

Social Network Analysis

Lecture Note: Computation problems in social. network analysis

Research on Community Structure in Bus Transport Networks

CS-E5740. Complex Networks. Network analysis: key measures and characteristics

Universal Properties of Mythological Networks Midterm report: Math 485

V4 Matrix algorithms and graph partitioning

Graph Theory for Network Science

Complex networks: A mixture of power-law and Weibull distributions

Graph Theory and Network Measurment

Web Structure Mining Community Detection and Evaluation

L Modelling and Simulating Social Systems with MATLAB

Community Structure and Beyond

Clustering analysis of gene expression data

1 Homophily and assortative mixing

Lesson 4. Random graphs. Sergio Barbarossa. UPC - Barcelona - July 2008

V2: Measures and Metrics (II)

Section 7.13: Homophily (or Assortativity) By: Ralucca Gera, NPS

THE DEPENDENCY NETWORK IN FREE OPERATING SYSTEM

Supplementary material to Epidemic spreading on complex networks with community structures

arxiv: v1 [physics.data-an] 27 Sep 2007

Network Theory: Social, Mythological and Fictional Networks. Math 485, Spring 2018 (Midterm Report) Christina West, Taylor Martins, Yihe Hao

How Do Real Networks Look? Networked Life NETS 112 Fall 2014 Prof. Michael Kearns

TELCOM2125: Network Science and Analysis

Clusters and Communities

Introduction to Graph Theory

CS224W: Analysis of Networks Jure Leskovec, Stanford University

Graph-theoretic Properties of Networks

Online Social Networks and Media

Modularity CMSC 858L

NETWORK ANALYSIS. Duygu Tosun-Turgut, Ph.D. Center for Imaging of Neurodegenerative Diseases Department of Radiology and Biomedical Imaging

An introduction to the physics of complex networks

Empirical analysis of online social networks in the age of Web 2.0

Community Detection: Comparison of State of the Art Algorithms

Nick Hamilton Institute for Molecular Bioscience. Essential Graph Theory for Biologists. Image: Matt Moores, The Visible Cell

Introduction to Complex Networks Analysis

Community Detection in Social Networks

Graph Theory for Network Science

Approximation slides 1. An optimal polynomial algorithm for the Vertex Cover and matching in Bipartite graphs

arxiv: v1 [stat.ml] 2 Nov 2010

CS388C: Combinatorics and Graph Theory

What is a network? Network Analysis

Community Analysis. Chapter 6

Section 7.12: Similarity. By: Ralucca Gera, NPS

Mathematics of Networks II

Response Network Emerging from Simple Perturbation

Temporal Networks. Hiroki Sayama

Mining Social Network Graphs

Graph Theory. ICT Theory Excerpt from various sources by Robert Pergl

Topic II: Graph Mining

Importance of IP Alias Resolution in Sampling Internet Topologies

Oh Pott, Oh Pott! or how to detect community structure in complex networks

Example for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows)

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection

Constructing a G(N, p) Network

Graph Theory. Graph Theory. COURSE: Introduction to Biological Networks. Euler s Solution LECTURE 1: INTRODUCTION TO NETWORKS.

SNA 8: network resilience. Lada Adamic

TELCOM2125: Network Science and Analysis

BACKGROUND: A BRIEF INTRODUCTION TO GRAPH THEORY

Signal Processing for Big Data

MCL. (and other clustering algorithms) 858L

Effect of age and dementia on topology of brain functional networks. Paul McCarthy, Luba Benuskova, Liz Franz University of Otago, New Zealand

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection

An introduction to graph analysis and modeling Descriptive Analysis of Network Data

My favorite application using eigenvalues: partitioning and community detection in social networks

Graph similarity. Laura Zager and George Verghese EECS, MIT. March 2005

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena

Online Social Networks and Media. Community detection

Graph Theory S 1 I 2 I 1 S 2 I 1 I 2

Graphs. Introduction To Graphs: Exercises. Definitions:

Analytical reasoning task reveals limits of social learning in networks

Plan of the lecture I. INTRODUCTION II. DYNAMICAL PROCESSES. I. Networks: definitions, statistical characterization, examples II. Modeling frameworks

Sergei Silvestrov, Christopher Engström. January 29, 2013

Discovery of Community Structure in Complex Networks Based on Resistance Distance and Center Nodes

COMP6237 Data Mining and Networks. Markus Brede. Lecture slides available here:

CSCI5070 Advanced Topics in Social Computing

Modeling and Simulating Social Systems with MATLAB

Complex Network Theory CS60078

Some Graph Theory for Network Analysis. CS 249B: Science of Networks Week 01: Thursday, 01/31/08 Daniel Bilar Wellesley College Spring 2008

Transcription:

Basics of Network Analysis Hiroki Sayama sayama@binghamton.edu

Graph = Network G(V, E): graph (network) V: vertices (nodes), E: edges (links) 1 Nodes = 1, 2, 3, 4, 5 2 3 Links = 1<->2, 1<->3, 1<->5, 2<->3, 2<->4, 2<->5, 3<->4, 3<->5, 4<->5 5 4 (Nodes may have states; links may have directions and weights) 2

Representation of a network Adjacency matrix: A matrix with rows and columns labeled by nodes, where element a ij shows the number of links going from node i to node j (becomes symmetric for undirected graph) Adjacency list: A list of links whose element i->j shows a link going from node i to node j (also represented as i -> {j 1, j 2, j 3, } ) 3

Exercise Represent the following network in: 1 Adjacency matrix 2 3 Adjacency list 4 5 4

Degree of a node A degree of node u, deg(u), is the number of links connected to u u 1 u 2 deg(u 1 ) = 4 deg(u 2 ) = 2 5

Connected graph A graph in which there is a path between any pair of nodes 6

Connected components Connected component Number of connected components = 2 Connected component 7

Complete graph A graph in which any pair of nodes are connected (often written as K 1, K 2, ) 8

Regular graph A graph in which all nodes have the same degree (often called k-regular graph with degree k) 9

Bipartite graph A graph whose nodes can be divided into two subsets so that no link connects nodes within the same subset = 10

Directed graph Each link is directed Direction represents either order of relationship or accessibility between nodes E.g. genealogy 11

Weighted directed graph Most general version of graphs Both weight and direction is assigned to each link E.g. traffic network 12

Measuring Topological Properties of Networks (1): Macroscopic Properties

Network density The ratio of # of actual links and # of possible links For an undirected graph: d = E / ( V ( V - 1) / 2 ) For a directed graph: d = E / ( V ( V - 1) ) 14

Characteristic path length In graph theory: Maximum of shortest path lengths between pairs of nodes (a.k.a. network diameter) In complex network science: Average shortest path lengths Characterizes how large the world being modeled is A small length implies that the network is well connected globally 15

Clustering coefficient For each node: Let n be the number of its neighbor nodes Let m be the number of links among the k neighbors Calculate c = m / (n choose 2) Then C = <c> (the average of c) C indicates the average probability for two of one s friends to be friends too A large C implies that the network is well connected locally to form a cluster 16

Degree distribution P(k) = Prob. (or #) of nodes with degree k Gives a rough profile of how the connectivity is distributed within the network S k P(k) = 1 (or total # of nodes) 17

Power law degree distribution P(k) ~ k -g P(k) A few well-connected nodes, a lot of poorly connected nodes log P(k) Scale-free network k log k Linear in log-log plot -> No characteristic scale (Scale-free networks) 18

How it appears Random Scale-free 19

Degree Distributions of Real-World Complex Networks Actors WWW Power grid A Barabási, R Albert Science 1999;286:509-512 20

Degree distribution of FB P(k) CCDF http://www.facebook.com/note.php?note_id=1 0150388519243859 http://arxiv.org/abs/1111.4503 21

Measuring Topological Properties of Networks (2): Centralities

Centrality measures ( B,C,D,E ) Degree centrality How many connections the node has Betweenness centrality How many shortest paths go through the node Closeness centrality How close the node is to other nodes Eigenvector centrality 23

Degree centrality Simply, # of links attached to a node C D (v) = deg(v) or sometimes defined as C D (v) = deg(v) / (N-1) 24

Betweenness centrality Prob. for a node to be on shortest paths between two other nodes C B (v) = Σ s v,t v #sp(s,e,v) #sp(s,e) s: start node, e: end node #sp(s,e,v): # of shortest paths from s to e that go though node v #sp(s,e): total # of shortest paths from s to e Easily generalizable to group betweenness 25

Closeness centrality Inverse of an average distance from a node to all the other nodes C C (v) = n-1 Σ w v d(v,w) d(v,w): length of the shortest path from v to w Its inverse is called farness Sometimes Σ is moved out of the fraction (it works for networks that are not strongly connected) NetworkX calculates closeness within each connected component 26

Eigenvector centrality Eigenvector of the largest eigenvalue of the adjacency matrix of a network C E (v) = (v-th element of x) Ax = lx l: dominant eigenvalue x is often normalized ( x = 1) 27

Exercise Who is most central by degree, betweenness, closeness, eigenvector? 28

Which centrality to use? To find the most popular person To find the most efficient person to collect information from the entire organization To find the most powerful person to control information flow within an organization To find the most important person (?) 29

Measuring Topological Properties of Networks (3): Mesoscopic Properties

Degree correlation (assortativity) Pearson s correlation coefficient of node degrees across links r = Cov(X, Y) σ X σ Y X: degree of start node (in / out) Y: degree of end node (in / out) 31

Assortative/disassortative networks Social networks are assortative Engineered / biological networks are disassortative (from Newman, M. E. J., Phys. Rev. Lett. 89: 208701, 2002) 32

K-cores A connected component of a network obtained by repeatedly deleting all the nodes whose degree is less than k until no more such nodes exist Helps identify where the core cluster is All nodes of a k-core have at least degree k The largest value of k for which a k- core exists is called degeneracy of the network 33

Exercise Find the k-core (with the largest k) of the following network 34

Coreness (core number) A node s coreness (core number) is c if it belongs to a c-core but not (c+1)-core Indicates how strongly the node is connected to the network Classifies nodes into several layers Useful for visualization 35

Community A subgraph of a network within which nodes are connected to each other more densely than to the outside Still defined vaguely Various detection algorithms proposed K-clique percolation Hierarchical clustering Girvan-Newman algorithm Modularity maximization (e.g., Louvain method) (diagram from Wikipedia) 36

Modularity A quantity that characterizes how good a given community structure is in dividing the network Q = E in - E in-r E E in : # of links connecting nodes that belong to the same community E in-r : Estimated E in if links were random 37

Community detection based on modularity The Louvain method Heuristic algorithm to construct communities that optimize modularity Blondel et al. J. Stat. Mech. 2008 (10): P10008 Python implementation by Thomas Aynaud available at: https://bitbucket.org/taynaud/pythonlouvain/ 38