Network Analysis. 1. Large and Complex Networks. Scale-free Networks Albert Barabasi Society

Similar documents
Nick Hamilton Institute for Molecular Bioscience. Essential Graph Theory for Biologists. Image: Matt Moores, The Visible Cell

Advanced Algorithms and Models for Computational Biology -- a machine learning approach

CAIM: Cerca i Anàlisi d Informació Massiva

Summary: What We Have Learned So Far

Wednesday, March 8, Complex Networks. Presenter: Jirakhom Ruttanavakul. CS 790R, University of Nevada, Reno

Graph Theory. Graph Theory. COURSE: Introduction to Biological Networks. Euler s Solution LECTURE 1: INTRODUCTION TO NETWORKS.

Example for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows)

Properties of Biological Networks

Modeling and Simulating Social Systems with MATLAB

arxiv:cond-mat/ v1 21 Oct 1999

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization

An introduction to the physics of complex networks

Mathematical Foundations

Machine Learning and Modeling for Social Networks

(Social) Networks Analysis III. Prof. Dr. Daning Hu Department of Informatics University of Zurich

CSCI5070 Advanced Topics in Social Computing

CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS

Introduction to network metrics

Web Structure Mining Community Detection and Evaluation

Graph-theoretic Properties of Networks

Chapter 1. Social Media and Social Computing. October 2012 Youn-Hee Han

Structure of biological networks. Presentation by Atanas Kamburov

Topic II: Graph Mining

Graph Theory for Network Science

Information Visualization. Jing Yang Spring Graph Visualization

Rethinking Preferential Attachment Scheme: Degree centrality versus closeness centrality 1

Introduction to Networks and Business Intelligence

arxiv:cond-mat/ v1 [cond-mat.dis-nn] 3 Aug 2000

Complex Networks. Structure and Dynamics

Network Thinking. Complexity: A Guided Tour, Chapters 15-16

An Evolving Network Model With Local-World Structure

Universal Behavior of Load Distribution in Scale-free Networks

How Do Real Networks Look? Networked Life NETS 112 Fall 2014 Prof. Michael Kearns

Biological Networks Analysis Dijkstra s algorithm and Degree Distribution

Graph/Network Visualization

Alessandro Del Ponte, Weijia Ran PAD 637 Week 3 Summary January 31, Wasserman and Faust, Chapter 3: Notation for Social Network Data

Large Scale Information Visualization. Jing Yang Fall Graph Visualization

On Complex Dynamical Networks. G. Ron Chen Centre for Chaos Control and Synchronization City University of Hong Kong

Exercise set #2 (29 pts)

Social Network Analysis With igraph & R. Ofrit Lesser December 11 th, 2014

CS-E5740. Complex Networks. Scale-free networks

Case Studies in Complex Networks

Response Network Emerging from Simple Perturbation

Network Analysis Networks permeate our lives. Networks play a central role in determining the transmission of information about job opportunities, how

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov

Social, Information, and Routing Networks: Models, Algorithms, and Strategic Behavior

Incoming, Outgoing Degree and Importance Analysis of Network Motifs

Lesson 4. Random graphs. Sergio Barbarossa. UPC - Barcelona - July 2008

Community Detection. Community

L Modeling and Simulating Social Systems with MATLAB

Network Mathematics - Why is it a Small World? Oskar Sandberg

Critical Phenomena in Complex Networks

Overlay (and P2P) Networks

Signal Processing for Big Data

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017

Networks and Discrete Mathematics

Biological Networks Analysis

Characteristics of Preferentially Attached Network Grown from. Small World

V 1 Introduction! Mon, Oct 15, 2012! Bioinformatics 3 Volkhard Helms!

Oh Pott, Oh Pott! or how to detect community structure in complex networks

Basics of Network Analysis

A quick review. The clustering problem: Hierarchical clustering algorithm: Many possible distance metrics K-mean clustering algorithm:

Overview of Network Theory, I

Topology and Dynamics of Complex Networks

6. Overview. L3S Research Center, University of Hannover. 6.1 Section Motivation. Investigation of structural aspects of peer-to-peer networks

Peer-to-Peer Networks 15 Self-Organization. Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg

Extracting Information from Complex Networks

Graph Model Selection using Maximum Likelihood

visone Analysis and Visualization of Social Networks

ECS 253 / MAE 253, Lecture 8 April 21, Web search and decentralized search on small-world networks

UNIVERSITA DEGLI STUDI DI CATANIA FACOLTA DI INGEGNERIA

M.E.J. Newman: Models of the Small World

Deterministic Hierarchical Networks

A Generating Function Approach to Analyze Random Graphs

Degree Distribution: The case of Citation Networks

Structured prediction using the network perceptron

- relationships (edges) among entities (nodes) - technology: Internet, World Wide Web - biology: genomics, gene expression, proteinprotein

arxiv:cs/ v1 [cs.ds] 25 Oct 2003

Graph Theory for Network Science

Erdős-Rényi Model for network formation

V 2 Clusters, Dijkstra, and Graph Layout

Web 2.0 Social Data Analysis

Constructing a G(N, p) Network

The Betweenness Centrality Of Biological Networks

Models of Network Formation. Networked Life NETS 112 Fall 2017 Prof. Michael Kearns

Introduction to Complex Networks Analysis

Unit 2: Graphs and Matrices. ICPSR University of Michigan, Ann Arbor Summer 2015 Instructor: Ann McCranie

Exploiting the Scale-free Structure of the WWW

Anomaly Detection in Very Large Graphs Modeling and Computational Considerations

The quantitative analysis of interactions takes bioinformatics to the next higher dimension: we go from 1D to 2D with graph theory.

Resilient Networking. Thorsten Strufe. Module 3: Graph Analysis. Disclaimer. Dresden, SS 15

Preliminaries: networks and graphs

RANDOM-REAL NETWORKS

Higher order clustering coecients in Barabasi Albert networks

Peer-to-Peer Data Management

Simplicial Complexes of Networks and Their Statistical Properties

Examples of Complex Networks

A new general family of deterministic hierarchical networks

A Topological Network Analysis of Greek Firms

Networks in economics and finance. Lecture 1 - Measuring networks

Transcription:

COMP4048 Information Visualisation 2011 2 nd semester 1. Large and Complex Networks Network Analysis Scale-free Networks Albert Barabasi http://www.nd.edu/~networks/ Seokhee Hong Austin Powers: The spy who shagged me Kevin Bacon Number Robert Wagner Wild Things Let s make it legal What Price Glory Society Nodes: individuals Links: social relationship (family/work/friendship/etc.) Barry Norton A Few Good Man Monsieur Verdoux S. Milgram (1967) John Guare Six Degrees of Separation Social networks: Many individuals with diverse social interactions between them. Communication networks The Earth is developing an electronic nervous system, a network with diverse nodes and links are -computers -phone lines -routers -TV cables -satellites -EM waves Communication networks: Many non-identical components with diverse connections between them. Complex systems Made of many non-identical elements connected by diverse interactions. NETWORK 1

Erdös-Rényi model (1960) Cluster Coefficient Clustering: My friends will likely know each other! - Democratic - Random Connect with probability p p=1/6 N=10 k ~ 1.5 Pál Erdös (1913-1996) Poisson distribution Networks are clustered [large C(p)] but have a small characteristic path length [small L(p)]. Probability to be connected C» p # of links between 1,2, n neighbors C = n(n-1)/2 Network C Crand L N WWW 0.1078 0.00023 3.1 153127 Internet 0.18-0.3 0.001 3.7-3.76 3015-6209 Actor 0.79 0.00027 3.65 225226 Coauthorship 0.43 0.00018 5.9 52909 Metabolic 0.32 0.026 2.9 282 Foodweb 0.22 0.06 2.43 134 C. elegance 0.28 0.05 2.65 282 Watts-Strogatz Model: Small World Networks World Wide Web: scalefree networks Nodes: WWW documents Links: URL links 800 million documents (S. Lawrence, 1999) ROBOT: collects all URL s found in a document and follows them recursively C(p) : clustering coeff. L(p) : average path length (Watts and Strogatz, Nature 393, 440 (1998)) R. Albert, H. Jeong, A-L Barabasi, Nature, 401 130 (1999) What did they expect? They find: out = 2.45 in = 2.1 k ~ 6 P(k=500) ~ 10-99 N WWW ~ 10 9 N(k=500)~10-90 P(k=500) ~ 10-6 N WWW ~ 10 9 N(k=500) ~ 10 3 Finite size scaling: create a network with N nodes with P in (k) and P out (k) < l > 1 2 3 4 nd.edu 19 degrees of separation 5 l 15 =2 [1 2 5] l 17 =4 [1 3 4 6 7] < l > =?? < l > = 0.35 + 2.06 log(n) IBM A. Broder et al WWW9 (00) 6 7 19 degrees of separation R. Albert et al Nature (99) based on 800 million webpages [S. Lawrence et al Nature (99)] P out (k) ~ k - out P in (k) ~ k - in 2

What does it mean? Poisson distribution Power-law distribution INTERNET BACKBONE Nodes: computers, routers Links: physical lines Exponential Network (Faloutsos, Faloutsos and Faloutsos, 1999) Scale-free Network ACTOR CONNECTIVITIES Nodes: actors Links: cast jointly Days of Thunder (1990) Far and Away (1992) Eyes Wide Shut (1999) N = 212,250 actors k = 28.78 P(k) ~k- =2.3 SCIENCE CITATION INDEX Nodes: papers Links: citations 25 SCIENCE COAUTHORSHIP Nodes: scientist (authors) Links: write paper together Witten-Sander PRL 1981 1736 PRL papers (1988) 2212 P(k) ~k- ( = 3) (S. Redner, 1998) (Newman, 2000, H. Jeong et al 2001) 3

SCALE-FREE NETWORKS (1) The number of nodes (N) is NOT fixed. Networks continuously expand by the addition of new nodes Examples: WWW : addition of new documents Citation : publication of new papers (2) The attachment is NOT uniform. A node is linked with higher probability to a node that already has a large number of links. Examples : WWW : new documents link to well known sites (CNN, YAHOO, NewYork Times, etc) Citation : well cited papers are more likely to be cited again Scale-free model (1) GROWTH : At every timestep we add a new node with m edges (connected to the nodes already present in the system). (2) PREFERENTIAL ATTACHMENT : The probability Π that a new node will be connected to node i depends on the connectivity k i of that node P(k) ~k -3 ki ( ki ) k A.-L.Barabási, R. Albert, Science 286, 509 (1999) j j GENOME protein-gene interactions PROTEOME protein-protein interactions METABOLISM Bio-chemical reactions Citrate Cycle Metabolic Network Nodes: chemicals (substrates) Links: bio-chemical reactions Metabolic network Archaea Bacteria Eukaryotes Organisms from all three domains of life are scale-free networks! H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi, Nature, 407 651 (2000) 4

Yeast protein network Nodes: proteins Links: physical interactions (binding) Topology of the protein network k k0 P( k) ~ ( k k0) exp( ) k H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature 411, 41-42 (2001) P. Uetz, et al. Nature 403, 623-7 (2000). Nature 408 307 (2000) p53 network (mammals) One way to understand the p53 network is to compare it to the Internet. The cell, like the Internet, appears to be a scale-free network. Complexity Network Science collaboration WWW Scale-free network Food Web Citation pattern Internet Cell UNCOVERING ORDER HIDDEN WITHIN COMPLEX SYSTEMS Traditional modeling: Network as a static graph Given a network with N nodes and L links Create a graph with statistically identical topology RESULT: model the static network topology PROBLEM: Real networks are dynamical systems! Evolving networks OBJECTIVE: capture the network dynamics METHOD : identify the processes that contribute to the network topology develop dynamical models that capture these processes BONUS: get the topology correctly. 5

Large and Complex Network Scale-free networks (a) Network modeling Exponential growth Preferential attachment (b) Properties: Power-low degree distribution High clustering coefficient Ultra-small avg. path length: O(loglog n) Other networks Small-world networks Power-low random sparse networks (graphs): Fan Chung 2. Social Network Analysis Basic Concepts and Terminology Social Network Analysis: Methods and Applications [Wasserman and Faust 94] Network Analysis: Methodological Foundations LNCS 3418 Tutorial [Brandes and Erlebach eds. 04] Social Network Analysis Methodological approach using graph theoretic concepts to describe, understand, explain social structure Network Model node, link, attribute weighted graph directed graph Purpose/level of interest (1) Centrality: important actors/ crucial links (2) Cohesive subgroups: components, cores, cliques (3) Structural roles: positions, roles, clusters (4) Network measures/statistics (1) Centrality Degree: local measure Distance measures (global) Betweenness [Freeman 79] low degree can be important (broker, gatekeeper, intermediary) proportion of shortest paths connecting each pair X and Z which pass thru Y Closeness: sum of shortest paths to all the other vertices Eccentricity: length of the longest shortest path Feedback measures Status/Hub/authority/eigenvector A B C G M Degree: A, B, C Betweenness: B 6

Displaying Centralities Node size, edge weight (2) Cohesive Subgroup Meaningful social group Component: strong component/weak component Cycles/cyclic component Connected (k-connectivity)/isolated Cut vertex/ separation pair Core k-core: maximal subgraph such that in which each vertex is adjacent to at least k other vertices vertex degree: >= k Clique : complete subgraph (strong/weak clique) 3-core Radial drawing Hierarchical drawing n-clique extension of clique n: maximum path length of members of clique two limitations: 1. n > 2: sociologically difficult to interpret 2. path may go thru other non-member vertices Diameter: 3 2-clique n-clan extension of n-clique, more useful concept Also require: the diameter of the clique be no greater than n 1-clique 2-clique 3-clique 2-clique No 2 -clan 2-clans k-plex Set of vertices in which each vertex is adjacent to all, except k of the other vertices (connected to n-k vertices) 1-plex = 1-clique: each vertex is connected to n-1 vertices (3) Network Positions and Structural Equivalence Positions/Roles Structural equivalance: Block model (or image matrix) : reduction of complex network Clusters: cliques, distance, similarity 1 n n1 n2 3-clique Not 3-plex 2-clique 3-plex n-clique/n-clan: reachability (path length) k-core/k-plex: degree 1..m m1 m2 Image matrix 7

Network Positions Conceptualising similarity of social positions: Structural equivalence Automorphic equivalence Regular equivalence Outdegree and indegree equivalence Blockmodelling Generalised blockmodelling Structural Equivalence The two green nodes are connected to exactly the same alters and are said to be structurally equivalent They hold identical positions in the network Formally, nodes a and b are structurally equivalent if, whenever (a,x) is an edge in G then so is (b,x), and conversely (x a, b) We can allocate nodes that are structurally equivalent to a (structural equivalence) class, or block represented by colour in the diagram. Blockmodel For a structural equivalence relation: If a node in block x is connected to a node in block y, then every node in block x is connected to every node in block y In this case we define an edge (x,y) between block x and block y, and the result is a reduced graph or blockmodel for the network Structural Equivalence for Directed graphs In a directed graph: nodes a and b are structurally equivalent if, (a) whenever (a,x) is an arc in G then so is (b,x), and conversely, and (b) whenever (x,a) is an arc in G then so is (x,b), and conversely graph blockmodel directed graph blockmodel Automorphic Equivalence A permutation α of the node set V of a graph is a re-labelling of the nodes: we write α(a) for the node to which the label a is attached by the relabelling A permutation α is an automorphism of the graph G if, whenever (a,x) is an edge in G, then (α(a),α(x)) is an edge in G, and conversely. Nodes a and b are automorphically equivalent if b = α(a) for some automorphism α Example: nodes with the same colour in the graph are automorphically equivalent Why is automorphic equivalence interesting? Automorphically equivalent nodes have the same position in a network in a more abstract sense than structurally equivalent nodes: they are not connected to the exact same nodes, but to nodes that play analogous roles in the network a potential representation for roles: leader, principal, broker, loner, clown, etc Note: if nodes are structually equivalent, then they are also automorphically equivalent, but the converse does not hold graph blockmodel Generalisation to directed and multiple graphs is straightforward, as for structural equivalence Note: thick edge on a node indicates a loop (self-tie) 8

Regular Equivalence Two nodes a and b are regularly equivalent if (a) whenever (a,x) is an edge in G, then there is some node y that is regularly equivalent to x for which (b,y) is an edge in G. In other words, if regularly equivalent nodes have the same colour, then each node in a class of regularly equivalent nodes is connected to other nodes of exactly the same set of colours (e.g red to yellow, green) Why are these forms of equivalence of interest? All of these forms of equivalence and several others [Pattison 1993] have the property that: there is a path at the block level (assuming a block-to-block tie whenever there is at least one node-level tie) if and only if there is at least one path at the node level graph (regular equivalence) blockmodel Blockmodels and Generalised Blockmodels (4) Network Measures/Statistics A blockmodel represents the relations among social positions, and comprises: an assignment of nodes to blocks, or positions (classes of equivalent nodes), and a specification of relations between blocks Different forms of equivalence are associated with different requirements for submatrix patterns of inter-block relations The most common practical applications to date have involved structural equivalence (vast majority) and regular equivalence ( niche applications) but software availability has played an important part. In a generalised blockmodel (Doreian et al., 2004, 2005), each submatrix may correspond to a different form of equivalence Degree distribution Clustering coefficient Diameter Average path length Connected component Density 3. Network Analysis using Pajek Vladimir Batagelj http://vlado.fmf.uni-lj.si/pub/networks/pajek/ 9

10

11

12

13

14

References Scale-free Networks: http://www.nd.edu/~networks/ GEOMI (GEOmetry for Maximum Insight) Social Network Analysis: INSNA http://www.insna.org SNA Tools UCINET Pajek: http://vlado.fmf.uni-lj.si/pub/networks/pajek/ NetMiner Visone Network Analysis Plug-ins Graph Layout Plug-ins Interaction Plug-ins Visual analysis tool GEOMI: http://www.cs.usyd.edu.au/~visual/valacon/geomi/ Wilmascope 15

Analysis Methods Graph theory Tree/planar/directed graph algorithms Graph partitioning Graph clustering Graph decomposition Social Network Analysis Centrality Cohesive subgroups Structural position/role Block modeling.. Graph/Network Models Graph models Tree/planar graphs Clustered graphs Hierarchical graphs Directed graphs Hyper-Graphs Hi-graphs Network models Scale-free networks Evolution networks Dynamic networks Temporal networks Random sparse networks Application Domains Social networks Citation network Collaboration network Telephone call network Policy network Biological networks Phylogenetic networks Metabolic pathways Protein-Protein Interaction networks Gene regulatory networks Signaling networks Webgraphs/ AS graphs Software engineering 2.5D Graph/Network Layouts Graph models Trees Planar graphs Clustered graphs Hierarchical graphs Directed graphs Network models Scale-free networks Evolution networks Dynamic networks Temporal networks Overlapping networks Assignment 2: Programming assignment 1. Form a Group of 2 people 2. Send me an email: group name and members * Flexibility: 1 person: less requirement no exam option: more requirements Homework Find Visualisation of Social networks displaying Centrality analysis k-core analysis Structural equivalence Network Motifs Scale-free networks Small world networks 16