Sergei Silvestrov, Christopher Engström. January 29, 2013

Similar documents
Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample

Tree Models of Similarity and Association. Clustering and Classification Lecture 5

Hard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering

Mining Social Network Graphs

Complex-Network Modelling and Inference

Cluster Analysis: Agglomerate Hierarchical Clustering

Greedy Approximations

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

Introduction to Graph Theory

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Discrete mathematics II. - Graphs

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters

Lecture #3: PageRank Algorithm The Mathematics of Google Search

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur

Cluster Analysis. Angela Montanari and Laura Anderlucci

Lecture 5: Graphs & their Representation

Lecture 5 Finding meaningful clusters in data. 5.1 Kleinberg s axiomatic framework for clustering

CS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample

Lecture 2 September 3

Clustering Algorithms for general similarity measures

Non Overlapping Communities

Centrality Measures. Computing Closeness and Betweennes. Andrea Marino. Pisa, February PhD Course on Graph Mining Algorithms, Università di Pisa

9881 Project Deliverable #1

MATH20902: Discrete Maths, Problem Set 6

Vertex Cover Approximations

CHAPTER 2. Graphs. 1. Introduction to Graphs and Graph Isomorphism

Hierarchical Clustering 4/5/17

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates

1 Random Walks on Graphs

BACKGROUND: A BRIEF INTRODUCTION TO GRAPH THEORY

Approximability Results for the p-center Problem

Stanford University CS261: Optimization Handout 1 Luca Trevisan January 4, 2011

Repetition: Primal Dual for Set Cover

CENTRALITIES. Carlo PICCARDI. DEIB - Department of Electronics, Information and Bioengineering Politecnico di Milano, Italy

Figure 1: A directed graph.

Social-Network Graphs

Scribe from 2014/2015: Jessica Su, Hieu Pham Date: October 6, 2016 Editor: Jimmy Wu

Probabilistic embedding into trees: definitions and applications. Fall 2011 Lecture 4

Algorithms: Lecture 10. Chalmers University of Technology

Algorithms for Bioinformatics

A Tutorial on the Probabilistic Framework for Centrality Analysis, Community Detection and Clustering

1 Better Approximation of the Traveling Salesman

Community Detection. Community

TELCOM2125: Network Science and Analysis

Basics of Network Analysis

Viewing with Computers (OpenGL)

Topology and Topological Spaces

Behavioral Data Mining. Lecture 18 Clustering

Document Clustering: Comparison of Similarity Measures

Lecture 3: Descriptive Statistics

Algebraic Graph Theory- Adjacency Matrix and Spectrum

Graph Contraction. Graph Contraction CSE341T/CSE549T 10/20/2014. Lecture 14

Theory of Computing. Lecture 10 MAS 714 Hartmut Klauck

Latent Distance Models and. Graph Feature Models Dr. Asja Fischer, Prof. Jens Lehmann

Week 12: Trees; Review. 22 and 24 November, 2017

Section 7.12: Similarity. By: Ralucca Gera, NPS

Chapter DM:II. II. Cluster Analysis

Chapter 10. Fundamental Network Algorithms. M. E. J. Newman. May 6, M. E. J. Newman Chapter 10 May 6, / 33

Notes for Lecture 24

THE GROUP OF SYMMETRIES OF THE TOWER OF HANOI GRAPH

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Unit #9: Graphs. CPSC 221: Algorithms and Data Structures. Will Evans 2012W1

CS388C: Combinatorics and Graph Theory

Extracting Information from Complex Networks

Lecture 8: The Traveling Salesman Problem

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds

V4 Matrix algorithms and graph partitioning

21 The Singular Value Decomposition; Clustering

Clustering analysis of gene expression data

Clustering algorithms and introduction to persistent homology

Clustering and Visualisation of Data

Dijkstra s algorithm for shortest paths when no edges have negative weight.

Chapter 8 DOMINATING SETS

Week 5 Video 5. Relationship Mining Network Analysis

Lecture 7: Asymmetric K-Center

Social Network Analysis

Clustering Distance measures K-Means. Lecture 22: Aykut Erdem December 2016 Hacettepe University

CS224W: Analysis of Networks Jure Leskovec, Stanford University

Shortest path problems

5.4 Shortest Paths. Jacobs University Visualization and Computer Graphics Lab. CH : Algorithms and Data Structures 456

Chapter 8 DOMINATING SETS

What is Clustering? Clustering. Characterizing Cluster Methods. Clusters. Cluster Validity. Basic Clustering Methodology

1 Variations of the Traveling Salesman Problem

Graph Theory Problem Ideas

1 Metric spaces. d(x, x) = 0 for all x M, d(x, y) = d(y, x) for all x, y M,

TELCOM2125: Network Science and Analysis

Introduction Single-source shortest paths All-pairs shortest paths. Shortest paths in graphs

Clustering and Dimensionality Reduction

Copyright 2000, Kevin Wayne 1

Node Similarity. Ralucca Gera, Applied Mathematics Dept. Naval Postgraduate School Monterey, California

Administration. Final Exam: Next Tuesday, 12/6 12:30, in class. HW 7: Due on Thursday 12/1. Final Projects:

implementing the breadth-first search algorithm implementing the depth-first search algorithm

Visual Representations for Machine Learning

Solving Linear Recurrence Relations (8.2)

BMI/STAT 768: Lecture 06 Trees in Graphs

Some Graph Theory for Network Analysis. CS 249B: Science of Networks Week 01: Thursday, 01/31/08 Daniel Bilar Wellesley College Spring 2008

Transcription:

Sergei Silvestrov, January 29, 2013

L2: t Todays lecture:.

L2: t Todays lecture:. measurements.

L2: t Todays lecture:. measurements. s.

L2: t Todays lecture:. measurements. s..

First we would like to define what we mean by the distance between two vertices. We want our distance function and the distance between two vertices in a graph to behave in the same way as the distance between two points in space using some metric.

First we would like to define what we mean by the distance between two vertices. We want our distance function and the distance between two vertices in a graph to behave in the same way as the distance between two points in space using some metric. We starrt by defining a metric. Which we will use to construct distance functions between vertices in a graph as well.

Definition A function f (x, y) is a metric if the following properties are satisfied: f (x, y) = 0, iff x = y

Definition A function f (x, y) is a metric if the following properties are satisfied: f (x, y) = 0, iff x = y f (x, y) = f (y, x) (symmetric)

Definition A function f (x, y) is a metric if the following properties are satisfied: f (x, y) = 0, iff x = y f (x, y) = f (y, x) (symmetric) f (x, y) 0 (non-negative)

Definition A function f (x, y) is a metric if the following properties are satisfied: f (x, y) = 0, iff x = y f (x, y) = f (y, x) (symmetric) f (x, y) 0 (non-negative) f (x, y) + f (y, z) f (x, z) (triangle inequality)

A set with a metric is called a metric space, some examples are: R 3 together with the euclidean distance: d(a, b) = (a 1 b 1 ) 2 + (a 2 b 2 ) 2 + (a 3 b 3 ) 2

A set with a metric is called a metric space, some examples are: R 3 together with the euclidean distance: d(a, b) = (a 1 b 1 ) 2 + (a 2 b 2 ) 2 + (a 3 b 3 ) 2 R 2 together with the manhattan distance: d(a, b) = a 1 b 1 + a 2 b 2

A set with a metric is called a metric space, some examples are: R 3 together with the euclidean distance: d(a, b) = (a 1 b 1 ) 2 + (a 2 b 2 ) 2 + (a 3 b 3 ) 2 R 2 together with the manhattan distance: d(a, b) = a 1 b 1 + a 2 b 2 Any set together with the discrete metric: d(a, b) = 1, a b

However the set of vertices V in a graph can also be considered a metric space:

However the set of vertices V in a graph can also be considered a metric space: We obviously cannot use the eucliean or manhattan distance though, since our vertices don t have coordinates.

However the set of vertices V in a graph can also be considered a metric space: We obviously cannot use the eucliean or manhattan distance though, since our vertices don t have coordinates. The discrete metric obviously works, but doesn t tell us much.

Let us consider a connected undirected unweighted graph G with vertices V. Theorem The set of vertices V of a undirected simple graph is a metric space together with metric defines as the shortest path between vertices. Proof. We check if shortest path fulfills the required properties: The shortest path to itself is zero, hence: d(x, x) = 0 and for all other pairs it must be d(x, y) > 0 since there must be at least one edge between them.

Since the graph is undirected any path from x to y is also a path from y to x by reversing the order. We have: d(x, y) = d(y, x) Since we could otherwise reverse the order of the shorter distance to create one of the same length in the other direction.

Since the graph is undirected any path from x to y is also a path from y to x by reversing the order. We have: d(x, y) = d(y, x) Since we could otherwise reverse the order of the shorter distance to create one of the same length in the other direction. The shortest path is obviosly non-negative since we can t have a negative number of edges between two vertices.

Last to check we have the triangle inequality: d(x, y) + d(y, z) d(x, z)

Last to check we have the triangle inequality: d(x, y) + d(y, z) d(x, z) If we assume that this is not true for shortest path, then there is vertices x, y, z such that: d(x, y) + d(y, z) < d(x, z)

Last to check we have the triangle inequality: d(x, y) + d(y, z) d(x, z) If we assume that this is not true for shortest path, then there is vertices x, y, z such that: d(x, y) + d(y, z) < d(x, z) However this would mean that we would get a path from x to z going through y which is shorter than the shortest path from x to z. We have a contradiction and the triangle inequality must be true.

Last to check we have the triangle inequality: d(x, y) + d(y, z) d(x, z) If we assume that this is not true for shortest path, then there is vertices x, y, z such that: d(x, y) + d(y, z) < d(x, z) However this would mean that we would get a path from x to z going through y which is shorter than the shortest path from x to z. We have a contradiction and the triangle inequality must be true. All four properties hold and we see that shortest path is a metric.

If we have a weighted graph with positive weights it can be shown that the shortest path, where the length of a path is the sum of the weights on the edges of the path is also a metric.

If we have a weighted graph with positive weights it can be shown that the shortest path, where the length of a path is the sum of the weights on the edges of the path is also a metric. If we have a directed graph, then shortest path is no longer a metric since it might not be symmetric. Functions that fulfill all except the symmetry is sometimes called quasi.

Another type of metric used for tree are Zhong defined as: d(x, y) = 2m(ccp(x, y)) m(x) m(y) 1 m(c) = k depth(c)+1 Where m(c) called the milestone depend on the depth of node c in the tree and ccp(x, y) is the closes common parent node. k is a positive constant.

Another type of metric used for tree are Zhong defined as: d(x, y) = 2m(ccp(x, y)) m(x) m(y) 1 m(c) = k depth(c)+1 Where m(c) called the milestone depend on the depth of node c in the tree and ccp(x, y) is the closes common parent node. k is a positive constant. This metric is useful when you want the depth of the nodes to influence the distance between them, for example if you want the distance of nodes higher up in the tree to be larger.

An alternative to the distance between nodes is to instead use a similarity measurement between them. If two nodes are are close the similarity is large and if they are far apart the similarity is close to zero.

An alternative to the distance between nodes is to instead use a similarity measurement between them. If two nodes are are close the similarity is large and if they are far apart the similarity is close to zero. between nodes is not as properly defined, it is however often easy to go from similarity to a metric or the other way around: If we got the distance d(x, y) between nodes, we can define the similarity as sim(x, y) = max(d) d(x, y), where max(d) is the maximum distance between any pair in the graph.

An alternative to the distance between nodes is to instead use a similarity measurement between them. If two nodes are are close the similarity is large and if they are far apart the similarity is close to zero. between nodes is not as properly defined, it is however often easy to go from similarity to a metric or the other way around: If we got the distance d(x, y) between nodes, we can define the similarity as sim(x, y) = max(d) d(x, y), where max(d) is the maximum distance between any pair in the graph. Similarily we can transform a similarity measurement to a metric (possibly not fulfilling all the properties).

Examples of similarity measurements are: Covariance or correlation.

Examples of similarity measurements are: Covariance or correlation. LCH defined on tree as: sim(x, y) = log sp(x, y) + 1 2M Where sp(x, y) is the shortest path and M is the maximum depth in the graph.

The choice of similarity or distance often comes down to the choice of method, or what the possible weights on edges represent. If weights on edges represent for example length or travel time, distance is probably more appropriate.

The choice of similarity or distance often comes down to the choice of method, or what the possible weights on edges represent. If weights on edges represent for example length or travel time, distance is probably more appropriate. If weights represent correlation or number of times two objects was found or used together, similarity is probably better.

Centrality of a node in a network is a measurement on how important the node is within the graph. This could for example be: How often a certain road was used in a road network?

Centrality of a node in a network is a measurement on how important the node is within the graph. This could for example be: How often a certain road was used in a road network? How influential someone is in a social network?

Centrality of a node in a network is a measurement on how important the node is within the graph. This could for example be: How often a certain road was used in a road network? How influential someone is in a social network? How important a server is for the overall network stability, which are the major potential bottlenecks?

Degree centrality The easiest commonly used centrality measure is degree centrality: Definition Degree centrality C D (v) of a vertex v in a graph G with vertices V and edges E is defined as: C D (v) = deg(v) Where the degree deg(v) is the number of edges connected with v, loops (edge from v to v) are counted twice. If the graph is directed we let indegree be the number of edges pointing towards v. and the outdegree be the number of edges starting in v.

Degree centrality Figure: Example of degree centrality of a graph

Closeness centrality Definition Closeness centrality C C (v) of a vertex v in a graph G with V vertices V and edges E is defined as: 1 C C (v i ) = V j=1 sp(v i, v j ) Where sp(v i, v j ) is the length of the shortest path between v i and v j.

Closeness centrality The shorter the paths is from a vertex to all other vertices the larger is the closeness.

Closeness centrality The shorter the paths is from a vertex to all other vertices the larger is the closeness. The closeness can be seen as a measurement on how fast information can spread from the vertex to all other in for example a broadcast newtork.

Closeness centrality Figure: Example of closeness centrality of a graph

Betweenness centrality Definition Betweenness centrality C B (v) of a vertex v in a graph G with vertices V and edges E is defined as: C C (v) = s v t V σ st (v) σ st Where σ st is the total number of shortest paths from s to t and σ st (v) is the total number of shortest paths from s to t that pass through v.

Betweenness centrality Betweenness was initially used to measure the control one human have over the comminucation of others in social networks.

Betweenness centrality Betweenness was initially used to measure the control one human have over the comminucation of others in social networks. It s a good way to identify possible bottlenecks in a network.

Betweenness centrality Betweenness was initially used to measure the control one human have over the comminucation of others in social networks. It s a good way to identify possible bottlenecks in a network. Sometimes the betweenness is normalized by dividing by the total number of pair of vertices not including v.

Betweenness centrality Betweenness was initially used to measure the control one human have over the comminucation of others in social networks. It s a good way to identify possible bottlenecks in a network. Sometimes the betweenness is normalized by dividing by the total number of pair of vertices not including v. Calculating either the betweenness or the closeness earlier is quite expensive, since we need to find the shortest path of every pair of vertices.

Betweenness centrality Figure: Example of betweenness centrality of a graph

Eigenvector centrality Definition Given a connected graph G with vertices V, edges E and adjacency matrix A. From the eigenvalue equation: Ax = λx We get the eigenvector centrality of vertex v i as the i th element of x, where x is the eigenvector to the dominant eigenvalue of A.

Eigenvector centrality x is a positive vector (given by Perron-Frobenius for non-negative irreducible matrices).

Eigenvector centrality x is a positive vector (given by Perron-Frobenius for non-negative irreducible matrices). Eigenvector centrality gives a measure of the influence of nodes in a network.

Eigenvector centrality x is a positive vector (given by Perron-Frobenius for non-negative irreducible matrices). Eigenvector centrality gives a measure of the influence of nodes in a network. Eigenvector centrality can be computed efficiently for even extremely large sparse through the use of the Power method.

Eigenvector centrality Figure: Example of eigenvector centrality of a graph

Other centrality measures Two other centrality measures is Katz centrality which can be written as: N x i = α k (A k ) ij, α (0, 1) k=1 j=1

Other centrality measures Two other centrality measures is Katz centrality which can be written as: x i = k=1 j=1 N α k (A k ) ij, α (0, 1) This can be seen as a generalized version of degree centrality where vertices further away are also counted but with a discounting factor α.

Other centrality measures Two other centrality measures is Katz centrality which can be written as: x i = k=1 j=1 N α k (A k ) ij, α (0, 1) This can be seen as a generalized version of degree centrality where vertices further away are also counted but with a discounting factor α. PageRank is another centrality measure which we will look at more closely during one of the seminars.

Centralization The centralization of a network is a measure of how central the most central node is in relation to how central al other are. Definition Centralization C x of a graph G with vertices V and edges E is defined as: N i=1 C x(p ) C x (p i ) C x = max N i=1 C x(p ) C x (p i ) Where C x (p i ) is the centrality of point p i, C x (p ) is the largest N centrality in the network and max C x (p ) C x (p i ) is the largest theoretical sum of differences in centrality for any graph with the same number of vertices. i=1

is a method which doesn t try to create one set of clusters, instead it aims to create a hierarchy of clusters. There are two primary types of hierarchical :

is a method which doesn t try to create one set of clusters, instead it aims to create a hierarchy of clusters. There are two primary types of hierarchical : Agglomerative, in which each vertex starts in it s own cluster, clusters are then merged to create larger and larger clusters.

is a method which doesn t try to create one set of clusters, instead it aims to create a hierarchy of clusters. There are two primary types of hierarchical : Agglomerative, in which each vertex starts in it s own cluster, clusters are then merged to create larger and larger clusters. Divisive, in which all vertices starts in a single cluster, clusters are then split to create smaller and smaller clusters.

is a method which doesn t try to create one set of clusters, instead it aims to create a hierarchy of clusters. There are two primary types of hierarchical : Agglomerative, in which each vertex starts in it s own cluster, clusters are then merged to create larger and larger clusters. Divisive, in which all vertices starts in a single cluster, clusters are then split to create smaller and smaller clusters. Because of computational limits, new clusters are generally decided using a greedy algorithm, in which the best new merge/split is choosen every time based on the old clusters.

Most forms of hierarchical use some metric in order to define the distance between points in space or vertices in a graph. After a metric is chosen we also need some way to determine the distance between two sets of points or vertices.

Most forms of hierarchical use some metric in order to define the distance between points in space or vertices in a graph. After a metric is chosen we also need some way to determine the distance between two sets of points or vertices. We call this a linkage criterion.

Some common Linkage criterions between two different sets are: Maximum: max{d(a, b) : a A, b B}

Some common Linkage criterions between two different sets are: Maximum: max{d(a, b) : a A, b B} Minimum: min{d(a, b) : a A, b B}

Some common Linkage criterions between two different sets are: Maximum: max{d(a, b) : a A, b B} Minimum: min{d(a, b) : a A, b B} 1 Mean: d(a, b) A B a A b B

Some common Linkage criterions between two different sets are: Maximum: max{d(a, b) : a A, b B} Minimum: min{d(a, b) : a A, b B} 1 Mean: d(a, b) A B a A b B Probability that cluster come from the same distribution.