Section 7.12: Similarity. By: Ralucca Gera, NPS

Similar documents
Node Similarity. Ralucca Gera, Applied Mathematics Dept. Naval Postgraduate School Monterey, California

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge

Section 7.13: Homophily (or Assortativity) By: Ralucca Gera, NPS

Centralities (4) Ralucca Gera, Applied Mathematics Dept. Naval Postgraduate School Monterey, California Excellence Through Knowledge

Spectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014

V2: Measures and Metrics (II)

Basics of Network Analysis

General Instructions. Questions

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies.

Erdös-Rényi Graphs, Part 2

Sergei Silvestrov, Christopher Engström. January 29, 2013

Information Retrieval. (M&S Ch 15)

Complex-Network Modelling and Inference

TELCOM2125: Network Science and Analysis

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)

Clustering. Distance Measures Hierarchical Clustering. k -Means Algorithms

Math 778S Spectral Graph Theory Handout #2: Basic graph theory

Document Clustering: Comparison of Similarity Measures

MATH 567: Mathematical Techniques in Data

Training-Free, Generic Object Detection Using Locally Adaptive Regression Kernels

CPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017

Modularity CMSC 858L

Ma/CS 6b Class 13: Counting Spanning Trees

REGULAR GRAPHS OF GIVEN GIRTH. Contents

Gene Clustering & Classification

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

(Type your answer in radians. Round to the nearest hundredth as needed.)

Near Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri

Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic

Recommender System. What is it? How to build it? Challenges. R package: recommenderlab

Based on Raymond J. Mooney s slides

Clustering Algorithms for general similarity measures

Community detection. Leonid E. Zhukov

V4 Matrix algorithms and graph partitioning

Cluster Analysis. Angela Montanari and Laura Anderlucci

A Comparison of Document Clustering Techniques

Matching and Covering

Transductive Learning: Motivation, Model, Algorithms

1. The Normal Distribution, continued

Graphs. Terminology. Graphs & Breadth First Search (BFS) Extremely useful tool in modeling problems. Consist of: Vertices Edges

Robustness of Centrality Measures for Small-World Networks Containing Systematic Error

Community Detection. Community

Diffusion Maps and Topological Data Analysis

Solving Linear Recurrence Relations (8.2)

University of Florida CISE department Gator Engineering. Clustering Part 2

CS1114 Section: SIFT April 3, 2013

Graph Theory S 1 I 2 I 1 S 2 I 1 I 2

CSI 445/660 Part 10 (Link Analysis and Web Search)

Math 776 Graph Theory Lecture Note 1 Basic concepts

Centrality in Large Networks

Section 4.1: Introduction to Trigonometry

Overview of Clustering

Introduction to Mixed Models: Multivariate Regression

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

1 More configuration model

Basic Combinatorics. Math 40210, Section 01 Fall Homework 4 Solutions

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

Information Retrieval. Lecture 7

Link Prediction and Anomoly Detection

UNIVERSITY OF OSLO. Faculty of Mathematics and Natural Sciences

Linear Regression and K-Nearest Neighbors 3/28/18

Lecture 11: E-M and MeanShift. CAP 5415 Fall 2007

Introduction to Machine Learning

PCP and Hardness of Approximation

Lecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates

Measure of Distance. We wish to define the distance between two objects Distance metric between points:

Mining Web Data. Lijun Zhang

CPSC 340: Machine Learning and Data Mining. Finding Similar Items Fall 2017

Tag-based Social Interest Discovery

Spectral Methods for Network Community Detection and Graph Partitioning

Solving Systems of Equations Using Matrices With the TI-83 or TI-84

Sum of Exterior Angles of Polygons TEACHER NOTES

Network Traffic Measurements and Analysis

Table of Laplace Transforms

Cluster algebras and infinite associahedra

Reflexive Regular Equivalence for Bipartite Data

vector space retrieval many slides courtesy James Amherst

CISC 4631 Data Mining

Semi-Supervised Clustering with Partial Background Information

Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen

Mining Social Network Graphs

TELCOM2125: Network Science and Analysis

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters

Relative Constraints as Features

CAIM: Cerca i Anàlisi d Informació Massiva

Distance Methods. "PRINCIPLES OF PHYLOGENETICS" Spring 2006

Data Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\

Information Retrieval: Retrieval Models

Defns An angle is in standard position if its vertex is at the origin and its initial side is on the -axis.

5 Machine Learning Abstractions and Numerical Optimization

CSE 5243 INTRO. TO DATA MINING

Data Mining. Lecture 03: Nearest Neighbor Learning

Lecture 17 November 7

Two-graphs revisited. Peter J. Cameron University of St Andrews Modern Trends in Algebraic Graph Theory Villanova, June 2014

Clustering CS 550: Machine Learning

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

Dynamic Collision Detection

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning

Package dissutils. August 29, 2016

Transcription:

Section 7.12: Similarity By: Ralucca Gera, NPS

Motivation We talked about global properties Average degree, average clustering, ave path length We talked about local properties: Some node centralities Next we ll look at pairwise properties of nodes: Node equivalence and similarity, Correlation between pairs of nodes Why care? How do items get suggested to users? Based on past history of the user Based on the similarity of the user with the other people 2

Similarity In complex network, one can measure similarity Between vertices Between networks We consider the similarity between vertices in the same network. How can this similarity be quantified? Differentiating vertices helps tease apart the types and relationships of vertices Useful for click here for pages/movies/books similar to this one We determine similarities based on the network structure There are two types of similarities: Structural equivalence (main one is the Pearson Corr. Coeff) Regular equivalence 3

Structural vs. Regular Equivalent nodes Defn: Two vertices are structurally equivalent if they share many neighbors. Such as two math students and that know the same professors Defn: Two vertices are regularly equivalent if they have many neighbors that are also equivalent. Such as two Deans and that have similar ties: provost, president, department chairs (some could actually be common) 4

Structural Equivalence

Structural Equivalent nodes Two vertices are structurally equivalent if they share many neighbors How can we measure this similarity? 6

Structural similarity 1. Count of common neighbors = 2. Normalized common neighbors (divide by constant): by the number of vertices: by the max number of common neighbors Jaccard similarity is the quotient of number of common to the union of all neighbors: Fast to compute (thus popular) as it just counts neighbors 7

Structural similarity (2) 3. Cosine similarity (divide by variable): Let x be the row corresponding to vertex i in A, and Let y be the column corresponding to vertex j in A: =?? simplifies to = = Convention: If deg i = 0 (denominator is 0), then cos = 0. Range: 8

Cosine similarity Cosine similarity is an example of a technique used in information retrieval, text analysis, or any comparison of to, where each of and can be vectorized based on their components For example: To find the closest document(s)/website(s) to another document or a query 9

Technique used in text analysis A collection of n documents (,,, ) can be represented in the vector space model by a term-document matrix. A list of t terms of interest:,,, An entry in the matrix corresponds to the weight of a term in the document; zero means the term has no significance in the document or it simply doesn t exist in the document. T 1 T 2. T t D 1 w 11 w 21 w t1 D 2 w 12 w 22 w t2 : : : : : : : : D n w 1n w 2n w tn Source: Raymond J. Mooney University of Texas at Austin 10

Graphic Representation Example: D 1 = 2T 1 + 3T 2 + 5T 3 T 3 D 2 = 3T 1 + 7T 2 + T 3 Q = 0T 1 + 0T 2 + 2T 3 5 D 1 = 2T 1 + 3T 2 + 5T 3 Q = 0T 1 + 0T 2 + 2T 3 2 3 T 1 D 2 = 3T 1 + 7T 2 + T 3 T 2 7 Is D 1 or D 2 more similar to Q? How to measure the degree of similarity? Distance? Angle? Projection? Source: Raymond J. Mooney University of Texas at Austin 11

Cosine Similarity Measure Cosine similarity measures the cosine of the angle between two vectors. Inner product normalized by the vector lengths. t ( d q w ij w iq ) j i 1 CosSim(d j, q) = t 2 t d j q w w ij i 1 i 1 iq 2 D 1 t 3 Q t 1 t 2 D 2 D 1 = 2T 1 + 3T 2 + 5T 3 CosSim(D 1, Q) = 10 / (4+9+25)(0+0+4) = 0.81 D 2 = 3T 1 + 7T 2 + 1T 3 CosSim(D 2, Q) = 2 / (9+49+1)(0+0+4) = 0.13 Q = 0T 1 + 0T 2 + 2T 3 D 1 is 6 times better than D 2 using cosine similarity Source: Raymond J. Mooney University of Texas at Austin 12

Illustration of 3 Nearest Neighbor for Text It works the same with vectors of the adjacency matrix, where the terms are just the neighbors, and similarity is how many common neighbors there are query Source: Raymond J. Mooney University of Texas at Austin 13

1. Count of common neighbors: = So far: Structural similarity 2. Normalized common neighbors (divide by the): number of vertices: max count of common neighbors possible exact count of common neighbors (Jaccard similarity) 3. Cosine similarity:

Pearson Correlation Coeff (1) 4. Pearson Correlation Coefficient (compare to the expected value of common neighbors in a network in which vertices choose their neighbors at random): Let i and j be two vertices of deg i and deg j The probability of j to choose at random one of i s neighbors is if loops in G) (or Thus the probability that all of j s neighbors are neighbors of i is 15

Pearson Correlation Coeff (2) 4. Pearson Correlation Coefficient (continued) is the actual number of common neighbors minus the expected common neighbors that will be normalized: = = = = ] = ] FOIL and simplify using Since the mean of the row is < > = 16 Since 1 are constants with respect to k

Pearson Correlation Coeff (3) 4. Pearson Correlation Coefficient (continued) is the actually number of common neighbors minus the expected common neighbors that will be normalized:, = ] = 0 if the # of common neighbors is what is expected by chance < 0 if the # of common neighbors is less than what is expected by chance > 0 if the # of common neighbors is more than what is expected by chance 17

Pearson Correlation Coeff (4) 4. Pearson Correlation Coefficient (continued) is the actually number of common neighbors minus the expected common neighbors normalized (by row and ): =,, = ] = 0 if the # of common neighbors is what is expected by chance 1 < 0 if the # of common neighbors is less than what is expected by chance 0 <1if the # of common neighbors is more than what is expected by chance shows more/less vertex similarity in a network compared to vertices expected to be neighbors at random

More measures There are more measures for structural equivalences, the Pearson Correlation Coefficient is commonly used Another one is Euclidean distance which is the cardinality of the symmetric difference between the neighbors of i and neighbors of j All these are measures of structural equivalence of a network: they measure different ways of counting common neighbors 19

Overview updated: Structural similarity 1. Count of common neighbors: = 2. Normalized common neighbors (divide by the): number of vertices: max count of common neighbors possible exact count of common neighbors (Jaccard similarity) 3. Cosine similarity: 4. Pearson Correlation Coefficient (exists in NetworkX) : =,, = ]

Regular Equivalence

Structural vs. Regular Equivalent nodes Defn: Two vertices are structurally equivalent if they share many neighbors. Such as two math students and that know the same professors Defn: Two vertices are regularly equivalent if they have many neighbors that are also equivalent. Such as two Deans and that have similar ties: provost, president, department chairs (some could actually be common) 22

Defn of regular Equivalence Defn. Two nodes are regularly equivalent if they are equally related to equivalent nodes (i.e. their neighbors need to be equivalent as well). We capture equivalence through color pattern adjacencies. To have similar roles, these nodes need different colors because of their adjacencies White and Reitz, 1983; Everette and Borgatti, 1991 23

Structural vs. Regular equivalent color classes Nodes with connectivity to the same nodes (common information about the network) vs. Nodes with similar roles in the network (similar positions in the network). Leonid E. Zhukov 24

Regular Equivalence (1) Regular equivalences: nodes don t have to be adjacent to be regular equivalent REGE an algorithm developed in 1970 to discover regular equivalences, but the operations are very involved and it is not easy to interpret (Newman) Newer methods: METHOD 1: develop that is high if the neighbors k and of i and j are (regularly) equivalent σ counts how many common neighbors are there, times their neighbors regular similarity σ A recurrence relation in terms of, much like eigenvector. 25

Regular Equivalence (2) Is equivalent to where α is the leading eigenvalue of A, and σ is the eigenvector. But: It may not give high self-similarity as it heavily depends on the similarity of neighbors of i It doesn t give high similarity to pairs of vertices with lots of common neighbors (because it looks at values of and, including ) Needs to be altered, and we ll see two methods next. 26

Regular Equivalence (3) Is equivalent to + δ with, δ = Calculate by starting with the initial values:, Note: is a weighted some of the even powers of A (walks of even length), but we want all the lengths

Regular Equivalence (4) METHOD 2 Is equivalent to with: + δ with σ α 0, σ α α, σ α α α α, is a weighted (? ) count of all walks between i and j, (with < 1 so longer paths weigh less) Leicht, Holme, and Newman, 2006 28

Regular Equivalence (5) METHOD 2 Is equivalent to + δ with This looks a lot like Katz centrality (recall α + β, where β is a constant initial weight given to each vertex so that its out degree matters) Katz centrality of vertex i is the sum of, 29

Regular Equivalence (6) The regular equivalence tends to be high for vertices of high degree (more chances of their neighbors being similar) However, some low degree vertices could be similar as well, so the formula can be modified if desired: which is equivalent to + δ 30

The connection between structural and regular equivalence

Equivalence (1) R is a weighted count of all walks between i and j, i.e: σ α α α α, Recall that the structural equivalence attempts to count the # of common neighbors of i and j in different ways What is the correlation between the number of common neighbors of 4 and 5, and paths of length 2 between 4 and 5? 32

Equivalence (2) R is a weighted count of all walks between i and j, i.e: σ α α α α, Recall that the structural equivalence attempts to count the # of common neighbors in different ways So the regular equivalence is a generalization of structural equivalence ( counts just paths of length 2) 3