Online Social Networks and Media

Size: px
Start display at page:

Download "Online Social Networks and Media"

Transcription

1 Online Social Networks and Media Absorbing Random Walks Link Prediction

2 Why does the Power Method work? If a matrix R is real and symmetric, it has real eigenvalues and eigenvectors: λ, w, λ 2, w 2,, (λ r, w r ) r is the rank of the matrix λ λ 2 λ r The vector space of R is the set of vectors that can be written as a linear combination of its rows (or columns) The eigenvectors w, w 2,, w r of R define a basis of the vector space For any vector x, Rx = α w + a 2 w a r w r After t multiplications we have: R t x = λ t α w + λ 2 t a 2 w λ 2 t a r w r Normalizing leaves only the term w.

3 ABSORBING RANDOM WALKS LABEL PROPAGATION OPINION FORMATION ON SOCIAL NETWORKS

4 Random walk with absorbing nodes What happens if we do a random walk on this graph? What is the stationary distribution? All the probability mass on the red sink node: The red node is an absorbing node

5 Random walk with absorbing nodes What happens if we do a random walk on this graph? What is the stationary distribution? There are two absorbing nodes: the red and the blue. The probability mass will be divided between the two

6 Absorption probability If there are more than one absorbing nodes in the graph a random walk that starts from a non-absorbing node will be absorbed in one of them with some probability The probability of absorption gives an estimate of how close the node is to red or blue

7 Absorption probability Computing the probability of being absorbed: The absorbing nodes have probability of being absorbed in themselves and zero of being absorbed in another node. For the non-absorbing nodes, take the (weighted) average of the absorption probabilities of your neighbors if one of the neighbors is the absorbing node, it has probability Repeat until convergence (= very small change in probs) P Red Pink = 2 3 P Red Yellow + 3 P(Red Green) 2 P Red Green = 4 P Red Yellow + 4 P Red Yellow =

8 Absorption probability Computing the probability of being absorbed: The absorbing nodes have probability of being absorbed in themselves and zero of being absorbed in another node. For the non-absorbing nodes, take the (weighted) average of the absorption probabilities of your neighbors if one of the neighbors is the absorbing node, it has probability Repeat until convergence (= very small change in probs) P Blue Pink = 2 3 P Blue Yellow + 3 P(Blue Green) 2 P Blue Green = 4 P Blue Yellow + 2 P Blue Yellow = 3 2 2

9 Why do we care? Why do we care to compute the absorbtion probability to sink nodes? Given a graph (directed or undirected) we can choose to make some nodes absorbing. Simply direct all edges incident on the chosen nodes towards them. The absorbing random walk provides a measure of proximity of non-absorbing nodes to the chosen nodes. Useful for understanding proximity in graphs Useful for propagation in the graph E.g, on a social network some nodes have high income, some have low income, to which income class is a non-absorbing node closer?

10 Example In this undirected graph we want to learn the proximity of nodes to the red and blue nodes 2 2 2

11 Example Make the nodes absorbing 2 2 2

12 Absorption probability Compute the absorbtion probabilities for red and blue P Red Pink = 2 3 P Red Yellow + 3 P(Red Green) P Red Green = 5 P Red Yellow + 5 P Red Pink + 5 P Red Yellow = 6 P Red Green + 3 P Red Pink P Blue Pink = P Red Pink P Blue Green = P Red Green 2 2 P Blue Yellow = P Red Yellow

13 Penalizing long paths The orange node has the same probability of reaching red and blue as the yellow one P Red Orange = P Red Yellow P Blue Orange = P Blue Yellow Intuitively though it is further away

14 Penalizing long paths Add an universal absorbing node to which each node gets absorbed with probability α. With probability α the random walk dies α With probability (-α) the random walk continues as before The longer the path from a node to an absorbing node the more likely the random walk dies along the way, the lower the absorbtion probability α -α α -α -α -α α P Red Green = ( α) 5 P Red Yellow + 5 P Red Pink + 5

15 Propagating values Assume that Red has a positive value and Blue a negative value Positive/Negative class, Positive/Negative opinion We can compute a value for all the other nodes in the same way This is the expected value for the node V(Pink) = 2 3 V(Yellow) + 3 V(Green) V Green = 5 V(Yellow) + 5 V(Pink) V Yellow = 6 V Green + 3 V(Pink)

16 Electrical networks and random walks Our graph corresponds to an electrical network There is a positive voltage of + at the Red node, and a negative voltage - at the Blue node There are resistances on the edges inversely proportional to the weights (or conductance proportional to the weights) The computed values are the voltages at the nodes V(Pink) = 2 3 V(Yellow) + 3 V(Green) V Green = 5 V(Yellow) + 5 V(Pink) V Yellow = 6 V Green + 3 V(Pink)

17 Opinion formation The value propagation can be used as a model of opinion formation. Model: Opinions are values in [-,] Every user u has an internal opinion s u, and expressed opinion z u. The expressed opinion minimizes the personal cost of user u: c z u = s u z u 2 + v:v is a friend of u w u z u z v 2 Minimize deviation from your beliefs and conflicts with the society If every user tries independently (selfishly) to minimize their personal cost then the best thing to do is to set z u to the average of all opinions: z u = s u + v:v is a friend of u w u z u + v:v is a friend of u w u This is the same as the value propagation we described before!

18 Example Social network with internal opinions 2 s = +0.5 s = s = -0.3 s = +0.2 s = -0.

19 Example One absorbing node per user with value the internal opinion of the user s = +0.5 One non-absorbing node per user that links to the corresponding absorbing node s = +0.8 The external opinion for each node is computed using the value propagation we described before Repeated averaging z = z = z = z = 0.04 z = -0.0 s = -0.3 Intuitive model: my opinion is a combination of what I believe and what my social network believes. s = -0.5 s = -0.

20 Transductive learning If we have a graph of relationships and some labels on some nodes we can propagate them to the remaining nodes Make the labeled nodes to be absorbing and compute the probability for the rest of the graph E.g., a social network where some people are tagged as spammers E.g., the movie-actor graph where some movies are tagged as action or comedy. This is a form of semi-supervised learning We make use of the unlabeled data, and the relationships It is also called transductive learning because it does not produce a model, but just labels the unlabeled data that is at hand. Contrast to inductive learning that learns a model and can label any new example

21 Implementation details Implementation is in many ways similar to the PageRank implementation For an edge (u, v)instead of updating the value of v we update the value of u. The value of a node is the average of its neighbors We need to check for the case that a node u is absorbing, in which case the value of the node is not updated. Repeat the updates until the change in values is very small.

22 LINK PREDICTION

23 The Problem Link prediction problem: Given the links in a social network at time t, predict which edges that will be added to the network Which features to use? User characteristics (profile), network interactions, topology Different from the problem of inferring missing (hidden) links (there is a temporal aspect, uses a static snapshot) To save experimental effort in the laboratory or in the field

24 Applications Recommending new friends on online social networks. Predicting the participants or actors in events Suggesting interactions between the members of a company/organization Predicting connections between members of terrorist organizations who have not been directly observed to work together Suggesting collaborations between researchers based on co-authorship. Network evolution model

25 Link Prediction Unsupervised (usually, assign scores based on similarity of endpoints) Supervised (given some positive (created edges) and negative examples (nonexistent edges) Classification Problem Problem: Class imbalance Instead of 0/, rank each edge by its probability to appear in the network

26 D. Liben-Nowell, D. and J. Kleinberg, The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7) (2007)

27 The Problem Link prediction problem: Given the links in a social network at time t, predict the edges that will be added to the network during the time interval from time t to a given future time t Which features to use? Based solely on the topology of the network (social proximity) (the more general problem also considers attributes of the nodes and links)

28 Problem Formulation I Consider a social network G = (V, E) where each edge e = <y, v> E represents an interaction between u and v that took place at a particular time t(e) (multiple interactions between two nodes as parallel edges with different timestamps) G[t, t ]: subgraph of G consisting of all edges with a timestamp between t and t, t < t, For four times, t 0 < t 0 < t < t, Given G[t 0, t 0 ], we wish to output a list of edges not in G[t 0, t 0 ] that are predicted to appear in G[t, t ] [t 0, t 0 ] training interval [t, t ] test interval

29 Problem Formulation II What about new nodes (node not in the training interval)? Two parameters: κ training and κ test Core: all nodes that are incident to at least κ training edges in G[t 0, t 0 ], and at least κ test edges in G[t, t ] Predict new edges between the nodes in Core

30 Example Dataset: co-authorship t 0 = 994, t 0 = 996: training interval -> [994, 996] t = 997, t = 999: test interval -> [997, 999] - G collab = <A, E old > = G[994, 996] - E new : authors in A that co-author a paper during the test interval but not during the training interval κ training = 3, κ test = 3: Core consists of all authors who have written at least 3 papers during the training period and at least 3 papers during the test period Predict E new

31 How to Evaluate the Prediction Each link predictor p outputs a ranked list L p of pairs in A A E old : predicted new collaborations in decreasing order of confidence Actual edges: E new = E new (Core Core), n = E new Evaluation method: Size of the intersection of the first n edge predictions from L p that are in Core Core (predicted) and the set E new (actual) How many of the top-n predictions are correct (precision?)

32 Methods for Link Prediction Assign a connection weight score(x, y) to each pair of nodes <x, y> based on the input graph (G collab ) and produce a ranked list of decreasing order of score How to assign the score between two nodes x and y? Some form of similarity or node proximity Most measures focus on the giant component

33 Methods for Link Prediction: Shortest Path For x, y A A E old, score(x, y) = (negated) length of shortest path between x and y If there are more than n pairs of nodes tied for the shortest path length, order them at random. Geodesic distance: number of edges in the shortest path

34 Methods for Link Prediction: Neighborhood-based The larger the overlap of the neighbors of two nodes, the more likely to be linked in the future Let Γ(x) denote the set of neighbors of x in G collab Common neighbors: A adjacency matrix -> A x,y 2 Number of different paths of length 2 Jaccard coefficient: The probability that both x and y have a feature f, for a randomly selected feature that either x or y has

35 Methods for Link Prediction: Neighborhood-based Adamic/Adar: Assigns large weights to common neighbors z of x and y which themselves have few neighbors (weight rare features more heavily) connections to unpopular nodes are more relevant

36 Methods for Link Prediction: Neighborhood-based Preferential attachment: the probability that a new edge has node x as its endpoint is proportional to Γ(x), i.e., nodes like to form ties with popular nodes Researchers found empirical evidence to suggest that co-authorship is correlated with the product of the neighborhood sizes

37 Methods for Link Prediction: based on the ensemble of all paths Not just the shortest, but all paths between two nodes

38 Methods for Link Prediction: based on the ensemble of all paths Katz β measure: Sum over all paths of length l, β > 0 is a parameter of the predictor, exponentially damped to count short paths more heavily Small β predictions much like common neighbors. Unweighted version, in which path () x,y =, if x and y have collaborated, 0 otherwise 2. Weighted version, in which path () x,y = #times x and y have collaborated

39 Methods for Link Prediction: based on the ensemble of all paths Consider a random walk on G collab that starts at x and iteratively moves to a neighbor of x chosen uniformly at random from Γ(x). The Hitting Time H x,y from x to y is the expected number of steps it takes for the random walk starting at x to reach y. score(x, y) = H x,y (symmetric version) The Commute Time C x,y from x to y is the expected number of steps to travel from x to y and from y to x score(x, y) = (H x,y + H y,x ) Can also consider stationary-normed versions: score(x, y) = H x,y π y score(x, y) = (H x,y π y + H y,x π x )

40 Methods for Link Prediction: based on the ensemble of all paths The hitting time and commute time measures are sensitive to parts of the graph far away from x and y -> periodically reset the walk Random walk on G collab that starts at x and has a probability of α of returning to x at each step. Rooted (Personalized) Page Rank: Starts from x, with probability ( a) moves to a random neighbor and with probability a returns to x score(x, y) = stationary probability of y in a rooted PageRank

41 Methods for Link Prediction: based on the ensemble of all paths SimRank score(x, y) = similarity(x, y) The expected value of γ l where l is a random variable giving the time at which random walks started from x and y first meet

42 Methods for Link Prediction: High-level approaches Low rank approximations A adjacency matrix Apply SVD (singular value decomposition) The rank-k matrix that best approximates A

43 Methods for Link Prediction: High-level approaches Unseen Bigrams Unseen bigrams: pairs of word that co-occur in a test corpus, but not in the corresponding training corpus Not just score(x, y) but score(z, y) for nodes z that are similar to x S x (δ) the δ nodes most related to x

44 Methods for Link Prediction: High-level approaches Clustering Compute score(x, y) for al edges in E old Delete the (-p) fraction of these edges for which the score is the lowest, for some parameter p Recompute score(x, y) for all pairs in the subgraph

45 Evaluation: baseline Baseline: random predictor Randomly select pairs of authors who did not collaborate in the training interval Probability that a random prediction is correct, Number of possible predictions Correct predictions In the datasets, from 0.5% (cond-mat) to 0.48% (astro-ph)

46 Evaluation: Factor improvement over random

47 Evaluation: Factor improvement over random

48 Evaluation: Average relevance performance (random) average ratio over the five datasets of the given predictor's performance versus a baseline predictor's performance. The error bars indicate the minimum and maximum of this ratio over the five datasets. The parameters for the starred predictors are as follows: () for weighted Katz, β= 0.005; (2) for Katz clustering, β = 0.00; ρ = 0.5; β2 = 0.; (3) for low-rank inner product, rank = 256; (4) for rooted Pagerank, α = 0.5; (5) for unseen bigrams, unweighted common neighbors with δ = 8; and (6) for SimRank, γ = 0.8.

49 Evaluation: Average relevance performance (distance)

50 Evaluation: Average relevance performance (neighbors)

51 Evaluation: prediction overlap How much similar are the predictions made by the different methods (common predictions)? Why? correct

52 Evaluation: datasets How much does the performance of the different methods depends on the dataset? (rank) On 4 of the 5 datasets best at an intermediate rank On qr-qc, best at rank, does it have a simpler structure? On hep-ph, preferential attachment the best Why is astro-ph difficult? The culture of physicists and physics collaboration

53 Evaluation: small world The shortest path even in unrelated disciplines is often very short

54 Evaluation: restricting to distance three Many pairs of authors separated by a graph distance of 2 who will not collaborate and many pairs who collaborate at distance greater than 2 Disregard all distance 2 pairs

55 Evaluation: the breadth of data Three additional datasets. Proceedings of STOC and FOCS 2. Papers for Citeseer 3. All five of the arxiv sections Common neighbors vs Random

56 Future Directions Improve performance. Even the best (Katz clustering on gr-qc) correct on only about 6% of its prediction Improve efficiency on very large networks (approximation of distances) Treat more recent collaborations as more important Additional information (paper titles, author institutions, etc) To some extent latently present in the graph

57 Future Directions Consider bipartite graph (e.g., some form of an affiliation network) author paper Coauthor graph Apply classification techniques from machine learning A simple binary classification problem: given two nodes x and y predict whether <x, y> is or 0

58

59 Aaron Clauset, Cristopher Moore & M. E. J. Newman. Hierarchical structure and the prediction of missing links in network, Nature, 453, 98-0 (2008)

60 Hierarchical Random Graphs Graph G with n nodes Dendrogram D a binary tree with n leaves Each internal node corresponds to the group of nodes that descend from it Each internal node r of the dendrogram is associated with a probability p r that a pair of vertices in the left and right subtrees of that node are connected Given two nodes i and j of G the probability p ij that they are connected by an edge is equal to p r where r is their lowest common ancestor

61 Hierarchical Random Graphs Example graph Possible dendrogram Assortativity (dense connections within groups of nodes and sparse between them) -> probabilities p r decrease as we move up the tree Given D and the probabilities p r, we can generate a graph, called a hierarchical random graph D: topological structure and parameters {p r }

62 Hierarchical Random Graphs Use to predict missing interactions in the network Given an observed but incomplete network, generate a set of hierarchical random graphs (i.e., a dendrogam and the associated probabilities) that fit the network (using statistical inference) Then look for pair of nodes that have a high probability of connection Is this better than link prediction? Experiments show that link prediction works well for strongly assortative networks (e.g, collaboration, citation) but not for networks that exhibit more general structure (e.g., food webs)

63 A rough idea of how to generate the model r a node in dendrogram D E r the number of edges in G whose endpoints have r as their lowest common ancestor in D, L r and R r the numbers of leaves in the left and right subtrees rooted at r Then the likelihood of the hierarchical random graph is If we fix the dendrogram D, it is easy to find the probabilities {p r } that maximize L(D, {p r }). For each r, they are given by the fraction of potential edges between the two subtrees of r that actually appear in the graph G. L(D) = (/9)(8/9) 8 = L(D2) = (/3)(2/3) 2 (/4) 2 (3/4) 6 =

64 A rough idea of how to generate the model Sample dendrograms D with probability proportional to their likelihood Choose an internal node uniformly at random and consider one of the two ways to reshuffle Always accept the transition if it increases the likelihood else accept with some probability

65 How to Evaluate the Prediction (other) An undirected network G(V, E) Predict Missing links (links not in E) To test, randomly divide E into a training set E T and a probe (test) set E P Apply standard techniques (k-fold cross validation) Each time we randomly pick a missing link and a nonexistent link to compare their scores If among n independent comparisons, there are n times the missing link having a higher score and n times they have the same score, the AUC value is the probability that a randomly chosen missing link is given a higher score than a randomly chosen nonexistent link If all the scores are generated from an independent and identical distribution, the AUC value should be about 0.5.

66 How to Evaluate the Prediction (other) Algorithm assigns scores of all non-observed links as s2 = 0.4, s3 = 0.5, s4 = 0.6, s34 = 0.5 and s45 = 0.6. To calculate AUC, compare the scores of a probe (missing) link and a nonexistent link. (n=) 6 pairs: s3 > s2, s3 < s4, s3 = s34, s45 > s2, s45 = s4, s45 > s34. AUC = ( )/

The link prediction problem for social networks

The link prediction problem for social networks The link prediction problem for social networks Alexandra Chouldechova STATS 319, February 1, 2011 Motivation Recommending new friends in in online social networks. Suggesting interactions between the

More information

Absorbing Random walks Coverage

Absorbing Random walks Coverage DATA MINING LECTURE 3 Absorbing Random walks Coverage Random Walks on Graphs Random walk: Start from a node chosen uniformly at random with probability. n Pick one of the outgoing edges uniformly at random

More information

Absorbing Random walks Coverage

Absorbing Random walks Coverage DATA MINING LECTURE 3 Absorbing Random walks Coverage Random Walks on Graphs Random walk: Start from a node chosen uniformly at random with probability. n Pick one of the outgoing edges uniformly at random

More information

CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks

CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks Archana Sulebele, Usha Prabhu, William Yang (Group 29) Keywords: Link Prediction, Review Networks, Adamic/Adar,

More information

Link Prediction for Social Network

Link Prediction for Social Network Link Prediction for Social Network Ning Lin Computer Science and Engineering University of California, San Diego Email: nil016@eng.ucsd.edu Abstract Friendship recommendation has become an important issue

More information

Link Prediction and Anomoly Detection

Link Prediction and Anomoly Detection Graphs and Networks Lecture 23 Link Prediction and Anomoly Detection Daniel A. Spielman November 19, 2013 23.1 Disclaimer These notes are not necessarily an accurate representation of what happened in

More information

Topic mash II: assortativity, resilience, link prediction CS224W

Topic mash II: assortativity, resilience, link prediction CS224W Topic mash II: assortativity, resilience, link prediction CS224W Outline Node vs. edge percolation Resilience of randomly vs. preferentially grown networks Resilience in real-world networks network resilience

More information

Link Prediction Benchmarks

Link Prediction Benchmarks Link Prediction Benchmarks Haifeng Qian IBM T. J. Watson Research Center Yorktown Heights, NY qianhaifeng@us.ibm.com October 13, 2016 This document describes two temporal link prediction benchmarks that

More information

Social Network Analysis

Social Network Analysis Social Network Analysis Mathematics of Networks Manar Mohaisen Department of EEC Engineering Adjacency matrix Network types Edge list Adjacency list Graph representation 2 Adjacency matrix Adjacency matrix

More information

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge Centralities (4) By: Ralucca Gera, NPS Excellence Through Knowledge Some slide from last week that we didn t talk about in class: 2 PageRank algorithm Eigenvector centrality: i s Rank score is the sum

More information

Basics of Network Analysis

Basics of Network Analysis Basics of Network Analysis Hiroki Sayama sayama@binghamton.edu Graph = Network G(V, E): graph (network) V: vertices (nodes), E: edges (links) 1 Nodes = 1, 2, 3, 4, 5 2 3 Links = 12, 13, 15, 23,

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models

More information

Measuring scholarly impact: Methods and practice Link prediction with the linkpred tool

Measuring scholarly impact: Methods and practice Link prediction with the linkpred tool Measuring scholarly impact: Methods and practice Link prediction with the linkpred tool Raf Guns University of Antwerp raf.guns@uantwerpen.be If you want to follow along Download and install Anaconda Python

More information

Social-Network Graphs

Social-Network Graphs Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities

More information

Extracting Information from Complex Networks

Extracting Information from Complex Networks Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization Pedro Ribeiro (DCC/FCUP & CRACS/INESC-TEC) Part 1 Motivation and emergence of Network Science

More information

Degree Distribution: The case of Citation Networks

Degree Distribution: The case of Citation Networks Network Analysis Degree Distribution: The case of Citation Networks Papers (in almost all fields) refer to works done earlier on same/related topics Citations A network can be defined as Each node is a

More information

Part I: Data Mining Foundations

Part I: Data Mining Foundations Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?

More information

Using network evolution theory and singular value decomposition method to improve accuracy of link prediction in social networks

Using network evolution theory and singular value decomposition method to improve accuracy of link prediction in social networks Proceedings of the Tenth Australasian Data Mining Conference (AusDM 2012), Sydney, Australia Using network evolution theory and singular value decomposition method to improve accuracy of link prediction

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford

More information

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1 Graph Theory Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ January 30, 2018 Network

More information

Graph Theory for Network Science

Graph Theory for Network Science Graph Theory for Network Science Dr. Natarajan Meghanathan Professor Department of Computer Science Jackson State University, Jackson, MS E-mail: natarajan.meghanathan@jsums.edu Networks or Graphs We typically

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu HITS (Hypertext Induced Topic Selection) Is a measure of importance of pages or documents, similar to PageRank

More information

Collaborative filtering based on a random walk model on a graph

Collaborative filtering based on a random walk model on a graph Collaborative filtering based on a random walk model on a graph Marco Saerens, Francois Fouss, Alain Pirotte, Luh Yen, Pierre Dupont (UCL) Jean-Michel Renders (Xerox Research Europe) Some recent methods:

More information

CS 224W Final Report Group 37

CS 224W Final Report Group 37 1 Introduction CS 224W Final Report Group 37 Aaron B. Adcock Milinda Lakkam Justin Meyer Much of the current research is being done on social networks, where the cost of an edge is almost nothing; the

More information

arxiv: v3 [cs.ni] 3 May 2017

arxiv: v3 [cs.ni] 3 May 2017 Modeling Request Patterns in VoD Services with Recommendation Systems Samarth Gupta and Sharayu Moharir arxiv:1609.02391v3 [cs.ni] 3 May 2017 Department of Electrical Engineering, Indian Institute of Technology

More information

Graph Theory and Network Measurment

Graph Theory and Network Measurment Graph Theory and Network Measurment Social and Economic Networks MohammadAmin Fazli Social and Economic Networks 1 ToC Network Representation Basic Graph Theory Definitions (SE) Network Statistics and

More information

Introduction to Complex Networks Analysis

Introduction to Complex Networks Analysis Introduction to Complex Networks Analysis Miloš Savić Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Serbia Complex systems and networks System - a set of interrelated

More information

Spectral Clustering and Community Detection in Labeled Graphs

Spectral Clustering and Community Detection in Labeled Graphs Spectral Clustering and Community Detection in Labeled Graphs Brandon Fain, Stavros Sintos, Nisarg Raval Machine Learning (CompSci 571D / STA 561D) December 7, 2015 {btfain, nisarg, ssintos} at cs.duke.edu

More information

Web consists of web pages and hyperlinks between pages. A page receiving many links from other pages may be a hint of the authority of the page

Web consists of web pages and hyperlinks between pages. A page receiving many links from other pages may be a hint of the authority of the page Link Analysis Links Web consists of web pages and hyperlinks between pages A page receiving many links from other pages may be a hint of the authority of the page Links are also popular in some other information

More information

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 1. Introduction Reddit is one of the most popular online social news websites with millions

More information

Geometric Registration for Deformable Shapes 3.3 Advanced Global Matching

Geometric Registration for Deformable Shapes 3.3 Advanced Global Matching Geometric Registration for Deformable Shapes 3.3 Advanced Global Matching Correlated Correspondences [ASP*04] A Complete Registration System [HAW*08] In this session Advanced Global Matching Some practical

More information

PageRank and related algorithms

PageRank and related algorithms PageRank and related algorithms PageRank and HITS Jacob Kogan Department of Mathematics and Statistics University of Maryland, Baltimore County Baltimore, Maryland 21250 kogan@umbc.edu May 15, 2006 Basic

More information

Clustering: Classic Methods and Modern Views

Clustering: Classic Methods and Modern Views Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering

More information

Lecture 3: Descriptive Statistics

Lecture 3: Descriptive Statistics Lecture 3: Descriptive Statistics Scribe: Neil Spencer September 7th, 2016 1 Review The following terminology is used throughout the lecture Network: A real pattern of binary relations Graph: G = (V, E)

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

A Tutorial on the Probabilistic Framework for Centrality Analysis, Community Detection and Clustering

A Tutorial on the Probabilistic Framework for Centrality Analysis, Community Detection and Clustering A Tutorial on the Probabilistic Framework for Centrality Analysis, Community Detection and Clustering 1 Cheng-Shang Chang All rights reserved Abstract Analysis of networked data has received a lot of attention

More information

Supervised Random Walks

Supervised Random Walks Supervised Random Walks Pawan Goyal CSE, IITKGP September 8, 2014 Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 1 / 17 Correlation Discovery by random walk Problem definition Estimate

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

Proximity Prestige using Incremental Iteration in Page Rank Algorithm

Proximity Prestige using Incremental Iteration in Page Rank Algorithm Indian Journal of Science and Technology, Vol 9(48), DOI: 10.17485/ijst/2016/v9i48/107962, December 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Proximity Prestige using Incremental Iteration

More information

Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References q Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

More information

Supervised Link Prediction with Path Scores

Supervised Link Prediction with Path Scores Supervised Link Prediction with Path Scores Wanzi Zhou Stanford University wanziz@stanford.edu Yangxin Zhong Stanford University yangxin@stanford.edu Yang Yuan Stanford University yyuan16@stanford.edu

More information

Department of Computer Science, Rensselaer Polytechnic Institute Troy, NY 12180

Department of Computer Science, Rensselaer Polytechnic Institute Troy, NY 12180 Chapter 9 A SURVEY OF LINK PREDICTION IN SOCIAL NETWORKS Mohammad Al Hasan Department of Computer and Information Science Indiana University- Purdue University Indianapolis, IN 46202 alhasan@cs.iupui.edu

More information

CENTRALITIES. Carlo PICCARDI. DEIB - Department of Electronics, Information and Bioengineering Politecnico di Milano, Italy

CENTRALITIES. Carlo PICCARDI. DEIB - Department of Electronics, Information and Bioengineering Politecnico di Milano, Italy CENTRALITIES Carlo PICCARDI DEIB - Department of Electronics, Information and Bioengineering Politecnico di Milano, Italy email carlo.piccardi@polimi.it http://home.deib.polimi.it/piccardi Carlo Piccardi

More information

An Empirical Analysis of Communities in Real-World Networks

An Empirical Analysis of Communities in Real-World Networks An Empirical Analysis of Communities in Real-World Networks Chuan Sheng Foo Computer Science Department Stanford University csfoo@cs.stanford.edu ABSTRACT Little work has been done on the characterization

More information

General Instructions. Questions

General Instructions. Questions CS246: Mining Massive Data Sets Winter 2018 Problem Set 2 Due 11:59pm February 8, 2018 Only one late period is allowed for this homework (11:59pm 2/13). General Instructions Submission instructions: These

More information

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena CSE 190 Lecture 16 Data Mining and Predictive Analytics Small-world phenomena Another famous study Stanley Milgram wanted to test the (already popular) hypothesis that people in social networks are separated

More information

Information Networks: PageRank

Information Networks: PageRank Information Networks: PageRank Web Science (VU) (706.716) Elisabeth Lex ISDS, TU Graz June 18, 2018 Elisabeth Lex (ISDS, TU Graz) Links June 18, 2018 1 / 38 Repetition Information Networks Shape of the

More information

QUINT: On Query-Specific Optimal Networks

QUINT: On Query-Specific Optimal Networks QUINT: On Query-Specific Optimal Networks Presenter: Liangyue Li Joint work with Yuan Yao (NJU) -1- Jie Tang (Tsinghua) Wei Fan (Baidu) Hanghang Tong (ASU) Node Proximity: What? Node proximity: the closeness

More information

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web

More information

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity

More information

CAIM: Cerca i Anàlisi d Informació Massiva

CAIM: Cerca i Anàlisi d Informació Massiva 1 / 72 CAIM: Cerca i Anàlisi d Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Balcázar, Ricard Gavaldá Department of Computer Science, UPC Fall 2016 http://www.cs.upc.edu/~caim

More information

Using PageRank in Feature Selection

Using PageRank in Feature Selection Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy fienco,meo,bottag@di.unito.it Abstract. Feature selection is an important

More information

Clustering analysis of gene expression data

Clustering analysis of gene expression data Clustering analysis of gene expression data Chapter 11 in Jonathan Pevsner, Bioinformatics and Functional Genomics, 3 rd edition (Chapter 9 in 2 nd edition) Human T cell expression data The matrix contains

More information

1 Training/Validation/Testing

1 Training/Validation/Testing CPSC 340 Final (Fall 2015) Name: Student Number: Please enter your information above, turn off cellphones, space yourselves out throughout the room, and wait until the official start of the exam to begin.

More information

V2: Measures and Metrics (II)

V2: Measures and Metrics (II) - Betweenness Centrality V2: Measures and Metrics (II) - Groups of Vertices - Transitivity - Reciprocity - Signed Edges and Structural Balance - Similarity - Homophily and Assortative Mixing 1 Betweenness

More information

Personalized Information Retrieval

Personalized Information Retrieval Personalized Information Retrieval Shihn Yuarn Chen Traditional Information Retrieval Content based approaches Statistical and natural language techniques Results that contain a specific set of words or

More information

Scott Philips, Edward Kao, Michael Yee and Christian Anderson. Graph Exploitation Symposium August 9 th 2011

Scott Philips, Edward Kao, Michael Yee and Christian Anderson. Graph Exploitation Symposium August 9 th 2011 Activity-Based Community Detection Scott Philips, Edward Kao, Michael Yee and Christian Anderson Graph Exploitation Symposium August 9 th 2011 23-1 This work is sponsored by the Office of Naval Research

More information

CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS

CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS Overview of Networks Instructor: Yizhou Sun yzsun@cs.ucla.edu January 10, 2017 Overview of Information Network Analysis Network Representation Network

More information

Graph Theory for Network Science

Graph Theory for Network Science Graph Theory for Network Science Dr. Natarajan Meghanathan Professor Department of Computer Science Jackson State University, Jackson, MS E-mail: natarajan.meghanathan@jsums.edu Networks or Graphs We typically

More information

University of Maryland. Tuesday, March 2, 2010

University of Maryland. Tuesday, March 2, 2010 Data-Intensive Information Processing Applications Session #5 Graph Algorithms Jimmy Lin University of Maryland Tuesday, March 2, 2010 This work is licensed under a Creative Commons Attribution-Noncommercial-Share

More information

Community detection algorithms survey and overlapping communities. Presented by Sai Ravi Kiran Mallampati

Community detection algorithms survey and overlapping communities. Presented by Sai Ravi Kiran Mallampati Community detection algorithms survey and overlapping communities Presented by Sai Ravi Kiran Mallampati (sairavi5@vt.edu) 1 Outline Various community detection algorithms: Intuition * Evaluation of the

More information

Networks in economics and finance. Lecture 1 - Measuring networks

Networks in economics and finance. Lecture 1 - Measuring networks Networks in economics and finance Lecture 1 - Measuring networks What are networks and why study them? A network is a set of items (nodes) connected by edges or links. Units (nodes) Individuals Firms Banks

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Automatic Summarization

Automatic Summarization Automatic Summarization CS 769 Guest Lecture Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of Wisconsin, Madison February 22, 2008 Andrew B. Goldberg (CS Dept) Summarization

More information

Web 2.0 Social Data Analysis

Web 2.0 Social Data Analysis Web 2.0 Social Data Analysis Ing. Jaroslav Kuchař jaroslav.kuchar@fit.cvut.cz Structure (2) Czech Technical University in Prague, Faculty of Information Technologies Software and Web Engineering 2 Contents

More information

Link Prediction in Heterogeneous Collaboration. networks

Link Prediction in Heterogeneous Collaboration. networks Link Prediction in Heterogeneous Collaboration Networks Xi Wang and Gita Sukthankar Department of EECS University of Central Florida {xiwang,gitars}@eecs.ucf.edu Abstract. Traditional link prediction techniques

More information

Non Overlapping Communities

Non Overlapping Communities Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides

More information

Link Prediction in Fuzzy Social Networks using Distributed Learning Automata

Link Prediction in Fuzzy Social Networks using Distributed Learning Automata Link Prediction in Fuzzy Social Networks using Distributed Learning Automata Behnaz Moradabadi 1 and Mohammad Reza Meybodi 2 1,2 Department of Computer Engineering, Amirkabir University of Technology,

More information

16 - Networks and PageRank

16 - Networks and PageRank - Networks and PageRank ST 9 - Fall 0 Contents Network Intro. Required R Packages................................ Example: Zachary s karate club network.................... Basic Network Concepts. Basic

More information

Learning with Hypergraphs

Learning with Hypergraphs Learning with Hypergraphs B. Ravindran Joint Work with Sai Nageshwar S. Reconfigurable and Intelligent Systems Engineering (RISE) Group Department of Computer Science and Engineering Indian InsEtute of

More information

CS6200 Information Retreival. The WebGraph. July 13, 2015

CS6200 Information Retreival. The WebGraph. July 13, 2015 CS6200 Information Retreival The WebGraph The WebGraph July 13, 2015 1 Web Graph: pages and links The WebGraph describes the directed links between pages of the World Wide Web. A directed edge connects

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising

More information

Supplementary material to Epidemic spreading on complex networks with community structures

Supplementary material to Epidemic spreading on complex networks with community structures Supplementary material to Epidemic spreading on complex networks with community structures Clara Stegehuis, Remco van der Hofstad, Johan S. H. van Leeuwaarden Supplementary otes Supplementary ote etwork

More information

Web Spam Challenge 2008

Web Spam Challenge 2008 Web Spam Challenge 2008 Data Analysis School, Moscow, Russia K. Bauman, A. Brodskiy, S. Kacher, E. Kalimulina, R. Kovalev, M. Lebedev, D. Orlov, P. Sushin, P. Zryumov, D. Leshchiner, I. Muchnik The Data

More information

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second

More information

1 Homophily and assortative mixing

1 Homophily and assortative mixing 1 Homophily and assortative mixing Networks, and particularly social networks, often exhibit a property called homophily or assortative mixing, which simply means that the attributes of vertices correlate

More information

Generating Useful Network-based Features for Analyzing Social Networks

Generating Useful Network-based Features for Analyzing Social Networks Generating Useful Network-based Features for Analyzing Social Networks Abstract Recently, many Web services such as social networking services, blogs, and collaborative tagging have become widely popular.

More information

3.1 Basic Definitions and Applications. Chapter 3. Graphs. Undirected Graphs. Some Graph Applications

3.1 Basic Definitions and Applications. Chapter 3. Graphs. Undirected Graphs. Some Graph Applications Chapter 3 31 Basic Definitions and Applications Graphs Slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley All rights reserved 1 Undirected Graphs Some Graph Applications Undirected graph G = (V,

More information

Community Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal

Community Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal Community Structure Detection Amar Chandole Ameya Kabre Atishay Aggarwal What is a network? Group or system of interconnected people or things Ways to represent a network: Matrices Sets Sequences Time

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information

Structure of Social Networks

Structure of Social Networks Structure of Social Networks Outline Structure of social networks Applications of structural analysis Social *networks* Twitter Facebook Linked-in IMs Email Real life Address books... Who Twitter #numbers

More information

Using PageRank in Feature Selection

Using PageRank in Feature Selection Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy {ienco,meo,botta}@di.unito.it Abstract. Feature selection is an important

More information

Social & Information Network Analysis CS 224W

Social & Information Network Analysis CS 224W Social & Information Network Analysis CS 224W Final Report Alexandre Becker Jordane Giuly Sébastien Robaszkiewicz Stanford University December 2011 1 Introduction The microblogging service Twitter today

More information

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov Algorithms and Applications in Social Networks 2017/2018, Semester B Slava Novgorodov 1 Lesson #1 Administrative questions Course overview Introduction to Social Networks Basic definitions Network properties

More information

Cycles in Random Graphs

Cycles in Random Graphs Cycles in Random Graphs Valery Van Kerrebroeck Enzo Marinari, Guilhem Semerjian [Phys. Rev. E 75, 066708 (2007)] [J. Phys. Conf. Series 95, 012014 (2008)] Outline Introduction Statistical Mechanics Approach

More information

Link Analysis in the Cloud

Link Analysis in the Cloud Cloud Computing Link Analysis in the Cloud Dell Zhang Birkbeck, University of London 2017/18 Graph Problems & Representations What is a Graph? G = (V,E), where V represents the set of vertices (nodes)

More information

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds MAE 298, Lecture 9 April 30, 2007 Web search and decentralized search on small-worlds Search for information Assume some resource of interest is stored at the vertices of a network: Web pages Files in

More information

Pregel. Ali Shah

Pregel. Ali Shah Pregel Ali Shah s9alshah@stud.uni-saarland.de 2 Outline Introduction Model of Computation Fundamentals of Pregel Program Implementation Applications Experiments Issues with Pregel 3 Outline Costs of Computation

More information

Demystifying movie ratings 224W Project Report. Amritha Raghunath Vignesh Ganapathi Subramanian

Demystifying movie ratings 224W Project Report. Amritha Raghunath Vignesh Ganapathi Subramanian Demystifying movie ratings 224W Project Report Amritha Raghunath (amrithar@stanford.edu) Vignesh Ganapathi Subramanian (vigansub@stanford.edu) 9 December, 2014 Introduction The past decade or so has seen

More information

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection CSE 255 Lecture 6 Data Mining and Predictive Analytics Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

Link Sign Prediction and Ranking in Signed Directed Social Networks

Link Sign Prediction and Ranking in Signed Directed Social Networks Noname manuscript No. (will be inserted by the editor) Link Sign Prediction and Ranking in Signed Directed Social Networks Dongjin Song David A. Meyer Received: date / Accepted: date Abstract Signed directed

More information

Stable Statistics of the Blogograph

Stable Statistics of the Blogograph Stable Statistics of the Blogograph Mark Goldberg, Malik Magdon-Ismail, Stephen Kelley, Konstantin Mertsalov Rensselaer Polytechnic Institute Department of Computer Science Abstract. The primary focus

More information

Analysis of Biological Networks. 1. Clustering 2. Random Walks 3. Finding paths

Analysis of Biological Networks. 1. Clustering 2. Random Walks 3. Finding paths Analysis of Biological Networks 1. Clustering 2. Random Walks 3. Finding paths Problem 1: Graph Clustering Finding dense subgraphs Applications Identification of novel pathways, complexes, other modules?

More information

Mining Social Network Graphs

Mining Social Network Graphs Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be

More information

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please) Virginia Tech. Computer Science CS 5614 (Big) Data Management Systems Fall 2014, Prakash Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 5 Inference

More information