CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Size: px
Start display at page:

Download "CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University"

Transcription

1 CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

2 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 2 Networks with positive and negative relationships Consider an undirected complete graph Label each edge as either: Positive: friendship, trust, positive sentiment, Negative: enemy, distrust, negative sentiment, Examine triples of connected nodes A, B, C

3 Start with the intuition [Heider 46]: Friend of my friend is my friend Enemy of enemy is my friend Enemy of friend is my enemy Look at connected triples of nodes: Balanced Consistent with friend of a friend or enemy of the enemy intuition 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 3 Unbalanced Inconsistent with the friend of a friend or enemy of the enemy intuition

4 Graph is balanced if every connected triple of nodes has: all 3 edges labeled, or exactly 1 edge labeled Unbalanced Balanced 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 4

5 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 5 Balance implies global coalitions [CartwrightHarary] If all triangles are balanced, then either: The network contains only positive edges, or Nodes can be split into 2 sets where negative edges only point between the sets L R

6 B Every node in L is enemy of R D Any 2 nodes in L are friends A Any 2 nodes in R are friends C L Friends of A Enemies of A 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 6 E R

7 International relations: Positive edge: alliance Negative edge: animosity Separation of Bangladesh from Pakistan in 1971: US supports Pakistan. Why? USSR was enemy of China China was enemy of India India was enemy of Pakistan US was friendly with China China vetoed Bangladesh from U.N.? 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 7 P U B? C I R

8 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 8

9 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 9

10 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 10

11 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 11

12 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 12

13 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 13

14 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 14 Balanced? Def 1: Local view Fill in the missing edges to achieve balance Def 2: Global view Divide the graph into two coalitions The 2 defs. are equivalent!

15 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 15 Graph is balanced if and only if it contains no cycle with an odd number of negative edges. How to compute this? Find connected components on edges For each component create a supernode Connect components A and B if there is a negative edge between the members Assign supernodes to sides using BFS

16 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 16

17 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 17

18 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 18

19 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 19 Using BFS assign each node a side Graph is unbalanced if any two supernodes are assigned the same side L R R L L L Unbalanced! R

20 [CHI 10] 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 20 Each link A B is explicitly tagged with a sign: Epinions: Trust/Distrust Does A trust B s product reviews? (only positive links are visible) Wikipedia: Support/Oppose Does A support B to become Wikipedia administrator? Slashdot: Friend/Foe Does A like B s comments? Other examples: Online multiplayer games

21 [CHI 10] Does structural balance hold? Triad Epinions Wikipedia P(T) P 0 (T) P(T) P 0 (T) Balance P(T) probability of a triad P 0 (T) triad probability if the signs would be random Real data Shuffled data 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 21 x x x x x x x

22 22 Intuitive picture of social network in terms of densely linked clusters How does structure interact with links? Embeddedness of link (A,B): Number of shared neighbors

23 [CHI 10] 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 23 Embeddedness of ties: Positive ties tend to be more embedded Positive ties tend to be more clumped together Public display of signs (votes) in Wikipedia further attenuates this Epinions Wikipedia

24 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, [CHI 10] 24 Clustering: net: More clustering than baseline net: Less clustering than baseline Size of connected component: /net: Smaller than the baseline

25 [CHI 10] 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 25 New setting: Links are directed and created over time X A B How many are now explained by balance? Only half (8 out of 16) Is there a better explanation? Yes. Status. 16 *2 signed directed triads

26 [CHI 10] 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 26 Links are directed and created over time Status theory [DavisLeinhardt 68, Guha et al. 04, Leskovec et al. 10] Link A B means: B has higher status than A Link A B means: B has lower status than A Status and balance give different predictions: X X A B A B Balance: Status: Balance: Status:

27 [CHI 10] 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 27 Edges are directed Edges are created over time X has links to A and B Now, A links to B (triad ABX) How does sign of AB depend signs of X? We need to formalize: Links are embedded in triads: Provides context for signs Users are heterogeneous in their linking behavior A A X? X B B

28 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, [CHI 10] 28 Link (A,B) appears in the context (A,B; X) 16 different contextualized links:

29 [CHI 10] Surprise: How much behavior of user deviates from baseline in context t: (A 1, B 1 ; X 1 ),, (A n, B n ; X n ) instances of contextualized link t k of them closed with a plus p g (A i ) generative baseline of A i empirical prob. of A i giving a plus Then: generative surprise of n triad type t: k pg ( Ai ) s i= 1 g ( t) = n i p g ( A )(1 p i g ( A )) 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 29 i A A X Vs. X B B

30 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 30 Two basic examples: X X A B A B Gen. surprise of A: Rec. surprise of B: Gen. surprise of A: Rec. surprise of B:

31 [CHI 10] 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 35 Determine node status: Assign X status 0 Based on signs and directions of edges set status of A and B 1 A B Surprise is statusconsistent, if: Gen. surprise is statusconsistent if it has same sign as status of B Rec. surprise is statusconsistent if it has the opposite sign from the status of A Surprise is balanceconsistent, if: If it completes a balanced triad 1 X 0 Statusconsistent if: Gen. surprise > 0 Rec. surprise < 0

32 [CHI 10] 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 36 Predictions: S g (t i ) S r (t i ) B g B r S g S r t 3 t 15 t 2 t 14 t 16

33 [WWW 10] 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 37 Both theories make predictions about the global structure of the network Structural balance Factions Find coalitions Status theory Global Status Flip direction and sign of minus edges Assign each node a unique status so that edges point from low to high 3 1 2

34 [WWW 10] 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 38 Fraction of edges of the network that satisfy Balance and Status? Observations: No evidence for global balance beyond the random baselines Real data is 80% consistent vs. 80% consistency under random baseline Evidence for global status beyond the random baselines Real data is 80% consistent, but 50% consistency under random baseline

35 [WWW 10] Edge sign prediction problem Given a network and signs on all but one edge, predict the missing sign Machine Learning Formulation: Predict sign of edge (u,v) Class label: Dataset: 1: positive edge 1: negative edge Learning method: Logistic regression u Original: 80% edges Balanced: 50% edges Evaluation: Accuracy and ROC curves Features for learning: Next slide? v 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 39

36 [WWW 10] 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 40 For each edge (u,v) create features: Triad counts (16): Counts of signed triads u edge u v takes part in Node degree (7 features): Signed degree: d out(u), d out(u), d in(v), d in(v) Total degree: d out (u), d in (v) Embeddedness of edge (u,v) v

37 [WWW 10] 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 41 Classification Accuracy: Epinions: 93.5% Slashdot: 94.4% Wikipedia: 81% Signs can be modeled from local network structure alone Trust propagation model of [Guha et al. 04] has 14% error on Epinions Triad features perform less well for less embedded edges Wikipedia is harder to model: Votes are publicly visible Epin Slash Wiki

38 42

39 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, 43 Do people use these very different linking systems by obeying the same principles? How generalizable are the results across the datasets? Train on row dataset, predict on column Nearly perfect generalization of the models even though networks come from very different applications

40 44 Signed networks provide insight into how social computing systems are used: Status vs. Balance Different role of reciprocated links Role of embeddedness and public display Sign of relationship can be reliably predicted from the local network context ~90% accuracy sign of the edge

41 Status difference (AB) More evidence that networks are globally organized based on status People use signed edges consistently regardless of particular application Near perfect generalization of models across datasets Many further directions: Status difference of nodes A and B [ICWSM 10]: A<B A=B A>B

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu Start with the intuition [Heider 46]: Friend of my friend is my friend Enemy of enemy is my friend Enemy of friend

More information

Positive and Negative Links

Positive and Negative Links Positive and Negative Links Web Science (VU) (707.000) Elisabeth Lex KTI, TU Graz May 4, 2015 Elisabeth Lex (KTI, TU Graz) Networks May 4, 2015 1 / 66 Outline 1 Repetition 2 Motivation 3 Structural Balance

More information

Social and Information Network Analysis Positive and Negative Relationships

Social and Information Network Analysis Positive and Negative Relationships Social and Information Network Analysis Positive and Negative Relationships Cornelia Caragea Department of Computer Science and Engineering University of North Texas June 14, 2016 To Recap Triadic Closure

More information

Positive and Negative Relations

Positive and Negative Relations Positive and Negative Relations CMSC 498J: Social Media Computing Department of Computer Science University of Maryland Spring 2016 Hadi Amiri hadi@umd.edu Lecture Topics Balanced and Unbalanced Networks

More information

Online Social Networks and Media. Positive and Negative Edges Strong and Weak Edges Strong Triadic Closure

Online Social Networks and Media. Positive and Negative Edges Strong and Weak Edges Strong Triadic Closure Online Social Networks and Media Positive and Negative Edges Strong and Weak Edges Strong Triadic Closure POSITIVE AND NEGATIVE TIES Structural Balance What about negative edges? Initially, a complete

More information

Link Sign Prediction and Ranking in Signed Directed Social Networks

Link Sign Prediction and Ranking in Signed Directed Social Networks Noname manuscript No. (will be inserted by the editor) Link Sign Prediction and Ranking in Signed Directed Social Networks Dongjin Song David A. Meyer Received: date / Accepted: date Abstract Signed directed

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu Chile Angola Gabon Ecuador Colombia Venezuela Canada Denmark Sweden Belgium-Lux Argentina Brazil Kuwait Untd Arab

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec Stanford University Jure Leskovec, Stanford University http://cs224w.stanford.edu [Morris 2000] Based on 2 player coordination game 2 players

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models

More information

Problem Set 2. General Instructions

Problem Set 2. General Instructions CS224W: Analysis of Networks Fall 2017 Problem Set 2 General Instructions Due 11:59pm PDT October 26, 2017 These questions require thought, but do not require long answers. Please be as concise as possible.

More information

Problem Set 2. General Instructions

Problem Set 2. General Instructions CS224W: Analysis of Networks Fall 2017 Problem Set 2 General Instructions Due 11:59pm PDT October 26, 2017 These questions require thought, but do not require long answers. Please be as concise as possible.

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University, y http://cs224w.stanford.edu Due in 1 week: Oct 4 in class! The idea of the reaction papers is: To familiarize yourselves

More information

Graph Mining: Overview of different graph models

Graph Mining: Overview of different graph models Graph Mining: Overview of different graph models Davide Mottin, Konstantina Lazaridou Hasso Plattner Institute Graph Mining course Winter Semester 2016 Lecture road Anomaly detection (previous lecture)

More information

CS224W Final Report Emergence of Global Status Hierarchy in Social Networks

CS224W Final Report Emergence of Global Status Hierarchy in Social Networks CS224W Final Report Emergence of Global Status Hierarchy in Social Networks Group 0: Yue Chen, Jia Ji, Yizheng Liao December 0, 202 Introduction Social network analysis provides insights into a wide range

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/4/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

More information

Extracting Information from Complex Networks

Extracting Information from Complex Networks Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform

More information

Network Lasso: Clustering and Optimization in Large Graphs

Network Lasso: Clustering and Optimization in Large Graphs Network Lasso: Clustering and Optimization in Large Graphs David Hallac, Jure Leskovec, Stephen Boyd Stanford University September 28, 2015 Convex optimization Convex optimization is everywhere Introduction

More information

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit

More information

Edge Classification in Networks

Edge Classification in Networks Charu C. Aggarwal, Peixiang Zhao, and Gewen He Florida State University IBM T J Watson Research Center Edge Classification in Networks ICDE Conference, 2016 Introduction We consider in this paper the edge

More information

Statistical Methods for Network Analysis: Exponential Random Graph Models

Statistical Methods for Network Analysis: Exponential Random Graph Models Day 2: Network Modeling Statistical Methods for Network Analysis: Exponential Random Graph Models NMID workshop September 17 21, 2012 Prof. Martina Morris Prof. Steven Goodreau Supported by the US National

More information

CSE 258 Lecture 12. Web Mining and Recommender Systems. Social networks

CSE 258 Lecture 12. Web Mining and Recommender Systems. Social networks CSE 258 Lecture 12 Web Mining and Recommender Systems Social networks Social networks We ve already seen networks (a little bit) in week 3 i.e., we ve studied inference problems defined on graphs, and

More information

Exploiting Social Network Structure for Person-to-Person Sentiment Analysis (Supplementary Material to [WPLP14])

Exploiting Social Network Structure for Person-to-Person Sentiment Analysis (Supplementary Material to [WPLP14]) Exploiting Social Network Structure for Person-to-Person Sentiment Analysis (Supplementary Material to [WPLP14]) Robert West, Hristo S. Paskov, Jure Leskovec, Christopher Potts Stanford University west@cs.stanford.edu,

More information

node2vec: Scalable Feature Learning for Networks

node2vec: Scalable Feature Learning for Networks node2vec: Scalable Feature Learning for Networks A paper by Aditya Grover and Jure Leskovec, presented at Knowledge Discovery and Data Mining 16. 11/27/2018 Presented by: Dharvi Verma CS 848: Graph Database

More information

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov Algorithms and Applications in Social Networks 2017/2018, Semester B Slava Novgorodov 1 Lesson #1 Administrative questions Course overview Introduction to Social Networks Basic definitions Network properties

More information

V2: Measures and Metrics (II)

V2: Measures and Metrics (II) - Betweenness Centrality V2: Measures and Metrics (II) - Groups of Vertices - Transitivity - Reciprocity - Signed Edges and Structural Balance - Similarity - Homophily and Assortative Mixing 1 Betweenness

More information

Structural Balance and Transitivity. Social Network Analysis, Chapter 6 Wasserman and Faust

Structural Balance and Transitivity. Social Network Analysis, Chapter 6 Wasserman and Faust Structural Balance and Transitivity Social Network Analysis, Chapter 6 Wasserman and Faust Balance Theory Concerned with how an individual's attitudes or opinions coincide with those of others in a network

More information

Scalable Network Analysis

Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin COMAD, Ahmedabad, India Dec 20, 2013 Outline Unstructured Data - Scale & Diversity Evolving Networks Machine Learning Problems arising in Networks Recommender

More information

Tie strength, social capital, betweenness and homophily. Rik Sarkar

Tie strength, social capital, betweenness and homophily. Rik Sarkar Tie strength, social capital, betweenness and homophily Rik Sarkar Networks Position of a node in a network determines its role/importance Structure of a network determines its properties 2 Today Notion

More information

CSE 158 Lecture 11. Web Mining and Recommender Systems. Social networks

CSE 158 Lecture 11. Web Mining and Recommender Systems. Social networks CSE 158 Lecture 11 Web Mining and Recommender Systems Social networks Assignment 1 Due 5pm next Monday! (Kaggle shows UTC time, but the due date is 5pm, Monday, PST) Assignment 1 Assignment 1 Social networks

More information

Link Label Prediction in Signed Citation Network. Thesis by Uchenna Akujuobi. In Partial Fulfillment of the Requirements.

Link Label Prediction in Signed Citation Network. Thesis by Uchenna Akujuobi. In Partial Fulfillment of the Requirements. Link Label Prediction in Signed Citation Network Thesis by Uchenna Akujuobi In Partial Fulfillment of the Requirements For the Degree of Masters of Science King Abdullah University of Science and Technology,

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu Can we identify node groups? (communities, modules, clusters) 2/13/2014 Jure Leskovec, Stanford C246: Mining

More information

1 Homophily and assortative mixing

1 Homophily and assortative mixing 1 Homophily and assortative mixing Networks, and particularly social networks, often exhibit a property called homophily or assortative mixing, which simply means that the attributes of vertices correlate

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

Web 2.0 Social Data Analysis

Web 2.0 Social Data Analysis Web 2.0 Social Data Analysis Ing. Jaroslav Kuchař jaroslav.kuchar@fit.cvut.cz Structure (2) Czech Technical University in Prague, Faculty of Information Technologies Software and Web Engineering 2 Contents

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2????? Machine Learning Node

More information

Characterization and Edge Sign Prediction in Signed Networks

Characterization and Edge Sign Prediction in Signed Networks Characterization and Edge Sign Prediction in Signed Networks Tongda Zhang, Haomiao Jiang, and Zhouxiao Bao Electrical Engineering Department, Stanford University, Stanford, CA 94305 USA Email: {tdzhang,

More information

CHARACTERIZATION AND EDGE SIGN PREDICTION IN SIGNED NETWORKS. Tongda Zhang, Haomiao Jiang, Zhouxiao Bao

CHARACTERIZATION AND EDGE SIGN PREDICTION IN SIGNED NETWORKS. Tongda Zhang, Haomiao Jiang, Zhouxiao Bao CHARACTERIZATION AND EDGE SIGN PREDICTION IN SIGNED NETWORKS Tongda Zhang, Haomiao Jiang, Zhouxiao Bao {tdzhang, hjiang36, zhouxiao}@stanford.edu ABSTRACT In this project, we perform an intensive study

More information

Scalable Clustering of Signed Networks Using Balance Normalized Cut

Scalable Clustering of Signed Networks Using Balance Normalized Cut Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.

More information

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena CSE 190 Lecture 16 Data Mining and Predictive Analytics Small-world phenomena Another famous study Stanley Milgram wanted to test the (already popular) hypothesis that people in social networks are separated

More information

Supervised Link Prediction with Path Scores

Supervised Link Prediction with Path Scores Supervised Link Prediction with Path Scores Wanzi Zhou Stanford University wanziz@stanford.edu Yangxin Zhong Stanford University yangxin@stanford.edu Yang Yuan Stanford University yyuan16@stanford.edu

More information

CSE 255 Lecture 13. Data Mining and Predictive Analytics. Triadic closure; strong & weak ties

CSE 255 Lecture 13. Data Mining and Predictive Analytics. Triadic closure; strong & weak ties CSE 255 Lecture 13 Data Mining and Predictive Analytics Triadic closure; strong & weak ties Monday Random models of networks: Erdos Renyi random graphs (picture from Wikipedia http://en.wikipedia.org/wiki/erd%c5%91s%e2%80%93r%c3%a9nyi_model)

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mining Massive Datasets Jure Leskovec, Stanford University http://cs46.stanford.edu /7/ Jure Leskovec, Stanford C46: Mining Massive Datasets Many real-world problems Web Search and Text Mining Billions

More information

An Empirical Analysis of Communities in Real-World Networks

An Empirical Analysis of Communities in Real-World Networks An Empirical Analysis of Communities in Real-World Networks Chuan Sheng Foo Computer Science Department Stanford University csfoo@cs.stanford.edu ABSTRACT Little work has been done on the characterization

More information

Ranking Clustered Data with Pairwise Comparisons

Ranking Clustered Data with Pairwise Comparisons Ranking Clustered Data with Pairwise Comparisons Alisa Maas ajmaas@cs.wisc.edu 1. INTRODUCTION 1.1 Background Machine learning often relies heavily on being able to rank the relative fitness of instances

More information

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Infinite data. Filtering data streams

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University  Infinite data. Filtering data streams /9/7 Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them

More information

Finding the Bias and Prestige of Nodes in Networks based on Trust Scores

Finding the Bias and Prestige of Nodes in Networks based on Trust Scores WWW 2 Session: Social Network Analysis March 28 April, 2, Hyderabad, India Finding the Bias and Prestige of Nodes in Networks based on Trust Scores ABSTRACT Abhinav Mishra Dept. of Computer Science and

More information

MIDTERM EXAMINATION Networked Life (NETS 112) November 21, 2013 Prof. Michael Kearns

MIDTERM EXAMINATION Networked Life (NETS 112) November 21, 2013 Prof. Michael Kearns MIDTERM EXAMINATION Networked Life (NETS 112) November 21, 2013 Prof. Michael Kearns This is a closed-book exam. You should have no material on your desk other than the exam itself and a pencil or pen.

More information

Graph Structure Over Time

Graph Structure Over Time Graph Structure Over Time Observing how time alters the structure of the IEEE data set Priti Kumar Computer Science Rensselaer Polytechnic Institute Troy, NY Kumarp3@rpi.edu Abstract This paper examines

More information

Signed Network Modeling Based on Structural Balance Theory

Signed Network Modeling Based on Structural Balance Theory Signed Network Modeling Based on Structural Balance Theory Tyler Derr Data Science and Engineering Lab Michigan State University derrtyle@msu.edu Charu Aggarwal IBM T.J. Watson Research Center charu@us.ibm.com

More information

Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman

Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman http://www.mmds.org Overview of Recommender Systems Content-based Systems Collaborative Filtering J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive

More information

Establishing Virtual Private Network Bandwidth Requirement at the University of Wisconsin Foundation

Establishing Virtual Private Network Bandwidth Requirement at the University of Wisconsin Foundation Establishing Virtual Private Network Bandwidth Requirement at the University of Wisconsin Foundation by Joe Madden In conjunction with ECE 39 Introduction to Artificial Neural Networks and Fuzzy Systems

More information

Predicting User Ratings Using Status Models on Amazon.com

Predicting User Ratings Using Status Models on Amazon.com Predicting User Ratings Using Status Models on Amazon.com Borui Wang Stanford University borui@stanford.edu Guan (Bell) Wang Stanford University guanw@stanford.edu Group 19 Zhemin Li Stanford University

More information

Non Overlapping Communities

Non Overlapping Communities Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides

More information

CSE 316: SOCIAL NETWORK ANALYSIS INTRODUCTION. Fall 2017 Marion Neumann

CSE 316: SOCIAL NETWORK ANALYSIS INTRODUCTION. Fall 2017 Marion Neumann CSE 316: SOCIAL NETWORK ANALYSIS Fall 2017 Marion Neumann INTRODUCTION Contents in these slides may be subject to copyright. Some materials are adopted from: http://www.cs.cornell.edu/home /kleinber/ networks-book,

More information

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset. Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied

More information

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection CSE 258 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford

More information

Local higher-order graph clustering

Local higher-order graph clustering Local higher-order graph clustering Hao Yin Stanford University yinh@stanford.edu Austin R. Benson Cornell University arb@cornell.edu Jure Leskovec Stanford University jure@cs.stanford.edu David F. Gleich

More information

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection CSE 158 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

CSE 158 Lecture 13. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

CSE 158 Lecture 13. Web Mining and Recommender Systems. Triadic closure; strong & weak ties CSE 158 Lecture 13 Web Mining and Recommender Systems Triadic closure; strong & weak ties Monday Random models of networks: Erdos Renyi random graphs (picture from Wikipedia http://en.wikipedia.org/wiki/erd%c5%91s%e2%80%93r%c3%a9nyi_model)

More information

Behavioral Data Mining. Lecture 9 Modeling People

Behavioral Data Mining. Lecture 9 Modeling People Behavioral Data Mining Lecture 9 Modeling People Outline Power Laws Big-5 Personality Factors Social Network Structure Power Laws Y-axis = frequency of word, X-axis = rank in decreasing order Power Laws

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu HITS (Hypertext Induced Topic Selection) Is a measure of importance of pages or documents, similar to PageRank

More information

Classify My Social Contacts into Circles Stanford University CS224W Fall 2014

Classify My Social Contacts into Circles Stanford University CS224W Fall 2014 Classify My Social Contacts into Circles Stanford University CS224W Fall 2014 Amer Hammudi (SUNet ID: ahammudi) ahammudi@stanford.edu Darren Koh (SUNet: dtkoh) dtkoh@stanford.edu Jia Li (SUNet: jli14)

More information

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization Pedro Ribeiro (DCC/FCUP & CRACS/INESC-TEC) Part 1 Motivation and emergence of Network Science

More information

Link Prediction for Social Network

Link Prediction for Social Network Link Prediction for Social Network Ning Lin Computer Science and Engineering University of California, San Diego Email: nil016@eng.ucsd.edu Abstract Friendship recommendation has become an important issue

More information

Influence Maximization in Location-Based Social Networks Ivan Suarez, Sudarshan Seshadri, Patrick Cho CS224W Final Project Report

Influence Maximization in Location-Based Social Networks Ivan Suarez, Sudarshan Seshadri, Patrick Cho CS224W Final Project Report Influence Maximization in Location-Based Social Networks Ivan Suarez, Sudarshan Seshadri, Patrick Cho CS224W Final Project Report Abstract The goal of influence maximization has led to research into different

More information

External influence on Bitcoin trust network structure

External influence on Bitcoin trust network structure External influence on Bitcoin trust network structure Edison Alejandro García, garcial@stanford.edu SUNetID: 6675 CS4W - Analysis of Networks Abstract Networks can express how much people trust or distrust

More information

Predict Topic Trend in Blogosphere

Predict Topic Trend in Blogosphere Predict Topic Trend in Blogosphere Jack Guo 05596882 jackguo@stanford.edu Abstract Graphical relationship among web pages has been used to rank their relative importance. In this paper, we introduce a

More information

CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties CSE 158 Lecture 11 Web Mining and Recommender Systems Triadic closure; strong & weak ties Triangles So far we ve seen (a little about) how networks can be characterized by their connectivity patterns What

More information

How to organize the Web?

How to organize the Web? How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second try: Web Search Information Retrieval attempts to find relevant docs in a small and trusted set Newspaper

More information

Modeling Movie Character Networks with Random Graphs

Modeling Movie Character Networks with Random Graphs Modeling Movie Character Networks with Random Graphs Andy Chen December 2017 Introduction For a novel or film, a character network is a graph whose nodes represent individual characters and whose edges

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec Stanford University Jure Leskovec, Stanford University http://cs224w.stanford.edu Teams of 2 3 students (1 is also ok) Project: Experimental

More information

Implementation of Network Community Profile using Local Spectral algorithm and its application in Community Networking

Implementation of Network Community Profile using Local Spectral algorithm and its application in Community Networking Implementation of Network Community Profile using Local Spectral algorithm and its application in Community Networking Vaibhav VPrakash Department of Computer Science and Engineering, Sri Jayachamarajendra

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 3/6/2012 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu 2 In many data mining

More information

Graaasp: a Web 2.0 Research Platform for Contextual Recommendation with Aggregated Data

Graaasp: a Web 2.0 Research Platform for Contextual Recommendation with Aggregated Data Graaasp: a Web 2.0 Research Platform for Contextual Recommendation with Aggregated Data Evgeny Bogdanov evgeny.bogdanov@epfl.ch Sandy El Helou Sandy.elhelou@epfl.ch Denis Gillet denis.gillet@epfl.ch Christophe

More information

AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks

AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks Yu Shi, Huan Gui, Qi Zhu, Lance Kaplan, Jiawei Han University of Illinois at Urbana-Champaign (UIUC) Facebook Inc. U.S. Army Research

More information

I211: Information infrastructure II

I211: Information infrastructure II Data Mining: Classifier Evaluation I211: Information infrastructure II 3-nearest neighbor labeled data find class labels for the 4 data points 1 0 0 6 0 0 0 5 17 1.7 1 1 4 1 7.1 1 1 1 0.4 1 2 1 3.0 0 0.1

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second

More information

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection CSE 255 Lecture 6 Data Mining and Predictive Analytics Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

Supervised Random Walks

Supervised Random Walks Supervised Random Walks Pawan Goyal CSE, IITKGP September 8, 2014 Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 1 / 17 Correlation Discovery by random walk Problem definition Estimate

More information

Recap! CMSC 498J: Social Media Computing. Department of Computer Science University of Maryland Spring Hadi Amiri

Recap! CMSC 498J: Social Media Computing. Department of Computer Science University of Maryland Spring Hadi Amiri Recap! CMSC 498J: Social Media Computing Department of Computer Science University of Maryland Spring 2015 Hadi Amiri hadi@umd.edu Announcement CourseEvalUM https://www.courseevalum.umd.edu/ 2 Announcement

More information

Coloring Signed Graphs

Coloring Signed Graphs Coloring Signed Graphs Lynn Takeshita May 12, 2016 Abstract This survey paper provides an introduction to signed graphs, focusing on coloring. We shall introduce the concept of signed graphs, a proper

More information

Social Networks, Social Interaction, and Outcomes

Social Networks, Social Interaction, and Outcomes Social Networks, Social Interaction, and Outcomes Social Networks and Social Interaction Not all links are created equal: Some links are stronger than others Some links are more useful than others Links

More information

Multi-Class Logistic Regression and Perceptron

Multi-Class Logistic Regression and Perceptron Multi-Class Logistic Regression and Perceptron Instructor: Wei Xu Some slides adapted from Dan Jurfasky, Brendan O Connor and Marine Carpuat MultiClass Classification Q: what if we have more than 2 categories?

More information

GraphGAN: Graph Representation Learning with Generative Adversarial Nets

GraphGAN: Graph Representation Learning with Generative Adversarial Nets The 32 nd AAAI Conference on Artificial Intelligence (AAAI 2018) New Orleans, Louisiana, USA GraphGAN: Graph Representation Learning with Generative Adversarial Nets Hongwei Wang 1,2, Jia Wang 3, Jialin

More information

Topic mash II: assortativity, resilience, link prediction CS224W

Topic mash II: assortativity, resilience, link prediction CS224W Topic mash II: assortativity, resilience, link prediction CS224W Outline Node vs. edge percolation Resilience of randomly vs. preferentially grown networks Resilience in real-world networks network resilience

More information

ECS 289 / MAE 298, Lecture 15 Mar 2, Diffusion, Cascades and Influence, Part II

ECS 289 / MAE 298, Lecture 15 Mar 2, Diffusion, Cascades and Influence, Part II ECS 289 / MAE 298, Lecture 15 Mar 2, 2011 Diffusion, Cascades and Influence, Part II Diffusion and cascades in networks (Nodes in one of two states) Viruses (human and computer) contact processes epidemic

More information

Network community detection with edge classifiers trained on LFR graphs

Network community detection with edge classifiers trained on LFR graphs Network community detection with edge classifiers trained on LFR graphs Twan van Laarhoven and Elena Marchiori Department of Computer Science, Radboud University Nijmegen, The Netherlands Abstract. Graphs

More information

Basic Network Concepts

Basic Network Concepts Basic Network Concepts Basic Vocabulary Alice Graph Network Edges Links Nodes Vertices Chuck Bob Edges Alice Chuck Bob Edge Weights Alice Chuck Bob Apollo 13 Movie Network Main Actors in Apollo 13 the

More information

Link Prediction in Graph Streams

Link Prediction in Graph Streams Peixiang Zhao, Charu C. Aggarwal, and Gewen He Florida State University IBM T J Watson Research Center Link Prediction in Graph Streams ICDE Conference, 2016 Graph Streams Graph Streams arise in a wide

More information

Balanced and partitionable signed graphs

Balanced and partitionable signed graphs A. Mrvar: Balanced and partitionable signed graphs 1 Balanced and partitionable signed graphs Signed graphs Signed graph is an ordered pair (G,σ), where: G = (V,L) is a graph with a set of vertices V and

More information

Preliminary results from an agent-based adaptation of friendship games

Preliminary results from an agent-based adaptation of friendship games Preliminary results from an agent-based adaptation of friendship games David S. Dixon June 29, 2011 This paper presents agent-based model (ABM) equivalents of friendshipbased games and compares the results

More information

Exponential Random Graph Models for Social Networks

Exponential Random Graph Models for Social Networks Exponential Random Graph Models for Social Networks ERGM Introduction Martina Morris Departments of Sociology, Statistics University of Washington Departments of Sociology, Statistics, and EECS, and Institute

More information

On Compressing Social Networks. Ravi Kumar. Yahoo! Research, Sunnyvale, CA. Jun 30, 2009 KDD 1

On Compressing Social Networks. Ravi Kumar. Yahoo! Research, Sunnyvale, CA. Jun 30, 2009 KDD 1 On Compressing Social Networks Ravi Kumar Yahoo! Research, Sunnyvale, CA KDD 1 Joint work with Flavio Chierichetti, University of Rome Silvio Lattanzi, University of Rome Michael Mitzenmacher, Harvard

More information

Applied Statistics for Neuroscientists Part IIa: Machine Learning

Applied Statistics for Neuroscientists Part IIa: Machine Learning Applied Statistics for Neuroscientists Part IIa: Machine Learning Dr. Seyed-Ahmad Ahmadi 04.04.2017 16.11.2017 Outline Machine Learning Difference between statistics and machine learning Modeling the problem

More information

Problems 1 and 5 were graded by Amin Sorkhei, Problems 2 and 3 by Johannes Verwijnen and Problem 4 by Jyrki Kivinen. Entropy(D) = Gini(D) = 1

Problems 1 and 5 were graded by Amin Sorkhei, Problems 2 and 3 by Johannes Verwijnen and Problem 4 by Jyrki Kivinen. Entropy(D) = Gini(D) = 1 Problems and were graded by Amin Sorkhei, Problems and 3 by Johannes Verwijnen and Problem by Jyrki Kivinen.. [ points] (a) Gini index and Entropy are impurity measures which can be used in order to measure

More information

Machine Learning in Python. Rohith Mohan GradQuant Spring 2018

Machine Learning in Python. Rohith Mohan GradQuant Spring 2018 Machine Learning in Python Rohith Mohan GradQuant Spring 2018 What is Machine Learning? https://twitter.com/myusuf3/status/995425049170489344 Traditional Programming Data Computer Program Output Getting

More information

Recommendation System for Location-based Social Network CS224W Project Report

Recommendation System for Location-based Social Network CS224W Project Report Recommendation System for Location-based Social Network CS224W Project Report Group 42, Yiying Cheng, Yangru Fang, Yongqing Yuan 1 Introduction With the rapid development of mobile devices and wireless

More information

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight local variation of one variable with respect to another.

More information

Density estimation. In density estimation problems, we are given a random from an unknown density. Our objective is to estimate

Density estimation. In density estimation problems, we are given a random from an unknown density. Our objective is to estimate Density estimation In density estimation problems, we are given a random sample from an unknown density Our objective is to estimate? Applications Classification If we estimate the density for each class,

More information