Structure of Social Networks

Similar documents
CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CSE 255 Lecture 13. Data Mining and Predictive Analytics. Triadic closure; strong & weak ties

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena

CSE 158 Lecture 13. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

Social Networks, Social Interaction, and Outcomes

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017

Economics of Information Networks

Web 2.0 Social Data Analysis

Behavioral Data Mining. Lecture 9 Modeling People

Graph and Link Mining

How to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200?

Math 443/543 Graph Theory Notes 10: Small world phenomenon and decentralized search

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Recap! CMSC 498J: Social Media Computing. Department of Computer Science University of Maryland Spring Hadi Amiri

Degree Distribution: The case of Citation Networks

Distribution-Free Models of Social and Information Networks

RANDOM-REAL NETWORKS

Extracting Information from Complex Networks

Navigation in Networks. Networked Life NETS 112 Fall 2017 Prof. Michael Kearns

Overview of the Stateof-the-Art. Networks. Evolution of social network studies

Tie strength, social capital, betweenness and homophily. Rik Sarkar

Example 1: An algorithmic view of the small world phenomenon

Tie-Strength and Strategies in Social Capital Management

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov

CSE 258 Lecture 12. Web Mining and Recommender Systems. Social networks

Basic Network Concepts

CSE 158 Lecture 11. Web Mining and Recommender Systems. Social networks

Graph Theory and Network Measurment

Information Networks and Semantics. Srinath Srinivasa IIIT Bangalore

Lecture 22 Tuesday, April 10

Network Mathematics - Why is it a Small World? Oskar Sandberg

This is not the chapter you re looking for [handwave]

Algorithms, Games, and Networks February 21, Lecture 12

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds

Trust Models for Expertise Discovery in Social Networks. September 01, Table of Contents

Social, Information, and Routing Networks: Models, Algorithms, and Strategic Behavior

1 Degree Distributions

Lecture #3: PageRank Algorithm The Mathematics of Google Search

Big Data Analytics CSCI 4030

Random graph models with fixed degree sequences: choices, consequences and irreducibilty proofs for sampling

TELCOM2125: Network Science and Analysis

Network Thinking. Complexity: A Guided Tour, Chapters 15-16

CS 322: (Social and Information) Network Analysis Jure Leskovec Stanford University

Ma/CS 6b Class 10: Ramsey Theory

Non Overlapping Communities

CIP- OSN Online Social Networks as Graphs. Dr. Thanassis Tiropanis -

Network Community Detection

Lecture 6: Graph Properties

Graph Theory S 1 I 2 I 1 S 2 I 1 I 2

Ma/CS 6b Class 12: Ramsey Theory

Transitivity and Triads

Social Network Analysis

Chapter 1. Social Media and Social Computing. October 2012 Youn-Hee Han

CS 124/LINGUIST 180 From Languages to Information. Social Networks: Small Worlds, Weak Ties, and Power Laws. Dan Jurafsky Stanford University

Sample questions with solutions Ekaterina Kochmar

Notebook Assignments

Demystifying movie ratings 224W Project Report. Amritha Raghunath Vignesh Ganapathi Subramanian

(Social) Networks Analysis III. Prof. Dr. Daning Hu Department of Informatics University of Zurich


Introduction to network metrics

CS6200 Information Retreival. The WebGraph. July 13, 2015

Social and Information Network Analysis Positive and Negative Relationships

Today. Types of graphs. Complete Graphs. Trees. Hypercubes.

5 Graphs

Online Social Networks and Media. Positive and Negative Edges Strong and Weak Edges Strong Triadic Closure

Lesson 18. Laura Ricci 08/05/2017

Copyright 2000, Kevin Wayne 1

TELCOM2125: Network Science and Analysis

Exponential Random Graph Models for Social Networks

1 Counting triangles and cliques

Social-Network Graphs

CS103 Handout 13 Fall 2012 May 4, 2012 Problem Set 5

Math 170- Graph Theory Notes

A Tutorial on the Probabilistic Framework for Centrality Analysis, Community Detection and Clustering

Graph Algorithms. Revised based on the slides by Ruoming Kent State

Lecture 19 Thursday, March 29. Examples of isomorphic, and non-isomorphic graphs will be given in class.

Social Networks. Slides by : I. Koutsopoulos (AUEB), Source:L. Adamic, SN Analysis, Coursera course

Getting started with social media and comping

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Graph Theory Part Two

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge

Grades 7 & 8, Math Circles 31 October/1/2 November, Graph Theory

Graphs / Networks. CSE 6242/ CX 4242 Feb 18, Centrality measures, algorithms, interactive applications. Duen Horng (Polo) Chau Georgia Tech

Lecture 17 November 7

Recitation 4: Elimination algorithm, reconstituted graph, triangulation

M.E.J. Newman: Models of the Small World

CENTRALITIES. Carlo PICCARDI. DEIB - Department of Electronics, Information and Bioengineering Politecnico di Milano, Italy

Algorithmic and Economic Aspects of Networks. Nicole Immorlica

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Introduction to Networks and Business Intelligence

Distributed Network Routing Algorithms Table for Small World Networks

An Introduction to Graph Theory

1. a graph G = (V (G), E(G)) consists of a set V (G) of vertices, and a set E(G) of edges (edges are pairs of elements of V (G))

Absorbing Random walks Coverage

Instant Insanity Instructor s Guide Make-it and Take-it Kit for AMTNYS 2006

Computational Complexity and Implications for Security DRAFT Notes on Infeasible Computation for MA/CS 109 Leo Reyzin with the help of Nick Benes

The Role of Homophily and Popularity in Informed Decentralized Search

Link Prediction for Social Network

Absorbing Random walks Coverage

Transcription:

Structure of Social Networks

Outline Structure of social networks Applications of structural analysis

Social *networks*

Twitter Facebook Linked-in IMs Email Real life Address books... Who

Twitter #numbers 140 characters 140 million active users 340 million tweets per day > 15 billion connections

The Twitter Graph Note: NOT (just) a social network Asymmetric, follow relationship VERY skewed graph But very valuable interest graph

Part I: Network Structure

What can networks tell us? The strength of weak ties [Granovetter 73] How do people find new jobs? Friends and acquaintances Surprising fact: discovery is enabled by weak ties

Strength of weak ties Definition: a bridge in a graph is an edge whose removal disconnects the endpoints.

Strength of weak ties Definition: a local bridge in a graph is an edge whose endpoints have no common neighbor.

Triadic closure

Strong Triadic closure Strong Triadic Closure Property: if the node has strong ties to two neighbors, then these neighbors must have at least a weak tie between them.

Strength of Weak Ties Claim: If a node A in a network satisfies the Strong Triadic Closure Property and is involved in at least two strong ties, then any local bridge it is involved in must be a weak tie. Consequence: all local bridges are weak ties!

Strength of Weak Ties Claim: If a node A in a network satisfies the Strong Triadic Closure Property and is involved in at least two strong ties, then any local bridge it is involved in must be a weak tie. Proof: by contradiction

Strength of Weak Ties Discovery is enabled by weak ties Surprising strength of weak ties! Simple structural model explains this cleanly Similar observations for Twitter/Facebook

More network insights Zachary s karate club and graph partitioning

Six degrees of separation Milgram s experiment: How are people connected? Letters given to people in Omaha Target is a person in Boston Rule: can only forward letters to people *you know* (<== social network) How many hops did it take?

Milgram s experiment

Six degrees of separation Bacon number game Short paths exist, and abound!

How? The world seems to be small But how does this happen?

How?

Watts-Strogatz model Each node forms 2 links: All nodes within r hops k nodes picked uniformly at random

Watts-Strogatz model Property: short paths exist between everyone! Intuition: random links exponentiate due to lack of triangles

Decentralized search Is Milgram s experiment now explained? Watts-Strogatz: short paths exist But how can we possibly navigate?

Milgram s experiment

Watts-Strogatz revisited Navigate using only local information Impossible to navigate in this world! So does not explain Milgram s experiment Intuition: random links are too random

Kleinberg s model Slight change: given exponent q, for nodes u, v create a link with probability (, )

Kleinberg s model

Kleinberg s model Some intuition:

Network structure Understanding structure affords deep insights Interplay between sociology and graph theory

Part II: Applications

Part II: Applications Recommendations

Graph Engine: Cassovary In-memory computation No compression! Adjacency list format Open source: https://github.com/twitter/cassovary

Link Prediction General algorithms: Triadic closure SALSA Most popular in a country Popular movie stars, pop stars, etc Personalized algorithms Personalized pagerank Data: Social graph + usage (relationship strength, interests)

The Long Tail

Pagerank Start with equal weights on every node Distribute weight equally among outgoing edges

Pagerank 1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8

Pagerank 1/16 + 1/16 + 1/8 + 1/8 + 1/8 = 1/2 1/16 1/16 1/16 1/16 1/16 1/16 1/8

Pagerank

Pagerank Alternate matrix formulation: Adjacency matrix A

Pagerank Alternate matrix formulation: Adjacency matrix A (normalized)

Pagerank Matrix view: Convergence via Eigenvectors:

Personalized-pagerank A personalized version of pagerank Reset to source u with probability

Personalized Pagerank Equivalent random walk view Easier to simulate via Monte-carlo

HITS and SALSA

More link prediction Pagerank -> personalized pagerank HITS -> SALSA Many other variants Similar users Simrank

Scale A cautionary tale about scaling Doing random walks Weighted random walks Map-Reduce

Uniform Random Walk The key sampling routine: For a user, find a random following/follower, i.e. pick uniformly at random from [,,... ] Budget per call: 20ms / 10K steps = 2000 ns /step Main memory reference: 100 ns Needs a call to random(), so just within budget

Weighted Random Walk But what if there are weights: For a user, find a random following/follower from with weights [,,... ] [,,... ] n could be very large!

Weighted Random Walk Many simple ideas don t work: Pre-process + binary search: [ > >...> ] Pay (log ) for every step Cost for Obama: 20 comparisons + lookups Recall: lookup is 100 ns, so out of budget :(

Weighted Random Walk We can do a lot better! In fact, we can do it in ( ) lookups 2 random calls + 2 memory lookups! Walker, Alastair J. (1974a), Fast Generation of Uniformly Distributed Pseudorandom Numbers With Floating Point Representation, Electronics Letters, 10, 553-554.

The Walker Method

The Walker Method =

The Walker Method Central idea: Equalize weights to

The Walker Method Central idea: Equalize weights to (can be transitive)

The Walker Method Key observation: Never need to borrow weight from more than one

The Walker Method Lemma: Can equalize weights among any set of n elements such that each bin has at most two elements. Proof by induction: - Easy for two - Suppose true for any set of n elements - Pick an element below average and transfer weight from someone above average (now the latter can be below average) - Now we are back to the n case!

The Walker Method For each i, we have: (,, ( ))

The Walker Method Algorithm: 1. Go to a random 2. Pick a random 3. If ( < ) else return return ( )

The Walker Method 2 random calls + 2 memory lookups! Are we done? Engineering: (,, ( )) (Int, Double, Int) = (4+8+4)*15*10^9 ~ 240GB Oops!

Summary Network structure is information rich Lots of practical applications Many algorithmic challenges

Thanks for listening!