Maximizing the Spread of Influence through a Social Network

Similar documents
Maximizing the Spread of Influence through a Social Network. David Kempe, Jon Kleinberg and Eva Tardos

Scalable Influence Maximization in Social Networks under the Linear Threshold Model

Viral Marketing and Outbreak Detection. Fang Jin Yao Zhang

Part I Part II Part III Part IV Part V. Influence Maximization

ECS 289 / MAE 298, Lecture 15 Mar 2, Diffusion, Cascades and Influence, Part II

Example 3: Viral Marketing and the vaccination policy problem

Extracting Influential Nodes for Information Diffusion on a Social Network

Combining intensification and diversification to maximize the propagation of social influence

Influence Maximization in the Independent Cascade Model

IRIE: Scalable and Robust Influence Maximization in Social Networks

Influence Maximization in Location-Based Social Networks Ivan Suarez, Sudarshan Seshadri, Patrick Cho CS224W Final Project Report

Information Dissemination in Socially Aware Networks Under the Linear Threshold Model

Graphical Approach for Influence Maximization in Social Networks Under Generic Threshold-based Non-submodular Model

Exact Computation of Influence Spread by Binary Decision Diagrams

Sources of Misinformation in Online Social Networks: Who to suspect?

Scalable Influence Maximization for Prevalent Viral Marketing in Large-Scale Social Networks

Minimizing the Spread of Contamination by Blocking Links in a Network

arxiv: v1 [cs.si] 12 Jan 2019

Lecture Note: Computation problems in social. network analysis

Maximizing Diffusion on Dynamic Social Networks

Journal of Engineering Science and Technology Review 7 (3) (2014) Research Article

Viral Marketing for Product Cross-Sell through Social Networks

Jure Leskovec Machine Learning Department Carnegie Mellon University

An Optimal Allocation Approach to Influence Maximization Problem on Modular Social Network. Tianyu Cao, Xindong Wu, Song Wang, Xiaohua Hu

Cascades. Rik Sarkar. Social and Technological Networks. University of Edinburgh, 2018.

Impact of Clustering on Epidemics in Random Networks

CS224W: Analysis of Networks Jure Leskovec, Stanford University

Algorithms and Theory of Computation

Jure Leskovec Computer Science Department Cornell University / Stanford University

A survey of submodular functions maximization. Yao Zhang 03/19/2015

Efficient Influence Maximization in Social Networks

Chi Wang, Wei Chen & Yajun Wang

Game Theoretic Models for Social Network Analysis

Strategic, Online Learning, and Computational Aspects of Social Network Science

A visual analytics approach to compare propagation models in social networks

Whom to befriend to influence people

Fractional Cascading in Wireless. Jie Gao Computer Science Department Stony Brook University

Data mining --- mining graphs

Sparsification of Social Networks Using Random Walks

Influence in Ratings-Based Recommender Systems: An Algorithm-Independent Approach

GRASP. Greedy Randomized Adaptive. Search Procedure

Lagrangian Decomposition Algorithm for Allocating Marketing Channels

arxiv: v1 [cs.si] 21 Oct 2015

Utilizing Social Influence in Content Distribution Networks

Stochastic Modeling of The Decay Dynamics of Online Social Networks

Learning Network Graph of SIR Epidemic Cascades Using Minimal Hitting Set based Approach

Parameter Learning for Latent Network Diffusion

Algorithmic Problems in Epidemiology

Recap Hill Climbing Randomized Algorithms SLS for CSPs. Local Search. CPSC 322 Lecture 12. January 30, 2006 Textbook 3.8

Structure of Social Networks

How good is the Shapley value-based approach to the influence maximization problem?

Graph Mining and Social Network Analysis

Midterm Examination CS540-2: Introduction to Artificial Intelligence

Coverage Approximation Algorithms

Classifier Case Study: Viola-Jones Face Detector

SPARSIFICATION OF SOCIAL NETWORKS USING RANDOM WALKS BRYAN WILDER

SOCIAL network analysis research can be broadly classified

Nonparametric Importance Sampling for Big Data

Absorbing Random walks Coverage

Absorbing Random walks Coverage

Efficient influence spread estimation for influence maximization under the linear threshold model

arxiv: v1 [cs.si] 7 Aug 2017

Cascade-aware partitioning of large graph databases

Content-Centric Flow Mining for Influence Analysis in Social Streams

Maximizing the Spread of Cascades Using Network Design

arxiv: v2 [stat.ml] 4 Apr 2018

A Class of Submodular Functions for Document Summarization

CMU-Q Lecture 8: Optimization I: Optimization for CSP Local Search. Teacher: Gianni A. Di Caro

Sketch-based Influence Maximization and Computation: Scaling up with Guarantees

Games in Networks: the price of anarchy, stability and learning. Éva Tardos Cornell University

Positive Influence Dominating Set in Online Social Networks

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Scaling Influence Maximization with Network Abstractions

Sybil defenses via social networks

Over-contribution in discretionary databases

Sampling Large Graphs: Algorithms and Applications

Inferring the Underlying Structure of Information Cascades

Fast Convergence of Regularized Learning in Games

Link Analysis and Web Search

Content Overlays. Nick Feamster CS 7260 March 12, 2007

Scott Philips, Edward Kao, Michael Yee and Christian Anderson. Graph Exploitation Symposium August 9 th 2011

SOCIAL network plays an important role for spreading

Analysis of P2P Storage Systems. March 13, 2009

4 INFORMED SEARCH AND EXPLORATION. 4.1 Heuristic Search Strategies

Privacy Breaches in Privacy-Preserving Data Mining

Youtube Graph Network Model and Analysis Yonghyun Ro, Han Lee, Dennis Won

Combinatorial Model and Bounds for Target Set Selection

CS224W: Analysis of Networks Jure Leskovec, Stanford University

Limiting Concept Spread in Environments with Interacting Concepts

Attack Tolerance and Resiliency of Large Complex Networks

Data Mining 4. Cluster Analysis

10703 Deep Reinforcement Learning and Control

Models and Algorithms for Network Immunization

A two-stage strategy for solving the connection subgraph problem

CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul

How to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200?

Online Stochastic Matching CMSC 858F: Algorithmic Game Theory Fall 2010

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Sybil-aware Least Cost Rumor Blocking in Social Networks

Transcription:

Maximizing the Spread of Influence through a Social Network By David Kempe, Jon Kleinberg, Eva Tardos Report by Joe Abrams

Social Networks

Infectious disease networks

Viral Marketing

Viral Marketing Example: Hotmail Included service s s URL in every email sent by users Grew from zero to 12 million users in 18 months with small advertising budget

Domingos and Richardson (2001, 2002) Introduction to maximization of influence over social networks Intrinsic Value vs. Network Value Expected Lift in Profit (ELP) Epinions, web of trust,, 75,000 users and 500,000 edges

Domingos and Richardson (2001, 2002) Viral marketing (using greedy hill-climbing strategy) worked very well compared with direct marketing Robust (69% of total lift knowing only 5% of edges)

Diffusion Model: Linear Threshold Model Each node (consumer) influenced by set of neighbors; has threshold Θ from uniform distribution [0,1] When combined influence reaches threshold, node becomes active Active node now can influence its neighbors Weighted edges

Diffusion Model: Linear Threshold Model

Diffusion Model: Independent Cascade Model Each active node has a probability p of activating a neighbor At time t+1, all newly activated nodes try to activate their neighbors Only one attempt for per node on target Akin to turn-based strategy game?

Influence Maximization Using greedy hill-climbing strategy, can approximate optimum to within a factor of (1 1/e ε), or ~63% Proven using theories of submodular functions (diminishing returns) Applies to both diffusion models

Testing on network data Co-authorship network High-energy physics theory section of www.arxiv.org 10,748 nodes (authors) and ~53,000 edges Multiple co-authored papers listed as parallel edges (greater weight)

Testing on network data Linear Threshold: influence weighed by # of parallel lines, inversely weighed by degree of target node: w = c u,v /d v Independent Cascade: p set at 1% and 10%; total probability for u v is 1 (1 p)^c u,v Weighted Cascade: p = 1/ d v

Algorithms Greedy hill-climbing High degree: nodes with greatest number of edges Distance centrality: lowest average distance with other nodes Random

Algorithms

Results: Linear Threshold Model Greedy: ~40% better than central, ~18% better than high degree

Results: Weighted Cascade Model

Results: Independent Cascade, p = 1%

Results: Independent Cascade, p = 10%

Advantages of Random Selection

Generalized models Generalized Linear Threshold: for node v, influence of neighbors not necessarily sum of individual influences Generalized Independent Cascade: for node v,, probability p depends on set of v s neighbors that have previously tried to activate v Models computationally equivalent, impossible to guarantee approximation

Non-Progressive Threshold Model Active nodes can become inactive Similar concept: at each time t,, whether or not v becomes/stays active depends on if influence meets threshold Can intervene at different times; need not perform all interventions at t = 0 Answer to progressive model with graph G equivalent to non-progressive model with layered graph G τ

General Marketing Strategies Can divide up total budget κ into equal increments of size δ For greedy hill-climbing strategy, can guarantee performance within factor of 1 e^[-(κ *γ)/(κ + δ *n)] As δ decreases relative to κ,, result approaches 1 e -1 = 63%

Strengths of paper Showed results in two complementary fashions: theoretical models and test results using real dataset Demonstrated that greedy hill-climbing strategy could guarantee results within 63% of optimum Used specific and generalized versions of two different diffusion models

Weaknesses of paper Doesn t t fully explain methodology of greedy hill-climbing strategy Lots of work not shown simply refers to work done in other papers Threshold value uniformly distributed? Influence inversely weighted by degree of target?

Questions?