Sampling Large Graphs: Algorithms and Applications
|
|
- Beatrix Patterson
- 5 years ago
- Views:
Transcription
1 Sampling Large Graphs: Algorithms and Applications Don Towsley Umass - Amherst Joint work with P.H. Wang, J.Z. Zhou, J.C.S. Lui, X. Guan
2 Measuring, Analyzing Large Networks - large networks can be represented by graphs - Facebook - WWW - Twitter - Ebay 300 million 1 Billion 50 Billion 233 Million Curse of data dimensionality!!!
3 Challenges in measurement: Information Distortion World Map in 1459 incomplete (Columbus et al. 1492) (Australia 17 th century) wrong proportions (Africa & Asia)
4 Why do we want to understand these networks? High school friendship Want to understand or find out network how did these networks evolve? who are the influential users? how does influence propagate? distribution of preference of users? communities in these networks?.etc.
5 Goals and Challenges Goals generate statistically valid characterization of network structure node pairs in this work Challenges large network sizes correcting for biases
6 How to measure: sampling algorithms Random sampling (uniform & independent) Vertex sampling Crawling Breadth First sampling (BFS) Edge sampling Random walk sampling
7 How to measure: sampling algorithms Random sampling (uniform & independent) Vertex sampling Crawling Breadth First sampling (BFS) Edge sampling Random walk sampling
8 How to measure: sampling algorithms Random sampling (uniform & independent) Vertex sampling Crawling Breadth First sampling (BFS) Edge sampling Random walk sampling
9 How to measure: sampling algorithms Random sampling (uniform & independent) Vertex sampling Crawling Breadth First sampling (BFS) Edge sampling Random walk sampling
10 How to measure: sampling algorithms Random sampling (uniform & independent) Vertex sampling Crawling Breadth First sampling (BFS) Edge sampling Random walk sampling
11 Focus of this talk Measure node pair statistics: important for many applications! 12
12 Classification of Node Pairs Classify node pair [u, v] using shortest path 1-hop node pair class if distance(u,v) = 1 2-hop node pair class if distance(u,v) = 2 13
13 Similarity Function: Homophily Homophily: tendency of users to connect to others with common interests. P. Singla and M. Richardson. Yes, there is a correlation: from social networks to personal behavior on the web. In WWW 2008 (MSN) Can infer characteristics and make recommendations Compare homophily(u,v) between different node pair classes 14
14 Similarity Function: Proximity Proximity(u,v): closeness of u and v; defined as functions such as number of common neighbors of u and v u v knowing proximity distribution of node pairs important for friendship prediction interest recommendation 15
15 Similarity Function: Distance Distance(u,v): length of shortest path between u and v in graph measure distance distribution of all node pairs to calculate average distance Twitter: 4.1 MSN: 6.6 effective diameter (the 90th percentile of all distances) small world 16
16 Problem Formulation undirected graph G = (V, E) measure characteristics of node pairs in following sets: all pairs - S = { u, v : u, v V, u v} one-hop pairs - pairs of connected nodes S (1) = u, v : u, v E two-hop pairs - pairs of nodes with at least one common neighbor S (2) = { u, v : u, v V, u v; x V st x, u, x, v E} 17
17 Problem Formulation F(u, v) - value of node pair property under study, e.g., # of common neighbors of u, v {a 1,, a K } - range of F u, v distribution of F u, v S: (ω 1,, ω K ) S (1) : (ω 1 (1),, ωk (1) ) S (2) : (ω 1 2,, ω K 2 ) ω k, ω k (1), ωk (2) - fractions of node pairs in S, S 1, S (2) with property F u, v = a k 18
18 Challenges OSNs large Facebook, Google+, Twitter, Renren, Sina Weibo,, V > 500 million users huge number of node pairs, V 2 > topology not available sampling required UVS (Uniform Vertex Sampling): sampling bias fors (1), S (2). Sometimes UVS not allowed crawling methods, such as breadth first search, and random walk (RW): sampling bias need to construct unbiased estimator 19
19 Node Pair Sampling Based on UVS Basic sampling techniques UVS: sample nodes from V uniformly weighted vertex sampling (WVS): sample nodes from V with desired probability distribution (π x : x V) independent WVS (IWVS (if we know topology) Metropolis-Hastings WVS (MHWVS): At each step, MHWVS selects a node v using UVS and then accepts the move with probability min(π v /π u, 1), otherwise remains at u. 20
20 Node Pair Sampling Based on UVS All pairs S Sampling method: use UVS to select two different nodes u and v uniformly at random Estimator: given sampled node pairs u i, v i, i = 1,, n n ω k = 1 1(F u n i, v i i=1 = a k ), k = 1,, K Accuracy (unbiased) E ω k = ω k and Var ω k = ω k ω k 2 n, k = 1,, K 21
21 Node Pair Sampling Based on UVS One hop pairs S (1) Sampling node pair [u, v] 1) sample node u according to probability distribution (π u 1 : u V), where d u π u 1 - degree of node u = d u 2 E 2) select neighbor v at random Each [u, v] sampled uniformly from S (1) 22
22 Node Pair Sampling Based on UVS One hop pairs S (1) Estimator: given sampled node pairs u i, v i, i = 1,, n ω k (1) = 1 n n i=1 Accuracy (unbiased) 1(F u i, v i = a k ), k = 1,, K E ω k (1) = ω k (1) and Var ω k (1) = ω (1) 1 2 k ωk n, k = 1,, K 23
23 Node Pair Sampling Based on UVS Two hop pairs S (2) Sampling node pair u, v 1) sample a node x according to probability distribution (π x 2 : x V) π x 2 = d x d x 1 M M - normalization constant 2) then select two neighbors u and v of node x at random 24
24 Node Pair Sampling Based on UVS Two hop pairs S (2) Sample (u, v) in S (2) with probability (2) π [u,v] = m(u, v)/m m(u, v) - number of neighbors common to u, v Estimator: given node pairs u i, v i, i = 1,, K ω k (2 ) = 1 H n 1(F u i, v i = a k ) i=1 m(u i, v i ), k = 1,, K H normalization constant 25
25 Node Pair Sampling Based on UVS Two hop pairs S (2) Accuracy: ω k (2) - asymptotically unbiased estimator of ω k (2), k = 1,, K. Meanwhile, P ω k 2 ω k 2 2εω 2 k 1 ε 1 1 nε 2 m ω k 2 + m 2 where 0 < ε < 1 and m = M/ S 2 26
26 Node Pair Sampling Based on RW Why? UVS not available, too costly. API not provided user IDs sparsely distributed Only crawling techniques can be used Breadth first search: unknown sampling bias Random walk: Walker moves to random neighbor, samples its information. For connected and non-bipartite graph, probability of RW being at node v converges to π v = d v 2 E, v V 27
27 Node Pair Sampling Based on RW All pairs S Sample node pair u i, v i by two independent RWs, where u i, v i are nodes sampled by two RWs at step i Node pair [u,v] sampled according to stationary distribution π [u,v] = d ud v, u, v V 4 E 2 28
28 Node Pair Sampling Based on RW All pairs S Estimator: given sampled node pairs u i, v i, i = 1,, n ω k = 1 J n 1(F u i, v i = a k )1(u i v i ) i=1 d ui d vi, k = 1,, K J normalization constant Accuracy: ω k, k = 1,, K is asymptotically unbiased estimate of ω k 29
29 Node Pair Sampling Based on RW One hop pairs S (1) Sampling method: random node pair [u i, v i ] sampled by RW u i, v i are nodes sampled at steps i and i + 1 Probability of RW sampling node pairs in S (1) are equal when RW reaches the steady state. 30
30 Node Pair Sampling Based on RW One hop pairs S (1) Estimator: given sampled node pairs u i, v i, i = 1,, K ω k (1 ) = 1 n n i=1 1(F u i, v i = a k ), k = 1,, K Accuracy: ω k (1 ) asymptotically unbiased estimate of ω k (1), 1 k K 31
31 Node Pair Sampling Based on RW Two hop pairs S (2) Neighborhood RW (NRW) current edge (u, v) next edge: select a random edge from edges connected to u or v, except edge (u, v) 32
32 Node Pair Sampling Based on RW Two hop pairs S (2) Select node pair by excluding same node in two edges consequently sampled by NRW. For example, output node pair [u, v] for two consequently sampled edges (u, w) and (w, v) Probability NRW samples node pair [u,v] in S (2) converges to (2) π [u,v] = m(u, v)/m m(u, v) - number neighbors common to u, v 33
33 Node Pair Sampling Based on RW Two hop pairs S (2) Estimator: given sampled node pairs u i, v i, i = 1,, n ω k (2 ) = 1 H n 1(F u i, v i = a k ) i=1 m(u i, v i ) where H = n i=1 1/m(u i, v i ) Accuracy: ω k (2 ) asymptotically unbiased estimate of ω k (2), 1 k K 34
34 Simulations: Distance Distribution Estimation B - number of sampled node pairs S - total number of node pairs error metric - NMSE ω k = E ω k ω k 2 ω k, k = 1,, K Gnutella - V 6300 B >.005 S, NMSE < 1 NMSE proportional to inverse of sqr() 35
35 Simulations: Mutual Neighbor Count Distribution in S (1) B = 0.01 S 1 node pairs sampled from S 1 36
36 Simulations: Mutual Neighbor Count Distribution in S (2) B = 0.01 S 2 node pairs sampled from S 2 37
37 Simulations: Interest Distribution Scheme Distribute interests over graph: 10 5 distinct interests: number nodes per interest ~ truncated Pareto distribution over {1,,10 3 } To distribute an interest possessed by k different nodes 1. select a random node v that can reach at least k- 1 different nodes 2. distribute interest to node v and closest k-1 nodes connected to v 38
38 Simulations: Common Interest Count Distribution for Generated Content CCDF of common interest count distribution for S, S (1), and S (2). Wiki-vote P2P-Gnutella 39
39 Simulations: Common Interest Count Distribution in S B = 0.01 S node pairs sampled from S Wiki-vote P2P-Gnutella 40
40 Simulations: Common Interest Count Distribution in S (1) B = 0.01 S (1) node pairs sampled from S (1) Wiki-vote P2P-Gnutella RW ~ IWVS > MHWVS => no topology 41
41 Simulations: Common Interest Count Distribution in S (2) B = 0.01 S 2 node pairs sampled from S 2 Wiki-vote P2P-Gnutella 42
42 Conclusions use sampling to estimate pair characteristics in sets S, S (1), and S (2). sampling methods based on independent vertex sampling and random walk produce asymptotically unbiased estimates validated approaches on wide range of graphs 43
43 Conclusions Markov Chain Mixing Times Other more powerful & elegant sampling methods: Frontier Sampling Efficiently Estimating Motif Statistics of Large Networks in the Dark. TKDD 2014 Design of Efficient Sampling Methods on Hybrid Social- Affiliation Networks. Accepted in IEEE ICDE 15 measuring, maximizing group closeness centrality over diskresident graphs. WWW 14 44
44 Thanks! Slides (will be) at r-sampling.pdf
Sampling Large Graphs: Algorithms and Applications
Sampling Large Graphs: Algorithms and Applications Don Towsley College of Information & Computer Science Umass - Amherst Collaborators: P.H. Wang, J.C.S. Lui, J.Z. Zhou, X. Guan Measuring, analyzing large
More informationHow to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200?
How to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200? Questions from last time Avg. FB degree is 200 (suppose).
More informationDS504/CS586: Big Data Analytics Graph Mining Prof. Yanhua Li
Welcome to DS504/CS586: Big Data Analytics Graph Mining Prof. Yanhua Li Time: 6:00pm 8:50pm R Location: AK232 Fall 2016 Graph Data: Social Networks Facebook social graph 4-degrees of separation [Backstrom-Boldi-Rosa-Ugander-Vigna,
More informationEmpirical Characterization of P2P Systems
Empirical Characterization of P2P Systems Reza Rejaie Mirage Research Group Department of Computer & Information Science University of Oregon http://mirage.cs.uoregon.edu/ Collaborators: Daniel Stutzbach
More informationFast Low-Cost Estimation of Network Properties Using Random Walks
Fast Low-Cost Estimation of Network Properties Using Random Walks Colin Cooper, Tomasz Radzik, and Yiannis Siantos Department of Informatics, King s College London, WC2R 2LS, UK Abstract. We study the
More informationDS504/CS586: Big Data Analytics Graph Mining Prof. Yanhua Li
Welcome to DS504/CS586: Big Data Analytics Graph Mining Prof. Yanhua Li Time: 6:00pm 8:50pm R Location: KH 116 Fall 2017 Reiews/Critiques I will choose one reiew to grade this week. Graph Data: Social
More informationDS504/CS586: Big Data Analytics Graph Mining Prof. Yanhua Li
Welcome to DS504/CS586: Big Data Analytics Graph Mining Prof. Yanhua Li Time: 6:00pm 8:50pm R Location: AK 233 Spring 2018 Service Providing Improve urban planning, Ease Traffic Congestion, Save Energy,
More informationA Walk in Facebook: Uniform Sampling of Users in Online Social Networks
A Walk in Facebook: Uniform Sampling of Users in Online Social Networks Minas Gjoka, Maciej Kurant, Carter T. Butts, Athina Markopoulou California Institute for Telecommunications and Information Technology
More informationModeling and Estimating the Structure of D2D-Based Mobile Social Networks
Modeling and Estimating the Structure of DD-Based Mobile Social Networks Sigit Aryo Pambudi Wenye Wang Department of Electrical and Computer Engineering North Carolina State University, Raleigh, NC 7606
More informationOptimizing Capacity-Heterogeneous Unstructured P2P Networks for Random-Walk Traffic
Optimizing Capacity-Heterogeneous Unstructured P2P Networks for Random-Walk Traffic Chandan Rama Reddy Microsoft Joint work with Derek Leonard and Dmitri Loguinov Internet Research Lab Department of Computer
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More informationOutsourcing Privacy-Preserving Social Networks to a Cloud
IEEE INFOCOM 2013, April 14-19, Turin, Italy Outsourcing Privacy-Preserving Social Networks to a Cloud Guojun Wang a, Qin Liu a, Feng Li c, Shuhui Yang d, and Jie Wu b a Central South University, China
More informationAbsorbing Random walks Coverage
DATA MINING LECTURE 3 Absorbing Random walks Coverage Random Walks on Graphs Random walk: Start from a node chosen uniformly at random with probability. n Pick one of the outgoing edges uniformly at random
More informationA Walk in Facebook: Uniform Sampling of Users in Online Social Networks
A Walk in Facebook: Uniform Sampling of Users in Online Social Networks Minas Gjoka CalIT2 UC Irvine mgjoka@uci.edu Maciej Kurant CalIT2 UC Irvine maciej.kurant@gmail.com Carter T. Butts Sociology Dept
More informationExtracting Information from Complex Networks
Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform
More informationAbsorbing Random walks Coverage
DATA MINING LECTURE 3 Absorbing Random walks Coverage Random Walks on Graphs Random walk: Start from a node chosen uniformly at random with probability. n Pick one of the outgoing edges uniformly at random
More informationCENTRALITIES. Carlo PICCARDI. DEIB - Department of Electronics, Information and Bioengineering Politecnico di Milano, Italy
CENTRALITIES Carlo PICCARDI DEIB - Department of Electronics, Information and Bioengineering Politecnico di Milano, Italy email carlo.piccardi@polimi.it http://home.deib.polimi.it/piccardi Carlo Piccardi
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More informationCS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul
1 CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul Introduction Our problem is crawling a static social graph (snapshot). Given
More informationUnbiased Sampling of Facebook
Unbiased Sampling of Facebook Minas Gjoka Networked Systems UC Irvine mgjoka@uci.edu Maciej Kurant School of ICS EPFL, Lausanne maciej.kurant@epfl.ch Carter T. Butts Sociology Dept UC Irvine buttsc@uci.edu
More informationSupervised Random Walks
Supervised Random Walks Pawan Goyal CSE, IITKGP September 8, 2014 Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 1 / 17 Correlation Discovery by random walk Problem definition Estimate
More informationLink Analysis and Web Search
Link Analysis and Web Search Moreno Marzolla Dip. di Informatica Scienza e Ingegneria (DISI) Università di Bologna http://www.moreno.marzolla.name/ based on material by prof. Bing Liu http://www.cs.uic.edu/~liub/webminingbook.html
More informationCS6200 Information Retreival. The WebGraph. July 13, 2015
CS6200 Information Retreival The WebGraph The WebGraph July 13, 2015 1 Web Graph: pages and links The WebGraph describes the directed links between pages of the World Wide Web. A directed edge connects
More informationLink Prediction for Social Network
Link Prediction for Social Network Ning Lin Computer Science and Engineering University of California, San Diego Email: nil016@eng.ucsd.edu Abstract Friendship recommendation has become an important issue
More informationAnalysis of Biological Networks. 1. Clustering 2. Random Walks 3. Finding paths
Analysis of Biological Networks 1. Clustering 2. Random Walks 3. Finding paths Problem 1: Graph Clustering Finding dense subgraphs Applications Identification of novel pathways, complexes, other modules?
More informationNeural Network Weight Selection Using Genetic Algorithms
Neural Network Weight Selection Using Genetic Algorithms David Montana presented by: Carl Fink, Hongyi Chen, Jack Cheng, Xinglong Li, Bruce Lin, Chongjie Zhang April 12, 2005 1 Neural Networks Neural networks
More informationEstimating and Sampling Graphs with Multidimensional Random Walks. Group 2: Mingyan Zhao, Chengle Zhang, Biao Yin, Yuchen Liu
Estimating and Sampling Graphs with Multidimensional Random Walks Group 2: Mingyan Zhao, Chengle Zhang, Biao Yin, Yuchen Liu Complex Network Motivation Social Network Biological Network erence: www.forbe.com
More informationComplex-Network Modelling and Inference
Complex-Network Modelling and Inference Lecture 8: Graph features (2) Matthew Roughan http://www.maths.adelaide.edu.au/matthew.roughan/notes/ Network_Modelling/ School
More informationCS 5220: Parallel Graph Algorithms. David Bindel
CS 5220: Parallel Graph Algorithms David Bindel 2017-11-14 1 Graphs Mathematically: G = (V, E) where E V V Convention: V = n and E = m May be directed or undirected May have weights w V : V R or w E :
More informationLink Farming in Twitter
Link Farming in Twitter Pawan Goyal CSE, IITKGP Nov 11, 2016 Pawan Goyal (IIT Kharagpur) Link Farming in Twitter Nov 11, 2016 1 / 1 Reference Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar
More informationK Nearest Neighbor Wrap Up K- Means Clustering. Slides adapted from Prof. Carpuat
K Nearest Neighbor Wrap Up K- Means Clustering Slides adapted from Prof. Carpuat K Nearest Neighbor classification Classification is based on Test instance with Training Data K: number of neighbors that
More informationGraph Algorithms using Map-Reduce. Graphs are ubiquitous in modern society. Some examples: The hyperlink structure of the web
Graph Algorithms using Map-Reduce Graphs are ubiquitous in modern society. Some examples: The hyperlink structure of the web Graph Algorithms using Map-Reduce Graphs are ubiquitous in modern society. Some
More informationData-Intensive Distributed Computing
Data-Intensive Distributed Computing CS 451/651 (Fall 2018) Part 4: Analyzing Graphs (1/2) October 4, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These slides are
More informationEstimating Sizes of Social Networks via Biased Sampling
Estimating Sizes of Social Networks via Biased Sampling Liran Katzir, Edo Liberty, and Oren Somekh Yahoo! Labs, Haifa, Israel International World Wide Web Conference, 28th March - 1st April 2011, Hyderabad,
More informationLecture 6: Spectral Graph Theory I
A Theorist s Toolkit (CMU 18-859T, Fall 013) Lecture 6: Spectral Graph Theory I September 5, 013 Lecturer: Ryan O Donnell Scribe: Jennifer Iglesias 1 Graph Theory For this course we will be working on
More informationUsing PageRank in Feature Selection
Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy {ienco,meo,botta}@di.unito.it Abstract. Feature selection is an important
More informationOnline Social Networks and Media
Online Social Networks and Media Absorbing Random Walks Link Prediction Why does the Power Method work? If a matrix R is real and symmetric, it has real eigenvalues and eigenvectors: λ, w, λ 2, w 2,, (λ
More informationOverlay (and P2P) Networks
Overlay (and P2P) Networks Part II Recap (Small World, Erdös Rényi model, Duncan Watts Model) Graph Properties Scale Free Networks Preferential Attachment Evolving Copying Navigation in Small World Samu
More informationMaking social interactions accessible in online social networks
Information Services & Use 33 (2013) 113 117 113 DOI 10.3233/ISU-130702 IOS Press Making social interactions accessible in online social networks Fredrik Erlandsson a,, Roozbeh Nia a, Henric Johnson b
More informationNonparametric Importance Sampling for Big Data
Nonparametric Importance Sampling for Big Data Abigael C. Nachtsheim Research Training Group Spring 2018 Advisor: Dr. Stufken SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Motivation Goal: build a model
More informationTopic mash II: assortativity, resilience, link prediction CS224W
Topic mash II: assortativity, resilience, link prediction CS224W Outline Node vs. edge percolation Resilience of randomly vs. preferentially grown networks Resilience in real-world networks network resilience
More informationDiffusion and Clustering on Large Graphs
Diffusion and Clustering on Large Graphs Alexander Tsiatas Thesis Proposal / Advancement Exam 8 December 2011 Introduction Graphs are omnipresent in the real world both natural and man-made Examples of
More informationUsing PageRank in Feature Selection
Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy fienco,meo,bottag@di.unito.it Abstract. Feature selection is an important
More informationSwitching for a Small World. Vilhelm Verendel. Master s Thesis in Complex Adaptive Systems
Switching for a Small World Vilhelm Verendel Master s Thesis in Complex Adaptive Systems Division of Computing Science Department of Computer Science and Engineering Chalmers University of Technology Göteborg,
More informationnode2vec: Scalable Feature Learning for Networks
node2vec: Scalable Feature Learning for Networks A paper by Aditya Grover and Jure Leskovec, presented at Knowledge Discovery and Data Mining 16. 11/27/2018 Presented by: Dharvi Verma CS 848: Graph Database
More informationA New Parallel Algorithm for Connected Components in Dynamic Graphs. Robert McColl Oded Green David Bader
A New Parallel Algorithm for Connected Components in Dynamic Graphs Robert McColl Oded Green David Bader Overview The Problem Target Datasets Prior Work Parent-Neighbor Subgraph Results Conclusions Problem
More informationChapter 1. Social Media and Social Computing. October 2012 Youn-Hee Han
Chapter 1. Social Media and Social Computing October 2012 Youn-Hee Han http://link.koreatech.ac.kr 1.1 Social Media A rapid development and change of the Web and the Internet Participatory web application
More informationGraph Data Management
Graph Data Management Analysis and Optimization of Graph Data Frameworks presented by Fynn Leitow Overview 1) Introduction a) Motivation b) Application for big data 2) Choice of algorithms 3) Choice of
More informationCS 220: Discrete Structures and their Applications. graphs zybooks chapter 10
CS 220: Discrete Structures and their Applications graphs zybooks chapter 10 directed graphs A collection of vertices and directed edges What can this represent? undirected graphs A collection of vertices
More informationAlgorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov
Algorithms and Applications in Social Networks 2017/2018, Semester B Slava Novgorodov 1 Lesson #1 Administrative questions Course overview Introduction to Social Networks Basic definitions Network properties
More informationBruno Martins. 1 st Semester 2012/2013
Link Analysis Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2012/2013 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 4
More informationLocal Algorithms for Sparse Spanning Graphs
Local Algorithms for Sparse Spanning Graphs Reut Levi Dana Ron Ronitt Rubinfeld Intro slides based on a talk given by Reut Levi Minimum Spanning Graph (Spanning Tree) Local Access to a Minimum Spanning
More informationResearch Article Link Prediction in Directed Network and Its Application in Microblog
Mathematical Problems in Engineering, Article ID 509282, 8 pages http://dx.doi.org/10.1155/2014/509282 Research Article Link Prediction in Directed Network and Its Application in Microblog Yan Yu 1,2 and
More informationA large-scale community structure analysis in Facebook
Ferrara EPJ Data Science 2012, 1:9 REGULAR ARTICLE OpenAccess A large-scale community structure analysis in Facebook Emilio Ferrara * * Correspondence: ferrarae@indiana.edu Center for Complex Networks
More informationDS504/CS586: Big Data Analytics Data acquisition and measurement Prof. Yanhua Li
Welcome to DS504/CS586: Big Data Analytics Data acquisition and measurement Prof. Yanhua Li Time: 6:00pm 8:50pm THURSDAY Location: AK 232 Fall 2016 Data acquisition and measurement ia Sampling and Estimation
More informationInferring Coarse Views of Connectivity in Very Large Graphs
Inferring Coarse Views of Connectivity in Very Large Graphs Reza Motamedi, Reza Rejaie, Walter Willinger, Daniel Lowd, Roberto Gonzalez http://onrg.cs.uoregon.edu/walkabout 10/8/14 1 Introduction! Large-scale
More informationMining and Analyzing Online Social Networks
The 5th EuroSys Doctoral Workshop (EuroDW 2011) Salzburg, Austria, Sunday 10 April 2011 Mining and Analyzing Online Social Networks Emilio Ferrara eferrara@unime.it Advisor: Prof. Giacomo Fiumara PhD School
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second
More informationCycles in Random Graphs
Cycles in Random Graphs Valery Van Kerrebroeck Enzo Marinari, Guilhem Semerjian [Phys. Rev. E 75, 066708 (2007)] [J. Phys. Conf. Series 95, 012014 (2008)] Outline Introduction Statistical Mechanics Approach
More informationNetwork embedding. Cheng Zheng
Network embedding Cheng Zheng Outline Problem definition Factorization based algorithms --- Laplacian Eigenmaps(NIPS, 2001) Random walk based algorithms ---DeepWalk(KDD, 2014), node2vec(kdd, 2016) Deep
More informationMultigraph Sampling of Online Social Networks
Multigraph Sampling of Online Social Networks Minas Gjoka CalIT2 UC Irvine mgjoka@uci.edu Carter T. Butts Sociology Dept UC Irvine buttsc@uci.edu Maciej Kurant CalIT2 UC Irvine maciej.kurant@gmail.com
More informationDetecting and Analyzing Communities in Social Network Graphs for Targeted Marketing
Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing Gautam Bhat, Rajeev Kumar Singh Department of Computer Science and Engineering Shiv Nadar University Gautam Buddh Nagar,
More informationIntroduction to Graph Theory
Introduction to Graph Theory Tandy Warnow January 20, 2017 Graphs Tandy Warnow Graphs A graph G = (V, E) is an object that contains a vertex set V and an edge set E. We also write V (G) to denote the vertex
More informationGraphs / Networks CSE 6242/ CX Centrality measures, algorithms, interactive applications. Duen Horng (Polo) Chau Georgia Tech
CSE 6242/ CX 4242 Graphs / Networks Centrality measures, algorithms, interactive applications Duen Horng (Polo) Chau Georgia Tech Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John
More informationGraph traversal and BFS
Graph traversal and BFS Fundamental building block Graph traversal is part of many important tasks Connected components Tree/Cycle detection Articulation vertex finding Real-world applications Peer-to-peer
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 60 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models Pregel: A System for Large-Scale Graph Processing
More informationSI Networks: Theory and Application, Fall 2008
University of Michigan Deep Blue deepblue.lib.umich.edu 2008-09 SI 508 - Networks: Theory and Application, Fall 2008 Adamic, Lada Adamic, L. (2008, November 12). Networks: Theory and Application. Retrieved
More informationLecture Note: Computation problems in social. network analysis
Lecture Note: Computation problems in social network analysis Bang Ye Wu CSIE, Chung Cheng University, Taiwan September 29, 2008 In this lecture note, several computational problems are listed, including
More informationGraphs / Networks CSE 6242/ CX Centrality measures, algorithms, interactive applications. Duen Horng (Polo) Chau Georgia Tech
CSE 6242/ CX 4242 Graphs / Networks Centrality measures, algorithms, interactive applications Duen Horng (Polo) Chau Georgia Tech Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John
More informationSequential Algorithms for Generating Random Graphs. Amin Saberi Stanford University
Sequential Algorithms for Generating Random Graphs Amin Saberi Stanford University Outline Generating random graphs with given degrees (joint work with M Bayati and JH Kim) Generating random graphs with
More informationAnalysis of Large Graphs: TrustRank and WebSpam
Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit
More informationSocial Network Analysis
Social Network Analysis Mathematics of Networks Manar Mohaisen Department of EEC Engineering Adjacency matrix Network types Edge list Adjacency list Graph representation 2 Adjacency matrix Adjacency matrix
More informationWedge A New Frontier for Pull-based Graph Processing. Samuel Grossman and Christos Kozyrakis Platform Lab Retreat June 8, 2018
Wedge A New Frontier for Pull-based Graph Processing Samuel Grossman and Christos Kozyrakis Platform Lab Retreat June 8, 2018 Graph Processing Problems modelled as objects (vertices) and connections between
More informationBasics of Network Analysis
Basics of Network Analysis Hiroki Sayama sayama@binghamton.edu Graph = Network G(V, E): graph (network) V: vertices (nodes), E: edges (links) 1 Nodes = 1, 2, 3, 4, 5 2 3 Links = 12, 13, 15, 23,
More informationPULP: Fast and Simple Complex Network Partitioning
PULP: Fast and Simple Complex Network Partitioning George Slota #,* Kamesh Madduri # Siva Rajamanickam * # The Pennsylvania State University *Sandia National Laboratories Dagstuhl Seminar 14461 November
More informationShort-Cut MCMC: An Alternative to Adaptation
Short-Cut MCMC: An Alternative to Adaptation Radford M. Neal Dept. of Statistics and Dept. of Computer Science University of Toronto http://www.cs.utoronto.ca/ radford/ Third Workshop on Monte Carlo Methods,
More informationPractical Session No. 12 Graphs, BFS, DFS, Topological sort
Practical Session No. 12 Graphs, BFS, DFS, Topological sort Graphs and BFS Graph G = (V, E) Graph Representations (V G ) v1 v n V(G) = V - Set of all vertices in G E(G) = E - Set of all edges (u,v) in
More informationMultidimensional Arrays & Graphs. CMSC 420: Lecture 3
Multidimensional Arrays & Graphs CMSC 420: Lecture 3 Mini-Review Abstract Data Types: List Stack Queue Deque Dictionary Set Implementations: Linked Lists Circularly linked lists Doubly linked lists XOR
More informationEconomics of Information Networks
Economics of Information Networks Stephen Turnbull Division of Policy and Planning Sciences Lecture 4: December 7, 2017 Abstract We continue discussion of the modern economics of networks, which considers
More informationCopyright 2000, Kevin Wayne 1
Chapter 3 - Graphs Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = V, m = E. Directed
More informationG(B)enchmark GraphBench: Towards a Universal Graph Benchmark. Khaled Ammar M. Tamer Özsu
G(B)enchmark GraphBench: Towards a Universal Graph Benchmark Khaled Ammar M. Tamer Özsu Bioinformatics Software Engineering Social Network Gene Co-expression Protein Structure Program Flow Big Graphs o
More informationCS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS
CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS Overview of Networks Instructor: Yizhou Sun yzsun@cs.ucla.edu January 10, 2017 Overview of Information Network Analysis Network Representation Network
More informationDe#anonymizing,Social,Networks, and,inferring,private,attributes, Using,Knowledge,Graphs,
De#anonymizing,Social,Networks, and,inferring,private,attributes, Using,Knowledge,Graphs, Jianwei Qian Illinois Tech Chunhong Zhang BUPT Xiang#Yang Li USTC,/Illinois Tech Linlin Chen Illinois Tech Outline
More informationGraph Exploration: How to do better than the random walk? Adrian Kosowski. INRIA Bordeaux Sud-Ouest.
Graph Exploration: How to do better than the random walk? Adrian Kosowski INRIA Bordeaux Sud-Ouest kosowski@labri.fr Réunion Displexity La Rochelle, April 4, 2013 Talk outline Introduction to network exploration
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second
More information1 Starting around 1996, researchers began to work on. 2 In Feb, 1997, Yanhong Li (Scotch Plains, NJ) filed a
!"#$ %#& ' Introduction ' Social network analysis ' Co-citation and bibliographic coupling ' PageRank ' HIS ' Summary ()*+,-/*,) Early search engines mainly compare content similarity of the query and
More informationOutline. Last 3 Weeks. Today. General background. web characterization ( web archaeology ) size and shape of the web
Web Structures Outline Last 3 Weeks General background Today web characterization ( web archaeology ) size and shape of the web What is the size of the web? Issues The web is really infinite Dynamic content,
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second
More informationCentralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge
Centralities (4) By: Ralucca Gera, NPS Excellence Through Knowledge Some slide from last week that we didn t talk about in class: 2 PageRank algorithm Eigenvector centrality: i s Rank score is the sum
More informationAlgorithms Activity 6: Applications of BFS
Algorithms Activity 6: Applications of BFS Suppose we have a graph G = (V, E). A given graph could have zero edges, or it could have lots of edges, or anything in between. Let s think about the range of
More informationLink Analysis in the Cloud
Cloud Computing Link Analysis in the Cloud Dell Zhang Birkbeck, University of London 2017/18 Graph Problems & Representations What is a Graph? G = (V,E), where V represents the set of vertices (nodes)
More informationCollecting social media data based on open APIs
Collecting social media data based on open APIs Ye Li With Qunyan Zhang, Haixin Ma, Weining Qian, and Aoying Zhou http://database.ecnu.edu.cn/ Outline Social Media Data Set Data Feature Data Model Data
More informationSlides based on those in:
Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org A 3.3 B 38.4 C 34.3 D 3.9 E 8.1 F 3.9 1.6 1.6 1.6 1.6 1.6 2 y 0.8 ½+0.2 ⅓ M 1/2 1/2 0 0.8 1/2 0 0 + 0.2 0 1/2 1 [1/N]
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/4/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
More informationCS425: Algorithms for Web Scale Data
CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original slides can be accessed at: www.mmds.org J.
More informationInformation Retrieval (IR) Introduction to Information Retrieval. Lecture Overview. Why do we need IR? Basics of an IR system.
Introduction to Information Retrieval Ethan Phelps-Goodman Some slides taken from http://www.cs.utexas.edu/users/mooney/ir-course/ Information Retrieval (IR) The indexing and retrieval of textual documents.
More informationLink Analysis from Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer and other material.
Link Analysis from Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer and other material. 1 Contents Introduction Network properties Social network analysis Co-citation
More informationThe Role of Homophily and Popularity in Informed Decentralized Search
The Role of Homophily and Popularity in Informed Decentralized Search Florian Geigl and Denis Helic Knowledge Technologies Institute In eldgasse 13/5. floor, 8010 Graz, Austria {florian.geigl,dhelic}@tugraz.at
More informationEvolution of Attribute-Augmented Social Networks: Measurements, Modeling, and Implications using Google+
Evolution of Attribute-Augmented Social Networks: Measurements, Modeling, and Implications using Google+ Neil Zhenqiang Gong, Wenchang Xu, Ling Huang, Prateek Mittal, Emil Stefanov, Vyas Sekar, Dawn Song
More informationRandomized Rumor Spreading in Social Networks
Randomized Rumor Spreading in Social Networks Benjamin Doerr (MPI Informatics / Saarland U) Summary: We study how fast rumors spread in social networks. For the preferential attachment network model and
More information