A Topology-aware Random Walk

Similar documents
Parallelism for Nested Loops with Non-uniform and Flow Dependences

An Optimal Algorithm for Prufer Codes *

All-Pairs Shortest Paths. Approximate All-Pairs shortest paths Approximate distance oracles Spanners and Emulators. Uri Zwick Tel Aviv University

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

CHAPTER 2 DECOMPOSITION OF GRAPHS

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Cluster Analysis of Electrical Behavior

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Unsupervised Learning

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

On Some Entertaining Applications of the Concept of Set in Computer Science Course

Load Balancing for Hex-Cell Interconnection Network

Constructing Minimum Connected Dominating Set: Algorithmic approach

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE

A Clustering Algorithm for Chinese Adjectives and Nouns 1

Feature Reduction and Selection

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

Problem Set 3 Solutions

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Smoothing Spline ANOVA for variable screening

Solving two-person zero-sum game by Matlab

Bridges and cut-vertices of Intuitionistic Fuzzy Graph Structure

A NOTE ON FUZZY CLOSURE OF A FUZZY SET

Load-Balanced Anycast Routing

Recognizing Faces. Outline

Hierarchical clustering for gene expression data analysis

F Geometric Mean Graphs

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

A KIND OF ROUTING MODEL IN PEER-TO-PEER NETWORK BASED ON SUCCESSFUL ACCESSING RATE

X- Chart Using ANOM Approach

Lecture 5: Multilayer Perceptrons

Reading. 14. Subdivision curves. Recommended:

Machine Learning: Algorithms and Applications

Fast Computation of Shortest Path for Visiting Segments in the Plane

Network Coding as a Dynamical System

Transaction-Consistent Global Checkpoints in a Distributed Database System

A Binarization Algorithm specialized on Document Images and Photos

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Network Topologies: Analysis And Simulations

Quantifying Performance Models

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

GSLM Operations Research II Fall 13/14

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

A New Graph Model with Random Edge Values: Connectivity and Diameter

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 5 Luca Trevisan September 7, 2017

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Efficient Content Distribution in Wireless P2P Networks

On the diameter of random planar graphs

A Deflected Grid-based Algorithm for Clustering Analysis

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Simulation Based Analysis of FAST TCP using OMNET++

Hermite Splines in Lie Groups as Products of Geodesics

Lecture 4: Principal components

Machine Learning. Topic 6: Clustering

A New Approach For the Ranking of Fuzzy Sets With Different Heights

Report on On-line Graph Coloring

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Life Tables (Times) Summary. Sample StatFolio: lifetable times.sgp

Structure from Motion

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Polyhedral Compilation Foundations

Module Management Tool in Software Development Organizations

Math Homotopy Theory Additional notes

Effectiveness of Information Retraction

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 4, Issue 6, December 2014

The Shortest Path of Touring Lines given in the Plane

Classifier Selection Based on Data Complexity Measures *

Improving the Efficiency of Load Balancing Games through Taxes

CMPS 10 Introduction to Computer Science Lecture Notes

Design and Analysis of Algorithms

An Entropy-Based Approach to Integrated Information Needs Assessment

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Harvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6)

Software Reliability Assessment Using High-Order Markov Chains

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Quantifying Responsiveness of TCP Aggregates by Using Direct Sequence Spread Spectrum CDMA and Its Application in Congestion Control

Lecture #15 Lecture Notes

An efficient iterative source routing algorithm

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

PROPERTIES OF BIPOLAR FUZZY GRAPHS

Metric Characteristics. Matrix Representations of Graphs.

Circuit Analysis I (ENGR 2405) Chapter 3 Method of Analysis Nodal(KCL) and Mesh(KVL)

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound

FAST AND DETERMINISTIC COMPUTATION OF FIXATION PROBABILITY IN EVOLUTIONARY GRAPHS

Graph-based Clustering

Inverse Kinematics (part 2) CSE169: Computer Animation Instructor: Steve Rotenberg UCSD, Spring 2016

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Evaluation of Parallel Processing Systems through Queuing Model

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array

3D vector computer graphics

Transcription:

A Topology-aware Random Walk Inkwan Yu, Rchard Newman Dept. of CISE, Unversty of Florda, Ganesvlle, Florda, USA Abstract When a graph can be decomposed nto clusters of well connected subgraphs, t s possble to speed up random walks takng advantage of the topology of the graph. In ths work, a new random walk scheme s ntroduced and a condton s gven when the new random walk performs better than the Metropols algorthm. Key words: Topology-aware, Random walk, Markov chan, Cover tme, Conductance Introducton Wth the explosve growth of P2P Peer-to-Peer) traffc and popularty, t s reasonable to assume that peers between ISPs Internet Servce Provders) suffer from congeston and poor connectvty than the ones wthn ISPs. Ths observaton also calls for awareness of P2P systems of the nter ISP congeston []. On the other hand, for unstructured P2P systems, t s observed that k random walkers can be effcent for locatng desred resources n the network [0]. In ths work, we desgn and analyze a smple but effcent herarchcal random walk scheme, takng the above observatons nto perspectve. Our scheme assumes that peers tend to form well connected clusters n the network and these clusters can be dentfed. These assumptons are reasonable, snce peers may form a cluster wthn an ISP rather than between ISPs. Once we dentfy clusters n the network, our random walk scheme makes dfferent moves dependng on whether nodes are connected to other clusters or not. Ths awareness of topology n random walk usually results n faster mxng tme. By analyss, we fnd condtons when our scheme performs better. Preprnt submtted to Elsever Aprl 2008

2 Related Work The random walk has been actvely studed for the past few decades. Some of good surveys are found n [9, 2]. As related to P2P systems, Law et al. [8] show how to buld expander graphs usng a dstrbuted algorthm. Pandurangan et al. [7] ntroduce a dstrbuted algorthm and analytcally show that ther algorthm bulds a low dameter graph whch s well connected smlar to an expander graph. Gkantsds et al. [4, 6] comple useful theores of random walks related to P2P systems and ntroduce a scheme to generate expander graphs. Random walks can be formalzed by the Markov chan. When a Markov chan s represented by the transton matrx, the mxng tme of transton matrx s domnated by the second largest egenvalue of the matrx [5, 2]. Madras and Randall [] show that when a Markov chan s hard to analyze as a whole, t s possble to decompose the Markov chan nto ntersectng subgraphs and to bound the convergence tme usng the second largest egenvalues of subgraphs. Ther dea s dfferent from ours n that they use the decomposton just to bound the convergence tme whereas we desgn a herarchcal random walk and analytcally show when the new random walk scheme performs better. The cover tme s the number of steps for a random walker takes to vst every node n a graph. Fredman [5] shows that the cover tme s bounded by On log n) where n s the number of nodes n the graph. 3 Defntons Before we present our results, t s necessary to ntroduce some defntons. Defne a graph G = V, E, where V s a set of nodes and E s a set of edges n the graph. If a node v belongs to the graph, we wrte v V and when there s an edge between two nodes u and v n G, we wrte u, v) E. And we wrte V G) = {v : v G} and EG) = {u, v) : u, v) G}. For brevty, we wrte V G) = G. Here, we thnk of a peer-to-peer network represented as a graph. Also, we assume that the graph s composed of clusters of well connected subgraphs and that each cluster s not solated. A cluster or subgraph) contans a subset of nodes that have outgong edges from that cluster to other clusters. Formally, we can decompose the connected graph as follows. For, j N, V S) = V S ), V S ) V S j ) = f j, and ES ) = {u, v) : u, v S, u, v) S}. Here, S can be nterpreted as a whole P2P network whle S s a cluster n 2

t wth outgong edges removed. In each S, we can fnd a set of nodes that orgnally have outgong edges n S. Defne a graph K, where V K) = V K ), V K ) V S ), V K ) and EK) = {u, v) : u K, v K j, j, u, v) S}. Also, every u K s not solated. The graph connectvty can be quantfed by the conductvty [4]. To defne the conductvty, frst defne the cutset of U, CU), as the set of edges wth one endpont n U and the other endpont s Ū, where U Ū = V. The degree of a node v s denoted as dv). Then we defne the volume of U as volu) = v U dv). Defne the conductance of graph as ΦG) = CU) mn U V, volu) volu)< 2 volv ) where CS) s a set of edges between S and S. The second largest egenvalue of Markov chan transton matrx for a graph G s represented as λ 2 G). 4 Topology Aware Random Walk To take advantage of well connected subgraphs of a graph, a slght change s necessary n the Metropols algorthm based random walk. Before we explan our random walk, the followng facts are useful n analyzng our algorthm. Fact In a connected graph, the Metropols algorthm creates a rreducble tme-reversble Markov chan for the graph [3]. From Fact, when the statonary dstrbuton s πu) = π for u n a connected graph G, the transton matrx created from the Metropols algorthm for the graph s symmetrc and rreducble. Fact 2 A graph wth a rreducble and tme-reversble Markov chan has the expected httng tme from a node u G back to tself as πu) [3]. Fact 3 A connected graph G can be covered wth expected steps of O G log G λ 2 G)) ) usng a symmetrc, rreducble Markov chan [3]. Algorthm shows how to take advantage of topology nformaton. Our random walk algorthm works wth herarchy. The frst level random walk moves on nodes n K wth the Metropols algorthm wth the statonary dstrbuton of K for each node n K. At each node n K, the second level random walk for S begns untl the random walk comes back to the node. The sec- 3

Algorthm TOPOLOGY AWARE RANDOM WALK Pck u S wth the equal probablty for u S whle true do f u.m = and u K then u.m m {Mark the node} u.c c {Node s covered} Select a neghbor of u, v S, where u S, usng Metropols Algorthm on S wth statonary dstrbuton of S {Random walk on S } u v {Move to v} end f whle true do f u.m = m then u.m Select a neghbor of u, v K usng Metropols Algorthm on K wth statonary dstrbuton of K {Random walk on K} u v break else u.c c Select a neghbor of u, v S, where u S, usng Metropols Algorthm on S wth statonary dstrbuton of S {Random walk on S } u v end f end whle end whle ond level random walk also uses Metropols algorthm but wth the statonary dstrbuton of S for each node n S. Wthout knowledge of topology, random walk based on the Metropols algorthm, wth the statonary dstrbuton of πu) = S for all u S, wll have the cover tme O S log S λ 2 S)) ) due to Fact 3. Now we can analytcally compare the cover tme of our random walk and the random walk wthout topologcal nformaton. Theorem 4 The topology aware random walk has faster expected cover tme when ) ) O S log S O λ 2 S) K W K S log S, λ 2 S ) where W K s the number of steps the random walk takes n the graph K wth K K W K log S λ 2 S ) for all, 4

and W K K log K λ 2 K). PROOF. To combne the two condtons for W K above { { }} K W K max K log K λ 2 K), max K log S. λ 2 S ) At each u K, f u S, the second level random walk would take the expected httng tme S by Fact 2, assumng the statonary dstrbuton s S. Then, the cover tme of S would be ) O K W K S log S. λ 2 S ) As the random walk wthout topologcal nformaton would take ) O S log S λ 2 S) steps, our random walk scheme performs better when ) ) O S log S O λ 2 S) K W K S log S. λ 2 S ) Egenvalues are not easy to obtan when the transton matrx of graph s not known. Instead, the conductance can be used to bound the second largest egenvalue [? ] as follows. Lemma 5 For a connected graph G, 2ΦG) λ 2 G) Φ2 G). 2 Now, the followng corollary s obvous. Corollary 6 The topology aware random walk has better expected cover tme when ) ) O S log S O 2ΦS) K W 2 K S log S, Φ 2 S ) 5

where W K s the number of steps the random walk takes n the graph K wth K K W 2 K log S Φ 2 S ) for all, and 2 W K K log K Φ 2 K). 5 Concluson In ths work, we ntroduced a random walk scheme that consders topology of graph to reduce the cover tme than when topology s gnored. Intutvely, the more nformaton we have about the structure of graph, the better walk we can perform. Also, we just consdered a random walk wth the same statonary dstrbuton for all node. Hence, t would be nterestng to thnk about what knd of nformaton and what type of random walk can help to cover the graph faster. References [] Ruchr Bndal, Pe Cao, Wllam Chan, Jan Medved, Tony Bates George Suwala,, and Amy Zhang. Improvng traffc localty n bttorrent va based neghbor selecton. In ICDCS, 2006. [2] S. Boyd, P. Dacons, and L. Xao. Fastest mxng markov chan on a graph. SIAM Revew, 464):667 689, 2 2004. [3] A. Broder and A. Karln. Bounds on the cover tme. Journal of Theoretcal Probablty, 2):0 20, 989. [4] M. Mhal C. Gkantsds and A. Saber. Random walks n peer-to-peer networks. In IEEE INFOCOM, 2004. [5] Joel Fredman. On the second egenvalue and random walks n random d-regular graphs. Combnatorca, 4), 99. [6] C. Gkantsds, M. Mhal, and A. Saber. Conductance and congeston n power law graphs. In ACM SgMetrcs, 2003. [7] Prabhakar Raghavan Gopal Pandurangan and El Upfal. Buldng lowdameter p2p networks. In IEEE Symposum on Foundatons of Computer Scence, pages 492 499, 200. [8] C. Law and K. Su. Dstrbuted constructon of random expander graphs. In IEEE INFOCOM, 2003. [9] L. Lovász. Random walks on graphs: A survey. [0] C. Lv, P. Cao, E. Cohen, K. L, and S. Shenker. Search and replca- 6

ton n unstructured peer-to-peer networks. In Proceedngs of the 6th nternatonal conference on Supercomputng, 2002. [] N. Madras and D. Randall. Markov chan decomposton for convergence rate analyss. Annals of Appled Probablty, 22):58 606, 2002. [2] D. Randall. Rapdly mxng markov chans wth applcatons n computer scence and physcs. Computng n Scence and Engneerng, 2006. [3] Ronald Wolff. Stochastc Modelng and the Theory of Queues. Prentce Hall, 989. 7