Distributed Line Graphs: A Universal Technique for Designing DHTs Based on Arbitrary Regular Graphs

Similar documents
Lecture 1 September 4, 2013

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem

Non-homogeneous Generalization in Privacy Preserving Data Publishing

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH

Improving Spatial Reuse of IEEE Based Ad Hoc Networks

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems

Coupling the User Interfaces of a Multiuser Program

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2

Skyline Community Search in Multi-valued Networks

AnyTraffic Labeled Routing

Study of Network Optimization Method Based on ACL

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem

Adaptive Load Balancing based on IP Fast Reroute to Avoid Congestion Hot-spots

Message Transport With The User Datagram Protocol

Optimal Oblivious Path Selection on the Mesh

A Stochastic Process on the Hypercube with Applications to Peer to Peer Networks

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation

Research Article Inviscid Uniform Shear Flow past a Smooth Concave Body

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks

PAPER. 1. Introduction

Learning convex bodies is hard

d 3 d 4 d d d d d d d d d d d 1 d d d d d d

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks

Robust PIM-SM Multicasting using Anycast RP in Wireless Ad Hoc Networks

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract

Questions? Post on piazza, or Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)!

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method

Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters

An Adaptive Routing Algorithm for Communication Networks using Back Pressure Technique

Optimal Distributed P2P Streaming under Node Degree Bounds

Ad-Hoc Networks Beyond Unit Disk Graphs

CS269I: Incentives in Computer Science Lecture #8: Incentives in BGP Routing

Online Appendix to: Generalizing Database Forensics

Disjoint Multipath Routing in Dual Homing Networks using Colored Trees

Provisioning Virtualized Cloud Services in IP/MPLS-over-EON Networks

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks

Learning Subproblem Complexities in Distributed Branch and Bound

6 Gradient Descent. 6.1 Functions

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation

A Classification of 3R Orthogonal Manipulators by the Topology of their Workspace

2-connected graphs with small 2-connected dominating sets

Design of Policy-Aware Differentially Private Algorithms

Optimal Routing and Scheduling for Deterministic Delay Tolerant Networks

On the Placement of Internet Taps in Wireless Neighborhood Networks

Using Vector and Raster-Based Techniques in Categorical Map Generalization

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization

A shortest path algorithm in multimodal networks: a case study with time varying costs

Threshold Based Data Aggregation Algorithm To Detect Rainfall Induced Landslides

Adjacency Matrix Based Full-Text Indexing Models

THE APPLICATION OF ARTICLE k-th SHORTEST TIME PATH ALGORITHM

A Plane Tracker for AEC-automation Applications

Supporting Fully Adaptive Routing in InfiniBand Networks

Questions? Post on piazza, or Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)!

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES

Performance Modelling of Necklace Hypercubes

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL

State Indexed Policy Search by Dynamic Programming. Abstract. 1. Introduction. 2. System parameterization. Charles DuHadway

Frequent Pattern Mining. Frequent Item Set Mining. Overview. Frequent Item Set Mining: Motivation. Frequent Pattern Mining comprises

Research Article REALFLOW: Reliable Real-Time Flooding-Based Routing Protocol for Industrial Wireless Sensor Networks

Non-Uniform Sensor Deployment in Mobile Wireless Sensor Networks

On-path Cloudlet Pricing for Low Latency Application Provisioning

Loop Scheduling and Partitions for Hiding Memory Latencies

EDOVE: Energy and Depth Variance-Based Opportunistic Void Avoidance Scheme for Underwater Acoustic Sensor Networks

Socially-optimal ISP-aware P2P Content Distribution via a Primal-Dual Approach

APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM. Abdelmgeid A. Aly

On Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation

PERFECT ONE-ERROR-CORRECTING CODES ON ITERATED COMPLETE GRAPHS: ENCODING AND DECODING FOR THE SF LABELING

Bends, Jogs, And Wiggles for Railroad Tracks and Vehicle Guide Ways

NAND flash memory is widely used as a storage

Coordinating Distributed Algorithms for Feature Extraction Offloading in Multi-Camera Visual Sensor Networks

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE

New Geometric Interpretation and Analytic Solution for Quadrilateral Reconstruction

NET Institute*

Improving Performance of Sparse Matrix-Vector Multiplication

Computer Organization

Additional Divide and Conquer Algorithms. Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication

Verifying performance-based design objectives using assemblybased vulnerability

Pairwise alignment using shortest path algorithms, Gunnar Klau, November 29, 2005, 11:

Kinematic Analysis of a Family of 3R Manipulators

Indexing the Edges A simple and yet efficient approach to high-dimensional indexing

Lab work #8. Congestion control

Recitation Caches and Blocking. 4 March 2019

Shift-map Image Registration

Software Reliability Modeling and Cost Estimation Incorporating Testing-Effort and Efficiency

Secure Network Coding for Distributed Secret Sharing with Low Communication Cost

Algorithm for Intermodal Optimal Multidestination Tour with Dynamic Travel Times

Image Segmentation using K-means clustering and Thresholding

Characterizing Decoding Robustness under Parametric Channel Uncertainty

Divide-and-Conquer Algorithms

Comparison of Methods for Increasing the Performance of a DUA Computation

Reformulation and Solution Algorithms for Absolute and Percentile Robust Shortest Path Problems

Transcription:

IEEE TRANSACTIONS ON KNOWLEDE AND DATA ENINEERIN, MANUSCRIPT ID Distribute Line raphs: A Universal Technique for Designing DHTs Base on Arbitrary Regular raphs Yiming Zhang an Ling Liu, Senior Member, IEEE Abstract Most propose DHTs engage certain topology maintenance mechanisms specific to the static graphs on which they are base The esigns of these mechanisms are complicate an repeate with graph-relevant concerns In this paper we propose the istribute line graphs (DL), a universal technique for esigning DHTs base on arbitrary regular graphs Using DL, the main features of the initial graphs are preserve, an thus people can esign a new DHT by simply choosing the graph with esirable features an applying DL to it We emonstrate the power of DL by illustrating four DL-enable DHTs base on ifferent graphs, namely, Kautz, e Bruijn, butterfly an hypertree graphs The effectiveness of our proposals is emonstrate through analysis, simulation an implementation Inex Terms Distribute networks, network topology, istribute hash tables, regular graphs INTRODUCTION D ISTRIBUTED hash tables (DHTs) [], [2] become increasingly popular in recent years DHTs have two important properties, namely, the noe egree (ie, the number of neighbors of each noe), an the network iameter (ie, the largest istance over all the pairs of noes) Constant-egree [3] DHTs have an exact or asymptotic constant noe egree irrespective of the network size, thus ensuring scalability as the network evolves As a result, recently constant-egree DHTs have attracte significant attention from the acaemic fiels Constant- egree DHTs are usually esigne base on a specific type of regular graphs, in which all noes have the same number of eges Eg, CAN [2] is base on - torus; Viceroy [4] is base on butterfly graphs; D2B [5] an Koore [6] are base on e Bruijn graphs; FissionE [7] an Moore [8] are base on Kautz graphs; Cycloi [9] is base on CCC graphs To eal with network ynamics such as noe joins/leaves, DHTs engage certain topology maintenance mechanisms, the esign of which is usually complicate an repeate By now, most DHTs have their own cleanslate esigns of these mechanisms, which are tightly couple with the static graphs on which they are base In this paper we present the istribute line graphs (DL), a universal technique for esigning ifferent DHTs base on arbitrary regular graphs DL s novelty is that it preserves the main features (eg, the egree, iameter, an routing algorithm) of the unerlying graphs without knowing them Suppose that in the future a Y Zhang is with the School of Computer, National University of Defense Technology, Changsha 473, China (e-mail: ymzhang@nuteucn) L Liu is with the College of Computing, eorgia Institute of Technology, Atlanta, A, 3332 USA E-mail: lingliu@ccgatecheu Manuscript receive December, 2 log N log N D We also provie proceures that eal with network complexity such as ungraceful leaves, concurrency, an robust routing uner churn Our propose DL technique is inspire by a novel technique calle line graph (L) iteration [] The L iteration, as well as its variations [-5], has been extensively stuie an applie as a universal approach to esigning interconnection networks, eg, multiprocessor networks However, these techniques require global topology information an centralize control, an thus cannot be applie to DHTs The challenge we face is the absence of knowlege for what graphs we are to eal with, together with the characteristics of DHTs such as ecentralize control an lack of global information To aress these problems, the key iea behin DL lies in a simple local ege-noe transition mechanism for routing table construction, which preserves the main features of the initial graphs To our knowlege, we are the first to propose a universal, feature-preserving technique for esigning ifferent DHTs base on arbitrary regular graphs Moreover, our DL technique provies a number of methoological avantages, some of which are liste as follows First, our DL technique remarkably simplifies the exxxx-xxxx/x/$xx 2x IEEE novel (an currently unknown) graph X with esirable features is propose Then, people will be able to esign a new, X-base DHT by simply applying DL to it We prove that in a DL-enable, N-noe DHT, the out-egree is, the in-egree is between an 2, an the iameter is less than 2(log N log N D ), where, D an N represent the base, iameter an number of noes of the initial graph, respectively This iameter reaches the theoretical lower boun (log N ) [3] of constant-egree DHTs The message cost cause by each join/leave is (log N), an the maximum ifference between noe ientifier lengths is no more than

2 IEEE TRANSACTIONS ON KNOWLEDE AND DATA ENINEERIN, MANUSCRIPT ID sign of new DHTs We emonstrate the power of DL by illustrating four new, DL-enable DHTs base on ifferent graphs, namely, Kautz, e Bruijn, butterfly an hypertree graphs Secon, the universal technique allows an in-epth stuy of common esign choices of DHTs To aress the topology imbalance problem, for example, in existing DHTs all joining noes first route to a ranom noe, the cost of which is (log N) an might be relatively expensive for highly ynamic networks In DL-enable DHTs, however, we show (in Section 44) that this cost can be evicte while still bouning the imbalance to the same range as before DL-Kautz (DK), a DL-enable, Kautz graph-base DHT, is prove to have the lowest iameter among all the DL-enable DHTs So, we use DK as a moel to evaluate the effectiveness of DL through extensive simulations We have also implemente a prototype [6] of DK Note that, however, the goal of this paper is to provie a universal topology maintenance mechanism an simplify the esign of DHT s backbone (mainly incluing hanling noe join/leave events an message routing), but NOT to esign a particular DHT Thus, after getting the backbone of a new DHT by using our DL technique, people still nee to consier more etails (eg, loa balancing for resource iscovery) for achieving a complete an practical DHT The rest of this paper is organize as follows Section 2 presents the basic esign an analysis of DL Section 3 iscusses the problem of how to support arbitrary base in DL In Section 4 we utilize the DL technique to esign ifferent DHTs, which are evaluate by extensive simulations in Section 5 Section 6 iscusses relate work an Section 7 conclues the paper 2 BASIC DISTRIBUTED LINE RAPHS At the beginning the DHT topology is referre to as initial graph Then it evolves into a family of graphs as noes join/leave the network Basically, all topology maintenance mechanisms can be abstracte as aressing the problem of how to a/elete a noe to/from a graph, while preserving main features of For simplicity this section focuses on the problem of aing a noe, an we will iscuss the inverse eletion operation in Section 4 2 Notation For convenience we first present the following notation: Noe an ege: The DHT topology is moele by a graph ( V, E) whose noes V V ( ) an eges E E ( ) represent, respectively, the processing elements (noes in the network) an the links between them We use e [ x, y] to enote an ege e from noe x to noe y Out-neighbor, in-neighbor, out-ege an in-ege: Let e [ x, y], we say that y is an out-neighbor of x, an that x is an in-neighbor of y We use ( x) an ( x ) to enote the sets of out-neighbors an inneighbors of x, respectively We also say that e is an out-ege (resp in-ege) of x (resp y) Fig Example of unifie escription (a) shows a stanar e Bruijn graph B(2,2) an (b) shows its unifie escription In (b), () {,3}, () {,}, () {,3}, () {2,3}, (2) {,2}, (2) {2,3}, (3) {,2}, (3) {,} Orer of a graph: The number of noes in graph is calle the orer of Out-egree, in-egree an egree: The number of outneighbors (resp in-neighbors) of noe x is calle out-egree (resp in-egree) of x The egree of x is the sum of its out-egree an in-egree -out-regular, -in-regular an -regular: raph is -out-regular (resp -in-regular) if x s out- egree is (resp x s in-egree is ) for all x raph is - regular (regular for short) if it is -out-regular an - in-regular Distance an iameter: The istance from x to y, enote as ( x, y ), is the length of a shortest path from x to y The iameter of, enote as D( ), is the largest istance over all the pairs of noes u : The ientifier length of noe u 22 Unifie Description of Initial raphs Regular graphs usually have their ifferent notation to efine noes/eges In orer to simplify the escription of DL s esign an analysis, in this section we unify the way to escribing a -regular, N -noe initial graph Let u enote the length of the ientifier of noe u, eg, if u u u 2 u k then u k Let X be an alphabet of N letters an x X satisfies x In our unifie escription mechanism, the initial graph in turn names its N noes with N letters x X, an for each noe there is an in-letter set ( ) an an out-letter set ( ), which are efine as ( ) ( ), (a) ( ) ( ) (b) The letters in X coul be any symbols such as { a, b, c, }, {,,, } or {,,2, } For the sake of convenience, we simply name noes in an initial graph with ientifiers,,2,, N This alphabet approach enables us to escribe all -regular graphs in a unifie way For example, Fig (a) shows the original e Bruijn graph B(2,2), an Fig (b) shows its unifie escription 23 Formal Definition of Distribute Line raphs The basic iea of istribute line graphs (DL) is simple: with a special noe v (which will be formally efine in Definition ), DL (i) eletes v, (ii) turns v s in-eges into new noes, an (iii) generates new in-/out-eges Before we present the efinition of DL, we first introuce an operator for turning an ege into a noe Let u = u u 2 u m, v = v v 2 v n, an m n Define the conjunction operator as (2): u v u v v v u v (2) mn 2 n mn For example, 2 2 2, 2 23 23 Clearly u v

ZHAN ET AL: DISTRIBUTED LINE RAPHS: A UNIVERSAL TECHNIQUE FOR DESININ DHTS BASED ON ARBITRARY REULAR RAPHS 3 contains the information of v On the other han, as iscusse in Theorem (see Section 25), for an ege [u, v] = [u u 2 u m, v v 2 v n ] satisfying m n in istribute line graphs, u = βv v 2 v n if m = n, or u = β'βv v 2 v n if m > n (in this case we have m = n + ) Consequently, we have u v vv 2 vn u u 2 umvn if m = n, or u v v v v u u v if m > n 2 n 2 m n Thus u v also contains the information of u So, we can say that u v contains the information of ege [ u, v ] Next we will present the efinition of istribute line graphs in an iterative manner, that is, we efine the noes an eges of i, the (i+) th graph of a series of istribute line graphs, by escribing how to obtain i from i, i,,2, Definition Let the initial graph ( V, E) be a - regular graph A series of graphs i DL ( i, v ) with i,, 2,, where noe v V ( i ) satisfies u ( v ) ( v ), i i v u, (3a) is sai to be a family of istribute line (DL) graphs with base, if the following conitions hol V ( ) V ( ) { v } { u v u ( v )} (3b) i i E( ) E ( ) {[ x, v ] x ( v )} {[ v, y] y ( v )} i i i i (3c) {[ u, uv ] u ( v )} {[ uv, w] u ( v ), w ( v )} i i i i 4 The transition from i to i is calle istribute 3 4 line (DL) iteration, an noe v is calle responsible noe (We efer the iscussion on how to fin v ( to Section 5 4) We say that the series of DL graphs is erive from initial graph In Definition : (3a) puts restrictions on the responsible noe of each DL iteration for balance purpose (which will be use in the following analysis in Section 25), ie, the 2 (e) =DL(,) ientifier length of v is no greater than any of its irect neighbors; (3b) gives the new noes generate by ol eges; an (3c) presents the rules of generating new eges Let s take Fig 2(a) ~ (e) as an example to illustrate the () ecompose proceure of DL iteration DL(, v ) with v () : ) Let, as shown in Fig 2(a); 2) Delete noe v () an all eges ajacent to/from v () in the new graph, as shown in Fig 2(b); () 3) For each in-ege [ u, v ] ajacent to v () in, a () a new noe u v to, as shown in Fig 2(c); () 4) A in- eges of the form [ u, u v ] to the new () noes u v in, as shown in Fig 2(); an () () 5) A out-eges of the form [ u v, w] to u v in () with w ( v ), as shown in Fig 2(e) Fig 2(f) ~ (j) show another five consecutive DL iterations, ie, DL(,4), DL(,3), DL(,), 2 3 2 4 3 5 DL( 4,2), an 6 DL( 5,) From these examples we can see that each new noe u v V ( ) i correspons to an ege [ u, v ] E( i ), an that all the new noes in the new DL graph i have the same outneighbors Fig 2 Examples of DL iterations w ( v ) as the responsible noe ( i v ) V ( ) i i Note that a series of DL iterations from cannot assure where noe satisfies the requirements for being a re- to achieve the line graphs [] of This is ecie sponsible noe Then, let DL (,) as shown in Fig 6 5 by the selection of responsible noes in each DL iteration 2(j), currently in 6 there are two noes with length 3 For example, let 5 DL ( 4,2) as shown in Fig 2(i),

4 IEEE TRANSACTIONS ON KNOWLEDE AND DATA ENINEERIN, MANUSCRIPT ID (noes 2 an 3), an a noe with length (noe 5) Clearly this series of graphs will never achieve the line graph of An 6 cannot be achieve by any series of PL iterations [] from either 24 Routing The DL graph is starte with initial graph, then it evolves with a series of DL iterations Consequently, the routing path from u u u u to v v v v in a DL graph 2 m 2 n inclues two subpaths The first has a counterpart in, an the secon is from the noe with its last letter being v 2 to estination v, corresponing to the letters appene by DL iterations (i) The first subpath: Clearly we have ui V ( ) an v V ( ) for i,2,, m an k,2,, n Suppose that k the iameter of is D, then there must be a path u x x v with t D Let x x x x, w m t 2 t uxv u umx xtv vn ww 2 wm t n Then there must be a () () ( t n ) ( t n) path from u in graph : u s s s s where the k th intermeiate noe s ( k ) in the path is of the form ( k s ) w w w, ie a substring of w Theorem i i mk (See Section 25) assures the existence of each s ( k ) ( k ) (ii) The secon subpath: The last character of s is th wm k Then, at the ( t n ) step the last character of noe ( t n) s ( t n) is wm t n vn, an the length of s may be one of ( t n) the three cases: s ( t n) n, s n, an s ( tn ) n Note ( t n) that if s ( t n) n we have s v v2 vn v Since there ( t n) alreay exists noe v in, only the case s n is possible Thus s v an the routing is over ( t n) The above DL routing proceure is summarize as follows Note that this algorithm is ecentralize an run on iniviual noes, ie, the next-hop is ecie by the noe where the message arrives This algorithm woul be invoke on each intermeiate noe along the routing path Proceure wdl_routing (Source u, Destination v, Hops k) // w is the current noe where the message arrives // k means how many hops the message alreay traverse if (w == v) return; // routing is over 2 Let u u u u an v v v v ; 2 m 2 n 3 Suppose in the path from u m to v : um x xt v ; 4 if ( k t ) { 5 next_hop = the neighbor with last character being x k ; } 6 else { 7 next_hop = the neighbor with last character being v k t ; } 8 next_hopdl_routing (u, v, k+); For example, the routing path from noe 5 to noe in Fig 2(h) is 5 2 2, where the first two hops correspon to Line 3 an the last correspons to Line 4 25 Analysis This subsection analyzes the properties of the DL iteration an DL graphs Note that for the sake of clarity only outline proofs are presente in this paper an formal proofs are inclue in our technical report [28] The following Theorem escribes the in-/outneighbors of a noe in DL graphs Theorem Let graph be a DL graph with base Let x x x x V ( ) 2 n (i) If there is a noe y ( x) satisfying y x, then ( x) { y y x x V ( )} (4a) n Otherwise, for each ( x ), either there is an in-neighbor y ( x) satisfying y x x x 2 n, (4b) or there are in-neighbors y ( x) satisfying y ' x x x 2 n (4c) with ' ( ) An x has no other in-neighbors in (ii) For each ( x n ), there is one out-neighbor z ( x) satisfying z x x x x (5) t 2 3 n with t 3 ( x x represents a null string if m n ) An x m n has no other out-neighbors in This theorem is prove by utilizing two obvious properties of DL graphs: (i) the ientifier of one noe cannot be a suffix of another; an (ii) if a noe has an in-neighbor with shorter ientifier, then it has no other in-neighbors Proof outline We prove Theorem by applying mathematical inuction to all DL graphs i with i,, 2, Clearly Theorem hols initially for Suppose that it hols for i Let i DL ( i, v) an v v v v 2 m To prove (4), we first consier the case that x is a new noe The new noe x is of the form v v 2 v m If x has some in-neighbor u satisfying u x, then by (3a) we have u v an u v v v 2 m We coul easily prove noe y is the only in-neighbor of x an thus (4a) hols Otherwise, similar to the above case, by (2), (3a), (4a) an (3c), noe v has istinct in-neighbors in the form of y v v v 2 m with ( ), for each ( v ) in i Thus (4) hols when x is a new noe Similarly, (4) still hols for i when x is an out-neighbor of v an thus (4) hols Similarly, we coul prove (5) hols This theorem shows in DL graphs that (i) the inneighbors are restricte to three forms of (4a), (4b) an (4c), an that (ii) the out-neighbors are also restricte to three forms corresponing to t =, 2, 3 in (5) Moreover, we can easily infer that for a DL graph : The ifference between the ientifier lengths of any two neighbors is no more than ; After each DL iteration there woul be more noes in the new graph; an For any noe x xx2 x n, we have xi ( xi ) an x ( x ) for i,,, n, where ( ) an i i ( ) enote respectively the in-letter set an outletter set of in the initial graph The secon inference shows that DL iteration with base 2 cannot be irectly applie to builing DHTs This problem will be aresse in the next section The following Theorem 2 escribes the in-egree an out-egree of a noe in DL graphs Theorem 2 Let graph be a DL graph with base For all noes x V ( ), the out-egree an in-egree, enote as ( x ) an ( ) x, satisfy respectively ( x ), (6a) ( x ) An the average in-egree of all noes in is 2 (6b)

ZHAN ET AL: DISTRIBUTED LINE RAPHS: A UNIVERSAL TECHNIQUE FOR DESININ DHTS BASED ON ARBITRARY REULAR RAPHS 5 Fig 3 Example of merge operation Note that noes 8, 9,,, which have nothing to o with this iteration, are omitte here for clarity 3 Noe Merge Let be a DL graph with base an let ' DL (, v) Proof outline Clearly (6a) hols initially for Suppose that it hols for i with i From Definition, the with v vv 2 v m As iscusse above, there are new noes in noes in ', namely, v v DL(, v) all have an out-egree of Thus v with ( v ) an i 2 m i i i i,,, (6a) hols If x has an in-neighbor y satisfying y x, For simplicity the new noes are sorte on their first then y is the only in-neighbor of x Otherwise each inneighbor y satisfies y x Then by Theorem, letters in an ascening orer Let h / 2 In the merge ( ) 2 operation, the first half, noes of the form v x an (6b) hols Clearly, the sum of inegrees of all noes is equal to that of out-egrees, so the v 2 v with i m i h, woul merge to one noe s vv 2 v m with ; an the secon half, noes of the form v average in-egree is Thus Theorem 2 hols v 2 v with i m i h, woul merge to one noe t vv This theorem shows that the DL graphs are always - 2 v m with h out-regular An although the in-egree upper boun is The in-eges (resp out-eges) of s an t in " are the union of the in-eges (resp out-eges) of their relatively large ( 2 ) in DL graphs, we woul show (in the component noes in ', ie, ( s) ( v v v ) an " ' i 2 m following Theorem 4 in Section 34) that in practice a noe "( s) ' ( i v v2 vm) with i h in DL-enable DHTs has at most 2 in-neighbors For the sake of clarity, noes s an t will be referre to The following Theorem 3 gives the upper boun for as physical noes; an the component noes hel by s the iameter of DL graphs an t are referre to as logical noes (shortly noes) The Theorem 3 Let graph be a DL graph with base an orer N Let the orer an iameter of the initial graph be N We enote this merge operation as " Merge( ', v) number of logical noes of s in " is enote as s an D, respectively Then the iameter of satisfies The two merge physical noes are calle sib-neighbors to D( ) 2(log N log N D ) (7) each other Fig 3 shows an example of the merge operation, where two logical noes an 4 in Fig 3(b) merge The proof inclues two steps: (i) eriving the relationship between iameter an shortest noe ientifier length; to one physical noe in Fig 3(c), an the logical noe an (ii) calculating the maximum ifference between all 6 merges to itself since / 2 After the merge pairs of noe ientifier lengths operation, the two physical noes an 6 in Fig 3(c) Proof outline Let u an v be of the shortest an longest ientifier in, respectively Let u m, v n Then are sib-neighbors to each other with 2 an 6 from the DL_Routing algorithm we can infer that the iameter of satisfies D( ) D n Suppose a physical noe v v v v V ( ) satisfies 32 Noe split m The number of noes (N) in satisfies N N k 2 m, v Then a split operation ' Split (, v) woul ivie the logical noes an eges hel by v into two shares thus we have m log N log N The istance from v to u, ( v, u ), satisfies ( v, u ) D m If two noes x Physical noe v hols the first share, ie the logical noes an y are neighbors in, then x y Since the ientifier lengths satisfies n m ( v, u ) D m i m of the form v v 2 v with k i k h ( h / v 2 ); an, we have a new physical noe hols the secon share, ie the logical noes of the form v n D 2 m So D( ) 2 D 2 m 2 2(log N log N D) v 2 v with k h i k i m v The From the proof of Theorem 3 we can infer the follow- ientifier of the new physical noe is v v 2 v ing Corollary Corollary The iameter of a DL graph is no more than that of the initial graph plus the largest length of ientifiers among all noes minus From Theorem 3 we can get an interesting property of DL graphs: the maximum ifference between noe ientifier lengths is no more than log N log N D, which will be use to iscuss the topology balancing issue of ifferent stabilization mechanisms in Section 44 3 SUPPORTIN ARBITRARY BASE IN DL As iscusse in the above section, we infer by Theorem that after each DL iteration there woul be more noes in the new graph than in the ol one For example, suppose that is a 3-regular graph K (3, 2) as shown in Fig 3(a) After the iteration DL(,), as shown in Fig 3(b) there are two more noes in than in Therefore, DL iteration with base 2 cannot be irectly applie to builing DHTs This section extens DL iteration to support arbitrary base by proposing the following merge/split operations k h m

6 IEEE TRANSACTIONS ON KNOWLEDE AND DATA ENINEERIN, MANUSCRIPT ID 33 DL+ graphs The basic DL iteration an merge/split operations are generally calle DL technique, an the series of graphs generate by DL are referre to as DL+ graphs Both the DL iteration an the merge/split operations nee a responsible noe v satisfying u ( v) ( v), v u or ( v u ) ( v u ) (8) A DL+ graph is compose of physical noes, an a DL graph is compose of logical noes At any time the topology of a DL-enable DHT is a DL+ graph, which can be mappe to a corresponing DL graph * by replacing all the physical noes with their logical noes Clearly, if every physical noe hols only one logical noe, then is equal to * We say that the initial graph, DL graphs an DL+ graphs have the same base We refer to DHTs that are built base on DL as DLenable DHTs, an the processing of DL can be summarize as follows Proceure DL_transition (Ol raph, New raph ') Choose a physical noe v V ( ) satisfying for any u ( v) ( v), v u or ( v u ) ( v u ) ; 2 if ( v ) { ' Split(, v) ; } * 3 else { Let be the corresponing DL graph of ; * 4 ' DL(, v) ; ' Merge( ', v) ; } In the above proceure, the split operation (Line 2) an the DL iteration followe by the merge operation (Line 4) both nee only irect neighbor information of responsible noe v Note that in Line 3 noe v oes not know the overall mapping from DL graph to DL+ graph, it only nees to know the mapping of local physical neighbors It is clear that there are one more noe in ' than in Thus, the DL technique can be applie to builing DHTs base on arbitrary regular graphs with any base Since DL+ graphs all have their corresponing DL graphs, routing in a DL+ graph is straightforwar: Let * be the corresponing DL graph of The path P between any two physical noes u, v always has a counterpart path P* between u*, v* *, where u * an v* are logical noes of u an v, respectively Thus the routing in can be irectly conucte by emulating the routing in the corresponing DL graph * The routing proceure in DL+ graphs is summarize as follows Proceure wdl + _Routing (Source u, Destination v, Hops k) // w is the current physical noe where the message arrives // k means how many hops the message alreay traverse if (w == v) return; // routing is over 2 Arbitrarily choose one of w s logical noes w'; // Sen the message to the logical next-hop, which invokes the // corresponing physical next-hop s DL + _routing 3 w'dl_routing (u, v, k); As shown in the iscussion of Fig 2 Examples of DL iterations, all the new noes in the new DL graph have the same out-neighbors Thus all the logical noes of a physical noe have the same out-neighbors So, when a physical noe receives a message, it can arbitrarily choose one of its logical noes to run the DL_Routing algorithm an route to the logical next-hop, an corresponingly, the physical next-hop An clearly, the physical estination v (in the DL+ graph) is merge by one or more logical noes (in the DL graph), of which there must be one with the same ientifier v So, when the message arrives at the logical noe v in the DL graph, it arrives at the physical estination v an the routing is over 34 Analysis The following Theorem 4 escribes the in- egree an outegree of a noe in DL+ graphs Theorem 4 Let graph be a DL+ graph with base For all physical noes x V ( ), the out-egree an in-egree, enote as ( x ) an ( ) x, satisfy respectively ( x), (9a) ( x ) 2 (9b) An the average in-egree of all noes in is Proof outline (9a) can be irectly erive from Theorem 2, an thus the average in-egree of all noes in is Next we prove the correctness of (9b) by analyzing the in-neighbors of each logical noe hel by x in the corresponing DL graph of We consier the following cases Case There is some physical noe v ( x) ( x) satisfying v x Let v* * be the logical noe having the same ientifier as v Suppose v * is generate by i DL( i, v') We use v' an x to enote both the logical noe an the physical noe Let x x x x iven 2 m any x ( ), by Theorem in * there is one logical noe y* * ( x) of the form y* x x2 x m, or there are logical noes y* ( x) of the form y* ' x x x * 2 m with ' ( ) In the first case, each ( x ) has one corresponing physical noe y ( x) In the secon case, each x ( ) has two corresponing physical noe y ( x ) an ( ) 2 x Thus (9b) hols Case 2 There are no physical noes v ( x) ( x) satisfying v x Similar to Case, we can prove (9b) still hols for this case an this completes the proof Compare with Theorem 2, this theorem not only guarantees the DL+ graphs to be always -out- regular, but also assures the in-egrees are boune by twice the base This in-egree upper boun is smaller than that in DL graphs ( 2 ) since ifferent in-neighbors in DL graphs might be merge into one in-neighbor in DL+ graphs The following Theorem 5 escribes the upper boun for the iameter of a DL+ graph Theorem 5 Let graph be a DL+ graph with base an orer N Let the orer an iameter of the initial graph be N an D, respectively Then the iameter of satisfies D( ) 2(log N log N D ) () Since all the paths in a DL+ graph have their counterpart paths with equal lengths in the corresponing DL graph, this theorem can be irectly erive by mapping a DL+ routing onto the corresponing DL routing Since the topologies of DL- enable DHTs are always DL+ graphs, by theorem 5 we see that the iameters of DL-enable DHTs reach the theoretical lower boun (log N ) of constant-egree DHTs [3] Moreover, we show in the case stuy on DK (at the en of Section 46) that whp its e facto iameter is much lower than the analytical result ( 2log N 2 )

ZHAN ET AL: DISTRIBUTED LINE RAPHS: A UNIVERSAL TECHNIQUE FOR DESININ DHTS BASED ON ARBITRARY REULAR RAPHS 7 4 BUILDIN DHTS WITH DL In this section we introuce how to utilize our DL technique to buil DHTs base on ifferent initial graphs 4 Self-Stabilization on Noe Joins This subsection iscusses the self-stabilization mechanism on noe joins, which follows the traitional manner in many existing DHTs like [2], [7], [4], ie, first routes to a ranom noe to balance the topology, an then routes to the responsible noe to finish the DL transition At the beginning, the topology of a DL-enable DHT is a -regular initial graph with orer N an iameter D Then it evolves into a family of DL+ graphs as noes join/leave Similar to other DHT stabilization mechanisms, the processing for noe joins in DLenable DHTs can be ivie into two phases, namely, (i) looking for a responsible noe satisfying (8); an (ii) upating the routing tables of relevant neighbors Looking for responsible noe Suppose that a new noe p joins the network This phase performs two tasks to fin a responsible noe v First, noe p sens a message targete to a surrogate noe u chosen uniformly at ranom in the network This step is to achieve a balance DHT topology The iscovery of surrogates is orthogonal to our DL technique an will be iscusse in later case stuies (Section 46) Secon, noe p invokes a JOIN message from u During the routing of the JOIN message, suppose that it arrives at an intermeiate noe u ' If u ' has a neighbor w satisfying w u', then the JOIN message will be forware to w; otherwise if u ' has a neighbor w' satisfying ( w' u ' ) ( w ' u ' ), then the JOIN message will be forware to w ' This proceure will continue until the JOIN message reaches a physical noe v satisfying (8), which is the responsible noe for this join event 2 Upating the routing tables This phase can be easily accomplishe by performing a DL transition ( ' Split(, v), or ' DL (, v) followe by " Merge( ', v) ), where v is the responsible noe that is foun in the first phase After the transition, there will be two new noes in the DHT topology (ie the DL+ graph), one corresponing to v, an the other corresponing to the new noe p 42 Self-Stabilization on Noe Leaves In a ynamic network, noes can leave at any time Ieally the leaving noe woul notify its relevant neighbors, the case of which is referre to as graceful leaves In contrast, sometimes noes might leave without notifying any neighbors This case is referre to as ungraceful leaves raceful leaves When a noe p gracefully leaves the network, it will actively transfer the neighbor information in its routing table to other noes, the proceure of which can be viewe as an inverse operation of noe joins First, noe p invokes a DEPART message During the routing of the DEPART message, suppose that it arrives at an intermeiate noe u If u has a neighbor w satisfying w u then the DEPART message is forware to w; otherwise if u has a neighbor w' satisfying ( w' u ) ( w ' u ), then the message is forware to w ' This proceure will continue until the DEPART message reaches the responsible noe v, which has locally longest ientifier or minimum orer, ie, v' ( v) ( v), v v ' or ( v v ' ) ( v v ' ) () Secon, noe v hans over its logical noes, as well as their in-/out-eges (ie routing table entries), to its sibneighbor x Note that the merge/split operations assure the existence of such sib- neighbors Then noe p hans over its logical noes an their in-/out-eges to v an the change of the logical noes ownership will be notice to relevant neighbors Then p leaves Thir, if the physical noe x has logical noes x v i v2 vk with i,,, after the above hanover operation, then the logical noes will turn into one logical noe v v 2 v k Afterwars, the physical noe x satisfies x, an it has the same ientifier with its logical noe Let the two DL graphs before an after this transition be ( ) an ' respectively Then we have ( ) ( i x x ), ' an '( x) ( x ) Clearly the DHT topology woul not be affecte if each graceful leave in a succession occurs after its previous leave has alreay been hanle If two neighbors, say u an v ( u v ), occur graceful leaves in a very small time slot, then the first leave, say noe v s leave, can be hanle normally, an u s leave processing will be blocke until v s leave has alreay been processe At this time u still can leave accoring to its will, but it nees to first choose a live in-neighbor to temporarily act as it until its leave processing is finishe, in case that relevant event processing messages route across u By this means we can hanle the succession of graceful leaves of any two noes Similarly, using neighbor s neighbor, we can hanle that of any specific noes (at the price of more backups) 2 Ungraceful leaves Noes in DHTs might abruptly stop communicating ue to a host of reasons We classify as ungraceful leaves that fail to complete the above-mentione operations require for graceful leaves To hanle ungraceful leaves, as in traitional approaches [3], [7], a simple backup mechanism is as follows A noe u stores a backup of its routing table at each of its irect in-neighbors These in-neighbors are sorte alphabetically an are in turn calle primary secretary, secon secretary,, th secretary of u u perioically sen ALIVE messages to its secretaries Once messages have not been receive for a certain perio of time, the secretaries will consier u ungracefully leaves the DHT The primary secretary is responsible for this situation an initiates an carries through all the operations on the behalf of u In case that the primary secretary fails unexpectely (eg, it crashes shortly after u s crash), the secon secretary perioically asks the primary secretary so that it will be able to fin out the failure of the primary secretary Then, the secon secretary will take over the responsibility of u s failure recovery By parity of reasoning, DHT can survive if at least one in-neighbor is alive The following Theorem 6 escribes the processing cost per noe join/leave in DL-enable DHTs (measure in terms of the number of hops the message traverses)

8 IEEE TRANSACTIONS ON KNOWLEDE AND DATA ENINEERIN, MANUSCRIPT ID Theorem 6 Let the orer of a DL-enable DHT be N, an let the orer an iameter of its initial -regular graph be N an D, respectively The processing cost per join/leave event, enote as T an S respectively, satisfies T 3(log N log N D ) (2a) S log N log N D (2b) An at most 3 noes nee to upate their routing tables Proof outline We ivie the joining processing into two part: route to the surrogate (taking T hops) an to the responsible noe (taking T 2 hops) Clearly T T T2 By Theorem 5, T 2(log N log N D ) Clearly, T ( v u ) ( v u ), v u log log 2 N N D, an v u v, thus T2 log N log N D Therefore (2a) hols Similar to the join proceure, it can be inferre that for the leaving processing (2b) hols If a DL operation ' DL (, v) followe by a merge operation ' merge( ', v) occurs, then the responsible noe v satisfies v Then at most 3 noes nee to upate their routing tables Similarly, if a split operation takes place, at most 3 noes nee to upate their routing tables The case for noe leaves is similar to that for noe joins Thus Theorem 6 hols From this theorem we can see that the processing cost for noe leave is remarkably less than that for noe join This is mainly because the first part of join processing (routing to a surrogate), which is for achieving a more balance DHT topology, takes a relatively high cost We will show in Section 44 that this cost can be evicte while still bouning the imbalance to the same range as before 43 Concurrency Although concurrency is orthogonal to DL, this subsection briefly iscusses this issue for completeness In DHTs many noes might join/leave simultaneously, which cause temporary inaccurate information in routing tables an in turn cause errors in the routing an join/leave processing To aress this problem, a simple atomic upate mechanism is aopte as follows When a noe joins/leaves the DHT, the routing tables of relevant noes shoul be upate Only when all upates are complete, are the new routing tables allowe to be use During the perio, the relevant JOIN/DEPART messages are suspene Messages are resent after the upate is finishe 44 Stabilization Cost vs Topology Imbalance Topology imbalance [3], [7] can be measure by the maximum ifference of ientifier lengths over all pairs of noes The imbalance of ientifiers is closely relate to the loa balancing issues such as noe/ege congestion an hot spot As iscusse in Section 4, in DL-enable DHTs a joining noe first routes to a ranom noe, in orer to ranomize the responsible noe for the forthcoming DL transition an to alleviate the topology imbalance By Theorem 6, however, the cost of such balanceoriente processing is (log N ), which might be too expensive for highly ynamic environments where noes frequently join/leave Similar problems exist in almost all DHTs These problems are actually a traeoff between the stabilization cost an topology imbalance, where higher cost results in more balance topology an vice versa In the rest of this paper, the join algorithm propose in Section 4 will be referre to as balance join, an this subsection will present a relevant but ifferent algorithm name fast join, achieving a lower processing cost at the price of inucing some egree of imbalance enerally speaking, the fast join algorithm can be viewe as a reuce version of the balance one, simply omitting the first phase of routing to a ranom noe That is, the new noe woul irectly invoke the JOIN message from a nearby gateway noe an the following processing is the same as the secon phase Note that the first phase is not a part of DL transitions an it is not inispensable for DL+ graphs So, the DL transition can always be conucte on the topologies with fast join an the result topologies are assure to be DL+ graphs Thus the routing is assure (by Theorem 5) to be correct The following Theorem 7 escribes the processing cost per noe join/leave in the fast join algorithm (measure in terms of the number of hops the message traverses) Theorem 7 Let the orer of a DL-enable DHT be N, an let the orer an iameter of its initial -regular graph be N an D, respectively Then the processing cost per noe join in the fast join algorithm, enote as T, satisfies T log N log N D (3) An at most 3 noes nee to upate their routing tables Since the processing of fast join is equal to that of the secon phase of balance join, the correctness of this theorem is straightforwar This theorem shows that the boun for the cost in fast join is about /3 that in the balance one On the other han, although practically the fast join algorithm brings more imbalance to the DHT topology, by Theorem 3 we can infer the ifference between any two noe ientifier lengths u an v satisfies v u log N log N D, which shows that fast/balance joins o have the same imbalance upper boun This is because the possible topology set for the balance join algorithm is in theory a superset of those for fast join For example, it is possible that the ranomly chosen responsible noe in the balance join algorithm is always the gateway noe use in fast join, resulting a completely same processing in both algorithms We will further evaluate the traeoff between stabilization cost an topology imbalance through simulations in Section 52 Topology imbalance affects the loa balance property Loa balance is much more complicate an relate to many aspects, eg, istribution of user quires, resource istribution in resource space, an mapping from resources onto network topology which can be further ivie into sub-mappings from resources onto keys an from keys onto topology enerally speaking, imbalance topology leas to imbalance loa istribution However, as illustrate by the esign of keys in DL-enable DHTs starting from K(,) (in Section IVF in the tech report [28]), mapping from keys onto topology is tightly couple with the particular unerlying graph that is not concerne by DL We will further stuy the impact of topology imbalance on loa balance in our future work

ZHAN ET AL: DISTRIBUTED LINE RAPHS: A UNIVERSAL TECHNIQUE FOR DESININ DHTS BASED ON ARBITRARY REULAR RAPHS 9 2 2 2 22 2 2 2 22 2 2 (a) Fig 4 Robust routing from 2 to 45 Routing uner Churn Frequent noe joins/leaves might inuce many suspensions an consequently churn [25] As iscusse in Section 42, we coul use traitional backup techniques for routing tables to support routing uring stabilization If the processing of a join/leave event is not affecte by other events, then this event will not hurt the topology This epens on how many events occur in the processing path an how many backups are hel Clearly, if backups are inefficient, then churn might hurt the DHT topology an consequently affect the routing To aress this problem, this subsection presents a novel DL-base mechanism that supports efficient routing uring stabilization an effectively reuces the affects of churn Specifically, we a a etour-aroun mechanism to the basic DL routing, hanling the situation where a neighbor has been ientifie as the next hop but oes not respon since it has faile or has been blocke ue to other noes join/leave Let the current DHT topology, its corresponing DL graph, an the initial -regular graph be, ', an, respectively The basic iea of our etour-aroun mechanism is to bypass the faulty neighbor an select an alternative noe in the DL graph ' to forwar the message As iscusse in Section 2, the routing between any two noes in a DL graph contains two phases In the first phase it is etermine by the initial -regular graph, where there are paths between any pair of noes since the connectivity is Base on the knowlege about, the current noe can irectly compute the alternative noe Next we focus on the secon phase where the routing is etermine by the target Suppose that the message arrives at noe c, an the normal routing path without noe failure shoul be c v x x x x Suppose that at this moment noe v fails Then two cases nee to be consiere 2 n If v x, then by Theorem, given x ( ), noe x either has an in-neighbor of the form x x 2 x n, or has in-neighbors of the form ' xx2 x n with ' ( ) In this case, noe c can bypass v by selecting any two alternate noes of the form y x x n an y' ' x x n, where ( x ), ' ( ), an y an y ' are not hel by the faulty (physical) noe Note that noe c has no irect links to y or y ', thus it woul not know which of them exists in the network However, Theorem assures one of them must exist (a) K (2,) (b) K (2,2) (c) K (2,3) Fig 5 Examples of line graph iterations Otherwise if v x, then the faulty noe v is the only in-neighbor of x In this case, after a DL iteration DL(, v ) we can a to each new noe u v a i i temporary link of the form [ w, u v ] with u ( v ), i w ( v ) an w u The temporary links will be elete when the in-egree of u v excees one Note that i by this means the out-egree of w may reach temporarily The following proceure is the same as in the first case Fig 4 shows an example of the robust routing from noe 2 to noe In (a) there is no failure an the routing can be conucte as usual; while in (b) the path between 2 an fails an the etour- aroun mechanism is taken: 2 2 2 2 46 Case Stuy iven an arbitrary initial - regular graph, a DHT backbone can be easily obtaine by aopting our DL technique As iscusse before, the remaining tasks are to ) escribe initial graph, ie, to efine the in-letter set ( ) an the out-letter set ( ) for each ; 2) esign the mechanism to fin surrogate noes for joining events (See Section 4); an 3) esign the mapping policy from resources onto noes Next we introuce four DL- enable DHTs base on ifferent graphs, namely, Kautz, e Bruijn, butterfly an hypertree graphs DL-Kautz (DK) We first show how to buil DL-Kautz (shortly DK), a DL-enable, Kautz graph-base DHT Let Z {,, 2,, } be an alphabet of letters The Kautz string space KS(, k) is efine as KS(, k) { x x x x Z, x x, i k } A Kautz graph 2 k i i i [7] with iameter D an base, enote as K (, D ), is efine by its noe set an ege set: V ( K(, D)) KS(, D), E( K(, D)) {[ x x2 xd, x2 xd] Z, xd} Examples of Kautz graphs K (2,), K (2,2), an K(2,3) are shown in Fig 5(a), (b), an (c), respectively For the sake of simplicity, here we only consier the case of initial graph K(,) (i) The in-letter set an out-letter set of each are respectively ( ) ( ) { Z, } (ii) To fin a surrogate, we use the KHash algorithm [7] on the joining noe s IP aress to get a target string s; an the surrogate is the noe onto which s is mappe (iii) Mapping from resources onto noes can be ivie into mappings from resources onto keys an from keys onto noes Any resource can be consistently an

IEEE TRANSACTIONS ON KNOWLEDE AND DATA ENINEERIN, MANUSCRIPT ID uniformly mappe to a Kautz string by using the KHash algorithm [7] However, the mapping from a Kautz string to a noe cannot aopt prefix (or suffix) matching policies [7], [2] since they cannot guarantee the consistency Introuction of mapping from keys onto noes is omitte here ue to lack of space See our tech report [28] We prove (in our tech report [28]) that when the number of noes, the DK topology coul be approxi- 2 m mate to Kautz graph K(, m ) Thus we can say that the main features of the initial Kautz graph are preserve after a series of DL transitions We will further iscuss the properties of DL-Kautz through analysis an evaluations in Section 465 an 5, respectively 2 DL-e Bruijn (DB) A e Bruijn graph [6] with iameter D an base, enote as B(, D ), is efine by its noe set an ege set: V ( B(, D)) Z D, E( B(, D)) {[ x x x, x x ] Z } 2 D 2 D Similar to DL-Kautz, next we show how to buil DL-e Bruijn (shortly DB), a DL- enable, e Bruijn graph-base DHT, by escribing initial graph B(,) The in-letter set an out-letter set of each are ( ) ( ) { Z } The mechanisms of surrogate noe iscovery an consistent resource mapping are orthogonal to our DL technique an can be esigne in a way similar to those in DK Here we omit these mechanisms ue to lack of space Fig 6 shows three examples of the DB topologies as noes join/leave the network 3 DL-Butterfly (DBF) k A ( k,r)-butterfly [4] is a graph with n kr noes, where k an r are iameter an egree of the graph, respectively Each noe is of the form ( x x xk ; i) with i k (i is calle level) an x, x, x r For k each noe ( xx xk ; i), there is an ege to all noes of the form ( x x x yx x ; i ) when i k an i i2 k ( yx xk ;) when i k An Example of the (2,2)- butterfly graph is shown in Fig 7(a) In the basic efinition, iameter k ecies both the number of levels an the length of noe ientifiers Since the number of levels is har to exten, as in other butterfly-base DHTs such as Ulysses [4], we ecouple the number of levels (k) an the length of noe ientifiers (m), an exten the length m only By this means, a generalize butterfly graph can be represente as BF ( k, r, m ), where k, r, m are the number of levels, the egree, an the length of noe ientifiers, respectively An example of the generalize butterfly BF(2, 2,) is shown in Fig 7(b), which is isomorphic to Fig 7(a) Note that the ientifiers,, 2, 3 of level an level in Fig 7(b) are mappe from,,, of corresponing levels in Fig 7(a) accoring to the unifie escription mechanism Next we show how to buil DL-Butterfly (shortly DBF), a DL-enable, butterfly graph- base DHT, by escribing the initial graph BF (2,2,) The in-letter sets an out-letter sets of each are respectively: (,) {(,) } {(,) ( 2)mo 4}, (,) {(,) } {(,) an is even}, {(,) an is o} (a) B (2,) (b) B (2,2) Fig 6 Examples of DB topologies Fig 7 Example of DBF topologies (,) {(,) } {(,) an is even}, {(,) an is o} (,) {(,) } {(,) ( 2)mo4} Again we omit the esign of surrogate noe iscovery an consistent resource mapping ue to lack of space Fig 7 (b) an (c) show two examples of the DBF topologies 4 DL-Hypertree (DH) A hypertree graph [26] is of n leaves an epth log N, where an N are the base an maximum number of noes The Owlet Research roup [27] implements a DL-enable, hypertree-base DHT, DL-Hypertree (shortly DH), by using our DL technique Space preclues further iscussion an intereste reaers are referre to their homepage [27] 5 Comparison As iscusse above, the DL technique can be use to esign many ifferent DHTs The following Theorem 8 compares the iameters of all the DL-enable DHTs Theorem 8 iven the base, DK has the lowest iameter upper boun among all DL-enable DHTs Proof outline The iameter of DL-enable DHTs satisfies D( ) 2(log N log N D ), where, N an D are the base, orer an iameter of the initial graph, an N is the current network size By the 2 D Moore boun [], N satisfies N D ( ) / Therefore, D log N log ( ) The minimum value of D log N can be attaine ONLY by the complete symmetric graph where each noe is ajacent to all others, ie, Kautz graph K (,) Since we use K (,) as the initial graph, it is easy to infer that DK s iameter is boune by 2log N 2 However, for DK with balance join, the KHash algorithm [7] assures a uniform istribution of surrogate noes u in the network, which consequently assures a uniform istribution of responsible noes v (See Section 4) Thus it is very likely that DK s topology is always relatively balance [], ie, the ifference between the maximum an minimum ientifier lengths is small If the ifference is no more than, then the topology is equivalent to a PL (c)

ZHAN ET AL: DISTRIBUTED LINE RAPHS: A UNIVERSAL TECHNIQUE FOR DESININ DHTS BASED ON ARBITRARY REULAR RAPHS Kautz graph [] with iameter log N (See the technical report [28]) Evaluations in Section 5 show that the e facto DK iameter is no more than log N + It is well known that Kautz graphs have the lowest iameter log N log ( / ) achieving the Moore boun []; while to our knowlege with the same routing table size 2 DK has the lowest rigorous (as oppose to asymptotic) iameter upper boun 2log N 2 as well as the lowest e facto iameter log N So, we conclue that our DL technique preserves the topology feature of Kautz graphs while being applie to esigning DK Note that although DK has the optimal iameter, it oes not mean that DK is always the best choice among all the DL-enable DHTs This is because there are many ifferent concerns for other topology properties such as congestion [7], robustness [4] an churn [25] 5 EVALUATION In this section, we choose DK as a moel to evaluate DL an compare DK with other constant-egree DHTs Note that, however, since the main purpose of this section is to prove the effectiveness of DL, we omit the evaluations of DK- specific properties In Section 5 we show that with the same base (meaning the same average routing table of size 2), the maximum an average routing path lengths are remarkably less for DK than for other constant-egree DHTs Section 52 evaluates the stabilization cost an topology imbalance of DK with balance/fast join algorithms, respectively, which are corresponingly referre to as balance DK (B-DK) an fast DK (F-DK) In Section 53, we show that the DK routing is robust uner churn, even when a certain percentage of noes become faulty By following the etour-aroun mechanism, when as much as % noes are faulty, most of routing messages targete to healthy noes can still reach the estination without failure The evaluations are conucte by moifying the Tapestry simulator [5] In our evaluations, the number of noes is varie from 256 up to M, an the experiment for each property is repeate at least, times 5 Routing Path Length We evaluate the maximum an average routing path lengths (measure in terms of overlay hops) of balance DK, as a function of the number of noes, for ifferent bases = 4 an = 6 In each experiment we ranomly choose two noes an a message is route by using the propose routing algorithm The results are plotte in Fig 8, where the maximum an average routing path length of DK are enote as DK (avg) an DK (max), respectively We compare them with other high performance DHTs incluing CAN, Koore an FissionE Note that although FissionE can only have a fixe base = 2, it is also inclue in Fig 8(a) for comparison Curves for all DHTs except CAN look linear since the x-axis is in log scale (By [3], the average path length L of CAN with orer N an base satisfies L = (/4)(N / )) Number of hops Number of hops 35 3 25 2 5 5 6 4 2 8 6 4 2 CAN Koore FissionE DK(avg) DK(max) 256 k 4k 6k 64k 256k M Number of noes (a) = 4 CAN Koore DK(avg) DK(max) 256 k 4k 6k 64k 256k M Number of noes (b) = 6 Fig 8 Maximum an average routing path length From Fig 8 we conclue both the maximum an average routing path lengths of DK are remarkably less than those of others Koore performs worse since its imaginary noes [6] consume more hops Note that although we cannot contain in Fig 8 all existing DHTs, to the best of our knowlege DK oes have the lowest routing path lengths (uner conition of the same routing table size) in the literature Basically, this hols because DL enable DK to inherit the feature of optimal routing path length of Kautz graphs whose iameters achieve the Moore boun [] We evaluate the probability istribution of path lengths of DK with bases = 4 an = 6, for a fixe network size of million noes The results are shown in Fig 9 We see that DK has a low variance both for = 4 an for = 6, inicating the lengths of most routing paths are very close to the mean value 52 Stabilization Cost an Topology Imbalance We evaluate the average message cost per join/leave event in balance/fast join algorithms, which are corresponingly referre to as balance DK (B-DK) an fast DK (F-DK), for a fixe network size of million noes, an compare them with Koore that has the lowest maintenance cost among other DHTs accoring to the authors There are gateway noes an a new noe ranomly chooses a gateway to start the processing The base is set to = 4 an = 6 respectively Since the results obtaine with the two values of have similar characters, we show results in Fig corresponing to = 6 only From this figure we can see that the average cost of