Efficient Recovery from False State in Distributed Routing Algorithms

Efficient Recovery from False State in Distribute Routing Algorithms Daniel Gyllstrom, Suarshan Vasuevan, Jim Kurose, Gerome Milau Department of Computer Science University of Massachusetts Amherst {pg, svasu, urose, milau}@cs.umass.eu Abstract Malicious an misconfigure noes can inject incorrect state into a istribute system, which can then be propagate system-wie as a result of normal networ operation. Such false state can egrae the performance of a istribute system or rener it unusable. In the case of networ routing algorithms, for example, false state corresponing to a noe incorrectly eclaring a cost of to all estinations (maliciously or ue to misconfiguration) can quicly sprea through the networ, causing other noes to (incorrectly) route via the misconfigure noe, resulting in suboptimal routing an networ congestion. We propose three algorithms for efficient recovery in such scenarios an prove the correctness of each of these algorithms. Through simulation, we evaluate our algorithms when applie to removing false state in istance vector routing, in terms of message an time overhea. Our analysis shows that over topologies where lin costs remain fixe, a recovery algorithm base on systemwie checpoints an a rollbac mechanism yiels superior performance. We fin that a ifferent algorithm one that selectively s false routing state networ-wie yiels the best performance in scenarios where lin costs change. I. INTRODUCTION Malicious an misconfigure noes can egrae the performance of a istribute system by injecting incorrect state information. Such false state can then be further propagate through the system either irectly in its original form or inirectly, e.g., as a result of iffusing computations initially using this false state. In this paper, we consier the problem of removing such false state from a istribute system. In orer to mae the false-state-removal problem concrete, we investigate istance vector routing as an instance of this problem. Distance vector forms the basis for many routing algorithms wiely use in the Internet (e.g., BGP, a path-vector algorithm) an in multi-hop wireless networs (e.g., AODV, iffusion routing). However, istance vector is vulnerable to compromise noes that can potentially floo a networ with false routing information, resulting in erroneous least cost paths, pacet loss, an congestion. Such scenarios have occurre in practice. For example, in 1997 a significant portion of Internet traffic was route through a single misconfigure router, renering a large part of the Internet inoperable for several hours [19]. More recently [1], a routing error force Google to reirect its traffic through Asia, causing congestion that left many Google services unreachable. Distance vector currently has no mechanism to recover from such scenarios. Instea, human operators are left to manually reconfigure routers. It is in this context that we propose an evaluate automate solutions for recovery. In this paper, we esign, evelop, an evaluate three ifferent approaches for correctly recovering from the injection of false routing state (e.g., a compromise noe incorrectly claiming a istance of to all estinations). Such false state, in turn, may propagate to other routers through the normal execution of istance vector routing, maing this a networwie problem. Recovery is correct if the routing tables in all noes have converge to a global state in which all noes have remove each compromise noe as a estination, an no noe bears a least cost path to any estination that routes through a compromise noe. Specifically, we evelop three novel istribute recovery algorithms: best,, an. best performs localize state invaliation, followe by networ-wie recovery. Noes irectly ajacent to a compromise noe locally select alternate paths that avoi the compromise noe; the traitional istribute istance vector algorithm is then execute to remove remaining false state using these new istance vectors. The algorithm performs global false state invaliation by using iffusing computations to invaliate istance vector entries (networ-wie) that route through a compromise noe. As in best, traitional istance vector routing is then use to recompute istance vectors. uses local snapshots an a rollbac mechanism to implement recovery. Although our solutions are tailore to istance vector routing, we believe they represent approaches that are applicable to other instances of this problem. We prove the correctness of each algorithm an evaluate its efficiency in terms of message overhea an convergence time via simulation. Our simulations show that when consiering topologies in which lin costs remain fixe, outperforms both an best (at the cost of checpoint memory). This is because can efficiently remove all false state by simply rolling bac to a checpoint immeiately preceing the injection of false routing state. In scenarios where lin costs can change, outperforms an best. performs poorly because, following rollbac, it must process the vali lin cost changes that occurre since the false routing state was injecte; best an, however, can mae use of computations subsequent to the injection of false routing state that i not epen on the false routing state. We will see, however, that best performance suffers because of the so-calle count-to- problem.

Recovery from false routing state is closely relate to the problem of recovering from malicious transactions [15] [4] in istribute atabases. Our problem is also similar to that of rollbac in optimistic parallel simulation [13]. However, we are unaware of any existing solutions to the problem of recovering from false routing state. A closely relate problem to the one consiere in this paper is that of iscovering misconfigure noes. In Section II, we iscuss existing solutions to this problem. In fact, the output of these algorithms serve as input to the recovery algorithms propose in this paper. This paper has six sections. In Section II we efine the problem an state our assumptions. We present our three recovery algorithms in Section III. Then, in Section IV, we present a qualitative evaluation of our recovery algorithms. Section V escribes our simulation stuy. We etail relate wor in Section VI an finally we conclue an comment on irections for future wor in Section VII. II. PROBLEM FORMULATION We consier istance vector routing [5] over arbitrary networ topologies. We moel a networ as an unirecte graph, G = (V, E), with a lin weight function w : E N. Each noe, v, maintains the following state as part of istance vector: a vector of all ajacent noes (aj(v)), a vector of least cost istances to all noes in G ( min v ), an a istance matrix that contains istances to every noe in the networ via each ajacent noe (matrix v ). We assume that the ientity of the compromise noe is provie by a ifferent algorithm, an thus o not consier this problem in this paper. Examples of such algorithms inclue [7], [8], [9] in the context of wire networs an [21] in the wireless setting. Specifically, we assume that at time t, this algorithm is use to notify all neighbors of the compromise noe(s) that a noe was compromise. Let t be the time the noe was compromise. For each of our algorithms, the goal is for all noes to recover correctly : all noes shoul remove the compromise noe as a estination an fin new least cost istances that o not use the compromise noe. If the networ becomes isconnecte as a result of removing the compromise noe, all noes nee only compute new least cost istances to all other noes within their connecte component. For simplicity, let v enote the compromise noe, let ol refer to min v before v was compromise, an let ba enote min v after v has been compromise. Table I summarizes the notation use in this ocument. III. RECOVERY ALGORITHMS In this section we propose three new recovery algorithms: best,, an. With one exception, the input an output of each algorithm is the same. 1 Input: Unirecte graph, G = (V, E), with weight function w : E N. v V, min v an matrix v are compute 1 Aitionally, as input requires that each v aj(v) is notifie of the time, t, in which v was compromise. Abbreviation Meaning min i noe i s the least cost vector matrix i noe i istance matrix lc lin cost change event t time the oracle etects the compromise noe t time the compromise noe was compromise ba compromise noe s least cost vector at an after t ol compromise noe s least cost vector at an before t v compromise noe aj(v) noes ajacent to v TABLE I TABLE OF ABBREVIATIONS. (using istance vector). Also, each v aj(v) is notifie that v was compromise. Output: Unirecte graph, G = (V, E ), where V = V {v}, E = E {( v, v i ) v i aj( v)}, an lin weight function w : E N. min v an matrix v are compute via the algorithms iscusse below v V. First we escribe a preprocessing proceure common to all three recovery algorithms. Then we escribe each recovery algorithm. A. Preprocessing All three recovery algorithms share a common preprocessing proceure. The proceure removes v as a estination an fins the noe IDs in each connecte component. This coul be implemente (as we have one here) using iffusing computations [6] initiate at each v aj(v). A iffusing computation is a istribute algorithm starte at a source noe which grows by sening queries along a spanning tree, constructe simultaneously as the queries propagate through the networ. When the computation reaches the leaves of the spanning tree, replies travel bac along the tree towars the source causing the tree to shrin. The computation eventually terminates when the source receives replies from each of its chilren in the tree. In our case, each iffusing computation message contains a vector of noe IDs. When a noe receives a iffusing computation message, the noe as its ID to the vector an removes v as a estination. At the en of the iffusing computation, each v aj(v) has a vector that inclues all noes in v s connecte component. Finally, each v aj(v) broacasts the vector of noe IDs to all noes in their connecte component. In the case where removing v partitions the networ, each noe will only compute shortest paths to noes in the vector. Consier the example in Figure 1 where v is the compromise noe. When i receives the notification that v has been compromise, i removes v as a estination an then initiates a iffusing computation. i creates a vector an as its noe ID to the vector. i sens a message containing this vector to j an. Upon receiving i s message, j an both remove v as a estination an a their own ID to the message s vector. Finally, l an receive a message from j an, respectively. l an a their noe own ID to the message s vector an remove v as a estination. Then, l an sen an ACK message

bac to j an, respectively, with the complete list of noe IDs. Eventually when i receives the ACKs from j an, i has a complete list of noes in its connecte component. Finally, i broacasts the vector of noe IDs in its connecte component. B. The best Algorithm best invaliates state locally an then uses istance vector to implement networ-wie recovery. Following the preprocessing escribe in Section III-A, each neighbor of the compromise noe locally invaliates state by selecting the least cost pre-existing alternate path that oes not use the compromise noe as the first hop. The resulting istance vectors trigger the execution of traitional istance vector to remove the remaining false state. Algorithm 1 in the Appenix gives a complete specification of best. We trace the execution of best using the example in Figure 1. At time t + (Figure 1(b)), i uses v to reach noes l an. j uses i to reach all noes except l. Notice that when j uses i to reach, it transitively uses ba (e.g., uses path j i v to ). After the preprocessing completes, i selects a new neighbor to route through to reach l an by fining its new smallest istance in matrix i to these estinations: i selects the routes via j to l with a cost of 1 an i pics the route via to reach with cost of 1. (No changes are require to route to j an because i uses its irect lin to these two noes). Then, using traitional istance vector i sens min i to j an. When j receives min i, j must moify its istance to because min i inicates that i s least cost to is now 1. j s new istance value to becomes 15, using the path j i l. j then sens a message sharing min j with its neighbors. From this point, recovery procees accoring by using traitional istance vector. best is simple an maes no synchronization assumptions. However, best is vulnerable to the count-to- problem. Because each noe only has local information, the new shortest paths may continue to use v. For example, if w(, ) = 4 in Figure 1, a count-to- scenario woul arise. After notification of v s compromise, i woul select the route via j to reach with cost 151 (by consulting matrix i ), using a path that oes not actually exist in G (i j i v ), since j has remove v as a neighbor. When i sens min i to j, j selects the route via i to with cost 21. Again, the path j i j i v oes not exist. In the next iteration, i pics the route via j having a cost of 251. This process continues until each noe fins their correct least cost to. We will see in our simulation stuy that the count-to- problem can incur significant message an time costs. C. The Algorithm globally invaliates all false state using a iffusing computation an then uses istance vector to compute new istance values that avoi all invaliate paths. The iffusing computation is initiate at the neighbors of v because only these noes are aware if v is use an intermeiary noe. The iffusing computations sprea from v s neighbors to the networ ege, invaliating false state at each noe along the way. Then ACKs travel bac from the networ ege to the neighbors of v, inicating that the iffusing computation is complete. See Algorithm 2 an 3 in the Appenix for a complete specification of this iffusing computation. Next, uses istance vector to recompute least cost paths invaliate by the iffusing computations. In Figure 1, the iffusing computation executes as follows. First, i sets its istance to l an to (thereby invaliating i s path to l an ) because i uses v to route these noes. Then, i sens a message to j an containing l an as invaliate estinations. When j receives i s message, j checs if it routes via i to reach l or. Because j uses i to reach, j sets its istance estimate to to. j oes not moify its least cost to l because j oes not route via i to reach l. Next, j sens a message that inclues as an invaliate estination. l performs the same steps as j. After this point, the iffusing computation ACKs travel bac towars i. When i receives an ACK, the iffusing computation is complete. At this point, i nees to compute new least costs to noe l an because i s istance estimates to these estinations are. i uses matrix i to select its new route to l (which is via j) an uses matrix i to fin i s new route to (which is via ). Both new paths have cost 1. Finally, i sens min i to its neighbors, triggering the execution of istance vector to recompute the remaining istance vectors. Note that a consequence of the iffusing computation is that not only is all ba state elete, but all ol state as well. Consier the case when v is etecte before noe i receives ba. It is possible that i uses ol to reach a estination,. In this case, the iffusing computation will set i s istance to to. An avantage of is that it maes no synchronization assumptions. Also, the iffusing computations ensure that the count-to- problem oes not occur by removing false state from the entire networ. However, globally invaliating false state can be wasteful if vali alternate paths are locally available. D. The Algorithm 2 is our thir an final recovery algorithm. Unlie best an, only requires that clocs across ifferent noes be loosely synchronize i.e. the maximum cloc offset between any two noes is assume to be δ. For ease of explanation, we escribe as if the clocs at ifferent noes are perfectly synchronize. Extensions to hanle loosely synchronize clocs shoul be clear. Accoringly, we assume that all neighbors of v, are notifie of the time, t, at which v was compromise. For each noe, i G, as a time imension to min i an matrix i, which then uses to locally archive a complete history of values. Once the compromise noe is iscovere, the archive allows the system to rollbac to a system snapshot from a time before v was compromise. From this point, nees to remove v an ol an upate stale 2 The name is an abbreviation for ChecPoint an Rollbac.

l 5 j 5 to via D i j l 1 2 v 15 2 1 15 l 1 5 1 j 5 to via D i j l 1 151 v 51 151 1 51 l 5 j 5 to D i l via j 1 2 2 1 v 5 5 i 5 to D j i l l 15 5 via 15 25 1 2 v 1 5 1 1 5 i 5 to D j i l l 11 5 via 11 21 1 2 v 5 i 5 to D j i l l 15 5 via 15 25 x 1 2 (a) Before t (b) t + (c) After recovery Fig. 1. Three snapshots of a graph, G, where v is the compromise noe: (a) G before v goes ba, (b) G after ba has finishe propagating but before recovery has starte, an (c) G after recovery. The ashe lines in (b) inicate paths using ba. matrix i an matrix j, at the time of the snapshot, are isplaye to the right of each sub-figure. The least cost values are unerline. istance values resulting from lin cost changes. We escribe each algorithm step in etail. Step 1: Create a min an matrix archive. We efine a snapshot of a ata structure to be a copy of all current istance values along with a timestamp. 3 The timestamp mars the time at which that set of istance values start being use. min an matrix are the only ata structures that nee to be archive. This approach is similar to ones use in temporal atabases [16], [14]. Our istribute archive algorithm is quite simple. Each noe has a choice of archiving at a given frequency (e.g., every m timesteps) or after some number of istance value changes (e.g., each time a istance value changes). Each noe must choose the same option, which is specifie as an input parameter to. A noe archives inepenently of all other noes. A sie effect of inepenent archiving, is that even with perfectly synchronize clocs, the union of all snapshots may not constitute a globally consistent snapshot. For example, a lin cost change event may only have propagate through part of the networ, in which case the snapshot for some noes will reflect this lin cost change (i.e., among noes that have learne of the event) while for other noes no local snapshot will reflect the occurrence of this event. We will see that a globally consistent snapshot is not require for correctness. Step 2: Rolling bac to a vali snapshot. Rollbac is implemente using iffusing computations. Neighbors of the compromise noe inepenently select a snapshot to roll bac to, such that the snapshot is the most recent one taen before t. Each such noe, i, rolls bac to this snapshot by restoring the min i an matrix i values from the snapshot. Then, i initiates a iffusing computation to inform all other noes to o the same. If a noe has alreay rolle bac an receives an aitional rollbac message, it is ignore. (Note that this rollbac algorithm ensures that no reinstate istance value uses ba because every noe rolls bac to a snapshot with a timestamp less that t. ) Algorithm 4 in the Appenix gives the pseuo-coe for the rollbac algorithm. Step 3: Steps after rollbac. After Step 2 completes, the algorithm in Section III-A is execute. There are two issues to aress. First, some noes may be using ol. Secon, some noes may have stale state as a result of lin cost changes that occurre uring [t, t] an consequently are not reflecte in the snapshot. To resolve these issues, each neighbor, i, of v, sets its istance to v to an then selects new least cost values that avoi the compromise noe, triggering the execution of istance vector to upate the remaining istance vectors. That is, for any estination,, where i routes via v to reach, i uses matrix i to fin a new least cost to. If a new least costs value is use, i sens a istance vector message to its neighbors. Otherwise, i sens no message. Messages sent trigger the execution of istance vector. During the execution of istance vector, each noe uses the most recent lin weights of its ajacent lins. Thus, if the same lin changes cost multiple times uring [t, t], we ignore all changes but the most recent one. Algorithm 5 specifies Step 3 of. In the example from Figure 1, the global state after rolling bac is nearly the same as the snapshot epicte in Figure 1(c): the only ifference between the actual system state an that epicte in Figure 1(c) is that in the former (i,v) = 5 rather than. Step 3 in maes this change. Because no noes use ol, no other changes tae place. Rather than using an iterative process to remove false state (lie in best an ), oes so in one iffusing computation. However, incurs storage overhea resulting from perioic snapshots of min an matrix. Also, after rolling bac, stale state may exist if lin cost changes occur uring [t, t]. This can be expensive to upate. Finally, unlie an best, requires loosely synchronize clocs because without a boun on the cloc offset, noes may rollbac to highly inconsistent local snapshots. Although correct, this woul severely egrae performance. 3 In practice, we only archive istance values that have change. Thus each istance value is associate with its own timestamp.

IV. ANALYSIS OF ALGORITHMS In Section IV-A, we prove the correctness of our three recovery algorithms. Then, we prove specific properties of these recovery algorithms in Section IV-B, which help better unerstan our simulation results. A. Correctness of Recovery Algorithms We mae the following assumptions in our proofs. All the initial matrix values are nonnegative. Furthermore, all min values perioically exchange between neighboring noes are nonnegative. All v V now their ajacent lin costs. All lin weights in G (an therefore G as well) are nonnegative an o not change. 4 G is finite an connecte. Finally, we assume reliable communication. Definition 1. An algorithm is correct if the following two conitions are satisfie. One, v V, v has the least cost an nows next-hop to all estinations v V. Two, the least cost is compute in finite time. Theorem 1. Distance vector is correct. P roof. Bertseas an Gallager [5] prove correctness for istribute Bellman-For for arbitrary nonnegative matrix values. Their istribute Bellman-For algorithm is the same as the istance vector algorithm use in this paper. Corollary 1. best is correct. P roof. As per the preprocessing step, each noe receiving a iffusing computation message removes v as a estination. Each noe is guarantee to receive a iffusing computation message (by our reliable communication an finite graph assumptions). Further, the iffusing computation terminates in finite time. Thus, we conclue that each v V removes v in finite time. Following the iffusing computation, each v aj(v) uses istance vector to etermine new least cost paths. Because all matrix v are nonnegative for all v V, by Theorem 1 we conclue best is correct. Corollary 2. is correct. P roof. The iffusing computation starts with each v aj(v) fining every estination,, to which v s least cost path uses v as the first-hop noe. v sets its least cost to each such to, thereby invaliating its path to. v then initiates a iffusing computation. When an arbitrary noe, i, receives a iffusing computation message from j, i iterates through each specifie in the message. If i routes via j to reach, i sets its least cost to to, therefore invaliating any path to with j an v an intermeiate noes. By our assumptions, each noe receives a iffusing computation message an the iffusing computation terminates in finite time. Thus, we conclue that all paths using v as an intermeiary noe are invaliate in finite time. Following the preprocessing, each v aj(v) uses istance vector to etermine new least cost paths. Because all matrix v are nonnegative for all v V, by Theorem 1 we 4 We use the efinition of G an G escribe in Section III. conclue that is correct. Corollary 3. is correct. P roof. rolls bac using a iffusing computation. Each noe that receives a iffusing computation message, rolls bac to a snapshot with timestep less than t. By our assumptions, all noes receive a message an the iffusing computation terminates in finite time. Thus, we conclue that each noe v V rolls bac to a snapshot with timestamp less than t in finite time. then runs the preprocessing algorithm escribe in Section III-A, which removes v as a estination in finite time (as shown in Corollary 1). Because each noe rolls bac to a snapshot in which all least costs are nonnegative an then uses istance vector to compute new least costs, by Theorem 1 we conclue that is correct. B. Properties of Recovery Algorithms In this section we formally characterize how min values change uring recovery. The properties establishe in this section will ai in unerstaning the simulation results presente in Section V. Our proofs assume that lin costs remain fixe uring recovery (i.e., uring [t, t]). We prove properties about min in orer provie a precise characterization of recovery trens. In particular, our proofs establish that: The least cost between two noes at the start of recovery is less than or equal to the least cost when recovery has complete. (Theorem 2) Before recovery begins, if the least cost between two noes is less than its cost when recovery is complete, the path must be using ba or ol either irectly or transitively. (Corollary 4) During best an recovery, if the least cost between two noes is less than its istance when recovery is complete, the path must be using ba or ol either irectly or transitively. (Corollary 5) The first two statements apply to any recovery algorithm because they mae no claims about min values uring recovery. Notation. We use the efinition of G an G escribe in Section III. Let n, V. Let p s (n, ) be the least cost path from noe n to at the start of recovery an δ s (n, ) the cost of this path; p i (n, ) is a path from n to use uring the recovery an δ i (n, ) the cost of this path 5 ; an p f (n, ) the least cost path from n to when recovery is finishe an has cost δ f (n, ). Theorem 2. n, V, δ s (n, ) δ f (n, ). P roof: Assume n i, i V such that δ s (n i, i ) > δ f (n i, i ). The paths available at the start of recovery are a superset of those available when recovery is complete. This means p f (n i, i ) is available before recovery begins. Distance vector woul use this path rather than p s (n i, i ), implying that δ s (n i, i ) = δ f (n i, i ), a 5 p i(n, ) an δ i(n, ) can change over time uring recovery.

contraiction. Corollary 4. n, V, if δ s (n, ) < δ f (n, ), then p s (n, ) is using ba or ol either irectly or transitively. P roof: Assume n i, i V such that a path p s (n i, i ) with cost δ s (n i, i ) is use before recovery begins where δ s (n i, i ) < δ f (n i, i ) an p s (n i, i ) oes not use ba or ol. The only paths available before recovery begins, which o not exist when recovery completes, are ones using ba or ol. Therefore, p s (n i, i ) must be available after recovery completes since we have assume that p s (n i, i ) oes not use ba or ol. Distance vector woul use p s (n i, i ) instea of p f (n i, i ) because δ s (n i, i ) < δ f (n i, i ). However this woul imply that δ s (n i, i ) = δ f (n i, i ), a contraiction. Corollary 5. For best an. n, V, if δ i (n, ) < δ f (n, ) then p i (n, ) must be using ba or ol either irectly or transitively 6 P roof: We can use the same proof for Corollary 4 if we substitute δ i (n, ) for δ s (n, ) an p i (n, ) for p s (n, ). V. EVALUATION In this section, we use simulations to characterize the performance of each of our three recovery algorithms in terms of message an time overhea. Our goal is to illustrate the relative performance of our recovery algorithms over ifferent topology types (e.g., Erös-Rényi graphs, Internetlie graphs) an ifferent networ conitions (e.g., fixe lin costs, changing lin costs). We evaluate recovery after a single compromise noe has istribute false routing state. We buil a custom simulator with a synchronous communication moel: noes sen an receive messages at fixe epochs. In each epoch, a noe receives a message from all its neighbors an performs its local computation. In the next epoch, the noe sens a message (if neee). All algorithms are eterministic uner this communication moel. The synchronous communication moel, although simple, yiels interesting insights into the performance of each of the recovery algorithms. Evaluation of our algorithms using a more general asynchronous communication moel is currently uner investigation. However, we believe an asynchronous implementation will emonstrate similar trens. We simulate the following scenario: 1) Before t, v V min v an matrix v are correctly compute. 2) At time t, v is compromise an avertises a ba (a vector with a cost of 1 to every noe in the networ) to its neighboring noes. 3) ba spreas for a specifie number of hops (this varies by experiment). Variable refers to the number of hops that ba has sprea. 6 Corollary 5 oes not apply to recovery because the δ i(n, ) < δ f (n, ) conition is not always satisfie. 4) At time t, some noe v V notifies all v aj(v) that v was compromise. 7 The message an time overhea are measure in step (4) above. The pre-computation common to all three recovery algorithms, escribe in Section III-A, is not counte towars message an time overhea. A. Fixe Lin Weight Experiments In the next three experiments, we evaluate our recovery algorithms over ifferent topology types in the case of fixe lin costs. 1) Experiment 1 - Erös-Rényi Graphs with Fixe Unit Lin Weights: We start with a simplifie setting an consier Erös- Rényi graphs with parameters n an p. n is the number of graph noes an p is the probability that lin (i, j) exists where i, j V. The lin weight of each ege in the graph is set to 5. We iterate over ifferent values of. For each, we generate an Erös-Rényi graph, G = (V, E), with parameters n an p. Then we select a v V uniformly at ranom an simulate the scenario escribe above, using v as the compromise noe. In total we sample 2 unique noes for each G. We set n = 1, p = {.5,.15,.25,.5}, an let = {1, 2,...1}. Each ata point is an average over 6 runs (2 runs over 3 topologies). We then plot the 9% confience interval. For each of our recovery algorithms, Figure 2 shows the message overhea for ifferent values of. We conclue that outperforms an best across all topologies. performs well because ba is remove using a single iffusing computation, while the other algorithms remove ba state through istance vector s the iterative process of the istance vector algorithm. s global state after rolling bac is almost the same as the final recovere state. best recovery can be unerstoo as follows. By Corollary 4 an 5 in Section IV-B, istance values increase from their initial value until they reach their final (correct) value. Any intermeiate, non-final, istance value uses ba or ol. Because ba an ol no longer exist uring recovery, these intermeiate values must correspon to routing loops. Table II shows that there are few pairwise routing loops uring best recovery in the networ scenarios generate in Experiment 1, inicating that best istance values quicly count up to their final value. 8 Although no pairwise routing loops exist uring recovery, incurs overhea in its phase. Roughly, 5% of s messages come from the phase. For these reasons, has higher message overhea than best. Figure 3 shows the time overhea for the same p values. The trens for time overhea match the trens we observe for message overhea. 9 7 For this noe also inicates the time, t, v was compromise. 8 We compute this metric as follows. After each simulation timestep, we count all pairwise routings loops over all source-estination pairs an then sum all of these values. 9 For the remaining experiments, we omit time overhea plots because time overhea follows the same trens as message overhea.

12 3 16 11 1 9 8 7 6 5 4 25 2 15 1 15 14 13 12 11 1 9 8 3 2 5 7 6 1 1 2 3 4 5 6 7 8 9 1 1 2 3 4 5 6 7 8 9 1 5 1 2 3 4 5 6 7 8 9 1 (a) p =.5, iameter=6.14 (b) p =.15, iameter=3.1 (c) p =.25, iameter=2.99 5 45 4 35 3 25 2 1 2 3 4 5 6 7 8 9 1 () p =.5, iameter=2 Fig. 2. Experiment 1: message overhea for Erös-Rényi Graphs with Fixe Unit Lin Weights generate over ifferent p values. Note the y-axis have ifferent scales. = 1 = 2 = 3 = 4 1 p =.5 14 87 92 p =.15 7 8 9 p =.25 p =.5 TABLE II AVERAGE NUMBER PAIRWISE ROUTING LOOPS FOR best IN EXPERIMENT 1. = 1 = 2 = 3 = 4 1 p =.5 554 133 9239 12641 p =.15 319 698 5514 7935 p =.25 28 446 351 544 p =.5 114 234 263 2892 TABLE III AVERAGE NUMBER PAIRWISE ROUTING LOOPS FOR best IN EXPERIMENT 2. an best message overhea increases with larger. Larger implies that false state has propagate further in the networ, implying more paths to repair, an therefore increase messaging. For values of greater than a graph s iameter, the message overhea remains constant, as expecte. 2) Experiment 2 - Erös-Rényi Graphs with Fixe but Ranomly Chosen Lin Weights: The experimental setup is ientical to Experiment 1 with one exception: lin weights are selecte uniformly at ranom between [1, n] (rather than using fixe lin weight of 5). Figure 4 show the message overhea for ifferent where p = {.5,.15,.25,.5}. In striing contrast to Experiment 1, outperforms best for most values of. best performs poorly because the count-to- problem: Table III shows the large average number of pairwise routing loops in this experiment, an inicator of the occurrence of countto- problem. In the few cases (e.g., = 1 for p =.15, p =.25 an p =.5) that best performs better than, best has few routing loops. No routing loops are foun with. performs well for the same reasons escribe in Section V-A1. In aition, we counte the number of epochs in which at least one pairwise routing loop existe. For best (across all topologies), on average, all but the last three timesteps ha at least one routing loop. This suggests that the count-to- problem ominates the cost for best. 3) Experiment 3 - Internet-lie Topologies: Thus far, we stuie the performance of our recovery algorithms over Erös-Rényi graphs, which have provie us with useful intuition about the performance of each algorithm. In this experiment, we simulate our algorithms over Internet-lie topologies ownloae from the Rocetfuel website [3] an generate using GT-ITM [2]. The Rocetfuel topologies have inferre ege weights. For each Rocetfuel topology, we let each noe be the compromise noe an average over all of these cases for each value of. For GT-ITM, we use the parameters specifie in Hecmann et al [11] for the 154- noe AT&T topology escribe in Section 4 of [11]. For the GT-ITM topologies, we use the same criteria specifie in Experiment 1 to generate each ata point. The results, shown in Figure 5, follow the same pattern as in

1 4.5 4.5 Timesteps to Converge 9 8 7 6 5 4 Timesteps to Converge 4 3.5 3 2.5 2 Timesteps to Converge 4 3.5 3 2.5 2 1.5 1 3 1 2 3 4 5 6 7 8 9 1 (a) p =.5, iameter=6.14 1.5 1 2 3 4 5 6 7 8 9 1 (b) p =.15, iameter=3.1.5 1 2 3 4 5 6 7 8 9 1 (c) p =.25, iameter=2.99 Timesteps to Converge 3 2.5 2 1.5 1.5 1 2 3 4 5 6 7 8 9 1 () p =.5, iameter=2 Fig. 3. Experiment 1: time overhea for Erös-Rényi Graphs with Fixe Unit Lin Weights generate over ifferent p values. Experiment 2. In the cases where best performs poorly, the count-to- problem ominates the cost, as evience by the number of pairwise routing loops. In the few cases that best performs better than, there are few pairwise routing loops. B. Lin Weight Change Experiments So far, we have evaluate our algorithms over ifferent topologies with fixe lin costs. We foun that outperforms the other algorithms because removes false routing state with a single iffusing computation, rather than using an iterative istance vector process as in best an. In the next two experiments we evaluate our algorithms over graphs with changing lin costs. We introuce lin cost changes between the time v is compromise an when v is iscovere (e.g. uring [t, t]). In particular, let there be λ lin cost changes per timestep, where λ is eterministic. To create a lin cost change event, we choose a lin (except for all (v, v) lins) whose lin will change equiprobably among all lins. The new lin cost is selecte uniformly at ranom from [1, n]. 1) Experiment 4: Except for λ, our experimental setup is ientical to the one in Experiment 2. We let λ = {1, 4, 8}. In orer to isolate the effects of lin costs changes, we assume that checpoints at each timestep. Figure 6 shows yiels the lowest message overhea for p =.5, but only slightly lower than. s message overhea increases with larger because there are more lin cost change events to process. After rolls bac, it must process all lin cost changes that occurre in [t, t]. In contrast, best an process some of the lin cost change events uring the interval [t, t] as part of normal istance vector execution. In our experimental setup, these messages are not counte because they o not occur in Step 4 (i.e., as part of the recovery process) of our simulation scenario escribe in Section V. Our analysis further inicates that best performance suffers because of the count-to- problem. The gap between best an the other algorithms shrins as λ increases because as λ increases, lin cost changes have a larger effect on message overhea. With larger p values, λ has a smaller effect on message complexity because more alternate paths are available. Thus when p =.15 an λ = 1, most of s recovery effort is towars removing ba state, rather than processing lin cost changes. Because removes ba using a single iffusing computation an there are few lin cost changes, has lower message overhea than in this case. As λ increases, has higher message overhea than : there are more lin cost changes to process an must process all such lin cost changes, while processes some lin cost changes uring the interval [t, t] as part of normal istance vector execution. 2) Experiment 5: In this experiment we stuy the trae-off between message overhea an storage overhea for. To this en, we vary the frequency at which checpoints an fix the interval [t, t]. Otherwise, our experimental setup is the same as Experiment 4. Figure 7 shows the results for an Erös-Rényi graph with lin weights selecte uniformly at ranom between [1, n], n = 1, p =.5, λ = {1, 4, 8} an = 2. We plot message overhea against the number of timesteps must

18 24 3 16 14 12 1 8 6 4 2 22 2 18 16 14 12 1 8 6 4 25 2 15 1 5 2 (a) p =.5, iameter=6.14 (b) p =.15, iameter=3.1 (c) p =.25, iameter=2.99 3 25 2 15 1 5 () p =.5, iameter=2 Fig. 4. scales. Experiment 2: message overhea for Erös-Rényi graph with lin weights selecte uniformly ranom from [1, 1]. Note the y-axis have ifferent 1 7 35 9 8 7 6 5 4 3 2 1 6 5 4 3 2 1 3 25 2 15 1 5 (a) GT-ITM, n = 156, iameter=14.133 (b) Rocetfuel 6461, n = 141, iameter=8 (c) Rocetfuel 3867, n = 79, iameter=1 Fig. 5. Experiment 3: Internet-lie graph message overhea rollbac, z. s message overhea increases with larger z because as z increases there are more lin cost change events to process. best an have constant message overhea because they operate inepenent of z. We conclue that as the frequency of snapshots ecreases, incurs higher message overhea. Therefore, when choosing the frequency of checpoints, the trae-off between storage an message overhea must be carefully consiere. C. Summary Our results show that for graphs with fixe lin costs, yiels the lowest message an time overhea. benefits from removing false state with a single iffusing computation. However, has storage overhea, requires loosely synchronize clocs, an requires the time v was compromise be ientifie. best s performance is etermine by the count-to- problem. In this case of Erös-Rényi graphs with fixe unit lin weights, the count-to- problem was minimal, helping best perform better than. avois the count-to- problem by first globally invaliating false state. Therefore in cases where the count-to- problem is significant, outperforms best. When consiering graphs with changing lin costs, s performance suffers because it must process all vali lin cost changes that occurre since v was compromise. Meanwhile, best an mae use of computations that followe the injection of false state, that o not epen on false routing state. However, best s performance egraes because of the count-to- problem. eliminates the count-to- problem an therefore yiels the best performance over topologies with changing lin costs.

3 25 3 25 3 25 2 15 1 2 15 1 2 15 1 5 5 5 35 (a) p =.5, iameter=6.14, λ = 1 35 (b) p =.5, iameter=6.14, λ = 4 35 (c) p =.5, iameter=6.14, λ = 8 3 3 3 25 2 15 1 25 2 15 1 25 2 15 1 5 5 5 () p =.15, iameter=3.1, λ = 1 (e) p =.15, iameter=3.1, λ = 4 (f) p =.15, iameter=3.1, λ = 8 Fig. 6. Experiment 4: Message overhea for p = {.5,.15} Erös-Rényi with lin weights selecte uniformly ranom with ifferent λ values. 13 12 11 1 9 8 7 6 5 3 25 2 15 1 5 3 25 2 15 1 4 3 z 5 (a) p =.5, = 2, λ = 1 (b) p =.5, = 2, λ = 4 (c) p =.5, = 2, λ = 8 Fig. 7. Experiment 5: message overhea for p =.5 Erös-Rényi with lin weights selecte uniformly ranom with ifferent λ values. z refers to the number of timesteps must rollbac. Note the y-axis have ifferent scales. Finally, we foun that an aitional challenge with is setting the parameter which etermines the checpoint frequency. More frequent checpointing yiels lower message an time overhea at the cost of more storage overhea. Ultimately, application-specific factors must be consiere when setting this parameter. VI. RELATED WORK There is a rich boy of research in securing routing protocols [12], [2], [23]. However, preventative measures sometimes fail, requiring automate techniques (lie ours) to provie recovery. Previous approaches to recovery from router faults [18], [22] focus on allowing a router to continue forwaring pacets while new routes are compute. We focus on a ifferent problem - recomputing new paths following the etection of a malicious noe that may have injecte false routing state into the networ. Our problem is similar to that of recovering from malicious but committe atabase transactions. Liu [4] an Ammann [15] evelop algorithms to restore a atabase to a vali state after a malicious transaction has been ientifie. s algorithm to globally invaliate false state can be interprete as a istribute implementation of the epenency graph approach in [15]. Database crash recovery [17] an message passing systems [7] both use snapshots to restore the system in the event of a failure. In both problem omains, the snapshot algorithms are careful to ensure snapshots are globally consistent. In our setting, consistent global snapshots are not require for, since istance vector routing only requires that all initial istance estimates be nonnegative.

Garcia-Lunes-Aceves s DUAL algorithm [1] uses iffusing computations to coorinate least cost upates in orer to prevent routing loops. In our case, an the preprocessing proceure (Section III-A) use iffusing computations for purposes other than upating least costs (e.g., rollbac to a checpoint in the case of an remove v as a estination uring preprocessing). Lie DUAL, the purpose of s iffusing computations is to prevent routing loops. However, s iffusing computations o not verify that new least costs preserve loop free routing (as with DUAL) but instea globally invaliate false routing state. Jefferson [13] proposes a solution to synchronize istribute systems calle Time Warp. Time Warp is a form of optimistic concurrency control an, as such, occasionally requires rolling bac to a checpoint. Time Warp oes so by unsening each message sent after the time the checpoint was taen. With our algorithm, a noe oes not nee to explicitly unsen messages after rolling bac. Instea, each noe sens its min taen at the time of the snapshot, which implicitly unoes the effects of any messages sent after the snapshot timestamp. VII. CONCLUSIONS AND FUTURE WORK In this paper, we evelop methos for recovery in scenarios where a malicious noe injects false state into a istribute system. We stuy an instance of this problem in istance vector routing. We present an evaluate three new algorithms for recovery in such scenarios. Among our three algorithms, our results show that a checpoint-rollbac base algorithm yiels the lowest message an time overhea over topologies with fixe lin costs. However, has storage overhea an requires loosely synchronize clocs. In the case of topologies with changing lin costs, performs best by avoiing the problems that plague an best. Unlie, has no stale state to upate because oes not rollbac in time. The count-to- problem results in high message overhea for best, while eliminates the count-to- problem by globally purging false state before fining new least cost paths. As future wor, we are intereste in fining the worst possible false state a compromise noe can inject. Some options inclue the minimum istance to all noes (e.g., our choice for false state use in this paper), state that maximizes the effect of the count-to- problem, an false state that contaminates a bottlenec lin. We also woul lie to evaluate the effects of multiple compromise noes on our recovery algorithms. VIII. ACKNOWLEDGMENTS The authors greatly appreciate iscussions with Dr. Brian DeCleene of BAE Systems, who initially suggeste this problem area. REFERENCES [1] Google Embarrasse an Apologetic After Crash. http://www.computerweely.com/articles/29/5/15/2366/googleembarrasse-an-apologetic-after-crash.htm. [2] GT-ITM. http://www.cc.gatech.eu/projects/gtitm/. [3] Rocetfuel. http://www.cs.washington.eu/research/networing/rocetfuel/ maps/weights/weights-ist.tar.gz. [4] P. Ammann, S. Jajoia, an Peng Liu. Recovery from Malicious Transactions. IEEE Trans. on Knowl. an Data Eng., 14(5):1167 1185, 22. [5] D. Bertseas an R. Gallager. Data Networs. Prentice-Hall, Inc., Upper Sale River, NJ, USA, 1987. [6] E. Dijstra an C. Scholten. Termination Detection for Diffusing Computations. Information Processing Letters, (11), 198. [7] K. El-Arini an K. Killourhy. Bayesian Detection of Router Configuration Anomalies. In MineNet 5: Proceeings of the 25 ACM SIGCOMM worshop on Mining networ ata, pages 221 222, New Yor, NY, USA, 25. ACM. [8] N. Feamster an H. Balarishnan. Detecting BGP Configuration Faults with Static Analysis. In 2n Symp. on Networe Systems Design an Implementation (NSDI), Boston, MA, May 25. [9] A. Felmann an J. Rexfor. IP Networ Configuration for Intraomain Traffic Engineering. IEEE Networ Magazine, 15:46 57, 21. [1] J. J. Garcia-Lunes-Aceves. Loop-free Routing using Diffusing Computations. IEEE/ACM Trans. Netw., 1(1):13 141, 1993. [11] O. Hecmann, M. Piringer, J. Schmitt, an R. Steinmetz. On Realistic Networ Topologies for Simulation. In MoMeTools 3: Proceeings of the ACM SIGCOMM worshop on Moels, methos an tools for reproucible networ research, pages 28 32, New Yor, NY, USA, 23. ACM. [12] YC Hu, D.B. Johnson, an A. Perrig. SEAD: Secure Efficient Distance Vector Routing for Mobile Wireless A Hoc Networs. In Mobile Computing Systems an Applications, 22. Proceeings Fourth IEEE Worshop on, pages 3 13, 22. [13] D. Jefferson. Virtual Time. ACM Trans. Program. Lang. Syst., 7(3):44 425, 1985. [14] C. Jensen, L. Mar, an N. Roussopoulos. Incremental Implementation Moel for Relational Databases with Transaction Time. IEEE Trans. on Knowl. an Data Eng., 3(4):461 473, 1991. [15] P. Liu, P. Ammann, an S. Jajoia. Rewriting Histories: Recovering from Malicious Transactions. Distribute an Parallel Databases, 8(1):7 4, 2. [16] D. Lomet, R. Barga, M. Mobel, an G. Shegalov. Transaction Time Support Insie a Database Engine. In ICDE 6: Proceeings of the 22n International Conference on Data Engineering, page 35, Washington, DC, USA, 26. IEEE Computer Society. [17] C. Mohan, D. Haerle, B. Linsay, H. Pirahesh, an P. Schwarz. ARIES: A Transaction Recovery Metho Supporting Fine-Granularity Locing an Partial Rollbacs Using Write-Ahea Logging. ACM Trans. Database Syst., 17(1):94 162, 1992. [18] J. Moy. Hitless OSPF Restart. In Wor in progress, Internet Draft, 21. [19] R. Neumnann. Internet routing blac hole. The Riss Digest: Forum on Riss to the Public in Computers an Relate Systems, 19(12), May 1997. [2] D. Pei, D. Massey, an L. Zhang. Detection of Invali Routing Announcements in RIP Protocol. In Global Telecommunications Conference, 23. GLOBECOM 3. IEEE, volume 3, pages 145 1455 vol.3, Dec. 23. [21] K. School an D. Westhoff. Context Aware Detection of Selfish Noes in DSR base A-hoc Networs. In Proceeings of IEEE GLOBECOM, pages 178 182, 22. [22] A. Shaih, R. Dube, an A. Varma. Avoiing Instability During Graceful Shutown of OSPF. Technical report, In Proc. IEEE INFOCOM, 22. [23] B. Smith, S. Murthy, an J.J. Garcia-Luna-Aceves. Securing Distance- Vector Routing Protocols. Networ an Distribute System Security, Symposium on, :85, 1997. IX. APPENDIX Notation. Let msg refer to a message sent uring s iffusing computation (to globally remove false routing state). msg inclues: 1) a fiel, src, which contains the noe ID of the sening noe 2) a vector, ests, of all estinations that inclue v as an intermeiary noe.

Let refer to the maximum cloc sew for. All other notation is specifie in Table I. Algorithm 1 best run at each i aj(v) 1: flag FALSE 2: set all path costs to v to 3: for each estination o 4: if v is first-hop router in least cost path to then 5: c least cost to using a path which oes not use v as first-hop router 6: upate min i an matrix i with c 7: flag TRUE 8: en if 9: en for 1: if flag = TRUE then 11: sen min i to each j aj(i) where j v 12: en if Algorithm 2 s iffusing computation run at each i aj(v) 1: set all path costs to v to 2: S 3: for each estination o 4: if v is first-hop router in least cost path to then 5: S S {} 6: en if 7: en for 8: if S then 9: sen S to each j aj(i) where j v 1: en if Algorithm 4 rollbac 1: if alreay rolle bac then 2: exit 3: en if 4: ˆt 5: for each snapshot, S, o 6: t S.timestamp 7: if t < (t ) an t > ˆt then 8: ˆt t 9: en if 1: en for 11: rollbac to snapshot taen at ˆt 12: if not spanning tree leaf noe then 13: sen rollbac request to spanning tree chil 14: else 15: sen ACK to spanning tree parent noe 16: en if Algorithm 3 s iffusing computation run at each i / aj(v) Input: msg containing src, ests fiels. 1: S 2: for each msg. ests o 3: if msg.src is next-hop router in least cost path to then 4: S S {} 5: en if 6: en for 7: if S then 8: sen S to spanning tree chil 9: else 1: sen ACK to msg.src 11: en if Algorithm 5 steps after rollbac run at each i aj(v) 1: flag FALSE 2: for each estination o 3: if min i [] = then 4: fin least cost to in matrix i an set in min i 5: flag TRUE 6: en if 7: en for 8: if f lag = TRUE or ajacent lin weight change uring [t, t] then 9: sen min i to each j aj(i) where j v 1: en if