1 From Routing to Traffic Engineering Robert Soulé Advanced Networking Fall 2016
2 In the beginning B Goal: pair-wise connectivity (get packets from A to B) Approach: configure static rules in routers that look at header, make a decision A Problems: Topology changes Human error West
3 ARPANet Routing in 1969 Goal: Automate the static table computation Shortest-path routing based on link metrics Metrics based on instantaneous queue length + constant Used a distance-vector algorithm
4 Distance-Vector Routing Local view of the network, global computation of least-cost paths Each router keeps a distance vector with best known distance to each vertex, and next hop in that path Routers exchange distance vectors, update if better After some time, state converges such that every routers has a minimum distance vector
5 Problems Transient loops during slow convergence Instantaneous queue length does not indicate expected delay, fluctuates wildly, causes routing oscillations Protocol requires dissemination of link state, can be high overhead in large networks
6 ARPANet Routing in 1979 Shortest-path routing based on link metrics Use average of queue length over time to reduce fluctuations Use link-state algorithm Only send updates on state if change exceeds a threshold
7 Link-State Routing Global view of the network, local computation of least-cost paths Each router maintains a complete view of the network (i.e., all links and costs) Each router advertises its adjacent links and costs via link-state advertisements (LSAs) Each router collects LSAs, and maintains its own view of network Each router computes the least-cost path based on its own local representation of the network
8 Example: Open Shortest Path First (OSPF) Link state routing algorithm Cost of a link is based on various metrics (RTT time, throughput on a link, reliability) Computes a shortest-path tree for each route using ~Dijkstra s algorithm Routing done at IP layer (not at the transport protocol)
9 Problems Congestion on shortest paths Congested links look bad to all routers All routers avoid the busy links, choosing the same paths Routing oscillations
10 Network Management Protocols we ve seen so far: Routing protocols adapt to topology changes TCP sends less traffic during congestion Is this efficient/enough? Can we use an empty path if there is congestion? Do we send traffic on high delay paths when other paths exist? How should routing adapt to traffic?
11 Traffic Engineering Don t just route on shortest path Try to avoid congested links Attempt to distribute load Satisfy application requirements Is all traffic the same? (e.g., customer facing traffic vs. backups) For managers, try to improve operational efficiency
12 Routing today Link state routing with shortest path algorithm Operators tune link weights How to configure the weights? Idea 1: based on link metrics (delay, capacity) Idea 2: based on demands, i.e., optimization problem
13 Measure, Model, Control Loop Network Model Topology changes and traffic statistics Changes to network configuration Network
14 Multi-Commodity Flow 1. Continuously monitor flows, build a traffic matrix 2. Encode as constraint problem 3. Map solution to physical paths 4. Update forwarding tables 5. Repeat West
14 Multi-Commodity Flow 1. Continuously monitor flows, build a traffic matrix 2. Encode as constraint problem 3. Map solution to physical paths 4. Update forwarding tables 5. Repeat West
15 Problem 1: Expensive Optimization Problem Solving MCF is computationally expensive.
16 Problem 2: Constant State Changes With optimal MCF, forwarding state must constantly be changed.
17 Randomized Routing Idea: rather than measuring/predicting traffic demands and recomputing, use randomization to achieve good performance We will look at a technique called Valiant Load Balancing Developed by Leslie Valiant in 1982 as a technique for parallel computers, i.e., parallel random-access machines (PRAM)
18 Routing in Sparse Graph Assume each edge can carry only one packet at each step, and no packet can traverse more than one edge at each time step The routing algorithm specifies for each pair of nodes, a route connecting pairs in the network Each vertex can buffer packets, and the queuing policy specifies the ordering of packets in the queue (e.g., FIFO) The metric we will use to compare algorithms is time: how many steps it takes to deliver packets
19 Routing in Sparse Graph 0 1 00 01 10 11 000 010 100 110 001 011 101 111 n = 1 n = 2 n = 3 Graph H: vertices V = {0,1} n,, edges allow one bit flip (e.g., 10 to 11) E = {(x, x ei) : x {0,1} n, i [n]} Graph is an n-dimensional hypercube, where N = 2 n
20 Deterministic Routing A simple algorithm for routing is a bit-fixing scheme To find a path from v to u, simply examine the bits from left to right, flipping as necessary: e.g., 10110 00110 00100 00101 Note that this scheme is oblivious. You only need to look at the end points to determine the path. You don t need any other information. Is this a good algorithm?
21 Deterministic Algorithm Theorem 1: There are permutations for which the bit-fixing scheme requires at least 2 n/2 /n time steps to transfer all messages. Proof: Assume n is even, and write x {0,1} n, as x = (x, x ), with x, x {0,1} n. Consider any permutations π : {0,1} n {0,1} n which maps (x,0) to (0,x ) for all x {0,1} n/2. These 2 n/2 /n paths all pass through a single vertex (0,0). As it has only n outgoing edges, we need at least 2 n/2 /n time steps to transfer all messages.
22 More Generally Theorem 2: For any deterministic oblivious permutation routing algorithm on a network of N nodes, each of out-degree d, there is an instance of permutation routing requiring Ω ( N / d) steps. We won t prove this.
23 Randomized Algorithm Phase 1: Pick a random intermediate node. Packet travels first to intermediate node, t(v). Phase 2. Packet travels from t(v) to d(v) Theorem 3: With high probability, all packets will be routed in at most O(log N) time steps!
24 Analysis First analyze Phase 1: For a packet M, let T1(M) be the number of steps for M to finish Phase 1. For an edge e, let X1(e) denote the number of packets that traverse e during phase 1. Note that in each time step, M is either traversing an edge, or waiting in a queue. Let e1,,em be the m <= n edges travelled by a packet M in Phase 1. Then, m T 1 (M ) X 1 (e i ) i=1
25 Analysis Let us call P = (e1,,em ) of m <= n edges possible packet path. Following the definition T1(M), for any possible packet path, we say: m T 1 (P) X 1 (e i ) i=1 So, the probability that Phase 1 takes more than T time steps is bound by the probability that T1(P) >= T.
26 Analysis To prove Theorem 3, we need a high probability bound on T1(P). This is difficult because X1(e1) are not independent random variables (if a packet traverses an edge, it is likely to traverse an adjacent edge). Proof Sketch: first, prove with high probability, no more than 6n packets cross any edge of P. Then, condition on the event, and prove a high probability bound on the total number of transitions these packets make through edges of the path P. Result: With probability O(N -1 ), no packet takes more than 30n time steps in Phase 1.
27 Analysis Phase 2 is like Phase 1, but backwards. Instead of starting at an origin and going to a random destination, you start at an random origin, and go to a destination. Result: With probability 1 - O(N -1 ), no packet takes more than 30n time steps in Phase 2. So, with probability 1- O(N -1 ), no packet takes more than 60n time steps in Phase 1 and 2.
28 Randomized Algorithms Routing in a sparse network is one example of when a randomized algorithm performs better than a deterministic algorithm Other examples include: Randomized quick sort Min-cut
29 Valiant Load Balancing In Practice? Need to generalize to arbitrary topologies (not just hypercubes). Need to balance on flows, not packets. Congestion is bound, but latency can increase. Several research systems, not often deployed in practice.
30 Valiant Load Balancing 1. Choose a random intermediate node 2. Route from source to intermediate node 3. Route from intermediate node to destination West
30 Valiant Load Balancing 1. Choose a random intermediate node 2. Route from source to intermediate node 3. Route from intermediate node to destination West
31 Valiant Load Balancing West East
31 Valiant Load Balancing West East
32 Equal-Cost Multi-Path Routing Compute a set of best paths, e.g., shortest paths Identify flows by hashing packet header fields Randomly forward along least cost paths Used in many real-world systems
33 ECMP Problems 1. What if there aren t n best paths? 2. What if best paths are not distinct? 3. Doesn t balance elephant flows well. West
33 ECMP Problems 1. What if there aren t n best paths? 2. What if best paths are not distinct? 3. Doesn t balance elephant flows well. West
34 Google s B4 Uses a semi-oblivious scheme Maintain a global view of the network Select a set of k shortest paths Send flows along those paths according to a probability Compute probabilities by solving MCF using an approximate algorithms
35