ommunication Networks Prof. Laurent Vanbever ommunication Networks Spring 08 Roland Meier / Thomas Holterbach Slides: Laurent Vanbever nsg.ee.ethz.ch TH ürich (-ITT) pril 9 08 Materials inspired from Scott Shenker & Jennifer Rexford Internet Protocol and orwarding Last week on ommunication Networks IP addresses use, structure, allocation IP forwarding longest match rule IP header IPv and IPv6, wire format source: oardwatch Magazine Internet Protocol and orwarding IPv addresses are unique -bits number associated to a network interface (on a host, a router, ) IP addresses are usually written using dotted-quad notation IP addresses use, structure, allocation IP forwarding longest match rule IP header IPv and IPv6, wire format 8.0.0.0 00000 000000 0000 000000 IP addressing is hierarchical, composed of a (network address) and a suffix (host address) ach has a given length, usually written using a slash notation bits 00000.000000.0000.000000 IP 8.0.0.0 / identifies the network suffix identifies the hosts in the network length (in bits) ommunication Networks Mon 9 pr 08 of 9
Prefixes are also sometimes specified using an address and a mask Routers forward packet to their destination according to the network part, not the host part ddress 8.0.0.0 00000.000000.0000. 00000000... 00000000 Mask 55.55.55.0 oing so enables to scale the forwarding tables Internet Protocol and orwarding......5...5 5.6.7.... 5.6.7. 5.6.7.00... LN router router router WN WN IP router IP router...0/ LN IP addresses use, structure, allocation IP forwarding longest match rule LN Local rea Network WN Wide rea Network 5.6.7.0/ forwarding table IP header IPv and IPv6, wire format Routers maintain forwarding entries for each Internet Provider s orwarding table IP Output 9.0.0.0/8 I# 9...0/ 9...0/ 9..0.0/6 I# I# I# 9.0.0.0/8 I# Provider Provider I# 9...0/ 9...0/ 9...0/ 9..0.0/6 To resolve ambiguity, forwarding is done along the most specific (i.e., the longer one) Internet Protocol and orwarding IP addresses use, structure, allocation IP forwarding longest match rule IP header IPv and IPv6, wire format ommunication Networks Mon 9 pr 08 of 9
bits 8 version header length 6 Type of Service Identification Time To Live This week on Total Length lags Protocol ommunication Networks ragment offset Header checksum Source IP address estination IP address Options (if any) Payload Internet routing traceroute www.google.ch http://www.opte.org!9 Internet routing comes into two flavors: intra- and inter-domain routing traceroute www.google.ch rou-etx--ee-tik-etx-dock- (8.0.0.) rou-ref-rz-bb-ref-rz-etx (0.0.0.) rou-fw-rz-ee-tik (0...9) rou-fw-rz-gw-rz (9..9.70) 5 swiix-0ge--.switch.ch (0.59.6.) 6 swiez (9..9.) 7 swiix-p.switch.ch (0.59.6.) 8 equinix-zurich.net.google.com (9..8.58) 9 66.9.9.57 (66.9.9.57) 0 inter-domain intra-domain routing routing ind paths between networks ind paths within a network zrh0s06-in-f.e00.net (7.9.0.88)!0! ST traffic to Google inter-domain intra-domain routing routing NW HI SLT KNS WSH LOS HOUS TL ind paths between networks! ommunication Networks Mon 9 pr 08 of 9
Google can be reached NW, WSH, TL, HOUS Google can be reached NW, WSH, TL, HOUS best exit point ST based on money, performance, traffic to Google SLT HI NW KNS WSH LOS HOUS TL NW can be reached SLT ST traffic to inter-domain routing intra-domain routing Google SLT KNS HI WSH NW LOS HOUS TL ind paths within a network!6 traceroute www.google.ch traceroute www.google.ch rou-etx--ee-tik-etx-dock- rou-etx--ee-tik-etx-dock- rou-ref-rz-bb-ref-rz-etx rou-fw-rz-ee-tik intra-domain routing rou-ref-rz-bb-ref-rz-etx rou-fw-rz-ee-tik inter-domain routing rou-fw-rz-gw-rz rou-fw-rz-gw-rz swiix-0ge--.switch.ch swiix-0ge--.switch.ch swiez intra-domain routing swiez swiix-p.switch.ch equinix-zurich.net.google.com swiix-p.switch.ch equinix-zurich.net.google.com inter-domain routing 66.9.9.57 intra-domain routing 66.9.9.57 zrh0s06-in-f.e00.net zrh0s06-in-f.e00.net!8!9 Internet routing from here to there, and back Internet routing from here to there, and back Intra-domain routing Intra-domain routing Link-state protocols Link-state protocols istance- protocols istance- protocols Inter-domain routing Inter-domain routing Path- protocols Path- protocols ommunication Networks Mon 9 pr 08 of 9
Intra-domain routing enables routers to compute forwarding paths to any internal subnet Network operators don t want arbitrary paths, they want good paths what kind of paths? definition good path is a path that minimizes some network-wide metric typically delay, load, loss, cost approach ssign to each link a weight (usually static), compute the shortest-path to each destination When weights are assigned proportionally to the distance, shortest-paths will minimize the end-to-end delay When weights are assigned proportionally to the distance, shortest-paths will minimize the end-to-end delay if traffic is such that there is no congestion Internet, the US based research network When weights are assigned inversely proportionally to each link capacity, throughput is maximized Internet routing from here to there, and back if traffic is such that there is no congestion Intra-domain routing Link-state protocols istance- protocols Inter-domain routing Path- protocols In Link-State routing, routers build a precise map of the network by flooding local views to everyone looding is performed as in L learning, except that it is reliable! ach router keeps track of its incident links and cost as well as whether it is up or down ach router broadcast its own links state to give every router a complete view of the graph Node sends its link-state on all its links Next node does the same, except on the one where the information arrived Routers run ijkstra on the corresponding graph to compute their shortest-paths and forwarding tables ommunication Networks Mon 9 pr 08 5 of 9
looding is performed as in L learning, except that it is reliable looding is performed as in L learning, except that it is reliable Node sends its link-state on all its links Node sends its link-state on all its links Next node does the same, except on the one where the information arrived Next node does the same, except on the one where the information arrived ll nodes are ensured to receive the latest version of all link-states challenges packet loss out of order arrival ll nodes are ensured to receive the latest version of all link-states solutions K & retransmissions sequence number time-to-live for each link-state link-state node initiate flooding in conditions Once a node knows the entire topology, it can compute shortest-paths using ijkstra s algorithm Topology change link or node failure/recovery onfiguration change link cost change Periodically refresh the link-state information every (say) 0 minutes account for possible data corruption y default, Link-State protocols detect topology changes using software-based beaconing uring network changes, the link-state database of each node might differ hello Routers periodically exchange Hello in both directions (e.g. every 0s) Trigger a failure after few missed Hellos control-plane consistency all nodes have the same link-state database OSP router (e.g., after missed ones) necessary Tradeoffs between: detection speed forwarding validity the global forwarding state directs packet to its destination bandwidth and PU overhead false positive/negatives Inconsistencies lead to transient disruptions in the form of blackholes or forwarding loops lackholes appear due to detection delay, as nodes do not immediately detect failure 5 G depends on the timeout for detecting lost hellos ommunication Networks Mon 9 pr 08 6 of 9
Transient loops appear due to inconsistent link-state databases 5 5 G G Initial forwarding state learns about the failure and immediately reroute to G G loop appears as isn t yet aware of the failure The loop disappears as soon as updates its forwarding table onvergence is the process during which the routers seek to actively regain a consistent view of the network Network convergence time depends on main factors factors time the routers take for detection realizing that a link or a neighbor is down flooding flooding the news to the entire network computation recomputing shortest-paths using ijkstra table update updating their forwarding table In practice, network convergence time is mostly driven by table updates time improvements detection few ms smaller timers flooding few ms high-priority flooding computation few ms incremental algorithms table update potentially, minutes! better table design table update potentially, minutes! better table design ommunication Networks Mon 9 pr 08 7 of 9
R 0 R R R R 0 R R Provider # ($) IP: 0.0.. M: 0:aa Provider # ($$) IP: 98.5.00. M: 0:bb 5k IP es R 0 R R Provider # ($) IP: 0.0.. M: 0:aa Provider # ($$) IP: 98.5.00. M: 0:bb ll 5k entries point to R because it is cheaper R s orwarding Table R s orwarding Table Next-Hop 5k IP es R 0 R R Provider # ($) IP: 0.0.. M: 0:aa Provider # ($$) IP: 98.5.00. M: 0:bb 56k 5k.0.0.0/.0..0/6 00.0.0.0/8 00.99.0.0/ Next-Hop (0:aa, 0) (0:aa, 0) (0:aa, 0) (0:aa, 0) 5k IP es R 0 R R Provider # ($) IP: 0.0.. M: 0:aa Provider # ($$) IP: 98.5.00. M: 0:bb Upon failure of R, all 5k entries have to be updated Upon failure of R, all 5k entries have to be updated R s orwarding Table R s orwarding Table 56k 5k.0.0.0/.0..0/6 00.0.0.0/8 00.99.0.0/ Next-Hop (0:aa, 0) (0:aa, 0) (0:aa, 0) (0:aa, 0) 5k IP es R 0 R R Provider # ($) IP: 0.0.. M: 0:aa Provider # ($$) IP: 98.5.00. M: 0:bb 56k 5k.0.0.0/.0..0/6 00.0.0.0/8 00.99.0.0/ Next-Hop (0:aa, 0) (0:aa, 0) (0:aa, 0) (0:aa, 0) R R Provider # ($$) IP: 98.5.00. M: 0:bb ommunication Networks Mon 9 pr 08 8 of 9
R s orwarding Table R s orwarding Table Next-Hop Next-Hop.0.0.0/ (0:bb, ).0.0.0/ (0:bb, ) 56k 5k.0..0/6 00.0.0.0/8 00.99.0.0/ (0:aa, 0) (0:aa, 0) (0:aa, 0) R R Provider # ($$) IP: 98.5.00. M: 0:bb 56k 5k.0..0/6 00.0.0.0/8 00.99.0.0/ (0:bb, ) (0:aa, 0) (0:aa, 0) R R Provider # ($$) IP: 98.5.00. M: 0:bb R s orwarding Table R s orwarding Table Next-Hop Next-Hop.0.0.0/ (0:bb, ).0.0.0/ (0:bb, ) 56k 5k.0..0/6 00.0.0.0/8 00.99.0.0/ (0:bb, ) (0:bb, ) (0:aa, 0) R R Provider # ($$) IP: 98.5.00. M: 0:bb 56k 5k.0..0/6 00.0.0.0/8 00.99.0.0/ (0:bb, ) (0:bb, ) (0:bb, ) R R Provider # ($$) IP: 98.5.00. M: 0:bb How long does it take for TH routers to converge? convergence time (s) isco Nexus 9k 0 TH recent routers 5 deployed 0. K 5K 0K K 00K 00K 00K 00K 0K # of es convergence time (s) 00 worst-case convergence time (s) 00 worst-case median case 0 0 0 0 0.. K 5K 0K K 00K 00K 0K K 5K 0K K 00K 00K 00K 00K 0K 0.. K 5K 0K K 00K 00K 0K K 5K 0K K 00K 00K 00K 00K 0K # of es # of es ommunication Networks Mon 9 pr 08 9 of 9
Traffic can be lost for several minutes 00 ~.5 min. The problem is that forwarding tables are flat 0 0 ntries do not share any information even if they are identical Upon failure, all of them have to be updated inefficient, but also unnecessary 0.. K 5K 0K K 00K 00K 0K K 5K 0K K 00K 00K 00K 00K 0K # of es Two universal tricks you can apply to any computer sciences problem When you need more flexibility, When you need more flexibility, you add a layer of indirection you add a layer of indirection When you need more scalability, you add a hierarchical structure replace this with that Router orwarding Table Router orwarding Table Mapping table 56k 5k Next-Hop.0.0.0/ (0:aa, 0).0..0/6 (0:aa, 0) 00.0.0.0/8 (0:aa, 0) 00.99.0.0/ (0:aa, 0) port 0 port 56k 5k.0.0.0/.0..0/6 00.0.0.0/8 00.99.0.0/ pointer Pointer table pointer NH (0:aa, 0) port 0 port Upon failures, we update the pointer table Here, we only need to do one update Router orwarding Table Mapping table Router orwarding Table Mapping table 56k.0.0.0/.0..0/6 00.0.0.0/8 pointer Pointer table pointer NH (0:aa, 0) port 0 port 56k.0.0.0/.0..0/6 00.0.0.0/8 pointer Pointer table pointer NH (0:bb, ) port 0 port 5k 00.99.0.0/ 5k 00.99.0.0/ ommunication Networks Mon 9 pr 08 0 of 9
Hierarchical table enables to converge within ms, independently on the number of es Today, two Link-State protocols are widely used: OSP and IS-IS convergence time (s) 00 0 0 OSP IS-IS Open Shortest Path irst Intermediate Systems ms. hierarchical table K 5K 0K K 00K 00K 0K K 5K 0K K 00K 00K 00K 00K 0K # of es OSP IS-IS OSP IS-IS Open Shortest Path irst Intermediate Systems Open Shortest Path irst Intermediate Systems used in many enterprise & ISPs used mostly in large ISPs work on top of IP work on top of link-layer only route IPv by default network protocol agnostic Internet routing from here to there, and back istance- protocols are based on ellman-ord algorithm Intra-domain routing Link-state protocols istance- protocols Inter-domain routing Path- protocols Let dx(y) be the cost of the least-cost path known by x to reach y Let dx(y) be the cost of the least-cost path known by x to reach y until convergence ach node bundles these distances into one message (called a ) that it repeatedly sends to all its neighbors ommunication Networks Mon 9 pr 08 of 9
Similarly to Link-State, situations cause nodes to send new Vs Let dx(y) be the cost of the least-cost path known by x to reach y Topology change link or node failure/recovery until convergence ach node bundles these distances into one message (called a ) that it repeatedly sends to all its neighbors onfiguration change link cost change ach node updates its distances based on neighbors s: Periodically refresh the link-state information every (say) 0 minutes dx(y) = min{ c(x,v) + dv(y) } over all neighbors v account for possible data corruption 6 Optimum -hop path st st Hop 0 0 6 6 6 0 0 0 0!90!9 Optimum -hop path st st Hop 0 0 6 Optimum -hops path st st Hop 0 0 7 6 7 6 5 6 7 7 5 0 0 0 0 0 0 0 0!9!9 Let s consider the convergence process Optimum -hops path st st Hop 0 0 6 6 after a link cost change 7 5 6 7 5 0 0 5 5 0 0 ommunication Networks Mon 9 pr 08 of 9
t = 0 t = 0 onsider the following network leading to the following s onsider the following network leading to the following s reaches directly 6 reaches 5 t = 0 (,) weight changes from to Node detects local cost change, update their s, and notify their neighbors if it has changed time t=0 6 5 t = updates its, sends it to and t = updates its, sends it to and t=0 t= t=0 t= t= 6 6 6 6 5 5 t = updates its, sends it to and t > no one moves anymore network has converged! t=0 t= t= t= t=0 t= t= t> 6 6 6 6 5 5 ommunication Networks Mon 9 pr 08 of 9
The algorithm terminates after iterations Good news travel fast! Good news travel fast! What about bad ones? t = 0 (,) weight changes from to t = updates its, sends it to and time t=0 t=0 t= 6 6 6 5 5 t = updates its, sends it to and t = updates its, sends it to and t=0 t= t= t=0 t= t= t= 6 6 6 6 8 5 7 5 7 t = updates its, sends it to and t= t= t= many iterations later 5 9 9 5 ommunication Networks Mon 9 pr 08 of 9
The algorithm terminates after iterations! This problem is known as count-to-infinity, a type of routing loop ad news travel slow! ount-to-infinity leads to very slow convergence what if the cost had changed from to 9999? Routers don t know when neighbors use them does not know that has switched to use it Let s try to fix that Whenever a router uses another one, it will announce it an infinite cost The technique is known as poisoned reverse s uses to reach, it announces to an infinite cost 5 t = 0 (,) weight changes from to t = updates its, sends it to and time t=0 t=0 t= 5 5 t = updates its, sends it to and t = updates its, sends it to and t=0 t= t= t=0 t= t= t= 5 5 6 5 6 ommunication Networks Mon 9 pr 08 5 of 9
t = updates its, sends it to and t > no one moves network has converged! t= t= t> 5 While poisoned reverse solved this case, it does not solve loops involving or more nodes ctual distance- protocols mitigate this issue by using small infinity, e.g. 6 see exercise session Link-State vs istance-vector routing Internet routing from here to there, and back Message complexity onvergence speed Robustness Link-State O(n) message sent n: #nodes : #links relatively fast node can advertise incorrect link cost nodes compute their own table Intra-domain routing Link-state protocols istance- protocols istance- Vector between neighbors only slow node can advertise incorrect path cost Inter-domain routing Path- protocols errors propagate Internet Internet ommunication Networks Mon 9 pr 08 6 of 9
Internet Internet network of networks order Gateway Protocol (GP) The Internet is a network of networks, referred to as utonomous Systems (S) ach S has a number (encoded on 6 bits) which identifies it S0 S0 S0 S0 S0 S0 S0 S0 S S GP is the routing protocol glueing the entire Internet together Using GP, Ses exchange information about the IP es they can reach, directly or indirectly S0 S0 S0 GP sessions S S0 S0 9..0.0/6 TH/UNIH amp Net GP needs to solve three key challenges: scalability, privacy and policy enforcement Link-State routing does not solve these challenges There is a huge # of networks and es 700k es, >,000 networks, millions (!) of routers loods topology information high processing overhead Networks don t want to divulge internal topologies or their business relationships Requires each node to compute the entire path high processing overhead Networks need to control where to send and receive traffic without an Internet-wide notion of a link cost metric Minimizes some notion of total distance works only if the policy is shared and uniform! ommunication Networks Mon 9 pr 08 7 of 9
istance-vector routing is on the right track, but not there yet istance-vector routing is on the right track, but not really there yet pros Hide details of the network topology pros Hide details of the network topology nodes determine only next-hop for each destination nodes determine only next-hop for each destination cons It still minimizes some common distance impossible to achieve in an inter domain setting It converges slowly counting-to-infinity problem GP relies on path- routing to support flexible routing policies and avoid count-to-infinity GP announcements carry complete path information instead of distances S0 S0 S0 key idea advertise the entire path instead of distances 9..0.0/6 Path: 0 S0 S 9..0.0/6 Path: 0 9..0.0/6 TH/UNIH amp Net ach S appends itself to the path when it propagates announcements S0 S0 S0 9..0.0/6 Path: 0 0 S0 S0 S0 S0 9..0.0/6 Path: 0 0 S0 S S 9..0.0/6 Path: 0 0 9..0.0/6 TH/UNIH amp Net 9..0.0/6 TH/UNIH amp Net omplete path information enables Ses to easily detect a loop Life of a GP router is made of three consecutive steps TH sees itself in the path and discard the route S0 while true: receives routes from my neighbors select one best route for each S S0 export the best route to my neighbors 9..0.0/6 Path: 0 0 ommunication Networks Mon 9 pr 08 8 of 9
ach S can apply local routing policies always prefer eutsche Telekom routes over T&T 9..0.0/6 Path: 0 0 ach S is free to select and use any path preferably, the cheapest one 9..0.0/6 Path: 0 0 ach S can apply local routing policies always prefer eutsche Telekom routes over T&T ach S is free to select and use any path preferably, the cheapest one decide which path to export (if any) to which neighbor IP traffic preferably, none to minimize carried traffic do not export TH routes to T&T do not export TH routes to T&T 9..0.0/6 Path: 0 ommercial break Next week on ommunication Networks Internet routing policies armasuisse Science and Technology: Open Internship Positions Thun, Switzerland (min. months) Network / yber Security ig ata / ata Science Security and Privacy in igital vionics Tallinn, stonia (min. 6 months) Network / yber Security igital orensics Network Monitoring In collaboration with: Interested students are encouraged to apply at: vincent.lenders@armasuisse.ch ommunication Networks Mon 9 pr 08 9 of 9