Tuning Core Routing Protocols Николай Милованов/Nikolay Milovanov
Contents ISIS overview ISIS tuning http://niau.org 2
ISIS overview 3
IP routing design requirements Requirements towards the IGP protocols: Scalability Stable Secure Fast convergence. http://niau.org 4
ISIS overview IS-IS id Link-state protocol with a hierarchical concept of areas. It uses the concept of propagating topological information to the associated areas (reliable flooding) Dijkstra algorithm to compute loop-free least-cost paths to every node in the connected areas. ISIS is based on CLNS addressing IS-IS was originally designed within the ISO primarily for the CLNS protocol, but later IP prefix propagation has been added to IS-IS (at the moment IS-IS has its own working group in the IETF). IP, CLNS, other address prefixes are somehow external to the linkstate protocol and that pure SPF computation is not dealing with any IP protocol addresses, but only with nodes and links. Address prefixes reachability in ISIS is computed on top of the Dijkstra computation. http://niau.org 5
ISIS/OSPF common stuff They both maintain a link state database from which a Dijkstra-based SPF algorithm computes a shortest-path tree They both use Hello packets to form and maintain adjacencies They both use areas to form a two-level hierarchical topology They both have the capability of providing address summarization between areas They both are classless protocols (support VLSM) They both elect a designated router to represent on broadcast networks Even if you have Broadcast media in Service Provider Core network stick to PtoP. http://niau.org 6
Hierarchy of ISIS OSI IS-IS routing makes use of two-level hierarchical routing : The backbone is called level 2 (L2) Areas are called level 1 (L1) http://niau.org 7
IS-IS Hierarchy L1 Routers Intra-area routing Neighbors only in the same area A routing domain is partitioned into areas and L1 routers have information about their own area Can not know the identity of routers or destinations outside of their area Level 1-only routers look at the attached-bit in level 1 LSPs to find the closest L1/L2 router in the area Use the closest L1/L2 router to exit the area http://niau.org 8
IS-IS Hierarchy L2 Routers Inter-area routing May have neighbors in other areas Know the level 2 topology Know which addresses are reachable via each level 2 router Do not need to know the topology within any level 1 area, except to the extent that a level 2 router may also be a level 1 router within a single area (called L1/L2 routers and it has L1 and L2 LSDBs) Can exchange data packets or routing information directly with external routers located outside of the its area http://niau.org 9
ISIS Hierarchy http://niau.org 10
ISIS vs OSPF area Design OSPF The border is inside routers (ABRs) Each link belongs to one area http://niau.org 11
ISIS vs OSPF area Design ISIS Each IS-IS router belongs to exactly one area. IS-IS is more flexible when extending the backbone. http://niau.org 12
ISIS areas Area borders are on links, not on routers All the routers are completely within an area Routers that connect areas are Level 2 routers, and routers that have no direct connectivity to another area are Level 1 routers L1 routers are analogous to OSPF nonbackbone internal routers, and L2 routers are analogous to OSPF backbone routers, and L1L2 routers are analogous to OSPF ABRs L1L2 routers must maintain both a level 1 link state database and a level 2 link state database L1L2 routers do not advertise L2 routes to L1 routers, so L1 routers have no knowledge of destinations outside of its own area. Therefore, L1 routers are similar to routers in an OSPF totally stubby area L1L2 routers maintaining separate level 1 and level 2 link state databases will calculate separate SPF trees for the level 1 and level 2 topology http://niau.org 13
ISIS routing logic L1 router: for a destination address, compare the area ID to this area. If not equal, pass to nearest L1/L2 router. If equal, use L1 database to route by system ID L1/L2 router: for a destination address, compare the area ID to this area. If not equal, use L2 database to route by area ID If equal, use L1 database to route by system ID http://niau.org 14
ISIS packets IS-IS Hello Packets (Used for maintaining adjacencies) Sends hello packet every 10 sec, dead interval time is 30 sec ESH ISH IIH Link State Packets Called LSPs Contains all information about one router, such as connected IP prefixes, area addresses, etc One LSP per router PSNP Reqeusting and confirming the link state information. CSNP when distribute the complete link state database http://niau.org 15
ISIS Hellos http://niau.org 16
ISIS operation Send Hellos and build adjacencies Create an LSP and flood it to neighbors Receive all LSPs from neighbors Run the SPF algorithm to calculate topology Run PRC (Partial Route Calculation) to calculate IP routing info. Newly received LSPs trigger SPF and/or PRC calculations http://niau.org 17
SPF algorithm - overview For the SPF algorithm to work, it requires all routers in the OSPF\IS-IS network to know about links and all the other routers in the same network. OSPF encode its link-state information in Link State Advertisements (LSAs) and floods it. IS-IS encode its information in a Link State Packet (LSP). When the initial data collection process is completed, OSPF \ IS- IS process runs the Dijkstra Shortest Path First algorithm to find the shortest path from itself to all the other routers in the network. The same process happen on each router in the network. When the algorithm processing is completed, all the routers have a similar table and consistent routing can start. http://niau.org 18
SPF runtime operation Dijkstra algorithm put the router as the root of a tree and calculate the shortest path to each destination. While the overall picture on all routers is similar (they all have the same routers and links), each router look differently at the result as the point of view is personal. It is just like in life you share a room with 3 other people, each one stand in a different corner. When you are asked to describe an object you describe the exact same object but it does look a bit different from different angels. When any change is noticed (link state change), SPF start the calculation all over and re-build the map. OSPF ability to use many areas is a way to reduce these frequent updates as it has less routers per area. This is a major consideration when using a link-state protocol. http://niau.org 19
CLNS addressing OSI network-layer addressing is implemented with NSAP addresses. The NSAP address identifies any system in the OSI network. Various NSAP format are used in various system, as different protocols may use different representations of NSAP. Example 49.0001.aaaa.bbbb.cccc.00 Area = 49.0001, SysID = aaaa.bbbb.cccc, Nsel = 00 If your loopback is 100.100.100.100 So you may configure a sysid = 0100.0100.0100.0100 http://niau.org 20
Why ISIS IS-IS is reliable. And Scalable. In the context of the MetroE design to have hundred of nodes in a single IS-IS area, and also supports the network to grow in the future without any scalability concerns. TE requires a link state protocol. IS-IS is a link state protocol. Development of new features for TE are typically introduced earlier for IS-IS than compared to OSPF Supports a 2-level hierarchy : level-1 (areas) and level-2 (backbone) All the prefixes are in single LSA packet. This helps tracing all routing information announced by a particular router IS-IS has been successfully deployed for many large ISPs ISIS is OSI protocol Integrated ISIS carry CLNS and IP addressing prefixes Less Resource Usage IS-IS databases contain one LSP per router in the routing domain http://niau.org 21
ISIS limitations Metrics are 6 bits wide (0 to 63) : Default interface metric is 10 unless manually specified. Therefore use wide metrics All areas are stub areas : Might result in suboptimal routing between areas No filtering allowed : All ISs must have the same view of an area (CLNS) IP address space summarization - possible http://niau.org 22
ISIS tuning 23
Questions The physical layer How fast can a down link be detected? Routing protocol convergence How fast can a routing protocol react to the topology change? Forwarding How fast can the forwarding engine on each router in the network adjust to the new paths that the routing protocol calculates? http://niau.org 24
Determine IS-IS: Single/Multiple Area Routing Level-1/Level-2 Routing policy Put everything that need needs something else then default route in L2 Put hosts that need only default route in L1 The routers in between shall be L1/L2 Simple Config Example router isis is-type level-2-only interface loopback 1 ip router isis isis circuit-type level-2-only interface loopback 10 ip router isis isis circuit-type level-2-only interface GigabitEthernet x/y ip router isis isis circuit-type level-2-only isis network point-to-point interface POS w/e.r #this is the link into the core ip router isis isis circuit-type level-2-only interface VLAN z #this is the link between PE routers (on the crossbar) ip router isis isis circuit-type level-2-only router isis passive-interface loopback 1 passive-interface loopback 10 http://niau.org 25
Define ISIS metrics Use wide metrics Example Values Config example router isis is-type level-1-2 metric-style wide level-1-2 interface GigabitEthernetx/y ip router isis isis metric <metric> level-2 http://niau.org 26
IS-IS: Timers and General Parameters The main reasons to change default timers in a network are to improve convergence time and reduce the amount of bandwidth used by regular update and keepalive traffic such as LSP, PSNP and IIH. Decreasing hello and hold timers can reduce re-convergence times but in a network with unstable links, this becomes a fine compromise. There is always a compromise between fast convergence times and network stability. Unstable networks with reduced timers will not only use unnecessary bandwidth with updates but also increase router CPU and memory usage because of the requirement to run the SPF and PRC computation http://niau.org 27
LSP Refresh Lsp-refresh-interval specifies the time in seconds the router will wait before refreshing and transmitting its own LSPs. It s important that the lsp-refresh-interval is lower than the max-lsp-lifetime in order for the LSP never to age-out. Config Example router isis max-lsp-lifetime 65535 lsp-refresh-interval 65000 Max-lsp-lifetime specifies the maximum lifetime in seconds specified in the LSP header. Routers use this timer to age-out and purge old LSPs. The recommendation is to increase this timer to the maximum of 65535 seconds (~18.7 hours). This will decrease the number of unnecessary LSP reflooding. http://niau.org 28
Exponential backoff timers Exponential backoff timers have been implemented in IS-IS to control the events of SPF computation, PRC computation and LSP generation. Prc-interval: specifies the number of seconds between two consecutive PRC calculations Spf-interval: specifies the number of seconds between two consecutive SPF calculations. Lsp-gen-interval: specifies the number of seconds between creating new versions of the same LSP. http://niau.org 29
Exponential backoff timers (Cont) Backoff timers syntax xxx-interval <MaxInt> [<InitWait> <Inc>] <MaxInt>maximum seconds between SPF run, PRC run or LSP generation. <InitWait>msec between first trigger and SPF run, PRC run or LSP generation. <Inc>msec between first and second SPF run, PRC run or LSP generation. The exponential backoff algorithm operates as follows: An initial event triggers SPF, PRC or LSP generation. <InitWait> determines the time between this event and the start of SPF, PRC or LSP generation in msec. <Inc>determines the amount of time in msec the router will wait in between consecutive SPF / PRC executions or LSP generations. This interval will be <Inc> between the first and second event, <2x Inc>between the second and third event, <4x Inc> between the third and fourth event, <8x Inc> between the fourth and fifth event and so on, until <MaxInt> has been reached. <MaxInt>determines the maximum amount of time in seconds the router will wait in between consecutive SPF / PRC executions or LSP generations. After 2 times <MaxInt> seconds without trigger, all timers are reset to their initial value of <InitWait>. Configuraiton example router isis spf-interval 5 1 50 prc-interval 5 1 50 lsp-gen-interval 5 1 50 http://niau.org 30
Interface dampening The IP Event Dampening introduces a configurable exponential decay mechanism to suppress the effects of excessive interface flapping events on routing protocols and routing tables in the network. This feature allows the network operator to configure a router to automatically identify and selectively dampen a local interface that is flapping. The following routes will be affected by the int. dampening: Connected routes: The connected routes of dampened interfaces are not installed into the routing table. When a dampened interface is unsuppressed, the connected routes will be installed into the routing table if the interface is up. Static routes: Static routes assigned to a dampened interface are not installed into the routing table. When a dampened interface is unsuppressed, the static route will be installed to the routing table if the interface is up. Config Example interface GigabitEthernet 1/0 carrier-delay msec 16 isis network point-to-point dampening interface POS 2/0 carrier-delay msec 0 dampening http://niau.org 31
ISIS fast convergence IGP convergence in an MPLS core has an impact on the traffic from all the services supported by the MPLS core. It affects only the traffic between routers in the core network. For e.g., in the context of MPLS-L2/L3 VPN, the IGP convergence affects traffic flow from one PE to another PE. There are other factors (PE-CE routing, Multi Protocol BGP convergence, etc.) that also influence the data convergence for the traffic between VPN sites. http://niau.org 32
ISIS fast convergence (Cont) Fast Convergence at Adjacency Setup When a router reloads, packets forwarded to the router could be lost and the router would effectively behave like a black hole for a short while. This could happen because I/IS-IS considers an adjacency to be established and valid before CSNP packets have been exchanged and thus before the LSDBs of the neighbors have been fully synchronized. In a recent I/IS-IS implementation, a router will immediately flood its own LSP even before sending CSNP packets. This does not eliminate the black-holing problem illustrated above. However, given that the LSP is flooded immediately, the overload bit can be set to advise the rest of the network not to attempt to route transit traffic through the newly reloaded router. Example Configuration router isis set-overload-bit on-startup 180 http://niau.org 33
ISIS fast convergence (Cont) Fast Down Detection Fast hellos Suppose that a certain protocol transmits a keepalive or hello packet every 10 seconds and declares a neighbor down after not hearing hello packets for 30 seconds. This is the maximum amount of time it takes for a neighbor failure to be detected, rather than the average amount of time. http://niau.org 34
ISIS fast convergence (Cont) Bidirectional Forwarding Detection (BFD) BFD is a form of fast hello at Layer 2.5. BFD is standard based approach that works with ISIS, OSPF, BGP http://niau.org 35
ISIS fast convergence (Cont) Incremental SPF The topology tree is used to populate the routing table with routes to IP networks. When changes occur, the entire SPT is recomputed. In many cases, the entire SPT need not be recomputed because most of the tree remains unchanged. Incremental SPF allows the system to recompute only the affected part of the tree. Recomputing only a portion of the tree rather than the entire tree results in faster IS-IS convergence and saves CPU resources. Incremental SPF computes only the steps needed to apply the changes in the network topology diagram. That process requires from the router to keep more information about the topology in order to apply the incremental changes. However incremental SPF reduces the demand of CPU. Configuration example router isis ispf level-2 http://niau.org 36
ISIS fast convergence (Cont) Priority driven IP prefixes The IS-IS Support for Priority-Driven IP Prefix RIB Installation feature allows customers to designate a subset of IP prefixes for faster processing and installation in the global routing table as one way to achieve faster convergence. For example, Loopback addresses may need to be processed first to help BGP next hop reachability updated faster than other types of packets. Configuration Example Router(config)# int lo 1 Router(config-if)# isis tag 100 Router(config)# int lo 10 Router(config-if)# isis tag 200 Router(config)#router isis Router(config-router)# ip route priority high tag 100 Router(config-router)# ip route priority medium tag 200 http://niau.org 37
ISIS fast convergence (Cont) Carrier Delay When an interface driver on a router detects a physical link failure (i.e. AIS, LOS or by other means), it will signal this failure to upper layers in the IOS operating system. This leads to one of the following log messages: As soon as router OS signals the link or line protocol going down, IS-IS will start to calculate the new topology by running SPF. After detecting a link failure, the interface driver will wait a certain time before signaling the link failure to upper layers. This interval is configurable on a per-interface basis using the interface configuration command carrierdelay. Configuration example interface GigE <X/Y> carrier-delay msec 16 http://niau.org 38
Other ISIS parameters Log-adjacency-changes: causes I/IS-IS to generate a log message when an I/IS-IS adjacency changes state (up or down). Ignore-lsp-errors: allows the router to ignore I/IS-IS link-state packets that are received with internal checksum errors rather than purging the link-state packets. This will avoid purge and flood storms in case of bad checksums LSPs. This will be enabled by default. Passive-interface: allows I/IS-IS to include the IP prefix of an interface in its own LSP as internal but no I/IS-IS packets will be sent over the interface (IIH or LSPs). Hello padding: I/IS-IS Hello PDUs (IIH) are padded to the full MTU size by default. This can have a negative impact on time sensitive application traffic across low bandwidth interfaces or on interface buffer resources if fast hellos are being used. Hello padding will be disabled globally. http://niau.org 39
MPLS опорни мрежи MPLS core networks инж. Николай Милованов/Nikolay Milovanov http://niau.org email: nmil@niau.org 40