External Routing BGP Jean Yves Le Boudec Fall 2012

Similar documents
ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE. External Routing BGP. Jean Yves Le Boudec 2015

Advanced Computer Networks

Advanced Computer Networks

Advanced Computer Networks

Advanced Computer Networks

Advanced Computer Networks

internet technologies and standards

BGP Scaling Techniques

BGP Scaling Techniques

Internet inter-as routing: BGP

CS BGP v4. Fall 2014

BGP. BGP Overview. Formats of BGP Messages. I. Header

Ravi Chandra cisco Systems Cisco Systems Confidential

CS519: Computer Networks. Lecture 4, Part 5: Mar 1, 2004 Internet Routing:

Configuring BGP. Cisco s BGP Implementation

Configuring BGP community 43 Configuring a BGP route reflector 44 Configuring a BGP confederation 44 Configuring BGP GR 45 Enabling Guard route

Internet Interconnection Structure

Routing Between Autonomous Systems (Example: BGP4) RFC 1771

TELE 301 Network Management

BGP. Autonomous system (AS) BGP version 4. Definition (AS Autonomous System)

Protecting an EBGP peer when memory usage reaches level 2 threshold 66 Configuring a large-scale BGP network 67 Configuring BGP community 67

Border Gateway Protocol (an introduction) Karst Koymans. Tuesday, March 8, 2016

BGP. Autonomous system (AS) BGP version 4. Definition (AS Autonomous System)

BGP Configuration. BGP Overview. Introduction to BGP. Formats of BGP Messages. Header

Configuration prerequisites 45 Configuring BGP community 45 Configuring a BGP route reflector 46 Configuring a BGP confederation 46 Configuring BGP

Border Gateway Protocol (an introduction) Karst Koymans. Monday, March 10, 2014

CSCI-1680 Network Layer: Inter-domain Routing Rodrigo Fonseca

Table of Contents. BGP Configuration 1

Module 6 Implementing BGP

LARGE SCALE IP ROUTING LECTURE BY SEBASTIAN GRAF

Introduction. Keith Barker, CCIE #6783. YouTube - Keith6783.

BGP. Autonomous system (AS) BGP version 4

Routing Unicast routing protocols

Table of Contents 1 BGP Configuration 1-1

BGP. Autonomous system (AS) BGP version 4

This appendix contains supplementary Border Gateway Protocol (BGP) information and covers the following topics:

Inter-Domain Routing: BGP

BGP. Autonomous system (AS) BGP version 4

BGP. Autonomous system (AS) BGP version 4. Definition (AS Autonomous System)

Routing. Jens A Andersson Communication Systems

BGP. Inter-domain routing with the Border Gateway Protocol. Iljitsch van Beijnum Amsterdam, 13 & 16 March 2007

LACNIC XIII. Using BGP for Traffic Engineering in an ISP

Interdomain Routing Reading: Sections P&D 4.3.{3,4}

BGP. Border Gateway Protocol A short introduction. Karst Koymans. Informatics Institute University of Amsterdam. (version 18.3, 2018/12/03 13:53:22)

CS4700/CS5700 Fundamentals of Computer Networks

BGP. Autonomous system (AS) BGP version 4. Definition (AS Autonomous System)

CS 457 Networking and the Internet. The Global Internet (Then) The Global Internet (And Now) 10/4/16. Fall 2016

COMP/ELEC 429 Introduction to Computer Networks

PART III. Implementing Inter-Network Relationships with BGP

BGP Attributes and Path Selection

BGP Commands. Network Protocols Command Reference, Part 1 P1R-355

CS4450. Computer Networks: Architecture and Protocols. Lecture 15 BGP. Spring 2018 Rachit Agarwal

BGP Commands. Network Protocols Command Reference, Part 1 P1R-355

Connecting to a Service Provider Using External BGP

Internet Routing Protocols Lecture 01 & 02

Last time. Transitioning to IPv6. Routing. Tunneling. Gateways. Graph abstraction. Link-state routing. Distance-vector routing. Dijkstra's Algorithm

Vendor: Alcatel-Lucent. Exam Code: 4A Exam Name: Alcatel-Lucent Border Gateway Protocol. Version: Demo

CS321: Computer Networks Unicast Routing

Examination. ANSWERS IP routning på Internet och andra sammansatta nät, DD2491 IP routing in the Internet and other complex networks, DD2491

Inter-AS routing and BGP. Network Layer 4-1

EE 122: Inter-domain routing Border Gateway Protocol (BGP)

IBGP internals. BGP Advanced Topics. Agenda. BGP Continuity 1. L49 - BGP Advanced Topics. L49 - BGP Advanced Topics

Routing in the Internet

Lecture 4: Intradomain Routing. CS 598: Advanced Internetworking Matthew Caesar February 1, 2011

Internet Routing : Fundamentals of Computer Networks Bill Nace

Interdomain Routing Reading: Sections K&R EE122: Intro to Communication Networks Fall 2007 (WF 4:00-5:30 in Cory 277)

CSCI-1680 Network Layer: Inter-domain Routing Rodrigo Fonseca

BGP. Border Gateway Protocol (an introduction) Karst Koymans. Informatics Institute University of Amsterdam. (version 17.3, 2017/12/04 13:20:08)

CSCD 433/533 Network Programming Fall Lecture 14 Global Address Space Autonomous Systems, BGP Protocol Routing

Operation Manual BGP. Table of Contents

CS 43: Computer Networks. 24: Internet Routing November 19, 2018

Routing & Protocols 1

Internet Routing Protocols Lecture 03 Inter-domain Routing

CSCI-1680 Network Layer: Inter-domain Routing Rodrigo Fonseca

Network Layer: Routing

Configuring Internal BGP Features

CS 640: Introduction to Computer Networks. Intra-domain routing. Inter-domain Routing: Hierarchy. Aditya Akella

Master Course Computer Networks IN2097

BGP Attributes (C) Herbert Haas 2005/03/11 1

BGP for Internet Service Providers

Chapter 13 Configuring BGP4

BGP Tutorial. APRICOT 2003, Taipei February Philip Smith APRICOT , Cisco Systems, Inc. All rights reserved.

Border Gateway Protocol

Inter-domain Routing. Outline. Border Gateway Protocol

CSc 450/550 Computer Networks Internet Routing

Routing Basics. Campus Network Design & Operations Workshop

Connecting to a Service Provider Using External BGP

University of Belgrade - School of Electrical Engineering Department of Telecommunications

CS 43: Computer Networks Internet Routing. Kevin Webb Swarthmore College November 16, 2017

Routing part 2. Electrical and Information Technology

BGP Techniques for ISP. Terutaka Komorizono

CertifyMe. CertifyMe

BGP Tutorial. APRICOT 2004, Kuala Lumpur February Philip Smith APRICOT , Cisco Systems, Inc. All rights reserved.

Cisco CISCO Configuring BGP on Cisco Routers Exam. Practice Test. Version

BGP Protocol & Configuration. Scalable Infrastructure Workshop AfNOG2008

Interdomain routing CSCI 466: Networks Keith Vertanen Fall 2011

IP Addressing & Interdomain Routing. Next Topic

BGP. Attributes 2005/03/11. (C) Herbert Haas

Chapter 4: Network Layer

Architectures and Protocols for Integrated Networks. Intra-domain and Inter-domain Routing Protocols

Transcription:

ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE External Routing BGP Jean Yves Le Boudec Fall 2012 Self Organization 1 Contents A. What Inter Domain Routing does 1. Inter Domain Routing 2. Policy Routing B. How BGP works 1. How it works 2. Aggregation 3. Interaction BGP IGP Packet Forwarding 4. Other Attributes 5. Bells and Whistles C. Illustrations and Statistics Textbook Section 5.1.1, The control plane 2 1

A. 1. Inter Domain Routing Why invented? The Internet is too large to be run by one routing protocol Hierarchical routing is used the Internet is split into Domains, or Autonomous Systems with OSPF: large domains are split into Areas Routing protocols are said interior: (Internal Gateway Protocols, IGPs): inside ASs: RIP, OSPF (standard), IGRP (Cisco) exterior: between ASs: BGP (today) EGP (old) and BGP 1 to BGP 4 (today), IDRP (tomorrow, maybe) 3 What is an ARD? An AS? ARD = Autonomous Routing Domain = routing domain under one single administration one or more border routers all subnetworks inside an ARD should be connected should learn about other subnetwork prefixes the routing tables of internal routers should contain entries of all destination of the Internet AS = Autonomous System = ARD with a number ( AS number ) AS number is 32 bits denoted with dotted integer notation e.g. 23.3456 0.559 means the same as 559 Private AS numbers: 0.64512 0.65535 ARDs that are do not need a number are typically those with default route to the rest of the world Examples AS1942 CICG GRENOBLE, AS2200 Renater AS559 SWITCH Teleinformatics Services AS5511 OPENTRANSIT EPFL: one ARD, no number 4 2

BGP and IGP ARDs can be transit (B and D), stub (A) or multihomed (C). Only non stub domains need an AS number. An IGP is used inside a domain, BGP is used between domains ARD D D1 area 0 D4 D3 area 1 area 2 D2 BGP-4 D6 A1 D5 OSPF A4 BGP-4 RIP, RIPng A2 B1 BGP-4 B4 A3 B2 Autonomous Routing Domain A B3 ARD B BGP-4 C1 EIGRP C4 C3 ARD C C2 5 What does BGP do? What does BGP do? BGP is a routing protocol between ARDs. It is used to compute paths from one router in one ARD to any network prefix anywhere in the world BGP can handle both IPv4 and IPv6 addresses in a single process The method of routing is Path vector With policy 6 3

Path Vector Routing What is the requirement? Find best routes, in a sense that can be decided by every ARD using their own criteria How does it work? a route between neighbours is (path: dest) where path is a sequence of AS numbers and dest is an IP prefix example: B A:n1 every AS uses its own rules for deciding which path is better BGP table keeps a record of best paths to all destinations AS announces only the best paths it knows Q. Explain how E can choose the best paths to n1 and n2? Q. How can loops be avoided? solution BGP table in E n5 B A:n1,n2 dest AS path A:n1,n2 B:n5 B n1 B A n2 B A n1, n2 E n3 D C n4 n4 D A D n5 B A:n1,n2 n3 C C A:n1,n2 C:n3 D C A:n1,n2 D C: n3 D: n4 7 Border Gateways, E and I BGP A router that runs BGP is called a BGP speaker At the boundary between 2 ARDs there are 2 BGP speakers, one in each domain Q: compare to OSPF Inside one ARD there are usually several BGP speakers They all talk to each other, to exchange what they have learnt Using Internal BGP (I BGP) Over TCP connections, full mesh called the BGP mesh I BGP is the same as E BGP except for one rule: routes learned from a node in the mesh are not repeated inside the mesh X:n1 X:n1 I-BGP D1 A B C D R D3 E F D2 BGP sessions over TCP connections Physical links G D4 H D5 8 4

Which BGP updates may be sent? 1. 1 2. 2 3. 3 4. 1 and 2 5. 1 and 3 6. 2 and 3 7. All 8. None 9. I don t know 1. 3 2 1 2. 3 2 1 3. 3 2 1 100% 0% 0% 0% 0% 0% 0% 0% 0% 1 2 3 4 5 6 7 8 9 9 Say what is always true 1. 1 2. 2 3. 1 and 2 4. None 5. I don t know 10 5

AS A is a stub; A2 has default route to D6; could AS A avoid using BGP? 1. No because interdomain routing requires an interdomain routing protocol 2. Yes, if A2 injects the destination 0/0 into RIP and RIPng 3. Yes but using another method 4. I don t know 11 Solution AS A is connected to the rest of the world at A2 If A2 injects 0/0 into IGP (i.e. RIP and RIPng) then all routers in AS A will compute the shortest path to 0/0 as the shortest path to A2 All traffic to the outside of AS A will be forwarded to A2 A2 forwards all traffic to the outside of AS A to D6 A stub domain does not need BGP 12 6

2. Policy Routing Why invented? Interconnection of ASs (= peering) is self organized point to point links between networks: ex: EPFL to Switch, Switch to Telianet interconnection points: NAP (Network Access Point), MAE (Metropolitan Area Ethernet), CIX (Commercial Internet exchange), GIX (Global Internet exchange), IXP, SFINX, LINX Mainly 3 types of relations Customer provider: EPFL is customer of Switch. EPFL pays Switch Shared Cost peer: EPFL and CERN are peers: costs of interconnection is shared Siblings: EPFL Ecublens and EPFL IMT are inside the same organization 13 What is the Goal of Policy Routing? C3 ISP 3 ISP 1 C1 ISP 2 C2 n2 Example: ISP3 ISP2 is transatlantic link, cost shared between ISP2 and ISP 3 ISP 3 ISP 1 is a local, inexpensive link Ci is customer of ISPi, ISPs are peers It is advantageous for ISP3 to send traffic to n2 via ISP1 ISP1 does not agree to carry traffic from C3 to C2 ISP1 offers a transit service to C1 and a non transit service to ISP 2 and ISP3 The goal of policy routing is to support this and other similar requirements Shared cost provider customer 14 7

C3 ISP 3 ISP 1 C1 Policy Routing is implemented by means of Import and Export filters ISP 2 C2 n2 Examples Import Filter for ISP1: from C1 accept C1 from ISP2 accept ISP2 from ISP2 accept ISP2 C2 from ISP3 accept ISP3 from ISP3 accept ISP3 C3 Export Filter for ISP1: to C1 announce ANY to ISP2 announce ISP1 to ISP2 announce ISP1 C1 to ISP3 announce ISP1 to ISP3 announce ISP1 C1 The Route Selection Process (see later) also influences the choice of routes 15 Typical Policies ISP 3 ISP 1 C3 n3 C1 ISP 2 C2 n2 Customer to provider: announce all routes of customer; accept all routes Provider to customer: announce all routes; accept all routes of customer Shared cost peers Announce/accept all internal routes and of customer Siblings Announce /accept all routes 16 8

B. BGP (Border Gateway Protocol) 1. How it works, Fundamental Examples BGP 4, RFC 1771 BGP routers talk to each other over TCP connections BGP messages: OPEN, NOTIFICATION (= RESET), KEEPALIVE UPDATE UPDATE messages contains modifications Additions and withdrawals In steady state, a BGP router transmits only modifications 17 A BGP Router receives routes from neighbours Stores in Adj RIB in (one per BGP peer, internal or external) Accepts / rejects them (import policy) Processes attributes Stores the results in the Loc RIB applies decision process For every subnetwork prefix, at most 1 route is selected The winning routes are marked in the Loc RIB Updates Routing Table with winning routes sends to neighbours Decides whether to send or not (export policy) Aggregate multiple routes into one, if applicable Store result in Adj RIB out (one per BGP peer) and send to neighbour Only routes learnt from E BGP are sent to an I BGP neighbour Sends updates when Adj RIB out changes (addition or deletion) 18 9

Model of a BGP Router = Routing table 19 Routes, RIBs, Routing Table The records sent in BGP messages are called Routes. Routes + their attributes are stored in the Adj RIB in, Loc RIB, Adj RIB out. A route is made of: destination (subnetwork prefix) path to the destination (AS PATH) Attributes Well known Mandatory ORIGIN (route learnt from IGP, BGP or static) AS PATH NEXT HOP Well known Discretionary LOCAL PREF (see later) ATOMIC AGGREGATE (= route cannot be dis aggregated) Optional Transitive MULTI EXIT DISC (MED)(see later) AGGREGATOR (who aggregated this route) Optional Nontransitive WEIGHT (see later) In addition, like any IP host or router, a BGP router also has a Routing Table = IP forwarding table Used for packet forwarding, in real time 20 10

The Decision Process The decision process decides which route is selected; At most one best route to exactly the same prefix is chosen Only one route to 2.2/16 can be chosen But there can be different routes to 2.2.2/24 and 2.2/16 A route can be selected only if its next hop is reachable Routes are compared against each other using a sequence of criteria, until only one route remains. The default sequence is 0. Highest weight (Cisco proprietary) 1. Highest LOCAL PREF 2. Shortest AS PATH 3. Lowest MED, if taken seriously by this network 4. E BGP > I BGP 5. Shortest path to NEXT HOP, according to IGP 6. Lowest BGP identifier 21 Fundamental Example In this simple example there are 4 BGP routers. They communicate directly or indirectly via E BGP or I BGP, as shown on the figure. There are 2 ASs, x and y. We do not show the details of the internals of y. R3 and R4 send the BGP messages shown. We show next only a subset of the route attributes (such as : destination, path, NEXT HOP) R1 R3 R2 R4 We focus on R1 and show its BGP information: 22 11

Step 1 10.1/16 AS =y 10.2/16 AS=y From R3 10.1/16 AS =y NEXT HOP=1.1.1.2 Best From R3 10.2/16 AS =y NEXT HOP=1.1.1.2 Best To R2 To R2 10.1/16 AS =y NEXT HOP=1.1.1.2 10.2/16 AS =y NEXT HOP=1.1.1.2 (import filters) R1 accepts the updates and stores them in Adj RIB In (Decision Process) R1 designates these routes as best routes (export filters)r1 puts updates into Adj RIB Out, which will cause them to be sent to BGP neighbours 23 Step 2 10.1/16 AS =y NEXT-HOP =2.2.2.1 10.2/16 AS=y NEXT-HOP =2.2.2.1 From R3 10.1/16 AS =y NEXT HOP=1.1.1.2 Best From R2 10.1/16 AS =y NEXT HOP=2.2.2.1 From R3 10.2/16 AS =y NEXT HOP=1.1.1.2 Best From R2 10.2/16 AS =y NEXT HOP=2.2.2.1 Will the decision process promote the new routes to best routes? 1. Yes 2. No 3. I don t know 38% 47% 15% 1 2 3 24 12

Step 2 10.1/16 AS =y NEXT-HOP =2.2.2.1 10.2/16 AS=y NEXT-HOP =2.2.2.1 From R3 10.1/16 AS =y NEXT HOP=1.1.1.2 Best From R2 10.1/16 AS =y NEXT HOP=2.2.2.1 From R3 10.2/16 AS =y NEXT HOP=1.1.1.2 Best From R2 10.2/16 AS =y NEXT HOP=2.2.2.1 R1 applies again its decision process. Now it has several possible routes to each prefix. The first applicable rule in the decision process (slide The Decision Process ) says that if a route is learnt from E BGP it has precedence over a route learnt from I BGP. Since all routes in Adj-RIB-In from R2 are learnt from I BGP, and all routes in Adj-RIB-In from R3 are learnt from E BGP, the winners are the latter, so there is no change Since there is no change in Loc-RIB there is no change in Adj-RIB-Out and therefore no message is sent by R1. 25 Fundamental Example, Continued There are now 3 BGP routers in AS x. Note that the 3 BGP in AS x routers must have TCP connections with each other (same in AS y, but not shown on figure). An IGP (for example OSPF) also runs on R1, R21 and R22. All link costs are equal to 1. The announcements made by R3 and R4 are different, as shown on the figure. R1 R3 R21 R4 R22 We focus on R1 and show its BGP information: 26 13

Step 1 10.1/16 AS =y From R3 10.1/16 AS =y NEXT HOP=1.1.1.2 Best To R21 To R22 10.1/16 AS =y NEXT HOP=1.1.1.2 10.1/16 AS =y NEXT HOP=1.1.1.2 R1 accepts the updates and stores it in Adj RIB In R1 designates this route as best route R1 puts route into Adj RIB Out, which will cause them to be sent to BGP neighbours R21 and R22 27 Step 2 10.2/16 AS =y NEXT-HOP=2.2.2.1 From R3 10.1/16 AS =y NEXT HOP=1.1.1.2 Best From R22 10.2/16 AS =y NEXT HOP=2.2.2.1 Best To R21 To R22 10.1/16 AS =y NEXT HOP=1.1.1.2 10.1/16 AS =y NEXT HOP=1.1.1.2 R1 accepts the updates and stores it in Adj RIB In R1 designates this route as best route R1 does not put route into Adj RIB Out to R21 because I BGP is not repeated over I BGP R1 does not put route into Adj RIB Out to R3 this would create an AS path loop 28 14

Step 3 10.2/16 AS =y NEXT-HOP=3.3.3.1 From R3 10.1/16 AS =y NEXT HOP=1.1.1.2 Best From R22 10.2/16 AS =y NEXT HOP=2.2.2.1 Best From R21 10.2/16 AS =y NEXT HOP=3.3.3.1 1. Yes 2. No 3. I don t know 71% 18% 12% 1 2 3 29 10.2/16 AS =y NEXT-HOP=3.3.3.1 From R3 10.1/16 AS =y NEXT HOP=1.1.1.2 Best From R22 10.2/16 AS =y NEXT HOP=2.2.2.1 Best From R21 10.2/16 AS =y NEXT HOP=3.3.3.1 Best The decision process now has to choose between two routes with same destination prefix 10.2/16. Both were learnt from I BGP, so we apply criterion 5 in slide The Decision Process. The distance, computed by the IGP, to 2.2.2.1 is 3 and the distance to 3.3.3.1 is 2. Thus the route that has NEXT-HOP=3.3.3.1 is preferred by the decision process, i.e. the new route is designated as best The new route is not put into Adj RIB Out for the same reasons as at step 2. 30 15

ISP1 and ISP2 are shared cost peers. Which path will be used by packets Customer 1 Customer 2? 1. R12 R11 R21 2. R12 R22 R21 3. I don t know 34% 54% 11% 1 2 3 31 Solution : Which path will be used by packets Customer 1 Customer 2? R11 ISP1 R12 Customer 1 Customer 2 R21 ISP2 R22 If default configuration is used, the decision process in ISP1 selects the closest next hop, i.e. Customer 1 Customer 2 uses R12 R22 R21 («Hot potato routing») Customer 2 Customer 1 uses R21 R11 R12: routing in the global internet is asymmetric! 32 16

How are routes injected into BGP = How are routes originated? BGP propagates route information, but how is this bootstrapped? Bootstrap a route is called Originate a route Two methods Static configuration: tell this BGP router which are the prefixes to originated (network command) Redistribute from IGP: tell this router to copy all the prefixes that the IGP has learnt Assumes the IGP either does not propagate external prefixes, or has a way to differentiate them Such routes are sent to E BGP neighbours only, with ORIGIN=IGP 33 2. Aggregation Domains that do not have a default route (i.e. all transit ISPs) must know all routes in the world (several hundreds of thousands of prefixes) in IP routing tables unless default routes are used in BGP announcements Aggregation is a way to reduce the number of routes Aggregation is expected to be very frequent with IPv6, a less with IPv4 34 17

Can AS3 aggregate these routes into a single one? 2001:baba:bebe/48 2001:baba:bebe/48, AS-PATH = 1 AS1 AS3 AS4 AS2 2001:baba:bebf/48, AS-PATH = 2 2001:baba:bebf/48 1. Yes and the aggregated prefix is 2001:baba:bebe/47 2. Yes and the aggregated prefix is 2001:baba:bebf/48 3. Yes but the aggregated prefix is none of the above 29% 4. No 5. I don t know 9% 43% 11% 9% 1 2 3 4 5 35 Solution The two prefixes are contiguous and can be aggregated as 2001:baba:bebe/47 AS3 sends to AS4 the UPDATE 2001:baba:bebe/47 AS PATH = 3 {1 2 } AS4 sends the UPDATE 2001:baba:bebe/47 AS PATH = 4 3 {1 2 } 2001:baba:bebe/48 2001:baba:bebf/48 2001:baba:bebe/47 36 18

Can the decision process in AS4 designate both routes as best? 2001:baba:bebe/48 2001:baba:bebe/48, AS-PATH = 1 AS1 2001:baba:bebe/47, AS-PATH = 3 {1 2} AS3 AS4 AS2 2001:baba:bebf/48, AS-PATH = 2 2001:baba:bebf/48 2001:baba:bebf/48, AS-PATH = 2 1. Yes 2. No 3. I don t know 0% 0% 0% 1 2 3 37 Solution: Can the decision process in AS4 designate both routes as best? 2001:baba:bebe/48 2001:baba:bebe/48, AS-PATH = 1 AS1 2001:baba:bebe/47, AS-PATH = 3 {1 2} AS3 AS4 AS2 2001:baba:bebf/48, AS-PATH = 2 2001:baba:bebf/48 2001:baba:bebf/48, AS-PATH = 2 The decision process in AS4 may select both routes because they are different Overlapping routes are considered different 38 19

Assume the decision process in AS4 designates both routes as best. Which path does a packet from AS4 to 2001:baba:bebf/48 follow? 2001:baba:bebe/48 2001:baba:bebe/48, AS-PATH = 1 AS1 2001:baba:bebe/47, AS-PATH = 3 {1 2} AS3 AS4 AS2 2001:baba:bebf/48 2001:baba:bebf/48, AS-PATH = 2 2001:baba:bebf/48, AS-PATH = 2 1. AS4 AS3 AS2 2. AS 4 AS2 3. I don t know 0% 0% 0% 1 2 3 39 Solution: Assume the decision process in AS4 designates both routes as best. Which path does a packet from AS4 to 2001:baba:bebf/48 follow? 2001:baba:bebe/48 2001:baba:bebe/48, AS-PATH = 1 AS1 2001:baba:bebe/47, AS-PATH = 3 {1 2} AS3 AS4 AS2 2001:baba:bebf/48 2001:baba:bebf/48, AS-PATH = 2 2001:baba:bebf/48, AS-PATH = 2 Longest prefix match packet goes AS4 AS2 A packet to 2001:baba:bebe/48 will go AS4 AS3 AS1 40 20

Assume the link AS2 AS3 breaks 2001:baba:bebe/48 2001:baba:bebe/48, AS-PATH = 1 AS1 2001:baba:bebe/47, AS-PATH = 3 {1 2} AS3 AS4 AS2 2001:baba:bebf/48, AS-PATH = 2 2001:baba:bebf/48 2001:baba:bebf/48, AS-PATH = 2 At AS4: keepalive detects loss of AS2 Adj RIB In routes are declared invalid Decision process recomputes best route to 2001:baba:bebf/48 There is none The routing table entry 2001:baba:bebf/48 is removed A packet to 2001:baba:bebf/48 matches the route 2001:baba:bebe/47 and goes via AS3 41 3. How Forwarding Entries learnt by BGP are written into Routing Tables So far, we have seen how BGP routers learn about all the prefixes in the world. It remains to see how they write the corresponding entries in the forwarding table (i.e. routing table). There are two possible ways for this, usually mutually exclusive: Redistribution: routes learnt by BGP are passed to IGP (ex: RIP) Called redistribution of BGP into IGP IGP propagates the routes to all routers in domain Not found much in practice (table too large for IGP) Injection: routes learnt by BGP are directly written into the forwarding table of this router Routes do not propagate; this helps only this router 42 21

Redistribution Example R5 R6 AS x 2.2.2.2 IGP 18.1/16 I-BGP (OSPF) R2 AS z R5 advertises 18.1/16 to R6 via E BGP R6 transmits it to R2 via I BGP TCP connection between R6 and R2 (redistribute BGP into IGP) R6 injects 18.1/16 into IGP IGP propagates 18.1/16 (type 4 LSA) and updates forwarding tables After IGP stabilizes, R1, R2 now have a route to 18.1/6R2 advertises route to R4 via E BGP Packet to 18.1/16 from AS y finds forwarding table entries in R2, R1 and R6 R1 IGP (OSPF) AS y R4 43 Re Distribution is Considered Harmful In practice, operators avoid re distribution of BGP into IGP Large number of routing entries in IGP Convergence time after failures is large if IGP has many routing table entries Therefore, injection is used instead Injection is usually combined with recursive table lookup When IP packet is submitted to router, the forwarding table may indicate a NEXT HOP which is not on link with router A second table lookup needs to be done to resolve the next hop into an on link neighbour in practice, second lookup may be done in advance not in real time by preprocessing the routing table 44 22

Example of Recursive Table Lookup At R1, data packet to 10.1.x.y is received The forwarding table at R1 is looked up first, the nex hop 2.2.2.63 is found; a second lookup for 2.2.2.63 is done; the packet is sent to 2.2..2.33 over eth0 Forwarding Table at R1 To NEXT-HOP interface 10.1/16 2.2.2.63 N/A 2.2.2.63 2.2.2.33 eth0 R1 2.2.2.33 2.2.2.63 10.1/16 R2 R3 2.2.2.93 45 Injection Example R5 AS x R6 2.2.2.2 IGP 18.1/16 I-BGP (OSPF) R2 AS z R5 advertises 18.1/16, NEXT-HOP = 2.2.2.2 to R6 via E BGP R6 injects 18.1/16, NEXT-HOP = 2.2.2.2 into its local forwarding table (does not redistribute into OSPF) R2 learns route from R6 via I BGP R2 injects 18.1/16, NEXT-HOP = 2.2.2.2 into its local forwarding table Data packet to 18.1.2.3 is received by R2 Recursive table lookup at R2 can be used Packet is sent to R1 R1 IGP AS y R4 46 23

Injection: What happens to this IP packet at R1? 1. It is forwarded to R6 because R1 does recursive table lookup 2. It is forwarded to R6 because R1 runs an IGP 3. It cannot be forwarded by R6 because R6 does not run BGP 4. None of the above 5. I don t know 0% 0% 0% 0% 0% 1 2 3 4 5 47 Injection Example R5 AS x R6 2.2.2.2 IGP 18.1/16 I-BGP (OSPF) R2 AS z R5 advertises 18.1/16, NEXT-HOP = 2.2.2.2 to R6 via E BGP R6 injects 18.1/16, NEXT-HOP = 2.2.2.2 into its local forwarding table (does not redistribute into OSPF) R2 learns route from R6 via I BGP R2 injects 18.1/16, NEXT-HOP = 2.2.2.2 into its local forwarding table Data packet to 18.1.2.3 is received by R2 Recursive table lookup at R2 can be used There is a problem at R1: how can we solve it? R1 IGP AS y R4 48 24

Injection in Practice Requires all Routers to Run BGP R5 R6 AS x 2.2.2.2 IGP 18.1/16 I-BGP (OSPF) R2 AS z I-BGP R1 IGP All routers also run I BGP (in addition to IGP) I-BGP R5 advertises 18.1/16, NEXT-HOP = 2.2.2.2 to R6 via E BGP R6 transmits 18.1/16, NEXT-HOP = 2.2.2.2 to R1 and R2 via R4 I BGP R6 injects 18.1/16, NEXT-HOP = 2.2.2.2 into its local forwarding table R2 injects 18.1/16, NEXT-HOP = 2.2.2.2 into its local forwarding AS y table Independently, IGP finds that, at R2, packets to 2.2.10.1 should be sent to R1 Data packet to 18.1.2.3 is received by R2 At R2, recursive table lookup determines that packet should be forwarded to R1 At R1, recursive table lookup determines that packet should be forwarded to R6 At R6, recursive table lookup determines that packet should be forwarded to 2.2.2.2 49 Injection in Practice Requires all Routers to Run BGP R5 AS x R6 2.2.2.2 IGP 18.1/16 I-BGP (OSPF) R2 AS z I-BGP Practical solution often deployed All routers also run I BGP (in addition to IGP) even if connected to no external router (like R1) Recursive table lookup is done at all routers Potential problem: size of I BGP mesh > use reflectors IGP is still needed to discover paths to nexthops; but handles only internal networks very few Other solutions: encapsulation, MPLS R1 IGP I-BGP AS y R4 50 25

4. Other Route Attributes LOCAL PREF pref=100 R6 I-BGP R1 AS x I-BGP R2 pref=10 Used inside an AS to select a best AS path Assigned by BGP router when receiving route over E BGP Propagated without change over I BGP Example R6 associates pref=100, R2 pref=10 R1 chooses the largest preference bgp default local-preference pref-value 51 LOCAL PREF Example: Link AS2 AS4 is expensive AS 4 sets LOCAL PREF to 100 to all routes received from AS 3 and to 50 to all routes received from AS 2 AS1: 10.1/16 AS 2 R1 10.1/16 AS 1 AS 4 AS 5 R2 R3 AS1: 10.1/16 AS 3 R1 receives the route AS2 AS1 10.1/16 over E BGP; sets LOCAL PREF to 50 R2 receives the route AS3 AS1 10.1/16 over E BGP; sets LOCAL PREF to 100 R3 receives AS2 AS1 10.1/16, LOCAL PREF=50 from R1 over I BGP and AS3 AS1 10.1/16, LOCAL PREF=100 from R1 over I BGP R3 selects AS3 AS1 10.1/16, LOCAL PREF=100 and installs it into routing table R3 announces only AS3 AS1 10.1/16 to AS 5 52 26

What does R3 announce to AS5? 1. 10.1/16 AS PATH=AS4 AS2 AS1 2. 10.1/16 AS PATH=AS4 AS3 AS1 3. Both 4. None 5. I don t know 0% 0% 0% 0% 0% 1 2 3 4 5 53 Which way does a packet go from R1 to 10.1.1.1? 1. Via the direct link to AS2 2. Via R2 3. I don t know 0% 0% 0% 1 2 3 54 27

Solution R1, R2 and R3 all select the route via AS3 as best route to 10.1/16 because of the LOCAL PREF attribute R3 advertizes only its best route to AS5, i.e. 10.1/16 AS PATH=AS4 AS3 AS1 R1 installs in forwarding table the nexthop corresponding to the R2 AS3 link and therefore the packet to 10.1.1.1 goes via AS3 55 Weight This is a route attribute given by Cisco or Zebra router It remains local to this router Never propagated to other routers, even in the same cloud Therefore there is no weight attribute in route announcements 56 28

MULTI EXIT DISC (MED) AS x R1 10.1/16 MED=10 10.2/16 MED=50 1.1.1.1 2.2.2.2 R2 10.1/16 MED=50 10.2/16 MED=10 R3 10.1/16 10.2/16 AS y R4 One AS connected to another over several links ex: multinational company connected to worldwide ISP AS y advertises its prefixes with different MEDs (low = preferred) If AS x accepts to use MEDs put by ASy: traffic goes on preferred link 57 R1 has 2 routes to 10.2/16, one learnt from R3 by E BGP (MED=10), one learnt from R2 by I BGP (MED=50). The decision process at R1 prefers 1. The route via R2 2. The route via R3 3. I don t know 0% 0% 0% 1 2 3 58 29

packet to 10.1.2.3 packet to 10.2.3.4 Solution AS x R1 10.1/16 MED=10 10.2/16 MED=50 1.1.1.1 2.2.2.2 R2 10.1/16 MED=50 10.2/16 MED=10 R3 10.1/16 10.2/16 AS y R4 R1 prefers the route via R2 because the decision process tests MED before E BGP> I BGP R1 has 2 routes to 10.1/16, R1 prefers the route via R1 Traffic from ASx to 10.1/16 flows via R3, traffic from ASx to 10.2/16 flows via R4 59 packet to 10.1.2.3 Router R3 crashes R1 clears routes to ASy learnt from R3 (keep alive mechanism) R2 is informed of the route suppression by I BGP R2 has now only 1 route to 10.1/16 and 1 route to 10.2/16; traffic to 10.1/16 now goes to R2 MED allows ASy to be dual homed and use closest link other links are used as backup 60 30

LOCAL PREF vs MED MED is used between ASs (i.e. over E BGP); LOCAL PREF is used inside one AS (over I BGP) MED is used to tell one provider AS which entry link to prefer; LOCAL PREF is used to tell the rest of the world which AS path we want to use, by not announcing the other ones. 61 Communities Other attributes can be associated with routes in order to simplify rules. They are called «communities» Pre defined: Example: NO EXPORT ( a well known, pre defined attribute) see later for an example Defined by one AS (a label of the form ASN:x where AS= AS number, x = a 2 byte number) 62 31

NO EXPORT Written on E BGP by one AS, transmitted on I BGP by accepting AS, not forwarded Example: AS2 has different routes to AS1 but AS2 sends only one aggregate route to AS3 simplifies the aggregation rules at AS2 R1 2.2.0/17 2.2.128/17 2.2.0/17 NO-EXPORT 2.2/16 R4 R3 R5 2.2/16 R2 2.2.128/17 NO-EXPORT 2.2/16 63 5. Bells and Whistles Route Flap Dampening Route modification propagates everywhere Sometimes routes are flapping successive UPDATE and WITHDRAW caused for example by BGP speaker that often crashes and reboots Solution: decision process eliminates flapping routes How withdrawn routes are kept in Adj RIN in if comes up again soon (ie : flap), route receives a penalty if penalty suppress limit route is not selected penalty fades out exponentially, see next slide 64 32

Route Flap Dampening penalty suppress-limit reuse-limit time t 1 t 2 Route suppressed at t 1, restored at t 2 65 AS x R1 Private AS Number Provider R2 10.1/16 MED=10 10.2/16 MED=50 10.1/16 MED=50 10.2/16 MED=10 R3 Client 10.1/16 10.2/16 R4 AS y Client uses BGP with MED to control flows of traffic (e.g provider should use R1 R3 for all traffic to 10.1/16 Client can use a private AS number not usable in the global internet, used only between Client and Provider Provider translates this number to his own when exporting routes to the outside world 66 33

Which way will packets from z to 2.1.1.1 go? 1. Via x 2. Via y 3. Both 4. I don t know 0% 0% 0% 0% 1 2 3 4 67 2.1/16 Which way will packets from z to 2.1.1.1 2.1/16 go? AS-path AS z AS-path x 100 100 100 100 y 100 2.1/16 AS-path 100 100 100 100 AS x R3 R1 Providers AS y 2.0/16 Client 2.1/16 AS 100 R2 R4 2.1/16 AS-path 100 Artificially inflated path: 100 100 100 100 : 2.0/16 causes other networks to prefer the alternate path via y Used by dual homed Client to force traffic to reach Client at the closest point to final destination 68 34

Avoid I BGP Mesh: Confederations AS P1 AS P2 AS P3 I-BGP I-BGP I-BGP AS z AS decomposed into sub AS private AS number similar to OSPF areas I BGP inside sub AS (full interconnection) E BGP between sub AS 69 Avoid I BGP Mesh : Route reflectors I-BGP RR RR RR I-BGP I-BGP I-BGP I-BGP I-BGP cluster 1 cluster 2 cluster 3 AS z Cluster of routers one I BGP session between one client and RR CLUSTER_ID Route reflector re advertises a route learnt via I BGP to avoid loops ORIGINATOR_ID attribute associated with the advertisement 70 35

I BGP configuration lo0 I-BGP lo0 I-BGP AS z I BGP configured on loopback interface (lo0) interface always up IP address associated with the interface IGP routing guarantees packet forwarding to the interface 71 Avoid E BGP mesh: Route server At interconnection point Instead of n(n 1)/2 peer to peer E BGP connections n connections to Route Server To avoid loops ADVERTISER attribute indicates which router in the AS generated the route 72 36

C. Illustrations: The Switch Network www.switch.ch 73 74 37

An Interconnection Point 75 76 38

from www.ris.ripe.net: all routes to 128.178.0.0/15 on RIPE Route Collectors AS TP re im Type epath ef ix er Next HOP MED Origin Community RRC ID A 128.178.0.0/ 15 2003-10-02 05:05:49Z 129.250.0.2 32 129.250.0.2 32 9 Not defined 2914 1299 559 2914:420 2914:2000 2914:3000 RIPE NCC A 128.178.0.0/ 15 2003-10-02 06:16:00Z 193.10.252. 5 193.10.252. 5 0 IGP 2603 3356 1299 559 2603:666 3356:2 3356:86 3356:507 3356:666 3356:2076 Netnod A 128.178.0.0/ 15 2003-10-02 06:16:17Z 194.68.48.1 194.68.48.1 0 IGP 12381 1653 2603 20965 559 12381:1653 Netnod A 128.178.0.0/ 15 2003-10-02 06:16:37Z 194.68.48.1 194.68.48.1 0 IGP 12381 1653 2603 3356 1299 559 12381:1653 Netnod A 128.178.0.0/ 15 2003-10-02 06:21:08Z 193.10.252. 5 193.10.252. 5 0 IGP 2603 20965 559 2603:222 2603:666 20965:155 Netnod A 128.178.0.0/ 15 2003-10-02 06:21:17Z 194.68.48.1 194.68.48.1 0 IGP 12381 1653 2603 20965 559 12381:1653 Netnod A 128.178.0.0/ 15 2003-10-02 07:24:06Z 129.250.0.2 32 129.250.0.2 32 9 Not defined 2914 3549 559 2914:420 77 The World seen from EPFL http://www.ris.ripe.net/bgpviz/ 78 39

79 Some statistics Source: http://www.cidr report.org Address prefixes ASs 80 40

81 Further Reading Slow convergence after route suppression BGP path exploration, similar to (but worse than) distance vector slow convergence. Is in the nature of path vector routing with explicit suppression. Craig Labovitz, Abha Ahuja, Abhijit Bose, Farnam Jahanian: Delayed Internet routing convergence. IEEE/ACM Trans. Netw. 9(3): 293 306 (2001) Route flap dampening slows down convergence Zhuoqing Morley Mao, Ramesh Govindan, George Varghese, Randy H. Katz: Route flap damping exacerbates internet routing convergence. SIGCOMM 2002: 221 233 Path vector + policy may suffer from incompatibilities (loops) The stable paths problem and interdomain routing Griffin, T.G.; Shepherd, F.B.; Wilfong, G. ACM/IEEE ToN April 2002, Page(s): 232 243 82 41

Other References Timothy Griffin s home page at Intel RFC 1771 (BGP 4) C. Huitema, Le Routage dans l Internet John W. Stewart III BGP 4 www.ris.ripe.net : AS paths www.cidr report.org aggregation statistics www.caida.org map of Internet rpsl.info.ucl.ac.be relations between ASs 83 42