Chapter 4: chapter goals: understand principles behind services service models forwarding versus routing how a router works generalized forwarding instantiation, implementation in the Internet 4- Network transport segment from sending to receiving host on sending side encapsulates segments into datagrams on receiving side, delivers segments to transport protocols in every host, router router examines header fields in all IP datagrams passing through it application transport application transport 4- Two key - functions - functions: forwarding: move packets from router s input to appropriate router output routing: determine route taken by packets from source to destination routing algorithms analogy: taking a trip forwarding: process of getting through single interchange routing: process of planning trip from source to destination Network service model Q: What service model for channel transporting datagrams from sender to receiver? example services for individual datagrams: guaranteed delivery guaranteed delivery with less than 40 msec delay example services for a flow of datagrams: in-order datagram delivery guaranteed minimum bandwidth to flow restrictions on changes in inter-packet spacing 4-3 4-4 Router architecture overview high-level view of generic router architecture: router input ports routing processor high-seed ing router output ports routing, management control plane (software) operates in millisecond time frame forwarding data plane (hardware) operttes in nanosecond timeframe 4- Input port functions : bit-level reception : e.g., Ethernet see chapter line termination protocol (receive) lookup, forwarding queueing decentralized ing: using header field values, lookup output port using forwarding table in input port memory ( match plus action ) goal: complete input port processing at line speed queuing: if datagrams arrive faster than forwarding rate into 4-6
Input port functions : bit-level reception : e.g., Ethernet see chapter line termination protocol (receive) lookup, forwarding queueing decentralized ing: using header field values, lookup output port using forwarding table in input port memory ( match plus action ) destination-based forwarding: forward based only on destination IP address (traditional) generalized forwarding: forward based on any set of header field values 4-7 Destination-based forwarding Destination Address Range 00000 0000 0000000 00000000 through 00000 0000 0000 00000 0000 000000 00000000 through 00000 0000 000000 00000 0000 00000 00000000 through 00000 0000 000 otherwise forwarding table Link Interface Q: but what happens if ranges don t divide up so nicely? 0 3 4-8 Longest prefix matching longest prefix matching when looking for forwarding table entry for given destination address, use longest address prefix that matches destination address. Destination Address Range 00000 0000 0000*** ********* 00000 0000 000000 ********* 00000 0000 000*** ********* otherwise Link interface 0 3 Longest prefix matching we ll see why longest prefix matching is used shortly, when we study addressing longest prefix matching: often performed using ternary content addressable memories (TCAMs) content addressable: present address to TCAM: retrieve address in one clock cycle, regardless of table size Cisco Catalyst: can up ~M routing table entries in TCAM examples: DA: 00000 0000 00000 00000 DA: 00000 0000 000000 0000 which interface? which interface? 4-9 4-0 Switching s transfer packet from input buffer to appropriate output buffer ing rate: rate at which packets can be transfer from inputs to outputs often measured as multiple of input/output line rate N inputs: ing rate N times line rate desirable three types of ing s Switching via memory first generation routers: traditional computers with ing under direct control of CPU packet copied to system s memory speed limited by memory bandwidth ( bus crossings per datagram) memory input port (e.g., Ethernet) memory output port (e.g., Ethernet) memory bus crossbar system bus 4-4-
Switching via a bus datagram from input port memory to output port memory via a shared bus bus contention: ing speed limited by bus bandwidth 3 Gbps bus, Cisco 600: sufficient speed for access and enterprise routers bus Switching via interconnection overcome bus bandwidth limitations banyan s, crossbar, other interconnection nets initially developed to connect processors in multiprocessor advanced design: fragmenting datagram into fixed length cells, cells through the. Cisco 000: es 60 Gbps through the interconnection crossbar 4-3 4-4 Input port queuing slower than input ports combined -> queueing may occur at input queues queueing delay and loss due to input buffer overflow! Head-of-the-Line (HOL) blocking: queued datagram at front of queue prevents others in queue from moving forward Output ports datagram buffer queueing This slide in HUGELY important! protocol (send) line termination output port contention: only one red datagram can be transferred. lower red packet is blocked one packet time later: green packet experiences HOL blocking buffering required when datagrams arrive from faster than the transmission rate scheduling discipline Priority chooses scheduling among who queued gets best datagrams for transmission performance, neutrality Datagram (packets) can be lost due to congestion, lack of buffers 4-4-6 Output port queueing How much buffering? at t, packets more from input to output one packet time later buffering when arrival rate via exceeds output line speed queueing (delay) and loss due to output port buffer overflow! RFC 3439 rule of thumb: average buffering equal to typical RTT (say 0 msec) times capacity C e.g., C = 0 Gpbs :. Gbit buffer recent recommendation: with N flows, buffering equal to RTT. C N 4-7 4-8 3
Scheduling mechanisms Scheduling policies: priority scheduling: choose next packet to send on FIFO (first in first out) scheduling: send in order of arrival to queue real-world example? discard policy: if packet arrives to full queue: who to discard? tail drop: drop arriving packet priority: drop/remove on priority basis random: drop/remove randomly packet arrivals queue (waiting area) (server) packet departures priority scheduling: send highest priority queued packet multiple classes, with different priorities class may depend on marking or other header info, e.g. IP source/dest, port numbers, etc. real world example? arrivals classify 3 arrivals packet in service departures high priority queue (waiting area) low priority queue (waiting area) 4 (server) 3 4 3 4 departures 4-9 4-0 Scheduling policies: still more Round Robin (RR) scheduling: multiple classes cyclically scan class queues, sending one complete packet from each class (if available) real world example? 3 arrivals 4 Scheduling policies: still more Weighted Fair Queuing (WFQ): generalized Round Robin each class gets weighted amount of service in each cycle real-world example? packet in service 3 4 departures 3 3 4 4-4- Chapter 4: outline 4. Overview of Network data plane control plane 4. What s inside a router 4.3 IP: Internet Protocol datagram format fragmentation IPv4 addressing address translation IPv6 4.4 Generalized Forward and SDN match action OpenFlow examples of match-plus-action in action The Internet host, router functions: routing protocols path selection RIP, OSPF, BGP transport : TCP, UDP forwarding table IP protocol addressing conventions datagram format packet handling conventions ICMP protocol error reporting router signaling 4-3 4-4 4
IP datagram format IP protocol version number header length (bytes) type of data max number remaining hops (decremented at each router) upper protocol to deliver payload to how much overhead? v 0 bytes of TCP v 0 bytes of IP v = 40 bytes + app overhead ver head. len 6-bit identifier time to live 3 bits type of service upper flgs length fragment header checksum 3 bit source IP address 3 bit destination IP address options (if any) data (variable length, typically a TCP or UDP segment) total datagram length (bytes) for fragmentation/ reassembly e.g. timestamp, record route taken, specify list of routers to visit. 4- IP fragmentation, reassembly s have MTU (max.transfer size) - largest possible -level frame different types, different MTUs large IP datagram divided ( fragmented ) within net one datagram becomes several datagrams reassembled only at final destination IP header bits used to identify, order related fragments reassembly fragmentation: in: one large datagram out: 3 smaller datagrams 4-6 IP fragmentation, reassembly Chapter 4: outline example: v v 4000 byte datagram MTU = 00 bytes 480 bytes in data field = 480/8 length ID =4000 =x fragflag =0 length ID =00 =x length ID =00 =x length ID =040 =x =0 one large datagram becomes several smaller datagrams fragflag = fragflag = fragflag =0 =0 =8 =370 4. Overview of Network data plane control plane 4. What s inside a router 4.3 IP: Internet Protocol datagram format fragmentation IPv4 addressing address translation IPv6 4.4 Generalized Forward and SDN match action OpenFlow examples of match-plus-action in action 4-7 4-8 IP addressing: introduction IP address: 3-bit identifier for host, router interface interface: connection between host/router and router s typically have multiple interfaces host typically has one or two interfaces (e.g., wired Ethernet, wireless 80.) IP addresses associated with each interface 3... 3... 3...3 3...4 3...9 3..3.7 3..3. 3... 3... 3..3. 3... = 0 0000000 0000000 0000000 3 IP addressing: introduction Q: how are interfaces actually connected? A: we ll learn about that in chapter, 6. A: wired Ethernet interfaces connected by Ethernet es For now: don t need to worry about how one interface is connected to another (with no intervening router) 3... 3... 3...3 3...4 3...9 3..3.7 3..3. 3... 3... 3..3. A: wireless WiFi interfaces connected by WiFi base station 4-9 4-30
Subnets IP address: subnet part - high order bits host part - low order bits what s a subnet? device interfaces with same subnet part of IP address can ly reach each other without intervening router 3... 3... 3... 3...4 3...9 3... 3...3 3..3.7 subnet 3..3. 3..3. consisting of 3 subnets Subnets recipe to determine the subnets, detach each interface from its host or router, creating islands of isolated s each isolated is called a subnet 3...0/4 3...0/4 3... 3... 3... 3...4 3...9 3... 3...3 3..3.7 subnet 3..3. 3..3. 3..3.0/4 subnet mask: /4 4-3 4-3 IP addresses: how to get one? Q: How does a host get IP address? hard-coded by system admin in a file Windows: control-panel->->configuration->tcp/ ip->properties UNIX: /etc/rc.config : Dynamic Host Configuration Protocol: dynamically get address from as server plug-and-play : Dynamic Host Configuration Protocol goal: allow host to dynamically obtain its IP address from server when it joins can renew its lease on address in use allows reuse of addresses (only hold address while connected/ on ) support for mobile users who want to join (more shortly) overview: host broadcasts discover msg [optional] server responds with offer msg [optional] host requests IP address: request msg server sends address: ack msg 4-33 4-34 client-server scenario client-server scenario 3...0/4 3... server 3... server: 3... discover src : 0.0.0.0, 68 Broadcast: dest.:...,67 is there a yiaddr: server 0.0.0.0 out there? transaction ID: 64 offer arriving client 3... 3...4 3...9 3...3 3..3.7 3... 3...0/4 arriving client needs address in this src: 3..., 67 Broadcast: dest:..., I m a 68 server! yiaddrr: Here s 3...4 an IP address transaction you ID: can 64use lifetime: 3600 secs request src: 0.0.0.0, 68 dest::..., 67 Broadcast: OK. I ll take yiaddrr: 3...4 that transaction IP address! ID: 6 lifetime: 3600 secs 3..3. 3..3. ACK 3..3.0/4 src: 3..., 67 Broadcast: dest:..., OK. You ve 68 yiaddrr: 3...4 got transaction that IP ID: address! 6 lifetime: 3600 secs 4-3 4-36 6
: more than IP addresses can return more than just allocated IP address on subnet: address of first-hop router for client name and IP address of DNS sever mask (indicating versus host portion of address) : example UDP IP Eth Phy UDP IP Eth Phy 68... router with server built into router connecting laptop needs its IP address, addr of first-hop router, addr of DNS server: use request encapsulated in UDP, encapsulated in IP, encapsulated in 80. Ethernet Ethernet frame broadcast (dest: FFFFFFFFFFFF) on LAN, received at router running server Ethernet demuxed to IP demuxed, UDP demuxed to 4-37 4-38 : example UDP IP Eth Phy UDP IP Eth Phy router with server built into router DCP server formulates ACK containing client s IP address, IP address of first-hop router for client, name & IP address of DNS server encapsulation of server, frame forwarded to client, demuxing up to at client client now knows its IP address, name and IP address of DSN server, IP address of its first-hop router 4-39 NAT: address translation rest of Internet 38.76.9.7 all datagrams leaving local have same single source NAT IP address: 38.76.9.7,different source port numbers 0.0.0.4 local (e.g., home ) 0.0.0/4 datagrams with source or destination in this have 0.0.0/4 address for source, destination (as usual) 0.0.0. 0.0.0. 0.0.0.3 4-40 NAT: address translation motivation: local uses just one IP address as far as outside world is concerned: range of addresses not needed from ISP: just one IP address for all devices can change addresses of devices in local without notifying outside world can change ISP without changing addresses of devices in local devices inside local net not explicitly addressable, visible by outside world (a security plus) NAT: address translation implementation: NAT router must: outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #)... remote clients/servers will respond using (NAT IP address, new port #) as destination addr remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair incoming datagrams: replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table 4-4 4-4 7
NAT: address translation : NAT router changes datagram source addr from 0.0.0., 334 to 38.76.9.7, 00, updates table NAT translation table WAN side addr LAN side addr 38.76.9.7, 00 0.0.0., 334 S: 38.76.9.7, 00 D: 8.9.40.86, 80 38.76.9.7 S: 8.9.40.86, 80 D: 38.76.9.7, 00 3 3: reply arrives dest. address: 38.76.9.7, 00 * Check out the online interactive exercises for more examples: http://gaia.cs.umass.edu/kurose_ross/interactive/ 0.0.0.4 S: 0.0.0., 334 D: 8.9.40.86, 80 S: 8.9.40.86, 80 D: 0.0.0., 334 4 : host 0.0.0. sends datagram to 8.9.40.86, 80 0.0.0. 0.0.0. 0.0.0.3 4: NAT router changes datagram dest addr from 38.76.9.7, 00 to 0.0.0., 334 4-43 NAT: address translation 6-bit port-number field: 60,000 simultaneous connections with a single LAN-side address! NAT is controversial: routers should only process up to 3 address shortage should be solved by IPv6 violates end-to-end argument NAT possibility must be taken into account by app designers, e.g., PP applications NAT traversal: what if client wants to connect to server behind NAT? 4-44 8