Internet protocol stack application: supporting network applications FTP, SMTP, HTTP transport: process-process data transfer TCP, UDP network: routing of datagrams from source to destination IP, routing protocols link: data transfer between neighboring network elements Ethernet, 802.111 (WiFi), PPP physical: bits on the wire application transport network link physical Introduction 1-1
source Encapsulation segment datagram frame message H l H t H n H t H n H t M M M M application transport network link physical link physical switch H l H n H n H t H t H t M M M M destination application transport network link physical H l H n H n H t H t M M network link physical H n H t M router Introduction 1-2
Chapter 4 4.1 Overview of Network layer data plane control plane 4.2 What s inside a router 4.3 IP: Internet Protocol datagram format fragmentation IPv4 addressing network address translation IPv6 4.4 Generalized Forward and SDN match action OpenFlow examples of match-plus-action in action Network Layer: Data Plane 4-3
Two key network-layer functions network-layer functions: forwarding: move packets from router s input to appropriate router output routing: determine route taken by packets from source to destination routing algorithms analogy: taking a trip forwarding: process of getting through single interchange routing: process of planning trip from source to destination Network Layer: Data Plane 4-4
Network layer: data plane, control plane Data plane local, per-router function determines how datagram arriving on router input port is forwarded to router output port forwarding function values in arriving packet header 0111 3 1 2 Control plane network-wide logic determines how datagram is routed among routers along end-end path from source host to destination host two control-plane approaches: traditional routing algorithms: implemented in routers software-defined networking (SDN): implemented in (remote) servers Network Layer: Data Plane 4-5
Network service model Service model for channel transporting datagrams from sender to receiver: For individual datagrams: guaranteed delivery guaranteed delivery with less than 40 msec delay For a flow of datagrams: in-order datagram delivery guaranteed minimum bandwidth to flow restrictions on changes in inter-packet spacing Network Architecture Service Model Bandwidth Guarantees? Loss Order Timing Congestion feedback Internet best effort none no no no no (inferred via loss) Network Layer: Data Plane 4-6
Router architecture overview high-level view of generic router architecture: routing processor high-seed switching fabric routing, management control plane (software) operates in millisecond time frame forwarding data plane (hardware) operates in nanosecond timeframe router input ports router output ports Network Layer: Data Plane 4-7
Input port queuing fabric slower than input ports combined -> queueing may occur at input queues queueing delay and loss due to input buffer overflow! Head-of-the-Line (HOL) blocking: queued datagram at front of queue prevents others in queue from moving forward switch fabric switch fabric output port contention: only one red datagram can be transferred. lower red packet is blocked one packet time later: green packet experiences HOL blocking Network Layer: Data Plane 4-8
Output ports This slide in HUGELY important! switch fabric datagram buffer queueing link layer protocol (send) line termination buffering required when datagrams arrive from fabric faster than the transmission rate scheduling discipline chooses among queued datagrams for transmission performance, network neutrality Datagram (packets) can be lost due to congestion, lack of buffers Priority scheduling who gets best Network Layer: Data Plane 4-9
Output port queueing switch fabric switch fabric at t, packets more from input to output one packet time later buffering when arrival rate via switch exceeds output line speed queueing (delay) and loss due to output port buffer overflow! Network Layer: Data Plane 4-10
Scheduling mechanisms scheduling: choose next packet to send on link FIFO (first in first out) scheduling: send in order of arrival to queue packet arrivals queue (waiting area) link (server) packet departures arrivals priority scheduling: send highest priority queued packet classify high priority queue (waiting area) low priority queue (waiting area) link (server) departures arrivals packet in service departures 2 1 3 5 1 3 2 4 5 1 4 3 2 4 5 Network Layer: Data Plane 4-11
Scheduling policies: still more Round Robin (RR) scheduling: multiple classes arrivals 2 1 3 4 5 packet in service 1 3 2 4 5 departures Weighted Fair Queuing (WFQ): generalized Round Robin 1 3 3 4 5 Network Layer: Data Plane 4-12
IP datagram format IP protocol version number header length (bytes) type of data max number remaining hops (decremented at each router) upper layer protocol to deliver payload to how much overhead? v 20 bytes of TCP v 20 bytes of IP v = 40 bytes + app layer overhead ver head. len 16-bit identifier time to live type of service upper layer 32 bits flgs length fragment offset header checksum 32 bit source IP address 32 bit destination IP address options (if any) data (variable length, typically a TCP or UDP segment) total datagram length (bytes) for fragmentation/ reassembly e.g. timestamp, record route taken, specify list of routers to visit. Network Layer: Data Plane 4-13
IP fragmentation, reassembly network links have MTU (max.transfer size) - largest possible link-level frame different link types, different MTUs large IP datagram divided ( fragmented ) within net one datagram becomes several datagrams reassembled only at final destination IP header bits used to identify, order related fragments reassembly fragmentation: in: one large datagram out: 3 smaller datagrams Network Layer: Data Plane 4-14
Subnets IP address: subnet part - high order bits host part - low order bits what s a subnet? device interfaces with same subnet part of IP address can physically reach each other without intervening router 223.1.1.1 223.1.1.2 223.1.2.1 223.1.1.4 223.1.2.9 223.1.2.2 223.1.1.3 223.1.3.27 subnet 223.1.3.1 223.1.3.2 network consisting of 3 subnets Network Layer: Data Plane 4-15
Transition from IPv4 to IPv6 not all routers can be upgraded simultaneously no flag days how will network operate with mixed IPv4 and IPv6 routers? tunneling: IPv6 datagram carried as payload in IPv4 datagram among IPv4 routers IPv4 header fields IPv4 source, dest addr IPv6 header fields IPv6 source dest addr UDP/TCP payload IPv4 payload IPv6 datagram IPv4 datagram Network Layer: Data Plane 4-16
Tunneling logical view: A IPv6 B IPv6 IPv4 tunnel connecting IPv6 routers E IPv6 F IPv6 physical view: A B C D E F IPv6 IPv6 IPv4 IPv4 IPv6 IPv6 Network Layer: Data Plane 4-17
Tunneling logical view: A IPv6 B IPv6 IPv4 tunnel connecting IPv6 routers E IPv6 F IPv6 physical view: A B C D E F IPv6 IPv6 IPv4 IPv4 IPv6 IPv6 flow: X src: A dest: F data src:b dest: E Flow: X Src: A Dest: F src:b dest: E Flow: X Src: A Dest: F flow: X src: A dest: F data data data A-to-B: IPv6 B-to-C: IPv6 inside IPv4 B-to-C: IPv6 inside IPv4 E-to-F: IPv6 Network Layer: Data Plane 4-18
IPv6: adoption Google: 8% of clients access services via IPv6 NIST: 1/3 of all US government domains are IPv6 capable Long (long!) time for deployment, use 20 years and counting! think of application-level changes in last 20 years: WWW, Facebook, streaming media, Skype, Why? Network Layer: Data Plane 4-19
Generalized Forwarding and SDN Each router contains a flow table that is computed and distributed by a logically centralized routing controller logically-centralized routing controller control plane data plane local flow table headers counters actions 0100 1101 3 2 1 values in arriving packet s header Network Layer: Data Plane 4-20
A link-state routing algorithm Dijkstra s algorithm net topology, link costs known to all nodes accomplished via link state broadcast all nodes have same info computes least cost paths from one node ( source ) to all other nodes gives forwarding table for that node iterative: after k iterations, know least cost path to k dest. s notation: c(x,y): link cost from node x to y; = if not direct neighbors D(v): current value of cost of path from source to dest. v p(v): predecessor node along path from source to v N': set of nodes whose least cost path definitively known Network Layer: Control Plane 5-21
Distance vector algorithm Bellman-Ford equation (dynamic programming) let d x (y) := cost of least-cost path from x to y then d x (y) = min {c(x,v) + d v (y) } v cost from neighbor v to destination y cost to neighbor v min taken over all neighbors v of x Network Layer: Control Plane 5-22
Internet approach to scalable routing aggregate routers into regions known as autonomous systems (AS) (a.k.a. domains ) intra-as routing routing among hosts, routers in same AS ( network ) all routers in AS must run same intra-domain protocol routers in different AS can run different intra-domain routing protocol gateway router: at edge of its own AS, has link(s) to router(s) in other AS es inter-as routing routing among AS es gateways perform interdomain routing (as well as intra-domain routing) Network Layer: Control Plane 5-23
Interconnected ASes 3c 3a 3b AS3 1a 1c 1d 1b Intra-AS Routing algorithm AS1 Forwarding table Inter-AS Routing algorithm 2a 2c AS2 2b forwarding table configured by both intraand inter-as routing algorithm intra-as routing determine entries for destinations within AS inter-as & intra-as determine entries for external destinations Network Layer: Control Plane 5-24
Inter-AS tasks suppose router in AS1 receives datagram destined outside of AS1: router should forward packet to gateway router, but which one? AS1 must: 1. learn which dests are reachable through AS2, which through AS3 2. propagate this reachability info to all routers in AS1 job of inter-as routing! 3c other networks 3b 3a AS3 1a AS1 1c 1d 1b 2a AS2 2c 2b other networks Network Layer: Control Plane 5-25
Internet inter-as routing: BGP BGP (Border Gateway Protocol): the de facto inter-domain routing protocol glue that holds the Internet together BGP provides each AS a means to: ebgp: obtain subnet reachability information from neighboring ASes ibgp: propagate reachability information to all ASinternal routers. determine good routes to other networks based on reachability information and policy allows subnet to advertise its existence to rest of Internet: I am here Network Layer: Control Plane 5-26
BGP path advertisement AS1 1b AS3 3b 1a 1c 1d AS2,AS3,X AS2 2a 2b 2c 3a AS3,X 3d 3c X 2d AS2 router 2c receives path advertisement AS3,X (via ebgp) from AS3 router 3a Based on AS2 policy, AS2 router 2c accepts path AS3,X, propagates (via ibgp) to all AS2 routers Based on AS2 policy, AS2 router 2a advertises (via ebgp) path AS2, AS3, X to AS1 router 1c Network Layer: Control Plane 5-27
BGP path advertisement AS1 1b AS3 3b 1a 1c 1d AS2,AS3,X AS2 2a 2b 2c 3a AS3,X 3d 3c X 2d gateway router may learn about multiple paths to destination: AS1 gateway router 1c learns path AS2,AS3,X from 2a AS1 gateway router 1c learns path AS3,X from 3a Based on policy, AS1 gateway router 1c chooses path AS3,X, and advertises path within AS1 via ibgp Network Layer: Control Plane 5-28
BGP: achieving policy via advertisements W A B C X legend: provider network customer network: Y Suppose an ISP only wants to route traffic to/from its customer networks (does not want to carry transit traffic between other ISPs) A advertises path Aw to B and to C B chooses not to advertise BAw to C: B gets no revenue for routing CBAw, since none of C, A, w are B s customers C does not learn about CBAw path C will route CAw (not using B) to get to w Network Layer: Control Plane 5-29
BGP: achieving policy via advertisements W A B C X legend: provider network customer network: Y Suppose an ISP only wants to route traffic to/from its customer networks (does not want to carry transit traffic between other ISPs) A,B,C are provider networks X,W,Y are customer (of provider networks) X is dual-homed: attached to two networks policy to enforce: X does not want to route from B to C via X.. so X will not advertise to B a route to C Network Layer: Control Plane 5-30
Why different Intra-, Inter-AS routing? policy: inter-as: admin wants control over how its traffic routed, who routes through its net. intra-as: single admin, so no policy decisions needed scale: hierarchical routing saves table size, reduced update traffic performance: intra-as: can focus on performance inter-as: policy may dominate over performance Network Layer: Control Plane 5-31
Link layer, LANs: outline 6.1 introduction, services 6.2 error detection, correction 6.3 multiple access protocols 6.4 LANs addressing, ARP Ethernet switches VLANS 6.5 link virtualization: MPLS 6.6 data center networking 6.7 a day in the life of a web request Link Layer and LANs 6-32
Slotted ALOHA assumptions: all frames same size time divided into equal size slots (time to transmit 1 frame) nodes start to transmit only slot beginning nodes are synchronized if 2 or more nodes transmit in slot, all nodes detect collision operation: when node obtains fresh frame, transmits in next slot if no collision: node can send new frame in next slot if collision: node retransmits frame in each subsequent slot with prob. p until success Link Layer and LANs 6-33
Slotted ALOHA node 1 1 1 1 1 node 2 2 2 2 node 3 3 3 3 C E C S E C E S S Pros: single active node can continuously transmit at full rate of channel highly decentralized: only slots in nodes need to be in sync simple Cons: collisions, wasting slots idle slots nodes may be able to detect collision in less than time to transmit packet clock synchronization Link Layer and LANs 6-34
Slotted ALOHA: efficiency efficiency: long-run fraction of successful slots (many nodes, all with many frames to send) suppose: N nodes with many frames to send, each transmits in slot with probability p prob that given node has success in a slot = p(1- p) N-1 prob that any node has a success = Np(1-p) N-1 max efficiency: find p* that maximizes Np(1-p) N-1 for many nodes, take limit of Np*(1-p*) N-1 as N goes to infinity, gives: max efficiency = 1/e =.37 at best: channel used for useful transmissions 37% of time!! Link Layer and LANs 6-35
Pure (unslotted) ALOHA unslotted Aloha: simpler, no synchronization when frame first arrives transmit immediately collision probability increases: frame sent at t 0 collides with other frames sent in [t 0-1,t 0 +1] Link Layer and LANs 6-36
Pure ALOHA efficiency P(success by given node) = P(node transmits). P(no other node transmits in [t 0-1,t 0 ]. P(no other node transmits in [t 0-1,t 0 ] = p. (1-p) N-1. (1-p) N-1 = p. (1-p) 2(N-1) choosing optimum p and then letting n = 1/(2e) =.18 even worse than slotted Aloha! Link Layer and LANs 6-37
CSMA (carrier sense multiple access) CSMA: listen before transmit: if channel sensed idle: transmit entire frame if channel sensed busy, defer transmission human analogy: don t interrupt others! Link Layer and LANs 6-38
CSMA collisions spatial layout of nodes collisions can still occur: propagation delay means two nodes may not hear each other s transmission collision: entire packet transmission time wasted distance & propagation delay play role in in determining collision probability Link Layer and LANs 6-39
CSMA/CD (collision detection) CSMA/CD: carrier sensing, deferral as in CSMA collisions detected within short time colliding transmissions aborted, reducing channel wastage collision detection: easy in wired LANs: measure signal strengths, compare transmitted, received signals difficult in wireless LANs: received signal strength overwhelmed by local transmission strength human analogy: the polite conversationalist Link Layer and LANs 6-40
CSMA/CD (collision detection) spatial layout of nodes Link Layer and LANs 6-41
CSMA/CD efficiency T prop = max prop delay between 2 nodes in LAN t trans = time to transmit max-size frame 1 efficiency = 1+ 5 t prop /t trans efficiency goes to 1 as t prop goes to 0 as t trans goes to infinity better performance than ALOHA: and simple, cheap, decentralized! Link Layer and LANs 6-42
Taking turns MAC protocols channel partitioning MAC protocols: share channel efficiently and fairly at high load inefficient at low load: delay in channel access, 1/N bandwidth allocated even if only 1 active node! random access MAC protocols efficient at low load: single node can fully utilize channel high load: collision overhead taking turns protocols look for best of both worlds! Link Layer and LANs 6-43
Taking turns MAC protocols polling: master node invites slave nodes to transmit in turn typically used with dumb slave devices concerns: polling overhead latency single point of failure (master) data slaves data poll master Link Layer and LANs 6-44
Taking turns MAC protocols token passing: control token passed from one node to next sequentially. token message concerns: token overhead latency single point of failure (token) (nothing to send) T T data Link Layer and LANs 6-45
Synthesis: a day in the life of a web request journey down protocol stack complete! application, transport, network, link putting-it-all-together: synthesis! goal: identify, review, understand protocols (at all layers) involved in seemingly simple scenario: requesting www page scenario: student attaches laptop to campus network, requests/receives www.google.com Link Layer and LANs 6-46
A day in the life: scenario browser DNS server Comcast network 68.80.0.0/13 school network 68.80.2.0/24 web page web server 64.233.169.105 Google s network 64.233.160.0/19 Link Layer and LANs 6-47
A day in the life connecting to the Internet DHCP DHCP DHCP DHCP DHCP UDP IP Eth Phy connecting laptop needs to get its own IP address, addr of first-hop router, addr of DNS server: use DHCP DHCP DHCP DHCP DHCP DHCP DHCP UDP IP Eth Phy router (runs DHCP) DHCP request encapsulated in UDP, encapsulated in IP, encapsulated in 802.3 Ethernet Ethernet frame broadcast (dest: FFFFFFFFFFFF) on LAN, received at router running DHCP server Ethernet demuxed to IP demuxed, UDP demuxed to DHCP Link Layer and LANs 6-48
A day in the life connecting to the Internet DHCP DHCP DHCP DHCP DHCP DHCP DHCP DHCP DHCP DHCP UDP IP Eth Phy DHCP UDP IP Eth Phy router (runs DHCP) DHCP server formulates DHCP ACK containing client s IP address, IP address of first-hop router for client, name & IP address of DNS server encapsulation at DHCP server, frame forwarded (switch learning) through LAN, demultiplexing at client DHCP client receives DHCP ACK reply Client now has IP address, knows name & addr of DNS server, IP address of its first-hop router Link Layer and LANs 6-49
A day in the life ARP (before DNS, before HTTP) DNS DNS DNS ARP query DNS UDP IP Eth Phy ARP ARP reply ARP Eth Phy router (runs DHCP) before sending HTTP request, need IP address of www.google.com: DNS DNS query created, encapsulated in UDP, encapsulated in IP, encapsulated in Eth. To send frame to router, need MAC address of router interface: ARP ARP query broadcast, received by router, which replies with ARP reply giving MAC address of router interface client now knows MAC address of first hop router, so can now send frame containing DNS query Link Layer and LANs 6-50
A day in the life using DNS DNS DNS DNS DNS DNS DNS UDP IP Eth Phy DNS DNS DNS DNS DNS UDP IP Eth Phy Comcast network 68.80.0.0/13 DNS server router (runs DHCP) IP datagram containing DNS query forwarded via LAN switch from client to 1 st hop router IP datagram forwarded from campus network into Comcast network, routed (tables created by RIP, OSPF, IS-IS and/or BGP routing protocols) to DNS server demuxed to DNS server DNS server replies to client with IP address of www.google.com Link Layer and LANs 6-51
A day in the life TCP connection carrying HTTP HTTP SYNACK SYNACK SYNACK HTTP TCP IP Eth Phy SYNACK SYNACK SYNACK TCP IP Eth Phy web server 64.233.169.105 router (runs DHCP) to send HTTP request, client first opens TCP socket to web server TCP SYN segment (step 1 in 3- way handshake) inter-domain routed to web server web server responds with TCP SYNACK (step 2 in 3-way handshake) TCP connection established! Link Layer and LANs 6-52
A day in the life HTTP request/reply HTTP HTTP HTTP HTTP HTTP HTTP TCP IP Eth Phy web page finally (!!!) displayed HTTP request sent into TCP socket HTTP HTTP HTTP HTTP HTTP TCP IP Eth Phy web server 64.233.169.105 router (runs DHCP) IP datagram containing HTTP request routed to www.google.com web server responds with HTTP reply (containing web page) IP datagram containing HTTP reply routed back to client Link Layer and LANs 6-53