Lecture 16 TCP Congestion Control Homework 6 Due Today TCP uses ACK arrival as a signal to transmit a new packet. Since connections come-and-go TCP congestion control must be adaptive. TCP congestion control consists of three separate mechanisms: Additive Increase/Multiplicative Decrease Tuesday, November 13 CS 475 Networks - Lecture 16 1 Tuesday, November 13 CS 475 Networks - Lecture 16 4 Outline Chapter 6 - Congestion Control and Resource Allocation 6.1 Issues in Resource Allocation 6.2 Queuing Disciplines 6.3 TCP Congestion Control 6.4 Congestion-Avoidance Mechanisms 6.5 Quality of Service 6.6 Summary For congestion control TCP maintains a CongestionWindow variable that is similar to the AdvertisedWindow variable used in flow control. The sender is limited to sending no more than EffectiveWindow bytes: MaxWindow = min(congestionwindow, AdvertisedWindow) EffectiveWindow = MaxWindow (LastByteSent - LastByteAcked) Tuesday, November 13 CS 475 Networks - Lecture 16 2 Tuesday, November 13 CS 475 Networks - Lecture 16 5 TCP Congestion Control Congestion control was not added to TCP until about 8 years after the birth of the Internet. At that time the Internet was on the verge of collapse due to congestion. TCP uses feedback from the receiver to determine how fast to transmit packets. TCP congestion control will work with either FIFO or fair queuing. It is assumed that the network is congested when a timeout occurs (a packet is not ACKed). Each time there is a timeout CongestionWindow is reduced by one-half (multiplicative decrease). If CongestionWindow was equal to 8 packets, successive timeouts would reduce it to 4, then 2, and then 1 (it is not allowed to fall below 1 packet or rather 1 MSS). Tuesday, November 13 CS 475 Networks - Lecture 16 3 Tuesday, November 13 CS 475 Networks - Lecture 16 6
For each ACK that arrives, CongestionWindow is incremented slightly (additive increase) via: Increment = MSS x (MSS/CongestionWindow) CongestionWindow += Increment This doubles CongestionWindow after a CongestionWindow of packets have been ACKed. Tuesday, November 13 CS 475 Networks - Lecture 16 7 Slow start increases the window more rapidly than additive increase. It is called slow start because TCP would originally send out a burst of data the size of the AdvertisedWindow. This burst would often overwhelm Internet routers. Slow start is also used immediately after a timeout until the window reaches ½ its size prior to the timeout (with slow start the window is reset to 1 packet after a time out). The sizing algorithm then switches to AIMD. Tuesday, November 13 CS 475 Networks - Lecture 16 10 AIMD results in the sawtooth pattern shown below for the congestion window size. Timeouts due to dropped packets have occurred at the instances in which there is a sharp decrease (by ½) of the window size. Notice the slow start intervals after a timeout in the graph of congestion window size below. Tuesday, November 13 CS 475 Networks - Lecture 16 8 Bullets indicate timeout events. Horizontal lines at top correspond to packet transmission times. Verticals lines are times at which a packet(s) that eventually times out is first transmitted. Tuesday, November 13 CS 475 Networks - Lecture 16 11 When a connection is first started we want to increase CongestionWindow as rapidly as possible to near its steady-state value(s). Instead of increasing it linearly at start up, it is doubled each time all the packets sent in a RTT are ACKed. Tuesday, November 13 CS 475 Networks - Lecture 16 9 Notice the gaps in the horizontal lines on the graph on the previous slide. These represent intervals in which the sender has sent a window of data and is waiting for an ACK (a timeout eventually occurs). Fast retransmit allows a sender to retransmit a lost packet before it times out. Fast retransmit and fast recovery were not in the original TCP congestion control algorithm, but were added later. Tuesday, November 13 CS 475 Networks - Lecture 16 12
Congestion-Avoidance Mechanisms When an out-of-order packet is received the receiver sends a duplicate ACK of the last in-order packet. When the sender sees three duplicate ACKs it resends the (presumed) dropped packet. TCP uses congestion control. It relies on timeouts and duplicate ACKs to detect when congestion is occurring and then decreases the window size. An alternative strategy would be to use congestion avoidance in which the sender predicts when congestion will occur and reduces the rate before packets are discarded. Ideally packets would never be lost due to congestion. Tuesday, November 13 CS 475 Networks - Lecture 16 13 Tuesday, November 13 CS 475 Networks - Lecture 16 16 DECbit Fast retransmit results in about a 20% increase in throughput. This is shown in the graph below. The congestion window is now decreased when duplicate ACKs are received (as well as when a timeout occurs). In the Digital Network Architecture (DNA) network, DNA routers would set a bit (the DECbit) when they were busy (the average queue length exceeded a threshold). The receiver would copy the bit into its ACK. At the sender, if more than 50% of the packets in the previous congestion window had the DECbit set the window would be decreased by 0.875. If not, the window would be increased by one packet. Tuesday, November 13 CS 475 Networks - Lecture 16 14 Tuesday, November 13 CS 475 Networks - Lecture 16 17 Random Early Detection (RED) With fast recovery the window size is decreased by ½ after a fast retransmit and the algorithm enters AIMD mode. Slow start is used only after a timeout. The window size is reset to 1 packet. Slow start is used until the window size reaches ½ its size before the timeout. At all other times the congestion window follows a pure AIMD pattern. Tuesday, November 13 CS 475 Networks - Lecture 16 15 RED also requires special routers, but can be used with TCP. RED routers monitor their queue length and implicitly notify the sender of congestion by dropping packets before congestion actually occurs. If the average router queue length is below MinThreshold no packets are dropped. If the length is above MaxThreshold all incoming packets are dropped. If the length is between the threshold values a packet is dropped with probability P. Tuesday, November 13 CS 475 Networks - Lecture 16 18
Random Early Detection (RED) The probability P is a function of the queue length as shown at right. It is also a function of the time since the last packet was dropped. P increases with time to prevent clusters of drops by distributing the drops over time. A third algorithm looks for a flattening in throughput as a signal that congestion is occurring. Every RTT the congestion window is increased by one packet. If the difference in throughput falls below a threshold the window size is decreased by one packet. (This algorithm effectively looks for a change in the slope of the throughput.) Tuesday, November 13 CS 475 Networks - Lecture 16 19 Tuesday, November 13 CS 475 Networks - Lecture 16 22 There are a collection of congestion avoidance algorithms that do not require special routers. They look for signs that some router's queue is building up and then throttle back to avoid congestion. One algorithm looks for an increase in RTT as a signal. AIMD is normally used but every two round-trip times it compares the current RTT to the average of the min and max RTTs seen so far. If the RTT is greater, the window is reduced by 1/8th. Tuesday, November 13 CS 475 Networks - Lecture 16 20 A fourth algorithm, known as TCP Vegas, compares the measured throughput to an expected value. The design of TCP Vegas is based on the observation (illustrated in the graphs on the following slide) that, prior to congestion, the window size increases while the sending rate remains flat. Tuesday, November 13 CS 475 Networks - Lecture 16 23 A second algorithm looks for changes in both the RTT and the window size. If (CurrentWindow OldWindow) x (CurrentRTT OldRTT) is positive the window is reduced by 1/8th. Otherwise, it is increased by 1/8th. Tuesday, November 13 CS 475 Networks - Lecture 16 21 Tuesday, November 13 CS 475 Networks - Lecture 16 24
TCP Vegas calculates an expected sending rate from: ExpectedRate = CongestionWindow/BaseRTT where BaseRTT is the minimum of all round-trip times. The difference (Diff) between the ExpectedRate and the ActualRate is compared to α and β thresholds (α < β). If Diff < α the window is increased linearly. If Diff > β the window is decreased linearly. Tuesday, November 13 CS 475 Networks - Lecture 16 25 TCP Vegas congestion control. Top congestion window. Bottom expected (black) and actual throughputs. Tuesday, November 13 CS 475 Networks - Lecture 16 26