TCP Adaptive Retransmission Algorithm - Original TCP Theory Estimate RTT Multiply by 2 to allow for variations Practice Use exponential moving average (A = 0.1 to 0.2) Estimate = (A) * measurement + (1- A) * estimate Problem Did not handle variations well Ambiguity for retransmitted packets Was in response to first, second, etc transmission? 11/11/06 CS/ECE 438 - UIUC, Fall 2006 1 11/11/06 CS/ECE 438 - UIUC, Fall 2006 2 TCP Adaptive Retransmission Algorithm Karn-Partridge TCP Adaptive Retransmission Algorithm Jacobson Algorithm Exclude retransmitted packets from RTT estimate For each retransmission Double RTT estimate Exponential backoff from congestion Problem Still did not handle variations well Did not solve network congestion problems as well as desired Algorithm Estimate variance of RTT Calculate mean interpacket RTT deviation to approximate variance Use second exponential moving average Deviation = (B) * RTT_Estimate Measurement + (1 B) * deviation B = 0.25, A = 0.125 for RTT_estimate Use variance estimate as component of RTT estimate Next_RTT = RTT_Estimate + 4 * Deviation Protects against high jitter Notes Algorithm is only as good as the granularity of the clock Accurate timeout mechanism is important for congestion control 11/11/06 CS/ECE 438 - UIUC, Fall 2006 3 11/11/06 CS/ECE 438 - UIUC, Fall 2006 4 TCP Connection Establishment TCP Connection Termination 3-Way Handshake Sequence Numbers J,K Message Types Synchronize (SYN) Acknowledge () Passive Open Server listens for connection from client Active Open Client initiates connection to server Client Synchronize (SYN) J SYN K, acknowledge () J+1 K+1 Time flows down Server listen Message Types Finished (FIN) Acknowledge () Active Sends no more data Passive close Accepts no more data Client Finished (FIN) J J+1 FIN K K+1 Time flows down Server 11/11/06 CS/ECE 438 - UIUC, Fall 2006 5 11/11/06 CS/ECE 438 - UIUC, Fall 2006 6 1
TCP Connection Management (cont) TCP State Descriptions TCP client lifecycle TCP server lifecycle LISTEN CLOSE_WAIT LAST_ FIN_WAIT_1 FIN_WAIT_2 CLOSING TIME_WAIT Disconnected Waiting for incoming connection Connection request received Connection request sent Connection ready for data transport Connection closed by peer Connection closed by peer, closed locally, await Connection closed locally Connection closed locally and d Connection closed by both sides simultaneously Wait for network to discard related packets 11/11/06 CS/ECE 438 - UIUC, Fall 2006 7 11/11/06 CS/ECE 438 - UIUC, Fall 2006 8 Active open/syn Passive open SYN/SYN + LISTEN Send/SYN SYN/SYN + / SYN + / FIN/ FIN_WAIT_1 FIN/ CLOSE_WAIT FIN_WAIT_2 FIN + / FIN/ CLOSING TIME_WAIT Timeout LAST_ Questions State transitions Describe the path taken by a server under normal conditions Describe the path taken by a client under normal conditions Describe the path taken assuming the client closes the connection first TIME_WAIT state What purpose does this state serve Prove that at least one side of a connection enters this state Explain how both sides might enter this state 11/11/06 CS/ECE 438 - UIUC, Fall 2006 9 11/11/06 CS/ECE 438 - UIUC, Fall 2006 10 Passive open SYN/SYN + LISTEN Send/SYN SYN/SYN + SYN + / Active open/syn TCP A TCP B 1. LISTEN 2. SYN-SENT --> <SEQ=100><CTL=SYN> --> SYN-RECEIVED 3. <-- <SEQ=300><=101><CTL=SYN,> <-- SYN-RECEIVED 4. --> <SEQ=101><=301><CTL=> --> 5. --> <SEQ=101><=301><CTL=><DATA> --> Active open/syn Passive open SYN/SYN + LISTEN Send/SYN SYN/SYN + SYN + / FIN/ FIN_WAIT_1 FIN/ CLOSE_WAIT CLOSING FIN_WAIT_2 FIN + / FIN/ TIME_WAIT Timeout LAST_ 11/11/06 CS/ECE 438 - UIUC, Fall 2006 11 11/11/06 CS/ECE 438 - UIUC, Fall 2006 12 2
TCP Sliding Window Protocol TCP Sliding Window Protocol Sender Side Sequence numbers Indices into byte stream sequence number Actually next byte expected as opposed to last byte received Advertised window Enables dynamic receive window size Receive buffers Data ready for delivery to application until requested Out-of-order data out to maximum buffer capacity Sender buffers Unacknowledged data Unsent data out to maximum buffer capacity LastByteAcked <= LastByteSent LastByteSent <= LastByteWritten Buffer bytes between LastByteAcked and LastByteWritten Advertised window Data available, but outside window First unacknowledged byte Last byte sent 11/11/06 CS/ECE 438 - UIUC, Fall 2006 13 11/11/06 CS/ECE 438 - UIUC, Fall 2006 14 TCP Sliding Window Protocol Receiver Side TCP generation - 1 LastByteRead < NextByteExpected NextByteExpected <= LastByteRcvd + 1 Buffer bytes between NextByteRead and LastByteRcvd Advertised window Buffered, out-of-order data Arrival of in-order segment with expected seq #. All data up to expected seq # already ed Delayed. Wait up to 500ms for next segment. 11/11/06 CS/ECE 438 - UIUC, Fall 2006 15 11/11/06 CS/ECE 438 - UIUC, Fall 2006 16 TCP generation - 2 TCP generation - 3 Arrival of in-order segment with expected seq #. One other segment has pending Immediately send single cumulative, ing both inorder segments 11/11/06 CS/ECE 438 - UIUC, Fall 2006 17 Arrival of out-of-order segment higher-than-expect seq. # Gap detected Immediately send duplicate, indicating seq. # of next expected byte 11/11/06 CS/ECE 438 - UIUC, Fall 2006 18 3
TCP generation - 4 TCP generation [RFC 1122, RFC 2581] Event at Receiver Arrival of in-order segment with expected seq #. All data up to expected seq # already ed TCP Receiver action Delayed. Wait up to 500ms for next segment. If no next segment, send Arrival of segment that partially or completely fills gap Immediate send, provided that segment starts at lower end of gap Arrival of in-order segment with expected seq #. One other segment has pending Arrival of out-of-order segment higher-than-expect seq. #. Gap detected Arrival of segment that partially or completely fills gap Immediately send single cumulative, ing both in-order segments Immediately send duplicate, indicating seq. # of next expected byte Immediate send, provided that segment starts at lower end of gap 11/11/06 CS/ECE 438 - UIUC, Fall 2006 19 11/11/06 CS/ECE 438 - UIUC, Fall 2006 20 Fast Retransmit What s the problem with time-out? time-out period often relatively long Detect lost segments via duplicate s. Sender often sends many segments backto-back If segment is lost, there will likely be many duplicate s. If sender receives 3 s for the same data, it supposes that segment after ed data was lost: 11/11/06 CS/ECE 438 - UIUC, Fall 2006 21 fast retransmit: resend segment before timer expires Why 3? Fast retransmit algorithm: event: received, with field value of y if (y > SendBase) { SendBase = y if (there are currently not-yet-acknowledged segments) start timer } else { increment count of dup s received for y if (count of dup s received for y = 3) { resend segment with sequence number y } a duplicate for already ed segment fast retransmit 11/11/06 CS/ECE 438 - UIUC, Fall 2006 22 A 4 th situation? Receiver side Sender Sliding window Buffered, out-of-order data What if? First unacknowledged byte Receiver Last byte sent Data available, but outside window Avoid? 11/11/06 CS/ECE 438 - UIUC, Fall 2006 23 Next byte to be read by application Buffered, out-of-order data 11/11/06 CS/ECE 438 - UIUC, Fall 2006 24 4
Flow Control vs. Congestion Control Flow control Preventing senders from overrunning the capacity of the receivers Congestion control Preventing too much data from being injected into the network, causing switches or links to become overloaded TCP provides both flow control based on advertised window congestion control discussed later in class Receiving side Receive buffer size = MaxRcvBuffer LastByteRcvd - LastByteRead < = MaxRcvBuffer AdvertisedWindow = MaxRcvBuffer - (NextByteExpected - NextByteRead) Shrinks as data arrives and Grows as the application consumes data Sending side Send buffer size = MaxSendBuffer LastByteSent - LastByteAcked < = AdvertisedWindow EffectiveWindow = AdvertisedWindow - (LastByteSent - LastByteAcked) EffectiveWindow > 0 to send data LastByteWritten - LastByteAcked < = MaxSendBuffer block sender if (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer 11/11/06 CS/ECE 438 - UIUC, Fall 2006 25 11/11/06 CS/ECE 438 - UIUC, Fall 2006 26 Problem: Slow receiver application Advertised window goes to 0 Sender cannot send more data Non-data packets used to update window Receiver may not spontaneously generate update or update may be lost Solution Sender periodically sends 1-byte segment, ignoring advertised window of 0 Eventually window opens Sender learns of opening from next of 1-byte segment Problem: Application delivers tiny pieces of data to TCP Example: telnet in character mode Each piece sent as a segment, returned as Very inefficient Solution Delay transmission to accumulate more data Nagle s algorithm Send first piece of data Accumulate data until first piece d Send accumulated data and restart accumulation Not ideal for some traffic (e.g. mouse motion) 11/11/06 CS/ECE 438 - UIUC, Fall 2006 27 11/11/06 CS/ECE 438 - UIUC, Fall 2006 28 TCP Bit Allocation Limitations Problem: Slow application reads data in tiny pieces Receiver advertises tiny window Sender fills tiny window Known as silly window syndrome Solution Advertise window opening only when MSS or 1/2 of buffer is available Sender delays sending until window is MSS or 1/2 of receiver s buffer (estimated) Sequence numbers vs. packet lifetime Assumed that IP packets live less than 60 seconds Can we send 2 32 bytes in 60 seconds? Less than an STS-12 line Advertised window vs. delay-bandwidth Only 16 bits for advertised window Cross-country RTT = 100 ms Adequate for only 5.24 Mbps! 11/11/06 CS/ECE 438 - UIUC, Fall 2006 29 11/11/06 CS/ECE 438 - UIUC, Fall 2006 30 5
TCP Sequence Numbers 32-bit TCP Advertised Window 16-bit Bandwidth Speed Time until wrap around Bandwidth Speed Max RTT T1 1.5 Mbps 6.4 hours T1 1.5 Mbps 350 ms Ethernet 10 Mbps 57 minutes Ethernet 10 Mbps 52.4 ms T3 45 Mbps 13 minutes T3 45 Mbps 11.6 ms FDDI 100 Mbps 6 minutes FDDI 100 Mbps 5.2 ms STS-3 155 Mbps 4 minutes STS-3 155 Mbps 3.4 ms STS-12 622 Mbps 55 seconds STS-12 622 Mbps 843 µs STS-24 1.2 Gbps 28 seconds STS-24 1.2 Gbps 437 µs 11/11/06 CS/ECE 438 - UIUC, Fall 2006 31 11/11/06 CS/ECE 438 - UIUC, Fall 2006 32 6