Impact of transmission errors on TCP performance 1 Outline Impact of transmission errors on TCP performance Approaches to improve TCP performance Classification Discussion of selected approaches 2 Random Errors If number of errors is small, they may be corrected by an error correcting code Excessive bit errors result in a packet being discarded, possibly before it reaches the transport layer 3 1
Random Errors May Cause Fast Retransmit Fast retransmit results in retransmission of lost packet reduction in congestion window Reducing congestion window in response to errors is unnecessary Reduction in congestion window reduces the throughput 4 Congestion Response Appropriate? On a CDMA channel, errors occur due to interference from other user, and due to noise Interference due to other users is an indication of congestion. If such interference causes transmission errors, it is appropriate to reduce congestion window If noise causes errors, it is not appropriate to reduce window When a channel is in a bad state for a long duration, it might be better to let TCP backoff, so that it does not unnecessarily attempt retransmissions while the channel remains in the bad state 5 Burst Errors May Cause Timeouts If wireless link remains unavailable for extended duration, a window worth of data may be lost driving through a tunnel passing a truck Timeout results in slow start Slow start reduces congestion window to 1 MSS, reducing throughput Reduction in window in response to errors unnecessary 6 2
Random Errors May Also Cause Timeout Multiple packet losses in a window can result in timeout when using TCP-Reno (and to a lesser extent when using SACK) 7 Impact of Transmission Errors TCP cannot distinguish between packet losses due to congestion and transmission errors Unnecessarily reduces congestion window Throughput suffers 8 Outline Impact of transmission errors on TCP performance Approaches to improve TCP performance Classification Discussion of selected approaches 9 3
Classification of Schemes to Improve Performance of TCP in Presence of Transmission Errors 10 Improving TCP Performance in Presence of Errors Classification 1 Classification based on nature of actions taken to improve performance Hide error losses from the sender if sender is unaware of the packet losses due to errors, it will not reduce congestion window Let sender know, or determine, cause of packet loss if sender knows that a packet loss is due to errors, it will not reduce congestion window 11 Improving TCP Performance in Presence of Errors Classification 2 Classification based on where modifications are needed At the sender node only At the receiver node only At intermediate node(s) only Combinations of the above 12 4
Ideal Behavior Ideal TCP behavior: Ideally, the TCP sender should simply retransmit a packet lost due to transmission errors, without taking any congestion control actions Such a TCP referred to as Ideal TCP Ideal TCP typically not realizable Ideal network behavior: Transmission errors should be hidden from the sender -- the errors should be recovered transparently and efficiently Proposed schemes attempt to approximate one of the above two ideals 13 Outline Impact of transmission errors on TCP performance Approaches to improve TCP performance Classification Discussion of selected approaches 14 Selected Schemes to Improve Performance of TCP in Presence of Transmission Errors 15 5
Caveat When describing various schemes, only the major features are presented Often, some additional features are present in these schemes, to optimize their performance We will not cover all the details, only the most relevant ones 16 Various Schemes Link level mechanisms Split connection approach TCP-Aware link layer TCP-Unaware approximation of TCP-aware link layer Explicit notification Receiver-based discrimination Sender-based discrimination 17 Link Level Mechanisms 18 6
Link Layer Mechanisms Forward Error Correction Forward Error Correction (FEC) can be used to correct small number of errors Correctable errors hidden from the TCP sender FEC incurs overhead even when errors do not occur Adaptive FEC schemes can reduce the overhead by choosing appropriate FEC dynamically 19 Link Layer Mechanisms Link Level Retransmissions Link level retransmission schemes retransmit a packet at the link layer, if errors are detected Retransmission overhead incurred only if errors occur unlike FEC overhead 20 Link Layer Mechanisms In general Use FEC to correct a small number of errors Use link level retransmission when FEC capability is exceeded 21 7
Link Level Retransmissions Link layer state TCP connection application application application transport transport transport network link network link rxmt network link physical physical physical wireless 22 Link Level Retransmissions Issues How many times to retransmit at the link level before giving up? What triggers link level retransmissions? Link layer timeout mechanism Link level acks (negative acks, dupacks, ) Other mechanisms (e.g., Snoop, as discussed later) How much time is required for a link layer retransmission? Small fraction of end-to-end TCP RTT Large fraction/multiple of end-to-end TCP RTT 23 Link Level Retransmissions Issues Should the link layer deliver packets as they arrive, or deliver them in-order? Link layer may need to buffer packets and reorder if necessary so as to deliver packets inorder 24 8
Link Level Retransmissions Issues Retransmissions can cause head-of-the-line blocking Receiver 1 Base station Receiver 2 Although link to receiver 1 may be in a bad state, the link to receiver 2 may be in a good state Retransmissions to receiver 1 are lost, and also block a packet from being sent to receiver 2 25 Link Level Retransmissions Issues Retransmissions can cause congestion losses Receiver 1 Base station Receiver 2 Attempting to retransmit a packet at the front of the queue, effectively reduces the available bandwidth, potentially making the queue at base station longer If the queue gets full, packets may be lost, indicating congestion to the sender Is this desirable or not? 26 Large TCP Retransmission Timeout Intervals Timeout interval may actually be larger than RTO Retransmission timer reset on an ack If the ack d packet and next packet were transmitted in a burst, next packet gets an additional RTT before the timer will go off data 1 2 Timeout = RTO ack Reset, Timeout = RTO Effectively, Timeout = RTT of packet 1 + RTO 27 9
Large TCP Retransmission Timeout Intervals Good for reducing interference with link level retransmits Bad for recovery from congestion losses Need a timeout mechanism that responds appropriately for both types of losses Open problem 28 Link Level Retransmissions Selective repeat protocols can deliver packets out of order Significantly out-of-order delivery can trigger TCP fast retransmit Redundant retransmission from TCP sender Reduction in congestion window Example: Receipt of packets 3,4,5 triggers dupacks Lost packet Retransmitted packet 6 2 5 4 3 2 1 29 Link Level Retransmissions In-order delivery To avoid unnecessary fast retransmit, link layer using retransmission should attempt to deliver packets almost in-order 6 5 4 3 2 2 1 6 5 2 4 3 2 1 30 10
Link Level Retransmissions In-order delivery Not all connections benefit from retransmissions or ordered delivery audio Need to be able to specify requirements on a perpacket basis Should the packet be retransmitted? How many times? Enforce in-order delivery? Need a standard mechanism to specify the requirements open issue (IETF PILC working group) 31 Adaptive Link Layer Strategies Adaptive protocols attempt to dynamically choose: FEC code retransmission limit frame size 32 Link Layer Schemes: Summary When is a reliable link layer beneficial to TCP performance? if it provides almost in-order delivery and TCP retransmission timeout large enough to tolerate additional delays due to link level retransmits 33 11
Link Layer Schemes: Classification Hide wireless losses from TCP sender Link layer modifications needed at both ends of wireless link TCP need not be modified 34 Various Schemes Link level mechanisms Split connection approach TCP-Aware link layer TCP-Unaware approximation of TCP-aware link layer Explicit notification Receiver-based discrimination Sender-based discrimination 35 Split Connection Approach 12
Split Connection Approach End-to-end TCP connection is broken into one connection on the wired part of route and one over wireless part of the route A single TCP connection split into two TCP connections if wireless link is not last on route, then more than two TCP connections may be needed 37 Split Connection Approach Connection between wireless host MH and fixed host FH goes through base station BS FH-MH = FH-BS + BS-MH FH BS MH Fixed Host Base Station Mobile Host 38 Split Connection Approach Split connection results in independent flow control for the two parts Flow/error control protocols, packet size, time-outs, may be different for each part FH BS MH Fixed Host Base Station Mobile Host 39 13
Split Connection Approach Per-TCP connection state TCP connection TCP connection application transport application transport rxmt application transport network network network link link link physical physical physical wireless Split Connection Approach : Classification Hides transmission errors from sender Primary responsibility at base station If specialized transport protocol used on wireless, then wireless host also needs modification 41 Split Connection Approach : Advantages BS-MH connection can be optimized independent of FH-BS connection Different flow / error control on the two connections Local recovery of errors Faster recovery due to relatively shorter RTT on wireless link Good performance achievable using appropriate BS-MH protocol Standard TCP on BS-MH performs poorly when multiple packet losses occur per window (timeouts can occur on the BS-MH connection, stalling during the timeout interval) Selective acks improve performance for such cases 42 14
Split Connection Approach : Disadvantages End-to-end semantics violated ack may be delivered to sender, before data delivered to the receiver May not be a problem for applications that do not rely on TCP for the end-to-end semantics 39 FH BS 38 37 MH 43 Split Connection Approach : Disadvantages BS retains hard state BS failure can result in loss of data (unreliability) If BS fails, packet will be lost Because it is ack d to sender, the sender does not buffer 39 FH BS 38 37 MH 44 Split Connection Approach : Disadvantages BS retains hard state. Hand-off latency increases due to state transfer Data that has been ack d to sender, must be moved to new base station FH 39 BS 38 37 MH 39 MH Hand-off New base station 45 15
Split Connection Approach : Disadvantages Buffer space needed at BS for each TCP connection BS buffers tend to get full, when wireless link slower (one window worth of data on wired connection could be stored at the base station, for each split connection) Window on BS-MH connection reduced in response to errors may not be an issue for wireless links with small delay-bw product 46 Split Connection Approach : Disadvantages Extra copying of data at BS copying from FH-BS socket buffer to BS-MH socket buffer increases end-to-end latency May not be useful if data and acks traverse different paths (both do not go through the base station) Example: data on a satellite wireless hop, acks on a dial-up channel data FH MH ack 47 Various Schemes Link layer mechanisms Split connection approach TCP-Aware link layer TCP-Unaware approximation of TCP-aware link layer Explicit notification Receiver-based discrimination Sender-based discrimination 48 16
TCP-Aware Link Layer 49 Snoop Protocol Retains local recovery of Split Connection approach and link level retransmission schemes Improves on split connection end-to-end semantics retained soft state at base station, instead of hard state 50 Snoop Protocol TCP connection Per TCP-connection state application application application transport transport transport network link network link rxmt network link physical physical physical FH BS wireless MH 51 17
Snoop Protocol Buffers data packets at the base station BS to allow link layer retransmission When dupacks received by BS from MH, retransmit on wireless link, if packet present in buffer Prevents fast retransmit at TCP sender FH by dropping the dupacks at BS FH BS MH 52 Snoop : Example 35 TCP state maintained at link layer 37 38 FH 39 38 37 BS 34 MH Example assumes delayed ack - every other packet ack d 53 Snoop : Example 35 39 37 38 41 39 38 34 54 18
Snoop : Example 37 38 39 42 41 39 dupack Duplicate acks are not delayed 55 Snoop : Example 37 38 41 39 43 42 41 Duplicate acks 56 Snoop : Example 37 38 39 41 42 FH 44 43 BS Discard dupack 37 41 MH Dupack triggers retransmission of packet 37 from base station BS needs to be TCP-aware to be able to interpret TCP headers 57 19
Snoop : Example 37 38 39 41 42 43 45 44 42 37 58 Snoop : Example 37 38 39 41 42 43 44 46 45 43 42 41 TCP sender does not fast retransmit 59 Snoop : Example 37 38 39 41 42 43 44 45 47 46 44 43 41 TCP sender does not fast retransmit 60 20
Snoop : Example 42 43 45 46 44 FH 48 47 41 BS 45 44 43 MH 61 Snoop 2000000 1600000 bits/sec 1200000 800000 0000 0 base TCP Snoop no error 256K 128K 64K 32K 16K 1/error rate (in bytes) 2 Mbps Wireless link 62 Snoop Protocol When Beneficial? Snoop prevents fast retransmit from sender despite transmission errors, and out-of-order delivery on the wireless link OOO delivery causes fast retransmit only if it results in at least 3 dupacks If wireless link level delay-bandwidth product is less than 4 packets, a simple (TCP-unaware) link level retransmission scheme can suffice Since delay-bandwidth product is small, the retransmission scheme can deliver the lost packet without resulting in 3 dupacks from the TCP receiver 63 21
Snoop Protocol : Classification Hides wireless losses from the sender Requires modification to only BS (networkcentric approach) 64 Snoop Protocol : Advantages High throughput can be achieved performance further improved using selective acks Local recovery from wireless losses Fast retransmit not triggered at sender despite out-of-order link layer delivery End-to-end semantics retained Soft state at base station loss of the soft state affects performance, but not correctness 65 Snoop Protocol : Disadvantages Link layer at base station needs to be TCPaware Not useful if TCP headers are encrypted (IPsec) Cannot be used if TCP data and TCP acks traverse different paths (both do not go through the base station) 66 22
Various Schemes Link layer mechanisms Split connection approach TCP-Aware link layer TCP-Unaware approximation of TCP-aware link layer Explicit notification Receiver-based discrimination Sender-based discrimination 67 Sender-Based Discrimination Scheme 68 Sender-Based Discrimination Scheme Sender can attempt to determine cause of a packet loss If packet loss determined to be due to errors, do not reduce congestion window Sender can only use statistics based on round-trip times, window sizes, and loss pattern unless network provides more information (example: explicit loss notification) 69 23
Heuristics for Congestion Avoidance throughput knee cliff RTT load load 70 Heuristics for Congestion Avoidance Define condition C as a function of congestion window size and observed RTTs Condition C evaluated when a new RTT is calculated condition C typically evaluates to 2 or 3 possible values for now assume 2 values: TRUE or FALSE If (C == True) reduce congestion window Several proposals for condition C 71 Heuristics for Congestion Avoidance TCP Vegas TCP Vegas expected throughput ET = W(i) / RTTmin actual throughput AT = W(i) / RTT(i) Condition C = ( ET-AT > beta) 72 24
Heuristics for Congestion Avoidance TCP Vegas Record latest value evaluated for condition C When a packet loss is detected if last evaluation of C is TRUE, assume packet loss is due to congestion else assume that packet loss is due to transmission errors If packet loss determined to be due to errors, do not reduce congestion window 73 Heuristics for Congestion Avoidance TCP Westwood Sender-side only modification of TCP stack for enhanced congestion control Basic mechanisms: 1) end-to-end estimation of available bandwidth 2) faster recovery 74 E2E bandwidth estimation packets SERVER BWE Filter Network CLIENT acks acks The rate of returning ACKS is used to estimate the available bandwidth 75 25
E2E bandwidth estimation d j bandwidth sample b j = t j t j 1 filtered value 2τ f j b j + b j bˆ j = bˆ 1 j 1 + j 2τ f + j 2τ f + j j = t j t j 1 1/τ F =Cut-off frequency 76 TCP Reno cwnd ssthresh Fast recovery Timeout Slow start Congestion Avoidance Slow start time 77 TCP Westwood cwnd BWE*RTTmin. ssthresh. Faster Recovery Timeout BWE*RTTmin. Slow Congestion start Avoidance Slow start time Faster recovery sets cwnd=ssthr=bwe*rttmin 78 26
Faster recovery if (3 DUPACKs are received) ssthresh= (BWE*RTTmin)/ seg_size; if (cwin > ssthresh) /* congestion avoid.*/ cwin = ssthresh; endif endif 79 Sender-Based Heuristics : Disadvantage Does not work quite well enough as yet!! Reason Statistics collected by the sender garbled by other traffic on the network Not much correlation between observed shortterm statistics, and onset of congestion 80 Sender-Based Heuristics : Advantages Only sender needs to be modified Needs further investigation to develop better heuristics investigate longer-term heuristics 81 27
Why do Statistical Technique Perform Poorly? The techniques we evaluated use simple statistics on RTT and window size W to draw conclusions about state of the network Unfortunately, correlation between RTT and W is often weak 82 Statistical Techniques Future Work Other statistical measures? Mechanisms that achieve good TCP throughput despite not-too-good diagnostic accuracy 83 TCP in Presence of Transmission Errors Summary Many techniques have been proposed, and several approaches perform well in many environments Recommendation: Prefer end-to-end techniques End-to-end techniques are those which do not require TCP-Specific help from lower layers Lower layers may help improve TCP performance without taking TCP-specific actions. Examples: Semi-reliable link level retransmission schemes Explicit notification 84 28