TCP/IP Performance ITL
Protocol Overview E-Mail HTTP (WWW) Remote Login File Transfer TCP UDP IP ICMP ARP RARP (Auxiliary Services) Ethernet, X.25, HDLC etc. ATM 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 2
Connection Types in TCP/IP Transport Layer TCP: Connection Oriented UDP: Connection-less Network Layer Connection-less Data Link Layer and Physical Network Depends on the network 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 3
Real Networks Include many different types of circuits Different speeds Some LAN, some Wide-Area connections Rely on routers to connect the different subnetworks Routers are not expected to have detailed knowledge about the traffic flows they are handling 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 4
Network Knowledge and Lack End Systems Thereof Knowthe applications they are running Often knowthe network capacity they would like to have Do not know the actual network capacity available Do not know the competition, i.e. other network users traffic 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 5
Network Knowledge and Lack Routers Thereof Knowthe capacity of the links they are attached to Do not know much about the network farther away from them Do not know the complete path taken by the packets handled in the router Do not know (from the network traffic itself) what the applications needs are 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 6
Routers Must cope with packet flows that may exceed the available capacity on their outbound route Short-term this indicates randomness in the traffic and we need to deal with it If the overload persists long-term we call it congestion, and we would like for it to go away Routers use queues to handle the short-term variations Long-term overload?? 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 7
Applications Should and often can adapt to the available capacity Should be fair in their use of resources, or Should identify themselves as high-capacity users (and compensate the network operator accordingly) Need information about the network and the capacity is can deliver 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 8
In the ideal case Use a control protocol to communicate this information between applications and the network Standard procedure in circuit switched and virtual circuit networks Telephone network Frame Relay and ATM Increases overall complexity Can provide a wide range of services really well 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 9
The case of the Internet Successful because a transparent network encourages application development and deployment Because the network elements are simple Reasonably low complexity Great flexibility Not much capability to communicate network information to applications 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 10
Performance Issues Long term Increase complexity and add QoS protocol layers Throw capacity at the network faster than applications require it (good luck...) Short term Implicit communication of congestion in the TCP protocol Network performs many different functions, some better than others 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 11
Application View Network attachment over which I dispatch my packets -- known Intermediate network Contains many links and queues Application sees an overall latency, or delay between packet dispatch and receipt More precisely, applications can discover Round Trip Times 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 12
Frame Transmission Time Depends on Frame Length Channel Bit Rate t frame = L R L = Frame Length in Bits R = Channel Data Rate in Bits per Second 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 13
Frame Propagation Time Depends on Physical distance between stations Signal propagation speed 2x10 8 to 3x10 8 meters/second t prop = d v d = distance between stations in meters v = speed of signal propagation 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 14
Round Trip Time Combination of Frame Transmission times on intermediate links Frame propagation times on all links Queue wait times in all intermediate nodes 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 15
Sliding Window 1 2 3 M 1 2 3 M 1 2 3 M Idle Time One Cycle 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 16
In practice How is the sliding window mechanism used in TCP What control do we have over performance parameters Starting with a quick TCP review... 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 17
UDP Header Source Port Destination Port Length Checksum 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 18
TCP Header Source Port Destination Port Sequence Number Acknowledgement Number misc Flags Window (flow cntrl) Checksum Urgent Options 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 19
TCP Connection Setup Three-Way Handshake Send SYN packet Wait for peer to return a SYN/ACK packet Acknowledge the SYN/ACK packet 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 20
TCP Connection Termination Send a FIN packet Wait to receive acknowledgement of FIN 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 21
TCP Data Exchange Sequence Numbers - Sliding Window Arbitrary initial setting Labels the first byte of the segment Acknowledgements Indicate the next byte the receiver is looking for, all previous bytes have been received. 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 22
TCP Segment Size Originally Unlimited IP fragments segments that are too large Turned out to be very inefficient SYN packet can carry the MSS (Maximum Segment Size) option Must be approved in the SYN/ACK Default used if the option is not present 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 23
TCP Sliding Window Operation Sender snd.una snd.nxt snd.una +snd.wnd snd.wnd (local to the Sender) Receiver rcv.nxt rcv.nxt +rcv.wnd rcv.wnd (Must tell the sender this value) 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 24
Slow Start Congestion Control 1 1 2 1 1 2 1 2 3 4 Idle Time Idle Time Note: recent TCP amendments permit more than 1 initial segment Window doubles in each cycle 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 25
The Congestion Collapse Problem Original TCP specs used the window for flow control, and retransmission after 2 round trip times Congestion of a link causes the timers to go off before an ack can be returned The network goes into steady state congestion where every segment is transmitted about three times 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 26
Congestion Issues Slow Start - New Connection Set send window to n*mss (n <= 4) Increase the window by MSS for each ack received Exponential increase in send window size What is the limit? Window size reached before full utilization Path is overloaded and an intermediate router discards one or more packets 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 27
Congestion Issues cont... Packet loss may occur due to actual errors or congestion TCP equates loss with congestion Congestion Avoidance, Timer Back-Off Reduce send window to 1/2 of previous size for each retransmit (exponential back-off) After a segment is retransmitted, set the new RTO timer for that segment to 2*RTO, up to a hard upper bound (2*MSL, Maximum Segment Life) (RTO = Retransmit Time-Out) 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 28
Congestion Issues cont... Slow Start - After retransmission Exponential slow-start up to 1/2 of the original window size Increase the window by MSS for each send window ack ed without loss Linear increase in send window size 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 29
What can we control Vendors TCP implementation needs to follow most recent guidelines TCP window size should be configurable Users Control the TCP window For each application (rare) For the entire workstation (more likely) 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 30
Tuning, cont. Network Administrator Router Queues In Out In Out In Out In Out 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 31
Optional Slides on TCP window operation
Example... Sender 1001 2001 2501 5001 Available window for further sends Next segment to send Sent but no ack received yet Receiver 1001 1501 5001 Available receive window space Received and acked; not yet picked up by client 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 33
Segment Dispatch Sender 1001 2001 2501 5001 Available window for further sends Next segment to send Sent but no ack received yet Dispatch segment to IP Set RTO (Retransmit Time Out) timer Proportional to the Round Trip Time (RTT) 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 34
Segment Receipt with Pickup Receiver 1001 1501 2001 5001 6001 Available receive window space Received and picked up by client Send Ack segment with Ack=2001 Window = 4000 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 35
Segment Receipt w/o Pickup Receiver 1001 1501 2001 5001 5501 Available receive window space Received but not picked up by client Received and picked up by client Send Ack packet with Ack = 2001 Window = 3500 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 36
Acknowledgement Receipt Sender 1001 2001 2501 5001 before 2001 2501 5001 5501 after Seg received with Ack=2001, Win=3500 Left window edge to 2001 Right window edge to 5501 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 37
Segment Receipt After Segment Loss Receiver 1001 1501 2001 5001 5501 Last segment received Missing segment Received but not picked up by client Received and picked up by client Send a duplicate acknowledgement Send Ack packet with Ack = 2001 Window = 3500 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 38
Retransmission Sender 2001 2501 5001 5501 Highest Ack Number received is 2001 Duplicate Ack=2001 may have been received RTO timer for segment 2001 expires and 2001 is retransmitted Trigger congestion avoidance algorithm We really want to avoid this because RTO is large 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 39
Retransmit Timing and Window Size - Single Error BDP (Bandwidth Delay Product) Ethernet: 1ms * 10Mbps = 1250 bytes Satcom T1: 500ms * 1.5Mbps = 94 kbytes Assume window size = BDP RTO > 2*RTT Recovery Ack after retransmit needs 1 RTT Channel idles for length of RTO ( drained pipe ) 2001 2501 5000 5501 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 40
Retransmission Timer Implementation Running estimate (based on Acks) of Average RTT RTT variance factor Exclude retransmissions Set RTO to RTT times RTT variance factor (with a hard upper bound) Around 2 RTT for lightly loaded links As high as 16 RTT for congested links 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 41
Window Scaling 16 bit window field in the TCP header allows a maximum of 64 kbytes for the window. RFC 1323 defines the window scaling option: Syn segment suggests a scaling factor Ack/Syn approves All window advertisements are scaled by that factor prior to use in TCP 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 42
Window Scaling cont... Large windows cause an adjunct problem: sequence number reuse RFC 1323 limits the window to about 1Gbyte to fit within the sequence number space OC-12 will use all sequence numbers in about 28 sec. Segements can live in the network for 120 sec 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 43
Fast Retransmit Sender 2001 2501 5000 5501 Duplicate Ack=2001 have been received Re-send segment 2001 before RTO expires Guess that 2001 was lost Wait for >=3 dup acks (segements could just have arrived out-of-order) Enter congestion avoidance with allowance for duplicate acks 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 44
Selective Acknowledgement Receiver 1001 1501 2001 2501 2601 5000 5501 Last segment received Missing Segment Enabled during Syn and Syn/Ack Receiver send segment with Ack = 2001, Window = 3500 SACK option: block start=2501, end=2600 4/30/2002 Hans Kruse & Shawn Ostermann, Ohio University 45