CS 356: Introduction to Computer Networks. Lecture 16: Transmission Control Protocol (TCP) Chap. 5.2, 6.3. Xiaowei Yang

Similar documents
TCP congestion control:

11/24/2009. Fundamentals of Computer Networks ECE 478/578. Flow Control in TCP

TCP. CSU CS557, Spring 2018 Instructor: Lorenzo De Carli (Slides by Christos Papadopoulos, remixed by Lorenzo De Carli)

Flow and Congestion Control Marcos Vieira

Internet Protocols Fall Outline

CSCI-1680 Transport Layer II Data over TCP Rodrigo Fonseca

Fast Retransmit. Problem: coarsegrain. timeouts lead to idle periods Fast retransmit: use duplicate ACKs to trigger retransmission

Flow and Congestion Control (Hosts)

Lecture 4: Congestion Control

TCP/IP Networking. Part 4: Network and Transport Layer Protocols

End-to-End Protocols. Transport Protocols. User Datagram Protocol (UDP) Application Layer Expectations

EE 122: Transport Protocols. Kevin Lai October 16, 2002

TCP Overview. Connection-oriented Byte-stream

Chapter 24. Transport-Layer Protocols

Problem. Chapter Outline. Chapter Goal. End-to-end Protocols. End-to-end Protocols. Chapter 5. End-to-End Protocols

Islamic University of Gaza Faculty of Engineering Department of Computer Engineering ECOM 4021: Networks Discussion. Chapter 5 - Part 2

COMP/ELEC 429/556 Introduction to Computer Networks

8. TCP Congestion Control

CS3600 SYSTEMS AND NETWORKS

TCP Adaptive Retransmission Algorithm - Original TCP. TCP Adaptive Retransmission Algorithm Jacobson

Overview. TCP congestion control Computer Networking. TCP modern loss recovery. TCP modeling. TCP Congestion Control AIMD

Outline Computer Networking. TCP slow start. TCP modeling. TCP details AIMD. Congestion Avoidance. Lecture 18 TCP Performance Peter Steenkiste

CS 43: Computer Networks. 19: TCP Flow and Congestion Control October 31, Nov 2, 2018

Reliable Transport II: TCP and Congestion Control

Detecting half-open connections. Observed TCP problems

EE 122: Transport Protocols: UDP and TCP

Reliable Byte-Stream (TCP)

Transmission Control Protocol. ITS 413 Internet Technologies and Applications

Chapter 3 outline. 3.5 Connection-oriented transport: TCP. 3.6 Principles of congestion control 3.7 TCP congestion control

Outline. User Datagram Protocol (UDP) Transmission Control Protocol (TCP) Transport layer (cont.) Transport layer. Background UDP.

CS4700/CS5700 Fundamentals of Computer Networks

Reliable Transport II: TCP and Congestion Control

Congestion Control in TCP

Transport Protocols and TCP

Networked Systems and Services, Fall 2017 Reliability with TCP

Transport Protocols CS 640 1

CS 356: Computer Network Architectures Lecture 19: Congestion Avoidance Chap. 6.4 and related papers. Xiaowei Yang

Internet Transport Protocols UDP and TCP

Computer Networking Introduction

TCP Basics : Computer Networking. Overview. What s Different From Link Layers? Introduction to TCP. TCP reliability Assigned reading

Fundamentals of Computer Networks ECE 478/578. Transport Layer. End- to- End Protocols 4/16/13. Spring Application. Application.

Networked Systems and Services, Fall 2018 Chapter 3

CSCD 330 Network Programming Winter 2015

Chapter III: Transport Layer

End-to-End Protocols. End-to-End Protocols

CNT 6885 Network Review on Transport Layer

Outline. TCP: Overview RFCs: 793, 1122, 1323, 2018, steam: r Development of reliable protocol r Sliding window protocols

Congestion / Flow Control in TCP

Transport layer. UDP: User Datagram Protocol [RFC 768] Review principles: Instantiation in the Internet UDP TCP

User Datagram Protocol (UDP):

TCP Performance. EE 122: Intro to Communication Networks. Fall 2006 (MW 4-5:30 in Donner 155) Vern Paxson TAs: Dilip Antony Joseph and Sukun Kim

Transport layer. Review principles: Instantiation in the Internet UDP TCP. Reliable data transfer Flow control Congestion control

CSE/EE 461 Lecture 12 TCP. A brief Internet history...

Outline. TCP: Overview RFCs: 793, 1122, 1323, 2018, Development of reliable protocol Sliding window protocols

CMSC 417. Computer Networks Prof. Ashok K Agrawala Ashok Agrawala. October 25, 2018

TCP Congestion Control

TCP Congestion Control

TCP Congestion Control

Transport Protocols. CSCI 363 Computer Networks Department of Computer Science

image 3.8 KB Figure 1.6: Example Web Page

Flow Control, and Congestion Control

COMPUTER NETWORKS CS CS 55201

COMPUTER NETWORKS CS CS 55201

ADVANCED COMPUTER NETWORKS

Transport Layer: Outline

Fall 2012: FCM 708 Bridge Foundation I

CE693 Advanced Computer Networks

Computer Network Fundamentals Spring Week 10 Congestion Control Andreas Terzis

Congestion Control. Brighten Godfrey CS 538 January Based in part on slides by Ion Stoica

Lecture 8. TCP/IP Transport Layer (2)

Transport Protocols & TCP TCP

Q23-5 In a network, the size of the receive window is 1 packet. Which of the follow-ing protocols is being used by the network?

Transport Layer: outline

Chapter 3- parte B outline

TCP = Transmission Control Protocol Connection-oriented protocol Provides a reliable unicast end-to-end byte stream over an unreliable internetwork.

TCP so far Computer Networking Outline. How Was TCP Able to Evolve

Chapter 3 Transport Layer

Congestion Control. Daniel Zappala. CS 460 Computer Networking Brigham Young University

Transport Over IP. CSCI 690 Michael Hutt New York Institute of Technology

CMPE 150/L : Introduction to Computer Networks. Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 10

TCP and Congestion Control (Day 1) Yoshifumi Nishida Sony Computer Science Labs, Inc. Today's Lecture

ECE 650 Systems Programming & Engineering. Spring 2018

ECE 461 Internetworking. Problem Sheet 6

Reliable Transport I: Concepts and TCP Protocol

Principles of congestion control

CS Networks and Distributed Systems. Lecture 10: Congestion Control

cs/ee 143 Communication Networks

Chapter 3 Transport Layer

CSE 461. TCP and network congestion

Overview. TCP & router queuing Computer Networking. TCP details. Workloads. TCP Performance. TCP Performance. Lecture 10 TCP & Routers

Internet transport protocols

Lecture 15: Transport Layer Congestion Control

Transport Layer. Application / Transport Interface. Transport Layer Services. Transport Layer Connections

Transport Layer PREPARED BY AHMED ABDEL-RAOUF

Application Service Models

Correcting mistakes. TCP: Overview RFCs: 793, 1122, 1323, 2018, TCP seq. # s and ACKs. GBN in action. TCP segment structure

Chapter 7. The Transport Layer

ECE 435 Network Engineering Lecture 10

Network Protocols. Transmission Control Protocol (TCP) TDC375 Autumn 2009/10 John Kristoff DePaul University 1

Page 1. Goals for Today" Placing Network Functionality" Basic Observation" CS162 Operating Systems and Systems Programming Lecture 15

Transcription:

CS 356: Introduction to Computer Networks Lecture 16: Transmission Control Protocol (TCP) Chap. 5.2, 6.3 Xiaowei Yang xwy@cs.duke.edu

Overview TCP Connection management Flow control When to transmit a segment Adaptive retransmission TCP options Modern extensions Congestion Control

Transmission Control Protocol Connection-oriented protocol Provides a reliable unicast end-to-end byte stream over an unreliable internetwork Byte Stream Byte Stream TCP TCP IP Internetwork

TCP performance is critical to business Source: http://www.webperformancetoday.com/2011/11/23/case-studyslow-page-load-mobile-business-metrics/

Source: http://www.webperformancetoday.com/2012/02/28/4-awesome-slidesshowing-how-page-speed-correlates-to-business-metrics-at-walmart-com/

Flow control

Sliding window revisited Sender Window Size Invariants LastByteAcked LastByteSent LastByteSent LastByteWritten LastByteRead < NextByteExpected NextByteExpected LastByteRcvd + 1 Limited sending buffer and Receiving buffer Receiver Window Size

Buffer Sizes vs Window Sizes Maximum SWS MaxSndBuf Maximum RWS MaxRcvBuf ((NextByteExpected-1) LastByteRead)

TCP Flow Control IP header TCP header TCP data 20 bytes 20 bytes 0 15 16 31 Source Port Number Destination Port Number Sequence number (32 bits) header length Acknowledgement number (32 bits) 0 Flags TCP checksum window size urgent pointer 20 bytes Options (if any) DATA Q: how does a receiver prevent a sender from overrunning its buffer? A: use AdvertisedWindow

Invariants for flow control Receiver side: LastByteRcvd LastByteRead MaxRcvBuf AdvertisedWindow = MaxRcvBuf ((NextByteExpected - 1) LastByteRead)

Invariants for flow control Sender side: MaxSWS = LastByteSent LastByteAcked AdvertisedWindow LastByteWritten LastByteAcked MaxSndBuf Sender process would be blocked if send buffer is full

Window probes What if a receiver advertises a window size of zero? Problem: Receiver can t send more ACKs as sender stops sending more data Design choices Receivers send duplicate ACKs when window opens Sender sends periodic 1 byte probes Why? Keeping the receive side simple à Smart sender/dumb receiver

When to send a segment? App writes bytes to a TCP socket TCP decides when to send a segment Design choices when window opens: Send whenever data available Send when collected Maximum Segment Size data Why?

Push flag What if App is interactive, e.g. ssh? App sets the PUSH flag Flush the sent buffer

Silly Window Syndrome Now considers flow control Window opens, but does not have MSS bytes Design choice 1: send all it has E.g., sender sends 1 byte, receiver acks 1, acks opens the window by 1 byte, sender sends another 1 byte, and so on

Silly Window Syndrome

How to avoid Silly Window Syndrome Receiver side Do not advertise small window sizes Min(MSS, MaxRecBuf/2) Sender side Wait until it has a large segment to send Q: How long should a sender wait?

Sender-Side Silly Window Syndrome avoidance Nagle s Algorithm Self-clocking Interactive applications may turn off Nagle s algorithm using the TCP_NODELAY socket option When app has data to send if data and window >= MSS send a full segment else if there is unacked data buffer new data until ACK else send all the new data now

TCP window management summary Receiver uses AdvertisedWindow for flow control Sender sends probes when AdvertisedWindow reaches zero Silly Window Syndrome avoidance Receiver: do not advertise small windows Sender: Nagle s algorithm

Overview TCP Connection management Flow control When to transmit a segment Adaptive retransmission TCP options Modern extensions Congestion Control

TCP Retransmission A TCP sender retransmits a segment when it assumes that the segment has been lost How does a TCP sender detect a segment loss? Timeout Duplicate ACKs (later)

How to set the timer Challenge: RTT unknown and variable Too small Results in unnecessary retransmissions Too large Long waiting time

Adaptive retransmission Estimate a RTO value based on round-trip time (RTT) measurements Implementation: one timer per connection Q: Retransmitted segments? RTT #1 RTT #2 RTT #3 ACK for Segment 4 ACK for Segment 5 Segment 1 ACK for Segment 1 Segment 2 Segment 3 ACK for Segment 2 + 3 Segment 5 Segment 4

Karn s Algorithm Ambiguity RTT? RTT? Timeout! segment retransmission of segment ACK Solution: Karn s Algorithm: Don t update RTT on any segments that have been retransmitted

Setting the RTO value Uses an exponential moving average (a low-pass filter) to estimate RTT (srtt) and variance of RTT (rttvar) The influence of past samples decrease exponentially The RTT measurements are smoothed by the following estimators srtt and rttvar: srtt n+1 = a RTT + (1- a ) srtt n rttvar n+1 = b ( RTT srtt n ) + (1- b ) rttvar n RTO n+1 = srtt n+1 + 4 rttvar n+1 The gains are set to a =1/4 and b =1/8 Negative power of 2 makes it efficient for implementation

Setting the RTO value (cont d) Initial value for RTO: Sender should set the initial value of RTO to RTO 0 = 3 seconds RTO calculation after first RTT measurements arrived srtt 1 = RTT rttvar 1 = RTT / 2 RTO 1 = srtt 1 + 4 rttvar n+1 When a timeout occurs, the RTO value is doubled RTO n+1 = max ( 2 RTO n, 64) seconds This is called an exponential backoff

Overview TCP Connection management Flow control When to transmit a segment Adaptive retransmission TCP options Modern extensions Congestion Control

TCP header fields Options: (type, length, value) TCP hdrlen field tells how long options are End of Options kind=0 1 byte NOP (no operation) kind=1 1 byte Maximum Segment Size kind=2 len=4 maximum segment size 1 byte 1 byte 2 bytes Window Scale Factor kind=3 len=3 shift count 1 byte 1 byte 1 byte Timestamp kind=8 len=10 timestamp value timestamp echo reply 1 byte 1 byte 4 bytes 4 bytes

TCP header fields Options: NOP is used to pad TCP header to multiples of 4 bytes Maximum Segment Size Window Scale Options Increases the TCP window from 16 to 32 bits, i.e., the window size is interpreted differently This option can only be used in the SYN segment (first segment) during connection establishment time Timestamp Option Can be used for roundtrip measurements

Modern TCP extensions Timestamp Window scaling factor Protection Against Wrapped Sequence Numbers (PAWS) Selective Acknowledgement (SACK) References http://www.ietf.org/rfc/rfc1323.txt http://www.ietf.org/rfc/rfc2018.txt

Improving RTT estimate TCP timestamp option Old design One sample per RTT Using host timer More samples to estimate Timestamp option Current TS, echo TS

Increase TCP window size IP header TCP header TCP data 20 bytes 20 bytes 0 15 16 31 Source Port Number Destination Port Number header length Sequence number (32 bits) Acknowledgement number (32 bits) 0 Flags TCP checksum window size urgent pointer 20 bytes Options (if any) DATA 16-bit window size Maximum send window <= 65535B Suppose a RTT is 100ms Max TCP throughput = 65KB/100ms = 5Mbps Not good enough for modern high speed links!

Protecting against Wraparound Time until 32-bit sequence number space wraps around.

Solution: Window scaling option Kind = 3 Length = 3 Shift.cnt Three bytes All windows are treated as 32-bit Negotiating shift.cnt in SYN packets Ignore if SYN flag not set Sending TCP Real available buffer >> self.shift.cnt à AdvertisedWindow Receiving TCP: stores other.shift.cnt AdvertisedWindow << other.shift.cnt à Maximum Sending Window

Protect Against Wrapped Sequence Number 32-bit sequence number space Why sequence numbers may wrap around? High speed link On an OC-45 (2.5Gbps), it takes 14 seconds < 2MSL Solution: compare timestamps Receiver keeps recent timestamp Discard old timestamps

Selective Acknowledgement More when we discuss congestion control If there are holes, ack the contiguous received blocks to improve performance

Overview Nitty-gritty details about TCP Connection management Flow control When to transmit a segment Adaptive retransmission TCP options Modern extensions Congestion Control How does TCP keeps the pipe full?

TCP Congestion Control

History The original TCP/IP design did not include congestion control and avoidance Receiver uses advertised window to do flow control No exponential backoff after a timeout It led to congestion collapse in October 1986 The NSFnet phase-i backbone dropped three orders of magnitude from its capacity of 32 kbit/s to 40 bit/s, and continued until end nodes started implementing Van Jacobson's congestion control between 1987 and 1988. TCP retransmits too early, wasting the network s bandwidth to retransmit packets already in transit and reducing useful throughput (goodput)

Design Goals Congestion avoidance: making the system operate around the knee to obtain low latency and high throughput Congestion control: making the system operate left to the cliff to avoid congestion collapse Congestion avoidance: making the system operate around the knee to obtain low latency and high throughput Congestion control: making the system operate left to the cliff to avoid congestion collapse

Key Improvements RTT variance estimate Old design: RTT n+1 = a RTT + (1- a ) RTT n RTO = β RTTn+1 Exponential backoff Slow-start Dynamic window sizing Fast retransmit

Challenge Send at the right speed Fast enough to keep the pipe full But not to overrun the pipe Drawback? Share nicely with other senders

Key insight: packet conservation principle and self-clocking When pipe is full, the speed of ACK returns equals to the speed new packets should be injected into the network

Solution: Dynamic window sizing Sending speed: SWS / RTT à Adjusting SWS based on available bandwidth The sender has two internal parameters: Congestion Window (cwnd) Slow-start threshold Value (ssthresh) SWS is set to the minimum of (cwnd, receiver advertised win)

Two Modes of Congestion Control 1. Probing for the available bandwidth slow start (cwnd < ssthresh) 2. Avoid overloading the network congestion avoidance (cwnd >= ssthresh)

Initial value: Slow Start Set cwnd = 1 MSS Modern TCP implementation may set initial cwnd to 2 When receiving an ACK, cwnd+= 1 MSS If an ACK acknowledges two segments, cwnd is still increased by only 1 segment. Even if ACK acknowledges a segment that is smaller than MSS bytes long, cwnd is increased by 1. Question: how can you accelerate your TCP download?

Congestion Avoidance If cwnd >= ssthresh then each time an ACK is received, increment cwnd as follows: cwnd += MSS * (MSS / cwnd) (cwnd measured in bytes) So cwnd is increased by one MSS only if all cwnd/mss segments have been acknowledged.

Example of Slow Start/Congestion Avoidance Assume ssthresh = 8 MSS cwnd = 1 cwnd = 2 cwnd = 4 14 Cwnd (in segments) 12 10 8 6 4 2 ssthresh cwnd = 8 cwnd = 9 0 t=0 t=2 t=4 t=6 Roundtrip times cwnd = 10

Congestion detection What would happen if a sender keeps increasing cwnd? Packet loss TCP uses packet loss as a congestion signal Loss detection 1. Receipt of a duplicate ACK (cumulative ACK) 2. Timeout of a retransmission timer

Reaction to Congestion Reduce cwnd Timeout: severe congestion cwnd is reset to one MSS: cwnd = 1 MSS ssthresh is set to half of the current size of the congestion window: ssthressh = cwnd / 2 entering slow-start

Reaction to Congestion Duplicate ACKs: not so congested (why?) Fast retransmit Three duplicate ACKs indicate a packet loss Retransmit without timeout

Duplicate ACK example 1K SeqNo=0 AckNo=1024 1K SeqNo=1024 1K SeqNo=2048 1. duplicate AckNo=1024 1K SeqNo=3072 2. duplicate AckNo=1024 1K SeqNo=4096 3. duplicate AckNo=1024 1K 1K SeqNo=1024 SeqNo=5120 54

Reaction to congestion: Fast Recovery Avoiding slow start ssthresh = cwnd/2 cwnd = cwnd+3mss Increase cwnd by one MSS for each additional duplicate ACK When ACK arrives that acknowledges new data, set: cwnd=ssthresh enter congestion avoidance

Flavors of TCP Congestion Control TCP Tahoe (1988, FreeBSD 4.3 Tahoe) Slow Start Congestion Avoidance Fast Retransmit TCP Reno (1990, FreeBSD 4.3 Reno) Fast Recovery Modern TCP implementation New Reno (1996) SACK (1996)

TCP Tahoe

TCP Reno TCP saw tooth SS CA Fast retransmission/fast recovery

Summary TCP Connection management Flow control When to transmit a segment Adaptive retransmission TCP options Modern extensions Congestion Control Next: network resource management

Why does it work? [Chiu-Jain] A feedback control system The network uses feedback y to adjust users load åx_i

Goals of Congestion Avoidance Efficiency: the closeness of the total load on the resource ot its knee Fairness: When all x_i s are equal, F(x) = 1 When all x_i s are zero but x_j = 1, F(x) = 1/n Distributedness A centralized scheme requires complete knowledge of the state of the system Convergence The system approach the goal state from any starting state

Metrics to measure convergence Responsiveness Smoothness

Model the system as a linear control system Four sample types of controls AIAD, AIMD, MIAD, MIMD

Phase plot x 2 x 1

Summary TCP Congestion Control Slow start: cwnd +=1 for every ack received Congestion avoidance (cwnd > ssthresh): cwnd += MSS/cwnd After three duplicate ACKs ssthressh = cwnd / 2 cwnd = ssthresh Control Algorithm is Additive Increase and Multiplicative Decrease (AIMD)