최양희서울대학교컴퓨터공학부 Connection-oriented (virtual circuit) Reliable Transfer Buffered Transfer Unstructured Stream Full Duplex Point-to-point Connection End-to-end service 1 2004 Yanghee Choi 2 Addressing: application to application addressing Reliable delivery: the receiver application should receive the same data stream the source puts on the network Segment order maintenance: data segments should reach the application in the same order they left the sender Flow control: the data sending speed should adapt itself to the receiver s speed Congestion control: the transmission speed can not be faster than the speed of the slowest link traversed on the connection path Segmentation: data is sent in segments that provide the highest throughput Sender Send Packet 1 ACK 1 Send Packet 2 ACK 2 Network Message r Packet 1 Send ACK 1 Packet 2 Send ACK 2 2004 Yanghee Choi 3 2004 Yanghee Choi 4
Timeout and Retransmission Sender Network Message Send Packet 1 Packet lost Start Timer r estimation 1 estimation 1 estimation 2 Timer Expires Retransmit Packet 1 estimation 2 Timeout Packet lost Start Timer ACK 1 Cancel Timer Packet 1 Send ACK 1 Timeout Packet lost 2004 Yanghee Choi 5 2004 Yanghee Choi 6 initial window 1 2 3 4 5 6 7 8 9 10... Window slides 1 2 3 4 5 6 7 8 9 10... Sender Send Packet 1 Send Packet 2 Send Packet 3 ACK 1 ACK 2 ACK 3 Network Message r packet 1 Send ACK 1 Packet 2 Send ACK 2 Packet 3 Send ACK 3 TCP is connection oriented and full duplex The maximum segment size(mss) is set during connection establishment Reliability is achieved using acknowledgments, round trip delay estimations and data retransmission TCP uses a variable window mechanism for flow control Congestion control and avoidance is reached using slow start and congestion avoidance schemes 2004 Yanghee Choi 7 2004 Yanghee Choi 8
Conceptual layering of UDP Connections: are identified by a and TCP pair of endpoints e.g., (147.46.114.112, 21) and Application (147.46.114.128, 1500) TCP uses the connection, not Reliable Stream (TCP)User Datagram(UDP) the protocol port, as its fundamental abstraction Because TCP identifies a Internet (IP) connection by a pair of endpoints, a given TCP port Network Interface number can be shared by multiple connections on the same machine Ports Application can provide Endpoints: (host, port) e.g., (147.46.114.112, 21) concurrent service to multiple connections simultaneously 2004 Yanghee Choi without needing unique local port for each connection 9 TCP views the data stream as a sequence of octets that it divides into segments for transmission TCP uses a sliding window mechanism to adjust the sender s transmission speed to that of the receiver The sliding window permits the sending of multiple segments before waiting for an ACK -> efficient transmission ACK segments indicate the last correctly received byte and the number of bytes the receiver is still willing to accept A sender keeps three pointers associated with every connection 1 2 3 4 5 6 7 8 9 10 11... 2004 Yanghee Choi current window 10 TCP allows the window size to vary over time ACK contains a window advertisement that specifies how many additional octets of data the receiver is prepared to accept (receiver s buffer size) In response to an increased(decreased) window advertisement, the sender increases(decreases) the size of its sliding window Variable size window provides flow control as well as reliable transfer Flow control mechanism is essential in Internet environment, where machines of various speeds and sizes communicate through networks and routers of various speed and capacities End-to-end flow control: sliding window scheme Congestion control: no explicit mechanism, implementation dependent 0 4 10 16 24 31 SOURCE PORT DESTINATION PORT SEQUENCE NUMBER ACKNOWLEDGEMENT NUMBER HLEN RESERVED CODE BITS WINDOW CHECKSUM URGENT POINTER OPTIONS (IF ANY) PADDING DATA... 2004 Yanghee Choi 11 2004 Yanghee Choi 12
Segments are exchanged to establish connections transfer data send ACK advertise window close connections CODE BITS: determines the purpose and contents of the segment Bit(left to right) Meaning if bit set to 1 URG Urgent pointer field is valid ACK Acknowledgement field is valid PSH This segment requests a push RST Reset the connection SYN Synchronize sequence numbers FIN Sender has reached end of its byte stream It is important for the program at one end of a connection to send data out of band, without for the program at the other end of the connection to consume octets already in the stream e.g., In a remote login session, interrupt or abort keyboard sequence TCP allows the sender to specify data as urgent, meaning that the receiving program should be notified of its arrival as quickly as possible, regardless of its position in the stream Urgent mode vs. normal mode When the URG code bit is set, the Urgent Pointer specifies the position in the segment where urgent data ends 2004 Yanghee Choi 13 2004 Yanghee Choi 14 Most common option in TCP segment To support heterogeneous buffer capacities To make good use of the bandwidth in high speed LAN. MSS == minimum MTU In general internet environment, choosing a good MSS can be difficult because performance can be poor for either extremely large segment sizes or extremely small sizes Extremely small MSS: makes network utilization low Extremely large MSS: decreases throughput because of fragmentation Optimum MSS occurs when the IP datagrams carrying the segments are as large as possible without requiring fragmentation anywhere along the path from the source to the destination. => But, difficult problem for several reasons Default MSS(536 bytes) = default size of IP datagram(576 bytes) - 40 2004 Yanghee Choi 15 16-bit integer checksum used to verify the integrity of the data as well as the TCP header TCP prepends a pseudo header to the segment, appends enough zero bits to make the segment a multiple of 16 bits, and computes the 16-bit checksum over the entire result TCP does not count the pseudo header or padding in the segment length, nor does it transmit them Pseudo header allows the receiver to verify that the segment has reached its correct destination At the receiver, the IP must pass to TCP the source and destination IP addresses from the datagram as well as the segment itself Pseudo header SOURCE IP ADDRESS 0 8 16 31 DESTINATION IP ADDRESS 2004 Yanghee Choi ZERO PROTOCOL TCP LENGTH 16
A TCP receiver always acknowledges the last correctly received byte -> cumulative ACK After sending a segment the sender starts a timer If the timer expires before receiving an ACK for the sent segment, the segment is considered lost and must be retransmitted In an internet environment, it is impossible to know a priori how quickly ACKs will return to the source The timeout value is calculated dynamically according to the measured round trip time(rtt) - adaptive retransmission algorithm Estimated round trip time (RTT) RTT = ( α* Old _ RTT ) + (( 1 α )* New _ Round _ Trip _ Sample ), 0 α < 1 Timeout value Timeout = β* RTT, β 1 2004 Yanghee Choi 17 Acknowledgment ambiguity Because both datagrams carry exactly the same data, the sender has no way of knowing whether an ACK corresponds to the original or retransmitted datagram. The original transmission and the most recent transmission both fail to provide accurate round trip time t1 t2 t3 timeout retransmit ACK Round_Trip_Sample = t3 - t2 or t3 - t1? 2004 Yanghee Choi 18 Karn s Algorithm When computing the round trip estimate, ignore samples that correspond to retransmitted segments, but use a backoff strategy, and retain the timeout value from a retransmitted packet for subsequent packets until a valid sample is obtained Timer backoff strategy: If the timer expires and causes a retransmission, TCP increases the timeout new _ timeout = γ* timeout, typically, γ = 2 When an internet misbehaves, Karn s algorithm separates computation of the timeout value from the current round trip estimate 2004 Yanghee Choi 19 To adapt to a wide range of variation in delay. Queueing theory suggests that the variation in RTT, σ, varies proportional to 1/(1-L), where L is the current network load. 0 L 1 The 1989 spec for TCP requires implementations to estimate both the average round trip time and the variance, and to use the estimated variance in place of the constant β DIFF = SAMPLE Old _ RTT Smoothed _ RTT = Old _ RTT + δ* DIFF DEV = Old _ DEV + ρ( DIFF Old _ DEV ) Timeout = Smoothed _ RTT + η* DEV DEV: the estimated mean deviation δ : controls how quickly the new sample affects the weighted average ρ : controls how quickly the new sample affects the mean deviation η : controls how much the deviation affects the round trip timeout 2004 Yanghee Choi 20
Event At Site 1 Network Message Event At Site 2 Send SYN seq=x SYN + ACK segment Send ACK y+1 Three way handshake SYN segment Send SYN seq=y, ACK x+1 ACK segment A sends a SYN segment with an initial sequence number(isn) and the maximum segment size(mss) it is willing to receive B replies with a SYN segment acknowledging ISN and announcing its MSS MSS can be at most as large as the interface segment size minus 40 2004 Yanghee Choi 21 Event At Site 1 Network Message Event At Site 2 (application closes connection) Send FIN seq=x ACK segment FIN + ACK segment Send ACK y+1 Three way handshake FIN segment Send ACK x+1 (inform application) (application closes connection) Send FIN seq=y, ACK x+1 ACK segment A sender terminates its part of the connection by sending a FIN segment After acknowledging the FIN the receiver can still send data on its part of the connection(half close) A connection can be aborted with RST segment if the abnormal conditions arise 2004 Yanghee Choi 22 anything / reset begin CLOSED SYN RECVD FIN WAIT-1 FIN WAIT-2 passive open syn / syn +ack reset ack close / fin close / fin ack / fin / ack fin-ack / ack fin / ack LISTEN ESTAB- LISHED CLOSING TIMED WAIT SYN SENT CLOSE WAIT LAST ACK 2004 Yanghee Choi 23 close syn / syn +ack ack / active open / syn send / syn syn+ack / ack fin / ack close / fin close / timeout / reset ack / timeout after 2 segment lifetimes TCP states CLOSED No connection is active or pending LISTEN The server is waiting for an incoming call SYN RCVD A connection request has arrived; wait for ACK SYN SENT The application has started to open a conn ESTABLISHED The normal data transfer state FIN WAIT-1 The application has said it is finished FIND WAIT-2 The other side has agreed to release TIMED WAIT Wait for all packets to die off CLOSING Both sides has tried to close simultaneously CLOSE WAIT The other side has initiated a release LAST ACK Wait for all packets to die off 2004 Yanghee Choi 24
Keyword UNIX keyword Description 0 Reserved 1 TCPMUX - TCP Multiplexer 5 RJE - Remote Job Entry 7 ECHO echo Echo 9 DISCARD discard Discard 11 USERS systat Active Users 13 DAYTIME daytime Daytime 15 - netstat Network status program 17 QUOTE qotd Quote of the day 19 CHARGEN chargen Character Generator 20 FTP-DATA ftp-data File Transfer Protocol 21 FTP ftp File Transfer Protocol 23 TELNET telnet Terminal Connection 25 SMTP smtp Simple Mail Transport Protocol 37 TIME time Time 42 NAMESERVER name Host Name Server 43 NICNAME whois Who Is 53 DOMAIN nameserver Domain Name Server 77 - rje any private RJE service 79 FINGER finger Finger 93 DCP - Device Control Protocol 95 SUPDUP supdup SUPDUP Protocol 2004 Yanghee Choi 25