Transport Over IP CSCI 690 Michael Hutt New York Institute of Technology
Transport Over IP What is a transport protocol? Choosing to use a transport protocol Ports and Addresses Datagrams UDP
What is a Transport Protocol? Provides common services between applications and network layer Rules for exchanging control messages and data End-to-End Different levels of services, TCP vs. UDP TCP, UDP, SCTP
Choosing to Use a Transport Protocol Network may provide little or no error detection Network may not retransmit data even if errors are detected Network may not provide end-to-end connectivity Network may not provide flow control
Ports and Addresses IP Address identifies end point IP Address alone cannot discern between multiple applications using the network services Port is a16 bit number 65535 ports available for every IP Address Socket => {source ip address, source port} Socket pair identifies a unique connection between two end points
Ports and Addresses IANA defines port ranges http://www.iana.org/assignments/port-numbers well-known ports: 0-1023 registered ports: 1024-49151 dynamic (ephemeral) ports 49152-65535 Root or superuser privileges are required for ports below 1024.
Datagrams Connectionless delivery no connection setup required Not reliable no indication that data was not received Up to the application to deal with retransmission and re-sequencing of data
User Datagram Protocol (UDP) Minimum protocol overhead Connectionless datagram service Destination port identifies application Reassembles fragmented data for application
UDP Message Format 0 7 8 15 16 23 24 31 +--------+--------+--------+--------+ Source Destination Port Port +--------+--------+--------+--------+ Length Checksum +--------+--------+--------+--------+ data octets... +----------------...
UDP Checksum Checksum is the 16-bit one's complement of the one's complement sum of a pseudo header of information from the IP header, the UDP header, and the data, padded with zero octets at the end (if necessary) to make a multiple of two octets. Same as IP header checksum
UDP Checksum The pseudo header conceptually prefixed to the UDP header contains the source address, the destination address, the protocol, and the UDP length. This information gives protection against misrouted datagrams. This checksum procedure is the same as is used in TCP. 0 7 8 15 16 23 24 31 +--------+--------+--------+--------+ source address +--------+--------+--------+--------+ destination address +--------+--------+--------+--------+ zero protocol UDP length +--------+--------+--------+--------+
Raw IP vs UDP Raw IP uses next protocol field to identify app 1 ICMP 2 IGMP 4 IP 6 TCP 17 UDP 89 OSPFIGP Next protocol field is only 8 bits http://www.iana.org/assignments/protocol-numbers/ TFTP/BOOTP use UDP datagram services available from TCP/IP stack
Transmission Control Protocol (TCP) Connection-oriented protocol Capabilities are negotiated, e.g., MSS Reliable transport protocol Connections are closed when no longer needed Control and data info can be mixed in the same message Uses IP protocol identifier: 6
TCP Header TCP Header Format 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Source Port Destination Port +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Sequence Number +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Acknowledgment Number +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Data U A P R S F Offset Reserved R C S S Y I Window G K H T N N +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Checksum Urgent Pointer +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Options Padding +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ data +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Connection Establishment Server must be listening issues a Listen request via the Sockets API Client initiates connection by sending a tcp packet with the SYN flag set Local port selected from ephemeral port range this range is usually OS dependent Server replies with SYN and ACK flags set Client replies with ACK flag set and connection is now open
TCP 3-Way Handshake
tcpdump of 3-Way Handshake 14:19:36.446464 arp who-has 192.168.1.1 tell 192.168.1.7 14:19:36.446826 arp reply 192.168.1.1 is-at 0:0:88:5:e:1 14:19:36.447423 192.168.1.7.1328 > 192.168.1.1.www: S 1620878225:1620878225(0) win 16384 <mss 1460,nop,nop,sackOK> (DF) (ttl 128, id 29498) 14:19:36.448085 192.168.1.1.www > 192.168.1.7.1328: S 595341843:595341843(0) ack 1620878226 win 16060 <mss 1460,nop,nop,sackOK> (DF) (ttl 64, id 6 3234) 14:19:36.448696 192.168.1.7.1328 > 192.168.1.1.www:. ack 1 win 17520 (DF) (ttl 128, id 29500)
Data Transfer TCP segment size (MSS) is determined by the MTU of the local link 1460 for ethernet TCP reassembles fragments and reorders segments before passing the data to the application
Acknowledgements and Flow Control TCP uses cumulative ACKs and specifies the next sequence number expected ACK can be sent alone or as a piggy-backed acknowledgement Window size tells sender how much data the receiver can currently accept Window size of zero means stop sending
TCP Close
TCP Delayed Acknowledgments TCP will try to piggyback ACKs with data ACKs may not be sent immediately The delay for TCP must be less than 500 ms
Nagle Algorithm RFC 896 Some apps generate data 1 byte at a time (telnet, rlogin) 41 byte packets are generated to send 1 byte If there is unacknowledged data - do not send small segments until outstanding data has been acknowledged When to disable Nagle X Window Systems - mouse movements Otherwise we may encounter up to a 500ms delay
Sliding Windows
Slow Start Proposed by Van Jacobson Used for congestion control Packets sent depending on the rate at which ACKs are received Uses a congestion window: cwnd cwnd is initialized to 1 byte Each time an ACK is received cwnd increases by one segment Sender transmits min(cwnd,advertised win) cwnd - congestion control imposed by sender advertised win - congestion control imposed by receiver
Bandwidth Delay Product capacity (bits) = bandwidth (bits/s) x round-trip time (sec) RTT across the US ~ 50 ms T1 link: 1.544 Mbps capacity = 1.544 Mbps * 50 ms = 9650 Bytes Minimum window size necessary to fully utilize the link capacity T3 link: 45 Mbps - capacity = 281250 bytes window-size is only 16 bits!
Window Scale Factor Window size = 2^F, where F max = 14 Option negotiated during connection establishment only appears with syn Extends window-size to ~ 1GB
Congestion Avoidance Packet loss indicated by timeout receipt of duplicate ACKs Implemented along with slow start cwnd: congestion window ssthresh: slow start threshold
Congestion Avoidance When congestion occurs timeout reception of duplicate ACKs 1/2 the current window size is saved in ssthresh if it was a timeout, cwnd is set to 1 (slow start) if cwnd <= ssthresh: slow start, otherwise congestion avoidance
Congestion Avoidance
Real-Time Transport Protocol (RTP) Runs on top of UDP Supports multicast transmission Used for real-time applications: voice video Timestamp field allows applications to deal with jitter jitter - variation in delay
RTP Control Protocol (RTCP) Provides feedback via multicast concerning the quality of the transmission Allows the source to reduce the transmission rate if necessary to improve QoS