Introduction We have seen in previous lectures that the physical layer is responsible for the transmission of row bits (Ones and Zeros) over the channel. It is responsible for issues related to the line coding, modulation/demodulation of the transmitted signal, the actual radiation/reception of electromagnetic signals at the antennas, etc. However, the physical layer is not responsible in any form for the issues related to whither the transmitted bits were received correctly or with errors (or not received at all). These and other issues are dealt with by upper layers including the Data Link Layer. One of the protocols discussed here is called the Automatic Repeat request (ARQ) protocol. Automatic Repeat Request (ARQ) The ARQ protocol is responsible for verifying that all frames that are transmitted from a source machine reach the destination machine, and if some frames are received in error or they get lost completely, the source machine is requested to repeat the transmission of these frames again. There are three forms of the ARQ: 1. Stop and Wait 2. Go Back N 3. Selective Repeat The operation of ARQ methods is based on acknowledgment messages (ACK and NAK massages). Each of the above ARQ methods has its own features, advantages and disadvantages. Next, we will discuss each of the three ARQ methods and compare their performance. 1. Stop and Wait ARQ From its name, the transmitter stops transmitting after a complete frame has been sent and waits for a response from the receiving machine to confirm the correct reception of the frame. This ARQ method can be described as the following: 1. The source machine starts transmitting a new frame. 2. After transmitting the frame, a timer that expires after the expected time of the arrival of the ACK is started (if the transmitting machine expects an ACK response after x seconds, a timer with duration slightly more than x is set). 3. Once the frame reaches the destination machine, the destination machine responds with an ACK message indicating the reception of an error free frame. 1
4. Once the ACK message is received by the source machine, Step 1 is repeated. There are several issues that must be considered with the Stop and Wait ARQ method. These include the possibility of late or lost frames, and the possibility of late or lost ACK messages. Consider for example the communication link shown below, where we have the Data Link Layer of Machine A wanting to send a series of frames to the Data Link Layer of Machine B. We will assume that data is transmitted from Machine A to Machine B only in this direction and ACK frames in the other direction (but an extension to this situation for information going in the opposite direction is also possible), and we will assume that the Error Detection algorithm used is strong enough to detect ALL errors. Note: The ARQ protocols running in the Data Link Layers of different machines exchange the two types of frames: A) Information Frame (I Frame), which is usually several thousands of bits long and contains the following components 1. Header containing addressing and other information 2. Information Packet (which is the PDU sent by the Network Layer down to the Data Link Layer) 3. Trailer containing CRC bits for error detection (computed based on both Header and Information Packet) B) Control Frame (ACK frame), which is usually tens to hundreds of bits long and contains the following components 1. Header containing addressing and other information 2. Trailer containing CRC (computed based on Header). A CRC in the control frame is important because you want to avoid basing your decision regarding whither a frame was received correctly or not on erroneous data (data with errors). Let us consider the following possible scenarios and see how the Stop and Wait algorithm performs in each case. 1. Lost Frame Consider the following scenario: 3. Machine B transmits 2
4. The ACK frame is received by Machine A in time (before Timer 0 expires). 6. The frame reaches Machine B (again assume that the frame is checked for errors and no errors Machine B transmits 7. Assume that ACK is lost or is received with errors. 8. Timer 1 times out (expires), and Machine A decides to re transmit Frame 1. 9. This time, Machine B receives Frame 1 correctly and sends 10. Machine A receives ACK before Timer 1 expires and decides to transmit Frame 2, and so on. Conclusion: The transmission recovers from the lost frame. 2. Lost ACK Consider the following scenario: 3. Machine B transmits 4. The ACK frame is received by Machine A in time (before Timer 0 expires). 6. Assume that Frame 1 gets lost or is received with errors. 7. Machine B will NOT send ACK because it did not receive Frame 1 correctly. 8. Timer 1 times out (expires), and Machine A decides to re transmit Frame 1. 9. This time, Machine B receives Frame 1 correctly and sends 10. Machine A receives ACK before Timer 1 expires and decides to transmit Frame 2, and so on. Conclusion: The transmission recovers from the lost frame 3
3. Delayed ACK Consider the following scenario: 3. Machine B transmits 4. The ACK frame is received by Machine A in time (before Timer 0 expires). 6. Frame 1 reaches Machine B in time. 7. Machine B transmits ACK to acknowledge Frame 1, but the ACK is delayed in the network, so it arrives to Machine A late after Timer 1 has expired. 8. Timer 1 times out (expires), and Machine A decides to re transmit Frame 1. 9. After the retransmission of Frame 1, the delayed old ACK is received, Machine B assumes that it is the ACK for the recently transmitted Frame 1. 10. Machine B will transmit Frame 2, and set a Timer 2 11. Assume that Frame 2 gets lost or corrupted by errors. 12. Assume that the ACK of retransmission of Frame 1 is received in the period at which an ACK for Frame 2 is expected. Machine A assumes that this is the ACK for frame 2. 13. Machine A decides to transmit Frame 3, and so on. 14. Frame 2 is lost and never gets re transmitted Conclusion: The transmission DOES NOT recover from the lost frame. 4
Need for a Sequence Number in the ACK As seen the last case (3. Delayed ACK), the possibility of ambiguity occurring for the different ACKs transmitted by Machine A is possible. This occurs because Machine A has no way of determining (as described above) the received ACKs belong to which frames. To avoid this possible ambiguity, a sequence number is needed in the ACKs to inform transmitting machine (Machine A) of which frame is actually being acknowledged. The question that we would like to answer is, what is the size (in number of bits) of the Sequence number needed for removing this possible ambiguity?. First, let us assume that any delay that a frame experiences is relatively small (compared to the time it takes a frame to propagate through the channel). This assumption is valid for the transmission of frames as the handling of frames is the responsibility of the Data Link Layer, which is responsible for transmission of frames of single links and is not responsible for the transmission of frames over an entire network or over the internet, and so delays are relatively small. Because of the structure of the Stop and Wait ARQ algorithm, only one frame may be outstanding (has not been acknowledged yet). The next frame is not transmitted until the current frame is acknowledged. So, any possible ambiguity as in the above figure will occur between a frame and the previous or next frame to it (for example, Frame 1 with Frame 2, or in general Frame X with Frame X+1). You will never have the possibility of ambiguity between Frame X and Frame X+2, for example). So, the sequence number that we need to add to the ACK must only be able to resolve the different between a frame and the frame before it or after it. So, ONE BIT SEQUENCE NUMBER is needed in the ACK to basically distinguish between the ACKs of EVEN and ODD frames. The use of a 1 bit sequence number to resolve possible ambiguity is illustrated in the following scenario and figure: 5
ACK(Even). 3. Machine B transmits ACK(Even). 4. The ACK(Even) frame is received by Machine A in time (before Timer 0 expires). ACK(Odd). 6. Frame 1 reaches Machine B in time. 7. Machine B transmits ACK(Odd) to acknowledge Frame 1, but the ACK(Odd) is delayed in the network, so it arrives to Machine A late after Timer 1 has expired. 8. Timer 1 times out (expires), and Machine A decides to re transmit Frame 1. 9. After the retransmission of Frame 1, the delayed old ACK(Odd) is received, Machine B assumes that it is the ACK(Odd) for the recently transmitted Frame 1. 10. Machine B will transmit Frame 2, and set a Timer 2. 11. Assume that Frame 2 gets lost or corrupted by errors. 12. Assume that the ACK(Odd) of retransmission of Frame 1 is received in the period at which an ACK(Even) for Frame 2 is expected. Machine A will ignore it because it is Expecting and ACK of an even frame but it receives an ACK for an odd frame. Timer 2 will not be reset and will ramin active waiting for ACK(Even). 13. Timer 2 expires after a period of time because no ACK(Even) is received. 14. Machine A decides to re transmit Frame 2, and repeat the process of waiting for ACK(Even). Conclusion: The transmission recovers from the lost frame. 6