MultiPath TCP: From Theory to Practice

Size: px
Start display at page:

Download "MultiPath TCP: From Theory to Practice"

Transcription

1 MultiPath TCP: From Theory to Practice Sébastien Barré, Christoph Paasch, and Olivier Bonaventure ICTEAM, Université catholique de Louvain, B-1348 Louvain-la-Neuve, Belgium Abstract. The IETF is developing a new transport layer solution, MultiPath TCP (MPTCP), which allows to efficiently exploit several Internet paths between a pair of hosts, while presenting a single TCP connection to the application layer. From an implementation viewpoint, multiplexing flows at the transport layer raises several challenges. We first explain how this major TCP extension affects the Linux TCP/IP stack when considering the establishment of TCP connections and the transmission and reception of data over multiple paths. Then, based on our implementation of MultiPath TCP in the Linux kernel, we explain how such an implementation can be optimized to achieve high performance and report measurements showing the performance of receive buffer tuning and coupled congestion control. Keywords: TCP, multipath, implementation, measurements. 1 Introduction The Internet is changing. When TCP/IP was designed, hosts had a single interface and only routers were equipped with several physical interfaces. Today, most hosts have more than one interface and the proliferation of smart-phones equipped with both 3G and WiFi will bring a growing number of multihomed hosts on the Internet. End-users often expect that using multihomed hosts will increase both redundancy and performance. Unfortunately, in practice this is not always the case as more than 95% of the total Internet traffic is still driven by TCP and TCP binds each connection to a single interface. This implies that TCP by itself is not capable of efficiently and transparently using the interfaces available on a multihomed host. The multihoming problem has received a lot of attention in the research community and within the IETF during the last years. Network layer solutions such as shim6 [15] or the Host Identity Protocol (HIP) [14] have been proposed and implemented. However, they remain experimental and it is unlikely that they will be widely deployed. Transport layer solutions have also been developed, first as extensions to TCP [13,7,18,23]. However, to our knowledge these extensions have never been implemented nor deployed. The Stream Control Transmission Protocol (SCTP) [19] protocol was designed with multihoming in mind and supports This work is supported by the European Commission via the 7th Framework Programme Integrated Project TRILOGY. J. Domingo-Pascual et al. (Eds.): NETWORKING 2011, Part I, LNCS 6640, pp , c IFIP International Federation for Information Processing 2011

2 MultiPath TCP: From Theory to Practice 445 fail-over. Several SCTP extensions [8,12] enable hosts to use multiple paths at the same time. Although implemented in several operating systems [8], SCTP is still not widely used besides specific applications. The main drawbacks of SCTP on the global Internet are first that application developers need to change their application to use SCTP. Second, various types of middle-boxes such as NATs or firewalls do not understand SCTP and block all SCTP packets. During the last two years, the MPTCP working group of the IETF has been developing multipath extensions to TCP [6] that enable hosts to use several paths possibly through multiple interfaces, to carry the packets that belong to a single connection. This is probably the most ambitious extension to TCP to be standardized within the IETF. As for all Internet protocols, its success will not only depend on the protocol specification but also on the availability of a reference implementation that can be used by real applications. In this paper we explain the challenges in implementing MPTCP in the Linux kernel and evaluate its performance based on lab measurements. This is the first implementation of this major TCP extensioninanoperatingsystemkernel. The paper is organized as follows. In section 2, we briefly describe MPTCP and compare it to regular TCP. Then we describe the architecture of the implementation. In section 4, we describe a set of general problems that must be solved by any protocol wanting to simultaneously use several paths, together with the chosen solutions. Next, we measure the performance of the implementation in different scenarios to show how the implementation-choices taken do influence the results. 2 MultiPath TCP MultiPath TCP [6] is different from existing TCP extensions like the large windows, timestamps or selective acknowledgement extensions. These older extensions defined new options that slightly change the reaction of hosts when they receive segments. MultiPath TCP allows a pair of hosts to use several paths to exchange the segments that carry the data from a single connection. To understand the operation of MultiPath TCP, let us consider a very simple case where a client having two addresses, A.1 and A.2 establishes a connection with a single homed server. The client first sends a SYN segment from address A.1 that contains the MP_CAPABLE option [6]. This option indicates that the client supports MultiPath TCP and contains a token that identifies the Multi- Path TCP connection. The server replies with a SYN+ACK segment containing the same option and its own token. The client concludes the three-way handshake. Once the TCP connection has been established, the client can advertise its other addresses by sending TCP segments with the ADD_ADDR option. It can also open a second TCP connection, called a subflow, that will be linked to the first one by sending a SYN segment with the MP_JOIN option that contains the token sent by the server in the initial subflow. The server replies by sending a SYN+ACK segment with the MP_JOIN option containing the token chosen by the client in the initial subflow and the client terminates the three-way handshake. These two

3 446 S. Barré, C. Paasch, and O. Bonaventure subflows are linked together inside a single MultiPath TCP connection and both can be used to send data. Subflows may fail and be added during the lifetime of a MultiPath TCP connection. Additional details about the establishment of MultiPath TCP connections and subflows may be found in [6]. The data produced by the client and the server can be sent over any of the subflows that compose a MultiPath TCP connection, and if a subflow fails, data may need to be retransmitted over another subflow. For this, MultiPath TCP relies on two principles. First, each subflow is equivalent to a normal TCP connection with its own 32-bits sequence numbering space. This is important to allow MultiPath TCP to traverse complex middle-boxes like transparent proxies or traffic normalizers. Second, MultiPath TCP maintains a 64-bits data sequence numbering space. Two TCP options use these data sequence numbers : DSN_MAP and DSN_ACK. When a host sends a TCP segment over one subflow, it indicates inside the segment, by using the DSN_MAP option, the mapping between the 64- bits data sequence number and the 32-bits sequence number used by the subflow. Thanks to this mapping, the receiving host can reorder the data received, possibly out-of-sequence over the different subflows. In MultiPath TCP, a received segment is acknowledged at two different levels. First, the TCP cumulative or selective acknowledgements are used to acknowledge the reception of the segments on each subflow. Second, the DSN_ACK option is returned by the receiving host to provide cumulative acknowledgements at the data sequence level. When a segment is lost, the receiver detects the gap in the received 32-bits sequence number and traditional TCP retransmission mechanisms are triggered to recover from the loss. When a subflow fails, MultiPath TCP detects the failure and retransmits the unacknowledged data over another subflow that is still active. Another important difference between MultiPath TCP and regular TCP is the congestion control scheme. MultiPath TCP cannot use the standard TCP control scheme without being unfair to normal TCP flows. Consider two hosts sharing a single bottleneck link. If both hosts use regular TCP and open one TCP connection, they should achieve almost the same throughput. If one host opens several subflows for a single MultiPath TCP connection that all pass through the bottleneck link, it should not be able to use more than its fair share of the link. This is achieved by the coupled congestion control scheme that is discussed in details in [17,22]. The standard TCP congestion control [1] increases and decreases the congestion window and slow-start threshold upon reception of acknowledgments and detection of losses. The coupled congestion control scheme also relies on a congestion window, but it is updated according to the following principle: For each non-duplicate ack on subflow i, increase the congestion window of the subflow i by min(α/cwnd tot, 1/cwnd i ) (where cwnd tot is the total max i( cwndi mss 2 i RT T congestion window of all the subflows and α = cwnd 2 ) i tot ( ). cwnd i mss i i RT T ) 2 i Upon detection of a loss on subflow i, decrease the subflow congestion window by cwnd i /2.

4 3 Linux MultiPath TCP Architecture MultiPath TCP: From Theory to Practice 447 Our Multipath architecture for the Linux kernel is based on an important rethinking of the building blocks that make the current Linux TCP stack (the current Linux patch has more than lines 1 ). Of major importance is the separation between user context and interrupt context. A piece of kernel code runs in user context if it is performing a task for a particular application, and can be interrupted by the OS scheduler. All system calls (like send, rcv) are run in user context. Being under control of the scheduler, user context activities are fair to other applications. On the other hand, software interrupts are run with higher priority than any process, and follow an event (e.g. timer expiry, packet reception). They can be interrupted only by hardware interrupts, and must thus be as short as possible. Unfortunately the handling of incoming packets happens in a software interrupt, and the TCP reordering is normally done there as well, because it is needed to acknowledge as fast as possible. To allow spending less time in software interrupts, and hence gaining in overall system responsiveness, Van Jacobson proposed to force reordering in the user context when the application is waiting in a receive system call [10]. Instead of reordering itself, the interrupt then puts the segments in a so-called prequeue, and wakes up the process, that performs reordering and sends acknowledgements. More details on the Linux networking architecture can be found in [20]. As shown in figure 1, our proposed architecture for multipath is made of three elements. The first element is the master subsocket. If MPTCP is not supported by the peer, only that element is used and regular TCP is executed. The master subsocket is a standard socket structure that provides the interface between the application and the kernel for TCP communication. The second building block is the Multi-path control bock (mpcb). This building block supervises the multiple flows used in a given connection, it runs the decision algorithm for starting or stopping subflows (based on local and remote information), the scheduling algorithm for feeding new application data to a particular subflow, and the reordering algorithm that allows providing an in-order stream of incoming bytes to the application. Note that reordering at the subflow-level is performed in the subflows themselves, this service being implemented already as part of the underlying TCP. The third building block is the slave subsocket. Slave subsockets are not visible to the application. They are initiated, managed and closed by the multipath control block. They otherwise share functionality with the master subsocket. The master subsocket and the slave subsockets together form a pool of subflows that the mpcb can use for scheduling outgoing data, and reordering incoming data. Should the master subsocket fail, it keeps its role of contact point between the application and the kernel, but simply it stops being used as an eligible subflow by the MPTCP scheduler, until it recovers. Connection establishment: The initiator of a TCP connection uses the initial subflow to learn from the peer its set of addresses, as defined in [6]. It then combines those addresses to try establishing subflows on every known path to 1 Available at

5 448 S. Barré, C. Paasch, and O. Bonaventure Applications socket src : A1 dst : B1 sp:x dp : y Networking stack multipath control block master subsock src : A1 dst : B1 sp: x dp : y slave subsock slave subsock slave subsock src : A2 dst : B1 sp: x dp : y src : A1 dst : B2 sp: x dp : y src : A2 dst : B2 sp: x dp : y DATA DATA DATA DATA Fig. 1. Overview of the multipath architecture the peer. The case of the receiver is more tricky: We want to be able to create new subsockets without relying on the usual listen. The reason is that the key to retrieve a listener is no longer the usual TCP 5-tuple, but instead a token attached in a TCP option. We solve this by attaching an accept queue to the multipath control block. The mpcb is found thanks to a token-based hash table lookup, and a new half-open socket is appended in the accept queue. Only when the new subsocket becomes established is it added to the list of active subsockets. Scheduling and sending data: If more than one subflow is in established state, MPTCP must decide on which subflow to send data. The current policy implemented in the scheduler tries to fill all subflows, as described later in this paper. However, the scheduler is modular and other policies like preferring one interface over the other could be implemented in the future. The scheduler must deal with the granularity of the allocations, that is, the number of contiguous bytes that are sent over the same subflow before deciding to select another subflow. This granularity influences the performance of a MultiPath TCP implementation with high bandwidth flows. The optimal use of all subflows would be an argument in favor of small allocation units. On the other hand, to spare CPU cycles and memory accesses, one would tend to allocate in large units (because that would result in fewer calls to the scheduler, and would facilitate segmentation offload). In this paper, we favor an optimal allocation of segments, deferring a full study of the performance trade-offs for another paper. We have examined two major design options for the scheduler. In the first design, whenever an application performs a sendmsg() system call or equivalent, the scheduler is invoked and data is immediately pushed to a specific subflow. This implies per-subflow send buffers, and has the advantage of preformatting segments (with correct MSS and sequence numbers) so that they can be sent very quickly each time an acknowledgement opens up more space in the congestion window. However, there may be several hundreds of milliseconds between the time when a segment is enqueued on a subflow, and its actual transmission on the wire. The path properties may change during that time and what was a good choice when running the scheduler may reveal a very wrong choice when the data is put on the wire. Even worse is the case of a failing subflow, because a full buffer of segments needs to be moved to another subflow when this happens.

6 MultiPath TCP: From Theory to Practice 449 These major drawbacks lead us to a second design, that solves those problems. In the current implementation, the data pushed by the applications is not scheduled anymore, but instead stored in a connection-level send buffer. Subflows pull data from the shared send buffer whenever they receive an acknowledgement. This design is similar to the one proposed for ptcp in [7], but ptcp has only been simulated and not implemented. This effectively solves the problem of fluctuating path properties and allows to run the scheduler when the segment is sent on the wire. Another problem that needs to be considered by an implementation is that different subflows may use different MSS. In this case, a byte-based connectionlevel send buffer would probably be needed. However, this would be inefficient in the Linux kernel that is optimized to handle queues of packets. To solve this problem, our implementation uses the minimum MSS over all subflows to send segments. This is slightly less efficient from an overhead viewpoint if one interface uses a much smaller MSS than the other, but in practice this is rarely the case. To evaluate the impact of using the same MSS over all subflows, we performed a simple test by contacting the most popular web servers according to the Alexa top list. We sent a SYN segment advertising an MSS value of 1460, 4096 or 9148 bytes to each of these servers and analyzed the returned MSS. 97% of these servers returned an MSS of 1380 bytes without any influence of the MSS option that we included in our SYN segment. Most of the other web servers returned an MSS that was very close to 1380 bytes. Thus, using the same MSS for all subflows appears reasonable on today s Internet. Receiving data: Data reception is performed in two steps. The first step is to receive data at the subflow level, and reorder it according to the 32-bits subflow sequence numbers. The second step is to reorder the data at the connection level by using the data sequence numbers, and finally deliver it to the application. As in regular TCP, each subflow maintains a COPIED_SEQ and a RCV.NXT [16] pointer, resp. to track the next byte to give to the upper layer (now the upper layer becomes the mpcb) and the next expected subflow sequence number. Likewise, the multipath control block maintains a connection level COPIED_SEQ and a RCV.NXT pointer, resp. to track the next byte to deliver to the application and the next expected data sequence number that is used when returning a DATA_ACK option. To store incoming data until the application asks for it, we use a single connection-level receive queue. All subflow-receive queues are always empty, because as soon as a segment becomes in order at the subflow-level, it is enqueued in the connection-level receive queue, or out-of-order queue. The problem of charging the application for this additional processing time (to ensure fairness with other running processes) can be solved by using Van Jacobson s prequeues [10], just like regular TCP does. Enabled by default in current kernels, that option is even more important for MPTCP, because MPTCP involves more processing, especially with regards to reordering, as we will show in section 5.

7 450 S. Barré, C. Paasch, and O. Bonaventure 4 Reaching Full Path Utilization One obvious goal of MPTCP is to be able to consider all available paths as a shared resource, just behaving as the sum of the individual resources [21]. However, from an implementation viewpoint, reaching that goal requires to deal with several new constraints. We first study the implications of MPTCP on the receive buffers and then the coupled congestion control scheme. 4.1 Receive Buffer Constraints To optimize the size of the receive buffer, Fisk et al. have proposed [4] to dynamically tune the TCP receive buffer to ensure that twice the Bandwidth-Delay Product (BDP) of the path can be stored in the buffers of the receiver. That value allows supporting reordering by the network (this requires one BDP), as well as fast retransmissions (one other BDP). This buffer tuning [4] is integrated in the Linux kernel. In the context of a multipath transfer, additional constraints appear. The problem, described in details in [9] for SCTP multipath transfer, is that a fast path can be blocked due to the receive buffer becoming full. In single-path TCP the receive buffer can become full only if the application stops consuming data. In MultiPath TCP, it can become full as well if some data coming over a path with a high delay is needed for connection-level reordering, before to be able to deliver the bytes to the application. In the worst case, each path i is continuously transmitting at a BW i rate, while the slowest path (that is, the path with highest RTT) is fast retransmitting. To accommodate for such a situation, the receive buffer must be dimensioned as [7,5]: rbuf =2 i subflows BW i RT T max Our implementation uses the technique proposed in [4] to estimate the RTT at the receiver by using the TCP timestamps, and computes the contribution of each path to the overall receive buffer. Whenever any of the contributions changes, the global receive buffer limit is updated in the mpcb. In practice, this dynamic tuning may reach the maximum allowed receive buffer configured on the system. This should be used as a hint to indicate that a subflow is underperforming and disable the slowest path. 4.2 Implementing the Coupled Congestion Control When implementing in the Linux kernel the coupled congestion control described in section 2, several issues need to be solved. The main problem is that the Linux kernel does not support floating point numbers. This implies that increasing the congestion window by 1/cwnd i is not possible, because 1/cwnd i is not an integer. Our implementation counts the number of acknowledged packets and maintains this information inside a subflow-variable (cwnd_cnt i ), as it is already done for other congestion control algorithms in the Linux kernel like New-Reno. Then, the congestion window is increased by one as soon as cwnd_cnt i >tot cwnd /α [17]. The coupled congestion control involves the calculation of the α-factor described in section 2. Our implementation computes α as shown in equation 1.

8 MultiPath TCP: From Theory to Practice 451 As our implementation uses the same mss for all the subflows, the formula does not need the mss anymore. rtt max and cwnd max are precalculated from the numerator, but rtt max has been put into the denominator to reduce the number of divisions in the formula. As the Linux kernel only handles fixed-point calculations, we need to scale the divisions, to reduce the error due to integer-underflow (scale num and scale den ). Later, the resulting scaling factor of scale num /scale 2 den has to be taken into account. 5 Evaluation cwnd max scale num α = cwnd tot ( rtt max cwnd i scale den (1) i rtt i ) 2 To evaluate the performance of our implementation, we performed lab measurements in the HEN testbed at University College London ( ac.uk/). We used two different scenarios. The first scenario is used to evaluate the coupled congestion control scheme while the second analyses the factors that influence the performance of MultiPath TCP. 5.1 Measuring the Coupled Congestion Control The first scenario is the congestion testbed shown in figure 2. In this scenario, we used four Linux workstations. The two workstations on the left use Intel R Xeon CPUs (2.66GHz, 8GB RAM) and Intel R 82571EB Gigabit Ethernet Controllers. They are both connected with one 1 Gbps link to a similar workstation that is configured as a router. The router is connected with a 100 Mbps link to a server. The upper workstation in figure 2 uses a standard TCP implementation while the bottom host uses our MultiPath TCP implementation. Fig. 2. Congestion testbed Fig. 3. Performance testbed Detailed simulations analyzed by Wischik et. al in [22] show that the coupled congestion control fulfills its goals. In this section we illustrate that our Linux kernel implementation of the coupled congestion control achieves the desired effects, even if it is facing additional complexity due to fixed-point calculations compared to the user-space implementation used in [22]. In the congestion testbed shown in figure 2, the coupled congestion control should be fair to TCP. This means that an MPTCP connection should allow other TCP sessions to take

9 452 S. Barré, C. Paasch, and O. Bonaventure over the bandwidth on a shared bottleneck. Furthermore, an MPTCP connection that uses several subflows should not slow down regular TCP connections. To measure the fairness of MPTCP, we use the bandwidth measurement software iperf 2 to establish sessions between the hosts on the left and the server on the right part of the figure. We vary the number of regular TCP connections from the upper host and the number of MPTCP subflows used by the bottom host. We ran the iperf sessions for a duration of 10 minutes, to allow the TCP-fairness over the bottleneck link to converge [11]. Each measurement runs five times and we report the average throughput. Fig. 4. MultiPath TCP with coupled congestion control behaves like a single TCP connection over a shared bottleneck with respect to regular TCP Fig. 5. With coupled congestion control on the subflows, MultiPath TCP is not unfair to TCP when increasing the number of subflows Thanks to the flexibility of the Linux congestion control implementation, we perform tests by using the standard Reno congestion control scheme with regular TCP and either Reno or the coupled congestion control scheme with MPTCP. The measurements show that MPTCP with the coupled congestion control is fair to regular TCP. When an MPTCP connection with two subflows is sharing a bottleneck link with a TCP connection, the coupled congestion control behaves as if the MPTCP session was just one single TCP connection. However, when Reno congestion control is used on the subflows, MPTCP gets more bandwidth because in that case two TCP subflows are really competing against one regular TCP flow (Fig. 4). The second scenario that we evaluate with the coupled congestion control is the impact of the number of MPTCP subflows on the throughput achieved by a single TCP connection over the shared bottleneck. We perform measurements by using one regular TCP connection running the Reno congestion control scheme and one MPTCP connection using one, two or three subflows. The MPTCP connection uses either the Reno congestion control scheme or the coupled congestion control scheme. Figure 5 shows first that when there is a single MPTCP 2

10 MultiPath TCP: From Theory to Practice 453 subflow, both Reno and the coupled congestion control scheme are fair as there is no difference between the regular TCP and the MPTCP connection. When there are two subflows in the MPTCP connection, figure 5 shows that Reno favors the MPTCP connection over the regular single-flow TCP connection. The unfairness of the Reno congestion control scheme is even more important when the MPTCP connection is composed of three subflows. In contrast, the measurements indicate that the coupled congestion control provides the same fairness when the MPTCP connection is composed of one, two or three subflows. 5.2 MPTCP Performance Our second scenario, depicted in figure 3, allows us to evaluate the factors that influence the performance of our MultiPath TCP implementation. It is composed of three workstations. Two of them act as source and destination while the third one serves as a router. The source and destination are equipped with AMD Opteron TM Processor 248 single-core 2.2 GHz CPUs, 2GB RAM and two Intel R 82546GB Gigabit Ethernet controllers. The links and the router are configured to ensure that the router does not cross-route traffic, i.e. the packets that arrive from the solid-shaped link in figure 3 are always forwarded over the solid-shaped link to reach the destination. This implies that the network has two completely disjoint paths between the source and the destination. We configure the bandwidth on Link A and Link B by changing their Ethernet configuration iperf goodput (Mbps) MPTCP (0 msec, 0 msec) MPTCP (0 msec, 10 msec) MPTCP (0 msec, 100 msec) MPTCP (0 msec, 500 msec) TCP (0 msec) rcvbuf (MB) Fig. 6. Impact of the maximum receive buffer size Fig. 7. Impact of the packet loss ratio As explained in section 4.1, one performance issue that affects MultiPath TCP is that MultiPath TCP may require large receive buffers when subflows have different delays. To evaluate this impact, we configured Link A and Link B with a bandwidth of 100 Mbps. Figure 6 Shows the impact of the maximum receive buffer on the performance achieved by MPTCP with different delays. For these measurements, we use two MPTCP subflows, one is routed over Link A while the second is routed over Link B. The router is configured to insert an additional delay of 0, 10, 100 and 500 milliseconds on Link B. Nodelayis

11 454 S. Barré, C. Paasch, and O. Bonaventure 2000 iperf goodput (Mbps) Gbps interface 2 Gbps interfaces MSS (bytes) Fig. 8. Impact of the MSS on the performance inserted on Link A. This allows us to consider an asymmetric scenario where the two subflows are routed over very different paths. Such asymmetric delays force the MPTCP receiver to store many packets in its receive buffer to be able to deliver all the received data in sequence. As a reference, we show in figure 6 the throughput achieved by iperf with a regular TCP connection over Link A. When the two subflows have the same delay, they are able to saturate the two 100 Mbps links with a receive buffer of 2 MBytes or more. When the delay difference is of only 10 millisecond, the goodput measured by iperf is not affected. With a difference of 100 milliseconds in delay between the two subflows, there is a small performance decrease. The performance is slightly better with 4 MBytes which is the default maximum receive buffer in the Linux kernel. When the delay difference reaches 500 milliseconds, an extreme case that would probably involve satellite links in the current Internet, the goodput achieved by MultiPath TCP is much more affected. This is mainly because the router drops packets and these losses cause timeout expirations and force MPTCP to slowly increases its congestion window due to the large round-trip-time. A second factor that affects the performance is the loss ratio. To evaluate whether losses on one subflow can affect the performance of the other subflow due to head-of-line blocking in the receive buffer, we configured the router to drop a percentage of the packets on Link B but no packets on Link A and set the delays of these links to 0 milliseconds. Fig. 7 shows the impact of the packet loss ratio on the performance achieved by the MPTCP connection. The figure shows the two subflows that compose the connection. The subflow shown in white passes through Link A while the subflow shown in gray passes through Link B. When there are no losses, the MPTCP connection is able to saturate the two 100 Mbps links. As expected, the gray subflow that passes through Link B is affected by the packet losses and its goodput decreases with the packet loss ratio. However, the goodput of the other subflow remains stable with packet loss ratios of 1, 2 or 3 %. It is only when the packet loss ratio reaches 4% or 5% that the goodput of the white subflow decreases slightly. The last factor that we analyze is the impact of the Maximum Segment Size (MSS) on the achieved goodput. TCP implementors know that a large MSS enables higher goodput [2] since it reduces the number of interrupts that needs

12 MultiPath TCP: From Theory to Practice 455 to be processed. To evaluate the impact of the MSS, we do not introduce delays nor losses on the router and use either one or two Gigabit Ethernet interfaces on the source and the destination. Figure 8 shows the goodput in function of the MSS size. With a standard 1400 bytes Ethernet MSS, MPTCP can fill one 1Gbps link, and partly use a second one. It is able to saturate two Gigabit Ethernet links with an MSS of 4500 bytes. Note that we do not use TSO (TCP Segmentation Offload) [3] for these measurements. With TSO the segment size handled by the system could have grown virtually to 64KB and allowed it to reach the same goodput with a lower CPU utilization. TSO support will be added later in our MPTCP implementation. Finally, we have looked at the CPU consumption in the system. Given that the MPTCP scheduler runs for every transmitted segment, we were afraid that increasing the number of subflows (hence the cost of running the scheduler) could significantly impact the performance. This is not true. We have run MPTCP connections containing 1 to 8 subflows on a shared 1 Gbps bottleneck with various MSS. Increasing the number of concurrent subflows from 1 to 8 has no significant impact on the overall system charge. We also found that the server is globally more busy than the sender, which can be explained by the cost of reordering the received segments. MPTCP currently uses the same algorithm as TCP to reorder segments, but while TCP typically reorders only a few segments at a time, MPTCP may need to handle hundreds of them. A better reordering algorithm could probably improve the receiver performance. Regarding the repartition of the load between the software interrupts and the user context, we found that the majority of the processing was done in the software interrupt context in the receiver: around 50% of the CPU time is spent in soft interrupt, 8% in the user context with a 1400 bytes MSS and a single Gigabit Ethernet link. This can be explained by the fact that prequeues are not in use. The sender spends around 17% in soft interrupt and 25% in user context under the same conditions. Although more work is performed in user context, which is good, there is still some amount of work performed in soft interrupt because the majority of the segments are sent when an incoming acknowledgement opens more space in the congestion or sending window. This event happens in interrupt context. 6 Conclusion and Future Work MultiPath TCP is a major extension to TCP that is being developed within the IETF. Its success will depend on the availability of a reference implementation. From an implementation viewpoint, MultiPath TCP raises several important challenges. We have analyzed several of them based on our experience in implementing the first MultiPath TCP implementation in the Linux kernel. In particular, we have shown how such an implementation can be structured and discussed how buffer management must be adapted due to the utilization of multiple subflows. We have analyzed the performance of our implementation in the HEN testbed and shown that the coupled congestion control scheme is more fair than the standard TCP congestion control scheme. We have also evaluated the impact of the delay on the receive buffers and the throughput and showed

13 456 S. Barré, C. Paasch, and O. Bonaventure that losses on one subflow had limited impact on the performance of another subflow from the same MultiPath TCP connection. Finally, we have shown that our implementation was able to efficiently utilize Gigabit Ethernet links when using large packets. Our implementation in the Linux kernel is a first step towards the adoption and the deployment of MultiPath TCP. There are however several research and implementation issues that remain open. Firstly, the MultiPath TCP protocol is not yet finalized and for example the security issues are still being developed. Secondly, MultiPath TCP currently uses the standard TCP retransmission mechanisms on each subflow while multipath-aware retransmission mechanisms could probably improve the performance in some cases. Thirdly, our current implementation uses all available subflows while better performance would probably be possible by adding and removing subflows based on their measured performance. Fourthly, the MultiPath TCP implementation should better interact with the network interfaces to benefit from the TCP segment offload mechanisms that some of them include. Finally, the performance of MultiPath TCP in the global Internet and its interactions with real middle-boxes should be evaluated. We expect that our publicly available implementation will allow other researchers to also contribute to this work. Acknowledgements We thank the members of the Trilogy project for their help and in particular Costin Raiciu for his help with the coupled congestion control, and Adam Greenhalgh for maintaining the HEN-Testbed and giving us the ability to execute the performance measurements. References 1. Allman, M., Paxson, V., Blanton, E.: TCP Congestion Control. RFC 5681 (Draft Standard) (September 2009), 2. Borman, D.A.: Implementing TCP/IP on a cray computer. SIGCOMM Comput. Commun. Rev. 19, (1989), 3. Currid, A.: TCP Offload to the Rescue. Queue 2, (2004), org/ / Fisk, M., chun Feng, W.: Dynamic right-sizing in TCP. In: Proceedings of the Los Alamos Computer Science Institute Symposium, pp (2001) 5. Ford, A., Raiciu, C., Barré, S., Iyengar, J.: Architectural Guidelines for Multipath TCP Development, internet draft, draft-ietf-mptcp-architecture-03.txt, work in progress (December 2010) 6. Ford, A., Raiciu, C., Handley, M.: TCP Extensions for Multipath Operation with Multiple Addresses, internet draft, draft-ietf-mptcp-multiaddressed-02.txt, work in progress (October 2010) 7. Hsieh, H.Y., Sivakumar, R.: ptcp: An End-to-End Transport Layer Protocol for Striped Connections. In: ICNP, pp IEEE Computer Society, Los Alamitos (2002)

14 MultiPath TCP: From Theory to Practice Iyengar, J., Amer, P.D., Stewart, R.R.: Concurrent multipath transfer using SCTP multihoming over independent end-to-end paths. IEEE/ACM Trans. Netw. 14(5), (2006) 9. Iyengar, J., Amer, P., Stewart, R.: Receive buffer blocking in concurrent multipath transfer. In: IEEE Global Telecommunications Conference, GLOBECOM 2005, p. 6 (2005) 10. Jacobson, V.: Re: query about tcp header on tcp-ip (September 1993), ftp://ftp. ee.lbl.gov/ /vanj.93sep07.txt 11. Li, Y.T., Leith, D., Shorten, R.N.: Experimental evaluation of TCP Protocols for High-Speed Networks. IEEE/ACM Trans. Netw. 15, (2007), dx.doi.org/ /tnet Liao, J., Wang, J., Zhu, X.: cmpsctp: An extension of SCTP to support concurrent multi-path transfer. Communications (2008) 13. Magalhaes, L., Kravets, R.: Transport Level Mechanisms for Bandwidth Aggregation on Mobile Hosts. In: ICNP, pp IEEE Computer Society, Los Alamitos (2001) 14. Moskowitz, R., Nikander, P.: Host Identity Protocol (HIP) Architecture. RFC 4423 (May 2006), Nordmark, E., Bagnulo, M.: Shim6: Level 3 Multihoming Shim Protocol for IPv6. RFC 5533 (June 2009), Postel, J.: Transmission Control Protocol. RFC 793 (Standard), updated by RFCs 1122, 3168 (September 1981) 17. Raiciu, C., Handley, M., Wischik, D.: Coupled Multipath-Aware Congestion Control. Internet draft (work in progress), Internet Engineering Task Force (July 2010), Rojviboonchai, K., Osuga, T., Aida, H.: R-M/TCP: Protocol for Reliable Multi- Path Transport over the Internet. In: AINA, pp IEEE Computer Society, Los Alamitos (2005) 19. Stewart, R.: Stream Control Transmission Protocol. RFC 4960 (September 2007) 20. Wehrle, K., Pahlke, F., Ritter, H., Muller, D., Bechler, M.: The Linux networking architecture: design and implementation of network protocols in the Linux kernel. Prentice Hall, Englewood Cliffs (2004) 21. Wischik, D., Handley, M., Braun, M.B.: The Resource Pooling Principle. SIGCOMM Comput. Commun. Rev. 38(5), (2008) 22. Wischik, D., Raiciu, C., Greenhalgh, A., Handley, M.: Design, implementation and evaluation of congestion control for multipath TCP, USENIX NSDI (April 2011) 23. Zhang, M., Lai, J., Krishnamurthy, A.: A transport layer approach for improving end-to-end performance and robustness using redundant paths. In: USENIX 2004, pp (2004)

Implementation and Evaluation of Coupled Congestion Control for Multipath TCP

Implementation and Evaluation of Coupled Congestion Control for Multipath TCP Implementation and Evaluation of Coupled Congestion Control for Multipath TCP Régel González Usach and Mirja Kühlewind Institute of Communication Networks and Computer Engineering (IKR), University of

More information

Analysis on MPTCP combining Congestion Window Adaptation and Packet Scheduling for Multi-Homed Device

Analysis on MPTCP combining Congestion Window Adaptation and Packet Scheduling for Multi-Homed Device * RESEARCH ARTICLE Analysis on MPTCP combining Congestion Window Adaptation and Packet Scheduling for Multi-Homed Device Mr. Prathmesh A. Bhat, Prof. Girish Talmale Dept. of Computer Science and Technology,

More information

MultiPath TCP : Linux Kernel Implementation

MultiPath TCP : Linux Kernel Implementation MultiPath : Linux Kernel Implementation Presenter: Christoph Paasch IP Networking Lab Université catholique de Louvain February 3, 2012 http://mptcp.info.ucl.ac.be Presenter: Christoph Paasch - IP Networking

More information

MPTCP : Linux Kernel implementation status

MPTCP : Linux Kernel implementation status MPTCP : Linux Kernel implementation status Presenter : Christoph Paasch IP Networking Lab Université catholique de Louvain 28 mars 2012 http ://mptcp.info.ucl.ac.be Presenter : Christoph Paasch - IP Networking

More information

Multipath TCP. Prof. Mark Handley Dr. Damon Wischik Costin Raiciu University College London

Multipath TCP. Prof. Mark Handley Dr. Damon Wischik Costin Raiciu University College London Multipath TCP How one little change can make: Google more robust your iphone service cheaper your home broadband quicker prevent the Internet from melting down enable remote brain surgery cure hyperbole

More information

Fast Retransmit. Problem: coarsegrain. timeouts lead to idle periods Fast retransmit: use duplicate ACKs to trigger retransmission

Fast Retransmit. Problem: coarsegrain. timeouts lead to idle periods Fast retransmit: use duplicate ACKs to trigger retransmission Fast Retransmit Problem: coarsegrain TCP timeouts lead to idle periods Fast retransmit: use duplicate ACKs to trigger retransmission Packet 1 Packet 2 Packet 3 Packet 4 Packet 5 Packet 6 Sender Receiver

More information

Multipath TCP. Decoupled from IP, TCP is at last able to support multihomed hosts

Multipath TCP. Decoupled from IP, TCP is at last able to support multihomed hosts Multipath TCP Decoupled from IP, TCP is at last able to support multihomed hosts Christoph Paasch and Olivier Bonaventure, UCL The Internet relies heavily on two protocols. In the network layer, IP (Internet

More information

Improving Multipath TCP. PhD Thesis - Christoph Paasch

Improving Multipath TCP. PhD Thesis - Christoph Paasch Improving Multipath TCP PhD Thesis - Christoph Paasch The Internet is like a map... highly connected Communicating over the Internet A 1 A A 2 A 3 A 4 A 5 Multipath communication A 1 A 2 A 3 A 4 A 5 A

More information

RD-TCP: Reorder Detecting TCP

RD-TCP: Reorder Detecting TCP RD-TCP: Reorder Detecting TCP Arjuna Sathiaseelan and Tomasz Radzik Department of Computer Science, King s College London, Strand, London WC2R 2LS {arjuna,radzik}@dcs.kcl.ac.uk Abstract. Numerous studies

More information

Recap. TCP connection setup/teardown Sliding window, flow control Retransmission timeouts Fairness, max-min fairness AIMD achieves max-min fairness

Recap. TCP connection setup/teardown Sliding window, flow control Retransmission timeouts Fairness, max-min fairness AIMD achieves max-min fairness Recap TCP connection setup/teardown Sliding window, flow control Retransmission timeouts Fairness, max-min fairness AIMD achieves max-min fairness 81 Feedback Signals Several possible signals, with different

More information

Multipath TCP. Prof. Mark Handley Dr. Damon Wischik Costin Raiciu University College London

Multipath TCP. Prof. Mark Handley Dr. Damon Wischik Costin Raiciu University College London Multipath TCP How one little change can make: YouTube more robust your iphone service cheaper your home broadband quicker prevent the Internet from melting down enable remote brain surgery cure hyperbole

More information

Communication Networks

Communication Networks Communication Networks Spring 2018 Laurent Vanbever nsg.ee.ethz.ch ETH Zürich (D-ITET) April 30 2018 Materials inspired from Scott Shenker & Jennifer Rexford Last week on Communication Networks We started

More information

Internet Engineering Task Force Internet-Draft Obsoletes: 6824 (if approved)

Internet Engineering Task Force Internet-Draft Obsoletes: 6824 (if approved) Internet Engineering Task Force Internet-Draft Obsoletes: 6824 (if approved) Intended status: Standards Track Expires: April 6, 2019 A. Ford Pexip C. Raiciu U. Politechnica of Bucharest M. Handley U. College

More information

Reasons not to Parallelize TCP Connections for Fast Long-Distance Networks

Reasons not to Parallelize TCP Connections for Fast Long-Distance Networks Reasons not to Parallelize TCP Connections for Fast Long-Distance Networks Zongsheng Zhang Go Hasegawa Masayuki Murata Osaka University Contents Introduction Analysis of parallel TCP mechanism Numerical

More information

6.1 Internet Transport Layer Architecture 6.2 UDP (User Datagram Protocol) 6.3 TCP (Transmission Control Protocol) 6. Transport Layer 6-1

6.1 Internet Transport Layer Architecture 6.2 UDP (User Datagram Protocol) 6.3 TCP (Transmission Control Protocol) 6. Transport Layer 6-1 6. Transport Layer 6.1 Internet Transport Layer Architecture 6.2 UDP (User Datagram Protocol) 6.3 TCP (Transmission Control Protocol) 6. Transport Layer 6-1 6.1 Internet Transport Layer Architecture The

More information

CS268: Beyond TCP Congestion Control

CS268: Beyond TCP Congestion Control TCP Problems CS68: Beyond TCP Congestion Control Ion Stoica February 9, 004 When TCP congestion control was originally designed in 1988: - Key applications: FTP, E-mail - Maximum link bandwidth: 10Mb/s

More information

Transmission Control Protocol. ITS 413 Internet Technologies and Applications

Transmission Control Protocol. ITS 413 Internet Technologies and Applications Transmission Control Protocol ITS 413 Internet Technologies and Applications Contents Overview of TCP (Review) TCP and Congestion Control The Causes of Congestion Approaches to Congestion Control TCP Congestion

More information

TCP. CSU CS557, Spring 2018 Instructor: Lorenzo De Carli (Slides by Christos Papadopoulos, remixed by Lorenzo De Carli)

TCP. CSU CS557, Spring 2018 Instructor: Lorenzo De Carli (Slides by Christos Papadopoulos, remixed by Lorenzo De Carli) TCP CSU CS557, Spring 2018 Instructor: Lorenzo De Carli (Slides by Christos Papadopoulos, remixed by Lorenzo De Carli) 1 Sources Fall and Stevens, TCP/IP Illustrated Vol. 1, 2nd edition Congestion Avoidance

More information

Transport Layer. -UDP (User Datagram Protocol) -TCP (Transport Control Protocol)

Transport Layer. -UDP (User Datagram Protocol) -TCP (Transport Control Protocol) Transport Layer -UDP (User Datagram Protocol) -TCP (Transport Control Protocol) 1 Transport Services The transport layer has the duty to set up logical connections between two applications running on remote

More information

Transmission Control Protocol (TCP)

Transmission Control Protocol (TCP) TETCOS Transmission Control Protocol (TCP) Comparison of TCP Congestion Control Algorithms using NetSim @2017 Tetcos. This document is protected by copyright, all rights reserved Table of Contents 1. Abstract....

More information

BLEST: Blocking Estimation-based MPTCP Scheduler for Heterogeneous Networks. Simone Ferlin, Ozgu Alay, Olivier Mehani and Roksana Boreli

BLEST: Blocking Estimation-based MPTCP Scheduler for Heterogeneous Networks. Simone Ferlin, Ozgu Alay, Olivier Mehani and Roksana Boreli BLEST: Blocking Estimation-based MPTCP Scheduler for Heterogeneous Networks Simone Ferlin, Ozgu Alay, Olivier Mehani and Roksana Boreli Multipath TCP: What is it? TCP/IP is built around the notion of a

More information

Internet Engineering Task Force. Intended status: Informational Expires: August 20, 2012 Technical University Berlin February 17, 2012

Internet Engineering Task Force. Intended status: Informational Expires: August 20, 2012 Technical University Berlin February 17, 2012 Internet Engineering Task Force Internet-Draft Intended status: Informational Expires: August 20, 2012 T. Ayar B. Rathke L. Budzisz A. Wolisz Technical University Berlin February 17, 2012 A Transparent

More information

Report on Transport Protocols over Mismatched-rate Layer-1 Circuits with 802.3x Flow Control

Report on Transport Protocols over Mismatched-rate Layer-1 Circuits with 802.3x Flow Control Report on Transport Protocols over Mismatched-rate Layer-1 Circuits with 82.3x Flow Control Helali Bhuiyan, Mark McGinley, Tao Li, Malathi Veeraraghavan University of Virginia Email: {helali, mem5qf, taoli,

More information

Transport Over IP. CSCI 690 Michael Hutt New York Institute of Technology

Transport Over IP. CSCI 690 Michael Hutt New York Institute of Technology Transport Over IP CSCI 690 Michael Hutt New York Institute of Technology Transport Over IP What is a transport protocol? Choosing to use a transport protocol Ports and Addresses Datagrams UDP What is a

More information

Analyzing the Receiver Window Modification Scheme of TCP Queues

Analyzing the Receiver Window Modification Scheme of TCP Queues Analyzing the Receiver Window Modification Scheme of TCP Queues Visvasuresh Victor Govindaswamy University of Texas at Arlington Texas, USA victor@uta.edu Gergely Záruba University of Texas at Arlington

More information

RED behavior with different packet sizes

RED behavior with different packet sizes RED behavior with different packet sizes Stefaan De Cnodder, Omar Elloumi *, Kenny Pauwels Traffic and Routing Technologies project Alcatel Corporate Research Center, Francis Wellesplein, 1-18 Antwerp,

More information

Chapter 3 outline. 3.5 Connection-oriented transport: TCP. 3.6 Principles of congestion control 3.7 TCP congestion control

Chapter 3 outline. 3.5 Connection-oriented transport: TCP. 3.6 Principles of congestion control 3.7 TCP congestion control Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles of reliable data transfer 3.5 Connection-oriented transport: TCP segment

More information

PLEASE READ CAREFULLY BEFORE YOU START

PLEASE READ CAREFULLY BEFORE YOU START MIDTERM EXAMINATION #2 NETWORKING CONCEPTS 03-60-367-01 U N I V E R S I T Y O F W I N D S O R - S c h o o l o f C o m p u t e r S c i e n c e Fall 2011 Question Paper NOTE: Students may take this question

More information

Achieving Efficient Parallelism in Transport Layer using P-TCP

Achieving Efficient Parallelism in Transport Layer using P-TCP International Journal of Computer Applications in Engineering Sciences [VOL II, ISSUE I, MARCH 2012] [ISSN: 2231-4946] Achieving Efficient Parallelism in Transport Layer using P-TCP Kamalakshi N 1, H Naganna

More information

MultiPath TCP : Linux Kernel implementation

MultiPath TCP : Linux Kernel implementation MultiPath TCP : Linux Kernel implementation Presenter: Christoph Paasch IP Networking Lab UCLouvain, Belgium August 28, 2012 http://multipath-tcp.org IP Networking Lab http://multipath-tcp.org 1/ 29 Networks

More information

A Scheduler for Multipath TCP

A Scheduler for Multipath TCP A Scheduler for Multipath TCP Fan Yang CIS Department University of Delaware Newark, Delaware, USA 19716 yangfan@udel.edu Paul Amer CIS Department University of Delaware Newark, Delaware, USA 19716 amer@udel.edu

More information

Computer Communication Networks Midterm Review

Computer Communication Networks Midterm Review Computer Communication Networks Midterm Review ICEN/ICSI 416 Fall 2018 Prof. Aveek Dutta 1 Instructions The exam is closed book, notes, computers, phones. You can use calculator, but not one from your

More information

Transport Layer. Application / Transport Interface. Transport Layer Services. Transport Layer Connections

Transport Layer. Application / Transport Interface. Transport Layer Services. Transport Layer Connections Application / Transport Interface Application requests service from transport layer Transport Layer Application Layer Prepare Transport service requirements Data for transport Local endpoint node address

More information

CSE 461. TCP and network congestion

CSE 461. TCP and network congestion CSE 461 TCP and network congestion This Lecture Focus How should senders pace themselves to avoid stressing the network? Topics Application Presentation Session Transport Network congestion collapse Data

More information

Exercises TCP/IP Networking With Solutions

Exercises TCP/IP Networking With Solutions Exercises TCP/IP Networking With Solutions Jean-Yves Le Boudec Fall 2009 3 Module 3: Congestion Control Exercise 3.2 1. Assume that a TCP sender, called S, does not implement fast retransmit, but does

More information

image 3.8 KB Figure 1.6: Example Web Page

image 3.8 KB Figure 1.6: Example Web Page image. KB image 1 KB Figure 1.: Example Web Page and is buffered at a router, it must wait for all previously queued packets to be transmitted first. The longer the queue (i.e., the more packets in the

More information

8. TCP Congestion Control

8. TCP Congestion Control 8. TCP Congestion Control 1 TCP Congestion Control Slow-start increase Multiplicative decrease Congestion avoidance Measurement of variation Exponential timer backoff 2002 Yanghee Choi 2 Congestion Control

More information

Two approaches to Flow Control. Cranking up to speed. Sliding windows in action

Two approaches to Flow Control. Cranking up to speed. Sliding windows in action CS314-27 TCP: Transmission Control Protocol IP is an unreliable datagram protocol congestion or transmission errors cause lost packets multiple routes may lead to out-of-order delivery If senders send

More information

Utilizing Datacenter Networks: Centralized or Distributed Solutions?

Utilizing Datacenter Networks: Centralized or Distributed Solutions? Utilizing Datacenter Networks: Centralized or Distributed Solutions? Costin Raiciu Department of Computer Science University Politehnica of Bucharest We ve gotten used to great applications Enabling Such

More information

CS457 Transport Protocols. CS 457 Fall 2014

CS457 Transport Protocols. CS 457 Fall 2014 CS457 Transport Protocols CS 457 Fall 2014 Topics Principles underlying transport-layer services Demultiplexing Detecting corruption Reliable delivery Flow control Transport-layer protocols User Datagram

More information

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup Chapter 4 Routers with Tiny Buffers: Experiments This chapter describes two sets of experiments with tiny buffers in networks: one in a testbed and the other in a real network over the Internet2 1 backbone.

More information

TCP/IP Performance ITL

TCP/IP Performance ITL TCP/IP Performance ITL Protocol Overview E-Mail HTTP (WWW) Remote Login File Transfer TCP UDP IP ICMP ARP RARP (Auxiliary Services) Ethernet, X.25, HDLC etc. ATM 4/30/2002 Hans Kruse & Shawn Ostermann,

More information

Lecture 3: The Transport Layer: UDP and TCP

Lecture 3: The Transport Layer: UDP and TCP Lecture 3: The Transport Layer: UDP and TCP Prof. Shervin Shirmohammadi SITE, University of Ottawa Prof. Shervin Shirmohammadi CEG 4395 3-1 The Transport Layer Provides efficient and robust end-to-end

More information

Topics. TCP sliding window protocol TCP PUSH flag TCP slow start Bulk data throughput

Topics. TCP sliding window protocol TCP PUSH flag TCP slow start Bulk data throughput Topics TCP sliding window protocol TCP PUSH flag TCP slow start Bulk data throughput 2 Introduction In this chapter we will discuss TCP s form of flow control called a sliding window protocol It allows

More information

Performance Consequences of Partial RED Deployment

Performance Consequences of Partial RED Deployment Performance Consequences of Partial RED Deployment Brian Bowers and Nathan C. Burnett CS740 - Advanced Networks University of Wisconsin - Madison ABSTRACT The Internet is slowly adopting routers utilizing

More information

Programming Assignment 3: Transmission Control Protocol

Programming Assignment 3: Transmission Control Protocol CS 640 Introduction to Computer Networks Spring 2005 http://www.cs.wisc.edu/ suman/courses/640/s05 Programming Assignment 3: Transmission Control Protocol Assigned: March 28,2005 Due: April 15, 2005, 11:59pm

More information

TCP Accelerator OVERVIEW

TCP Accelerator OVERVIEW OVERVIEW TCP Accelerator: Take your network to the next level by taking control of TCP Sandvine s TCP Accelerator allows communications service providers (CSPs) to dramatically improve subscriber quality

More information

Reliable Transport II: TCP and Congestion Control

Reliable Transport II: TCP and Congestion Control Reliable Transport II: TCP and Congestion Control Stefano Vissicchio UCL Computer Science COMP0023 Recap: Last Lecture Transport Concepts Layering context Transport goals Transport mechanisms and design

More information

Outline Computer Networking. TCP slow start. TCP modeling. TCP details AIMD. Congestion Avoidance. Lecture 18 TCP Performance Peter Steenkiste

Outline Computer Networking. TCP slow start. TCP modeling. TCP details AIMD. Congestion Avoidance. Lecture 18 TCP Performance Peter Steenkiste Outline 15-441 Computer Networking Lecture 18 TCP Performance Peter Steenkiste Fall 2010 www.cs.cmu.edu/~prs/15-441-f10 TCP congestion avoidance TCP slow start TCP modeling TCP details 2 AIMD Distributed,

More information

CS 640 Introduction to Computer Networks Spring 2009

CS 640 Introduction to Computer Networks Spring 2009 CS 640 Introduction to Computer Networks Spring 2009 http://pages.cs.wisc.edu/~suman/courses/wiki/doku.php?id=640-spring2009 Programming Assignment 3: Transmission Control Protocol Assigned: March 26,

More information

Designing a Resource Pooling Transport Protocol

Designing a Resource Pooling Transport Protocol Designing a Resource Pooling Transport Protocol Michio Honda, Keio University Elena Balandina, Nokia Research Center Pasi Sarolahti, Nokia Research Center Lars Eggert, Nokia Research Center Global Internet

More information

Implementation and Analysis of Large Receive Offload in a Virtualized System

Implementation and Analysis of Large Receive Offload in a Virtualized System Implementation and Analysis of Large Receive Offload in a Virtualized System Takayuki Hatori and Hitoshi Oi The University of Aizu, Aizu Wakamatsu, JAPAN {s1110173,hitoshi}@u-aizu.ac.jp Abstract System

More information

TCP so far Computer Networking Outline. How Was TCP Able to Evolve

TCP so far Computer Networking Outline. How Was TCP Able to Evolve TCP so far 15-441 15-441 Computer Networking 15-641 Lecture 14: TCP Performance & Future Peter Steenkiste Fall 2016 www.cs.cmu.edu/~prs/15-441-f16 Reliable byte stream protocol Connection establishments

More information

Multiple unconnected networks

Multiple unconnected networks TCP/IP Life in the Early 1970s Multiple unconnected networks ARPAnet Data-over-cable Packet satellite (Aloha) Packet radio ARPAnet satellite net Differences Across Packet-Switched Networks Addressing Maximum

More information

IS370 Data Communications and Computer Networks. Chapter 5 : Transport Layer

IS370 Data Communications and Computer Networks. Chapter 5 : Transport Layer IS370 Data Communications and Computer Networks Chapter 5 : Transport Layer Instructor : Mr Mourad Benchikh Introduction Transport layer is responsible on process-to-process delivery of the entire message.

More information

Chapter 6. What happens at the Transport Layer? Services provided Transport protocols UDP TCP Flow control Congestion control

Chapter 6. What happens at the Transport Layer? Services provided Transport protocols UDP TCP Flow control Congestion control Chapter 6 What happens at the Transport Layer? Services provided Transport protocols UDP TCP Flow control Congestion control OSI Model Hybrid Model Software outside the operating system Software inside

More information

Making Multipath TCP friendlier to Load Balancers and Anycast

Making Multipath TCP friendlier to Load Balancers and Anycast Making Multipath TCP friendlier to Load Balancers and Anycast Fabien Duchêne Olivier Bonaventure Université Catholique de Louvain ICNP 2017

More information

Multipath TCP: Overview, Design, and Use-Cases

Multipath TCP: Overview, Design, and Use-Cases Multipath TCP: Overview, Design, and Use-Cases Benno Overeinder FOR MULTIPATH TCP MPTCP slides by courtesy of Olivier Bonaventure (UCL) The TCP Byte Stream Model Client ABCDEF...111232 0988989... XYZZ

More information

MULTIPATH COMMUNICATIONS USING NAMES

MULTIPATH COMMUNICATIONS USING NAMES MULTIPATH COMMUNICATIONS USING NAMES by Rashmi Purushothama, Master of Science Thesis in Communication systems Communication Networks and Systems laboratory (NETS), Swedish Institute of Computer Science

More information

ECE 650 Systems Programming & Engineering. Spring 2018

ECE 650 Systems Programming & Engineering. Spring 2018 ECE 650 Systems Programming & Engineering Spring 2018 Networking Transport Layer Tyler Bletsch Duke University Slides are adapted from Brian Rogers (Duke) TCP/IP Model 2 Transport Layer Problem solved:

More information

Chapter III. congestion situation in Highspeed Networks

Chapter III. congestion situation in Highspeed Networks Chapter III Proposed model for improving the congestion situation in Highspeed Networks TCP has been the most used transport protocol for the Internet for over two decades. The scale of the Internet and

More information

Bandwidth Allocation & TCP

Bandwidth Allocation & TCP Bandwidth Allocation & TCP The Transport Layer Focus Application Presentation How do we share bandwidth? Session Topics Transport Network Congestion control & fairness Data Link TCP Additive Increase/Multiplicative

More information

Performance Analysis of TCP Variants

Performance Analysis of TCP Variants 102 Performance Analysis of TCP Variants Abhishek Sawarkar Northeastern University, MA 02115 Himanshu Saraswat PES MCOE,Pune-411005 Abstract The widely used TCP protocol was developed to provide reliable

More information

Impact of Path Characteristics and Scheduling Policies on MPTCP Performance

Impact of Path Characteristics and Scheduling Policies on MPTCP Performance Impact of Path Characteristics and Scheduling Policies on MPTCP Performance Behnaz Arzani, Alexander Gurney, Shuotian Cheng, Roch Guerin 2 and Boon Thau Loo University Of Pennsylvania 2 University of Washington

More information

Master s Thesis. TCP Congestion Control Mechanisms for Achieving Predictable Throughput

Master s Thesis. TCP Congestion Control Mechanisms for Achieving Predictable Throughput Master s Thesis Title TCP Congestion Control Mechanisms for Achieving Predictable Throughput Supervisor Prof. Hirotaka Nakano Author Kana Yamanegi February 14th, 2007 Department of Information Networking

More information

Experimental Evaluation of Multipath TCP Schedulers

Experimental Evaluation of Multipath TCP Schedulers Experimental Evaluation of Multipath TCP Schedulers Christoph Paasch ICTEAM, UCLouvain Belgium Ozgu Alay Simula Research Laboratory Fornebu, Norway Simone Ferlin Simula Research Laboratory Fornebu, Norway

More information

Congestion / Flow Control in TCP

Congestion / Flow Control in TCP Congestion and Flow Control in 1 Flow Control and Congestion Control Flow control Sender avoids overflow of receiver buffer Congestion control All senders avoid overflow of intermediate network buffers

More information

Outline Computer Networking. Functionality Split. Transport Protocols

Outline Computer Networking. Functionality Split. Transport Protocols Outline 15-441 15 441 Computer Networking 15-641 Lecture 10: Transport Protocols Justine Sherry Peter Steenkiste Fall 2017 www.cs.cmu.edu/~prs/15 441 F17 Transport introduction TCP connection establishment

More information

TCP based Receiver Assistant Congestion Control

TCP based Receiver Assistant Congestion Control International Conference on Multidisciplinary Research & Practice P a g e 219 TCP based Receiver Assistant Congestion Control Hardik K. Molia Master of Computer Engineering, Department of Computer Engineering

More information

UNIT IV -- TRANSPORT LAYER

UNIT IV -- TRANSPORT LAYER UNIT IV -- TRANSPORT LAYER TABLE OF CONTENTS 4.1. Transport layer. 02 4.2. Reliable delivery service. 03 4.3. Congestion control. 05 4.4. Connection establishment.. 07 4.5. Flow control 09 4.6. Transmission

More information

Chapter 3- parte B outline

Chapter 3- parte B outline Chapter 3- parte B outline 3.1 transport-layer services 3.2 multiplexing and demultiplexing 3.3 connectionless transport: UDP 3.4 principles of reliable data transfer 3.5 connection-oriented transport:

More information

Experimental Evaluation of Multipath TCP Schedulers

Experimental Evaluation of Multipath TCP Schedulers Experimental Evaluation of Multipath TCP Schedulers Christoph Paasch 1, Simone Ferlin 2, Özgü Alay 2 and Olivier Bonaventure 1 1 ICTEAM, UCLouvain, Belgium 2 Simula Research Laboratory, Fornebu, Norway

More information

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2014

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2014 1 Congestion Control In The Internet Part 2: How it is implemented in TCP JY Le Boudec 2014 Contents 1. Congestion control in TCP 2. The fairness of TCP 3. The loss throughput formula 4. Explicit Congestion

More information

CC-SCTP: Chunk Checksum of SCTP for Enhancement of Throughput in Wireless Network Environments

CC-SCTP: Chunk Checksum of SCTP for Enhancement of Throughput in Wireless Network Environments CC-SCTP: Chunk Checksum of SCTP for Enhancement of Throughput in Wireless Network Environments Stream Control Transmission Protocol (SCTP) uses the 32-bit checksum in the common header, by which a corrupted

More information

Transport layer. UDP: User Datagram Protocol [RFC 768] Review principles: Instantiation in the Internet UDP TCP

Transport layer. UDP: User Datagram Protocol [RFC 768] Review principles: Instantiation in the Internet UDP TCP Transport layer Review principles: Reliable data transfer Flow control Congestion control Instantiation in the Internet UDP TCP 1 UDP: User Datagram Protocol [RFC 768] No frills, bare bones Internet transport

More information

MPTCP: Design and Deployment. Day 11

MPTCP: Design and Deployment. Day 11 MPTCP: Design and Deployment Day 11 Use of Multipath TCP in ios 7 Multipath TCP in ios 7 Primary TCP connection over WiFi Backup TCP connection over cellular data Enables fail-over Improves performance

More information

Transport layer. Review principles: Instantiation in the Internet UDP TCP. Reliable data transfer Flow control Congestion control

Transport layer. Review principles: Instantiation in the Internet UDP TCP. Reliable data transfer Flow control Congestion control Transport layer Review principles: Reliable data transfer Flow control Congestion control Instantiation in the Internet UDP TCP 1 UDP: User Datagram Protocol [RFC 768] No frills, bare bones Internet transport

More information

Connection-oriented (virtual circuit) Reliable Transfer Buffered Transfer Unstructured Stream Full Duplex Point-to-point Connection End-to-end service

Connection-oriented (virtual circuit) Reliable Transfer Buffered Transfer Unstructured Stream Full Duplex Point-to-point Connection End-to-end service 최양희서울대학교컴퓨터공학부 Connection-oriented (virtual circuit) Reliable Transfer Buffered Transfer Unstructured Stream Full Duplex Point-to-point Connection End-to-end service 1 2004 Yanghee Choi 2 Addressing: application

More information

cs/ee 143 Communication Networks

cs/ee 143 Communication Networks cs/ee 143 Communication Networks Chapter 4 Transport Text: Walrand & Parakh, 2010 Steven Low CMS, EE, Caltech Recap: Internet overview Some basic mechanisms n Packet switching n Addressing n Routing o

More information

bitcoin allnet exam review: transport layer TCP basics congestion control project 2 Computer Networks ICS 651

bitcoin allnet exam review: transport layer TCP basics congestion control project 2 Computer Networks ICS 651 bitcoin allnet exam review: transport layer TCP basics congestion control project 2 Computer Networks ICS 651 Bitcoin distributed, reliable ("hard to falsify") time-stamping network each time-stamp record

More information

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2014

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2014 1 Congestion Control In The Internet Part 2: How it is implemented in TCP JY Le Boudec 2014 Contents 1. Congestion control in TCP 2. The fairness of TCP 3. The loss throughput formula 4. Explicit Congestion

More information

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2015

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2015 1 Congestion Control In The Internet Part 2: How it is implemented in TCP JY Le Boudec 2015 Contents 1. Congestion control in TCP 2. The fairness of TCP 3. The loss throughput formula 4. Explicit Congestion

More information

User Datagram Protocol (UDP):

User Datagram Protocol (UDP): SFWR 4C03: Computer Networks and Computer Security Feb 2-5 2004 Lecturer: Kartik Krishnan Lectures 13-15 User Datagram Protocol (UDP): UDP is a connectionless transport layer protocol: each output operation

More information

7. TCP 최양희서울대학교컴퓨터공학부

7. TCP 최양희서울대학교컴퓨터공학부 7. TCP 최양희서울대학교컴퓨터공학부 1 TCP Basics Connection-oriented (virtual circuit) Reliable Transfer Buffered Transfer Unstructured Stream Full Duplex Point-to-point Connection End-to-end service 2009 Yanghee Choi

More information

PACC: A Path Associativity Congestion Control and Throughput Model For Multi-path TCP

PACC: A Path Associativity Congestion Control and Throughput Model For Multi-path TCP Available online at www.sciencedirect.com Procedia Computer Science 4 (211) 1278 1287 International Conference on Computational Science, ICCS 211 PACC: A Path Associativity Congestion Control and Throughput

More information

Fall 2012: FCM 708 Bridge Foundation I

Fall 2012: FCM 708 Bridge Foundation I Fall 2012: FCM 708 Bridge Foundation I Prof. Shamik Sengupta Instructor s Website: http://jjcweb.jjay.cuny.edu/ssengupta/ Blackboard Website: https://bbhosted.cuny.edu/ Intro to Computer Networking Transport

More information

TCP Over SoNIC. Xuke Fang Cornell University. XingXiang Lao Cornell University

TCP Over SoNIC. Xuke Fang Cornell University. XingXiang Lao Cornell University TCP Over SoNIC Xuke Fang Cornell University XingXiang Lao Cornell University ABSTRACT SoNIC [1], a Software-defined Network Interface Card, which provides the access to the physical layer and data link

More information

A proposal for MPTCP Robust session Establishment (MPTCP RobE) enable full multipath capability for MPTCP Markus Amend, Eckard Bogenfeld, Andreas

A proposal for MPTCP Robust session Establishment (MPTCP RobE) enable full multipath capability for MPTCP Markus Amend, Eckard Bogenfeld, Andreas A proposal for MPTCP Robust session Establishment (MPTCP RobE) enable full multipath capability for MPTCP Markus Amend, Eckard Bogenfeld, Andreas Matz 18 July 2017 Agenda 01 The role of the initial flow

More information

SCTP Congestion Window Overgrowth During Changeover

SCTP Congestion Window Overgrowth During Changeover SCTP Congestion Window Overgrowth During Changeover Janardhan R. Iyengar, Armando L. Caro, Jr., Paul D. Amer, Gerard J. Heinz Computer and Information Sciences University of Delaware iyengar, acaro, amer,

More information

Networked Systems and Services, Fall 2018 Chapter 3

Networked Systems and Services, Fall 2018 Chapter 3 Networked Systems and Services, Fall 2018 Chapter 3 Jussi Kangasharju Markku Kojo Lea Kutvonen 4. Transport Layer Reliability with TCP Transmission Control Protocol (TCP) RFC 793 + more than hundred other

More information

6.033 Spring 2015 Lecture #11: Transport Layer Congestion Control Hari Balakrishnan Scribed by Qian Long

6.033 Spring 2015 Lecture #11: Transport Layer Congestion Control Hari Balakrishnan Scribed by Qian Long 6.033 Spring 2015 Lecture #11: Transport Layer Congestion Control Hari Balakrishnan Scribed by Qian Long Please read Chapter 19 of the 6.02 book for background, especially on acknowledgments (ACKs), timers,

More information

Outline 9.2. TCP for 2.5G/3G wireless

Outline 9.2. TCP for 2.5G/3G wireless Transport layer 9.1 Outline Motivation, TCP-mechanisms Classical approaches (Indirect TCP, Snooping TCP, Mobile TCP) PEPs in general Additional optimizations (Fast retransmit/recovery, Transmission freezing,

More information

Tuning, Tweaking and TCP

Tuning, Tweaking and TCP Tuning, Tweaking and TCP (and other things happening at the Hamilton Institute) David Malone and Doug Leith 16 August 2010 The Plan Intro to TCP (Congestion Control). Standard tuning of TCP. Some more

More information

Transport Protocols and TCP: Review

Transport Protocols and TCP: Review Transport Protocols and TCP: Review CSE 6590 Fall 2010 Department of Computer Science & Engineering York University 1 19 September 2010 1 Connection Establishment and Termination 2 2 1 Connection Establishment

More information

CSE 473 Introduction to Computer Networks. Exam 2. Your name here: 11/7/2012

CSE 473 Introduction to Computer Networks. Exam 2. Your name here: 11/7/2012 CSE 473 Introduction to Computer Networks Jon Turner Exam 2 Your name here: 11/7/2012 1. (10 points). The diagram at right shows a DHT with 16 nodes. Each node is labeled with the first value in its range

More information

arxiv: v1 [cs.ni] 28 Sep 2015

arxiv: v1 [cs.ni] 28 Sep 2015 Stream-based aggregation of unreliable heterogeneous network links arxiv:19.8222v1 [cs.ni] 28 Sep 21 Abstract Last mile link is often a bottleneck for end user. However, users typically have multiple ways

More information

CS Transport. Outline. Window Flow Control. Window Flow Control

CS Transport. Outline. Window Flow Control. Window Flow Control CS 54 Outline indow Flow Control (Very brief) Review of TCP TCP throughput modeling TCP variants/enhancements Transport Dr. Chan Mun Choon School of Computing, National University of Singapore Oct 6, 005

More information

User Datagram Protocol

User Datagram Protocol Topics Transport Layer TCP s three-way handshake TCP s connection termination sequence TCP s TIME_WAIT state TCP and UDP buffering by the socket layer 2 Introduction UDP is a simple, unreliable datagram

More information

Cross-layer TCP Performance Analysis in IEEE Vehicular Environments

Cross-layer TCP Performance Analysis in IEEE Vehicular Environments 24 Telfor Journal, Vol. 6, No. 1, 214. Cross-layer TCP Performance Analysis in IEEE 82.11 Vehicular Environments Toni Janevski, Senior Member, IEEE, and Ivan Petrov 1 Abstract In this paper we provide

More information

CONCURRENT MULTIPATH TRANSFER USING TRANSPORT LAYER MULTIHOMING: PERFORMANCE UNDER VARYING BANDWIDTH PROPORTIONS

CONCURRENT MULTIPATH TRANSFER USING TRANSPORT LAYER MULTIHOMING: PERFORMANCE UNDER VARYING BANDWIDTH PROPORTIONS CONCURRENT MULTIPATH TRANSFER USING TRANSPORT LAYER MULTIHOMING: PERFORMANCE UNDER VARYING BANDWIDTH PROPORTIONS Janardhan R. Iyengar, Paul D. Amer Protocol Engineering Lab, Computer and Information Sciences,

More information