Delay-Constrained Optimized Packet Aggregation in High-Speed Wireless Networks

Teymoori P, Yazdani N. Delay-constrained optimized packet aggregation in high-speed wireless networks. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 28(3): 525 539 May 2013. DOI 10.1007/s11390-013-1353-1 Delay-Constrained Optimized Packet Aggregation in High-Speed Wireless Networks Peyman Teymoori 1 and Nasser Yazdani 1,2 1 School of Electrical and Computer Engineering, University College of Engineering, University of Tehran, Tehran, Iran 2 School of Computer Science, Institute for Research in Fundamental Sciences, Tehran, Iran E-mail: {p.teymoori, yazdani}@ut.ac.ir Received July 31, 2012; revised January 23, 2013. Abstract High-speed wireless networks such as IEEE 802.11n have been introduced based on IEEE 802.11 to meet the growing demand for high-throughput and multimedia applications. It is known that the medium access control (MAC) efficiency of IEEE 802.11 decreases with increasing the physical rate. To improve efficiency, few solutions have been proposed such as Aggregation to concatenate a number of packets into a larger frame and send it at once to reduce the protocol overhead. Since transmitting larger frames eventuates to dramatic delay and jitter increase in other nodes, bounding the maximum aggregated frame size is important to satisfy delay requirements of especially multimedia applications. In this paper, we propose a scheme called Optimized Packet Aggregation (OPA) which models the network by constrained convex optimization to obtain the optimal aggregation size of each node regarding to delay constraints of other nodes. OPA attains proportionally fair sharing of the channel while satisfying delay constrains. Furthermore, reaching the optimal point is guaranteed in OPA with low complexity. Simulation results show that OPA can successfully bound delay and meet the requirements of nodes with only an insignificant throughput penalty due to limiting the aggregation size even in dynamic conditions. Keywords high-speed wireless network, delay requirement, aggregation, convex optimization 1 Introduction Recently, an ever-growing interest in wireless technologies and their applications has been aroused. To support rich multimedia applications such as highdefinition television (HDTV) and DVD, the trend is to provide higher bandwidths in the network [1]. Consequently, the recently approved standard, IEEE 802.11n, supports physical rates of up to 600 Mbps [2]. IEEE 802.11 [3], as a de facto WLAN technology, introduces Distributed Coordination Function (DCF) which is a distributed channel access mechanism based on Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA). Unfortunately, higher physical rates reduce the MAC layer efficiency of IEEE 802.11 protocol [4-5]. However, throughput at the MAC layer can be improved by aggregating several frames before transmission which is the main enhancement in IEEE 802.11n. Frame aggregation not only reduces the transmission time for preamble and headers, but also reduces the waiting time during random backoff period for successive frame transmissions [2]. One of the main requirements of high-speed WLANs is supporting real-time and delay-sensitive applications [6]. For real-time applications and delaysensitive traffic, bounding delay and preserving predictable delay are of significant importance in WLANs [7]. Delay-sensitive applications over WLANs require that the network transmits packet in a timely manner. If packets do not arrive on time at destinations and some of them experience delay, network quality of service degrades [8]. Although frame aggregation can increase throughput at the MAC layer under ideal channel conditions, a larger aggregated frame will cause each station to wait longer before any channel access [9]. Thus, throughput increase is achieved at the expense of total delay increase at the MAC layer. The problem is intensified in error-prone channels since larger frames are more prone to corruption and may take even more time to be transmitted, leading to longer delay [9]. Moreover, delay might be even longer if nodes are saturated. Researches have only focused on proposing effecting aggregation schemes, e.g., [1-2, 4-5, 10] and the effect of aggregation on delay and bounding such delay are paid no attention. To the best of our knowledge, this is the first work on analyzing and bounding delay caused by aggregation for WLANs. Regular Paper 2013 Springer Science + Business Media, LLC & Science Press, China

526 J. Comput. Sci. & Technol., May 2013, Vol.28, No.3 To achieve high efficiency at the MAC layer in highspeed WLANs and get the best out of aggregation while bounding delay, we propose an analytical optimization model for aggregation called Optimized Packet Aggregation (OPA). As stated above, the main approach of increasing throughput is to aggregate more packets. Since increasing the aggregation size results in longer transmission time and therefore, increasing delay at nodes, the trade-off between throughput and delay is calculated using an accurate analytical model. This model maximizes throughput while preventing delay to increase. In other words, OPA finds the maximum aggregation size that nodes can use such that delay requirements of nodes are not violated. We analytically evaluate OPA in terms of fairness, convergence, and complexity. We show that OPA can attain weighted proportional fairness among nodes while satisfying delay requirements of nodes. Convergence and uniqueness of the optimal solution are also guaranteed by OPA. Besides, OPA does not impose any important complexity on nodes. We implement our model in NS-2 and compare it with the method proposed in [1]. Simulation results show that OPA dramatically decreases average delay while maximizing throughput especially in saturated situations in the presence of a large number of nodes transmitting high-rate traffic. Furthermore, OPA is even successful under dynamic conditions and for various traffic types. The rest of the paper is organized as follows. Section 2 reviews related work. We introduce the analytical model of OPA in Section 3, and discuss its implementation issues in Section 4. Section 5 evaluates algorithmic issues of our model while Section 6 presents detailed simulation results. Finally we summarize our conclusions in Section 7. 2 Related Work Efficiency of high-speed wireless networks can be improved by incorporating features such as aggregation which is the main contribution of methods like burst acknowledgement (Burst ACK) [10], and block acknowledgement (Block ACK), e.g., [1, 5, 11]. Burst ACK transmits a bulk of data frames and receives their corresponding ACK frames by performing only one backoff process. Block ACK enhances Burst ACK by transmitting only a single ACK frame for a series of data frames instead of separate ACK frames, and consequently, results in more overhead reduction. Concatenation (CM) and piggyback (CM) methods [5] further enhance efficiency. CM concatenates multiple frames into a single frame while through PM, the receiver is allowed to piggyback a data frame to the sender if it has data destined for the sender. To improve efficiency more, the AFR scheme [1] introduces a new delimitation mechanism with less overhead compared with [5] which is achieved by a fragmentation technique. A similar enhancement in IEEE 802.11e [12] is transmission opportunity (TXOP). Having been obtained through a successful contention, TXOP defines a period of time for a station to transmit multiple data frames without entering the backoff procedure. It reduces the overhead due to eliminating contention and backoff. To respond to the increasing demand for realtime and multimedia applications over wireless, IEEE 802.11n was standardized [11]. This standard defines two types of frame aggregation: aggregate MAC service data unit (A-MSDU) and aggregate MAC protocol data unit (A-MPDU) with the maximum size of 64 KB. Although simulation results of [13] demonstrate the effectiveness of the IEEE 802.11n MAC layer, the standard does not specify exactly how many packets should be aggregated and how delay requirements are treated. Along with the above proposed schemes, some analytical models were proposed to evaluate performance of aggregation. In [14], a theoretical model is presented to evaluate the saturation throughput for the burst transmission and acknowledgment (BTA) scheme under error channel conditions in the ad hoc mode. Through simulation, the authors tried to find the optimal block size leading to the best performance. Authors in [9] proposed an analytical model to study the performance of IEEE 802.11n under uni-directional and bi-directional data transfer. They also numerically proposed an optimal frame size adaptation algorithm with A-MSDU under error-prone channels. Only few researches have targeted limiting the aggregation size to solve problems of wireless networks. The authors of [15] tried to solve the performance anomaly of IEEE 802.11 multi-rate wireless networks. They introduced a variable transmission time which is similar to TXOP. It is worth mentioning that a similar technique called Data Aggregation [16] is employed in wireless sensor networks with the aim of reducing the size of data transmitted along the aggregation tree towards the sink to save more energy. Size reduction can be performed by methods such as digesting data or compression, and the content of the aggregated packet is not necessarily the concatenation of the received packets forming the aggregated packet. Moreover, nodes may hold packets Network simulator (NS). http://www.isi.edu/nsnam/ns/, July 2012.

Peyman Teymoori et al.: Delay-Constrained Optimized Packet Aggregation 527 before transmission as long as it is possible to increase aggregation. Studies such as [17] present an analytical method on how long an aggregation packet can be held to increase energy-saving with regard to message delayconstraints. However, our focus in this paper is optimizing Packet Aggregation where packets are only concatenated, not held, and there is no aggregation tree. The above-mentioned researches only focus on how to concatenate packets into a larger frame to reduce the protocol overhead. Although throughput increases due to the elimination of some protocol headers and interframe spaces, the optimal concatenated frame length has not been investigated by these researches. In addition, the length is not exactly calculated based on requirements of nodes such as delay. These two considerations are the bases of our optimization model. 3 Optimized Packet Aggregation 3.1 Network Model We assume that there are n nodes in a single-hop network which are contending for the wireless channel. As the worst case, all nodes are assumed to be saturated. The set of nodes is denoted by N = {1,..., n}. r i denotes the physical data rate of node i. The goal is to maximize the channel throughput by increasing aggregation sizes of nodes without violating their delay constraints. Upon accessing channel, node i aggregates packets up to x i bytes where x i [0, max x ] and max x denotes the maximum aggregation size which depends on the protocol capabilities. This assumption is not related to a particular aggregation method and therefore, any of these methods can be utilized as the underlying aggregation mechanism. Any aggregation method has an overhead caused by headers and frame checksums. In addition, there is an extra overhead imposed by the MAC and physical layers to transmit a packet. Overheads stem from the channel time during which no data is transmitted such as frame headers, checksums, inter-frame spaces such as SIFS and DIFS, control frames such as RTS and CTS, and physical layer preambles [4]. Due to generality, we only consider the total protocol overhead denoted by P OH, in second. Now, we can define the utility function of nodes. In the network context, the more a node accesses the channel, the more profitable this situation is. The term profitable indicates the transmission time of a node. Therefore, the number of packets a node aggregates specifies its duration of accessing the channel and also its profit. Throughput of a node has a direct relation with its number of aggregated packets [1]. In other words, the utility function of node i is an increasing function of its aggregation size, x i. We use a logarithmic function which is strictly increasing as the utility function increases. Behavior of a logarithmic function to its input is similar to a real throughput function of x i and simulation results of [1] approve this. For node i, it is defined as U i (x i ) = ω i log x i, (1) where ω i is the weight of node i. We assume that nodes can be assigned different weights according to their channel access requirements and the level of quality of service determined by the upper layers. Our goal is to maximize the network utility. We define the network utility as U(x) = n i=1 U i(x i ), (2) which is the summation of the utilities of all nodes and x denotes a vector whose elements are x i, i N. We define delay constraints of nodes as the maximum duration that nodes can wait to access the channel. The length of this duration is usually specified by the traffic profile of nodes, and we denote it by d i for node i. Since access to the channel of a wireless network is stochastic, nodes may access to the channel differently but we assume that the channel access mechanism is fair and the expected values of the number of accesses of nodes to the channel are the same. This is the case we have in long term in IEEE 802.11. Analytically, for each node i we should have B i (x) d i, where B i (x) = (n 1)P OH + j i x j/r j, (3) that determines the time required for the other nodes to transmit their aggregated frames of size x j, and this time should be shorter than or equal to d i. This duration includes protocol overheads of the other n 1 nodes as well. In order to have an upper bound on the sum of aggregation sizes of nodes, as another constraint, we assume: n i=1 x i max aggr, where max aggr denotes the maximum aggregation size of all nodes. We use this constraint mainly for attaining fairness among nodes which will be discussed in detail later. 3.2 Optimization Problem Convex optimization and constraint programming [18] have been used in many engineering applications to reach an optimum situation. They are promising approaches that can maximize a utility function in the

528 J. Comput. Sci. & Technol., May 2013, Vol.28, No.3 presence of some constraints. Here, we formulate our problem by convex optimization. Through the Primal Problem, we present the network optimization problem which is centralized and needs all network information. Then, in order to have a distributed version of this optimization problem which is suitable for WLANs, we present its Dual Problem and show how nodes can follow a distributed iterative method to solve the optimization problem. 3.2.1 Primal Problem We model the optimized packet aggregation problem as following. max U(x), x (4) s.t. B i (x) d i, n i N, (5) x i max aggr, i=1 (6) x : a vector of positive integers. (7) This means that we try to find a vector x which specifies the aggregation size of each node in order to achieve the maximum utility in the network. This is performed by considering (5) as restriction on the aggregation size to meet delay requirements of nodes. Constraint (6) is the upper bound defined on the sum of aggregation sizes. In this subsection, we assume that max aggr is large enough so that the feasible set of constraint shown by (5) is a subset of that of (6). Hence, we neglect this constraint in the following formulation but we will get back to this in the next subsection. Moreover, since the above problem involves integer programming and makes computing the solution complex, we relax it by removing constraint (7). Then, we round the solution to have integer outputs. Simulation results will show that this approximation still gives good results. 3.2.2 Dual Problem Although problem (4) can be separated among nodes, there are dependencies among constraints over the network. The dependent nature of the problem obligates using a centralized method which imposes extra overhead. In order to have a distributed solution and for the sake of simplicity in designing the channel access protocol, we solve the problem through its dual problem. First, by defining the Lagrangian problem, we take the constraints into account which yields L(x, λ) = U(x) + n i=1 λ i(b i (x) d i ), (8) where λ i is the Lagrange multiplier associated with the i-th inequality constraint and vector λ is called dual variables of problem (4) where λ = (λ i N). Then, the Lagrangian dual function is defined as g(λ) = inf L(x, λ), (9) x which is the maximum value of the Lagrangian over x. The dual function yields lower bounds on the optimal value of the problem (4) (5) [18]. To solve (9), we should find x i such that L(x, λ) x i = 0. (10) Taking the partial derivative of (8) with respect to x i and finding its solution yields x i = ω ir i j i λ j Substituting (11) in (9) obtains ω 1 r 1 g(λ) = ω 1 log (λ 2 + λ 3 + + λ n ) ω 2 r 2 ω 2 log (λ 1 + λ 3 + + λ n ) ω n r n ω n log (λ 1 + λ 2 + + λ n 1 ) + λ 1 (n 1)P OH. (11) λ 1 d 1 + λ 2 (n 1)P OH λ 2 d 2 + + λ n (n 1)P OH ω 2 λ 1 λ n d n + (λ 1 + λ 3 + + λ n ) + ω 3 λ 1 (λ 1 + λ 2 + λ 4 + + λ n ) +... + ω n λ 1 (λ 1 + λ 2 + + λ n 1 ) + ω 1 λ 2 (λ 2 + λ 3 + + λ n ) + ω 3 λ 2 (λ 1 + λ 2 + λ 4 + + λ n ) + + ω n λ 2 (λ 1 + λ 2 + + λ n 1 ) + ω 1 λ n (λ 2 + λ 3 + + λ n ) + ω 2 λ n (λ 1 + λ 3 + + λ n ) + + ω n 1 λ n (λ 1 + + λ n 2 + λ n ). (12) The Lagrangian dual problem gives an upper bound on the optimal value of the optimization problem. This problem is defined as max g(λ), λ (13) s.t. λ 0. (14)

Peyman Teymoori et al.: Delay-Constrained Optimized Packet Aggregation 529 The Lagrangian dual problem (13) (14) is a convex optimization problem, since the objective to be maximized is concave and the constraint is convex. The above problem can be solved by differentiating g(λ) of λ which yields g λ i = (n 1)P OH d i + j i ω j k j λ k As an example, (15) for i = 1 is written as: g λ 1 = (n 1)P OH d 1 +. (15) ω 2 (λ 1 + λ 3 + + λ n ) + ω 3 (λ 1 + λ 2 + λ 4 + + λ n ) + + ω n (λ 1 + λ 2 + + λ n 1 ). (16) Solving g λ i = 0 yields the optimum value which is denoted by λ i, but due to the complexity of (15), we cannot represent a closed-form solution. Then, the optimum point is calculated iteratively. To obtain a distributed solution with low complexity, we solve the dual problem using the gradient projection method [19]. In an unconstrained minimization problem, the gradient projection method iteratively steps toward the opposite direction of the gradient of the objective function. Using the following iterative equation, λ i is calculated for each node i. Therefore, for the dual problem (13) (14), we get [ λ (k+1) i = λ (k) i + γ g ]+, (17) λ (k) i where λ (k) i denotes the value of λ i at iteration k and [z] + = max(z, 0). This means that in each iteration k, the values of λ i are updated and improved. γ denotes the step length. Equation (17) is the decent method which produces a maximizing sequence to solve an optimization problem [18]. By this equation we mean an algorithm that computes a sequence of points λ (0) i, λ (1) i,... dom (g) with g(λ (k) i ) p as k + where dom(g) means the domain of g and p denotes the optimum point of g. The algorithm terminates when p g(λ (k) i ) ɛ, and ɛ > 0 is some specified tolerance. At each iteration, one of the λ i is updated and its new value is used in calculation of the other λ values. This process continues until we reach the ɛ- neighborhood of the optimal λ i. 3.3 Fairness Fairness is one of the fundamental properties of resource sharing systems [19]. Since one of the main applications of high-speed wireless networks is multimedia [20-21], attaining some kind of fairness among nodes is significantly important. In wireless networks based on CSMA/CA, the amount of service each node receives can be differentiated through a variety of parameters such as contention window size and inter-frame space [22]. It is noteworthy that in this paper, we assume that regardless of the service nodes demand, they have delay requirements that should be satisfied as well. There exist different methods for providing service differentiation, and it is of significant importance to consider delay requirements carefully while using any of these methods. Aggregation size determines the duration in which a node can access the channel and it can be also used as one of the service differentiation parameters. Since our approach targets aggregation size to satisfy delay constraints, it can also maintain some form of service differentiation among nodes. We use weight vector ω in utility function (2) as indices of how much resource a user or a class of users is granted. Although each user can have his/her own weight in our model, we consider four priority classes with different weights and assign users to these classes. Moreover, each class has its own delay requirement that determines d i of node i. Clearly, users of the same class have the same weight and delay requirement and they are expected to receive the same channel share. We will discuss more about this in the simulation section. Here, we evaluate fairness properties of OPA under different conditions. First, we present some conditions which will be used later during the discussion on fairness of our approach: C1: ω i = ω j for all nodes i and j. C2: the feasible set of constraint (6) is a subset of the feasible set of constraint (5). C3: the feasible set of constraint (5) is a subset of the feasible set of constraint (6). C4: r i = r j for all i and j. Through the following definitions and lemmas, we establish the necessary background to examine the fairness of OPA. Definition 1 (Weighted Proportional Fairness [23] (WPF)). An allocation vector x is proportionally fair if and only if, for any other feasible allocation y, we have n ω y i x i i 0. (18) i=1 x i Lemma 1 (Optimality Condition [19] ). If x is the solution of maxf(x), where f( ) is continuously differentiable, then (x x ) f(x ) 0.

530 J. Comput. Sci. & Technol., May 2013, Vol.28, No.3 Lemma 2. Utility function (2) satisfies (18). Proof. From Lemma 1 we have ( )) n (x x ) U i(x i 0. (19) i=1 x Applying on the utility function yields (x x ) n i=1 U i (x i ) x 0. (20) x i After taking derivative of each U i, finding the value at point x, and some modifications we have n i=1 ω i y i x i x i 0, (21) which shows that (2) satisfies (18). Now, through the following theorem, we can show how OPA maintains fairness under C2. Theorem 1. Under condition C2, OPA attains weighted proportional fairness among nodes. Proof. From Lemma 2 we can see that the utility function of OPA satisfies WPF condition (18). Since C2 holds, we can neglect constraint (5), and only constraint (6) remains. By solving the optimization problem we get x i = ω i max aggr n ω, (22) j j=1 meaning that each node is assigned an aggregation size proportional to its weight. Therefore, under condition C2, our approach attains weighted proportional fairness. Theorem 2. Under conditions C1 and C2, OPA attains proportional fairness. Proof. As a direct result of Theorem 1 we can observe that assuming condition C1 changes (18) to the required condition of proportional fairness and as a result, (18) changes to proportional aggregation size assignment. Therefore, we have x i = x j = 1 n max aggr for all i and j. Theorems 1 and 2 indicate that when nodes do not have strict delay requirements or there is no constraint, OPA attains WPF among nodes. Hence, nodes can be easily differentiated based on the service they receive. In the rest of this subsection, we will discuss about delay-constrained networks and how OPA operates under this condition. First, we start by presenting a definition. Definition 2 (Delay-Constrained WPF). An aggregation size allocation vector x is delay-constrained weighted proportionally fair if and only if it satisfies both WPF condition (18) and constraint (5). The attained fairness is called Delay-Constrained WPF (DCWPF). It can be observed that a DCWPF allocation vector x satisfies (18) as the result of WPF. The only difference is that (22) is not necessarily valid for elements x i. In other words, DCWPF does not necessarily allocate aggregation sizes exactly proportional to weights of nodes; instead, it tries to proportionally increase allocated sizes of more weighted nodes as long as delay constraints of the other nodes are not violated. Now, let us see how OPA attains fairness under the new condition. Theorem 3. Under condition C3, OPA attains DCWPF. Proof. According to Lemma 2, it can be verified that utility function (4) satisfies (18). Since C3 holds, the solution of OPA, denoted by x, does satisfy constraint (5). Therefore, according to the definition of DCWPF, x is delay-constrained weighted proportionally fair and as a result, OPA attains DCWPF under C3. As stated before, one of the results of condition C3 is that x i /x j is not exactly proportional to ω i /ω j. On the other hand, d i and d j have a direct impact on the proportion of x i /x j. In the next propositions, we discuss a number of possible situations and their fairness properties according to the OPA implementation which will be presented later. Proposition 1. Let (x i, i N) be the primal optimal aggregation size allocation vector. For any two nodes i and j under condition C4, 1) if they have the same delay requirements and weights, i.e., d i = d j and ω i = ω j, then x i = x j ; 2) if d i d j and ω i = ω j, then x i x j ; 3) if d i = d j and ω i ω j, then x i x j. Proof. 1) According to the complementary slackness, as one of the Karush-Kuhn-Tucker (KKT) [18] conditions, we have λ i (B i (x ) d i ) = 0 for all i. Since the problem is symmetric (condition C4), both λ i and λ j are either zero or greater than zero. In the first case and according to (11), we have x i = x j. In the second case, we have B i (x ) d i = B j (x ) d j. (23) Since d i = d j and ω i = ω j, after some manipulation (23) yields x i = x j. 2) Similar to the above argument and according to (23), if d j increases to values larger than d i, B j should be increased to hold the equality. There is only one term in B j that is different from those in B i, and this term is x i. Therefore, we should have x i x j for equality (23) to hold. This proves the claim. 3) Similarly, if ω i increases in (23) and in order to

Peyman Teymoori et al.: Delay-Constrained Optimized Packet Aggregation 531 hold the equality, we should have ω j k j λ k = ω i k i λ k. (24) Therefore, we have λ i λ j. Applying this inequality and ω i ω j on (11) yields x i x j. Remark 1. If condition C4 does not hold, some changes in the values of x i may happen. Since x i has a direct relationship with r i, increasing it to r i where r i = αr i results in aggregation size x i = αx i. Therefore, considering this relationship between x i and r i can extend Proposition 1 for situations where C4 does not hold. In addition to the situations discussed in the previous proposition, some others may happen in the network, for example, how long delay less-constrained or more-weighted nodes experience using OPA. Proposition 2. 1) Less-constrained nodes may experience shorter delay than their maximum tolerable amount determined by d i. 2) More-weighted nodes may experience shorter delay than their maximum tolerable amount. Proof. 1) According to the duality theory, Lagrangian multipliers of the constraints that are not completely satisfied are zero. The duality gap of less constrained nodes are only zero where B i (x) = d i. Since the equality is only achievable if the other nodes can increase their aggregation sizes and this cannot be done due to violating the constraints of more constrained nodes, the duality gap occurs which as a result yields B i (x) < d i. This means that nodes except i cannot occupy the channel as long as d i and node i experiences shorter delay than d i. 2) Let x i denote the aggregation size vector of nodes except node i. According to Proposition 1, increasing the weight of node i results in larger values of x i and smaller values of x i to meet delay requirements. Since B i (x) only depends on x i and we can denote it by B i (x i ), it decreases consequently meaning that it may become smaller than d i. So far, we examined what happens to the optimal allocation vector as the consequence of changing weights. Since DCWPF does not necessarily share the channel exactly proportional to weights, in some situations we might need to know if we can assign weights to a number of delay-constrained nodes such that a desired proportion (allocation vector) is attained. In other words, given an allocation size vector, we might aim to obtain its associated weight vector. To show that if this is possible under OPA, first, through the following definitions we establish the necessary terminology. Then, we prove a theorem that shows it is possible to find such a weight vector. Definition 3 (Feasible Vector). An aggregation size allocation vector is feasible if it satisfies constraints (5) and (6). Definition 4 (Attainable Vector). An aggregation size allocation vector x is attainable if x is primal optimal for problem (4) (6). Definition 5 (Completely Satisfied Constraint). A constraint B i is completely satisfied if B i (x i ) = d i. Therefore, no element x j in x i can be increased without violating the constraint. Theorem 4. Any feasible aggregation size allocation vector x is attainable through an appropriate weight vector ω provided that each x i participates in at least one completely satisfied constraint. Proof. The condition on completely satisfied constraints guarantees that no increase can be done in all x i because we need to maximize utility function (4) and the iterative equation (17) should converge to the maximum point. In order to prove the theorem, we need to show that an appropriate vector ω exists for any x. Since our utility function and constraints are convex, if we can find λ and ω such that the KKT sufficient conditions hold, then x is attainable (globally maximum). To hold the complementary slackness, we should have λ i (B i(x) d i ) = 0 for each constraint i. For the constraints that are not completely satisfied, we have λ i = 0. Using (11) we have n relations among ω i and λ i variables where all x i are known. Now, we should find the answer of these relations for ωi and λ i where λ i is not greater than zero for constraints that are not completely satisfied. The obtained ω from the answer of the above relations makes x attainable. The above theorem shows that we can design the OPA protocol such that it converges to any arbitrary aggregation size allocation vector that guarantees satisfaction of delay constraints. Further, since we differentiate service based on the aggregation size (the time a node captures the channel), any service allocation strategy would be available (attainable) for nodes. 4 Implementation Using OPA, nodes should follow a distributed approach to compute their optimal aggregation sizes. In this case, each node only knows its delay requirement (d i ), physical rate (r i ), protocol overhead (P OH ), and weight (ω i ). As (15) shows, each node also requires λ, the number of nodes (n), and ω in order to solve the iterative (17). Nodes can guess the number of nodes contending for the channel by listening to transmissions in the channel

532 J. Comput. Sci. & Technol., May 2013, Vol.28, No.3 and extracting their IDs (node identifiers). In addition, nodes need some information about the others. For this reason, two particular fields are considered in the MAC header of the protocol. A 4-byte field, called LM (Lagrange Multiplier), is considered to carry λ i. Moreover, a 1-byte field called W (weight) is added to the header to carry ω i. Each node i calculates the latest value of its λ i right before transmission and sets the fields of LM and W by their corresponding values. The receiving node and the other listening nodes extract the values of these fields and update their local information regarding the transmitting node. As another useful feature for dynamic environments, we assume that nodes determine whether they have more backlogged packets in their MAC queues through a particular field. Protocols such as IEEE 802.11n support this feature. Therefore, nodes can only consider Active nodes (nodes with backlogged packets) in their future calculations. We call this field A (active). This field can be also used to ignore inactive nodes such as those in the sleep mode. Each node i considers λ j in (11) if node j is active. If node j wins the contention and transmits a data frame after a while, the other nodes can consider λ j in their calculations from now on. It is worth noting that we assume that nodes are saturated most of the time and the period during which they are idle or active is not short (e.g., less than 1 second). In this case, nodes have enough time to converge to the new condition. We will discuss more about how our approach operates in dynamic environments in Section 6. We also consider the case where a node is suddenly disconnected from the network due to any reasons while it still has backlogged packets. This can happen due to the mobility of the user as well. Since the node cannot set its active field to 0, the other nodes may consider its information in their calculations which results in delay deviation from desired delay constraints. Although this problem might be rare, to cope with this, we assume that nodes periodically set the active column of a node to 0 if they do not hear any frames from it for a duration longer than an inactive threshold (e.g., 1 2 seconds depending on the number of nodes and the physical rate). The threshold is determined long enough such that at least one transmission from a node in that duration is guaranteed in order not to ignore an active node, and short enough to detect inactive nodes as fast as possible. Each node maintains the aforementioned information about the other nodes as a four-column list called NI List. Columns are ID, LM, W, and A (indicating if the node is active), respectively. As the communication proceeds and data frames are transmitted, each node extracts the information and updates its NI List. This process continues until all nodes reach the optimal value of λ, and as a results of (11), the optimal aggregation sizes. Regardless of the weight and activeness of nodes, the only value that should be transmitted is λ. This value implicitly consists of all information required to find an optimal solution like node physical rate, delay requirements, and the current state of nodes. Therefore, transmitting the entire parameters of nodes is not required. Algorithm 1 shows pseudo-code of this process for each node. Algorithm 1. OPA Algorithm for Node i 1: procedure Initialize 2: λ i 1 3: ω i Initial Weight 4: NI List Null 5: end procedure 6: procedure Listen/Receive(p: DataFrame) 7: if p.type = DATA FRAME then 8: Update NI List(p.ID, p.lm, p.w, p.a) 9: end if 10: ProcessReception(p) Rest of processing 11: end procedure 12: procedure Send(p: DataFrame) 13: if p.type = DATA FRAME then [ ] + 14: λ (k+1) i λ (k) i + γ g 15: x i ω i r i j i,a j =1 λ j λ (k) i 16: x i round(min(max(x i, x min), x max)) 17: p.lm λ (k+1) i 18: p.w ω i 19: if MAC.queue Null then 20: p.a 1 21: else 22: p.a 0 23: end if 24: Aggregate(p, x i) Put at most x i bytes in p 25: k k + 1 26: end if 27: ActualSend(p) 28: end procedure Since due to physical layer limitations, arbitrary large frame sizes in wireless channel are not possible [1], we use a boundary guard for x i in Algorithm 1. That is, if x i is out of the supported range of the frame size, we assign the lower or upper bound to x i. In Algorithm 1, we denote these two parameters by x min and x max. As shown by the algorithm, nodes initialize their local variables at first. Upon receiving any data frame,

Peyman Teymoori et al.: Delay-Constrained Optimized Packet Aggregation 533 they extract its LM, W, and A fields and update their local lists. When a node is about to transmit a data frame, it calculates λ i based on the latest λ values it has received. Now, x i can be obtained by only considering λ values of active nodes in its calculation. Then, values of the LM, W, and A fields are set and, finally, the aggregated frame is transmitted. 5 Evaluation In this section, we evaluate complexity and convergence of OPA. Since OPA is a distributed approach, it should be efficient in terms of time and message complexity. Further, nodes need to keep transmitting for a while to reach the optimal point. Therefore, we will examine its convergence as well. 5.1 Algorithm Complexity According to the OPA algorithm, time complexity of OPA is O(n 2 ). Each node i should compute the value of (17) in order to update λ i. The only time-consuming part of this formula is calculating g λ i. Equation (15) can be used to find this value for node i. For instance, (16) shows this calculation for node 1. From this, it is obvious that there are n 1 fractions and each of them adds n 1 values. Therefore, the time complexity of OPA is O(n 2 ). OPA does not impose any extra messaging overhead because each node informs the other nodes of its local information (mainly its λ i ) only through the LM field in the MAC header of a data frame. Then, no other specific frame is required. The only overhead is four bytes in the MAC header and since the aggregated frame length is large (up to 64 KB in IEEE 802.11n [11] ), a very small fraction of the packet is used for this matter. Furthermore, we use one byte for weight which is insignificant as well. Since time complexity of OPA is not high because the number of nodes is usually small and OPA does not impose any message complexity, the overhead of OPA is insignificant. This means that nodes need not to worry about energy consumption. 5.2 Convergence An important issue regarding OPA is its convergence to the optimal point, i.e., the aggregation size leading to maximum throughput while taking delay constraints into account. In the context of optimization, behavior of the target function is important especially when the function has multiple local maxima. If a function is strictly concave/convex, it definitely has a unique maximum/minimum point. The next theorem investigates the primal problem for this condition. Theorem 5. The primal problem (4) (5) is strictly concave and admits a unique maximization. Proof. To establish the concavity condition, first we prove that the utility function expressed as (1) is concave. Taking the second derivative of U i (x i ) proves that U i (x i ) is concave if and only if 2 U i (x i ) x i 2 < 0. Taking the second derivative of U i (x i ) yields 1 x i 2 < 0, which proves that every utility function is concave. According to (2), the network utility, U(x), is a nonnegative and non-zero weighted sum of strictly concave functions, and consequently, it is strictly concave. Constraint (5) is an affine function, and therefore, it is convex. Since the feasible set of the optimization problem (4) and (5) is compact and its objective is strictly concave, the optimal solution exists and it is unique as a consequence of concavity. Further, there is no local maximum but the global one. In the next theorem, we examine why the primal problem results in the optimum condition while satisfying delay constraints. Theorem 6. The problem (4) approximates the network throughput but constraint (5) exactly specifies the aggregation size. Proof. Assume that there are two nodes one of which sends packet and the other receives them. According to the throughput relation presented by [2], for high physical rates, we have T cx i where T, c, and x i denote the throughput, a constant, and the number of aggregated packets, respectively. This means that throughput has approximately a linear relationship with x i. This yields cx i > x i > log x i = U(x i ). For a number of n nodes, we have n i=1 cx i > n i=1 x i > n i=1 log x i = U(x). This means that problem (4) finds a lower bound for throughput. Since constraints (5)exactly specify how much nodes can tolerate delay in terms of the aggregation size of the other nodes and the optimization process finds the maximum value of x which satisfies these constraints, the result is the accurate possible value for x. Step length is one of the parameters of the dual problem (13) that has a significant importance in its convergence to the optimal point. We proved that this

534 J. Comput. Sci. & Technol., May 2013, Vol.28, No.3 problem has a unique optimal point. Now, through the following theorem we show a sufficient condition on the step length under which the OPA algorithm convergences to the optimal point. Theorem 7. Assume that step length γ is chosen such that 0 < γ 2 Ω where Ω = max n 1 j n i=1 k i,j ω k ( l k λ l) 2, then starting from any initial point, the generated sequence (λ (k) ) is dual optimal leading to the primal optimal solution of problem (4) (6). Proof. According to the decent lemma for the gradient-based optimization of a function g(λ) with constant step length, if g(λ) holds the Lipschitz Continuity property, i.e., there exists M such that for any two arbitrary vectors λ and λ in the domain we have g(λ ) g(λ ) M λ λ 2, then the sequence (λ (k) ) obtained from λ (k+1) = λ (k) + γ g(λ) converges to the optimal point provided that ɛ γ 2 ɛ M for some ɛ > 0. To prove the theorem, we need to find M such that the Lipschitz Continuity property holds for g(λ), and it is sufficient to show that the Hessian of g(λ) is upper bounded in l 2 -norm. The Hessian of g(λ) is defined as H = [h ij ] n n where h ij = 2 g(λ) λ i λ j. Taking derivative of (12) with respect to λ i and λ j yields h ij = k i,j ω k ( l k λ l) 2. According to the fact that in linear algebra we have H 2 2 H 1 H, we can find the upper bound of Hessian. Hence, we have H 1 = max 1 j n = max n i=1 h ij n 1 j n i=1 k i,j ω k ( l k λ l) 2 and since H is symmetric, H 1 = H and consequently H 2 H 1. Therefore, for sufficiently small ɛ we have 0 < γ max 2 n 1 j n i=1 k i,j which proves the theorem. ω k ( l k λ l) 2 6 Simulation Results In this section, we examine the performance of OPA through simulation. The main goal of OPA is constraining delay while maximizing throughput in static and even dynamic environments with variable number of nodes. The service differentiation capability of OPA is evaluated as well. We use the AFR implementation which was performed in NS-2 2.27 as the base aggregation method and enhance it by our approach. Our implementation is performed in NS-2 as well. The simulated network topology is a single-hop network of homogeneous nodes where node i sends packets to node i + 1. Simulation results are the average of 10 different runs with unique random seeds and confidence level of 95%. Confidence intervals of throughput, peak delay, and average delay are no more that 1 Mbps, 0.04 second, and 0.004 second, respectively. The simulation time is 10 seconds which is long enough for nodes to reach a stable behavior. In all simulation scenarios, AFR uses the maximum aggregation size, i.e., 64 KB, which leads to the maximum achievable throughput. On the contrary, OPA optimizes its aggregation size based on network parameters such as delay requirements. In additions, the two approaches are examined for two different types of traffic, CBR and HDTV, which are requirements of high-speed WLANs [6]. The network performance is evaluated by calculating the following metrics: throughput : the rate at which the network can transmit packet at the MAC layer. In particular, the value of this metric represents the total number of bits of data transmitted in the wireless channel over the transmission time. average delay : the average delay a packet experiences when it is in the head-of-line aggregated frame in the MAC queue until the nodes can access the channel. peak delay : the maximum delay a packet experiences is measured for each node, and this metric is equal to the average of peak delay of nodes. power : the ratio of the achieved throughput over the average delay. 6.1 CBR Traffic One type of traffic which is used to examine OPA is CBR. It generates UDP packets with size 1 024 B at a constant rate equal to the physical rate. The packet generation rate is high enough to saturate nodes. It is noteworthy that this is the worst condition under which OPA can operate because CBR rate is very http://www.hamilton.ie/tianji li/afr.html, July 2012. Network simulator (NS). http://www.isi.edu/nsnam/ns/, July 2012.

Peyman Teymoori et al.: Delay-Constrained Optimized Packet Aggregation 535 high and nodes may be heavily backlogged and only very large aggregation sizes can help transmit the backlogged packets faster. Two different scenarios utilizing CBR are implemented. In the first one, different number of nodes with d i = 0.015 second use CBR traffic to send packets to their destinations. The number of stations varies from 10 to 30. Results of this scenario are shown in Figs. 1 4. These figures show throughput, peak delay, average delay, and power, respectively, for a number of physical rates which are shown in the figures by AFR- X of OPA-X where X denotes the rate in Mbps. Larger aggregation sizes increase throughput more. On the other hand, average delay increases as well due to the longer access time to the media. Referring to Fig.1, since OPA limits the aggregation size, its throughput decreases but the amount of decrease is insignificant for lower physical rates. In the presence of 30 nodes at rate 300 Mbps, OPA requires to constrain nodes more to satisfy their delay requirements. This, as a result, decreases the network throughput at most 25 Mbps. On the contrary, results of Fig.2 show that OPA is successful is satisfying delay requirements of 0.015 second. No matter how many nodes in the network are or what the physical rate is, the average delays of nodes meet their requirements. As can be seen, AFR averages much higher delays than OPA especially for lower rates and its value is sensitive to the number of nodes and the physical rate. Fig.3 shows peak delay results of the above scenario. Peak delay heavily depends on the aggregation size because larger sizes introduce larger variations in delay. Referring to the figure, OPA limits the peak delay for any number of nodes and physical rates. However, using AFR, nodes face serious problems since peak delay increases dramatically. As one of the main problems of protocols based on IEEE 802.11, short-term unfairness intensifies delay variance because nodes tend to recapture the channel more after a successful access due to resetting their contention window size. Therefore, larger aggregation sizes lengthen the time that takes a node to access the channel. As shown by the figure, 30 nodes using AFR with rate 50 Mbps may experience delay close to 0.9 second which is not tolerable by most of applications [6]. Increasing the physical rate can mitigate the problem of AFR but peak delay is still sensitive to the number of nodes. As shown in the figure, OPA can successfully bound peak delay as the result of bounding average delay. Fig.1. Network throughput of OPA and AFR for various number of nodes. Fig.2. nodes. Average delay of OPA and AFR for various number of Fig.3. Peak delay of OPA and AFR for various number of nodes. Since limiting the aggregation size decreases throughput, a question arises that whether the whole result is beneficial. To answer this, we measure the power of the above scenario and its results are illustrated in Fig.4. Referring to this figure, power of AFR decreases as fast as the number of node increases in the network. OPA, in comparison with AFR, achieves much higher power. Besides, its power decreases more slowly as more nodes are added to the network. It means that the amount of delay decrease is much larger than throughput decrease in OPA, and therefore, OPA is more beneficial in general. In addition, if we use peak delay instead of the average delay in calculating

536 J. Comput. Sci. & Technol., May 2013, Vol.28, No.3 power, OPA can achieve even more power than AFR for various parameters. Fig.7. Average delay of OPA and AFR for various physical rates. Fig.4. Power of OPA and AFR for various number of nodes. To examine the effects of high physical rates on OPA, in the second scenario, the number of nodes is fixed to 20 and 40, and the physical rate varies from 100 Mbps to 600 Mbps. Nodes have delay requirements of 0.015 second. Performance results of this scenario are presented in Figs. 5 7. Fig.5. Throughput of OPA and AFR for various physical rates. Fig.6. Peak delay of OPA and AFR for various physical rates. The main means to reduce the protocol overhead for higher physical rates is using larger aggregation sizes [4]. Referring to Fig.5, AFR can reach throughput values of 438 Mbps and 345 Mbps for 20 and 40 nodes, respectively, using the maximum aggregation size for the maximum physical rate, i.e., 600 Mbps. As OPA needs to limit the aggregation size to bound delay, its throughput values are 425 Mbps and 304 Mbps, respectively, for rate 600 Mbps as the worst case. It means that OPA can meet delay requirements at the cost of at most 1 Mbps throughput decrease per node. However, as illustrated by Figs. 6 and 7, it can at least halve the values of peak and average delay for this physical rate. Moreover, referring to these figures, lower rates introduce longer delays that can be bounded by OPA while throughput decrease of OPA for lower rates is much lower than that of higher rates. It is worth noting that in this scenario, nodes generate CBR traffic at their physical rate as the worst case, i.e., one node can saturate the network. As it will be shown, OPA achieves closer throughput to those of AFR where nodes generate traffic with rates lower than their physical rate such as the case in HDTV traffic. 6.2 HDTV Traffic Supporting HDTV traffic is one of the requirements of high-speed wireless LAN protocols such as IEEE 802.11n [11]. It has a rate of 19.2 24 Mbps of constantsize packets and a 200 ms peak delay requirement. In the next scenario, we investigate the performance of OPA and AFR for HDTV traffic. The number of nodes varies from 10 to 30 with physical rates 100 Mbps and 250 Mbps. Each node has one HDTV flow, and due to the importance of peak delay for HDTV, we set nodes delay requirements such that their peak delay is at most 200 ms. Fig.8 shows throughput of the two schemes. In comparison to the CBR traffic, throughput decrease of OPA in this case is much lower than that of OPA in CBR traffic because in this case nodes only send HDTV traffic of rate 19.2 Mbps. However, in the CBR scenario, nodes are sending traffic at their physical rate and the network is fully saturated. As it can be observed from

Peyman Teymoori et al.: Delay-Constrained Optimized Packet Aggregation 537 the figure, throughput decrease of OPA is insignificant even for a large number of nodes. Fig.8. Throughput of HDTV traffic over OPA and AFR for various number of nodes. impossible. Therefore, fast convergence to the optimum size is very important. To investigate OPA in a dynamic network, we test it with a number of nodes sending the CBR traffic at rate of 250 Mbps with d i = 0.015 second. Then, nodes start transmission at time zero and keep sending. In Fig.10, we measure the average aggregation size of nodes at every 0.1 second. The optimal aggregation size of each number of nodes is drawn in the figure. It can be observed that the 10-node topology converges fast to its optimum size. As the number of node increases, the convergence process takes longer time, but it is noted that after 0.6 second, the difference between the optimal and the actual aggregation sizes decreases to less than 2 KB for all topologies. The main advantage of OPA in this scenario is bounding peak delay of the HDTV flows. Referring to Fig.9, OPA averages 200 ms peak delay for any number of nodes with rates of 100 Mbps and 250 Mbps meaning that with an insignificant decrease in throughput, many HDTV flows can be transmitted in the network. On the contrary, almost all peak delay values of AFR are larger than those of OPA implying that no more than 10 HDTV flows can be transmitted simultaneously by AFR with rates lower than 250 Mbps. Fig.10. Aggregation size convergence of OPA at the network start time. In the next scenario, we assume that the network of the previous scenario is operating under a stable condition, i.e., nodes have reached the optimum aggregation size, and then, some nodes leave the network. Results of this scenario are illustrated in Fig.11 which shows the average aggregation size of nodes from second 5 to second 6. At second 5, five nodes leave the network and consequently, the optimum aggregation size changes Fig.9. Peak delay of HDTV traffic over OPA and AFR for various number of nodes. 6.3 Dynamic Networks One of the main challenges of the protocols like OPA is their behavior in dynamic environments. The OPA algorithm needs a number of iterations to reach the optimum aggregation size. If the convergence process takes a long time, delay requirements of nodes cannot be satisfied. In addition, in a very dynamic network, nodes may join or leave the network, start transmission or stop it after a while, and become idle. In these cases, convergence to the optimum aggregation size might be Fig.11. Aggregation size change of OPA at second 5.

538 J. Comput. Sci. & Technol., May 2013, Vol.28, No.3 from 46 KB to 64 KB, 22 KB to 36 KB, and 14 KB to 21 KB, for 10-node, 20-node, and 30-node topologies, respectively. Referring to the figure, the aggregation size of each topology converges to the new optimum size fast after the change happens. In this case, after approximately 0.4 second the difference between the optimal and the actual aggregation sizes decreases to less than 2 KB. 6.4 Service Differentiation Here, we investigate the capability of OPA in performing service differentiation among nodes. In the first scenario, there are a number of nodes with delay requirements of 0.015 second and the physical rate of 250 Mbps. Nodes are categorized into four priority classes: Class 0, Class 1, Class 2, and Class 3, with weights of 8, 4, 2, and 1, respectively. Weights of nodes of the same class are equal. In Fig.12, throughput of each class is shown for various number of nodes. In the 8-node topology, the optimum aggregation sizes of the classes are 97 KB, 53 KB, 29 KB, and 16 KB, respectively. Since a 97 KB size is bounded by 64 KB in the OPA algorithm, throughputs of Class 0 and Class 1 are approximately the same. As the number of nodes increases, nodes of each class reach totally different aggregation sizes which are approximately proportional to their weights while meeting the delay requirement. In the 12-node topology, the optimum aggregation sizes are 56 KB, 29 KB, 15 KB, and 8 KB, respectively and their throughputs are approximately proportional to the weights. The throughput curves in the figure decrease for a larger number of nodes since collision increases, but OPA still keeps the desired proportions among classes. Fig.12. Per-class throughput of OPA for 4 weighted classes. In the next scenario, each priority class has a different delay requirement shown in Fig.13, in second. The same as before, in the 8-node topology the optimum aggregation sizes are larger than 64 KB, and therefore, they are bounded by 64 KB. Hence, the perclass throughput values of the classes are the same. In the 12-node topology, the optimum sizes are 53 KB, 42 KB, 38 KB, and 38 KB, respectively. Sizes of classes Class 2 and Class 3 are the same according to Proposition 2 because the aggregation size of Class 3 can be increased without affecting constraints, and this situation remains for the other topologies. We also observe throughput decrease for topologies with more nodes. Fig.13. Per-class throughput of OPA for 4 classes with different delay requirements. 7 Conclusions One of the recent requirements of wireless networks is providing high-speed data transmission and supporting multimedia traffic. There are some approaches that increase the efficiency of the MAC layer through aggregating packets and reducing the protocol overhead. However, the main drawback is that aggregation increases delay (jitter and peak delay) dramatically which is intolerable by multimedia applications. In this paper, in order to achieve high efficiency at the MAC layer of these networks while solving the delay increase problem, we proposed an analytical model of the wireless medium access, Optimized Packet Aggregation (OPA), which suggests the optimized aggregation size for nodes. This model, as the main parameter of optimizing, considers delay requirements of nodes and permits nodes to transmit only a particular amount of aggregated data such that delay constraints of nodes are satisfied. To evaluate the proposed model, we extended the AFR implementation in NS-2. Simulation results indicate that OPA dramatically decreases the network average delay while maximizing throughput especially in saturated situations in the presence of a large number of nodes and high physical rates. OPA bounds delay with a very small cost of decreasing throughput since it limits the aggregation size to constrain delay. Moreover, OPA can successfully share the channel proportional

Peyman Teymoori et al.: Delay-Constrained Optimized Packet Aggregation 539 to weights while meeting delay requirements. We call this new kind of fairness Delay-Constrained Weighted Proportional Fairness (DCWPF). We also proved the convergence condition of OPA and show that its complexity in terms of time and message is very low. Results also show that under dynamic environments, OPA can reach the new optimal condition fast. As future work, we will try to extend the analytical model to consider different aspects like the channel error rate and traffic patterns of nodes. Moreover, experimental evaluation of the protocol on real IEEE 802.11n testbeds is of our interest. References [1] Li T, Ni Q, Malone D, Leith D, Xiao Y, Turletti T. Aggregation with fragment retransmission for very high-speed WLANs. Trans. Networking, 2009, 17(2): 591-604. [2] Xiao Y. IEEE 802.11n: Enhancements for higher throughput in wireless LANs. IEEE Wireless Communications, 2005, 12(6): 82-91. [3] IEEE 802.11 WG. Part 11: Wireless LAN medium access control (MAC) and physical layer (PHY) specification. Standard, IEEE, Aug. 1999. [4] Xiao Y, Rosdahl J. Performance analysis and enhancement for the current and future IEEE 802.11 MAC protocols. ACM SIGMOBILE Mobile Computing and Communications Review, 2003, 7(2): 6-19. [5] Xiao Y. IEEE 802.11 performance enhancement via concatenation and piggyback mechanisms. IEEE Transactions on Wireless Communications, 2005, 4(5): 2182-2192. [6] Stephens A P, Bjerke B, Jechoux B et al. IEEE P802.11 Wireless LANs: Usage models, IEEE-802.11-03/802r23, May 2004. [7] Raptis P, Vitsas V, Paparrizos K. Packet delay metrics for IEEE 802.11 distributed coordination function. Mobile Networks and Applications, 2008, 14(6): 772-781. [8] Carvalho M M, Garcia-Luna-Aceves J J. Delay analysis of IEEE 802.11 in single-hop networks. In Proc. the 11th IEEE International Conference on Network Protocols, Nov. 2003, pp.146-155. [9] Lin Y, Wong VWS. Frame aggregation and optimal frame size adaptation for IEEE 802.11n WLANs. In Proc. IEEE Global Telecommunications Conf., Nov. 27-Dec. 1, 2006, pp.1-6. [10] Vitsas V, Chatzimisios P, Boucouvalas A C et al. Enhancing performance of the IEEE 802.11 distributed coordination function via packet bursting. In Proc. IEEE Global Telecommunications Conference Workshops, Nov. 29-Dec. 3, 2004, pp.245-252. [11] IEEE. IEEE 802.11n-2009: Amendment 5: Enhancements for higher throughput. Standard, IEEE, http://standards. ieee.org/findstals/standard/802.11n-2009.html. 2009. [12] IEEE. IEEE 802.11e-2005-IEEE standard for information technology Local and metropolitan area networks Specific requirement Part 11: Wireless LAN medium access control (MAC) and physical layer (PHY) specifications Amendment: Medium access control (MAC) quality of service enhancements. Standard, IEEE, 2005, http://standards.ieee.org/findstds/standard/802.11e- 2005.html. [13] Wang C Y, Wei H Y. IEEE 802.11n MAC enhancement and performance evaluation. Mobile Networks and Applications, 2009, 14(6): 760-771. [14] Li T, Ni Q, Xiao Y. Investigation of the block ACK scheme in wireless ad hoc networks. Wireless Communications and Mobile Computing, 2006, 6(6): 877-888. [15] Razafindralambo T, Lassous I G, Iannone L, Fdida S. Dynamic packet aggregation to solve performance anomaly in 802.11 wireless networks. In Proc. the 9th MSWiM, Oct. 2006, pp.247-254. [16] Fasolo E, Rossi M, Widmer J, Zorzi M. In-network aggregation techniques for wireless sensor networks: A survey. IEEE Wireless Communications, 2007, 14(2): 70-87. [17] Wu K, Liu C, Xiao Y, Liu J. Delay-constrained optimal data aggregation in hierarchical wireless sensor networks. Mobile Networks and Applications, 2009, 14(5): 571-589. [18] Boyd S P, Vandenberghe L. Convex Optimization. Cambridge University Press, 2004. [19] Levy H, Avi-Itzhak B, Raz D. Principles of fairness quantification in queuing systems. In Lecture Notes in Computer Sciences 5233, Kouvatsos D (ed.), Springer-Verlag, 2011, pp.284-300. [20] Singh M, Edwards B, Al E. System description and operating principles for high throughput enhancements to 802.11. IEEE 802.11-4/0870r, 2004. [21] Mujtaba S A. IEEE P802.11 wireless LANS: TGn sync proposal technical specification. IEEE 802.11-04/8890r0, August 2004. [22] Kumar S, Raghavan V, Deng J. Medium access control protocols for ad hoc wireless networks: A survey. Ad Hoc Networks, 2006, 4(3): 326-358. [23] Le Boudec J Y. Rate Adaptation, Congestion Control and Fairness: A Tutorial. Ecole Polytechnique Federale de Lausanne (EPFL), Dec. 2008. Peyman Teymoori received the B.S. and M.S. degrees in computer engineering from Ferdowsi University of Mashhad and Amirkabir University of Technology, Iran, in 2001 and 2004, respectively. He is currently working toward the Ph.D. degree in computer engineering in the School of Electrical and Computer Engineering at University of Tehran from 2007. His current research interests include computer networks, algorithmic aspects of wireless ad hoc networks, protocol design, and evaluation of protocols for wireless networks. Nasser Yazdani got his B.S. degree in computer engineering from Sharif University of Technology, Tehran, Iran. He worked in Iran Telecommunication Research Center as a researcher and developer for a few years. To pursue his education, he entered to Case Western Reserve University, USA, and graduated with a Ph.D. degree in computer science and engineering. Then, he worked in different companies and research institutes in USA. He joined the School of Electrical and Computer Engineering of University of Tehran, in September 2000 and now he is a full professor. His research interest includes networking, packet switching, access methods, operating systems and database.