THE CAPACITY demand of the early generations of

Size: px
Start display at page:

Download "THE CAPACITY demand of the early generations of"

Transcription

1 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 3, MARCH Resequencing Worst-Case Analysis for Parallel Buffered Packet Switches Ilias Iliadis, Senior Member, IEEE, and Wolfgang E. Denzel Abstract This paper considers a general parallel buffered packet switch (PBPS) architecture which is based on multiple packet switches operating independently and in parallel. A load-balancing mechanism is used at each input to distribute the traffic to the parallel switches. The buffer structure of each of the parallel packet switches is based on either a dedicated, a shared, or a buffered-crosspoint output-queued architecture. As in such PBPS multipath switches, packets may get out of order when they travel independently in parallel through these switches, a resequencing mechanism is necessary at the output side. This paper addresses the issue of evaluating the minimum resequence-queue size required for a deadlock-free lossless operation. An analytical method is presented for the exact evaluation of the worst-case resequencing delay and the worst-case resequence-queue size. The results obtained reveal their relation, and demonstrate the impact of the various system parameters on resequencing. Index Terms Load balancing, parallel packet switching, resequencing, switching systems. I. INTRODUCTION THE CAPACITY demand of the early generations of high-performance packet switches or routers could sufficiently be covered with single-stage switch-fabric architectures. Even the majority of the current switch generations is still based on single-stage architectures, because the continued advances in very-large-scale integration (VLSI) technology made it possible, for the most part, to keep pace with the increasing capacity requirements. However, a further proliferation in capacity, in particular, an increase of the number of ports as opposed to an increase of the port bandwidth, will eventually require the transition to multistage packet switch architectures. Owing to the complications associated with this transition, an intermediate solution that allows an easier migration is of great interest. One such solution is based on multiple single-stage switches, which provide multiplied capacity by independently operating in parallel. This concept has been referred to as parallel switch architecture in [1], parallel packet switch (PPS) in [2] and [3], and distributed packet switch in [4]. It addresses the problem arising when line rates run faster than the switch fabric does. Further advantages of the PPS concept include the possibility for a smooth migration from the existing single-stage switches and the flexibility regarding the number of parallel switches. The migration is facilitated because the individual switches can work fully independently and need not be synchronized among Paper approved by A. Pattavina, the Editor for Switching Architecture Performance of the IEEE Communications Society. Manuscript received April 17, 2005; revised January 27, 2006 and August 2, The authors are with IBM Research, Zurich Research Laboratory, 8803 Rüschlikon, Switzerland ( ili@zurich.ibm.com). Digital Object Identifier /TCOMM each other. Therefore, this concept can be applied to any switch by developing only a single new component, namely, a higher speed switch adapter or line card that incorporates the plane load-balancing and plane server schemes. Second, using more switches than the minimum number required for accommodating the total bandwidth results in an effective speedup that improves the overall performance. In this paper, we consider the general parallel buffered packet switch (PBPS) architecture, which is based on multiple buffered packet switches operating independently and in parallel. The buffer structure of each of the PPSs is based on either a dedicated, a shared, or a crosspoint output-queued architecture. Our particular interest is on the buffered-crosspoint architecture, such as the single-stage distributed packet-routing switch described in [5], which for practical reasons uses a round-robin column-scheduling mechanism. The operation is assumed to be lossless, which is achieved by employing either a credit or a backpressure type of a feedback flow-control mechanism both from the switch planes to the input adapters and from the output adapters to the switching planes. As in multistage-multipath switches, a resequencing problem arises also in this context, as packets of a given flow may get out of sequence when they travel independently through various buffered switch planes. Consequently, a resequencing mechanism is needed at the egress side. It is assumed that it is based on sequence numbers rather than a time-based resequencing. The cost for resequencing is now tolerated, because it provides the possibility to obtain higher capacity with a minimum of lower speed parallel hardware and without centralized control [3]. We address the issue of evaluating the worst-case resequence-queue size, as this determines the minimum output buffer size required for a deadlock-free lossless operation. A deadlock occurs when an egress output buffer is full with resequence cells; none of them can depart from the buffer, and no cells can enter the buffer because there is no free space left. Consequently, the knowledge of the worst-case resequence-queue size is a prerequisite for an appropriate buffer dimensioning and safe operation of the switching system. The disturbance of the cell sequence, as well as the delay performance, depend on how evenly the traffic load is balanced across the parallel switches. A good balance and a high efficiency are best achieved by dynamic load-balancing mechanisms that dispatch each cell independently. They can be state-independent or state-dependent. In the former case, cells are dispatched to the planes without considering any information related to the switches, for example, in a round-robin [6], [7], random, or cyclic [1] manner. In the latter case, cells are dispatched to planes that are determined based on switch-related information, such as buffer (queue) occupancies [4], or /$ IEEE

2 606 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 3, MARCH 2007 the knowledge of the traffic already arrived [2], [3]. Note that these mechanisms do not entail any throughput reduction, because they do not inhibit the flow of cells from the input adapters to the planes, provided there is free space. However, they cause different degrees of temporary load asymmetries, which, in turn, have an impact on the resequencing. Although the state-dependent mechanisms are likely to result in improved resequencing behavior owing to the minimization of the load asymmetries, this improvement is practically reduced because of the propagation delays (in cells in flight) between adapters and switch planes. Clearly, the longer the propagation delay, the less representative and, therefore, useful the state information is. Also, simulations we conducted have shown that state-dependent load balancing is less effective when it is performed on cells rather than on variable-length packets. Moreover, as we consider the PPS concept applicable to any switch, we could not a priori assume that the switch has the capability of providing sufficient state information. We therefore focus here on the less complex, state-independent load-balancing mechanisms. Extensive work in resequencing analysis in various other contexts exists, e.g., [8] [11] and the references therein. These results, however, are limited to the resequencing delay and the resequence-queue occupancy. To the best of our knowledge, none of the earlier studies has considered the issue of the maximum resequence-queue size because it was assumed that if the resequence buffer is full, subsequently arriving out-of-sequence packets are dropped. In contrast, we do not allow packets or cells to be dropped in the architectures considered, because one intended use of the fabric is for server interconnect. In this domain, the stringent latency requirements do not allow for retransmissions, which, in turn, translates into a lossless mode of operation. Moreover, early works dealt with the resequencing incurred by a single flow either in isolation [9], [10] or in the presence of multiple flows [8]. In contrast to previous works, here we consider the combined multiple flow effect arising in the PBPS context, as at an output adapter there can be many independent flows simultaneously under resequencing, all of which should be jointly taken into account. More recent works [12] have only established upper bounds of the worst-case resequence-queue size, denoted by, because, as this paper demonstrates, its exact derivation is far from straightforward. It turns out that a trivial upper bound is derived as where denotes the number of parallel switch planes, and the maximum buffer occupancy in a plane with cells of the highest priority that are destined to an output port associated with an output adapter. This paper addresses the following practical questions. How tight is this upper bound derived? If it is not tight, is it possible that a given switch configuration with an output buffer size that is much less than this upper bound can still operate in a deadlock-free lossless fashion? One of the contributions of this paper is that it provides the answers to these questions by deriving the actual worst-case resequence-queue size, given as follows: (1) (2) The essence of (2) is that the worst-case resequence-queue size corresponds to the largest delay cells can experience in the network, and that this largest delay is determined by the number of cell buffers that can be used by cells destined to a particular output in a single plane multiplied by the number of remaining parallel planes. This corresponds to the case in which some (possibly one) cells are located in a full plane and, while they are working their way through this plane, the remaining planes are left free to allow subsequent cells to bypass them. What is not at all evident, and in fact seems counterintuitive, is the possibility of ever being able to arrive at the scenario described above, in which the loading asymmetry between the full plane and any of the remaining planes is so extreme, given that actually a scheme that load-balances the traffic is deployed. This issue will be addressed in great detail. A range of additional contributions cover the following aspects. The worst-case resequence-queue sizes for the dynamic state-independent load-balancing mechanisms considered (random, cyclic, round-robin) are all equal to the absolute worst-case resequence-queue size. The worst-case resequence-queue sizes for the dedicated, shared, and crosspoint output-queued architectures are the same as given by (2), despite some strikingly different characteristics. In the presence of priority classes, the worst-case resequence-queue size is bounded for the highest-priority traffic only. Consequently, additional measures are needed to ensure a safe operation for the low-priority traffic classes. Although (2) turns out to apply to both a dedicated/shared architecture and a crosspoint architecture, there are distinct differences, such that the details of the underlying architecture matter. For instance, in the presence of multiple flows, the worst-case resequence-queue size in the former case is always achieved by a single flow, whereas in the latter case, it can also be achieved by multiple flows. Additional differences are given in Section III-B.3. Owing to the generality of the result obtained, one may be tempted to assume that there is a unified approach to arrive at this result. This is, however, highly unlikely, because the analysis presented in this paper shows that the methodology developed in the case of dedicated/shared architecture does not apply to the case of a crosspoint architecture, and vice versa. For small values of, the worst-case resequence-queue size established in (2) is substantially different from the trivial upper bound. For example, for a PBPS architecture based on six switch planes, with each having a maximum buffering capability of 1024 cells per output port, is equal to 5120 cells for the dedicated, shared, and crosspoint output-queueing. This value is about 17% lower than the loose upper bound of 6144 cells obtained as the product of the number of planes (6) and the maximum buffering capability (1024). This gap becomes even more pronounced for a smaller number of switch planes: reducing the number of switch planes to four, is reduced to 3072 cells, which is 25% less than the loose upper bound of 4096 cells. The design conditions for a deadlock-free lossless operation now follow from (2), where the minimum output buffer size required depends only on and. It therefore indirectly de-

3 ILIADIS AND DENZEL: RESEQUENCING WORST-CASE ANALYSIS FOR PARALLEL BUFFERED PACKET SWITCHES 607 Fig. 1. PBPS/buffered-crosspoint switch. pends on the switch size through. Furthermore, it is independent of the load, because the worst-case resequence-queue size could also occur in a lightly loaded system. For example, we have conducted simulations for a server interconnect fabric based on a buffered-crosspoint architecture with eight switch planes, each having 128 kb buffer space per output port. Using real-application traffic based on traces of a scientific computing application, and although the average load was only 1%, the maximum resequence queue occupancy measured (299 MB) was already 33% of the worst case (896 MB). This was reached during one of the rare activity periods, and is attributed to the highly bursty nature of this traffic and the long-tailed packetlength distribution. Also, simulations we conducted using geometrically distributed packet lengths have shown that the maximum resequence queue occupancy increases with the load, as well as the mean packet length. It has reached values as high as 55% of the worst case. This means that if the resequencing queue provided were half of that dictated by the worst-case analysis, which is still significant, a deadlock would have occurred. This implies that, in general, reducing, even by a small factor, the resequence queue size compared with the worst-case requirement could potentially lead to a deadlock situation. This paper is organized as follows. In Section II, our model of a PBPS based on buffered-crosspoint switches is presented. Detailed descriptions of the state-independent load-balancing mechanisms considered, as well as of the operation of the resequence queue, are provided. Section III presents a general analytical method for the derivation of the worst-case resequencing delay and resequence-queue occupancy under the joint effect of multiple resequence flows. The circumstances under which the worst-case resequence-queue size arises are identified. The issue of multiple priorities is also addressed. Concluding remarks follow in Section IV. II. PRELIMINARIES A. Switch Model A PBPS architecture based on a buffered-crosspoint switch is illustrated in Fig. 1. There are parallel switch planes, each of which consisting of an output-queued switch of size running at a nominal speed of one. The ingress side comprises input adapters, each performing the plane loadbalancing function, i.e., the distribution of incoming packets or cells across the lower capacity parallel switches (planes). At the egress side, the output traffic of all parallel switches is merged back by the reverse function, the plane server function, performed by each of the output adapters. Both the input and output adapters have a buffering capability. As shown in Fig. 1, the input and output adapters consist of individual ingress-input and egress-output ports, respectively, running at a relative speed of. Note that the architecture proposed allows us to realize both a fast switch (by choosing a high value of with an appropriate number of planes ), as well as a larger switch size ( versus ). Owing to the deployment of multiple planes, the internal switch speed need not be higher than the ingress/egress port speed. The speedup or expansion factor of the configuration considered is then given by. For stability reasons, the expansion factor should be greater than or equal to one. The architecture considered provides the capability of switching variable-length packets by segmenting the packets at the ingress into fixed-length segments, and subsequently transmitting them through the switch via fixed-size switch cells. In the following, a buffer space unit is taken to be equal to the fixed size of a switch cell. We also refer to the transmission time of a switch cell at the nominal switch port speed as a cell cycle. It is, moreover, assumed that the round-trip times between input adapters and planes, and between planes and output adapters, are negligible. The system supports a number of priority classes. Priorities are obeyed at all points of queueing, i.e., at the virtual output queues (VOQs) of the input adapters, in the buffers of the switch planes, and at the output adapters. A flow is the set of fixed-length segments traveling between a pair of ingress-input and egress-output ports and transported by switch cells through the system. A given flow may contain multiple streams of individual internet protocol (IP) sessions [e.g., transport control protocol (TCP) or user

4 608 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 3, MARCH 2007 datagram protocol (UDP)]. The switch-related information of cells is their incoming and outgoing ports, their sequence number, and their priority. At the output adapters, however, the cells (and the corresponding segments) are associated with the ingress-input/egress-output flows so that resequencing can be performed at the flow level. The worst-case resequencing is evaluated over all packet sizes. This, in conjunction with the fact that resequence queue-size requirements increase with increasing packet size, implies that the results obtained apply to the case where packet sizes exceed a minimum value. In particular, in the case of a crosspoint architecture, we shall show that the worst-case resequence-queue size can be achieved by flows. Therefore, the minimum value is equal to, as cells waiting for resequencing and associated with any given of the flows all belong to the same packet. Within the switch planes, there is a buffer for every input output pair, also referred to as the crosspoint buffer. Crosspoint buffers of a given column correspond to the same output port and, for ease of implementation, are assumed to be served in a round-robin manner. Each crosspoint buffer distinguishes priorities and has a size of units, which is shared among the priority classes. Let be the maximum buffer occupancy with cells of the highest priority. Each input adapter contains VOQs corresponding to the switch output ports. Each VOQ can be fed at a maximum rate of cells per cell cycle. VOQs are served according to priorities, and according to a round-robin scheme within a priority. In one cell cycle, the plane load-balancing scheme is assumed to have the capability to dispatch one cell to each plane, i.e., up to cells to the corresponding planes, as shown in Fig. 1. Furthermore, the cells could be taken from the same VOQ if the remaining VOQs are empty (this is reflected by the thick lines in the figure). Each output adapter contains an egress output buffer of size units shared among the egress-output ports and fed from the planes. In one cell cycle, the plane server scheme can transfer one cell from each plane to an output adapter, implying a maximum rate of cells per cell cycle. As mentioned earlier, either a credit or a backpressure type of a feedback flow-control mechanism is employed from the switch planes to the input adapters, as well as from the output adapters to the switching planes, ensuring a lossless operation. Therefore, a specific scheme has to be specified and applied when an output buffer is full, which, in turn, causes the feedback flow control to be activated. More specifically, we consider a round-robin scheme in which each of the planes periodically gets an opportunity to send a cell every slots. In Section II-B, we comment on the efficiency of this mechanism. In addition to the buffered-crosspoint queued architecture depicted in Fig. 1, this paper also considers the dedicated and shared output-queued architectures. The dedicated output-queued architecture assumes a single buffer per output port that corresponds to all crosspoint buffers of the column associated with this port in the buffered-crosspoint architecture. The shared output-queued architecture assumes a single buffer for all output ports that corresponds to all crosspoint buffers. Let denote the maximum buffer occupancy in a plane with cells of the highest priority that are destined to an output port associated with an output adapter. In the case of the Fig. 2. Maximum resequence-queue size versus output buffer size. buffered-crosspoint architecture, it holds that. Note that cells can join this buffer (of a given plane) at a maximum rate of cells per cycle. According to the scheme described above, cells are transferred from the input to the output adapters over multiple switch planes. As a result, cells (and segments) corresponding to a given flow may arrive at an output adapter in an order different from the one in which they originally departed from the input adapter. As first-in first-out (FIFO) delivery is required, out-ofsequence cells must wait at the output adapter in a virtual resequence queue for the missing cells to arrive, so they can be put back into proper sequence. When a missing cell arrives, it releases the corresponding resequence cells from the resequence queue. Consider an arbitrary output adapter and let denote the maximum resequence-queue size in the output buffer of size. Clearly, it holds that. For small values of, we expect to be equal to, as the first few out-of-sequence cells can easily fill the output buffer leading to a deadlock. However, for sufficiently large values of, is smaller than and, in fact, converges to the worst-case resequence-queue size, as shown in Fig. 2. Clearly, the minimum output buffer size required for safe operation is equal to the worst-case resequence-queue size incremented by one cell, i.e., work is the evaluation of this value. B. Load-Balancing Mechanisms. The objective of this Each input adapter contains a plane load-balancing function that dynamically dispatches the outgoing cells to the available planes. A cell can be dispatched to a given switch plane if the corresponding flow control permits it. A cell is said to be eligible if there is a plane that can accept it. We also refer to the planes to which an eligible cell can be dispatched as eligible planes. Within a cell cycle, there can be at most one cell dispatched to any given plane. Therefore, a high efficiency is achieved when cells are being dispatched to all planes within each cell cycle. In this paper, we focus on the less complex state-independent load-balancing mechanisms, and, in particular, on the following schemes. a) Round-Robin: A round-robin mechanism is applied to the aggregate stream of eligible cells; each cell is dispatched to the plane pointed by the counter that is incremented until the first eligible plane is found. b) Random: Cells are randomly dispatched to the eligible planes. c) Cyclic: It is applied in switch input adapters that operate in slotted fashion.

5 ILIADIS AND DENZEL: RESEQUENCING WORST-CASE ANALYSIS FOR PARALLEL BUFFERED PACKET SWITCHES 609 Fig. 3. Resequence queue. According to this scheme, successive slots are cyclically preassigned to planes, regardless of the arrival pattern of the cells. Note that these schemes cannot exclude the possibility that an arbitrary number of successive cells of a given flow are dispatched to the same plane. Owing to this property, severe load asymmetries between planes can occur, as discussed in detail in Appendix A. At the egress side of the switch, we have considered various mechanisms for the plane server scheme. In contrast to the load-balancing schemes at the ingress side, simulations we conducted on similar schemes at the egress side revealed that more complex state-dependent plane server schemes did not result in any advantage over the round-robin scheme. Consequently, in the remainder, we assume that the plane server scheme operates in a round-robin fashion at the egress side of the switch. C. The Resequence Queue In the PBPS context, many flows can be simultaneously under resequencing at an arbitrary output adapter. Let us consider a specific flow, and, in particular, a typical sequence of cells, belonging to this flow. In Fig. 3, it is shown that cells have already arrived and have been stored at the output buffer of the corresponding destination adapter, forming the resequence queue while waiting for cell to arrive. Let us now consider the case where the output buffer is not full, and suppose that cells were transferred within one cell cycle from planes, respectively, to the output buffer along with cell transferred from plane 1. As cell remains in plane 1, cells join the resequence queue. The same effect occurs when the output buffer is full, albeit on a different time scale. The longest period arises when all cells in the output buffer are destined to a particular egress-output port such that the output buffer is emptied at a rate of cells per cell cycle. In this case, cells,, will be transferred to the output buffer in successive cell cycles. This period constitutes an arbitration cycle, during which the plane server scheme transfers one cell from each of the planes to the output buffer. Clearly, the resequence-queue size increases by the same amount in both cases, although the arbitration cycle lasts one cell cycle in the former, and cell cycles in the latter case. This indicates that as far as the resequence-queue size is concerned, time should be measured in relative (arbitration cycles) rather than absolute terms (cell cycles). More formally, arbitration cycle is the period it takes for the plane server scheme to check and serve (if possible) all planes provided they are not all empty. In the remainder, unless otherwise indicated, the term cycle will refer to an arbitration cycle. Remark 1: From the above, it follows that a delay of one arbitration cycle corresponds to a resequencing delay of at most cell cycles, occurring when the egress output buffer is constantly filled with cells that are all destined to the same egressoutput port. III. ANALYSIS OF MULTIPLE FLOWS A. Worst-Case Resequencing Delay We consider the crosspoint output-queued architecture, and describe a scenario resulting in the maximization of the resequencing delay. Let us consider multiple flows stemming from the input adapters and destined to the first output adapter, as depicted in Fig. 4 by the shaded and black cells, and let us focus on a typical tagged flow arriving at, say, the first adapter with its cells indicated in black. At the instant cell of the tagged flow is dispatched to plane 1, buffers of the first plane corresponding to the first output adapter as well as are full, as indicated in Fig. 4 by the shaded cells, which all are destined to the first egress-output port of the first output adapter. At the same time, a subsequent cell of the tagged flow, say cell, is dispatched to an empty plane, say plane 2. In the next cycle, cell and a shaded cell will be dispatched to the output buffer. Cell will arrive at the output buffer after all shaded cells have arrived, i.e., after cycles. Thus, the worst-case resequencing delay (experienced by cell and expressed in arbitration cycles) is given by or, by virtue of Remark 1, cell cycles. Note that (3) also applies to the case of a dedicated/shared output-queued architecture. Remark 2: The worst-case resequencing delay does not depend on, the number of individual ports per adapter. Note that in obtaining these results, no assumptions were made regarding the load-balancing scheme. Let us now address the following issue. Is it possible, under the deployment of a plane load-balancing scheme, to arrive at the scenario described above, where the loading asymmetry between plane 1 and any of (3)

6 610 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 3, MARCH 2007 Fig. 4. Worst-case resequencing delay for a PBPS/crosspoint-queued switch configuration. the remaining planes is so extreme? Note that the plane load-balancing scheme ensures that the load is equally distributed among the planes. Consequently, at first glance, this scenario, in which all the shaded cells reside in one of the planes, seems rather unlikely. Nevertheless, in Appendix A, it is shown that this scenario is feasible under any of the state-independent load-balancing schemes considered. B. Worst-Case Resequence-Queue Size Let us now consider a snapshot of the output resequence queue at a random cycle, and let us assume that there are flows under resequencing indexed by. Let us denote by a cell that entered the th plane at time, and was subsequently dispatched to the egress-output buffer in cycle, that corresponds to the th flow, and whose sequence number is equal to. To distinguish resequence cells residing in the resequence queue from the rest of the cells, the notation is used ( instead of ). For example, cell waits for cell, which is located in the second plane and will be dispatched to the output buffer in cycle. The missing cells will arrive after cycle, and are, therefore, indicated in subsequent columns. From the set of missing cells corresponding to a given flow, the one with the smallest sequence number is called the principal missing cell. For example, and are missing cells corresponding to flow, with the latter being the principal one. As there are flows under resequencing, there are principal missing cells. These are indicated in Fig. 5 with a sequence number equal to one. Remark 3: Let be a resequence cell that waits for cell of the same flow. This implies that and. Furthermore, the two cells cannot stem from the same plane, i.e.,, as in all output-queued architectures considered, a cell of a flow cannot bypass, within a plane, another cell of the same flow. Remark 4: The dedicated/shared output-queued architecture ensures a FIFO delivery because cells maintain their positions relative to one another. Thus, cells depart from the plane and enter the output adapter in the same order in which they entered the plane. In contrast, a crosspoint-queued architecture

7 ILIADIS AND DENZEL: RESEQUENCING WORST-CASE ANALYSIS FOR PARALLEL BUFFERED PACKET SWITCHES 611 Fig. 5. Snapshot of the output resequence queue. with round-robin column scheduling does not necessarily guarantee a FIFO delivery. The FIFO property is guaranteed only among cells stemming from the same crosspoint buffer. Let and be two cells arriving at the output adapter from a given plane, say the th plane, with. Consider that the two cells enter two different crosspoint buffers. will depart prior to if at the instant it enters its crosspoint buffer, the number of cells in the buffer is smaller than the corresponding number of cells in front of in its crosspoint buffer. Thus for a dedicated/shared output-queued architecture and for a crosspoint output-queued architecture for cells stemming from the same crosspoint buffer, and for the crosspoint output-queued architecture considered. (4) We now proceed by defining the resequencing degree as the maximum of the resequencing delay cycles corresponding to the resequence cells at cycle, i.e.,, where denotes the earliest dispatch cycle for the resequence cells located in the resequence queue at cycle, formally. Let be such a resequence cell, and let be its corresponding principal missing cell. Clearly, at cycle, the principal missing cell is stored at the buffer of plane and remains there throughout the interval. Note that the resequencing degree is bounded above by the worst-case resequencing delay (expressed in arbitration cycles) with the equality holding iff two successive cells of a flow are simultaneously transferred to two planes, with the buffer of the first of them being full of cells destined to the output adapter, and the buffer of the other being empty. From the above, it follows that the resequence-queue size is equal to the number of resequence cells located inside the resequence rectangle depicted in Fig. 5. The remaining empty slots (5) do not contribute to the resequence-queue size because they correspond either to no cell arrivals or to arrivals of cells that are not out-of-sequence and, therefore, not shown. Note that the resequence-queue size at cycle is trivially bounded above by, the maximum possible number of cells inside the resequence rectangle. Similarly, the worst-case resequence-queue size is trivially bounded above by, the maximum possible number of cells inside the resequence rectangle when the length of the rows is the maximum possible, which, by virtue of (3) and (5), is equal to. This type of bound has been applied in previous works, such as [1] and [12]. We now proceed to derive tight upper bounds determining the worst-case resequence-queue size. 1) Dedicated/Shared Output-Queued Architecture: Lemma 1: For the dedicated and shared output-queued architectures, the resequence-queue size at cycle is bounded above by with the equality holding iff throughout the interval, at each cycle there are resequence cells stemming from the same planes. Proof: Let us consider the missing cells, and, in particular, a missing cell that has entered its plane at the earliest time compared with the others. Formally s.t. (7) It follows that is a principal missing cell. Suppose that from the missing cells depicted in the right-hand side of the resequence rectangle in Fig. 5, is the one that arrived the earliest, i.e., for all these cells, it holds that. Then,,,,, and. We now consider the row of the resequence rectangle corresponding to the plane, and establish the following proposition. Proposition 1: The slots corresponding to the th row (plane) of the resequence rectangle do not contain any resequence cell. Proof: For the purpose of contradiction, suppose that there exists a slot that contains a resequence cell which waits for. These correspond to cells and shown (6)

8 612 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 3, MARCH 2007 Fig. 6. Snapshot of the output resequence queue (crosspoint-queued architecture). in Fig. 5. According to Remark 3, it holds that and. Cell arrives at the output adapter prior to, which by virtue of (4) implies that. From definition (7), it follows that the entry time of cell satisfies the inequality. Combining these inequalities yields, which implies that during the same cycle, first and were dispatched to planes and, respectively, and then was dispatched to plane. This, however, leads to contradiction, because it implies that has entered its plane earlier than. Consequently, the shaded row of Fig. 5 does not contain resequence cells. Proposition 1 implies that the number of resequence cells in the resequence rectangle can be at most, with the equality holding iff throughout the interval, there are resequence cells at each cycle stemming from the remaining planes. This completes the proof of the lemma. Theorem 1: For the dedicated and shared output-queued architectures, the worst-case resequence-queue size is given by achieved by a set of flows corresponding to a single input/output pair of adapters, and with a corresponding resequencing degree equal to the worst-case resequencing delay. Proof: From (3), (5), and (6), it follows that, with the equality holding iff equalities hold in both (5) and (6). The plane with the full buffer described in the condition of (5) corresponds to the th row of Proposition 1. On the other hand, the equality in (6) implies that throughout the interval, at each cycle there are resequence cells stemming from the remaining planes, which, in turn, excludes having a second plane with full buffer. This implies that all the resequence cells of the first cycle, as well as subsequent cycles, correspond to the same flow (in terms of input/output ports), which potentially corresponds to several flows from the various ingress-input ports of an input adapter to the various egress-output ports of the output adapter. Consequently, this (8) leads us to the unique worst-case resequence-queue size scenario in which cells of such flows are dispatched to a plane having a full buffer, and at the same time, subsequent cells are dispatched to the remaining planes where the buffers are empty. Note that in this scenario, it holds that the resequencing degree is equal to the worst-case resequencing delay. Remark 5: Note that the worst-case resequence-queue size implies a maximum resequencing degree. 2) Buffered-Crosspoint Output-Queued Architecture: We begin by noting that Lemma 1 established in the case of a dedicated/shared output-queued architecture does not necessarily hold in the case of the crosspoint output-queued architecture considered, as (4) no longer implies owing to the failure of the FIFO property. We therefore proceed by following a different methodology which in the end reveals that indeed Lemma 1 does not hold. Let us consider a snapshot of the worst-case output resequence-queue size at cycle, as shown in Fig. 6. Consider the principal missing cells, and let us group them based on the input adapter from which they were dispatched to the planes. Suppose that there are such groups formed, and without loss of generality, assume that they correspond to the first input adapters. Let us also consider the principal missing cells of the th group, and in particular, cell, which was the first of these cells to enter its plane. This allows us to assign all resequence cells of the flows corresponding to this group to a single flow corresponding to, without affecting the number of resequence cells. Consequently, applying this assignment to each of the groups results in a worst-case scenario in which there is a single principal missing cell for each such group. Let be the set of the principal missing cells. We now denote by the number of the cells in that are dispatched to the th plane, such that. Let be the vector and be the number of planes that contain such cells, i.e.,, with. Let us now consider a plane, say the th plane, for which. From the above, we deduce that the principal missing cells reside in the th th crosspoint buffer of the column of the th plane. Furthermore,

9 ILIADIS AND DENZEL: RESEQUENCING WORST-CASE ANALYSIS FOR PARALLEL BUFFERED PACKET SWITCHES 613 the worst-case resequence-queue size assumption implies that they are residing in the th plane throughout the interval. As a consequence, these crosspoints contribute cells to the th row of the resequence rectangle, none of which is a resequence cell. The proof is similar to the proof given for the dedicated/shared case, with (4) now being repeatedly applied to each of the crosspoint buffers, and is therefore omitted. These cells are indicated by the symbol in Fig. 6, where it is assumed that and. The number of resequence cells in the th row of the region is maximized when the remaining crosspoint buffers dispatch resequence cells (indicated by the symbol in Fig. 6). Moreover, the worst-case resequence-queue size assumption implies a high resequencing degree, which translates into a maximum length of the rows of the rectangle. This is achieved when the remaining th th crosspoint buffers also constantly dispatch (nonresequence) cells (indicated by the symbol in Fig. 6), and when at the arrival epochs of cells the corresponding crosspoints are full, i.e., contain cells. Owing to the round-robin service mechanism of the crosspoints buffers, the th row (starting at cycle ) contains identical -cycle-long segments followed by the last segment, which contains the principal missing cell(s), as depicted in Fig. 6. Clearly, for a given vector, both the resequencing degree and the number of resequence cells are maximized when, for instance, the resequence cell(s) are at the beginning and the missing cell(s) are at the end of the last segment. It will later be shown that a vector that maximizes the resequence-queue size does not necessarily maximize the resequencing delay. Remark 6: Depending on the values of, and, it may not always be possible that in all of the first rows of the last segment the cells are arranged in the form,,, as shown in Fig. 6. Consequently, the expressions subsequently derived constitute upper bounds, which are tight for some special combinations of, and, as will be discussed below. The number of resequence cells in the th row of the last segment cannot exceed the quantity, where, as depicted in Fig. 6. Furthermore, by considering all the planes, it follows that Let us now consider a plane, say the th plane, for which. The worst-case resequence-queue size implies that the first crosspoints constantly provide resequence cells corresponding to the resequence flows, and that the remaining crosspoints are empty, so that the th row is filled with resequence cells. This is illustrated in the top two rows in Fig. 6. From the above, it follows that is bounded above by (9), and by using the fact that and (10) with and (11) (12) From (10), it follows that the worst-case resequence-queue size is bounded above by In order to derive the function following two tight upper bounds: (13),we first evaluate the (14) We begin with the evaluation of. Conditioning first on and then on, (14) yields with s.t.,or (15) s.t. (16) We now establish the following lemmas. Lemma 2: Consider the vector, with elements having the value, elements having the value, and the remaining elements being equal to zero. The vector is maximal, i.e.,, and also results in the largest resequencing degree. Proof: See Appendix C. Corollary 1: It holds that (17) Proof: Immediate from Lemma 2 and (12). Lemma 3: It holds that, for either and, or. Proof: See Appendix C. Lemma 4: It holds that, for either or. Proof: See Appendix C. Theorem 2: The worst-case resequence-queue size is given by (18)

10 614 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 3, MARCH 2007 for either and, with a corresponding resequencing degree,or, with a corresponding resequencing degree. Proof: From (10), (14), and by combining the results and conditions given by Lemmas 3 and 4, (13) yields. We now demonstrate that for each of these conditions, the equality holds. According to Remark 6, a sufficient condition for equality is when the cells in the first rows of the last segment are arranged in the form,,.for condition and, this is possible when, for each, cells reside in consecutive crosspoints in the th plane, and at the beginning of a segment, the round-robin mechanism serves the th (modulo ) crosspoint. For condition, this is possible when, at the beginning of a segment, the round-robin mechanism of the plane in which the only missing cell is located serves the crosspoint following the one in which the missing cell is located. A detailed description of the corresponding worstcase scenario is provided in Appendix B. In conclusion, the function does indeed represent the worst-case resequence-queue size. The resequencing degree corresponding to each of the conditions is obtained from (9) by considering the maximal vector as defined by Lemma 2. Remark 7: In contrast to the dedicated/shared output-queued architecture case, here the worst-case resequence-queue size does not necessarily imply a maximum resequencing degree. This is because when and the worst-case resequence-queue size is obtained by the vector, the corresponding resequencing degree is, which is less than the worst-case resequencing delay. However, the resequencing degree corresponding to the worst-case resequence-queue size obtained by the vector is equal to the worst-case resequencing delay. Note also that in the former case, Lemma 1, which was established for a dedicated/shared output-queued architecture, does not hold. 3) Comparison of the Output-Queued Architectures: From (8) and (18), it follows that the same expression applies to both a crosspoint architecture and a dedicated/shared architecture. There are, however, two differences between the dedicated/shared and the crosspoint architectures considered. First, the worst-case resequence-queue size in the former case is always achieved by a single flow, whereas in the latter case, it can also be achieved by flows. Second, according to Remark 7, the worst-case resequence-queue size implies the worst-case resequencing delay only in the former case. In the latter case, the worst-case resequence-queue size does not necessarily imply a maximum resequencing delay. Remark 8: The worst-case resequence-queue size does not depend on or. Note again that in obtaining these results, no assumptions were made regarding the load-balancing scheme. As in the evaluation of the worst-case resequencing delay, it turns out that these results also apply in the case of the state-independent plane load-balancing schemes considered. The scenarios are constructed using the technique presented in Appendices A and B. This technique involves two steps. First, the creation of backpressure and backlog in the output adapter, the switch buffers, and the input adapters, by considering a hot-spot scenario. Second, the consideration of appropriately synchronized arrival patterns. C. Priorities We consider here a system operating under the presence of multiple priorities. In this case, the results obtained in the preceding Sections III-A and III-B apply to the highest priority class. Let us now consider a lower priority flow of cells. The worst-case resequence-queue size and resequencing delay arise when cell is dispatched to a plane and stays there indefinitely, because it is always preempted by traffic of higher priority, and at the same time, cells subsequent to are transferred to the remaining planes, and from there to the output adapter. In this case, both the worst-case resequencing delay and the worst-case resequence-queue size are unbounded. Note also that this holds independently of, the number of priorities. Consequently, the model considered so far is prone to deadlock situations under priority traffic and regardless of the size of the output buffers. From the above, it follows that additional measures are needed to ensure safe operation. One solution, for instance, could be to devise a mechanism to identify and release lower priority missing cells that have been blocked for a long period of time. This amounts to overriding priorities at these release instants. As a first attempt, we have simulated numerous such schemes, and found that a reduction of the maximum measured resequence-queue sizes could only be achieved by schemes of high complexity. The worst-case resequence-queue sizes associated with these schemes are the subject of further study. IV. CONCLUSIONS The general class of PBPSs based on a dedicated, shared, or buffered-crosspoint output-queued architecture has been considered. The issue of evaluating the minimum resequence-queue size required for a deadlock-free lossless operation in the presence of multiple flows was addressed. An analytical method was developed for evaluating the absolute worst-case resequence-queue size, independently of the load-balancing mechanism used, and it was proved that this is also the worst case for some typical load-balancing mechanisms, such as random, cyclic, and round-robin. It was demonstrated that for these mechanisms, the worst-case resequence-queue size depends only on the characteristics of the parallel switches, and, in particular, on the number of their ports, the size of the internal switch buffers, and the number of parallel planes. The circumstances under which the worst-case resequence-queue size arises were identified. Whereas earlier works derived only upper bounds of the worst-case resequence-queue size, we derived the exact values, which for a small number of planes are significantly lower than these loose upper bounds. Interestingly, in the case of the crosspoint-queued architecture with round-robin column scheduling, it is found that the resequencing delay corresponding to the exact worst-case

11 ILIADIS AND DENZEL: RESEQUENCING WORST-CASE ANALYSIS FOR PARALLEL BUFFERED PACKET SWITCHES 615 resequence-queue size is not necessarily the maximum possible one. Furthermore, despite some strikingly different characteristics of the output-queued architectures considered, such as the lack of the FIFO property of the crosspoint-queued architecture, the worst-case resequence-queue size is the same for all three architectures. The results obtained apply to the highest priority class only. In systems having multiple priority classes, however, additional measures are needed to ensure a safe operation. Their nature has been outlined, and is a subject of future work. A further reduction of the resequence buffer size required is possible by adopting more sophisticated (state-dependent) load-balancing mechanisms. The analysis of these schemes builds upon the analytical methods developed and the results obtained in this paper, and is a subject of ongoing investigation. APPENDIX A SCENARIO FOR EXTREME ASYMMETRY We show here that the scenario with the extreme plane asymmetries considered in Section III-A is feasible under any of the state-independent load-balancing schemes considered. 1 It is based on the property that any of these schemes cannot exclude the possibility that successive cells of a given flow are dispatched to the same plane. While for the random scheme this is apparent, for the other two schemes, this is less so. Let us, for instance, assume a round-robin mechanism. We begin the construction of the scenario by initially considering an empty system and flows arriving at all ingress-inputs at full rate, and destined to the first egress-output of the first output adapter. This hot-spot scenario eventually causes the egress-output buffer, as well as the crosspoint buffers of the first column of each plane, to fill up. 2 Owing to the assumption of an expansion factor greater than or equal to one, the potential input rate of is larger than the rate of cells per cell cycle at which the egress output buffer is emptied. Therefore, the egress output buffer remains filled and exercises constant backpressure to the planes. Consequently, a period of one arbitration cycle lasts cell cycles. This implies that the rate at an ingress-input can be at most cells per arbitration cycle. We now show how one arrives at the scenario described by considering appropriately synchronized arrival patterns. Suppose, without loss of generality, that each of the round-robin input load-balancers points to the first plane. The plane asymmetry is now created by changing the arrival pattern as follows. Let us consider the arbitration cycle that begins at the instant when one cell place becomes available in of the first plane, owing to a cell transfer to the output adapter. 3 All flows stemming from the first input adapter are stopped, except the one between the first ingress-input port and the first egress-output 1 Although this scenario refers to a crosspoint-queued architecture, a similar scenario can be constructed in the case of a dedicated/shared output-queued architecture. 2 Note that when the crosspoint buffers of the switches are filled before the egress-output buffer, the input VOQs also start filling up. One then arrives at this state by allowing the egress-output buffer to fill up as well, and subsequently stopping the incoming traffic until the input VOQs become empty. 3 In the case of a cyclic load-balancing mechanism, we assume that the flow out of the output adapter is momentarily interrupted until the cell transfer occurs at a slot preassigned to the first plane. port. The rate of this flow is changed to one (shaded) cell every (arbitration) cycles. The first cell of this flow is transferred to buffer of the first plane. Consider now the instant (within the same arbitration cycle) when one cell place becomes available in of the second plane, owing to a cell transfer to the output adapter. A new flow between the first ingress-input port and the last egress-output port is initiated at a rate of cells every (arbitration) cycles. According to the dynamic state-independent schemes considered, the first cell of this flow is transferred to buffer of the second plane, and the th cell to the th plane. It is now evident that at the end of the arbitration cycle buffer of the first plane is full, whereas buffers of the remaining planes are no longer full (one cell place is empty). This is the first instance of asymmetry between the planes. The above sequence of arrival patterns is now successively repeated in the next arbitration cycles at input adapters 2 through, such that buffers of the first plane are full, in contrast to buffers of the remaining planes, which are no longer full (one cell place is empty). Note also that at the end of this -cycle interval, each of the round-robin input load-balancers has dispatched cells to all planes, and therefore points to the first plane, just as it did at the beginning of this interval. The same holds in the case of the cyclic load-balancing mechanism. Repeating the scenario described above in the subsequent -cycle intervals will, therefore, result in the buffers of the first plane being always full, in contrast to buffers of the remaining planes, which are continuously depleted until they become empty. APPENDIX B WORST-CASE RESEQUENCE-QUEUE SIZE Theorem 2 states that the worst-case resequence-queue size can be achieved either by flows or a single flow. We choose here to present a scenario that builds upon the one presented in Appendix A and is based on a single flow, recognizing that a similar one can be constructed for the case of flows. Condition of Theorem 2 implies that all resequence cells belong to a single flow. This flow is assumed to correspond to the pair between the first ingress-input port and the first egress-output port, with its rate being changed again to the maximum of (black) cells per cell cycle. The first, second, third,, th cell of this flow are assumed to join buffers of the first, second, third,, th plane, respectively. In the following arbitration cycle, the second, third,, th cell of the flow (along with a cell from buffer of the first plane) will be transferred to the egress-output buffer and join the resequence queue. Note also that the th cell of the flow will be dispatched to the second plane because buffer of the first plane is full. In this way, at each arbitration cycle, cells of the flow will be transferred from planes 2 through to the egress-output buffer. All these cells join the resequence queue, as they have to wait for the first cell of the flow stored in buffer of the first plane. This cell will be dispatched to the egress-output buffer after arbitration cycles, corresponding to the number of cells in the first plane upon its arrival. Consequently, the resequence-queue size will grow to, which, according to (18), is the worst case.

12 616 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 3, MARCH 2007 APPENDIX C PROPERTIES RELATED TO THE BUFFERED-CROSSPOINT ARCHITECTURE Proof of Lemma 2: First, we show that results in the largest resequencing degree. This follows immediately from (9) and the fact that s.t.. To show that is also maximal, it suffices to show that there is a maximal vector with such that. Suppose there is a maximal vector with for some pair. We will show that the vector is also maximal, with and. Clearly, repeatedly applying this procedure results in a maximal vector with,or. Let and. The following cases are considered. 1). Then, and. From the above and (12), it follows that. Owing to the assumption, the event implies, and consequently,. From the above, it follows that and, by virtue of definition (16),, which implies that the vector is also maximal. 2). Then, and and given that also maximal. Proof of Lemma 3:, and for. From the above and (12), it follows that. From this inequality, it follows that and, by virtue of definition (16),, which implies that the vector is Substituting (17) into (15) yields. We proceed by observing that, with the equality holding iff or. Furthermore, it holds that, with the equality holding iff or, which is equivalent to, considering that. Consequently,, with the equality holding iff,or and. From the above, it now follows that., Proof of Lemma 4: Conditioning first on and then on, (14) yields s.t., and by virtue of (11),, where the two terms of the maximum are both equal to and are obtained when and, respectively. REFERENCES [1] T. Aramaki, H. Suzuki, S.-I. Hayano, and T. Takeuchi, Parallel ATOM switch architecture for high-speed ATM networks, in Proc. IEEE Int. Conf. Commun., Chicago, IL, Jun. 1992, vol. 1, pp [2] S. Iyer, A. Awadallah, and N. McKeown, Analysis of a packet switch with memories running slower than the line rate, in Proc. IEEE IN- FOCOM, Tel Aviv, Israel, Mar. 2000, vol. 2, pp [3] S. Iyer and N. McKeown, Making parallel packet switches practical, in Proc. IEEE INFOCOM, Anchorage, AK, Apr. 2001, vol. 3, pp [4] W. Wang, L. Dong, and W. Wolf, Distributed switch architecture with dynamic load-balancing and parallel input-queued crossbars for terabit switch fabrics, in Proc. IEEE INFOCOM, New York, NY, Jun. 2002, vol. 1, pp [5] F. Abel, C. Minkenberg, R. P. Luijten, M. Gusat, and I. Iliadis, A fourterabit packet switch supporting long round-trip times, IEEE Micro, vol. 23, no. 1, pp , Jan./Feb [6] F. M. Chiussi, J. G. Kneuer, and V. P. Kumar, Low-cost scalable switching solutions for broadband networking: The ATLANTA architecture and chipset, IEEE Commun. Mag., vol. 35, no. 3, pp , Mar [7] M. A. Henrion, G. J. Eilenberger, G. H. Petit, and P. H. Parmentier, Multipath self-routing switch, IEEE Commun. Mag., vol. 31, no. 4, pp , Apr [8] I. Iliadis and Y.-C. Lien, Resequencing in distributed systems with multiple classes, in Proc. IEEE INFOCOM, New Orleans, LA, Mar. 1988, pp [9] I. Iliadis and L. Y.-C. Lien, Resequencing delay for a queueing system with two heterogeneous servers under a threshold-type scheduling, IEEE Trans. Commun., vol. 36, no. 6, pp , Jun [10], Resequencing control for a queueing system with two heterogeneous servers, IEEE Trans. Commun., vol. 41, no. 6, pp , Jun [11] N. Gogate and S. S. Panwar, On a resequencing model for high speed networks, in Proc. IEEE INFOCOM, Toronto, ON, Canada, Jun. 1994, vol. 1, pp [12] H. J. Chao and J. S. Park, Architecture designs of a large-capacity Abacus ATM switch, in Proc. IEEE CLOBECOM, Sydney, Australia, Nov. 1998, vol. 1, pp Ilias Iliadis (S 84 M 88 SM 99) received the B.S. degree in electrical engineering in 1983 from the National Technical University of Athens, Greece, the M.S. degree in 1984 from Columbia University, New York, NY, as a Fulbright Scholar, and the Ph.D. degree in electrical engineering in 1988, also from Columbia University. He has been with the IBM Zurich Research Laboratory, Rüschlikon, Switzerland, since He was responsible for the performance evaluation of IBM s PRIZMA switch. His research interests include performance evaluation, optimization and control of computer communication networks and storage systems, traffic control and engineering for networks, switch architectures, and stochastic systems. He holds several patents. Dr. Iliadis is a member of IFIP Working Group 6.3, Sigma Xi, and the Technical Chamber of Greece. He served as a Technical Program Co-Chair for the IFIP Networking 2004 Conference.

13 ILIADIS AND DENZEL: RESEQUENCING WORST-CASE ANALYSIS FOR PARALLEL BUFFERED PACKET SWITCHES 617 Wolfgang E. Denzel received the M.S. and Ph.D. degrees in electrical engineering from Stuttgart University, Stuttgart, Germany, in 1979 and 1986, respectively. Since 1985, he has been a Researcher at the IBM Zurich Research Laboratory, Rüschlikon, Switzerland. He was responsible for architectural design and performance evaluation of IBM s PRIZMA switch. He worked on system aspects of ATM-based corporate networks and corporate optical networks. In these fields, he participated in several European RACE projects and coordinated the ACTS COBNET project. His recent interests are in server interconnection networks, congestion control in lossless networks, and end-to-end simulation techniques for high-performance computing systems. He holds several patents.

Switching Using Parallel Input Output Queued Switches With No Speedup

Switching Using Parallel Input Output Queued Switches With No Speedup IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 10, NO. 5, OCTOBER 2002 653 Switching Using Parallel Input Output Queued Switches With No Speedup Saad Mneimneh, Vishal Sharma, Senior Member, IEEE, and Kai-Yeung

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

Worst-case Ethernet Network Latency for Shaped Sources

Worst-case Ethernet Network Latency for Shaped Sources Worst-case Ethernet Network Latency for Shaped Sources Max Azarov, SMSC 7th October 2005 Contents For 802.3 ResE study group 1 Worst-case latency theorem 1 1.1 Assumptions.............................

More information

Efficient Queuing Architecture for a Buffered Crossbar Switch

Efficient Queuing Architecture for a Buffered Crossbar Switch Proceedings of the 11th WSEAS International Conference on COMMUNICATIONS, Agios Nikolaos, Crete Island, Greece, July 26-28, 2007 95 Efficient Queuing Architecture for a Buffered Crossbar Switch MICHAEL

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction In a packet-switched network, packets are buffered when they cannot be processed or transmitted at the rate they arrive. There are three main reasons that a router, with generic

More information

FUTURE communication networks are expected to support

FUTURE communication networks are expected to support 1146 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 13, NO 5, OCTOBER 2005 A Scalable Approach to the Partition of QoS Requirements in Unicast and Multicast Ariel Orda, Senior Member, IEEE, and Alexander Sprintson,

More information

NEW STABILITY RESULTS FOR ADVERSARIAL QUEUING

NEW STABILITY RESULTS FOR ADVERSARIAL QUEUING NEW STABILITY RESULTS FOR ADVERSARIAL QUEUING ZVI LOTKER, BOAZ PATT-SHAMIR, AND ADI ROSÉN Abstract. We consider the model of adversarial queuing theory for packet networks introduced by Borodin et al.

More information

Scheduling Algorithms to Minimize Session Delays

Scheduling Algorithms to Minimize Session Delays Scheduling Algorithms to Minimize Session Delays Nandita Dukkipati and David Gutierrez A Motivation I INTRODUCTION TCP flows constitute the majority of the traffic volume in the Internet today Most of

More information

A Pipelined Memory Management Algorithm for Distributed Shared Memory Switches

A Pipelined Memory Management Algorithm for Distributed Shared Memory Switches A Pipelined Memory Management Algorithm for Distributed Shared Memory Switches Xike Li, Student Member, IEEE, Itamar Elhanany, Senior Member, IEEE* Abstract The distributed shared memory (DSM) packet switching

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

MULTI-PLANE MULTI-STAGE BUFFERED SWITCH

MULTI-PLANE MULTI-STAGE BUFFERED SWITCH CHAPTER 13 MULTI-PLANE MULTI-STAGE BUFFERED SWITCH To keep pace with Internet traffic growth, researchers have been continually exploring new switch architectures with new electronic and optical device

More information

Real-Time (Paradigms) (47)

Real-Time (Paradigms) (47) Real-Time (Paradigms) (47) Memory: Memory Access Protocols Tasks competing for exclusive memory access (critical sections, semaphores) become interdependent, a common phenomenon especially in distributed

More information

206 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 16, NO. 1, FEBRUARY The RGA arbitration can also start from the output side like in DRR [13] and

206 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 16, NO. 1, FEBRUARY The RGA arbitration can also start from the output side like in DRR [13] and 206 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 16, NO. 1, FEBRUARY 2008 Matching From the First Iteration: An Iterative Switching Algorithm for an Input Queued Switch Saad Mneimneh Abstract An iterative

More information

MOST attention in the literature of network codes has

MOST attention in the literature of network codes has 3862 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 8, AUGUST 2010 Efficient Network Code Design for Cyclic Networks Elona Erez, Member, IEEE, and Meir Feder, Fellow, IEEE Abstract This paper introduces

More information

Long Round-Trip Time Support with Shared-Memory Crosspoint Buffered Packet Switch

Long Round-Trip Time Support with Shared-Memory Crosspoint Buffered Packet Switch Long Round-Trip Time Support with Shared-Memory Crosspoint Buffered Packet Switch Ziqian Dong and Roberto Rojas-Cessa Department of Electrical and Computer Engineering New Jersey Institute of Technology

More information

A Four-Terabit Single-Stage Packet Switch with Large. Round-Trip Time Support. F. Abel, C. Minkenberg, R. Luijten, M. Gusat, and I.

A Four-Terabit Single-Stage Packet Switch with Large. Round-Trip Time Support. F. Abel, C. Minkenberg, R. Luijten, M. Gusat, and I. A Four-Terabit Single-Stage Packet Switch with Large Round-Trip Time Support F. Abel, C. Minkenberg, R. Luijten, M. Gusat, and I. Iliadis IBM Research, Zurich Research Laboratory, CH-8803 Ruschlikon, Switzerland

More information

The GLIMPS Terabit Packet Switching Engine

The GLIMPS Terabit Packet Switching Engine February 2002 The GLIMPS Terabit Packet Switching Engine I. Elhanany, O. Beeri Terabit Packet Switching Challenges The ever-growing demand for additional bandwidth reflects on the increasing capacity requirements

More information

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 8, NO. 6, DECEMBER 2000 747 A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks Yuhong Zhu, George N. Rouskas, Member,

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

Analysis of Binary Adjustment Algorithms in Fair Heterogeneous Networks

Analysis of Binary Adjustment Algorithms in Fair Heterogeneous Networks Analysis of Binary Adjustment Algorithms in Fair Heterogeneous Networks Sergey Gorinsky Harrick Vin Technical Report TR2000-32 Department of Computer Sciences, University of Texas at Austin Taylor Hall

More information

13 Sensor networks Gathering in an adversarial environment

13 Sensor networks Gathering in an adversarial environment 13 Sensor networks Wireless sensor systems have a broad range of civil and military applications such as controlling inventory in a warehouse or office complex, monitoring and disseminating traffic conditions,

More information

Unit 2 Packet Switching Networks - II

Unit 2 Packet Switching Networks - II Unit 2 Packet Switching Networks - II Dijkstra Algorithm: Finding shortest path Algorithm for finding shortest paths N: set of nodes for which shortest path already found Initialization: (Start with source

More information

Shared-Memory Combined Input-Crosspoint Buffered Packet Switch for Differentiated Services

Shared-Memory Combined Input-Crosspoint Buffered Packet Switch for Differentiated Services Shared-Memory Combined -Crosspoint Buffered Packet Switch for Differentiated Services Ziqian Dong and Roberto Rojas-Cessa Department of Electrical and Computer Engineering New Jersey Institute of Technology

More information

DiffServ Architecture: Impact of scheduling on QoS

DiffServ Architecture: Impact of scheduling on QoS DiffServ Architecture: Impact of scheduling on QoS Abstract: Scheduling is one of the most important components in providing a differentiated service at the routers. Due to the varying traffic characteristics

More information

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,

More information

Byzantine Consensus in Directed Graphs

Byzantine Consensus in Directed Graphs Byzantine Consensus in Directed Graphs Lewis Tseng 1,3, and Nitin Vaidya 2,3 1 Department of Computer Science, 2 Department of Electrical and Computer Engineering, and 3 Coordinated Science Laboratory

More information

Stop-and-Go Service Using Hierarchical Round Robin

Stop-and-Go Service Using Hierarchical Round Robin Stop-and-Go Service Using Hierarchical Round Robin S. Keshav AT&T Bell Laboratories 600 Mountain Avenue, Murray Hill, NJ 07974, USA keshav@research.att.com Abstract The Stop-and-Go service discipline allows

More information

Optical Packet Switching

Optical Packet Switching Optical Packet Switching DEISNet Gruppo Reti di Telecomunicazioni http://deisnet.deis.unibo.it WDM Optical Network Legacy Networks Edge Systems WDM Links λ 1 λ 2 λ 3 λ 4 Core Nodes 2 1 Wavelength Routing

More information

Distributed minimum spanning tree problem

Distributed minimum spanning tree problem Distributed minimum spanning tree problem Juho-Kustaa Kangas 24th November 2012 Abstract Given a connected weighted undirected graph, the minimum spanning tree problem asks for a spanning subtree with

More information

Ch 4 : CPU scheduling

Ch 4 : CPU scheduling Ch 4 : CPU scheduling It's the basis of multiprogramming operating systems. By switching the CPU among processes, the operating system can make the computer more productive In a single-processor system,

More information

Quality of Service (QoS)

Quality of Service (QoS) Quality of Service (QoS) The Internet was originally designed for best-effort service without guarantee of predictable performance. Best-effort service is often sufficient for a traffic that is not sensitive

More information

From Static to Dynamic Routing: Efficient Transformations of Store-and-Forward Protocols

From Static to Dynamic Routing: Efficient Transformations of Store-and-Forward Protocols SIAM Journal on Computing to appear From Static to Dynamic Routing: Efficient Transformations of StoreandForward Protocols Christian Scheideler Berthold Vöcking Abstract We investigate how static storeandforward

More information

Multicast Traffic in Input-Queued Switches: Optimal Scheduling and Maximum Throughput

Multicast Traffic in Input-Queued Switches: Optimal Scheduling and Maximum Throughput IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 11, NO 3, JUNE 2003 465 Multicast Traffic in Input-Queued Switches: Optimal Scheduling and Maximum Throughput Marco Ajmone Marsan, Fellow, IEEE, Andrea Bianco,

More information

Experimental Extensions to RSVP Remote Client and One-Pass Signalling

Experimental Extensions to RSVP Remote Client and One-Pass Signalling 1 Experimental Extensions to RSVP Remote Client and One-Pass Signalling Industrial Process and System Communications, Darmstadt University of Technology Merckstr. 25 D-64283 Darmstadt Germany Martin.Karsten@KOM.tu-darmstadt.de

More information

Topic 4a Router Operation and Scheduling. Ch4: Network Layer: The Data Plane. Computer Networking: A Top Down Approach

Topic 4a Router Operation and Scheduling. Ch4: Network Layer: The Data Plane. Computer Networking: A Top Down Approach Topic 4a Router Operation and Scheduling Ch4: Network Layer: The Data Plane Computer Networking: A Top Down Approach 7 th edition Jim Kurose, Keith Ross Pearson/Addison Wesley April 2016 4-1 Chapter 4:

More information

EXTREME POINTS AND AFFINE EQUIVALENCE

EXTREME POINTS AND AFFINE EQUIVALENCE EXTREME POINTS AND AFFINE EQUIVALENCE The purpose of this note is to use the notions of extreme points and affine transformations which are studied in the file affine-convex.pdf to prove that certain standard

More information

Scaling Internet Routers Using Optics Producing a 100TB/s Router. Ashley Green and Brad Rosen February 16, 2004

Scaling Internet Routers Using Optics Producing a 100TB/s Router. Ashley Green and Brad Rosen February 16, 2004 Scaling Internet Routers Using Optics Producing a 100TB/s Router Ashley Green and Brad Rosen February 16, 2004 Presentation Outline Motivation Avi s Black Box Black Box: Load Balance Switch Conclusion

More information

A Reduction of Conway s Thrackle Conjecture

A Reduction of Conway s Thrackle Conjecture A Reduction of Conway s Thrackle Conjecture Wei Li, Karen Daniels, and Konstantin Rybnikov Department of Computer Science and Department of Mathematical Sciences University of Massachusetts, Lowell 01854

More information

Title: Proposed modifications to Performance Testing Baseline: Throughput and Latency Metrics

Title: Proposed modifications to Performance Testing Baseline: Throughput and Latency Metrics 1 ATM Forum Document Number: ATM_Forum/97-0426. Title: Proposed modifications to Performance Testing Baseline: Throughput and Latency Metrics Abstract: This revised text of the baseline includes better

More information

Algorithms for Provisioning Virtual Private Networks in the Hose Model

Algorithms for Provisioning Virtual Private Networks in the Hose Model IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 10, NO 4, AUGUST 2002 565 Algorithms for Provisioning Virtual Private Networks in the Hose Model Amit Kumar, Rajeev Rastogi, Avi Silberschatz, Fellow, IEEE, and

More information

Configuring QoS. Finding Feature Information. Prerequisites for QoS. General QoS Guidelines

Configuring QoS. Finding Feature Information. Prerequisites for QoS. General QoS Guidelines Finding Feature Information, on page 1 Prerequisites for QoS, on page 1 Restrictions for QoS, on page 2 Information About QoS, on page 2 How to Configure QoS, on page 10 Monitoring Standard QoS, on page

More information

588 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 7, NO. 4, AUGUST 1999

588 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 7, NO. 4, AUGUST 1999 588 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 7, NO 4, AUGUST 1999 The Deflection Self-Routing Banyan Network: A Large-Scale ATM Switch Using the Fully Adaptive Self-Routing its Performance Analyses Jae-Hyun

More information

Performance of UMTS Radio Link Control

Performance of UMTS Radio Link Control Performance of UMTS Radio Link Control Qinqing Zhang, Hsuan-Jung Su Bell Laboratories, Lucent Technologies Holmdel, NJ 77 Abstract- The Radio Link Control (RLC) protocol in Universal Mobile Telecommunication

More information

Technical Notes. QoS Features on the Business Ethernet Switch 50 (BES50)

Technical Notes. QoS Features on the Business Ethernet Switch 50 (BES50) Technical Notes QoS Features on the Business Ethernet Switch 50 (BES50) Version: NN70000-004 issue 1.00 Date: February 3 rd, 2009 Status: Released Copyright 2009 Nortel Networks. All rights reserved. The

More information

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

Module 17: Interconnection Networks Lecture 37: Introduction to Routers Interconnection Networks. Fundamentals. Latency and bandwidth Interconnection Networks Fundamentals Latency and bandwidth Router architecture Coherence protocol and routing [From Chapter 10 of Culler, Singh, Gupta] file:///e /parallel_com_arch/lecture37/37_1.htm[6/13/2012

More information

Shared-Memory Combined Input-Crosspoint Buffered Packet Switch for Differentiated Services

Shared-Memory Combined Input-Crosspoint Buffered Packet Switch for Differentiated Services Shared-Memory Combined -Crosspoint Buffered Packet Switch for Differentiated Services Ziqian Dong and Roberto Rojas-Cessa Department of Electrical and Computer Engineering New Jersey Institute of Technology

More information

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Greedy Algorithms (continued) The best known application where the greedy algorithm is optimal is surely

More information

Job-shop scheduling with limited capacity buffers

Job-shop scheduling with limited capacity buffers Job-shop scheduling with limited capacity buffers Peter Brucker, Silvia Heitmann University of Osnabrück, Department of Mathematics/Informatics Albrechtstr. 28, D-49069 Osnabrück, Germany {peter,sheitman}@mathematik.uni-osnabrueck.de

More information

ACONCURRENT system may be viewed as a collection of

ACONCURRENT system may be viewed as a collection of 252 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 3, MARCH 1999 Constructing a Reliable Test&Set Bit Frank Stomp and Gadi Taubenfeld AbstractÐThe problem of computing with faulty

More information

Latency on a Switched Ethernet Network

Latency on a Switched Ethernet Network FAQ 07/2014 Latency on a Switched Ethernet Network RUGGEDCOM Ethernet Switches & Routers http://support.automation.siemens.com/ww/view/en/94772587 This entry is from the Siemens Industry Online Support.

More information

H3C S9500 QoS Technology White Paper

H3C S9500 QoS Technology White Paper H3C Key words: QoS, quality of service Abstract: The Ethernet technology is widely applied currently. At present, Ethernet is the leading technology in various independent local area networks (LANs), and

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

FIRM: A Class of Distributed Scheduling Algorithms for High-speed ATM Switches with Multiple Input Queues

FIRM: A Class of Distributed Scheduling Algorithms for High-speed ATM Switches with Multiple Input Queues FIRM: A Class of Distributed Scheduling Algorithms for High-speed ATM Switches with Multiple Input Queues D.N. Serpanos and P.I. Antoniadis Department of Computer Science University of Crete Knossos Avenue

More information

Chapter III. congestion situation in Highspeed Networks

Chapter III. congestion situation in Highspeed Networks Chapter III Proposed model for improving the congestion situation in Highspeed Networks TCP has been the most used transport protocol for the Internet for over two decades. The scale of the Internet and

More information

Providing Flow Based Performance Guarantees for Buffered Crossbar Switches

Providing Flow Based Performance Guarantees for Buffered Crossbar Switches Providing Flow Based Performance Guarantees for Buffered Crossbar Switches Deng Pan Dept. of Electrical & Computer Engineering Florida International University Miami, Florida 33174, USA pand@fiu.edu Yuanyuan

More information

36 IEEE TRANSACTIONS ON BROADCASTING, VOL. 54, NO. 1, MARCH 2008

36 IEEE TRANSACTIONS ON BROADCASTING, VOL. 54, NO. 1, MARCH 2008 36 IEEE TRANSACTIONS ON BROADCASTING, VOL. 54, NO. 1, MARCH 2008 Continuous-Time Collaborative Prefetching of Continuous Media Soohyun Oh, Beshan Kulapala, Andréa W. Richa, and Martin Reisslein Abstract

More information

Lecture (08, 09) Routing in Switched Networks

Lecture (08, 09) Routing in Switched Networks Agenda Lecture (08, 09) Routing in Switched Networks Dr. Ahmed ElShafee Routing protocols Fixed Flooding Random Adaptive ARPANET Routing Strategies ١ Dr. Ahmed ElShafee, ACU Fall 2011, Networks I ٢ Dr.

More information

Latency on a Switched Ethernet Network

Latency on a Switched Ethernet Network Page 1 of 6 1 Introduction This document serves to explain the sources of latency on a switched Ethernet network and describe how to calculate cumulative latency as well as provide some real world examples.

More information

1 Introduction

1 Introduction Published in IET Communications Received on 17th September 2009 Revised on 3rd February 2010 ISSN 1751-8628 Multicast and quality of service provisioning in parallel shared memory switches B. Matthews

More information

Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks

Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks Dr. Vinod Vokkarane Assistant Professor, Computer and Information Science Co-Director, Advanced Computer Networks Lab University

More information

Chapter 24 Congestion Control and Quality of Service 24.1

Chapter 24 Congestion Control and Quality of Service 24.1 Chapter 24 Congestion Control and Quality of Service 24.1 Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 24-1 DATA TRAFFIC The main focus of congestion control

More information

However, this is not always true! For example, this fails if both A and B are closed and unbounded (find an example).

However, this is not always true! For example, this fails if both A and B are closed and unbounded (find an example). 98 CHAPTER 3. PROPERTIES OF CONVEX SETS: A GLIMPSE 3.2 Separation Theorems It seems intuitively rather obvious that if A and B are two nonempty disjoint convex sets in A 2, then there is a line, H, separating

More information

QUALITY of SERVICE. Introduction

QUALITY of SERVICE. Introduction QUALITY of SERVICE Introduction There are applications (and customers) that demand stronger performance guarantees from the network than the best that could be done under the circumstances. Multimedia

More information

Crossbar - example. Crossbar. Crossbar. Combination: Time-space switching. Simple space-division switch Crosspoints can be turned on or off

Crossbar - example. Crossbar. Crossbar. Combination: Time-space switching. Simple space-division switch Crosspoints can be turned on or off Crossbar Crossbar - example Simple space-division switch Crosspoints can be turned on or off i n p u t s sessions: (,) (,) (,) (,) outputs Crossbar Advantages: simple to implement simple control flexible

More information

Thwarting Traceback Attack on Freenet

Thwarting Traceback Attack on Freenet Thwarting Traceback Attack on Freenet Guanyu Tian, Zhenhai Duan Florida State University {tian, duan}@cs.fsu.edu Todd Baumeister, Yingfei Dong University of Hawaii {baumeist, yingfei}@hawaii.edu Abstract

More information

On the Max Coloring Problem

On the Max Coloring Problem On the Max Coloring Problem Leah Epstein Asaf Levin May 22, 2010 Abstract We consider max coloring on hereditary graph classes. The problem is defined as follows. Given a graph G = (V, E) and positive

More information

Hardware Assisted Recursive Packet Classification Module for IPv6 etworks ABSTRACT

Hardware Assisted Recursive Packet Classification Module for IPv6 etworks ABSTRACT Hardware Assisted Recursive Packet Classification Module for IPv6 etworks Shivvasangari Subramani [shivva1@umbc.edu] Department of Computer Science and Electrical Engineering University of Maryland Baltimore

More information

Network Working Group Request for Comments: 1046 ISI February A Queuing Algorithm to Provide Type-of-Service for IP Links

Network Working Group Request for Comments: 1046 ISI February A Queuing Algorithm to Provide Type-of-Service for IP Links Network Working Group Request for Comments: 1046 W. Prue J. Postel ISI February 1988 A Queuing Algorithm to Provide Type-of-Service for IP Links Status of this Memo This memo is intended to explore how

More information

Dynamic Scheduling Algorithm for input-queued crossbar switches

Dynamic Scheduling Algorithm for input-queued crossbar switches Dynamic Scheduling Algorithm for input-queued crossbar switches Mihir V. Shah, Mehul C. Patel, Dinesh J. Sharma, Ajay I. Trivedi Abstract Crossbars are main components of communication switches used to

More information

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 3, APRIL

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 3, APRIL IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 3, APRIL 2007 629 Video Packet Selection and Scheduling for Multipath Streaming Dan Jurca, Student Member, IEEE, and Pascal Frossard, Senior Member, IEEE Abstract

More information

UNIT-II OVERVIEW OF PHYSICAL LAYER SWITCHING & MULTIPLEXING

UNIT-II OVERVIEW OF PHYSICAL LAYER SWITCHING & MULTIPLEXING 1 UNIT-II OVERVIEW OF PHYSICAL LAYER SWITCHING & MULTIPLEXING Syllabus: Physical layer and overview of PL Switching: Multiplexing: frequency division multiplexing, wave length division multiplexing, synchronous

More information

An Approach to Task Attribute Assignment for Uniprocessor Systems

An Approach to Task Attribute Assignment for Uniprocessor Systems An Approach to ttribute Assignment for Uniprocessor Systems I. Bate and A. Burns Real-Time Systems Research Group Department of Computer Science University of York York, United Kingdom e-mail: fijb,burnsg@cs.york.ac.uk

More information

Scheduling Unsplittable Flows Using Parallel Switches

Scheduling Unsplittable Flows Using Parallel Switches Scheduling Unsplittable Flows Using Parallel Switches Saad Mneimneh, Kai-Yeung Siu Massachusetts Institute of Technology 77 Massachusetts Avenue Room -07, Cambridge, MA 039 Abstract We address the problem

More information

Generic Architecture. EECS 122: Introduction to Computer Networks Switch and Router Architectures. Shared Memory (1 st Generation) Today s Lecture

Generic Architecture. EECS 122: Introduction to Computer Networks Switch and Router Architectures. Shared Memory (1 st Generation) Today s Lecture Generic Architecture EECS : Introduction to Computer Networks Switch and Router Architectures Computer Science Division Department of Electrical Engineering and Computer Sciences University of California,

More information

An algorithm for Performance Analysis of Single-Source Acyclic graphs

An algorithm for Performance Analysis of Single-Source Acyclic graphs An algorithm for Performance Analysis of Single-Source Acyclic graphs Gabriele Mencagli September 26, 2011 In this document we face with the problem of exploiting the performance analysis of acyclic graphs

More information

PERSONAL communications service (PCS) provides

PERSONAL communications service (PCS) provides 646 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 5, NO. 5, OCTOBER 1997 Dynamic Hierarchical Database Architecture for Location Management in PCS Networks Joseph S. M. Ho, Member, IEEE, and Ian F. Akyildiz,

More information

Implementing Access Lists and Prefix Lists

Implementing Access Lists and Prefix Lists An access control list (ACL) consists of one or more access control entries (ACE) that collectively define the network traffic profile. This profile can then be referenced by Cisco IOS XR softwarefeatures

More information

RECHOKe: A Scheme for Detection, Control and Punishment of Malicious Flows in IP Networks

RECHOKe: A Scheme for Detection, Control and Punishment of Malicious Flows in IP Networks > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < : A Scheme for Detection, Control and Punishment of Malicious Flows in IP Networks Visvasuresh Victor Govindaswamy,

More information

3186 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 9, SEPTEMBER Zero/Positive Capacities of Two-Dimensional Runlength-Constrained Arrays

3186 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 9, SEPTEMBER Zero/Positive Capacities of Two-Dimensional Runlength-Constrained Arrays 3186 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 51, NO 9, SEPTEMBER 2005 Zero/Positive Capacities of Two-Dimensional Runlength-Constrained Arrays Tuvi Etzion, Fellow, IEEE, and Kenneth G Paterson, Member,

More information

Introduction to Real-Time Communications. Real-Time and Embedded Systems (M) Lecture 15

Introduction to Real-Time Communications. Real-Time and Embedded Systems (M) Lecture 15 Introduction to Real-Time Communications Real-Time and Embedded Systems (M) Lecture 15 Lecture Outline Modelling real-time communications Traffic and network models Properties of networks Throughput, delay

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

Resource Reservation Protocol

Resource Reservation Protocol 48 CHAPTER Chapter Goals Explain the difference between and routing protocols. Name the three traffic types supported by. Understand s different filter and style types. Explain the purpose of tunneling.

More information

Sections Describing Standard Software Features

Sections Describing Standard Software Features 30 CHAPTER This chapter describes how to configure quality of service (QoS) by using automatic-qos (auto-qos) commands or by using standard QoS commands. With QoS, you can give preferential treatment to

More information

Sections Describing Standard Software Features

Sections Describing Standard Software Features 27 CHAPTER This chapter describes how to configure quality of service (QoS) by using automatic-qos (auto-qos) commands or by using standard QoS commands. With QoS, you can give preferential treatment to

More information

1188 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 13, NO. 5, OCTOBER Wei Sun, Student Member, IEEE, and Kang G. Shin, Fellow, IEEE

1188 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 13, NO. 5, OCTOBER Wei Sun, Student Member, IEEE, and Kang G. Shin, Fellow, IEEE 1188 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 13, NO. 5, OCTOBER 2005 End-to-End Delay Bounds for Traffic Aggregates Under Guaranteed-Rate Scheduling Algorithms Wei Sun, Student Member, IEEE, and Kang

More information

Network Layer Enhancements

Network Layer Enhancements Network Layer Enhancements EECS 122: Lecture 14 Department of Electrical Engineering and Computer Sciences University of California Berkeley Today We have studied the network layer mechanisms that enable

More information

CS 556 Advanced Computer Networks Spring Solutions to Midterm Test March 10, YOUR NAME: Abraham MATTA

CS 556 Advanced Computer Networks Spring Solutions to Midterm Test March 10, YOUR NAME: Abraham MATTA CS 556 Advanced Computer Networks Spring 2011 Solutions to Midterm Test March 10, 2011 YOUR NAME: Abraham MATTA This test is closed books. You are only allowed to have one sheet of notes (8.5 11 ). Please

More information

Resource Sharing for QoS in Agile All Photonic Networks

Resource Sharing for QoS in Agile All Photonic Networks Resource Sharing for QoS in Agile All Photonic Networks Anton Vinokurov, Xiao Liu, Lorne G Mason Department of Electrical and Computer Engineering, McGill University, Montreal, Canada, H3A 2A7 E-mail:

More information

Wireless Networks (CSC-7602) Lecture 8 (15 Oct. 2007)

Wireless Networks (CSC-7602) Lecture 8 (15 Oct. 2007) Wireless Networks (CSC-7602) Lecture 8 (15 Oct. 2007) Seung-Jong Park (Jay) http://www.csc.lsu.edu/~sjpark 1 Today Wireline Fair Schedulling Why? Ideal algorithm Practical algorithms Wireless Fair Scheduling

More information

Dell PowerVault MD3600f/MD3620f Remote Replication Functional Guide

Dell PowerVault MD3600f/MD3620f Remote Replication Functional Guide Dell PowerVault MD3600f/MD3620f Remote Replication Functional Guide Page i THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT

More information

Precomputation Schemes for QoS Routing

Precomputation Schemes for QoS Routing 578 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 11, NO. 4, AUGUST 2003 Precomputation Schemes for QoS Routing Ariel Orda, Senior Member, IEEE, and Alexander Sprintson, Student Member, IEEE Abstract Precomputation-based

More information

Application Layer Multicast Algorithm

Application Layer Multicast Algorithm Application Layer Multicast Algorithm Sergio Machado Universitat Politècnica de Catalunya Castelldefels Javier Ozón Universitat Politècnica de Catalunya Castelldefels Abstract This paper presents a multicast

More information

Chapter 4 Network Layer: The Data Plane

Chapter 4 Network Layer: The Data Plane Chapter 4 Network Layer: The Data Plane A note on the use of these Powerpoint slides: We re making these slides freely available to all (faculty, students, readers). They re in PowerPoint form so you see

More information

THE TRANSPORT LAYER UNIT IV

THE TRANSPORT LAYER UNIT IV THE TRANSPORT LAYER UNIT IV The Transport Layer: The Transport Service, Elements of Transport Protocols, Congestion Control,The internet transport protocols: UDP, TCP, Performance problems in computer

More information

NOTE: The S9500E switch series supports HDLC encapsulation only on POS interfaces. Enabling HDLC encapsulation on an interface

NOTE: The S9500E switch series supports HDLC encapsulation only on POS interfaces. Enabling HDLC encapsulation on an interface Contents Configuring HDLC 1 Overview 1 HDLC frame format and frame type 1 Enabling HDLC encapsulation on an interface 1 Configuring an IP address for an interface 2 Configuring the link status polling

More information

Configuring Fabric Congestion Control and QoS

Configuring Fabric Congestion Control and QoS CHAPTER 1 Fibre Channel Congestion Control (FCC) is a Cisco proprietary flow control mechanism that alleviates congestion on Fibre Channel networks. Quality of service () offers the following advantages:

More information

4.2 Virtual Circuit and Datagram Networks

4.2 Virtual Circuit and Datagram Networks 4.2 VIRTUAL CIRCUIT AND DATAGRAM NETWORKS 313 Available bit rate (ABR) ATM network service. With the Internet offering socalled best-effort service, ATM s ABR might best be characterized as being a slightly-better-than-best-effort

More information

On The Complexity of Virtual Topology Design for Multicasting in WDM Trees with Tap-and-Continue and Multicast-Capable Switches

On The Complexity of Virtual Topology Design for Multicasting in WDM Trees with Tap-and-Continue and Multicast-Capable Switches On The Complexity of Virtual Topology Design for Multicasting in WDM Trees with Tap-and-Continue and Multicast-Capable Switches E. Miller R. Libeskind-Hadas D. Barnard W. Chang K. Dresner W. M. Turner

More information

Chapter -5 QUALITY OF SERVICE (QOS) PLATFORM DESIGN FOR REAL TIME MULTIMEDIA APPLICATIONS

Chapter -5 QUALITY OF SERVICE (QOS) PLATFORM DESIGN FOR REAL TIME MULTIMEDIA APPLICATIONS Chapter -5 QUALITY OF SERVICE (QOS) PLATFORM DESIGN FOR REAL TIME MULTIMEDIA APPLICATIONS Chapter 5 QUALITY OF SERVICE (QOS) PLATFORM DESIGN FOR REAL TIME MULTIMEDIA APPLICATIONS 5.1 Introduction For successful

More information