A Novel Feedback-based Two-stage Switch Architecture

Size: px
Start display at page:

Download "A Novel Feedback-based Two-stage Switch Architecture"

Transcription

1 A Novel Feedback-based Two-stage Switch Architecture Kwan L. Yeung and N. H. Liu Dept. of Electrical and Electronic Engineering The University of Hong Kong Pokfulam, Hong Kong Abstract A load-balanced two-stage switch can eliminate the scheduler, is scalable, and provides close to 100% throughput. Its major problem is that packets can be mis-sequenced. Aim at preventing packets from being received out-of-sequence at outputs, we provide an elegant solution to the mis-sequencing problem based on a novel two-stage switch architecture with feedback. The feedback path is constructed by judicially selecting and coordinating the two sequences of N deterministic configurations used in the two stages of switch, such that if middle-stage port j is connected to output k in slot t, then input k is connected to middle-stage j in slot t+1. With a single packet buffer at each middle-stage VOQ, we show that the packet missequencing problem is naturally solved. With this feedback architecture, middle-stage port j piggybacks an N-bit occupancy vector (1-bit for each VOQ) on the packet sent to output k in each time slot. As output k and input k reside on the same line-card, input k selects a packet for sending in the next time slot based on the occupancy vector received a procedure known as port-based scheduling. Four simple port-based scheduling algorithms are proposed for load balancing in the first-stage switch. We show that they provide an unbeatable overall delay-throughput performance under various traffic conditions. I. INTRODUCTION With the continuous growth of bandwidth in fiber links, the need for building high speed switches/routers is urgent in order to keep pace with the increased transmission rate. Recently, a novel two-stage switch architecture is proposed [7] based on the concept of load balancing. It consists of two switch fabrics in tandem (Fig. 1) and each fabric is configured following a deterministic and periodic sequence of N configurations. The only requirement is that each input is connected to each output exactly once in the sequence. There are many ways to generate such a sequence. As an example, a sequence can be constructed by cyclic shifting the set of input/output connections used in each time slot, such that at time slot t, input i (for i = 0,1,2,,N-1) is connected to output j, where j is given by j = ( i + t ) mod N. (1) Each value of t corresponds to a configuration. Varying t from 0 to N-1, this gives a sequence of N required configurations. Since the sequence is pre-determined, a scheduler [1][2] for finding the best configuration on a slot-by-slot basis is not required. This eliminates the computation overheads as well as the communication overheads [3][4][5][6]. In a two-stage switch, the first switch fabric is responsible for load balancing and the second fabric is for delivering packets based on their destination ports. The goal of load balancing is to make the traffic seen by the second switch fabric as evenly distributed as possible. In the original two-stage switch design [7], a packet is immediately sent to the currently connected output of the first switch fabric, called middle-stage outputs/ports, as soon as it arrives. So no input port buffers/queues are required and load balancing relies purely on the connection patterns obtained from the deterministic sequence of configurations. In [7], it is proved that such a basic load balancing scheme can already guarantee 100% throughput for a broad class of incoming traffic. But the major drawback of this two-stage approach is that packets can be mis-sequenced when they arrive at output ports (of the second switch fabric). This is because packets of the same flow (i.e. packets from the same input port to the same output port) will be distributed to different middle-stage ports and thus will experience different amounts of delays. A simple approach to mis-sequencing is to re-order the packets at the output ports using re-sequencing buffers. (Resequencing buffers are not shown in Fig. 1.) With the original two-stage switch architecture [7], packets can be missequenced by an arbitrary amount, thus a finite re-sequencing buffer is not possible. Efforts are made in [8][9] to bound the delay at additional costs: N writes to memory in one time slot in [8], and a very complicated re-sequencing buffer design (such as 3-dimensional queues) in [9]. A proactive approach is to prevent packets from becoming mis-sequencing. A major advantage is that no re-sequencing buffers/delays are needed/incurred. Full Frames First (FFF) algorithm [10] is the first one thus designed. But it requires heavy state information exchange among the switch line-cards. In [11], Full Ordered Frames First (FOFF) is proposed to replace FFF. But FOFF compromises the original goal of preventing packet mis-sequencing. In FOFF, re-sequencing buffers, though finite, are added to each output port for reordering packets. Another good attempt to prevent mis-sequencing is the Mailbox switch [12] in its basic form. By using a set of symmetric configurations in both stages of switch fabrics, a feedback path for reporting middle-stage packet departure time is created. Based on it, the next packet in the flow will be dispatched and inserted in a middle-stage VOQ (virtual output queue) such that it will depart no earlier than the previous packet of the same flow. Compared with FFF algorithm, Mailbox switch trades the switch throughput ( 75%) for simplicity. However, like FFF, Mailbox switch is extended to

2 allow mis-sequencing for the sake of higher throughput ( 95%). Some further extension is made in [13] for studying the amounts of buffers to be placed at inputs, outputs, and middlestage ports. Despite of the compromises made by earlier attempts [10][12], we believe preventing packet mis-sequencing is the way to go. In this paper, we propose an elegant solution based on a novel feedback mechanism, enabled by properly constructing and coordinating the two sequences of N configurations used in the two stages of switch fabrics. Specifically, in each time slot the two configurations used at the two switch fabrics must satisfy the condition that: if middlestage port j is connected to output k in slot t, then input k is connected to middle-stage j in slot t+1. Together with a single packet buffer at each middle-stage VOQ, we show that the packet mis-sequencing problem is gone. With the proposed feedback architecture, middle-stage port j piggybacks an N-bit occupancy vector (1-bit for each VOQ) on the packet sent to output k in each time slot. By exploiting the fact that each pair of input k and output k reside on the same switch line-card, the occupancy vector arrived at output k is available to input k at negligible cost. Therefore, input k can select a packet for sending in the next time slot based on the occupancy vector this is called portbased scheduling. Four simple port-based scheduling algorithms are proposed for load balancing in the first-stage switch fabric. The idea is to forward just enough packets to the middle-stage ports, such that neither overflow nor underflow will ever occur. In the next section, the importance of having an efficient feedback mechanism in load balancing is discussed. In Section III, our feedback-based two-stage switch architecture is presented. Four simple port-based scheduling algorithms are designed in Section IV. Then the delay-throughput performance based on the proposed two-stage switch architecture is studied in Section V. We conclude the paper in Section VI. Fig. 1: A load-balanced two-stage switch. II. LOAD BALANCING & FEEDBACK Load balancing is of paramount importance in two-stage switch design. Lacking an efficient feedback mechanism leads to poor load balancing performance, and in turn hurts the overall switch performance. Switch performance is measured by both average packet delay and throughput. Existing work focuses more on improving throughput. Sometimes delay is (unconsciously) traded for higher throughput. We elaborate on the above issues in this section based on the two-stage switch architecture shown in Fig. 1. It is equipped with VOQs at both input and middle-stage ports, denoted by VOQ 1 and VOQ 2, respectively. We use VOQ 1 (i,k) to represent the VOQ at input port i with packets destined for output k. Similarly, VOQ 2 (j,k) is used to denote the VOQ at middle-stage port j with packets destined for output k. We define flow f ik as the packets arriving at input i and destined for output k. Packets from f ik are stored in VOQ 1 (i,k). Packets destined for output k (which may come from different input ports/flows) are placed in VOQ 2 (j,k) for j = 0, 1,, N-1. The well-received goal of load balancing is to make the traffic seen by the second switch fabric as evenly distributed as possible. This is usually interpreted as making the queue sizes of all VOQ 2 (j,k) s as equal as possible [9][10]. Due to the lack of feedback, each input port keeps track its own sending history and based on that, load balancing schemes in [9][10] try to keep the difference in the cumulative number of packets that sent to each middle-stage port for a given flow by at most one. The problem with such a per-flow-based load balancing is that each VOQ 2 (j,k) is shared by all flows destined to output k. Since there is no coordination among different flows/inputs, the queue sizes of the N VOQ 2 (j,k) s, for j=0,1,,n-1, can be differed by as large as N packets 1, leaving alone the possible discrepancy among all the N 2 VOQ 2 (j,k) s, for j,k=0,1,,n-1. If we have a feedback mechanism that allows us to know the size of each VOQ 2 (j,k) before sending a packet to it, we can surely do a much better job. Here we branch out to take a different view on the goal of load balancing. We believe that load balancing for two-stage switch is better interpreted as avoiding both overflow and underflow at every VOQ 2 (j,k). Overflow is definitely a bad thing as it causes packet dropping/loss and retransmission. Underflow means that if there are packets waiting in some input ports for a particular output k, then VOQ 2 (j,k) should not be empty at the time that middle-stage port j (thus VOQ 2 (j,k)) is connected to output k. (Note that we know in advance when a middle-stage port will be connected to a particular output because the sequence of N configurations are pre-determined.) Preventing underflow ensures the two-stage switch is workconserving. This in turn helps to enable 100% throughput in the second-stage switch fabric. Due to the inability of accurately balancing queue sizes, the per-flow-based load balancing [9][10] suffers from the underflow problem, which intensifies if the incoming traffic is skewed. To ease the situation, inputs are encouraged to move more packets to the middle-stage ports (accordingly the buffer size of each VOQ 2 (j,k) s must be increased), in the hope to boost up the overall queue occupancy so as to avoid the underflow problem at a few queues. This can enhance the throughput performance, as can be seen from [12][13]. Unfortunately, the delay performance has been sacrificed. Delay performance can be adversely affected in two ways. 1 Assume each input sends a single packet to output k via the same middlestage port j. Then VOQ 2 (j,k) contains N packets while others, VOQ 2 (j,k) for j j, have 0-occupancy. 2

3 First, with the deterministic sequence of N configurations, each additional packet in an VOQ 2 (j,k) means this packet will experience an additional delay of N slots. Second, the longer the packets stay in the middle-stage ports, the more severe the mis-sequencing problem is. Consequently, a larger resequencing buffer/delay is required at output ports. By taking the delay performance into account, we would like to stress that the amount of buffers at each VOQ 2 (j,k) is to meet the load balancing goal (of no overflow and no underflow), not to increase the throughput (as 100% throughput is already guaranteed if the load is balanced). In other words, if underflow problem is solved, (extra) packets should be stored at input ports rather than at middle-stage. Note that similar concept is adopted in designing Internet active queue management schemes, where buffers at a router are for absorbing bursts, not for increasing throughput. This brings in another issue: how much buffer is enough? The answer to this question depends on how the goal of no overflow and no underflow can be accomplished. This, in turn, relies on if we have a feedback mechanism in place for letting each input port know the current queue status/sizes at its connected middle-stage port. Last but not the least, this feedback mechanism must be simple enough, or it easily defeats the original purpose of designing two-stage switches. In the next section, an efficient feedback mechanism is designed. With it, we show that a single packet buffer at each VOQ 2 (j,k) is sufficient to prevent both underflow and overflow problems. A side advantage is that an N-bit occupancy vector is enough to report the status of the N VOQs at each middle-stage port. Some other benefits/insights can be seen from the following scenario. Assume we have a perfect knowledge on middle-stage queue occupancy, and the sequence of configurations used in the firststage switch fabric is obtained from (1). In time slot t, if packet A from flow f ik is delivered to join VOQ 2 (j,k) at the second position in the queue, packet A has to wait for N additional slots after the head-of-line (HOL) packet is delivered to output k. (On average, packets queued at the next-to-the-hol position experiences a middle-stage delay of 3N/2 slots for uniform traffic.) Instead of dispatching packet A to VOQ 2 (j 1,k), we can keep it in VOQ 1 (i,k) for possible delivery to VOQ 2 (j+1,k) in slot t+1. If packet A joins VOQ 2 (j+1,k) at the HOL position, it can be delivered to output k in less than N slots (on average N/2 slots), which is much shorter than joining VOQ 2 (j,k). Similarly, if joining VOQ 2 (j+1,k) at the HOL position fails, we can try VOQ 2 (j+2,k) in slot t+2. Repeat this process until packet A can be delivered. Following this approach of only sending a packet to join at the HOL position of each VOQ 2 (j,k), each VOQ 2 (j,k) only needs a single packet buffer. At the same time, there are N chances for a packet (A) to experience less than 3N/2 slots in middle-stage ports. In order not to underutilize the first-stage switch fabric, when packet A is not sent in slot t (due to either the HOL position is taken, or packet A is not being selected for sending), another packet (say, destined for k ) can be sent from input i to join at the HOL position of VOQ 2 (j,k ). Since each input has up to N packets/queues to choose from, the chance for being able to send a packet is very high. (Otherwise, this may hurt the throughput.) We call the algorithm for selecting a packet to send (based on the middle-stage port occupancy) as port-based scheduling algorithm. Port-based scheduling is much simpler than a general scheduling algorithm [1,2], because it is decentralized and no determination of switch configurations is required. III. FEEDBACK PATH DESIGN Our proposed feedback mechanism is enabled by properly constructing and coordinating the two deterministic sequences of N configurations used in the two stages of switch fabrics, such that if middle-stage port j is connected to output port k at time t, then at time (t+1) input port k is connected to the same middle-stage port j. We call it staggered symmetry property (Property 1). Then by exploiting the (implementation) fact that each pair of input k and output k reside on the same switch linecard, the VOQ status at middle-stage port j can be piggybacked onto the packet sent to output k, which is then made available to input k at negligible cost. Fig. 2: A joint sequence for 4 x 4 two-stage switch. A. Two Sequences of N Configurations For an NxN switch, there are N! configurations. They can be divided into (N-1)! sets of N configurations, such that each set meets the requirement that each input connects to each output exactly once. Eqn. (1) provides a way to find such a set. Let π t denote the configuration found from (1) in slot t. Then [π 0, π 1, π 2,, π N-1 ] denotes the resulting sequence of N configurations. Without loss of generality, we adopt this sequence in our firststage switch shown in Fig. 1. The same set of N configurations is used in the second-stage switch, but in a different order/sequence for providing the necessary feedback path (i.e. Property 1). In this case, a sequence in the reverse order of that in the first stage, or [π N-1, π N-2, π N-3,, π 1 ], is used. Specifically, at time t (for 0 t<n), middle-stage port j is connected to output k, where k is given by k = ( j + N 1 t ) mod N (2) If t is inside [xn, (x+1)n), set t=t xn before applying (2). 3

4 Combining the configurations/sequences in both stages, our proposed two-stage switch is configured according to the joint sequence of [π 0 π N-1, π 1 π N-2, π 2 π N-3,, π N-1 π 0 ]. Before we formally prove that this joint sequence has the staggered symmetry 2 property (Property 1), let us first consider the example shown in Fig. 2 for N = 4 and single-packet-bufferper-voq 2. At t = 0, middle-stage port 0 is connected to output 3. A packet is sent from VOQ 2 (0,3) to output 3, together with a piggybacked 4-bit queue occupancy vector. (Also refer to Fig. 3.) This feedback arrives at output 3 at the end of slot t=0. Since output 3 and input 3 are collocated on the same line-card, this feedback is available to input 3 at the beginning of slot t=1. Based on it, input 3 selects a packet for sending to middle-stage port 0 using a port-based scheduling algorithm. Fig. 3 Timing diagram showing the operation of the twostage switch architecture with feedback. B. Properties of the Joint Sequence We prove that a two-stage switch with the proposed joint sequence [π 0 π N-1, π 1 π N-2, π 2 π N-3,, π N-1 π 0 ] and a single packet buffer for each and every VOQ 2 (j,k), it has the following important properties. Property 1 (Staggered Symmetry). If middle-stage port j is connected to output port k at time t, then at time (t+1) input port k is connected to the same middle-stage port j. Proof: At time t, middle-stage port j is connected to output k, where k is given by (2). We need to show that at time t+1, if input i is connected to the same middle-stage port j, then we must have k = i. At time t+1, we have j=(i+t+1) mod N from (1). Substitute j into (2), we get If (i+t+1)<n, then k = ( i+ t+ 1) mod N + N 1 t mod N. k = i+ t+ 1+ N 1 t mod N = ( i+ N)modN =i 2 For every pair of staggered configurations in [π 0 π N-1, π 1 π N-2, π 2 π N-3,, π N-1 π 0 ], namely, π N-1 & π 1, π N-2 & π 2,, π 0 & π 0, the two configurations in the pair are always symmetric to each other with respect to the column of middle-stage ports, as can be observed from Fig. 2. If (i+t+1) N, we have k = i+ t+ 1 N + N 1 t modn = i Combining the above two cases, k = i is true. # Property 2 (Anchor Output). Input i is always connected to output K, where K = [(i+n-1) mod N], via one of the middlestage ports. We call K the anchor output of input i. Proof: At time t, input i is connected to output k via middlestage port j. Substitute j=(i+t+1) mod N from (1) into (2), we can express k in terms of i. k = ( i+ t) mod N + N 1 t mod N = ( i+ N 1) mod N = K We can see that K depends only on i. So for a given input i, it is always connected to the same anchor output K. # In Fig. 2, input 0/1/2/3 has anchor output 3/0/1/2, via one of the middle-stage ports in each time slot. Property 3 (Deterministic Delay in Middle-stage Ports). Let K be the anchor output of input i. For every packet of flow f ik, it experiences the same d slots delay in one of the middlestage ports, where d is given by N, if K = k d = K k, if K > k K + N k, if K < k Proof: Suppose at slot t, input i is connected to its anchor output K via middle-stage port j. If a packet (A) is successfully sent from input i to join VOQ 2 (j,k), then VOQ 2 (j,k) must either be empty at slot t-1, or the packet originally in it at t-1 (B) is sent to output k in slot t. In the latter case, both packets A and B belong to the same flow f ik (that s why they join the same VOQ 2 (j,k)), and with the same anchor output k=k. It takes N slots/configurations for port j to be re-connected to output K for delivering packet A. Because of the single-packet-buffer-per-voq 2, packet A is the only packet in VOQ 2 (j,k) and it will be delivered for sure after N slots. This gives the maximum delay of N slots. Note that a given middle-stage port j is connected to each output in descending order of the output port numbers. This can be seen from Fig. 2, where middle-stage port 0 is connected to outputs 3, 2, 1, and 0 at slots t=0, 1, 2, and 3 respectively. If packet A s target output k is K-1, middle port j will be connected to output K-1 after one slot, which is the minimum delay that a packet (A) can experience in VOQ 2 (j,k). In general, if packet A s target output k is d ports away from the anchor output K (counted in descending order of port numbers), i.e. d =K-k if K>k, and d =K+N-k if K<k, the delay packet A experienced in VOQ 2 (j,k) is d slots. Obviously, the value of d is bounded by [1, N]. # Assume traffic is uniformly distributed such that each input has the same input load and each packet is uniformly distributed to all outputs. The average packet delay at middlestage ports is given by (1+N)N/2 slots. Property 4 (In-order Packet Delivery). If a two-stage switch is configured according to the joint sequence of [π 0 π N-1, π 1 π N-2, π 2 π N-3,, π N-1 π 0 ], and each and every VOQ 2 (j,k) has a 4

5 single packet buffer, in-order packet delivery is guaranteed. Proof: Assume at times t A and t B (where t B >t A ), packets A and B of flow f ik are delivered from VOQ 1 (i,k) and joined VOQ 2 (j 1,k) and VOQ 2 (j 2,k) respectively. Let d A and d B be the delays experienced in VOQ 2 by the two packets. Missequencing occurs only if packet B arrives at output k earlier than packet A, i.e. t A +d A >t B +d B. However, this will never happen because t B >t A and d A =d B from Property 3. # Note that if more than one packet buffers are allocated to each middle-stage VOQ 2 (j,k), in-order packet delivery may not be guaranteed. IV. PORT-BASED SCHEDULING ALGORITHMS Assume input i is connected to middle-stage port j at slot t, and the anchor output of i is K. Based on the N-bit occupancy vector piggybacked from middle-stage j in slot t-1, we find S j, the set of VOQ 2 (j,k) (for k=0,1,,n-1) with 0-occupancy. Input i chooses the HOL packet at VOQ 1 (i,h) for sending only if h S j and VOQ 1 (i,h) is not empty. This prevents overflow at VOQ 2 (j,h). Since middle port j is connected to each output in descending order of the output port numbers, we know that in next slot t+1 port j will be connected to output K-1 (wrapped around by N). If VOQ 2 (j,k-1) is empty and VOQ 1 (i,k-1) is not, we face a possible underflow in VOQ 2 (j,k-1) at slot t+1. As such, a scheduling algorithm should give priority to VOQ 1 (i,k-1) at slot t. Four simple scheduling algorithms are designed with the above considerations in mind. (1) Round-Robin (RR): If VOQ 1 (i,h ) is selected in slot t-1, then VOQ 1 (i,h +1) is selected if h +1 S j. Otherwise, h the first output port with h>h and h S j is selected. RR gives fair access to each VOQ 1, and is suitable for hardware implementation [5]. (2) Optimized Round-Robin (Opt-RR): If VOQ 2 (j,k-1) is empty and VOQ 1 (i,k-1) is not, VOQ 1 (i,k-1) is selected. Otherwise, use RR. Opt-RR is enhanced to avoid underflow. (3) Longest Queue First (LQF): For all VOQ 1 (i,h) s with h S j, the one with the longest queue size is selected. LQF is good for non-uniform traffic. An efficient implementation of (quasi) LQF is available [14]. (4) Optimized Longest Queue First (Opt-LQF): If VOQ 2 (j,k- 1) is empty and VOQ 1 (i,k-1) is not, VOQ 1 (i,k-1) is selected. Otherwise, use LQF. V. PERFORMANCE EVALUATIONS In this section, the delay-throughput performance of our proposed scheduling algorithms is studied by simulations. For comparison, we also simulate a) the LQF algorithm for Byte- Focal Switch (LQF_Byte-focal) [9], for its better than FOFF [11] performance; b) the islip algorithm [5] (with a single iteration), which serves as a benchmark for single-stage inputqueued switches; and c) output-queued switch, which serves as a lower bound. Due to the space limitation, we only present the results for switch size N=32. A. Uniform Traffic At each time slot for each input, a packet arrives with probability p and destines to each output with equal probability. p is called the input load. Fig. 4 shows the delay-throughput performance of all 7 algorithms. Among them, we can see that the four port-based scheduling algorithms we proposed, RR, Opt-RR, LQF, and Qpt-LQF, give comparable and less-than- 20-slot delay performance up to p=0.8. In fact, their average packet delay at middle-stage ports, the minimum cost for twostage switches, can be easily calculated as (1+N)N/2 = 16.5 slots. If we deduct this portion from the overall delay experienced, we can see that the (input port) delay of our algorithms matches very well with output-queued performance. We also notice that the performance gain of using Opt-RR and Opt-LQF is marginal because the possible underflow problem mentioned in Section IV is negligible. Compared with LQF_Byte-Focal, our algorithms give significantly smaller delay. For islip, our algorithms produce a smaller delay when p is large (>0.6). At p=0.7, LQF_Byte-focal requires 120 slots (not shown), islip requires 88 slots, and only 17 slots by our algorithms. Delay (time slots) LQF Opt-LQF RR Opt_RR Output-queued islip LQF_Byte-Focal Delay-throughput with uniform traffic Input load p Fig. 4 Delay vs input load p, with uniform traffic. B. Uniform Bursty Traffic Bursty arrivals are modeled by the ON/OFF traffic model. In the ON state, a packet arrival is generated in every time slot. In the OFF state, no packet arrivals are generated. Packets of the same burst have the same output and the output for each burst is uniformly distributed. Given the average input load of p and average burst size s, the state transition probabilities from OFF to ON is p/[s(1-p)] and from ON to OFF is 1/s. Fig. 5 shows the delay-throughput performance with mean burst size s = 30 packets. Due to the bursty traffic nature, delay builds up quickly with input load. We can still observe that our proposed four algorithms generally perform better than LQF_Byte-focal and islip. At p=0.8, LQF_Byte-focal requires 224 slots, whereas 180 slots for RR and Opt-RR, 168 for LQF, 162 for Opt-LQF, and 114 for output-queued. 5

6 Delay (time slots) LQF Opt-LQF RR Opt-RR Output-queued islip LQF_Byte-focal Delay-throughput with uniform bursty traffic Input load p Fig. 5 Delay vs input load p, with bursty traffic and s = 30. C. Hot-spot Traffic Packets arriving at each input port in each time slot follow the same independent Bernoulli process with probability p. Packet destinations are generated as follows. For input port i, packet goes to output i with probability ½, and goes to other outputs with same probability 1/[2(N-1)]). From Fig. 6, again we can see that our four proposed algorithms give similar superior delay performance, and with a consistent delay gap of about 24 slots (which is the delay experienced at middle-stage ports) above the output-queued performance over all input loads. Delay (time slots) LQF Opt-LQF RR Opt-RR Output-queued islip LQF_Byte-focal Delay-throughput with hot-spot traffic Input load p Fig. 6 Delay vs input load p, with hot-spot traffic. paper. Based on the collected information, input ports forward just enough packets to the middle-stage ports, so as to avoid both buffer overflow and underflow. Combining with a singlepacket-buffer-per-middle-stage-voq, we not only provided an elegant solution for packet mis-sequencing problem, but also an unbeatable delay-throughput switch performance. References [1] Y. Tamir and G. L. Frazier, High-performance multi-queue buffers for VLSI communications switches, Proceeding 15 th Annual Symposium Computer Architecture, pp , June [2] N. McKeown, V. Anantharam, and J. Walrand, "Achieving 100% throughput in an input queued switch," Proceeding INFOCOM, April 1996, San Francisco, USA. [3] T. Anderson, S. Owicki, J. Saxes and C. Thacker, High speed switch scheduling for local area networks, ACM Transactions on Computer Systems, Vol. 11, pp , [4] N. McKeown, Scheduling algorithms for input-queued cell switches, PhD. Thesis, University of California at Berkeley, [5] N. McKeown, The islip scheduling algorithm for input-queued switches, IEEE Transactions. On Networking, Vol. 7, No. 2, pp , April [6] Y. Li, S. Panwar and H. J. Chao, On the performance of a dual round-robin switch, Proceeding INFOCOM, [7] C. S. Chang, W. J. Chen and H. Y. Hunag, Load balanced Birkhoff-von Neumann switches, part I: one-stage buffering, Computer Communications, Vol. 25, pp , [8] C. S. Chang, W. J. Chen and H. Y. Hunag, Load balanced Birkhoff-von Neumann switches, part II: multi-stage buffering, Computer Communications, Vol. 25, pp , [9] Y. Shen, S. Jiang, S. S. Panwar and H. J. Chao, Byte-Focal: a practical load-balanced switch, IEEE Workshop on High Performance Switching and Routing, May 2005, Hong Kong. [10] Isaac Keslassy and Nick McKeown, Maintaining packet order in two-stage switches, Proceeding INFOCOM, June 2002, New York, USA. [11] I. Keslassy, S.T. Chuang, K. Yu, D. Miller, M. Horowitz, O. Solgaard and N. McKeown, Scaling the Internet Routers using Optics, ACM SIGCOMM 03, Karlsruhe, Germany, Aug [12] C. S. Chang, D. S. Lee and Y. J. Shih, Mailbox switch: a scalable two-stage switch architecture for conflict resolution of ordered packets, Proceeding INFOCOM, March 2004, Hong Kong. [13] C.Y. Tu, C.S. Chang, D.S. Lee and C.T. Chiu, Design a simple and high performance switch using a two-stage architecture, IEEE GLOBECOM 2005, St. Louis, USA, Nov [14] Y. S. Lin and C. B. Shung, Quasi-Pushout Cell Discarding, IEEE Communications Letters, Vol. 1, pp , Sept VI. CONCLUSIONS A load-balanced two-stage switch can eliminate the scheduler, is scalable, and guarantees 100% throughput for a broad class of traffic. The major problem is that packets can be mis-sequenced. Aiming at preventing packets from becoming mis-sequencing, a novel feedback mechanism was designed for collecting the VOQ status in the middle-stage ports in this 6

Globecom. IEEE Conference and Exhibition. Copyright IEEE.

Globecom. IEEE Conference and Exhibition. Copyright IEEE. Title FTMS: an efficient multicast scheduling algorithm for feedbackbased two-stage switch Author(s) He, C; Hu, B; Yeung, LK Citation The 2012 IEEE Global Communications Conference (GLOBECOM 2012), Anaheim,

More information

Citation Globecom - IEEE Global Telecommunications Conference, 2011

Citation Globecom - IEEE Global Telecommunications Conference, 2011 Title Achieving 100% throughput for multicast traffic in input-queued switches Author(s) Hu, B; He, C; Yeung, KL Citation Globecom - IEEE Global Telecommunications Conference, 2011 Issued Date 2011 URL

More information

CFSB: A Load Balanced Switch Architecture with O (1) Complexity

CFSB: A Load Balanced Switch Architecture with O (1) Complexity 200 3rd International Conference on Computer and Electrical Engineering (ICCEE 200) IPCSIT vol. 53 (202) (202) IACSIT Press, Singapore DOI: 0.7763/IPCSIT.202.V53.No..02 CFSB: A Load Balanced Switch Architecture

More information

Dynamic Scheduling Algorithm for input-queued crossbar switches

Dynamic Scheduling Algorithm for input-queued crossbar switches Dynamic Scheduling Algorithm for input-queued crossbar switches Mihir V. Shah, Mehul C. Patel, Dinesh J. Sharma, Ajay I. Trivedi Abstract Crossbars are main components of communication switches used to

More information

The Concurrent Matching Switch Architecture

The Concurrent Matching Switch Architecture The Concurrent Matching Switch Architecture Bill Lin Isaac Keslassy University of California, San Diego, La Jolla, CA 9093 0407. Email: billlin@ece.ucsd.edu Technion Israel Institute of Technology, Haifa

More information

Scaling Internet Routers Using Optics Producing a 100TB/s Router. Ashley Green and Brad Rosen February 16, 2004

Scaling Internet Routers Using Optics Producing a 100TB/s Router. Ashley Green and Brad Rosen February 16, 2004 Scaling Internet Routers Using Optics Producing a 100TB/s Router Ashley Green and Brad Rosen February 16, 2004 Presentation Outline Motivation Avi s Black Box Black Box: Load Balance Switch Conclusion

More information

Matching Schemes with Captured-Frame Eligibility for Input-Queued Packet Switches

Matching Schemes with Captured-Frame Eligibility for Input-Queued Packet Switches Matching Schemes with Captured-Frame Eligibility for -Queued Packet Switches Roberto Rojas-Cessa and Chuan-bi Lin Abstract Virtual output queues (VOQs) are widely used by input-queued (IQ) switches to

More information

Efficient Queuing Architecture for a Buffered Crossbar Switch

Efficient Queuing Architecture for a Buffered Crossbar Switch Proceedings of the 11th WSEAS International Conference on COMMUNICATIONS, Agios Nikolaos, Crete Island, Greece, July 26-28, 2007 95 Efficient Queuing Architecture for a Buffered Crossbar Switch MICHAEL

More information

F cepted as an approach to achieve high switching efficiency

F cepted as an approach to achieve high switching efficiency The Dual Round Robin Matching Switch with Exhaustive Service Yihan Li, Shivendra Panwar, H. Jonathan Chao AbsrmcrVirtual Output Queuing is widely used by fixedlength highspeed switches to overcome headofline

More information

Design and Performance Analysis of a Practical Load-Balanced Switch

Design and Performance Analysis of a Practical Load-Balanced Switch 242 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL 57, NO 8, AUGUST 29 Design and Performance Analysis of a Practical Load-Balanced Switch Yanming Shen, Shivendra S Panwar, and H Jonathan Chao Abstract The load-balanced

More information

Integrated Scheduling and Buffer Management Scheme for Input Queued Switches under Extreme Traffic Conditions

Integrated Scheduling and Buffer Management Scheme for Input Queued Switches under Extreme Traffic Conditions Integrated Scheduling and Buffer Management Scheme for Input Queued Switches under Extreme Traffic Conditions Anuj Kumar, Rabi N. Mahapatra Texas A&M University, College Station, U.S.A Email: {anujk, rabi}@cs.tamu.edu

More information

FIRM: A Class of Distributed Scheduling Algorithms for High-speed ATM Switches with Multiple Input Queues

FIRM: A Class of Distributed Scheduling Algorithms for High-speed ATM Switches with Multiple Input Queues FIRM: A Class of Distributed Scheduling Algorithms for High-speed ATM Switches with Multiple Input Queues D.N. Serpanos and P.I. Antoniadis Department of Computer Science University of Crete Knossos Avenue

More information

Designing Scalable Routers with a New Switching Architecture

Designing Scalable Routers with a New Switching Architecture Designing Scalable Routers with a New Switching Architecture Zuhui Yue 1, Youjian Zhao 1, Jianping Wu 2, Xiaoping Zhang 3 Department of Computer Science, Tsinghua University, Beijing, P.R.China, 100084

More information

The Arbitration Problem

The Arbitration Problem HighPerform Switchingand TelecomCenterWorkshop:Sep outing ance t4, 97. EE84Y: Packet Switch Architectures Part II Load-balanced Switches ick McKeown Professor of Electrical Engineering and Computer Science,

More information

Selective Request Round-Robin Scheduling for VOQ Packet Switch ArchitectureI

Selective Request Round-Robin Scheduling for VOQ Packet Switch ArchitectureI This full tet paper was peer reviewed at the direction of IEEE Communications Society subject matter eperts for publication in the IEEE ICC 2011 proceedings Selective Request Round-Robin Scheduling for

More information

Long Round-Trip Time Support with Shared-Memory Crosspoint Buffered Packet Switch

Long Round-Trip Time Support with Shared-Memory Crosspoint Buffered Packet Switch Long Round-Trip Time Support with Shared-Memory Crosspoint Buffered Packet Switch Ziqian Dong and Roberto Rojas-Cessa Department of Electrical and Computer Engineering New Jersey Institute of Technology

More information

Parallelism in Network Systems

Parallelism in Network Systems High Performance Switching Telecom Center Workshop: and outing Sept 4, 997. Parallelism in Network Systems Joint work with Sundar Iyer HP Labs, 0 th September, 00 Nick McKeown Professor of Electrical Engineering

More information

Scalable Schedulers for High-Performance Switches

Scalable Schedulers for High-Performance Switches Scalable Schedulers for High-Performance Switches Chuanjun Li and S Q Zheng Mei Yang Department of Computer Science Department of Computer Science University of Texas at Dallas Columbus State University

More information

Performance Analysis of WLANs Under Sporadic Traffic

Performance Analysis of WLANs Under Sporadic Traffic Performance Analysis of 802.11 WLANs Under Sporadic Traffic M. Garetto and C.-F. Chiasserini Dipartimento di Elettronica, Politecnico di Torino, Italy Abstract. We analyze the performance of 802.11 WLANs

More information

Design and Evaluation of a Parallel-Polled Virtual Output Queued Switch *

Design and Evaluation of a Parallel-Polled Virtual Output Queued Switch * Design and Evaluation of a Parallel-Polled Virtual Output Queued Switch * K. J. Christensen Department of Computer Science and Engineering University of South Florida Tampa, FL 3360 Abstract - Input-buffered

More information

Shared-Memory Combined Input-Crosspoint Buffered Packet Switch for Differentiated Services

Shared-Memory Combined Input-Crosspoint Buffered Packet Switch for Differentiated Services Shared-Memory Combined -Crosspoint Buffered Packet Switch for Differentiated Services Ziqian Dong and Roberto Rojas-Cessa Department of Electrical and Computer Engineering New Jersey Institute of Technology

More information

On Scheduling Unicast and Multicast Traffic in High Speed Routers

On Scheduling Unicast and Multicast Traffic in High Speed Routers On Scheduling Unicast and Multicast Traffic in High Speed Routers Kwan-Wu Chin School of Electrical, Computer and Telecommunications Engineering University of Wollongong kwanwu@uow.edu.au Abstract Researchers

More information

A Partially Buffered Crossbar Packet Switching Architecture and its Scheduling

A Partially Buffered Crossbar Packet Switching Architecture and its Scheduling A Partially Buffered Crossbar Packet Switching Architecture and its Scheduling Lotfi Mhamdi Computer Engineering Laboratory TU Delft, The etherlands lotfi@ce.et.tudelft.nl Abstract The crossbar fabric

More information

Shared-Memory Combined Input-Crosspoint Buffered Packet Switch for Differentiated Services

Shared-Memory Combined Input-Crosspoint Buffered Packet Switch for Differentiated Services Shared-Memory Combined -Crosspoint Buffered Packet Switch for Differentiated Services Ziqian Dong and Roberto Rojas-Cessa Department of Electrical and Computer Engineering New Jersey Institute of Technology

More information

Buffered Crossbar based Parallel Packet Switch

Buffered Crossbar based Parallel Packet Switch Buffered Crossbar based Parallel Packet Switch Zhuo Sun, Masoumeh Karimi, Deng Pan, Zhenyu Yang and Niki Pissinou Florida International University Email: {zsun3,mkari1, pand@fiu.edu, yangz@cis.fiu.edu,

More information

Int. J. Advanced Networking and Applications 1194 Volume: 03; Issue: 03; Pages: (2011)

Int. J. Advanced Networking and Applications 1194 Volume: 03; Issue: 03; Pages: (2011) Int J Advanced Networking and Applications 1194 ISA-Independent Scheduling Algorithm for Buffered Crossbar Switch Dr Kannan Balasubramanian Department of Computer Science, Mepco Schlenk Engineering College,

More information

Concurrent Round-Robin Dispatching Scheme in a Clos-Network Switch

Concurrent Round-Robin Dispatching Scheme in a Clos-Network Switch Concurrent Round-Robin Dispatching Scheme in a Clos-Network Switch Eiji Oki * Zhigang Jing Roberto Rojas-Cessa H. Jonathan Chao NTT Network Service Systems Laboratories Department of Electrical Engineering

More information

Buffer Sizing in a Combined Input Output Queued (CIOQ) Switch

Buffer Sizing in a Combined Input Output Queued (CIOQ) Switch Buffer Sizing in a Combined Input Output Queued (CIOQ) Switch Neda Beheshti, Nick Mckeown Stanford University Abstract In all internet routers buffers are needed to hold packets during times of congestion.

More information

Performance Analysis of Cell Switching Management Scheme in Wireless Packet Communications

Performance Analysis of Cell Switching Management Scheme in Wireless Packet Communications Performance Analysis of Cell Switching Management Scheme in Wireless Packet Communications Jongho Bang Sirin Tekinay Nirwan Ansari New Jersey Center for Wireless Telecommunications Department of Electrical

More information

Design of Optical Burst Switches based on Dual Shuffle-exchange Network and Deflection Routing

Design of Optical Burst Switches based on Dual Shuffle-exchange Network and Deflection Routing Design of Optical Burst Switches based on Dual Shuffle-exchange Network and Deflection Routing Man-Ting Choy Department of Information Engineering, The Chinese University of Hong Kong mtchoy1@ie.cuhk.edu.hk

More information

Providing Flow Based Performance Guarantees for Buffered Crossbar Switches

Providing Flow Based Performance Guarantees for Buffered Crossbar Switches Providing Flow Based Performance Guarantees for Buffered Crossbar Switches Deng Pan Dept. of Electrical & Computer Engineering Florida International University Miami, Florida 33174, USA pand@fiu.edu Yuanyuan

More information

THE CAPACITY demand of the early generations of

THE CAPACITY demand of the early generations of IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 3, MARCH 2007 605 Resequencing Worst-Case Analysis for Parallel Buffered Packet Switches Ilias Iliadis, Senior Member, IEEE, and Wolfgang E. Denzel Abstract

More information

3 log 2 B Fiber Delay Lines

3 log 2 B Fiber Delay Lines Constructing Optical LIFO Buffers of Size B with 3 log 2 B Fiber Delay Lines Xiaoliang Wang, Xiaohong Jiang Graduate School of Information Sciences Tohoku University Sendai, Japan 980-8579 Email: {waxili,jiang}@ecei.tohoku.ac.jp

More information

Designing Efficient Benes and Banyan Based Input-Buffered ATM Switches

Designing Efficient Benes and Banyan Based Input-Buffered ATM Switches Designing Efficient Benes and Banyan Based Input-Buffered ATM Switches Rajendra V. Boppana Computer Science Division The Univ. of Texas at San Antonio San Antonio, TX 829- boppana@cs.utsa.edu C. S. Raghavendra

More information

Using Traffic Models in Switch Scheduling

Using Traffic Models in Switch Scheduling I. Background Using Traffic Models in Switch Scheduling Hammad M. Saleem, Imran Q. Sayed {hsaleem, iqsayed}@stanford.edu Conventional scheduling algorithms use only the current virtual output queue (VOQ)

More information

Router architectures: OQ and IQ switching

Router architectures: OQ and IQ switching Routers/switches architectures Andrea Bianco Telecommunication etwork Group firstname.lastname@polito.it http://www.telematica.polito.it/ Computer etwork Design - The Internet is a mesh of routers core

More information

Scheduling Algorithms for Input-Queued Cell Switches. Nicholas William McKeown

Scheduling Algorithms for Input-Queued Cell Switches. Nicholas William McKeown Scheduling Algorithms for Input-Queued Cell Switches by Nicholas William McKeown B.Eng (University of Leeds) 1986 M.S. (University of California at Berkeley) 1992 A thesis submitted in partial satisfaction

More information

PCRRD: A Pipeline-Based Concurrent Round-Robin Dispatching Scheme for Clos-Network Switches

PCRRD: A Pipeline-Based Concurrent Round-Robin Dispatching Scheme for Clos-Network Switches : A Pipeline-Based Concurrent Round-Robin Dispatching Scheme for Clos-Network Switches Eiji Oki, Roberto Rojas-Cessa, and H. Jonathan Chao Abstract This paper proposes a pipeline-based concurrent round-robin

More information

Unit 2 Packet Switching Networks - II

Unit 2 Packet Switching Networks - II Unit 2 Packet Switching Networks - II Dijkstra Algorithm: Finding shortest path Algorithm for finding shortest paths N: set of nodes for which shortest path already found Initialization: (Start with source

More information

A New Integrated Unicast/Multicast Scheduler for Input-Queued Switches

A New Integrated Unicast/Multicast Scheduler for Input-Queued Switches Proc. 8th Australasian Symposium on Parallel and Distributed Computing (AusPDC 20), Brisbane, Australia A New Integrated Unicast/Multicast Scheduler for Input-Queued Switches Kwan-Wu Chin School of Electrical,

More information

Chapter 6 Queuing Disciplines. Networking CS 3470, Section 1

Chapter 6 Queuing Disciplines. Networking CS 3470, Section 1 Chapter 6 Queuing Disciplines Networking CS 3470, Section 1 Flow control vs Congestion control Flow control involves preventing senders from overrunning the capacity of the receivers Congestion control

More information

Doubling Memory Bandwidth for Network Buffers

Doubling Memory Bandwidth for Network Buffers Doubling Memory Bandwidth for Network Buffers Youngmi Joo Nick McKeown Department of Electrical Engineering, Stanford University, Stanford, CA 9435-93 {jym,nickm}@leland.stanford.edu Abstract Memory bandwidth

More information

MULTICAST is an operation to transmit information from

MULTICAST is an operation to transmit information from IEEE TRANSACTIONS ON COMPUTERS, VOL. 54, NO. 10, OCTOBER 2005 1283 FIFO-Based Multicast Scheduling Algorithm for Virtual Output Queued Packet Switches Deng Pan, Student Member, IEEE, and Yuanyuan Yang,

More information

Router/switch architectures. The Internet is a mesh of routers. The Internet is a mesh of routers. Pag. 1

Router/switch architectures. The Internet is a mesh of routers. The Internet is a mesh of routers. Pag. 1 Router/switch architectures Andrea Bianco Telecommunication etwork Group firstname.lastname@polito.it http://www.telematica.polito.it/ Computer etworks Design and Management - The Internet is a mesh of

More information

Quality of Service (QoS)

Quality of Service (QoS) Quality of Service (QoS) The Internet was originally designed for best-effort service without guarantee of predictable performance. Best-effort service is often sufficient for a traffic that is not sensitive

More information

K-Selector-Based Dispatching Algorithm for Clos-Network Switches

K-Selector-Based Dispatching Algorithm for Clos-Network Switches K-Selector-Based Dispatching Algorithm for Clos-Network Switches Mei Yang, Mayauna McCullough, Yingtao Jiang, and Jun Zheng Department of Electrical and Computer Engineering, University of Nevada Las Vegas,

More information

A Lossless Quality Transmission Algorithm for Stored VBR Video

A Lossless Quality Transmission Algorithm for Stored VBR Video 1 A Lossless Quality Transmission Algorithm for Stored VBR Video Fei Li, Yan Liu and Ishfaq Ahmad Department of Computer Science The Hong Kong University of Science and Technology Clear Water Bay, Kowloon,

More information

THERE has been much interest recently in a class of. The Interleaved Matching Switch Architecture

THERE has been much interest recently in a class of. The Interleaved Matching Switch Architecture IEEE TRASACTIOS O COMMUICATIOS, VOL. 57, O., DECEMBER 009 The Interleaved Matching Switch Architecture Bill Lin, Member, IEEE, and Isaac Keslassy, Member, IEEE. Abstract Operators need routers to provide

More information

048866: Packet Switch Architectures

048866: Packet Switch Architectures 048866: Packet Switch Architectures Output-Queued Switches Deterministic Queueing Analysis Fairness and Delay Guarantees Dr. Isaac Keslassy Electrical Engineering, Technion isaac@ee.technion.ac.il http://comnet.technion.ac.il/~isaac/

More information

Advanced Computer Networks

Advanced Computer Networks Advanced Computer Networks QoS in IP networks Prof. Andrzej Duda duda@imag.fr Contents QoS principles Traffic shaping leaky bucket token bucket Scheduling FIFO Fair queueing RED IntServ DiffServ http://duda.imag.fr

More information

Hierarchical Scheduling for DiffServ Classes

Hierarchical Scheduling for DiffServ Classes Hierarchical Scheduling for DiffServ Classes Mei Yang, Jianping Wang, Enyue Lu, S Q Zheng Department of Electrical and Computer Engineering, University of evada Las Vegas, Las Vegas, V 8954 Department

More information

Fixed-Length Switching vs. Variable-Length Switching in Input-Queued IP Switches

Fixed-Length Switching vs. Variable-Length Switching in Input-Queued IP Switches Fixed-Length Switching vs. Variable-Length Switching in Input-Queued IP Switches Chengchen Hu, Xuefei Chen, Wenjie Li, Bin Liu Department of Computer Science and Technology Tsinghua University Beijing,

More information

Matrix Unit Cell Scheduler (MUCS) for. Input-Buered ATM Switches. Haoran Duan, John W. Lockwood, and Sung Mo Kang

Matrix Unit Cell Scheduler (MUCS) for. Input-Buered ATM Switches. Haoran Duan, John W. Lockwood, and Sung Mo Kang Matrix Unit Cell Scheduler (MUCS) for Input-Buered ATM Switches Haoran Duan, John W. Lockwood, and Sung Mo Kang University of Illinois at Urbana{Champaign Department of Electrical and Computer Engineering

More information

CS 344/444 Computer Network Fundamentals Final Exam Solutions Spring 2007

CS 344/444 Computer Network Fundamentals Final Exam Solutions Spring 2007 CS 344/444 Computer Network Fundamentals Final Exam Solutions Spring 2007 Question 344 Points 444 Points Score 1 10 10 2 10 10 3 20 20 4 20 10 5 20 20 6 20 10 7-20 Total: 100 100 Instructions: 1. Question

More information

Packet Switching Queuing Architecture: A Study

Packet Switching Queuing Architecture: A Study Packet Switching Queuing Architecture: A Study Shikhar Bahl 1, Rishabh Rai 2, Peeyush Chandra 3, Akash Garg 4 M.Tech, Department of ECE, Ajay Kumar Garg Engineering College, Ghaziabad, U.P., India 1,2,3

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

An Enhanced Dynamic Packet Buffer Management

An Enhanced Dynamic Packet Buffer Management An Enhanced Dynamic Packet Buffer Management Vinod Rajan Cypress Southeast Design Center Cypress Semiconductor Cooperation vur@cypress.com Abstract A packet buffer for a protocol processor is a large shared

More information

Optical Packet Switching

Optical Packet Switching Optical Packet Switching DEISNet Gruppo Reti di Telecomunicazioni http://deisnet.deis.unibo.it WDM Optical Network Legacy Networks Edge Systems WDM Links λ 1 λ 2 λ 3 λ 4 Core Nodes 2 1 Wavelength Routing

More information

INF5050 Protocols and Routing in Internet (Friday ) Subject: IP-router architecture. Presented by Tor Skeie

INF5050 Protocols and Routing in Internet (Friday ) Subject: IP-router architecture. Presented by Tor Skeie INF5050 Protocols and Routing in Internet (Friday 9.2.2018) Subject: IP-router architecture Presented by Tor Skeie High Performance Switching and Routing Telecom Center Workshop: Sept 4, 1997. This presentation

More information

Design of a Weighted Fair Queueing Cell Scheduler for ATM Networks

Design of a Weighted Fair Queueing Cell Scheduler for ATM Networks Design of a Weighted Fair Queueing Cell Scheduler for ATM Networks Yuhua Chen Jonathan S. Turner Department of Electrical Engineering Department of Computer Science Washington University Washington University

More information

CHOKe - A simple approach for providing Quality of Service through stateless approximation of fair queueing. Technical Report No.

CHOKe - A simple approach for providing Quality of Service through stateless approximation of fair queueing. Technical Report No. CHOKe - A simple approach for providing Quality of Service through stateless approximation of fair queueing Rong Pan Balaji Prabhakar Technical Report No.: CSL-TR-99-779 March 1999 CHOKe - A simple approach

More information

A Split-Central-Buffered Load-Balancing Clos-Network Switch with In-Order Forwarding

A Split-Central-Buffered Load-Balancing Clos-Network Switch with In-Order Forwarding A Split-Central-Buffered Load-Balancing Clos-Network Switch with In-Order Forwarding Oladele Theophilus Sule, Roberto Rojas-Cessa, Ziqian Dong, Chuan-Bi Lin, arxiv:8265v [csni] 3 Dec 28 Abstract We propose

More information

Resource Sharing for QoS in Agile All Photonic Networks

Resource Sharing for QoS in Agile All Photonic Networks Resource Sharing for QoS in Agile All Photonic Networks Anton Vinokurov, Xiao Liu, Lorne G Mason Department of Electrical and Computer Engineering, McGill University, Montreal, Canada, H3A 2A7 E-mail:

More information

MULTI-PLANE MULTI-STAGE BUFFERED SWITCH

MULTI-PLANE MULTI-STAGE BUFFERED SWITCH CHAPTER 13 MULTI-PLANE MULTI-STAGE BUFFERED SWITCH To keep pace with Internet traffic growth, researchers have been continually exploring new switch architectures with new electronic and optical device

More information

DESIGN OF EFFICIENT ROUTING ALGORITHM FOR CONGESTION CONTROL IN NOC

DESIGN OF EFFICIENT ROUTING ALGORITHM FOR CONGESTION CONTROL IN NOC DESIGN OF EFFICIENT ROUTING ALGORITHM FOR CONGESTION CONTROL IN NOC 1 Pawar Ruchira Pradeep M. E, E&TC Signal Processing, Dr. D Y Patil School of engineering, Ambi, Pune Email: 1 ruchira4391@gmail.com

More information

Routing, Routers, Switching Fabrics

Routing, Routers, Switching Fabrics Routing, Routers, Switching Fabrics Outline Link state routing Link weights Router Design / Switching Fabrics CS 640 1 Link State Routing Summary One of the oldest algorithm for routing Finds SP by developing

More information

Scheduling. Scheduling algorithms. Scheduling. Output buffered architecture. QoS scheduling algorithms. QoS-capable router

Scheduling. Scheduling algorithms. Scheduling. Output buffered architecture. QoS scheduling algorithms. QoS-capable router Scheduling algorithms Scheduling Andrea Bianco Telecommunication Network Group firstname.lastname@polito.it http://www.telematica.polito.it/ Scheduling: choose a packet to transmit over a link among all

More information

Multicast Scheduling in WDM Switching Networks

Multicast Scheduling in WDM Switching Networks Multicast Scheduling in WDM Switching Networks Zhenghao Zhang and Yuanyuan Yang Dept. of Electrical & Computer Engineering, State University of New York, Stony Brook, NY 11794, USA Abstract Optical WDM

More information

Asynchronous vs Synchronous Input-Queued Switches

Asynchronous vs Synchronous Input-Queued Switches Asynchronous vs Synchronous Input-Queued Switches Andrea Bianco, Davide Cuda, Paolo Giaccone, Fabio Neri Dipartimento di Elettronica, Politecnico di Torino (Italy) Abstract Input-queued (IQ) switches are

More information

LS Example 5 3 C 5 A 1 D

LS Example 5 3 C 5 A 1 D Lecture 10 LS Example 5 2 B 3 C 5 1 A 1 D 2 3 1 1 E 2 F G Itrn M B Path C Path D Path E Path F Path G Path 1 {A} 2 A-B 5 A-C 1 A-D Inf. Inf. 1 A-G 2 {A,D} 2 A-B 4 A-D-C 1 A-D 2 A-D-E Inf. 1 A-G 3 {A,D,G}

More information

Achieving Distributed Buffering in Multi-path Routing using Fair Allocation

Achieving Distributed Buffering in Multi-path Routing using Fair Allocation Achieving Distributed Buffering in Multi-path Routing using Fair Allocation Ali Al-Dhaher, Tricha Anjali Department of Electrical and Computer Engineering Illinois Institute of Technology Chicago, Illinois

More information

The Design and Performance Analysis of QoS-Aware Edge-Router for High-Speed IP Optical Networks

The Design and Performance Analysis of QoS-Aware Edge-Router for High-Speed IP Optical Networks The Design and Performance Analysis of QoS-Aware Edge-Router for High-Speed IP Optical Networks E. Kozlovski, M. Düser, R. I. Killey, and P. Bayvel Department of and Electrical Engineering, University

More information

On Achieving Throughput in an Input-Queued Switch

On Achieving Throughput in an Input-Queued Switch 858 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 11, NO. 5, OCTOBER 2003 On Achieving Throughput in an Input-Queued Switch Saad Mneimneh and Kai-Yeung Siu Abstract We establish some lower bounds on the speedup

More information

Stop-and-Go Service Using Hierarchical Round Robin

Stop-and-Go Service Using Hierarchical Round Robin Stop-and-Go Service Using Hierarchical Round Robin S. Keshav AT&T Bell Laboratories 600 Mountain Avenue, Murray Hill, NJ 07974, USA keshav@research.att.com Abstract The Stop-and-Go service discipline allows

More information

A Proposal for a High Speed Multicast Switch Fabric Design

A Proposal for a High Speed Multicast Switch Fabric Design A Proposal for a High Speed Multicast Switch Fabric Design Cheng Li, R.Venkatesan and H.M.Heys Faculty of Engineering and Applied Science Memorial University of Newfoundland St. John s, NF, Canada AB X

More information

Switch Architecture for Efficient Transfer of High-Volume Data in Distributed Computing Environment

Switch Architecture for Efficient Transfer of High-Volume Data in Distributed Computing Environment Switch Architecture for Efficient Transfer of High-Volume Data in Distributed Computing Environment SANJEEV KUMAR, SENIOR MEMBER, IEEE AND ALVARO MUNOZ, STUDENT MEMBER, IEEE % Networking Research Lab,

More information

Comparative Study of blocking mechanisms for Packet Switched Omega Networks

Comparative Study of blocking mechanisms for Packet Switched Omega Networks Proceedings of the 6th WSEAS Int. Conf. on Electronics, Hardware, Wireless and Optical Communications, Corfu Island, Greece, February 16-19, 2007 18 Comparative Study of blocking mechanisms for Packet

More information

Priority-aware Scheduling for Packet Switched Optical Networks in Datacenter

Priority-aware Scheduling for Packet Switched Optical Networks in Datacenter Priority-aware Scheduling for Packet Switched Optical Networks in Datacenter Speaker: Lin Wang Research Advisor: Biswanath Mukherjee PSON architecture Switch architectures and centralized controller Scheduling

More information

Network Working Group Request for Comments: 1046 ISI February A Queuing Algorithm to Provide Type-of-Service for IP Links

Network Working Group Request for Comments: 1046 ISI February A Queuing Algorithm to Provide Type-of-Service for IP Links Network Working Group Request for Comments: 1046 W. Prue J. Postel ISI February 1988 A Queuing Algorithm to Provide Type-of-Service for IP Links Status of this Memo This memo is intended to explore how

More information

1 Architectures of Internet Switches and Routers

1 Architectures of Internet Switches and Routers 1 Architectures of Internet Switches and Routers Xin Li, Lotfi Mhamdi, Jing Liu, Konghong Pun, and Mounir Hamdi The Hong-Kong University of Science & Technology. {lixin,lotfi,liujing, konghong,hamdi}@cs.ust.hk

More information

Routers with a Single Stage of Buffering * Sigcomm Paper Number: 342, Total Pages: 14

Routers with a Single Stage of Buffering * Sigcomm Paper Number: 342, Total Pages: 14 Routers with a Single Stage of Buffering * Sigcomm Paper Number: 342, Total Pages: 14 Abstract -- Most high performance routers today use combined input and output queueing (CIOQ). The CIOQ router is also

More information

DiffServ Architecture: Impact of scheduling on QoS

DiffServ Architecture: Impact of scheduling on QoS DiffServ Architecture: Impact of scheduling on QoS Abstract: Scheduling is one of the most important components in providing a differentiated service at the routers. Due to the varying traffic characteristics

More information

Multicast Transport Protocol Analysis: Self-Similar Sources *

Multicast Transport Protocol Analysis: Self-Similar Sources * Multicast Transport Protocol Analysis: Self-Similar Sources * Mine Çağlar 1 Öznur Özkasap 2 1 Koç University, Department of Mathematics, Istanbul, Turkey 2 Koç University, Department of Computer Engineering,

More information

IMPLEMENTATION OF CONGESTION CONTROL MECHANISMS USING OPNET

IMPLEMENTATION OF CONGESTION CONTROL MECHANISMS USING OPNET Nazy Alborz IMPLEMENTATION OF CONGESTION CONTROL MECHANISMS USING OPNET TM Communication Networks Laboratory School of Engineering Science Simon Fraser University Road map Introduction to congestion control

More information

Resource allocation in networks. Resource Allocation in Networks. Resource allocation

Resource allocation in networks. Resource Allocation in Networks. Resource allocation Resource allocation in networks Resource Allocation in Networks Very much like a resource allocation problem in operating systems How is it different? Resources and jobs are different Resources are buffers

More information

Using Hybrid Algorithm in Wireless Ad-Hoc Networks: Reducing the Number of Transmissions

Using Hybrid Algorithm in Wireless Ad-Hoc Networks: Reducing the Number of Transmissions Using Hybrid Algorithm in Wireless Ad-Hoc Networks: Reducing the Number of Transmissions R.Thamaraiselvan 1, S.Gopikrishnan 2, V.Pavithra Devi 3 PG Student, Computer Science & Engineering, Paavai College

More information

THE LOAD-BALANCED ROUTER

THE LOAD-BALANCED ROUTER THE LOAD-BALACED ROUTER a dissertation submitted to the department of electrical engineering and the committee on graduate studies of stanford university in partial fulfillment of the requirements for

More information

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 21, NO. 8, AUGUST

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 21, NO. 8, AUGUST IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 21, NO. 8, AUGUST 2013 1481 Low Propagation Delay Load-Balanced 4 4SwitchFabricICin0.13-μm CMOS Technology Ching-Te Chiu, Yu-Hao Hsu,

More information

Scheduling Algorithms to Minimize Session Delays

Scheduling Algorithms to Minimize Session Delays Scheduling Algorithms to Minimize Session Delays Nandita Dukkipati and David Gutierrez A Motivation I INTRODUCTION TCP flows constitute the majority of the traffic volume in the Internet today Most of

More information

STATE UNIVERSITY OF NEW YORK AT STONY BROOK. CEASTECHNICAL REPe. Multiclass Information Based Deflection Strategies for the Manhattan Street Network

STATE UNIVERSITY OF NEW YORK AT STONY BROOK. CEASTECHNICAL REPe. Multiclass Information Based Deflection Strategies for the Manhattan Street Network ; STATE UNIVERSITY OF NEW YORK AT STONY BROOK CEASTECHNICAL REPe Multiclass Information Based Deflection Strategies for the Manhattan Street Network J.-W. Jeng and T.G. Robertazzi June 19, 1992 Multiclass

More information

Generic Architecture. EECS 122: Introduction to Computer Networks Switch and Router Architectures. Shared Memory (1 st Generation) Today s Lecture

Generic Architecture. EECS 122: Introduction to Computer Networks Switch and Router Architectures. Shared Memory (1 st Generation) Today s Lecture Generic Architecture EECS : Introduction to Computer Networks Switch and Router Architectures Computer Science Division Department of Electrical Engineering and Computer Sciences University of California,

More information

Topic 4a Router Operation and Scheduling. Ch4: Network Layer: The Data Plane. Computer Networking: A Top Down Approach

Topic 4a Router Operation and Scheduling. Ch4: Network Layer: The Data Plane. Computer Networking: A Top Down Approach Topic 4a Router Operation and Scheduling Ch4: Network Layer: The Data Plane Computer Networking: A Top Down Approach 7 th edition Jim Kurose, Keith Ross Pearson/Addison Wesley April 2016 4-1 Chapter 4:

More information

Congestion Management in Lossless Interconnects: Challenges and Benefits

Congestion Management in Lossless Interconnects: Challenges and Benefits Congestion Management in Lossless Interconnects: Challenges and Benefits José Duato Technical University of Valencia (SPAIN) Conference title 1 Outline Why is congestion management required? Benefits Congestion

More information

Tutorial 9 : TCP and congestion control part I

Tutorial 9 : TCP and congestion control part I Lund University ETSN01 Advanced Telecommunication Tutorial 9 : TCP and congestion control part I Author: Antonio Franco Course Teacher: Emma Fitzgerald January 27, 2015 Contents I Before you start 3 II

More information

Communication using Multiple Wireless Interfaces

Communication using Multiple Wireless Interfaces Communication using Multiple Interfaces Kameswari Chebrolu and Ramesh Rao Department of ECE University of California, San Diego Abstract With the emergence of different wireless technologies, a mobile

More information

Switching Hardware. Spring 2015 CS 438 Staff, University of Illinois 1

Switching Hardware. Spring 2015 CS 438 Staff, University of Illinois 1 Switching Hardware Spring 205 CS 438 Staff, University of Illinois Where are we? Understand Different ways to move through a network (forwarding) Read signs at each switch (datagram) Follow a known path

More information

An Enhanced Binning Algorithm for Distributed Web Clusters

An Enhanced Binning Algorithm for Distributed Web Clusters 1 An Enhanced Binning Algorithm for Distributed Web Clusters Hann-Jang Ho Granddon D. Yen Jack Lee Department of Information Management, WuFeng Institute of Technology SingLing Lee Feng-Wei Lien Department

More information

Delayed reservation decision in optical burst switching networks with optical buffers

Delayed reservation decision in optical burst switching networks with optical buffers Delayed reservation decision in optical burst switching networks with optical buffers G.M. Li *, Victor O.K. Li + *School of Information Engineering SHANDONG University at WEIHAI, China + Department of

More information

Research Letter A Simple Mechanism for Throttling High-Bandwidth Flows

Research Letter A Simple Mechanism for Throttling High-Bandwidth Flows Hindawi Publishing Corporation Research Letters in Communications Volume 28, Article ID 74878, 5 pages doi:11155/28/74878 Research Letter A Simple Mechanism for Throttling High-Bandwidth Flows Chia-Wei

More information

Stretch-Optimal Scheduling for On-Demand Data Broadcasts

Stretch-Optimal Scheduling for On-Demand Data Broadcasts Stretch-Optimal Scheduling for On-Demand Data roadcasts Yiqiong Wu and Guohong Cao Department of Computer Science & Engineering The Pennsylvania State University, University Park, PA 6 E-mail: fywu,gcaog@cse.psu.edu

More information