Design Principles for Practical Self-Routing Nonblocking Switching Networks with O(N log N) Bit-Complexity

Size: px
Start display at page:

Download "Design Principles for Practical Self-Routing Nonblocking Switching Networks with O(N log N) Bit-Complexity"

Transcription

1 IEEE TRANSACTIONS ON COMPUTERS, VOL. 46, NO. 10, OCTOBER Design Principles for Practical Self-Routing Nonblocking Switching Networks with O(N log N) Bit-Complexity Te H. Szymanski, Member, IEEE Computer Society Abstract Principles for esigning practical self-routing nonblocking N N circuit-switche connection networks with optimal q(n log N) harware at the bit-level of complexity are escribe. The overall principles behin the architecture can be escribe as Expan-Route-Contract. A self-routing nonblocking network with w-bit wie atapaths can be achieve by expaning the atapaths to w + z inepenent bit-serial connections, routing these connections through self-routing networks with blocking, an by contracting the ata at the output an recovering the w-bit wie atapaths. For an appropriate reunancy z, the blocking probability can be mae arbitrarily small an the fault tolerance arbitrarily high. By using efficient space omain concentrators, the architecture yiels self-routing nonblocking switching networks with an optimal O(N log N) bits of memory or O(N log N log log log N) logic gates. By using a linear-cost time omain concentrator, the architecture yiels self-routing nonblocking switching networks with an optimal q(n log N) bits of memory or logic gates. These esigns meet Shannon s lower boun on memory requirements, establishe in the 1950s. The number of stages of crossbars can match the theoretical minimum, which has not been achieve by previous self-routing networks. The architecture is feasible with existing electrical or optical technologies. The esigns of electrical an optical switch cores with Terabits of bisection banwith for Networks-of-Workstations (NOWs) are escribe. Inex Terms Multistage, networks, self-routing, nonblocking, circuit-switching, scalable, ranomization, electrical, optical. 1 INTRODUCTION A N N N nonblocking Connection Network is a circuit-switche network capable of achieving any of the N! permutations of its N input ports onto its N output ports [35]. Such networks are often use for ATM switching or multiprocessor communication. The harware cost is efine as the number of logic gates or bits of memory require in its construction. The epth is efine as the number of logic gates along the longest path between an input port an an output port. A self-routing network is one in which a circuit-switche connection can be establishe by the harware as it propagates forwar through the network, with reliance only on the local information available at each noe; there is no nee for any off-line path precomputation. The setup time is efine as the propagation elay of all logic gates traverse in the establishment of a connection between some input port an some output port. This paper presents a esign for practical an scalable connection networks, i.e., nonblocking switching networks which easily an efficiently scale to large sizes. This problem is historically significant an it is important in the esign of Gigabit an Terabit optical networks [36]. Using recently evelope free-space optical technologies [2], [12], [18], [21], complex electronic switching noes can be implemente on an Opto-Electronic Integrate Circuit (OEIC) The author is with the Department of Electrical Engineering, McGill University, Montreal, PQ, Canaa H3A 2A7. tes@macs.ee.mcgill.ca. Manuscript receive 9 Aug. 1993; revise 10 June 1996 an 17 May For information on obtaining reprints of this article, please sen to: tc@computer.org, an reference IEEECS Log Number with optical I/O. Base upon inustry projections [2], [18], [28], within a ecae, OEICs containing millions of electronic logic gates an thousans of optical I/Os will be feasible. Each OEIC can implement one stage of an optical multistage network. The optical output of one stage can be fe into the next stage through free-space, where the permutation between stages can be implemente optically. The appeal of these networks is their very high bisection banwiths (in the Terabit per secon range) an the simplicity of their construction, since all interstage wires are implemente optically. It is important to minimize the number of stages in an optical network to minimize cost an maximize reliability. It is also important to avoi complicate backtracking control algorithms within the network, which are infeasible to achieve optically. Perhaps surprisingly, the new optical technologies are highlighting the nee for goo solutions to historic networking problems, such as the esign of scalable switching networks with fast self-routing control algorithms. (A complete set of graphs which allows a reaer to esign nonblocking networks is presente in Section 2.) OEICs with hunres of binary circuit-switching noes have been evelope in many ifferent smart pixel technologies, i.e., [8], [9], [12], [21], [34], [39]. Researchers at the former AT&T have emonstrate a circuit-switche optical multistage network with 60K optical channels which use off-line routing [8], [9], [12]. In spite of the consierable inustrial interest in optical multistage networks, to ate, there oes not exist an efficient scalable nonblocking circuit-switching network architecture with fast self-routing algorithms an with a harware complexity which is asymptotically optimal /97/$ IEEE J:\PRODUCTION\TC\3-FINAL\09- regularpap KSM 27,136 08/11/97 2:11 PM 1 /

2 2 IEEE TRANSACTIONS ON COMPUTERS, VOL. 46, NO. 10, OCTOBER 1997 A recent report sponsore by the U.S. National Science Founation (NSF) entitle Research Priorities in Networking an Communications [36] efine 15 key research priorities for the next ecae. These priorities inclue Dynamic Network Control, i.e., the nee for fast routing algorithms to control the Gigabit an Terabit networks of the future, an Switching Systems, i.e., the nee for optimally scalable connection networks. Accoring to the NSF report, Realization of these goals require new avances in switching system theory an esign. To ate, there are no practical architectures for nonblocking multipoint virtual circuit switches that can meet the theoretical limits on optimal scaling with respect to all the characteristics of practical concern (switching network complexity, routing memory requirements) an most systems now being use have poor scaling properties [36]. The Expan-Route-Contract (ERC) network architecture propose in this paper represents one step towar these goals. The ERC network can meet Shannon s asymptotic lower boun on the harware complexity of selfrouting nonblocking networks (see next paragraph), can scale optimally to Gigabit an Terabit banwiths associate with optical technologies an allows for simple an very fast network control algorithms which are provably immune to congestion. In the 1950s, Claue Shannon establishe a lower boun on the harware complexity of self-routing nonblocking circuit-switching networks which are provably immune to congestion (equivalently, they never exhibit blocking given any permutation traffic pattern) [29]. Accoring to Shannon s complexity argument, an optimal self-routing circuitswitche connection network woul require q(n log N) harware, 1 which inclues all bits of memory [29], all logic gates, an all crosspoints, an woul have a epth of O(log N) binary noes. To ate, no known self-routing nonblocking circuit-switching networks with explicit constructions meet Shannon s lower bouns. The famous AKS sorting network [1] meets Shannon s lower bouns in the limit of infinitely large sizes N, but it relies on linear cost concentrators which lack explicit constructions, i.e., it cannot be built in practice. The MultiBenes network propose by Avora, Leighton, an Maggs [3] also meets Shannon s lower bouns, but it also relies on expaner graphs which lack explicit constructions, relies on complex backtracking control algorithms, an it requires an AKS sorting network to acknowlege calls. These two networks are primarily of theoretical significance, an establishe that Shannon s lower bouns on the cost an epth of self-routing connection networks can be met in theory, but not in practice. We point out that a store-an-forwar packetswitche network, where packets are buffere in each stage, coul not possibly meet Shannon s lower boun on cost. Each packet requires at least O(log N) bits to ientify its estination, an, if packets are buffere in every stage, then the harware complexity of the network is at least q(n log 2 N) bits of memory, which is suboptimal by a factor of O(log N). In aition, packet-switche networks are unesirable since they are slow [27]. A pipeline circuit-switche network with fast connection establishment can eliver a 1. All logarithms are to the base 2 unless otherwise inicate. permutation of packets from its input sie to its output sie in roughly the same amount of time a packet-switche network requires to move packets forwar one stage. For these reasons, pipeline circuit switching an the similar worm-hole routing technique have largely eliminate packet switching in recent multicomputer networks [27]. (Packet-switche networks using ranomize routing are escribe in [23], [37], [38].) In practice, many self-routing permutation networks are base on bit-serial circuit-switche versions of Batcher's sorting network [6], with q(n log 2 N) binary noes, q(log 2 N) epth, an q(log 2 N) setup time. These complexities are expresse at the bit-level, an inclue all crosspoints, all bits of internal memory, an all logic gates, where every logic gate is assume to have boune fan-in an boune fan-out. There have been some innovative switch esigns over the years. Douglass has propose a rearrangeable network with O(N log 2.5 N) harware an O(log 2.5 N) setup time [11]. Chien an Oruc propose permutation networks with O(N log N log log N) bit cost an with O(log 3 N) bit elay [7]. Using a numerical analysis, De Biase et al. propose permutation networks with O(N log 2+e N) bit cost (where e > 0) an O(log N) bit elay [10]. The complexity of various networks is illustrate in Table 1. However, the iscrepancies between the best-known theoretical results an practical results are evient in Table 1. In this paper, we propose an architecture for self-routing nonblocking, circuit-switche connection networks, which we call the Expan-Route-Contract (ERC) architecture. An overview of the architecture is shown in Fig. 1. The nonblocking ERC architecture is base on the concept of expaning the incoming ata, routing the bits through inepenent bit-serial networks which exhibit blocking, an compacting the ata at the output. The architecture is also base on probabilistic schemes an ranomization, rather than on eterministic schemes. The expansion can be accomplishe in at least two ways: 1) A w-bit wie ata wor can be encoe with a Forwar Error Correcting Coe to yiel a w + z bit wie wor or 2) The w-bit wie ata wor is submitte to an expaner creating a w + z bit wie wor. After the expansion, the w + z bits of ata are route through w + z inepenent bit-serial self-routing circuitswitche networks, which we call bit-planes. Each selfrouting bit-plane attempts to establish bit-serial circuitswitche connections, an each bit-plane is allowe to have an arbitrary blocking probability. (The bit-serial connections can also be bit-parallel). The expansion z is a esign parameter which is chosen so that the probability that a w-bit wie connection is establishe is sufficiently large. Given any level of blocking in the bit-planes, it is always possible to pick the expansion z so that the mean-time-betweenblocking of a w-bit wie connection is an arbitrarily large amount of time, for example years. It is important to recognize the strength of this probabilistic approach: There are only about years left in the life of our universe, an, hence, these networks can be esigne so that the meantime-between-blocking can excee the remaining life of our universe. J:\PRODUCTION\TC\3-FINAL\09-97\105193_2.DOC regularpaper97.ot KSM 19,968 08/11/97 2:11 PM 2 / 13

3 SZYMANSKI: DESIGN PRINCIPLES FOR PRACTICAL SELF-ROUTING NONBLOCKING SWITCHING NETWORKS 3 TABLE 1 ASYMPTOTIC COMPLEXITIES OF VARIOUS NONBLOCKING CIRCUIT-SWITCHED NETWORKS Fig. 1. Example of the propose switch architecture, base on expansion, routing, an contraction. In orer to keep the overall cost at q(n log N), the selfrouting bit-planes must have q(n log N) bit complexity, an the blocking probability of any bit-plane must not approach one, so that the require expansion (base on z) is boune by a constant factor. It has recently been establishe that self-routing ilate banyans can be esigne to have q(n log N) bit complexity an a blocking probability (enote pb) which approaches zero as N approaches infinity [33]. In this paper, it is shown that the ilate banyans with low blocking probabilities can be use in the propose Expan-Route-Contract architecture to yiel harwareefficient nonblocking switches with q(n log N) bit complexity (where the nonblocking property is base on probabilistic arguments). These nonblocking switches can meet Shannon s lower boun first establishe in the 1950s on the harware complexity of self-routing nonblocking switches. (We note that the ERC architecture can yiel a nonblocking network using any self-routing bit-planes, as long as the blocking probability in the bit-planes is less than one. We also point out that the bit-serial connections can be replace by bit-parallel connections.) In practice, each bit of high-spee memory has an equivalent cost of nearly 10 logic gates. Hence, minimization of memory requirements is often more important than minimization of logic gates [29]. The propose ERC network can be esigne with asymptotically optimal memory requirements. In summary, the propose ERC connection network architecture has straightforwar explicit constructions, uses very simple an fast routing algorithms which are easily implemente in harware, an has very fast setup times when compare to other known networks. This paper is organize as follows. Section 2 presents a brief review of multipath elta networks, an erives some upper bouns on the blocking in multipath elta networks. Section 3 escribes the principles behin the ERC architecture. Section 4 iscusses SDM an TDM constructions of multipath elta networks an erives asymptotic complexities. Section 5 escribes the application of the theory to the esign of electrical an optical networks, an Section 6 contains concluing remarks. 2 BLOCKING IN MULTIPATH CIRCUIT-SWITCHED DELTA NETWORKS Delta networks are banyan networks with the self-routing property [26]. A -ilate elta network [4], [19], [31], [32], [33] can be obtaine from a regular banyan by increasing the capacity of each ege to hanle up to connections simultaneously, an by replacing all the crossbar switches with ilate crossbar switches. Each logical input port to a ilate crossbar can receive up to connections simultaneously, an each logical output port of a ilate crossbar can transmit up to connections simultaneously. A two-ilate elta network is shown in Fig. 2a. The theorems in this paper will apply to ilate elta networks, an the more general multipath elta networks (see next paragraph) have comparable performance. A p-path elta network can been efine as a multipath elta network with the following property: In every stage where a routing ecision must be mae, there exist p alternate paths which lea to a given estination [32]. Therefore, in every stage, a p-path elta network has at least p suitable alternate paths which a connection coul take while moving towar its estination. A two-path elta network is shown in Fig. 2b. J:\PRODUCTION\TC\3-FINAL\09-97\105193_2.DOC regularpaper97.ot KSM 19,968 08/11/97 2:11 PM 3 / 13

4 4 IEEE TRANSACTIONS ON COMPUTERS, VOL. 46, NO. 10, OCTOBER Blocking Probability in a Multipath Delta Network THEOREM 2. Given a ranom uniform traffic moel, the conitional blocking probability in a self-routing -ilate b n b n elta network with h connection requests source at each logical input port is upper boune by pb n eh F H G I K J e - h. Fig. 2. (a) Dilate elta network built with two-ilate 2 2 crossbar switches. (b) Multipath elta network built with two-ilate 2 2 crossbar switches. In this section, simple proofs are propose which establish that 1) q(n log log N)connections will survive when route through a p-path N N elta network with a path multiplicity of p = q(log log N), an 2) that the blocking probability pb of an iniviual connection approaches zero as N approaches infinity, given a loaing of q(log log N) connections per I/O port. (Some looser bouns were presente in [33].) Reaers who are not intereste in the mathematical proofs may procee irectly to Section 2.5, where the numeric results are iscusse, without a loss of continuity. Consier a circuit-switche -ilate b n b n elta network, with N b n logical input ports an N logical output ports. (Note: A logical port has a capacity to support connections.) Let there be h connection requests applie at each logical network input port (h ). Connection requests are ranomly istribute over the logical output ports. Connection requests flow from the input sie to the output sie. Whenever + 1 or more requests attempt to exit a logical output port in stage i, requests are selecte at ranom an propagate forwar an the remainer are blocke. Let the ranom variable N in enote the number of requests entering the network, let N out enote the number of requests leaving the network, an let N blocke enote the number of connection requests blocke within the network (each variable assumes a value given a specific state of the network). Define the acceptance probability as pa E[N out ]/e[n in ] an the blocking probability as pb E[N blocke ]/E[N in ], where the expectation is taken over all states of the network. It follows that pb = 1 - pa. (These probabilities are conitional on the fact that a request exists initially). Theorem 1 yiels a concise upper boun on the blocking in a ilate crossbar switch an is state without proof. The proof uses Valiant s version of Chernoff s boun [38] after application of Hoeffing s result [13], to yiel a close form expression on the tail of a binomial istribution. THEOREM 1. Given a ranom uniform traffic moel, the conitional blocking probability pb in a -ilate N N crossbar, with h connection requests source at each logical input port is upper boune by pb eh F H G I K J e - h. PROOF. Suppose that all logical output ports in all stages are assigne unique labels L i,j for 0 i n, 0 j N - 1, where N b n. A path through a network is efine as a sequence of ilate eges (i.e., eges with a capacity of connections). A connection request (for a circuit-switche connection) follows a particular path to its estination as it is route through the network; if it encounters a saturate ege, it blocks, otherwise, it survives. Define B i,j as the number of connection requests that block at L i,j in a given state. By efinition, the en-to-en conitional blocking probability is given by n N-1 Â Â, E Bls ENblocke s= 1 l= 0 pb =. EN Nh in Consier a specific topology, the omega-inverse network, in this section (the result applies to all topologically equivalent elta networks). Due to the ranom uniform traffic moel, the paths must be evenly istribute over the output ports in the last stage of the network. Assuming that there was no blocking in stages 1 to n - 1, then the entire network can be viewe as a -ilate N N crossbar (where N = b n ), where the blocking occurs only at the output ports. By applying the upper boun from Theorem 1, it follows that the expecte number of requests which block at the output ports of the last stage is given by E L NM N-1 Â l= 0 B ln, L NM O eh QP F H G I K J e -h O QP bnhg. The first n - 1 stages of the network can be viewe as two smaller N/2 N/2 ilate banyans. By repeate application of the above argument on the rest of the network, the expecte number of blocke requests in the -ilate b n b n elta network is upper boune as follows; L O n N-1 eh -h E ÂÂBls, n e Nh NM s= l= QP F H G I K J b g. 1 0 Therefore, the conitional blocking probability of the entire network is upper boune as follows; F eh h pb n e H G I K J -. Theorem 2 bouns the blocking probability in a multipath elta network given a ranom uniform traffic moel. 2.2 Worst-Case Traffic Immunity an Ranomize J:\PRODUCTION\TC\3-FINAL\09-97\105193_2.DOC regularpaper97.ot KSM 19,968 08/11/97 2:11 PM 4 / 13

5 SZYMANSKI: DESIGN PRINCIPLES FOR PRACTICAL SELF-ROUTING NONBLOCKING SWITCHING NETWORKS 5 Routing In the worst case, -ilate N N elta network can establish only O( N ) connections simultaneously. For example, in an eight-ilate 64K 64K banyan, the worst-case traffic pattern only allows about three percent of all connections to pass through. To ensure immunity to worst-case traffic, it is sufficient to ranomize the traffic first, through the aition of another elta network [23], [38]. The concatenation of two ilate elta networks (one for ranomization an one for routing, with the innermost stages merge) yiels a class of general topologies which coul be calle Tanem Dilate Banyan networks, which inclue the Dilate Benes topology as one example. A more general network obtaine from the concatenation of two multipath elta networks yiels a class of topologies which coul be calle Tanem Multipath Delta networks, which inclues the MultiBenes topology [3] as one example. Every connection request picks a ranom estination at the output of the ranomization network an then attempts to establish a connection to that estination. Any given traffic pattern, incluing a worst-case pattern, is transforme into a ranom traffic pattern in the ranomization network [20], [23], [38]. Requests then attempt to establish connections to their original estinations through the routing network. This approach eliminates congestion ue to worst-case traffic patterns, as will be shown. (Note: In practice, the ranomizer can be operate in various manners; see Section 2.4.) To ate, no researchers have manage to rigorously erive a boun on the blocking probability of self-routing circuit-switching networks using ranomize routing. Leighton aresses the problem of eriving a rigorous proof in his textbook. The connection requests which have survive through a ranomizer are not ranomly istribute over its output links: Their positions are correlate an it is not known how to boun the blocking given correlate traffic moels. The ifficulty of hanling correlate traffic moels an the unsolve nature of the problem is escribe in [20]. In this section, we present an alternative approach to boun the blocking in ilate elta networks using ranomize routing. A key point of the proof is to note the istinction between paths an surviving connection requests. At any stage, a surviving connection request is essentially the front-en of a pipeline circuit-switche connection, or, equivalently, the front-en of a worm-hole route connection. The surviving connection requests exiting the ranomizer are not ranomly istribute over its output ports, as observe by Leighton. However, the paths of all connection requests must be ranomly istribute over the output ports, since the path estinations are selecte at ranom. Hence, if we assume all connection requests are surviving connection requests at any given stage, we can exploit the fact of ranom path estinations an thereby upper boun the blocking at the given stage. We may then exploit the symmetry between the ranomizing network an the routing network to boun the blocking in the routing network. Since the loaing at each en is eterministic an ientical (h paths per logical port) an the loaing istribution at the mile is ientical, then the pattern of paths is symmetric about the mile. We may then establish that the upper boun from Theorem 2 is vali in each network; this is formalize in the next theorem. THEOREM 3. Given the concatenation of two -ilate N N banyan networks, with h < connection requests at each input port, which are ranomly an uniformly istribute over the outputs of the first ilate banyan, such that each output port of the secon ilate banyan is the estination of precisely h connection requests, the expecte number of blocke requests in the secon ilate banyan is upper boune as follows: L M N O P Q L N M M 2n N-1 n N-1 eh -h E Bls, E Bls, n e Nh MÂ Â ÂÂ s= n+ l= P = s= l= P F H G I K J pb n eh O P Q Æ F H G I K J e -h. b g PROOF. Follows by symmetry from the proof of Theorem 2, noting that the expectation is upper boune by consiering paths which have ranom estinations, rather than surviving connection requests which have correlate estinations, an observing that the set of paths over which the expectation is taken in the routing network is symmetric with the set of paths in the ranomization network, an observing that the irection of flow of connections is irrelevant. Theorem 3 establishes an upper boun on the blocking in the routing network given any worst-case traffic pattern, an is necessary to establish the existence of self routing nonblocking ERC networks which are immune to worstcase traffic patterns in Section Asymptotic Performance We now consier the blocking in a ilate elta network as the network size scales towar infinity. THEOREM 4. Given the concatenation of two ilate banyans, one acting as a ranomization network an one acting as a routing network, then pb Æ 0 as N Æ through the appropriate choice of h an. PROOF. Let the ilation be = K Èlog log N for constant K 1 an the number of traffic sources at each input port h be such that (eh/) 1/2, then K log log N 1 -K log logn 2e lim pb lim n e lim 2 næ F H G I K J F H G næ næ n n K+ c I = KJ (since c > 0 an n = log N). Theorem 4 establishes that pb Æ 0 as N Æ for the concatenation of two self-routing ilate Delta networks. These networks can be mae to have an arbitrarily low pb an immunity to worst-case traffic patterns by the appropriate choice of h an. (We note that even faster convergence to zero can be obtaining by selecting a wier ilation = K Èlog N, in which case, pb lim NÆ n/n K+c = 0, although, in practice, this larger ilation is not necessary. Wier ilations are useful in an asymptotic analysis where all N connections in a permutation o not block simultaneously.) 0 J:\PRODUCTION\TC\3-FINAL\09-97\105193_2.DOC regularpaper97.ot KSM 19,968 08/11/97 2:11 PM 5 / 13

6 6 IEEE TRANSACTIONS ON COMPUTERS, VOL. 46, NO. 10, OCTOBER 1997 Fig. 3. Blocking probability (bit-pb) versus stages for various ilate banyans: (a) fixe ilation = 2, half-loae, (b) fixe ilation = 4, half-loae, (c) fixe ilation = 8, half-loae, () ilation = loa = O(log log N). Blocking probability approaches zero when ilation grows slowly. Bol ot represents esign example iscusse in Section Variations In practice, a number of techniques can be use to either reuce the cost or lower the blocking probability. Blocking in the ranomization network can be eliminate by having the crossbar switches always propagate connection requests forwar in pseuoranom irections. Each switch in the ranomizer can select a nonblocking state at ranom; the incoming connections are permute an always propagate forwar. The cost of the ranomizer can also be reuce by using a one-ilate crossbar rather than -ilate crossbars. This approach eliminates blocking in the ranomizer an reuces the cost of the ranomizer. Alternatively, in practice, the blocking probability can be reuce by employing eflection routing in the ranomization network, i.e., see [9]. The ranomization network attempts to route requests to their intene estinations. If a request encounters a congeste link in stage i, it is eflecte an propagate out over the wrong link. After exiting the ranomization network, some requests will arrive at their estinations, while others will arrive at incorrect estinations, an all requests are launche into the routing network. In the routing network, the requests which were eflecte in the ranomization network will have another chance to be route to their estination. The eflection routing algorithm results in a lower en-to-en blocking probability than preicte by Theorems 2 an 4, but is more complex to implement electronically. 2.5 Numeric Results In this section, the blocking probability of a ilate banyan base routing network is plotte against various parameters. Exact analytic moels for the blocking probability of ilate banyans uner ranom uniform traffic have been publishe in [19], [32]. The reaer is referre to those papers for etails. The blocking probability of a ilate banyan can be reuce by operating at lower loas, i.e., by separating an active input port which supports connections by one or more ile input ports. This approach was also use by Avora, Leighton, an Maggs to lower the loa in the J:\PRODUCTION\TC\3-FINAL\09-97\105193_2.DOC regularpaper97.ot KSM 19,968 08/11/97 2:11 PM 6 / 13

7 SZYMANSKI: DESIGN PRINCIPLES FOR PRACTICAL SELF-ROUTING NONBLOCKING SWITCHING NETWORKS 7 MultiBenes network [3]. To lower the loa in our system, we assume that each input port which has the capacity to source up to connection requests actually sources fewer requests, i.e., for a half-loa, each ilate input port sources h = /2 connection requests. The blocking probabilities of various ilate banyans, with various ilations an, at half loa (h = /2), are plotte against the number of stages in Figs. 3a, 3b, an 3c. (Blocking in the last stage is eliminate in our traffic moel, since at most h connections arrive at any one logical output port). Given a fixe ilation, the pb will approach one as N Æ, as inicate by Theorem 2, although it oes so very slowly. (In all figures, the bol ots represent an eight-ilate banyan which will be use in the optical esign in Section 5.) Accoring to Theorem 4, for pb to approach 0 as N Æ, the ilation an loaing must grow slowly with N, i.e., = Èlog log N an h = O(). The blocking probabilities of various ilate banyans which meet the conitions of Theorem 4, with a ilation of È4 log log N an at four ifferent loaings (h/ = 0.125, 0.25, 0.375, an 0.5), are plotte in Fig. 3. The number of stages is set to a very large value (300 stages), so that the asymptotic limits of the curves can be examine. As establishe in Theorem 4, asymptotically pb Æ 0 as N Æ. Hence, ilate banyans with fast selfrouting algorithms can be esigne to have arbitrarily small pbs as the network size scales to infinity. These results will be necessary in Section 3 to erive nonblocking ERC networks (base on a probabilistic approach) from networks with blocking. 3 THE EXPAND-ROUTE-CONTRACT NONBLOCKING SWITCH ARCHITECTURE In conventional circuit-switching connection networks, the connection atapaths are usually many bits wie, typically eight, 16, or 32 bits. Typically, all the bits in a connection atapath are switche together as an inivisible entity. If the circuit-switche connection blocks, then all bits in the connection block simultaneously. Similarly, if one bit in the atapath fails, then the entire connection fails. The propose ERC architecture relies on a funamentally ifferent approach. In orer to establish a w-bit wie circuit-switche connection in the ERC network, at the input sie, w + z inepenent bit-serial connection requests are inserte into the network (where w 1). Each bit-serial connection is route through a circuit-switche network calle a bit-plane, typically a one-bit wie ilate banyan with a finite blocking probability. At the output sie, all surviving bit-serial circuit-switche connections are contracte together, an a w-bit wie atapath is establishe if w or more bit-serial connections have survive. In principle, the bit-planes can be any self-routing bit-serial circuitswitching networks with blocking, incluing the conventional bit-serial banyan network. (In principle, we coul also use a Forwar-Error Correcting coe which can correct z bit-errors to expan a w-bit ata wor to w + z bits at the input sie, an the ecoer to contract w + z bits to a vali w-bit coe wor at the output sie. A blocke bit-serial connection appears as a consistent bit-error which can be correcte by the error-correcting coe.) Suppose that every N N bit-plane has a blocking probability enote bit-pb. The goal is to esign an N N switch with w-bit wie atapaths with a given blocking probability per connection (calle the connection-pb ), which can be arbitrarily small, given any level of blocking in the bit-planes. Typically, one may select the connectionpb to be 10-8, although other probabilities, such as 10-20, are easily esigne. (Blocking in networks is occasionally efine as the event that a permutation cannot be route in one pass. We assume the more practical measure in this paper, i.e., the event that a connection cannot be route. Uner the permutation moel, our ilation in Theorem 4 must be wier, i.e., O(log N), an the ERC network still yiels optimal memory bit-complexity. ) Given that we insert w + z bit-serial connection requests into w + z bit-planes, the probability that a w-bit wie ata path is establishe at each output port is given by the following. (In practice, we coul inject multiple bit-serial connections into the same bit-plane.) Let pb = bit - pb, an pa = 1 - bit - pb; an w 1. Therefore, the Pr[w-bit wie ata path establishe] = j= w F H I w+ z -pa j  e b g Â! w + z j w+ z-j j pa 1- pa ª K j= w pa. j In this section, efine the expansion to be the number of extra bits require. For example, an expansion of four an a atapath with of w implies that 4 + w bit-serial connection requests must be operate in parallel to achieve the specifie connection-pb. (If the bit-serial connections are replace by bit-parallel connections, the expansion also applies to bit-parallel connections.) The require expansion epens upon the blocking probability in the bit-plane an the atapath with. Wier atapaths require less expansion to achieve a given connection-pb. Fig. 4 applies for a atapath with of eight bits. Fig. 4a plots the expansion require to achieve a given connection-pb when the bit-pb is in the range of to Fig. 4b plots the expansion require to achieve a given connection-pb when the bit-pb is in the range of 0.01 to 0.1. Fig. 4c plots the expansion require to achieve a given connection-pb when the bit-pb is in the range of 0.1 to 0.5. The ERC architecture yiels nonblocking networks regarless of the blocking probability in the bit-planes. Bit-planes with more blocking simply require a larger expansion to achieve a given connection-pb. Figs. 3 an 4 supply sufficient ata so that a reaer can esign a self-routing nonblocking network of their choice. For example, to achieve a connection-pb of 10-8 given a bitpb of , the expansion is four bits (see the ot on Fig. 4a). It is also possible that each connection is one bit wie. For this case, we set w = 1 an the expansion, or number of extra copies of the request, can be foun from the previous equation. A connection request is successful if at least one bit-serial request reaches the estination. J:\PRODUCTION\TC\3-FINAL\09-97\105193_2.DOC regularpaper97.ot KSM 19,968 08/11/97 2:11 PM 7 / 13

8 8 IEEE TRANSACTIONS ON COMPUTERS, VOL. 46, NO. 10, OCTOBER 1997 Fig. 4. Contour map of expansion versus bit-pb to achieve a given connection-pb for a atapath with of eight bits: (a) bit-pb in the range , (b) bit-pb in the range , (c) bit-pb in the range Bol ot correspons to example use in Section 5. 4 HARDWARE-EFFICIENT TDM AND SDM CONSTRUCTIONS In this section, the harware complexity of the ERC network using multipath elta networks is examine. 4.1 Space Division Constructions A -ilate k*k crossbar switch consists of k k-to- concentrators; each concentrator collects the requests with a istinct routing bit between 0... k - 1 an propagates of them forwar. Concentrator Construction 1: A simple concentrator esign calle a Daisy-Chain Concentrator is shown in Fig. 5. The concentrator controller consists of an array of k-by- control cells, where an iniviual control cell is shown in Fig. 5a. Each control cell requires four logic gates an rives an associate crosspoint cell, shown in Fig. 5b. Each vertical column is essentially a aisy-chain, which controls access to one output port. A busy signal travels own the aisy-chain an is asserte by the first active request. No other requests can claim the same output port once its busy signal is asserte. Other requests which encounter a busy signal in one aisy-chain column are forware to the next aisy-chain column to see if it is busy. An example state of a 6-to-4 concentrator is shown in Fig. 5c. The k-to- aisy-chain concentrator requires five gates per cell an k cells. For fixe k, the concentrator requires O( 2 ) logic gates an has a setup time of O() logic gates. Each of the k input ports requires O(log k) bits of memory to ientify the requeste logical output port. In a synchronous moe of operation, the egree k switch requires O(k log k) bits of memory. For fixe k, the switch requires O() bits of memory. THEOREM 5. The use of the aisy-chain concentrator in the ERC architecture, which satisfies the conitions of Theorem 4, yiels a self-routing nonblocking N N connection network with O(N log N) noes, O(N log N) bits of memory, O(N log N log log N) logic gates, an a epth of O(log N log log N) logic gate elays. (The proof follows irectly by substitution, where N is the number of istinct connections supporte by the network.) When the complexity is measure in terms of crossbar noes, the cost is an optimal O(N log N) noes. In terms of bits of memory, the complexity is an optimal O(N log N) bits, which meets Shannon s lower boun establishe in the 1950s. Hence, the switch scales optimally accoring to these important practical metrics. In terms of logic gates, the complexity is a slightly suboptimal O(N log N log log N) logic gates, an the epth is a slightly suboptimal O(N log N log log N) logic gate elays. As integrate circuits will soon support millions of gates, an as logic gate imensions an elays will continue to shrink, these logic gate metrics have iminishing importance in practice. Furthermore, the grow rate in the term O(log log N) is so slow as to be negligible in practice. The simplicity an regular VLSI J:\PRODUCTION\TC\3-FINAL\09-97\105193_2.DOC regularpaper97.ot KSM 19,968 08/11/97 2:11 PM 8 / 13

9 SZYMANSKI: DESIGN PRINCIPLES FOR PRACTICAL SELF-ROUTING NONBLOCKING SWITCHING NETWORKS 9 Fig. 5. (a) Daisy-chain concentrator control cell with four logic gates, (b) ata plane crosspoint with tristate river gate, (c) 6-to-4 aisy-chain concentrator with 6 4 array of concentrator cells. Three requests are route to the first three output columns. (Dashe cells are not neee.) layout of this concentrator circuit make it useful in practical esigns (see Section 5). (The epth of this construction can be improve an will be reporte elsewhere.) Concentrator Construction 2: For sufficiently large ilations, the logic gate complexity of the -ilate k k crossbar can be improve. The crossbar can be constructe with O(k log k) harware an O(log k) epth by using self-routing concentrators with logarithmic epth, as shown in Fig. 6. Each concentrator consists of a ranking circuit to assign ranks to the active inputs, followe by an omega-inverse network, which acts as a compact concentrator for k = 2. (It has recently been establishe that a single Omega network can act as a zero an one concentrator simultaneously [16], an, hence, only one Omega network is necessary in Fig. 6b.) The ranking of k inputs is achieve by using a bit-serial pipeline binary tree, as shown in Fig. 6a. Each box or circle is a bit-serial aer (the first stage of boxes is not necessary an rawn for symmetry). The ranks an partial sums are compute bit-serially, least significant bit first. This ranker is a pipeline multistage circuit base upon a binary tree ranker escribe in [20]. In the upwar phase, each noe propagates the sum up towar the root an propagates the count from its uppermost chil horizontally. The root noe (in the mile) propagates a zero count to its uppermost chil, an propagates the incoming count from its upper chil to the lower chil. In the ownwar phase, each intermeiate noe propagates the count arriving from above irectly to its uppermost chil, an as the count from above an the count arriving horizontally, an propagates the sum to its lower chil. The leaves a the incoming count with a one if they have a connection, yieling the rank (from one to k) of the request. THEOREM 6. The use of the log-epth concentrator in the ERC architecture, which satisfies the conitions of Theorem 4, yiels a self-routing nonblocking N N connection network with O(N log N) noes, O(N log N log log log N) bits of memory, O(N log N log log log N) logic gates, an a epth of O(log N log log log N) logic gate elays. (Proof follows by substitution.) When the complexity is measure in terms of crossbar noes, the cost is an optimal O(N log N) noes. In terms of bits of memory, the harware complexity is O(N log N log log log N) bits, which is slightly suboptimal by a small factor of O(log log log N) when compare to Shannon s lower boun. In terms of logic gates, the complexity is O(N log N log log log N) an the epth is O(log N log log log N), which is an improvement over the aisy-chain concentrator esign. In situations where gate elays are more important than bits of memory, the log-epth concentrator may be preferable over the aisy-chain concentrator. 4.2 TDM Constructions A time-ivision -ilate k k crossbar can be implemente with a harware cost of O(k 2 ) logic gates an bits of memory an a latency of O(k) by using a circuit calle a time-bit concentrator [33]. A TDM-ilate crossbar switch with four incoming an four outgoing spaceivision links is shown in Fig. 7. Each link carries up to time-multiplexe bit-serial connections. On each incoming link, the bits from the -multiplexe connections keep arriving in the same orer (i.e., for a ilation of eight, bits belonging to connections arrive in orer 1, 2, 3,..., 8, an the cycle repeats). At each input port, a circular buffer J:\PRODUCTION\TC\3-FINAL\09-97\105193_2.DOC regularpaper97.ot KSM 19,968 08/11/97 2:11 PM 9 / 13

10 10 IEEE TRANSACTIONS ON COMPUTERS, VOL. 46, NO. 10, OCTOBER 1997 Fig. 6. (a) An O(log N)-epth ranking circuit, (b) a -ilate 2 2 crossbar with O(log ) epth an O( log ) bit-complexity. Fig. 7. A 4 4 ilate crossbar using time-bit concentrators. stores the routing bits for the connections. The circular buffer is loae initially as the connection-request heaers pass by. Once the circular buffers are loae with routing bits, the multiplexe ata bits arrive, an each bit is route to a push-pop stack at the esire logical output port. In Fig. 7, we have two eicate push-pop stacks at each output port, so that there is no contention for pushing the stack. One stack is use to store incoming bits, while the other is emptying outgoing bits. (Each stack must allow four simultaneous pushes in constant time, an can be esigne as four separate push-pop stacks for simplicity). Once all incoming bits have been route to the output ports, of them are then transmitte forwar over each space-ivision link by having a rea-out circuit pop the stack(s) (in constant time) in a repeatable orer for time units. Any bits left in the stack(s) after this represent blocke connections, an they are roppe. During the time one stack is emptying, another stack is storing a new set of incoming bits, which supplies the outgoing bits when the cycle repeats itself. It can be verifie that this circuit acts as a -ilate crossbar, has O(k 2 ) harware an O(k) latency in gates. We point out that the time-bit concentrator is fully pipelineable, i.e., new bits can enter an exit the concentrator on every link uring every clock cycle. THEOREM 7. The use the time-bit concentrator in the ERC architecture, which satisfies the conitions of Theorem 4, yiels a self-routing nonblocking N N connection network with O(N log N) noes, O(N log N) bits of memory, O(N log N) logic gates, an a epth of O(log N log log N) logic gate elays. (Proof follows by substitution.) When the complexity is measure in terms of crossbar noes, the cost is an optimal O(N log N) noes. In terms of bits of memory an logic gates, the complexity is an optimal O(N log N), which meets Shannon s lower boun. Hence, the switch scales optimally accoring to these important practical metrics. The epth is O(log N log log N) logic gate elays, which is slightly suboptimal by the small factor of O(log log N). This esign is particularly useful in optical networks, since TDM is a natural mechanism to exploit the banwith avantage that optics offers over electronics. (It is interesting to observe that the complexity of the ERC network is an optimal O(N log N) harware for wier ilations of = O(log N), provie that h = O() an k is constant.) Variations: In practice, the epth of all the ERC self-routing networks can be reuce to an optimal O(log N) logic gate elays by keeping the ilation fixe an letting the expansion increase slowly with N. One may select a multistage network with a fixe ilation an loaing h, which has a complexity of O(N log N) harware an a epth of O(log N) logic gates. From Fig. 3, for fixe ilations, it is observe that pb will rise very slowly as N increases. To keep the connection_pb below a prescribe value, the expansion can be rea from Fig. 4. The analysis an numerical results ini- J:\PRODUCTION\TC\3-FINAL\09-97\105193_2.DOC regularpaper97.ot KSM 19,968 08/11/97 2:11 PM 10 / 13

11 SZYMANSKI: DESIGN PRINCIPLES FOR PRACTICAL SELF-ROUTING NONBLOCKING SWITCHING NETWORKS 11 TABLE 2 SEMICONDUCTOR INDUSTRY ASSOCIATION PROJECTIONS FOR CMOS TECHNOLOGY [28] Year Feature Size (µ) Gates Area (Sq. mm) Electrical I/O pins On-Chip Clock Off-Chip Clock M Mhz 100 Mhz M 600 1, Mhz 175 Mhz M 800 2, Mhz 250 Mhz M 1,000 2, Mhz 350 Mhz M 1,250 3,600 1 Ghz 500 Mhz cate that the expansion is ª q(log log N). Hence, in practice, ERC networks can also be esigne to have O(N log N log log N) harware complexity an O(log N) logic gate elays. 5 A TERABIT SELF-ROUTING NONBLOCKING ATM SWITCH CORE FOR NETWORKS-OF- WORKSTATIONS One application of fast self-routing switching networks is the Networks of Workstations (NOWs) istribute computer architecture. NOWs interconnecte with a centralize ATM switch core base on a multistage elta network are escribe in [27]. The performance of microprocessors an communication networks have been growing exponentially over the last ecae, an these trens are expecte to continue well into the future [22], [27], [28]. By the year 2017, the single chip micros are expecte to have performances of a few Teraflops per secon, an networks are expecte to have capacities of several Terabits per secon [22]. In this section, we consier the esign of a scalable ATM switch core, which can interconnect a large number of networke workstations. Each workstation typically has a CMOS Message- Processor (MP) to hanle the communication protocols. Messages are supplie to the Message-Processors, where they may be fragmente into fixe size packets, assigne sequence numbers for error control, replicate for broacasting, queue, an then transmitte over electrical or fiber links to the centralize switch core. The MPs also perform the receiving protocols. The switch core coul support istribute share memory over a NOW. Consier a pipeline circuit-switche switch core, where the connections are establishe an torn own on a perpacket basis. (The network coul be synchronous or asynchronous.) As a connection moves forwar, a packet is transferre in a pipeline manner byte-by-byte. Pipeline circuit switching is similar to worm-hole routing [27], in that every intermeiate noe buffers a few bits or bytes of a packet, an the packet can be sprea out over many intermeiate noes as it moves through the network. Consier the esign of a 1K 1K switch core with bytewie ata paths, with a blocking probability per connection of In practice, protocols will ensure that no estination is overloae, i.e., that a estination receives at most h packets at a time (for some number h). Attempts to transmit to an overloae estination can be flagge with a busy signal an eferre until a short time later. Assume the switch is to be esigne using CMOS technology available in the year Table 2 illustrates some of the Semiconuctor Inustry Association (SIA) estimates for CMOS technology over the next ecae [28]. The figures in Table 2 will influence the esign example, an are traitionally conservative. Multistage networks can be esigne with many stages of simple binary switches, or fewer stages of larger switches. To minimize the IC count, we will consier multistage networks with fewer stages of moerate size switches. A one-stage switch has minimal cost in terms of ICs. However, it requires very large crossbar switches. Due to electronic pin limitations, it is not possible to implement a 1K 1K switch with eight bit-wie ata paths on a single IC (yieling a one-stage network). However, using the esign principles propose in this paper, one can esign a three, five, seven stage, or arbitrary 2n - 1 stage networks with moerate size crossbars an with arbitrarily low blocking probabilities, which overcome the electrical pin-limitation problem. Consier a three stage CLOS network with moerate size crossbar switches in each stage, where each crossbar switch is bit-serial an eight-ilate. Each bitserial Clos network represents an inepenent bit-plane in our ERC switch, an requires 16 crossbars per stage for three stages. Assume that the effective input loaing is one half, i.e., each eight-ilate input port supports four processors, rather than eight. The blocking probability of ilate networks can be rea irectly from Fig. 3c. Accoring to Fig. 3c, the bit-pb of this bit-plane is In other wors, less than one percent of the bit-serial connections will block on average, given a permutation traffic moel. (For a uniform ranom traffic moel, the blocking is higher an the require expansion can be recompute following the metho in Section 3.) To achieve a connection-pb of 10-8, the expansion can be rea from Fig. 4a. The require expansion is four bits. Hence, to establish a byte-wie connection, we launch 12 inepenent bit-serial connection requests into the ERC network. At each output port, a byte-wie connection is establishe if eight or more bit-serial connections survive. An 8-to-12 expaner is neee at each switch input port, to create 12 inepenent bit-serial connections from the original eight. Also, a 12-to-8 concentrator is neee at each switch output port, to compact up to 12 bit-serial requests own to an eight-bit atapath. The aitional gates neee to implement these components are negligible. (A 12-to-8 concentrator requires about 500 gates, which is negligible compare to the number of gates in the MP. Acknowlegment signals can be use to ientify vali bit-serial connections.) The self-routing ilate crossbar switches can be esigne by using the aisy-chain concentrators of Section 4.1. It can be verifie that each eight-ilate bit-serial crossbar requires about 1 KBit of memory, which is very small when compare to the memory requirements of a J:\PRODUCTION\TC\3-FINAL\09-97\105193_2.DOC regularpaper97.ot KSM 19,968 08/11/97 2:11 PM 11 / 13

12 12 IEEE TRANSACTIONS ON COMPUTERS, VOL. 46, NO. 10, OCTOBER 1997 buffere crossbar. (A buffere crossbar supporting ATM cells woul require at least one cell buffer per link, or at least 54 KBits of memory.) Due to electrical I/O pin limitations, each IC has enough I/O pins to support only five of these crossbar switches. It follows that the electrical ERC network requires 116 ICs, an has an aggregate banwith of 1.4 Terabit/sec. The architecture scales to five, seven, or more stages. The five-stage switch has a banwith of ª 22.9 Terabit/sec, an the seven-stage switch has a banwith of ª 367 Terabit/sec. The propose architecture aresses two key networking problems, as ientifie in the NSF report [36]. The problem of fast control of Gigabit/Terabit networks is partially aresse by using the ERC architecture, since it uses very fast self routing algorithms an is provably robust an immune to congestion (as emonstrate by Theorems 1-4). The problem that existing switch architectures have suboptimal harware an memory scaling properties is aresse, as the harware an memory complexity of the propose architecture scales optimally or nearly optimally. Perhaps the only major remaining problem with all electrical architectures is the large number of wires between stages. This problem can be solve by using optics as the next esign example illustrates. 5.1 Design of an Opto-Electronic Switch Core The esign of an opto-electronic switch core is summarize. To minimize cost, a one-stage switch woul be preferable. However, a one-stage switch will require a 1K 1K crossbar, which is very large. Table 3 illustrates projections for the electrical an optical I/O properties of OEICs (hereafter calle ICs). Column 2 is from [18]. Using the aisy-chain concentrator, a 1K 1K crossbar with byte-wie atapaths will require in excess of 10M gates, exceeing the gate capacity of the ICs. Hence, a three-stage ERC construction, using moerate size crossbars, can be use. TABLE 3 PROJECTED CAPACITIES FOR SINGLE CHIP OEICS (BASED ON DATA IN [2], [18]) Year Max # Optical I/O Optical Clock Max. Optical I/O BW , Mhz 0.6 Tb/s , Mhz 2.1 Tb/s , Mhz 6 Tb/s , Mhz 14 Tb/s ,000 1 Ghz 25 Tb/s Note: BW is prouct of optical I/O time optical clock ivie by two. It can be verifie that each IC has sufficient optical I/O to support a large number of eight-ilate bit-serial crossbar switches. However, each IC is limite by logic gates to implement only about 25 of these crossbar switches, which we assume. It follows that the ERC switch core requires 24 OEICs, an has an aggregate banwith of 2.8 Terabit/sec (ouble the banwith of the electrical version, since the optical atapaths operate at the faster optical clock rate). A five-stage network has a banwith of ª 46 Terabit/sec, an a seven-stage network has a banwith of ª 734 Terabit/sec. These esigns are technologically feasible with existing OEIC technology. The atapaths to an from the switch core can be realize with commercially available parallel fiber ribbons, such as the Motorola OPTOBUS [24]. A fiel programmable logic evice with optical I/O, which can be ynamically programme to implement ilate crossbars, has been evelope [30]. Using these technologies an the propose esign principles, one may esign arbitrarily large optical switching networks using multiple stages of moerate size crossbars. 5.2 Comparison with the ALM Network Avora, Leighton, an Maggs escribe a self-routing MultiBenes network which is nonblocking an which has an asymptotically optimal cost of O(N log N) gates an bits of memory [3]. Like the ERC network, the MultiBenes network can be viewe as the concatenation of two multipath elta networks. However, the ALM routing algorithms an noes are consierably more complex than the propose ERC schemes. The ALM network requires complicate backtracking routing algorithms an partial packet buffering in the noes, which rener it unattractive for optical implementations. In aition, to achieve the optimal harware complexity, the ALM network requires linear cost expaners for which no explicit construction is known. Nevertheless, in orer to raw a comparison, we assume that their expaners can be built. The esign in [3] escribes a network with a path multiplicity of 10, a spacing between active logical input ports of 300, an the use of binary switching noes. It follows that each stage in an ALM network with 1K active input ports requires at least 300K connection atapaths. Hence, an optical version of the ALM network has a cost which is several orers of magnitue more than the ERC network. In practice, one may be able to improve the performance ALM network. However, it is more efficient to apply the ERC esign principles on the MultiBenes or Multipath Delta topology, which will yiel an optimal or near-optimal network. 6 CONCLUSIONS Principles for esigning practical self-routing nonblocking switching networks, such as those use in ATM switch cores, were propose. These principles lea to a large class of selfrouting nonblocking switching networks, which are base on the concepts of expansion, routing, an contraction. These networks aress two research priorities ientifie in a recent NSF sponsore report, the nee for fast algorithms to control the Gigabit an Terabit networks of the future, an the nee for optimally scalable switching networks [36]. The propose space omain constructions yiel self-routing nonblocking switching networks with an optimal O(N log N) bits of memory or O(N log N log log log N) logic gates. The propose time omain construction yiels self-routing nonblocking switching networks with an optimal q(n log N) bits of memory or q(n log N) logic gates. These esigns meet Shannon s lower boun on memory requirements establishe in the 1950s, an they reaily scale to large sizes. The propose architecture briges the iscrepancies between the best-known theoretical an practical results, an are attractive for both electrical an optical implementations. Fast self-routing switching networks with Terabits of bisection banwiths can be esigne using these principles. J:\PRODUCTION\TC\3-FINAL\09-97\105193_2.DOC regularpaper97.ot KSM 19,968 08/11/97 2:11 PM 12 / 13

13 SZYMANSKI: DESIGN PRINCIPLES FOR PRACTICAL SELF-ROUTING NONBLOCKING SWITCHING NETWORKS 13 ACKNOWLEDGMENTS We gratefully acknowlege the comments of the referees which have improve the presentation of the paper. This research was fune by NSERC Canaa Grant OPG REFERENCES [1] M. Ajtai, J. Komlos, an E. Szemerei, An O(NlogN) Sorting Network, Proc. 15th ACM Symp. Theory of Computation, pp. 1-9, [2] ARPA/COOP/AT&T Hybri-SEED Workshop Notes, George Mason Univ., July [3] S. Avora, F. Leighton, an B. Maggs, On Line Algorithms for Path Selection in a Nonblocking Network, Proc ACM Symp. Theory of Computing, pp , [4] B.D. Alleyne an I. Scherson, Expane Delta Networks for Very Large Parallel Computers, Proc. Int l Conf. Parallel Processing, pp , [5] A. Bassalygo an M.S. Pinsker, Complexity of Optimum Nonblocking Switching Network without Reconnections, Problems of Information Transmission, vol. 9, pp , [6] K.E. Batcher, Sorting Networks an Their Applications, Proc Spring Joint Computer Conf., [7] M.V. Chien an A.Y. Oruc, Aaptive Binary Sorting Schemes an Associate Interconnection Networks, Proc. Int l Conf. Parallel Processing, pp , [8] T.J. Cloonan, G.W. Richars, A.L. Lentine, F.B. McCormick, an J.R. Erickson, Free-Space Photonic Switching Architectures Base on Extene Generalize Shuffles, Applie Optics, vol. 31, no. 35, pp. 7,471-7,492, Dec [9] T.J. Cloonan, Comparative Stuy of Optical an Electronic Interconnection Technologies for Large Asynchronous Transfer Moe Packet Switching Applications, Optical Eng., vol. 33, no. 5, pp. 1,512-1,523, May [10] G.A. De Biase, C. Ferrone, an A. Massini, An O(logN) Depth Asymptotically Nonblocking Self Routing Permutation Network, IEEE Trans. Computers, vol. 44, no. 8, pp. 1,047-1,051, Aug [11] B. Douglass, Rearrangeable Three-Stage Interconnection Networks an Their Routing Properties, IEEE Trans. Computers, vol. 42, no. 5, pp , May [12] H.S. Hinton, T.J. Cloonan, F.B. McCormick, A.L. Lentine, an F.A.P.Tooley, Free-Space Digital Optical Systems, Proc. IEEE, vol. 82, no. 11, pp. 1,632-1,649, Nov [13] W. Hoeffing, On the Distribution of the Number of Successes in Inepenent Trials, Annals of Math. Statistics, vol. 27, pp , [14] A. Huang an S. Knauer, Starlite: A Wieban Digital Switch, Proc. Globecom, Dec [15] C.Y. Jan an A.Y. Oruc, Fast Self-Routing Permutation Switching on an Asymptotically Minimal Cost Network, IEEE Trans. Comm., vol. 42, no. 12, Dec [16] R. Kannan, H.F. Joran, K.Y. Lee, an C. Ree, A Bit-Controlle MultiChannel Time Slot Permutation Network, Proc. Secon Int l Conf. Massively Parallel Processing Using Optical Interconnects, pp , [17] D.M. Koppelman an A.Y. Oruc, A Self-Routing Permutation Network, J. Parallel an Distribute Computing, vol. 10, pp , [18] A.V. Krishnamoorthy an D.A.B. Miller, Scaling Optoelectronic- VLSI Circuits into the 21st Century: A Technology Roamap, IEEE J. Selecte Topics in Quantum Electronics, vol. 2, no. 1, pp , Apr [19] C.P. Kruskal an M. Snir, The Performance of Multistage Interconnection Networks for Multiprocessors, IEEE Trans. Computers, vol. 32, no. 12, pp. 1,091-1,098, Dec [20] F.T. Leighton, Parallel Algorithms an Architectures: Arrays, Trees an Hypercubes. Morgan-Kaufmann, [21] A.L. Lentine et al., 700 Mb/s Operation of Optoelectronic Switching Noes Comprise of Flip-Chip-Bone GaAs/AlGaAS MQW Moulators an Detectors on Silicon CMOS Circuitry, Proc. Conf. Lasers an Electrooptics, [22] T. Lewis, The Next 10,000 2 Years: Part 1, Computer, Apr. 1996, pp , Part 2, pp , May [23] D. Mitra an R.A. Cieslak, Ranomize Parallel Communications on an Extension of the Omega Network, J. ACM, vol. 34, pp , [24] Motorola, OPTOBUS Data Sheet, Logic Integrate Circuits Division, [25] D. Nassimi an S. Sahni, Parallel Permutation an Sorting Algorithms an a New Generalize Connection Network, J. ACM, pp , 1982 [26] J.H. Patel, Performance of Processor-Memory Interconnections for Multiprocessors, IEEE Trans. Computers, vol. 30, no. 10, pp , Oct [27] J.L. Hennessey an D.A. Patterson, Computer Architecture, A Quantatative Approach, secon eition. San Francisco: Morgan-Kauffman, [28] Semiconuctor Inustry Association, The National Technology Roamap for Semiconuctors, San Jose, Calif.: SIA, [29] C.E. Shannon, Memory Requirements in a Telephone Exchange, Bell. Systems Technical J., [30] S. Sherif, T.H. Szymanski, an H.S. Hinton, Design an Implementation of a Fiel Programmable Smart Pixel Array, Proc. LEOS 96 Conf. Smart Pixels, Keystone, Colo., Aug [31] B. Supmonchai an T.H. Szymanski, Fast Self-Routing Concentrators for Optoelectronic Systems, submitte. [32] T.H. Szymanski an V.C. Hamacher, On the Universality of Multipath Multistage Interconnection Networks, Interconnection Networks, I. Scherson an Youseff, es., IEEE CS Press, [33] T.H. Szymanski an C. Fang, Ranomize Routing of Virtual Connections in Essentially Nonblocking log N-Depth Networks, IEEE Trans. Comm., pp. 2,521-2,531, Sept [34] T.H. Szymanski an H.S. Hinton, Reconfigurable Intelligent Optical Backplane for Parallel Computing an Communications, Applie Optics, pp. 1,253-1,268, Mar [35] C.D. Thompson, Generalize Connection Networks for Parallel Processor Intercommunication, IEEE Trans. Computers, vol. 27, no. 12, pp. 1,119-1,125, Dec [36] U.S. National Science Founation, Research Priorities in Networking an Communications, Report to the NSF Division of Networking an Communications Research an Infrastructure, May 12-14, 1994, Arlington, Va. [37] E. Upfal, S. Felperin, an M Snir, Ranomize Routing with Shorter Paths, IEEE Trans. Parallel an Distribute Systems, vol. 7, no. 4, pp , Apr [38] L.G. Valiant an G.J. Brebner, Universal Schemes for Parallel Communications, Proc. 13th Ann. ACM Symp. Theory of Computing, pp , [39] M. Yamaguchi, an K-I Yukimatsu, Recent Free-Space Photonic Switches, IEICE Trans. Comm., vol. E77B, no. 2, Feb Te H. Szymanski receive his BSc egree in engineering science, an the MASc an PhD egrees in electrical engineering from the University of Toronto. From 1987 to 1991, he was an assistant professor at Columbia University an a principle investigator at the U.S. National Science Founation Center for Telecommunications Research working on ATM switching networks an WDM optical architectures. He is currently an associate professor an the irector of the Microelectronics an Computer Systems Laboratory at McGill University, an a project leaer in the Canaian Institute for Telecommunications Research, leaing a project on Optical Architectures an Applications. An intelligent optical backplane architecture evelope by this project is being constructe in Canaa. He is active professionally, an has serve on the program committees for the 1998 an 1997 Workshops on Optics in Computer Science, the 1998 an 1997 International Conferences on Massively Parallel Processing using Optical Interconnects, the 1998 International Conference on Optical Computing, the 1997 Innovative Systems on Silicon Conference, the 1995 Workshop on High-Spee Network Computing, an the 1998, 1995, an 1994 Canaian Conferences on Programmable Logic Devices. He has also serve as Session Organizer for IEEE Infocom, the International Computing an Communication Conference, an the International Conference on Parallel Processing. His personal interests incluing snowboaring an cycling, an his research interests inclue telecommunication an computing architectures, performance moeling, an optical networks. He is a member of the IEEE Computer an Communications societies. J:\PRODUCTION\TC\3-FINAL\09-97\105193_2.DOC regularpaper97.ot KSM 19,968 08/11/97 2:11 PM 13 / 13

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2 This paper appears in J. of Parallel an Distribute Computing 10 (1990), pp. 167 181. Intensive Hypercube Communication: Prearrange Communication in Link-Boun Machines 1 2 Quentin F. Stout an Bruce Wagar

More information

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH Galen H Sasaki Dept Elec Engg, U Hawaii 2540 Dole Street Honolul HI 96822 USA Ching-Fong Su Fuitsu Laboratories of America 595 Lawrence Expressway

More information

Supporting Fully Adaptive Routing in InfiniBand Networks

Supporting Fully Adaptive Routing in InfiniBand Networks XIV JORNADAS DE PARALELISMO - LEGANES, SEPTIEMBRE 200 1 Supporting Fully Aaptive Routing in InfiniBan Networks J.C. Martínez, J. Flich, A. Robles, P. López an J. Duato Resumen InfiniBan is a new stanar

More information

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Queueing Moel an Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Marc Aoun, Antonios Argyriou, Philips Research, Einhoven, 66AE, The Netherlans Department of Computer an Communication

More information

Message Transport With The User Datagram Protocol

Message Transport With The User Datagram Protocol Message Transport With The User Datagram Protocol User Datagram Protocol (UDP) Use During startup For VoIP an some vieo applications Accounts for less than 10% of Internet traffic Blocke by some ISPs Computer

More information

1 Surprises in high dimensions

1 Surprises in high dimensions 1 Surprises in high imensions Our intuition about space is base on two an three imensions an can often be misleaing in high imensions. It is instructive to analyze the shape an properties of some basic

More information

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control Almost Disjunct Coes in Large Scale Multihop Wireless Network Meia Access Control D. Charles Engelhart Anan Sivasubramaniam Penn. State University University Park PA 682 engelhar,anan @cse.psu.eu Abstract

More information

EFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER

EFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER FFICINT ON-LIN TSTING MTHOD FOR A FLOATING-POINT ADDR A. Droz, M. Lobachev Department of Computer Systems, Oessa State Polytechnic University, Oessa, Ukraine Droz@ukr.net, Lobachev@ukr.net Abstract In

More information

Disjoint Multipath Routing in Dual Homing Networks using Colored Trees

Disjoint Multipath Routing in Dual Homing Networks using Colored Trees Disjoint Multipath Routing in Dual Homing Networks using Colore Trees Preetha Thulasiraman, Srinivasan Ramasubramanian, an Marwan Krunz Department of Electrical an Computer Engineering University of Arizona,

More information

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization 1 Offloaing Cellular Traffic through Opportunistic Communications: Analysis an Optimization Vincenzo Sciancalepore, Domenico Giustiniano, Albert Banchs, Anreea Picu arxiv:1405.3548v1 [cs.ni] 14 May 24

More information

Lecture 1 September 4, 2013

Lecture 1 September 4, 2013 CS 84r: Incentives an Information in Networks Fall 013 Prof. Yaron Singer Lecture 1 September 4, 013 Scribe: Bo Waggoner 1 Overview In this course we will try to evelop a mathematical unerstaning for the

More information

Optimal Oblivious Path Selection on the Mesh

Optimal Oblivious Path Selection on the Mesh Optimal Oblivious Path Selection on the Mesh Costas Busch Malik Magon-Ismail Jing Xi Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 280, USA {buschc,magon,xij2}@cs.rpi.eu Abstract

More information

Performance Modelling of Necklace Hypercubes

Performance Modelling of Necklace Hypercubes erformance Moelling of ecklace ypercubes. Meraji,,. arbazi-aza,, A. atooghy, IM chool of Computer cience & harif University of Technology, Tehran, Iran {meraji, patooghy}@ce.sharif.eu, aza@ipm.ir a Abstract

More information

Loop Scheduling and Partitions for Hiding Memory Latencies

Loop Scheduling and Partitions for Hiding Memory Latencies Loop Scheuling an Partitions for Hiing Memory Latencies Fei Chen Ewin Hsing-Mean Sha Dept. of Computer Science an Engineering University of Notre Dame Notre Dame, IN 46556 Email: fchen,esha @cse.n.eu Tel:

More information

Questions? Post on piazza, or Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)!

Questions? Post on piazza, or  Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)! EE122 Fall 2013 HW3 Instructions Recor your answers in a file calle hw3.pf. Make sure to write your name an SID at the top of your assignment. For each problem, clearly inicate your final answer, bol an

More information

State Indexed Policy Search by Dynamic Programming. Abstract. 1. Introduction. 2. System parameterization. Charles DuHadway

State Indexed Policy Search by Dynamic Programming. Abstract. 1. Introduction. 2. System parameterization. Charles DuHadway State Inexe Policy Search by Dynamic Programming Charles DuHaway Yi Gu 5435537 503372 December 4, 2007 Abstract We consier the reinforcement learning problem of simultaneous trajectory-following an obstacle

More information

Non-homogeneous Generalization in Privacy Preserving Data Publishing

Non-homogeneous Generalization in Privacy Preserving Data Publishing Non-homogeneous Generalization in Privacy Preserving Data Publishing W. K. Wong, Nios Mamoulis an Davi W. Cheung Department of Computer Science, The University of Hong Kong Pofulam Roa, Hong Kong {wwong2,nios,cheung}@cs.hu.h

More information

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 3 Sofia 017 Print ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.1515/cait-017-0030 Particle Swarm Optimization Base

More information

Architecture Design of Mobile Access Coordinated Wireless Sensor Networks

Architecture Design of Mobile Access Coordinated Wireless Sensor Networks Architecture Design of Mobile Access Coorinate Wireless Sensor Networks Mai Abelhakim 1 Leonar E. Lightfoot Jian Ren 1 Tongtong Li 1 1 Department of Electrical & Computer Engineering, Michigan State University,

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu Institute of Information Science Acaemia Sinica Taipei, Taiwan Da-wei Wang Jan-Jan Wu Institute of Information Science

More information

Learning convex bodies is hard

Learning convex bodies is hard Learning convex boies is har Navin Goyal Microsoft Research Inia navingo@microsoftcom Luis Raemacher Georgia Tech lraemac@ccgatecheu Abstract We show that learning a convex boy in R, given ranom samples

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks TR-IIS-05-021 Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu, Pangfeng Liu, Da-Wei Wang, Jan-Jan Wu December 2005 Technical Report No. TR-IIS-05-021 http://www.iis.sinica.eu.tw/lib/techreport/tr2005/tr05.html

More information

Secure Network Coding for Distributed Secret Sharing with Low Communication Cost

Secure Network Coding for Distributed Secret Sharing with Low Communication Cost Secure Network Coing for Distribute Secret Sharing with Low Communication Cost Nihar B. Shah, K. V. Rashmi an Kannan Ramchanran, Fellow, IEEE Abstract Shamir s (n,k) threshol secret sharing is an important

More information

Online Appendix to: Generalizing Database Forensics

Online Appendix to: Generalizing Database Forensics Online Appenix to: Generalizing Database Forensics KYRIACOS E. PAVLOU an RICHARD T. SNODGRASS, University of Arizona This appenix presents a step-by-step iscussion of the forensic analysis protocol that

More information

Improving Spatial Reuse of IEEE Based Ad Hoc Networks

Improving Spatial Reuse of IEEE Based Ad Hoc Networks mproving Spatial Reuse of EEE 82.11 Base A Hoc Networks Fengji Ye, Su Yi an Biplab Sikar ECSE Department, Rensselaer Polytechnic nstitute Troy, NY 1218 Abstract n this paper, we evaluate an suggest methos

More information

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation DEIM Forum 2018 I4-4 Abstract Ranom Clustering for Multiple Sampling Units to Spee Up Run-time Sample Generation uzuru OKAJIMA an Koichi MARUAMA NEC Solution Innovators, Lt. 1-18-7 Shinkiba, Koto-ku, Tokyo,

More information

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama an Hayato Ohwaa Faculty of Sci. an Tech. Tokyo University of Science, 2641 Yamazaki, Noa-shi, CHIBA, 278-8510, Japan hiroyuki@rs.noa.tus.ac.jp,

More information

EDOVE: Energy and Depth Variance-Based Opportunistic Void Avoidance Scheme for Underwater Acoustic Sensor Networks

EDOVE: Energy and Depth Variance-Based Opportunistic Void Avoidance Scheme for Underwater Acoustic Sensor Networks sensors Article EDOVE: Energy an Depth Variance-Base Opportunistic Voi Avoiance Scheme for Unerwater Acoustic Sensor Networks Safar Hussain Bouk 1, *, Sye Hassan Ahme 2, Kyung-Joon Park 1 an Yongsoon Eun

More information

A Classification of 3R Orthogonal Manipulators by the Topology of their Workspace

A Classification of 3R Orthogonal Manipulators by the Topology of their Workspace A Classification of R Orthogonal Manipulators by the Topology of their Workspace Maher aili, Philippe Wenger an Damien Chablat Institut e Recherche en Communications et Cybernétique e Nantes, UMR C.N.R.S.

More information

Coupling the User Interfaces of a Multiuser Program

Coupling the User Interfaces of a Multiuser Program Coupling the User Interfaces of a Multiuser Program PRASUN DEWAN University of North Carolina at Chapel Hill RAJIV CHOUDHARY Intel Corporation We have evelope a new moel for coupling the user-interfaces

More information

Considering bounds for approximation of 2 M to 3 N

Considering bounds for approximation of 2 M to 3 N Consiering bouns for approximation of to (version. Abstract: Estimating bouns of best approximations of to is iscusse. In the first part I evelop a powerseries, which shoul give practicable limits for

More information

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method Southern Cross University epublications@scu 23r Australasian Conference on the Mechanics of Structures an Materials 214 Transient analysis of wave propagation in 3D soil by using the scale bounary finite

More information

Skyline Community Search in Multi-valued Networks

Skyline Community Search in Multi-valued Networks Syline Community Search in Multi-value Networs Rong-Hua Li Beijing Institute of Technology Beijing, China lironghuascut@gmail.com Jeffrey Xu Yu Chinese University of Hong Kong Hong Kong, China yu@se.cuh.eu.h

More information

Kinematic Analysis of a Family of 3R Manipulators

Kinematic Analysis of a Family of 3R Manipulators Kinematic Analysis of a Family of R Manipulators Maher Baili, Philippe Wenger an Damien Chablat Institut e Recherche en Communications et Cybernétique e Nantes, UMR C.N.R.S. 6597 1, rue e la Noë, BP 92101,

More information

MODULE VII. Emerging Technologies

MODULE VII. Emerging Technologies MODULE VII Emerging Technologies Computer Networks an Internets -- Moule 7 1 Spring, 2014 Copyright 2014. All rights reserve. Topics Software Define Networking The Internet Of Things Other trens in networking

More information

All-to-all Broadcast for Vehicular Networks Based on Coded Slotted ALOHA

All-to-all Broadcast for Vehicular Networks Based on Coded Slotted ALOHA Preprint, August 5, 2018. 1 All-to-all Broacast for Vehicular Networks Base on Coe Slotte ALOHA Mikhail Ivanov, Frerik Brännström, Alexanre Graell i Amat, an Petar Popovski Department of Signals an Systems,

More information

Distributed Line Graphs: A Universal Technique for Designing DHTs Based on Arbitrary Regular Graphs

Distributed Line Graphs: A Universal Technique for Designing DHTs Based on Arbitrary Regular Graphs IEEE TRANSACTIONS ON KNOWLEDE AND DATA ENINEERIN, MANUSCRIPT ID Distribute Line raphs: A Universal Technique for Designing DHTs Base on Arbitrary Regular raphs Yiming Zhang an Ling Liu, Senior Member,

More information

Overview. Operating Systems I. Simple Memory Management. Simple Memory Management. Multiprocessing w/fixed Partitions.

Overview. Operating Systems I. Simple Memory Management. Simple Memory Management. Multiprocessing w/fixed Partitions. Overview Operating Systems I Management Provie Services processes files Manage Devices processor memory isk Simple Management One process in memory, using it all each program nees I/O rivers until 96 I/O

More information

Learning Subproblem Complexities in Distributed Branch and Bound

Learning Subproblem Complexities in Distributed Branch and Bound Learning Subproblem Complexities in Distribute Branch an Boun Lars Otten Department of Computer Science University of California, Irvine lotten@ics.uci.eu Rina Dechter Department of Computer Science University

More information

An Adaptive Routing Algorithm for Communication Networks using Back Pressure Technique

An Adaptive Routing Algorithm for Communication Networks using Back Pressure Technique International OPEN ACCESS Journal Of Moern Engineering Research (IJMER) An Aaptive Routing Algorithm for Communication Networks using Back Pressure Technique Khasimpeera Mohamme 1, K. Kalpana 2 1 M. Tech

More information

AnyTraffic Labeled Routing

AnyTraffic Labeled Routing AnyTraffic Labele Routing Dimitri Papaimitriou 1, Pero Peroso 2, Davie Careglio 2 1 Alcatel-Lucent Bell, Antwerp, Belgium Email: imitri.papaimitriou@alcatel-lucent.com 2 Universitat Politècnica e Catalunya,

More information

Adjacency Matrix Based Full-Text Indexing Models

Adjacency Matrix Based Full-Text Indexing Models 1000-9825/2002/13(10)1933-10 2002 Journal of Software Vol.13, No.10 Ajacency Matrix Base Full-Text Inexing Moels ZHOU Shui-geng 1, HU Yun-fa 2, GUAN Ji-hong 3 1 (Department of Computer Science an Engineering,

More information

Computer Organization

Computer Organization Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.

More information

Adaptive Load Balancing based on IP Fast Reroute to Avoid Congestion Hot-spots

Adaptive Load Balancing based on IP Fast Reroute to Avoid Congestion Hot-spots Aaptive Loa Balancing base on IP Fast Reroute to Avoi Congestion Hot-spots Masaki Hara an Takuya Yoshihiro Faculty of Systems Engineering, Wakayama University 930 Sakaeani, Wakayama, 640-8510, Japan Email:

More information

Analysis of half-space range search using the k-d search skip list. Here we analyse the expected time for half-space

Analysis of half-space range search using the k-d search skip list. Here we analyse the expected time for half-space Analysis of half-space range search using the k- search skip list Mario A. Lopez Brafor G. Nickerson y 1 Abstract We analyse the average cost of half-space range reporting for the k- search skip list.

More information

On the Placement of Internet Taps in Wireless Neighborhood Networks

On the Placement of Internet Taps in Wireless Neighborhood Networks 1 On the Placement of Internet Taps in Wireless Neighborhoo Networks Lili Qiu, Ranveer Chanra, Kamal Jain, Mohamma Mahian Abstract Recently there has emerge a novel application of wireless technology that

More information

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES OLIVIER BERNARDI AND ÉRIC FUSY Abstract. We present bijections for planar maps with bounaries. In particular, we obtain bijections for triangulations an quarangulations

More information

Switch Fabrics. Switching Technology S P. Raatikainen Switching Technology / 2006.

Switch Fabrics. Switching Technology S P. Raatikainen Switching Technology / 2006. Switch Fabrics Switching Technology S38.3165 http://www.netlab.hut.fi/opetus/s383165 L4-1 Switch fabrics Basic concepts Time and space switching Two stage switches Three stage switches Cost criteria Multi-stage

More information

Characterizing Decoding Robustness under Parametric Channel Uncertainty

Characterizing Decoding Robustness under Parametric Channel Uncertainty Characterizing Decoing Robustness uner Parametric Channel Uncertainty Jay D. Wierer, Wahee U. Bajwa, Nigel Boston, an Robert D. Nowak Abstract This paper characterizes the robustness of ecoing uner parametric

More information

Study of Network Optimization Method Based on ACL

Study of Network Optimization Method Based on ACL Available online at www.scienceirect.com Proceia Engineering 5 (20) 3959 3963 Avance in Control Engineering an Information Science Stuy of Network Optimization Metho Base on ACL Liu Zhian * Department

More information

Optimal Distributed P2P Streaming under Node Degree Bounds

Optimal Distributed P2P Streaming under Node Degree Bounds Optimal Distribute P2P Streaming uner Noe Degree Bouns Shaoquan Zhang, Ziyu Shao, Minghua Chen, an Libin Jiang Department of Information Engineering, The Chinese University of Hong Kong Department of EECS,

More information

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks : a Movement-Base Routing Algorithm for Vehicle A Hoc Networks Fabrizio Granelli, Senior Member, Giulia Boato, Member, an Dzmitry Kliazovich, Stuent Member Abstract Recent interest in car-to-car communications

More information

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems On the Role of Multiply Sectione Bayesian Networks to Cooperative Multiagent Systems Y. Xiang University of Guelph, Canaa, yxiang@cis.uoguelph.ca V. Lesser University of Massachusetts at Amherst, USA,

More information

Baring it all to Software: The Raw Machine

Baring it all to Software: The Raw Machine Baring it all to Software: The Raw Machine Elliot Waingol, Michael Taylor, Vivek Sarkar, Walter Lee, Victor Lee, Jang Kim, Matthew Frank, Peter Finch, Srikrishna Devabhaktuni, Rajeev Barua, Jonathan Babb,

More information

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 1, NO. 4, APRIL 01 74 Towar Efficient Distribute Algorithms for In-Network Binary Operator Tree Placement in Wireless Sensor Networks Zongqing Lu,

More information

Dual Arm Robot Research Report

Dual Arm Robot Research Report Dual Arm Robot Research Report Analytical Inverse Kinematics Solution for Moularize Dual-Arm Robot With offset at shouler an wrist Motivation an Abstract Generally, an inustrial manipulator such as PUMA

More information

Lab work #8. Congestion control

Lab work #8. Congestion control TEORÍA DE REDES DE TELECOMUNICACIONES Grao en Ingeniería Telemática Grao en Ingeniería en Sistemas e Telecomunicación Curso 2015-2016 Lab work #8. Congestion control (1 session) Author: Pablo Pavón Mariño

More information

Chalmers Publication Library

Chalmers Publication Library Chalmers Publication Library All-to-all Broacast for Vehicular Networks Base on Coe Slotte ALOHA This ocument has been ownloae from Chalmers Publication Library (CPL). It is the author s version of a work

More information

Uninformed search methods

Uninformed search methods CS 1571 Introuction to AI Lecture 4 Uninforme search methos Milos Hauskrecht milos@cs.pitt.eu 539 Sennott Square Announcements Homework assignment 1 is out Due on Thursay, September 11, 014 before the

More information

Switch Fabrics. Switching Technology S Recursive factoring of a strict-sense non-blocking network

Switch Fabrics. Switching Technology S Recursive factoring of a strict-sense non-blocking network Switch Fabrics Switching Technology S38.65 http://www.netlab.hut.fi/opetus/s3865 5 - Recursive factoring of a strict-sense non-blocking network A strict-sense non-blocking network can be constructed recursively,

More information

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem Throughput Characterization of Noe-base Scheuling in Multihop Wireless Networks: A Novel Application of the Gallai-Emons Structure Theorem Bo Ji an Yu Sang Dept. of Computer an Information Sciences Temple

More information

Chapter 9 Memory Management

Chapter 9 Memory Management Contents 1. Introuction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threas 6. CPU Scheuling 7. Process Synchronization 8. Dealocks 9. Memory Management 10.Virtual Memory

More information

Depth Sizing of Surface Breaking Flaw on Its Open Side by Short Path of Diffraction Technique

Depth Sizing of Surface Breaking Flaw on Its Open Side by Short Path of Diffraction Technique 17th Worl Conference on Nonestructive Testing, 5-8 Oct 008, Shanghai, China Depth Sizing of Surface Breaking Flaw on Its Open Sie by Short Path of Diffraction Technique Hiroyuki FUKUTOMI, Shan LIN an Takashi

More information

Divide-and-Conquer Algorithms

Divide-and-Conquer Algorithms Supplment to A Practical Guie to Data Structures an Algorithms Using Java Divie-an-Conquer Algorithms Sally A Golman an Kenneth J Golman Hanout Divie-an-conquer algorithms use the following three phases:

More information

Computer Organization

Computer Organization Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.

More information

Crossbar - example. Crossbar. Crossbar. Combination: Time-space switching. Simple space-division switch Crosspoints can be turned on or off

Crossbar - example. Crossbar. Crossbar. Combination: Time-space switching. Simple space-division switch Crosspoints can be turned on or off Crossbar Crossbar - example Simple space-division switch Crosspoints can be turned on or off i n p u t s sessions: (,) (,) (,) (,) outputs Crossbar Advantages: simple to implement simple control flexible

More information

PERFECT ONE-ERROR-CORRECTING CODES ON ITERATED COMPLETE GRAPHS: ENCODING AND DECODING FOR THE SF LABELING

PERFECT ONE-ERROR-CORRECTING CODES ON ITERATED COMPLETE GRAPHS: ENCODING AND DECODING FOR THE SF LABELING PERFECT ONE-ERROR-CORRECTING CODES ON ITERATED COMPLETE GRAPHS: ENCODING AND DECODING FOR THE SF LABELING PAMELA RUSSELL ADVISOR: PAUL CULL OREGON STATE UNIVERSITY ABSTRACT. Birchall an Teor prove that

More information

2-connected graphs with small 2-connected dominating sets

2-connected graphs with small 2-connected dominating sets 2-connecte graphs with small 2-connecte ominating sets Yair Caro, Raphael Yuster 1 Department of Mathematics, University of Haifa at Oranim, Tivon 36006, Israel Abstract Let G be a 2-connecte graph. A

More information

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources An Algorithm for Builing an Enterprise Network Topology Using Wiesprea Data Sources Anton Anreev, Iurii Bogoiavlenskii Petrozavosk State University Petrozavosk, Russia {anreev, ybgv}@cs.petrsu.ru Abstract

More information

Comparison of Methods for Increasing the Performance of a DUA Computation

Comparison of Methods for Increasing the Performance of a DUA Computation Comparison of Methos for Increasing the Performance of a DUA Computation Michael Behrisch, Daniel Krajzewicz, Peter Wagner an Yun-Pang Wang Institute of Transportation Systems, German Aerospace Center,

More information

Robust PIM-SM Multicasting using Anycast RP in Wireless Ad Hoc Networks

Robust PIM-SM Multicasting using Anycast RP in Wireless Ad Hoc Networks Robust PIM-SM Multicasting using Anycast RP in Wireless A Hoc Networks Jaewon Kang, John Sucec, Vikram Kaul, Sunil Samtani an Mariusz A. Fecko Applie Research, Telcoria Technologies One Telcoria Drive,

More information

6.823 Computer System Architecture. Problem Set #3 Spring 2002

6.823 Computer System Architecture. Problem Set #3 Spring 2002 6.823 Computer System Architecture Problem Set #3 Spring 2002 Stuents are strongly encourage to collaborate in groups of up to three people. A group shoul han in only one copy of the solution to the problem

More information

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation Solution Representation for Job Shop Scheuling Problems in Ant Colony Optimisation James Montgomery, Carole Faya 2, an Sana Petrovic 2 Faculty of Information & Communication Technologies, Swinburne University

More information

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks 1 Backpressure-base Packet-by-Packet Aaptive Routing in Communication Networks Eleftheria Athanasopoulou, Loc Bui, Tianxiong Ji, R. Srikant, an Alexaner Stolyar Abstract Backpressure-base aaptive routing

More information

Research Article REALFLOW: Reliable Real-Time Flooding-Based Routing Protocol for Industrial Wireless Sensor Networks

Research Article REALFLOW: Reliable Real-Time Flooding-Based Routing Protocol for Industrial Wireless Sensor Networks Hinawi Publishing Corporation International Journal of Distribute Sensor Networks Volume 2014, Article ID 936379, 17 pages http://x.oi.org/10.1155/2014/936379 Research Article REALFLOW: Reliable Real-Time

More information

Additional Divide and Conquer Algorithms. Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication

Additional Divide and Conquer Algorithms. Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication Aitional Divie an Conquer Algorithms Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication Divie an Conquer Closest Pair Let s revisit the closest pair problem. Last

More information

Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm

Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm NASA/CR-1998-208733 ICASE Report No. 98-45 Parallel Directionally Split Solver Base on Reformulation of Pipeline Thomas Algorithm A. Povitsky ICASE, Hampton, Virginia Institute for Computer Applications

More information

A quasi-nonblocking self-routing network which routes packets in log 2 N time.

A quasi-nonblocking self-routing network which routes packets in log 2 N time. A quasi-nonblocking self-routing network which routes packets in log 2 N time. Giuseppe A. De Biase Claudia Ferrone Annalisa Massini Dipartimento di Scienze dell Informazione, Università di Roma la Sapienza

More information

PAPER. 1. Introduction

PAPER. 1. Introduction IEICE TRANS. COMMUN., VOL. E9x-B, No.8 AUGUST 2010 PAPER Integrating Overlay Protocols for Proviing Autonomic Services in Mobile A-hoc Networks Panagiotis Gouvas, IEICE Stuent member, Anastasios Zafeiropoulos,,

More information

A Formal Model and Efficient Traversal Algorithm for Generating Testbenches for Verification of IEEE Standard Floating Point Division

A Formal Model and Efficient Traversal Algorithm for Generating Testbenches for Verification of IEEE Standard Floating Point Division A Formal Moel an Efficient Traversal Algorithm for Generating Testbenches for Verification of IEEE Stanar Floating Point Division Davi W. Matula, Lee D. McFearin Department of Computer Science an Engineering

More information

Virtual Circuit Blocking Probabilities in an ATM Banyan Network with b b Switching Elements

Virtual Circuit Blocking Probabilities in an ATM Banyan Network with b b Switching Elements Proceedings of the Applied Telecommunication Symposium (part of Advanced Simulation Technologies Conference) Seattle, Washington, USA, April 22 26, 21 Virtual Circuit Blocking Probabilities in an ATM Banyan

More information

Ad-Hoc Networks Beyond Unit Disk Graphs

Ad-Hoc Networks Beyond Unit Disk Graphs A-Hoc Networks Beyon Unit Disk Graphs Fabian Kuhn, Roger Wattenhofer, Aaron Zollinger Department of Computer Science ETH Zurich 8092 Zurich, Switzerlan {kuhn, wattenhofer, zollinger}@inf.ethz.ch ABSTRACT

More information

Optimal Routing and Scheduling for Deterministic Delay Tolerant Networks

Optimal Routing and Scheduling for Deterministic Delay Tolerant Networks Optimal Routing an Scheuling for Deterministic Delay Tolerant Networks Davi Hay Dipartimento i Elettronica olitecnico i Torino, Italy Email: hay@tlc.polito.it aolo Giaccone Dipartimento i Elettronica olitecnico

More information

Scalable Deterministic Scheduling for WDM Slot Switching Xhaul with Zero-Jitter

Scalable Deterministic Scheduling for WDM Slot Switching Xhaul with Zero-Jitter FDL sel. VOA SOA 100 Regular papers ONDM 2018 Scalable Deterministic Scheuling for WDM Slot Switching Xhaul with Zero-Jitter Bogan Uscumlic 1, Dominique Chiaroni 1, Brice Leclerc 1, Thierry Zami 2, Annie

More information

Control of Scalable Wet SMA Actuator Arrays

Control of Scalable Wet SMA Actuator Arrays Proceeings of the 2005 IEEE International Conference on Robotics an Automation Barcelona, Spain, April 2005 Control of Scalable Wet SMA Actuator Arrays eslie Flemming orth Dakota State University Mechanical

More information

Coordinating Distributed Algorithms for Feature Extraction Offloading in Multi-Camera Visual Sensor Networks

Coordinating Distributed Algorithms for Feature Extraction Offloading in Multi-Camera Visual Sensor Networks Coorinating Distribute Algorithms for Feature Extraction Offloaing in Multi-Camera Visual Sensor Networks Emil Eriksson, György Dán, Viktoria Foor School of Electrical Engineering, KTH Royal Institute

More information

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks 1 Backpressure-base Packet-by-Packet Aaptive Routing in Communication Networks Eleftheria Athanasopoulou, Loc Bui, Tianxiong Ji, R. Srikant, an Alexaner Stoylar arxiv:15.4984v1 [cs.ni] 27 May 21 Abstract

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

Exploring Context with Deep Structured models for Semantic Segmentation

Exploring Context with Deep Structured models for Semantic Segmentation 1 Exploring Context with Deep Structure moels for Semantic Segmentation Guosheng Lin, Chunhua Shen, Anton van en Hengel, Ian Rei between an image patch an a large backgroun image region. Explicitly moeling

More information

Learning Polynomial Functions. by Feature Construction

Learning Polynomial Functions. by Feature Construction I Proceeings of the Eighth International Workshop on Machine Learning Chicago, Illinois, June 27-29 1991 Learning Polynomial Functions by Feature Construction Richar S. Sutton GTE Laboratories Incorporate

More information

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics CS 106 Winter 2016 Craig S. Kaplan Moule 01 Processing Recap Topics The basic parts of speech in a Processing program Scope Review of syntax for classes an objects Reaings Your CS 105 notes Learning Processing,

More information

Threshold Based Data Aggregation Algorithm To Detect Rainfall Induced Landslides

Threshold Based Data Aggregation Algorithm To Detect Rainfall Induced Landslides Threshol Base Data Aggregation Algorithm To Detect Rainfall Inuce Lanslies Maneesha V. Ramesh P. V. Ushakumari Department of Computer Science Department of Mathematics Amrita School of Engineering Amrita

More information

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract The Reconstruction of Graphs Dhananay P. Mehenale Sir Parashurambhau College, Tila Roa, Pune-4030, Inia. Abstract In this paper we iscuss reconstruction problems for graphs. We evelop some new ieas lie

More information

Probabilistic Medium Access Control for. Full-Duplex Networks with Half-Duplex Clients

Probabilistic Medium Access Control for. Full-Duplex Networks with Half-Duplex Clients Probabilistic Meium Access Control for 1 Full-Duplex Networks with Half-Duplex Clients arxiv:1608.08729v1 [cs.ni] 31 Aug 2016 Shih-Ying Chen, Ting-Feng Huang, Kate Ching-Ju Lin, Member, IEEE, Y.-W. Peter

More information

Improving Performance of Sparse Matrix-Vector Multiplication

Improving Performance of Sparse Matrix-Vector Multiplication Improving Performance of Sparse Matrix-Vector Multiplication Ali Pınar Michael T. Heath Department of Computer Science an Center of Simulation of Avance Rockets University of Illinois at Urbana-Champaign

More information

A New Search Algorithm for Solving Symmetric Traveling Salesman Problem Based on Gravity

A New Search Algorithm for Solving Symmetric Traveling Salesman Problem Based on Gravity Worl Applie Sciences Journal 16 (10): 1387-1392, 2012 ISSN 1818-4952 IDOSI Publications, 2012 A New Search Algorithm for Solving Symmetric Traveling Salesman Problem Base on Gravity Aliasghar Rahmani Hosseinabai,

More information

Verifying performance-based design objectives using assemblybased vulnerability

Verifying performance-based design objectives using assemblybased vulnerability Verying performance-base esign objectives using assemblybase vulnerability K.A. Porter Calornia Institute of Technology, Pasaena, Calornia, USA A.S. Kiremijian Stanfor University, Stanfor, Calornia, USA

More information

6 Gradient Descent. 6.1 Functions

6 Gradient Descent. 6.1 Functions 6 Graient Descent In this topic we will iscuss optimizing over general functions f. Typically the function is efine f : R! R; that is its omain is multi-imensional (in this case -imensional) an output

More information

Bends, Jogs, And Wiggles for Railroad Tracks and Vehicle Guide Ways

Bends, Jogs, And Wiggles for Railroad Tracks and Vehicle Guide Ways Ben, Jogs, An Wiggles for Railroa Tracks an Vehicle Guie Ways Louis T. Klauer Jr., PhD, PE. Work Soft 833 Galer Dr. Newtown Square, PA 19073 lklauer@wsof.com Preprint, June 4, 00 Copyright 00 by Louis

More information

Provisioning Virtualized Cloud Services in IP/MPLS-over-EON Networks

Provisioning Virtualized Cloud Services in IP/MPLS-over-EON Networks Provisioning Virtualize Clou Services in IP/MPLS-over-EON Networks Pan Yi an Byrav Ramamurthy Department of Computer Science an Engineering, University of Nebraska-Lincoln Lincoln, Nebraska 68588 USA Email:

More information