Scalable crossbar network: a non-blocking interconnection network for large-scale systems

J Supercomput DOI 10.1007/s11227-014-1319-2 Scalable crossbar network: a non-blocking interconnection network for large-scale systems Fathollah Bistouni Mohsen Jahanshahi Springer Science+Business Media New York 2014 Abstract Interconnection networks (INs) are used in wide applications of multiprocessor systems in order to set up connections between various nodes such as processors and memory modules. However, there is a fundamental problem in INs that has always been considered as one of the most challenging issues in this area. Blocking problem in these networks degrades network performance and consequently the performance of the whole system. In the meantime, the main option for dealing with this problem is the use of non-blocking crossbar networks. However, there are engineering and scaling difficulties when using these networks in large-scale systems. The number of pins on a VLSI chip cannot exceed a few hundreds, which restricts the size of the largest crossbar that should be integrated into a single VLSI chip. Using the idea of multistage implementation of crossbar network can resolve the problem. However, the next problem that arises with this idea is high hardware cost. Therefore, in this paper, a new implementation of crossbar network named scalable crossbar network (SCN) that is a non-blocking network is presented to cope with the aforementioned scaling problems. In addition, performance analysis results show that SCN outperforms multistage crossbar networks and multistage interconnection networks in terms of terminal reliability, mean time to failure, and system failure rate. Keywords Parallel computers Interconnection networks Blocking problem Reliability Scalable crossbar network F. Bistouni Department of Information Technology, Qazvin Branch, Islamic Azad University, Qazvin, Iran e-mail: f_bistouni@qiau.ac.ir M. Jahanshahi (B) Young Researchers and Elite Club, Central Tehran Branch, Islamic Azad University, Tehran, Iran e-mail: mohsenjahanshahi@gmail.com; mjahanshahi@iauctb.ac.ir

F. Bistouni, M. Jahanshahi 1 Introduction In the early 1950s, Neumann proposed a simple economical design for electronic computers in which a single processing unit is connected to a single memory module. During 1960s, parallel computing was partially solved, wherein by using the concept of solid state components, the cost of large computing machines was reduced. Afterward, very large scale integration (VLSI) was evolved in which thousands of transistors were placed on a single chip. Success of supercomputers to cope with scientific problems such as weather modeling, aerodynamic analysis of aircraft design, and particle physics may be the strongest motivation for development of parallel computers. After 1980s, this technology has played an undeniable, viable role to solve other problems [1]. Parallel processors are computer systems that consist of several processing elements connected via interconnection networks [2]. In a generic, multi-processor architecture, many processors link together through an interconnection network by which those are capable of transmitting data among themselves. In these systems, every node has a processor, a share of the main memory, and a cache hierarchy. The processors are connected to the global interconnection through a network interface. Another component of this system is I/O, where I/O devices are often connected to an I/O bus which is interfaced to memory module of different processors via the interconnection. Therefore, processor, memory hierarchy, and interconnection network are critical components of a parallel system [3 5]. One of the main concerns of parallel computing systems which apply multiple processors is the mechanism of information transferring among the processor elements and memory modules. Interconnection networks are formed by a complex connection of switching elements and links which determine the communication fashion among them. As a result, designing high-performance interconnection networks becomes a critical issue to exploit the performance of parallel computers [4 13]. Choosing the network topology is the first step in the design of a network. A key to the efficiency of interconnection networks comes from the fact that communication resources are shared. Instead of creating a dedicated channel between each terminal pair, the interconnection network is implemented with a collection of shared switching elements connected by shared links. The connection pattern of these switching elements defines the network s topology. In fact, topology of an interconnection network is specified by a set of nodes N connected by a set of links L. Messages originate and terminate in a set of terminal nodes N where N N. Therefore, a network node may be a terminal node that acts as a source (input) or destination (output) for packets, or it may be a switch node that forwards packets from input ports to associated output ports. In other words, a message is delivered between terminals by making several hops across the shared links and switch nodes from its source terminal to its destination terminal [10,14,15]. For efficient and fair use of the network resources, a message is often divided into packets prior to transmission. A packet is the smallest unit of communication that contains the destination address and sequencing information, which are carried in the packet header. For topologies in which packets may have to traverse some intermediate switching elements, the routing algorithm determines the path selected by a packet to reach its destination. At each intermediate switch, the routing algorithm indicates

Scalable crossbar network Fig. 1 A crossbar network of size N M the next channel to be used. That channel may be selected among a set of possible choices. If all the candidate channels are busy, the packet is blocked and cannot advance, which is known as the blocking or head-of-line (HOL) blocking problem. Therefore, the main reason for the problem comes from the fact that communication resources such as switching elements and links are limited and shared in an interconnection network. In addition, efficient routing is also critical to the performance of interconnection networks. When a message or packet header reaches an intermediate switch, a switching mechanism determines how and when the router switch is set; that is, the input channel is connected to the output channel selected by the routing algorithm. In other words, the switching mechanism determines how network resources are allocated for message transmission [10,14,16 19]. More precisely, a network is said to be non-blocking if it can handle all requests that are a permutation (the term permutation defined as a request for parallel connections of every N sources to their N corresponding distinct destinations) of the inputs and outputs. That is, a dedicated path can be formed from each input to its corresponding output without any conflicts (shared channels). Conversely, a network is blocking if it cannot handle all such requests without conflicts [10,14,20,21]. Therefore, considering the blocking problem is one of the main factors for choosing an appropriate interconnection topology. So far, many interconnection topologies have been proposed. However, a few of them can efficiently solve the blocking problem. For systems with N nodes, the ideal topology would connect those nodes through a single N N switch. Such a switch is known as a crossbar. The crossbar networks allow any processor in the system to connect to any other processor or memory unit so that many processors can communicate simultaneously without contention. Therefore, a crossbar network is obviously strictly non-blocking for each permutation of connections. A crossbar can be defined as a switching network with N inputs and M outputs, which allows up to min{n, M} one-to-one interconnections without contention. Figure 1 shows an N M crossbar network. Usually, M = N except for crossbars connecting processors and memory modules. However it is realized, when we depict crossbar switches in a system, to avoid drawing the entire schematic each time, we employ the symbol for a crossbar shown in Fig. 2. Where it is clear from context that the box is a crossbar switch, we will omit the X and depict the crossbar as a simple box with inputs and outputs. Many other networks work very hard to achieve the non-blocking property that comes to the crossbar network very easily. If crossbars are trivially non-blocking, then why bother with any other non-blocking network? The main reason is scalability issue.

F. Bistouni, M. Jahanshahi Fig. 2 Used symbol for a 3 5 crossbar network The number of physical connections of a switch is limited by hardware constraints such as the number of available pins and wiring area. The advent of VLSI permitted the integration of hardware for thousands of switches into a single chip. However, the number of pins on a VLSI chip cannot exceed a few hundreds, which restricts the size of the largest crossbar that should be integrated into a single VLSI chip [10,14,22]. These scaling issues preclude the use of crossbar networks for large network sizes. Therefore, the network can be used in small-scale multi-processors. However, there is a reasonable solution for the exploitation of these networks in large-scale systems. The solution uses small-size crossbar networks as building blocks for larger network sizes. According to previous works, this solution can be used by two different approaches: (1) Using small-size crossbar networks to build other large-size interconnection networks with different topology than the crossbar network [10,14,20,23 44]. Using this approach, many topologies have been introduced, most of which are known as multistage interconnection networks. (2) Using small-size crossbar networks to build large-sized crossbar networks [10,14,45]. Using this approach, designed networks similar to the crossbar network are non-blocking. Using the first approach, many alternative topologies have been proposed. Generally, in these topologies, messages may have to traverse several crossbar switches before reaching the destination node. In these networks, these switches are usually identical and have been traditionally organized as a set of stages. Each stage (but the input/output stages) is only connected to the previous and next stages using regular connection patterns. Input and output stages are connected to the first and last stages, respectively. These networks are referred to as multistage interconnection networks (MINs) and have different properties depending on the number of stages and how those stages are arranged. As will be discussed in Sect. 2, this approach can greatly promote the scalability problem. However, generally, it cannot hold the non-blocking feature of the crossbar network and suffers from many difficulties to solve the blocking problem. Unlike the first method, which is considered by many researchers, so far, very little works has been done on the latter approach. However, we believe that the development of this approach can lead to desired topologies which can meet most performance parameters needed in this area. Therefore, our focus in this paper is on the second approach. In this paper, according to this approach and the ideas proposed in [10,14] to build scalable crossbar networks, a new topology (see Sect. 3) called scalable crossbar network (SCN) will be proposed, which can tackle the blocking problem efficiently. In addition, the SCN eliminates the scalability issue due to the use of small-size crossbar networks as switching elements. On the other hand, the SCN s self-routing mechanism is efficient and affordable. Furthermore, performance analysis (see Sect. 4) demon-

Scalable crossbar network strates that SCN achieves a very reasonable performance in terms of various critical parameters such as terminal reliability, mean time to failure, and system failure rate compared with many representative MINs such as, shuffle-exchange network (SEN), extra-stage shuffle-exchange network (SEN+), Benes network, multilayer MINs, and replicated MINs as well as multistage implementation of crossbar network called multistage crossbar network (MCN). We will be more familiar with these networks in Sect. 2. The rest of the paper is organized as follows: an overview of related works will be provided in Sect. 2. The proposed interconnection topology, SCN, will be presented in Sect. 3. In Sect. 4, network performance is extensively analyzed. Finally, Sect. 5 concludes the paper. 2 Related works As discussed in Sect. 1, the number of pins on a VLSI chip cannot exceed a few hundreds, which restricts the size of the largest crossbar that can be integrated into a single VLSI chip. Also, the solution is using the small-size crossbar networks as building blocks for larger network sizes. On the other hand, this solution can be implemented in two different methods: (1) Using small-size crossbar networks to build other large-size interconnection networks with different topology with respect to the crossbar network [10,14,20,23 44]. (2) Using small-size crossbar networks to build large-size crossbar networks [10,14,45]. We will discuss two approaches in details in the following Sects. 2.1 and 2.2. 2.1 Using small-size crossbar networks to build other large-size interconnection networks Using this method, so far, many topologies have been proposed. In what follows, important recent works are reviewed: Various banyan-type MIN architectures, such as omega network, binary n-cube network, shuffle-exchange network, and delta networks have been presented in the literature. Typically, all of these kinds of networks are made of crossbar switches with small size 2 2. These basic networks provide just one path between any sourcedestination pair using 2 2 switches as basic elements. The network fails if merely one of the switches fails, resulting in poor fault-tolerance. In [23], a MIN called gamma was proposed. The gamma network is an interconnection network connecting N inputs to N outputs. It consists of (log 2 N +1) stages numbered from stage 0 to stage (log 2 N) and N crossbar switches per stage. The crossbar switches in the first and last stages are of small size 1 3 and 3 1, respectively, and the intermediate stages have 3 3 small-size crossbar switches. The stages are linked via power of two and identify connections in such a way that redundant paths exist between the input and output terminals. However, the gamma network provides only one path when source and destination node sequence number is the same. Therefore, it provides different levels of fault tolerance to different node pairs and fails to guarantee high fault-tolerance for all pairs of source destination nodes. Therefore, in order to improve the capability of

F. Bistouni, M. Jahanshahi fault-tolerance in the gamma, in [24], two new designs of 4-disjoint paths MINs called 4-disjoint gamma interconnection networks (4DGIN-1 and 4DGIN-2) were proposed. Both designs have (log 2 N + 1) number of stages from stage 0 to (log 2 N), each stage involves N sources and N destinations. In these networks, small-size crossbar switches are used in sizes such as 2 4, 4 2, 2 3, and 3 2. In [25], a fault-tolerant network called CSMIN (combining switches multistage interconnection network) was designed. CSMIN provides two disjoint paths to guarantee one fault-tolerant and can dynamically reroute packets between these two paths to solve the collision situation. A CSMIN of size N consists of (log 2 N + 1) stages labeled from 0 to (log 2 N). In this topology, the switch architecture at the first and the last stage has 2 4 and 3 2 small-size crossbars, respectively. Also, switches located at stage 1 have 3 3 crossbars. Moreover, each switch located at the intermediate stage has a 4 4 crossbar switch. In [26], to eliminate the backtracking penalties of the CSMIN, a new design called Fault-tolerant Fully-Chained Combining Switches Multi-stage Interconnection Network (FCSMIN) has been proposed. The FCSMIN has multiple paths between any source-destination pair to provide better fault-tolerance capability. The FCSMIN changes one of the original non-straight links of CSMIN at stage 1 to (log 2 N 1) to a chained link. A FCSMIN of size N consists of (log 2 N + 1) stages labeled from 0 to (log 2 N). Similar to the CSMIN, all switching elements used in FCSMIN are crossbar networks of small sizes. The first stage of FCSMIN is similar to CSMIN having 2 4 crossbar switches. For stages 1 to (log 2 N 1), each switch of FCSMIN is augmented with a chaining links. 3 3 crossbar switches replace the switches at intermediate stages. It also removed either of the non-straight links between the last two stages so that the final stage has 2 1 crossbar switches. A new class of fault-tolerant MINs called extra group networks (EGN) has been introduced in [27]. An EGN of size N has one m l multiplexer in the input stage, log 2 ( m N ) in the intermediate stages of 2 2 crossbar switches, and one 1 m demultiplexer in the output stage. There are ( N 2 + 2m N ) switches in each of the intermediate stages, (N + m N ) multiplexers in the input stage and (N + m N ) demultiplexers in the output stage. Each unique path network of size m N plus its associated multiplexers and demultiplexers called a group. Therefore, similar to previous works, the EGN is also a MIN consisting of small-size crossbar networks. In order to further improve fault-tolerance and reliability in the EGN, A new MIN topology named improved extra group network (IEGN) is proposed in [20]. Similar to EGN, the IEGN is also a MIN with small-size crossbar networks as switching elements. The IEGN is an extra group network (EGN) with additional auxiliary links. Adding some auxiliary links increases the size of crossbar switches from 2 2to3 3. The auxiliary links help enhance fault-tolerant capability and keep a better level of performance even in the presence of faults. Generally, an IEGN of size N N has one m l multiplexer stage (input stage), log 2 ( m N ) stages (intermediate stages) of 3 3 crossbar switches, and one l m demultiplexer stage (output stage). There are ( N 2 + 2m N ) crossbar switches in each of the intermediate stages, (N + m N ) multiplexers in the input stage and (N + m N ) demultiplexers in the output stage. Reference [28] proposed a new architectural concept called HASIN (hierarchical adaptive switching interconnection network). The HASIN network topology uses crossbars switch in the local level and a mesh topology in the global level. In this work, the power consumption is reduced since

Scalable crossbar network small crossbar switches are used to compose the clusters. As the crossbars present a simple architecture and do not require buffers, the power consumption is much smaller than conventional router architecture. This hierarchical topology not only reduces the number of hops and explores the communication locality, but is also able to provide low power consumption. Reference [29] is an effort to improve reliability and fault-tolerance of banyan-type networks such as shuffle-exchange network (SEN) by introducing a new topology, called the augmented shuffle-exchange network (ASEN) consisting of crossbar switches of small sizes. The ASEN is a SEN with one less stage, additional intrastage links called auxiliary links, multiplexers, demultiplexers, and a slightly more complex switching element. An ASEN of size N N consists of ((log 2 N) 1) stages of ( N 2 ) switching elements. The switches in the final stage are 2 2 crossbar switches and the remaining switches in stages 1 through ((log 2 N) 2) are 3 3 crossbar switches. There is one 2 1 multiplexer for each input link of a switch in the stage 1 and one 1 2 demultiplexer for each output link of a switch in the stage ((log 2 N) 1). Therefore, an ASEN of size N N has N multiplexers in the input stage and N demultiplexers in the output stage. The network complexity (defined as the number of 2 2 switching elements) of an N N ASEN is given by [( 3N 2 )(1 + 4 3 ((log 2 N) 2))]. Reference [33] introduces a class of fault-tolerant MINs named as Augmented Baseline Networks (ABNs). An ABN of size N N consists of two identical groups of N 2 sources and N 2 destinations. The switching elements in the final stage are 2 2 crossbar switches and the remaining switches in stages 1 through ((log 2 N) 3) are 3 3 crossbar switches. Each source is linked to both groups via multiplexers. There is one 4 1 multiplexer for each input link of a switch in the stage 1 and one 1 2 demultiplexer for each output link of a switch in stage ((log 2 N) 2). In each stage, the crossbar switches can be grouped into conjugate subsets, where each one is composed of all switches in a particular stage that leads to the same subset of destinations. These switches communicate through the auxiliary links, form a conjugate loop. The conjugate loops are formed in such a way that the two switches which form a loop, have their respective conjugate switches in a different loop. This pair of loops is called conjugate loops. The network complexity of an N N ABN is equal to [( 9N 8 9 + ((log 2 N) 3))]. Another type of MINs are replicated MINs [20,39]. Replicated MINs enlarge banyan-type MINs by replicating them L times. The resulting MINs are arranged in L layers. An L-layer replicated MIN )( 16 of size N N consists of (log 2 N) stages of ( L N 2 ) switching elements. Typically, all switching elements are small-size crossbar switches of size 2 2. Packets are received by the inputs of the network and distributed to the layers. There is one 1 L demultiplexer for any input before stage 1. In addition, there was one L 1 multiplexer for L output links of L switching layers in stage (log 2 N), which are connected to each output node. Therefore L-layer replicated MINs of size N N has N demultiplexers in the input stage and N multiplexers in the output stage. The network complexity of an N NL-layer replicated MIN is [( L N 4 )(1 + 2(log 2 N))]. In contrast to the banyan-type MINs, replicated MINs offer better performance in terms of reliability and multicasting because multiple paths exist for a source-destination pair. Almost all kinds of topologies described above are of multistage interconnection networks. Since all these topologies are made of small size crossbar switches which in turn are implemented using a single chip, these topologies can greatly satisfies the

F. Bistouni, M. Jahanshahi scalability problem. However, in these topologies efficient addressing the fundamental blocking problem is left. In continue, we present the arguments on this issue: The MINs can be classified mainly into two categories: (1) Single-path (banyantype) MINs. (2) Multi-path MINs. In the single-path MINs, there is only one path between each source-destination pair, minimizing the number of switches and stages. Therefore, a connection between a free source-destination pair is not always possible because of conflicts with the existing connections. These networks are also known as blocking MINs. The second type of networks, the multi-path MINs can provide multiple paths between each source-destination pair. These networks are also known as fault-tolerant MINs. In the fault-tolerant MINs, it is possible to provide multiple paths to reduce conflicts and increase fault-tolerance compared with the single-path MINs. However, most of the fault-tolerant MINs are not able to handle all conflicts yet to create a non-blocking state in the network. Therefore, most of these networks are also known as blocking MINs. However, there is a hope to solve the blocking problem by using two specific classes of fault-tolerant MINs: (1) Rearrangeable non-blocking (or simply rearrangeable) MINs such as Benes network [43] and (2n 1)-stage shuffleexchange networks (n = log 2 N) [40 42]. (2) Non-blocking Clos network [44]. A rearrangeable MIN is always capable of connecting all free sources to free destinations in a permutation request, but to accomplish this in some scenarios the existing connections may be rearranged which is called rearrangeable non-blocking. Although the rearrangeable MINs are theoretically capable of creating a non-blocking mode on network, in practice, there are some underlying issues that make these networks not to be a good candidate for solving the blocking problem. These issues are as follows: (a) In the case of rearrangeable networks, it is possible that reorganizing connections is not acceptable as applications cannot be interrupted [45]. (b) Rearrangeable networks require a central controller to rearrange connections and were proposed for array processors. Also, connections cannot be easily rearranged on multiprocessors because processors access the network asynchronously. So, rearrangeable networks behave like blocking networks when accesses are asynchronous [10]. A network is non-blocking if any permutation can be set up, without the need to reroute (or rearrange) any of the connections that are already set up. Therefore, if any free source can be connected to any free destination without altering the path taken by any other traffic, then the network is non-blocking. The best-known example of a nonblocking MIN is the Clos network. The Clos network is a three-stage network (Clos networks with any odd number of stages can be derived recursively from the three-stage Clos by replacing the switches of the middle stage with three-stage Clos networks) in which each stage is composed of a number of crossbar switches. A symmetric Clos is characterized by a triple, (m, n, r) where m is the number of middle-stage crossbar switches, n is the number of input (output) ports on each switch in the first stage (last stage), and r is the number of crossbar switches in each first and last stages. In a Clos network, each middle stage switch has one input link from every switch in first stage and one output link to every switch in last stage. Thus, the r first stage switches are n m crossbars to connect n input ports to m middle switches, the m middle switches are r r crossbars to connect r first stage switches to r last stage switches, and the r last stage switches are m n crossbars to connect m middle switches to n output ports. Although the Clos network can show non-blocking feature, it suffers from some

Scalable crossbar network issues which makes it not to be a good candidate for solving the blocking problem. These issues are as follows: (a) It is proved that a Clos network is non-blocking if (m 2n 1) [14,46]. Therefore, only under some circumstances, it can solve the blocking problem, thus it suffers from some limitations. (b) Usually it needs a suitable mechanism or control for assigning the connections. Also, the mechanism or control in Clos network is too complex [14,30,45 48]. In routing a circuit in a Clos network, the only free decision is at the first stage switch, where any of the m middle switches can be chosen as long as the link to that middle switch is available. The middle switches must choose a single link to the output switch (and the route is not possible if this link is busy). Similarly, the last stage switch must choose the selected output port. Therefore, the problem of routing in a Clos network is the problem of assigning each circuit (or packet) to a middle switch. In total, according to the arguments made in this sub-section, it can be concluded that the MINs are able to solve the scalability problem existing in the crossbar network and can also provide parameters such as fault-tolerance. However, these networks cannot appropriately cope with the blocking problem. 2.2 Using small-size crossbar networks to build large-sized crossbar networks In this case, the small size crossbar networks are used as switching elements to build large-scale crossbar networks. By exploiting this technique, first, the blocking problem can easily be solved, because as previously mentioned, it has been proven that crossbar networks are strictly non-blocking network. Second, the scalability problem can be solved using this method. Although the number of pins on a VLSI chip cannot exceed a few hundreds, large crossbars can be realized by partitioning them into smaller crossbars, each one implemented using a single chip. Therefore, this technique is much more efficient approach than the one discussed in Sect. 2.1. Now, the question that arises here is whether previous designed topologies are based on this approach? To the best of our knowledge, although some ideas have been proposed in this case [10,14], the number of practical solutions undertaken in this field is almost zero. In other words, less practical efforts has been done to implement this approach compared to the first approach. A multistage implementation of crossbar architecture called multistage crossbar network (MCN), consisting of small-size crossbar networks as switching elements, is rare examples of the efforts [45], which will be introduced in continue. Figure 3 shows a MCN of size 4 4, which is built using a number of 2 2 crossbar switches. In general, an N N MCN can be realized with (N 2 ) 2 2 crossbar switches. This type of network can provide multiple paths for some source-destination pairs. Also, unlike most networks, in this topology, the path length (defined as the number of switching elements between a source-destination pair) between different sourcedestination pairs is not identical. In this network, the path length can vary between 1 and (2N 1) switching elements. Unlike the typical crossbar network (Fig. 1), whose port size is limited by pin count, such crossbar-based networks are useful to build large crossbar networks, promoting scalability due to exploitation of small-size crossbar switches as switching elements.

F. Bistouni, M. Jahanshahi Fig. 3 A MCN of size 4 4 Although this structure can satisfy the scalability parameter, it has a major disadvantage compared to typical crossbar network, higher hardware cost. Cost of an interconnection network can be calculated according to the number of crosspoints within a switch [20,33,39,47]. Therefore, since there are (N 2 ) crosspoints on a crossbar network, its cost is equal to (N 2 ). However, since the MCN has (N 2 ) crossbar switches of size 2 2, its cost is equal to (4N 2 ). In other words, The MCN is four times more expensive than the typical crossbar network, which is non-negligible, especially in large-scale systems. According to the arguments made in this sub-section, it can be concluded that the approach of using small-size crossbar networks to build large-size crossbar networks is more appropriate technique to design the scalable non-blocking topologies, compared to the first approach (described in Sect. 2.1). However, as mentioned before, so far, little works have been done to exploit this approach. Also, in the case of MCN which is a network built on the basis of this approach, although it can solve the problems of scalability and blocking, it imposes a higher hardware cost than the typical crossbar network. Therefore, our motivation in this paper is to provide a new interconnection topology based on this approach, named scalable crossbar network (SCN) that meets the following main requirements: (1) It is non-blocking. (2) It is made up of small-size crossbar switches in order to solve the problem of limitation on the number of pins on a single VLSI chip. (3) It does not impose a higher hardware cost than the typical crossbar network. As will be discussed in Sect. 3, the SCN can meet all these needs. In addition, it has an efficient routing mechanism that is fast, affordable, and self-routing. Also, the performance analysis done in Sect. 4 will show that the SCN has a very good performance in terms of various key parameters such as terminal reliability, mean time to failure, and system failure rate compared to various interconnection networks such as SEN, SEN+, Benes network, replicated MIN, multilayer MINs, and MCN.

Scalable crossbar network 3 Scalable crossbar network (SCN) A typical crossbar network (shown in Fig. 1) can be seen as a network that contains only one switching element. In other words, there is only one crossbar switch between each source-destination pair in a typical crossbar network. This structure can bring great benefits to a topology: Despite having only one switching element between each source-destination pair it increases transmission speed between the nodes. In other words, the path length which has a direct relationship with latency is always equal to 1. Also, non-blocking state for each permutation of connections can be guaranteed, because the permutation defined as a request for parallel connections of every N sources to their N corresponding distinct destinations. In addition, unlike the MCN in which there are large number of switches involved in the cost, hardware cost of the network is just equal to the number of crosspoints in a single switch. As a result, the cost of such a network is lower than the cost of MCN. Despite the above advantages, a typical single-switch crossbar network suffers from a major problem that is the problem of scalability in the number of pins on a VLSI chip, which leads to the network size limitation problem in a network. Given the above arguments, using only one switching element between each sourcedestination pair achieves many advantages, especially in non-blocking mode. Therefore, we will utilize this idea to design a new non-blocking topology, SCN. However, as noted, we are faced a major problem, scalability. To deal with this engineering problem, we will use a number of small-size crossbar switches rather than a single large-size switch so that each switch can support a certain group of sources and destinations. Therefore, in this new structure, we need a group of resources (such as switching elements and links) in order to assign them to different nodes. In other words, to meet basic objectives, such as non-blocking, scalability, and low cost, the new topology is a combination of several methods, as follows: (1) Similar to typical crossbar network, there is only one switch between each source-destination pair. (2) A number of small size crossbar switches are used as switching elements. (3) In this structure, network resources are grouped to fairly allocate them to different nodes, facilitating the routing mechanism, and also keeping only one switch between each source-destination pair. The aforementioned first two methods have already been used in literature; the first method in the crossbar network and the second one in networks such as MINs and MCN. However, to our best knowledge, the third method is innovative and has not been seen in any of the previous works for the design of non-blocking networks. In addition, other creative works done here is combination of these three methods. In continue, we will become more familiar with the topological structure of the SCN. Figure 4 showsascnofsize8 8, which is built using a number of 2 2 crossbar switches (for better understanding, connections to the source 0 is highlighted). An N N SCN network requires ( C N ) switching groups, each group containing ( C N ) crossbar switches of size C C. For instance, with (N = 8) and (C = 2) (Fig. 4), a 8 8 SCN requires four switching groups, each group contains four crossbar switches of size 2 2. The groups are labeled as G i, i = 1, 2,...,( C N ) and the switching elements are labeled as SE i, j, where i represents the group number and i = 1, 2,...,( C N ) and j represents the switch number and j = 1, 2,...,( C N ). In this network, each

F. Bistouni, M. Jahanshahi Fig. 4 A SCN of size 8 8 source is connected to one switch in each group. Therefore, each source is connected to ( C N ) switch. For example, in Fig. 4, the source 000 (0) is connected to four switches SE 1,1, SE 2,1, SE 3,1, and SE 4,1. Also, each of these switches is connected to two separate destinations from other switches, SE 1,1 is connected to destinations 000 and 001, SE 2,1 is connected to destinations 010 and 011, SE 3,1 is connected to destinations 100 and 101, and SE 4,1 is connected to destinations 110 and 111. In other words, each switching group will support (C) destinations; all these switching groups together can cover all network destinations. Therefore, the SCN is designed such that there

Scalable crossbar network is only one intermediate switch between each source-destination pair. Moreover, this network is designed in such a way that each switch of size C C is only under traffic of (C) sources. Since it is always assumed that the destination for none of the sources is not equal at the same time in a permutation of connections, all (C) sources can be connected easily without collision by C C crossbar switches to their destinations. Therefore, the SCN eliminates the possibility of blocking and with this structure, we can achieve our main goal, namely to design a non-blocking network. On the other hand, the SCN is a network made of small size crossbar switches. As a result, the scalability problem is solved by the adoption of the technique. In the past, both of these features (i.e., non-blocking and scalability) were present in the MCN. However, as discussed in Sect. 2, the main problem was its higher hardware cost compared with typical crossbar network. Now, a question arises here is that what the cost of SCN is? As discussed in Sect. 2, the cost of a network can be calculated by taking the number of crosspoints within a switching element and by the number of switching elements within the network [20,33,39,45,47] into account. Therefore, cost of two networks of crossbar and MCN is equal to (N 2 ) and (4N 2 ), respectively. On the other hand, since the SCN is comprised of ( C N ) switching groups and there are ( C N ) switches of size C C in each group, total number of switches is equal to ( N 2 ). Also, since these C 2 switches are of size C C, the cost of SCN is equal to ( N 2 C 2 ) = N 2. Therefore, the C cost of SCN is four times less than the MCN, and it is equal 2 to the cost of the crossbar network. Routing for SCN can be expressed as self-routing. In self-routing procedure, the switches examine the destinations of their input data and set themselves. No central routing hardware is needed [20,49,50]. Routing tag consists of binary digits that control the connection through different switching groups from input to the output. Let the source S and destination D be represented in binary as S = s n...s 1 and D = d n...d 1, where n = log 2 N. The routing for SCN will include two phases: (1) Determine the switching group number. (2) Determine the switch state (straight or exchange). To send a message from a source to a destination in the SCN, firstly, it is necessary to specify the switching group number (GN) that the destination is located on it. This can be calculated according to destination tag bits as follows: GN = (log 2 N) 2 i=0 d i+2 (2) i + 1 For example, consider the destination 110 (d 3 = 1, d 2 = 1, d 1 = 0), the group number for this destination is calculated as GN = ( 1 i=0 d i+2 (2) i ) + 1 = (d 2 (2) 0 + d 3 (2) 1 ) + 1 = (1 + 2) + 1 = 4. After determining the group number, the next step is to specify the switch state. The switch state specifies that the input message to the switch should be sent to the upper output or lower output. If the incoming message is sent to the same output port with the input port of the switch, then the state is called straight. Otherwise, the state is exchange. These two modes are shown in Fig. 5, assuming both input ports

F. Bistouni, M. Jahanshahi Fig. 5 States of switching elements Table 1 The routing values for source 000 Source Destinations Group number (s 1 d 1 ) Switch state 000 000 1 0 Straight 001 1 1 Exchange 010 2 0 Straight 011 2 1 Exchange 100 3 0 Straight 101 3 1 Exchange 110 4 0 Straight 111 4 1 Exchange are working. Let the upper input and output lines be labeled i and the lower input and output lines be labeled j. (1) Straight-input i to output i, input j to output j; (2) Exchange-input i to output j, input j to output i. The switch state (SS) in the SCN can be obtained by using the source and destination tag bits, as follows: SS = { Straight, s1 d 1 = 0 Exchange, s 1 d 1 = 1 For instance, consider the source 101 (s 3 = 1, s 2 = 0, s 1 = 1) and destination 100 (d 3 = 1, d 2 = 0, d 1 = 0) then we have: GN = ( 1 i=0 d i+2 (2) i ) + 1 = (d 2 (2) 0 + d 3 (2) 1 ) + 1 = (0 + 2) + 1 = 3, so the source will be attached to the third switching group and since s 1 d 1 = 1 0 = 1 so the switch state is exchange. To better appreciate this, the routing values for source 000 and all destinations are summarized in Table 1. Also, for further discussion, the routing values to the permutation P = ( 04567 13572460 ) are as follows: Source 000 and destination 001:GN = ( 1 i=0 d i+2 (2) i )+1 = (d 2 (2) 0 + d 3 (2) 1 )+ 1 = (0 + 0) + 1 = 1 and s 1 d 1 = 0 1 = 1 Source 001 and destination 011: GN = ( 1 i=0 d i+2 (2) i )+1 = (d 2 (2) 0 + d 3 (2) 1 )+ 1 = (1 + 0) + 1 = 2 and s 1 d 1 = 1 1 = 0 Source 010 and destination 101: GN = ( 1 i=0 d i+2 (2) i )+1 = (d 2 (2) 0 + d 3 (2) 1 )+ 1 = (0 + 2) + 1 = 3 and s 1 d 1 = 0 1 = 1 Source 011 and destination 111: GN = ( 1 i=0 d i+2 (2) i )+1 = (d 2 (2) 0 + d 3 (2) 1 )+ 1 = (1 + 2) + 1 = 4 and s 1 d 1 = 1 1 = 0

Scalable crossbar network Source 100 and destination 010: GN = ( 1 i=0 d i+2 (2) i )+1 = (d 2 (2) 0 + d 3 (2) 1 )+ 1 = (1 + 0) + 1 = 2 and s 1 d 1 = 0 0 = 0 Source 101 and destination 100: GN = ( 1 i=0 d i+2 (2) i )+1 = (d 2 (2) 0 + d 3 (2) 1 )+ 1 = (0 + 2) + 1 = 3 and s 1 d 1 = 1 0 = 1 Source 110 and destination 110: GN = ( 1 i=0 d i+2 (2) i )+1 = (d 2 (2) 0 + d 3 (2) 1 )+ 1 = (1 + 2) + 1 = 4 and s 1 d 1 = 0 0 = 0 Source 111 and destination 000: GN = ( 1 i=0 d i+2 (2) i )+1 = (d 2 (2) 0 + d 3 (2) 1 )+ 1 = (0 + 0) + 1 = 1 and s 1 d 1 = 1 0 = 1 4 Performance analysis In this section, performance of the SCN compared to known networks, namely, SEN, SEN+, Benes network, two-layer replicated MIN, multilayer MIN 1248, multilayer MIN 1888 (the digits of the legend refer to the number of layers at stage 1, stage 2, etc.; also, the parameters of start replication factor G S, growth factor C F, and layer limit factor G L will be defined for each network separately as follows: network 1248: G S =2,G F = 2, and G L = 8 and network 1888: G S =2,G F = 8, and G L = 8.), and MCN will be evaluated. These networks cover a wide range of types of topologies. That is why we choose them to compare with. The SEN network is a blocking network. The SEN+ is a fault-tolerance network. The Benes network is a rearrangeable network. The replicated MIN is a network with a certain structure and high efficiency. Also, based on [39], it has been proven that both multilayer networks 1248 and 1888 present an appropriate performance. Moreover, the MCN is a non-blocking network. Generally, reliability is defined by the IEEE as the ability of a system or component to perform its required functions under stated conditions for a specified period of time [51]. Therefore, in the domain of interconnection networks, many researchers have been convinced that it is the most immediate parameter for each efficient network topology [20,24,29,31 35,52 56]. In addition, reliability analysis is a mathematical description of a system and can give exact information about the performance of the system. An interconnection network should be able to deliver information reliably. Interconnection networks can be designed for continuous operation in the presence of a limited number of faults. In other words, in an interconnection network, reliability is a measure of how often the network correctly performs the task of delivering messages. In most situations, there is a need to deliver messages 100 % of time without loss. A network can be defined as a collection of nodes and links (which are known as vertices and edges in graph theory, respectively) in which some particular nodes are called terminals [57,58]. On the other hand, the reliability of a network is defined as the connectivity probability of certain set of terminal nodes with each other. This connection can be achieved with at least one fault-free path between the nodes. If this connection is achieved, then the network is in state of up, otherwise it is in state of down [20,32,35,57,59 62]. However, the connectivity analysis is very challenging in the case of complex networks. Complex networks are consisting of multiple source and destination nodes, complex topology, interdependencies at the component and system levels, and uncertainties in actual conditions of network components and deterioration models [59,60]. According to this definition, lifeline networks such as electrical and

F. Bistouni, M. Jahanshahi gas networks [59,60,63], wireless mobile ad hoc networks (MANETs) [64], wireless mesh networks [65 68], wireless sensor networks [69,70], sensors based on nanowire networks [71], social networks [72], stochastic-flow manufacturing networks (SMNs) [73], and interconnection networks [10,14,20,32,35] are known as complex network systems from the viewpoint of reliability. According to the reported researches, reliability investigation of the complex networks can be accomplished by simulation or analytical models. Although simulationbased approaches are easily implemented, there are some restrictions to their effectiveness. For instance, the number of performed simulations should be large enough to provide a comprehensive study which can be extremely time-consuming. Furthermore, simulation presents a small range of results compared to the analytical methods. Clearly, analytical methods have been avoided due to their complexity in favor of the simplicity of using simulation. Using reliability equations, analytical methods have been developed to present an exact solution for computing the reliability of a system. Therefore, the time-consuming calculations and the non-repeatability issue of the simulation methodology should be eliminated. Given the reliability equation for a system, further analyses on the system such as computing exact values of the reliability, failure rate at specific points in time, computation of the system MTTF (mean time to failure) can be performed. In addition, reliability optimization techniques can be utilized to promote design improvement efforts. Given the above arguments, it can be concluded that reliability is a key parameter in network performance. Also, analytical methods in reliability analysis should be developed to present an exact solution for computing the reliability of a system. Therefore, the reliability parameter will be studied carefully in this paper by the analytical method. On the other hand, an important measure of reliability, which is of interest to many researchers, is the terminal reliability [20,24,31,34,54,74]. As a result, our focus in this paper is on analyzing the terminal reliability. Another important metric that we should investigate is the time that a system is available, often referred to as uptime in the IT industry. The length of time that a system is online between outages or failures can be thought of as the time to failure for that system. The mean time to failure (MTTF) is the average of the time to failure or in other words, the MTTF is the expected value of the time to failure. Therefore, due to crucial nature of this parameter, it also will be analyzed in this paper as one of the most important performance metrics. In many cases, the primary objective in system performance analysis is to obtain a failure distribution of the entire system based on the failure distribution of its components. The parameter that can be used to study these cases is system failure rate. However, almost in none of the previous works this parameter has been investigated. Therefore, for a deeper analysis of the performance of the networks, this important parameter will be analyzed as well. Moreover, the parameter that can be used in assessment of the MINs performance in terms of cost, is the cost-effectiveness parameter that in the most reported works [4,27,32,33,35] has been emphasized. Therefore, we will discuss the parameter to a comprehensive review of the hardware costs of the networks. Here, the switch fault model will be used for reliability analysis of the MINs therefore; it will be assumed that each switching component (i.e. switching elements, mul-

Scalable crossbar network tiplexers, and demultiplexers) may fail. In addition, the Weibull distribution is one of the most commonly used distributions in reliability. Hence, we assume that the time-to-failure of the switching components are described with a Weibull distribution with three parameters of lifetime variable t, characteristic life or scale parameter η, and slope or shape parameter β. 4.1 Mathematical analyses The terminal reliability, denoted R(t), is defined as the probability of successful communication between a source-destination pair. In this paper, we assume that the r(t) is the probability of a 2 2 switching element (SE 2 2 ) being operational. Also, given the number of gates in switching components of different sizes, their operational probability can be generated based on r(t) [4,32,33]. It is also assumed that the hardware complexity of a component is directly proportional to the number of gates [4,33]. In the SCN, there is only one switching element between each source-destination pair. Therefore, we have: R SCN (t) = r(t) (1) Also, for the SEN and SEN+, we have: R SEN (t) = r(t) (log 2 N) (2) R SEN+ (t) = r(t) 2 (1 (1 r(t) ((log 2 N) 1) ) 2 ) (3) For the 8 8 Benes network, terminal reliability is given by: R 8 8 Benes (t) = r(t) 2 (1 (1 (r(t) 2 (1 (1 r(t)) 2 ))) 2 ) (4) Terminal reliability of the N N Benes network is calculated as follows: R N N Benes (t) = r(t) 2 (1 ((1 (R N 2 N 2 Benes(t)))2 )) (5) Also, for the two-layer replicated MIN and multilayer MINs, we have: R replicated (t) = r(t)(1 (1 r(t) (log 2 N) ) 2 ) (6) R network1248 (t) = r(t) 4 (1 (1 (r(t) 2 (1 (1 (r(t) 2 (1 (1 r(t) ((log 2 N) 3) ) 2 ))) 2 ))) 2 ) (7) R network1888 (t) = r(t) 10 (1 (1 r(t) ((log 2 N) 1) ) 8 ) (8) In the case of the MCN, as discussed in Sect. 2, in this topology, the path length (defined as the number of switching elements between a source-destination pair) between different source-destination pairs is not identical. In this network, the path length can vary between 1 and (2N 1) switching elements. This makes the reliability vary between different sources and destinations. To solve this, we will consider an average