Design and Evaluation of a Fault-Tolerant Adaptive Router for Parallel Computers

Size: px
Start display at page:

Download "Design and Evaluation of a Fault-Tolerant Adaptive Router for Parallel Computers"

Transcription

1 Design and Evaluation of a Fault-Tolerant Adaptive Router for Parallel Computers Tsutomu YOSHINAGA, Hiroyuki HOSOGOSHI, Masahiro SOWA Graduate School of Information Systems, University of Electro-Communications, Chofu-shi, Tokyo, , Japan. yosinaga@is.uec.ac.jp Abstract In this paper, we propose a design methodology for faulttolerant adaptive routers for parallel and distributed computers. The key idea of our method is integrating minimal and non-minimal routing that is supported by independent virtual channels (VCs). Distinguishing the routing functions for each set of VCs simplifies the design of fault-tolerant algorithms. After describing the method, we show an application of a routing algorithm for two-dimensional mesh and torus networks. This algorithm, called Detour-NF, supports three routing modes: deterministic, minimal fully adaptive and non-minimal fault-tolerant operations. We also discuss the hardware cost and operational speed of minimal and non-minimal routers based on our design, which uses hardware description language (HDL). Communication performance and fault-tolerance are demonstrated by an HDL simulation. The experimental results show that supporting both minimal and non-minimal routing modes is advantageous for high-bandwidth and low-latency communication, as well as fault-tolerance. Keywords: fault-tolerance, adaptive router, nonminimal routing algorithm, hardware design, communication performance 1. Introduction Recent high-performance parallel and distributed computers consist of thousands of nodes or processing elements. For such large systems, communication performance as well as fault-tolerance is important. The adaptive routing technique is an approach to obtaining high bandwidth and low latency communication by its dynamic selection of message paths based on network state. By avoiding congestion areas or faults, messages can be routed more smoothly than with deterministic routing. In spite of such potential ability, adaptive routers have not been utilized widely in real computer systems. One reason for their limited use is that adaptive routers don t keep in-order message delivery. Since the software implementation for in-order delivery involves a large overhead, system designers often adopt a deterministic routing algorithm. However, if only a limited number of messages require in-order delivery and the rest of the messages can be routed without this constraint of delivery order, the deterministic routing decision for all messages is unnecessarily strict. A solution is to support both adaptive and nonadaptive routing modes. If a user could choose the preferred routing mode on a message or application basis, then the system could take advantage of both adaptive routing and in-order delivery. Adaptive routing algorithms can be classified into two categories: minimal and non-minimal path selections [4]. Minimal adaptive algorithms are usually used to increase performance by improving the utilization of network resources, such as channels. However, they don t provide fault-tolerance. Therefore, non-minimal adaptive routing is required for reliable communication against faults. Once a non-minimal adaptive routing algorithm is supported, it is not so difficult to use the router as a minimal router by restricting path selection to the minimal paths. Generally, non-minimal routers require more virtual channels (VCs) than minimal routers for deadlock prevention, even though the additional VCs may rarely be used when there are no faults. In the static fault model, (that is, a model in which faults can be detected before starting computation and dynamic faults do not appear during computation), these VCs can be more effectively utilized under the minimal routing constraint. Our goal is to design a fault-tolerant adaptive router which is able to switch its routing operation mode without wasting hardware resources such as VCs. In this paper, we assume the static fault model. First, we consider the design methodology of non-minimal routing algorithms. Then, we show an example design of a non-minimal adaptive router. We also analyze the hardware costs and performance of our design using hardware description language (HDL).

2 2. Related Works There have been several studies that combine deterministic and adaptive routing modes in a router. Fulgham and Snyder proposed the Triplex router, which supports oblivious, minimal and non-minimal fully adaptive routing classes [7]. This router basically provides a non-minimal class for congested networks. Millar and Najjar designed a hybrid deterministic and adaptive router [11]. This router tries to apply low-latency deterministic routing at low traffic with the flexibility of adaptive routing at high traffic. Therefore, fault-tolerance is not the main issue in either of these cases. Most fault-tolerant adaptive routing algorithms for direct networks have been discussed for -ary -cubes and wormhole switching [3]. Linder and Harden proposed an adaptive and fault-tolerant wormhole routing algorithm using the concept of a virtual network [9]. Their algorithm requires VCs per physical channel, and additional VCs to support faulty channels. Glass and Ni proposed the turn model, [8] which tolerates up to faults in an - dimensional mesh. The turn model presents partially adaptive routing algorithms that do not perform well in fault-free cases [1]. Boppana and Chalasani proposed fault rings and fault chains which show the boundaries of faulty components. The fault rings and fault chains can handle block faults on a two-dimensional mesh with four VCs per physical channel. Suh proposed software-based rerouting as a cost-effective alternative to the hardware solution [13], although the performance is relatively low. Duato proposed a methodology for the design of faulttolerant, fully adaptive routing algorithms [5]. He defined the redundancy level of a network, which represents the maximum number of simultaneous faults. In his methodology, a designer sets the redundancy level first. Then, a faulttolerant routing function is defined by repeating the steps for adding VCs to extend the non-minimal routing function and by removing redundant output channels by checking whether or not the extended channel dependency graph is acyclic. With Duato s methodology, we can generally design an appropriate, fault-tolerant, fully adaptive routing algorithm, although it is complex. In this paper, we propose an alternative methodology to design fault-tolerant, fully adaptive routing algorithms. This design method is simpler than Duato s method with respect to distinguishing minimal and non-minimal VCs. 3. Design Methodology To preserve simple switch-ability between minimal and non-minimal routing modes, our design method for faulttolerant, adaptive routing algorithms starts from an existing minimal fully adaptive routing function R. Normally, R requires a few VCs per physical channel to prevent deadlocks. Since R supplies minimal paths to messages, we identify these VCs as minimal VCs. The design steps adding fault-tolerance to R are as follows: 1. Add one or more VCs to each physical channel. Let us identify these additional VCs as non-minimal VCs. 2. Define a routing function R1 which specifies how to use the non-minimal VCs. Once a message enters the non-minimal VC, R1 supplies the non-minimal VCs only for a misrouting message until the message is delivered to its destination. Therefore, there is no channel dependency from the non-minimal VCs to the minimal VCs. The channel dependency graph of R1, which is constructed from the non-minimal VCs, should be acyclic to avoid deadlocks. 3. Combine the original minimal fully adaptive routing function R and the non-minimal routing function R1 such that a message selects a minimal path supplied by R as much as possible. When the message is blocked by a faulty channel and there are no alternative minimal paths to its destination, it is misrouted to a nonminimal path supplied by R1. 4. Finally, if they are necessary, 180-degree turns are added to support backtrack channels when messages switch their routing function from R to R1. The number of additional non-minimal VCs and the routing function R1 decide the fault-tolerance ability. A larger number of non-minimal VCs may handle a variety of faulty patterns, whereas fewer non-minimal VCs are costefficient for a network that is rarely faulty. The new routing function which is obtained by the combination of R and R1 is deadlock-free, because the original R and R1 are independently deadlock-free on the virtual networks organized by the minimal and non-minimal VCs, respectively. Livelock can be avoided by limiting the number of misrouting hops. The final step of the method is not always necessary, but it increases the misrouting flexibility when only the local faulty information is usable. 4. Routing Algorithm This section presents a fault-tolerant, adaptive routing algorithm for two-dimensional mesh and torus networks as an application of the method described in Section 3. For the minimal fully adaptive routing function R, we use Duato s Protocol, which requires two and three VCs per physical channel for meshes and tori, respectively [6]. One VC is an

3 adaptive VC, and the remaining one (in the mesh) or two (in the torus) are escape VC(s). By restricting the use of the adaptive VC as a non-adaptive one, it can easily operate as a deterministic router in order to keep in-order message delivery. For flow control, wormhole switching is used. The fault model that we consider here is a static one. Channel faults are assumed to be recognized only by their directly connected live routers, and no global faulty information is used Mesh algorithm As stated in the previous section, for the first step, we add one VC to each physical channel. Then, we define a non-minimal routing function R1. In order to maximize the fault-tolerance ability, which can be applied by a single non-minimal VC, we select the negative-first routing function [8]. Step 3 combines the minimal routing function R with the non-minimal function R1. To prevent performance degradation, misrouting is allowed for messages which have a single minimal path to their destination, and only when the path is blocked by a faulty channel. Figure 1 shows some examples of the allowed misrouting paths for messages that are going straight. Thick arrows represent the message paths supplied by the minimal fully adaptive function R, and thin U-shaped arrows represent the non-minimal paths for bypassing the faulty channels, which are drawn as dotted lines. Since R1 is a negative-first algorithm, it prohibits turns to the positive direction followed by the negative direction. However, as we show in Figure 1, any straight massage, except on the negative edge channels, can bypass the faulty channel. For example, when a message advancing from the bottom to the top is blocked by a fault, it turns to the left or negative direction in the x dimension, then it turns to the upward (positive) direction, and finally it turns to the right (positive) direction. In such a negative-positive-positive misrouting case, adaptive routing can be applied to the path selection of the positive-positive portion. On the other hand, a solitary misrouting path is applied for a negative-negative-positive bypass. This is because R1, the negative-first algorithm, is a partially adaptive routing algorithm. If R1 is defined by other variants of the turn model, such as north-last and west-first, we cannot guarantee the bypass for some directional messages because of the imbalance of prohibitive turns. We notice that all bypasses include at least one positive transmission followed by a negative transmission, and the bypass cannot start from a positive transmission. Therefore, when a message advancing to the positive direction is blocked by a fault at the final turn, it needs a 180-degree turn to start the bypass from a negative transmission. Otherwise, the misrouting path becomes longer or non-local faulty information is required. Here, we extend the routing function Positive direction Negative direction Routing path by R Misrouting path by R1 Faulty channel Positive direction Negative direction Figure 1. Misrouting examples for straight messages. Negative Positive Negative Routing path by R Misrouting path by R1 Faulty channel Positive Figure 2. Misrouting examples with 180- degree turns. to support 180-degree turns for messages that are moving to positive directions. Figure 2 shows examples of 180-degree turns when messages switch routing functions from R to R Torus algorithm The mesh algorithm is simple, but has the disadvantage that it cannot handle faults on the negative edge. We can design a torus algorithm, which provides support by utilizing wraparound channels that connect the nodes across the dateline. One way is preparing two non-minimal VCs that eliminate the occurrence of torus cycles. Another way is restricting the number of misrouting hops across the dateline to one hop, so that this does not cause the torus cycle even with a single non-minimal VC. This can be guaranteed in such a way that, once a message is misrouted from one dimension, it is not allowed to pass the wraparound channels on that dimension. In other words, the non-minimal routing function R1 would regard the network topology as a mesh. Figure 3 shows the misrouting paths of two messages in a two-dimensional torus. We assume that the dotted lines are datelines for each dimension. Message A is misrouted from the to the Y dimension, and it is not allowed to pass the wraparound channel on the dimension any more. So, the direction of message A is changed on the misrouting path of the dimension. Message B can be misrouted by only a

4 A S D West Port North Port East Port D B S... wire... PE S: source node D: destination node : faulty channels : dateline PE I/F South Port Figure 3. Misrouting paths on tori. single hop across the dateline of the Y dimension. Because messages which need more than two hops after passing the dateline do not exist, the torus cycle never occurs. We call the resulting routing algorithm Detour-NF. Its features are summarized as follows: It can support three routing modes: deterministic, minimal fully adaptive, and non-minimal fault-tolerant without wasting hardware resources. It suits networks where the fault rate is relatively low because it combines minimal fully adaptive and nonminimal fault-tolerant routing. It does not require global faulty information or routing table management. When there is no static faulty channel on the network, the user may use the Detour-NF router as a fully adaptive minimal router which has two escape VCs and two adaptive VCs per physical channel. Otherwise, non-minimal VCs may be reserved, even on a non-static fault network, in order to tolerate dynamic faults. 5. Router Design We have designed the Detour-NF router for twodimensional tori so that we could evaluate its hardware cost and operational speed. We first explain the hardware organization of Detour-NF, then compare the cost and speed with a dimension-ordered deterministic router and Duato s minimal fully adaptive router, based on our designs using Verilog-HDL Hardware Organization and Routing Logic Figure 4 shows the block diagram of Detour-NF. It consists of four network ports (North, East, West and South) Virtual Channel(VC) Address Decoder(AD) Output Channel Arbiter(OCA) Figure 4. Hardware Organization. and one processing element interface (PE I/F). All are connected to each other by wire. Instead of a central crossbar switch, separate multiplexers are placed in each port [14, 15]. Each network port consists of four VCs with address decoders (ADs) and an output channel arbiter (OCA). The PE I/F consists of two VCs with the ADs and OCA. One difference between Detour-NF and the pure Duato s minimal adaptive router is that Detour-NF has a connection from the non-minimal VC to the OCA in the west and south ports to support 180-degree turns. Messages generated by the PE are stored in one of the VCs in the PE I/F. The message header, containing its destination address, is decoded by the AD. The independent AD per VC enables parallel header decoding for multiple messages. The AD creates one or two output request signals based on the routing function. When there are two candidates for output ports, these two requests are simultaneously propagated to the OCAs in the selected output ports. The OCA arbitrates several output requests from the VCs which hold the messages, and returns an acknowledge signal to one of them. The acknowledged VC decides the output port based on some selection policy when it receives multiple acknowledgments, and injects the message into the network. Since there is no central crossbar switch in the router, the routing decision and intra-router data transmission are not serialized. We use a dedicated signal that is exchanged between two adjacent routers to detect the faulty channel. The routing functions R and R1 are automatically selected by the ADs, based on the possible paths for messages and faulty information. For fault-free networks, one of the routing modes (the dimension-order or minimal fully

5 Table 1. Synthesis results of three routers. Router Dimension-order Duato s protocol Detour-NF VCs / channel MA clock (MHz) 89 (1.05) 89 (1.05) 85 (1.00) 82 (0.96) Area (cells) 4974 (0.85) 6331 (1.08) 5882 (1.00) 9151 (1.56) Total FFs 4016 (0.99) 5133 (1.26) 4066 (1.00) 5245 (1.29) Numbers in ( ) show the ratios to the values of Duato s minimal fully adaptive router. adaptive mode) can be chosen based on a flag in the message header or by a static router setting. The entire routing function is realized by the hardware logic for fast routing decisions and the fact that no routing table lookup is required [2]. This router adopts an input buffer scheme. A message arriving from the network is stored to a requested VC in the network port. Then, the routing action is repeated until the message reaches its destination Hardware Cost and Speed Table 1 shows the speed and hardware cost of the three routers (dimension-order, Duato s protocol (DP) and Detour-NF). The values in this table were obtained from the synthesis results using the Synopsys FPGA Compiler II. We specified the target device as ilinks VirtexE V600EFG900-8 with a higher priority for speed. The maximum clock frequency reflects the complexity of the circuits. The required chip area, which is represented by the number of cells, and the total flip-flops (FFs) show the hardware cost. For the dimension-order router, we show the results for two patterns (namely, either three or four VCs per physical channel) to compare with the minimum number of VCs for DP and Detour-NF, which have three and four VCs, respectively. All of the routers were designed with 32-bit width physical channels and the buffer capacity per VC is eight 32-bit flits. We also show the ratios relative to the values of the DP router. The dimension-order router can be operated at the fastest clock frequency because of its simple routing logic. Detour- NF is 4% slower than DP in clock frequency because of its logic complexity and its increase of hardware quantities. When we increase the number of VCs, the area and total FFs are increased by the buffer space. The dimension-order router and Detour-NF with four VCs require 26% and 29% more FFs compared with the DP. The increase in area of the dimension-order router is smaller than Detour-NF since the wire area of the dimension-order router is smaller than that of Detour-NF. 1 1 The -Y dimension-ordered router does not require the wire from the north and the south ports to the east and west ports, because turns from the From logic synthesis, we can say that the Detour-NF router is more complex than the other two, but the speed degradation and the increase in hardware are not considerable when we take into account its fault-tolerant ability. 6. Communication Performance 6.1. Simulation Conditions In order to compare the performance characteristics of the adaptive and non-adaptive routing algorithms, we have simulated routers which have four VCs per physical channel. The simulation was executed on a 10 by 10 twodimensional torus network using an HDL simulator. We assume that the clock frequency of the routers is 100 MHz, and that each router takes three clock cycles to hop a message header. We also assume that the cable delay between two routers is less than a single clock cycle. Before showing the simulation results, we summarize the relationship between the routing algorithms and their usage of four VCs per physical channel. Dimension-order 4 non-adaptive VCs Duato s Protocol (DP) 2 non-adaptive (escape) VCs 2 minimal fully adaptive VC Detour-NF 2 non-adaptive VCs 1 minimal fully adaptive VC 1 non-minimal partially adaptive VC For the dimension-order router, all four VCs are used as non-adaptive VCs. On the other hand, DP requires two non-adaptive (escape) VCs and an additional two VCs can be used as minimal fully adaptive VCs. Finally, Detour-NF uses two non-adaptive VCs with one minimal fully adaptive VC and one non-minimal partially adaptive VC. vertical dimension to the horizontal are prohibited.

6 The simulated traffic patterns are random and hot-spot. In the random traffic, each node decides the destination randomly so that messages are spread uniformly on the network. In the hot-spot traffic, one-fourth of all messages are forced to a destination in the middle column of the torus, and the rest of the messages are sent randomly. For each simulation, we ignore the first 2000 messages, and the communication bandwidth is calculated for the following 5000 messages. We evaluated the network bandwidth by varying the message size. We evaluated the average message latency for 64-byte (16-flit) messages by changing the injection rate from the source nodes. For the latency evaluation, we measured the time for messages to reach their destinations after the source nodes created them Faulty models We simulated the Detour-NF router on fault-free and faulty networks. For the fault-free network, the nonminimal VC was not used in order to compare the performance with DP. For the faulty models, we assumed channel or node faults. Figure 5 shows the locations of the faulty channels and nodes. First, we will examine the bandwidth for random traffic with four and eight faulty channels. The four faulty channels are marked as x in the figure. In the case of the eight faulty channels, four additional channels are also set as faulty. Next, we will show the bandwidth for the cases of two and four faulty nodes. The faulty nodes are modeled in such a way that all four channels connected to the faulty node are marked as faulty. The faulty nodes do not take part in the communication. Namely, the faulty nodes never send or receive any messages Results (1) Random traffic Figure 6 shows the network bandwidth for random traffic with and without faulty components. This graph shows that DP achieves the highest bandwidth, although it does not have the fault-tolerance ability. The bandwidth of the Detour-NF router is plotted between the two minimal routing modes, DP and dimension-order. This tendency derives from the routing freedom, which increases according to the number of adaptive VCs. When we increase the number of faulty channels or nodes, the bandwidth of Detour-NF is degraded. Performance degradation by the separated faulty channels is larger than in the case of the faulty nodes. Even in the case of four faulty nodes, Detour-NF has a clear advantage over the dimension-order mode. One reason for these results is that the faulty node is never a message source or destination. This condition eases congestion around the faults compared to the faulty channels. : four faulty channels : additional four faulty channels : two faulty nodes : additional two faulty nodes Figure 5. Locations of the faulty channels and nodes. Figure 7 shows the average message latency for random traffic. The network is saturated at a certain bandwidth and the latency increases. The saturation points show the peak bandwidth of each routing algorithm for a 64-byte message. This graph shows that the saturation points and average latency of Detour-NF are mid-way between DP and dimension-order. The greater the number of faulty nodes that exist, the lower the bandwidth saturation. However, a fewer number of faulty nodes, such as two and four faults, can be tolerated by a single non-minimal VC without degrading communication performance as much. We obtained similar results for another uniform communication pattern, the all-to-all traffic pattern [10]. (2) Hot-spot traffic Figure 8 shows the network bandwidth for hot-spot traffic. We notice that DP and Detour-NF modes achieve a much higher bandwidth than the dimension-order router because of adaptive routing. The bandwidth of Detour-NF is degraded for larger messages because of the fewer number of adaptive VCs. However, the performance degradation caused by the faulty nodes is relatively small when we compare it with the random traffic performance. Figure 9 shows the average message latency for hot-spot traffic. The saturation bandwidth and latency of Detour-NF do not show large differences for the fault-free network and that with two or four faulty nodes. The reason is that the bottleneck of the hot-spot has a larger impact than the few faulty nodes. The Detour-NF router shows saturation bandwidth and latency that is closer to DP than the dimension-order mode due to adaptive routing.

7 22 20 Duato(no faults) Detour-NF(no faults) 10 Bandwidth [GB/s] Detour-NF(2 faulty nodes) Detour-NF(4 faulty nodes) Detour-NF(4 faulty channels) Detour-NF(8 faulty channels) Dimension-order(no faults) Message size [bytes] Bandwidth [GB/s] Duato(No faults) Detour-NF(No faults) Detour-NF(2 faulty nodes) Detour-NF(4 faulty nodes) Dimension-order(No faults) Message size [bytes] Figure 6. Bandwidth for random traffic. Figure 8. Bandwidth for hot-spot traffic. Average Latency [us] Duato(no faults) Detour-NF(no faults) Detour-NF(2 faulty nodes) Detour-NF(4 faulty nodes) Dimension-order(no faults) Average Latency [us] Duato(no faults) Detour-NF(no faults) Detour-NF(2 faulty nodes) Detour-NF(4 faulty nodes) Dimension-order(no faults) Bandwidth[GB/s] Figure 7. Average message latency for random traffic Bandwidth[GB/s] Figure 9. Average message latency for hotspot traffic. From these studies, we can conclude that supporting multiple routing modes such as deterministic, minimal fully adaptive, and non-minimal adaptive routing, has advantages. The Detour-NF router provides fault-tolerance ability and reasonable performance for cases with few faulty components. Since the deterministic and minimal fully adaptive routing modes do not allow any faults, they still are advantageous for fault-free networks. The deterministic mode is good for an environment requiring in-order message delivery, and the minimal fully adaptive routing mode provides good performance. 7. Conclusion We have considered the designs of fault-tolerant adaptive routers and proposed a method for constructing routing algorithms. Our method is simple in the sense that it integrates minimal and non-minimal routing algorithms for independent sets of VCs. We believe our method suits the router design in this era of VLSI technology and large-scale parallel and distributed systems. Based on our method, we showed an example design of Detour-NF. The Detour-NF design can be used not only as a non-minimal adaptive router but also as a minimal adaptive or non-adaptive router without wasting hardware resources. It suits environments where the faults rates are relatively low in both fault-free and faulty networks. The decoding of hardware messages without requiring routing table management is useful for fast operation and it simplifies the treatment of static network faults. However, the faulttolerance ability depends on the number of VCs. To support unconstrained fault regions with a few VCs, a routing algorithm for irregular networks, such as up*/down* routing [12], could be a candidate for the non-minimal routing function R1 in our design method. We are currently considering a flexible algorithm such as up*/down* routing for networks of short mean time between failure. Acknowledgments We would like to thank Prof. Takanobu Baba of

8 Utsunomiya University for his helpful comments. We also wish to thank Osamu Mitobe and Ta Quoq Viet, graduate school students at the University of Electro- Communications, for their help in our experiments. This research is supported in part by the Grants-in-Aid for Scientific Research of the Japan Society for the Promotion of Science (JSPS), No and No The study has been done using CAD tools provided by the VLSI Design and Education Center(VDEC) at the University of Tokyo. [14] T. Yoshinaga, M. Hayashi, M. Horita, Y. Yamaguchi, K. Ootsu, and T. Baba: A Cost and Performance Comparison for Wormhole Routers based on HDL Designs, Proc. IC- PADS 98, pp (1998). [15] T. Yoshinaga, M. Hayashi, M. Horita, S. Nakamura, K. Ootsu, and T. Baba: Recover-x: An Adaptive Router with Limited Escape Channels, Proc. ICPADS 2000, pp (2000). References [1] R.V. Boppana and S. Snyder: A Comparison of Adaptive Wormhole Routing Algorithms, Proc. 20th ISCA, pp (1993). [2] A.A. Chien: A Cost and Speed Model for k-ary n-cube Wormhole Routers, IEEE Trans. Parallel and Distributed System, vol.9, No.2, pp (1998). [3] W.J. Dally and C.L. Seiz: Deadlock-Free Message Routing in Multiprocessor Interconnection Network, IEEE Trans. Computers, vol.c-36, no.5, pp (1987). [4] J. Duato, S. Yalamanchili, and L. Ni: Interconnection Networks, an Engineering Approach, IEEE Computer Society Press, p.515 (1997). [5] J. Duato: A Theory of Fault-Tolerant Routing in Wormhole Networks, IEEE Trans. Parallel and Distributed Systems, vol.8, no.8, pp (1997). [6] J. Duato: A Necessary and Sufficient Condition for Deadlock-Free Adaptive Routing in Wormhole Networks, IEEE Trans. Parallel and Distributed Systems, vol.6, no.10, pp (1995). [7] M.L. Fulgham and L. Snyder: Triplex Router: A Versatile Torus Routing Algorithm, Technical Report UW-CSE , University of Washington (1996). [8] C.J. Glass and L.M. Ni: The Turn Model for Adaptive Routing, Proc. 19th ISCA, pp (1992). [9] D.H. Linder and J.C.Harden: An Adaptive and Fault Tolerant Wormhole Routing Strategy for -ary -cubes, IEEE Trans. on Computers, vol.40, no.1, pp.2 12 (1991). [10] H. Hosogoshi, O. Mitobe, T. Yoshinaga, and M. Sowa: Design of a Fault-Tolerant Fully Adaptive Router, Proc. Symposium on Advanced Computing Systems and Infrastructures, pp (2003, in Japanese). [11] D.R. Millar and W.A. Najjar: Preliminary Evaluation of a Hybrid Deterministic/Adaptive Router, Proc. Parallel Computing, Routing and Communication Workshop, Lecture Notes in Computer Science, vol.1417, pp (1997). [12] M.D. Schroeder, A.D. Birrel, M. Burrows, H. Murray, R.M. Needham, T.L. Rodeheffer, E.H. Satterthwaite, and C.P. Thacker: Autonet: A High-Speed, Self-Configurable Local Area Network using Point-to-Point links, IEEE J. Selected Areas Commun., vol.9, no.8, pp (1991). [13] Y.-J. Suh, et al.: Software Based Fault-Tolerant Oblivious Routing in Pipelined Network, Proc. ICPP, vol.1, pp (1995).

SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS*

SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS* SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS* Young-Joo Suh, Binh Vien Dao, Jose Duato, and Sudhakar Yalamanchili Computer Systems Research Laboratory Facultad de Informatica School

More information

Fault-Tolerant Routing Algorithm in Meshes with Solid Faults

Fault-Tolerant Routing Algorithm in Meshes with Solid Faults Fault-Tolerant Routing Algorithm in Meshes with Solid Faults Jong-Hoon Youn Bella Bose Seungjin Park Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science Oregon State University

More information

Fault-Tolerant Routing in Fault Blocks. Planarly Constructed. Dong Xiang, Jia-Guang Sun, Jie. and Krishnaiyan Thulasiraman. Abstract.

Fault-Tolerant Routing in Fault Blocks. Planarly Constructed. Dong Xiang, Jia-Guang Sun, Jie. and Krishnaiyan Thulasiraman. Abstract. Fault-Tolerant Routing in Fault Blocks Planarly Constructed Dong Xiang, Jia-Guang Sun, Jie and Krishnaiyan Thulasiraman Abstract A few faulty nodes can an n-dimensional mesh or torus network unsafe for

More information

An Examination of Routing Algorithms for Parallel Computing Environments

An Examination of Routing Algorithms for Parallel Computing Environments A case can be made that the Achilles heel of parallel processing networks and clusters is that they all have to deal with the unavoidable problem of communication over the System Area Network. In distributed

More information

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew

More information

Fault-Tolerant Wormhole Routing Algorithms in Meshes in the Presence of Concave Faults

Fault-Tolerant Wormhole Routing Algorithms in Meshes in the Presence of Concave Faults Fault-Tolerant Wormhole Routing Algorithms in Meshes in the Presence of Concave Faults Seungjin Park Jong-Hoon Youn Bella Bose Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science

More information

Deadlock and Livelock. Maurizio Palesi

Deadlock and Livelock. Maurizio Palesi Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,

More information

EE482, Spring 1999 Research Paper Report. Deadlock Recovery Schemes

EE482, Spring 1999 Research Paper Report. Deadlock Recovery Schemes EE482, Spring 1999 Research Paper Report Deadlock Recovery Schemes Jinyung Namkoong Mohammed Haque Nuwan Jayasena Manman Ren May 18, 1999 Introduction The selected papers address the problems of deadlock,

More information

NOC Deadlock and Livelock

NOC Deadlock and Livelock NOC Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,

More information

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control 1 Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection

More information

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,

More information

Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin

Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220 Admin Homework #5 Due Dec 3 Projects Final (yes it will be cumulative) CPS 220 2 1 Review: Terms Network characterized

More information

Communication in Multicomputers with Nonconvex Faults

Communication in Multicomputers with Nonconvex Faults Communication in Multicomputers with Nonconvex Faults Suresh Chalasani Rajendra V. Boppana Technical Report : CS-96-12 October 1996 The University of Texas at San Antonio Division of Computer Science San

More information

The Odd-Even Turn Model for Adaptive Routing

The Odd-Even Turn Model for Adaptive Routing IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 11, NO. 7, JULY 2000 729 The Odd-Even Turn Model for Adaptive Routing Ge-Ming Chiu, Member, IEEE Computer Society AbstractÐThis paper presents

More information

Deadlock-free Fault-tolerant Routing in the Multi-dimensional Crossbar Network and Its Implementation for the Hitachi SR2201

Deadlock-free Fault-tolerant Routing in the Multi-dimensional Crossbar Network and Its Implementation for the Hitachi SR2201 Deadlock-free Fault-tolerant Routing in the Multi-dimensional Crossbar Network and Its Implementation for the Hitachi SR2201 Yoshiko Yasuda, Hiroaki Fujii, Hideya Akashi, Yasuhiro Inagami, Teruo Tanaka*,

More information

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N.

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N. Interconnection topologies (cont.) [ 10.4.4] In meshes and hypercubes, the average distance increases with the dth root of N. In a tree, the average distance grows only logarithmically. A simple tree structure,

More information

Deadlock-free XY-YX router for on-chip interconnection network

Deadlock-free XY-YX router for on-chip interconnection network LETTER IEICE Electronics Express, Vol.10, No.20, 1 5 Deadlock-free XY-YX router for on-chip interconnection network Yeong Seob Jeong and Seung Eun Lee a) Dept of Electronic Engineering Seoul National Univ

More information

Lecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control

Lecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees, butterflies,

More information

Routing Algorithms. Review

Routing Algorithms. Review Routing Algorithms Today s topics: Deterministic, Oblivious Adaptive, & Adaptive models Problems: efficiency livelock deadlock 1 CS6810 Review Network properties are a combination topology topology dependent

More information

Lecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background

Lecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background Lecture 15: PCM, Networks Today: PCM wrap-up, projects discussion, on-chip networks background 1 Hard Error Tolerance in PCM PCM cells will eventually fail; important to cause gradual capacity degradation

More information

Deadlock. Reading. Ensuring Packet Delivery. Overview: The Problem

Deadlock. Reading. Ensuring Packet Delivery. Overview: The Problem Reading W. Dally, C. Seitz, Deadlock-Free Message Routing on Multiprocessor Interconnection Networks,, IEEE TC, May 1987 Deadlock F. Silla, and J. Duato, Improving the Efficiency of Adaptive Routing in

More information

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252

More information

Performance Analysis of a Minimal Adaptive Router

Performance Analysis of a Minimal Adaptive Router Performance Analysis of a Minimal Adaptive Router Thu Duc Nguyen and Lawrence Snyder Department of Computer Science and Engineering University of Washington, Seattle, WA 98195 In Proceedings of the 1994

More information

A Hybrid Interconnection Network for Integrated Communication Services

A Hybrid Interconnection Network for Integrated Communication Services A Hybrid Interconnection Network for Integrated Communication Services Yi-long Chen Northern Telecom, Inc. Richardson, TX 7583 kchen@nortel.com Jyh-Charn Liu Department of Computer Science, Texas A&M Univ.

More information

Wormhole Routing Techniques for Directly Connected Multicomputer Systems

Wormhole Routing Techniques for Directly Connected Multicomputer Systems Wormhole Routing Techniques for Directly Connected Multicomputer Systems PRASANT MOHAPATRA Iowa State University, Department of Electrical and Computer Engineering, 201 Coover Hall, Iowa State University,

More information

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 3) Table lookup

More information

A Deterministic Fault-Tolerant and Deadlock-Free Routing Protocol in 2-D Meshes Based on Odd-Even Turn Model

A Deterministic Fault-Tolerant and Deadlock-Free Routing Protocol in 2-D Meshes Based on Odd-Even Turn Model A Deterministic Fault-Tolerant and Deadlock-Free Routing Protocol in 2-D Meshes Based on Odd-Even Turn Model Jie Wu Dept. of Computer Science and Engineering Florida Atlantic University Boca Raton, FL

More information

Communication in Multicomputers with Nonconvex Faults?

Communication in Multicomputers with Nonconvex Faults? In Proceedings of EUROPAR 95 Communication in Multicomputers with Nonconvex Faults? Suresh Chalasani 1 and Rajendra V. Boppana 2 1 Dept. of ECE, University of Wisconsin-Madison, Madison, WI 53706-1691,

More information

Lecture: Interconnection Networks

Lecture: Interconnection Networks Lecture: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm 1 Packets/Flits A message is broken into multiple packets (each packet

More information

The Cray T3E Network:

The Cray T3E Network: The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus Steven L. Scott and Gregory M. Thorson Cray Research, Inc. {sls,gmt}@cray.com Abstract This paper describes the interconnection network

More information

Generic Methodologies for Deadlock-Free Routing

Generic Methodologies for Deadlock-Free Routing Generic Methodologies for Deadlock-Free Routing Hyunmin Park Dharma P. Agrawal Department of Computer Engineering Electrical & Computer Engineering, Box 7911 Myongji University North Carolina State University

More information

BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs

BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs -A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs Pejman Lotfi-Kamran, Masoud Daneshtalab *, Caro Lucas, and Zainalabedin Navabi School of Electrical and Computer Engineering, The

More information

Deadlock- and Livelock-Free Routing Protocols for Wave Switching

Deadlock- and Livelock-Free Routing Protocols for Wave Switching Deadlock- and Livelock-Free Routing Protocols for Wave Switching José Duato,PedroLópez Facultad de Informática Universidad Politécnica de Valencia P.O.B. 22012 46071 - Valencia, SPAIN E-mail:jduato@gap.upv.es

More information

A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ

A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ E. Baydal, P. López and J. Duato Depto. Informática de Sistemas y Computadores Universidad Politécnica de Valencia, Camino

More information

Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip

Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip Nauman Jalil, Adnan Qureshi, Furqan Khan, and Sohaib Ayyaz Qazi Abstract

More information

TDT Appendix E Interconnection Networks

TDT Appendix E Interconnection Networks TDT 4260 Appendix E Interconnection Networks Review Advantages of a snooping coherency protocol? Disadvantages of a snooping coherency protocol? Advantages of a directory coherency protocol? Disadvantages

More information

Optimal Topology for Distributed Shared-Memory. Multiprocessors: Hypercubes Again? Jose Duato and M.P. Malumbres

Optimal Topology for Distributed Shared-Memory. Multiprocessors: Hypercubes Again? Jose Duato and M.P. Malumbres Optimal Topology for Distributed Shared-Memory Multiprocessors: Hypercubes Again? Jose Duato and M.P. Malumbres Facultad de Informatica, Universidad Politecnica de Valencia P.O.B. 22012, 46071 - Valencia,

More information

4. Networks. in parallel computers. Advances in Computer Architecture

4. Networks. in parallel computers. Advances in Computer Architecture 4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors

More information

Lecture 16: On-Chip Networks. Topics: Cache networks, NoC basics

Lecture 16: On-Chip Networks. Topics: Cache networks, NoC basics Lecture 16: On-Chip Networks Topics: Cache networks, NoC basics 1 Traditional Networks Huh et al. ICS 05, Beckmann MICRO 04 Example designs for contiguous L2 cache regions 2 Explorations for Optimality

More information

Adaptive Multimodule Routers

Adaptive Multimodule Routers daptive Multimodule Routers Rajendra V Boppana Computer Science Division The Univ of Texas at San ntonio San ntonio, TX 78249-0667 boppana@csutsaedu Suresh Chalasani ECE Department University of Wisconsin-Madison

More information

Routing and Deadlock

Routing and Deadlock 3.5-1 3.5-1 Routing and Deadlock Routing would be easy...... were it not for possible deadlock. Topics For This Set: Routing definitions. Deadlock definitions. Resource dependencies. Acyclic deadlock free

More information

CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP

CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 133 CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 6.1 INTRODUCTION As the era of a billion transistors on a one chip approaches, a lot of Processing Elements (PEs) could be located

More information

Communication Performance in Network-on-Chips

Communication Performance in Network-on-Chips Communication Performance in Network-on-Chips Axel Jantsch Royal Institute of Technology, Stockholm November 24, 2004 Network on Chip Seminar, Linköping, November 25, 2004 Communication Performance In

More information

On Constructing the Minimum Orthogonal Convex Polygon in 2-D Faulty Meshes

On Constructing the Minimum Orthogonal Convex Polygon in 2-D Faulty Meshes On Constructing the Minimum Orthogonal Convex Polygon in 2-D Faulty Meshes Jie Wu Department of Computer Science and Engineering Florida Atlantic University Boca Raton, FL 33431 E-mail: jie@cse.fau.edu

More information

Lecture 22: Router Design

Lecture 22: Router Design Lecture 22: Router Design Papers: Power-Driven Design of Router Microarchitectures in On-Chip Networks, MICRO 03, Princeton A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip

More information

Lecture 18: Communication Models and Architectures: Interconnection Networks

Lecture 18: Communication Models and Architectures: Interconnection Networks Design & Co-design of Embedded Systems Lecture 18: Communication Models and Architectures: Interconnection Networks Sharif University of Technology Computer Engineering g Dept. Winter-Spring 2008 Mehdi

More information

Deadlock and Router Micro-Architecture

Deadlock and Router Micro-Architecture 1 EE482: Advanced Computer Organization Lecture #8 Interconnection Network Architecture and Design Stanford University 22 April 1999 Deadlock and Router Micro-Architecture Lecture #8: 22 April 1999 Lecturer:

More information

Software-Based Deadlock Recovery Technique for True Fully Adaptive Routing in Wormhole Networks

Software-Based Deadlock Recovery Technique for True Fully Adaptive Routing in Wormhole Networks Software-Based Deadlock Recovery Technique for True Fully Adaptive Routing in Wormhole Networks J. M. Martínez, P. López, J. Duato T. M. Pinkston Facultad de Informática SMART Interconnects Group Universidad

More information

A Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes

A Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes A Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes N.A. Nordbotten 1, M.E. Gómez 2, J. Flich 2, P.López 2, A. Robles 2, T. Skeie 1, O. Lysne 1, and J. Duato 2 1 Simula Research

More information

OASIS NoC Architecture Design in Verilog HDL Technical Report: TR OASIS

OASIS NoC Architecture Design in Verilog HDL Technical Report: TR OASIS OASIS NoC Architecture Design in Verilog HDL Technical Report: TR-062010-OASIS Written by Kenichi Mori ASL-Ben Abdallah Group Graduate School of Computer Science and Engineering The University of Aizu

More information

Design of a router for network-on-chip. Jun Ho Bahn,* Seung Eun Lee and Nader Bagherzadeh

Design of a router for network-on-chip. Jun Ho Bahn,* Seung Eun Lee and Nader Bagherzadeh 98 Int. J. High Performance Systems Architecture, Vol. 1, No. 2, 27 Design of a router for network-on-chip Jun Ho Bahn,* Seung Eun Lee and Nader Bagherzadeh Department of Electrical Engineering and Computer

More information

On Constructing the Minimum Orthogonal Convex Polygon for the Fault-Tolerant Routing in 2-D Faulty Meshes 1

On Constructing the Minimum Orthogonal Convex Polygon for the Fault-Tolerant Routing in 2-D Faulty Meshes 1 On Constructing the Minimum Orthogonal Convex Polygon for the Fault-Tolerant Routing in 2-D Faulty Meshes 1 Jie Wu Department of Computer Science and Engineering Florida Atlantic University Boca Raton,

More information

FT-Z-OE: A Fault Tolerant and Low Overhead Routing Algorithm on TSV-based 3D Network on Chip Links

FT-Z-OE: A Fault Tolerant and Low Overhead Routing Algorithm on TSV-based 3D Network on Chip Links FT-Z-OE: A Fault Tolerant and Low Overhead Routing Algorithm on TSV-based 3D Network on Chip Links Hoda Naghibi Jouybari College of Electrical Engineering, Iran University of Science and Technology, Tehran,

More information

Fault-adaptive routing

Fault-adaptive routing Fault-adaptive routing Presenter: Zaheer Ahmed Supervisor: Adan Kohler Reviewers: Prof. Dr. M. Radetzki Prof. Dr. H.-J. Wunderlich Date: 30-June-2008 7/2/2009 Agenda Motivation Fundamentals of Routing

More information

Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA

Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA Maheswari Murali * and Seetharaman Gopalakrishnan # * Assistant professor, J. J. College of Engineering and Technology,

More information

MESH-CONNECTED networks have been widely used in

MESH-CONNECTED networks have been widely used in 620 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 5, MAY 2009 Practical Deadlock-Free Fault-Tolerant Routing in Meshes Based on the Planar Network Fault Model Dong Xiang, Senior Member, IEEE, Yueli Zhang,

More information

DESIGN AND IMPLEMENTATION ARCHITECTURE FOR RELIABLE ROUTER RKT SWITCH IN NOC

DESIGN AND IMPLEMENTATION ARCHITECTURE FOR RELIABLE ROUTER RKT SWITCH IN NOC International Journal of Engineering and Manufacturing Science. ISSN 2249-3115 Volume 8, Number 1 (2018) pp. 65-76 Research India Publications http://www.ripublication.com DESIGN AND IMPLEMENTATION ARCHITECTURE

More information

Generalized Theory for Deadlock-Free Adaptive Wormhole Routing and its Application to Disha Concurrent

Generalized Theory for Deadlock-Free Adaptive Wormhole Routing and its Application to Disha Concurrent Generalized Theory for Deadlock-Free Adaptive Wormhole Routing and its Application to Disha Concurrent Anjan K. V. Timothy Mark Pinkston José Duato Pyramid Technology Corp. Electrical Engg. - Systems Dept.

More information

A Novel Energy Efficient Source Routing for Mesh NoCs

A Novel Energy Efficient Source Routing for Mesh NoCs 2014 Fourth International Conference on Advances in Computing and Communications A ovel Energy Efficient Source Routing for Mesh ocs Meril Rani John, Reenu James, John Jose, Elizabeth Isaac, Jobin K. Antony

More information

A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing

A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing 727 A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing 1 Bharati B. Sayankar, 2 Pankaj Agrawal 1 Electronics Department, Rashtrasant Tukdoji Maharaj Nagpur University, G.H. Raisoni

More information

A VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER. A Thesis SUNGHO PARK

A VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER. A Thesis SUNGHO PARK A VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER A Thesis by SUNGHO PARK Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements

More information

JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS

JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS 1 JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS Shabnam Badri THESIS WORK 2011 ELECTRONICS JUNCTION BASED ROUTING: A NOVEL TECHNIQUE FOR LARGE NETWORK ON CHIP PLATFORMS

More information

SoC Design. Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik

SoC Design. Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik SoC Design Prof. Dr. Christophe Bobda Institut für Informatik Lehrstuhl für Technische Informatik Chapter 5 On-Chip Communication Outline 1. Introduction 2. Shared media 3. Switched media 4. Network on

More information

SoC Design Lecture 13: NoC (Network-on-Chip) Department of Computer Engineering Sharif University of Technology

SoC Design Lecture 13: NoC (Network-on-Chip) Department of Computer Engineering Sharif University of Technology SoC Design Lecture 13: NoC (Network-on-Chip) Department of Computer Engineering Sharif University of Technology Outline SoC Interconnect NoC Introduction NoC layers Typical NoC Router NoC Issues Switching

More information

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing?

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing? Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing? J. Flich 1,P.López 1, M. P. Malumbres 1, J. Duato 1, and T. Rokicki 2 1 Dpto. Informática

More information

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

Interconnection Networks: Topology. Prof. Natalie Enright Jerger Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design

More information

Flow Control can be viewed as a problem of

Flow Control can be viewed as a problem of NOC Flow Control 1 Flow Control Flow Control determines how the resources of a network, such as channel bandwidth and buffer capacity are allocated to packets traversing a network Goal is to use resources

More information

ECE 669 Parallel Computer Architecture

ECE 669 Parallel Computer Architecture ECE 669 Parallel Computer Architecture Lecture 21 Routing Outline Routing Switch Design Flow Control Case Studies Routing Routing algorithm determines which of the possible paths are used as routes how

More information

NOW Handout Page 1. Outline. Networks: Routing and Design. Routing. Routing Mechanism. Routing Mechanism (cont) Properties of Routing Algorithms

NOW Handout Page 1. Outline. Networks: Routing and Design. Routing. Routing Mechanism. Routing Mechanism (cont) Properties of Routing Algorithms Outline Networks: Routing and Design Routing Switch Design Case Studies CS 5, Spring 99 David E. Culler Computer Science Division U.C. Berkeley 3/3/99 CS5 S99 Routing Recall: routing algorithm determines

More information

PERFORMANCE EVALUATION OF FAULT TOLERANT METHODOLOGIES FOR NETWORK ON CHIP ARCHITECTURE

PERFORMANCE EVALUATION OF FAULT TOLERANT METHODOLOGIES FOR NETWORK ON CHIP ARCHITECTURE PERFORMANCE EVALUATION OF FAULT TOLERANT METHODOLOGIES FOR NETWORK ON CHIP ARCHITECTURE By HAIBO ZHU A thesis submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE IN

More information

Networks-on-Chip Router: Configuration and Implementation

Networks-on-Chip Router: Configuration and Implementation Networks-on-Chip : Configuration and Implementation Wen-Chung Tsai, Kuo-Chih Chu * 2 1 Department of Information and Communication Engineering, Chaoyang University of Technology, Taichung 413, Taiwan,

More information

A Literature Review of on-chip Network Design using an Agent-based Management Method

A Literature Review of on-chip Network Design using an Agent-based Management Method A Literature Review of on-chip Network Design using an Agent-based Management Method Mr. Kendaganna Swamy S Dr. Anand Jatti Dr. Uma B V Instrumentation Instrumentation Communication Bangalore, India Bangalore,

More information

Interconnection Networks: Routing. Prof. Natalie Enright Jerger

Interconnection Networks: Routing. Prof. Natalie Enright Jerger Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly

More information

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing Jose Flich 1,PedroLópez 1, Manuel. P. Malumbres 1, José Duato 1,andTomRokicki 2 1 Dpto.

More information

Global Adaptive Routing Algorithm Without Additional Congestion Propagation Network

Global Adaptive Routing Algorithm Without Additional Congestion Propagation Network 1 Global Adaptive Routing Algorithm Without Additional Congestion ropagation Network Shaoli Liu, Yunji Chen, Tianshi Chen, Ling Li, Chao Lu Institute of Computing Technology, Chinese Academy of Sciences

More information

SURVEY ON LOW-LATENCY AND LOW-POWER SCHEMES FOR ON-CHIP NETWORKS

SURVEY ON LOW-LATENCY AND LOW-POWER SCHEMES FOR ON-CHIP NETWORKS SURVEY ON LOW-LATENCY AND LOW-POWER SCHEMES FOR ON-CHIP NETWORKS Chandrika D.N 1, Nirmala. L 2 1 M.Tech Scholar, 2 Sr. Asst. Prof, Department of electronics and communication engineering, REVA Institute

More information

Lecture 3: Flow-Control

Lecture 3: Flow-Control High-Performance On-Chip Interconnects for Emerging SoCs http://tusharkrishna.ece.gatech.edu/teaching/nocs_acaces17/ ACACES Summer School 2017 Lecture 3: Flow-Control Tushar Krishna Assistant Professor

More information

OASIS Network-on-Chip Prototyping on FPGA

OASIS Network-on-Chip Prototyping on FPGA Master thesis of the University of Aizu, Feb. 20, 2012 OASIS Network-on-Chip Prototyping on FPGA m5141120, Kenichi Mori Supervised by Prof. Ben Abdallah Abderazek Adaptive Systems Laboratory, Master of

More information

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) D.Udhayasheela, pg student [Communication system],dept.ofece,,as-salam engineering and technology, N.MageshwariAssistant Professor

More information

Fault-Tolerant and Deadlock-Free Routing in 2-D Meshes Using Rectilinear-Monotone Polygonal Fault Blocks

Fault-Tolerant and Deadlock-Free Routing in 2-D Meshes Using Rectilinear-Monotone Polygonal Fault Blocks Fault-Tolerant and Deadlock-Free Routing in -D Meshes Using Rectilinear-Monotone Polygonal Fault Blocks Jie Wu Department of Computer Science and Engineering Florida Atlantic University Boca Raton, FL

More information

Network-on-chip (NOC) Topologies

Network-on-chip (NOC) Topologies Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance

More information

EE 6900: Interconnection Networks for HPC Systems Fall 2016

EE 6900: Interconnection Networks for HPC Systems Fall 2016 EE 6900: Interconnection Networks for HPC Systems Fall 2016 Avinash Karanth Kodi School of Electrical Engineering and Computer Science Ohio University Athens, OH 45701 Email: kodi@ohio.edu 1 Acknowledgement:

More information

High Performance Interconnect and NoC Router Design

High Performance Interconnect and NoC Router Design High Performance Interconnect and NoC Router Design Brinda M M.E Student, Dept. of ECE (VLSI Design) K.Ramakrishnan College of Technology Samayapuram, Trichy 621 112 brinda18th@gmail.com Devipoonguzhali

More information

A Thermal-aware Application specific Routing Algorithm for Network-on-chip Design

A Thermal-aware Application specific Routing Algorithm for Network-on-chip Design A Thermal-aware Application specific Routing Algorithm for Network-on-chip Design Zhi-Liang Qian and Chi-Ying Tsui VLSI Research Laboratory Department of Electronic and Computer Engineering The Hong Kong

More information

The Effect of Adaptivity on the Performance of the OTIS-Hypercube under Different Traffic Patterns

The Effect of Adaptivity on the Performance of the OTIS-Hypercube under Different Traffic Patterns The Effect of Adaptivity on the Performance of the OTIS-Hypercube under Different Traffic Patterns H. H. Najaf-abadi 1, H. Sarbazi-Azad 2,1 1 School of Computer Science, IPM, Tehran, Iran. 2 Computer Engineering

More information

INTERCONNECTION NETWORKS LECTURE 4

INTERCONNECTION NETWORKS LECTURE 4 INTERCONNECTION NETWORKS LECTURE 4 DR. SAMMAN H. AMEEN 1 Topology Specifies way switches are wired Affects routing, reliability, throughput, latency, building ease Routing How does a message get from source

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Demand Based Routing in Network-on-Chip(NoC)

Demand Based Routing in Network-on-Chip(NoC) Demand Based Routing in Network-on-Chip(NoC) Kullai Reddy Meka and Jatindra Kumar Deka Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, Guwahati, India Abstract

More information

OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management

OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management Marina Garcia 22 August 2013 OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management M. Garcia, E. Vallejo, R. Beivide, M. Valero and G. Rodríguez Document number OFAR-CM: Efficient Dragonfly

More information

Fault-Tolerant Multiple Task Migration in Mesh NoC s over virtual Point-to-Point connections

Fault-Tolerant Multiple Task Migration in Mesh NoC s over virtual Point-to-Point connections Fault-Tolerant Multiple Task Migration in Mesh NoC s over virtual Point-to-Point connections A.SAI KUMAR MLR Group of Institutions Dundigal,INDIA B.S.PRIYANKA KUMARI CMR IT Medchal,INDIA Abstract Multiple

More information

Rajendra V. Boppana. Computer Science Division. for example, [23, 25] and the references therein) exploit the

Rajendra V. Boppana. Computer Science Division. for example, [23, 25] and the references therein) exploit the Fault-Tolerance with Multimodule Routers Suresh Chalasani ECE Department University of Wisconsin Madison, WI 53706-1691 suresh@ece.wisc.edu Rajendra V. Boppana Computer Science Division The Univ. of Texas

More information

Lecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels

Lecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels Lecture: Interconnection Networks Topics: TM wrap-up, routing, deadlock, flow control, virtual channels 1 TM wrap-up Eager versioning: create a log of old values Handling problematic situations with a

More information

EECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal

EECS 570. Lecture 19 Interconnects: Flow Control. Winter 2018 Subhankar Pal Lecture 19 Interconnects: Flow Control Winter 2018 Subhankar Pal http://www.eecs.umich.edu/courses/eecs570/ Slides developed in part by Profs. Adve, Falsafi, Hill, Lebeck, Martin, Narayanasamy, Nowatzyk,

More information

A Survey of Techniques for Power Aware On-Chip Networks.

A Survey of Techniques for Power Aware On-Chip Networks. A Survey of Techniques for Power Aware On-Chip Networks. Samir Chopra Ji Young Park May 2, 2005 1. Introduction On-chip networks have been proposed as a solution for challenges from process technology

More information

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID Lecture 25: Interconnection Networks, Disks Topics: flow control, router microarchitecture, RAID 1 Virtual Channel Flow Control Each switch has multiple virtual channels per phys. channel Each virtual

More information

Extended Junction Based Source Routing Technique for Large Mesh Topology Network on Chip Platforms

Extended Junction Based Source Routing Technique for Large Mesh Topology Network on Chip Platforms Extended Junction Based Source Routing Technique for Large Mesh Topology Network on Chip Platforms Usman Mazhar Mirza Master of Science Thesis 2011 ELECTRONICS Postadress: Besöksadress: Telefon: Box 1026

More information

Fault-tolerant & Adaptive Stochastic Routing Algorithm. for Network-on-Chip. Team CoheVer: Zixin Wang, Rong Xu, Yang Jiao, Tan Bie

Fault-tolerant & Adaptive Stochastic Routing Algorithm. for Network-on-Chip. Team CoheVer: Zixin Wang, Rong Xu, Yang Jiao, Tan Bie Fault-tolerant & Adaptive Stochastic Routing Algorithm for Network-on-Chip Team CoheVer: Zixin Wang, Rong Xu, Yang Jiao, Tan Bie Idea & solution to be investigated by the project There are some options

More information

Bursty Communication Performance Analysis of Network-on-Chip with Diverse Traffic Permutations

Bursty Communication Performance Analysis of Network-on-Chip with Diverse Traffic Permutations International Journal of Soft Computing and Engineering (IJSCE) Bursty Communication Performance Analysis of Network-on-Chip with Diverse Traffic Permutations Naveen Choudhary Abstract To satisfy the increasing

More information

Efficient And Advance Routing Logic For Network On Chip

Efficient And Advance Routing Logic For Network On Chip RESEARCH ARTICLE OPEN ACCESS Efficient And Advance Logic For Network On Chip Mr. N. Subhananthan PG Student, Electronics And Communication Engg. Madha Engineering College Kundrathur, Chennai 600 069 Email

More information