SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS*

Size: px
Start display at page:

Download "SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS*"

Transcription

1 SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS* Young-Joo Suh, Binh Vien Dao, Jose Duato, and Sudhakar Yalamanchili Computer Systems Research Laboratory Facultad de Informatica School of Electrical and Computer Engineering Universidad Politecnica de Valencia Georgia Institute of Technology P.O.B Atlanta, Georgia Valencia, Spain {suh, dao, phone: (404) fax: (404) Abstract -- This paper presents a software based approach to fault-tolerant routing in oblivious, wormhole routed networks. When a message encounters a faulty output link it is removed from the network by the local router and delivered to the messaging layer of the local node s operating system. The message passing software can re-route this message along a non-minimal oblivious path or via an intermediate node, which will forward the message to the destination. A message may encounter multiple faults and pass through multiple intermediate nodes. This paper discusses deadlock, livelock, and performance issues. Router designs are minimally impacted remaining compact, oblivious, and fast. Therefore this approach is a good candidate for incorporation into the next generation of wormhole routed multiprocessor networks. 1.0 Introduction Interconnection networks in modern machines make use of oblivious, fixed path wormhole routing to achieve high throughput and low message latency. However, fault tolerant communication requires the network to be able to dynamically route messages along alternative, possibly non-minimal paths. This paper proposes a software based approach for re-routing messages blocked by faults. The techniques reported here are motivated by several considerations. First it is targeted towards environments where the fault rates are relatively low, i.e., on the order of a maximum of 3 failed components between repair cycles. During this time we wish the machine to continue functioning, with possibly degraded communication performance. Solutions for higher fault rates have been addressed elsewhere [9]. Second, we wish to retain the features of existing oblivious router designs, i.e. compactness and speed. This implies that additional hardware complexity in the form of additional virtual channels should be avoided. Finally, we wish to make the common case fast. Therefore messages that do not encounter faults should be minimally impacted. The basic idea is quite simple. When a message encounters faulty link, it is removed from the network by the local router and delivered to the messaging layer of the local node s operating system. The message passing software either i) modifies the header so the message may follow an alternative dimension order path, or ii) computes an intermediate node address. In either case, the message is re-injected into the network. In the case that the message is transmitted to the intermediate node, * This research was supported in part by a grant from the National Science Foundation under grant CCR it will be forwarded upon receipt to the final destination. A message may encounter multiple faults and pass through multiple intermediate nodes. The problem is distinct from adaptive packet routing in networks using packet switching or virtual cut through [14]. Routing is oblivious and is based on wormhole. Re-routing must consider dependencies across multiple routers caused by small buffers (< message size) and pipelined dataflow. The proposed techniques do accommodate a range of fault patterns: more than previously proposed wormhole routing techniques. Only messages that encounter faults are affected, and degradation is largely proportional to the number of faults. If the mean time between failures is large, this approach is viable. Thus, we feel it is a good candidate for inclusion in the next generation of wormhole routed networks, and it targets commercial multiprocessors. This is particularly true when the application environment does not justify the use of expensive, custom, fault-tolerant backplanes. 2.0 Fault Model This paper considers k-ary n-cube networks. Adjacent faulty links and faulty nodes are coalesced into fault regions. Fault regions may overlap forming a larger fault region. We also assume that the fault regions do not disconnect the network. Fault regions may be convex or L- shaped or U-shaped concave regions. Some example fault regions are illustrated in Figure 1. Messages are routed obliviously, and therefore cannot progress when the required output link at a node leads to a fault region. The message is removed from the network and delivered to the local node s message handling software. This message is said to be absorbed at the local node and is subsequently referred to as a faulted message. This status is recorded in the message header. 3.0 Fault Tolerant Oblivious Routing This section describes the software-based fault-tolerant oblivious routing algorithm, e-sft. In the absence of faults, the e-sft routing algorithm is equivalent to the traditional dimension order routing algorithm, e-cube [8]. When the outgoing link at a node leads to a fault region, the message is absorbed and delivered to the messaging layer of the local node. The header may now be modified to reflect a different, non-minimal path around the faulty region. For example in Figure 1, a message from node P to node Q is blocked by a fault region. The message can be transmitted along the negative X-dimension across the wraparound channel to node Q. Alternatively, an intermediate node may be selected, and the message is sent to this node. The intermediate node will receive the message and

2 Y P T X R D Convex Fault Region Q forward it to the final destination. For example in Figure 1, a message from node P to node R may be first sent to intermediate node T. Node T can then transmit the message in dimension order to node R. If other faults are encountered by the re-routed message, the process is repeated. 3.1 Routing Header In order to keep track of the manner in which a message is re-routed, the header contains a 2 bit flag called the Direction Flag (DF). The DF is used to record the following information regarding the path that a message has taken Traversing the shortest distance along the X-dimension 01 - Traversing the longer distance along the X-dimension 10 - Traversing the shortest distance along the Y-dimension 11 - Traversing the longer distance along the Y-dimension For example, the message from P to Q in Figure 1 will have a DF value of <01>. Since fault-free routing is dimension order, a DF value of <10> would imply the message had attempted to traverse the X-dimension in both directions and is now trying to traverse the Y- dimension. The exception to this interpretation is when the source and destination nodes are in the same column. In this case the message will attempt the Y-dimension first and the interpretation of DF is reversed, i.e., <00> and <01> (<10> and <11>) refer to the Y-dimension (Xdimension). The DF value in the header is only modified when the message is absorbed at a node. It enables e-sft routing to keep track of the directions along each dimension that have been attempted and therefore aid in rerouting decisions. There are three additional fields in the header. A faulted status bit (F) indicates that the message has encountered at least one fault and is being re-routed. This bit enables a node to distinguish between messages destined for the local node and messages which must be forwarded. A prevent flag (PF) status bit is used to prevent the occurrence of certain livelock situations. The role of the PF will be clear from the example described later. A two bit re-route table field (RT) specifies one of three tables to be used for re-routing decisions. Finally, since messages may be routed through intermediate nodes, the header must contain two sets of address fields. The first records the final destination address (X-Final, Y-Final). This is an absolute address. The second is used for routing the messages and is an offset within each dimension (X- 16 G E A 8 7 5,11 4,10 2, Concave Fault Region S F 6 9 C 3 12 Figure 1. Examples of Fault Regions B H 15 X-Dest Y-Dest RT:Re-route Table F:Faulted Status Bit PF:Prevent Flag DF:Direction Flag F X-Final Y-Final Dest, Y-Dest). The message header now appears as shown in Figure 2. Note that the routers only process the offset fields and set the F bit. All of the remaining header processing is done in software, and only when messages encounter faults. Thus, router operations are minimally impacted. 3.2 The Routing Algorithm The network hardware routes messages using traditional dimension order routing based on the X-Dest and Y-Dest fields in the message header. If the outgoing channel at a router is faulty, the router sets the F bit, and routes the message to the local processor interface. This causes the message to be marked as a faulted message and ejected to the local messaging layer. If the F bit is set, the messaging software checks the X-Final and Y-Final fields to determine if the message is to be delivered locally. If not, a rerouting function is invoked. The X-Dest and Y-Dest fields are updated, and if necessary the DF and PF flags (as described below) are modified. The re-routing function depends only on the relative address of the first node where a fault is encountered and the final destination. Let the coordinates of the node where the message first encounters a faulty channel be ( x f, y f ) and the coordinates of the destination be ( x d, y d ). Let the offsets at ( x f, y f ) in each dimension be x (given by x d x f ) and y (given by ). There are three possible cases i) y = 0, ii) y d x = 0, and iii) x 0 & y 0. The re-routing decisions for each case are captured in Tables 1-3. The RT field identifies which of the three cases apply to the message and is set at ( x f, y f ). The notation s signifies that the message header is modified to be transmitted to the node with coordinates ( x d, y d ), along the shortest path l in the X-dimension. The notation signifies transmission along the longer path in the X-dimension. Note that re-routed messages still follow dimension order, though not shortest path within a dimension (due to faults). When re-routed messages are absorbed at nodes due to faults, the RT field identifies the table to be used in making the re-routing decision for the message. If the RT field is 0, then this is the first node at which the message encountered a fault and RT must be set to signify one of the three cases above. The tables can be interpreted as follows. If a header to be re-routed has a DF value shown in the first column, it is re-routed to the node shown in the second column. The notation specifies the direction in the dimension the message is to be transmitted. The remarks column specifies the action that takes place if the message subsequently encounters a fault before reaching the node specified in column 2 or arrived at the intermediate node without RT DF X-Dest:X-coordinate offset Y-Dest:Y-coordinate offset X-Final:X-coordinate of final destination Y-Final:Y-coordinate of final destination Figure 2. Format of the message header y f PF

3 meeting a fault. The Tables attempt to capture the following ideas. When a message encounters a fault, it is first re-routed in the same dimension in the opposite direction. If another fault is encountered, the message is routed in an orthogonal dimension in an attempt to route around the faulty regions. The DF keeps track of the directions attempted. The PF will be shown to prevent certain livelock situations within concave fault regions. 01 l DF=<10> after meeting a fault if PF = 0 DF=<11> after meeting a fault if PF = 1 10 s DF=<00> & PF is unchanged if no fault met ( x c, y c + r) y DF=<11> and PF = 1 if fault met 11 s DF=<00> after received by a node ( x c, y c r) y Table 1: Re-routing decisions for the case ( x c, y d ) y 01 l DF=<10> after meeting a fault if PF = 0 ( x c, y d ) y DF=<11> after meeting a fault if PF = 1 10 s DF=<00> & PF is unchanged if no fault met ( x c + r, y c ) x DF=<11> and PF = 1 if fault met 11 s DF=<00> after received by a node ( x c r, y c ) x Table 2: Re-routing decisions for the case y = 0 x = 0 01 l DF=<10> after meeting a fault 10 s DF=<11> after meeting a fault ( x c, y d ) y DF=<00> if not meet a fault 11 l DF=<00> after received by a node ( x c, y d ) y Table 3: Re-routing decisions for the case x 0 & y 0 Table 1 shows the routing decisions for the case y = 0, i.e., the first node at which the message was absorbed, and destination node are on the same row. The message is injected into the network with DF=<00> and PF=0 along the shortest path in the X-dimension. This is captured in the first row of the table. The remarks column indicates the action taken when the message encounters a faulty link. In this case, DF is changed to <01> to signify re-routing along the longer path in the X-dimension and the message is re-injected. If the message encounters another faulty channel before reaching ( x d, y d ), DF is changed to <10> (PF=0) or <11> (PF=1) as denoted in the remarks column of row 2. Let us assume DF is set to <10>. In this case from row 3 we see that the message is re-routed r hops along positive direction in the Y-dimension in an attempt to find a path around the fault region. The message is explicitly sent to an intermediate node ( x c, y c + r), where ( x c, y c ) is the current node. When the message reaches this intermediate node, it is absorbed and DF is reset. However, if instead the message with DF=<10> encounters a faulty channel, DF is set to <11> and PF set to 1, and the message is re-routed r hops in the negative direction in the Y-dimension. The choice of r is arbitrary. The right choice depends upon the expected height of the fault region. The safest (and most expensive) choice would be to use r = 1 hop. Table 2 illustrates the routing decisions for the case of x = 0. These are nearly similar to the case for y = 0. But, when the offset in Y-dimension is eliminated at an intermediate node, RT field is changed for Table 1. Table 3 captures decisions for the case of x 0 & y 0. In this case the message must traverse both the X- and Y- dimensions, and, after the message encounters a faulty channel along the longer path in X-dimension, e-sft attempts to eliminate the offset in one of the dimensions reducing subsequent re-routing decisions to those captured in Table 1 and Table 2. When one of the dimensions has been eliminated, the RT field is changed to reflect the choice of Table 1 or Table 2. Figure 1 illustrates an example where y = 0. In the figure S denotes the source node and D the destination node. The path taken by the message is numbered. A message sent with DF=<00> from S meets a faulty channel, and is absorbed by node A (1). Since y = 0, DF is l set to <01> and the message is transmitted to, where it meets a faulty channel at node B. Now DF is changed to <10> and the message is to be transmitted along the Y-dimension. However, since y = 0, the purpose now is to traverse the Y-dimension just enough to be able to be routed around the faulty region. In this example r = 1 so, the message is sent to node C, which is located one hop away from node B (3). Since the message is received rather than being ejected due to a fault, DF is reset to <00>. The message tries the shortest path in the X-dimension and fails (4) and the process is repeated until step 8. After step 8, DF is set to <10> and node F s tries to send the message to ( x c, y c + 1) y. However, the Y- dimension channel is faulty. Therefore DF is changed to <11>, which indicates that the message should be sent in the opposite path in the Y-dimension. After some more failures in the X-dimension, the message is arrived at node H. In the next step, the message is delivered to the final destination. Note that in step 11 and step 5 (or step 2 and step 14) the DF value is <01>, and after both steps the message meets a faulty channel at node C in the X- dimension. If node C changes DF to <10> after both steps 5 and 11, the message passes through nodes E-C-F-G-F- C-E-C-F..., infinitely, resulting in livelock. Therefore we require DF to be set to <11> after step 11 to force the message to traverse the negative Y-dimension. This distinction is realized by the PF bit in the header. The PF is initially 0. When DF makes a transition from <10> to <11>, PF is set to 1 and remains at 1. The value of PF is used to prevent cycles in the course of the message transmission (which corresponds to a cycle of DF values) and thus avoid certain livelock situations.

4 3.3 Deadlock and Livelock Freedom A few observations can be made about the behavior of e-sft routing. Re-routed messages still follow dimension order, which is deadlock-free [8], though not necessarily the shortest path in each dimension. When intermediate nodes are utilized, these nodes are selected to be in the same column or row as the current node. Finally, absorbed messages use dynamically allocated buffers in node memory (rather than router buffers) to prevent introducing dependencies between consumption channels and injection channels. It has been shown that these latter dependencies could lead to deadlock [2]. Thus we have the following theorem. Theorem 1. e-sft is deadlock-free. Due to space limitations, the proof of deadlock freedom is provided in a detailed technical report [15]. While e-sft is deadlock-free it is not necessarily livelockfree for arbitrarily large fault patterns. However, when the number of faults is limited to a small number, e.g., 3, livelock freedom can also be guaranteed [15]. Under the current fault model, the PF flag enables a message to exit certain types of concave regions, and routing does proceed around convex regions. These observations and experience with simulations are encouraging. However, the issue of livelock-freedom for the current fault model is still under study. 4.0 Performance Evaluation The performance of e-sft was evaluated with flit-level simulation studies of message passing in a 16-ary 2-cube with 16 flit long messages and a single flit routing header. We use a congestion control mechanism (similar to [1]) by placing a limit on the size of the buffer on the injection channels. If the input buffers are filled, messages cannot be injected into the network until a message in the buffer has been routed. Injection rates are specified as the number of 16 flit messages injected each 5000 cycle period. Thus, injection rate of 20 corresponds to flits/node/cycle. Note that these are 32 bit flits. A 32 bit header enables routing within a torus. A flit crosses a channel in a single cycle, and traverses a router from input to output in a single cycle. Routing decisions are assumed to take a single cycle with the network operating with a 50 Mhz clock, and 20 ns cycle time. The software cost for absorbing and re-injecting a message is derived from measured times on an Intel Paragon and reported work with active message implementations [11]. Based on these studies we assess this cost at 25µ s per absorption/injection or 50µ s each time a message must be processed by the messaging software at an intermediate node. If the message encounters busy injection buffers when it is being re-injected, it is re-queued for re-injection at a later time. Absorbed messages have priority over new messages to prevent starvation. Relative to existing router designs [4], the only additional functionality required is in the Routing Arbitration Block [4]. One side effect of the increased header size is a possible increase in virtual channel buffer size and the width of the internal datapaths, although 32 bit datapaths appear to be reasonable for the next generation of routers. The remaining required functionality of e-sft is implemented in the messaging layer software. 4.1 Simulation Results In a fault-free network the behavior of e-cube and e-sft is identical. Simulation experiments placed a single fault region of varying size within the network. Performance of e-sft is shown in Figure 3 for three different sized concave fault regions (5, 8, and 11 nodes) and for a 9 node convex fault region. Due to the greater difficulty in entering and exiting a concave fault region, the average message latency for e-sft is greater in the presence of concave fault regions rather than for equivalent sized convex fault regions. The curves also show that for each of the particular fault configurations, the latency remains relatively constant as the throughput increases. As the throughput increases, the number of messages each node injects and receives increases, but the percentage of messages that encounter the fault region remains relatively constant. Therefore, the latency remains relatively flat. Another factor is that the high latencies of re-routed messages Latency (Clock Cycles) Latency (Clock Cycles) Throughput (Flits/Cycle/Node) Latency Vs. Throughput Throughput (Flits/Cycle/Node) Node Concave 8 Node Concave 11 Node Concave 9 Node Convex Figure 3. Latency-throughput curves Inject = 10 Inject = 20 Inject = 40 Inject = 60 Latency Vs Faults Inject = 10 Inject = 20 Inject = 40 Inject = 60 Faults (Node Failures) Throughput Vs Faults Faults (Node Failures) Figure 4. Latency-throughput vs. node faults

5 masks some of the growth in the latency of messages that do not encounter faults, though a close inspection of the graphs reveals a small but steady growth in average latency. Figure 4 shows the performance of e-sft in the presence of a single convex fault region ranging in size from 1 failed router node to 21 failed router nodes. The latency plot indicates that when the network is below saturation traffic, the increase in the size of the fault block causes significant increases in the average message latency. This is due to the increase in the number of messages encountering larger fault regions (an 1 node fault region represents 0.4% of the total number of nodes in the network, while a 21 node fault region represents 8.2% of the total number of nodes). The latency and throughput curves for high injection rates (60) represent an interesting case. Throughput and latency appear to remain relatively constant. At high rates and larger fault regions, more messages become absorbed and re-routed. However, the limited buffer size provides a natural throttling mechanism for both new messages as well as absorbed messages waiting to be re-injected. As a result, active faulted messages in the network form a smaller percentage of the traffic and both the latency and throughput characteristics are dominated by the steady state values of traffic unaffected by faults. The initial drop in throughput for small number of faults is due to the fact that a higher percentage of faulted messages are delivered reducing throughput. These results suggest that sufficient buffering of faulted messages and priorities in re-injecting them have a significant impact on the performance of faulted messages. At lower injection rates the throughput of the network remains relatively constant independent of the size of the fault blocks since fault blocks only increase the latency of the messages. Since messages are guaranteed delivery, when operating well below saturation the network quickly reaches the steady state throughput. The effect of the overhead on the message latencies can be significant. Message latency histograms show peaks at intervals of 2500 cycles (corresponding to the 50µ s software overhead each time a message passes through the messaging layer software at a node). Among messages that do encounter faults, it appears that on the average the majority of faulted messages do not require more than three re-routing steps to be routed around the fault region. In general, the results demonstrate good performance with messages being re-routed a few times. In practice, we find that the probability of multiple router failures before repair to be very low. Therefore we expect that large majority of faulted messages will not have to pass through more than one node. This would make these techniques attractive for next generation wormhole routed networks. 5.0 Concluding Remarks We find that performance of re-routed messages is significantly affected by the techniques for buffering and reinjection. While the large majority of traffic is unaffected by faults, reliable message delivery and improving the latency of faulted messages will require better understanding of how re-routed messages should be handled. Finally, it appears that this approach can be extended nat- urally to networks employing adaptive routing. For example, many fully adaptive routing protocols rely on dimension order routing over a subset of channels [10,7] to avoid deadlock. Messages which are blocked waiting on these channels, and experience faults on these channels can be absorbed and re-routed. Furthermore, partially adaptive routing protocols such as those based on the Turn Model [13] can also be adapted in a similar fashion. These issues and extensions are the subject of ongoing research. REFERENCES [1]R. Boppana and S. Chalasani. A comparison of adaptive wormhole routing algorithms, Proc. of Int. Symp. on Computer Architecture, pp , [2] R. Boppana and S. Chalasani, Fault-tolerant routing with non-adaptive wormhole algorithms in mesh networks, Proc. of Supercomputing, pp , [3] S. Chalasani and R. Boppana, Fault-tolerant wormhole routing in tori, Proc. of Int. Conf. on Supercomputing, [4]A. Chien, A cost and speed model for k-ary n-cube wormhole routers, Proc. of Hot Interconnects Workshop, Aug [5] A. Chien and J. H. Kim, Planar-adaptive routing: Low-cost adaptive networks for multiprocessors, Proc. of Int. Symp. on Computer Architecture, pp , [6] W. J. Dally, Virtual-channel flow control, IEEE Trans. on Parallel and Distributed Systems, vol. 3, pp , March [7] W. J. Dally and H. Aoki, Deadlock-free adaptive routing in multicomputer networks using virtual channels, IEEE Trans. on Parallel and Distributed Systems, vol. 4, pp , April [8] W. J. Dally and C. L. Seitz, Deadlock-free message routing in multiprocessor interconnection networks, IEEE Trans. on Computers, C-36, pp. q , May [9]B. V. Dao, J. Duato, and S. Yalamanchili, Configurable flow control mechanisms for fault tolerant routing, Proc. of Int. Symp. on Computer Architecture, June [10]J. Duato, A new theory of deadlock-free adaptive routing in wormhole networks, IEEE Trans. on Parallel and Distributed Systems, vol. 4, pp , Dec [11]T. von Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser, Active messages: a mechanism for integrated communication and computation, Proc. of Int. Symp. on Computer Architecture, pp , [12]C. J. Glass and L. Ni, Fault-tolerant wormhole routing in meshes, Proc. of the Fault Tolerant Computing Symposium, [13]C. J. Glass and L. Ni, The turn model for adaptive routing, Proc. of Int. Symp. on Computer Architecture, pp , [14]S. Konstantinidou and L. Snyder, Chaos router: architecture and performance, Proc. of Int. Symp. on Computer Architecture, pp , [15]Y. J. Suh, B. V. Dao, J. Duato, and S. Yalamanchili, Software based fault tolerant routing, Technical Report TR-GIT/CSRL-95/04, Georgia Institute of Technology, Atlanta, Georgia , April 1995.

Deadlock- and Livelock-Free Routing Protocols for Wave Switching

Deadlock- and Livelock-Free Routing Protocols for Wave Switching Deadlock- and Livelock-Free Routing Protocols for Wave Switching José Duato,PedroLópez Facultad de Informática Universidad Politécnica de Valencia P.O.B. 22012 46071 - Valencia, SPAIN E-mail:jduato@gap.upv.es

More information

Fault-Tolerant Wormhole Routing Algorithms in Meshes in the Presence of Concave Faults

Fault-Tolerant Wormhole Routing Algorithms in Meshes in the Presence of Concave Faults Fault-Tolerant Wormhole Routing Algorithms in Meshes in the Presence of Concave Faults Seungjin Park Jong-Hoon Youn Bella Bose Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science

More information

A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ

A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ E. Baydal, P. López and J. Duato Depto. Informática de Sistemas y Computadores Universidad Politécnica de Valencia, Camino

More information

Fault-Tolerant Routing Algorithm in Meshes with Solid Faults

Fault-Tolerant Routing Algorithm in Meshes with Solid Faults Fault-Tolerant Routing Algorithm in Meshes with Solid Faults Jong-Hoon Youn Bella Bose Seungjin Park Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science Oregon State University

More information

Optimal Topology for Distributed Shared-Memory. Multiprocessors: Hypercubes Again? Jose Duato and M.P. Malumbres

Optimal Topology for Distributed Shared-Memory. Multiprocessors: Hypercubes Again? Jose Duato and M.P. Malumbres Optimal Topology for Distributed Shared-Memory Multiprocessors: Hypercubes Again? Jose Duato and M.P. Malumbres Facultad de Informatica, Universidad Politecnica de Valencia P.O.B. 22012, 46071 - Valencia,

More information

Communication in Multicomputers with Nonconvex Faults

Communication in Multicomputers with Nonconvex Faults Communication in Multicomputers with Nonconvex Faults Suresh Chalasani Rajendra V. Boppana Technical Report : CS-96-12 October 1996 The University of Texas at San Antonio Division of Computer Science San

More information

A Hybrid Interconnection Network for Integrated Communication Services

A Hybrid Interconnection Network for Integrated Communication Services A Hybrid Interconnection Network for Integrated Communication Services Yi-long Chen Northern Telecom, Inc. Richardson, TX 7583 kchen@nortel.com Jyh-Charn Liu Department of Computer Science, Texas A&M Univ.

More information

Deadlock. Reading. Ensuring Packet Delivery. Overview: The Problem

Deadlock. Reading. Ensuring Packet Delivery. Overview: The Problem Reading W. Dally, C. Seitz, Deadlock-Free Message Routing on Multiprocessor Interconnection Networks,, IEEE TC, May 1987 Deadlock F. Silla, and J. Duato, Improving the Efficiency of Adaptive Routing in

More information

Performance Evaluation of a New Routing Strategy for Irregular Networks with Source Routing

Performance Evaluation of a New Routing Strategy for Irregular Networks with Source Routing Performance Evaluation of a New Routing Strategy for Irregular Networks with Source Routing J. Flich, M. P. Malumbres, P. López and J. Duato Dpto. Informática de Sistemas y Computadores Universidad Politécnica

More information

Software-Based Deadlock Recovery Technique for True Fully Adaptive Routing in Wormhole Networks

Software-Based Deadlock Recovery Technique for True Fully Adaptive Routing in Wormhole Networks Software-Based Deadlock Recovery Technique for True Fully Adaptive Routing in Wormhole Networks J. M. Martínez, P. López, J. Duato T. M. Pinkston Facultad de Informática SMART Interconnects Group Universidad

More information

Communication in Multicomputers with Nonconvex Faults?

Communication in Multicomputers with Nonconvex Faults? In Proceedings of EUROPAR 95 Communication in Multicomputers with Nonconvex Faults? Suresh Chalasani 1 and Rajendra V. Boppana 2 1 Dept. of ECE, University of Wisconsin-Madison, Madison, WI 53706-1691,

More information

Wormhole Routing Techniques for Directly Connected Multicomputer Systems

Wormhole Routing Techniques for Directly Connected Multicomputer Systems Wormhole Routing Techniques for Directly Connected Multicomputer Systems PRASANT MOHAPATRA Iowa State University, Department of Electrical and Computer Engineering, 201 Coover Hall, Iowa State University,

More information

A Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes

A Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes A Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes N.A. Nordbotten 1, M.E. Gómez 2, J. Flich 2, P.López 2, A. Robles 2, T. Skeie 1, O. Lysne 1, and J. Duato 2 1 Simula Research

More information

Fault-Tolerant Routing in Fault Blocks. Planarly Constructed. Dong Xiang, Jia-Guang Sun, Jie. and Krishnaiyan Thulasiraman. Abstract.

Fault-Tolerant Routing in Fault Blocks. Planarly Constructed. Dong Xiang, Jia-Guang Sun, Jie. and Krishnaiyan Thulasiraman. Abstract. Fault-Tolerant Routing in Fault Blocks Planarly Constructed Dong Xiang, Jia-Guang Sun, Jie and Krishnaiyan Thulasiraman Abstract A few faulty nodes can an n-dimensional mesh or torus network unsafe for

More information

Adaptive Multimodule Routers

Adaptive Multimodule Routers daptive Multimodule Routers Rajendra V Boppana Computer Science Division The Univ of Texas at San ntonio San ntonio, TX 78249-0667 boppana@csutsaedu Suresh Chalasani ECE Department University of Wisconsin-Madison

More information

Performance Analysis of a Minimal Adaptive Router

Performance Analysis of a Minimal Adaptive Router Performance Analysis of a Minimal Adaptive Router Thu Duc Nguyen and Lawrence Snyder Department of Computer Science and Engineering University of Washington, Seattle, WA 98195 In Proceedings of the 1994

More information

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing?

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing? Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing? J. Flich 1,P.López 1, M. P. Malumbres 1, J. Duato 1, and T. Rokicki 2 1 Dpto. Informática

More information

Generic Methodologies for Deadlock-Free Routing

Generic Methodologies for Deadlock-Free Routing Generic Methodologies for Deadlock-Free Routing Hyunmin Park Dharma P. Agrawal Department of Computer Engineering Electrical & Computer Engineering, Box 7911 Myongji University North Carolina State University

More information

Basic Low Level Concepts

Basic Low Level Concepts Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock

More information

Removing the Latency Overhead of the ITB Mechanism in COWs with Source Routing Λ

Removing the Latency Overhead of the ITB Mechanism in COWs with Source Routing Λ Removing the Latency Overhead of the ITB Mechanism in COWs with Source Routing Λ J. Flich, M. P. Malumbres, P. López and J. Duato Dpto. of Computer Engineering (DISCA) Universidad Politécnica de Valencia

More information

Design and Evaluation of a Fault-Tolerant Adaptive Router for Parallel Computers

Design and Evaluation of a Fault-Tolerant Adaptive Router for Parallel Computers Design and Evaluation of a Fault-Tolerant Adaptive Router for Parallel Computers Tsutomu YOSHINAGA, Hiroyuki HOSOGOSHI, Masahiro SOWA Graduate School of Information Systems, University of Electro-Communications,

More information

EE482, Spring 1999 Research Paper Report. Deadlock Recovery Schemes

EE482, Spring 1999 Research Paper Report. Deadlock Recovery Schemes EE482, Spring 1999 Research Paper Report Deadlock Recovery Schemes Jinyung Namkoong Mohammed Haque Nuwan Jayasena Manman Ren May 18, 1999 Introduction The selected papers address the problems of deadlock,

More information

The Odd-Even Turn Model for Adaptive Routing

The Odd-Even Turn Model for Adaptive Routing IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 11, NO. 7, JULY 2000 729 The Odd-Even Turn Model for Adaptive Routing Ge-Ming Chiu, Member, IEEE Computer Society AbstractÐThis paper presents

More information

Routing and Deadlock

Routing and Deadlock 3.5-1 3.5-1 Routing and Deadlock Routing would be easy...... were it not for possible deadlock. Topics For This Set: Routing definitions. Deadlock definitions. Resource dependencies. Acyclic deadlock free

More information

EE 6900: Interconnection Networks for HPC Systems Fall 2016

EE 6900: Interconnection Networks for HPC Systems Fall 2016 EE 6900: Interconnection Networks for HPC Systems Fall 2016 Avinash Karanth Kodi School of Electrical Engineering and Computer Science Ohio University Athens, OH 45701 Email: kodi@ohio.edu 1 Acknowledgement:

More information

Boosting the Performance of Myrinet Networks

Boosting the Performance of Myrinet Networks IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. XX, NO. Y, MONTH 22 1 Boosting the Performance of Myrinet Networks J. Flich, P. López, M. P. Malumbres, and J. Duato Abstract Networks of workstations

More information

Generalized Theory for Deadlock-Free Adaptive Wormhole Routing and its Application to Disha Concurrent

Generalized Theory for Deadlock-Free Adaptive Wormhole Routing and its Application to Disha Concurrent Generalized Theory for Deadlock-Free Adaptive Wormhole Routing and its Application to Disha Concurrent Anjan K. V. Timothy Mark Pinkston José Duato Pyramid Technology Corp. Electrical Engg. - Systems Dept.

More information

Resource Deadlocks and Performance of Wormhole Multicast Routing Algorithms

Resource Deadlocks and Performance of Wormhole Multicast Routing Algorithms IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 9, NO. 6, JUNE 1998 535 Resource Deadlocks and Performance of Wormhole Multicast Routing Algorithms Rajendra V. Boppana, Member, IEEE, Suresh

More information

MESH-CONNECTED networks have been widely used in

MESH-CONNECTED networks have been widely used in 620 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 5, MAY 2009 Practical Deadlock-Free Fault-Tolerant Routing in Meshes Based on the Planar Network Fault Model Dong Xiang, Senior Member, IEEE, Yueli Zhang,

More information

On Constructing the Minimum Orthogonal Convex Polygon in 2-D Faulty Meshes

On Constructing the Minimum Orthogonal Convex Polygon in 2-D Faulty Meshes On Constructing the Minimum Orthogonal Convex Polygon in 2-D Faulty Meshes Jie Wu Department of Computer Science and Engineering Florida Atlantic University Boca Raton, FL 33431 E-mail: jie@cse.fau.edu

More information

Routing Algorithms. Review

Routing Algorithms. Review Routing Algorithms Today s topics: Deterministic, Oblivious Adaptive, & Adaptive models Problems: efficiency livelock deadlock 1 CS6810 Review Network properties are a combination topology topology dependent

More information

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing Jose Flich 1,PedroLópez 1, Manuel. P. Malumbres 1, José Duato 1,andTomRokicki 2 1 Dpto.

More information

Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin

Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220 Admin Homework #5 Due Dec 3 Projects Final (yes it will be cumulative) CPS 220 2 1 Review: Terms Network characterized

More information

Improving Network Performance by Reducing Network Contention in Source-Based COWs with a Low Path-Computation Overhead Λ

Improving Network Performance by Reducing Network Contention in Source-Based COWs with a Low Path-Computation Overhead Λ Improving Network Performance by Reducing Network Contention in Source-Based COWs with a Low Path-Computation Overhead Λ J. Flich, P. López, M. P. Malumbres, and J. Duato Dept. of Computer Engineering

More information

DUE to the increasing computing power of microprocessors

DUE to the increasing computing power of microprocessors IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 13, NO. 7, JULY 2002 693 Boosting the Performance of Myrinet Networks José Flich, Member, IEEE, Pedro López, M.P. Malumbres, Member, IEEE, and

More information

Input Buffering (IB): Message data is received into the input buffer.

Input Buffering (IB): Message data is received into the input buffer. TITLE Switching Techniques BYLINE Sudhakar Yalamanchili School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA. 30332 sudha@ece.gatech.edu SYNONYMS Flow Control DEFITION

More information

A Distributed Formation of Orthogonal Convex Polygons in Mesh-Connected Multicomputers

A Distributed Formation of Orthogonal Convex Polygons in Mesh-Connected Multicomputers A Distributed Formation of Orthogonal Convex Polygons in Mesh-Connected Multicomputers Jie Wu Department of Computer Science and Engineering Florida Atlantic University Boca Raton, FL 3343 Abstract The

More information

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)

Routing Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 3) Table lookup

More information

Deadlock and Router Micro-Architecture

Deadlock and Router Micro-Architecture 1 EE482: Advanced Computer Organization Lecture #8 Interconnection Network Architecture and Design Stanford University 22 April 1999 Deadlock and Router Micro-Architecture Lecture #8: 22 April 1999 Lecturer:

More information

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew

More information

A Survey of Routing Techniques in Store-and-Forward and Wormhole Interconnects

A Survey of Routing Techniques in Store-and-Forward and Wormhole Interconnects SANDIA REPORT SAND2008-0068 Unlimited Release Printed January 2008 A Survey of Routing Techniques in Store-and-Forward and Wormhole Interconnects David M. Holman and David S. Lee Prepared by Sandia National

More information

Deadlock and Livelock. Maurizio Palesi

Deadlock and Livelock. Maurizio Palesi Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,

More information

On Constructing the Minimum Orthogonal Convex Polygon for the Fault-Tolerant Routing in 2-D Faulty Meshes 1

On Constructing the Minimum Orthogonal Convex Polygon for the Fault-Tolerant Routing in 2-D Faulty Meshes 1 On Constructing the Minimum Orthogonal Convex Polygon for the Fault-Tolerant Routing in 2-D Faulty Meshes 1 Jie Wu Department of Computer Science and Engineering Florida Atlantic University Boca Raton,

More information

Characterization of Deadlocks in Interconnection Networks

Characterization of Deadlocks in Interconnection Networks Characterization of Deadlocks in Interconnection Networks Sugath Warnakulasuriya Timothy Mark Pinkston SMART Interconnects Group EE-System Dept., University of Southern California, Los Angeles, CA 90089-56

More information

NOC Deadlock and Livelock

NOC Deadlock and Livelock NOC Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,

More information

A New Theory of Deadlock-Free Adaptive Multicast Routing in. Wormhole Networks. J. Duato. Facultad de Informatica. Universidad Politecnica de Valencia

A New Theory of Deadlock-Free Adaptive Multicast Routing in. Wormhole Networks. J. Duato. Facultad de Informatica. Universidad Politecnica de Valencia A New Theory of Deadlock-Free Adaptive Multicast Routing in Wormhole Networks J. Duato Facultad de Informatica Universidad Politecnica de Valencia P.O.B. 22012, 46071 - Valencia, SPAIN E-mail: jduato@aii.upv.es

More information

BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs

BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs -A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs Pejman Lotfi-Kamran, Masoud Daneshtalab *, Caro Lucas, and Zainalabedin Navabi School of Electrical and Computer Engineering, The

More information

Fault-Tolerant and Deadlock-Free Routing in 2-D Meshes Using Rectilinear-Monotone Polygonal Fault Blocks

Fault-Tolerant and Deadlock-Free Routing in 2-D Meshes Using Rectilinear-Monotone Polygonal Fault Blocks Fault-Tolerant and Deadlock-Free Routing in -D Meshes Using Rectilinear-Monotone Polygonal Fault Blocks Jie Wu Department of Computer Science and Engineering Florida Atlantic University Boca Raton, FL

More information

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252

More information

Total-Exchange on Wormhole k-ary n-cubes with Adaptive Routing

Total-Exchange on Wormhole k-ary n-cubes with Adaptive Routing Total-Exchange on Wormhole k-ary n-cubes with Adaptive Routing Fabrizio Petrini Oxford University Computing Laboratory Wolfson Building, Parks Road Oxford OX1 3QD, England e-mail: fabp@comlab.ox.ac.uk

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control 1 Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection

More information

Interprocessor Communication. Basics of Network Routing

Interprocessor Communication. Basics of Network Routing Interprocessor Communication There are two main differences between sequential computers and parallel computers -- multiple processors and the hardware to connect them together. That hardware is the most

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

A New Theory of Deadlock-Free Adaptive. Routing in Wormhole Networks. Jose Duato. Abstract

A New Theory of Deadlock-Free Adaptive. Routing in Wormhole Networks. Jose Duato. Abstract A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks Jose Duato Abstract Second generation multicomputers use wormhole routing, allowing a very low channel set-up time and drastically reducing

More information

MESH-CONNECTED multicomputers, especially those

MESH-CONNECTED multicomputers, especially those IEEE TRANSACTIONS ON RELIABILITY, VOL. 54, NO. 3, SEPTEMBER 2005 449 On Constructing the Minimum Orthogonal Convex Polygon for the Fault-Tolerant Routing in 2-D Faulty Meshes Jie Wu, Senior Member, IEEE,

More information

Lecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control

Lecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees, butterflies,

More information

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

Module 17: Interconnection Networks Lecture 37: Introduction to Routers Interconnection Networks. Fundamentals. Latency and bandwidth Interconnection Networks Fundamentals Latency and bandwidth Router architecture Coherence protocol and routing [From Chapter 10 of Culler, Singh, Gupta] file:///e /parallel_com_arch/lecture37/37_1.htm[6/13/2012

More information

TDT Appendix E Interconnection Networks

TDT Appendix E Interconnection Networks TDT 4260 Appendix E Interconnection Networks Review Advantages of a snooping coherency protocol? Disadvantages of a snooping coherency protocol? Advantages of a directory coherency protocol? Disadvantages

More information

BLAM : A High-Performance Routing Algorithm for Virtual Cut-Through Networks

BLAM : A High-Performance Routing Algorithm for Virtual Cut-Through Networks BLAM : A High-Performance Routing Algorithm for Virtual Cut-Through Networks Mithuna Thottethodi Λ Alvin R. Lebeck y Shubhendu S. Mukherjee z Λ School of Electrical and Computer Engineering Purdue University

More information

Congestion Management in Lossless Interconnects: Challenges and Benefits

Congestion Management in Lossless Interconnects: Challenges and Benefits Congestion Management in Lossless Interconnects: Challenges and Benefits José Duato Technical University of Valencia (SPAIN) Conference title 1 Outline Why is congestion management required? Benefits Congestion

More information

CONNECTION-BASED ADAPTIVE ROUTING USING DYNAMIC VIRTUAL CIRCUITS

CONNECTION-BASED ADAPTIVE ROUTING USING DYNAMIC VIRTUAL CIRCUITS Proceedings of the International Conference on Parallel and Distributed Computing and Systems, Las Vegas, Nevada, pp. 379-384, October 1998. CONNECTION-BASED ADAPTIVE ROUTING USING DYNAMIC VIRTUAL CIRCUITS

More information

Multi-path Routing for Mesh/Torus-Based NoCs

Multi-path Routing for Mesh/Torus-Based NoCs Multi-path Routing for Mesh/Torus-Based NoCs Yaoting Jiao 1, Yulu Yang 1, Ming He 1, Mei Yang 2, and Yingtao Jiang 2 1 College of Information Technology and Science, Nankai University, China 2 Department

More information

Lecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels

Lecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels Lecture: Interconnection Networks Topics: TM wrap-up, routing, deadlock, flow control, virtual channels 1 TM wrap-up Eager versioning: create a log of old values Handling problematic situations with a

More information

On Topology and Bisection Bandwidth of Hierarchical-ring Networks for Shared-memory Multiprocessors

On Topology and Bisection Bandwidth of Hierarchical-ring Networks for Shared-memory Multiprocessors On Topology and Bisection Bandwidth of Hierarchical-ring Networks for Shared-memory Multiprocessors Govindan Ravindran Newbridge Networks Corporation Kanata, ON K2K 2E6, Canada gravindr@newbridge.com Michael

More information

Interconnect Technology and Computational Speed

Interconnect Technology and Computational Speed Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented

More information

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,

More information

Deadlock-Free Adaptive Routing in Meshes Based on Cost-Effective Deadlock Avoidance Schemes

Deadlock-Free Adaptive Routing in Meshes Based on Cost-Effective Deadlock Avoidance Schemes Deadlock-Free Adaptive Routing in Meshes Based on Cost-Effective Deadlock Avoidance Schemes Dong Xiang Yueli Zhang Yi Pan Jie Wu School of Software Tsinghua Universit Beijing 184, China School of Software

More information

Flow Control can be viewed as a problem of

Flow Control can be viewed as a problem of NOC Flow Control 1 Flow Control Flow Control determines how the resources of a network, such as channel bandwidth and buffer capacity are allocated to packets traversing a network Goal is to use resources

More information

Deadlock-free Routing in InfiniBand TM through Destination Renaming Λ

Deadlock-free Routing in InfiniBand TM through Destination Renaming Λ Deadlock-free Routing in InfiniBand TM through Destination Renaming Λ P. López, J. Flich and J. Duato Dept. of Computing Engineering (DISCA) Universidad Politécnica de Valencia, Valencia, Spain plopez@gap.upv.es

More information

Communication Performance in Network-on-Chips

Communication Performance in Network-on-Chips Communication Performance in Network-on-Chips Axel Jantsch Royal Institute of Technology, Stockholm November 24, 2004 Network on Chip Seminar, Linköping, November 25, 2004 Communication Performance In

More information

Interconnection Network

Interconnection Network Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network

More information

in Oblivious Routing

in Oblivious Routing Static Virtual Channel Allocation in Oblivious Routing Keun Sup Shim, Myong Hyon Cho, Michel Kinsy, Tina Wen, Mieszko Lis G. Edward Suh (Cornell) Srinivas Devadas MIT Computer Science and Artificial Intelligence

More information

4. Networks. in parallel computers. Advances in Computer Architecture

4. Networks. in parallel computers. Advances in Computer Architecture 4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors

More information

A First Implementation of In-Transit Buffers on Myrinet GM Software Λ

A First Implementation of In-Transit Buffers on Myrinet GM Software Λ A First Implementation of In-Transit Buffers on Myrinet GM Software Λ S. Coll, J. Flich, M. P. Malumbres, P. López, J. Duato and F.J. Mora Universidad Politécnica de Valencia Camino de Vera, 14, 46071

More information

Rajendra V. Boppana. Computer Science Division. for example, [23, 25] and the references therein) exploit the

Rajendra V. Boppana. Computer Science Division. for example, [23, 25] and the references therein) exploit the Fault-Tolerance with Multimodule Routers Suresh Chalasani ECE Department University of Wisconsin Madison, WI 53706-1691 suresh@ece.wisc.edu Rajendra V. Boppana Computer Science Division The Univ. of Texas

More information

Global Adaptive Routing Algorithm Without Additional Congestion Propagation Network

Global Adaptive Routing Algorithm Without Additional Congestion Propagation Network 1 Global Adaptive Routing Algorithm Without Additional Congestion ropagation Network Shaoli Liu, Yunji Chen, Tianshi Chen, Ling Li, Chao Lu Institute of Computing Technology, Chinese Academy of Sciences

More information

Lecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks

Lecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks Lecture: Transactional Memory, Networks Topics: TM implementations, on-chip networks 1 Summary of TM Benefits As easy to program as coarse-grain locks Performance similar to fine-grain locks Avoids deadlock

More information

Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution

Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution Nishant Satya Lakshmikanth sailtosatya@gmail.com Krishna Kumaar N.I. nikrishnaa@gmail.com Sudha S

More information

A Deterministic Fault-Tolerant and Deadlock-Free Routing Protocol in 2-D Meshes Based on Odd-Even Turn Model

A Deterministic Fault-Tolerant and Deadlock-Free Routing Protocol in 2-D Meshes Based on Odd-Even Turn Model A Deterministic Fault-Tolerant and Deadlock-Free Routing Protocol in 2-D Meshes Based on Odd-Even Turn Model Jie Wu Dept. of Computer Science and Engineering Florida Atlantic University Boca Raton, FL

More information

Prioritized Shufflenet Routing in TOAD based 2X2 OTDM Router.

Prioritized Shufflenet Routing in TOAD based 2X2 OTDM Router. Prioritized Shufflenet Routing in TOAD based 2X2 OTDM Router. Tekiner Firat, Ghassemlooy Zabih, Thompson Mark, Alkhayatt Samir Optical Communications Research Group, School of Engineering, Sheffield Hallam

More information

A Literature Review of on-chip Network Design using an Agent-based Management Method

A Literature Review of on-chip Network Design using an Agent-based Management Method A Literature Review of on-chip Network Design using an Agent-based Management Method Mr. Kendaganna Swamy S Dr. Anand Jatti Dr. Uma B V Instrumentation Instrumentation Communication Bangalore, India Bangalore,

More information

MMR: A High-Performance Multimedia Router - Architecture and Design Trade-Offs

MMR: A High-Performance Multimedia Router - Architecture and Design Trade-Offs MMR: A High-Performance Multimedia Router - Architecture and Design Trade-Offs Jose Duato 1, Sudhakar Yalamanchili 2, M. Blanca Caminero 3, Damon Love 2, Francisco J. Quiles 3 Abstract This paper presents

More information

Interconnection Networks: Routing. Prof. Natalie Enright Jerger

Interconnection Networks: Routing. Prof. Natalie Enright Jerger Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly

More information

Network-on-chip (NOC) Topologies

Network-on-chip (NOC) Topologies Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance

More information

Deadlock-free Fault-tolerant Routing in the Multi-dimensional Crossbar Network and Its Implementation for the Hitachi SR2201

Deadlock-free Fault-tolerant Routing in the Multi-dimensional Crossbar Network and Its Implementation for the Hitachi SR2201 Deadlock-free Fault-tolerant Routing in the Multi-dimensional Crossbar Network and Its Implementation for the Hitachi SR2201 Yoshiko Yasuda, Hiroaki Fujii, Hideya Akashi, Yasuhiro Inagami, Teruo Tanaka*,

More information

Efficient Communication in Metacube: A New Interconnection Network

Efficient Communication in Metacube: A New Interconnection Network International Symposium on Parallel Architectures, Algorithms and Networks, Manila, Philippines, May 22, pp.165 170 Efficient Communication in Metacube: A New Interconnection Network Yamin Li and Shietung

More information

EE 382C Interconnection Networks

EE 382C Interconnection Networks EE 8C Interconnection Networks Deadlock and Livelock Stanford University - EE8C - Spring 6 Deadlock and Livelock: Terminology Deadlock: A condition in which an agent waits indefinitely trying to acquire

More information

A MULTI-PATH ROUTING SCHEME FOR TORUS-BASED NOCS 1. Abstract: In Networks-on-Chip (NoC) designs, crosstalk noise has become a serious issue

A MULTI-PATH ROUTING SCHEME FOR TORUS-BASED NOCS 1. Abstract: In Networks-on-Chip (NoC) designs, crosstalk noise has become a serious issue A MULTI-PATH ROUTING SCHEME FOR TORUS-BASED NOCS 1 Y. Jiao 1, Y. Yang 1, M. Yang 2, and Y. Jiang 2 1 College of Information Technology and Science, Nankai University, China 2 Dept. of Electrical and Computer

More information

The Cray T3E Network:

The Cray T3E Network: The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus Steven L. Scott and Gregory M. Thorson Cray Research, Inc. {sls,gmt}@cray.com Abstract This paper describes the interconnection network

More information

NOC: Networks on Chip SoC Interconnection Structures

NOC: Networks on Chip SoC Interconnection Structures NOC: Networks on Chip SoC Interconnection Structures COE838: Systems-on-Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering

More information

EE382C Lecture 1. Bill Dally 3/29/11. EE 382C - S11 - Lecture 1 1

EE382C Lecture 1. Bill Dally 3/29/11. EE 382C - S11 - Lecture 1 1 EE382C Lecture 1 Bill Dally 3/29/11 EE 382C - S11 - Lecture 1 1 Logistics Handouts Course policy sheet Course schedule Assignments Homework Research Paper Project Midterm EE 382C - S11 - Lecture 1 2 What

More information

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID Lecture 25: Interconnection Networks, Disks Topics: flow control, router microarchitecture, RAID 1 Virtual Channel Flow Control Each switch has multiple virtual channels per phys. channel Each virtual

More information

Adaptive Routing in Hexagonal Torus Interconnection Networks

Adaptive Routing in Hexagonal Torus Interconnection Networks Adaptive Routing in Hexagonal Torus Interconnection Networks Arash Shamaei and Bella Bose School of Electrical Engineering and Computer Science Oregon State University Corvallis, OR 97331 5501 Email: {shamaei,bose}@eecs.oregonstate.edu

More information

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N.

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N. Interconnection topologies (cont.) [ 10.4.4] In meshes and hypercubes, the average distance increases with the dth root of N. In a tree, the average distance grows only logarithmically. A simple tree structure,

More information

Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip

Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip Nauman Jalil, Adnan Qureshi, Furqan Khan, and Sohaib Ayyaz Qazi Abstract

More information

Connection-oriented Multicasting in Wormhole-switched Networks on Chip

Connection-oriented Multicasting in Wormhole-switched Networks on Chip Connection-oriented Multicasting in Wormhole-switched Networks on Chip Zhonghai Lu, Bei Yin and Axel Jantsch Laboratory of Electronics and Computer Systems Royal Institute of Technology, Sweden fzhonghai,axelg@imit.kth.se,

More information

Deadlock-Free Connection-Based Adaptive Routing with Dynamic Virtual Circuits

Deadlock-Free Connection-Based Adaptive Routing with Dynamic Virtual Circuits Computer Science Department Technical Report #TR050021 University of California, Los Angeles, June 2005 Deadlock-Free Connection-Based Adaptive Routing with Dynamic Virtual Circuits Yoshio Turner and Yuval

More information

Lecture 3: Topology - II

Lecture 3: Topology - II ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 3: Topology - II Tushar Krishna Assistant Professor School of Electrical and

More information