Characterization of Deadlocks in Interconnection Networks
|
|
- Bryan Booker
- 6 years ago
- Views:
Transcription
1 Characterization of Deadlocks in Interconnection Networks Sugath Warnakulasuriya Timothy Mark Pinkston SMART Interconnects Group EE-System Dept., University of Southern California, Los Angeles, CA Abstract Deadlock-free routing algorithms have been developed recently without fully understanding the frequency and characteristics of deadlocks. Using a simulator capable of true deadlock detection, we measure a network's susceptibility to deadlock due to various design parameters. The effects of bidirectionality, routing adaptivity, virtual channels, buffer size and node degree on deadlock formation are studied. In the process, we provide insight into the frequency and characteristics of deadlocks and the relationship between routing flexibility, blocked messages, resource dependencies and the degree of correlation needed to form deadlock. 1 Introduction Interconnection network routing algorithms aim to minimize message blocking by efficiently utilizing network virtual channel and physical channel resources while ensuring deadlock freedom. Routing approaches to accomplish this can be based on avoiding deadlock or on recovering from deadlock. The main distinction between these two approaches is the decision made in trading off routing freedom and deadlock formation. Avoidance-based routing algorithms enforce certain routing restrictions in order to altogether avoid deadlocks [1,, ]. Recovery-based routing algorithms relax routing restrictions and recover from potential deadlock situations [4, 5]. The circumstances under which either routing approach is preferable depend critically on the frequency with which deadlocks occur and the resulting effects. For instance, deadlock may be so infrequent for a particular network configuration that avoidance-based routing inefficiently uses network resources, resulting in frequent message blocking. On the other hand, deadlock may be so frequent and costly in some network configurations that avoidance-based routing outperforms recovery-based routing. This paper precisely quantifies the frequency and characteristics of deadlock formation in wormhole and cut-through k-ary n-cube networks and identifies network design parameters which influence deadlock formation. This enables us to better understand the nature of deadlocks and their likelihood and to determine the circumstances under which routing al- This research was supported by an NSF Research Initiation Award, grant ECS , and an NSF Career Award, grant ECS gorithms should be based on recovery as opposed to avoidance. In accomplishing this, we analyze the effects of different traffic patterns, bidirectionality, routing adaptivity, node degree, number of virtual channels and buffer depth on the frequency and characteristics of deadlocks. To our knowledge, no other study of router-related deadlock in interconnection networks has been performed to the detail presented here. In the next section, we classify deadlocks through example. Section presents the experiments we performed and the results. Section 4 presents related work and important findings are summarized in Section 5. Deadlock Formation Deadlocks in interconnection networks can occur as a result of cyclic resource dependencies formed when messages hold onto some resources (i.e., virtual channels) while waiting to acquire others. As a message progresses through a network, it acquires exclusive ownership of a virtual channel (VC) prior to each hop. When the header flit of a message blocks, it can be thought of as requesting the exclusive use of one of possibly many alternative VCs in order to progress to the next hop. A blocked message resumes once a new VC is acquired. As the tail of a message moves through the network, it releases previously acquired VCs no longer needed, so they can become available for other messages. The exclusive ownership and resource wait-for conditions along with the condition that messages are not preempted makes cyclic dependencies and deadlock possible..1 Depicting Deadlocks We use channel wait-for graphs (CWGs) [6] to model resource dependencies within interconnection networks. Although similar to dependency graphs used in previous work [, 7, 8], these graphs depict network state reflecting resource allocations and requests existing at a particular point in time, not the resource allocations allowed by the routing algorithm. Hence, in this context, CWGs depicting the entire network state are not necessarily connected. Figures 1 through 4 show examples of messages being routed in k-ary n-cube wormhole networks, along with the corresponding CWGs. In the network illustrations (Figures 1a, a, a and 4a), the source and destination nodes of message m i are labeled s i and d i, respectively. VC labeling in these figures is done only to facilitate explanation and is not
2 s 4 8 d s 10 5 d 1 11 d 5 d s 1 1 d 4 s 4 5 s 6 7 s 1 s 0 1 d d 4 s owned by owned by owned by owned by owned by d 5 d 6 s 4 5 d 1 4 s owned by owned by owned by owned by owned by 6 7 Figure 1. (a) A "single-cycle deadlock" for DOR with 1 VC. (b) The CWG contains a knot. intended to convey information regarding the relative positions of VCs within the network. In the CWGs (Figures 1b, b, b and 4b), vertices represent VCs. Outgoing arc(s) at each vertex are labeled with the message which currently owns that VC. A path formed by a series of solid arcs with the same label implies the temporal order in which VCs were acquired and continues to be owned by a particular message. Blocked messages are represented by connecting the ends of such paths to one or more desired VCs using dashed arcs. At any vertex, the labels of incoming dashed arcs represent the group of messages that desire to use that VC at this instant in time. Only those portions of the network's CWGs useful for illustrative purposes are shown in these figures. Figure 1a shows five messages (,,,,and ) being routed statically in dimension order within a torusconnected network with one VC. Note that messages,,and are blocked while messages and have acquired all of the channels needed to reach their destinations. Message has acquired channels 1 and, and requires to continue. Similarly, message has acquired channels, 4,and 5, and requires 6 to continue; message has acquired channels 6, 7,and 0, and requires 1 to continue. Thus, each of these blocked messages will wait indefinitely for one of the other messages in the group to release an owned VC. Figure 1b shows the CWG for the scenario in Figure 1a. There is a single cycle in this graph consisting of vertices 0, 1,,, 4, 5, 6 and 7. Given the set of all resources involved in this cycle, R = f 0 ; 1 ; ; ; 4 ; 5 ; 6 ; 7 g, observe that the set of vertices that can be reached by each and every member of R is R itself. This type of relationship formed by vertices in one or more cycles is referred to as a knot [9]. Assuming that the routing function is connected, a knot is a necessary and sufficient condition for deadlock [6].. Classifying Deadlocks..1 Single-Cycle Deadlocks Deadlock can be characterized by its deadlock set, resource set, andknot cycle density. The deadlock depicted in Figure 1 is what we refer to as a single-cycle deadlock. In Figure. (a) A "single-cycle deadlock" for minimal adaptive routing with 1 VC. (b) The CWG contains a knot. this example, the deadlock involves messages in its deadlock set f ; ; g, occupies 8 channels in its resource set f 0 ; 1 ; ; ; 4 ; 5 ; 6 ; 7 g, and has a knot cycle density of one cycle (true of all single-cycle deadlocks). Single-cycle deadlocks are more likely to occur in networks having minimal resources and/or highly restrictive routing options on available resources. As in the above example (Figure 1) of a torus network with one VC that allows only non-adaptive (static) dimension ordered routing, the routing function returns at most a single channel option. This is reflected in the CWG by a single dashed outgoing arc at any vertex in Figure 1b (maximum fan-out of one). In such a network, a single cycle is sufficient to form a knot. However, for this to occur, a correlated resource dependency among multiple messages must form. Single-cycle deadlocks are also possible in networks which use less restrictive routing (e.g., minimal adaptive routing with only one VC) when only one routing option is available to all messages comprising the deadlock set (e.g., due to faulty links or routing in the destination's dimension). An example is illustrated in Figures a and b. Here, each of the messages,, and has acquired VCs, exhausted their routing adaptivity, and are therefore waiting to acquire the one channel needed to reach their respective destinations. However, the required channels are already owned by members of this group of messages. The CWG (Figure b) contains a single cycle, and the vertices in this cycle form a knot, R = f 1 ; ; 5 ; 7 g. Hence, with a knot cycle density of one, this too is a single-cycle deadlock; its deadlock set contains 4 messages f ; ; ; g and its resource set includes 8 channels f 0 ; 1 ; ; ; 4 ; 5 ; 6 ; 7 g. This single-cycle deadlock not only requires all of the messages in the deadlock set to have exhausted their adaptivity, but also to own all of the resources needed by other messages in the deadlock set. Therefore, an even higher degree of correlation of message resource dependency is required for this type of deadlock to occur. In this example, message has acquired 8 and 9, and is waiting for a VC owned by message which is involved in the deadlock. Although the message is not able to
3 s 5 s m 6 d 4 s 5 s m 6 s 1 s d,7 d 4, d 4,6 d 1, s 6 s 0 1 m 11 m 6 m6 1 m m 8 s 1 s d,7 d d 4,6 d 1, s 6 s 0 1 m m 11 m 6 m6 1 m m 8 s 4 owned by owned by owned by owned by s 7 owned by owned by m 6 owned by owned by m m s 4 owned by owned by owned by owned by s 7 owned by owned by m 6 owned by owned by m m Figure. (a) A "multi-cycle deadlock" for minimal adaptive routing with VCs. (b) The CWG contains a knot. proceed until the deadlock is resolved, it is not considered to be in the deadlock set as its resources do not meet the condition for participation in a knot as described previously. This type of message is referred to as a dependent message and is distinguished from those messages actually in the deadlock set. The usefulness of this distinction is evident when developing deadlock detection mechanisms for recovery-based routers. The detection mechanism must be careful not to incorrectly identify dependent messages as being among those properly in the deadlock set, as removing them from the network will not resolve the deadlock. Moreover, dependent messages may be transient in that they may be able to proceed using an alternate resource not owned by one of the messages in the deadlock set... Multi-Cycle Deadlocks Figures a and b depict the network and the CWG for a more complex example of a deadlock, one comprised of multiple resource dependency cycles. This network uses minimal adaptive routing and two VCs per physical channel. Once again, all messages (...m 8 )have exhausted their adaptivity and are blocked. Each message is waiting to acquire one of two VCs needed to continue routing, both of which are owned by other members of the group. There are multiple unique cycles in the CWG. The set of all vertices involved in this group of cycles, R = f 1 ; ; 5 ; 7 ; 9 ; 11 ; 1 ; 15 g, meets the requirement for a knot. This is an example of what is referred to as a multi-cycle deadlock; its deadlock set has 8 messages f :::m 8 g, its resource set has 16 VCs f 0 ; 1 ;::: 15 g, and its knot cycle density is 4 cycles. CWGs similar to Figure b, where there are multiple outgoing dashed arcs per blocked message (fan-out > 1), are indicative of networks which allow a greater degree of routing flexibility (e.g., provide multiple VCs per physical channel, allow adaptive routing, etc.). Given that the messages in this example have exhausted their adaptivity, the vertices with a fan-out of two in Figure b correspond to a routing relation that supplies two alternative resources for each of the blocked messages. Should messages have blocked prior to exhausting their adaptivity, vertices with larger fan-out (i.e., Figure 4. (a) A "cyclic non-deadlock" for minimal adaptive routing with VCs. (b) The CWG does not contain a knot. 4) would exist in the graph. As can be seen by this example, the fan-out of vertices in the CWG, which is determined by routing adaptivity and the number of VCs per physical channel, greatly influences the number of unique cycles that can form. More importantly, increasing the routing flexibility exponentially increases the degree of correlation of resource dependency required for multiple cycles to form knots... Cyclic Non-Deadlocks A scenario in which multiple cycles exists but which does not result in deadlock (referred to as cyclic non-deadlock) is depicted in Figures 4a and 4b. This is similar to the previous example except that message 's destination is changed, allowing it to acquire the required VCs on its way to its destination. There are 8 unique cycles in the CWG. Given the set of all vertices in this group of cycles f 1 ; ; 5 ; 9 ; 11 ; 1 ; 15 g, note that vertices 7 and 16 are reachable from members of this set, but the opposite does not hold. This set (or any subset thereof) does not meet the conditions for a knot; therefore, there is no deadlock in this network. This is because message may eventually reach its destination and subsequently release 7, which will allow one of the two messages waiting for this channel ( or ) to continue. Other messages will then be able to proceed in a similar fashion. This example confirms the notion that cycles are necessary but not sufficient for deadlock, as was concluded by Duato [7]. Resource dependency graphs of deadlock avoidance algorithms based on Duato's framework may have cycles but will always have an escape resource to avoid deadlock (such as 7 in Figure 4b). The elimination of these cycles as required by some avoidance-based routing schemes is therefore overly restrictive. Similarly, eliminating cycles in a packet wait-for graph [10] to avoid deadlock is also overly restrictive the packet wait-for graph for this example clearly contains cycles, yet no deadlock exists. In summary, single-cycle deadlocks are possible in networks which have a single channel resource and limited adaptivity defined on that resource (due to static routing or exhausted adaptivity). Multi-cycle deadlocks involving highly correlated message resource dependencies are possible in networks using multiple resources and which allow
4 greater routing adaptivity over those resources. It has been shown that the number of blocked messages (number of vertices which have outgoing dashed arcs) and the flexibility in routing (fan- out of these vertices) greatly influence the formation of cycles [11]. However, deadlock occurs only when a group of cycles form a knot. Normalized Deadlocks o uni directional + bi directional Deadlock Set Size o uni directional + bi directional Deadlock Characterization Our approach for precisely detecting deadlocks is based on a theoretical framework which defines a deadlock as a knot within a CWG [6]. We implement a deadlock detection algorithm that is able to identify knots within the CWG of an ongoing network simulation. The deadlock detection algorithm involves maintaining a CWG, detecting cycles within this graph, and identifying groups of cycles which form knots. It is implemented in a flit-level simulator called FlexSim (an extension of FlitSi.0). All simulations are run for normalized loads up to full network capacity or until the network saturates with respect to the number of resource dependency cycles, generally well beyond the loads at which network performance saturates (shown in the figures by a vertical dashed line). Each simulation is run for 0,000 simulation cycles beyond steady state. Unless otherwise stated, all simulations are performed using uniform traffic, a 16-ary -cube with bidirectional channels, a fixed message size of flits, an edge buffer depth of two flits, one injection and reception channel, and a channel selection policy which favors continuing routing in the current dimension over turning. Minimal true fully adaptive routing (TFAR) is used for adaptive routing and dimension ordered routing (DOR) is used for static routing. Since no other restrictions are enforced, deadlocks are possible for both routing schemes. The deadlock detection algorithm is invoked every 50 simulation cycles. Deadlocks are broken by removing a message in the deadlock set (flit-by-flit) from the network so as to synthesize a recovery procedure (as in the Disha scheme [5]). Deadlock frequency is presented as normalized deadlocks which is the ratio given by the number of deadlocks averaged over all messages delivered. When no deadlocks exist, we instead use the total number of resource dependency cycles formed and the amount of congestion (number and percentage of blocked messages) to represent the conditions that could lead to deadlock formation. The size of deadlock and resource sets and the knot cycle density are used to describe the size and complexity of deadlocks..1 Effect of Physical Links on Deadlocks In studying the effect of network links on deadlock formation, we measure the frequency of deadlocks in tori with uniand bidirectional channels. We assume DOR with one VC per physical channel for both networks (all other parameters set to default values). Figures 5a and 5b show normalized deadlocks vs. load rate and deadlock set size vs. load rate for the two networks under uniform traffic. Normalized load rate is calculated based on total link bandwidth and average internode distance, which differs for both networks. The figures show that the uni-torus leads to relatively more deadlock despite having generated less overall traffic Figure 5. (a) Normalized deadlocks vs. rate. (b) Deadlock set size vs. load rate. load Below network saturation, there are 1 and 7 deadlocks for every 100 messages delivered (on average) in the bi- and unidirectional networks, respectively. For the two networks, no more than 4 (bi) and (uni) messages are involved in each deadlock below saturation loads. This indicates that unless messages experience deadlock more than once, up to % (bi) and 15% (uni) of all messages participate in deadlock. Deep into saturation, deadlock frequency grows to 11% (bi) and 60% (uni) while the number of messages involved in deadlock converges to around 6 for both networks. From this, we can infer that at highly saturated load rates messages may be involved in multiple deadlocks prior to being delivered, particularly in the uni-directional network. The deadlocks formed in both networks are of the singlecycle deadlock variety described in Section..1. The requirements and factors leading to deadlock for the two networks, however, are different which helps to explain the disparity in deadlock frequency. For one, a bi-torus requires a minimum of messages per deadlock whereas only messages comprise the minimal deadlock set for a uni-torus. As confirmed by Figure 5b, the uni-torus has deadlocks involving fewer messages for all load rates up through deep saturation. Second, and more importantly, for uniform traffic in a torus with 16 nodes per dimension each bi-link is used by 1% of the messages traveling in a particular direction within a given dimension whereas each uni-link is used by 50% of the messages in the network. This suggests that the highly correlated resource dependencies resulting from all network traffic having to travel in the same direction (and turn) to reach their respective destinations is a major contributor to deadlock frequency. Our results show that as expected, adding routing resources (e.g., bidirectional physical links) reduces resource contention such that correlated resource dependencies required for deadlock are less likely to form. Although bidirectionality significantly reduces deadlock frequency, it does not by itself reduce the likelihood of deadlock formation to sufficiently low enough levels. However, bidirectionality may be combined with other techniques (following sections) to reduce deadlock frequency to well within acceptable levels.. Effect of Adaptivity on Deadlocks In studying the effect of adaptivity on deadlock formation, we measure the frequency of deadlocks and cycles in tori using DOR and TFAR. To focus on the effects of adaptivity alone, we again use a single VC per physical channel for both algorithms. Figures 6a and 6b show the normalized
5 Normalized Deadlocks and Cycles * TFAR Cycles o TFAR Deadlocks + DOR Cycles and Deadlocks Deadlock and Resource Set Size * TFAR Resource Set o DOR Resource Set x TFAR Deadlock Set + DOR Deadlock Set Figure 6. (a) Normalized deadlocks and cycles vs. load rate. (b) Deadlock and resource set size vs. load rate. Normalized Deadlocks * TFAR1 + DOR1 o DOR Number of Cycles TFAR4 0.5 DOR 0.5 DOR DOR4 TFAR TFAR1 10 TFAR DOR % Messages Blocked Figure 7. (a) Normalized deadlocks vs. load rate. (b) Number of cycles vs. percent of messages blocked. deadlocks and cycles vs. load rate and the deadlock and resource set size vs. load rate for the two algorithms under uniform traffic. DOR allows only single-cycle deadlocks to form (as in Figure 1), so one curve can represent both cycle and deadlock information. In contrast, TFAR allows cyclic non-deadlocks (similar to Figure 4). Since many more cycles can exist than there are deadlocks, two different curves are used to convey cycle and deadlock formation. Our results show that TFAR suffers no deadlocks below network saturation, 1 deadlock per 100 messages delivered at saturation, and about the same number of deadlocks as messages delivered in deep saturation. The ratio of deadlocks to messages delivered for DOR is even smaller prior to saturation (less than 1 per 1000 messages delivered). This rate gradually increases to 1 deadlock for every 10 messages delivered in deep saturation. In terms of actual number of deadlocks (not normalized to throughput), DOR suffers more than TFAR by as much as a factor 6. Interestingly, DOR has higher sustained throughput over TFAR despite having a larger number of deadlocks. This explains the discrepancy between actualdeadlockand normalized deadlock. It is also observed that the performance of TFAR is highly sensitive to just a few deadlocks while the performance of DOR remains relatively unaffected even as the number of deadlocks grows. The size of deadlock and resource sets in DOR are inherently limited by the single-cycle deadlocks which form. Given that deadlocks are broken immediately upon detection, the effects of deadlocks in DOR are local, isolated to a given row or a column within the network. The relatively simpler correlation of message dependency required for these deadlocks makes them more likely but, at the same time, less severe. In contrast, TFAR can lead to large multicycle deadlocks which have a more global effect upon the network. Hence, the higher degree of correlation of message resource dependency required for these deadlocks makes them less likely but more severe. The results shown in Figure 6b confirms our hypothesis. Large multi-cycle deadlocks appear in TFAR with deadlock sets and resource sets that are 5 to 7 and 7 to 10 times larger than those of DOR, respectively. What's more, the knot cycle densities for TFAR deadlocks are greater by a factor of 10 to 0. Some of the larger deadlocks observed in TFAR involve as many as 5% of the messages within the network, occupy more than 40% of the channels, and involve hundreds of cycles, thus confirming their global nature. As a result, the residual effects of such large deadlocks are longer-term and widespread; just a few can greatly degrade performance. This is in contrast to the deadlocks in DOR which have more localized, shorter-term effects, thereby making DOR' s performance less affected by a large number of deadlocks. The cyclic non-deadlocks in TFAR may also degrade performance. Duato [] has described situations where messages block cyclically faster than they can be drained and remain blocked for extended periods, leading to large message latencies. The large number of cycles we have observed even in the absence of deadlocks suggests that this may be occurring. Hence, low throughput resulting from these cyclic non-deadlocks contributes to the higher normalized deadlock frequency for TFAR although fewer actual deadlocks form. Given that TFAR with a single VC makes harmful deadlocks and cyclic non-deadlocks probable, recovery-based adaptive routing would benefit from additional VCs. Next we will examine the effect of additional VCs on reducing the likelihood of deadlock formation.. Effect of Virtual Channels on Deadlocks In investigating the effects of traffic flow on deadlock formation using multiple VCs per physical channel, we measure the deadlock frequency of DOR and TFAR in tori networks which allow the unrestricted use of,, and 4 VCs (all other parameters default). For experiments in which deadlock did not occur, we use network congestion and resource dependency cycles formed as a measure of the likelihood of possible deadlock. Figures 7a and 7b show normalized deadlocks vs. load rate and number of cycles vs. percentage of blocked messages under uniform traffic. In Figure 7b, each curve is annotated with the load rate at which cycles first appear (first point) and the load rate at which the highest number of cycles were found (last point). In Figure 7a, DOR with two VCs (DOR) does not lead to deadlock prior to saturation; the nd VC more than doubles the load at which deadlocks begin to appear when using only 1 VC. At its saturation load rate, approximately 1 deadlock occurs for every 100 messages delivered. Deadlock frequency increases to 1 for every 5 messages delivered in deep saturation. Beyond saturation, the actual number of deadlocks for DOR1 and DOR is roughly the same. However, a larger reduction in throughput at loads after saturation makes the normalized deadlock measure slightly higher for DOR (as shown in the figure). With or more VCs, DOR suffers no deadlocks. In contrast, VCs are sufficient to discourage deadlocks in TFAR (DOR, DOR4, TFAR, TFAR and TFAR4 are not plotted as no deadlocks occurred).
6 A number of factors contribute to the elimination of deadlocks when additional VCs are introduced. The new VCs are resources that become available to messages which would otherwise block. The likelihood of the formation of cycles and knots decreases when fewer messages are blocked within the network. The new VCs also provide a higher number of routing options for those messages which still block within the network. As was illustrated in Section, additional routing options increase the deadlock set size, resource set size, and knot cycle density needed for deadlock, thereby requiring a higher degree of correlation of message dependency in order for deadlock to form. This greatly diminishes the likelihood of deadlocks. Note that TFAR amplifies the effects of additional VCs since adaptivity makes new routing options available in each dimension. This explains why TFAR is able to eliminate all deadlocks with a smaller number of VCs (two instead of three for DOR). The simpler correlation of message dependencies required for deadlock in DOR combined with restrictions in the use of the new resources makes VCs insufficient to eliminate deadlock in DOR. Figure 7b indicates that adding VCs reduces congestion and allows higher loads to be applied before a large number of cyclic non-deadlocks form. TFAR1 results in increasingly higher congestion and a larger number of cycles starting at saturation. TFAR eliminates the cycles encountered at low load rates in TFAR1, and substantially reduces the overall congestion (from over 70% of the messages being blocked down to as few as 1%). As TFAR reaches saturation, its congestion increases while the number of cycles grows rapidly. The third and the fourth VCs for TFAR and DOR show a similar effect on reducing congestion and eliminating cycles at loads prior to saturation, leading to rapid growth in cycles once saturation is reached. In summary, we observe that additional VCs are able to reduce the amount of messages which block within the network, as expected. This, along with the higher degree of correlation of message dependencies required for deadlock in the presence of a larger number of routing options due to the additional VCs greatly diminishes the likelihood of deadlock. We find the extent to which deadlocks are eliminated with as few as VCs per physical channel to be surprising. However, as networks with multiple VCs reach saturation, a higher number of blocked messages along with the larger number of routing options increases the number of cycles exponentially. The fact that no knots exist even in the presence of hundreds of thousands of cycles suggests the formation of extremely large cyclic non-deadlocks at saturation loads. Operating below saturation avoids this performance degradation..4 Effect of Buffer Depth on Deadlocks We now investigate the effects of increasing the channel buffer size on deadlock formation. We measure the frequency of deadlocks in bidirectional tori with channel buffer depths of, 4, 6, 8, 16, and flits. TFAR with one VC per physical channel is used. Using a buffer of the same depth as message length corresponds to virtual cut-through switching [1]. Other buffer depths correspond to wormhole or buffered wormhole switching [1]. Figures 8a and 8b show normalized deadlocks vs. load rate and normalized Normalized Deadlocks x buffer=. buffer=4 + buffer=6... buffer=8 o buffer=16 * buffer= Normalized Deadlocks x buffer=. buffer=4 + buffer=6... buffer=8 o buffer=16 * buffer= Messages in the Network Figure 8. (a) Normalized deadlocks vs. load rate. (b) Normalized deadlocks vs. messages in the network. deadlocks vs. the number of messages in the network. As shown in Figure 8a, networks with buffer depths of, 4 and 6 flits all saturate at a similar load rate. After saturation, these networks lead to a large amount of deadlocks (15 to 5 deadlocks for every 10 messages delivered). The network with a buffer depth of 8 flits saturates at a 5% higher load rate, and leads to a similar deadlock frequency for load rates beyond saturation. Networks with buffers depths of 16 and flits saturate at a 75% higher load than the smallest buffers, reflecting the larger capacity of these networks. A buffer depth of 16 flits leads to the highest number of deadlocks (15 to 5 deadlocks per every 10 messages delivered) while the virtual cut-through network (buffer depth of flits) leads to the smallest number of deadlocks (1 deadlock for every messages delivered) at load rates beyond saturation. The increase in saturation load as the buffer depth is increased confirms that each message requires the simultaneous use of fewer channels due to the higher capacity. This allows for message compaction. Below saturation, compaction leads to less resource contention and allows more messages to be serviced by the network. The similar saturation load for buffer depths of, 4, and 6 flits (6%, 1%, and 18% of the message size) indicates that the amount of compaction occurring for these buffer sizes are alike, and suggests that messages have blocked close to their source nodes, thereby neutralizing the effect of compaction. Increases in saturation loads are greater for larger increments in buffer sizes, thereby suggesting effective compaction for these buffer sizes (buffer sizes of 8, 16 and flits which can accommodate 5%, 50%, and 100% of a message, respectively). When normalized with respect to the number of messages in the network (Figure 8b), the networks with smaller capacity buffers lead to a substantially higher number of deadlocks. This is explained by the fact that in these networks, each message requires the simultaneous use of a larger number of resources, thereby leading to higher resource contention. Although higher capacity buffers allow more messages to enter the network and, potentially, a larger number of messages to block at saturation, the degree of correlation of message dependency required for deadlocks also increases due to the message compaction, thereby making multi-cycle deadlocks with large deadlock and resource sets less probable.
7 .5 Effect of Network Node Degree on Deadlocks To investigate the effects of node degree on the frequency of deadlocks, we measured deadlock frequency in a 16-ary - cube (D) and a 4-ary 4-cube (4D) torus-connected network, both of which use TFAR routing with one VC. Load rate was normalized based on the total link bandwidth and average internode distance of the two networks. The 4D network resulted in relatively fewer deadlocks at loads prior to saturation (less than 1% of the deadlocks which occurred for the D network). Also, the 4D network achieved higher performance well beyond the saturation load of the D network, thereby leading to an even larger gap in the normalized deadlock frequency. The two main factors contributing to this are the additional network resources (physical channels) and the increased routing freedom (dimensions). Similar to other experiments, additional links serve to reduce resource contention and the high node degree, along with adaptive routing, increases the required correlation of message dependencies in order for knots to form. The few deadlocks that did form in the 4D network were all single-cycle deadlocks, which suggests that the few messages in the deadlock sets were limited to restricted routing due to exhaustion of routing adaptivity towards the destination..6 Effect of Non-Uniform Traffic on Deadlocks The deadlock frequencies for non-uniform traffic patterns (bit-reversal, matrix-transpose, perfect-shuffle, and hot- spot) were similar to (in most cases, within 10% of) the deadlock frequencies for the uniform traffic patterns in the experiments described above. The characteristics of the deadlocks (deadlock set size, resource set size, and knot cycle density) were similar as well. The only exception to this was for DOR. Single-cycle deadlocks in DOR (as shown in Figure 1) require circular overlap of messages. The source and destination pairs designated by some of these non-uniform traffic patterns are such that this overlap is not possible. 4 Related Work Deadlock approximation schemes proposed previously [4, 5] have provided little insight into the frequency of true deadlocks. In contrast, our work presents frequencies of actual deadlock as well as their characteristics as they relate to key network parameters. CWGs and similar constructs have previously been used to statically represent connections allowed by deadlock-avoidance based algorithms [, 8]. In contrast, we use these graphs to model dynamic resource allocation in unrestricted routing, and to precisely define and detect deadlocks. A summary of work characterizing deadlocks as knots in generalized resource graphs intended to describe deadlocks in operating systems is presented in [9]. Our work is a specialized application of this framework, intended for depicting deadlocks in interconnection networks. 5 Conclusions and Future and Work We characterize the causal effects of various network parameters on blocked messages, resource dependency cycles, and deadlocks to gain a greater understanding of the viability of deadlock recovery-based routing. Through simulation and analysis, we empirically show how deadlock probability is influenced by these factors when routing restrictions are not enforced so as to avoid deadlock. Our results for k-ary n-cube networks with n confirm that deadlock probability is less in bidirectional networks than in unidirectional networks, and it decreases as node degree and adaptivity is increased. Localized deadlocks of limited harmful effect are more probable with dimension ordered routing whereas globally harmful deadlocks are probable with true fully adaptive routing. Deadlock probability is less in virtual cut-through networks than in bufferedwormhole and wormhole networks, as expected. Interestingly, however, deadlocks are highly improbable (none were detected) if as few as VCs are used with dimension ordered routing and only VCs are used with true fully adaptive routing in bidirectional wormhole networks. These results lead us to conclude that recovery-based routing is viable since the unrestricted use of only a few virtual channels is sufficient to make deadlock highly improbable. Providing greater routing flexibility and buffer capacity through increased routing adaptivity, number of virtual and physical channels (bidirectional), and buffer depth greatly increases the complexity of correlated resource dependencies required for deadlock to occur. We will continue this characterization study by examining the effect of irregular network topology, hybrid message length, misrouting, etc., on deadlock. We also plan to characterize deadlock formation under hybrid non-uniform traffic loads using program-driven simulations. References [1] Andrew A. Chien and J. H. Kim. Planar-Adaptive Routing: Low- Cost Adaptive Networks for Multiprocessors. In Proc. of the 19th Symposium on Computer Architecture, pp 68-77, May 199. [] L. Ni and C. Glass, The Turn Model for Adaptive Routing, In Proc. of the 19th International Symposium on Computer Architecture, IEEE Computer Society, pages 78-87, May 199. [] J. Duato. A New Theory of Deadlock-free Adaptive Routing in Wormhole Networks. IEEE Transactions on Parallel and Distributed Systems, 4(1):10-11, December 199. [4] J. Kim, Z. Liu, and A. Chien. Compressionless Routing: A Framework for Adaptive and Fault-tolerant Routing. In Proc. of the 1st International Symposium on Computer Architecture, pp 89-00, April [5] Anjan K.V. and Timothy M. Pinkston, An Efficient, Fully Adaptive Deadlock Recovery Scheme: Disha, In Proc. of the nd International Symposium on Computer Architecture, pp 01-10, June [6] Sugath Warnakulasuriya and Timothy Mark Pinkston. Implementation of Deadlock Detection in a Simulated Interconnection Network Environment, Technical Report CENG 97-01, University of Southern California, January [7] J. Duato. A Necessary and Sufficient Condition for Dead lock-free Adaptive Routing in Wormhole Networks. IEEE Transactions on Parallel and Distributed Systems, 6(10): , October [8] Loren Schwiebert, D.N. Jayasimha, A Necessary and Sufficient Condition for Deadlock-Free Wormhole Routing, Journal of Parallel and Distributed Computing,, (1996).
8 [9] Mamoru Maekawa, Arthur E. Oldehoft, and Rodney R. Oldehoft, Operating Systems: Advanced Concepts, Benjamin Cummings, [10] William J. Dally and Hiromichi Aoki, Deadlock- Free Adaptive Routing in Multicomputer Networks Using Virtual Channels, IEEE Transactions on Parallel Distributed Systems, Vol. 4, No. 4, April, 199. [11] Timothy Mark Pinkston and Sugath Warnakulasuriya. On Deadlock in Interconnection Networks, To appear in Proc. of the 4th International Symposium on Computer Architecture, June [1] Parviz Kermani and Leonard Kleinrock. Virtual cut- through: A new computer communication switching technique, Computer Networks, pages 67-86, [1] C.B. Stunkle et al. The SP high-performance switch, IBM Systems Journal, vol. 4, no., pp , 1995.
EE482, Spring 1999 Research Paper Report. Deadlock Recovery Schemes
EE482, Spring 1999 Research Paper Report Deadlock Recovery Schemes Jinyung Namkoong Mohammed Haque Nuwan Jayasena Manman Ren May 18, 1999 Introduction The selected papers address the problems of deadlock,
More informationGeneralized Theory for Deadlock-Free Adaptive Wormhole Routing and its Application to Disha Concurrent
Generalized Theory for Deadlock-Free Adaptive Wormhole Routing and its Application to Disha Concurrent Anjan K. V. Timothy Mark Pinkston José Duato Pyramid Technology Corp. Electrical Engg. - Systems Dept.
More informationA Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ
A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks Λ E. Baydal, P. López and J. Duato Depto. Informática de Sistemas y Computadores Universidad Politécnica de Valencia, Camino
More informationCrossbar Analysis for Optimal Deadlock Recovery Router Architecture
rossbar Analysis for Optimal Deadlock Recovery Router Architecture Yungho hoi Timothy Mark Pinkston SMART Interconnects Group EE-Systems Dept, University of Southern alifornia, Los Angeles, A 90089-2562
More informationSoftware-Based Deadlock Recovery Technique for True Fully Adaptive Routing in Wormhole Networks
Software-Based Deadlock Recovery Technique for True Fully Adaptive Routing in Wormhole Networks J. M. Martínez, P. López, J. Duato T. M. Pinkston Facultad de Informática SMART Interconnects Group Universidad
More informationSOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS*
SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS* Young-Joo Suh, Binh Vien Dao, Jose Duato, and Sudhakar Yalamanchili Computer Systems Research Laboratory Facultad de Informatica School
More informationRecall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms
CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252
More informationCombining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing?
Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing? J. Flich 1,P.López 1, M. P. Malumbres 1, J. Duato 1, and T. Rokicki 2 1 Dpto. Informática
More informationDeadlock: Part II. Reading Assignment. Deadlock: A Closer Look. Types of Deadlock
Reading Assignment T. M. Pinkston, Deadlock Characterization and Resolution in Interconnection Networks, Chapter 13 in Deadlock Resolution in Computer Integrated Systems, CRC Press 2004 Deadlock: Part
More informationFault-Tolerant Routing Algorithm in Meshes with Solid Faults
Fault-Tolerant Routing Algorithm in Meshes with Solid Faults Jong-Hoon Youn Bella Bose Seungjin Park Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science Oregon State University
More informationDeadlock. Reading. Ensuring Packet Delivery. Overview: The Problem
Reading W. Dally, C. Seitz, Deadlock-Free Message Routing on Multiprocessor Interconnection Networks,, IEEE TC, May 1987 Deadlock F. Silla, and J. Duato, Improving the Efficiency of Adaptive Routing in
More informationA Hybrid Interconnection Network for Integrated Communication Services
A Hybrid Interconnection Network for Integrated Communication Services Yi-long Chen Northern Telecom, Inc. Richardson, TX 7583 kchen@nortel.com Jyh-Charn Liu Department of Computer Science, Texas A&M Univ.
More informationFault-Tolerant Wormhole Routing Algorithms in Meshes in the Presence of Concave Faults
Fault-Tolerant Wormhole Routing Algorithms in Meshes in the Presence of Concave Faults Seungjin Park Jong-Hoon Youn Bella Bose Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science
More informationWormhole Routing Techniques for Directly Connected Multicomputer Systems
Wormhole Routing Techniques for Directly Connected Multicomputer Systems PRASANT MOHAPATRA Iowa State University, Department of Electrical and Computer Engineering, 201 Coover Hall, Iowa State University,
More informationAdaptive Multimodule Routers
daptive Multimodule Routers Rajendra V Boppana Computer Science Division The Univ of Texas at San ntonio San ntonio, TX 78249-0667 boppana@csutsaedu Suresh Chalasani ECE Department University of Wisconsin-Madison
More informationBoosting the Performance of Myrinet Networks
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. XX, NO. Y, MONTH 22 1 Boosting the Performance of Myrinet Networks J. Flich, P. López, M. P. Malumbres, and J. Duato Abstract Networks of workstations
More informationBLAM : A High-Performance Routing Algorithm for Virtual Cut-Through Networks
BLAM : A High-Performance Routing Algorithm for Virtual Cut-Through Networks Mithuna Thottethodi Λ Alvin R. Lebeck y Shubhendu S. Mukherjee z Λ School of Electrical and Computer Engineering Purdue University
More informationResource Deadlocks and Performance of Wormhole Multicast Routing Algorithms
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 9, NO. 6, JUNE 1998 535 Resource Deadlocks and Performance of Wormhole Multicast Routing Algorithms Rajendra V. Boppana, Member, IEEE, Suresh
More informationEE 6900: Interconnection Networks for HPC Systems Fall 2016
EE 6900: Interconnection Networks for HPC Systems Fall 2016 Avinash Karanth Kodi School of Electrical Engineering and Computer Science Ohio University Athens, OH 45701 Email: kodi@ohio.edu 1 Acknowledgement:
More informationPerformance of Multihop Communications Using Logical Topologies on Optical Torus Networks
Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,
More informationDeadlock and Router Micro-Architecture
1 EE482: Advanced Computer Organization Lecture #8 Interconnection Network Architecture and Design Stanford University 22 April 1999 Deadlock and Router Micro-Architecture Lecture #8: 22 April 1999 Lecturer:
More informationCombining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing
Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing Jose Flich 1,PedroLópez 1, Manuel. P. Malumbres 1, José Duato 1,andTomRokicki 2 1 Dpto.
More informationTrue fully adaptive routing employing deadlock detection and congestion control.
True fully adaptive routing employing deadlock detection and congestion control. 16 May, 2001 Dimitris Papadopoulos, Arjun Singh, Kiran Goyal, Mohamed Kilani. {fdimitri, arjuns, kgoyal, makilani}@stanford.edu
More informationNEtwork-on-Chip (NoC) [3], [6] is a scalable interconnect
1 A Soft Tolerant Network-on-Chip Router Pipeline for Multi-core Systems Pavan Poluri and Ahmed Louri Department of Electrical and Computer Engineering, University of Arizona Email: pavanp@email.arizona.edu,
More information4. Networks. in parallel computers. Advances in Computer Architecture
4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors
More informationFlow Control can be viewed as a problem of
NOC Flow Control 1 Flow Control Flow Control determines how the resources of a network, such as channel bandwidth and buffer capacity are allocated to packets traversing a network Goal is to use resources
More informationLecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control
Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control 1 Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection
More informationPerformance Evaluation of Probe-Send Fault-tolerant Network-on-chip Router
erformance Evaluation of robe-send Fault-tolerant Network-on-chip Router Sumit Dharampal Mediratta 1, Jeffrey Draper 2 1 NVIDIA Graphics vt Ltd, 2 SC Information Sciences Institute 1 Bangalore, India-560001,
More informationGeneric Methodologies for Deadlock-Free Routing
Generic Methodologies for Deadlock-Free Routing Hyunmin Park Dharma P. Agrawal Department of Computer Engineering Electrical & Computer Engineering, Box 7911 Myongji University North Carolina State University
More informationA Survey of Routing Techniques in Store-and-Forward and Wormhole Interconnects
SANDIA REPORT SAND2008-0068 Unlimited Release Printed January 2008 A Survey of Routing Techniques in Store-and-Forward and Wormhole Interconnects David M. Holman and David S. Lee Prepared by Sandia National
More informationDeadlock and Livelock. Maurizio Palesi
Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,
More informationPerformance Evaluation of a New Routing Strategy for Irregular Networks with Source Routing
Performance Evaluation of a New Routing Strategy for Irregular Networks with Source Routing J. Flich, M. P. Malumbres, P. López and J. Duato Dpto. Informática de Sistemas y Computadores Universidad Politécnica
More informationOFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management
Marina Garcia 22 August 2013 OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management M. Garcia, E. Vallejo, R. Beivide, M. Valero and G. Rodríguez Document number OFAR-CM: Efficient Dragonfly
More informationRouting and Deadlock
3.5-1 3.5-1 Routing and Deadlock Routing would be easy...... were it not for possible deadlock. Topics For This Set: Routing definitions. Deadlock definitions. Resource dependencies. Acyclic deadlock free
More informationCONNECTION-BASED ADAPTIVE ROUTING USING DYNAMIC VIRTUAL CIRCUITS
Proceedings of the International Conference on Parallel and Distributed Computing and Systems, Las Vegas, Nevada, pp. 379-384, October 1998. CONNECTION-BASED ADAPTIVE ROUTING USING DYNAMIC VIRTUAL CIRCUITS
More informationLecture 12: Interconnection Networks. Topics: dimension/arity, routing, deadlock, flow control
Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees, butterflies,
More informationSwitching/Flow Control Overview. Interconnection Networks: Flow Control and Microarchitecture. Packets. Switching.
Switching/Flow Control Overview Interconnection Networks: Flow Control and Microarchitecture Topology: determines connectivity of network Routing: determines paths through network Flow Control: determine
More informationInterconnection Networks: Topology. Prof. Natalie Enright Jerger
Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design
More informationThe Effect of Adaptivity on the Performance of the OTIS-Hypercube under Different Traffic Patterns
The Effect of Adaptivity on the Performance of the OTIS-Hypercube under Different Traffic Patterns H. H. Najaf-abadi 1, H. Sarbazi-Azad 2,1 1 School of Computer Science, IPM, Tehran, Iran. 2 Computer Engineering
More informationA New Theory of Deadlock-Free Adaptive. Routing in Wormhole Networks. Jose Duato. Abstract
A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks Jose Duato Abstract Second generation multicomputers use wormhole routing, allowing a very low channel set-up time and drastically reducing
More informationLecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels
Lecture: Interconnection Networks Topics: TM wrap-up, routing, deadlock, flow control, virtual channels 1 TM wrap-up Eager versioning: create a log of old values Handling problematic situations with a
More informationRouting Algorithms. Review
Routing Algorithms Today s topics: Deterministic, Oblivious Adaptive, & Adaptive models Problems: efficiency livelock deadlock 1 CS6810 Review Network properties are a combination topology topology dependent
More informationLecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)
Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew
More informationLecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel
More informationInterconnection Network
Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network
More informationTopology basics. Constraints and measures. Butterfly networks.
EE48: Advanced Computer Organization Lecture # Interconnection Networks Architecture and Design Stanford University Topology basics. Constraints and measures. Butterfly networks. Lecture #: Monday, 7 April
More informationDynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers
Dynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers Young Hoon Kang, Taek-Jun Kwon, and Jeff Draper {youngkan, tjkwon, draper}@isi.edu University of Southern California
More informationPacket Switch Architecture
Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.
More informationPacket Switch Architecture
Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.
More informationPerformance Analysis of a Minimal Adaptive Router
Performance Analysis of a Minimal Adaptive Router Thu Duc Nguyen and Lawrence Snyder Department of Computer Science and Engineering University of Washington, Seattle, WA 98195 In Proceedings of the 1994
More informationDUE to the increasing computing power of microprocessors
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 13, NO. 7, JULY 2002 693 Boosting the Performance of Myrinet Networks José Flich, Member, IEEE, Pedro López, M.P. Malumbres, Member, IEEE, and
More informationInterconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N.
Interconnection topologies (cont.) [ 10.4.4] In meshes and hypercubes, the average distance increases with the dth root of N. In a tree, the average distance grows only logarithmically. A simple tree structure,
More informationEfficient Communication in Metacube: A New Interconnection Network
International Symposium on Parallel Architectures, Algorithms and Networks, Manila, Philippines, May 22, pp.165 170 Efficient Communication in Metacube: A New Interconnection Network Yamin Li and Shietung
More informationA Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes
A Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes N.A. Nordbotten 1, M.E. Gómez 2, J. Flich 2, P.López 2, A. Robles 2, T. Skeie 1, O. Lysne 1, and J. Duato 2 1 Simula Research
More informationLecture 15: PCM, Networks. Today: PCM wrap-up, projects discussion, on-chip networks background
Lecture 15: PCM, Networks Today: PCM wrap-up, projects discussion, on-chip networks background 1 Hard Error Tolerance in PCM PCM cells will eventually fail; important to cause gradual capacity degradation
More informationMESH-CONNECTED networks have been widely used in
620 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 5, MAY 2009 Practical Deadlock-Free Fault-Tolerant Routing in Meshes Based on the Planar Network Fault Model Dong Xiang, Senior Member, IEEE, Yueli Zhang,
More informationInterconnect Technology and Computational Speed
Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented
More informationTotal-Exchange on Wormhole k-ary n-cubes with Adaptive Routing
Total-Exchange on Wormhole k-ary n-cubes with Adaptive Routing Fabrizio Petrini Oxford University Computing Laboratory Wolfson Building, Parks Road Oxford OX1 3QD, England e-mail: fabp@comlab.ox.ac.uk
More informationAppendix B. Standards-Track TCP Evaluation
215 Appendix B Standards-Track TCP Evaluation In this appendix, I present the results of a study of standards-track TCP error recovery and queue management mechanisms. I consider standards-track TCP error
More informationBasic Low Level Concepts
Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock
More informationNetwork on Chip Architecture: An Overview
Network on Chip Architecture: An Overview Md Shahriar Shamim & Naseef Mansoor 12/5/2014 1 Overview Introduction Multi core chip Challenges Network on Chip Architecture Regular Topology Irregular Topology
More informationMcGill University - Faculty of Engineering Department of Electrical and Computer Engineering
McGill University - Faculty of Engineering Department of Electrical and Computer Engineering ECSE 494 Telecommunication Networks Lab Prof. M. Coates Winter 2003 Experiment 5: LAN Operation, Multiple Access
More informationDeadlock-Free Connection-Based Adaptive Routing with Dynamic Virtual Circuits
Computer Science Department Technical Report #TR050021 University of California, Los Angeles, June 2005 Deadlock-Free Connection-Based Adaptive Routing with Dynamic Virtual Circuits Yoshio Turner and Yuval
More informationA DAMQ SHARED BUFFER SCHEME FOR NETWORK-ON-CHIP
A DAMQ HARED BUFFER CHEME FOR ETWORK-O-CHIP Jin Liu and José G. Delgado-Frias chool of Electrical Engineering and Computer cience Washington tate University Pullman, WA 99164-2752 {jinliu, jdelgado}@eecs.wsu.edu
More informationA New Adaptive Hardware Tree-Based Multicast Routing in K-Ary N-Cubes
IEEE TRANSACTIONS ON COMPUTERS, VOL. 50, NO. 7, JULY 2001 1 A New Adaptive Hardware Tree-Based Multicast Routing in K-Ary N-Cubes Dianne R. Kumar, Member, IEEE, Walid A. Najjar, and Pradip K. Srimani,
More informationRandomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks
2080 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 11, NOVEMBER 2012 Randomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks Rohit Sunkam
More informationOn characterizing BGP routing table growth
University of Massachusetts Amherst From the SelectedWorks of Lixin Gao 00 On characterizing BGP routing table growth T Bu LX Gao D Towsley Available at: https://works.bepress.com/lixin_gao/66/ On Characterizing
More informationThe Odd-Even Turn Model for Adaptive Routing
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 11, NO. 7, JULY 2000 729 The Odd-Even Turn Model for Adaptive Routing Ge-Ming Chiu, Member, IEEE Computer Society AbstractÐThis paper presents
More informationNOC Deadlock and Livelock
NOC Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,
More informationCommunication Performance in Network-on-Chips
Communication Performance in Network-on-Chips Axel Jantsch Royal Institute of Technology, Stockholm November 24, 2004 Network on Chip Seminar, Linköping, November 25, 2004 Communication Performance In
More informationDeadlock-free Fault-tolerant Routing in the Multi-dimensional Crossbar Network and Its Implementation for the Hitachi SR2201
Deadlock-free Fault-tolerant Routing in the Multi-dimensional Crossbar Network and Its Implementation for the Hitachi SR2201 Yoshiko Yasuda, Hiroaki Fujii, Hideya Akashi, Yasuhiro Inagami, Teruo Tanaka*,
More informationOn Topology and Bisection Bandwidth of Hierarchical-ring Networks for Shared-memory Multiprocessors
On Topology and Bisection Bandwidth of Hierarchical-ring Networks for Shared-memory Multiprocessors Govindan Ravindran Newbridge Networks Corporation Kanata, ON K2K 2E6, Canada gravindr@newbridge.com Michael
More informationLecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance
Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,
More informationDeadlock-free XY-YX router for on-chip interconnection network
LETTER IEICE Electronics Express, Vol.10, No.20, 1 5 Deadlock-free XY-YX router for on-chip interconnection network Yeong Seob Jeong and Seung Eun Lee a) Dept of Electronic Engineering Seoul National Univ
More informationBARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs
-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs Pejman Lotfi-Kamran, Masoud Daneshtalab *, Caro Lucas, and Zainalabedin Navabi School of Electrical and Computer Engineering, The
More informationImproving Network Performance by Reducing Network Contention in Source-Based COWs with a Low Path-Computation Overhead Λ
Improving Network Performance by Reducing Network Contention in Source-Based COWs with a Low Path-Computation Overhead Λ J. Flich, P. López, M. P. Malumbres, and J. Duato Dept. of Computer Engineering
More informationLecture: Interconnection Networks
Lecture: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm 1 Packets/Flits A message is broken into multiple packets (each packet
More informationMeasure of Impact of Node Misbehavior in Ad Hoc Routing: A Comparative Approach
ISSN (Print): 1694 0814 10 Measure of Impact of Node Misbehavior in Ad Hoc Routing: A Comparative Approach Manoj Kumar Mishra 1, Binod Kumar Pattanayak 2, Alok Kumar Jagadev 3, Manojranjan Nayak 4 1 Dept.
More informationThe final publication is available at
Document downloaded from: http://hdl.handle.net/10251/82062 This paper must be cited as: Peñaranda Cebrián, R.; Gómez Requena, C.; Gómez Requena, ME.; López Rodríguez, PJ.; Duato Marín, JF. (2016). The
More informationOn Constructing the Minimum Orthogonal Convex Polygon in 2-D Faulty Meshes
On Constructing the Minimum Orthogonal Convex Polygon in 2-D Faulty Meshes Jie Wu Department of Computer Science and Engineering Florida Atlantic University Boca Raton, FL 33431 E-mail: jie@cse.fau.edu
More informationCommunication in Multicomputers with Nonconvex Faults?
In Proceedings of EUROPAR 95 Communication in Multicomputers with Nonconvex Faults? Suresh Chalasani 1 and Rajendra V. Boppana 2 1 Dept. of ECE, University of Wisconsin-Madison, Madison, WI 53706-1691,
More informationComputation of Multiple Node Disjoint Paths
Chapter 5 Computation of Multiple Node Disjoint Paths 5.1 Introduction In recent years, on demand routing protocols have attained more attention in mobile Ad Hoc networks as compared to other routing schemes
More informationFault-Tolerant Routing in Fault Blocks. Planarly Constructed. Dong Xiang, Jia-Guang Sun, Jie. and Krishnaiyan Thulasiraman. Abstract.
Fault-Tolerant Routing in Fault Blocks Planarly Constructed Dong Xiang, Jia-Guang Sun, Jie and Krishnaiyan Thulasiraman Abstract A few faulty nodes can an n-dimensional mesh or torus network unsafe for
More informationInterconnection Networks: Flow Control. Prof. Natalie Enright Jerger
Interconnection Networks: Flow Control Prof. Natalie Enright Jerger Switching/Flow Control Overview Topology: determines connectivity of network Routing: determines paths through network Flow Control:
More informationInterconnection Networks: Routing. Prof. Natalie Enright Jerger
Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly
More informationNetworks: Routing, Deadlock, Flow Control, Switch Design, Case Studies. Admin
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220 Admin Homework #5 Due Dec 3 Projects Final (yes it will be cumulative) CPS 220 2 1 Review: Terms Network characterized
More informationDesign and Evaluation of a Fault-Tolerant Adaptive Router for Parallel Computers
Design and Evaluation of a Fault-Tolerant Adaptive Router for Parallel Computers Tsutomu YOSHINAGA, Hiroyuki HOSOGOSHI, Masahiro SOWA Graduate School of Information Systems, University of Electro-Communications,
More informationOn Constructing the Minimum Orthogonal Convex Polygon for the Fault-Tolerant Routing in 2-D Faulty Meshes 1
On Constructing the Minimum Orthogonal Convex Polygon for the Fault-Tolerant Routing in 2-D Faulty Meshes 1 Jie Wu Department of Computer Science and Engineering Florida Atlantic University Boca Raton,
More informationLecture 18: Communication Models and Architectures: Interconnection Networks
Design & Co-design of Embedded Systems Lecture 18: Communication Models and Architectures: Interconnection Networks Sharif University of Technology Computer Engineering g Dept. Winter-Spring 2008 Mehdi
More informationA Survey of Techniques for Power Aware On-Chip Networks.
A Survey of Techniques for Power Aware On-Chip Networks. Samir Chopra Ji Young Park May 2, 2005 1. Introduction On-chip networks have been proposed as a solution for challenges from process technology
More informationRouting Algorithm. How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus)
Routing Algorithm How do I know where a packet should go? Topology does NOT determine routing (e.g., many paths through torus) Many routing algorithms exist 1) Arithmetic 2) Source-based 3) Table lookup
More informationEvaluation of Seed Selection Strategies for Vehicle to Vehicle Epidemic Information Dissemination
Evaluation of Seed Selection Strategies for Vehicle to Vehicle Epidemic Information Dissemination Richard Kershaw and Bhaskar Krishnamachari Ming Hsieh Department of Electrical Engineering, Viterbi School
More informationThe Cray T3E Network:
The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus Steven L. Scott and Gregory M. Thorson Cray Research, Inc. {sls,gmt}@cray.com Abstract This paper describes the interconnection network
More informationThomas Moscibroda Microsoft Research. Onur Mutlu CMU
Thomas Moscibroda Microsoft Research Onur Mutlu CMU CPU+L1 CPU+L1 CPU+L1 CPU+L1 Multi-core Chip Cache -Bank Cache -Bank Cache -Bank Cache -Bank CPU+L1 CPU+L1 CPU+L1 CPU+L1 Accelerator, etc Cache -Bank
More informationFault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson
Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies Mohsin Y Ahmed Conlan Wesson Overview NoC: Future generation of many core processor on a single chip
More informationTopologies. Maurizio Palesi. Maurizio Palesi 1
Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and
More information3. Evaluation of Selected Tree and Mesh based Routing Protocols
33 3. Evaluation of Selected Tree and Mesh based Routing Protocols 3.1 Introduction Construction of best possible multicast trees and maintaining the group connections in sequence is challenging even in
More informationn = 2 n = 2 n = 1 n = 1 λ 12 µ λ λ /2 λ /2 λ22 λ 22 λ 22 λ n = 0 n = 0 λ 11 λ /2 0,2,0,0 1,1,1, ,0,2,0 1,0,1,0 0,2,0,0 12 1,1,0,0
A Comparison of Allocation Policies in Wavelength Routing Networks Yuhong Zhu a, George N. Rouskas b, Harry G. Perros b a Lucent Technologies, Acton, MA b Department of Computer Science, North Carolina
More informationDesign of a System-on-Chip Switched Network and its Design Support Λ
Design of a System-on-Chip Switched Network and its Design Support Λ Daniel Wiklund y, Dake Liu Dept. of Electrical Engineering Linköping University S-581 83 Linköping, Sweden Abstract As the degree of
More informationNetwork-on-chip (NOC) Topologies
Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance
More information