Oblivious Routing Design for Mesh Networks to Achieve a New Worst-Case Throughput Bound
|
|
- Brianne Houston
- 6 years ago
- Views:
Transcription
1 Oblivious Routing Design for Mesh Networs to Achieve a New Worst-Case Throughput Bound Guang Sun,, Chia-Wei Chang, Bill Lin, Lieguang Zeng Tsinghua University, University of California, San Diego State Key Laboratory on Microwave and Digital Communications, Tsinghua National Laboratory for Information Science and Technology, Department of Electronic Engineering, Tsinghua University, Beijing 00084, China Abstract / networ capacity is often believed to be the limit of worst-case throughput for mesh networs. However, this paper provides a new worst-case throughput bound, which is higher than / networ capacity, for odd radix two-dimensional mesh networs. In addition, we propose a routing algorithm called UTURN that can achieve this worstcase throughput bound for odd radix meshes. For even radix meshes, we prove that UTURN achieves the optimal worst-case throughput, namely, half of networ capacity. UTURN considers all routing paths with at most turns and distributes the traffic loads uniformly in both X and Y dimensions. Theoretical analysis and simulation results show that UTURN outperforms existing routing algorithms in worstcase throughput. Moreover, UTURN achieves good average-throughput at the expense of approximately.5x minimal average hop count. For asymmetric meshes, we further propose an algorithm called UTURN- A and provide theoretical analysis for different algorithms. Both theoretical analysis and simulation show that UTURN and UTURN-A outperform existing algorithms VAL, DOR and OTURN in both worstcase and average throughput for asymmetric meshes. I. INTRODUCTION Networs-on-chip NoC) has been proposed as a new intrachip communication infrastructure. Regular topologies, especially the mesh topology, have been widely deployed [][]. The choice of oblivious routing algorithm, which is oblivious of the networ state when determining a route, has a significant impact on the networ performance, such as the throughput and the latency. Average-case throughput provides an average performance metric, whereas worstcase throughput provides a performance guarantee. It is well-nown and has been shown that the optimal worst-case throughput for oblivious routing in two-dimensional mesh networs is / of the networ capacity when the radix is even. For example, such an analysis for the even radix case is provided in []. For the odd radix case, it is also well-nown that a worst-case throughput of / networ capacity can be achieved. This / of networ capacity is often believed to be the limit of the worstcase throughput. For example, in the near optimality analysis of the OTURN routing algorithm [4], the near optimality of OTURN in the odd radix case is relative to a presumed / networ capacity limit. In this paper, we provide a new worst-case throughput bound of +)/ +) of networ capacity for the odd radix twodimensional mesh with radix, which is higher than / networ capacity. We propose a non-minimal oblivious routing algorithm called UTURN uniform -turn) that can achieve this worst-case throughput bound. Improved analytical results for oblivious routing are important because they often form the theoretical underpinnings for a variety of networ analysis problems. The main contributions of this paper are as follows: a) We provide a new worst-case throughput bound, which is higher than / networ capacity, for the odd radix two-dimensional mesh. b) UTURN achieves this worst-case throughput. It outperforms VAL [0], DOR [5] and OTURN [4] in worst-case throughput for all odd radix mesh networs. For even radix two-dimensional mesh, we proved UTURN achieves optimal worst-case throughput, namely, half of networ capacity. c) For average-case throughput, UTURN also outperforms VAL, DOR, and OTURN, on average over the six evaluated networ sizes both odd and even radix), by 7.%, 39.7% and 8.8%, respectively. d) For asymmetric meshes, we describe a variant called UTURN- A and analyze the theoretical performances for different routing algorithms. Both theoretical analysis and simulation show that UTURN and UTURN-A achieve better worst-case and average case throughput than other existing algorithms VAL, DOR and OTURN. The rest of this paper is organized as follows. Section II provides a brief bacground on routing algorithms and performance metrics. Section III describes our UTURN routing algorithm and provides theoretical performance analysis. Section IV discusses the performance in asymmetric meshes for different algorithms. Section V provides simulation results. Finally, Section VI draws conclusions. II. BACKGROUND A. Routing Algorithms In this section, we first enumerate some classic routing algorithms in NoC, and then we discuss the pros and cons of each routing scheme by comparing it with our routing design. Dimension-order routing DOR)[5] algorithm is one of the most popular routing algorithms. However, it suffers from poor worstcase and average-case throughput due to the lac of routing diversity [4]. On the other hand, Valiant proposed a routing algorithm called VAL [0] that offers excellent worst-case throughput half of networ capacity), but it suffers from poor average-case throughput and long routing path [6]. ROMM [3] overcomes some of the drawbacs in DOR and VAL. It achieves minimal-length routing and good averagecase throughput. However, its worst-case throughput is very poor. Thus, these existing routing algorithms suffer from either poor worstcase throughput e.g. DOR, ROMM) or poor latency e.g. VAL). OTURN [4] achieves both minimal-length routing and good throughput. For D odd radix mesh networs, OTURN has worstcase throughput within a factor of / of half of networ capacity. However, instead of using at most one turn in OTURN routing, our UTURN routing considers all routing paths with at most turns and distributes the traffic uniformly in both X and Y dimensions, which results in higher worst-case throughput than half of networ capacity for all odd radix mesh networs. Simulations show that UTURN outperforms the above routing algorithms in both worst-case and average-case throughput. Although the routing algorithms described in [9] and [7] also use routes with at most turns, they are developed for torus networs rather than meshes. Besides, TURN [9] does not have a closedform description. It requires solving a linear programming for each networ radix. But the linear programming grows rapidly, maing it difficult to scale to large networs [7]. IVAL [9] has only been shown to achieve the same worst-case throughput as VAL for torus networs, whereas UTURN has a closed-form description and //$ IEEE 47
2 achieves a higher worst-case throughput than VAL in the case of odd radix mesh networs. Lie ITURN and WTURN described in [7] for torus networs, UTURN also provides closed-form analytical expressions for evaluating the average path length. These closedform analytical expressions helped us to derive a routing algorithm that achieves a worst-case throughput better than the / networ capacity previously believed to be the limit of worst-case throughput for mesh networs. B. Performance Metrics In this section, we give the performance metrics, such as worstcase and average-case throughput, for oblivious routing algorithms. In order to simplify the discussion on these performance metrics, lie [4] [6], we ignore flow control issues, and assume single flit pacets that route from node to node in one cycle. In particular, we elaborate on the concept of networ capacity, and provide a brief overview of analytical methods to evaluate worst-case and averagecase throughput for oblivious routing algorithms. Networ Capacity: Networ capacity is defined by the maximal sustainable channel load Υ when a networ is loaded with uniformly distributed traffic [4]. For any -ary n-mesh [][6], independent of n, is even Υ 4 = ) is odd 4 where Υ is the inverse of the networ capacity. Worst-Case Throughput: For a given routing algorithm R and traffic matrix Λ, the maximal channel load ΥR, Λ) is the expected traffic loads crossing the heaviest loaded channel under R and Λ [6]. For a given routing algorithm R, the worst-case channel load Υ wcr) is the heaviest channel load that can be caused by any admissible traffic [6]. A double sub-stochastic traffic matrix Λ is admissible when it has row and column sums of at most. For a given routing algorithm, [8] provided a method to find worst-case traffic that can cause worst-case channel load and worst-case throughput. Note that the worst-case channel load is the inverse of worst-case throughput. Lie [6][4], we use the normalized worst-case throughput, which is normalized to the networ capacity, as worst-case performance metric: «ΥwcR) Θ wcr) = ) Average-Case Throughput: As for average performance metric, we use normalized average-case throughput. Using the same method in [6][4][8], the average-case throughput for a given routing algorithm R is found by averaging the throughput over a large set of random traffic matrix T e.g. T = ) III. UTURN ROUTING In this section, we first describe our UTURN routing algorithm for mesh networs, and then we provide the performance analysis, including throughput analysis and latency analysis. A. Routing Description As the name implies, UTURN considers any routing path between the source and destination with at most turns, where each turn changes dimension from X to Y or Y to X. Note that U-turns, turns that reverse direction along the same dimension, are not allowed in UTURN. In comparison to OTURN routing, UTURN is more flexible in that OTURN routes are a subset of UTURN routes. OTURN only uses routes with at most one turn. However, our UTURN routing considers all routing paths with at most turns and balances Υ the traffic in both X and Y dimensions. The greater flexibility and more uniform traffic distribution enable UTURN to achieve better throughput performance. For implementation, UTURN routing consists of two parts: XYX- Routing with 0.5 probability and YXY-Routing with also 0.5 probability. First, we elaborate on XYX-Routing. Suppose x,y ) is the source node and x,y ) is the destination node. XYX-Routing is a three-segment-routing generated as follows: a) X-segment: choose at uniform random an X position x [0, ], and route in X-dimension from source node x,y ) to x,y ). b) Y-segment: route from x,y ) to x,y ) in Y- dimension. c) X-segment: route from x,y ) to destination node x,y ) in X-dimension. Each of these three segments uses minimal-length routing, although the combination of segments can result in non-minimal routing along the entire path. There is a degenerate case. When y = y, it just needs one-segment routing: route directly from x,y ) to x,y ) using minimal-length routing in the X-dimension. Fig. a) depicts five possible XYX-Routing paths from source node to destination node for 5 5 mesh, where each path has a probability of /5. YXY-Routing is similar to XYX-Routing for symmetry, which is shown in Fig. b). There is also a degenerate case for YXY-Routing. When x = x, it just needs one-segment routing: route directly from x,y ) to x,y ) using minimal-length routing in the Y-dimension. Fig.. S D a) XYX-Routing S D b) YXY-Routing Routing paths for XYX-Routing and YXY-Routing in 5 5 mesh. From the aforementioned routing description, we now that XYX- Routing and YXY-Routing cover all routing paths between the source and destination with at most turns. In addition, XYX- Routing distributes the traffic uniformly along the X-Dimension, and YXY-Routing distributes the traffic uniformly along the Y- dimension. UTURN uses XYX-Routing and YXY-Routing with equal probability, so that it balances traffic loads in both X and Y dimensions, which results in a high throughput. As for the deadloc problem, two virtual channels per physical channel is sufficient to achieve deadloc-free routing by incrementing a pacet s virtual channel set after each turn from the Y to the X dimension [6]. B. Performance Analysis ) Throughput Analysis: In this section, we prove that our UTURN achieves a new worst-case throughput bound of + )/ +) of networ capacity for an odd radix two-dimensional mesh with radix, which is higher than / of networ capacity. For even case, UTURN achieves optimal worst-case throughput, namely, half of networ capacity. We prove these by three claims as follows. Claim. For one-dimensional mesh, the worst-case channel load of minimal-length routing is when the radix is odd and when is even. Proof: [8] proposed a method, which find the worst-case traffic for any given oblivious routing algorithm in a given topology. So, we use this method, and find that the worst-case traffic for 48
3 minimal-length routing in one dimension mesh is Tornado: from x to x + ) mod. The heaviest channel load of the minimal-length routing under Tornado traffic matrix is when is odd and when is even. Therefore, this is the worst-case channel load for minimal-length routing in one-dimensional mesh. As the worst-case throughput is the inverse of worst-case channel load, the worst-case throughput of minimal-length routing in odd radix one-dimensional mesh is, which is higher than half of networ capacity. For even radix one-dimensional mesh, the worst-case throughput of minimal-length routing is, which is optimal, namely, half of networ capacity. Claim. For the odd radix two-dimensional mesh, the worst-case channel load of XYX-Routing is: in Y-dimension and in X- dimension, respectively. For the even radix two-dimensional mesh, the worst-case channel load of XYX-Routing is in X and Y dimensions. Proof: As discussed in Section II-B, the maximal channel load under uniformly distributed traffic is when is odd and 4 4 when is even. XYX-Routing consists of three segments routing, and the two X-segment routings get twice uniform. The maximal channel load under twice uniform traffic matrix is simply ) 4 = when is odd and )= when is even. Thus, for 4 two-dimensional mesh, the worst-case channel load of XYX-Routing in X-dimension is when the radix is odd and when the radix is even. As for the Y-dimension case in XYX-Routing, it uses minimallength routing from x,y ) to x,y ). Since traffic is uniformly distributed along the X-dimension, the Y-dimension channels with different X positions but the same Y position have the same traffic loads. So the traffic of Y-dimension of XYX-Routing is degenerated into the case in Claim, namely, minimal-length routing in onedimension mesh. Thus, we get that the maximal channel load in Y dimension for XYX-Routing is when the radix is odd and when is even. Therefore, the worst-case channel load for XYX-Routing is when the radix is odd and when the radix is even. And the worst-case throughput for XYX-Routing is when the radix is odd and when the radix is even. As shown in Section II-B, the networ capacity for mesh is the inverse of Υ 4, namely, when the radix is odd and 4 when the radix is even. Thus, the worst-case throughput for XYX-Routing equals to half of networ capacity. Claim 3. For odd radix two-dimensional mesh, UTURN achieves a worst-case throughput bound of +)/+) of networ capacity, which is higher than half of networ capacity. For even radix twodimensional mesh, UTURN achieves optimal worst-case throughput, namely, half of networ capacity. Proof: The worst-case channel load analysis for YXY-Routing is the same with XYX-Routing because of symmetry. Similarly, it follows that the worst-case channel load of YXY-Routing in odd radix two-dimensional mesh is in X-dimension and in the Y-dimension, respectively. For the even radix two-dimensional mesh, the worst-case channel load of YXY-Routing is in X and Y dimensions. UTURN consists of two parts: XYX-Routing with 0.5 probability and YXY-Routing with also 0.5 probability. Thus, for odd radix twodimensional mesh, the worst-case channel load of UTURN in X- dimension is 0.5 )+0.5 )=. Similarly, the worstcase channel load of UTURN in Y-dimension is also 4. 4 So, the the worst-case channel load of UTURN is and 4 the worst-case throughput is in odd radix two-dimensional 4 mesh. For even radix two-dimensional mesh, the worst-case channel load of UTURN in X-dimension and Y-dimension are all 0.5 )+ 0.5 )=. So, the the worst-case channel load of UTURN is and the worst-case throughput is in even radix two-dimensional mesh. 4 As shown in Section II-B, the networ capacity for mesh is when the radix is odd and 4 when the radix is even. Therefore, for an odd radix two-dimensional mesh, UTURN achieves a worst-case 4 4 throughput bound of / = +)/ +) of networ capacity, which is higher than half of networ capacity for all radix. For an even radix two-dimensional mesh, UTURN achieves a worst-case throughput of ` ` / 4 =0.5, which is optimal. Fig. shows that UTURN outperforms VAL, DOR and OTURN in worst-case throughput for different odd radix networs. The worstcase throughput in even radix case is optimal and we do not show the even results in Fig.. Different with UTURN, the optimal routing including TURN) proposed in [9] does not have a closed-form description, which requires solving a separate linear programming for each networ radix. The linear programming grow so fast that it is difficult to scale to large networs [7]. Note that UTURN achieves the same optimal worst-case throughput as the optimal routing that can be obtained with the method proposed in [9]. Normalized Throughput Fig VAL DOR 0. Our algorithm OTURN Optimal Routing Networ Radix ) Worst-case throughput for odd radix meshes. ) Latency Analysis: For latency analysis, we use the average hop count H aver) for a given routing algorithm R to measure the average latency. We assume that the traffic between all sourcedestination pairs are all the same when computing H aver) [6]. The average hop count for DOR in -ary -mesh networ is, 3 where each component represents the average hop count in 3 each dimension using minimal-length routing [][6]. In UTURN, minimal-length routing is used in three segments. However, as mentioned in Section III-A, there is a degenerated case when y = y, which has a probability of /. In this degenerated case, there is only one minimal-length routing segment. Therefore, the average hop count of UTURN is given as follows: H aveuturn) = 3 3 )+ 3 = 3 3 ) 3) In order to quantify the penalty factor in average latency for given routing algorithm R over DOR, we normalize the average hop count 49
4 of R to DOR. So the penalty factor for UTURN is 3, which is less than.5 for all radix. Compared with VAL, which has a penalty factor of, UTURN achieves better performance in both throughput and latency. UTURN is a non-minimal routing, which has.5x average hop count when compared with OTURN and DOR. In comparison with DOR and OTURN, UTurn offers a latency-bandwidth tradeoff for mesh networs where it offers better throughput at the expense of a higher latency cost. Fig. 3. routing. a) DOR-XY routing b) DOR-YX routing Traffic patterns which achieve maximal load lower bound in DOR IV. ASYMMETRIC MESHES Networ Capacity: SectionII-B discusses the networ capacity for -ary n-mesh networs, namely symmetric meshes. We can get the networ capacity in asymmetric meshes using the same analysis method with symmetric meshes [][6]. Suppose max = max x, y) in the asymmetric meshes, the maximal sustainable channel loads under uniform traffic are max Υ 4 max is even asym = 4) max is odd max 4 max where Υ asym is the inverse of the networ capacity. A. Throughput Analysis for Different Algorithms In the following, we will analyze the worst-case throughput for different routing algorithms in asymmetric meshes with the assumption of max = x. For the case of max = y, these algorithms just need to exchange the operations between X and Y dimensions. VAL: For VAL routing, the two-stage minimal routing gets twice uniform loads. Thus, the worst-case channel loads of VAL is Υ asym, which results in normalized worst-case throughput T WCVAL)=. 5) DOR: For DOR routing, the worst-case throughput can be different for various y in the meshes where max = x > y >. However, we can get a maximal load bound for different y: max max is even L maxdor) + max 6) max is odd The result for the even max case is easy to find under the traffic: the left half part nodes are sources and the right half part nodes are destinations. For the odd max case, the result can be found under the traffic patterns showed in Fig. 3, where the red nodes represent sources and the green nodes represent destinations. For both DOR- XY and DOR-YX routings, the channel load represented by blac arrows achieve the maximal load bound of +max. Eqn. 6 just shows the lower bounds. Absolutely, there are worse cases for different y. For example, when x y =, the maximal loads are max for both even and odd max, which are showed in Fig. 4. Therefore, according to the Eqn. 6, the upper bound of normalized worst-case throughput for DOR routing is T WCDOR) max is even max max < max is odd OTURN: For OTURN, we can also get a maximal load bound as following: L maxoturn) max, max is even or odd. 8) Fig.5 shows the traffic patterns for this load lower bound in both even and odd max cases. For 7 6 mesh Fig. 5a)) and 6 5 mesh Fig. 5b)), the channel loads represented by blac arrows are 3.5 and 7) Fig. 4. a) DOR-XY routing 7 x 6) b) DOR-YX routing6 x5) Examples where the maximal loads are max in DOR routing. 3, respectively. Therefore, according to the Eqn. 8, the upper bound of normalized worst-case throughput for OTURN routing is T WCOTURN) max is even max max < max is odd a) OTURN routing 7 x 6) b) OTURN routing6 x5) Fig. 5. Traffic patterns which achieve maximal load lower bound in OTURN routing. UTURN: Section III-A describes the UTURN algorithm for the symmetric case. According to the analysis of Claim in the Section III-B, we can get the maximal channel loads in X and Y dimensions L x and L y) in the mesh x > y for YXY and XYX routing, which are showed in Table I. For meshes x > y, the maximal channel load is decided by L x because L x >L y for all the cases in Table I. YXY-routing achieves excellent worst-case channel loads when x > y: max max is even L maxyxy) = max 0) max is odd Thus, the normalized worst-case throughput for YXY-routing in the mesh x > y is T WCYXY) = max is even max+ max > max is odd 9) ) For meshes y > x, XYX-routing achieves the same worst-case throughput with YXY-routing in meshes x > y, namely, Eqn.. Therefore, UTURN only implements YXY-routing in x > y meshes. In x < y meshes, UTURN only implements XYXrouting. Thus, UTURN achieves the worst-case throughput of Eqn.. UTURN-A: UTURN achieves the optimal worst-case throughput for both odd and even radix networs and for both symmetric and asymmetric networs. In the case of asymmetric meshes, we can employ a variant of UTURN that we refer to as UTURN-A, UTURN-A is a variant of UTURN that optimizes for average case performance in asymmetric meshes. 430
5 TABLE I MAXIMAL CHANNEL LOADS IN X AND Y DIMENSIONS FOR XYX-ROUTING AND YXY-ROUTING. x y mesh XYX-routing YXY-routing even x L x = x L x = x even y L y = y L y = y even x L x = x L x = x odd y L y = y L y = y y odd x L x = x x L x = x even y L y = y L y = y odd x L x = x L x = x x odd y L y = y L y = y y better average-case throughput by 4.6%, 0.8% and 4.5%, respectively. Compared with DOR, OTURN and VAL, our UTURN achieves better average-case throughput by.9%, 8.5% and 8.8%, respectively. Compared with UTURN-A, UTURN achieves better worst-case throughput but worse average throughput. which can achieve better average case throughput than UTURN at the expense of worst-case throughput optimality. UTURN-A is implemented by using XYX-routing and YXY-routing with the same probability. According to Table I, UTURN-A achieves worst-case channel loads when x > y: Fig. 6. Worst-case throughput for different routing algorithms in different asymmetric mesh sizes. L maxuturn-a) =L xuturn-a) = LxXYX)+ LxYXY) max max is even = max ) max+) 4 max max is odd ) For meshes x < y, UTURN-A achieves the same maximal channel load with the x > y case, namely, Eqn., excepting that max = y. So, the normalized worst-case throughput is T WCUTURN-A) = max is even max is odd max+ max+ > 3) As can be seen in Eqn. 3, the worst-case throughput of UTURN- A is the same as that of UTURN for the even radix case, which is optimal. However, in the odd radix case, although the worst-case throughput of UTURN-A is better than / networ capacity, it is only near-optimal in that UTURN-A cannot achieve the worst-case throughput optimal result of UTURN as shown in Eqn. ). B. Throughput Comparison for Different Algorithms According to Equations 5, 7, 9, and 3, our UTURN and UTURN-A algorithms achieve better worst-case throughput than VAL, DOR and OTURN in asymmetric meshes. The aforementioned theoretical analysis shows that UTURN and UTURN-A achieve best worst-case throughput of / networ capacity in the asymmetric meshes with even max. For the asymmetric meshes with odd max, our algorithms UTURN and UTURN-A achieve a worstcase throughput better than / networ capacity. Fig. 6 shows the normalized worst-case throughput of different routing algorithms for different asymmetric mesh sizes. It shows that our UTURN and UTURN-A algorithms outperform existing algorithms lie VAL, DOR and OTURN in worst-case throughput, which also validates the aforementioned theoretical worst-case throughput analysis for asymmetric meshes. As for average-case throughput, our algorithms also achieves excellent performance. Fig. 7 shows the average-case throughput of different routing algorithms for different asymmetric mesh sizes. Compared with DOR, OTURN and VAL, UTURN-A achieves Fig. 7. Average-case throughput for different routing algorithms in different asymmetric mesh sizes. V. EVALUATION FOR SYMMETRIC MESHES In this section, we compare the throughput of UTURN with VAL, DOR and OTURN with respect to each traffic matrix shown in Table II. Table III and Table IV provide the evaluation results for different odd and even radix symmetric networs respectively. As shown in Table III, UTURN outperforms all other routing algorithms in worst-case throughput for odd radix mesh networs, and achieves a new optimal worst-case throughput bound of +)/+ ), which is higher than / of networ capacity. For the performance in the even radix mesh networs showed in Table IV, UTURN achieves the best worst-case throughput, namely, half of networ capacity. This also validates the throughput analysis of UTURN in Section III-B. For average-case throughput, UTURN also outperforms VAL, DOR, and OTURN, on average over the six evaluated networ sizes both odd and even radix), by 7.%, 39.7% and 8.8%, respectively. UTURN also performs very well in both odd and even radix networs under adversarial traffic patterns, namely, Transpose and DOR-WC. For Complement and Nearest-Neighbor traffic, UTURN outperforms VAL, but does worse than DOR and OTURN. For ideal uniformly Random traffic, although DOR achieves better throughput than UTURN, DOR is significantly worse in the average and worse case throughput. In summary, our evaluation shows that UTURN performs very well in different networ sizes and traffic matrices. VI. CONCLUSION UTURN is a closed-form non-minimal oblivious routing algorithm for mesh networs. We prove it achieves a new worst-case throughput bound of +)/ +) of networ capacity with odd radix, which is higher than / networ capacity. For even radix 43
6 Worst-Case Average- Case Transpose Random DOR-WC Complement Nearest- Neighbor TABLE II TRAFFIC MATRICES EVALUATED. Worst-case traffic matrix that results in lowest throughput Average throughput over a million random traffic matrix Pacets route from x,y) to y,x) Destination node chosen at uniform random Worst-case traffic for DOR. Pacets route from x,y) to -y-,-x-) Pacet at x,y) sent to -x-,-y-). Pacets sent to nodes one hop away with equal probability TABLE III NORMALIZED WORST-CASE &AVERAGE-CASE THROUGHPUT FOR ODD CASE 3x3 mesh VAL DOR OTURN UTURN Worst-case Average-case Transpose Random DOR-WC Complement Nearest-Neighbor x5 mesh VAL DOR OTURN UTURN Worst-case Average-case Transpose Random DOR-WC Complement Nearest-Neighbor x7 mesh VAL DOR OTURN UTURN Worst-case Average-case Transpose Random DOR-WC Complement Nearest-Neighbor meshes, UTURN achieves optimal worst-case throughput. Thus, it outperforms existing algorithms such as VAL, DOR and OTURN in worst-case throughput. In addition, when compared with the optimal routings that can be obtained with the method proposed in [9] for odd radix meshes, UTURN can achieve the same optimal worst-case throughput. For average-case throughput, UTURN also outperforms VAL, DOR and OTURN on average of six simulation cases by 7.%, 39.7% and 8.8%, respectively. Moreover, we provide a closed-form expression for the average hop count of UTURN in latency analysis. In comparison with DOR and OTURN, UTURN achieves a higher throughput at the expense of an average hop count that is approximately.5x minimal. For asymmetric meshes, both theoretical analysis and simulation show that our UTURN and UTURN-A algorithms achieve better worst-case and average case throughput than other existing algorithms VAL, DOR and OTURN. VII. ACKNOWLEDGMENTS This wor was partially supported by the Creative Research Groups of National Natural Science Foundation NNSF) under grant No and 63305, the National S&T Major Project under No. 0ZX and 009ZX , Research Fund of Tsinghua University under No , and ZTE Corporation. TABLE IV NORMALIZED WORST-CASE &AVERAGE-CASE THROUGHPUT FOR EVEN CASE 4x4 mesh VAL DOR OTURN UTURN Worst-case Average-case Transpose Random DOR-WC Complement Nearest-Neighbor x6 mesh VAL DOR OTURN UTURN Worst-case Average-case Transpose Random DOR-WC Complement Nearest-Neighbor x8 mesh VAL DOR OTURN UTURN Worst-case Average-case Transpose Random DOR-WC Complement Nearest-Neighbor The authors would also lie to acnowledge the support of NSF CCF REFERENCES [] W. Dally and B. Towles. Principles and practices of interconnection networs. Morgan Kaufmann, 004. [] S. Guang, L. Shijun, J. Depeng, L. Yong, S. Li, Y. Zhang, and Z. Lieguang. Performance-aware hybrid algorithm for mapping ips onto mesh-based networ on chip. IEICE TRANSACTIONS on Information and Systems, 945): , 0. [3] T. Nesson and S. Johnsson. Romm routing on mesh and torus networs. In Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures, pages ACM, 995. [4] D. Seo, A. Ali, W. Lim, N. Rafique, and M. Thottethodi. Near-optimal worst-case throughput routing for two-dimensional mesh networs. In ACM SIGARCH Computer Architecture News, volume 33, pages IEEE Computer Society, 005. [5] H. Sullivan and T. Bashow. A large scale, homogeneous, fully distributed parallel machine, i. In ACM SIGARCH Computer Architecture News, volume 5, pages ACM, 977. [6] R. Sunam Ramanujam and B. Lin. Randomized partially-minimal routing on three-dimensional mesh networs. Computer Architecture Letters, 7):37 40, 008. [7] R. Sunam Ramanujam and B. Lin. Weighted random oblivious routing on torus networs. In Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networing and Communications Systems, pages 04. ACM, 009. [8] B. Towles and W. Dally. Worst-case traffic for oblivious routing functions. In Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures, pages 8. ACM, 00. [9] B. Towles, W. Dally, and S. Boyd. Throughput-centric routing algorithm design. In Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures, pages ACM, 003. [0] L. Valiant and G. Brebner. Universal schemes for parallel communication. In Proceedings of the thirteenth annual ACM symposium on Theory of computing, pages ACM,
INTERCONNECTION networks are used in a variety of applications,
1 Randomized Throughput-Optimal Oblivious Routing for Torus Networs Rohit Sunam Ramanujam, Student Member, IEEE, and Bill Lin, Member, IEEE Abstract In this paper, we study the problem of optimal oblivious
More informationRandomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks
2080 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 11, NOVEMBER 2012 Randomized Partially-Minimal Routing: Near-Optimal Oblivious Routing for 3-D Mesh Networks Rohit Sunkam
More informationA Layer-Multiplexed 3D On-Chip Network Architecture Rohit Sunkam Ramanujam and Bill Lin
50 IEEE EMBEDDED SYSTEMS LETTERS, VOL. 1, NO. 2, AUGUST 2009 A Layer-Multiplexed 3D On-Chip Network Architecture Rohit Sunkam Ramanujam and Bill Lin Abstract Programmable many-core processors are poised
More informationA Novel 3D Layer-Multiplexed On-Chip Network
A Novel D Layer-Multiplexed On-Chip Network Rohit Sunkam Ramanujam University of California, San Diego rsunkamr@ucsdedu Bill Lin University of California, San Diego billlin@eceucsdedu ABSTRACT Recently,
More informationFinding Worst-case Permutations for Oblivious Routing Algorithms
Stanford University Concurrent VLSI Architecture Memo 2 Stanford University Computer Systems Laboratory Finding Worst-case Permutations for Oblivious Routing Algorithms Brian Towles Abstract We present
More informationWITH the development of the semiconductor technology,
Dual-Link Hierarchical Cluster-Based Interconnect Architecture for 3D Network on Chip Guang Sun, Yong Li, Yuanyuan Zhang, Shijun Lin, Li Su, Depeng Jin and Lieguang zeng Abstract Network on Chip (NoC)
More informationInterconnection Networks: Routing. Prof. Natalie Enright Jerger
Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly
More informationNear-Optimal Worst-case Throughput Routing for Two-Dimensional Mesh Networks
Near-Optimal Worst-case Throughput Routing for Two-Dimensional Mesh Networks Daeho Seo Akif Ali Won-Taek Lim Nauman Rafique Mithuna Thottethodi School of Electrical and Computer Engineering Purdue University
More informationGlobal Adaptive Routing Algorithm Without Additional Congestion Propagation Network
1 Global Adaptive Routing Algorithm Without Additional Congestion ropagation Network Shaoli Liu, Yunji Chen, Tianshi Chen, Ling Li, Chao Lu Institute of Computing Technology, Chinese Academy of Sciences
More informationNear-Optimal Worst-case Throughput Routing for Two-Dimensional Mesh Networks
Near-Optimal Worst-case Throughput Routing for Two-Dimensional Mesh Networks Daeho Seo Akif Ali Won-Taek Lim Nauman Rafique Mithuna Thottethodi School of Electrical and Computer Engineering Purdue University
More informationInterconnection Networks: Topology. Prof. Natalie Enright Jerger
Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design
More informationTopologies. Maurizio Palesi. Maurizio Palesi 1
Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and
More informationNetwork-on-chip (NOC) Topologies
Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance
More informationThroughput-Centric Routing Algorithm Design
Throughput-Centric Routing Algorithm Design Brian Towles, William J. Dally, and Stephen Boyd Department of Electrical Engineering Stanford University {btowles,billd}@cva.stanford.edu ABSTRACT The increasing
More informationStatic Virtual Channel Allocation in Oblivious Routing
Static Virtual Channel Allocation in Oblivious Routing Keun Sup Shim Myong Hyon Cho Michel Kinsy Tina Wen Mieszko Lis G. Edward Suh Srinivas Devadas Computer Science and Artificial Intelligence Laboratory
More informationChapter 7 Slicing and Dicing
1/ 22 Chapter 7 Slicing and Dicing Lasse Harju Tampere University of Technology lasse.harju@tut.fi 2/ 22 Concentrators and Distributors Concentrators Used for combining traffic from several network nodes
More informationEarly Transition for Fully Adaptive Routing Algorithms in On-Chip Interconnection Networks
Technical Report #2012-2-1, Department of Computer Science and Engineering, Texas A&M University Early Transition for Fully Adaptive Routing Algorithms in On-Chip Interconnection Networks Minseon Ahn,
More informationTopology basics. Constraints and measures. Butterfly networks.
EE48: Advanced Computer Organization Lecture # Interconnection Networks Architecture and Design Stanford University Topology basics. Constraints and measures. Butterfly networks. Lecture #: Monday, 7 April
More informationBasic Switch Organization
NOC Routing 1 Basic Switch Organization 2 Basic Switch Organization Link Controller Used for coordinating the flow of messages across the physical link of two adjacent switches 3 Basic Switch Organization
More informationDesign of a router for network-on-chip. Jun Ho Bahn,* Seung Eun Lee and Nader Bagherzadeh
98 Int. J. High Performance Systems Architecture, Vol. 1, No. 2, 27 Design of a router for network-on-chip Jun Ho Bahn,* Seung Eun Lee and Nader Bagherzadeh Department of Electrical Engineering and Computer
More informationBARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs
-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs Pejman Lotfi-Kamran, Masoud Daneshtalab *, Caro Lucas, and Zainalabedin Navabi School of Electrical and Computer Engineering, The
More informationOblivious Routing in On-Chip Bandwidth-Adaptive Networks Myong Hyon Cho, Mieszko Lis, Keun Sup Shim, Michel Kinsy, Tina Wen, and Srinivas Devadas
Computer Science and Artificial Intelligence Laboratory Technical Report MIT-CSAIL-TR-009-0 March 7, 009 Oblivious Routing in On-Chip Bandwidth-Adaptive Networks Myong Hyon Cho, Mieszko Lis, Keun Sup Shim,
More informationLecture 3: Topology - II
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 3: Topology - II Tushar Krishna Assistant Professor School of Electrical and
More informationCS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control
CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 9. Routing and Flow Control Intro What did we learn in the last lecture Topology metrics Including minimum diameter of directed
More informationCenter Concentrated X-Torus Topology
Center Concentrated X-Torus Topolog Dinesh Kumar 1, Vive Kumar Sehgal, and Nitin 3 1 Department of Comp. Sci. & Eng., Japee Universit of Information Technolog, Wanaghat, Solan, Himachal Pradesh, INDIA-17334.
More informationPath-Diverse Inorder Routing
Path-Diverse Inorder Routing Mieszko Lis, Myong Hyon Cho, Keun Sup Shim, and Srinivas Devadas Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology {mieszko, mhcho,
More informationA Novel Energy Efficient Source Routing for Mesh NoCs
2014 Fourth International Conference on Advances in Computing and Communications A ovel Energy Efficient Source Routing for Mesh ocs Meril Rani John, Reenu James, John Jose, Elizabeth Isaac, Jobin K. Antony
More informationIN this letter we focus on OpenFlow-based network state
1 On the Impact of Networ State Collection on the Performance of SDN Applications Mohamed Aslan, and Ashraf Matrawy Carleton University, ON, Canada Abstract Intelligent and autonomous SDN applications
More informationScheduling Unsplittable Flows Using Parallel Switches
Scheduling Unsplittable Flows Using Parallel Switches Saad Mneimneh, Kai-Yeung Siu Massachusetts Institute of Technology 77 Massachusetts Avenue Room -07, Cambridge, MA 039 Abstract We address the problem
More informationLecture 2: Topology - I
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 2: Topology - I Tushar Krishna Assistant Professor School of Electrical and
More informationECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng. Prof. Natalie Enright Jerger
ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng Prof. Natalie Enright Jerger Announcements Feedback on your project proposals This week Scheduled extended 1 week Next week:
More informationSlim Fly: A Cost Effective Low-Diameter Network Topology
TORSTEN HOEFLER, MACIEJ BESTA Slim Fly: A Cost Effective Low-Diameter Network Topology Images belong to their creator! NETWORKS, LIMITS, AND DESIGN SPACE Networks cost 25-30% of a large supercomputer Hard
More informationPerformance of Multihop Communications Using Logical Topologies on Optical Torus Networks
Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,
More informationRouting Algorithms for 2-D Mesh Network-On-Chip Architectures
Routing Algorithms for 2-D Mesh Network-On-hip Architectures Edward hron Gene Kishinevsky Brandon Nefcy Nishant Patil Department of Electrical Engineering, Stanford University {echron, genek, nefcy, nppatil}@stanford.edu
More informationAdaptive Channel Queue Routing on k-ary n-cubes
Adaptive Channel Queue Routing on k-ary n-cubes Arjun Singh, William J Dally, Amit K Gupta, Brian Towles Computer Systems Laboratory, Stanford University {arjuns,billd,btowles,agupta}@cva.stanford.edu
More informationTopologies. Maurizio Palesi. Maurizio Palesi 1
Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and
More informationECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts
ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts School of Electrical and Computer Engineering Cornell University revision: 2017-10-17-12-26 1 Network/Roadway Analogy 3 1.1. Running
More informationDeadlock-free XY-YX router for on-chip interconnection network
LETTER IEICE Electronics Express, Vol.10, No.20, 1 5 Deadlock-free XY-YX router for on-chip interconnection network Yeong Seob Jeong and Seung Eun Lee a) Dept of Electronic Engineering Seoul National Univ
More informationDynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution
Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution Nishant Satya Lakshmikanth sailtosatya@gmail.com Krishna Kumaar N.I. nikrishnaa@gmail.com Sudha S
More informationChapter 3 : Topology basics
1 Chapter 3 : Topology basics What is the network topology Nomenclature Traffic pattern Performance Packaging cost Case study: the SGI Origin 2000 2 Network topology (1) It corresponds to the static arrangement
More informationInterconnection networks
Interconnection networks When more than one processor needs to access a memory structure, interconnection networks are needed to route data from processors to memories (concurrent access to a shared memory
More informationOblivious Routing in On-Chip Bandwidth-Adaptive Networks
009 8th International Conference on Parallel Architectures and Compilation Techniques Oblivious Routing in On-Chip Bandwidth-Adaptive Networks Myong Hyon Cho, Mieszko Lis, Keun Sup Shim, Michel Kinsy,
More informationAsynchronous Bypass Channel Routers
1 Asynchronous Bypass Channel Routers Tushar N. K. Jain, Paul V. Gratz, Alex Sprintson, Gwan Choi Department of Electrical and Computer Engineering, Texas A&M University {tnj07,pgratz,spalex,gchoi}@tamu.edu
More informationin Oblivious Routing
Static Virtual Channel Allocation in Oblivious Routing Keun Sup Shim, Myong Hyon Cho, Michel Kinsy, Tina Wen, Mieszko Lis G. Edward Suh (Cornell) Srinivas Devadas MIT Computer Science and Artificial Intelligence
More informationLecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel
More informationPerformance Analysis of a Minimal Adaptive Router
Performance Analysis of a Minimal Adaptive Router Thu Duc Nguyen and Lawrence Snyder Department of Computer Science and Engineering University of Washington, Seattle, WA 98195 In Proceedings of the 1994
More informationDISTRIBUTED ROUTER FABRICS
DISTRIBUTED ROUTER FABRICS A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR
More informationCentralized Adaptive Routing for NoCs
Centralized Adaptive Routing for NoCs Authors Name/s per 1st Affiliation (Author) line 1 (of Affiliation): dept. name of organization line 2: name of organization, acronyms acceptable line 3: City, Country
More informationA Survey of Techniques for Power Aware On-Chip Networks.
A Survey of Techniques for Power Aware On-Chip Networks. Samir Chopra Ji Young Park May 2, 2005 1. Introduction On-chip networks have been proposed as a solution for challenges from process technology
More informationRouting Algorithms. Review
Routing Algorithms Today s topics: Deterministic, Oblivious Adaptive, & Adaptive models Problems: efficiency livelock deadlock 1 CS6810 Review Network properties are a combination topology topology dependent
More informationSurveying Formal and Practical Approaches for Optimal Placement of Replicas on the Web
Surveying Formal and Practical Approaches for Optimal Placement of Replicas on the Web TR020701 April 2002 Erbil Yilmaz Department of Computer Science The Florida State University Tallahassee, FL 32306
More informationQUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose
QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose Department of Electrical and Computer Engineering University of California,
More informationDemand Based Routing in Network-on-Chip(NoC)
Demand Based Routing in Network-on-Chip(NoC) Kullai Reddy Meka and Jatindra Kumar Deka Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, Guwahati, India Abstract
More informationAdaptive Routing in Hexagonal Torus Interconnection Networks
Adaptive Routing in Hexagonal Torus Interconnection Networks Arash Shamaei and Bella Bose School of Electrical Engineering and Computer Science Oregon State University Corvallis, OR 97331 5501 Email: {shamaei,bose}@eecs.oregonstate.edu
More informationSpider-Web Topology: A Novel Topology for Parallel and Distributed Computing
Spider-Web Topology: A Novel Topology for Parallel and Distributed Computing 1 Selvarajah Thuseethan, 2 Shanmuganathan Vasanthapriyan 1,2 Department of Computing and Information Systems, Sabaragamuwa University
More informationEC 513 Computer Architecture
EC 513 Computer Architecture On-chip Networking Prof. Michel A. Kinsy Virtual Channel Router VC 0 Routing Computation Virtual Channel Allocator Switch Allocator Input Ports VC x VC 0 VC x It s a system
More informationA Multicast Routing Algorithm for 3D Network-on-Chip in Chip Multi-Processors
Proceedings of the World Congress on Engineering 2018 ol I A Routing Algorithm for 3 Network-on-Chip in Chip Multi-Processors Rui Ben, Fen Ge, intian Tong, Ning Wu, ing hang, and Fang hou Abstract communication
More informationChapter 4 : Butterfly Networks
1 Chapter 4 : Butterfly Networks Structure of a butterfly network Isomorphism Channel load and throughput Optimization Path diversity Case study: BBN network 2 Structure of a butterfly network A K-ary
More informationThroughout this course, we use the terms vertex and node interchangeably.
Chapter Vertex Coloring. Introduction Vertex coloring is an infamous graph theory problem. It is also a useful toy example to see the style of this course already in the first lecture. Vertex coloring
More informationEfficient Second-Order Iterative Methods for IR Drop Analysis in Power Grid
Efficient Second-Order Iterative Methods for IR Drop Analysis in Power Grid Yu Zhong Martin D. F. Wong Dept. of Electrical and Computer Engineering Dept. of Electrical and Computer Engineering Univ. of
More informationECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng. Prof. Natalie Enright Jerger
ECE 1749H: Interconnec1on Networks for Parallel Computer Architectures: Rou1ng Prof. Natalie Enright Jerger Rou1ng Overview Discussion of topologies assumed ideal rou1ng In prac1ce Rou1ng algorithms are
More informationDesign of a high-throughput distributed shared-buffer NoC router
Design of a high-throughput distributed shared-buffer NoC router The MIT Faculty has made this article openly available Please share how this access benefits you Your story matters Citation As Published
More informationFast-Response Multipath Routing Policy for High-Speed Interconnection Networks
HPI-DC 09 Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks Diego Lugones, Daniel Franco, and Emilio Luque Leonardo Fialho Cluster 09 August 31 New Orleans, USA Outline Scope
More informationDynamic Wavelength Assignment for WDM All-Optical Tree Networks
Dynamic Wavelength Assignment for WDM All-Optical Tree Networks Poompat Saengudomlert, Eytan H. Modiano, and Robert G. Gallager Laboratory for Information and Decision Systems Massachusetts Institute of
More informationAchieve Significant Throughput Gains in Wireless Networks with Large Delay-Bandwidth Product
Available online at www.sciencedirect.com ScienceDirect IERI Procedia 10 (2014 ) 153 159 2014 International Conference on Future Information Engineering Achieve Significant Throughput Gains in Wireless
More informationRouting Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip
Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip Nauman Jalil, Adnan Qureshi, Furqan Khan, and Sohaib Ayyaz Qazi Abstract
More informationImproving the Data Scheduling Efficiency of the IEEE (d) Mesh Network
Improving the Data Scheduling Efficiency of the IEEE 802.16(d) Mesh Network Shie-Yuan Wang Email: shieyuan@csie.nctu.edu.tw Chih-Che Lin Email: jclin@csie.nctu.edu.tw Ku-Han Fang Email: khfang@csie.nctu.edu.tw
More informationThe final publication is available at
Document downloaded from: http://hdl.handle.net/10251/82062 This paper must be cited as: Peñaranda Cebrián, R.; Gómez Requena, C.; Gómez Requena, ME.; López Rodríguez, PJ.; Duato Marín, JF. (2016). The
More informationA Low-Overhead Hybrid Routing Algorithm for ZigBee Networks. Zhi Ren, Lihua Tian, Jianling Cao, Jibi Li, Zilong Zhang
A Low-Overhead Hybrid Routing Algorithm for ZigBee Networks Zhi Ren, Lihua Tian, Jianling Cao, Jibi Li, Zilong Zhang Chongqing Key Lab of Mobile Communications Technology, Chongqing University of Posts
More informationOFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management
Marina Garcia 22 August 2013 OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management M. Garcia, E. Vallejo, R. Beivide, M. Valero and G. Rodríguez Document number OFAR-CM: Efficient Dragonfly
More informationEcient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines
Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines B. B. Zhou, R. P. Brent and A. Tridgell Computer Sciences Laboratory The Australian National University Canberra,
More informationTwo Multicasting Schemes for Irregular 3D Mesh-based Bufferless NoCs
MATEC Web of Conferences 22, 01028 ( 2015) DOI: 10.1051/ matecconf/ 20152201028 C Owned by the authors, published by EDP Sciences, 2015 Two Multicasting Schemes for Irregular 3D Mesh-based Bufferless NoCs
More informationA COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS. and. and
Parallel Processing Letters c World Scientific Publishing Company A COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS DANNY KRIZANC Department of Computer Science, University of Rochester
More informationERA: An Efficient Routing Algorithm for Power, Throughput and Latency in Network-on-Chips
: An Efficient Routing Algorithm for Power, Throughput and Latency in Network-on-Chips Varsha Sharma, Rekha Agarwal Manoj S. Gaur, Vijay Laxmi, and Vineetha V. Computer Engineering Department, Malaviya
More informationANALYSIS AND IMPROVEMENT OF VALIANT ROUTING IN LOW- DIAMETER NETWORKS
ANALYSIS AND IMPROVEMENT OF VALIANT ROUTING IN LOW- DIAMETER NETWORKS Mariano Benito Pablo Fuentes Enrique Vallejo Ramón Beivide With support from: 4th IEEE International Workshop of High-Perfomance Interconnection
More informationImproving Range Query Performance on Historic Web Page Data
Improving Range Query Performance on Historic Web Page Data Geng LI Lab of Computer Networks and Distributed Systems, Peking University Beijing, China ligeng@net.pku.edu.cn Bo Peng Lab of Computer Networks
More informationDiagonal Principal Component Analysis for Face Recognition
Diagonal Principal Component nalysis for Face Recognition Daoqiang Zhang,2, Zhi-Hua Zhou * and Songcan Chen 2 National Laboratory for Novel Software echnology Nanjing University, Nanjing 20093, China 2
More informationPartitioning Methods for Multicast in Bufferless 3D Network on Chip
Partitioning Methods for Multicast in Bufferless 3D Network on Chip Chaoyun Yao 1(B), Chaochao Feng 1, Minxuan Zhang 1, Wei Guo 1, Shouzhong Zhu 1, and Shaojun Wei 2 1 College of Computer, National University
More informationDesign and Implementation of Multistage Interconnection Networks for SoC Networks
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.5, October 212 Design and Implementation of Multistage Interconnection Networks for SoC Networks Mahsa
More informationPDA-HyPAR: Path-Diversity-Aware Hybrid Planar Adaptive Routing Algorithm for 3D NoCs
PDA-HyPAR: Path-Diversity-Aware Hybrid Planar Adaptive Routing Algorithm for 3D NoCs Jindun Dai *1,2, Renjie Li 2, Xin Jiang 3, Takahiro Watanabe 2 1 Department of Electrical Engineering, Shanghai Jiao
More informationNetwork Dilation: A Strategy for Building Families of Parallel Processing Architectures Behrooz Parhami
Network Dilation: A Strategy for Building Families of Parallel Processing Architectures Behrooz Parhami Dept. Electrical & Computer Eng. Univ. of California, Santa Barbara Parallel Computer Architecture
More informationRecall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms
CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252
More informationSanaz Azampanah Ahmad Khademzadeh Nader Bagherzadeh Majid Janidarmian Reza Shojaee
Sanaz Azampanah Ahmad Khademzadeh Nader Bagherzadeh Majid Janidarmian Reza Shojaee Application-Specific Routing Algorithm Selection Function Look-Ahead Traffic-aware Execution (LATEX) Algorithm Experimental
More informationGeometric transformations assign a point to a point, so it is a point valued function of points. Geometric transformation may destroy the equation
Geometric transformations assign a point to a point, so it is a point valued function of points. Geometric transformation may destroy the equation and the type of an object. Even simple scaling turns a
More informationLecture 9: Group Communication Operations. Shantanu Dutt ECE Dept. UIC
Lecture 9: Group Communication Operations Shantanu Dutt ECE Dept. UIC Acknowledgement Adapted from Chapter 4 slides of the text, by A. Grama w/ a few changes, augmentations and corrections Topic Overview
More information548 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 4, APRIL 2011
548 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 4, APRIL 2011 Extending the Effective Throughput of NoCs with Distributed Shared-Buffer Routers Rohit Sunkam
More informationCapacity of Grid-Oriented Wireless Mesh Networks
Capacity of Grid-Oriented Wireless Mesh Networks Nadeem Akhtar and Klaus Moessner Centre for Communication Systems Research University of Surrey Guildford, GU2 7H, UK Email: n.akhtar@surrey.ac.uk, k.moessner@surrey.ac.uk
More informationTesting Isomorphism of Strongly Regular Graphs
Spectral Graph Theory Lecture 9 Testing Isomorphism of Strongly Regular Graphs Daniel A. Spielman September 26, 2018 9.1 Introduction In the last lecture we saw how to test isomorphism of graphs in which
More informationAn Empirical Study of Performance Benefits of Network Coding in Multihop Wireless Networks
An Empirical Study of Performance Benefits of Network Coding in Multihop Wireless Networks Dimitrios Koutsonikolas, Y. Charlie Hu, Chih-Chun Wang School of Electrical and Computer Engineering, Purdue University,
More informationA novel look-ahead routing algorithm based on graph theory for triplet-based network-on-chip router
LETTER IEICE Electronics Express, Vol.15, No.8, 1 12 A novel look-ahead routing algorithm based on graph theory for triplet-based network-on-chip router Xu Chen, Feng Shi a), Fei Yin, and Xiaojun Wang
More informationComplementary Graph Coloring
International Journal of Computer (IJC) ISSN 2307-4523 (Print & Online) Global Society of Scientific Research and Researchers http://ijcjournal.org/ Complementary Graph Coloring Mohamed Al-Ibrahim a*,
More informationDesign of a System-on-Chip Switched Network and its Design Support Λ
Design of a System-on-Chip Switched Network and its Design Support Λ Daniel Wiklund y, Dake Liu Dept. of Electrical Engineering Linköping University S-581 83 Linköping, Sweden Abstract As the degree of
More information4. Networks. in parallel computers. Advances in Computer Architecture
4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors
More informationQuantification of Capacity and Transmission Delay for Mobile Ad Hoc Networks (MANET)
Quantification of Capacity and Transmission Delay for Mobile Ad Hoc Networks (MANET) 1 Syed S. Rizvi, 2 Aasia Riasat, and 3 Khaled M. Elleithy 1, 3 Computer Science and Engineering Department, University
More informationDLABS: a Dual-Lane Buffer-Sharing Router Architecture for Networks on Chip
DLABS: a Dual-Lane Buffer-Sharing Router Architecture for Networks on Chip Anh T. Tran and Bevan M. Baas Department of Electrical and Computer Engineering University of California - Davis, USA {anhtr,
More informationConstant Queue Routing on a Mesh
Constant Queue Routing on a Mesh Sanguthevar Rajasekaran Richard Overholt Dept. of Computer and Information Science Univ. of Pennsylvania, Philadelphia, PA 19104 ABSTRACT Packet routing is an important
More information6 Randomized rounding of semidefinite programs
6 Randomized rounding of semidefinite programs We now turn to a new tool which gives substantially improved performance guarantees for some problems We now show how nonlinear programming relaxations can
More informationFT-Z-OE: A Fault Tolerant and Low Overhead Routing Algorithm on TSV-based 3D Network on Chip Links
FT-Z-OE: A Fault Tolerant and Low Overhead Routing Algorithm on TSV-based 3D Network on Chip Links Hoda Naghibi Jouybari College of Electrical Engineering, Iran University of Science and Technology, Tehran,
More informationA Combined BIT and TIMESTAMP Algorithm for. the List Update Problem. Susanne Albers, Bernhard von Stengel, Ralph Werchner
A Combined BIT and TIMESTAMP Algorithm for the List Update Problem Susanne Albers, Bernhard von Stengel, Ralph Werchner International Computer Science Institute, 1947 Center Street, Berkeley, CA 94704,
More informationCommunication Networks I December 4, 2001 Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page 1
Communication Networks I December, Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page Communication Networks I December, Notation G = (V,E) denotes a
More information