Thermal-aware Steiner Routing for 3D Stacked ICs

Size: px
Start display at page:

Download "Thermal-aware Steiner Routing for 3D Stacked ICs"

Transcription

1 Thermal-aware Steiner Routing for 3D Stacked ICs Mohit Pathak and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology {mohitp, Abstract In this paper, we present the first work on the Steiner routing for 3D stacked ICs. In the 3D Steiner routing problem, the pins are located in multiple device layers, which makes it more general than its D counterpart. Our algorithm consists of two steps: tree construction and tree refinement. Our tree construction algorithm builds a delay-oriented Steiner tree under a given thermal profile. We show that thermal-aware 3D tree construction involves the minimization of two-variable Elmore delay function. In our tree refinement algorithm, we reposition the through-vias while preserving the original routing topology for further thermal optimization under performance constraint. We employ a novel scheme to relax the initial NLP formulation to ILP and consider all through-vias from all nets simultaneously. Our related experiments show the effectiveness of our proposed solutions. I. INTRODUCTION The emerging 3D die stacking technology enables the integration of multiple planar integrated circuits in the vertical direction with high-density interconnect through-silicon-vias. The pitch of these vertical die-to-die vias can be as short as a few microns, thus substantially reduce the communication latency, in particular, the global interconnect. One of the major concerns of 3D ICs is thermal dissipation. Stacking of different device layers combined with the low thermal conductivity of the bonding material may result in excessively high on-chip temperature. The location of through vias in a Steiner tree has high impact on the delay at the sink nodes of the tree. In addition, effective placement of through vias can play a significant role in lowering the temperature of the chip since these vias establish thermal paths to the heat sink. Thus, an approach which tries to optimize the location of through vias such that both performance and thermal issues are addressed is important. The contributions of this paper are as follows: We formulate and solve the new 3D IC Steiner Routing (3DSR) problem for multi-pin net routing in 3D stacked ICs. Our algorithm considers all die simultaneously and constructs performance-oriented Steiner trees under given thermal profile while determining the locations for the related through-vias. We show that thermal-aware 3D tree construction involves the minimization of two-variable Elmore delay function. We formulate and solve the new Through-via Relocation problem for thermal optimization in 3D stacked ICs. We further optimize the thermal objective by relocating This material is based upon work supported by the National Science Foundation under CAREER Grant No. CCF and MARCO CS. through vias in each tree under a given timing constraint while preserving the original performance-oriented routing topology. We employ a novel scheme to relax the initial NLP formulation to ILP and consider all throughvias from all nets simultaneously. A previous work [1] talks about Steiner trees in different routing layers, but their work does not consider pins on different device layers. Thus, this work cannot be used for 3D ICs without any significant modification. Another recent work on thermal via insertion [] is targeting thermal optimization during routing for 3D ICs. However, the thermal vias are additional vias that do not carry any signal and may cause congestion problem. In our algorithm, we relocate the existing through-vias for thermal optimization. Lastly, some analytical results on the location of through vias were recently presented in [3]. However, this work focused on two-pin nets and did not perform Steiner tree construction. II. PRELIMINARIES A. Problem Formulation We assume the following is given: (i) a set of m nets {n 0, n 1,, n m 1 }, where each net is represented by a list of pins n i = {p 0,p 1,,p k 1 } with p 0 as the driver, (ii) a 3D routing grid G that represents the routing resource in a given 3D stacked IC, where each grid node represents a routing region and each edge denotes the adjacency among the regions, (iii) each x/y grid edge is associated with horizontal/vertical wire capacity and z with via capacity, (iv) the location of each pin p(x, y, z), and (v) a 3D thermal grid Z with temperature at all grid nodes. A 3D Steiner Tree is defined to be a set of D (= planar) Steiner trees connected by through vias. The goal of the Thermal-aware 3D Steiner Routing problem is to generate a 3D Steiner tree for each net while satisfying the capacity constraints specified in the underlying G. 1 The objective is to minimize (i) the maximum temperature among all nodes in the thermal grid, and (ii) the maximum Elmore delay among all pins in each tree, where the delay is computed based on the given thermal distribution. This paper uses the recent temperature dependent interconnect delay model proposed in [4]. The line resistance per unit length can be calculated as: r(x) =r 0 (1 + β T (x)), where r 0 is the resistance at 0 C, β is the temperature co-efficient of resistance, and T (x) denotes the temperature at location x. 1 Note that we do not explicitly minimize routing/via congestion but implicitly address it by satisfying the capacity constraints. Each edge capacity can be set independently to reflect the availability of the routing resource /07/$ IEEE 05

2 B. Overview of the Approach The goal of our 3D Steiner routing is to address the following two important design objectives: performance and thermal. We tackle this problem in two steps: 1) Construction step: we first perform thermal analysis from the given 3D placement. We then construct a performance-oriented routing tree for each net under the non-uniform thermal profile. We recompute the temperature values to reflect the routing. ) Refinement step: We optimize the thermal objective by relocating through vias in each tree under given timing constraints. Note that the through via relocation preserves the original tree topology while optimizing the thermal objective. We recompute the timing slacks and temperature values to reflect the relocation. We repeat this step until no more thermal improvement is possible. III. 3D STEINER TREE CONSTRUCTION A. Overview of the Algorithm The basic approach of our 3D Steiner tree construction algorithm is similar to SERT [5], where an existing tree is incrementally grown by connecting a new sink pin to it. SERT starts with the driver pin and selects the sink pin that minimizes Elmore delay when connected to the driver. This process continues until all sink pins are connected to the tree that is being grown. The goal is to minimize the maximum Elmore delay among all sink pins of the tree. Here the biggest challenge is to compute the point on the tree where the new pin connects to. There are three major differences between SERT and our work: (i) all the pins in SERT are located in the same die, whereas our 3D algorithm handles the pins located in different die. This 3D case requires the usage of through vias, and the location of these vias has huge impact on the topology of the tree as well as the sink pin delay. (ii) the delay optimization in SERT is based on single variable, whereas our algorithm deals with two-variable function optimization, (iii) our interconnect delay is computed based on the given thermal profile. A pseudocode of our algorithm is shown in Figure 1. Our routing algorithm consists of two phases: construction (line 1-13) and refinement (14-15). We construct 3D Steiner trees during the construction phase while ignoring congestion and then alleviate congestion by rip-up-and-reroute during the refinement phase. Given a net n, our 3D Steiner tree T n initially contains the driver pin (line ). We store the remaining pins of n in Q n (line 3). We then examine all pinedge pair (line 5-6) and compute the impact of connecting the pin to the edge on Elmore delay under the given thermal profile Z, where the pin is from Q n and the edge is from T n. Specifically, the delay impact is calculated based on the increase in temperature-dependent Elmore delay among all pins currently in T n (line 9-10). This requires the computation of connection point x and the through via location y (line 7-8) (to be discussed in Section III-B). Next, we select the pin-edge pair that results in the minimum max-delay increase (line 11) Thermal-aware 3D Steiner Routing Algorithm input: netlist NL, routing graph R, thermal profile Z output: 3D Steiner tree for each net 1. for (each net n NL). T n = p 0 (n); 3. Q n = setofpinsofn except p 0 ; 4. while (Q n ) 5. for (each pin a Q n ) 6. for (each edge e T n ) 7. x = connection point for a e; 8. y = through via location on e(x, a); 9. update d(p) for all p T n ; 10. X(a, e) = max{d(p) p T n a}; 11. (a min,e min ) = pin+edge pair with min X; 1. T n = T n e min ; 13. remove a min from Q n ; 14. for (each non-timing critical T n violating capacity) 15. rip-up-and-reroute T n under Z; Fig. 1. Pseudocode of thermal-aware 3D Steiner routing algorithm. In case e and a are located in different planes, e(r, a) will contain a through via. and add the pin to T n (line 1-13). Our rip-up-and-reroute is done only on the less timing critical nets, i.e., the nets with smaller max-delay values (line 14-15). The complexity of our algorithm is O(nm 4 ), where n is the total number of nets and m is the maximum number of pins in any net. The O(m 4 ) term is based on the while-loop, two for-loops, and Elmore delay computation for all sinks (line 9). Since m is bounded by a constant in VLSI circuits typically, our algorithm runs in linear time in practice. B. Connection Point and Via Location In this section we discuss how we can efficiently construct 3D Steiner trees. Our discussion is based on two-die case for the simplicity of the discussion, but our algorithm is applicable to multiple die without any modification. Let r 1 and c 1 denote the unit length resistance and capacitance values for die 1. r and c are similarly defined for die. The capacitance and resistance of a through via connecting the two die are denoted C via and R via. Given a pin p and an edge e T,theconnection point is defined as the point on e to which p is connected. The connection point computation for D case has been presented in [5], where the Elmore delay change on an entire tree caused by adding a new pin to the tree is a function of a single variable x, the location of connection point. We extend this work by introducing a second variable y that represents the location of through via. We then optimize the two-variable delay function and determine the location of connection point (= x) and through via (= y) for 3D case. ReferringtoFigure,e(p, c) and e(q, b) are edges on T. p is the parent node of e(p, c), and q is the parent node of e(q, b). a is the new pin that needs to connect to e(p, c). Edge e(p, c) lies on die 1 with interconnect parasitics r 1 and c 1, whereas a lies on die with interconnect parasitics r and c. 06

3 die 1 die p 0 g (case 4) b (case 3) connection point p x d c (case ) q y, through via dz a (case 1) Fig.. Illustration of how pin a connects to e(p, c) T. e(y, a) is routed in the bottom die, where as all other edges are routed in the top die. x is the location of connection point on e(p, c). y is the location of the through via inserted on e(x, a). e(q, b) is another branch in T. g is another sink that is not a part of the subtree rooted at p. d is the shortest distance point on e(p, c) from a, andδz is the distance between the through via and a. The Elmore delay of T a is a function of both x and y. d is the point on e(p, c) that is of the shortest distance to a. x is the connection point, and y is the location of through via. Our first goal is to derive Elmore delay equations that are the functions of δx and δy. In what follows, we let δx denote the distance between node p and node x, δq, δa, δb, δc, and δd are used similarly. δy is the distance between x and y, and δz is the distance between y and a. LetT b denote the subtree rooted at node b. In order to compute the Elmore delay change on all sink pins in T caused by adding a to T, we consider the following four cases: (i) delay at the node to be added (= node a), (ii) delay at the subtree located after the connection point (= node c), (iii) delay at the subtree that could be located either before or after the connection point (= node b), (iv) delay of the nodes not in T p. Figure illustrates these four cases: Case 1: We handle the delay at node a. In this case, d(a) is a sum of four functions. f 1 is the delay from p 0 to p. f is the delay from p to x without considering e(q, b) and T b. f 3 is the delay from p to x when considering e(q, b) and T b. f 4 is the delay from x to a. Thus we have d(a) =f 1 +f +f 3 +f 4, where f 1 = K 0 + K 1 {c 1 δy + C via + c δz + c 1 δc + C c + C b +c 1 (δb ( δq)} ) δx f = r 1 δx c 1 + c 1δy + C via + c δz + c 1 (δc δx)+c c { r1 δx(c 1 (δb δq)+c b ), if δx δq f 3 = r 1 δq(c ( 1 (δb δq)+c b ), ) if δx δq δy f 4 = r 1 δy c 1 + C via + c δz δz +r c + R via ( Cvia + c δz ) where δz = δa (δx + δy), K 0 is the sum of resistance and capacitance products along p 0 p path, K 1 is the sum of resistance along p 0 p path, and C i is the capacitance of the sub-tree rooted at i th node. Case : The new delay at node c is given by d(c) =f 1 + f + f 3 + f 4, where f 3 = r 1 δq(c 1 (δb δq)+c b ) { } f 4 c1 (δc δx) = r 1 (δc δx) + C c Case 3: The new delay at node b is given by d(b) =f 1 + f + f 3, where { f r1 δx(c = 1 δy + C via + c δz), if δx δq r 1 δq(c 1 δy + C via + c δz), if δx δq { } f 3 δq = r 1 δq c 1 + c 1(δb δq)+c b + C c Case 4: For all other nodes not in T p, the added delay is a function of the added capacitance, which is linear in terms of x and y and given by C = c 1 (δx + δy)+c via + c δz In case of connecting two pins separated by non-zero intermediate layers, we use stacked vias so that no routing in these intermediate layers are used. C. Optimization of Delay Equations We discuss how the delay equations derived in the previous section can be optimized to give a small set of possible optimum location points. We first consider the conditions needed to determine the minimum of a general quadratic function of two variables. We later show how the delay equations derived in the previous section can be optimized using these conditions. In general, for a quadratic function of two variables f(δx, δy), the maximum or the minimum of the function depends upon the values of F δx and the determinant of the Hessian matrix H 1 : F F δx δx δy F δx δy F δy where F is the delay function under consideration. The above values for a quadratic function of two variables are always constant. For our case we have 0 δx δd and 0 δy δa and thus consider the following cases: Case A: If F δx 0 and H 1 0, the minimum can be found at the boundary points, i.e., δx =0or δx = δd and δy =0or δy = δa. Thus we have four points to look for the minimum. Case B: If F 0 F 0 and H 1 =0,wehavea δx δy concave function, and the minimum lies on the boundary points. Case C: If F δx = 0 F δy = 0 and H 1 = 0, then f(δx, δy) is a linear function of δx and δy, and the minimum lies at the boundary points. Case D: If H 1 < 0, the critical point found is a saddle point. and the minimum lies at the boundary, although a different set of boundary points may need to be chosen. The set of boundary points may be found by setting δx = 0 or δx = δd and minimizing f(δx, δy) as a function of δy or setting δy = 0 or δy = δa and minimizing f(δx, δy) as a function of δx. We show that the Elmore delay at each sink node in T can be optimized by considering any of the 4 cases shown above. 07

4 p3 (L3) v1 driver (L1) R v1 v R v3 v3 p4 (L4) p1 (L1) p (L) goal is then to find a new location for each movable via in each Steiner tree so that the maximum temperature among all nodes in the thermal grid is minimized while the timing and routing resource capacity constraints are not violated. Note that we preserve the original topology of the Steiner trees. All that is changing is the location of through vias for thermal optimization, where the movable through vias are moved into thermal hotspots under timing constraint to reduce the thermal resistivity. Fig. 3. Illustration of movable range. The figure is a top view of a multilayer grid. The driver is represented by a triangle, the sinks are represented by dots, and the vias are represented by square boxes. The driver is located in layer 1 (= L1). The movable range of each via is represented by the dotted lines. Thus, there is a fixed number of points (x, y) for which the Elmore delay can be minimized. IV. THERMAL-AWARE THROUGH VIA RELOCATION A. Overview of the Algorithm The motivation behind our through via relocation is to move as many through vias into thermal hotspots as possible while preserving the original tree topology we obtain during our construction step. The objective is to minimize the maximum temperature on the chip while not violating the timing and routing resource capacity constraints. Note that the problem of optimizing the number of through vias and temperature is non-linear in nature since we need to solve the equation T = PR, where T is the temperature matrix, P is the power vector, and R is the thermal resistance matrix such that R 1 a, where a is the number of through vias. General solutions available for solving non-linear problems cannot be applied directly for large problem sizes. In this section we propose an innovative problem formulation which helps us effectively overcome the non-linear nature of this problem. We propose a relaxed ILP based formulation in which the number of integer variables are kept at minimum. Our ILP-based method optimizes through vias on all nets simultaneously, which is more rigorous than a sequential approach that optimizes the nets one by one. In addition, we target all hotspots simultaneously instead of iteratively targeting one by one. Our experimental results in Section V demonstrate the advantage of this approach. B. Movable Range We start with the set of 3D Steiner trees we obtain from the construction step. All pins in each tree T i are associated with timing constraint that denotes the required arrival time in terms of Elmore delay. Each through via v T i is associated with the movable range, denoted R via, that denotes the range of new location along its route to the connection point so that the timing constraints are not violated. An illustration is shown in Figure 3. In case the movable range of a through via v is a single point, v is non-movable; otherwise it is movable. Our C. Fast Thermal Analysis To optimize the location of through vias, we use a fast thermal analysis model described in []. An illustration is shown in Figure 4. A tile structure is imposed on the surface, where each tile is approximated as a resistive chain as shown in Figure 4. In 3D ICs, the heat sinks are attached to the bottom or top side of the 3D IC stack, with other boundaries being adiabatic. Thus, the dominant heat flow is in the vertical direction. For the purpose of optimization we view each tile stack in Figure 4 as an independent thermal resistive chain. In our model we do not consider effects of lateral thermal dissipation, this can be justified since the thermal conductivity of epoxy material used to join the die is much lower than that of silicon itself. This essentially means that it is difficult for heat to dissipate in the vertical direction as compared to horizontal direction. Our fast thermal model which optimizes vertical heat flow helps in reducing the overall chip temperature. To obtain effective temperature reduction the full resistive thermal model [] (considering lateral resistances) is run twice, once before and once after calling our via relocation routine. The values in temperature reduction reported in our experiments are based on the full resistive model. D. Non-linear Programming In the following sections we first show how the thermal relocation problem can be formulated as a NLP (non linear programming) formulation. We then show how the NLP may be converted to an ILP problem formulation. The ILP formulation adds a large number of integer variables in the problem thus making it difficult to solve. We finally propose our reduced ILP problem formulation (which is a relaxed ILP formulation) which reduces the number of integer variables significantly. The NLP based formulation is defined as follows (Table I explains the notations we use in the formulation): Minimize Σα T (1) The authors in [] suggest to add thermal-vias for further thermal optimization. Our method, which simply relocates exiting through-vias, can be used in conjunction with these approaches for more rigorous thermal optimization. 08

5 TABLE I VARIABLES AND CONSTANTS USED IN OUR ILP FORMULATION T temperature at tile (i, j, k) α temperature-related weight for tile (i, j, k), which is computed by T org org /Tmax, wheret org denotes the original temperature for (i, j, k) before the optimization, and Tmax org is the maximum value among all T org values. constant vmax maximum number of through vias each tile can accommodate. constant β n becomes 1 when a through via is moved to tile (i, j, k) so that the total number of movable through vias at (i, j, k) changes from n 1 to n. V org original number of vias (= movable + non-movable) in tile (i, j, k) before the optimization. constant V m number of through vias in tile (i, j, k), which is just m. number of vias (= movable + non-movable) in tile (n) G cur G max R m R novia α γ n δ m Subject to: (i, j, k) after the optimization. becomes 1 if a through via in net n is moved from tile (i, j, k) to (x, y, k); 0 otherwise. current usage of wires and through vias for a grid (i, j, k) in 3D routing grid G. capacity constraint of wires and through vias for a grid (i, j, k) in 3D routing grid G. constant thermal resistance of tile i, j, k having m vias. thermal resistance of tile i, j, k with no through vias. constant thermal resistance of one through via. constant becomes 1 if the number of through vias in the tile (i, j, k) is n. it is the temperature difference between tile i, j, k and i, j, k 1 if the number of through vias in tile i, j, k is m. δ m =(P i,j,n + + P ) R m constant T = T 1 +(P i,j,n + + P ) R opt V =ΣM x,y,k R opt = α α + R novia () (3) = V org + V (4) v,w,k (n) ΣM (n), n (5) G cur G max, (i, j, k) (6) Σ (n) =1, n (7) (n) {0, 1} (8) Equation (1) is our objective function, where we minimize the weighted sum of temperature values at all thermal tiles (thermal tiles and thermal mesh nodes are used interchangeably in this paper). The weights α are computed based on the initial temperature measured before the through via relocation. In this case, the higher the α is, the lower the T we desire. Equation () gives the temperature at each tile based on our fast thermal model illustrated in Figure 4. Equation(3) shows the variation of thermal resistance based on the number of through vias in a tile. This is obtained from solving the following parallel resistance relation: 1/R opt =1/Rnovia opt /α. Equation (4) is from the definition of V. Equation + (5) states that the total change in the number of through vias for tile (i, j, k) is the total number of through vias moved into R lateral Tile stack array Sing tile stack Tile stack analysis Fig. 4. Thermal model used, where R b denotes the thermal resistance to the heatsink. The convention used in our ILP formulation is that adding through vias at point (= tile) i reduces the value of R i. (i, j, k) minus the total number of through vias moved out of (i, j, k). Equation (6) ensures that the routing resource (= wires and through vias) capacity constraints are satisfied. Equation (8) states that (n) are binary integer variables. Equation (7) ensures that only one via per net is moved. Note that this restriction is unavoidable since the movable range of a through via is computed independent of other through vias. Once a through via is moved, it affects the timing constraint, movability, and the range of all other through vias in the same net. The ultimate way to perform via relocation is to consider all vias from all nets simultaneously, which is computationally expensive. Our method that considers one via from all nets simultaneously is better than a sequential approach that considers all vias from a single net. We see that the original via relocation problem is nonlinear in nature due the inverse relation between thermal resistance and number of through vias in a tile (= R opt P 5 P 4 P 3 P P R b R 5 R 4 R 3 R R 1 vs ). In the next section we propose a simplified integer linear programming formulation which overcomes the nonlinear problem formulation. E. Integer Linear Programming From the NLP formulation we see that the number of through vias in each tile is an integer variable. Our ILP based formulation differs from the NLP in the following way: we replace Equations () and (3) with the following: T = T 1 + γ 0 δ γ vmax δ vmax (9) 1 γ 1 +γ + + vmax γ vmax = (10) Σγ m =1 (i, j, k) (11) γ n {0, 1} (1) Equation (9) gives the new equation for calculating the temperature. In this equation γ n are the new integer variables that are added whereas δ (refer to Table I) are the constants which are calculated for each possible value of the number of vias in a tile. Equation (10) equates the γ n variable with the optimum number of vias in a tile. Equation (11) ensures that for each tile only one γ n takes a value 1. Finally equation (1) ensures that γ n is either 0 or 1. All other equations are same to our previous NLP formulation. 09

6 We see that for this new ILP formulation a large number of new integer variables γ n are added. The number of these new integer variables is proportional to the number of tiles in our thermal grid and the number of vias possible in each grid. Adding such large number of integer variables makes the problem harder to solve. In our next section we propose our reduced ILP formulation which removes the need of integer γ n variables, thus reducing the large number of integer variables required. F. Reduced Integer Linear Programming Our ILP-based via relocation is formulated as follows (Table I explains the notations we use in the formulation): Minimize Subject to: Σα T (13) T = T 1 + δ 0 β 1 (δ 0 δ 1 ) (14) β vmax (δ vmax 1 δ vmax ) β 1 + β + + β vmax = (15) = V org + V (16) V =ΣM x,y,k v,w,k (n) ΣM (n), n (17) G cur G max, (i, j, k) (18) 0 β m 1 (19) (n) {0, 1} (0) Σ (n) =1, n (1) Equation (14) is a new way to compute the temperature at the routing tile (i, j, k) compared with Equation (9) (interested readers are referred to our technical report [6] for more details). Equation (15) states that the total number of through vias in a tile (i, j, k), which is specified by the β i values (1 i vmax), should equal to. Equation (19) restricts the range of values β n can take. All other equations are same as discussed in NLP formulation. To overcome the restriction of moving just one via per net (Equation 1) we iterate the entire relaxed ILP multiple times so that multiple through vias from a single net is given a chance to relocate in an iterative fashion. The number of integer variables (= (n)) can be huge if the number of nets is larger or the thermal grid is finer. This makes our ILP formulation less desirable for a large problem instance. We overcome this limitation by relaxing these integer variables and solving the resulting LP problem. We round the continuous variables based on a threshold value λ. We additionally perform a gainbased gradient search to obtain the λ value that generates the best quality results. V. EXPERIMENTAL RESULTS We implemented our algorithms in C++/STL and ran our experiments on Linux PC running at GHz. We tested our algorithms with three sets of benchmark: ISCAS89, ITC99, TABLE III TEMPERATURE COMPARISON BETWEEN GREEDY AND OUR SIMULTANEOUS THROUGH VIA RELOCATION APPROACH. T org AND Topt RESPECTIVELY DENOTE THE TEMPERATURE BEFORE AND AFTER THERMAL OPTIMIZATION. greedy ILP multiple ILP ckt T org T opt cpu T opt cpu T opt cpu iter s b14 opt s s b0 opt b1 opt b opt ibm ibm ibm ibm ibm RATIO and ISPD98. Our experiment is based on 4-die stacked IC, where the top two and bottom two dies are bonded in face-toface and the middle two in back-to-back. We assume all 4 dies have different unit-length resistance and capacitance values [3]: r 1 = 86Ω/mm and c 1 = 396fF/mm, r = 175Ω/mm and c = 100fF/mm, r 3 = 74Ω/mm and c 3 = 79fF/mm, and r 4 = 154Ω/mm and c 4 = 10fF/mm for unit-length interconnect for the four dies. R via = 106Ω/mm and C via = 140fF/mm values are used for the face-to-face vias, and R via = 53Ω/mm and C via = 80fF/mm values are used for the back-to-back vias. All runtime reported are in seconds. We use [7] to analyze the temperature profile from a given 3D placement [8]. To compare our results we implemented a wirelength driven 3D maze router. During tree construction in our 3D maze router, a new pin is added to an already existing tree such that the added interconnect is shortest possible. We also implemented a 3D version of Steiner arborescence. We first converted the 3D problem to a D problem by mapping the corresponding pin locations to the D plane, then we implemented a D Steiner arborescence used in [9]. All the wire routing was done on the same plane as that of the driver pin, finally all pins which were not located on the same plane as the driver pin were connected using stacked vias. We report the total wirelength, number of vias, the maximum path-delay among all sinks and processor time in seconds for each circuit. A. 3D Steiner Routing Results Table II shows a comparison among 3D maze, 3D A-tree, and our 3D Steiner routing. We observe that our 3D Steiner router achieves 49% average performance improvement over 3D maze routing and about 9% improvement over 3D A-tree. However, we see an increase in 1% wirelength and 7% vias over 3D maze router. Also our 3D Steiner router works slower than 3D A-tree, which can be attributed to a large number of high fanout nets present in the design. Our 3D Steiner tree generation takes O(p 4 ) time to run, where p is the number of pins, and does not scale well with increasing fanout size. 10

7 TABLE II 3D ROUTING FOR ALL TYPES OF NETS. WECOMPARE3D MAZE ROUTER AND OUR 3D STEINER ROUTER. THE RUNTIME IS IN SECONDS. THE VIA BOUND COLUMN SHOWS A LOWER BOUND ON THE VIA COUNT. 3D Maze Routing 3D A-tree 3D Steiner Routing ckt size v-bound wire delay via cpu wire delay via cpu wire delay via cpu s b14 opt s s b0 opt b1 opt b opt ibm ibm ibm ibm ibm RATIO The v-bound column in Table II shows the lower bound of the through via usage for each circuit. For the two-die twopin nets, the lower bound is 1. For the multi-die, multi-pin nets, we use the minimum possible via to connect all pins in the dies. We see that the number of vias needed by the 3D maze or 3D Steiner router is about twice as many as the minimum required. The number of vias used in 3D A-tree algorithm is the highest. However, this should not be used as a comparison metric, since we constructed the 3D A-tree to primarily compare performance results. If we look at circuit ibm13, we observe that 3D-Atree performs better than our 3D Steiner in terms of performance. This was due to congestion that caused larger number of nets to be ripped and re-routed in our 3D Steiner algorithm. In some cases, we observed that 3D A-tree caused lower congestion thus required fewer re-routing. B. Through Via Relocation Results To compare the results of our through via relocation, we implemented a fast greedy algorithm, which tries to move through vias into thermal hotspots in an iterative fashion. We choose a single hotspot and relocate the movable vias into it. We then repeat this process for the next hotspot until no more temperature improvement can be obtained. Table III shows the reduction in maximum temperature by our approach and the greedy algorithm (the accurate thermal model was used to calculate final temperature). We observe that our simultaneous approach achieves consistent improvement over the greedy approach at the expense of additional runtime. The runtime of our ILP-based method ranges from 70 to 800 seconds. The runtime for our biggest circuit, ibm17, that contains 184K nets is around 800 seconds. This shows that our method scales well with the complexity of the circuit while maintaining high quality solutions. In Table III we also show the impact of multiple iterations on our relaxed ILP. Our entire through via relocation algorithm can be repeated multiple times since the temperature values change from each iteration. However, we observe that the majority of the improvement is achieved from the first iteration and that the algorithm converges quickly without any major improvement during the subsequent iterations. VI. CONCLUSIONS This paper studied two new problems that are important for 3D stacked IC technology: 3D Steiner tree construction and through via relocation. Our routing algorithm is based on a constructive method, where a 3D Steiner tree is grown by connecting a new pin to the existing tree. We derived twovariable delay equations and optimized them to compute the location of the through vias under given thermal profile. For through via relocation, we devised an innovative technique which helps avoid the non-linear optimization required for temperature optimization. Our formulation can handle large number of vias simultaneously for an effective temperature optimization. REFERENCES [1] M. Yilidiz and P.H.Madded, Preferred Direction Steiner Trees, IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, 00. [] J. Cong and Y. Zhang, Thermal Via Planning for 3-D ICs, in Proc. IEEE Int. Conf. on Computer-Aided Design, 005. [3] V. Pavlidis and E. Friedman, Interconnect Delay Minimization through Interlayer Via Placement in 3-D ICs, in Proc. Great Lakes Symposum on VLSI, 005. [4] A. Ajami, K. Banerjee, and M. Pedram, Effects of non-uniform substrate temperature on the clock signal integrity in high performance designs, in Proc. of IEEE Custom Integrated Circuits Conference, May 001, pp. pp [5] K. Boese, A. Kahng, B. McCoyy, and G. Robins, Near-Optimal Critical Sink Routing Tree Constructions, IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, [6] M. Pathak and S. K. Lim, Thermal-aware Steiner Routing for 3D Stacked ICs, Georgia Institute of Technology, Tech. Rep. GIT-CERCS-07-18, 007. [7] T.-Y. Wang and C. C.-P. Chen, 3-d thermal-adi: A linear-time chip level transient thermal simulator, IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, pp , 00. [8] B. Goplen and S. Sapatnekar, Efficient Thermal Placement of Standard Cells in 3D ICs using a Force Directed Approach, in Proc. IEEE Int. Conf. on Computer-Aided Design, 003. [9] J. Cong, K.-S. Leung, and D. Zhou, Performance Driven Interconnect Design Based on distributed RC delay model, in Proc. ACM Design Automation Conf.,

Thermal-aware Steiner Routing for 3D Stacked ICs

Thermal-aware Steiner Routing for 3D Stacked ICs Thermal-aware Steiner Routing for 3D Stacked ICs Mohit Pathak and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology {mohitp, limsk}@ece.gatech.edu Abstract In this

More information

A Study of Through-Silicon-Via Impact on the 3D Stacked IC Layout

A Study of Through-Silicon-Via Impact on the 3D Stacked IC Layout A Study of Through-Silicon-Via Impact on the Stacked IC Layout Dae Hyun Kim, Krit Athikulwongse, and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta,

More information

Basic Idea. The routing problem is typically solved using a twostep

Basic Idea. The routing problem is typically solved using a twostep Global Routing Basic Idea The routing problem is typically solved using a twostep approach: Global Routing Define the routing regions. Generate a tentative route for each net. Each net is assigned to a

More information

Planning for Local Net Congestion in Global Routing

Planning for Local Net Congestion in Global Routing Planning for Local Net Congestion in Global Routing Hamid Shojaei, Azadeh Davoodi, and Jeffrey Linderoth* Department of Electrical and Computer Engineering *Department of Industrial and Systems Engineering

More information

Temperature-Aware Routing in 3D ICs

Temperature-Aware Routing in 3D ICs Temperature-Aware Routing in 3D ICs Tianpei Zhang, Yong Zhan and Sachin S. Sapatnekar Department of Electrical and Computer Engineering University of Minnesota 1 Outline Temperature-aware 3D global routing

More information

On GPU Bus Power Reduction with 3D IC Technologies

On GPU Bus Power Reduction with 3D IC Technologies On GPU Bus Power Reduction with 3D Technologies Young-Joon Lee and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta, Georgia, USA yjlee@gatech.edu, limsk@ece.gatech.edu Abstract The

More information

SEMICONDUCTOR industry is beginning to question the

SEMICONDUCTOR industry is beginning to question the 644 IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, VOL. 29, NO. 3, SEPTEMBER 2006 Placement and Routing for 3-D System-On-Package Designs Jacob Rajkumar Minz, Student Member, IEEE, Eric Wong,

More information

Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment

Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment Xin-Wei Shih, Tzu-Hsuan Hsu, Hsu-Chieh Lee, Yao-Wen Chang, Kai-Yuan Chao 2013.01.24 1 Outline 2 Clock Network Synthesis Clock network

More information

Double Patterning-Aware Detailed Routing with Mask Usage Balancing

Double Patterning-Aware Detailed Routing with Mask Usage Balancing Double Patterning-Aware Detailed Routing with Mask Usage Balancing Seong-I Lei Department of Computer Science National Tsing Hua University HsinChu, Taiwan Email: d9762804@oz.nthu.edu.tw Chris Chu Department

More information

Congestion-Aware Power Grid. and CMOS Decoupling Capacitors. Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar

Congestion-Aware Power Grid. and CMOS Decoupling Capacitors. Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar Congestion-Aware Power Grid Optimization for 3D circuits Using MIM and CMOS Decoupling Capacitors Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar University of Minnesota 1 Outline Motivation A new

More information

Metal-Density Driven Placement for CMP Variation and Routability

Metal-Density Driven Placement for CMP Variation and Routability Metal-Density Driven Placement for CMP Variation and Routability ISPD-2008 Tung-Chieh Chen 1, Minsik Cho 2, David Z. Pan 2, and Yao-Wen Chang 1 1 Dept. of EE, National Taiwan University 2 Dept. of ECE,

More information

Bus-Aware Microarchitectural Floorplanning

Bus-Aware Microarchitectural Floorplanning Bus-Aware Microarchitectural Floorplanning Dae Hyun Kim School of Electrical and Computer Engineering Georgia Institute of Technology daehyun@ece.gatech.edu Sung Kyu Lim School of Electrical and Computer

More information

OpenAccess In 3D IC Physical Design

OpenAccess In 3D IC Physical Design OpenAccess In 3D IC Physical Design Jason Cong, Jie Wei,, Yan Zhang VLSI CAD Lab Computer Science Department University of California, Los Angeles Supported by DARPA and CFD Research Corp Outline 3D IC

More information

Buffered Routing Tree Construction Under Buffer Placement Blockages

Buffered Routing Tree Construction Under Buffer Placement Blockages Buffered Routing Tree Construction Under Buffer Placement Blockages Abstract Interconnect delay has become a critical factor in determining the performance of integrated circuits. Routing and buffering

More information

An Efficient Computation of Statistically Critical Sequential Paths Under Retiming

An Efficient Computation of Statistically Critical Sequential Paths Under Retiming An Efficient Computation of Statistically Critical Sequential Paths Under Retiming Mongkol Ekpanyapong, Xin Zhao, and Sung Kyu Lim Intel Corporation, Folsom, California, USA Georgia Institute of Technology,

More information

A Study of IR-drop Noise Issues in 3D ICs with Through-Silicon-Vias

A Study of IR-drop Noise Issues in 3D ICs with Through-Silicon-Vias A Study of IR-drop Noise Issues in 3D ICs with Through-Silicon-Vias Moongon Jung and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta, Georgia, USA Email:

More information

Thermal-Aware 3D IC Placement Via Transformation

Thermal-Aware 3D IC Placement Via Transformation Thermal-Aware 3D IC Placement Via Transformation Jason Cong, Guojie Luo, Jie Wei and Yan Zhang Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 Email: { cong,

More information

AS VLSI technology scales to deep submicron and beyond, interconnect

AS VLSI technology scales to deep submicron and beyond, interconnect IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL., NO. 1 TILA-S: Timing-Driven Incremental Layer Assignment Avoiding Slew Violations Derong Liu, Bei Yu Member, IEEE, Salim

More information

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets.

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Clock Routing Problem Formulation Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Better to develop specialized routers for these nets.

More information

Thermal-Aware 3D IC Physical Design and Architecture Exploration

Thermal-Aware 3D IC Physical Design and Architecture Exploration Thermal-Aware 3D IC Physical Design and Architecture Exploration Jason Cong & Guojie Luo UCLA Computer Science Department cong@cs.ucla.edu http://cadlab.cs.ucla.edu/~cong Supported by DARPA Outline Thermal-Aware

More information

Chapter 5 Global Routing

Chapter 5 Global Routing Chapter 5 Global Routing 5. Introduction 5.2 Terminology and Definitions 5.3 Optimization Goals 5. Representations of Routing Regions 5.5 The Global Routing Flow 5.6 Single-Net Routing 5.6. Rectilinear

More information

Multilayer Routing on Multichip Modules

Multilayer Routing on Multichip Modules Multilayer Routing on Multichip Modules ECE 1387F CAD for Digital Circuit Synthesis and Layout Professor Rose Friday, December 24, 1999. David Tam (2332 words, not counting title page and reference section)

More information

Topological Routing to Maximize Routability for Package Substrate

Topological Routing to Maximize Routability for Package Substrate Topological Routing to Maximize Routability for Package Substrate Speaker: Guoqiang Chen Authors: Shenghua Liu, Guoqiang Chen, Tom Tong Jing, Lei He, Tianpei Zhang, Robby Dutta, Xian-Long Hong Outline

More information

VLSI Physical Design: From Graph Partitioning to Timing Closure

VLSI Physical Design: From Graph Partitioning to Timing Closure VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5 Global Routing Original uthors: ndrew. Kahng, Jens, Igor L. Markov, Jin Hu VLSI Physical Design: From Graph Partitioning to Timing

More information

University of California at Berkeley. Berkeley, CA the global routing in order to generate a feasible solution

University of California at Berkeley. Berkeley, CA the global routing in order to generate a feasible solution Post Routing Performance Optimization via Multi-Link Insertion and Non-Uniform Wiresizing Tianxiong Xue and Ernest S. Kuh Department of Electrical Engineering and Computer Sciences University of California

More information

ECE260B CSE241A Winter Routing

ECE260B CSE241A Winter Routing ECE260B CSE241A Winter 2005 Routing Website: / courses/ ece260bw05 ECE 260B CSE 241A Routing 1 Slides courtesy of Prof. Andrew B. Kahng Physical Design Flow Input Floorplanning Read Netlist Floorplanning

More information

VLSI Physical Design: From Graph Partitioning to Timing Closure

VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5 Global Routing Original uthors: ndrew. Kahng, Jens, Igor L. Markov, Jin Hu Chapter 5 Global Routing 5. Introduction 5.2 Terminology and Definitions 5.3 Optimization Goals 5. Representations of

More information

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Chen-Wei Liu 12 and Yao-Wen Chang 2 1 Synopsys Taiwan Limited 2 Department of Electrical Engineering National Taiwan University,

More information

L14 - Placement and Routing

L14 - Placement and Routing L14 - Placement and Routing Ajay Joshi Massachusetts Institute of Technology RTL design flow HDL RTL Synthesis manual design Library/ module generators netlist Logic optimization a b 0 1 s d clk q netlist

More information

A Novel Framework for Multilevel Full-Chip Gridless Routing

A Novel Framework for Multilevel Full-Chip Gridless Routing A Novel Framework for Multilevel Full-Chip Gridless Routing Tai-Chen Chen Yao-Wen Chang Shyh-Chang Lin Graduate Institute of Electronics Engineering Graduate Institute of Electronics Engineering SpringSoft,

More information

Layer Assignment for Reliable System-on-Package

Layer Assignment for Reliable System-on-Package Layer Assignment for Reliable System-on-Package Jacob R. Minz and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta, GA 30332-0250 {jrminz,limsk}@ece.gatech.edu

More information

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface.

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface. Placement Introduction A very important step in physical design cycle. A poor placement requires larger area. Also results in performance degradation. It is the process of arranging a set of modules on

More information

Clock Tree Resynthesis for Multi-corner Multi-mode Timing Closure

Clock Tree Resynthesis for Multi-corner Multi-mode Timing Closure Clock Tree Resynthesis for Multi-corner Multi-mode Timing Closure Subhendu Roy 1, Pavlos M. Mattheakis 2, Laurent Masse-Navette 2 and David Z. Pan 1 1 ECE Department, The University of Texas at Austin

More information

Estimation of Wirelength

Estimation of Wirelength Placement The process of arranging the circuit components on a layout surface. Inputs: A set of fixed modules, a netlist. Goal: Find the best position for each module on the chip according to appropriate

More information

DpRouter: A Fast and Accurate Dynamic- Pattern-Based Global Routing Algorithm

DpRouter: A Fast and Accurate Dynamic- Pattern-Based Global Routing Algorithm DpRouter: A Fast and Accurate Dynamic- Pattern-Based Global Routing Algorithm Zhen Cao 1,Tong Jing 1, 2, Jinjun Xiong 2, Yu Hu 2, Lei He 2, Xianlong Hong 1 1 Tsinghua University 2 University of California,

More information

Place and Route for FPGAs

Place and Route for FPGAs Place and Route for FPGAs 1 FPGA CAD Flow Circuit description (VHDL, schematic,...) Synthesize to logic blocks Place logic blocks in FPGA Physical design Route connections between logic blocks FPGA programming

More information

Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion

Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion Chen Li, Cheng-Kok Koh School of ECE, Purdue University West Lafayette, IN 47907, USA {li35, chengkok}@ecn.purdue.edu Patrick

More information

CS612 Algorithms for Electronic Design Automation. Global Routing

CS612 Algorithms for Electronic Design Automation. Global Routing CS612 Algorithms for Electronic Design Automation Global Routing Mustafa Ozdal CS 612 Lecture 7 Mustafa Ozdal Computer Engineering Department, Bilkent University 1 MOST SLIDES ARE FROM THE BOOK: MODIFICATIONS

More information

Efficient Second-Order Iterative Methods for IR Drop Analysis in Power Grid

Efficient Second-Order Iterative Methods for IR Drop Analysis in Power Grid Efficient Second-Order Iterative Methods for IR Drop Analysis in Power Grid Yu Zhong Martin D. F. Wong Dept. of Electrical and Computer Engineering Dept. of Electrical and Computer Engineering Univ. of

More information

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Physical Design of Digital Integrated Circuits (EN029 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Lecture 09: Routing Introduction to Routing Global Routing Detailed Routing 2

More information

Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools

Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools Shamik Das, Anantha Chandrakasan, and Rafael Reif Microsystems Technology Laboratories Massachusetts Institute of Technology

More information

Full-chip Multilevel Routing for Power and Signal Integrity

Full-chip Multilevel Routing for Power and Signal Integrity Full-chip Multilevel Routing for Power and Signal Integrity Jinjun Xiong Lei He Electrical Engineering Department, University of California at Los Angeles, CA, 90095, USA Abstract Conventional physical

More information

A Novel Performance-Driven Topology Design Algorithm

A Novel Performance-Driven Topology Design Algorithm A Novel Performance-Driven Topology Design Algorithm Min Pan, Chris Chu Priyadarshan Patra Electrical and Computer Engineering Dept. Intel Corporation Iowa State University, Ames, IA 50011 Hillsboro, OR

More information

SEMICONDUCTOR industry is beginning to question the

SEMICONDUCTOR industry is beginning to question the IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, VOL.??, NO.??, JANUARY(?) 2005 1 Placement and Routing for 3D System-On-Package Designs Jacob Minz, Eric Wong, Mohit Pathak, and Sung Kyu Lim,

More information

Buffered Steiner Trees for Difficult Instances

Buffered Steiner Trees for Difficult Instances Buffered Steiner Trees for Difficult Instances C. J. Alpert 1, M. Hrkic 2, J. Hu 1, A. B. Kahng 3, J. Lillis 2, B. Liu 3, S. T. Quay 1, S. S. Sapatnekar 4, A. J. Sullivan 1, P. Villarrubia 1 1 IBM Corp.,

More information

CAD Algorithms. Placement and Floorplanning

CAD Algorithms. Placement and Floorplanning CAD Algorithms Placement Mohammad Tehranipoor ECE Department 4 November 2008 1 Placement and Floorplanning Layout maps the structural representation of circuit into a physical representation Physical representation:

More information

Thermal-Driven Multilevel Routing for 3-D ICs

Thermal-Driven Multilevel Routing for 3-D ICs Thermal-Driven Multilevel Routing for 3-D ICs Jason Cong and Yan Zhang Computer Science Department, UCLA Los Angeles, CA 90095 tel. 310-206-5449, fax. 310-825-2273 cong, zhangyan@cs.ucla.edu Abstract 3-D

More information

Placement Algorithm for FPGA Circuits

Placement Algorithm for FPGA Circuits Placement Algorithm for FPGA Circuits ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations

Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations Renshen Wang Department of Computer Science and Engineering University of California, San Diego La Jolla,

More information

IN recent years, interconnect delay has become an increasingly

IN recent years, interconnect delay has become an increasingly 436 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 18, NO. 4, APRIL 1999 Non-Hanan Routing Huibo Hou, Jiang Hu, and Sachin S. Sapatnekar, Member, IEEE Abstract This

More information

Fast Dual-V dd Buffering Based on Interconnect Prediction and Sampling

Fast Dual-V dd Buffering Based on Interconnect Prediction and Sampling Based on Interconnect Prediction and Sampling Yu Hu King Ho Tam Tom Tong Jing Lei He Electrical Engineering Department University of California at Los Angeles System Level Interconnect Prediction (SLIP),

More information

Adaptive Power Blurring Techniques to Calculate IC Temperature Profile under Large Temperature Variations

Adaptive Power Blurring Techniques to Calculate IC Temperature Profile under Large Temperature Variations Adaptive Techniques to Calculate IC Temperature Profile under Large Temperature Variations Amirkoushyar Ziabari, Zhixi Bian, Ali Shakouri Baskin School of Engineering, University of California Santa Cruz

More information

Routability-Driven Bump Assignment for Chip-Package Co-Design

Routability-Driven Bump Assignment for Chip-Package Co-Design 1 Routability-Driven Bump Assignment for Chip-Package Co-Design Presenter: Hung-Ming Chen Outline 2 Introduction Motivation Previous works Our contributions Preliminary Problem formulation Bump assignment

More information

Performance-Preserved Analog Routing Methodology via Wire Load Reduction

Performance-Preserved Analog Routing Methodology via Wire Load Reduction Electronic Design Automation Laboratory (EDA LAB) Performance-Preserved Analog Routing Methodology via Wire Load Reduction Hao-Yu Chi, Hwa-Yi Tseng, Chien-Nan Jimmy Liu, Hung-Ming Chen 2 Dept. of Electrical

More information

Iterative-Constructive Standard Cell Placer for High Speed and Low Power

Iterative-Constructive Standard Cell Placer for High Speed and Low Power Iterative-Constructive Standard Cell Placer for High Speed and Low Power Sungjae Kim and Eugene Shragowitz Department of Computer Science and Engineering University of Minnesota, Minneapolis, MN 55455

More information

Incremental Layer Assignment for Critical Path Timing

Incremental Layer Assignment for Critical Path Timing Incremental Layer Assignment for Critical Path Timing Derong Liu 1, Bei Yu 2, Salim Chowdhury 3, and David Z. Pan 1 1 ECE Department, University of Texas at Austin, Austin, TX, USA 2 CSE Department, Chinese

More information

FastPlace 2.0: An Efficient Analytical Placer for Mixed- Mode Designs

FastPlace 2.0: An Efficient Analytical Placer for Mixed- Mode Designs FastPlace.0: An Efficient Analytical Placer for Mixed- Mode Designs Natarajan Viswanathan Min Pan Chris Chu Iowa State University ASP-DAC 006 Work supported by SRC under Task ID: 106.001 Mixed-Mode Placement

More information

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 RegularRoute: An Efficient Detailed Router Applying Regular Routing Patterns Yanheng Zhang and Chris Chu Abstract In this paper, we propose

More information

Co-optimization of TSV assignment and micro-channel placement for 3D-ICs

Co-optimization of TSV assignment and micro-channel placement for 3D-ICs THE INSTITUTE FOR SYSTEMS RESEARCH ISR TECHNICAL REPORT 2012-10 Co-optimization of TSV assignment and micro-channel placement for 3D-ICs Bing Shi, Ankur Srivastava and Caleb Serafy ISR develops, applies

More information

Thermal Via Planning for 3-D ICs

Thermal Via Planning for 3-D ICs Thermal Via Planning for 3-D ICs Jason Cong Computer Science Department, UCLA Los Angeles, CA 90095 cong@cs.ucla.edu Yan Zhang Computer Science Department, UCLA Los Angeles, CA 90095 zhangyan@cs.ucla.edu

More information

Test-TSV Estimation During 3D-IC Partitioning

Test-TSV Estimation During 3D-IC Partitioning Test-TSV Estimation During 3D-IC Partitioning Shreepad Panth 1, Kambiz Samadi 2, and Sung Kyu Lim 1 1 Dept. of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332 2

More information

Very Large Scale Integration (VLSI)

Very Large Scale Integration (VLSI) Very Large Scale Integration (VLSI) Lecture 6 Dr. Ahmed H. Madian Ah_madian@hotmail.com Dr. Ahmed H. Madian-VLSI 1 Contents FPGA Technology Programmable logic Cell (PLC) Mux-based cells Look up table PLA

More information

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 3, MARCH

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 3, MARCH IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 3, MARCH 2012 459 NCTU-GR: Efficient Simulated Evolution-Based Rerouting and Congestion-Relaxed Layer Assignment on 3-D Global

More information

ECE 5745 Complex Digital ASIC Design Topic 13: Physical Design Automation Algorithms

ECE 5745 Complex Digital ASIC Design Topic 13: Physical Design Automation Algorithms ECE 7 Complex Digital ASIC Design Topic : Physical Design Automation Algorithms Christopher atten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/courses/ece7

More information

Advanced Surface Based MoM Techniques for Packaging and Interconnect Analysis

Advanced Surface Based MoM Techniques for Packaging and Interconnect Analysis Electrical Interconnect and Packaging Advanced Surface Based MoM Techniques for Packaging and Interconnect Analysis Jason Morsey Barry Rubin, Lijun Jiang, Lon Eisenberg, Alina Deutsch Introduction Fast

More information

Microprocessor Thermal Analysis using the Finite Element Method

Microprocessor Thermal Analysis using the Finite Element Method Microprocessor Thermal Analysis using the Finite Element Method Bhavya Daya Massachusetts Institute of Technology Abstract The microelectronics industry is pursuing many options to sustain the performance

More information

EDA for ONoCs: Achievements, Challenges, and Opportunities. Ulf Schlichtmann Dresden, March 23, 2018

EDA for ONoCs: Achievements, Challenges, and Opportunities. Ulf Schlichtmann Dresden, March 23, 2018 EDA for ONoCs: Achievements, Challenges, and Opportunities Ulf Schlichtmann Dresden, March 23, 2018 1 Outline Placement PROTON (nonlinear) PLATON (force-directed) Maze Routing PlanarONoC Challenges Opportunities

More information

Linking Layout to Logic Synthesis: A Unification-Based Approach

Linking Layout to Logic Synthesis: A Unification-Based Approach Linking Layout to Logic Synthesis: A Unification-Based Approach Massoud Pedram Department of EE-Systems University of Southern California Los Angeles, CA February 1998 Outline Introduction Technology and

More information

An Efficient Routing Tree Construction Algorithm with Buffer Insertion, Wire Sizing and Obstacle Considerations

An Efficient Routing Tree Construction Algorithm with Buffer Insertion, Wire Sizing and Obstacle Considerations An Efficient Routing Tree Construction Algorithm with uffer Insertion, Wire Sizing and Obstacle Considerations Sampath Dechu Zion Cien Shen Chris C N Chu Physical Design Automation Group Dept Of ECpE Dept

More information

Thermal Analysis on Face-to-Face(F2F)-bonded 3D ICs

Thermal Analysis on Face-to-Face(F2F)-bonded 3D ICs 1/16 Thermal Analysis on Face-to-Face(F2F)-bonded 3D ICs Kyungwook Chang, Sung-Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology Introduction Challenges in 2D Device

More information

Unit 7: Maze (Area) and Global Routing

Unit 7: Maze (Area) and Global Routing Unit 7: Maze (Area) and Global Routing Course contents Routing basics Maze (area) routing Global routing Readings Chapters 9.1, 9.2, 9.5 Filling Unit 7 1 Routing Unit 7 2 Routing Constraints 100% routing

More information

3-D INTEGRATED CIRCUITS (3-D ICs) are emerging

3-D INTEGRATED CIRCUITS (3-D ICs) are emerging 862 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 21, NO. 5, MAY 2013 Study of Through-Silicon-Via Impact on the 3-D Stacked IC Layout Dae Hyun Kim, Student Member, IEEE, Krit

More information

Chapter 28: Buffering in the Layout Environment

Chapter 28: Buffering in the Layout Environment Chapter 28: Buffering in the Layout Environment Jiang Hu, and C. N. Sze 1 Introduction Chapters 26 and 27 presented buffering algorithms where the buffering problem was isolated from the general problem

More information

FUTURE processors implemented in deep submicrometer. Multiobjective Microarchitectural Floorplanning for 2-D and 3-D ICs

FUTURE processors implemented in deep submicrometer. Multiobjective Microarchitectural Floorplanning for 2-D and 3-D ICs 38 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 26, NO. 1, JANUARY 2007 Multiobjective Microarchitectural Floorplanning for 2-D and 3-D ICs Michael Healy, Student

More information

A Thermal-aware Application specific Routing Algorithm for Network-on-chip Design

A Thermal-aware Application specific Routing Algorithm for Network-on-chip Design A Thermal-aware Application specific Routing Algorithm for Network-on-chip Design Zhi-Liang Qian and Chi-Ying Tsui VLSI Research Laboratory Department of Electronic and Computer Engineering The Hong Kong

More information

A Provably Good Approximation Algorithm for Rectangle Escape Problem with Application to PCB Routing

A Provably Good Approximation Algorithm for Rectangle Escape Problem with Application to PCB Routing A Provably Good Approximation Algorithm for Rectangle Escape Problem with Application to PCB Routing Qiang Ma Hui Kong Martin D. F. Wong Evangeline F. Y. Young Department of Electrical and Computer Engineering,

More information

Static Compaction Techniques to Control Scan Vector Power Dissipation

Static Compaction Techniques to Control Scan Vector Power Dissipation Static Compaction Techniques to Control Scan Vector Power Dissipation Ranganathan Sankaralingam, Rama Rao Oruganti, and Nur A. Touba Computer Engineering Research Center Department of Electrical and Computer

More information

CATALYST: Planning Layer Directives for Effective Design Closure

CATALYST: Planning Layer Directives for Effective Design Closure CATALYST: Planning Layer Directives for Effective Design Closure Yaoguang Wei 1, Zhuo Li 2, Cliff Sze 2 Shiyan Hu 3, Charles J. Alpert 2, Sachin S. Sapatnekar 1 1 Department of Electrical and Computer

More information

a) wire i with width (Wi) b) lij C coupled lij wire j with width (Wj) (x,y) (u,v) (u,v) (x,y) upper wiring (u,v) (x,y) (u,v) (x,y) lower wiring dij

a) wire i with width (Wi) b) lij C coupled lij wire j with width (Wj) (x,y) (u,v) (u,v) (x,y) upper wiring (u,v) (x,y) (u,v) (x,y) lower wiring dij COUPLING AWARE ROUTING Ryan Kastner, Elaheh Bozorgzadeh and Majid Sarrafzadeh Department of Electrical and Computer Engineering Northwestern University kastner,elib,majid@ece.northwestern.edu ABSTRACT

More information

An explicit feature control approach in structural topology optimization

An explicit feature control approach in structural topology optimization th World Congress on Structural and Multidisciplinary Optimisation 07 th -2 th, June 205, Sydney Australia An explicit feature control approach in structural topology optimization Weisheng Zhang, Xu Guo

More information

A Framework for Systematic Evaluation and Exploration of Design Rules

A Framework for Systematic Evaluation and Exploration of Design Rules A Framework for Systematic Evaluation and Exploration of Design Rules Rani S. Ghaida* and Prof. Puneet Gupta EE Dept., University of California, Los Angeles (rani@ee.ucla.edu), (puneet@ee.ucla.edu) Work

More information

[14] M. A. B. Jackson, A. Srinivasan and E. S. Kuh, Clock routing for high-performance ICs, 27th ACM

[14] M. A. B. Jackson, A. Srinivasan and E. S. Kuh, Clock routing for high-performance ICs, 27th ACM Journal of High Speed Electronics and Systems, pp65-81, 1996. [14] M. A. B. Jackson, A. Srinivasan and E. S. Kuh, Clock routing for high-performance ICs, 27th ACM IEEE Design AUtomation Conference, pp.573-579,

More information

How Much Logic Should Go in an FPGA Logic Block?

How Much Logic Should Go in an FPGA Logic Block? How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca

More information

Parallel Simulated Annealing for VLSI Cell Placement Problem

Parallel Simulated Annealing for VLSI Cell Placement Problem Parallel Simulated Annealing for VLSI Cell Placement Problem Atanu Roy Karthik Ganesan Pillai Department Computer Science Montana State University Bozeman {atanu.roy, k.ganeshanpillai}@cs.montana.edu VLSI

More information

Crosslink Insertion for Variation-Driven Clock Network Construction

Crosslink Insertion for Variation-Driven Clock Network Construction Crosslink Insertion for Variation-Driven Clock Network Construction Fuqiang Qian, Haitong Tian, Evangeline Young Department of Computer Science and Engineering The Chinese University of Hong Kong {fqqian,

More information

A Design Tradeoff Study with Monolithic 3D Integration

A Design Tradeoff Study with Monolithic 3D Integration A Design Tradeoff Study with Monolithic 3D Integration Chang Liu and Sung Kyu Lim Georgia Institute of Techonology Atlanta, Georgia, 3332 Phone: (44) 894-315, Fax: (44) 385-1746 Abstract This paper studies

More information

ICS 252 Introduction to Computer Design

ICS 252 Introduction to Computer Design ICS 252 Introduction to Computer Design Lecture 16 Eli Bozorgzadeh Computer Science Department-UCI References and Copyright Textbooks referred (none required) [Mic94] G. De Micheli Synthesis and Optimization

More information

Circuit Model for Interconnect Crosstalk Noise Estimation in High Speed Integrated Circuits

Circuit Model for Interconnect Crosstalk Noise Estimation in High Speed Integrated Circuits Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 8 (2013), pp. 907-912 Research India Publications http://www.ripublication.com/aeee.htm Circuit Model for Interconnect Crosstalk

More information

Clock Skew Optimization Considering Complicated Power Modes

Clock Skew Optimization Considering Complicated Power Modes Clock Skew Optimization Considering Complicated Power Modes Chiao-Ling Lung 1,2, Zi-Yi Zeng 1, Chung-Han Chou 1, Shih-Chieh Chang 1 National Tsing-Hua University, HsinChu, Taiwan 1 Industrial Technology

More information

Congestion Reduction during Placement with Provably Good Approximation Bound

Congestion Reduction during Placement with Provably Good Approximation Bound Congestion Reduction during Placement with Provably Good Approximation Bound X. YANG Synplicity, Inc. M. WANG Cadence Design Systems, Inc. R. KASTNER University of California, Santa Barbara and S. GHIASI

More information

Reducing Power in an FPGA via Computer-Aided Design

Reducing Power in an FPGA via Computer-Aided Design Reducing Power in an FPGA via Computer-Aided Design Steve Wilton University of British Columbia Power Reduction via CAD How to reduce power dissipation in an FPGA: - Create power-aware CAD tools - Create

More information

Constraint-Driven Floorplanning based on Genetic Algorithm

Constraint-Driven Floorplanning based on Genetic Algorithm Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 147 Constraint-Driven Floorplanning based on Genetic Algorithm

More information

Interconnect Delay and Area Estimation for Multiple-Pin Nets

Interconnect Delay and Area Estimation for Multiple-Pin Nets Interconnect Delay and Area Estimation for Multiple-Pin Nets Jason Cong and David Z. Pan UCLA Computer Science Department Los Angeles, CA 90095 Sponsored by SRC and Avant!! under CA-MICRO Presentation

More information

MAPLE: Multilevel Adaptive PLacEment for Mixed Size Designs

MAPLE: Multilevel Adaptive PLacEment for Mixed Size Designs MAPLE: Multilevel Adaptive PLacEment for Mixed Size Designs Myung Chul Kim, Natarajan Viswanathan, Charles J. Alpert, Igor L. Markov, Shyam Ramji Dept. of EECS, University of Michigan IBM Corporation 1

More information

Department of Electrical and Computer Engineering

Department of Electrical and Computer Engineering LAGRANGIAN RELAXATION FOR GATE IMPLEMENTATION SELECTION Yi-Le Huang, Jiang Hu and Weiping Shi Department of Electrical and Computer Engineering Texas A&M University OUTLINE Introduction and motivation

More information

Short Papers. Creating and Exploiting Flexibility in Rectilinear Steiner Trees

Short Papers. Creating and Exploiting Flexibility in Rectilinear Steiner Trees IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 5, MAY 2003 605 Short Papers Creating and Exploiting Flexibility in Rectilinear Steiner Trees Elaheh Bozorgzadeh,

More information

Diagonal Routing in High Performance Microprocessor Design

Diagonal Routing in High Performance Microprocessor Design Diagonal Routing in High Performance Microprocessor Design Noriyuki Ito, Hideaki Katagiri, Ryoichi Yamashita, Hiroshi Ikeda, Hiroyuki Sugiyama, Hiroaki Komatsu, Yoshiyasu Tanamura, Akihiko Yoshitake, Kazuhiro

More information

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University Abbas El Gamal Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program Stanford University Chip stacking Vertical interconnect density < 20/mm Wafer Stacking

More information

Congestion Reduction during Placement with Provably Good Approximation Bound 1

Congestion Reduction during Placement with Provably Good Approximation Bound 1 Congestion Reduction during Placement with Provably Good Approximation Bound 1 Xiaojian Yang Maogang Wang Ryan Kastner Soheil Ghiasi Majid Sarrafzadeh Synplicity Inc. Sunnyvale, California Cadence Design

More information

THE continuous increase of the problem size of IC routing

THE continuous increase of the problem size of IC routing 382 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 24, NO. 3, MARCH 2005 MARS A Multilevel Full-Chip Gridless Routing System Jason Cong, Fellow, IEEE, Jie Fang, Min

More information