Routability-Driven Repeater Block Planning for Interconnect-Centric Floorplanning

Size: px
Start display at page:

Download "Routability-Driven Repeater Block Planning for Interconnect-Centric Floorplanning"

Transcription

1 Routability-Driven Repeater Block Planning for Interconnect-Centric Floorplanning Probir Sarkar and Cheng-Kok Koh, Member, IEEE Abstract In this paper, we present a repeater block planning algorithm for interconnect-centric floorplanning. We introduce the concept of independent feasible regions for repeaters and derive an analytical formula for their computation. We develop a routability-driven repeater clustering algorithm to perform repeater block planning based on iterative deletion. The goal is to obtain a high quality solution for the repeater block locations so that performance-driven interconnect synthesis at the routing stage can be carried out with ease, while minimizing the chip area. Experimental results show that our method increases the percentage of all global nets that meet their target delays from 67.5% in [1] to 85%. Moreover, our approach minimizes the expected routing congestion, making it easier for performance-driven routers to synthesize global nets that require the insertion of repeaters to meet timing constraints. Index Terms Buffer insertion, deep submicron, physical design, Floorplanning, Routing. 1 Introduction Due to the continued scaling of VLSI technologies, interconnects play a dominant role in determining system performance, power, reliability, and cost. To ensure timing closure of designs, impacts of interconnects must be considered as early as possible in the design flow. 13 This work was supported by NSF CAREER award CCR P. Sarkar was with the School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN , USA. He is now with Conexant Systems, Newport Beach, CA 92660, USA. 13 C.-K. Koh is with the School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN , USA.

2 Several interconnect synthesis techniques topology construction, repeater insertion, device sizing, and wire sizing and spacing have been studied in the literature. A comprehensive survey of these techniques can be found in [2]. Studies in [3, 4, 5, 6] showed repeater insertion being among the most effective methods to optimize signal delay. Without repeaters, the delay of a long RC wire is quadratic in terms of the wire length. With judicious insertion of repeaters, the delay becomes linear [7, 8, 9]. However, most interconnect synthesis techniques were designed for post-placement interconnect optimization. In [10], it was projected that over 700K repeaters will be inserted in a single chip for the 70nm technology. The insertion of that many repeaters will significantly change the floorplan and placement of a design, rendering the original floorplan and placement invalid and leading to a slow design convergence. Floorplanning [11, 12], being the first stage of the physical design process, has significant effects on overall system power, performance, and reliability. However, very few existing timing-driven floorplanning techniques consider the option of repeater insertion. An advantage of considering repeater optimization during floorplanning is that the problem size at this stage is smaller compared to the problem size faced by place-and-route and hence, permits a more effective search of the design space. In [13], the floorplanner assumed that repeaters could be inserted arbitrarily in an existing floorplan. However, repeaters consume silicon resources, and certain circuit blocks such as the cache may not allow the insertion of repeaters. To overcome this problem, the authors in [1] considered the insertion of blocks of repeaters in the channel regions between circuit blocks. However, the greedy clustering of repeaters into a repeater block may result in routing congestion. As pointed out by the authors in [14], it is important to perform interconnect planning for global routing during the floorplanning stage. Many designs are routing-limited; it may not be feasible to get signals to and out of repeaters due to the limitation in routing resources. Therefore, it is important to consider routing feasibility (in subsequent steps) during the global distribution of repeaters in the floorplanning step. In this paper, we propose a routability-driven repeater block planning algorithm for interconnectcentric floorplanning. We introduce the concept of independent feasible region (IFR) for repeater insertion under delay constraint, and derive an analytical formula for the computation of IFRs. All repeaters are freely movable within their respective IFRs without violating the timing constraints. This has the advantage that each repeater of a net has equal flexibility of position, which facilitates global optimization. As in [1], we cluster individual repeaters into repeater blocks to form a regular layout structure for a higher density implementation. We develop an effective routability-driven repeater 2

3 block planning algorithm. A tile based congestion model is used to drive an iterative deletion heuristic for repeater clustering. Experimental results show that our method can boost the completion rate, i.e. the percentage of all global nets that meet their target delays, to 85% from the 67.5% completion rate reported in [1]. Furthermore, the expected congestion due to our repeater block planning algorithm is lower than that produced by a planner without considering congestion. The remainder of the paper is organized as follows. In Section 2, we present the problem formulation. Section 3 defines the independent feasible region for repeater insertion. Section 4 presents the routability-driven repeater block planning algorithm. Experimental results are shown in Section 5, followed by the conclusion in Section 6. An extended abstract of this work was presented at ISPD 2000 [15]. 2 Problem Formulation In this paper we study the following repeater block planning problem: given an initial floorplan, and timing constraints on each net, find the number, locations, and sizes of the repeater blocks to be inserted in order to meet the timing constraints. As in [1], we consider the insertion of repeater blocks in the channel regions between circuit blocks, in which repeater insertion is not allowed. Although the primary objective of the repeater block planner is to meet the timing constraints for all nets, it is equally important to avoid routing hot spots and keep the number of repeater blocks and the increase in chip area within tolerable limits. The two objectives of reducing the total number of repeater blocks added and avoiding regions with high routing congestion are contradictory in nature; by clustering large number of repeaters into a repeater block, we create a highly congested routing region around that repeater block. Figure 1 compares a routability-driven repeater block planner with a repeater planner that tries to minimize repeater blocks. Three feasible regions from repeaters of different nets are shown and the dashed lines represent the routes passing through the region. If minimizing repeater blocks is the sole objective, the repeater block would be chosen in the high congestion region as shown on the left in Figure 1. A routability-aware repeater planner, however, would move the repeater blocks away from the high congestion zone. As shown on the right in Figure 1, a routability-driven repeater block planner would strike a balance between the two contradictory criteria: routing congestion and repeater block count. 3

4 Feasible regions Repeater Blocks Routes Repeater Block Based approach: assigns repeaters to congested regions. Congestion-driven approach Separates repeater blocks if region is highly congested. Figure 1: Impact of routing congestion on repeater block planning. We measure the overall cost of a repeater block plan by a weighted composition of two cost functions: one cost function guides the solution towards minimum number of repeater blocks; and the other one minimizes the routing congestion of the solution. We shall present in detail the cost functions in Sections 4.1 and 4.2. The weights of the two cost functions may be suitably adjusted to reflect the relative importance of each criterion. Our repeater block planner tries to obtain a repeater block plan that minimizes the overall cost of the solution. 3 Feasible Region Computation In this section, we introduce the concept of independent feasible region (IFR) for repeater placement and obtain an expression for computing its width. We define the independent feasible region for a repeater to be the region in which it can be placed such that the timing constraint of the net is satisfied, assuming that the other repeaters of that net are also located within their respective independent feasible regions. The effectiveness of its use in repeater block planning is demonstrated in Section 5 by the significantly higher completion rates that we obtain as compared to [1]. 3.1 Preliminaries First, we present the definitions and expressions that will be used in stating the main result for IFR computation. Each driver/repeater is modeled as a switch-level RC circuit [16], and the Elmore delay formula [17] is used for delay computations. The notation for the physical parameters of the interconnect and repeater we use in this paper is as follows: 4

5 r: wire resistance per unit length; c: wire capacitance per unit length; T b : C b : R b : intrinsic repeater delay; repeater input capacitance; repeater output resistance. Given a wire segment of length l with driver output resistance R and sink capacitance C, the Elmore delay of this segment is defined as D RClµ rc 2 µl2 Rc rcµl RC (1) Using the above expression, the Elmore delay of a single-source, single-sink net N (two pin net) of length L with n repeaters can be expressed as, D N x 1 x 2 x n Lµ D R d C b x 1 µ D R b C s L x n µ n 1 i1 D R b C b x i 1 x i µ nt b where R d is the driver resistance, C s is the sink capacitance, and x i is the location of the i-th repeater. The optimal locations of the n repeaters for delay minimization of the net as shown in [18] are x i i 1µy L x L i ¾ 12 n (2) where x L y L 1 n 1 L n R b R d µ r 1 n 1 L R b R d µ r C s C b µ µ (3) c C s C b µ µ (4) c We denote the optimal delay for the net N, of length L, with n repeaters by D N opt nlµ DN x 1 x 2 x n Lµ 3.2 Feasible Region for Repeater Insertion In [1], the feasible region (FR) for repeater insertion was defined as the region in which a repeater could be placed, assuming that all the remaining repeaters were optimally placed with respect to it, in order to satisfy the target delay constraint, denoted by Dtgt. N We denote the width of the feasible region for a given repeater by W FR. An analytical expression for W FR was given in [1]. The following theorem presents an alternative but equivalent analytical expression, of which the proof is presented in the Appendix. 5

6 Source W IFR 1 i-1 i i+1 Sink l 1 l i l i+1 Figure 2: Independent feasible regions. Theorem 1 For Dtgt N D N opt nlµ, the width of the feasible region for the i-th repeater (i n) of the net N is W FR 2 2 Dtgt N DN opt nlµµ n i 1µ iµ rc n 1µ However, such a definition of FRs has the disadvantage that if we assign a repeater to a location that is on or near the boundary of its feasible region, the feasible regions of the other repeaters are almost entirely eliminated. ¾ 3.3 Independent Feasible Region As opposed to the definition of feasible region, the independent feasible region of a repeater is the region where it can be placed while meeting the timing specifications of the net, assuming that the other repeaters are placed within their respective independent feasible regions. Formally, we define the independent feasible region (IFR) for the i-th repeater of a net N as IFR i x i W IFR2x i W IFR2µ 0Lµ such that x 1 x 2 x i x n µ ¾ IFR 1 IFR 2 IFR n, D N x 1 x 2...x n Lµ Dtgt. N Here, W IFR and Dtgt N respectively denote the width of independent feasible region IFR i and the target delay associated with the net. Note that the final placement of a repeater in its IFR does not depend on the placement of the other repeaters, so long as they are placed within their respective IFRs. To allocate an equal degree of freedom to each repeater in the net, we choose the IFR intervals to be of equal width (see Figure 2). We have the following theorem for the width of the independent feasible region of a repeater. Theorem 2 For Dtgt N DN opt nlµ, the width of the independent feasible region for the i-th 6

7 repeater (i n) of the net N is W IFR 2 Dtgt N D N opt nlµ rc 2n 1µ ¾ The proof is presented in the Appendix. It can be trivially shown that when n 1, W FR W IFR. Although W IFR W FR for n 1, using IFRs for repeater planning results in a solution of higher quality because it facilitates a higher degree of fairness during global optimization of repeater block clustering. Note that when the objective is to minimize the number of repeaters inserted in a net, the independent feasible regions of repeaters belonging to the same net do not overlap each other. Otherwise, at least one repeater can be eliminated without violating the given timing constraint. 3.4 Two-Dimensional IFR In the preceding discussions, we were limited to repeater insertion along an one-dimensional line. Implicit in the discussions was the assumption that the route from source to sink is specified by some global router. For repeater block planning during floorplanning, however, no routing information is available. We assume that each net would be routed with a shortest path within the bounding box containing the two terminals. Therefore, we need to compute two-dimensional regions in which the repeaters can be placed. The two-dimensional independent feasible region (2-D IFR) of a repeater is defined as the union of the one-dimensional IFRs of that repeater on all monotonic Manhattan routes between source and sink. Therefore, 2-D IFRs, as in [1], are convex octilinear polygons with horizontal, vertical, and 1-slope boundaries (see Figure 3). However, 2-D IFRs of repeaters belonging to the same net are not completely independent of each other. As the widths and locations of a 2-D IFR are valid only under the assumption that a monotonic Manhattan route exists between the source and the sink, the assignments of repeaters to locations within their respective 2-D IFRs must be made such that they constitute a monotone path from source to sink. In Figure 3, for example, the repeater assignments, which form a non-monotonic sequence from the source to the sink, violate the monotonicity constraint even though the repeaters are within their respective 2-D IFRs. Therefore, whenever the 2-D IFR of a repeater is modified, the 2-D IFRs for all other repeaters in the net may need to be updated. 1 In this paper, we develop an efficient scheme to perform the update. Details of 13 1 Even though 2-D IFRs are not truly independent, they do not exhibit the same problem as FRs when a 7

8 Sink Feasible regions reduced by circuit blocks Circuit Blocks Feasible regions for repeaters 1 and 2 Source The assignment of repeaters to shown locations will result in a non-monotonic path. Figure 3: Non-monotonic repeater assignments. the updating scheme are presented in Section 4.4. In the subsequent discussions, we shall deal with only 2-D IFRs; we refer to them simply as IFRs. 4 Repeater Block Planning In this section, we describe in detail our routability-driven repeater block planning algorithm. Given a floorplan, we assume that repeaters can only be inserted within the vertical and horizontal channels [19] defined by circuit blocks as in [1]. We also assume that no repeaters can be inserted into circuit blocks. For the purpose of routing congestion computation (to be described in Section 4.1), we divide the entire chip area into a set of rectangular routing tiles as shown in Figure 4. The routing tiles correspond to higher metal layers reserved for the routing of global nets. We also divide the channel between circuit blocks into a set of repeater-block tiles, which are of a finer resolution than routing tiles. As in [1], these repeater-block tiles represent locations where the repeaters may be placed. Figure 4 shows a subset of repeater-block tiles. In essence, we construct a two-level tile structure, one for the purpose of estimating routing-congestion and the other for defining candidate repeater block (CRB) locations. For each repeater b to be inserted, we find S b, the set of CRBs in which it can be placed. As shown in Figure 4, each repeater has several CRBs to which it may be assigned. The objective repeater is assigned to the boundary of its FR, i.e., the FRs of remaining repeaters are almost entirely eliminated. Note that for 2-D FRs, repeaters of a net must also constitute a monotone path from source to sink because a 2-D FR is similarly obtained as the union of the one-dimensional FRs of that repeater on all monotonic Manhattan routes between source and sink. 8

9 Routing Tiles Independent Feasible Region Circuit Block Sink Circuit Block Circuit Block CRBs of feasible region Source Bounding Box of the net Figure 4: Creation of routing tiles and candidate repeater blocks. of the repeater block planner is to assign each repeater to a single CRB. To solve this problem, we take the approach of first generating the candidate set for each repeater, and then using a routing-congestion driven iterative deletion [20] algorithm to obtain an assignment for each repeater. The iterative deletion procedure operates on a bipartite graph G that represents the set of all possible repeater assignments. Let B be the set of all repeaters that need to be inserted. The edge set ofg is defined as E Gµ bcµ : b ¾ Bc ¾ S b. The iterative deletion algorithm starts with a redundant solution space that contains all possible assignments of repeaters to CRBs. One at a time, the algorithm deletes an incompatible repeater assignment, i.e., an assignment that results in high routing congestion or too many repeater blocks. Equivalently, the iterative deletion algorithm removes an edge from the bipartite graph one at a time. When the iterative deletion algorithm terminates, we obtain assignments of repeaters to CRBs that are compatible to each other in terms of routability and repeater block count. Figure 5 gives the overall flow of our repeater block planning algorithm. Steps 1 through 4 are data preparation stages. Step 1 constructs a two-level tile structure as shown in Figure 4. In Steps 2 and 3, we compute the minimum number of repeaters required for each two-pin net to meet its target delay [1, 18] and assign W IFR of each repeater based on the available net slack (Dtgt N D N opt nlµ) as defined in Theorem 2. Then, we generate the candidate set S b for each repeater b by intersecting the IFR of the repeater with the set of repeater-block tiles within the bounding box of the net. If the IFR for a repeater does not have any intersections with the set of repeater-block tiles, then we consider possible repeater locations along the boundaries of circuit blocks. For such nets, we allow repeaters to be placed along the circuit block boundaries in order to meet their timing constraints. Step 4 constructs the bipartite graphg. 9

10 Repeater Block Planning Algorithm 1. Build a two-level tile structure on a given floorplan; 2. Compute IFR for each repeater b ¾ B; 3. Obtain CRB set S b for each repeater b ¾ B; 4. Generate the bipartite graph G; 5. while there exists a repeater to be assigned do 6. Delete the highest cost edge ofg; 7. Update monotonicity; 8. Update congestion matrix; 9. Update edge costs; 10. Assign repeater to a CRB if required; Figure 5: Repeater Block Planning Algorithm. Steps 5 10 perform the iterative deletion operations. The iterative deletion procedure assigns each repeater to one of its CRBs. To accomplish the task of deleting incompatible repeater assignments, the edges ing have dynamic weights. The weight of edge bcµ reflects the cost of assigning repeater b to CRB c, assuming that the other repeaters can go into any of their CRBs with equal probability. An edge of a higher cost implies that the repeater block assignment is likely to be incompatible with the rest, and the iterative deletion operation in Steps 6 10 (to be described in Section 4.3) seeks to remove the highest cost edge or the most incompatible repeater block assignment. As edges are iteratively deleted from graph G, the routability of the floorplan is modified and so is the expected repeater block count. These changes are reflected onto the edge costs of the graph at every iteration, thus making the edge weights dynamic. We shall present the routing congestion model and the associated cost function in Section 4.1, and the edge weight, which is a composite cost function of the routing congestion cost and repeater block cost, in Section Congestion Model The congestion model employed is essentially a two dimensional rectangular grid based probabilistic map assuming two-bend routing for each segment. This is similar to that developed in [14]. The congestion of a routing tile tile i jµ is defined as : C h i jµ Expected number of horizontal routes 10

11 passing through tile i jµ C v i jµ Expected number of vertical routes passing through tile i jµ To derive the congestion numbers for the routing grid, we first compute the expected number of routes passing through every routing tile for a wire segment with a fixed source and a fixed sink. Without loss of generality, we consider the source to be located in tile 00µ and the sink to be located in tile mnµ and assume that a bend consumes both horizontal and vertical routing resources. It is easy to see that there are a total of m nµ possible two-bend routes from source to sink. Assuming that all these routes are equally likely, we obtain the probability matrix shown in Figure 6. δc h i jµ is defined as the probability of a horizontal route passing through tile i jµ. Similarly δc v i jµ is defined for vertical routes. Tile δc h i jµ δc v i jµ 0 i m0 j n 1 m n 0 i m j 0 m i 1 m n 0 i m j n i 1 m n 1 i 00 j n m n 1 i m0 j n i 0 j 0 or i m j n m n m m n Figure 6: Probability matrix. 1 m n 1 m n 1 m n n j 1 m n j 1 m n n m n We define a subnet as the segment of a net between two consecutive repeaters, between the source and the first repeater, or between the last repeater and the sink. In the problem at hand, the source and the sink of a subnet may have several candidate locations if they are repeaters. We compute the contribution of each subnet to the congestion matrices, C h and C v by fixing the source and sink of a subnet segment to the centroids of their CRBs. Thus, given a set of two pin nets and candidate locations for the repeaters we can compute the congestion matrices C h and C v as follows: C h i jµ subnets C v i jµ subnets δc h i jµ δc v i jµ 11

12 Slope=100 Congestion Cost Slope=10 Slope = Normalized Congestion Figure 7: Piecewise-linear congestion cost function. As is done in a number of other routers we assign a congestion cost to each routing tile. A number of ways for modeling the congestion cost have been proposed in literature [21, 22, 23]. For the purpose of this work, we use a monotonic piecewise cost function of the type proposed in [24] (see Figure 7). 4.2 Dynamic Edge Weights The cost of assigning a repeater to a CRB (or the edge cost) is a weighted composite function comprising of the congestion cost and the repeater block cost. Congestion Cost: The congestion cost of an edge is defined as the maximum congestion cost among all routing tiles in the one-bend routing path for the subnet that has the repeater as its source and the subnet that has the repeater as its sink. The respective sink and source of the subnets are assumed to be located at the centroids of their CRBs. One-bend routing, instead of two-bend routing, is used for congestion cost calculation because of its efficiency. The congestion metric and the associated cost functions used have been described in Section 4.1. We denote the congestion cost of an edge e bcµ ¾ E Gµ as CC eµ. Repeater Block Cost: The repeater block cost function is used to guide the iterative deletion procedure to converge to a solution with minimum number of repeater blocks. Let c be a CRB. Let B c be the number of IFRs intersecting on this CRB. Also, define B max to be the maximum number of repeaters that can be inserted into a CRB. The repeater block cost of the CRB c, BB cµ is defined as follows: If the number of repeaters assigned to the 12

13 channel tile is less than B max, then we define BB cµ to be 1min B c B max µ. Otherwise we define BB cµ. The composite cost function is a weighted product of the two costs. The cost of an edge, e bcµ ¾ E Gµ is defined as : COST eµ CC eµµ p 1 BB cµµ p2 where p 1 and p 2 are positive parameters such that p 1 p 2 1. Changing the values of p 1 and p 2 allows us to perform a tradeoff between the two contradictory criteria involved. 4.3 Iterative Deletion The iterative improvement method modifies the solution space repeatedly by changing a small portion at each step. Our procedure begins with a redundant set of possible locations for each repeater, and removes a single highest-cost redundant assignment at each step, while attempting to minimize the cost of the solution. It proceeds by deleting the highest cost edge e bcµ of G. Also, let N denote the net to which repeater b belongs. We update the bipartite graph and its associated edge costs as follows: Monotonicity Update: As mentioned in Section 3.4, the formula for the IFR of each repeater has been derived assuming that a monotonic route between the source and the sink through the repeater locations is generated. To prevent non-monotonic repeater location assignments, we update the candidate sets for repeaters in the net N to ensure the existence of a monotonic path. A CRB in N lies on a monotonic path if there exists a sequence of CRBs, one in each CRB set of the remaining repeaters, that forms a monotonic path from source to sink. In Figure 3, for example, the repeater assignments shown form a non-monotonic sequence from the source to the sink. CRBs that do not lie on a monotonic path are removed. In Section 4.4, we present an efficient technique to remove non-monotone CRBs. Congestion Update: Deletion of the highest cost edge and the corresponding monotonicity update change the solution set represented by the bipartite graph G. After monotonicity update, the CRB sets of repeaters on a net change. Once all the changes to the CRB set of a repeater have been determined, the new centroid of the CRB set can be computed in linear time with respect to the number of deleted CRBs. The congestion update procedure reflects the impact of these changes onto the congestion matrices C h and C v. As 13

14 the centroids of the CRB sets of repeaters on the net before and after the monotonicity update are known, the congestion matrix can be updated in a very efficient manner (linear with respect to the number of routing tiles through which the affected subnets pass through). Note that the update is necessary only when the new and old centroids of a subnet lie in different routing tiles. It is important to note that the centroids usually change their routing tile locations only after several iterations of monotonicity updates. We also observe that the congestion matrix does not change significantly with each congestion update. Edge Cost Update: Due to monotonicity and congestion updates, the cost of assigning a repeater to a particular tile changes. This procedure updates the cost of the edges in the graphg. The updates of the repeater block costs can be done by simply recomputing the function BB cµ, for all CRBs that have been deleted or to which repeater assignments have been made. The update of BB cµ for all edges incident on c is linear with respect to the degree of c ing. However, it is very costly to reflect the changes of the congestion cost in the edge cost with each congestion update. As the congestion matrix does not change significantly with each occurrence of congestion update, we perform a lazy update of the routing cost for all the edges after every K occurrences of congestion updates. In our implementation, K 20 provides a satisfactory compromise between efficiency and solution quality. The assignment of a repeater to a repeater block is accomplished when the candidate set for the repeater consists of only one repeater block. 4.4 An Efficient Technique for Monotonicity Update As stated in Section 3.4, it is important that the eventual repeater assignments of a net constitute a monotonic path from source to sink. To ensure that, the candidate sets of other repeaters belonging to the same net need to be properly updated upon the deletion of a CRB from the candidate set of a repeater of that net. This procedure is referred to as Monotonicity Update. In the rest of this section, we shall consider the case where the net N, whose source is located at X s Y s µ and destination at X d Y d µ, has X s X d and Y s Y d. 2 The other three cases X s X d and Y s Y d, X s X d and Y s Y d, and X s X d and Y s Y d can be handled in a similar 13 2 We use the notation X Y µ for coordinates to avoid any confusion with the notation x and y used for denoting optimal repeater placement in Section 3. 14

15 IDS un-deleted CRBs c i c j IDC deleted CRBs c k Figure 8: An independent dominated set and its independent dominated contour. White squares corresponds to un-deleted CRBs. Lightly shaded squares correspond to deleted CRBs. Black squares correspond to CRBs in the independent dominated set. The independent dominated contour is depicted by thick lines. The darker shaded squares correspond to the region in which one should search for new members of the independent dominated set should c j be deleted. fashion. Let X i Y i µ be the location of the CRB to which the i-th repeater b i is assigned. The Monotonicity Update procedure must ensure that the following property is preserved by the eventual assignments of n repeaters: Monotonicity Property (MP): i j, 1 i n, 1 j n, i j; X i X j i j Prop. (1) Y i Y j i j Prop. (2) In order to preserve the monotonicity property, we make use of the dominance relation of CRBs. Consider two CRBs c and c ¼ at locations X c Y c µ and X c ¼Y c ¼ µ, respectively, we say that c ¼ dominates c if and only if X c X c ¼ and Y c Y c ¼. If neither c nor c ¼ dominates the other, they are mutually independent. Given the set of CRBs S b of repeater b, we define an independent dominated set IDS b to be a subset of S b such that (i) all CRBs in IDS b are mutually independent; (ii) other than self-dominance, CRBs from IDS b do not dominate CRBs from S b ; and (iii) every CRB from S b dominates (self-dominance included) at least one CRB from IDS b. Figure 8 shows the independent dominated set (IDS) of a repeater. We arrange the CRBs in an IDS by their X-coordinates in increasing order. From each CRB in the IDS, we extend a vertical line segment northward either infinitely if the CRB is the first ¾ 15

16 in the set or until it reaches the Y -coordinate of the previous CRB otherwise. We also extend from each CRB in the IDS a horizontal line segment eastward either infinitely if the CRB is the last in the set or until it reaches the X-coordinate of the next CRB otherwise. These line segments define a monotone contour of the IDS, which is dominated by CRBs in the northeast region of the contour. We refer to the monotone contour as the independent dominated contour (IDC) (see Figure 8). Consider two independent dominated sets IDS bi and IDS b j respectively for repeaters b i and b j, i j, we say that IDS b j dominates IDS bi or IDS bi IDS b j if and only if IDC b j is completely in the northeast region of IDC bi. When IDS b j dominates IDS bi, every CRB in IDS b j dominates at least one CRB in IDS bi. In other words, there exist some repeater assignments that constitute a monotone path. In order for the eventual repeater assignments to preserve the monotone property, we maintain the following relaxed monotone property: Relaxed Monotonicity Property (RMP): i j, 1 i n, 1 j n, i j; IDS bi IDS b j i j Prop. (3) ¾ When a CRB c of repeater b i is considered for deletion, we have to update IDS bi if c happens to be in IDS bi. As a result, we have to update the IDSs of subsequent repeaters b j, i j, if Prop. (3) is violated. If the sequence of updates results in an empty S bk, for some b k, then c should not be deleted in the first place. Instead, b i should be assigned to c. Based on the above observation, an monotonicity update procedure that preserves the relaxed monotonicity property has been developed. Figure 9 gives the details of the update procedure. We assume that e b i cµ is the edge to be deleted, where b i is the i-th repeater of net N, and c is a CRB in S bi. To be more general, the monotonicity update procedure takes in a subset of edges (adjacent to b i ) that are to be deleted. We denote the subset of edges by Ê bi. The Boolean variable ÎÓÐØÓÒ Ð monitors whether an IFR has become empty because of the monotonicity update. When that happens, we un-delete all CRBs that has been deleted, and making use of the changes stored in the variable ÁË»Á Ò, revert to the original IDS and IDC for each IFR. It also signals to the iterative deletion procedure in Figure 5 that b i should be assigned to c, where Ê bi e b i cµ is the input edge set. While the other steps are fairly straightforward, Steps 6 and 7 require further explanation. The procedure ÁË»Á ÍÔØ updates the IDS bi and accordingly IDC bi. Consider deleting a CRB from S bi, we do not have to update IDS bi if that CRB is not originally in IDS bi. Now, we consider the case when we delete a CRB from IDS bi. In the IDS shown in Figure 8, for example, 16

17 Monotonicity Update Procedure(Ê bi ) 1. for each c ¾ Ê bi do 2. mark c deleted; 3. if S bi /0 then 4. ÎÓÐØÓÒ Ð ÌÊÍ; 5. else» Ø ÒÜØ ÁÊ ÓÑ ÑÔØÝ» 6. ÁË»Á Ò ÁË»Á ÍÔØ (Ê bi S bi S bi 1 ); 7. Ê bi 1 c ¼ ¾ S bi 1 that violates RMP; 8. if Ê bi 1 /0 then 9. ÎÓÐØÓÒ Ð ÄË; 10. else 11. ÎÓÐØÓÒ Ð Monotonicity Update Procedure(Ê bi 1 ); 12. if ÎÓÐØÓÒ Ð ÌÊÍ then 13. for each c ¾ Ê bi do 14. mark c undeleted; 15. un-do ÁË»Á Ò ; 16. return ÎÓÐØÓÒ Ð; Figure 9: Monotonicity Update Procedure. 17

18 we have three ordered CRBs c i, c j, and c k. When we delete c j from the IDS, we search for new members of the IDS from within the darker shaded region defined by the rectangles bounded by X c j, X ck 1, Y c j, and Y ci 1, assuming that every CRB occupies a unit square. The new IDS members can be computed by sweeping either row-wise from Y c j to Y ci 1 or column-wise from X c j to X ck 1. Let c r be the first CRB that we encounter when we sweep row-wise from Y c j to Y ci 1 such that X cr X ck. Then, any remaining new IDS members lie in the rectangular region defined by X c j, X cr 1, Y cr 1, and Y ci 1. When we sweep columnwise, we search for the first CRB c c such that Y cc Y ci. Then, the next new IDS member, if any, lies in the rectangular region bounded by X cc 1, X ck 1, Y ci, and Y cc 1. Although the number of columns or rows that the algorithm has to sweep varies, we are able to amortize the operation cost such that the run-time complexity of a CRB deletion operation is O 1µ. The cost of marking a CRB as deleted, be it a IDS member or not, is charged to the CRB itself. In the following analysis, we assume that the algorithm performs a row-wise sweep after deleting a IDS member. If sweeping a row produces a new IDS member, we charge the operation cost to that CRB. An IDS member, once created, will not be involved in any sweeping operation again. The next time it is charged with an operation cost again is when it is marked deleted. If the sweep produces no new IDS member, then we charge the cost to a previously deleted CRB which is located at the east-most boundary of the rectangular region. For the example used in the preceding two paragraphs, we charge the cost of sweeping rows between Y c j to Y cr 1 to CRBs with co-ordinates X ck 1Y c j µ X ck 1Y cr 1µ. Such CRBs will not be charged with another sweep as they now lie outside the northeast region of the IDC. Therefore, every CRB is charged at most twice, i.e. O 1µ. Propagating the change in the contour from IDC bi to IDC bi 1 may turn out to be slightly more costly. Using the example in Figure 8, let c r be the CRB that is ordered before c k in the new IDS after c j is deleted. If X ck 1Y cr 1µ is in the northeast region of IDC bi 1, then we have to update IDC bi 1. As IDS bi 1 is ordered, we can perform a binary search in O logids bi 1 µ to find the two adjacent CRBs in IDS bi 1 whose Y -coordinates sandwich Y cr 1. Note that IDS bi 1 is no greater than the minimum of the numbers of the columns and rows in IDS bi 1. If X ck 1Y cr 1µ does not dominate the two adjacent CRBs, then no deletion of CRBs in S bi 1 is required. We can achieve a speed-up by taking advantage of the geometrical properties of IFRs. As mentioned in Section 3.4, all IFRs are octilinear in shape. Therefore, we can define for S bi 1 a 1-slope line Y X M such that for each c ¾ S bi 1, X c Y c M. Only when X ck Y cr 2 M do we have to perform a binary search. In general, we have to verify the new contour between c i and c k does not invade the northeast region of IDC bi 1 after 18

19 Description Value r wire resistance per unit length (Ωµm) c wire capacitance per unit length (ff/µm) T b intrinsic repeater delay (ps) 36.4 C s C b sink/repeater input capacitance (ff) 23.4 R b R d driver/repeater output resistance (Ω) 180 Table 1: Parameter values for interconnect, repeater, driver, and sink. deleting c j. The algorithm outlined in Figure 9 illustrates how the deletion of a CRB of b i affects the IFRs of repeaters b j, j i in net N. Such a deletion also affects the IFRs of repeaters b j, j i in net N. Similar to the algorithm in Figure 9, we develop an algorithm to perform monotonicity updates of IFRs for b j, j i, by defining the independent dominating set of S b as follows: (i) All CRBs in the independent dominating set are mutually independent; (ii) other than selfdominance, CRBs from S b do not dominate CRBs from the independent dominating set; and (iii) every CRB from S b is dominated (self-dominance included) by at least one CRB from the independent dominating set. The independent dominating set also defines an independent dominating contour, which dominates all CRBs in the southwest region of the contour. As the algorithm to update the independent dominating contours of preceding repeaters is very similar to the algorithm in Figure 9, we do not present its details here. 5 Experimental Results We have implemented our repeater block planning algorithm using C on a SUN UltraSPARC II machine. In this section, we present the details of our experimental set up and the results obtained. The interconnect and repeater parameters and the Elmore delay model have been described in Section 3.1. The values (see Table 1) used for these parameters are based on a 018µm technology in the NTRS 97 roadmap [25]. We report the results of our repeater block planner for six MCNC [26] benchmark circuits. The relevant details of these benchmarks are shown in Table 2. In this work we focus on solving the problem of repeater block planning for two-pin (single source, single sink) nets. As the benchmark files do not provide information on signal direction, we choose the first pin to be the source and all the others to be sinks, and decompose a multiple terminal net into a set 19

20 Circuit Modules Nets 2-Pin Nets apte xerox hp ami ami playout Table 2: Details of MCNC benchmarks. of two-pin nets. This may not be good especially for path based timing optimizations, but it provides a sufficiently good model of the input to a repeater block planner. We ignore all single pin, power, and ground interconnects. The initial floorplans of the MCNC benchmark circuits used for this work were obtained from [27], and are the same as those used in [1]. These floorplans were generated by running simulated tempering using an improved Monte-Carlo technique [28]. Since the MCNC benchmarks do not come with any timing information, we assign target delays to the two-pin nets as follows. We ignore all two pin nets whose lengths are smaller than the critical length L min [18], above which repeater insertion can be used for delay reduction. For each net we then compute the optimal delay T opt obtainable by repeater insertion [18] and then randomly assign a target delay between 1.05 and 1.20 times T opt as in [1]. As we generate these timing constraints on our own, a direct comparison between our method and that in [1] may not be fair even though we do use their results for reference in the following discussion. In Table 3 we report the following results from our repeater block planning algorithm : (i) ratio of number of nets for which the delay constraint is met to the number of nets for which MET the delay constraint is not satisfied, NOT MET ; (ii) the total number of repeaters inserted to meet timing constraints, N REP ; (iii) the maximum tile congestion, C MAX ; (iv) the chip area increase, δa expressed as a percentage of the original chip area; and (v) CPU time required, TCPU. We also include the results for these benchmark circuits reported in [1]. Compared to [1], MET is significantly higher. The success rates of meeting the timing constraints (or completion rates) range from 71% (for apte) to 95% (for playout). The average completion rate is 85%. On the other hand, the average completion rate for the same examples in [1] is 67.5%, and the completion rates range from 58% (for xerox and hp) to 84% (for ami33). Our completion rates are higher for all benchmark circuits in Table 2. 20

21 Circuit apte MET NOT MET N REP C MAX δa TCPU (s) p 1 1 p / p 1 0 p / p 1 p / [1] 102 / xerox p 1 1 p / p 1 0 p / p 1 p / [1] 260 / hp p 1 1 p / p 1 0 p / p 1 p / [1] 131 / ami33 p 1 1 p / p 1 0 p / p 1 p / [1] 305 / ami49 p 1 1 p / p 1 0 p / p 1 p / [1] 412 / playout p 1 1 p / p 1 0 p / p 1 p / [1] 1533 / Table 3: Comparison of repeater block planning solutions. 21

22 In Table 3, we report an identical N REP under three different combinations of p 1 and p 2 for each benchmark circuit even though the completion rates are different. The completion rates are different because some nets cannot meet the timing constraints after we expand the floorplan to accommodate the repeater blocks along the boundaries of circuit blocks (see Section 4). In fact, all entries in N REP, C MAX, and δa include nets that do not meet the timing constraints after floorplan expansions. Table 3 also shows that we generally require a smaller number of repeaters to achieve a higher completion rates, compared to [1]. In [1], all channels between blocks are expanded at the beginning of the planning step [27]. As a result, net lengths become longer, and more repeaters are required. In our algorithm, we expand the channels only when required. Therefore, our net lengths are shorter in general, and we require a smaller number of repeaters to meet the timing constraints. The results in Table 3 also show that our algorithm is able to reduce routing congestion. With p 1 1 and p 2 0 the objective function is oriented towards the reduction of congestion cost, whereas setting p 1 0 and p 2 1 causes the repeater block count to be minimized. p 1 p 2 05 represents a solution with a tradeoff between the two contradictory cost functions. The Table shows that routing congestion levels can be lowered when they are accounted for during repeater block planning. In xerox, for example, C MAX for p 1 0 and p 2 1 (i.e., the objective of minimizing the number of repeater blocks) is about 50% higher than the C MAX for p 1 1 and p 2 0 (i.e., the objective of minimizing the routing congestion). Clearly, if the design is routing-limited, the actual completion rates for the former case will be reduced accordingly. However, the average increase in chip area due to repeater insertion is 1.25%, which is higher than the 1.05% average increase reported in [1]. Moreover, the run-times of our algorithm are higher than those in [1]. 6 Conclusion In this paper we have presented a repeater block planning algorithm that uses the concept of independent feasible regions to do repeater clustering. The assignments of repeaters to repeater blocks are routability-driven and thus, we are able to avoid regions of high routing congestion. Experimental results show that our technique performs significantly better than existing repeater planning methods. 22

23 7 Acknowledgments The authors would like to thank Prof. Jason Cong, Mr. David Z. Pan, and Mr. Tianming Kong of UCLA for providing the floorplans of the MCNC benchmark circuits. They would also like to thank them and Dr. Kevin K.-Y. Chao of Intel for their helpful discussions, and the anonymous reviewers for their constructive comments. Appendix: Proofs of Theorems 1 and 2 We consider a net N of length L optimally inserted with n repeaters. We denote the optimal placement of the first repeater by x L, the distance between two optimally placed repeaters by y L, and the distance between the last repeater and the sink by z L. The expressions for x L and y L are given by Eqns. (3) and (4), respectively, and z L L n 1µy L x L. In the following lemma and theorems, we deal with repeaters displaced from their optimal locations. As a result, the distances between the driver and the first repeater, between repeaters, and between the last repeater and the sink change. We capture the resultant change in the delay by the following expression: RCwlµ D RCl wµ D RClµ rc 2 w2 rcl Rc rcµw (5) where w is the change in net length, l is the original net length, R is the driver resistance, C is the sink capacitance, and D RClµ is the Elmore delay in Eqn. (1). The coefficient of w in Eqn. (5) has an interesting property summarized in the following lemma: Lemma 1 The coefficient of w in Eqn. (5) satisfies the following equalities: rcx L R dc rc b rcy L R bc rc b rcz L R bc rc s (6) Proof: From Eqns. (2) (4), we can show that rcx L R dc rc b rc n 1 L c n n 1 R b rc R b R d µ L n 1 r rcy L R bc rc b c n 1 R d C s C b µ c r n 1 C r n s n 1 C b R b c rc b 23

24 and rcz L R bc rc s rc L n 1µy L x L µ R bc rc s rc n 1 L c n n 1 R c b n 1 R c d n 1 C c n s n 1 C b rcy L R bc rc b ¾ Theorem 1 For Dtgt N DN opt nlµ, the width of the feasible region for the i-th repeater (i n) of the net N is W FR 2 2 Dtgt N DN opt nlµµ n i 1µ iµ rc n 1µ Proof: Let DN opt nlµ denote the optimal delay for a net N of length L inserted with n repeaters. Also, let D N opt nl W µ denote the optimal delay for the same net (i.e., same R d and C s ) but with a difference of W in its wirelength. First, we consider the difference in the optimal delays of the two configurations: D N opt nl W µ DN opt nlµ D R dc b x L W µ D R dc b x L µ From Eqns. (2) (4), n 1µD R b C b y L W µ D R bc b y L µ D R b C s z L W µ D R bc s z L µ x L W x L y L W y L z L W z L W n 1 Therefore, we can rewrite the difference between the optimal delays as: D N opt nl W µ DNopt nlµ R W dc b n 1 x L µ n 1µ R W bc b n 1 y L µ W R b C s n 1 z L µ (7) We split N into two subnets N 1 and N 2 at the i-th repeater. Therefore, the length of N 1 is L 1 x L i 1µy L and the sink of N 1 is the i-th repeater of N. The length of N 2 is L 2 n iµy L z L and the driver of N 2 is the i-repeater. There are i 1 repeaters in N 1 and n i repeaters in N 2. As y L 1, z L 1, x L 2, and y L 2 correspond to the distances between optimally placed repeaters in N, the following equalities hold: y L 1 z L 1 x L 2 y L 2 y L (8) 24

25 Furthermore, x L 1 x L and z L 2 z L (9) Let the i-th repeater of N be displaced by a distance W from its optimal location (W being positive toward the sink), and the remaining repeaters be placed optimally with respect to the displaced repeater. Therefore, the lengths of N 1 and N 2 change by W and W, respectively. The delay of such a configuration is From Eqns. (7) and (8), ˆD nlµ D N 1 opt i 1L 1 W µ D N 2 opt n il 2 W µ T b D N 1 opt i 1L 1 W µ D N 1 opt i 1L 1µ R d C b W i x L 1 µ i 1µ R b C b W i y L 1 µ D N opt n 2 il 2 W µ D opt n N 2 il 2µ n iµ R b C b Therefore, W R b C s n i 1 z L 2 µ W n i 1 y L 2 µ ˆD nlµ D N opt nlµ R dc b W i x L 1 µ i 1µ R b C b W i y L 1 µ n iµ R b C b W W n i 1 y L 2 µ R b C s n i 1 z L 2 µ Expanding the s and substituting x L for x L 1, y L for y L 1 and y L 2, and z L for z L 2, we obtain ˆD nlµ D N opt nlµ i rc 2 Making use of the equalities in Lemma 1, W i 2 rcx L R dc rc b µ W i 1µ rcy L R bc rc b µ i rc W 2 n i 1µ 2 n i 1 W n iµ rcy L R bc rc b µ n i 1 rcz L R bc rc s µ ˆD nlµ D N opt nlµ rc 2 W n i 1 1 W 2 1 i n i 1 W i 25

26 Therefore, the maximum displacement for the i-th repeater without violating the timing constraint D N tgt is W 2 Dtgt N DN opt nlµµ n i 1µ iµ n 1µrc Hence, W FR for the i-th repeater is 2 Dtgt N DN opt nlµµ n i 1µ iµ W FR 2 n 1µrc ¾ Theorem 2 For Dtgt N D N opt nlµ, the width of the independent feasible region for the i-th repeater (i n) of the net N is W IFR 2 Dtgt N D N opt nlµ rc 2n 1µ Proof: Consider the net N of length L having n repeaters as shown in Figure 2. Let w i be the displacement of the i-th repeater from its optimal position (w i being positive toward the sink). Let w 0 0 and w n 1 0 denote the displacements of the source and the sink from their original positions, respectively. The delay for this configuration of repeaters is D D R d C b x L µ R dc b w 1 w 0 x L µ n 1 D R b C b y L µ R bc b w i 1 w i y L µ i1 D R b C s z L µ R bc s w n 1 w n z L µ nt b D N rc n 1 opt nlµ 2 w i w i 1 µ 2 rcx L R dc rc b µ w 1 w 0 µ i1 n 1 rcy L R bc rc b µ w i 1 w i µ rcz L R bc rc s µ w n 1 w n µ i1 Making use of the fact that displacements w 0 w n 1 0, and that the second last term of the preceding equation is a telescopic sum, we derive the following expression for the last three terms of the preceding equation: rcx L R dc rc b µw 1 rcy L R bc rc b µ w n w 1 µ rcz L R bc rc s µw n which sum to zero because of the equalities in Lemma 1. 26

A buffer planning algorithm for chip-level floorplanning

A buffer planning algorithm for chip-level floorplanning Science in China Ser. F Information Sciences 2004 Vol.47 No.6 763 776 763 A buffer planning algorithm for chip-level floorplanning CHEN Song 1, HONG Xianlong 1, DONG Sheqin 1, MA Yuchun 1, CAI Yici 1,

More information

Basic Idea. The routing problem is typically solved using a twostep

Basic Idea. The routing problem is typically solved using a twostep Global Routing Basic Idea The routing problem is typically solved using a twostep approach: Global Routing Define the routing regions. Generate a tentative route for each net. Each net is assigned to a

More information

S 1 S 2. C s1. C s2. S n. C sn. S 3 C s3. Input. l k S k C k. C 1 C 2 C k-1. R d

S 1 S 2. C s1. C s2. S n. C sn. S 3 C s3. Input. l k S k C k. C 1 C 2 C k-1. R d Interconnect Delay and Area Estimation for Multiple-Pin Nets Jason Cong and David Zhigang Pan Department of Computer Science University of California, Los Angeles, CA 90095 Email: fcong,pang@cs.ucla.edu

More information

An Interconnect-Centric Design Flow for Nanometer Technologies

An Interconnect-Centric Design Flow for Nanometer Technologies An Interconnect-Centric Design Flow for Nanometer Technologies Professor Jason Cong UCLA Computer Science Department Los Angeles, CA 90095 http://cadlab.cs.ucla.edu/~ /~cong

More information

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets.

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Clock Routing Problem Formulation Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Better to develop specialized routers for these nets.

More information

Constructive floorplanning with a yield objective

Constructive floorplanning with a yield objective Constructive floorplanning with a yield objective Rajnish Prasad and Israel Koren Department of Electrical and Computer Engineering University of Massachusetts, Amherst, MA 13 E-mail: rprasad,koren@ecs.umass.edu

More information

Buffer Block Planning for Interconnect Planning and Prediction

Buffer Block Planning for Interconnect Planning and Prediction Buffer Block Planning for Interconnect Planning and Prediction Jason Cong, Tianming Kong and David Zhigang Pan y Department of Computer Science, University of California, Los Angeles, CA 90095 y IBM T.

More information

Place and Route for FPGAs

Place and Route for FPGAs Place and Route for FPGAs 1 FPGA CAD Flow Circuit description (VHDL, schematic,...) Synthesize to logic blocks Place logic blocks in FPGA Physical design Route connections between logic blocks FPGA programming

More information

Interconnect Delay and Area Estimation for Multiple-Pin Nets

Interconnect Delay and Area Estimation for Multiple-Pin Nets Interconnect Delay and Area Estimation for Multiple-Pin Nets Jason Cong and David Z. Pan UCLA Computer Science Department Los Angeles, CA 90095 Sponsored by SRC and Avant!! under CA-MICRO Presentation

More information

Pseudopin Assignment with Crosstalk Noise Control

Pseudopin Assignment with Crosstalk Noise Control 598 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 20, NO. 5, MAY 2001 Pseudopin Assignment with Crosstalk Noise Control Chin-Chih Chang and Jason Cong, Fellow, IEEE

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface.

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface. Placement Introduction A very important step in physical design cycle. A poor placement requires larger area. Also results in performance degradation. It is the process of arranging a set of modules on

More information

An Interconnect-Centric Design Flow for Nanometer. Technologies

An Interconnect-Centric Design Flow for Nanometer. Technologies An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong Department of Computer Science University of California, Los Angeles, CA 90095 Abstract As the integrated circuits (ICs) are scaled

More information

Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion

Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion Chen Li, Cheng-Kok Koh School of ECE, Purdue University West Lafayette, IN 47907, USA {li35, chengkok}@ecn.purdue.edu Patrick

More information

Introduction VLSI PHYSICAL DESIGN AUTOMATION

Introduction VLSI PHYSICAL DESIGN AUTOMATION VLSI PHYSICAL DESIGN AUTOMATION PROF. INDRANIL SENGUPTA DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Introduction Main steps in VLSI physical design 1. Partitioning and Floorplanning l 2. Placement 3.

More information

Planning for Local Net Congestion in Global Routing

Planning for Local Net Congestion in Global Routing Planning for Local Net Congestion in Global Routing Hamid Shojaei, Azadeh Davoodi, and Jeffrey Linderoth* Department of Electrical and Computer Engineering *Department of Industrial and Systems Engineering

More information

TCG-Based Multi-Bend Bus Driven Floorplanning

TCG-Based Multi-Bend Bus Driven Floorplanning TCG-Based Multi-Bend Bus Driven Floorplanning Tilen Ma Department of CSE The Chinese University of Hong Kong Shatin, N.T. Hong Kong Evangeline F.Y. Young Department of CSE The Chinese University of Hong

More information

Metal-Density Driven Placement for CMP Variation and Routability

Metal-Density Driven Placement for CMP Variation and Routability Metal-Density Driven Placement for CMP Variation and Routability ISPD-2008 Tung-Chieh Chen 1, Minsik Cho 2, David Z. Pan 2, and Yao-Wen Chang 1 1 Dept. of EE, National Taiwan University 2 Dept. of ECE,

More information

An Enhanced Perturbing Algorithm for Floorplan Design Using the O-tree Representation*

An Enhanced Perturbing Algorithm for Floorplan Design Using the O-tree Representation* An Enhanced Perturbing Algorithm for Floorplan Design Using the O-tree Representation* Yingxin Pang Dept.ofCSE Univ. of California, San Diego La Jolla, CA 92093 ypang@cs.ucsd.edu Chung-Kuan Cheng Dept.ofCSE

More information

Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment

Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment Xin-Wei Shih, Tzu-Hsuan Hsu, Hsu-Chieh Lee, Yao-Wen Chang, Kai-Yuan Chao 2013.01.24 1 Outline 2 Clock Network Synthesis Clock network

More information

Double Patterning-Aware Detailed Routing with Mask Usage Balancing

Double Patterning-Aware Detailed Routing with Mask Usage Balancing Double Patterning-Aware Detailed Routing with Mask Usage Balancing Seong-I Lei Department of Computer Science National Tsing Hua University HsinChu, Taiwan Email: d9762804@oz.nthu.edu.tw Chris Chu Department

More information

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Physical Design of Digital Integrated Circuits (EN029 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Lecture 09: Routing Introduction to Routing Global Routing Detailed Routing 2

More information

A Path Based Algorithm for Timing Driven. Logic Replication in FPGA

A Path Based Algorithm for Timing Driven. Logic Replication in FPGA A Path Based Algorithm for Timing Driven Logic Replication in FPGA By Giancarlo Beraudo B.S., Politecnico di Torino, Torino, 2001 THESIS Submitted as partial fulfillment of the requirements for the degree

More information

Chapter 5 Global Routing

Chapter 5 Global Routing Chapter 5 Global Routing 5. Introduction 5.2 Terminology and Definitions 5.3 Optimization Goals 5. Representations of Routing Regions 5.5 The Global Routing Flow 5.6 Single-Net Routing 5.6. Rectilinear

More information

FUTURE communication networks are expected to support

FUTURE communication networks are expected to support 1146 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 13, NO 5, OCTOBER 2005 A Scalable Approach to the Partition of QoS Requirements in Unicast and Multicast Ariel Orda, Senior Member, IEEE, and Alexander Sprintson,

More information

On Improving Recursive Bipartitioning-Based Placement

On Improving Recursive Bipartitioning-Based Placement Purdue University Purdue e-pubs ECE Technical Reports Electrical and Computer Engineering 12-1-2003 On Improving Recursive Bipartitioning-Based Placement Chen Li Cheng-Kok Koh Follow this and additional

More information

Thermal-Aware 3D IC Physical Design and Architecture Exploration

Thermal-Aware 3D IC Physical Design and Architecture Exploration Thermal-Aware 3D IC Physical Design and Architecture Exploration Jason Cong & Guojie Luo UCLA Computer Science Department cong@cs.ucla.edu http://cadlab.cs.ucla.edu/~cong Supported by DARPA Outline Thermal-Aware

More information

ECE 5745 Complex Digital ASIC Design Topic 13: Physical Design Automation Algorithms

ECE 5745 Complex Digital ASIC Design Topic 13: Physical Design Automation Algorithms ECE 7 Complex Digital ASIC Design Topic : Physical Design Automation Algorithms Christopher atten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/courses/ece7

More information

Routing. Robust Channel Router. Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998

Routing. Robust Channel Router. Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998 Routing Robust Channel Router Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998 Channel Routing Algorithms Previous algorithms we considered only work when one of the types

More information

Fast Dual-V dd Buffering Based on Interconnect Prediction and Sampling

Fast Dual-V dd Buffering Based on Interconnect Prediction and Sampling Based on Interconnect Prediction and Sampling Yu Hu King Ho Tam Tom Tong Jing Lei He Electrical Engineering Department University of California at Los Angeles System Level Interconnect Prediction (SLIP),

More information

Optimal Detector Locations for OD Matrix Estimation

Optimal Detector Locations for OD Matrix Estimation Optimal Detector Locations for OD Matrix Estimation Ying Liu 1, Xiaorong Lai, Gang-len Chang 3 Abstract This paper has investigated critical issues associated with Optimal Detector Locations for OD matrix

More information

Satisfiability Modulo Theory based Methodology for Floorplanning in VLSI Circuits

Satisfiability Modulo Theory based Methodology for Floorplanning in VLSI Circuits Satisfiability Modulo Theory based Methodology for Floorplanning in VLSI Circuits Suchandra Banerjee Anand Ratna Suchismita Roy mailnmeetsuchandra@gmail.com pacific.anand17@hotmail.com suchismita27@yahoo.com

More information

Chapter 15 Introduction to Linear Programming

Chapter 15 Introduction to Linear Programming Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of

More information

Iterative-Constructive Standard Cell Placer for High Speed and Low Power

Iterative-Constructive Standard Cell Placer for High Speed and Low Power Iterative-Constructive Standard Cell Placer for High Speed and Low Power Sungjae Kim and Eugene Shragowitz Department of Computer Science and Engineering University of Minnesota, Minneapolis, MN 55455

More information

Georgia Institute of Technology College of Engineering School of Electrical and Computer Engineering

Georgia Institute of Technology College of Engineering School of Electrical and Computer Engineering Georgia Institute of Technology College of Engineering School of Electrical and Computer Engineering ECE 8832 Summer 2002 Floorplanning by Simulated Annealing Adam Ringer Todd M c Kenzie Date Submitted:

More information

An Interconnect-Centric Design Flow for Nanometer Technologies

An Interconnect-Centric Design Flow for Nanometer Technologies An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong UCLA Computer Science Department Email: cong@cs.ucla.edu Tel: 310-206-2775 URL: http://cadlab.cs.ucla.edu/~cong Exponential Device

More information

Efficient Second-Order Iterative Methods for IR Drop Analysis in Power Grid

Efficient Second-Order Iterative Methods for IR Drop Analysis in Power Grid Efficient Second-Order Iterative Methods for IR Drop Analysis in Power Grid Yu Zhong Martin D. F. Wong Dept. of Electrical and Computer Engineering Dept. of Electrical and Computer Engineering Univ. of

More information

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Chen-Wei Liu 12 and Yao-Wen Chang 2 1 Synopsys Taiwan Limited 2 Department of Electrical Engineering National Taiwan University,

More information

Full Custom Layout Optimization Using Minimum distance rule, Jogs and Depletion sharing

Full Custom Layout Optimization Using Minimum distance rule, Jogs and Depletion sharing Full Custom Layout Optimization Using Minimum distance rule, Jogs and Depletion sharing Umadevi.S #1, Vigneswaran.T #2 # Assistant Professor [Sr], School of Electronics Engineering, VIT University, Vandalur-

More information

Efficient Multilayer Routing Based on Obstacle-Avoiding Preferred Direction Steiner Tree

Efficient Multilayer Routing Based on Obstacle-Avoiding Preferred Direction Steiner Tree Efficient Multilayer Routing Based on Obstacle-Avoiding Preferred Direction Steiner Tree Ching-Hung Liu, Yao-Hsin Chou, Shin-Yi Yuan, and Sy-Yen Kuo National Taiwan University 1 Outline 2 Outline 3 The

More information

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize. Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject

More information

CAD Algorithms. Placement and Floorplanning

CAD Algorithms. Placement and Floorplanning CAD Algorithms Placement Mohammad Tehranipoor ECE Department 4 November 2008 1 Placement and Floorplanning Layout maps the structural representation of circuit into a physical representation Physical representation:

More information

Critical Area Computation for Missing Material Defects in VLSI Circuits

Critical Area Computation for Missing Material Defects in VLSI Circuits IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 20, NO. 5, MAY 2001 583 Critical Area Computation for Missing Material Defects in VLSI Circuits Evanthia Papadopoulou

More information

On GPU Bus Power Reduction with 3D IC Technologies

On GPU Bus Power Reduction with 3D IC Technologies On GPU Bus Power Reduction with 3D Technologies Young-Joon Lee and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta, Georgia, USA yjlee@gatech.edu, limsk@ece.gatech.edu Abstract The

More information

An integrated placement and routing approach

An integrated placement and routing approach Retrospective Theses and Dissertations 2006 An integrated placement and routing approach Min Pan Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/rtd Part of the Electrical

More information

Non-Rectangular Shaping and Sizing of Soft Modules for Floorplan Design Improvement

Non-Rectangular Shaping and Sizing of Soft Modules for Floorplan Design Improvement Non-Rectangular Shaping and Sizing of Soft Modules for Floorplan Design Improvement Chris C.N. Chu and Evangeline F.Y. Young Abstract Many previous works on floorplanning with non-rectangular modules [,,,,,,,,,,,

More information

Figure 1: An example of a hypercube 1: Given that the source and destination addresses are n-bit vectors, consider the following simple choice of rout

Figure 1: An example of a hypercube 1: Given that the source and destination addresses are n-bit vectors, consider the following simple choice of rout Tail Inequalities Wafi AlBalawi and Ashraf Osman Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV fwafi,osman@csee.wvu.edug 1 Routing in a Parallel Computer

More information

Approximation Algorithms for Wavelength Assignment

Approximation Algorithms for Wavelength Assignment Approximation Algorithms for Wavelength Assignment Vijay Kumar Atri Rudra Abstract Winkler and Zhang introduced the FIBER MINIMIZATION problem in [3]. They showed that the problem is NP-complete but left

More information

Accepting that the simple base case of a sp graph is that of Figure 3.1.a we can recursively define our term:

Accepting that the simple base case of a sp graph is that of Figure 3.1.a we can recursively define our term: Chapter 3 Series Parallel Digraphs Introduction In this chapter we examine series-parallel digraphs which are a common type of graph. They have a significant use in several applications that make them

More information

6. Dicretization methods 6.1 The purpose of discretization

6. Dicretization methods 6.1 The purpose of discretization 6. Dicretization methods 6.1 The purpose of discretization Often data are given in the form of continuous values. If their number is huge, model building for such data can be difficult. Moreover, many

More information

On Multi-Stack Boundary Labeling Problems

On Multi-Stack Boundary Labeling Problems On Multi-Stack Boundary Labeling Problems MICHAEL A. BEKOS 1, MICHAEL KAUFMANN 2, KATERINA POTIKA 1, ANTONIOS SYMVONIS 1 1 National Technical University of Athens School of Applied Mathematical & Physical

More information

Thermal-aware Steiner Routing for 3D Stacked ICs

Thermal-aware Steiner Routing for 3D Stacked ICs Thermal-aware Steiner Routing for 3D Stacked ICs Mohit Pathak and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology {mohitp, limsk}@ece.gatech.edu Abstract In this

More information

VLSI Physical Design: From Graph Partitioning to Timing Closure

VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5 Global Routing Original uthors: ndrew. Kahng, Jens, Igor L. Markov, Jin Hu Chapter 5 Global Routing 5. Introduction 5.2 Terminology and Definitions 5.3 Optimization Goals 5. Representations of

More information

Laboratory 6. - Using Encounter for Automatic Place and Route. By Mulong Li, 2013

Laboratory 6. - Using Encounter for Automatic Place and Route. By Mulong Li, 2013 CME 342 (VLSI Circuit Design) Laboratory 6 - Using Encounter for Automatic Place and Route By Mulong Li, 2013 Reference: Digital VLSI Chip Design with Cadence and Synopsys CAD Tools, Erik Brunvand Background

More information

Very Large Scale Integration (VLSI)

Very Large Scale Integration (VLSI) Very Large Scale Integration (VLSI) Lecture 6 Dr. Ahmed H. Madian Ah_madian@hotmail.com Dr. Ahmed H. Madian-VLSI 1 Contents FPGA Technology Programmable logic Cell (PLC) Mux-based cells Look up table PLA

More information

THE continuous increase of the problem size of IC routing

THE continuous increase of the problem size of IC routing 382 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 24, NO. 3, MARCH 2005 MARS A Multilevel Full-Chip Gridless Routing System Jason Cong, Fellow, IEEE, Jie Fang, Min

More information

A CSP Search Algorithm with Reduced Branching Factor

A CSP Search Algorithm with Reduced Branching Factor A CSP Search Algorithm with Reduced Branching Factor Igor Razgon and Amnon Meisels Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, 84-105, Israel {irazgon,am}@cs.bgu.ac.il

More information

Behavioral Array Mapping into Multiport Memories Targeting Low Power 3

Behavioral Array Mapping into Multiport Memories Targeting Low Power 3 Behavioral Array Mapping into Multiport Memories Targeting Low Power 3 Preeti Ranjan Panda and Nikil D. Dutt Department of Information and Computer Science University of California, Irvine, CA 92697-3425,

More information

VLSI Physical Design: From Graph Partitioning to Timing Closure

VLSI Physical Design: From Graph Partitioning to Timing Closure VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5 Global Routing Original uthors: ndrew. Kahng, Jens, Igor L. Markov, Jin Hu VLSI Physical Design: From Graph Partitioning to Timing

More information

T. Biedl and B. Genc. 1 Introduction

T. Biedl and B. Genc. 1 Introduction Complexity of Octagonal and Rectangular Cartograms T. Biedl and B. Genc 1 Introduction A cartogram is a type of map used to visualize data. In a map regions are displayed in their true shapes and with

More information

c 2011 Yun Wei Chang

c 2011 Yun Wei Chang c 2011 Yun Wei Chang SINGLE-LAYER BUS ROUTING FOR HIGH-SPEED BOARDS BY YUN WEI CHANG THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Science in Electrical and Computer

More information

Chapter 14 Global Search Algorithms

Chapter 14 Global Search Algorithms Chapter 14 Global Search Algorithms An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Introduction We discuss various search methods that attempts to search throughout the entire feasible set.

More information

Cost Models for Query Processing Strategies in the Active Data Repository

Cost Models for Query Processing Strategies in the Active Data Repository Cost Models for Query rocessing Strategies in the Active Data Repository Chialin Chang Institute for Advanced Computer Studies and Department of Computer Science University of Maryland, College ark 272

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

Interleaving Schemes on Circulant Graphs with Two Offsets

Interleaving Schemes on Circulant Graphs with Two Offsets Interleaving Schemes on Circulant raphs with Two Offsets Aleksandrs Slivkins Department of Computer Science Cornell University Ithaca, NY 14853 slivkins@cs.cornell.edu Jehoshua Bruck Department of Electrical

More information

5 The Theory of the Simplex Method

5 The Theory of the Simplex Method 5 The Theory of the Simplex Method Chapter 4 introduced the basic mechanics of the simplex method. Now we shall delve a little more deeply into this algorithm by examining some of its underlying theory.

More information

Placement Algorithm for FPGA Circuits

Placement Algorithm for FPGA Circuits Placement Algorithm for FPGA Circuits ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

ADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL

ADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL ADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY BHARAT SIGINAM IN

More information

Cluster-based approach eases clock tree synthesis

Cluster-based approach eases clock tree synthesis Page 1 of 5 EE Times: Design News Cluster-based approach eases clock tree synthesis Udhaya Kumar (11/14/2005 9:00 AM EST) URL: http://www.eetimes.com/showarticle.jhtml?articleid=173601961 Clock network

More information

Optimized Implementation of Logic Functions

Optimized Implementation of Logic Functions June 25, 22 9:7 vra235_ch4 Sheet number Page number 49 black chapter 4 Optimized Implementation of Logic Functions 4. Nc3xe4, Nb8 d7 49 June 25, 22 9:7 vra235_ch4 Sheet number 2 Page number 5 black 5 CHAPTER

More information

Multi-Cluster Interleaving on Paths and Cycles

Multi-Cluster Interleaving on Paths and Cycles Multi-Cluster Interleaving on Paths and Cycles Anxiao (Andrew) Jiang, Member, IEEE, Jehoshua Bruck, Fellow, IEEE Abstract Interleaving codewords is an important method not only for combatting burst-errors,

More information

Monotone Paths in Geometric Triangulations

Monotone Paths in Geometric Triangulations Monotone Paths in Geometric Triangulations Adrian Dumitrescu Ritankar Mandal Csaba D. Tóth November 19, 2017 Abstract (I) We prove that the (maximum) number of monotone paths in a geometric triangulation

More information

Integrated Floorplanning with Buffer/Channel Insertion for Bus-Based Microprocessor Designs 1

Integrated Floorplanning with Buffer/Channel Insertion for Bus-Based Microprocessor Designs 1 Integrated Floorplanning with Buffer/ for Bus-Based Microprocessor Designs 1 Faran Rafiq Intel Microlectronics Services, 20325 NW Von Neumann Dr. AG3-318, Beaverton, OR 97006 faran.rafiq@intel.com Malgorzata

More information

An Efficient Transformation for Klee s Measure Problem in the Streaming Model Abstract Given a stream of rectangles over a discrete space, we consider the problem of computing the total number of distinct

More information

AN 567: Quartus II Design Separation Flow

AN 567: Quartus II Design Separation Flow AN 567: Quartus II Design Separation Flow June 2009 AN-567-1.0 Introduction This application note assumes familiarity with the Quartus II incremental compilation flow and floorplanning with the LogicLock

More information

EEL 4783: HDL in Digital System Design

EEL 4783: HDL in Digital System Design EEL 4783: HDL in Digital System Design Lecture 13: Floorplanning Prof. Mingjie Lin Topics Partitioning a design with a floorplan. Performance improvements by constraining the critical path. Floorplanning

More information

FastPlace 2.0: An Efficient Analytical Placer for Mixed- Mode Designs

FastPlace 2.0: An Efficient Analytical Placer for Mixed- Mode Designs FastPlace.0: An Efficient Analytical Placer for Mixed- Mode Designs Natarajan Viswanathan Min Pan Chris Chu Iowa State University ASP-DAC 006 Work supported by SRC under Task ID: 106.001 Mixed-Mode Placement

More information

ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs

ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs Vaughn Betz Jonathan Rose Alexander Marquardt

More information

Pebble Sets in Convex Polygons

Pebble Sets in Convex Polygons 2 1 Pebble Sets in Convex Polygons Kevin Iga, Randall Maddox June 15, 2005 Abstract Lukács and András posed the problem of showing the existence of a set of n 2 points in the interior of a convex n-gon

More information

Line Arrangement. Chapter 6

Line Arrangement. Chapter 6 Line Arrangement Chapter 6 Line Arrangement Problem: Given a set L of n lines in the plane, compute their arrangement which is a planar subdivision. Line Arrangements Problem: Given a set L of n lines

More information

22 Elementary Graph Algorithms. There are two standard ways to represent a

22 Elementary Graph Algorithms. There are two standard ways to represent a VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph

More information

a) wire i with width (Wi) b) lij C coupled lij wire j with width (Wj) (x,y) (u,v) (u,v) (x,y) upper wiring (u,v) (x,y) (u,v) (x,y) lower wiring dij

a) wire i with width (Wi) b) lij C coupled lij wire j with width (Wj) (x,y) (u,v) (u,v) (x,y) upper wiring (u,v) (x,y) (u,v) (x,y) lower wiring dij COUPLING AWARE ROUTING Ryan Kastner, Elaheh Bozorgzadeh and Majid Sarrafzadeh Department of Electrical and Computer Engineering Northwestern University kastner,elib,majid@ece.northwestern.edu ABSTRACT

More information

Statistical Timing Analysis Using Bounds and Selective Enumeration

Statistical Timing Analysis Using Bounds and Selective Enumeration IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 9, SEPTEMBER 2003 1243 Statistical Timing Analysis Using Bounds and Selective Enumeration Aseem Agarwal, Student

More information

Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors

Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors G. Chen 1, M. Kandemir 1, I. Kolcu 2, and A. Choudhary 3 1 Pennsylvania State University, PA 16802, USA 2 UMIST,

More information

Introduction of ISPD18 Contest Problem

Introduction of ISPD18 Contest Problem Introduction of ISPD18 Contest Problem Detailed routing can be divided into two steps. First, an initial detailed routing step is used to generate a detailed routing solution while handling the major design

More information

γ 2 γ 3 γ 1 R 2 (b) a bounded Yin set (a) an unbounded Yin set

γ 2 γ 3 γ 1 R 2 (b) a bounded Yin set (a) an unbounded Yin set γ 1 γ 3 γ γ 3 γ γ 1 R (a) an unbounded Yin set (b) a bounded Yin set Fig..1: Jordan curve representation of a connected Yin set M R. A shaded region represents M and the dashed curves its boundary M that

More information

IN VLSI CIRCUITS. MUL'I'-PADs, SINGLE LAYER POWER NETT ROUTING

IN VLSI CIRCUITS. MUL'I'-PADs, SINGLE LAYER POWER NETT ROUTING MUL''-PADs, SNGLE LAYER POWER NETT ROUTNG N VLS CRCUTS H. Cai Delft University of Techndogy Departmen$ of Electrical hgineering Delft. The Neth.m?ands ABSTRACT An algorithm is presented for obtaining a

More information

An algorithm for Performance Analysis of Single-Source Acyclic graphs

An algorithm for Performance Analysis of Single-Source Acyclic graphs An algorithm for Performance Analysis of Single-Source Acyclic graphs Gabriele Mencagli September 26, 2011 In this document we face with the problem of exploiting the performance analysis of acyclic graphs

More information

22 Elementary Graph Algorithms. There are two standard ways to represent a

22 Elementary Graph Algorithms. There are two standard ways to represent a VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph

More information

Design Compiler Graphical Create a Better Starting Point for Faster Physical Implementation

Design Compiler Graphical Create a Better Starting Point for Faster Physical Implementation Datasheet Create a Better Starting Point for Faster Physical Implementation Overview Continuing the trend of delivering innovative synthesis technology, Design Compiler Graphical streamlines the flow for

More information

On Increasing Signal Integrity with Minimal Decap Insertion in Area-Array SoC Floorplan Design

On Increasing Signal Integrity with Minimal Decap Insertion in Area-Array SoC Floorplan Design On Increasing Signal Integrity with Minimal Decap Insertion in Area-Array SoC Floorplan Design Chao-Hung Lu Department of Electrical Engineering National Central University Taoyuan, Taiwan, R.O.C. Email:

More information

TRIPLE patterning lithography (TPL) is regarded as

TRIPLE patterning lithography (TPL) is regarded as IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 4, APRIL 2016 1319 Triple Patterning Lithography Aware Optimization and Detailed Placement Algorithms for Standard Cell-Based

More information

University of California at Berkeley. Berkeley, CA the global routing in order to generate a feasible solution

University of California at Berkeley. Berkeley, CA the global routing in order to generate a feasible solution Post Routing Performance Optimization via Multi-Link Insertion and Non-Uniform Wiresizing Tianxiong Xue and Ernest S. Kuh Department of Electrical Engineering and Computer Sciences University of California

More information

Multilayer Routing on Multichip Modules

Multilayer Routing on Multichip Modules Multilayer Routing on Multichip Modules ECE 1387F CAD for Digital Circuit Synthesis and Layout Professor Rose Friday, December 24, 1999. David Tam (2332 words, not counting title page and reference section)

More information

Term Paper for EE 680 Computer Aided Design of Digital Systems I Timber Wolf Algorithm for Placement. Imran M. Rizvi John Antony K.

Term Paper for EE 680 Computer Aided Design of Digital Systems I Timber Wolf Algorithm for Placement. Imran M. Rizvi John Antony K. Term Paper for EE 680 Computer Aided Design of Digital Systems I Timber Wolf Algorithm for Placement By Imran M. Rizvi John Antony K. Manavalan TimberWolf Algorithm for Placement Abstract: Our goal was

More information

Can Recursive Bisection Alone Produce Routable Placements?

Can Recursive Bisection Alone Produce Routable Placements? Supported by Cadence Can Recursive Bisection Alone Produce Routable Placements? Andrew E. Caldwell Andrew B. Kahng Igor L. Markov http://vlsicad.cs.ucla.edu Outline l Routability and the placement context

More information

Distance Trisector Curves in Regular Convex Distance Metrics

Distance Trisector Curves in Regular Convex Distance Metrics Distance Trisector Curves in Regular Convex Distance Metrics Tetsuo sano School of Information Science JIST 1-1 sahidai, Nomi, Ishikawa, 923-1292 Japan t-asano@jaist.ac.jp David Kirkpatrick Department

More information

A Provably Good Approximation Algorithm for Rectangle Escape Problem with Application to PCB Routing

A Provably Good Approximation Algorithm for Rectangle Escape Problem with Application to PCB Routing A Provably Good Approximation Algorithm for Rectangle Escape Problem with Application to PCB Routing Qiang Ma Hui Kong Martin D. F. Wong Evangeline F. Y. Young Department of Electrical and Computer Engineering,

More information

L14 - Placement and Routing

L14 - Placement and Routing L14 - Placement and Routing Ajay Joshi Massachusetts Institute of Technology RTL design flow HDL RTL Synthesis manual design Library/ module generators netlist Logic optimization a b 0 1 s d clk q netlist

More information

How Much Logic Should Go in an FPGA Logic Block?

How Much Logic Should Go in an FPGA Logic Block? How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca

More information