Partitioning. Hidenori Sato Akira Onozawa Hiroaki Matsuda. BTM. Bakoglu et al. [2] proposed an H-tree structure.

Size: px
Start display at page:

Download "Partitioning. Hidenori Sato Akira Onozawa Hiroaki Matsuda. BTM. Bakoglu et al. [2] proposed an H-tree structure."

Transcription

1 Balanced-Mesh Clock Routing Technique Using Circuit Partitioning Hidenori Sato kira Onozawa Hiroaki Matsuda NTT LSI Laboratories 3-1, Morinosato Wakamiya, tsugi-shi, Kanagawa Pref., , Japan. bstract clock routing technique using a balanced-mesh routing is proposed, which incorporates the advantages of both the well-known balanced-tree and xed-mesh routing method. The circuit is partitioned into subblocks called Mesh-Routing Regions(MR's) in which clock skew is suppressed below a constant by mesh routing. Then the net from the clock source toeach MR is routed asabalanced-tree. In using the technique to the design of MPEG2-encoder LSI, a skew of 210 ps was achieved. 1 Introduction In a synchronous VLSI design, circuit speed is limited by the critical path delay and clock skew. The critical path delay is the maximum path delay through the combinational circuits between two synchronous elements, i.e., ip-op's(). The clock skew is the maximum dierence in the delay times from the clock source to the 's. It is said that the clock skew must be less than 5% of the critical path delay time to build high performance electronic systems, which is a very tight design constraint. Until recently the delay in transistors was the dominant factor eecting the performance. However, with deep submicron technology, the interconnect delay makes up a large part of the overall delay [1]. Thus, the clock skew consideration in layout design is crucial. Several clock-routing techniques have been proposed in recent years. These approaches can be classied into those based on the Balanced-Tree Method (BTM), where the clock net is routed as a tree so that the delay times of clock signal are balanced [2, 3, 4, 5, 6], and those based on the Fixed- Mesh Method (FMM), where the clock net is routed as a xed mesh driven by a large buer [7]. Many important works have been presented regarding BTM. Bakoglu et al. [2] proposed an H-tree structure. The H-tree can reduce the skew, but the placement and size of 's are subject to certain restrictions to keep the H-tree symmetric. Jackson et al. [3] proposed the Method of Means and Medians, which recursively partitions a circuit into two subsets and then connects the subsets considering their centers of mass. This method reduces skew even if 's are not placed symmetrically. Minami et al. [5] proposed the Path Delay Balancing Method which merges two subclock trees in a bottom-up manner at a point where the skew is minimized. Tsay [4] also proposed a method of this type and Edahiro [6] improved it to minimize the length of the clock net while keeping the skew zero. On the whole, BTM can achieve very low (possibly zero) skew, but it may increase the number of routing tracks and the delay due to detours that can not be predicted before routing. In other words, BTM may increase area and delay time by making the skew unnecessarily small. This is especially crucial in the design of chips having many 's, e.g. MPEG2 LSI's. FMM has been applied to the design of a DEC lpha chip [7] 1. The entire chip is covered by a big mesh of interconnect metal that drives all the 's. lthough it could achieve clock skew of less than 300ps for a 0.75-m technology, the power dissipated by the clock was almost 40% of the total power dissipation of the chip because FMM tends to overestimate the skew leading to an increased number of interconnects and the need for a large buer. However, a xed mesh is easy to route and at most one routing track is required in each channel, which means, unlike in BTM, the area increase due to clock routing is predictable. Taking the advantages of both of BTM and FMM into account, we developed a practical clock routing method called the Balanced-Mesh Method (BMM), in which the circuit is partitioned into some sub-blocks 1 mesh is called a grid in the original paper. ED&TC /96 $ IEEE

2 Balanced or Minimum-Delay Tree Mesh Routing for Clock Net Layout Block Mesh Routing Cell Rows Clock Source of Mesh Routing Clock Source of the Chip Clock Buer Clock Source Figure 1: The Balanced-Mesh Method and the clock net in each sub-block is routed as a mesh (see Fig. 1). Each mesh is driven by a relatively small clock buer placed at its center row and these buers are routed from the clock source using a balanced tree or a minimum-delay tree. The circuit is partitioned so that each sub-block's skew and the clock-signal delay time can be bounded under given allowances, based on the relationship among the clock skew, delay time, and density in a chip. This relationship is determined beforehand by circuit simulation. Since the simulation is performed under the worst condition, the clock skew and the delay time can be made lower than the given allowances in actual design. Since in general the area covered by the meshes and the buers driving them are smaller in BMM than in FMM, it can reduce the power dissipation of the clock signal signicantly. Furthermore, it provides the advantages of FMM, i.e., easy routing with at most one routing track in each channel. nother important point is, in general, the delay with mesh routing is smaller than with tree-based approaches [8, 9]. The BMM was applied to several circuits including a MPEG2 LSI. The results show that the BMM can be used to design circuits with more than 100-MHz clock frequency with almost no area overhead. Section 2 discusses the eect of partitioning on the clock skew and delay time and describes the layout ow using BMM. Section 3 overviews the partitioning algorithm, and Section 4 shows the experimental results and the method's eect. 2 Balanced-Mesh Method This section presents an empirical model for the clock skew and delay time of mesh routing and the Input Capacitance of s Figure 2: Simulation Model for Mesh Routing layout ow based on that model. 2.1 Characterization of Mesh Routing mesh for a clock net is a combination of a loop, horizontal interconnections in every other channel, and a vertical center interconnection. The clock source is placed at the center of the loop (see Fig. 2). buer is connected to this source. We rst simulated the eect of mesh routing by HSPICE [10] under the following conditions: 1) the aspect ratio of the loop is 1, 2) the input capacitances of 's are localized to realize the worst case, and 3) the clock interconnects are twice as thick as the others. The interconnections were modeled by RC ladders. We examined the clock skew S and delay time d in relation to the number of 's N and the mesh area, because it is reasonable to assume that the clock-signal delay time depends on the total clock net length, which depends on, and the amount of input capacitance, which is proportional to N. The simulation results are summarized in Fig. 3. Figure 3 (a) shows the constant-skew curves in terms of N and. The hatched area shows the region where the skew is less than a constant S 1. Figure 3 (b) shows the constant-delay curves in terms of N and. Here the delay time is the average of the delay times from the clock source to 's. s a result, S and d are approximately represented as follows: S = (N 2 )+Const:; (1) d = N + + Const:: (2)

3 N S S S (a) S < S <S N d d < d < d 3 d 2 d 1 Figure 3: (a) Constant-skew curves and (b) Constantdelay curves in terms of N and (b) N S d max N (b) min N min Range of Delay Time Density of Chip Delay Upper Bound Delay (a) Lower Bound max Here,, and are positive constants determined from the simulations. These constants depend only on the technologies used and the drivability of the clock buer, not on the circuit type. That is, by carefully partitioning the circuit into sub-blocks having at most N 's and area that satisfy eqs. (1) and (2) for given S and d, we can suppress the skew and delay in each partitioned block. sub-block region that ensures skew is called a Mesh-Routing Region(MR). Given curves (1) and (2), constraints for N and that ensure given S and d are determined as follows. s shown in Fig. 4, N and must be below both of the constant-skew curve for S and constant-delay curve for d. In addition, we need a lower bound for delay times to equalize them to reduce the skew among MR's. N and realizing an MR must be bounded by these two delay curves and a skew curve. We further shrink this region into the hatched rectangle shown in Fig.4 to have a safe margin to the skew and delay boundaries. This is done as follows: We draw a line indicating the ` density', i.e., the ratio N =, of the chip. Then points (a) and (b) are determined as the intersections of this line, the constant-skew and constant-delay curves. The region for an MR is the rectangle determined by these two points, where MR's will have the density close to that of the chip and can have a safe margin to the delay and skew boundaries. In reality, this rectangle is also enough large to maintain the freedom of the partitioning. Formally an MR is dened as follows: Denition 2.1 Sub-block SB(i) having N (i) 's and area (i) is an MR, if it satises N min N (i) N max ; (3) min (i) max ; (4) where N min Figure 4: MR Constraints, N max, min and max are the constants that represent lower and upper limits of N (i) and (i) determined as shown in Fig.4, respectively. We call (3) and (4) by MR constraints. We can widen these ranges further by providing a number of sizes of clock buers. 2.2 Layout Flow using BMM The layout ow using BMM is as follows: (Step1.) (Step2.) (Step3.) (Step4.) (Step5.) (Step6.) MR Partitioning, FloorPlan, Placement, Global Routing of Clock Net, Global Routing of Other Signal Nets, Detailed Routing. Before layout, we rst determine eqs. (1) and (2) from simulation results. This is done only once for a technology. MR constraints are determined regarding the required performance of a particular design. In (Step1.), the circuit is partitioned into MR's so that they satisfy the MR constraints, and then a clock buer, the size of which depends on the values of N and, is selected in each MR (see Fig. 5). We perform this step in advance to the placement, because the placement quality can not be degraded very much by partitioning since a MR is fairly large and can contain a few logic blocks. Further, it is dicult to nd a room for clock buers after the placement due to their sizes. Next, the oorplan of MR's is performed considering the number of nets crossing the MR's. This is followed by the cell placement without taking the clock

4 Clock Net MR Partitioning Cell Inter-MR Clock Net MR1 Intra-MR Clock Net Clock Buer Figure 5: Image of MR Partitioning MR2 net into consideration. Cells are placed within each MR they belong to. The clock buer is positioned either at the left or right edge of the center row of the corresponding MR to adjoin the power lines, because the clock buer dissipates a lot of power. In (Step4.), the routing is classied into two types: intra-mr and inter-mr. Intra-MR routing is the mesh routing in each MR. Inter-MR routing is the minimum-delay-time-routing from the clock source to all MR's, because even the minimum-delay-timerouting gives small clock skew(see Table 2) when circuit is not so large. The balanced-tree can also be used since the number of the MR's is small and very little area is wasted due to detours. The global routing of other signal nets is performed after this. The last step is the channel routing. We developed a MR-partitioning program and a clock-global-routing program. We describe the MRpartitioning algorithm in detail in the next section. 3 The MR-Partitioning lgorithm 3.1 Problem Formulation Let G(V; E) be a hyper-graph with a vertex set V and an edge set E, where v 2 V and e 2 E correspond to a cell and a net of the given circuit, respectively. Hereafter, we use the graph notation and the circuit interchangeably. The set V is partitioned into N p subsets. (N p will be dened later in this section.) Let G i G i (V i ;E i ) be a graph of a partition of graph G where S i=n p i=1 V i = V; V i \ V j = (i 6= j; i; j = 1; :::; N p ). The MR-partitioning problem: Minimize : P i;j C(i; j) Subject to : MR constraints (3) and (4), where C(i; j) is the number of nets connecting G i and G j (i; j =1; :::; N p ). N p is: N p = either bn all =Ntarget c or dn all =Ntarget e; (5) where N all target and N are the total number of 's and the expected number of 's in MR's whose default value is the mean of N min and N max, respectively. s shown in eq. (5), N p can be calculated in one of two ways. The one satisfying the following feasibility constraints is used: N p 1 N min N all N p 1 N max ; (6) N p 1 min N p 1 max : (7) Here, is the area of the entire circuit. If neither choice in eq. (5) satises eqs. (6) and (7), then the partitioning is infeasible, whichisvery rare in practice however. If the partitioning is feasible, G i can have approximately N target 's. 3.2 Mincut-based Bipartitioning Technique The circuit is partitioned into MR's by iterating the bipartitioning technique like in the conventional mincut algorithm [11, 12]. To satisfy the MR constraints at the nal level of partitioning, we impose constraints on the intermediate levels of partitioning. Let G P be a sub-circuit at an intermediate level that will be further partitioned into Np P ( N p) MR's. Let the number of 's and the area of G P be N P and P, respectively. Suppose we partition G P into two sub-circuits, G and G B, that have Np = dnp P =2e MR's and Np B = bnp P =2c MR's, respectively. We denote the number of 's and the area of G and G B by N,, N B and B, respectively. We impose the following constraints on G and G B : min 1 min 1 N = dn P 1 N p N P p e; (8) N B = N P 0 N ; (9) N N max max 1 N B N max B max 1 N N min N B N min ; (10) : (11) It can be shown that the partitioning satisfying the above constraints is always possible and MR's satisfying the MR constraints can be obtained at last if eqs. (6) and (7) hold. The MR partitioning algorithm is summarized below.

5 lgorithm 3.1 : MR Partitioning begin initial clustering; while N P p > 1 do begin set partitioning constraints (8), (9), (10) and (11); initial partitioning; while number of clusters in G and G B is more than 2 do begin swap clusters [11]; hierarchical clustering; end decompose clusters to initial cluster level; choose next sub-circuit if N P p > 1; end end G G B net Initial Cluster Cut-line (a) Improvement on Initial Clusters Cluster (b) Hierarchical Clustering for Initial Clusters In initial clustering, the cells are clustered according to the conventional denition of connectivity. Each cluster is forced to include at most one. With this clustering, cells having strong connectivity can be put in one MR and the complexity of the partitioning can be reduced. In initial partitioning, the clusters are partitioned into G and G B under constraints (8), (9), (10) and (11) maintaining a logical structure as much as possible. The partitioning is improved in the swapclusters step, which is based on the famous mincut algorithm [11]. Note here that the moving of clusters is restricted by constraints (8), (9), (10) and (11). fter the improving, the circuit is further clustered in the hierarchical-clustering step with the increased cluster size like in [12], and the partitioning is iterated again until the number of clusters in G and G B becomes one. s shown in Fig. 6, this algorithm can reduce local optimum solutions. 4 Experimental Results The MR-partitioning program and the clock-globalrouting program were implemented in C on a Sun Sparc Station 2. BMM was tested on three ISCS benchmark circuits, two industrial circuits, data1 and data2(see Table 1), and a MPEG2 LSI. First, we show the results for ISCS data. We assumed a 0:5-m CMOS technology, and set MR constraints (3), (4) to produce a skew below 180 ps, which enable a design of 100-MHz or faster. The clock interconnect width was twice the normal width so the resistance was half the normal interconnect resistance. Table 2 shows the skew from each clock buer to each (Intra-MR), the skew from clock source to clock buers (Inter-MR), and the skew from clock source to each (Overall). The maximum clock (c) Swapping Clusters ccording to Gain (d) More Hierarchical Clustering Figure 6: n example of the improvement obtained with the algorithm( hatched initial cluster is a cluster including a. In (a), the number of net that cross the cut-line(cuts) is 6, while that number is 3 in (c) after the hierarchical clustering of (a) and swapping clusters.) delay times from source to 's (Phase Delay) are also shown. They were calculated by HSPICE [10]. We achieved skew below 180 ps for all the data by BMM. Next, we examined the dependency of skew and phase delay on placement. ll tests used the same MR partitioning results and oorplan. Only the placement was dierent. s shown in Table 3, the dierence of both skew and phase delay time were at most only 30 ps or so even if placement is dierent. Table 4 compares results for the proposed routing (BMM) and (pseudo) steiner tree routing (Steiner). For each type of routing, we changed the width of the interconnect. Dbl indicates a double width interconnect and Min indicates a minimum width interconnect. With the proposed routing method, skew was below the allowance. However, skew was more than 1 ns with the conventional routing method. This indicates that BMM with wide interconnects is very eective in reducing the clock skew. Table 4 also lists the interconnect capacitances and the phase delay times of the clock net. Phase delay time of BMM is smaller than those of the steiner tree routing with the same width, although the capacitance of BMM is larger than those of the steiner tree routing. In addition, the area is also comparable. Since the delay times and the area of BTM must be larger than

6 Table 1: Experimental Data Circuit # modules # 's # MR's (except 's) S S S data data Table 2: Skew and Phase Delay with BMM Clock Skew(ps) Phase rea Circuit Intra Inter Over Delay -MR -MR -all (ns) (mm 2 ) S S S with the steiner tree routing due to the detours 2,we can say that BMM is more ecient than BTM in terms of both delay and area. Furthermore, we examined the robustness of MR partitioning for data1 and data2 under the same MR constraints as ISCS data. The result is shown in Table 5. In the table, auto means the results were obtained by MR partitioning program, and logic means the partitioning was determined regarding the logical structure. ll MR's satisfy the MR constraints. The results show that both partitioning methods give almost the same area and total net length. In addition, the results indicate that BMM gives the skew below the allowance and almost the same phase delay time if MR satises the MR conditioning. We applied BMM to a design of 0:5-m 122-MHz MPEG2-encoder LSI whose die size is mm 2 and that contains 1:6 M transistors. We set constraints to produce a skew below 350 ps. The chip was partitioned into 14 MR's, and they were routed by an approximately balanced tree from a clock-root buer placed at the center. Note worthy is that this root buer is much smaller than the one used in DEC lpha design [7] since it only drives 14 buers. The achieved clock skew was 210 ps and the total interconnect capacitance of the clock nets was 64 pf. To compare this result with FMM, we calculated the interconnect capacitance of the mesh that covers this chip. It contains the horizontal interconnects for every other channel and vertical interconnects routed at an interval of 1 mm. The result showed that interconnect capacitance could be 150 pf. This indicates that BMM is much more ecient than FMM in terms of reducing the power dissipation. If we take the buer size into account in the above calculation, then there would be a larger dierence in the capacitance (i.e. power) because FMM uses a lot bigger buer than 2 For Instance, Huang et al. [13] showed the total net length of zero skew tree can be more than 1.5 times larger than that of steiner. Table 3: Skew and Phase Delay of S13207 for Dierent Placements Test Clock Skew(ps) Phase rea Intra Inter Over Delay No. -MR -MR -all (ns) (mm 2 ) BMM does Conclusions We have proposed a clock routing technique called the Balanced-Mesh Method (BMM), which incorporates the advantages of both the well-known Balanced-Tree Method (BTM) and the Fixed-Mesh Method (FMM). We developed a MR-partitioning program and a clock-global-routing program to implement it. The experimental results for a couple of IS- CS benchmark circuits show that BMM can achieve a small skew and phase delay regardless of the placement. BMM was applied to the design of a MPEG2 LSI with 1.6 M transistors. The clock skew was 210 ps, which enabled a clock frequency of 122-MHz. In addition, it was experimentally shown that BMM is better than BTM in reducing phase delay, and provides much lower power dissipation than FMM. In BMM, MR constrains are determined from simulation results. In future work, we will analyze the theoretical basis of mesh routing and determine MR constraints theoretically. Further, we will extend BMM to multiple clocks. cknowledgements The authors would like to thank Toru dachi and Ryota Kasai for helpful discussions, and Hiroshi

7 Table 4: Skew, Phase Delay and Capacitance of S13207 for Dierent Routing Methods. (BMM & Dbl is the proposed method, and Steiner & Min is the conventional routing method.) Routing Clock Skew(ps) Phase Delay Capacitance rea Route Width Intra-MR Inter-MR Overall (ns) (pf) (mm 2 ) BMM Dbl Steiner Dbl BMM Min Steiner Min Table 5: Skew, Phase Delay, rea and Total Net Length for Dierent MR Partitioning Methods Data Partitoning Clock Skew(ps) Phase Delay rea Total net Method Intra-MR Inter-MR Overall (ns) (mm 2 ) Length(mm) data1 auto logic data2 auto logic Miyashita for HSPICE simulation. References [1] H. B. Bakoglu: \Circuits, Interconnections, and Packaging for VLSI", ddison-wesley Publishing Company(1990). [2] H. B. Bakoglu, J. T. Walker and J. D. Meindl: \ Symmetric Clock-Distribution Tree and Optimized High-Speed Interconnections for Reduced Clock Skew in ULSI and WSI Circuits", Proc. of IEEE International Conference on Computer Design, pp. 118{122(1986). [3] M.. B. Jackson,. Srinivasan and E. S. Kuh: \Clock Routing for High-Performance ICs", Proc. of CM/IEEE Design utomation Conference, pp. 322{327(1990). [4] R.-S. Tsay: \Exact Zero Skew", Proc. of International Conference on Computer ided Design, pp. 336{339(1991). [5] F. Minami and M. Takano: \Clock Tree Synthesis Based on RC Delay Balancing", Proc. of IEEE Custom Integrated Circuits Conference, pp {28.3.4(1992). [6] M. Edahiro: \ Clustering-Based Optimization lgorithm in Zero-Skew Routings", Proc. of CM/IEEE Design utomation Conference, pp. 612{616(1993). [7] D. Dobberpuhl et al.: \ 200-MHz 64-b Dual- Issue CMOS Microprocessor", IEEE Journal of Sold State Circuits, pp. 1555{1567(1992). [8] P. K. Chan and K. Karplus: \Computing Signal Delay in General RC Networks by Tree/Link Partitioning", IEEE Transactions on Computer- ided Design, pp. 898{902(1990). [9] B.. McCoy and G. Robins: \Non-Tree Routing", Proc. of IEEE European Design and Test Conference, pp. 430{434(1994). [10] : \HSPICE USER'S MNUL(HSPICE Version H92)", MET-SOFTWRE(1992). [11] C. M. Fiduccia and R. M. Mattheyses: \ Linear- Time Heuristic for Improving Netwrok Partitions", Proc. of CM/IEEE Design utomation Conference, pp. 175{181(1982). [12] M. Edahiro and T. Yoshimura: \New Placement and Global Routing lgorithms for Standard Cell Layouts", Proc. of CM/IEEE Design utomation Conference, pp. 642{645(1990). [13] D. J.-H. Huang,. B. Kahng and C.-W.. Tsao: \On the Bounded-Skew Clock and Steiner Routing Problems", Proc. of International Conference on Computer ided Design, pp. 508{513(1995).

(a) (b) (c) Phase1. Phase2. Assignm ent offfs to scan-paths. Phase3. Determination of. connection-order offfs. Phase4. Im provem entby exchanging FFs

(a) (b) (c) Phase1. Phase2. Assignm ent offfs to scan-paths. Phase3. Determination of. connection-order offfs. Phase4. Im provem entby exchanging FFs Scan-chain Optimization lgorithms for Multiple Scan-paths Susumu Kobayashi Masato Edahiro Mikio Kubo C&C Media Research Laboratories NEC Corporation Kawasaki, Japan Logistics and Information Engineering

More information

University of California at Berkeley. Berkeley, CA the global routing in order to generate a feasible solution

University of California at Berkeley. Berkeley, CA the global routing in order to generate a feasible solution Post Routing Performance Optimization via Multi-Link Insertion and Non-Uniform Wiresizing Tianxiong Xue and Ernest S. Kuh Department of Electrical Engineering and Computer Sciences University of California

More information

Basic Idea. The routing problem is typically solved using a twostep

Basic Idea. The routing problem is typically solved using a twostep Global Routing Basic Idea The routing problem is typically solved using a twostep approach: Global Routing Define the routing regions. Generate a tentative route for each net. Each net is assigned to a

More information

S 1 S 2. C s1. C s2. S n. C sn. S 3 C s3. Input. l k S k C k. C 1 C 2 C k-1. R d

S 1 S 2. C s1. C s2. S n. C sn. S 3 C s3. Input. l k S k C k. C 1 C 2 C k-1. R d Interconnect Delay and Area Estimation for Multiple-Pin Nets Jason Cong and David Zhigang Pan Department of Computer Science University of California, Los Angeles, CA 90095 Email: fcong,pang@cs.ucla.edu

More information

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets.

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Clock Routing Problem Formulation Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Better to develop specialized routers for these nets.

More information

[HaKa92] L. Hagen and A. B. Kahng, A new approach to eective circuit clustering, Proc. IEEE

[HaKa92] L. Hagen and A. B. Kahng, A new approach to eective circuit clustering, Proc. IEEE [HaKa92] L. Hagen and A. B. Kahng, A new approach to eective circuit clustering, Proc. IEEE International Conference on Computer-Aided Design, pp. 422-427, November 1992. [HaKa92b] L. Hagen and A. B.Kahng,

More information

3-1, Morinosato Wakamiya, Atsugi-shi Uchisaiwai-chou, Chiyoda-ku. were also integrated in the proposed algorithms. 2.

3-1, Morinosato Wakamiya, Atsugi-shi Uchisaiwai-chou, Chiyoda-ku. were also integrated in the proposed algorithms. 2. A Global Router Optimizing Timing and Area for High-Speed Bipolar LSI's Ikuo Harada y and Hitoshi Kitazawa z y NTT LSI Laboratories z NTT R&D Headquarters 3-1, Morinosato Wakamiya, Atsugi-shi 1-1-7 Uchisaiwai-chou,

More information

CAD Algorithms. Circuit Partitioning

CAD Algorithms. Circuit Partitioning CAD Algorithms Partitioning Mohammad Tehranipoor ECE Department 13 October 2008 1 Circuit Partitioning Partitioning: The process of decomposing a circuit/system into smaller subcircuits/subsystems, which

More information

Place and Route for FPGAs

Place and Route for FPGAs Place and Route for FPGAs 1 FPGA CAD Flow Circuit description (VHDL, schematic,...) Synthesize to logic blocks Place logic blocks in FPGA Physical design Route connections between logic blocks FPGA programming

More information

Linking Layout to Logic Synthesis: A Unification-Based Approach

Linking Layout to Logic Synthesis: A Unification-Based Approach Linking Layout to Logic Synthesis: A Unification-Based Approach Massoud Pedram Department of EE-Systems University of Southern California Los Angeles, CA February 1998 Outline Introduction Technology and

More information

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface.

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface. Placement Introduction A very important step in physical design cycle. A poor placement requires larger area. Also results in performance degradation. It is the process of arranging a set of modules on

More information

Placement Algorithm for FPGA Circuits

Placement Algorithm for FPGA Circuits Placement Algorithm for FPGA Circuits ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

Cluster-based approach eases clock tree synthesis

Cluster-based approach eases clock tree synthesis Page 1 of 5 EE Times: Design News Cluster-based approach eases clock tree synthesis Udhaya Kumar (11/14/2005 9:00 AM EST) URL: http://www.eetimes.com/showarticle.jhtml?articleid=173601961 Clock network

More information

a) wire i with width (Wi) b) lij C coupled lij wire j with width (Wj) (x,y) (u,v) (u,v) (x,y) upper wiring (u,v) (x,y) (u,v) (x,y) lower wiring dij

a) wire i with width (Wi) b) lij C coupled lij wire j with width (Wj) (x,y) (u,v) (u,v) (x,y) upper wiring (u,v) (x,y) (u,v) (x,y) lower wiring dij COUPLING AWARE ROUTING Ryan Kastner, Elaheh Bozorgzadeh and Majid Sarrafzadeh Department of Electrical and Computer Engineering Northwestern University kastner,elib,majid@ece.northwestern.edu ABSTRACT

More information

160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp

160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp Scientia Iranica, Vol. 11, No. 3, pp 159{164 c Sharif University of Technology, July 2004 On Routing Architecture for Hybrid FPGA M. Nadjarbashi, S.M. Fakhraie 1 and A. Kaviani 2 In this paper, the routing

More information

Implementations of Dijkstra's Algorithm. Based on Multi-Level Buckets. November Abstract

Implementations of Dijkstra's Algorithm. Based on Multi-Level Buckets. November Abstract Implementations of Dijkstra's Algorithm Based on Multi-Level Buckets Andrew V. Goldberg NEC Research Institute 4 Independence Way Princeton, NJ 08540 avg@research.nj.nec.com Craig Silverstein Computer

More information

Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment

Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment Xin-Wei Shih, Tzu-Hsuan Hsu, Hsu-Chieh Lee, Yao-Wen Chang, Kai-Yuan Chao 2013.01.24 1 Outline 2 Clock Network Synthesis Clock network

More information

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and

More information

Diagonal Routing in High Performance Microprocessor Design

Diagonal Routing in High Performance Microprocessor Design Diagonal Routing in High Performance Microprocessor Design Noriyuki Ito, Hideaki Katagiri, Ryoichi Yamashita, Hiroshi Ikeda, Hiroyuki Sugiyama, Hiroaki Komatsu, Yoshiyasu Tanamura, Akihiko Yoshitake, Kazuhiro

More information

Estimation of Wirelength

Estimation of Wirelength Placement The process of arranging the circuit components on a layout surface. Inputs: A set of fixed modules, a netlist. Goal: Find the best position for each module on the chip according to appropriate

More information

Incorporating the Controller Eects During Register Transfer Level. Synthesis. Champaka Ramachandran and Fadi J. Kurdahi

Incorporating the Controller Eects During Register Transfer Level. Synthesis. Champaka Ramachandran and Fadi J. Kurdahi Incorporating the Controller Eects During Register Transfer Level Synthesis Champaka Ramachandran and Fadi J. Kurdahi Department of Electrical & Computer Engineering, University of California, Irvine,

More information

ECE260B CSE241A Winter Routing

ECE260B CSE241A Winter Routing ECE260B CSE241A Winter 2005 Routing Website: / courses/ ece260bw05 ECE 260B CSE 241A Routing 1 Slides courtesy of Prof. Andrew B. Kahng Physical Design Flow Input Floorplanning Read Netlist Floorplanning

More information

Partitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5.

Partitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5. Course contents: Partitioning Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic Readings Chapter 7.5 Partitioning 1 Basic Definitions Cell: a logic block used to build larger circuits.

More information

CAD Algorithms. Placement and Floorplanning

CAD Algorithms. Placement and Floorplanning CAD Algorithms Placement Mohammad Tehranipoor ECE Department 4 November 2008 1 Placement and Floorplanning Layout maps the structural representation of circuit into a physical representation Physical representation:

More information

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Physical Design of Digital Integrated Circuits (EN029 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Lecture 09: Routing Introduction to Routing Global Routing Detailed Routing 2

More information

On GPU Bus Power Reduction with 3D IC Technologies

On GPU Bus Power Reduction with 3D IC Technologies On GPU Bus Power Reduction with 3D Technologies Young-Joon Lee and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta, Georgia, USA yjlee@gatech.edu, limsk@ece.gatech.edu Abstract The

More information

Routing. Robust Channel Router. Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998

Routing. Robust Channel Router. Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998 Routing Robust Channel Router Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998 Channel Routing Algorithms Previous algorithms we considered only work when one of the types

More information

ICS 252 Introduction to Computer Design

ICS 252 Introduction to Computer Design ICS 252 Introduction to Computer Design Lecture 16 Eli Bozorgzadeh Computer Science Department-UCI References and Copyright Textbooks referred (none required) [Mic94] G. De Micheli Synthesis and Optimization

More information

Ecient Processor Allocation for 3D Tori. Wenjian Qiao and Lionel M. Ni. Department of Computer Science. Michigan State University

Ecient Processor Allocation for 3D Tori. Wenjian Qiao and Lionel M. Ni. Department of Computer Science. Michigan State University Ecient Processor llocation for D ori Wenjian Qiao and Lionel M. Ni Department of Computer Science Michigan State University East Lansing, MI 4884-107 fqiaow, nig@cps.msu.edu bstract Ecient allocation of

More information

Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools

Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools Shamik Das, Anantha Chandrakasan, and Rafael Reif Microsystems Technology Laboratories Massachusetts Institute of Technology

More information

L14 - Placement and Routing

L14 - Placement and Routing L14 - Placement and Routing Ajay Joshi Massachusetts Institute of Technology RTL design flow HDL RTL Synthesis manual design Library/ module generators netlist Logic optimization a b 0 1 s d clk q netlist

More information

Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations

Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations Renshen Wang Department of Computer Science and Engineering University of California, San Diego La Jolla,

More information

Introduction VLSI PHYSICAL DESIGN AUTOMATION

Introduction VLSI PHYSICAL DESIGN AUTOMATION VLSI PHYSICAL DESIGN AUTOMATION PROF. INDRANIL SENGUPTA DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Introduction Main steps in VLSI physical design 1. Partitioning and Floorplanning l 2. Placement 3.

More information

(a) (b) (c) Routing Blocks. Channels

(a) (b) (c) Routing Blocks. Channels A Unied Approach to Multilayer Over-the-Cell Routing Sreekrishna Madhwapathy, Naveed Sherwani Siddharth Bhingarde, Anand Panyam Dept. of Computer Science Intel Corporation Western Michigan University Hillsboro,

More information

The Global Standard for Mobility (GSM) (see, e.g., [6], [4], [5]) yields a

The Global Standard for Mobility (GSM) (see, e.g., [6], [4], [5]) yields a Preprint 0 (2000)?{? 1 Approximation of a direction of N d in bounded coordinates Jean-Christophe Novelli a Gilles Schaeer b Florent Hivert a a Universite Paris 7 { LIAFA 2, place Jussieu - 75251 Paris

More information

/97 $ IEEE

/97 $ IEEE NRG: Global and Detailed Placement Majid Sarrafzadeh Maogang Wang Department of Electrical and Computer Engineering, Northwestern University majid@ece.nwu.edu Abstract We present a new approach to the

More information

applies to general (nontree) RC circuits can be eciently, globally optimized clock meshes busses with crosstalk Contribution dominant time constant as

applies to general (nontree) RC circuits can be eciently, globally optimized clock meshes busses with crosstalk Contribution dominant time constant as Optimal Wire and Transistor Sizing for Circuits With Non-Tree Topology Lieven Vandenberghe (UCLA) Stephen Boyd (Stanford University) Abbas El Gamal (Stanford University) applies to general (nontree) RC

More information

A Multi-Layer Router Utilizing Over-Cell Areas

A Multi-Layer Router Utilizing Over-Cell Areas A Multi-Layer Router Utilizing Over-Cell Areas Evagelos Katsadas and Edwin h e n Department of Electrical Engineering University of Rochester Rochester, New York 14627 ABSTRACT A new methodology is presented

More information

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Zhou B. B., Brent R. P. and Tridgell A. y Computer Sciences Laboratory The Australian National University Canberra,

More information

An Interconnect-Centric Design Flow for Nanometer Technologies

An Interconnect-Centric Design Flow for Nanometer Technologies An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong UCLA Computer Science Department Email: cong@cs.ucla.edu Tel: 310-206-2775 URL: http://cadlab.cs.ucla.edu/~cong Exponential Device

More information

Crosslink Insertion for Variation-Driven Clock Network Construction

Crosslink Insertion for Variation-Driven Clock Network Construction Crosslink Insertion for Variation-Driven Clock Network Construction Fuqiang Qian, Haitong Tian, Evangeline Young Department of Computer Science and Engineering The Chinese University of Hong Kong {fqqian,

More information

Iterative-Constructive Standard Cell Placer for High Speed and Low Power

Iterative-Constructive Standard Cell Placer for High Speed and Low Power Iterative-Constructive Standard Cell Placer for High Speed and Low Power Sungjae Kim and Eugene Shragowitz Department of Computer Science and Engineering University of Minnesota, Minneapolis, MN 55455

More information

A New K-Way Partitioning Approach. Bernhard M. Riess, Heiko A. Giselbrecht, and Bernd Wurth. Technical University of Munich, Munich, Germany

A New K-Way Partitioning Approach. Bernhard M. Riess, Heiko A. Giselbrecht, and Bernd Wurth. Technical University of Munich, Munich, Germany A New K-Way Partitioning Approach for Multiple Types of s Bernhard M. Riess, Heiko A. Giselbrecht, and Bernd Wurth Institute of Electronic Design Automation Technical University of Munich, 8090 Munich,

More information

Circuit Model for Interconnect Crosstalk Noise Estimation in High Speed Integrated Circuits

Circuit Model for Interconnect Crosstalk Noise Estimation in High Speed Integrated Circuits Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 8 (2013), pp. 907-912 Research India Publications http://www.ripublication.com/aeee.htm Circuit Model for Interconnect Crosstalk

More information

Can Recursive Bisection Alone Produce Routable Placements?

Can Recursive Bisection Alone Produce Routable Placements? Supported by Cadence Can Recursive Bisection Alone Produce Routable Placements? Andrew E. Caldwell Andrew B. Kahng Igor L. Markov http://vlsicad.cs.ucla.edu Outline l Routability and the placement context

More information

Minoru SASAKI and Kenji KITA. Department of Information Science & Intelligent Systems. Faculty of Engineering, Tokushima University

Minoru SASAKI and Kenji KITA. Department of Information Science & Intelligent Systems. Faculty of Engineering, Tokushima University Information Retrieval System Using Concept Projection Based on PDDP algorithm Minoru SASAKI and Kenji KITA Department of Information Science & Intelligent Systems Faculty of Engineering, Tokushima University

More information

Algorithms for an FPGA Switch Module Routing Problem with. Application to Global Routing. Abstract

Algorithms for an FPGA Switch Module Routing Problem with. Application to Global Routing. Abstract Algorithms for an FPGA Switch Module Routing Problem with Application to Global Routing Shashidhar Thakur y Yao-Wen Chang y D. F. Wong y S. Muthukrishnan z Abstract We consider a switch-module-routing

More information

Clock Design of 300MHz 128-bit 2-way Superscalar Microprocessor

Clock Design of 300MHz 128-bit 2-way Superscalar Microprocessor Clock Design of MHz 8-bit -way Superscalar Microprocessor Fujio Ishihara Christian Klingner Ken-ichi Agawa Processor Development Group TOSHIBA AMERICA Mobile and Network Development. Group System ULSI

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION Rapid advances in integrated circuit technology have made it possible to fabricate digital circuits with large number of devices on a single chip. The advantages of integrated circuits

More information

A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM

A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 09, 2016 ISSN (online): 2321-0613 A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM Yogit

More information

A Recursive Coalescing Method for Bisecting Graphs

A Recursive Coalescing Method for Bisecting Graphs A Recursive Coalescing Method for Bisecting Graphs The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Citation Accessed Citable

More information

Graph Models for Global Routing: Grid Graph

Graph Models for Global Routing: Grid Graph Graph Models for Global Routing: Grid Graph Each cell is represented by a vertex. Two vertices are joined by an edge if the corresponding cells are adjacent to each other. The occupied cells are represented

More information

Constructive floorplanning with a yield objective

Constructive floorplanning with a yield objective Constructive floorplanning with a yield objective Rajnish Prasad and Israel Koren Department of Electrical and Computer Engineering University of Massachusetts, Amherst, MA 13 E-mail: rprasad,koren@ecs.umass.edu

More information

Fishbone: A Block-Level Placement and Routing Scheme

Fishbone: A Block-Level Placement and Routing Scheme Fishbone: Block-Level Placement and Routing Scheme Fan Mo EECS, UC Berkeley Robert K. Brayton EECS, UC Berkeley Cory Hall, UC Berkeley 57 Cory Hall, UC Berkeley Berkeley, C97 Berkeley, C97-5-6-6 -5-6-98

More information

VERY large scale integration (VLSI) design for power

VERY large scale integration (VLSI) design for power IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 25 Short Papers Segmented Bus Design for Low-Power Systems J. Y. Chen, W. B. Jone, Member, IEEE, J. S. Wang,

More information

Silicon Virtual Prototyping: The New Cockpit for Nanometer Chip Design

Silicon Virtual Prototyping: The New Cockpit for Nanometer Chip Design Silicon Virtual Prototyping: The New Cockpit for Nanometer Chip Design Wei-Jin Dai, Dennis Huang, Chin-Chih Chang, Michel Courtoy Cadence Design Systems, Inc. Abstract A design methodology for the implementation

More information

Postgrid Clock Routing for High Performance Microprocessor Designs

Postgrid Clock Routing for High Performance Microprocessor Designs IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 31, NO. 2, FEBRUARY 2012 255 Postgrid Clock Routing for High Performance Microprocessor Designs Haitong Tian, Wai-Chung

More information

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines B. B. Zhou, R. P. Brent and A. Tridgell Computer Sciences Laboratory The Australian National University Canberra,

More information

Circuit Placement: 2000-Caldwell,Kahng,Markov; 2002-Kennings,Markov; 2006-Kennings,Vorwerk

Circuit Placement: 2000-Caldwell,Kahng,Markov; 2002-Kennings,Markov; 2006-Kennings,Vorwerk Circuit Placement: 2000-Caldwell,Kahng,Markov; 2002-Kennings,Markov; 2006-Kennings,Vorwerk Andrew A. Kennings, Univ. of Waterloo, Canada, http://gibbon.uwaterloo.ca/ akenning/ Igor L. Markov, Univ. of

More information

EE582 Physical Design Automation of VLSI Circuits and Systems

EE582 Physical Design Automation of VLSI Circuits and Systems EE582 Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University Preliminaries Table of Contents Semiconductor manufacturing Problems to solve Algorithm complexity

More information

International Conference on Parallel Processing (ICPP) 1994

International Conference on Parallel Processing (ICPP) 1994 Parallel Logic Synthesis using Partitioning Kaushik De LSI Logic Corporation 1551 McCarthy lvd., MS E-192 Milpitas, C 95035, US Email: kaushik@lsil.com Prithviraj anerjee Center for Reliable & High-Perf.

More information

I N. k=1. Current I RMS = I I I. k=1 I 1. 0 Time (N time intervals)

I N. k=1. Current I RMS = I I I. k=1 I 1. 0 Time (N time intervals) ESTIMATION OF MAXIMUM CURRENT ENVELOPE FOR POWER BUS ANALYSIS AND DESIGN y S. Bobba and I. N. Hajj Coordinated Science Lab & ECE Dept. University of Illinois at Urbana-Champaign Urbana, Illinois 61801

More information

COMPARATIVE STUDY OF CIRCUIT PARTITIONING ALGORITHMS

COMPARATIVE STUDY OF CIRCUIT PARTITIONING ALGORITHMS COMPARATIVE STUDY OF CIRCUIT PARTITIONING ALGORITHMS Zoltan Baruch 1, Octavian Creţ 2, Kalman Pusztai 3 1 PhD, Lecturer, Technical University of Cluj-Napoca, Romania 2 Assistant, Technical University of

More information

Hypergraph Partitioning With Fixed Vertices

Hypergraph Partitioning With Fixed Vertices Hypergraph Partitioning With Fixed Vertices Andrew E. Caldwell, Andrew B. Kahng and Igor L. Markov UCLA Computer Science Department, Los Angeles, CA 90095-596 Abstract We empirically assess the implications

More information

An Interconnect-Centric Design Flow for Nanometer. Technologies

An Interconnect-Centric Design Flow for Nanometer. Technologies An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong Department of Computer Science University of California, Los Angeles, CA 90095 Abstract As the integrated circuits (ICs) are scaled

More information

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141 ECE 637 Integrated VLSI Circuits Introduction EE141 1 Introduction Course Details Instructor Mohab Anis; manis@vlsi.uwaterloo.ca Text Digital Integrated Circuits, Jan Rabaey, Prentice Hall, 2 nd edition

More information

Unit 5A: Circuit Partitioning

Unit 5A: Circuit Partitioning Course contents: Unit 5A: Circuit Partitioning Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic Simulated annealing based partitioning algorithm Readings Chapter 7.5 Unit 5A 1 Course

More information

Floorplan Area Minimization using Lagrangian Relaxation

Floorplan Area Minimization using Lagrangian Relaxation Floorplan Area Minimization using Lagrangian Relaxation F.Y. Young 1, Chris C.N. Chu 2, W.S. Luk 3 and Y.C. Wong 3 1 Department of Computer Science and Engineering The Chinese University of Hong Kong New

More information

Large Scale Circuit Partitioning

Large Scale Circuit Partitioning Large Scale Circuit Partitioning With Loose/Stable Net Removal And Signal Flow Based Clustering Jason Cong Honching Li Sung-Kyu Lim Dongmin Xu UCLA VLSI CAD Lab Toshiyuki Shibuya Fujitsu Lab, LTD Support

More information

8ns. 8ns. 16ns. 10ns COUT S3 COUT S3 A3 B3 A2 B2 A1 B1 B0 2 B0 CIN CIN COUT S3 A3 B3 A2 B2 A1 B1 A0 B0 CIN S0 S1 S2 S3 COUT CIN 2 A0 B0 A2 _ A1 B1

8ns. 8ns. 16ns. 10ns COUT S3 COUT S3 A3 B3 A2 B2 A1 B1 B0 2 B0 CIN CIN COUT S3 A3 B3 A2 B2 A1 B1 A0 B0 CIN S0 S1 S2 S3 COUT CIN 2 A0 B0 A2 _ A1 B1 Delay Abstraction in Combinational Logic Circuits Noriya Kobayashi Sharad Malik C&C Research Laboratories Department of Electrical Engineering NEC Corp. Princeton University Miyamae-ku, Kawasaki Japan

More information

Preclass Warmup. ESE535: Electronic Design Automation. Motivation (1) Today. Bisection Width. Motivation (2)

Preclass Warmup. ESE535: Electronic Design Automation. Motivation (1) Today. Bisection Width. Motivation (2) ESE535: Electronic Design Automation Preclass Warmup What cut size were you able to achieve? Day 4: January 28, 25 Partitioning (Intro, KLFM) 2 Partitioning why important Today Can be used as tool at many

More information

Introduction. Summary. Why computer architecture? Technology trends Cost issues

Introduction. Summary. Why computer architecture? Technology trends Cost issues Introduction 1 Summary Why computer architecture? Technology trends Cost issues 2 1 Computer architecture? Computer Architecture refers to the attributes of a system visible to a programmer (that have

More information

ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS)

ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) Objective Part A: To become acquainted with Spectre (or HSpice) by simulating an inverter,

More information

An Ecient Approximation Algorithm for the. File Redistribution Scheduling Problem in. Fully Connected Networks. Abstract

An Ecient Approximation Algorithm for the. File Redistribution Scheduling Problem in. Fully Connected Networks. Abstract An Ecient Approximation Algorithm for the File Redistribution Scheduling Problem in Fully Connected Networks Ravi Varadarajan Pedro I. Rivera-Vega y Abstract We consider the problem of transferring a set

More information

Topics. ! PLAs.! Memories: ! Datapaths.! Floor Planning ! ROM;! SRAM;! DRAM. Modern VLSI Design 2e: Chapter 6. Copyright 1994, 1998 Prentice Hall

Topics. ! PLAs.! Memories: ! Datapaths.! Floor Planning ! ROM;! SRAM;! DRAM. Modern VLSI Design 2e: Chapter 6. Copyright 1994, 1998 Prentice Hall Topics! PLAs.! Memories:! ROM;! SRAM;! DRAM.! Datapaths.! Floor Planning Programmable logic array (PLA)! Used to implement specialized logic functions.! A PLA decodes only some addresses (input values);

More information

Trees, Trees and More Trees

Trees, Trees and More Trees Trees, Trees and More Trees August 9, 01 Andrew B. Kahng abk@cs.ucsd.edu http://vlsicad.ucsd.edu/~abk/ How You ll See Trees in CS Trees as mathematical objects Trees as data structures Trees as tools for

More information

Delay Estimation for Technology Independent Synthesis

Delay Estimation for Technology Independent Synthesis Delay Estimation for Technology Independent Synthesis Yutaka TAMIYA FUJITSU LABORATORIES LTD. 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki, JAPAN, 211-88 Tel: +81-44-754-2663 Fax: +81-44-754-2664 E-mail:

More information

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Physical Design of Digital Integrated Circuits (EN029 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Lecture 08: Interconnect Trees Introduction to Graphs and Trees Minimum Spanning

More information

Eliminating Routing Congestion Issues with Logic Synthesis

Eliminating Routing Congestion Issues with Logic Synthesis Eliminating Routing Congestion Issues with Logic Synthesis By Mike Clarke, Diego Hammerschlag, Matt Rardon, and Ankush Sood Routing congestion, which results when too many routes need to go through an

More information

Online algorithms for clustering problems

Online algorithms for clustering problems University of Szeged Department of Computer Algorithms and Artificial Intelligence Online algorithms for clustering problems Summary of the Ph.D. thesis by Gabriella Divéki Supervisor Dr. Csanád Imreh

More information

A Path Based Algorithm for Timing Driven. Logic Replication in FPGA

A Path Based Algorithm for Timing Driven. Logic Replication in FPGA A Path Based Algorithm for Timing Driven Logic Replication in FPGA By Giancarlo Beraudo B.S., Politecnico di Torino, Torino, 2001 THESIS Submitted as partial fulfillment of the requirements for the degree

More information

VLSI Physical Design: From Graph Partitioning to Timing Closure

VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter Netlist and System Partitioning Original Authors: Andrew B. Kahng, Jens, Igor L. Markov, Jin Hu Chapter Netlist and System Partitioning. Introduction. Terminology. Optimization Goals. Partitioning

More information

Zero-Skew Clock Routing Trees With Minimum Wirelength. Kenneth D. Boese and Andrew B. Kahng. UCLA Computer Science Dept., Los Angeles, CA

Zero-Skew Clock Routing Trees With Minimum Wirelength. Kenneth D. Boese and Andrew B. Kahng. UCLA Computer Science Dept., Los Angeles, CA Zero-Skew Clock Routing Trees With Minimum Wirelength Kenneth D. Boese and Andrew B. Kahng UCLA Computer Science Dept., Los Angeles, CA 90024-1596 Abstract In the design of high performance VLSI systems,

More information

Genetic Algorithm for Circuit Partitioning

Genetic Algorithm for Circuit Partitioning Genetic Algorithm for Circuit Partitioning ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

Whitespace-Aware TSV Arrangement in 3D Clock Tree Synthesis

Whitespace-Aware TSV Arrangement in 3D Clock Tree Synthesis 2013 IEEE Computer Society Annual Symposium on VLSI Whitespace-Aware TSV Arrangement in 3D Clock Tree Synthesis Xin Li, Wulong Liu, Haixiao Du, Yu Wang, Yuchun Ma, Huazhong Yang Tsinghua National Laboratory

More information

Calibrating Achievable Design GSRC Annual Review June 9, 2002

Calibrating Achievable Design GSRC Annual Review June 9, 2002 Calibrating Achievable Design GSRC Annual Review June 9, 2002 Wayne Dai, Andrew Kahng, Tsu-Jae King, Wojciech Maly,, Igor Markov, Herman Schmit, Dennis Sylvester DUSD(Labs) Calibrating Achievable Design

More information

Key terms and concepts: Divide and conquer system partitioning floorplanning chip planning placement routing global routing detailed routing

Key terms and concepts: Divide and conquer system partitioning floorplanning chip planning placement routing global routing detailed routing SICs...THE COURSE ( WEEK) SIC CONSTRUCTION Key terms and concepts: microelectronic system (or system on a chip) is the town and SICs (or system blocks) are the buildings System partitioning corresponds

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 2, FEBRUARY

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 2, FEBRUARY IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 2, FEBRUARY 2000 267 Short Papers Hypergraph Partitioning with Fixed Vertices Charles J. Alpert, Andrew E. Caldwell,

More information

Multi-Voltage Domain Clock Mesh Design

Multi-Voltage Domain Clock Mesh Design Multi-Voltage Domain Clock Mesh Design Can Sitik Electrical and Computer Engineering Drexel University Philadelphia, PA, 19104 USA E-mail: as3577@drexel.edu Baris Taskin Electrical and Computer Engineering

More information

X(1) X. X(k) DFF PI1 FF PI2 PI3 PI1 FF PI2 PI3

X(1) X. X(k) DFF PI1 FF PI2 PI3 PI1 FF PI2 PI3 Partial Scan Design Methods Based on Internally Balanced Structure Tomoya TAKASAKI Tomoo INOUE Hideo FUJIWARA Graduate School of Information Science, Nara Institute of Science and Technology 8916-5 Takayama-cho,

More information

An Iterative Approach for Delay-Bounded. Qing Zhu Mehrdad Parsa Wayne W.M. Dai. Board of Studies in Computer Engineering

An Iterative Approach for Delay-Bounded. Qing Zhu Mehrdad Parsa Wayne W.M. Dai. Board of Studies in Computer Engineering An Iterative Approach for Delay-Bounded Minimum Steiner Tree Construction Qing Zhu Mehrdad Parsa Wayne W.M. Dai Board of Studies in Computer Engineering University of California, Santa Cruz, CA 95064 qingz,

More information

Real-Time Scalability of Nested Spin Locks. Hiroaki Takada and Ken Sakamura. Faculty of Science, University of Tokyo

Real-Time Scalability of Nested Spin Locks. Hiroaki Takada and Ken Sakamura. Faculty of Science, University of Tokyo Real-Time Scalability of Nested Spin Locks Hiroaki Takada and Ken Sakamura Department of Information Science, Faculty of Science, University of Tokyo 7-3-1, Hongo, Bunkyo-ku, Tokyo 113, Japan Abstract

More information

under Timing Constraints David Filo David Ku Claudionor N. Coelho, Jr. Giovanni De Micheli

under Timing Constraints David Filo David Ku Claudionor N. Coelho, Jr. Giovanni De Micheli Interface Optimization for Concurrent Systems under Timing Constraints David Filo David Ku Claudionor N. Coelho, Jr. Giovanni De Micheli Abstract The scope of most high-level synthesis eorts to date has

More information

out of whitespace. Since non-uniform cell sizes generally worsen partitioner performance, modern cell libraries that have widelyvarying drive strength

out of whitespace. Since non-uniform cell sizes generally worsen partitioner performance, modern cell libraries that have widelyvarying drive strength Hierarchical Whitespace Allocation in Top-down Placement Andrew Caldwell and Igor Markov UCLA Computer Science Dept., Los Angeles, CA 90095-1596 fcaldwell,imarkovg@cs.ucla.edu Abstract Increased transistor

More information

Basic Block. Inputs. K input. N outputs. I inputs MUX. Clock. Input Multiplexors

Basic Block. Inputs. K input. N outputs. I inputs MUX. Clock. Input Multiplexors RPack: Rability-Driven packing for cluster-based FPGAs E. Bozorgzadeh S. Ogrenci-Memik M. Sarrafzadeh Computer Science Department Department ofece Computer Science Department UCLA Northwestern University

More information

A Parallel Circuit-Partitioned Algorithm for Timing Driven Cell Placement

A Parallel Circuit-Partitioned Algorithm for Timing Driven Cell Placement A Parallel Circuit-Partitioned Algorithm for Timing Driven Cell Placement John A. Chandy Sierra Vista Research N. Santa Cruz Ave., Suite 0 Los Gatos, CA 500 Prithviraj Banerjee Center for Parallel and

More information

Abstraction and Optimization of Consistent Floorplanning with Pillar Block Constraints

Abstraction and Optimization of Consistent Floorplanning with Pillar Block Constraints Abstraction and Optimization of Consistent Floorplanning with Pillar Block Constraints Ning FU, Shigetoshi NAKATAKE, Yasuhiro TAKASHIMA, Yoji KAJITANI School of Environmental Engineering, University of

More information

Full Custom Layout Optimization Using Minimum distance rule, Jogs and Depletion sharing

Full Custom Layout Optimization Using Minimum distance rule, Jogs and Depletion sharing Full Custom Layout Optimization Using Minimum distance rule, Jogs and Depletion sharing Umadevi.S #1, Vigneswaran.T #2 # Assistant Professor [Sr], School of Electronics Engineering, VIT University, Vandalur-

More information

Timing Driven Force Directed Placement with Physical Net Constraints

Timing Driven Force Directed Placement with Physical Net Constraints Timing Driven Force Directed Placement with Physical Net Constraints Karthik Rajagopal Tal Shaked & University of Washington Yegna Parasuram Tung Cao Amit Chowdhary Bill Halpin & Syracuse University ABSTRACT

More information