SEMICONDUCTOR industry is beginning to question the

Size: px
Start display at page:

Download "SEMICONDUCTOR industry is beginning to question the"

Transcription

1 644 IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, VOL. 29, NO. 3, SEPTEMBER 2006 Placement and Routing for 3-D System-On-Package Designs Jacob Rajkumar Minz, Student Member, IEEE, Eric Wong, Student Member, IEEE, Mohit Pathak, Student Member, IEEE, and Sung Kyu Lim, Senior Member, IEEE Abstract Three-dimensional (3-D) packaging via system-onpackage (SOP) is a viable alternative to system-on-chip (SOC) to meet the rigorous requirements of today s mixed signal system integration. In this article, we present the first physical design algorithms for thermal and power supply noise-aware 3-D placement and crosstalk-aware 3-D global routing. Existing approaches consider the thermal distribution, power supply noise, and crosstalk issues as an afterthought, which may require an expensive cooling scheme, more decoupling capacitors (=decap), and additional routing layers. Our goal is to overcome this problem with our thermal/decap/crosstalk-aware 3-D layout automation tools. The traditional design objectives such as performance, area, wirelength, and via are considered simultaneously to ensure high quality results. The related experimental results demonstrate the effectiveness of our approaches. Index Terms Crosstalk, mixed-signal CAD, placement and routing, power supply noise, system-on-package (SOP), thermal distribution, three-dimensional (3-D) packaging. I. INTRODUCTION SEMICONDUCTOR industry is beginning to question the viability of system-on-chip (SOC) approach due to its lowyield and high-cost problem. Recently, three-dimensional (3-D) packaging via system-on-package (SOP) [1] [3] has been proposed as an alternative solution to meet the rigorous requirements of today s mixed signal system integration. 1 The SOP is about 3-D integration of multiple functions in a miniaturized package achieved by thin film embedding. The 3-D SOP concept optimizes integrated circuits (ICs) for transistors and the package for integration of digital, RF, optical, sensor and others. It accomplishes this by both build-up SOP, similar to IC fabrication, and by stacked SOP, similar to parallel board fabrication. The uniqueness of 3-D SOP is in the highly integrated or embedded RF, optical or digital functional blocks, and sensors, in contrast to stacked ICs and stacked package. Due to the high complexity in designing large-scale SOP under multiple objectives and constraints, computer-aided design (CAD) tools have become indispensable. Thus, innovative ideas on CAD tools for Manuscript received November 26, 2004; revised July 30, This work was supported by Grants from MARCO C2S2 and NSF CAREER under CCF This work was recommended for publication by Associate Editor C. Amon upon evaluation of the reviewers comments. The authors are with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA USA ( limsk@ece.gatech.edu). Color versions of Figs. 6, 7, 10, and 19 are available online at Digital Object Identifier /TCAPT A special issue on SOP (vol. 27, issue 2, May 2004) provides a comprehensive survey of the state-of-the-art in SOP technology. SOP technology are crucial to fully exploit the potential of this new emerging technology. However, there exists very few tools, if not none, that handle the complexity of automatic 3-D SOP layout generation. This article tackles the three most crucial issues that threaten the reliability of SOP paradigm: thermal distribution, power supply noise, and crosstalk. First, thermal issues can no longer be ignored in high performance 3-D packages due to higher power densities and other issues. High temperatures not only require more advanced heat sinks, they also degrade circuit performance. Interconnect delay increases with temperature, which degrades circuit timing. If timing deteriorates enough, logic faults can occur. Hence thermal issues must be considered early-on in the design process. Second, the continuing trend of reducing power supply voltage has resulted in reduced noise margin, which effects reliability and may even cause functional failures due to spurious transitions. Active devices in 3-D packaging draw a large volume of instantaneous current during switching, which causes simultaneous switching noise (SSN). Third, due to the scaling down of device geometry in deep-submicron technologies, the crosstalk noise between adjacent nets has become a major concern in high performance packaging design. Increased coupling noise can cause signal delays, logic hazards and even malfunctioning of the designs. Thus controlling the level of crosstalk noise in 3-D packages chip is an important task for the designers. However, existing approaches consider these issues as an afterthought, which may require expensive cooling schemes, more decoupling capacitors decap, and more routing layers. In addition, many time-consuming iterations are required between full-length thermal/ssn/crosstalk simulation and manual layout repair until the convergence to a satisfactory result. We note that the placement of modules in 3-D SOP design has huge impact on thermal distribution and SSN while the routing of the signal nets has direct impact on crosstalk. Therefore, the goal of this article is to present the first physical layout algorithm for 3-D SOP that combines thermal/decap-aware 3-D placement and crosstalk-aware 3-D global routing algorithm. The traditional design objectives such as performance, area, wirelength, and via costs are considered simultaneously to ensure high quality results. The related experimental results demonstrate the effectiveness of our approach. The remainder of this paper is organized as follows. Section II discusses previous works. Section III presents the problem formulation and an overview of our 3-D algorithms. Section IV presents the SOP placement algorithm. Section V presents the SOP routing algorithm. Section VI presents the experimental results. We conclude in Section VII /$ IEEE

2 MINZ et al.: 3-D SYSTEM-ON-PACKAGE DESIGNS 645 II. RELATED WORKS The physical design for 3-D integration has drawn interest from both academia and industry recently. Unlike the active physical design research effort in mixed signal SOC [4], [5] and 3-D stacked die technology [6] [8], however, the research for 3-D SOP considerably lacks behind due to its short history. The physical design research for SOP has been recently pioneered by the [9] [14]. Decoupling capacitor placement is done for two-dimensional (2-D) printed circuit board (PCB) designs [15] [17]. Thermal-aware module placement algorithms are proposed for PCB designs [18], [19]. Many multichip module (MCM) routing algorithms have been proposed in the literature [20] [25]. Several works on MCM pin redistribution include [26] [28]. There exists several similarities and differences between MCM and SOP routing. In general, the design objectives and constraints in both MCM and SOP routing are quite similar, where performance, wirelength, via, layer, and crosstalk optimization are crucial. In addition, the routing process is divided into multiple steps: pin distribution, topology generation, layer assignment, and detailed routing. A notable difference between SOP and MCM routing lies in the fact that there exists multiple device layers in SOP, whereas in MCMs there is only one device layer. Therefore, nets are now connecting pins located in all intermediate layers in SOP, and the blocks in each layer behave as obstacles. In MCMs, however, all pins are located only at the top layer, and there exists no obstacles except for the wires themselves. This makes SOP or 3-D package routing problem more general than MCM routing. In SOP designs, some pins in each device layer can be distributed either to the layer above or below. In addition, some nets that connect blocks in nonneighboring device layers need to penetrate intermediate device layers. Since the blocks in these intermediate layers become obstacles, designers use routing channels in each device layer to insert feed-through vias. Thus, we have to determine which routing channel to use for feed-through vias. These routing channels are also used for intra-layer connection. Thus, after feed-through via insertion is done, we need to decide which remaining channels to use for the intra-layer connection. We then need to finish connection between the original and distributed pins. In case the pin location is not known for some soft blocks, we need to determine their location either along the boundary or underneath the block during our local routing step. III. PROBLEM FORMULATION A. SOP Layer Structure The layer structure in multilayer SOP is illustrated in Figs. 1 and 2. The placement layers 2 contain the blocks (such as ICs, embedded passives, opto-electric components, etc), which from the point of view of physical design are just rectangular blocks with pins along the boundary. The interval between two adjacent placement layers is called the routing interval. A routing interval contains a stack of routing layers sandwiched between pin distribution layers. These layers are actually routing 2 We use placement layer and device layer interchangeably. Fig. 1. Illustration of 3-D SOP, where A, D, M, and P, respectively, node analog chip, digital chip, memory module, and embedded passive blocks. Fig. 2. Illustration of the layer structure and routing resource in SOP. The block and white dots respectively denote the original and redistributed pins. The x denotes a feed-through pin for an x-net to pass through a placement layer using a routing channel. The solid, dotted, and arrowed lines, respectively, denote signal wires, vias, and feed-through vias. layer pairs so that the rectilinear partial net topologies may be assigned to them. The pin distribution layers in each routing interval are used to evenly distribute pins from the nets that are assigned to this interval. Then these evenly distributed pins are connected using the routing layer pairs. Each placement layer consists of a pair of routing layers, so routing is permitted. A feed-through via is used to connect two pin distribution layers from different routing intervals. Thus, the routing channels in each placement layer are used for two purposes: i) accommodate feed-through vias and ii) perform local routing, where limited number of intra-layer connections are made. In the SOP model the nets are classified into two categories. The nets which have all their terminals in the same placement layer are called -nets, while the ones having terminals in different placement layers are -nets. The -nets can be routed in a single routing interval or indeed within the placement layer itself. On the other hand, the -nets may span more than one routing interval. We use the floor connection graph (FCG) [29], illustrated in Fig. 3, to model the placement layer, where each routing channel becomes an edge and each channel intersection point becomes a node. In addition, each soft block becomes a node, and pin assignment edges are added to this node to connect to all adjacent channels. This model allows us to determine which boundary to

3 646 IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, VOL. 29, NO. 3, SEPTEMBER 2006 Fig. 3. Graph-based modeling of placement layer: (a) a connection between two hard blocks and (b) b is a soft block, and the new pin is located on the bottom boundary. Dotted lines denote pi assignment edges. use for the pins with unknown location. Each channel edge is associated with i) via capacity for feed-through vias and ii) wire capacity for local routing. In addition, pin assignment edges have pin capacity for each boundary of a soft block. The routing and pin distribution layers in each routing interval are modeled with a standard 3-D grid, where each node represents a routing region and each -direction edge represents each horizontal/vertical boundary among the regions. Each -direction edge represents a group of vias each region can accommodate. Thus, all edges in this 3-D grid are associated with capacity: and edges for wire capacity, and edges for via capacity. B. SOP Placement Problem The following are given as the input to our 3-D SOP placement problem: i) a set of blocks that represent the various active and passive components in the given SOP design, ii) width, height, and maximum switching currents for each block, iii) a netlist that specifies how the blocks are connected via electrical wires, iv), the number of placement layers in the 3-D packaging structure, and v) the number of power/ground signal layers along with the location of the power/ground pins. For each net from a given netlist, let denote the wirelength of. The wirelength is the sum of Manhattan distance in, and directions, where the direction is the height of the associated vias. 3 Let, and, respectively, denote the width, height, and area of the placement layer. Let, i.e., the product of the maximum width and the maximum height among all placement layers. Let. Let and denote the average width and height among all placement layers. Let, i.e., the sum of the deviation of the dimension among all placement layers. Last, the area objective of 3-D SOP placement, denoted, is the weighted sum of, and. Let denote the maximum temperature among all blocks. Let denote the total amount of decoupling capacitance required to suppress the simultaneous switching noise (SSN) 3 We assume that the height of vias connecting two x y layers in a routing and pin distribution layer pair is of one grid, whereas the height of feed-through vias that penetrate a placement layer is six grid. Fig. 4. Illustration of crosstalk computation between two Steiner trees. The numbers denote the coupling length. (a) Crosstalk is 7. (b) Crosstalk-free routing. under the given tolerance value. Last, the goal of the SOP Placement Problem is to find the location of each block in the placement layers such that the following cost function is minimized while SSN constraint is satisfied: 4 In addition, decaps are required to be placed adjacent to the blocks that require them. C. SOP Routing Problem For each net from a given netlist NL, let and, respectively, denote the amount of crosstalk and via associated with. Let denote the coupling length between and as illustrated in Fig. 4. We define as follows: where denote the routing layer that contains net. For each net, let denote the Elmore delay [30] at sink. Then, the maximum sink delay of net, denoted, is. The performance of a SOP global routing is to minimize the following cost function: For each routing interval in a 3-D package with placement layers, let, and, respectively, denote the top pin distribution layer pairs, routing layer pairs, and bottom pin distribution layer pairs. There exists three kinds of connections in each routing interval : top (connection between and ), middle (connection between and ), and bottom (connection between and ). We use the routing layer pairs in, and, respectively, for the top, middle, and bottom connections. We construct the routing grid, denoted, that contains all the distributed pins 4 In this weighted cost function, the area term A is normalized to A =A, where A is the footprint area of the initial solution used for Simulated Annealing. This will ensure the area term stays close to one during the annealing process. The same normalization is done for the other objectives as well wirelength, decap, and thermal. (1)

4 MINZ et al.: 3-D SYSTEM-ON-PACKAGE DESIGNS 647 Fig. 6. Top: 3-D grid of a SOP for thermal modeling. Fig. 5. Overall design flow. We place the feedthrough-vias for -nets and finish connection in each placement layer. More details are presented in Section V. from and in a 2-D grid. These pins are connected by a graph-based Steiner tree topology generation algorithm (we use an existing heuristic [31] for this purpose). The total layers used in a SOP global routing solution is given by Section V-D discusses how to compute and Section V-E discusses how to compute and. Last, the formal definition of SOP Global Routing Problem is as follows: Given a 3-D placement and netlist, generate a routing topology for each net, assign to a set of routing layers and assign all pins of to legal locations. All conflicting nets are assigned to different routing layers while satisfying various wire/via capacity constraints. The objective is to minimize the following cost function: We minimize during our layer assignment step and during our channel assignment step. is the focus during our topology generation step. The wirelength, via, and crosstalk minimization are addressed in all steps of our global router. D. Overview of the Approach An overview of our 3-D SOP physical design tool is shown in Fig. 5. The input to the flow is a module-level netlist that specifies how the SOP modules such as digital die, analog die, embedded passives, opto/mems modules, etc, are connected. Our 3-D module placement is based on the Simulated Annealing approach [32], where we examine many candidate 3-D module placement solutions and evaluate each of them using our thermal and power supply noise analyzers. More details are presented in Section IV. Our 3-D global routing is decomposed into five steps, where the input/output (IO) pins from the blocks are first evenly distributed using pin distribution layers to alleviate congestion problem. We then construct routing tree topology for all nets and assign them to the routing layers to minimize crosstalk. (2) IV. SOP PLACEMENT ALGORITHM A. Overview of the Algorithm Simulated Annealing is a very popular approach for module placement due to its high quality solutions and flexibility in handling various constraints. We extend the existing 2-D Sequence Pair scheme [33] to represent our 3-D module placement solutions. Simulated Annealing procedure starts with an initial multi-layer placement along with its cost in terms of area, wirelength, decap, and maximum temperature. We then make a random perturbation (move) to the initial solution to generate a new 3-D placement solution and measure its cost. We perform 3-D SSN analysis to compute the amount of decap needed. The algorithm does a one time set-up of the thermal matrices. These matrices are used during incremental temperature calculations to evaluate the thermal cost. The thermal modeling and evaluation is explained in Section IV-B. White space insertion (discussed in Section IV-D) is done and the increase in footprint area is estimated. If the new cost is lower than the old one, the solution is accepted; otherwise the new solution is accepted based on some probability that is dependent on temperature of the annealing schedule. We examine a predetermined number of candidate solutions at each temperature. The temperature is decreased exponentially, and the annealing process terminates when the freezing temperature is reached. The actual decap allocation is done after optimization using LP solver discussed in Section IV-D. B. 3-D Thermal Analysis We use a 3-D thermal resistance mesh, as shown in Fig. 6, for thermal analysis. Each node models a small volume of the 3-D SOP substrate, and each edge denotes the connectivity between two adjacent regions. This is equivalent to using a discrete approximation of the steady state thermal equation, where is thermal conductivity, is temperature, and is power. This results in the matrix equation. Since the matrix computation involved with thermal analysis for each candidate 3-D placement solution is expensive, we

5 648 IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, VOL. 29, NO. 3, SEPTEMBER 2006 (a) (b) Fig. 7. Illustration of 3-D power supply modeling. (a) Multilayer power supply network, where multiple sets of device, routing, and power supply layers are stacked together. (b) 3-D-grid modeling. where black and gray nodes respectively denote power supply and consumption nodes. Fig. 8. Illustration of SSN calculation. The dominant current source for block A is s, which is not located in the same layer. The dominant (shortest) path p carries I =6 amount of current, where I denotes the current demand of A. The block C draws current from s and s using p ;p, and p (each of these carries I =3 amount of current). The resistance of p, the overlap between p and p, contributes to the SSN at B and C. tackle this problem with the following scheme. Assuming that the thermal conductivity of the modules are similar, swapping the module locations would not change the thermal resistance matrix too much. This means that matrix only needs to be computed once in the beginning. To calculate the temperature profile of a new block configuration, the power profile needs to be updated and then multiplied by. Alternatively, a change in power profile can be defined. Multiplying and will give change in temperature profile. Adding to the old temperature profile will give the new temperature profile. These equations summarize the two methods:. Swapping two blocks usually has a small effect on the power profile, so should be sparse. This reduces the number of multiplications used by the second method at the expense of doing extra additions and subtractions. C. 3-D Power Supply Noise Modeling We model the P/G network for 3-D SOP as a 3-D grid graph as shown in Fig. 7. The edges in the grid-graph have inductive and resistive impedances. The mesh contains power-supply points and connection points, which supply and consume currents. The current distribution for the blocks is inversely proportional to the impedance of path from source. The dominant current source for a block is defined as the voltage source supplying significantly more power to the block than any other neighboring sources. The dominant path for a block is the path from the dominant supply to the block causing the most drop in voltage. It has been shown experimentally in [34] that the shortest path between the dominant current source (nearest Vdd pins) and the block offers highly accurate SSN estimation within reasonable runtime. In our 3-D SSN analysis engine, we compute dominant paths for all blocks which is dynamically updated whenever a new placement solution is evaluated in terms of SSN. Let be a dominant current path for block. Then denotes the set of dominating paths overlapping with ( includes itself). Let be the overlapping segments between path and. Let and denote the resistance and inductance of. Fig. 9. LP-based 3-D decap allocation. After the current paths and their values have been determined for all blocks, the SSN for is given by where is the current in the path, which is the sum of all currents through this path to various consumers. An illustration is shown in Fig. 8. The weight of and its rate of change are the resistive and inductive components of the path. Let denote the maximum charge drawn from the power supply by block. If, where is the noise tolerance, the decap allocated to block is given by where denotes the total number of blocks to be placed. Finally, the decap cost is given by. D. 3-D Decoupling Capacitor Placement In our LP-based 3-D decap allocation shown in Fig. 9, denotes the amount of decap allocated from white space to block. The objective is to maximize the utilization of white spaces for decap allocation. The constraint (??) limits the total allocation of a white space to its total area, where denotes the area of white space, and denotes the neighboring blocks of. The constraint (??) ensures that the total amount of decap allocated to each block does not exceed its demand, where denotes the demand.

6 MINZ et al.: 3-D SYSTEM-ON-PACKAGE DESIGNS 649 Fig. 10. Illustration of 3-D decap allocation: (a) 3-D placement, (b) X-expansion, and (c) XY -expansion, where the darker blocks denote the neighboring blocks of the decap ( = white space) inserted. Note that blocks from other layers can utilize the white space for decap insertion. Fig. 11. Pseudocode for coarse pin distribution algorithm. Unlike the 2-D decap assignment as done in [34], the neighboring blocks in 3-D case are the adjacent blocks either from the same placement layer or from neighboring layers. The assumption here is that the white space from different layers can be used to allocate decap. In order to facilitate this, we introduce predetermined parameters to control decap allocation to block from white space module. evaluates the usefulness of whitespace to be used as a decap for block. The value of decreases with increasing distance of whitespace from the module that requires decap. Hence allocations of decap from far located white spaces will be avoided. The second aspect of the formulation deals with the generation of, the optimal area that will accommodate all decap without too much penalty on the final footprint area. The decap budget for each block is determined through SSN analysis of a particular 3-D placement. The decap allocation involves several iteration between white space insertion and LP solving to reach a good solution both in terms of completeness of decap allocated and the additional increase in area. The white space is generated by expanding the floorplan in the and direction as illustrated in Fig. 10. In a multilayer placement, however, we have to take additional care to balance the expansion in each layer, so as to minimize the expansion of the total footprint area. The expansion is proportional to the decap demand of each module. In our sequence pair-based 3-D placement, we modify the horizontal and vertical constraint graphs to expand the placement into and directions, respectively. Note that we may have to iterate between white space insertion and allocation if the current expansion does not satisfy all the decap needs. In order to prevent from iterating between LP and white space insertion, we start by adding white spaces generously in the floorplan then use LP to perform compaction. In our scheme, the -expansion is controlled by parameters discussed earlier, where the existing white spaces have higher ranks than the newly inserted ones. The actual amount of expansion is determined by a parameter 1, where we increase the amount of expansion by a factor of for each block that needs decap. This ensures that the initial expansion satisfies the LP constraint. Since LP determines the individual, we use these assignment values to decide which white space insertion was not necessary and thus can be removed. As exact decap allocation is a time-consuming process, it is impractical to solve an LP problem for every candidate solution. However, our white space insertion takes, which is computationally equivalent to computing block location using longest path computation for area. Therefore, we can afford white space insertion (and the calculation of area increase due to decap insertion) at every move. We perform the actual LP-based decap allocation at the end of the annealing process as a compaction stage. V. SOP GLOBAL ROUTING A. Overview of the Approach Our 3-D global router is divided into the following five steps. 1) Pin redistribution: we first determine which set of -nets and -net segments are assigned to each routing interval. The pins from these nets are then evenly distributed in the top and bottom pin distribution layers. Our pin redistribution step is further divided into three steps: coarse pin distribution, net distribution, and detailed pin distribution. 2) Topology generation: Steiner trees are generated for all nets in each routing interval so that the performance of the routed design is optimized. 3) Layer assignment: the routed nets are assigned to a unique routing pair in the routing layer so that the total number of layers used is minimized. 4) Channel assignment: for each -net, its location of feedthrough via in the routing channel is determined. We also assign channels and finish the connections for the i-nets that are to be routed in each placement layer. 5) Local routing: we finish connections between the pins now located in the routing channels and the pins on the block boundaries. We also determine the location of pins from soft blocks. We use an existing max-cut partitioning heuristic [35] for the net distribution in step 1. For step 2, we use an existing RSA/G heuristic [31] to generate the net topologies. In addition, we use the congestion-driven rip-up-and-reroute [36] for step 5. Therefore, the focus of this paper is to develop heuristics for the remaining steps in 1, 3, and 4. B. Coarse Pin Distribution A pseudocode for our coarse pin distribution algorithm is shown in Fig. 11. First, we assign all pins in the placement layers to a nearby grid point in CP, a 2-D grid, while trying to balance the number of pins assigned to each grid point (line

7 650 IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, VOL. 29, NO. 3, SEPTEMBER 2006 Fig. 12. Illustration of coarse pin distribution. Pins along the external boundary are not shown for simplicity. Fig. 14. Pseudocode for SOP detailed pin distribution algorithm. Fig. 13. Illustration of the gain computation for coarse pin distribution. A net n indicated by the black node is moved from P to P. Then, g (n) =02; g = 0, and g = 01, where deg(p )=3becomes the maximum-degree partition. 1 4). An illustration is shown in Fig. 12. We impose pin capacity for each grid point so that the pins are evenly distributed in CP. Our approach is to visit the pins in a random order and find the best grid for each pin. For each pin, a grid point that is closest to the original location of and has not violated the pin capacity constraint is chosen (line 3). After this process is finished, CP serves the starting point of our min-cut placement based algorithm, where each grid point corresponds to a partition. We then iteratively improve the quality of this initial solution via move-based approach. In our new heuristic algorithm, our cost function is based on i) how far the new pin location is from the initial location, ii) total wirelength, and iii) how evenly distributed the inter-partition connections are. For each pin, wedefine the displacement gain, denoted, to represent how much the distance between the original and new location is reduced if is moved to another partition. We define the wirelength gain, denoted, to represent how much the length of the nets that contain (estimated by the half-perimeter of the bounding box) is reduced if is moved to another partition. For a partition, let denote the number of nets that have connections to. Then, the cutsize balance factor is defined to be the difference between the maximum and minimum among all partitions. We define the balance gain, denoted, to represent how much the cutsize balance factor is reduced. Our move-based multilevel mincut partitioning algorithm performs cell move based on the combined gain function Fig. 13 shows an illustration of the gain computation. In our multilevel approach, we first perform restricted multilevel clustering (line 5) that preserves the initial placement result, where two pins that are in different partitions initially are not clustered together. At each level of the cluster hierarchy from top to bottom (line 8), we compute the combined gain for each cluster and perform cluster moves. In order to compute the displacement and balance gain of a group of pins ( cluster), we add the individual displacement and balance gain of all pins in this cluster. When there is no gain at a certain level, we decompose the clusters into next lower level and perform refinement. This process continues until we obtain a solution at the bottom level (line 12). C. Detailed Pin Distribution After coarse pin distribution and net distribution are finished, we know which set of nets are assigned to each routing interval as well as their (evenly distributed) entry/exit points in pin distribution layers. However, the coarse pin distribution is done based on the 2-D grid that merged all multiple placement layers into one. The even pin distribution in this 2-D grid offers a good enough reference points for net distribution. But, it does not consider even pin distribution in each individual routing interval. In addition, it is also possible that pin capacity for each routing region in each routing interval may be violated. Therefore, the goal of detailed pin distribution is to address these problems in each routing interval so that the subsequent topology generation and layer assignment truly benefit from this even pin distribution. In addition, we use a grid large enough for each routing interval to legalize pin location, i.e., each grid point contains only one pin. Since the crosstalk minimization are addressed during the prior steps, the major focus of detailed pin distribution step is on i) how far the new location is from the original location obtained from coarse pin distribution and ii) the total wirelength. A pseudocode for our net distribution algorithm is shown in Fig. 14. Our force-directed heuristic algorithm encourages all pins from the same net to be placed closer to the center of mass while minimizing the distance between the old and new pin location. The size of the grids for detailed pin distribution is determined for each routing interval so that each pin can be assigned to a unique grid point (line 1 3). In addition, we project the coarse pin distribution result to this new set of grids (line 5). Note that there still exists overlap among the pins in at this point even though DP is usually finer than CP. In order to remove this overlap, we apply an additional force that slightly pulls each pin toward the center of mass. For each pin in a net, the displacement force (line 8) for -direction is defined as follows:

8 MINZ et al.: 3-D SYSTEM-ON-PACKAGE DESIGNS 651 Fig. 15. Illustration of layer assignment. where denotes the -coordinate of, the center of mass of, and denotes the width of the bounding box of. We compute using the -coordinates. Note that. The vector is then added to (line 9). This minor change on the original pin location helps to remove most of the overlap in while not increasing the wirelength too much. We then sort the pins based on the lexicographic order of new locations and assign each pin starting from the top-most row in the left-most column (line 10 11). Due to its simplicity, this deterministic algorithm is quite efficient and effective in reducing the additional wirelength required for pin distribution as well as the total wirelength among all nets as shown in Section VI. D. SOP Layer Assignment For each routing interval, the routing grid contains all the (redistributed) pins from the top and bottom pin distribution layers ( and ). We generate Steiner-tree based routing topology using the RSA/G heuristic [31] to connect these pins. The goal is to minimize the maximum sink delay as discussed in Section III-C. The routing layer in each routing interval consists of several layer pairs, where each pair consists of one layer for horizontal wires and another layer for vertical wires. Thus, we can assign a entire rectilinear routing tree to a routing pair. In addition, two trees that are intersecting can also be assigned to the same routing pair provided that they do not violated the wire capacity of the routing regions involved. An illustration is shown in Fig. 15. The SOP layer assignment problem is to assign each net to a routing layer pair so that the wire capacity constraint is satisfied and the total number of layer pairs used for all routing intervals is minimized. Our approach is to perform layer assignment for each routing interval independently. For each routing interval, we construct a layer constraint graph (LCG) [37], denoted LCG, as follows: corresponding to each net in we have a node in LCG. Two nodes LCG have an edge between them if a net segment and are sharing the same edge in, i.e., and are sharing the same boundary of a routing region. Then we use a node coloring algorithm to assign colors to the nodes in LCG such that no two nodes sharing an edge are assigned the same color. Let denote the total number of colors used during the node coloring, and let denote the wire capacity of the boundary in routing region. Then, the total number of layer pairs used in this routing interval is computed as follows:. Last, a node with color is assigned to layer pair. Let and, respectively, denote the maximum number of wires used among all horizontal and vertical edges in. Then, the following is a lower-bound Fig. 16. Pseudocode for SOP layer assignment algorithm. on the number of layers used in :. Fig. 16 shows the SOP layer assignment algorithm that includes our coloring heuristic. We first sort all nodes in LCG in a decreasing order of the number of their neighbors (line 3). Let denote the set of colors used by the neighbors of (line 6). We visit the nodes in the sorted order (line 5). In case there exists an used color that is not included in (line 7), we assign this color to (line 12). In case there exist multiple colors that satisfy this condition, we use the lowest color. Otherwise, we introduce a new color and assign it to (line 9 10). Last, we assign a layer pair to each net based on its color (line 13). In spite of its simplicity, this greedy algorithm provides results that are very close to the lower bound on total number of layers used as demonstrated in Section VI. E. Sop Channel Assignment Our prior topology generation and layer assignment steps focus on the connections among the distributed pins in the pin distribution layers. after these steps are finished, there remain two kinds of connections: i) connections between the original and distributed pins and ii) feed-through via insertion. For both types of connections, the routing channels in the placement layers are used. During the SOP channel assignment step, each pin in the pin distribution layer is mapped to a routing channel in the neighboring placement layer. In addition, each pin from an x-net that needs to penetrate a placement layer is also mapped to a routing channel in the placement layer. Last, we generate routing topology for each pin-to-channel connection and assign it to a routing layer pair in the pin distribution layer. Each placement layer is modeled with the floor connection graph (FCG) [29] as illustrated in Fig. 3. During the SOP local routing step, we use the congestion-driven rip-up-and-reroute [36] to i) finish the routing for the given set of two-pin connections while satisfying the wire capacity constraint and ii) decide the location of pins for soft blocks along their boundaries. An illustration is shown in Fig. 17. For each placement layer, let denote the set of pins that need to be mapped to a routing channel in. contains pins from and. The pins in are grouped into two sets: terminal pin set for the pins that have terminals in and feed-through pin set for the pairs of pins that require feed-through vias to penetrate.

9 652 IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, VOL. 29, NO. 3, SEPTEMBER 2006 TABLE I AREA/WIRE-DRIVEN VERSUS DECAP-DRIVEN VERSUS THERMAL-DRIVEN ALGORITHMS Fig. 17. Illustration of (a) channel assignment and (b) local routing. Let denote the set of routing channels in. Each channel is associated with the pin capacity constraint. The goal of SOP Channel Assignment Problem is to map each pin to a routing channel for 1 and finish -to- connection while satisfying the pin capacity constraint, i.e.,. For each pair of pins,we map and to the same channel. Let denote the number of layers used to finish connection between pins from and, and let denote the number of layers used to finish connection between pins from and. The objective of SOP channel assignment is to minimize the following cost function: A pseudocode for SOP channel assignment algorithm is shown in Fig. 18. We visit each placement layer and assign the feed-through pins and terminal pins to the channels. In addition, an L-shaped routing topology for each pin-to-channel is constructed in a 2-D grid (line 3). The signal delay of feed-through vias is larger than that of other types of vias. Since each channel is under a capacity constraint, it is important to assign the feed-through pins to the nearest channels first. Therefore, our strategy is to perform channel assignment for the feed-through pins first to minimize the delay of -nets that require feed-through vias. In addition, we give priority to the Fig. 18. Pseudocode for SOP channel assignment algorithm. pins that are included in long nets. Thus, we sort the pins based on the wirelength (line 4). Our heuristic algorithm assigns pins to channels based on the cost of mapping we seek a channel with the best mapping cost for a given pin (line 6 8). We compute the cost of mapping for a given pin and a channel as follows: cost room dist bend cong where room denotes the number of pins can accommodate until it violates the capacity constraint, dist is the Manhattan distance between and bend is the number of bends in the connection between and, and cong denotes the total number of existing connections along the proposed L-shaped route. We choose the channel with the maximum cost. Note that a channel is represented with a line instead of a point. Thus, distance and bend are based on the shortest connection between and any point on. Upon a pin to channel mapping, we update the usage of channel in and edges in (line 10). After the channel assignment and topology generation for all pin-to-channel connections are finished, we perform layer assignment using our coloring heuristic presented in Section V-D and compute the total number of layers used.

10 MINZ et al.: 3-D SYSTEM-ON-PACKAGE DESIGNS 653 VI. EXPERIMENTAL RESULTS We implemented our algorithms in C /STL and ran our experiments on Linux Beowulf clusters. We tested our algorithms with two sets of benchmarks. The first set is from the standard GSRC floorplan circuits [38]. The second set, named the GT benchmark, was synthesized from the IBM circuits [39], where we use our multilevel partitioner [40] to divide the gate-level netlist into multiple blocks. The GSRC benchmarks are small to medium-sized in terms of both the number of blocks and nets. 5 The GT benchmarks contain medium to large number of blocks with dense connections. A. SOP Placement Results The dimension of the area was assumed to be in mm and area per unit decap cost was chosen to be 50 mm. The number of placement layers is fixed to four for all experiments. In our experiments, we used the same parameters across the benchmarks. The technology parameters used were 0.01 m for wire resistance per unit length, 1 ph/ m for wire inductance per unit length, and 10 ff/ m for wire capacitance per unit length. We report the average runtime for each benchmark measured in seconds. For our thermal analysis we used a grid size of in the and direction. The number of active layers was four with three passive layers in between them. The conductivity of the substrate was chosen to be that of silicon (0.1 W/mm C). The conductivity of the sides of the package was fixed at 0.01 W/mm C. The heat sink was assumed to be at the top of the package with a conductivity of 0.5 W/mm C and the conductivity of the bottom of the package was chosen to be 0.2 W/mm C. The power density of the blocks varied from 10 W/mm^2 to 300 W/mm^2 depending on the switching current demands. The switching current demands were generated using a formula using a random number and the size of the block. Table I shows a comparison among 3-D SOP placement algorithms with three different objectives: 1) area/wire-driven ( 1 and 0 in (1), Section III B), 2) decap-driven ( 1 and 0), and 3) thermal-driven ( 1 and 0). Our baseline algorithm is the area/wirelength-driven algorithm. We achieved a 21% improvement over baseline in decap cost using our decap-driven algorithm, with a 7% increase in final area and 15% increase in wirelength. The temperature decreases by 3%. Our thermal-driven floorplanning achieved a 21% improvement over baseline with a dramatic increase in total area of 76%. Wirelength increased by 28% and decap requirement increases by 19%. We note that it may be possible to reduce area and wirelength costs by fine-tuning parameters. Table I shows the result of our final 3-D placement algorithm that considers all four objectives: area, wirelength, decap, and thermal ( 1 in (1), Section III-B). We note that our area/wire/decap/thermal-driven algorithm achieves an improvement in both decap and temperature over baseline 5 The GT benchmark circuits are available for download at our website: Fig. 19. Decap versus temperature correlation plot for n300. TABLE II DECAP VERSUS TEMPERATURE CORRELATION CONSTANTS by 13% and 9%, respectively. There is a reasonable increase in final area and wirelength by 19% and 15%, respectively. We also tested our area/wire/decap algorithm under a thermal constraint. In this case, all candidate solutions during the simulated annealing are discarded if the maximum temperature is above a certain threshold (100 C in our case). We observe that the area, wirelength, and decap results improve simultaneously at the cost of slight increase in temperature. The CPU times of the algorithm depend on the number of candidate solutions evaluated. The times reported in the table directly reflect the subset of the configuration space spanned by the algorithm. The constrained algorithm ran much faster, making a small number of moves because fewer moves were accepted. Fig. 19 shows the temperature and decap requirement of each block of the final floorplan of the n300 benchmark. The figure clearly shows that there is little correlation between temperature and decap. This shows that minimizing one objective does not necessarily minimize the other objective as a positive correlation would indicate. It also means that minimizing both objectives is not mutually exclusive, as a negative correlation would indicate. This matches with the block placement results. Table II shows the correlation constants. Table III shows power supply noise simulation results for three 3-D placement schemes-no-decap aware, decap-aware, and decap-aware decap-placement. The P/G plane structure size is 246 mm 254 mm, and the top P/G plane pair was modeled using cavity resonator model [41] and simulated in HSPICE. The placement layer that uses this P/G plan includes 14 active devices. The dc 5 V sources are located at four edges in the plane pair and fourteen current sources exist in the plane pair. As can be seen from the Table III, the SSN for the

11 654 IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, VOL. 29, NO. 3, SEPTEMBER 2006 TABLE III POWER SUPPLY NOISE SIMULATION RESULTS. WE REPORT SSN NOISE FOR EACH BLOCK PLACED IN THE TOP PLACEMENT LAYER. (a) NONOPTIMIZED CASE WITHOUT ANY DECOUPLING CAPACITOR. FROM TABLE I WE NEED 26.7 DECAP UNITS TO SUPPRESS SSN NOISE. (b) OPTIMIZED CASE WITHOUT ANY DECOUPLING CAPACITOR. FROM TABLE I WE NEED 19.9 DECAP UNITS. (c) OPTIMIZED CASE WITH DECOUPLING CAPACITORS INSERTED TABLE VI SOP PIN REDISTRIBUTION RESULTS USING OUR COARSE AND DETAILED PIN DISTRIBUTION ALGORITHMS. WE REPORT THE WIRELENGTH BETWEEN THE ORIGINAL AND THE NEW LOCATION (dw), TOTAL WIRELENGTH (wl), CROSSTALK (xt) AND TOTAL NUMBER OF LAYER PAIRS USED (1 yr) IN THE ROUTING LAYERS TABLE IV BENCHMARK CHARACTERISTICS OF GSRC AND GT CIRCUITS.WCAP DENOTES THE WIRE CAPACITY OFEACH BOUNDARY IN THE ROUTING TILES.CP AND DP RESPECTIVELY DENOTE THE DIMENSION OF 2-D GRIDS USED DURING OUR COARSE AND DETAILED PIN DISTRIBUTION STEPS TABLE VII TOPOLOGY GENERATION AND LAYER ASSIGNMENT RESULTS. WE REPORT THE TOTAL WIRELENGTH (WL), ELMORE DELAY OF THE NETS WITH MAXIMUM SINK DELAY (dly), AND THE LOWER BOUND (low) AND THE ACTUAL NUMBER (1 yr) OF LAYERS USED TABLE V AREA INFORMATION OF THE 4-LAYER SOP FLOORPLANNING FOR GSRC AND GT BENCHMARK DESIGNS. THE MIN AND MAX RESPECTIVELY DENOTE THE DIMENSION OF THE MINIMUM AND MAXIMUM FLOORPLAN AREA, WHEREAS THE FINAL COLUMN DENOTES THE DIMENSION OF THE OVERALL FOOTPRINT decap-aware algorithm is lower compared to nondecap-aware algorithm. The SSN of the noisiest block blk4 is 1.58 V, which is reduced to 1.42 V by our decap-aware scheme. With the insertion of decap, the noise is suppressed to 0.22 V. In addition, the total amount of decap required for nondecap-aware algorithm is 26.7, which is reduced to 19.9 with our decap-aware scheme. The largest amount of decap is used for blk5 (0.50 nf), because of an increase in its SSN after optimization. The numbers show that the SSN was efficiently suppressed and the amount of decap reduced by using our algorithms. B. SOP Routing Results Table IV shows the characteristics of the GSRC and GT benchmark designs used during routing (see Table V). In Table VI we compare pin redistribution results. Under the DPD columns, we perform DPD ( detailed pin distribution presented in Section V-C) only, where we skip CPD ( coarse pin distribution presented in Section V-B) and assign all -nets to the routing interval below for net distribution. Under the CPD DPD, we perform CPD using the algorithm, assign all -nets to the routing interval below for net distribution, and perform DPD. DPD serves as our baseline, where CPD+DPD respectively demonstrate the impact of our coarse pin distribution. In all cases, we perform detailed pin distribution to legalize the pin location, i.e., remove overlaps among the pins. We use the following metrics to evaluate our solutions: wirelength between the original and the new location (dw), total wirelength (wl), crosstalk (xt) and total number of layer pairs used (lyr) in the routing layers. The displacement (dw) and wirelength (wl) results are scaled by 10, and the time reported is the average runtime among the GSRC/GT circuits. From the comparison between DPD and CPD DPD, we note that the displacement result (dw) increases by an average of 50%. However, CPD lowers the total wirelength (wl) consistently by 10% on average and the number of layers (lyr) by 10% on average. In Table VII we show our topology generation (RSA/G) and layer assignment (LAYER) results. We used the technology parameters for process for Elmore delay computation. Specifically, the driver resistance of 29.4 k, input capacitance of ff, unit-length resistance of 0.82 m and

12 MINZ et al.: 3-D SYSTEM-ON-PACKAGE DESIGNS 655 TABLE VIII SOP CHANNEL ASSIGNMENT RESULTS. WE REPORT THE LAYER USAGE, WIRELENGTH, AND VIA FOR THE BASELINE (WIRELENGTH MINIMIZATION ONLY) AND MULTI-OBJECTIVE ALGORITHMS TABLE IX SOP LOCAL ROUTING RESULTS FOR THE BASELINE (WIRELENGTH MINIMIZATION ONLY) AND MULTI-OBJECTIVE ALGORITHMS. WE REPORT THE WIRELENGTH (wl), MAXIMUM (max) AND AVERAGE (avg) ROUTING DEMAND AS WELL AS THE STANDARD DEVIATION (dev) unit-length capacitance of 0.24 ff/ m are used. We report the total wirelength (wl), Elmore delay of the nets with maximum sink delay (dly), and the lower bound and the actual number of layers used for the top-most routing interval. In general, GSRC benchmarks have bigger delay than GT benchmarks due to the larger average wirelength. Our layer assignment algorithm presented in Section V-D is able to achieve results very close to the lower bounds discussed in Section V-D. For the GT circuits, the layer assignment results are within 10% of the lower bound. For the GSRC circuits, we were able to achieve the results equal to the lower bound. Our channel assignment results are shown in Table VIII. The baseline case is where we optimize the wirelength only. We then compare it to our multi-objective channel assignment algorithm that simultaneously optimizes the wirelength, via, and layer usage. Our comparison indicates that the number of layers is consistently and significantly reduced especially for the bigger GT benchmarks, where an average improvement of 48% is observed. In case of the second largest benchmark gt1000, we achieved 57% improvement. This saving on the layer usage comes at the cost of increase in wirelength and vias. The average increase in wirelength is 12% and 30% for GSRC and GT benchmarks, respectively. The average increase in via usage is 63% and 79% for GSRC and GT benchmarks, respectively. We noted that the channel assignment result is very sensitive to the weighting constants among the objectives used in our cost function. This indicates that the solution space of the channel assignment problem offers many useful tradeoff points. Table IX reports our local routing results. We report the wirelength, maximum and average routing demand as well as the standard deviation. Our baseline is the local routing optimized for wirelength only. We then compare it against our multi-objective local routing algorithm that simultaneously optimizes wirelength and routing demands. In both cases, the same pin demand constraint is imposed. We note that the improvement of our multi-objective algorithm over the baseline is significant, especially for GSRC circuits the routing demands were reduced by 33% on average while the wirelength increased by only 10%. In addition, we reduced the routing demands for the GT benchmarks by 41% on average, with wirelength increase by 20%. In our biggest benchmarks (gt1500), our routing demand reduction Fig. 20. Snapshot of four-layer SOP with three routing interval for n200 benchmark. The routing-tile boundaries with heavy usage are shown with thicker lines. is the largest (53%), which comes with the maximum increase in wirelength (23%). This again indicates that the local routing result is very sensitive to the weighting constants among the objectives used in our cost function. The lower standard deviation of our multi-objective algorithm indicates that the routing demand is more evenly distributed ( lower congestion) compared to the wirelength-only case. Last, Fig. 20 shows a snapshot of a four-layer SOP with 3 routing interval for n200 benchmark. VII. CONCLUSION In this article, we presented the first physical layout algorithm for 3-D SOP designs that includes thermal/decap-aware 3-D placement and crosstalk-aware 3-D routing algorithm. We extended the thermal and SSN models to 3-D and used them to guide our 3-D module placement. In addition to estimating the amount of decap needed to keep the SSN at the circuits tolerance level, we also efficiently used near-by white spaces to allocate decap if necessary. Our routing process is divided into pin redistribution, topology generation, layer assignment, channel assignment, and local routing steps. Our major objective is to reduce the amount of crosstalk and layers while satisfying various constraints on routing resource. Our ongoing work includes SOP detailed routing.

13 656 IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, VOL. 29, NO. 3, SEPTEMBER 2006 REFERENCES [1] R. Tummala and V. Madisetti, System on chip or system on package?, IEEE Design Test Comput., vol. 16, no. 2, pp , Apr./Jun [2] R. Tummala, SOP: What is it and why? A new microsystem-integration technology paradigm-moore s law for system integration of miniaturized convergent systems of the next decade, IEEE Trans. Adv. Packag., vol. 27, no. 2, pp , May [3] R. R. Tummala et al., The SOP for miniaturized, mixed-signal computing, communication, and consumer systems of the next decade, IEEE Trans. Adv. Packag., vol. 27, no. 2, pp , May [4] K. Kundert, H. Chang, D. Jefferies, G. Lamant, E. Malavasi, and F. Sendig, Design of mixed-signal systems-on-a-chip, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 19, no. 12, pp , Dec [5] G. Gielen and R. Rutenbar, Computer-aided design of analog and mixed-signal integrated circuits, Proc. IEEE, vol. 88, no. 12, pp , Dec [6] Y. Deng and W. Maly, Interconnect characteristics of 2.5-D system integration scheme, in Proc. Int. Symp. Phys. Design, 2001, p [7] S. Das, A. Chandrakasan, and R. Reif, Design tools for 3-D integrated circuits, in Proc. Asia South Pacific Design Automat. Conf., 2003, pp [8] B. Goplen and S. Sapatnekar, Efficient thermal placement of standard cells in 3-D IC s using a force directed approach, in Proc. IEEE Int. Conf. Comput.-Aided Design, Nov. 2003, pp [9] J. Minz and S. K. Lim, Layer assignment for system on packages, in Proc. Asia South Pacific Design Automat. Conf., Jan. 2004, pp [10] J. Minz, M. Pathak, and S. K. Lim, Net and pin distribution for 3-D package global routing, in Proc. Design, Automat. Test Europe, Feb. 2004, pp [11] R. Ravichandran, J. Minz, M. Pathak, S. Easwar, and S. K. Lim, Physical layout automation for system-on-packages, in Proc. IEEE Electron. Comp. Technol. Conf., vol. 1, Jun. 2004, pp [12] J. Minz, S. K. Lim, J. Choi, and M. Swaminathan, Module placement for power supply noise and wire congestion avoidance in 3-D packaging, in Proc. IEEE Electr. Perform. Electron. Packag., 2004, pp [13] J. Minz and S. K. Lim, A global router for system-on-package targeting layer and crosstalk minimization, in Proc. IEEE Electr. Perform. Electron. Packag., 2004, pp [14] J. Minz, E. Wong, and S. K. Lim, Thermal and congestion-aware physical design for 3-D system-on-pacakge, in Proc. 55th IEEE Electron. Comp. Technol. Conf., May 31 Jun , pp [15] J. Choi, S. Chun, N. Na, M. Swaminathan, and L. Smith, A methodology for the placement and optimization of decoupling capacitors for gigahertz systems, in Proc. VLSI Design Symp., 2000, pp [16] A. Kamo, T. Watanabe, and H. Asai, An optimization method for placement of decoupling capacitors on printed circuit board, in Proc. IEEE Electr. Perform. Electron. Packag., 2000, pp [17] Y. Chen, Z. Chen, and J. Fang, Optimum placement of decoupling capacitors on packages and printed circuit boards under the guidance of electromagnetic field simulation, in Proc. IEEE Electron. Comp. Technol. Conf., Orlando, FL, May 1996, pp [18] J. Lee, Thermal placement algorithm based on heat conduction analogy, IEEE Trans. Compon. Packag. Technol., vol. 26, no. 2, pp , Jun [19] K. Deb, P. Jain, N. Gupta, and H. Maji, Multiobjective placement of electronic components using evolutionary algorithms, IEEE Trans. Compon. Packag. Technol., vol. 27, no. 3, pp , Sep [20] K. Khoo and J. Cong, An efficient multilayer MCM router based on four-via routing, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 14, no. 10, pp , Oct [21] J. D. Cho, K. Liao, S. Raje, and M. Sarrafzadeh, M2R: Multilayer routing algorithm for high-performance MCMs, IEEE Trans. Circuits. Syst. I, vol. 41, no. 4, pp , Apr [22] W. W. Dai, Topological routing in surf: Generating a rubberband sketch, in Proc. ACM Design Automat. Conf., 1991, pp [23] D. Wang and E. Kuh, A new timing-driven multilayer MCM/IC routing algorithm, in Proc. IEEE Multi-Chip Module Conf., 1997, pp [24] Y. Cha, C. Rim, and K. Nakajima, A simple and effective greedy multilayer router for MCMs, in Proc. Int. Symp. Physical Design, 1997, pp [25] J. Cong and P. Madden, Performance driven multi-layer general area routing for PCB/MCM designs, in Proc. ACM Design Automat. Conf., Mar. 1998, pp [26] J. Cho and M. Sarrafzadeh, The pin redistribution problem in multichip modules, Math. Programm. Spec. Iss. Appl. Discrete Programm. Comput. Sci., pp , [27] D. Chang, T. Gonzales, and O. Ibarra, A flow based approach to the pin redistribution problem for multi-chip modules, in Proc. Great Lakes Symp. VLSI, 1994, p [28] J. D. Cho, An optimum pin redistribution for multichip modules, in Proc. IEEE Multi-Chip Module Conf. (MCMC 96), 1996, p [29] J. Cong, Pin assignment with global routing for general cell design, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 10, no. 11, pp , Nov [30] W. Elmore, The transient response of damped linear networks with particular regard to wideband amplifiers, J. Appl. Phys., pp , [31] J. Cong, A. B. Kahng, and K.-S. Leung, Efficient algorithms for the minimum shortest path steiner arborescence problem with applications to VLSI physical design, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 17, no. 1, pp , Jan [32] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, Optimization by simulated annealing, Science, pp , [33] H. Murata, K. Fujiyoshi, S. Nakatake, and Y. Kajitani, Rectangle packing based module placement, in Proc. IEEE Int. Conf. Computer-Aided Design, 1995, pp [34] S. Zhao, C. Koh, and K. Roy, Decoupling capacitance allocation and its application to power supply noise aware floorplanning, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 21, no. 1, pp , Jan [35] J. D. Cho, S. Raje, M. Sarrafzadeh, M. Sriram, and S. M. Kang, Crosstalk-minimum layer assignment, in Proc. IEEE Custom Integr. Circuits Conf., San Diego, CA, 1993, pp [36] L. McMurchie and C. Ebeling, Pathfinder: A negotiation based performance-driven router for FPGAs, in Proc. ACM Int. Symp. Field-Programmable Gate Arrays, 1995, pp [37] M. Ho, M. Sarrafzadeh, G. Vijayan, and C. Wong, Layer assignment for multichip modules, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 9, no. 12, pp , Dec [38] GSRC, Floorplan Benchmark Suites, Tech. Rep., Online. Available: [39] C. J. Alpert, The ISPD98 circuit benchmark suite, in Proc. Int. Symp. Phys. Design, 1998, pp [40] J. Cong and S. K. Lim, Edge separability based circuit clustering with application to multi-level circuit partitioning, IEEE Trans. Comput.- Aided Design Integr. Circuits Syst., vol. 23, no. 3, pp , Mar [41] N. Na, J. Choi, M. Swaminathan, J. P. Libous, and D. P. O Connor, Modeling and simulation of core switching noise for asics, IEEE Trans. Adv. Packag., vol. 25, no. 1, pp. 4 11, Feb Jacob Rajkumar Minz (S 05) received the B.Tech. degree in computer science and engineering from the Indian Institute of Technology, Kharagpur, in 2001 and is currently pursuing the Ph.D. degree at the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta. He was with the Advanced VLSI Design Lab, Indian Institute of Technology, Kharagpur, for one year, where he was involved in the design of digital chips. His areas of interest are physical design automation and algorithms for electronic CAD. Eric Wong (S 05) received the B.S. degree in electrical engineering from the State University of New York at Binghamton in 2004 and is currently pursuing the Ph.D. degree in electrical and computer engineering at the Georgia Institute of Technology, Atlanta. His research interests include physical VLSI design and design automation for VLSI circuits.

14 MINZ et al.: 3-D SYSTEM-ON-PACKAGE DESIGNS 657 Mohit Pathak (S 05) received the B.Tech. degree in computer science and engineering from the Indian Institute of Technology, Kharagpur, in 2004 and is currently pursuing the Ph.D. degree in the School of Electrical and Computer Engineering, Georgia Institute of Technology (Georgia Tech), Atlanta. He is a Graduate Research Assistant at Georgia Tech. He was with Magma Design Automation, India for one year, where he worked as a Member of Technical Staff. His areas of interest are physical design automation and VLSI design. Sung Kyu Lim (S 94 M 00 SM 05) received the B.S., M.S., and Ph.D. degrees from the Computer Science Department, University of California, Los Angeles (UCLA), in 1994, 1997, and 2000, respectively. From 2000 to 2001, he was a Post-Doctoral Scholar at UCLA, and a Senior Engineer at Aplus Design Technologies, Inc. He joined the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, as an Assistant Professor in August 2001 and the College of Computing as an Adjunct Assistant Professor in September He served as a Guest Editor for ACM Transactions on Design Automation of Electronic Systems. His research focus is on the physical design automation for 3-D circuits, 3-D system-on-packages, microarchitectural physical planning, field programmable analog arrays, and quantum cell automata. Dr. Lim received the Design Automation Conference Graduate Scholarship in 2003 and the National Science Foundation Faculty Early Career Development (CAREER) Award in He has been on the Advisory Board of the ACM Special Interest Group on Design Automation since He has served the Technical Program Committee of several ACM and IEEE conferences on electronic design automation.

SEMICONDUCTOR industry is beginning to question the

SEMICONDUCTOR industry is beginning to question the IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, VOL.??, NO.??, JANUARY(?) 2005 1 Placement and Routing for 3D System-On-Package Designs Jacob Minz, Eric Wong, Mohit Pathak, and Sung Kyu Lim,

More information

Layer Assignment for Reliable System-on-Package

Layer Assignment for Reliable System-on-Package Layer Assignment for Reliable System-on-Package Jacob R. Minz and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta, GA 30332-0250 {jrminz,limsk}@ece.gatech.edu

More information

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface.

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface. Placement Introduction A very important step in physical design cycle. A poor placement requires larger area. Also results in performance degradation. It is the process of arranging a set of modules on

More information

CAD Algorithms. Placement and Floorplanning

CAD Algorithms. Placement and Floorplanning CAD Algorithms Placement Mohammad Tehranipoor ECE Department 4 November 2008 1 Placement and Floorplanning Layout maps the structural representation of circuit into a physical representation Physical representation:

More information

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets.

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Clock Routing Problem Formulation Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Better to develop specialized routers for these nets.

More information

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Chen-Wei Liu 12 and Yao-Wen Chang 2 1 Synopsys Taiwan Limited 2 Department of Electrical Engineering National Taiwan University,

More information

Basic Idea. The routing problem is typically solved using a twostep

Basic Idea. The routing problem is typically solved using a twostep Global Routing Basic Idea The routing problem is typically solved using a twostep approach: Global Routing Define the routing regions. Generate a tentative route for each net. Each net is assigned to a

More information

Placement Algorithm for FPGA Circuits

Placement Algorithm for FPGA Circuits Placement Algorithm for FPGA Circuits ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

Chapter 5 Global Routing

Chapter 5 Global Routing Chapter 5 Global Routing 5. Introduction 5.2 Terminology and Definitions 5.3 Optimization Goals 5. Representations of Routing Regions 5.5 The Global Routing Flow 5.6 Single-Net Routing 5.6. Rectilinear

More information

L14 - Placement and Routing

L14 - Placement and Routing L14 - Placement and Routing Ajay Joshi Massachusetts Institute of Technology RTL design flow HDL RTL Synthesis manual design Library/ module generators netlist Logic optimization a b 0 1 s d clk q netlist

More information

Estimation of Wirelength

Estimation of Wirelength Placement The process of arranging the circuit components on a layout surface. Inputs: A set of fixed modules, a netlist. Goal: Find the best position for each module on the chip according to appropriate

More information

Decoupling Capacitor Planning and Sizing For Noise and Leakage Reduction*

Decoupling Capacitor Planning and Sizing For Noise and Leakage Reduction* Decoupling Capacitor Planning and Sizing For Noise and Leakage Reduction* Eric Wong, Jacob Minz, and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta,

More information

Routing. Robust Channel Router. Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998

Routing. Robust Channel Router. Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998 Routing Robust Channel Router Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998 Channel Routing Algorithms Previous algorithms we considered only work when one of the types

More information

Thermal-aware Steiner Routing for 3D Stacked ICs

Thermal-aware Steiner Routing for 3D Stacked ICs Thermal-aware Steiner Routing for 3D Stacked ICs Mohit Pathak and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology {mohitp, limsk}@ece.gatech.edu Abstract In this

More information

A Study of Through-Silicon-Via Impact on the 3D Stacked IC Layout

A Study of Through-Silicon-Via Impact on the 3D Stacked IC Layout A Study of Through-Silicon-Via Impact on the Stacked IC Layout Dae Hyun Kim, Krit Athikulwongse, and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta,

More information

Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools

Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools Shamik Das, Anantha Chandrakasan, and Rafael Reif Microsystems Technology Laboratories Massachusetts Institute of Technology

More information

VLSI Physical Design: From Graph Partitioning to Timing Closure

VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5 Global Routing Original uthors: ndrew. Kahng, Jens, Igor L. Markov, Jin Hu Chapter 5 Global Routing 5. Introduction 5.2 Terminology and Definitions 5.3 Optimization Goals 5. Representations of

More information

Multilayer Routing on Multichip Modules

Multilayer Routing on Multichip Modules Multilayer Routing on Multichip Modules ECE 1387F CAD for Digital Circuit Synthesis and Layout Professor Rose Friday, December 24, 1999. David Tam (2332 words, not counting title page and reference section)

More information

VLSI Physical Design: From Graph Partitioning to Timing Closure

VLSI Physical Design: From Graph Partitioning to Timing Closure VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5 Global Routing Original uthors: ndrew. Kahng, Jens, Igor L. Markov, Jin Hu VLSI Physical Design: From Graph Partitioning to Timing

More information

On GPU Bus Power Reduction with 3D IC Technologies

On GPU Bus Power Reduction with 3D IC Technologies On GPU Bus Power Reduction with 3D Technologies Young-Joon Lee and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta, Georgia, USA yjlee@gatech.edu, limsk@ece.gatech.edu Abstract The

More information

Very Large Scale Integration (VLSI)

Very Large Scale Integration (VLSI) Very Large Scale Integration (VLSI) Lecture 6 Dr. Ahmed H. Madian Ah_madian@hotmail.com Dr. Ahmed H. Madian-VLSI 1 Contents FPGA Technology Programmable logic Cell (PLC) Mux-based cells Look up table PLA

More information

Pseudopin Assignment with Crosstalk Noise Control

Pseudopin Assignment with Crosstalk Noise Control 598 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 20, NO. 5, MAY 2001 Pseudopin Assignment with Crosstalk Noise Control Chin-Chih Chang and Jason Cong, Fellow, IEEE

More information

Place and Route for FPGAs

Place and Route for FPGAs Place and Route for FPGAs 1 FPGA CAD Flow Circuit description (VHDL, schematic,...) Synthesize to logic blocks Place logic blocks in FPGA Physical design Route connections between logic blocks FPGA programming

More information

THE continuous increase of the problem size of IC routing

THE continuous increase of the problem size of IC routing 382 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 24, NO. 3, MARCH 2005 MARS A Multilevel Full-Chip Gridless Routing System Jason Cong, Fellow, IEEE, Jie Fang, Min

More information

Cluster-based approach eases clock tree synthesis

Cluster-based approach eases clock tree synthesis Page 1 of 5 EE Times: Design News Cluster-based approach eases clock tree synthesis Udhaya Kumar (11/14/2005 9:00 AM EST) URL: http://www.eetimes.com/showarticle.jhtml?articleid=173601961 Clock network

More information

FPGA Power Management and Modeling Techniques

FPGA Power Management and Modeling Techniques FPGA Power Management and Modeling Techniques WP-01044-2.0 White Paper This white paper discusses the major challenges associated with accurately predicting power consumption in FPGAs, namely, obtaining

More information

ECE260B CSE241A Winter Routing

ECE260B CSE241A Winter Routing ECE260B CSE241A Winter 2005 Routing Website: / courses/ ece260bw05 ECE 260B CSE 241A Routing 1 Slides courtesy of Prof. Andrew B. Kahng Physical Design Flow Input Floorplanning Read Netlist Floorplanning

More information

Introduction VLSI PHYSICAL DESIGN AUTOMATION

Introduction VLSI PHYSICAL DESIGN AUTOMATION VLSI PHYSICAL DESIGN AUTOMATION PROF. INDRANIL SENGUPTA DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Introduction Main steps in VLSI physical design 1. Partitioning and Floorplanning l 2. Placement 3.

More information

ICS 252 Introduction to Computer Design

ICS 252 Introduction to Computer Design ICS 252 Introduction to Computer Design Lecture 16 Eli Bozorgzadeh Computer Science Department-UCI References and Copyright Textbooks referred (none required) [Mic94] G. De Micheli Synthesis and Optimization

More information

A Study of IR-drop Noise Issues in 3D ICs with Through-Silicon-Vias

A Study of IR-drop Noise Issues in 3D ICs with Through-Silicon-Vias A Study of IR-drop Noise Issues in 3D ICs with Through-Silicon-Vias Moongon Jung and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta, Georgia, USA Email:

More information

Mincut Placement with FM Partitioning featuring Terminal Propagation. Brett Wilson Lowe Dantzler

Mincut Placement with FM Partitioning featuring Terminal Propagation. Brett Wilson Lowe Dantzler Mincut Placement with FM Partitioning featuring Terminal Propagation Brett Wilson Lowe Dantzler Project Overview Perform Mincut Placement using the FM Algorithm to perform partitioning. Goals: Minimize

More information

Optimum Placement of Decoupling Capacitors on Packages and Printed Circuit Boards Under the Guidance of Electromagnetic Field Simulation

Optimum Placement of Decoupling Capacitors on Packages and Printed Circuit Boards Under the Guidance of Electromagnetic Field Simulation Optimum Placement of Decoupling Capacitors on Packages and Printed Circuit Boards Under the Guidance of Electromagnetic Field Simulation Yuzhe Chen, Zhaoqing Chen and Jiayuan Fang Department of Electrical

More information

Unit 7: Maze (Area) and Global Routing

Unit 7: Maze (Area) and Global Routing Unit 7: Maze (Area) and Global Routing Course contents Routing basics Maze (area) routing Global routing Readings Chapters 9.1, 9.2, 9.5 Filling Unit 7 1 Routing Unit 7 2 Routing Constraints 100% routing

More information

FastPlace 2.0: An Efficient Analytical Placer for Mixed- Mode Designs

FastPlace 2.0: An Efficient Analytical Placer for Mixed- Mode Designs FastPlace.0: An Efficient Analytical Placer for Mixed- Mode Designs Natarajan Viswanathan Min Pan Chris Chu Iowa State University ASP-DAC 006 Work supported by SRC under Task ID: 106.001 Mixed-Mode Placement

More information

3-D INTEGRATED CIRCUITS (3-D ICs) are emerging

3-D INTEGRATED CIRCUITS (3-D ICs) are emerging 862 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 21, NO. 5, MAY 2013 Study of Through-Silicon-Via Impact on the 3-D Stacked IC Layout Dae Hyun Kim, Student Member, IEEE, Krit

More information

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Physical Design of Digital Integrated Circuits (EN029 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Lecture 09: Routing Introduction to Routing Global Routing Detailed Routing 2

More information

EE582 Physical Design Automation of VLSI Circuits and Systems

EE582 Physical Design Automation of VLSI Circuits and Systems EE582 Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University Preliminaries Table of Contents Semiconductor manufacturing Problems to solve Algorithm complexity

More information

An Interconnect-Centric Design Flow for Nanometer Technologies

An Interconnect-Centric Design Flow for Nanometer Technologies An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong UCLA Computer Science Department Email: cong@cs.ucla.edu Tel: 310-206-2775 URL: http://cadlab.cs.ucla.edu/~cong Exponential Device

More information

Constructive floorplanning with a yield objective

Constructive floorplanning with a yield objective Constructive floorplanning with a yield objective Rajnish Prasad and Israel Koren Department of Electrical and Computer Engineering University of Massachusetts, Amherst, MA 13 E-mail: rprasad,koren@ecs.umass.edu

More information

CAD Algorithms. Circuit Partitioning

CAD Algorithms. Circuit Partitioning CAD Algorithms Partitioning Mohammad Tehranipoor ECE Department 13 October 2008 1 Circuit Partitioning Partitioning: The process of decomposing a circuit/system into smaller subcircuits/subsystems, which

More information

Double Patterning-Aware Detailed Routing with Mask Usage Balancing

Double Patterning-Aware Detailed Routing with Mask Usage Balancing Double Patterning-Aware Detailed Routing with Mask Usage Balancing Seong-I Lei Department of Computer Science National Tsing Hua University HsinChu, Taiwan Email: d9762804@oz.nthu.edu.tw Chris Chu Department

More information

ECE 5745 Complex Digital ASIC Design Topic 13: Physical Design Automation Algorithms

ECE 5745 Complex Digital ASIC Design Topic 13: Physical Design Automation Algorithms ECE 7 Complex Digital ASIC Design Topic : Physical Design Automation Algorithms Christopher atten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/courses/ece7

More information

Thermal-aware Steiner Routing for 3D Stacked ICs

Thermal-aware Steiner Routing for 3D Stacked ICs Thermal-aware Steiner Routing for 3D Stacked ICs Mohit Pathak and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology {mohitp, limsk}@ece.gatech.edu Abstract In this

More information

Full-chip Routing Optimization with RLC Crosstalk Budgeting

Full-chip Routing Optimization with RLC Crosstalk Budgeting 1 Full-chip Routing Optimization with RLC Crosstalk Budgeting Jinjun Xiong, Lei He, Member, IEEE ABSTRACT Existing layout optimization methods for RLC crosstalk reduction assume a set of interconnects

More information

Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion

Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion Chen Li, Cheng-Kok Koh School of ECE, Purdue University West Lafayette, IN 47907, USA {li35, chengkok}@ecn.purdue.edu Patrick

More information

Physical Design Implementation for 3D IC Methodology and Tools. Dave Noice Vassilios Gerousis

Physical Design Implementation for 3D IC Methodology and Tools. Dave Noice Vassilios Gerousis I NVENTIVE Physical Design Implementation for 3D IC Methodology and Tools Dave Noice Vassilios Gerousis Outline 3D IC Physical components Modeling 3D IC Stack Configuration Physical Design With TSV Summary

More information

A Multi-Layer Router Utilizing Over-Cell Areas

A Multi-Layer Router Utilizing Over-Cell Areas A Multi-Layer Router Utilizing Over-Cell Areas Evagelos Katsadas and Edwin h e n Department of Electrical Engineering University of Rochester Rochester, New York 14627 ABSTRACT A new methodology is presented

More information

A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM

A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 09, 2016 ISSN (online): 2321-0613 A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM Yogit

More information

Physical Design Closure

Physical Design Closure Physical Design Closure Olivier Coudert Monterey Design System DAC 2000 DAC2000 (C) Monterey Design Systems 1 DSM Dilemma SOC Time to market Million gates High density, larger die Higher clock speeds Long

More information

Chip/Package/Board Design Flow

Chip/Package/Board Design Flow Chip/Package/Board Design Flow EM Simulation Advances in ADS 2011.10 1 EM Simulation Advances in ADS2011.10 Agilent EEsof Chip/Package/Board Design Flow 2 RF Chip/Package/Board Design Industry Trends Increasing

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION Rapid advances in integrated circuit technology have made it possible to fabricate digital circuits with large number of devices on a single chip. The advantages of integrated circuits

More information

ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS)

ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) Objective Part A: To become acquainted with Spectre (or HSpice) by simulating an inverter,

More information

TABLE OF CONTENTS 1.0 PURPOSE INTRODUCTION ESD CHECKS THROUGHOUT IC DESIGN FLOW... 2

TABLE OF CONTENTS 1.0 PURPOSE INTRODUCTION ESD CHECKS THROUGHOUT IC DESIGN FLOW... 2 TABLE OF CONTENTS 1.0 PURPOSE... 1 2.0 INTRODUCTION... 1 3.0 ESD CHECKS THROUGHOUT IC DESIGN FLOW... 2 3.1 PRODUCT DEFINITION PHASE... 3 3.2 CHIP ARCHITECTURE PHASE... 4 3.3 MODULE AND FULL IC DESIGN PHASE...

More information

Congestion-Aware Power Grid. and CMOS Decoupling Capacitors. Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar

Congestion-Aware Power Grid. and CMOS Decoupling Capacitors. Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar Congestion-Aware Power Grid Optimization for 3D circuits Using MIM and CMOS Decoupling Capacitors Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar University of Minnesota 1 Outline Motivation A new

More information

CS612 Algorithms for Electronic Design Automation. Global Routing

CS612 Algorithms for Electronic Design Automation. Global Routing CS612 Algorithms for Electronic Design Automation Global Routing Mustafa Ozdal CS 612 Lecture 7 Mustafa Ozdal Computer Engineering Department, Bilkent University 1 MOST SLIDES ARE FROM THE BOOK: MODIFICATIONS

More information

Topics. ! PLAs.! Memories: ! Datapaths.! Floor Planning ! ROM;! SRAM;! DRAM. Modern VLSI Design 2e: Chapter 6. Copyright 1994, 1998 Prentice Hall

Topics. ! PLAs.! Memories: ! Datapaths.! Floor Planning ! ROM;! SRAM;! DRAM. Modern VLSI Design 2e: Chapter 6. Copyright 1994, 1998 Prentice Hall Topics! PLAs.! Memories:! ROM;! SRAM;! DRAM.! Datapaths.! Floor Planning Programmable logic array (PLA)! Used to implement specialized logic functions.! A PLA decodes only some addresses (input values);

More information

Temperature-Aware Routing in 3D ICs

Temperature-Aware Routing in 3D ICs Temperature-Aware Routing in 3D ICs Tianpei Zhang, Yong Zhan and Sachin S. Sapatnekar Department of Electrical and Computer Engineering University of Minnesota 1 Outline Temperature-aware 3D global routing

More information

Eliminating Routing Congestion Issues with Logic Synthesis

Eliminating Routing Congestion Issues with Logic Synthesis Eliminating Routing Congestion Issues with Logic Synthesis By Mike Clarke, Diego Hammerschlag, Matt Rardon, and Ankush Sood Routing congestion, which results when too many routes need to go through an

More information

Linking Layout to Logic Synthesis: A Unification-Based Approach

Linking Layout to Logic Synthesis: A Unification-Based Approach Linking Layout to Logic Synthesis: A Unification-Based Approach Massoud Pedram Department of EE-Systems University of Southern California Los Angeles, CA February 1998 Outline Introduction Technology and

More information

Obstacle-Aware Longest-Path Routing with Parallel MILP Solvers

Obstacle-Aware Longest-Path Routing with Parallel MILP Solvers , October 20-22, 2010, San Francisco, USA Obstacle-Aware Longest-Path Routing with Parallel MILP Solvers I-Lun Tseng, Member, IAENG, Huan-Wen Chen, and Che-I Lee Abstract Longest-path routing problems,

More information

Iterative-Constructive Standard Cell Placer for High Speed and Low Power

Iterative-Constructive Standard Cell Placer for High Speed and Low Power Iterative-Constructive Standard Cell Placer for High Speed and Low Power Sungjae Kim and Eugene Shragowitz Department of Computer Science and Engineering University of Minnesota, Minneapolis, MN 55455

More information

HIPEX Full-Chip Parasitic Extraction. Summer 2004 Status

HIPEX Full-Chip Parasitic Extraction. Summer 2004 Status HIPEX Full-Chip Parasitic Extraction Summer 2004 Status What is HIPEX? HIPEX Full-Chip Parasitic Extraction products perform 3D-accurate and 2D-fast extraction of parasitic capacitors and resistors from

More information

Data Partitioning. Figure 1-31: Communication Topologies. Regular Partitions

Data Partitioning. Figure 1-31: Communication Topologies. Regular Partitions Data In single-program multiple-data (SPMD) parallel programs, global data is partitioned, with a portion of the data assigned to each processing node. Issues relevant to choosing a partitioning strategy

More information

Chapter 2 On-Chip Protection Solution for Radio Frequency Integrated Circuits in Standard CMOS Process

Chapter 2 On-Chip Protection Solution for Radio Frequency Integrated Circuits in Standard CMOS Process Chapter 2 On-Chip Protection Solution for Radio Frequency Integrated Circuits in Standard CMOS Process 2.1 Introduction Standard CMOS technologies have been increasingly used in RF IC applications mainly

More information

Hipex Full-Chip Parasitic Extraction

Hipex Full-Chip Parasitic Extraction What is Hipex? products perform 3D-accurate and 2D-fast extraction of parasitic capacitors and resistors from hierarchical layouts into hierarchical transistor-level netlists using nanometer process technology

More information

Thermal-Aware 3D IC Physical Design and Architecture Exploration

Thermal-Aware 3D IC Physical Design and Architecture Exploration Thermal-Aware 3D IC Physical Design and Architecture Exploration Jason Cong & Guojie Luo UCLA Computer Science Department cong@cs.ucla.edu http://cadlab.cs.ucla.edu/~cong Supported by DARPA Outline Thermal-Aware

More information

ECO-system: Embracing the Change in Placement

ECO-system: Embracing the Change in Placement Motivation ECO-system: Embracing the Change in Placement Jarrod A. Roy and Igor L. Markov University of Michigan at Ann Arbor Cong and Sarrafzadeh: state-of-the-art incremental placement techniques unfocused

More information

Term Paper for EE 680 Computer Aided Design of Digital Systems I Timber Wolf Algorithm for Placement. Imran M. Rizvi John Antony K.

Term Paper for EE 680 Computer Aided Design of Digital Systems I Timber Wolf Algorithm for Placement. Imran M. Rizvi John Antony K. Term Paper for EE 680 Computer Aided Design of Digital Systems I Timber Wolf Algorithm for Placement By Imran M. Rizvi John Antony K. Manavalan TimberWolf Algorithm for Placement Abstract: Our goal was

More information

Advanced Surface Based MoM Techniques for Packaging and Interconnect Analysis

Advanced Surface Based MoM Techniques for Packaging and Interconnect Analysis Electrical Interconnect and Packaging Advanced Surface Based MoM Techniques for Packaging and Interconnect Analysis Jason Morsey Barry Rubin, Lijun Jiang, Lon Eisenberg, Alina Deutsch Introduction Fast

More information

CENG 4480 Lecture 11: PCB

CENG 4480 Lecture 11: PCB CENG 4480 Lecture 11: PCB Bei Yu Reference: Chapter 5 of Ground Planes and Layer Stacking High speed digital design by Johnson and Graham 1 Introduction What is a PCB Why we need one? For large scale production/repeatable

More information

On Increasing Signal Integrity with Minimal Decap Insertion in Area-Array SoC Floorplan Design

On Increasing Signal Integrity with Minimal Decap Insertion in Area-Array SoC Floorplan Design On Increasing Signal Integrity with Minimal Decap Insertion in Area-Array SoC Floorplan Design Chao-Hung Lu Department of Electrical Engineering National Central University Taoyuan, Taiwan, R.O.C. Email:

More information

Sungmin Bae, Hyung-Ock Kim, Jungyun Choi, and Jaehong Park. Design Technology Infrastructure Design Center System-LSI Business Division

Sungmin Bae, Hyung-Ock Kim, Jungyun Choi, and Jaehong Park. Design Technology Infrastructure Design Center System-LSI Business Division Sungmin Bae, Hyung-Ock Kim, Jungyun Choi, and Jaehong Park Design Technology Infrastructure Design Center System-LSI Business Division 1. Motivation 2. Design flow 3. Parallel multiplier 4. Coarse-grained

More information

Planning for Local Net Congestion in Global Routing

Planning for Local Net Congestion in Global Routing Planning for Local Net Congestion in Global Routing Hamid Shojaei, Azadeh Davoodi, and Jeffrey Linderoth* Department of Electrical and Computer Engineering *Department of Industrial and Systems Engineering

More information

Parallel Simulated Annealing for VLSI Cell Placement Problem

Parallel Simulated Annealing for VLSI Cell Placement Problem Parallel Simulated Annealing for VLSI Cell Placement Problem Atanu Roy Karthik Ganesan Pillai Department Computer Science Montana State University Bozeman {atanu.roy, k.ganeshanpillai}@cs.montana.edu VLSI

More information

Calibrating Achievable Design GSRC Annual Review June 9, 2002

Calibrating Achievable Design GSRC Annual Review June 9, 2002 Calibrating Achievable Design GSRC Annual Review June 9, 2002 Wayne Dai, Andrew Kahng, Tsu-Jae King, Wojciech Maly,, Igor Markov, Herman Schmit, Dennis Sylvester DUSD(Labs) Calibrating Achievable Design

More information

Reduce Your System Power Consumption with Altera FPGAs Altera Corporation Public

Reduce Your System Power Consumption with Altera FPGAs Altera Corporation Public Reduce Your System Power Consumption with Altera FPGAs Agenda Benefits of lower power in systems Stratix III power technology Cyclone III power Quartus II power optimization and estimation tools Summary

More information

Comprehensive Place-and-Route Platform Olympus-SoC

Comprehensive Place-and-Route Platform Olympus-SoC Comprehensive Place-and-Route Platform Olympus-SoC Digital IC Design D A T A S H E E T BENEFITS: Olympus-SoC is a comprehensive netlist-to-gdsii physical design implementation platform. Solving Advanced

More information

VLSI Lab Tutorial 3. Virtuoso Layout Editing Introduction

VLSI Lab Tutorial 3. Virtuoso Layout Editing Introduction VLSI Lab Tutorial 3 Virtuoso Layout Editing Introduction 1.0 Introduction The purpose of this lab tutorial is to guide you through the design process in creating a custom IC layout for your CMOS inverter

More information

How Much Logic Should Go in an FPGA Logic Block?

How Much Logic Should Go in an FPGA Logic Block? How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca

More information

OPTIMISATION OF PIN FIN HEAT SINK USING TAGUCHI METHOD

OPTIMISATION OF PIN FIN HEAT SINK USING TAGUCHI METHOD CHAPTER - 5 OPTIMISATION OF PIN FIN HEAT SINK USING TAGUCHI METHOD The ever-increasing demand to lower the production costs due to increased competition has prompted engineers to look for rigorous methods

More information

Computer-Based Project on VLSI Design Co 3/7

Computer-Based Project on VLSI Design Co 3/7 Computer-Based Project on VLSI Design Co 3/7 IC Layout and Symbolic Representation This pamphlet introduces the topic of IC layout in integrated circuit design and discusses the role of Design Rules and

More information

Animation of VLSI CAD Algorithms A Case Study

Animation of VLSI CAD Algorithms A Case Study Session 2220 Animation of VLSI CAD Algorithms A Case Study John A. Nestor Department of Electrical and Computer Engineering Lafayette College Abstract The design of modern VLSI chips requires the extensive

More information

A Path Based Algorithm for Timing Driven. Logic Replication in FPGA

A Path Based Algorithm for Timing Driven. Logic Replication in FPGA A Path Based Algorithm for Timing Driven Logic Replication in FPGA By Giancarlo Beraudo B.S., Politecnico di Torino, Torino, 2001 THESIS Submitted as partial fulfillment of the requirements for the degree

More information

Graph Models for Global Routing: Grid Graph

Graph Models for Global Routing: Grid Graph Graph Models for Global Routing: Grid Graph Each cell is represented by a vertex. Two vertices are joined by an edge if the corresponding cells are adjacent to each other. The occupied cells are represented

More information

Three-Dimensional Cylindrical Model for Single-Row Dynamic Routing

Three-Dimensional Cylindrical Model for Single-Row Dynamic Routing MATEMATIKA, 2014, Volume 30, Number 1a, 30-43 Department of Mathematics, UTM. Three-Dimensional Cylindrical Model for Single-Row Dynamic Routing 1 Noraziah Adzhar and 1,2 Shaharuddin Salleh 1 Department

More information

An integrated placement and routing approach

An integrated placement and routing approach Retrospective Theses and Dissertations 2006 An integrated placement and routing approach Min Pan Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/rtd Part of the Electrical

More information

Thermal-Aware 3D IC Placement Via Transformation

Thermal-Aware 3D IC Placement Via Transformation Thermal-Aware 3D IC Placement Via Transformation Jason Cong, Guojie Luo, Jie Wei and Yan Zhang Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 Email: { cong,

More information

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 8, NO. 6, DECEMBER 2000 747 A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks Yuhong Zhu, George N. Rouskas, Member,

More information

Genetic Algorithm for Circuit Partitioning

Genetic Algorithm for Circuit Partitioning Genetic Algorithm for Circuit Partitioning ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

342 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 3, MARCH /$ IEEE

342 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 3, MARCH /$ IEEE 342 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 3, MARCH 2009 Custom Networks-on-Chip Architectures With Multicast Routing Shan Yan, Student Member, IEEE, and Bill Lin,

More information

An Interconnect-Centric Design Flow for Nanometer. Technologies

An Interconnect-Centric Design Flow for Nanometer. Technologies An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong Department of Computer Science University of California, Los Angeles, CA 90095 Abstract As the integrated circuits (ICs) are scaled

More information

AN AUTOMATED INTERCONNECT DESIGN SYSTEM

AN AUTOMATED INTERCONNECT DESIGN SYSTEM AN AUTOMATED INTERCONNECT DESIGN SYSTEM W. E. Pickrell Automation Systems, Incorporated INTRODUCTION This paper describes a system for automatically designing and producing artwork for interconnect surfaces.

More information

TCG-Based Multi-Bend Bus Driven Floorplanning

TCG-Based Multi-Bend Bus Driven Floorplanning TCG-Based Multi-Bend Bus Driven Floorplanning Tilen Ma Department of CSE The Chinese University of Hong Kong Shatin, N.T. Hong Kong Evangeline F.Y. Young Department of CSE The Chinese University of Hong

More information

Automated Extraction of Physical Hierarchies for Performance Improvement on Programmable Logic Devices

Automated Extraction of Physical Hierarchies for Performance Improvement on Programmable Logic Devices Automated Extraction of Physical Hierarchies for Performance Improvement on Programmable Logic Devices Deshanand P. Singh Altera Corporation dsingh@altera.com Terry P. Borer Altera Corporation tborer@altera.com

More information

VERY large scale integration (VLSI) design for power

VERY large scale integration (VLSI) design for power IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 25 Short Papers Segmented Bus Design for Low-Power Systems J. Y. Chen, W. B. Jone, Member, IEEE, J. S. Wang,

More information

Computing Submesh Reliability in Two-Dimensional Meshes

Computing Submesh Reliability in Two-Dimensional Meshes Computing Submesh Reliability in Two-Dimensional Meshes Chung-yen Chang and Prasant Mohapatra Department of Electrical and Computer Engineering Iowa State University Ames, Iowa 511 E-mail: prasant@iastate.edu

More information

MAPLE: Multilevel Adaptive PLacEment for Mixed Size Designs

MAPLE: Multilevel Adaptive PLacEment for Mixed Size Designs MAPLE: Multilevel Adaptive PLacEment for Mixed Size Designs Myung Chul Kim, Natarajan Viswanathan, Charles J. Alpert, Igor L. Markov, Shyam Ramji Dept. of EECS, University of Michigan IBM Corporation 1

More information

1120 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 11, NO. 6, DECEMBER 2003

1120 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 11, NO. 6, DECEMBER 2003 1120 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 11, NO. 6, DECEMBER 2003 Fixed-Outline Floorplanning: Enabling Hierarchical Design Saurabh N. Adya, Member, IEEE, and Igor L.

More information

MODULAR PARTITIONING FOR INCREMENTAL COMPILATION

MODULAR PARTITIONING FOR INCREMENTAL COMPILATION MODULAR PARTITIONING FOR INCREMENTAL COMPILATION Mehrdad Eslami Dehkordi, Stephen D. Brown Dept. of Electrical and Computer Engineering University of Toronto, Toronto, Canada email: {eslami,brown}@eecg.utoronto.ca

More information

Statistical Timing Analysis Using Bounds and Selective Enumeration

Statistical Timing Analysis Using Bounds and Selective Enumeration IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 9, SEPTEMBER 2003 1243 Statistical Timing Analysis Using Bounds and Selective Enumeration Aseem Agarwal, Student

More information