High-Density Integration of Functional Modules Using Monolithic 3D-IC Technology

Size: px
Start display at page:

Download "High-Density Integration of Functional Modules Using Monolithic 3D-IC Technology"

Transcription

1 High-Density Integration of Functional Modules Using Monolithic 3D-IC Technology Shreepad Panth 1, Kambiz Samadi 2, Yang Du 2, and Sung Kyu Lim 1 1 Dept. of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA Qualcomm Research, San Diego, CA Abstract Three dimensional integrated circuits (3D-ICs) have emerged as a promising solution to continue device scaling. They can be realized using Through Silicon Vias (TSVs), or monolithic integration using Monolithic Inter-tier vias (MIVs), an emerging alternative that provides much higher via densities. In this paper, we provide a framework for floorplanning existing IP blocks into 3D-ICs using MIVs. We take the floorplanning solution all the way through place-and-route and report post-layout metrics for area, wirelength, timing, and power consumption. Results show that the wirelength of TSV-based 3D designs outperform designs by upto 14% in large-scale circuits only. MIV-based 3D designs, however, offer an average wirelength improvement of 33% for a wide range of benchmark circuits. We also show that while TSV-based 3D cannot improve the performance and power unless the TSV capacitance is reduced, MIV-based 3D offers significant reduction of upto 33% in the longest path delay and 35% in the inter-block net power. I. INTRODUCTION Fig. 1. A sample monolithic 3D technology with three metal layers per tier. Three dimensional integrated circuits (3D-ICs) have emerged as a promising solution to extend the scaling trajectory predicted by the Moore s Law. Currently, through-silicon vias (TSVs) enable 3D- ICs, allowing vertical stacking of multiple dies fabricated separately. However, the quality of TSV-based 3D-ICs strongly depends on the TSV dimensions and parisitics, and are limited to memory-on-logic or large logic-on-logic designs with relatively small number of global interconnects [1], [2]. An emerging alternative to TSV-based 3D is monolithic 3D that enables orders of magnitude higher integration density compared to that of TSV-based technology, due to the extremely small size of the monolithic inter-tier vias (MIVs). Monolithic 3D integration technology fabricates two or more tiers of devices sequentially, instead of bonding two previously fabricated dies using micro bumps and TSVs. Figure 1 shows a typical monolithic 3D structure with three metal layers per tier. The two device tiers are connected by inter-tier vias, which are essentially the same size as intra-tier vias. To fabricate the top device tier, low-thermal budgeting process must be applied to prevent damage to the underlying tier s back-end-of-line (BEOL). Currently, several monolithic 3D integration processes are developed. CEA/LETI [3], [4] has developed a sequential integration flow based on low temperature bonding process. Samsung [5] has developed a S3 technology for 3-tier SRAM cell using low-thermal TFT process. Overall, MIVs provide better electrical characteristics (i.e., less parasitics, electrical coupling, etc.) than TSVs, and also enable higher integration densities due to their small size. In this paper, we propose an efficient 3D design space exploration framework (i.e., 3D floorplanning) which accounts for the different characteristics between TSV-based and monolithic 3D integration technologies. Since re-designing existing logic, memory and IP blocks for 3D incurs significant design overhead and cost, near-term 3D-ICs will focus on reusing existing blocks [6], [7], [8]. In this paper, we present a floorplanning framework that uses different optimization objectives, based on physical characteristics of both TSVs and MIVs. To the best of our knowledge, this paper is the first to provide a 3D floorplanning framework specifically for monolithic 3D integration technology. To further enhance the applicability of our approach, we integrate our proposed 3D block-level floorplanning framework with the existing commercial place and route (P&R) tools to more accurately assess the solution quality. We use four different testcases with varying complexities, ranging from 33K to 1.7M gates in 45nm technology to better show the impact of our proposed methodology. The contributions of our work are listed below. We propose and develop an integrated 3D block-level floorplanning framework with appropriate objective functions for both TSV-based and monolithic 3D to enable an efficient 3D-IC design space exploration. In addition to using simulated annealing for floorplanning, we propose a post-floorplan refinement (PFPR) heuristic which achieves an average reduction of 6.13% in inter-block wirelength with respect to the initial floorplan. We propose a methodology for MIV planning, which relies on custom scripts, and existing commercial P&R tools. We develop a methodology that takes the obtained 3D floorplan all the way through place and route, and is capable of reporting post-layout timing and power numbers. II. RELATED WORK Monolithic design for high-performance ICs was presented in [9]. This paper presented two design styles, the first one in which PMOS and NMOS devices are fabricated on separate layers, and another in which standard cells have both PMOS and NMOS devices in the same tier. They presented a placement algorithm to fully utilize high density MIVs. Although the stackup is similar to our case, this paper carried out their study at the gate level, and the placement algorithm is not applicable to block-level monolithic 3D-ICs. Design for monolithic 3D-SRAM was carried out in [10]. The authors provided different design styles of the SRAM cell, assuming different PMOS and NMOS tiers, and compared them w.r.t. static noise margin, write margin, and data retention voltage. While several prior works exist on adding TSVs at the gate level or core level, only a few works consider adding TSVs into existing whitespace blocks at the floorplanning stage. Simultaneous buffering /13/$ IEEE 681

2 Center-to-Center based Annealing Update with pin locations Annealing based refinement Create Verilog and DEF files with pins Route with Encounter swap two blocks in either the positive sequence, negative sequence, or both, and (3) move or swap two blocks between a pair of dies/tiers. In TSV-based 3D, we need to control the number of TSVs due to its significant silicon area. Hence, the TSV-based 3D cost function is given as follows. C TSV = αw L + βa + γn TSV (1) Fig. 2. Monolithic? No TSV planning Yes Extract MIV location and connectivity Create Verilog/DEF file for each die Existing work Custom program Cadence Encounter The design flow to obtain a 3D floorplan with TSV/MIV insertion. and TSV planning was carried out in [11], but the authors reported inaccurate 3D HPWL and timing metrics. An improved algorithm was presented in [7], but the same inaccurate HPWL metric was used. Results based on an improved BB--HPWL metric was presented in [8], and the most accurate HPWL metric based on subnets was presented in [6]. However, none of these papers compared the quality of their engine with that of a commercially available tool, or took the obtained floorplans through place and route and reported postlayout numbers. These shortcomings are overcome in this paper, and therefore, the numbers reported are the most accurate. To the best of our knowledge, this is the first work to fully exploit the high density offered by monolithic 3D integration, use a validated floorplanner to perform block-level monolithic 3D design, and compare post-layout 3D wirelength, timing and power numbers with those of a commercial tool. III. 3D FLOORPLANNING WITH MONOLITHIC INTER-TIER VIAS A. Problem Formulation and Overview A general form of the 3D floorplanning problem can be stated as follows : Given the number of desired tiers, and a set of blocks along with their corresponding widths and heights, determine the (x, y, z) locations of each of the blocks and all MIVs/TSVs. The overall design flow is shown in Figure 2. We first perform floorplanning to determine the location of all the blocks assuming the pins are placed at the center. Once the locations of all the blocks are determined, we update the locations of the pins and perform a refinement step (i.e., PFPR) to further minimize wirelength. Depending on whether we are dealing with TSVs or MIVs, we have different via planning engines. Finally, we create separate Verilog files for each die/tier with the corresponding connectivity information, and a design exchange format (DEF) file with the location of blocks and TSVs/MIVs. Each of the above steps are further explained in following subsections. B. Floorplanning Engine In this step, we take the description of all the blocks as well as the connectivity information and generate an output floorplan that minimizes a certain cost function depending on whether we are using TSV-based or monolithic 3D. We use a simulated annealing engine similar to [6], maintaining a separate sequence pair for each die. We perform the following different moves during the annealing process: (1) change aspect ratio of a block (or rotate in case of hard blocks), (2) In the above equation, WL represents the inter-block wirelength, A represents the chip area, and N TSV represents the number of TSVs. However, if we are dealing with monolithic 3D, then the MIV size is negligible, and we do not need to constrain the number of MIVs, opening up the possibility for further optimization. The monolithic 3D cost function is given as follows. C MIV = α WL+ β A (2) Considering the pin locations of the blocks during floorplanning will require an extra step to compute the physical location of all block-pins. Since the number of block-pins are quite large, this will lead to large runtime overhead. We instead propose a postfloorplanning refinement (PFPR) step to consider pin locations once block locations have been determined. C. Post-Floorplan Refinement (PFPR) After we determine the relative locations of all the blocks, we update the blocks with the pin locations. Each block has 8 possible orientations, 0, 90, 180, 270, and their flipped counterparts. Without changing the relative locations of the blocks in the floorplan, each block can only have four possible orientations. For example, if the pins are in the center of a block, 0, 180 or 90, 270 and their flipped counterparts are all the same. However, if the pins are placed along the periphery each of the above four orientations gives a different wirelength result. The goal of this step is to determine the orientation of each block, such that the wirelength is minimized. To do this, we use simulated annealing, where the only operation allowed is to change block orientation. The block orientation can only be changed among the allowed four scenarios. No sequence pair is necessary, as the relative locations of blocks do not change. Furthermore, wirelength computation can be done incrementally as we only change one block at a time. D. MIV Planning Algorithm Once we obtain the 3D floorplanning result, we need to insert TSVs or MIVs (monolithic inter-tier vias) in the case of monolithic 3D to connect blocks in different tiers. Since TSVs are big (around 5μm to 10μm) and we may not have enough whitespace in the dies, a whitespace manipulation step is required. We use an existing TSV planner [6] that constructs a 3D rectilinear Steiner tree (RST) from a rectilinear Steiner minimum tree (RSMT), and then moves TSVs to nearby whitespace based on a network-flow formulation. In the case that there is insufficient whitespace, we insert whitespace at desired locations. However, in the case of monolithic 3D, MIVs are very small (around 70nm) and hence, we can safely assume that there is always whitespace available for MIV insertion. In this case, we can utilize existing obstacle avoiding routers to perform MIV insertion. We use the IC router in Cadence SOC Encounter, and since it is limited to 15 metal layers, we use three metal layers to represent a given tier for the MIV planning stage only. This allows us to represent up to 5 tiers. For example, if a block is in tier 2, we use metal layer 4 to place block-pins, and metal layers 5 and 6 to represent interblock routing on that tier. Vias between metal 6 and 7 represent MIVs between tier 2 and 3. Our choice of the number of metal layers used 682

3 Algorithm 1: MIV Planning Algorithm Input : Location of all blocks in B, block orientation, block-pin locations, and connectivity information Output: Number, location, and connectivity information of MIVs 1 for n 1 to N net do 2 add connectivity information into a Verilog file; 3 end 4 for i 1 to B do 5 for p 1 to N b i pin do 6 add pin physical location (x p b i,y p b i,l b i ) in the DEF; 7 end 8 add routing blockage for b i on its assigned layer l b i j ; 9 end 10 read the above Verilog and DEF files into SOC Encounter; 11 route the design and save the routed DEF file; 12 read the routed DEF file and reconstruct the routing graphs; 13 extract corresponding subnets in each die / tier from the routing graphs; 14 create Verilog file for each die/tier with subnet connectivity; 15 create DEF file for each die/tier with MIV locations; TABLE I DESIGN STATISTICS FOR ALL BENCHMARKS Design # Gates #Blk #Inter-blk Intra-blk Target nets WL (μm) period (ns) des perf 33, , , cf rca , ,135 1,210, cf fft , ,402 4,490, mult ,639, ,471 12,354, is justified because we only route the inter-block nets in our blocklevel monolithic 3D designs, and they are routed in the top 2 or 3 metal layers of each tier. Our MIV planning heuristic starts with creating a netlist that contains the connectivity information of the pins of all the 3D nets as shown in Lines 1 to 3 of Algorithm 1, where N net denotes the total number of 3D nets. We then create a DEF file that contains the physical location of every pin of each block; x p b i and y p b i denote the x and y coordinates of pin p of block b i, respectively, and l b i denotes the metal layer that block b i is assigned to. In addition, we add routing blockages for each block to account for (1) the fact that MIVs cannot be placed within the blocks and (2) the internal wiring of each block (Lines 4 to 9). Next, we give the Verilog and DEF files to SOC Encounter to route all the 3D nets simultaneously (Lines 10 and 11). Simultaneous routing of all 3D nets avoids any possible congestion issues due to the small size of MIVs. Once we obtain the routed DEF, we trace the routing topology to determine (1) which MIV belongs to which net, and (2) which block-pin the MIV connects to (Lines 12 and 13). Finally, we generate the Verilog and DEF files for each tier (Lines 14 and 15) that contains the block/miv locations. IV. EVALUATION A. Experimental Setup All required code and scripts are implemented in C/C++ and python, and all experiments are carried out on a 2.5 GHz 64- bit linux system. The 45nm Nangate open source standard cell library is used in our experiments. The TSV diameter, landing pad size, pitch, and thickness are assumed to be 6μm, 7μm, 10μm, and 50μm respectively. The MIV diameter, pitch and thickness are 0.07μm, 0.28μm and 0.31μm respectively. The TSV resistance and Fig. 3. Our design flow used to get post-layout simulation results. capacitance are 50mΩ, and 122fF respectively. These parasitics are measured values, taken from [12]. The MIV resistance and capacitance are similar to that of local vias and are 4Ω, and 1fF respectively. The monolithic structure is similar to that of Figure 1, except that we use six metal layers per tier. We consider four benchmarks in this work, statistics of which are shown in Table I. The first three are taken from the Opencores benchmark suite [13], and the fourth is a custom built 256-bit integer multiplier. This multiplier is built out of 256x4-bit multiplier and 512- bit adder blocks, arranged into an adder tree. Each multiplier block has 3 pipeline stages and each adder block has 4 pipeline stages. The design flow used to obtain all results is shown in Figure 3. It consists of roughly two steps: block design, and top-level design and analysis. 1) Block Design: We begin by designing each block separately in Cadence SOC Encounter. The netlist for each block is obtained by grouping modules bottom up along the hierarchy, until they reach a certain area threshold. Timing constraints for each block depend on the overall system frequency, and are determined by context characterization. Each block is then placed, routed and timing optimized in SOC Encounter. This step finalizes the pin locations within each block. We choose four blocks at random from cf rca 16 testcase and show their layouts in Figure 4. 2) Top-level Design and Analysis: We perform floorplanning using the methodology described in Section III-B. Three different floorplanning methodologies are considered, the first two (1) TSV-based 3D (TSV) and (2) monolithic 3D (MIV) are already described. The third one, MIV TF is obtained by using the same floorplan output as in the TSV case (before whitespace insertion), but using the MIVplanning engine instead of the TSV-planning engine. This compares the quality of the two different methodologies, starting with the same floorplan. The number of MIVs in MIV TF used can be more than the number of TSVs because multi-pin nets might use far more MIVs due to their small size. Some sample layouts for floorplanning and 2-Die implementations of cf rca 16 are shown in Figure 4. We next route each die separately in SOC Encounter. We perform parasitic extraction to obtain the SPEF files for each die. In addition, we create a top-level Verilog file with the interconnections between dies, and a top-level SPEF file with the TSV/MIV parasitics. All netlist and parasitic information is then fed into Synopsys Primetime to obtain true 3D timing and power numbers. B. Experimental Results and Discussions 1) Floorplanner Validation: We run our floorplanner in mode, and compare it with the results obtained from wirelength-driven floorplanning in Cadence Encounter. The Encounter footprint area is 683

4 Fig. 4. Some sample layouts for cf rca 16 testcase, along with select block designs, and zoomed in shots of TSVs and MIVs TABLE II A COMPARISON OF THE PERFORMANCE OF OUR FLOORPLANNER AND CADENCE ENCOUNTER Footprint (mm 2 ) Inter-block WL (m) Encounter Ours Encounter Ours des perf (1.00) (0.92) (1.00) (1.01) cf rca (1.00) (0.93) (1.00) (1.02) cf fft (1.00) (0.68) (1.00) (1.06) mul (1.00) (0.94) (1.00) (1.05) Average Fig. 5. Various components of net power reported in this paper. obtained by gradually increasing the area and running floorplanning until no block overlap is observed. The results are summarized in Table II. The large area reduction in the cf fft design is due to the fact that Cadence Encounter repeatedly produces module overlaps when provided with smaller area. This is presumably due to some bug in the legalization stage of SOC Encounter. It can still provide comparable wirelength to our floorplanner however, as this particular testcase is only locally connected, and each block communicates with only one or two neighbours. As seen from Table II, our floorplanner produces comparable results with SOC Encounter. 2) Comparison of versus 3D: In this section, we compare the wirelength, timing and top-level net power of and 3D cases of all designs. The clock period assumed for Total Negative Slack (TNS) and power calculation is taken from Table I. The different components of net power are explained in Figure 5. We have intrablock nets, and inter-block nets. The inter-block net power is further split up into three components: (1) intra-block component (OBN-Int.), (2) inter-block component (OBN-Top) and (3) pin component of the loading cell (OBN-Pin). At the block level, the only component of net power that can be optimized is OBN-Top. Furthermore, since we do not have a true 3D timing optimization engine, we report preoptimization timing and power numbers. The results for all designs are summarized in Table III. From this table, we see that with respect to the inter-block wirelength, monolithic 3D gives us significant advantage. The total wirelength reduction depends upon the ratio of inter-block wirelength to intrablock wirelength, and varies depending on the circuit. TSV-based 3D design however, does not give any improvement in wirelength for the small design des perf, and we start to see small improvements in the cf rca 16 and cf fft testcases. However, with the largest design, we see no improvement, mainly because we need to travel a large distance to the nearest whitespace block to place a TSV. Also, as expected, MIV TF gives better wirelength than the TSV-based method, but worse than the MIV case. With respect to timing and net power, we see that the MIV case improves the longest path delay (LPD), the total negative slack (TNS) and the top-level net power. The timing of MIV TF is sometimes better than the timing of MIV, as wirelength driven floorplanning does not guarantee best timing. In the benchmarks considered, except in the 2-Die case of cf fft 256 8, the TSV case does not give any timing or power improvement over. This is because the large 122fF capacitance is analogous to more than 700μm of Metal 10 wire in the 45nm technology, and a significant number of such long wires are required to see a sensible reduction. In general, the reduction in top net power of MIV follows the reduction in top net wirelength. The only exception is mult Here we see that our design has 43% more power than encounter, with only 5% more wirelength. This is because power consumption depends on the wirelength distribution, and our floorplanner results in solutions with the longer nets having higher switching activity. Therefore, we conclude that monolithic 3D can provide significant benefits over even in the case of small designs, while TSVbased 3D is suitable for designs with a large number of long interconnections or memory-on-logic stacking applications; and the improvement in the case of logic-on-logic will be observed only with smaller TSV parasitics. 3) Power benefit of monolithic 3D: We provide a detailed preoptimization power split-up of all four testcases in Table IV, with the legend explained in Figure 5. We compare with MIV-based 3D, and also provide a reference case of ideal interconnections. This ideal case does not correspond to any real physical scenario, but represents the theoretical minimum power consumption at the block level. The values are obtained by setting the parasitics of the OBN- 684

5 MIV TABLE III A COMPARISON OF WIRELENGTH, TIMING AND TOP NET POWER OF VERSUS 3D Footprint Normalised #MIV/ Inter-block Total routed LPD TNS OBN-Top (μm μm) Si. Area #TSV routed WL (μm) WL (μm) (ns) (ns) power (mw) des perf Encounter 256x ,805 (1.00) 563,293 (1.00) 1.65 (1.00) (1.00) (1.00) Ours 251x ,489 (1.01) 566,977 (1.01) 1.73 (1.05) (1.21) (1.06) 2 Dies 146x , ,678 (0.76) 478,166 (0.85) 1.44 (0.87) (0.58) 8.55 (0.76) 3 Dies 127x , ,240 (0.63) 432,728 (0.77) 1.23 (0.74) (0.35) 7.29 (0.65) 4 Dies 111x , ,868 (0.58) 415,356 (0.74) 1.10 (0.67) (0.17) 6.41 (0.57) 2 Dies 215x ,092 (1.34) 683,580 (1.21) 2.18 (1.32) (1.92) (1.88) TSV 3 Dies 320x ,267 (1.46) 725,755 (1.29) 2.46 (1.49) (3.33) (2.69) 4 Dies 359x ,739 (2.08) 945,227 (1.68) 4.09 (2.48) (4.37) (4.28) 2 Dies 213x ,823 (1.05) 581,311 (1.03) 2.06 (1.25) (1.35) (1.13) MIV TF 3 Dies 211x ,226 (1.00) 563,714 (1.00) 1.65 (1.00) (1.2) (1.04) 4 Dies 186x , ,356 (0.68) (0.80) 1.25 (0.75) (0.40) 7.29 (0.65) cf rca 16 Encounter 667x ,673 (1.00) 1,572,291 (1.00) 1.85 (1.00) -2, (1.00) 4.71 (1.00) Ours 555x ,542 (1.02) 1,578,160 (1.00) 1.75 (0.95) -2, (0.78) 4.73 (1.00) 2 Dies 416x , ,156 (0.80) 1,499,774 (0.95) 1.73 (0.94) -1, (0.71) 3.74 (0.79) MIV 3 Dies 367x , ,910 (0.71) 1,466,258 (0.93) 1.72 (0.93) -1, (0.63) 3.61 (0.77) 4 Dies 273x , ,583 (0.67) 1,451,201 (0.92) 1.69 (0.92) -1, (0.57) 3.37 (0.72) 2 Dies 484x ,347 (1.07) 1,564,965 (1.00) 2.48 (1.34) -11,093 (4.02) 7.49 (1.59) TSV 3 Dies 377x ,425 (1.11) 1,612,043 (1.03) 3.23 (1.75) -16,074 (5.82) (2.44) 4 Dies 350x ,090 (0.95) 1,555,708 (0.99) 3.63 (1.97) -18,825 (6.81) 13.4 (2.85) 2 Dies 438x ,631 (0.89) 1,534,249 (0.98) 1.79 (0.97) -2,463.8 (0.89) 4.12 (0.87) MIV TF 3 Dies 375x ,093 (0.78) 1,491,711 (0.95) 1.65 (0.90) -1, (0.49) 3.7 (0.79) 4 Dies 317x ,092 (0.73) 1,473,710 (0.94) 1.66 (0.90) (0.45) 3.45 (0.73) cf fft Encounter 1,300x1, ,674 (1.00) 4,904,487 (1.00) 2.18 (1.00) -22,308 (1.00) 7.7 (1.00) Ours 1,142x ,933 (1.06) 4,927,746 (1.00) 2.12 (0.97) -11,388 (0.51) 8.2 (1.06) 2 Dies 819x , ,787 (0.64) 4,754,600 (0.97) 1.96 (0.90) -3,618 (0.16) 5.3 (0.69) MIV 3 Dies 581x , ,256 (0.61) 4,745,069 (0.97) 1.9 (0.87) -4,447 (0.20) 5.06 (0.66) 4 Dies 595x , ,049 (0.65) 4,759,862 (0.97) 1.85 (0.85) -4,023 (0.18) 5.29 (0.69) 2 Dies 679x ,166 (0.89) 4,859,979 (0.99) 2.1 (0.96) -14,655 (0.66) 9.22 (1.20) TSV 3 Dies 653x ,592 (0.86) 4,848,405 (0.99) 2.47 (1.13) -34,950 (1.57) 11.1 (1.44) 4 Dies 584x ,216 (1.02) 4,913,029 (1.00) 3.22 (1.48) -67,602 (3.03) (2.16) 2 Dies 675x ,887 (0.87) 4,848,700 (0.99) 1.87 (0.86) -6,314 (0.28) 6.82 (0.89) MIV TF 3 Dies 649x ,045 (0.82) 4,829,858 (0.98) 1.74 (0.80) -1,358 (0.06) 6.24 (0.81) 4 Dies 578x ,465 (0.75) 4,801,278 (0.98) 1.85 (0.85) -1,626 (0.07) 5.74 (0.75) mult Encounter 2,280x2, ,089,968 (1.00) 29,444,308 (1.00) 1.12 (1.00) (1.00) (1.00) Ours 2,144x2, ,870,346 (1.05) 30,224,686 (1.03) 1.27 (1.14) (1.17) (1.43) 2 Dies 1,506x1, ,513 13,815,376 (0.81) 26,169,716 (0.89) 1.17 (1.05) (1.16) (1.01) MIV 3 Dies 1,286x1, ,682 11,392,196 (0.67) 23,746,536 (0.81) 0.95 (0.85) (0.62) 125 (0.87) 4 Dies 1,177x1, ,994 10,116,222 (0.59) 22,470,562 (0.76) 0.97 (0.87) (0.60) (0.77) 2 Dies 1,608x1, ,683 18,825,744 (1.10) 31,180,084 (1.06) 1.76 (1.58) (2.04) (2.11) TSV 3 Dies 1,508x1, ,599 21,184,404 (1.24) 33,538,744 (1.14) 2.02 (1.8) (3.87) (2.58) 4 Dies 1,240x1, ,232 20,890,062 (1.22) 33,244,402 (1.13) 2.45 (2.19) (4.37) (2.61) 2 Dies 1,601x1, ,162 16,127,948 (0.94) 28,482,288 (0.97) 1.06 (0.95) (0.95) (1.30) MIV TF 3 Dies 1,501x1, , ,560,50 (0.89) 27,610,390 (0.94) 0.99 (0.88) (0.86) (1.25) 4 Dies 1,182x1, ,260 15,1246,51 (0.89) 27,478,991 (0.93) 1.12 (1.00) (1.07) (1.30) Fig. 6. Timing slack histograms comparing and MIV-based 3D (2 die) for FFT benchmark. Negative slacks are shown in red, and positive slacks in green. Top nets to zero in Primetime. With a reduction in the wirelength of top level nets, we expect reduction the following power components: (1) Inter-block components of inter-block nets (OBN-Top), and (2) Switching power of the standard cells driving inter-block nets. From Table IV, we see that even theoretically, only a 10% average reduction in the total power consumption is possible, and the reduction is larger for designs with relatively more inter-block nets. We also see that MIV-based 3D gives us 3.1% average reduction in the total power consumption across our four testcases. If we consider the parameter that is being optimized by floorplanning, i.e., OBN-Top, we see that a large reduction in the power consumption is obtained by using monolithic 3D. The reduction in the driving cell power is present in all testcases, but most noticeable in the mult , which has a huge number of driving cells. Since we do not have a true 3D timing optimization tool, we cannot compare post-optimization numbers directly. However, we can predict the trend from the TNS reduction (Table III), and timing slack histograms (shown for cf fft testcase in Figure 6). Due to the average reduction of 51% in TNS, fewer buffer insertions and cell 685

6 TABLE IV A DETAILED SPLIT UP OF THE POWER FOR AND MONOLITHIC 3D ( IN MW ) Std. Cell Leakage IBN OBN-Pin OBN-Int. OBN-Top Total des perf Ideal interconnections (-) 50.1 (0.80) Encounter (1.00) 62.5 (1.00) Ours (1.06) 63.3 (1.01) 2 Dies (0.76) 59.5 (0.95) MIV 3 Dies (0.65) 58.1 (0.93) 4 Dies (0.57) 57.1 (0.91) cf rca 16 Ideal interconnections (-) (0.97) Encounter (1.00) (1.00) Ours (1.00) (1.00) 2 Dies (0.79) (0.99) MIV 3 Dies (0.77) (0.99) 4 Dies (0.72) (0.99) cf fft Ideal interconnections (-) (0.98) Encounter (1.00) (1.00) Ours (1.06) (1.00) 2 Dies (0.69) (0.99) MIV 3 Dies (0.66) (0.99) 4 Dies (0.69) (0.99) mult Ideal interconnections (-) (0.84) Encounter (1.00) (1.00) Ours (1.43) (1.04) 2 Dies (1.01) (0.98) MIV 3 Dies (0.87) (0.97) 4 Dies (0.77) (0.95) upsizing will be required to meet timing. Also, since the entire slack histograms are shifted towards the right, techniques such as timing slack redistribution or multi-v th design can be employed to achieve further power benefit. C. Design Guidelines for block-level MIV-based 3D We consider two possible scenarios: timing critical and power critical designs. In the case of timing critical designs, we have shown that MIV-based 3D can give significant reduction in longest path delay, as well as the total negative slack. Larger reductions in delay will be seen for designs with combinational paths through blocks. In the case of power critical designs, we have shown that MIV-based 3D gives significant reduction in inter-block net power, and depending on the number of inter-block nets, significant savings in power of driving cells of inter-block nets. Further power reduction can be achieved in one of several ways: (1) re-designing the blocks to downsize interblock drivers, (2) voltage scaling of the 3D system, which will shift the entire timing distribution back to the case, and (3) Multi-V th optimization will require fewer low V th cells to meet timing, reducing device power. V. CONCLUSIONS In this paper, we provided a floorplanning framework for monolithic 3D-ICs, and a methodology to obtain post-layout wirelength, timing, and power numbers for block-level 3D-ICs. We demonstrated that monolithic inter-tier via (MIV)-based 3D-ICs can achieve up to 42% reduction in wirelength when compared with -ICs. In addition, we compared our monolithic 3D designs to the throughsilicon-via (TSV)-based 3D-IC designs in terms of area, wirelength, power and performance. We observed that TSV-based 3D is only beneficial if either the TSV capacitance scales down, or the circuit has a large number of long wires. We also showed that due to a significant reduction in the total negative slack, and increase of the positive slacks, MIV-based 3D-ICs require less timing optimization. Moreover, with the application of advanced methods such as multi- Vth etc, further reduction in power is possible. REFERENCES [1] K. Yang, D. H. Kim, and S. K. Lim, Design Quality Tradeoff Studies for 3D ICs Built with Nano-scale TSVs and Devices, in Proc. Int. Symp. on Quality Electronic Design, 2012, pp [2] X.Dong, J. Zhao, and Y. Xie, Fabrication Cost Analysis and Cost-Aware Design Space Exploration for 3D-ICs, in IEEE Trans. on Computer- Aided Design of Integrated Circuits and Systems, 2010, pp [3] P. Batude et al., Advances in 3D CMOS Sequential Integration, in Proc. IEEE Int. Electron Devices Meeting, 2009, pp [4] O.Thomas et al., Compact 6T SRAM cell with robust read/write stabilizing design in 45nm Monolithic 3D IC technology, in Proc. IEEE Int. Conf. on Integrated Circuit Design and Tech., 2009, pp [5] S.-M. Jung, H. Lim, K. Kwak, and K. Kim, 500-MHz DDR High- Performance 72-Mb 3-D SRAM Fabricated With Laser-Induced Epitaxial c-si Growth Technology for a Stand-Alone and Embedded Memory Application, in IEEE Trans. on Electron Devices, 2010, pp [6] D. H. Kim, R. O. Topaloglu, and S. K. Lim, Block-Level 3D IC Design with Through-Silicon-Via Planning, in Proc. Asia and South Pacific Design Aut. Conf., 2012, pp [7] M. Tsai, T. Wang, and T. Hwang, Through-Silicon Via Planning in 3-D Floorplanning, in IEEE Trans. on VLSI Systems, 2011, pp [8] J. Knechtel, I. Markov, and J. Lienig, Assembling 2-D Blocks Into 3-D Chips, in IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, 2012, pp [9] S. Bobba et al., CELONCEL: Effective design technique for 3-D monolithic integration targeting high performance integrated circuits, in Proc. Asia and South Pacific Design Aut. Conf., 2011, pp [10] C. Liu and S. K. Lim, Ultra-High Density 3D SRAM Cell Designs for Monolithic 3D Integration, in Proc. IEEE Int. Interconnect Technology Conference, [11] H. Xu, D. Sheqin, M. Yuchun, and H. Xianlong, Simultaneous buffer and interlayer via planning for 3D floorplanning, in Proc. Int. Symp. on Quality Electronic Design, 2009, pp [12] X. Wu et al., Electrical Characterization for Inter-tier Connections and Timing Analysis for 3-D ICs, in IEEE Trans. on VLSI Systems, 2012, pp [13] 686

A Design Tradeoff Study with Monolithic 3D Integration

A Design Tradeoff Study with Monolithic 3D Integration A Design Tradeoff Study with Monolithic 3D Integration Chang Liu and Sung Kyu Lim Georgia Institute of Techonology Atlanta, Georgia, 3332 Phone: (44) 894-315, Fax: (44) 385-1746 Abstract This paper studies

More information

On GPU Bus Power Reduction with 3D IC Technologies

On GPU Bus Power Reduction with 3D IC Technologies On GPU Bus Power Reduction with 3D Technologies Young-Joon Lee and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta, Georgia, USA yjlee@gatech.edu, limsk@ece.gatech.edu Abstract The

More information

AS TECHNOLOGY scaling approaches its limits, 3-D

AS TECHNOLOGY scaling approaches its limits, 3-D 1716 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 36, NO. 10, OCTOBER 2017 Shrunk-2-D: A Physical Design Methodology to Build Commercial-Quality Monolithic 3-D ICs

More information

A Study of IR-drop Noise Issues in 3D ICs with Through-Silicon-Vias

A Study of IR-drop Noise Issues in 3D ICs with Through-Silicon-Vias A Study of IR-drop Noise Issues in 3D ICs with Through-Silicon-Vias Moongon Jung and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta, Georgia, USA Email:

More information

Monolithic 3D IC Design for Deep Neural Networks

Monolithic 3D IC Design for Deep Neural Networks Monolithic 3D IC Design for Deep Neural Networks 1 with Application on Low-power Speech Recognition Kyungwook Chang 1, Deepak Kadetotad 2, Yu (Kevin) Cao 2, Jae-sun Seo 2, and Sung Kyu Lim 1 1 School of

More information

On Enhancing Power Benefits in 3D ICs: Block Folding and Bonding Styles Perspective

On Enhancing Power Benefits in 3D ICs: Block Folding and Bonding Styles Perspective On Enhancing Power Benefits in 3D ICs: Block Folding and Bonding Styles Perspective Moongon Jung, Taigon Song, Yang Wan, Yarui Peng, and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta,

More information

Physical Design Implementation for 3D IC Methodology and Tools. Dave Noice Vassilios Gerousis

Physical Design Implementation for 3D IC Methodology and Tools. Dave Noice Vassilios Gerousis I NVENTIVE Physical Design Implementation for 3D IC Methodology and Tools Dave Noice Vassilios Gerousis Outline 3D IC Physical components Modeling 3D IC Stack Configuration Physical Design With TSV Summary

More information

Thermal Analysis on Face-to-Face(F2F)-bonded 3D ICs

Thermal Analysis on Face-to-Face(F2F)-bonded 3D ICs 1/16 Thermal Analysis on Face-to-Face(F2F)-bonded 3D ICs Kyungwook Chang, Sung-Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology Introduction Challenges in 2D Device

More information

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University Abbas El Gamal Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program Stanford University Chip stacking Vertical interconnect density < 20/mm Wafer Stacking

More information

A Study of Through-Silicon-Via Impact on the 3D Stacked IC Layout

A Study of Through-Silicon-Via Impact on the 3D Stacked IC Layout A Study of Through-Silicon-Via Impact on the Stacked IC Layout Dae Hyun Kim, Krit Athikulwongse, and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta,

More information

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Chen-Wei Liu 12 and Yao-Wen Chang 2 1 Synopsys Taiwan Limited 2 Department of Electrical Engineering National Taiwan University,

More information

Design and Analysis of Ultra Low Power Processors Using Sub/Near-Threshold 3D Stacked ICs

Design and Analysis of Ultra Low Power Processors Using Sub/Near-Threshold 3D Stacked ICs Design and Analysis of Ultra Low Power Processors Using Sub/Near-Threshold 3D Stacked ICs Sandeep Kumar Samal, Yarui Peng, Yang Zhang, and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta,

More information

Tier-Partitioning for Power Delivery vs Cooling Tradeoff in 3D VLSI for Mobile Applications

Tier-Partitioning for Power Delivery vs Cooling Tradeoff in 3D VLSI for Mobile Applications Tier-Partitioning for Power Delivery vs Cooling Tradeoff in 3D VLSI for Mobile Applications Shreepad Panth, Kambiz Samadi, Yang Du, and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta,

More information

Test-TSV Estimation During 3D-IC Partitioning

Test-TSV Estimation During 3D-IC Partitioning Test-TSV Estimation During 3D-IC Partitioning Shreepad Panth 1, Kambiz Samadi 2, and Sung Kyu Lim 1 1 Dept. of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332 2

More information

How Much Cost Reduction Justifies the Adoption of Monolithic 3D ICs at 7nm Node?

How Much Cost Reduction Justifies the Adoption of Monolithic 3D ICs at 7nm Node? How Much Cost Reduction Justifies the Adoption of Monolithic 3D ICs at 7nm Node? Bon Woong Ku, Peter Debacker, Dragomir Milojevic, Praveen Raghavan, and Sung Kyu Lim School of ECE, Georgia Institute of

More information

On the Design of Ultra-High Density 14nm Finfet based Transistor-Level Monolithic 3D ICs

On the Design of Ultra-High Density 14nm Finfet based Transistor-Level Monolithic 3D ICs 2016 IEEE Computer Society Annual Symposium on VLSI On the Design of Ultra-High Density 14nm Finfet based Transistor-Level Monolithic 3D ICs Jiajun Shi 1,2, Deepak Nayak 1,Motoi Ichihashi 1, Srinivasa

More information

3D systems-on-chip. A clever partitioning of circuits to improve area, cost, power and performance. The 3D technology landscape

3D systems-on-chip. A clever partitioning of circuits to improve area, cost, power and performance. The 3D technology landscape Edition April 2017 Semiconductor technology & processing 3D systems-on-chip A clever partitioning of circuits to improve area, cost, power and performance. In recent years, the technology of 3D integration

More information

Tier Partitioning Strategy to Mitigate BEOL Degradation and Cost Issues in Monolithic 3D ICs

Tier Partitioning Strategy to Mitigate BEOL Degradation and Cost Issues in Monolithic 3D ICs Tier Partitioning Strategy to Mitigate BEOL Degradation and Cost Issues in Monolithic 3D ICs Sandeep Kumar Samal, Deepak Nayak, Motoi Ichihashi, Srinivasa Banna, and Sung Kyu Lim School of ECE, Georgia

More information

Eliminating Routing Congestion Issues with Logic Synthesis

Eliminating Routing Congestion Issues with Logic Synthesis Eliminating Routing Congestion Issues with Logic Synthesis By Mike Clarke, Diego Hammerschlag, Matt Rardon, and Ankush Sood Routing congestion, which results when too many routes need to go through an

More information

PHYSICAL DESIGN METHODOLOGIES FOR MONOLITHIC 3D ICS

PHYSICAL DESIGN METHODOLOGIES FOR MONOLITHIC 3D ICS PHYSICAL DESIGN METHODOLOGIES FOR MONOLITHIC 3D ICS A Dissertation Presented to The Academic Faculty by Shreepad Amar Panth In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy

More information

Design and Analysis of 3D IC-Based Low Power Stereo Matching Processors

Design and Analysis of 3D IC-Based Low Power Stereo Matching Processors Design and Analysis of 3D IC-Based Low Power Stereo Matching Processors Seung-Ho Ok 1, Kyeong-ryeol Bae 1, Sung Kyu Lim 2, and Byungin Moon 1 1 School of Electronics Engineering, Kyungpook National University,

More information

Thermal-Aware 3D IC Physical Design and Architecture Exploration

Thermal-Aware 3D IC Physical Design and Architecture Exploration Thermal-Aware 3D IC Physical Design and Architecture Exploration Jason Cong & Guojie Luo UCLA Computer Science Department cong@cs.ucla.edu http://cadlab.cs.ucla.edu/~cong Supported by DARPA Outline Thermal-Aware

More information

3-D integrated circuits (3-D ICs) have emerged as a

3-D integrated circuits (3-D ICs) have emerged as a IEEE TRANSACTIONS ON COMPONENTS, PACKAGING AND MANUFACTURING TECHNOLOGY, VOL. 6, NO. 4, APRIL 2016 637 Probe-Pad Placement for Prebond Test of 3-D ICs Shreepad Panth, Member, IEEE, and Sung Kyu Lim, Senior

More information

Iterative-Constructive Standard Cell Placer for High Speed and Low Power

Iterative-Constructive Standard Cell Placer for High Speed and Low Power Iterative-Constructive Standard Cell Placer for High Speed and Low Power Sungjae Kim and Eugene Shragowitz Department of Computer Science and Engineering University of Minnesota, Minneapolis, MN 55455

More information

OpenAccess In 3D IC Physical Design

OpenAccess In 3D IC Physical Design OpenAccess In 3D IC Physical Design Jason Cong, Jie Wei,, Yan Zhang VLSI CAD Lab Computer Science Department University of California, Los Angeles Supported by DARPA and CFD Research Corp Outline 3D IC

More information

UCLA 3D research started in 2002 under DARPA with CFDRC

UCLA 3D research started in 2002 under DARPA with CFDRC Coping with Vertical Interconnect Bottleneck Jason Cong UCLA Computer Science Department cong@cs.ucla.edu http://cadlab.cs.ucla.edu/ cs edu/~cong Outline Lessons learned Research challenges and opportunities

More information

Three DIMENSIONAL-CHIPS

Three DIMENSIONAL-CHIPS IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) ISSN: 2278-2834, ISBN: 2278-8735. Volume 3, Issue 4 (Sep-Oct. 2012), PP 22-27 Three DIMENSIONAL-CHIPS 1 Kumar.Keshamoni, 2 Mr. M. Harikrishna

More information

Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools

Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools Shamik Das, Anantha Chandrakasan, and Rafael Reif Microsystems Technology Laboratories Massachusetts Institute of Technology

More information

A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM

A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 09, 2016 ISSN (online): 2321-0613 A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM Yogit

More information

A Framework for Systematic Evaluation and Exploration of Design Rules

A Framework for Systematic Evaluation and Exploration of Design Rules A Framework for Systematic Evaluation and Exploration of Design Rules Rani S. Ghaida* and Prof. Puneet Gupta EE Dept., University of California, Los Angeles (rani@ee.ucla.edu), (puneet@ee.ucla.edu) Work

More information

Improving Detailed Routability and Pin Access with 3D Monolithic Standard Cells

Improving Detailed Routability and Pin Access with 3D Monolithic Standard Cells Improving Detailed Routability and Pin Access with 3D Monolithic Standard Cells Daohang Shi, and Azadeh Davoodi University of Wisconsin - Madison {dshi7,adavoodi}@wisc.edu ABSTRACT We study the impact

More information

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017 Design of Low Power Adder in ALU Using Flexible Charge Recycling Dynamic Circuit Pallavi Mamidala 1 K. Anil kumar 2 mamidalapallavi@gmail.com 1 anilkumar10436@gmail.com 2 1 Assistant Professor, Dept of

More information

PLACEMENT OF TSVS IN THREE DIMENSIONAL INTEGRATED CIRCUITS (3D IC) College of Engineering, Madurai, India.

PLACEMENT OF TSVS IN THREE DIMENSIONAL INTEGRATED CIRCUITS (3D IC) College of Engineering, Madurai, India. Volume 117 No. 16 2017, 179-184 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu PLACEMENT OF TSVS IN THREE DIMENSIONAL INTEGRATED CIRCUITS (3D IC)

More information

Power-Supply-Network Design in 3D Integrated Systems

Power-Supply-Network Design in 3D Integrated Systems Power-Supply-Network Design in 3D Integrated Systems Michael B. Healy and Sung Kyu Lim School of Electrical and Computer Engineering, Georgia Institute of Technology 777 Atlantic Dr. NW, Atlanta, GA 3332

More information

3-D INTEGRATED CIRCUITS (3-D ICs) are emerging

3-D INTEGRATED CIRCUITS (3-D ICs) are emerging 862 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 21, NO. 5, MAY 2013 Study of Through-Silicon-Via Impact on the 3-D Stacked IC Layout Dae Hyun Kim, Student Member, IEEE, Krit

More information

ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS)

ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) Objective Part A: To become acquainted with Spectre (or HSpice) by simulating an inverter,

More information

6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1

6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1 6T- SRAM for Low Power Consumption Mrs. J.N.Ingole 1, Ms.P.A.Mirge 2 Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1 PG Student [Digital Electronics], Dept. of ExTC, PRMIT&R,

More information

Silicon Virtual Prototyping: The New Cockpit for Nanometer Chip Design

Silicon Virtual Prototyping: The New Cockpit for Nanometer Chip Design Silicon Virtual Prototyping: The New Cockpit for Nanometer Chip Design Wei-Jin Dai, Dennis Huang, Chin-Chih Chang, Michel Courtoy Cadence Design Systems, Inc. Abstract A design methodology for the implementation

More information

Introduction 1. GENERAL TRENDS. 1. The technology scale down DEEP SUBMICRON CMOS DESIGN

Introduction 1. GENERAL TRENDS. 1. The technology scale down DEEP SUBMICRON CMOS DESIGN 1 Introduction The evolution of integrated circuit (IC) fabrication techniques is a unique fact in the history of modern industry. The improvements in terms of speed, density and cost have kept constant

More information

Full-Chip Through-Silicon-Via Interfacial Crack Analysis and Optimization for 3D IC

Full-Chip Through-Silicon-Via Interfacial Crack Analysis and Optimization for 3D IC Full-Chip Through-Silicon-Via Interfacial Crack Analysis and Optimization for 3D IC Moongon Jung 1, Xi Liu 2, Suresh K. Sitaraman 2, David Z. Pan 3, and Sung Kyu Lim 1 1 School of ECE, Georgia Institute

More information

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141 ECE 637 Integrated VLSI Circuits Introduction EE141 1 Introduction Course Details Instructor Mohab Anis; manis@vlsi.uwaterloo.ca Text Digital Integrated Circuits, Jan Rabaey, Prentice Hall, 2 nd edition

More information

MOORE s law historically enables designs with higher

MOORE s law historically enables designs with higher 634 IEEE TRANSACTIONS ON NANOTECHNOLOGY, VOL. 17, NO. 4, JULY 2018 Interdie Coupling Extraction and Physical Design Optimization for Face-to-Face 3-D ICs Yarui Peng, Member, IEEE, Dusan Petranovic, Member,

More information

An Overview of Standard Cell Based Digital VLSI Design

An Overview of Standard Cell Based Digital VLSI Design An Overview of Standard Cell Based Digital VLSI Design With examples taken from the implementation of the 36-core AsAP1 chip and the 1000-core KiloCore chip Zhiyi Yu, Tinoosh Mohsenin, Aaron Stillmaker,

More information

Designing 3D Tree-based FPGA TSV Count Minimization. V. Pangracious, Z. Marrakchi, H. Mehrez UPMC Sorbonne University Paris VI, France

Designing 3D Tree-based FPGA TSV Count Minimization. V. Pangracious, Z. Marrakchi, H. Mehrez UPMC Sorbonne University Paris VI, France Designing 3D Tree-based FPGA TSV Count Minimization V. Pangracious, Z. Marrakchi, H. Mehrez UPMC Sorbonne University Paris VI, France 13 avril 2013 Presentation Outlook Introduction : 3D Tree-based FPGA

More information

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech)

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) K.Prasad Babu 2 M.tech (Ph.d) hanumanthurao19@gmail.com 1 kprasadbabuece433@gmail.com 2 1 PG scholar, VLSI, St.JOHNS

More information

Reconfigurable Multicore Server Processors for Low Power Operation

Reconfigurable Multicore Server Processors for Low Power Operation Reconfigurable Multicore Server Processors for Low Power Operation Ronald G. Dreslinski, David Fick, David Blaauw, Dennis Sylvester, Trevor Mudge University of Michigan, Advanced Computer Architecture

More information

Mixed Cell-Height Implementation for Improved Design Quality in Advanced Nodes

Mixed Cell-Height Implementation for Improved Design Quality in Advanced Nodes Mixed Cell-Height Implementation for Improved Design Quality in Advanced Nodes Sorin Dobre, Andrew B. Kahng + and Jiajia Li UC San Diego, ECE and + CSE Depts., La Jolla, CA 92093, {abk, jil150}@ucsd.edu

More information

Fast, Accurate A Priori Routing Delay Estimation

Fast, Accurate A Priori Routing Delay Estimation Fast, Accurate A Priori Routing Delay Estimation Jinhai Qiu Implementation Group Synopsys Inc. Mountain View, CA Jinhai.Qiu@synopsys.com Sherief Reda Division of Engineering Brown University Providence,

More information

Synthesizable FPGA Fabrics Targetable by the VTR CAD Tool

Synthesizable FPGA Fabrics Targetable by the VTR CAD Tool Synthesizable FPGA Fabrics Targetable by the VTR CAD Tool Jin Hee Kim and Jason Anderson FPL 2015 London, UK September 3, 2015 2 Motivation for Synthesizable FPGA Trend towards ASIC design flow Design

More information

940 IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 62, NO. 3, MARCH 2015

940 IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 62, NO. 3, MARCH 2015 940 IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 62, NO. 3, MARCH 2015 Evaluating Chip-Level Impact of Cu/Low-κ Performance Degradation on Circuit Performance at Future Technology Nodes Ahmet Ceyhan, Member,

More information

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology http://dx.doi.org/10.5573/jsts.014.14.6.760 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.14, NO.6, DECEMBER, 014 A 56-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology Sung-Joon Lee

More information

Lab. Course Goals. Topics. What is VLSI design? What is an integrated circuit? VLSI Design Cycle. VLSI Design Automation

Lab. Course Goals. Topics. What is VLSI design? What is an integrated circuit? VLSI Design Cycle. VLSI Design Automation Course Goals Lab Understand key components in VLSI designs Become familiar with design tools (Cadence) Understand design flows Understand behavioral, structural, and physical specifications Be able to

More information

THROUGH-SILICON-VIA-AWARE PREDICTION AND PHYSICAL DESIGN FOR MULTI-GRANULARITY 3D INTEGRATED CIRCUITS

THROUGH-SILICON-VIA-AWARE PREDICTION AND PHYSICAL DESIGN FOR MULTI-GRANULARITY 3D INTEGRATED CIRCUITS THROUGH-SILICON-VIA-AWARE PREDICTION AND PHYSICAL DESIGN FOR MULTI-GRANULARITY 3D INTEGRATED CIRCUITS A Dissertation Presented to The Academic Faculty By Dae Hyun Kim In Partial Fulfillment of the Requirements

More information

Emerging Platforms, Emerging Technologies, and the Need for Crosscutting Tools Luca Carloni

Emerging Platforms, Emerging Technologies, and the Need for Crosscutting Tools Luca Carloni Emerging Platforms, Emerging Technologies, and the Need for Crosscutting Tools Luca Carloni Department of Computer Science Columbia University in the City of New York NSF Workshop on Emerging Technologies

More information

Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion

Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion Chen Li, Cheng-Kok Koh School of ECE, Purdue University West Lafayette, IN 47907, USA {li35, chengkok}@ecn.purdue.edu Patrick

More information

Calibrating Achievable Design GSRC Annual Review June 9, 2002

Calibrating Achievable Design GSRC Annual Review June 9, 2002 Calibrating Achievable Design GSRC Annual Review June 9, 2002 Wayne Dai, Andrew Kahng, Tsu-Jae King, Wojciech Maly,, Igor Markov, Herman Schmit, Dennis Sylvester DUSD(Labs) Calibrating Achievable Design

More information

Double Patterning-Aware Detailed Routing with Mask Usage Balancing

Double Patterning-Aware Detailed Routing with Mask Usage Balancing Double Patterning-Aware Detailed Routing with Mask Usage Balancing Seong-I Lei Department of Computer Science National Tsing Hua University HsinChu, Taiwan Email: d9762804@oz.nthu.edu.tw Chris Chu Department

More information

MONOLITHIC 3-D integration is emerging as an

MONOLITHIC 3-D integration is emerging as an IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF IEGRATED CIRCUITS AND SYSTEMS, VOL. 37, NO. 4, APRIL 018 845 Detailed-Placement-Enabled Dynamic Power Optimization of Multitier Gate-Level Monolithic 3-D ICs

More information

POWER, PERFORMANCE, AND COST IMPACT OF GATE-LEVEL MONOLITHIC 3D IC IN THE 7NM TECHNOLOGY NODE. A Dissertation Presented to The Academic Faculty

POWER, PERFORMANCE, AND COST IMPACT OF GATE-LEVEL MONOLITHIC 3D IC IN THE 7NM TECHNOLOGY NODE. A Dissertation Presented to The Academic Faculty POWER, PERFORMANCE, AND COST IMPACT OF GATE-LEVEL MONOLITHIC 3D IC IN THE 7NM TECHNOLOGY NODE A Dissertation Presented to The Academic Faculty By Bon Woong Ku In Partial Fulfillment of the Requirements

More information

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras CAD for VLSI Debdeep Mukhopadhyay IIT Madras Tentative Syllabus Overall perspective of VLSI Design MOS switch and CMOS, MOS based logic design, the CMOS logic styles, Pass Transistors Introduction to Verilog

More information

DFT-3D: What it means to Design For 3DIC Test? Sanjiv Taneja Vice President, R&D Silicon Realization Group

DFT-3D: What it means to Design For 3DIC Test? Sanjiv Taneja Vice President, R&D Silicon Realization Group I N V E N T I V E DFT-3D: What it means to Design For 3DIC Test? Sanjiv Taneja Vice President, R&D Silicon Realization Group Moore s Law & More : Tall And Thin More than Moore: Diversification Moore s

More information

An overview of standard cell based digital VLSI design

An overview of standard cell based digital VLSI design An overview of standard cell based digital VLSI design Implementation of the first generation AsAP processor Zhiyi Yu and Tinoosh Mohsenin VCL Laboratory UC Davis Outline Overview of standard cellbased

More information

Physical Design of a 3D-Stacked Heterogeneous Multi-Core Processor

Physical Design of a 3D-Stacked Heterogeneous Multi-Core Processor Physical Design of a -Stacked Heterogeneous Multi-Core Processor Randy Widialaksono, Rangeen Basu Roy Chowdhury, Zhenqian Zhang, Joshua Schabel, Steve Lipa, Eric Rotenberg, W. Rhett Davis, Paul Franzon

More information

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets.

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Clock Routing Problem Formulation Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Better to develop specialized routers for these nets.

More information

A Low Power 720p Motion Estimation Processor with 3D Stacked Memory

A Low Power 720p Motion Estimation Processor with 3D Stacked Memory A Low Power 720p Motion Estimation Processor with 3D Stacked Memory Shuping Zhang, Jinjia Zhou, Dajiang Zhou and Satoshi Goto Graduate School of Information, Production and Systems, Waseda University 2-7

More information

Introduction. Summary. Why computer architecture? Technology trends Cost issues

Introduction. Summary. Why computer architecture? Technology trends Cost issues Introduction 1 Summary Why computer architecture? Technology trends Cost issues 2 1 Computer architecture? Computer Architecture refers to the attributes of a system visible to a programmer (that have

More information

The Pennsylvania State University The Graduate School College of Engineering ELECTRONIC DESIGN AUTOMATION CHALLENGES IN THREE

The Pennsylvania State University The Graduate School College of Engineering ELECTRONIC DESIGN AUTOMATION CHALLENGES IN THREE The Pennsylvania State University The Graduate School College of Engineering ELECTRONIC DESIGN AUTOMATION CHALLENGES IN THREE DIMENSIONAL INTEGRATED CIRCUITS (3D ICS) A Thesis in Computer Science and Engineering

More information

DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER

DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER Bhuvaneswaran.M 1, Elamathi.K 2 Assistant Professor, Muthayammal Engineering college, Rasipuram, Tamil Nadu, India 1 Assistant Professor, Muthayammal

More information

EE5780 Advanced VLSI CAD

EE5780 Advanced VLSI CAD EE5780 Advanced VLSI CAD Lecture 1 Introduction Zhuo Feng 1.1 Prof. Zhuo Feng Office: EERC 513 Phone: 487-3116 Email: zhuofeng@mtu.edu Class Website http://www.ece.mtu.edu/~zhuofeng/ee5780fall2013.html

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

ASIC Physical Design Top-Level Chip Layout

ASIC Physical Design Top-Level Chip Layout ASIC Physical Design Top-Level Chip Layout References: M. Smith, Application Specific Integrated Circuits, Chap. 16 Cadence Virtuoso User Manual Top-level IC design process Typically done before individual

More information

AMchip architecture & design

AMchip architecture & design Sezione di Milano AMchip architecture & design Alberto Stabile - INFN Milano AMchip theoretical principle Associative Memory chip: AMchip Dedicated VLSI device - maximum parallelism Each pattern with private

More information

Full Custom Layout Optimization Using Minimum distance rule, Jogs and Depletion sharing

Full Custom Layout Optimization Using Minimum distance rule, Jogs and Depletion sharing Full Custom Layout Optimization Using Minimum distance rule, Jogs and Depletion sharing Umadevi.S #1, Vigneswaran.T #2 # Assistant Professor [Sr], School of Electronics Engineering, VIT University, Vandalur-

More information

Fast Delay Estimation with Buffer Insertion for Through-Silicon-Via-Based 3D Interconnects

Fast Delay Estimation with Buffer Insertion for Through-Silicon-Via-Based 3D Interconnects Fast Delay Estimation with Buffer Insertion for Through-Silicon-Via-Based 3D Interconnects Young-Joon Lee and Sung Kyu Lim Electrical and Computer Engineering, Georgia Institute of Technology email: yjlee@gatech.edu

More information

CAD Algorithms. Placement and Floorplanning

CAD Algorithms. Placement and Floorplanning CAD Algorithms Placement Mohammad Tehranipoor ECE Department 4 November 2008 1 Placement and Floorplanning Layout maps the structural representation of circuit into a physical representation Physical representation:

More information

Floorplan considering interconnection between different clock domains

Floorplan considering interconnection between different clock domains Proceedings of the 11th WSEAS International Conference on CIRCUITS, Agios Nikolaos, Crete Island, Greece, July 23-25, 2007 115 Floorplan considering interconnection between different clock domains Linkai

More information

FABRICATION TECHNOLOGIES

FABRICATION TECHNOLOGIES FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general

More information

Machine Learning Based Variation Modeling and Optimization for 3D ICs

Machine Learning Based Variation Modeling and Optimization for 3D ICs J. lnf. Commun. Converg. Eng. 14(4): 258-267, Dec. 2016 Regular paper Machine Learning Based Variation Modeling and Optimization for 3D ICs Sandeep Kumar Samal 1, Guoqing Chen 2, and Sung Kyu Lim 1*, Member,

More information

Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations

Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations Renshen Wang Department of Computer Science and Engineering University of California, San Diego La Jolla,

More information

Wojciech P. Maly Department of Electrical and Computer Engineering Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA

Wojciech P. Maly Department of Electrical and Computer Engineering Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA Interconnect Characteristics of 2.5-D System Integration Scheme Yangdong Deng Department of Electrical and Computer Engineering Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA 15213 412-268-5234

More information

FPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.

FPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function. FPGA Logic block of an FPGA can be configured in such a way that it can provide functionality as simple as that of transistor or as complex as that of a microprocessor. It can used to implement different

More information

Thermal-Aware 3D IC Placement Via Transformation

Thermal-Aware 3D IC Placement Via Transformation Thermal-Aware 3D IC Placement Via Transformation Jason Cong, Guojie Luo, Jie Wei and Yan Zhang Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 Email: { cong,

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

Stacked Silicon Interconnect Technology (SSIT)

Stacked Silicon Interconnect Technology (SSIT) Stacked Silicon Interconnect Technology (SSIT) Suresh Ramalingam Xilinx Inc. MEPTEC, January 12, 2011 Agenda Background and Motivation Stacked Silicon Interconnect Technology Summary Background and Motivation

More information

Congestion-Aware Power Grid. and CMOS Decoupling Capacitors. Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar

Congestion-Aware Power Grid. and CMOS Decoupling Capacitors. Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar Congestion-Aware Power Grid Optimization for 3D circuits Using MIM and CMOS Decoupling Capacitors Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar University of Minnesota 1 Outline Motivation A new

More information

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface.

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface. Placement Introduction A very important step in physical design cycle. A poor placement requires larger area. Also results in performance degradation. It is the process of arranging a set of modules on

More information

How Much Logic Should Go in an FPGA Logic Block?

How Much Logic Should Go in an FPGA Logic Block? How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca

More information

PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO. IRIS Lab National Chiao Tung University

PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO. IRIS Lab National Chiao Tung University PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO IRIS Lab National Chiao Tung University Outline Introduction Problem Formulation Algorithm -

More information

Xilinx SSI Technology Concept to Silicon Development Overview

Xilinx SSI Technology Concept to Silicon Development Overview Xilinx SSI Technology Concept to Silicon Development Overview Shankar Lakka Aug 27 th, 2012 Agenda Economic Drivers and Technical Challenges Xilinx SSI Technology, Power, Performance SSI Development Overview

More information

DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR LOGIC FAMILIES

DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR LOGIC FAMILIES Volume 120 No. 6 2018, 4453-4466 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ http://www.acadpubl.eu/hub/ DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR

More information

Whitespace-Aware TSV Arrangement in 3D Clock Tree Synthesis

Whitespace-Aware TSV Arrangement in 3D Clock Tree Synthesis 2013 IEEE Computer Society Annual Symposium on VLSI Whitespace-Aware TSV Arrangement in 3D Clock Tree Synthesis Xin Li, Wulong Liu, Haixiao Du, Yu Wang, Yuchun Ma, Huazhong Yang Tsinghua National Laboratory

More information

Investigation and Comparison of Thermal Distribution in Synchronous and Asynchronous 3D ICs Abstract -This paper presents an analysis and comparison

Investigation and Comparison of Thermal Distribution in Synchronous and Asynchronous 3D ICs Abstract -This paper presents an analysis and comparison Investigation and Comparison of Thermal Distribution in Synchronous and Asynchronous 3D ICs Brent Hollosi 1, Tao Zhang 2, Ravi S. P. Nair 3, Yuan Xie 2, Jia Di 1, and Scott Smith 3 1 Computer Science &

More information

TABLE OF CONTENTS 1.0 PURPOSE INTRODUCTION ESD CHECKS THROUGHOUT IC DESIGN FLOW... 2

TABLE OF CONTENTS 1.0 PURPOSE INTRODUCTION ESD CHECKS THROUGHOUT IC DESIGN FLOW... 2 TABLE OF CONTENTS 1.0 PURPOSE... 1 2.0 INTRODUCTION... 1 3.0 ESD CHECKS THROUGHOUT IC DESIGN FLOW... 2 3.1 PRODUCT DEFINITION PHASE... 3 3.2 CHIP ARCHITECTURE PHASE... 4 3.3 MODULE AND FULL IC DESIGN PHASE...

More information

An Automated System for Checking Lithography Friendliness of Standard Cells

An Automated System for Checking Lithography Friendliness of Standard Cells An Automated System for Checking Lithography Friendliness of Standard Cells I-Lun Tseng, Senior Member, IEEE, Yongfu Li, Senior Member, IEEE, Valerio Perez, Vikas Tripathi, Zhao Chuan Lee, and Jonathan

More information

Linking Layout to Logic Synthesis: A Unification-Based Approach

Linking Layout to Logic Synthesis: A Unification-Based Approach Linking Layout to Logic Synthesis: A Unification-Based Approach Massoud Pedram Department of EE-Systems University of Southern California Los Angeles, CA February 1998 Outline Introduction Technology and

More information

3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER

3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER 3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER CODES+ISSS: Special session on memory controllers Taipei, October 10 th 2011 Denis Dutoit, Fabien Clermidy, Pascal Vivet {denis.dutoit@cea.fr}

More information

ESD Protection Scheme for I/O Interface of CMOS IC Operating in the Power-Down Mode on System Board

ESD Protection Scheme for I/O Interface of CMOS IC Operating in the Power-Down Mode on System Board ESD Protection Scheme for I/O Interface of CMOS IC Operating in the Power-Down Mode on System Board Kun-Hsien Lin and Ming-Dou Ker Nanoelectronics and Gigascale Systems Laboratory Institute of Electronics,

More information

L14 - Placement and Routing

L14 - Placement and Routing L14 - Placement and Routing Ajay Joshi Massachusetts Institute of Technology RTL design flow HDL RTL Synthesis manual design Library/ module generators netlist Logic optimization a b 0 1 s d clk q netlist

More information

PICo Embedded High Speed Cache Design Project

PICo Embedded High Speed Cache Design Project PICo Embedded High Speed Cache Design Project TEAM LosTohmalesCalientes Chuhong Duan ECE 4332 Fall 2012 University of Virginia cd8dz@virginia.edu Andrew Tyler ECE 4332 Fall 2012 University of Virginia

More information

Chapter 2 On-Chip Protection Solution for Radio Frequency Integrated Circuits in Standard CMOS Process

Chapter 2 On-Chip Protection Solution for Radio Frequency Integrated Circuits in Standard CMOS Process Chapter 2 On-Chip Protection Solution for Radio Frequency Integrated Circuits in Standard CMOS Process 2.1 Introduction Standard CMOS technologies have been increasingly used in RF IC applications mainly

More information