Fast Dual-V dd Buffering Based on Interconnect Prediction and Sampling

Size: px
Start display at page:

Download "Fast Dual-V dd Buffering Based on Interconnect Prediction and Sampling"

Transcription

1 Based on Interconnect Prediction and Sampling Yu Hu King Ho Tam Tom Tong Jing Lei He Electrical Engineering Department University of California at Los Angeles System Level Interconnect Prediction (SLIP), 2007

2 Outline Introduction

3 Motivation Motivation Major Contributions Aggressive buffering increases interconnect power 35% cells are buffers at 65nm technology [Saxena, TCAD 04] 2 Previous work for singal-v dd buffering Power-optimal single-v dd buffer insertion [Lillis, JSSC 96] Delay-optimal buffered tree generation [Cong, DAC 00; Alpert, TCAD 02] 3 Previous work for dual-v dd buffering Power-optimal dual-v dd buffer insertion and buffered tree construction [Tam, DAC 05] 4 Problem remains Power aware buffering suffers from the expensive computational cost Dual-V dd buffers dramatically increase computational complexity

4 Motivation Motivation Major Contributions Aggressive buffering increases interconnect power 35% cells are buffers at 65nm technology [Saxena, TCAD 04] 2 Previous work for singal-v dd buffering Power-optimal single-v dd buffer insertion [Lillis, JSSC 96] Delay-optimal buffered tree generation [Cong, DAC 00; Alpert, TCAD 02] 3 Previous work for dual-v dd buffering Power-optimal dual-v dd buffer insertion and buffered tree construction [Tam, DAC 05] 4 Problem remains Power aware buffering suffers from the expensive computational cost Dual-V dd buffers dramatically increase computational complexity

5 Motivation Motivation Major Contributions Aggressive buffering increases interconnect power 35% cells are buffers at 65nm technology [Saxena, TCAD 04] 2 Previous work for singal-v dd buffering Power-optimal single-v dd buffer insertion [Lillis, JSSC 96] Delay-optimal buffered tree generation [Cong, DAC 00; Alpert, TCAD 02] 3 Previous work for dual-v dd buffering Power-optimal dual-v dd buffer insertion and buffered tree construction [Tam, DAC 05] 4 Problem remains Power aware buffering suffers from the expensive computational cost Dual-V dd buffers dramatically increase computational complexity

6 Motivation Motivation Major Contributions Aggressive buffering increases interconnect power 35% cells are buffers at 65nm technology [Saxena, TCAD 04] 2 Previous work for singal-v dd buffering Power-optimal single-v dd buffer insertion [Lillis, JSSC 96] Delay-optimal buffered tree generation [Cong, DAC 00; Alpert, TCAD 02] 3 Previous work for dual-v dd buffering Power-optimal dual-v dd buffer insertion and buffered tree construction [Tam, DAC 05] 4 Problem remains Power aware buffering suffers from the expensive computational cost Dual-V dd buffers dramatically increase computational complexity

7 Motivation Major Contributions Major Constributions Focus on speedup for power-aware dual-v dd buffer insertion and buffered tree construction 2 Propose three speedup techniques for power optimized dual-v dd buffer insertion based on interconnect prediction and sampling Pre-buffer Slack Pruning (PSP) extended from the one presented in [Shi, DAC 03] Predictive Min-delay Pruning (PMP) 3D sampling Runtime grows linearly w.r.t. the tree-size and we achieve 50x speedup compared with [Tam, DAC 05] 3 Incorporate the fast buffer insertion and grid reduction in power optimized buffered tree construction algorithm

8 Motivation Major Contributions Major Constributions Focus on speedup for power-aware dual-v dd buffer insertion and buffered tree construction 2 Propose three speedup techniques for power optimized dual-v dd buffer insertion based on interconnect prediction and sampling Pre-buffer Slack Pruning (PSP) extended from the one presented in [Shi, DAC 03] Predictive Min-delay Pruning (PMP) 3D sampling Runtime grows linearly w.r.t. the tree-size and we achieve 50x speedup compared with [Tam, DAC 05] 3 Incorporate the fast buffer insertion and grid reduction in power optimized buffered tree construction algorithm

9 Motivation Major Contributions Major Constributions Focus on speedup for power-aware dual-v dd buffer insertion and buffered tree construction 2 Propose three speedup techniques for power optimized dual-v dd buffer insertion based on interconnect prediction and sampling Pre-buffer Slack Pruning (PSP) extended from the one presented in [Shi, DAC 03] Predictive Min-delay Pruning (PMP) 3D sampling Runtime grows linearly w.r.t. the tree-size and we achieve 50x speedup compared with [Tam, DAC 05] 3 Incorporate the fast buffer insertion and grid reduction in power optimized buffered tree construction algorithm

10 Modeling Dual-V dd Buffering Problem Formulation Introduction 2 Modeling Dual-V dd Buffering Problem Formulation 3 4 5

11 Delay, Slew and Power Modeling Elmore Delay Model The delay of wire with length l is The delay of a buffer d buf is Modeling Dual-V dd Buffering Problem Formulation d(l) = ( 2 c w l + c load ) r w l () d buf = d int + r o c load (2) Bakoglus slew metric (Elmore delay times ln9) 2 Power Model Wire power dissipation E w = 0.5 c w l V 2 dd (3) Lumped buffer dynamic/short-circuit power Can be easily extended to leakage power

12 Dual-V dd Buffering Modeling Dual-V dd Buffering Problem Formulation Intuitions for power aware dual-v dd buffering High V dd buffers drive critical interconnect edges for timing optimization 2 Low V dd buffers drive non-critical interconnect edges for power 3 Achieves power saving since power is proportional to α V 2 dd 4 Suffer no loss of delay optimality Constraints in dual-v dd buffering Disallowing low V dd drives high V dd Not affect optimality [Tam, DAC 05] Exclude power and delay overhead of level converters

13 Problem Formulation Modeling Dual-V dd Buffering Problem Formulation Dual-V dd buffer insertion and sizing (dbis) Given An interconnect fanout tree with source n src, sink nodes n s and Steiner points n p Buffer stations n b 2 Find A buffer size and V dd level assignment 3 Such that (objective) The RAT qn src at the source n src is met The power consumed by the interconnect tree is minimized Slew rate at every input of the buffers and the sinks n s is upper bounded by the slew rate bound s

14 Problem Formulation Modeling Dual-V dd Buffering Problem Formulation Dual V dd Buffered Tree Construction (dtree) Given An un-routed net with source n src and sink nodes n s Buffer stations n b 2 Find A routing for the buffered tree A buffer size and V dd level assignment 3 Such that (objective) The RAT qn src at the source n src is met The power consumed by the interconnect tree is minimized Slew rate at every input of the buffers and the sinks n s is upper bounded by the slew rate bound s

15 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for Introduction 2 3 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for 4

16 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for Baseline Algorithm Flow Based on [Lillis, JSSC 96] 2 Dynamic programming with partial solution (option) pruning 3 Options must now record downstream V dd levels for buffering to prevent V dd L V dd H, which removes unnecessary search in solution space 4 Still quite slow for large nets Definition (Domination) In node n, option Φ = (rat, cap, pwr, θ) dominates Φ 2 = (rat 2, cap 2, pwr 2, θ), if rat rat 2, cap cap 2, and pwr pwr 2

17 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for Options are indexed by capacitive values 2 Few options left after power-delay sampling [Tam, DAC 05] under the same capacitive index 3 A linear list is used to store options under the same capacitive value 4 Capacitive values are organized by a binary search tree

18 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for Options are indexed by capacitive values 2 Few options left after power-delay sampling [Tam, DAC 05] under the same capacitive index 3 A linear list is used to store options under the same capacitive value 4 Capacitive values are organized by a binary search tree Percentage of options (00%) > Number of options indexed by the same capative value

19 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for Options are indexed by capacitive values 2 Few options left after power-delay sampling [Tam, DAC 05] under the same capacitive index 3 A linear list is used to store options under the same capacitive value 4 Capacitive values are organized by a binary search tree 40 Percentage of options (00%) > Number of options indexed by the same capative value C 0 = (p, q ), (p 2, q 2 ),... C = 8 C 2 = (p, q ), (p 2, q 2 ),... (p, q ), (p 2, q 2 ),... C 3 = 5 C 4 = 9 C 5 = 2 (p 3, q 3 ), (p 3, q 3 ),... (p 4, q 4 ), (p 4, q 4 ),... (p 5, q 5 ), (p 5, q 5 ),

20 Motivation for 3D Sampling Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for Power-delay sampling has shown effectiveness [Tam, DAC 05] 2 The size of the third dimenstion (capacitive values) increases significantly for large testcases node# sink# > 00 > % 62% % 64% % 65% % 7% 3 Sampling on all dimensions in power-delay-capacitance solution space is necessary

21 Motivation for 3D Sampling Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for Power-delay sampling has shown effectiveness [Tam, DAC 05] 2 The size of the third dimenstion (capacitive values) increases significantly for large testcases node# sink# > 00 > % 62% % 64% % 65% % 7% 3 Sampling on all dimensions in power-delay-capacitance solution space is necessary

22 Motivation for 3D Sampling Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for Power-delay sampling has shown effectiveness [Tam, DAC 05] 2 The size of the third dimenstion (capacitive values) increases significantly for large testcases node# sink# > 00 > % 62% % 64% % 65% % 7% 3 Sampling on all dimensions in power-delay-capacitance solution space is necessary

23 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for The idea of 3D sampling is to pick only a certain number of options among all options uniformly over the power-delay-capacitance space for upstream propagation power dissipation power dissipation required arrival time capacitance required arrival time capacitance 50 (a) 2D sampling (b) 3D sampling

24 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for Pre-buffer Slack Pruning (PSP) [Shi, ASPDAC 05] Suppose R min is the minimal resistance in the buffer library. For two non-redundant options Φ = (rat, cap, pwr, θ ) and Φ 2 = (rat 2, cap 2, pwr 2, θ 2 ), where rat < rat 2 and cap < cap 2, then Φ 2 is pruned, if (rat 2 rat )/(cap 2 cap ) R min Extension for Dual-V dd PSP Choose a proper high / low V dd buffer resistance R H / R L for PSP to avoid overly aggressive pruning 2 If θ = true, R min = R H. Otherwise, R min = R L.

25 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for Assume a continuous number of buffers and buffer sizes 2 The optimum unit length delay, delay opt, is given by [Bakoglou, book, 990] delay opt = 2 r s c o rc( + 2 ( + c p )) (4) c o 3 (4) calculates the lower bound of delay from any node to the source Predictive min-delay pruning (PMP) A newly generated option Φ = (rat, cap, pwr, θ) is pruned if rat delay opt dis(v) < RAT 0, where RAT 0 is the target RAT at the source, dis(v) is the distance of the node-to-source path

26 Experimental Settings Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for Extract technical parameters by BSIM4 and SPICE under 65nm Table: Settings for the 65nm global interconnect. Settings Values Interconnect r w = 0.86Ω/µm, c w = 0.059fF/µm V dd H Buffer c in = 0.47fF, Vdd H =.2V (min size) ro H = 4.7kΩ, dh b = 72ps, EH b = 84fJ V dd L Buffer c in = 0.47fF, Vdd L = 0.9V (min size) Level converter ro L = 5.4kΩ, dl b = 98ps, EL b = 34fJ c in = 0.47fF, E LC = 5.7fJ (min size) d LC = 220ps 2 Randomly generate and route 0 nets by GeoSteiner

27 Experimental Settings Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for Extract technical parameters by BSIM4 and SPICE under 65nm Table: Settings for the 65nm global interconnect. Settings Values Interconnect r w = 0.86Ω/µm, c w = 0.059fF/µm V dd H Buffer c in = 0.47fF, Vdd H =.2V (min size) ro H = 4.7kΩ, dh b = 72ps, EH b = 84fJ V dd L Buffer c in = 0.47fF, Vdd L = 0.9V (min size) Level converter ro L = 5.4kΩ, dl b = 98ps, EL b = 34fJ c in = 0.47fF, E LC = 5.7fJ (min size) d LC = 220ps 2 Randomly generate and route 0 nets by GeoSteiner

28 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for runtime (s) delay power net DVB sam pmp+sam psp+sam all sam pmp+sam psp+sam all sam pmp+sam psp+sam all s s s s s s s s s ave Sampling grid size is 20x20x20 for 3D sampling and 20x20 for DVB 2 Accompanied with substantial speedup, 3D sampling brings significant error for large testcases 3 Combining PSP/PMP with 3D sampling improves runtime and solution quality 4 PSP / PMP prunes many redundant options and 3D sampling can always select option samples from a good candidate pool

29 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for runtime (s) delay power net DVB sam pmp+sam psp+sam all sam pmp+sam psp+sam all sam pmp+sam psp+sam all s s s s s s s s s ave Sampling grid size is 20x20x20 for 3D sampling and 20x20 for DVB 2 Accompanied with substantial speedup, 3D sampling brings significant error for large testcases 3 Combining PSP/PMP with 3D sampling improves runtime and solution quality 4 PSP / PMP prunes many redundant options and 3D sampling can always select option samples from a good candidate pool

30 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for runtime (s) delay power net DVB sam pmp+sam psp+sam all sam pmp+sam psp+sam all sam pmp+sam psp+sam all s s s s s s s s s ave Sampling grid size is 20x20x20 for 3D sampling and 20x20 for DVB 2 Accompanied with substantial speedup, 3D sampling brings significant error for large testcases 3 Combining PSP/PMP with 3D sampling improves runtime and solution quality 4 PSP / PMP prunes many redundant options and 3D sampling can always select option samples from a good candidate pool

31 Baseline Algorithm Data Structure for Pruning 3D Sampling Pre-buffer Slack Pruning Predictive Min-delay Pruning Experimental Results for runtime (s) delay power net DVB sam pmp+sam psp+sam all sam pmp+sam psp+sam all sam pmp+sam psp+sam all s s s s s s s s s ave Sampling grid size is 20x20x20 for 3D sampling and 20x20 for DVB 2 Accompanied with substantial speedup, 3D sampling brings significant error for large testcases 3 Combining PSP/PMP with 3D sampling improves runtime and solution quality 4 PSP / PMP prunes many redundant options and 3D sampling can always select option samples from a good candidate pool

32 Buffered Tree Construction Baseline Algorithm Routing Grid Reduction Experimental Results Introduction Buffered Tree Construction Baseline Algorithm Routing Grid Reduction Experimental Results 5

33 Buffered Tree Construction Baseline Algorithm Routing Grid Reduction Experimental Results Baseline Algorithm Extend the delay optimization buffered tree construction algorithm [Cong, DAC 00] Build Hanan Graph w / buffer insertion nodes according to locations of buffer stations Path search on the grid by option propagation 2 Option growth is exponential 3 Power and dual-v dd buffers further accelerate option growth Definition (Domination) In node n, option Φ = (S, R, rat, cap, pwr, θ ) dominates Φ 2 = (S 2, R 2, rat 2, cap 2, pwr 2, θ 2 ), if S S 2, rat rat 2, cap cap 2, and pwr pwr 2

34 Routing Grid Reduction Buffered Tree Construction Baseline Algorithm Routing Grid Reduction Experimental Results Option growth is exponential w.r.t. routing grid size 2 Restrict the progation direction always towards the source (c) Before reduction (d) After reduction

35 Experimental Results Buffered Tree Construction Baseline Algorithm Routing Grid Reduction Experimental Results Our Fast stree / dtree algorithm runs over 00x faster than S-Tree/D-Tree [Tam, DAC 05] within % power overhead 2 Be able to solve the buffered tree construction problem on 0 sinks with 400+ buffer stations test cases runtime(s) RAT*(ps) power(fj) name n# s# nl# S-Tree stree D-Tree dtree S-Tree stree S-Tree stree D-Tree dtree grid grid grid grid grid grid < 00 < 00 > 99% < 0% < 0%

36 Conclusions Future Work Introduction Conclusions Future Work

37 Conclusions Future Work Conclusions Presented efficient algorithms to dual-v dd buffering for power optimization Studied three pruning techniques including PSP, PMP and 3D sampling Proposed grid reduction for buffered tree construction speedup Experimental results show that we obtain over 50x and 00x speedup compared with the most efficient existing algorithms [Tam, DAC 05] for dual-v dd buffer insertion and buffered tree construction, respectively

38 Conclusions Future Work Future Work Further improve the efficiency of buffered tree construction by adapting hierarchical tree generation algorithm 2 Slack allocation for more power reduction Chip level FPGA dual-v dd assignment [Lin, DAC 05] Fix buffer location, assign V dd levels Consider multiple critical path Solve as a linear programming problem More freedom of ASIC buffering introduces more challenge to algorithms

Fast Algorithms for Power-optimal Buffering

Fast Algorithms for Power-optimal Buffering Submission under Review, Please DO NOT Distribute Fast Algorithms for Power-optimal Buffering Yu Hu and Jing Tong Electrical Engineering Dept. Tsinghua University, Beijing, P.R. China {matrix98, jingtong}@tsinghua.edu.cn

More information

Making Fast Buffer Insertion Even Faster Via Approximation Techniques

Making Fast Buffer Insertion Even Faster Via Approximation Techniques 1A-3 Making Fast Buffer Insertion Even Faster Via Approximation Techniques Zhuo Li 1,C.N.Sze 1, Charles J. Alpert 2, Jiang Hu 1, and Weiping Shi 1 1 Dept. of Electrical Engineering, Texas A&M University,

More information

Power Optima l Dual-V dd Bu ffered Tre e Con sid erin g Bu f fer Stations and Blockages

Power Optima l Dual-V dd Bu ffered Tre e Con sid erin g Bu f fer Stations and Blockages Power Optima l Dual-V Bu ffered Tre e Con sid erin g Bu f fer Stations and Blockages 30.1 King o Tam and ei e Electrical Engineering Dept. Univ. of California, os Angeles, CA 90095, USA {ktam, lhe}@ee.ucla.edu

More information

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 1 Lecture 10: Repeater (Buffer) Insertion Introduction to Buffering Buffer Insertion

More information

Time Algorithm for Optimal Buffer Insertion with b Buffer Types *

Time Algorithm for Optimal Buffer Insertion with b Buffer Types * An Time Algorithm for Optimal Buffer Insertion with b Buffer Types * Zhuo Dept. of Electrical Engineering Texas University College Station, Texas 77843, USA. zhuoli@ee.tamu.edu Weiping Shi Dept. of Electrical

More information

Interconnect Delay and Area Estimation for Multiple-Pin Nets

Interconnect Delay and Area Estimation for Multiple-Pin Nets Interconnect Delay and Area Estimation for Multiple-Pin Nets Jason Cong and David Z. Pan UCLA Computer Science Department Los Angeles, CA 90095 Sponsored by SRC and Avant!! under CA-MICRO Presentation

More information

Fast Algorithms For Slew Constrained Minimum Cost Buffering

Fast Algorithms For Slew Constrained Minimum Cost Buffering 18.3 Fast Algorithms For Slew Constrained Minimum Cost Buffering Shiyan Hu, Charles J. Alpert, Jiang Hu, Shrirang Karandikar, Zhuo Li, Weiping Shi, C. N. Sze Department of Electrical and Computer Engineering,

More information

An Efficient Algorithm For RLC Buffer Insertion

An Efficient Algorithm For RLC Buffer Insertion An Efficient Algorithm For RLC Buffer Insertion Zhanyuan Jiang, Shiyan Hu, Jiang Hu and Weiping Shi Texas A&M University, College Station, Texas 77840 Email: {jerryjiang, hushiyan, jianghu, wshi}@ece.tamu.edu

More information

An Efficient Routing Tree Construction Algorithm with Buffer Insertion, Wire Sizing and Obstacle Considerations

An Efficient Routing Tree Construction Algorithm with Buffer Insertion, Wire Sizing and Obstacle Considerations An Efficient Routing Tree Construction Algorithm with uffer Insertion, Wire Sizing and Obstacle Considerations Sampath Dechu Zion Cien Shen Chris C N Chu Physical Design Automation Group Dept Of ECpE Dept

More information

DpRouter: A Fast and Accurate Dynamic- Pattern-Based Global Routing Algorithm

DpRouter: A Fast and Accurate Dynamic- Pattern-Based Global Routing Algorithm DpRouter: A Fast and Accurate Dynamic- Pattern-Based Global Routing Algorithm Zhen Cao 1,Tong Jing 1, 2, Jinjun Xiong 2, Yu Hu 2, Lei He 2, Xianlong Hong 1 1 Tsinghua University 2 University of California,

More information

Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction

Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction 44.1 Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Yan Lin and Lei He Electrical Engineering Department University of California, Los Angeles, CA

More information

Variation Tolerant Buffered Clock Network Synthesis with Cross Links

Variation Tolerant Buffered Clock Network Synthesis with Cross Links Variation Tolerant Buffered Clock Network Synthesis with Cross Links Anand Rajaram David Z. Pan Dept. of ECE, UT-Austin Texas Instruments, Dallas Sponsored by SRC and IBM Faculty Award 1 Presentation Outline

More information

Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment

Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment Xin-Wei Shih, Tzu-Hsuan Hsu, Hsu-Chieh Lee, Yao-Wen Chang, Kai-Yuan Chao 2013.01.24 1 Outline 2 Clock Network Synthesis Clock network

More information

AN OPTIMIZED ALGORITHM FOR SIMULTANEOUS ROUTING AND BUFFER INSERTION IN MULTI-TERMINAL NETS

AN OPTIMIZED ALGORITHM FOR SIMULTANEOUS ROUTING AND BUFFER INSERTION IN MULTI-TERMINAL NETS www.arpnjournals.com AN OPTIMIZED ALGORITHM FOR SIMULTANEOUS ROUTING AND BUFFER INSERTION IN MULTI-TERMINAL NETS C. Uttraphan 1, N. Shaikh-Husin 2 1 Embedded Computing System (EmbCoS) Research Focus Group,

More information

Vdd Programmable and Variation Tolerant FPGA Circuits and Architectures

Vdd Programmable and Variation Tolerant FPGA Circuits and Architectures Vdd Programmable and Variation Tolerant FPGA Circuits and Architectures Prof. Lei He EE Department, UCLA LHE@ee.ucla.edu Partially supported by NSF. Pathway to Power Efficiency and Variation Tolerance

More information

PERFORMANCE optimization has always been a critical

PERFORMANCE optimization has always been a critical IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 18, NO. 11, NOVEMBER 1999 1633 Buffer Insertion for Noise and Delay Optimization Charles J. Alpert, Member, IEEE, Anirudh

More information

A Novel Performance-Driven Topology Design Algorithm

A Novel Performance-Driven Topology Design Algorithm A Novel Performance-Driven Topology Design Algorithm Min Pan, Chris Chu Priyadarshan Patra Electrical and Computer Engineering Dept. Intel Corporation Iowa State University, Ames, IA 50011 Hillsboro, OR

More information

Congestion-Aware Power Grid. and CMOS Decoupling Capacitors. Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar

Congestion-Aware Power Grid. and CMOS Decoupling Capacitors. Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar Congestion-Aware Power Grid Optimization for 3D circuits Using MIM and CMOS Decoupling Capacitors Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar University of Minnesota 1 Outline Motivation A new

More information

Porosity Aware Buffered Steiner Tree Construction

Porosity Aware Buffered Steiner Tree Construction IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. XX, NO. Y, MONTH 2003 100 Porosity Aware Buffered Steiner Tree Construction Charles J. Alpert, Gopal Gandham, Milos Hrkic,

More information

An Interconnect-Centric Design Flow for Nanometer. Technologies

An Interconnect-Centric Design Flow for Nanometer. Technologies An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong Department of Computer Science University of California, Los Angeles, CA 90095 Abstract As the integrated circuits (ICs) are scaled

More information

A Faster Approximation Scheme For Timing Driven Minimum Cost Layer Assignment

A Faster Approximation Scheme For Timing Driven Minimum Cost Layer Assignment A Faster Approximation Scheme For Timing Driven Minimum Cost Layer Assignment ABSTRACT Shiyan Hu Dept. of Electrical and Computer Engineering Michigan Technological University Houghton, Michigan 49931

More information

Path-Based Buffer Insertion

Path-Based Buffer Insertion 1346 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 26, NO. 7, JULY 2007 Path-Based Buffer Insertion C. N. Sze, Charles J. Alpert, Jiang Hu, and Weiping Shi Abstract

More information

Fast Delay Estimation with Buffer Insertion for Through-Silicon-Via-Based 3D Interconnects

Fast Delay Estimation with Buffer Insertion for Through-Silicon-Via-Based 3D Interconnects Fast Delay Estimation with Buffer Insertion for Through-Silicon-Via-Based 3D Interconnects Young-Joon Lee and Sung Kyu Lim Electrical and Computer Engineering, Georgia Institute of Technology email: yjlee@gatech.edu

More information

Buffered Steiner Trees for Difficult Instances

Buffered Steiner Trees for Difficult Instances Buffered Steiner Trees for Difficult Instances C. J. Alpert 1, M. Hrkic 2, J. Hu 1, A. B. Kahng 3, J. Lillis 2, B. Liu 3, S. T. Quay 1, S. S. Sapatnekar 4, A. J. Sullivan 1, P. Villarrubia 1 1 IBM Corp.,

More information

Buffered Routing Tree Construction Under Buffer Placement Blockages

Buffered Routing Tree Construction Under Buffer Placement Blockages Buffered Routing Tree Construction Under Buffer Placement Blockages Abstract Interconnect delay has become a critical factor in determining the performance of integrated circuits. Routing and buffering

More information

Multi-Voltage Domain Clock Mesh Design

Multi-Voltage Domain Clock Mesh Design Multi-Voltage Domain Clock Mesh Design Can Sitik Electrical and Computer Engineering Drexel University Philadelphia, PA, 19104 USA E-mail: as3577@drexel.edu Baris Taskin Electrical and Computer Engineering

More information

Double Patterning Layout Decomposition for Simultaneous Conflict and Stitch Minimization

Double Patterning Layout Decomposition for Simultaneous Conflict and Stitch Minimization Double Patterning Layout Decomposition for Simultaneous Conflict and Stitch Minimization Kun Yuan, Jae-Seo Yang, David Z. Pan Dept. of Electrical and Computer Engineering The University of Texas at Austin

More information

PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO. IRIS Lab National Chiao Tung University

PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO. IRIS Lab National Chiao Tung University PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO IRIS Lab National Chiao Tung University Outline Introduction Problem Formulation Algorithm -

More information

Architecture Evaluation for

Architecture Evaluation for Architecture Evaluation for Power-efficient FPGAs Fei Li*, Deming Chen +, Lei He*, Jason Cong + * EE Department, UCLA + CS Department, UCLA Partially supported by NSF and SRC Outline Introduction Evaluation

More information

S 1 S 2. C s1. C s2. S n. C sn. S 3 C s3. Input. l k S k C k. C 1 C 2 C k-1. R d

S 1 S 2. C s1. C s2. S n. C sn. S 3 C s3. Input. l k S k C k. C 1 C 2 C k-1. R d Interconnect Delay and Area Estimation for Multiple-Pin Nets Jason Cong and David Zhigang Pan Department of Computer Science University of California, Los Angeles, CA 90095 Email: fcong,pang@cs.ucla.edu

More information

Fast Interconnect Synthesis with Layer Assignment

Fast Interconnect Synthesis with Layer Assignment Fast Interconnect Synthesis with Layer Assignment Zhuo Li IBM Research lizhuo@us.ibm.com Tuhin Muhmud IBM Systems and Technology Group tuhinm@us.ibm.com Charles J. Alpert IBM Research alpert@us.ibm.com

More information

Device And Architecture Co-Optimization for FPGA Power Reduction

Device And Architecture Co-Optimization for FPGA Power Reduction 54.2 Device And Architecture Co-Optimization for FPGA Power Reduction Lerong Cheng, Phoebe Wong, Fei Li, Yan Lin, and Lei He Electrical Engineering Department University of California, Los Angeles, CA

More information

Chapter 28: Buffering in the Layout Environment

Chapter 28: Buffering in the Layout Environment Chapter 28: Buffering in the Layout Environment Jiang Hu, and C. N. Sze 1 Introduction Chapters 26 and 27 presented buffering algorithms where the buffering problem was isolated from the general problem

More information

An Interconnect-Centric Design Flow for Nanometer Technologies

An Interconnect-Centric Design Flow for Nanometer Technologies An Interconnect-Centric Design Flow for Nanometer Technologies Professor Jason Cong UCLA Computer Science Department Los Angeles, CA 90095 http://cadlab.cs.ucla.edu/~ /~cong

More information

Simultaneous Shield and Buffer Insertion for Crosstalk Noise Reduction in Global Routing

Simultaneous Shield and Buffer Insertion for Crosstalk Noise Reduction in Global Routing Simultaneous Shield and Buffer Insertion for Crosstalk Noise Reduction in Global Routing Tianpei Zhang and Sachin S. Sapatnekar Department of Electrical and Computer Engineering, University of Minnesota

More information

AS VLSI technology scales to deep submicron and beyond, interconnect

AS VLSI technology scales to deep submicron and beyond, interconnect IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL., NO. 1 TILA-S: Timing-Driven Incremental Layer Assignment Avoiding Slew Violations Derong Liu, Bei Yu Member, IEEE, Salim

More information

Vdd Programmability to Reduce FPGA Interconnect Power

Vdd Programmability to Reduce FPGA Interconnect Power Vdd Programmability to Reduce FPGA Interconnect Power Fei Li, Yan Lin and Lei He Electrical Engineering Department University of California, Los Angeles, CA 90095 ABSTRACT Power is an increasingly important

More information

Crosslink Insertion for Variation-Driven Clock Network Construction

Crosslink Insertion for Variation-Driven Clock Network Construction Crosslink Insertion for Variation-Driven Clock Network Construction Fuqiang Qian, Haitong Tian, Evangeline Young Department of Computer Science and Engineering The Chinese University of Hong Kong {fqqian,

More information

Linking Layout to Logic Synthesis: A Unification-Based Approach

Linking Layout to Logic Synthesis: A Unification-Based Approach Linking Layout to Logic Synthesis: A Unification-Based Approach Massoud Pedram Department of EE-Systems University of Southern California Los Angeles, CA February 1998 Outline Introduction Technology and

More information

Incremental Layer Assignment for Critical Path Timing

Incremental Layer Assignment for Critical Path Timing Incremental Layer Assignment for Critical Path Timing Derong Liu 1, Bei Yu 2, Salim Chowdhury 3, and David Z. Pan 1 1 ECE Department, University of Texas at Austin, Austin, TX, USA 2 CSE Department, Chinese

More information

Timing-Driven Maze Routing

Timing-Driven Maze Routing 234 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 2, FEBRUARY 2000 Timing-Driven Maze Routing Sung-Woo Hur, Ashok Jagannathan, and John Lillis Abstract This

More information

Pyramids: An Efficient Computational Geometry-based Approach for Timing-Driven Placement

Pyramids: An Efficient Computational Geometry-based Approach for Timing-Driven Placement Pyramids: An Efficient Computational Geometry-based Approach for Timing-Driven Placement Tao Luo, David A. Papa, Zhuo Li, C. N. Sze, Charles J. Alpert and David Z. Pan Magma Design Automation / Austin,

More information

Topological Routing to Maximize Routability for Package Substrate

Topological Routing to Maximize Routability for Package Substrate Topological Routing to Maximize Routability for Package Substrate Speaker: Guoqiang Chen Authors: Shenghua Liu, Guoqiang Chen, Tom Tong Jing, Lei He, Tianpei Zhang, Robby Dutta, Xian-Long Hong Outline

More information

Postgrid Clock Routing for High Performance Microprocessor Designs

Postgrid Clock Routing for High Performance Microprocessor Designs IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 31, NO. 2, FEBRUARY 2012 255 Postgrid Clock Routing for High Performance Microprocessor Designs Haitong Tian, Wai-Chung

More information

Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations

Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations Renshen Wang Department of Computer Science and Engineering University of California, San Diego La Jolla,

More information

Construction of All Rectilinear Steiner Minimum Trees on the Hanan Grid

Construction of All Rectilinear Steiner Minimum Trees on the Hanan Grid Construction of All Rectilinear Steiner Minimum Trees on the Hanan Grid Sheng-En David Lin and Dae Hyun Kim Presenter: Dae Hyun Kim (Assistant Professor) daehyun@eecs.wsu.edu School of Electrical Engineering

More information

HS3DPG: Hierarchical Simulation for 3D P/G Network Shuai Tao, Xiaoming Chen,Yu Wang, Yuchun Ma, Yiyu Shi, Hui Wang, Huazhong Yang

HS3DPG: Hierarchical Simulation for 3D P/G Network Shuai Tao, Xiaoming Chen,Yu Wang, Yuchun Ma, Yiyu Shi, Hui Wang, Huazhong Yang HS3DPG: Hierarchical Simulation for 3D P/G Network Shuai Tao, Xiaoming Chen,Yu Wang, Yuchun Ma, Yiyu Shi, Hui Wang, Huazhong Yang Presented by Shuai Tao (Tsinghua University, Beijing,China) January 24,

More information

SUBMITTED FOR PUBLICATION TO: IEEE TRANSACTIONS ON VLSI, DECEMBER 5, A Low-Power Field-Programmable Gate Array Routing Fabric.

SUBMITTED FOR PUBLICATION TO: IEEE TRANSACTIONS ON VLSI, DECEMBER 5, A Low-Power Field-Programmable Gate Array Routing Fabric. SUBMITTED FOR PUBLICATION TO: IEEE TRANSACTIONS ON VLSI, DECEMBER 5, 2007 1 A Low-Power Field-Programmable Gate Array Routing Fabric Mingjie Lin Abbas El Gamal Abstract This paper describes a new FPGA

More information

Physical Synthesis for FPGA Interconnect Power Reduction by Dual-Vdd Budgeting and Retiming

Physical Synthesis for FPGA Interconnect Power Reduction by Dual-Vdd Budgeting and Retiming Physical Synthesis for FPGA Interconnect Power Reduction by Dual-Vdd Budgeting and Retiming YU HU, YAN LIN, and LEI HE University of California, Los Angeles and TIM TUAN Xilinx Research Labs Field programmable

More information

Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations

Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations XXVII IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, OCTOBER 5, 2009 Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations Renshen Wang 1 Takumi Okamoto 2

More information

An FPGA Architecture Supporting Dynamically-Controlled Power Gating

An FPGA Architecture Supporting Dynamically-Controlled Power Gating An FPGA Architecture Supporting Dynamically-Controlled Power Gating Altera Corporation March 16 th, 2012 Assem Bsoul and Steve Wilton {absoul, stevew}@ece.ubc.ca System-on-Chip Research Group Department

More information

An Exact Algorithm for the Statistical Shortest Path Problem

An Exact Algorithm for the Statistical Shortest Path Problem An Exact Algorithm for the Statistical Shortest Path Problem Liang Deng and Martin D. F. Wong Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign Outline Motivation

More information

Total Power-Optimal Pipelining and Parallel Processing under Process Variations in Nanometer Technology

Total Power-Optimal Pipelining and Parallel Processing under Process Variations in Nanometer Technology otal Power-Optimal Pipelining and Parallel Processing under Process ariations in anometer echnology am Sung Kim 1, aeho Kgil, Keith Bowman 1, ivek De 1, and revor Mudge 1 Intel Corporation, Hillsboro,

More information

Minimizing Power Dissipation during. University of Southern California Los Angeles CA August 28 th, 2007

Minimizing Power Dissipation during. University of Southern California Los Angeles CA August 28 th, 2007 Minimizing Power Dissipation during Write Operation to Register Files Kimish Patel, Wonbok Lee, Massoud Pedram University of Southern California Los Angeles CA August 28 th, 2007 Introduction Outline Conditional

More information

Routing Tree Construction with Buffer Insertion under Buffer Location Constraints and Wiring Obstacles

Routing Tree Construction with Buffer Insertion under Buffer Location Constraints and Wiring Obstacles Routing Tree Construction with Buffer Insertion under Buffer Location Constraints and Wiring Obstacles Ying Rao, Tianxiang Yang University of Wisconsin Madison {yrao, tyang}@cae.wisc.edu ABSTRACT Buffer

More information

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions 04/15/14 1 Introduction: Low Power Technology Process Hardware Architecture Software Multi VTH Low-power circuits Parallelism

More information

FPGA Power Reduction Using Configurable Dual-Vdd

FPGA Power Reduction Using Configurable Dual-Vdd FPGA Power Reduction Using Configurable Dual-Vdd 45.1 Fei Li, Yan Lin and Lei He Electrical Engineering Department University of California, Los Angeles, CA {feil, ylin, lhe}@ee.ucla.edu ABSTRACT Power

More information

An Efficient Chip-level Time Slack Allocation Algorithm for Dual-Vdd FPGA Power Reduction

An Efficient Chip-level Time Slack Allocation Algorithm for Dual-Vdd FPGA Power Reduction An Efficient Chip-level Time Slack Allocation Algorithm for Dual-Vdd FPGA Power Reduction Yan Lin 1, Yu Hu 1, Lei He 1 and Vijay Raghunat 2 Electrical Engineering Dept., UCLA, Los Angeles, CA 1 Purdue

More information

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Kshitij Bhardwaj Dept. of Computer Science Columbia University Steven M. Nowick 2016 ACM/IEEE Design Automation

More information

POWER OPTIMIZATION USING BODY BIASING METHOD FOR DUAL VOLTAGE FPGA

POWER OPTIMIZATION USING BODY BIASING METHOD FOR DUAL VOLTAGE FPGA POWER OPTIMIZATION USING BODY BIASING METHOD FOR DUAL VOLTAGE FPGA B.Sankar 1, Dr.C.N.Marimuthu 2 1 PG Scholar, Applied Electronics, Nandha Engineering College, Tamilnadu, India 2 Dean/Professor of ECE,

More information

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets.

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Clock Routing Problem Formulation Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Better to develop specialized routers for these nets.

More information

Cluster-based approach eases clock tree synthesis

Cluster-based approach eases clock tree synthesis Page 1 of 5 EE Times: Design News Cluster-based approach eases clock tree synthesis Udhaya Kumar (11/14/2005 9:00 AM EST) URL: http://www.eetimes.com/showarticle.jhtml?articleid=173601961 Clock network

More information

Circuit Model for Interconnect Crosstalk Noise Estimation in High Speed Integrated Circuits

Circuit Model for Interconnect Crosstalk Noise Estimation in High Speed Integrated Circuits Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 8 (2013), pp. 907-912 Research India Publications http://www.ripublication.com/aeee.htm Circuit Model for Interconnect Crosstalk

More information

Place and Route for FPGAs

Place and Route for FPGAs Place and Route for FPGAs 1 FPGA CAD Flow Circuit description (VHDL, schematic,...) Synthesize to logic blocks Place logic blocks in FPGA Physical design Route connections between logic blocks FPGA programming

More information

Repeater Insertion in Tree Structured Inductive Interconnect: Software Description

Repeater Insertion in Tree Structured Inductive Interconnect: Software Description Repeater Insertion in Tree Structured Inductive Interconnect: Software Description Yehea I. Ismail ECE Department Northwestern University Evanston, IL 60208 Eby G. Friedman ECE Department University of

More information

ALGORITHMS FOR THE SCALING TOWARD NANOMETER VLSI PHYSICAL SYNTHESIS. A Dissertation CHIN NGAI SZE

ALGORITHMS FOR THE SCALING TOWARD NANOMETER VLSI PHYSICAL SYNTHESIS. A Dissertation CHIN NGAI SZE ALGORITHMS FOR THE SCALING TOWARD NANOMETER VLSI PHYSICAL SYNTHESIS A Dissertation by CHIN NGAI SZE Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements

More information

On the Decreasing Significance of Large Standard Cells in Technology Mapping

On the Decreasing Significance of Large Standard Cells in Technology Mapping On the Decreasing Significance of Standard s in Technology Mapping Jae-sun Seo, Igor Markov, Dennis Sylvester, and David Blaauw Department of EECS, University of Michigan, Ann Arbor, MI 48109 {jseo,imarkov,dmcs,blaauw}@umich.edu

More information

VID (Voltage ID) features require timing analysis on even more corners

VID (Voltage ID) features require timing analysis on even more corners Mehmet Avci Number of Corners for Typical Part Introduction The number of signoff PVT corners is increasing VID (Voltage ID) features require timing analysis on even more corners Designers try to reduce

More information

Technology Dependent Logic Optimization Prof. Kurt Keutzer EECS University of California Berkeley, CA Thanks to S. Devadas

Technology Dependent Logic Optimization Prof. Kurt Keutzer EECS University of California Berkeley, CA Thanks to S. Devadas Technology Dependent Logic Optimization Prof. Kurt Keutzer EECS University of California Berkeley, CA Thanks to S. Devadas 1 RTL Design Flow HDL RTL Synthesis Manual Design Module Generators Library netlist

More information

POWER PERFORMANCE OPTIMIZATION METHODS FOR DIGITAL CIRCUITS

POWER PERFORMANCE OPTIMIZATION METHODS FOR DIGITAL CIRCUITS POWER PERFORMANCE OPTIMIZATION METHODS FOR DIGITAL CIRCUITS Radu Zlatanovici zradu@eecs.berkeley.edu http://www.eecs.berkeley.edu/~zradu Department of Electrical Engineering and Computer Sciences University

More information

NoCIC: A Spice-based Interconnect Planning Tool Emphasizing Aggressive On-Chip Interconnect Circuit Methods

NoCIC: A Spice-based Interconnect Planning Tool Emphasizing Aggressive On-Chip Interconnect Circuit Methods 1 NoCIC: A Spice-based Interconnect Planning Tool Emphasizing Aggressive On-Chip Interconnect Circuit Methods V. Venkatraman, A. Laffely, J. Jang, H. Kukkamalla, Z. Zhu & W. Burleson Interconnect Circuit

More information

A Survey on Buffered Clock Tree Synthesis for Skew Optimization

A Survey on Buffered Clock Tree Synthesis for Skew Optimization A Survey on Buffered Clock Tree Synthesis for Skew Optimization Anju Rose Tom 1, K. Gnana Sheela 2 1, 2 Electronics and Communication Department, Toc H Institute of Science and Technology, Kerala, India

More information

Power-Optimal Pipelining in Deep Submicron Technology Λ

Power-Optimal Pipelining in Deep Submicron Technology Λ 8. Power-Optimal Pipelining in Deep Submicron Technology Λ Seongmoo Heo and Krste Asanović MIT Computer Science and Artificial Intelligence Laboratory Vassar Street, Cambridge, MA 9 fheomoo,krsteg@csail.mit.edu

More information

[14] M. A. B. Jackson, A. Srinivasan and E. S. Kuh, Clock routing for high-performance ICs, 27th ACM

[14] M. A. B. Jackson, A. Srinivasan and E. S. Kuh, Clock routing for high-performance ICs, 27th ACM Journal of High Speed Electronics and Systems, pp65-81, 1996. [14] M. A. B. Jackson, A. Srinivasan and E. S. Kuh, Clock routing for high-performance ICs, 27th ACM IEEE Design AUtomation Conference, pp.573-579,

More information

CELL-BASED design technology has dominated

CELL-BASED design technology has dominated 16 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 6, NO., FEBRUARY 007 Performance Benefits of Monolithically Stacked 3-D FPGA Mingjie Lin, Student Member, IEEE, Abbas

More information

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University Abbas El Gamal Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program Stanford University Chip stacking Vertical interconnect density < 20/mm Wafer Stacking

More information

Concurrent Flip-Flop and Repeater Insertion for High Performance Integrated Circuits

Concurrent Flip-Flop and Repeater Insertion for High Performance Integrated Circuits Concurrent Flip-Flop and Repeater Insertion for High Performance Integrated Circuits Pasquale Cocchini pasquale.cocchini@intel.com, Intel Labs, CADResearch Abstract For many years, CMOS process scaling

More information

Incremental and On-demand Random Walk for Iterative Power Distribution Network Analysis

Incremental and On-demand Random Walk for Iterative Power Distribution Network Analysis Incremental and On-demand Random Walk for Iterative Power Distribution Network Analysis Yiyu Shi Wei Yao Jinjun Xiong Lei He Electrical Engineering Dept., UCLA IBM Thomas J. Watson Research Center Los

More information

Clock Tree Resynthesis for Multi-corner Multi-mode Timing Closure

Clock Tree Resynthesis for Multi-corner Multi-mode Timing Closure Clock Tree Resynthesis for Multi-corner Multi-mode Timing Closure Subhendu Roy 1, Pavlos M. Mattheakis 2, Laurent Masse-Navette 2 and David Z. Pan 1 1 ECE Department, The University of Texas at Austin

More information

A Framework for Systematic Evaluation and Exploration of Design Rules

A Framework for Systematic Evaluation and Exploration of Design Rules A Framework for Systematic Evaluation and Exploration of Design Rules Rani S. Ghaida* and Prof. Puneet Gupta EE Dept., University of California, Los Angeles (rani@ee.ucla.edu), (puneet@ee.ucla.edu) Work

More information

Simultaneous Shield and Buffer Insertion for Crosstalk Noise Reduction in Global Routing

Simultaneous Shield and Buffer Insertion for Crosstalk Noise Reduction in Global Routing Simultaneous Shield and Buffer Insertion for Crosstalk Noise Reduction in Global Routing Tianpei Zhang and Sachin S. Sapatnekar Department of Electrical and Computer Engineering, University of Minnesota

More information

An Interconnect-Centric Design Flow for Nanometer Technologies

An Interconnect-Centric Design Flow for Nanometer Technologies An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong UCLA Computer Science Department Email: cong@cs.ucla.edu Tel: 310-206-2775 URL: http://cadlab.cs.ucla.edu/~cong Exponential Device

More information

CONTINUED scaling of feature sizes in integrated circuits

CONTINUED scaling of feature sizes in integrated circuits 1070 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 7, JULY 2010 Dose Map and Placement Co-Optimization for Improved Timing Yield and Leakage Power Kwangok

More information

Semi-supervised learning and active learning

Semi-supervised learning and active learning Semi-supervised learning and active learning Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Combining classifiers Ensemble learning: a machine learning paradigm where multiple learners

More information

FPGA Power Management and Modeling Techniques

FPGA Power Management and Modeling Techniques FPGA Power Management and Modeling Techniques WP-01044-2.0 White Paper This white paper discusses the major challenges associated with accurately predicting power consumption in FPGAs, namely, obtaining

More information

FIELD programmable gate arrays (FPGAs) provide an attractive

FIELD programmable gate arrays (FPGAs) provide an attractive IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 9, SEPTEMBER 2005 1035 Circuits and Architectures for Field Programmable Gate Array With Configurable Supply Voltage Yan Lin,

More information

Department of Electrical and Computer Engineering

Department of Electrical and Computer Engineering LAGRANGIAN RELAXATION FOR GATE IMPLEMENTATION SELECTION Yi-Le Huang, Jiang Hu and Weiping Shi Department of Electrical and Computer Engineering Texas A&M University OUTLINE Introduction and motivation

More information

Kyoung Hwan Lim and Taewhan Kim Seoul National University

Kyoung Hwan Lim and Taewhan Kim Seoul National University Kyoung Hwan Lim and Taewhan Kim Seoul National University Table of Contents Introduction Motivational Example The Proposed Algorithm Experimental Results Conclusion In synchronous circuit design, all sequential

More information

Thermal-Aware 3D IC Physical Design and Architecture Exploration

Thermal-Aware 3D IC Physical Design and Architecture Exploration Thermal-Aware 3D IC Physical Design and Architecture Exploration Jason Cong & Guojie Luo UCLA Computer Science Department cong@cs.ucla.edu http://cadlab.cs.ucla.edu/~cong Supported by DARPA Outline Thermal-Aware

More information

Large Scale Circuit Partitioning

Large Scale Circuit Partitioning Large Scale Circuit Partitioning With Loose/Stable Net Removal And Signal Flow Based Clustering Jason Cong Honching Li Sung-Kyu Lim Dongmin Xu UCLA VLSI CAD Lab Toshiyuki Shibuya Fujitsu Lab, LTD Support

More information

CATALYST: Planning Layer Directives for Effective Design Closure

CATALYST: Planning Layer Directives for Effective Design Closure CATALYST: Planning Layer Directives for Effective Design Closure Yaoguang Wei 1, Zhuo Li 2, Cliff Sze 2 Shiyan Hu 3, Charles J. Alpert 2, Sachin S. Sapatnekar 1 1 Department of Electrical and Computer

More information

THE INCREASING impact of interconnect delay on the

THE INCREASING impact of interconnect delay on the 1126 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 36, NO. 7, JULY 2017 Incremental Layer Assignment Driven by an External Signoff Timing Engine Vinicius Livramento,

More information

Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization

Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization 6.1 Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization De-Shiuan Chiou, Da-Cheng Juan, Yu-Ting Chen, and Shih-Chieh Chang Department of CS, National Tsing Hua University, Hsinchu,

More information

University of California at Berkeley. Berkeley, CA the global routing in order to generate a feasible solution

University of California at Berkeley. Berkeley, CA the global routing in order to generate a feasible solution Post Routing Performance Optimization via Multi-Link Insertion and Non-Uniform Wiresizing Tianxiong Xue and Ernest S. Kuh Department of Electrical Engineering and Computer Sciences University of California

More information

Post-Route Gate Sizing for Crosstalk Noise Reduction

Post-Route Gate Sizing for Crosstalk Noise Reduction Post-Route Gate Sizing for Crosstalk Noise Reduction Murat R. Becer, David Blaauw, Ilan Algor, Rajendran Panda, Chanhee Oh, Vladimir Zolotov and Ibrahim N. Hajj Motorola Inc., Univ. of Michigan Ann Arbor,

More information

Simultaneous Slack Matching, Gate Sizing and Repeater Insertion for Asynchronous Circuits

Simultaneous Slack Matching, Gate Sizing and Repeater Insertion for Asynchronous Circuits Simultaneous Slack Matching, Gate Sizing and Repeater Insertion for Asynchronous Circuits Gang Wu and Chris Chu Department of Electrical and Computer Engineering, Iowa State University, IA Email: {gangwu,

More information

Clock Skew Optimization Considering Complicated Power Modes

Clock Skew Optimization Considering Complicated Power Modes Clock Skew Optimization Considering Complicated Power Modes Chiao-Ling Lung 1,2, Zi-Yi Zeng 1, Chung-Han Chou 1, Shih-Chieh Chang 1 National Tsing-Hua University, HsinChu, Taiwan 1 Industrial Technology

More information

PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory

PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory Scalable and Energy-Efficient Architecture Lab (SEAL) PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in -based Main Memory Ping Chi *, Shuangchen Li *, Tao Zhang, Cong

More information

Lithography Simulation-Based Full-Chip Design Analyses

Lithography Simulation-Based Full-Chip Design Analyses Lithography Simulation-Based Full-Chip Design Analyses Puneet Gupta a, Andrew B. Kahng a, Sam Nakagawa a,saumilshah b and Puneet Sharma c a Blaze DFM, Inc., Sunnyvale, CA; b University of Michigan, Ann

More information

Temperature-Aware Routing in 3D ICs

Temperature-Aware Routing in 3D ICs Temperature-Aware Routing in 3D ICs Tianpei Zhang, Yong Zhan and Sachin S. Sapatnekar Department of Electrical and Computer Engineering University of Minnesota 1 Outline Temperature-aware 3D global routing

More information