PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO. IRIS Lab National Chiao Tung University
|
|
- Jade Jefferson
- 5 years ago
- Views:
Transcription
1 PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO IRIS Lab National Chiao Tung University
2 Outline Introduction Problem Formulation Algorithm - PushPull Experimental Results Conclusion
3 Outline 3 Introduction Problem Formulation Algorithm - PushPull Experimental Results Conclusion
4 Introduction 4 Timing characterization is difficult due to a wide range of dynamic variations Supply voltage droops, process variations, etc. A timing guardband is reserved to ensure correct functionality Degrade circuit performance, i.e., limit the clock frequency clk1 Timing guardband clk Timing with normal condition Timing with dynamic variation T 1 T
5 Resilient Circuit 5 Eliminate the guardband by error detection and correction Ex: Razor flip-flop (one error-detection circuit) Correction through instruction replay Cons: short path issue error detection window clk clk_del Data error 0 1 FF Latch D. Ernst et al., Razor: a low-power pipeline based on circuit-level timing speculation MICRO, 003
6 Short Path Issue 6 Short paths should exceed the error detection window Otherwise, may detect false timing errors Require a significant hold time margin for short paths Focus on short path padding in this paper error detection window clk clk_del Data error 1 0 False timing error
7 Previous Work 7 Determine the padding delay path by path Combine with clock skew scheduling to minimize the clock period at logic synthesis stage J.P. Fishburn, IEEE TC, 1990 R.B. Deokar and S.S. Sapatnekar, ISCAS, 1994 S.H. Huang et al., DAC, 007 Determine the gate/wire to insert padding delay Solve by linear programming N. V. Shenoy et al., ICCAD, 1993 Heuristics Pad the gate with the largest setup slack US Patent 6,990,646,006 and 7,78,16,007 Pad the gate passed by most hold violating paths Y. Liu et al., DAC, 011
8 Short Path Padding Example 8 A=0.6 a=0.3 FF 1 i 1 (1.0/0.7/0.3, 0.4/0.1/-0.3) (1./0.8/0.4, 0.5/0./-0.3) R=1.0 r=0.3 R=1. r=0.5 i 3 1 FF (1.1/0.1/1.0, 0.4/0.1/-0.3) o 1 The largest setup slack FF 1 (1.0/0.7/0.3, (0.8/0.7/0.1, (0.8/0.8/0.0, 0.4/0.1/-0.3) 0.3/0.1/-0.) 0.3/0./-0.1) o 1 i R= r=0.5 i 3 (1.1/0.1/1.0, (1.1/0.4/0.7, (0.8/0.4/0.4, 0.4/0.1/-0.3) 0.4/0.4/0.0) 0.1/0.4/0.3) 1 (1./0.8/0.4, (1./1.1/0.1, 0.5/0./-0.3) 0.5/0.5/0.0) R=1.0 r=0.3 FF Setup Legend: Hold Unfix ID (R/A/S, r/a/h) : gate : edge : PI/PO : Flip-flop ID 1 3 Delay
9 Short Path Padding Example (cont d) 9 The most hold violating path A=0.6 a=0.3 FF 1 i 1 (1.0/0.7/0.3, 0.4/0.1/-0.3) R=1.0 r=0.3 i 3 1 FF (1.1/0.1/1.0, 0.4/0.1/-0.3) (1./0.8/0.4, 0.5/0./-0.3) o 1 R=1. r=0.5 FF 1 i 1 i (0.8/0.8/0.0, (1.0/0.7/0.3, (0.8/0.7/0.1, 0.3/0./-0.1) 0.4/0.1/-0.3) 0.3/0.1/-0.) (1.1/0.1/1.0, (0.8/0.1/0.7, 0.4/0.1/-0.3) 0.1/0.1/0.0) (1./0.8/0.4, (1./1.1/0.1, 0.5/0./-0.3) 0.5/0.5/0.0) R=1.0 r=0.3 o 1 R=1. r=0.5 FF Legend: Unfix ID (R/A/S, r/a/h) : gate : edge : PI/PO : Flip-flop ID 1 3 Delay
10 Short Path Padding Example (cont d) 10 A=0.6 a=0.3 FF 1 i 1 (1.0/0.7/0.3, 0.4/0.1/-0.3) (1./0.8/0.4, 0.5/0./-0.3) R=1.0 r=0.3 R=1. r=0.5 i 3 1 FF (1.1/0.1/1.0, 0.4/0.1/-0.3) o 1 FF 1 i 1 i (0.9/0.9/0.0, 0.3/0.3/0.0) (0.9/0./0.7, 0./0./0.0) (1./1./0.0, 0.5/0.5/0.0) R=1.0 r=0.3 o 1 R=1. r=0.5 FF Legend: Optimal padding delay = 0.5 ID (R/A/S, r/a/h) : gate : edge : PI/PO : Flip-flop ID 1 3 Delay
11 Our Contributions 11 Find the padding values and locations with a global view Compute the fanout padding flexibility of each gate Realize delay padding at the post-layout stage Coarse-grained delay padding: by spare cells Fine-grained delay padding: by dummy metal
12 Outline 1 Introduction Problem Formulation Algorithm - PushPull Experimental Results Conclusion
13 Problem Formulation 13 Given Placed and routed resilient design Cell library Spare cells and dummy metal information Target clock period and error detection window Goal Pad short paths Satisfy Setup/hold timing constraints Minimize the padding overhead
14 Delay Padding 14 Lengthen short paths by buffer insertion and extra capacitance hook-up buffer any type of cells Buffer insertion Padding wire Extra cap hook-up Padding gate
15 Outline 15 Introduction Problem Formulation Algorithm - PushPull Experimental Results Conclusion
16 Overview of PushPull 16 Start netlist, DEF/SPEF file cell library, timing constraints Padding Value Determination Padding resource collection Fanout padding flexibility calculation / Feasibility checking Load/Buffer Allocation Spare cell selection Dummy metal allocation Y Padding value decision Improvement? N Padding value refinement Timing analysis End
17 17 Example: Padding Value Determination First, pad gates Legend: A=0.6 a=0.3 FF 1 i (1.0/0.7/0.3, (1.0/1.0/0.0, 0.5/0.1/-0.4) 0.5/0.4/-0.1) (1./0.8/0.4, (1./1./0.0, 0.5/0./-0.3) 0.5/0.5/0.0) R=1.0 r=0.5 R=1. r=0.5 i 3 1 FF (1.1/0.3/0.7, (1.1/0.1/1.0, 0.3/0.3/0.0) 0.4/0.1/-0.3) o 1 ID (R/A/S, r/a/h) : gate : edge : PI/PO : Flip-flop ID 1 3 Delay
18 18 Example: Padding Value Determination Second, pad wires Legend: A=0.6 a=0.3 FF 1 i (1.0/1.0/0.0, 0.5/0.4/-0.1) 0.5/0.5/0.0) (0.6,-0.1) (0.5,0.0) (0.0,-0.1) (0.0,0.0) (0.6,0.) +0.3 (1./1./0.0, 0.5/0.5/0.0) R=1.0 r=0.5 R=1. r=0.5 i 3 1 FF (1.1/0.3/0.7, 0.3/0.3/0.0) o 1 ID (R/A/S, r/a/h) : gate : edge : PI/PO : Flip-flop ID 1 3 Delay
19 Example: Load/Buffer Allocation 19 A=0.6 a=0.3 FF 1 i (1.0/1.0/0.0, 0.5/0.5/0.0) (1./1./0.0, 0.5/0.5/0.0) R=1.0 r=0.5 R=1. r=0.5 i 3 1 FF (1.1/0.3/0.7, 0.3/0.3/0.0) 0.5 Spare cell Dummy metal o 1 Legend: ID (R/A/S, r/a/h) : gate : edge : PI/PO : Flip-flop ID 1 3 Delay
20 Padding Value Determination 0 Start netlist, DEF/SPEF file cell library, timing constraints Padding Value Determination Padding resource collection Fanout padding flexibility calculation / Feasibility checking Load/Buffer Allocation Spare cell selection Dummy metal allocation Y Padding value decision Improvement? N Padding value refinement Timing analysis End
21 Fanout Padding Flexibility Calculation 1 Fanout padding flexibility P F (i) of gate i Reflect the maximum padding value allowed on its whole fanout cone with a global view P F i, j = 0, if g i PO or H i 0; min 0, min H i, j e i, j E H i, otherwise. A=0.6 a=0.3 FF 1 i 1 (0.8/0.7/0.1, 0.3/0.1/-0.) (1.0/0.7/0.3, 0.4/0.1/-0.3) P F ()=0.1 P F (3)=0.3 P F (1)=0.0 (1./0.8/0.4, 0.5/0./-0.3) R=1.0 r=0.3 R=1. r=0.5 i 3 1 FF (0.8/0.1/0.7, 0.1/0.1/0.0) (1.1/0.1/1.0, 0.4/0.1/-0.3) +0.3 o 1 (1./1.1/0.1, 0.5/0.5/0.0)
22 Padding Value Decision Decide the padding value of each gate in the topological order Only pad the remaining slack P i = max P saf i P F i, 0 A=0.6 a=0.3 FF 1 i 1 P()=0. (1.0/0.7/0.3, 0.4/0.1/-0.3) P F ()=0.1 P F (3)=0.3 P F (1)=0. (1./0.8/0.4, (1./1.0/0., 0.5/0./-0.3) R=1.0 r=0.3 R=1. r=0.5 i 3 1 FF (1.1/0.1/1.0, 0.4/0.1/-0.3) P(3)=0.0 P(1)=0. o 1
23 Padding Value Decision 3 Iterate until no more improvement A=0.6 a=0.3 FF 1 i 1 P()=0. (0.9/0.9/0.0, 0.3/0.3/0.0) P F ()=0.0 P F (3)=0.0 P F (1)=0.0 R=1.0 r=0.3 R=1. r=0.5 i 3 1 FF (0.9/0.1/0.8, (0.9/0./0.7, 0./0.1/-0.1) 0./0./0.0) P(3)=0.1 (1./1./0.0, 0.5/0.4/-0.1) 0.5/0.5/0.0) P(1)=0. o 1
24 Padding Value Refinement 4 Further reduce the total padding values with forked path Push the padding values toward the gates where two or more paths fork A=0. A=0. (0.9/0.6/0.3, 0.4/0.4/0.0) P()=0.3 P(3)=0.3 3 (0.9/0.6/0.3, 0.4/0.4/0.0) (1.0/0.7/0.3, 0.5/0.5/0.0) 1 P(1)=0.3 Total = 0.6 Total = 0.3 R=1.0 r=0.5 FF 1 A=0.4 (0.5/0.5/0.0, 0.1/0.1/0.0) 1 P(1)=0. Total = 0.5 Total = 0.3 (1.0/1.0/0.0, 0.5/0.5/0.0) 0.0 P()= P(3)=0.3 3 (1.0/1.0/0.0, 0.5/0.5/0.0) R=1.0 r=0.5 FF 1 R=1.0 r=0.5 FF Joined short paths Forked short paths
25 Buffer Insertion 5 When hold violations cannot be fully cleaned by padding on gates Padding on wires (buffer insertion) +0.1 A=0.6 a=0.3 FF 1 i 1 (1.0/1.0/0.0, 0.5/0.4/-0.1) 0.5/0.5/0.0) (0.6, (0.5, -0.1) 0.0) (0.0, -0.1) 0.0) (0.0, 0.) P()=0.3 P(3)=0. (1./1./0.0, 0.5/0.5/0.0) R=1.0 r=0.5 R=1. r=0.5 i 3 1 FF (1.0/0.3/0.7, 0.3/0.3/0.0) (0.0, 0.1) P(1)=0.1 (0.7, 0.0) (0.7, 0.0) (0.0, 0.0) o 1
26 Load/Buffer Allocation 6 Start netlist, DEF/SPEF file cell library, timing constraints Padding Value Determination Padding resource collection Fanout padding flexibility calculation / Feasibility checking Load/Buffer Allocation Spare cell selection Dummy metal allocation Y Padding value decision Improvement? N Padding value refinement Timing analysis End
27 Spare Cell Selection (1/) 7 Extract the available spare cells located within the bounding box of its fanout net Find suitable spare cell candidates for gate/wire Reduce to the subset sum problem +0.5 s 1 : 0. s s : : 0.1 spare cell 3 Spare cells {s 1 } {s 1, s } {s 1, s, s 3 } Padding value {s 3 } {s } 0.15 {s } 0.0 {s 1 } 0.0 {s 1 } 0.0 {s 1 } 0.0 {s 1 } 0.0 {s 1 } 0.5 {s, s 3 }
28 Spare Cell Selection (/) 8 Solve the competition problem between different padding gates/wires Record multiple subset sum solutions Sort the gates/wires in ascending order of the number of spare cell solutions Assign the gates/wires to their best solution in the sorted order Padding gates/wires Spare cells s 1 Spare cells Subset sum for g 1 g 1 s s s 3 s 4 g s 3 Subset sum for g Subset sum for w 3 s 5 w 3 s 4 s 5 optimal suboptimal s 6
29 Dummy Metal Allocation 9 Fix remaining padding by dummy metal insertion Convert the delay to an amount of capacitance Assign un-overlapped metal resource Resort the unfixed padding to maximum network flow padding g g w d 1 d bounding box 0.15/0.15 -/0.15 s 0./0. -/0. 0.1/0.1 -/0.1 Padding gates/wires g 1 g w / -/ 0.1/ -/ 0.1/ -/ 0.1/ -/ Dummy metal d 1 d 0.5/0.5 -/0.5 t 0./0. -/0. flow/capacity
30 Outline 30 Introduction Problem Formulation Algorithm - PushPull Experimental Results Conclusion
31 Experimental Results 31 Platform: Intel Xeon CPU with 16GB memory, CentOS 5.5 Compare with two greedy heuristics Circuit Greedy 1: Greedy : Ours: PushPull Largest setup first [] Most path sharing first [1][][3] TNS 1 THS 1 Padding #Ite. Runtime TNS 1 THS 1 Padding Runtime TNS #Ite. 1 THS 1 Padding Runtime #Ite. (ps) (ps) delay (ps) (s) (ps) (ps) delay (ps) (s) (ps) (ps) delay (ps) (s) s s , , , s , , , s , , , s s , , , s , , , des_perf , , ,0.0 3,414 6, , ,088.0,57 4,0.46 b ,675, , ,18, ,90 37, , ,79,30.0 6,0 10, TNS 1 : total negative setup slack after padding value determination. THS 1 : total negative hold slack after padding value determination. [1] S. Yoshikawa, Hold time error correction method and correction program for integrated circuits, US Patent 6,990,646, 006. [] Y. Sun et al., Method and apparatus for fixing hold time violations in a circuit design, US Patent 7,78,16, 007. [3] Y. Liu et al., Re-synthesis for cost-efficient circuit-level timing speculation, DAC, pp , 011.
32 Experimental Results (cont d) 3 Compare with LP solutions Conservative LP Ours: PushPull Circuit #Gate #FF #SFF clock period THS (ps) TNS 1 THS 1 Padding Runtime TNS 1 THS 1 Padding Normalized #Ite. Runtime (ns) (ps) (ps) delay (ps) (s) (ps) (ps) delay (ps) to LP (s) s s , , , s , , , s , , , s s , , , s ,890 1, , , , des_perf 51,349 8,8081, , , , b19 7,87 5,541, ,481,800.0 NA NA NA timeout ,675, TNS 1 : total negative setup slack after padding value determination. THS 1 : total negative hold slack after padding value determination. The impact of input slew on padding delay capacitance conversion The average error rate over all cases is only 0.56%. LP: N. V. Shenoy et al., "Minimum padding to satisfy short path constraints," ICCAD-93.
33 Experimental Results (cont d) 33 Compare our load/buffer allocation method with LP+Mapping Circuit LP+Mapping Ours: PushPull TNS (ps) THS (ps) Runtime (s) TNS (ps) THS (ps) Runtime (s) s s s s s s s , des_perf , b19 NA NA NA TNS : total negative setup slack after load/buffer allocation. THS : total negative hold slack after load/buffer allocation. LP+Mapping: N. V. Shenoy et al., "Minimum padding to satisfy short path constraints," ICCAD-93.
34 Conclusion 34 Focus on the severe short path padding problem in resilient circuits Enable the timing error detection and correction mechanism of resilient circuits Determine the padding values and locations with a global view vs. the local view adopted by resent prior work Realize the determined padding values at physical implementation Propose coarse-grained and fine-grained load/buffer allocation by using spare cells and dummy metal
35 Future Work 35 Allocate spare cell and dummy metal simultaneously Spare cells: discrete capacitance resource Dummy metal: continuous capacitance resource Adopt Composite Current Source (CCS) timing model Use more accuracy delay model
36 36 Thank You! Contact info: Yu-Ming Yang
37 37 Backup Slides
38 The Resilient Circuit Design Flow 38 Logic synthesis Timing analysis S th /H th selection Resynthesis Timing analysis Resilient circuit replacement Placement & routing Short path padding
Kyoung Hwan Lim and Taewhan Kim Seoul National University
Kyoung Hwan Lim and Taewhan Kim Seoul National University Table of Contents Introduction Motivational Example The Proposed Algorithm Experimental Results Conclusion In synchronous circuit design, all sequential
More informationClock Tree Resynthesis for Multi-corner Multi-mode Timing Closure
Clock Tree Resynthesis for Multi-corner Multi-mode Timing Closure Subhendu Roy 1, Pavlos M. Mattheakis 2, Laurent Masse-Navette 2 and David Z. Pan 1 1 ECE Department, The University of Texas at Austin
More informationSymmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment
Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment Xin-Wei Shih, Tzu-Hsuan Hsu, Hsu-Chieh Lee, Yao-Wen Chang, Kai-Yuan Chao 2013.01.24 1 Outline 2 Clock Network Synthesis Clock network
More informationOn Constructing Lower Power and Robust Clock Tree via Slew Budgeting
1 On Constructing Lower Power and Robust Clock Tree via Slew Budgeting Yeh-Chi Chang, Chun-Kai Wang and Hung-Ming Chen Dept. of EE, National Chiao Tung University, Taiwan 2012 年 3 月 29 日 Outline 2 Motivation
More informationAsia and South Pacific Design Automation Conference
Asia and South Pacific Design Automation Conference Authors: Kuan-Yu Lin, Hong-Ting Lin, and Tsung-Yi Ho Presenter: Hong-Ting Lin chibli@csie.ncku.edu.tw http://eda.csie.ncku.edu.tw Electronic Design Automation
More informationPerformance-Preserved Analog Routing Methodology via Wire Load Reduction
Electronic Design Automation Laboratory (EDA LAB) Performance-Preserved Analog Routing Methodology via Wire Load Reduction Hao-Yu Chi, Hwa-Yi Tseng, Chien-Nan Jimmy Liu, Hung-Ming Chen 2 Dept. of Electrical
More informationPhysical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006
Physical Design of Digital Integrated Circuits (EN029 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Lecture 08: Interconnect Trees Introduction to Graphs and Trees Minimum Spanning
More informationPhysical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006
Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 1 Lecture 10: Repeater (Buffer) Insertion Introduction to Buffering Buffer Insertion
More informationTechnology Dependent Logic Optimization Prof. Kurt Keutzer EECS University of California Berkeley, CA Thanks to S. Devadas
Technology Dependent Logic Optimization Prof. Kurt Keutzer EECS University of California Berkeley, CA Thanks to S. Devadas 1 RTL Design Flow HDL RTL Synthesis Manual Design Module Generators Library netlist
More informationL14 - Placement and Routing
L14 - Placement and Routing Ajay Joshi Massachusetts Institute of Technology RTL design flow HDL RTL Synthesis manual design Library/ module generators netlist Logic optimization a b 0 1 s d clk q netlist
More informationPOWER PERFORMANCE OPTIMIZATION METHODS FOR DIGITAL CIRCUITS
POWER PERFORMANCE OPTIMIZATION METHODS FOR DIGITAL CIRCUITS Radu Zlatanovici zradu@eecs.berkeley.edu http://www.eecs.berkeley.edu/~zradu Department of Electrical Engineering and Computer Sciences University
More informationInterconnect Delay and Area Estimation for Multiple-Pin Nets
Interconnect Delay and Area Estimation for Multiple-Pin Nets Jason Cong and David Z. Pan UCLA Computer Science Department Los Angeles, CA 90095 Sponsored by SRC and Avant!! under CA-MICRO Presentation
More informationRetiming. Adapted from: Synthesis and Optimization of Digital Circuits, G. De Micheli Stanford. Outline. Structural optimization methods. Retiming.
Retiming Adapted from: Synthesis and Optimization of Digital Circuits, G. De Micheli Stanford Outline Structural optimization methods. Retiming. Modeling. Retiming for minimum delay. Retiming for minimum
More informationImplementing Tile-based Chip Multiprocessors with GALS Clocking Styles
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California, Davis, USA Outline Introduction Timing issues
More informationCluster-based approach eases clock tree synthesis
Page 1 of 5 EE Times: Design News Cluster-based approach eases clock tree synthesis Udhaya Kumar (11/14/2005 9:00 AM EST) URL: http://www.eetimes.com/showarticle.jhtml?articleid=173601961 Clock network
More informationProblem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets.
Clock Routing Problem Formulation Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Better to develop specialized routers for these nets.
More informationECE 4514 Digital Design II. Spring Lecture 20: Timing Analysis and Timed Simulation
ECE 4514 Digital Design II Lecture 20: Timing Analysis and Timed Simulation A Tools/Methods Lecture Topics Static and Dynamic Timing Analysis Static Timing Analysis Delay Model Path Delay False Paths Timing
More informationLecture 11 Logic Synthesis, Part 2
Lecture 11 Logic Synthesis, Part 2 Xuan Silvia Zhang Washington University in St. Louis http://classes.engineering.wustl.edu/ese461/ Write Synthesizable Code Use meaningful names for signals and variables
More informationTing Wu, Chi-Ying Tsui, Mounir Hamdi Hong Kong University of Science & Technology Hong Kong SAR, China
CMOS Crossbar Ting Wu, Chi-Ying Tsui, Mounir Hamdi Hong Kong University of Science & Technology Hong Kong SAR, China OUTLINE Motivations Problems of Designing Large Crossbar Our Approach - Pipelined MUX
More information3. Implementing Logic in CMOS
3. Implementing Logic in CMOS 3. Implementing Logic in CMOS Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 27 September, 27 ECE Department,
More informationCAD Technology of the SX-9
KONNO Yoshihiro, IKAWA Yasuhiro, SAWANO Tomoki KANAMARU Keisuke, ONO Koki, KUMAZAKI Masahito Abstract This paper outlines the design techniques and CAD technology used with the SX-9. The LSI and package
More informationVID (Voltage ID) features require timing analysis on even more corners
Mehmet Avci Number of Corners for Typical Part Introduction The number of signoff PVT corners is increasing VID (Voltage ID) features require timing analysis on even more corners Designers try to reduce
More informationEE582 Physical Design Automation of VLSI Circuits and Systems
EE582 Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University Preliminaries Table of Contents Semiconductor manufacturing Problems to solve Algorithm complexity
More informationLarge Scale Circuit Partitioning
Large Scale Circuit Partitioning With Loose/Stable Net Removal And Signal Flow Based Clustering Jason Cong Honching Li Sung-Kyu Lim Dongmin Xu UCLA VLSI CAD Lab Toshiyuki Shibuya Fujitsu Lab, LTD Support
More informationParallel Streaming Computation on Error-Prone Processors. Yavuz Yetim, Margaret Martonosi, Sharad Malik
Parallel Streaming Computation on Error-Prone Processors Yavuz Yetim, Margaret Martonosi, Sharad Malik Upsets/B muons/mb Average Number of Dopant Atoms Hardware Errors on the Rise Soft Errors Due to Cosmic
More informationChapter 5 Global Routing
Chapter 5 Global Routing 5. Introduction 5.2 Terminology and Definitions 5.3 Optimization Goals 5. Representations of Routing Regions 5.5 The Global Routing Flow 5.6 Single-Net Routing 5.6. Rectilinear
More informationUltra Low Power (ULP) Challenge in System Architecture Level
Ultra Low Power (ULP) Challenge in System Architecture Level - New architectures for 45-nm, 32-nm era ASP-DAC 2007 Designers' Forum 9D: Panel Discussion: Top 10 Design Issues Toshinori Sato (Kyushu U)
More informationVdd Programmable and Variation Tolerant FPGA Circuits and Architectures
Vdd Programmable and Variation Tolerant FPGA Circuits and Architectures Prof. Lei He EE Department, UCLA LHE@ee.ucla.edu Partially supported by NSF. Pathway to Power Efficiency and Variation Tolerance
More informationA Hierarchical Bin-Based Legalizer for Standard-Cell Designs with Minimal Disturbance
A Hierarchical Bin-Based Legalizer for Standard- Designs with Minimal Disturbance Yu-Min Lee, Tsung-You Wu, and Po-Yi Chiang Department of Electrical Engineering National Chiao Tung University ASPDAC,
More informationVariation Tolerant Buffered Clock Network Synthesis with Cross Links
Variation Tolerant Buffered Clock Network Synthesis with Cross Links Anand Rajaram David Z. Pan Dept. of ECE, UT-Austin Texas Instruments, Dallas Sponsored by SRC and IBM Faculty Award 1 Presentation Outline
More informationFloorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence
Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Chen-Wei Liu 12 and Yao-Wen Chang 2 1 Synopsys Taiwan Limited 2 Department of Electrical Engineering National Taiwan University,
More informationPessimism Reduction In Static Timing Analysis Using Interdependent Setup and Hold Times
Pessimism Reduction In Static Timing Analysis Using Interdependent Setup and Hold Times Emre Salman, Ali Dasdan, Feroze Taraporevala, Kayhan Küçükçakar, and Eby G. Friedman Department of Electrical and
More informationPower-Mode-Aware Buffer Synthesis for Low-Power Clock Skew Minimization
This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Electronics Express, Vol.* No.*,*-* Power-Mode-Aware Buffer Synthesis for Low-Power
More informationABC basics (compilation from different articles)
1. AIG construction 2. AIG optimization 3. Technology mapping ABC basics (compilation from different articles) 1. BACKGROUND An And-Inverter Graph (AIG) is a directed acyclic graph (DAG), in which a node
More informationFast Dual-V dd Buffering Based on Interconnect Prediction and Sampling
Based on Interconnect Prediction and Sampling Yu Hu King Ho Tam Tom Tong Jing Lei He Electrical Engineering Department University of California at Los Angeles System Level Interconnect Prediction (SLIP),
More informationIterative-Constructive Standard Cell Placer for High Speed and Low Power
Iterative-Constructive Standard Cell Placer for High Speed and Low Power Sungjae Kim and Eugene Shragowitz Department of Computer Science and Engineering University of Minnesota, Minneapolis, MN 55455
More informationPlace and Route for FPGAs
Place and Route for FPGAs 1 FPGA CAD Flow Circuit description (VHDL, schematic,...) Synthesize to logic blocks Place logic blocks in FPGA Physical design Route connections between logic blocks FPGA programming
More informationFPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011
FPGA for Complex System Implementation National Chiao Tung University Chun-Jen Tsai 04/14/2011 About FPGA FPGA was invented by Ross Freeman in 1989 SRAM-based FPGA properties Standard parts Allowing multi-level
More informationVLSI Test Technology and Reliability (ET4076)
VLSI Test Technology and Reliability (ET4076) Lecture 8 (1) Delay Test (Chapter 12) Said Hamdioui Computer Engineering Lab Delft University of Technology 2009-2010 1 Learning aims Define a path delay fault
More informationElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests
ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests Mingxing Tan 1 2, Gai Liu 1, Ritchie Zhao 1, Steve Dai 1, Zhiru Zhang 1 1 Computer Systems Laboratory, Electrical and Computer
More informationAn Interconnect-Centric Design Flow for Nanometer Technologies
An Interconnect-Centric Design Flow for Nanometer Technologies Professor Jason Cong UCLA Computer Science Department Los Angeles, CA 90095 http://cadlab.cs.ucla.edu/~ /~cong
More informationSYNTHESIS FOR ADVANCED NODES
SYNTHESIS FOR ADVANCED NODES Abhijeet Chakraborty Janet Olson SYNOPSYS, INC ISPD 2012 Synopsys 2012 1 ISPD 2012 Outline Logic Synthesis Evolution Technology and Market Trends The Interconnect Challenge
More informationSungmin Bae, Hyung-Ock Kim, Jungyun Choi, and Jaehong Park. Design Technology Infrastructure Design Center System-LSI Business Division
Sungmin Bae, Hyung-Ock Kim, Jungyun Choi, and Jaehong Park Design Technology Infrastructure Design Center System-LSI Business Division 1. Motivation 2. Design flow 3. Parallel multiplier 4. Coarse-grained
More informationOverview. Design flow. Principles of logic synthesis. Logic Synthesis with the common tools. Conclusions
Logic Synthesis Overview Design flow Principles of logic synthesis Logic Synthesis with the common tools Conclusions 2 System Design Flow Electronic System Level (ESL) flow System C TLM, Verification,
More informationMetal-Density Driven Placement for CMP Variation and Routability
Metal-Density Driven Placement for CMP Variation and Routability ISPD-2008 Tung-Chieh Chen 1, Minsik Cho 2, David Z. Pan 2, and Yao-Wen Chang 1 1 Dept. of EE, National Taiwan University 2 Dept. of ECE,
More informationLeakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction
44.1 Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Yan Lin and Lei He Electrical Engineering Department University of California, Los Angeles, CA
More informationNOLO : A No-Loop, Predictive Useful Skew Methodology for Improved Timing in IC Implementation
NOLO : A No-Loop, Predictive Useful Skew Methodology for Improved Timing in IC Implementation Tuck-Boon Chan, Andrew B. Kahng and Jiajia Li CSE and ECE Departments, UC San Diego, La Jolla, CA 92093 {tbchan,
More informationLinking Layout to Logic Synthesis: A Unification-Based Approach
Linking Layout to Logic Synthesis: A Unification-Based Approach Massoud Pedram Department of EE-Systems University of Southern California Los Angeles, CA February 1998 Outline Introduction Technology and
More informationAbacus: Fast Legalization of Standard Cell Circuits with Minimal Movement
EDA Institute for Electronic Design Automation Prof. Ulf Schlichtmann Abacus: Fast Legalization of Standard Cell Circuits with Minimal Movement Peter Spindler, Ulf Schlichtmann and Frank M. Johannes Technische
More informationECE 5745 Complex Digital ASIC Design Topic 13: Physical Design Automation Algorithms
ECE 7 Complex Digital ASIC Design Topic : Physical Design Automation Algorithms Christopher atten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/courses/ece7
More informationA Routing Approach to Reduce Glitches in Low Power FPGAs
A Routing Approach to Reduce Glitches in Low Power FPGAs Quang Dinh, Deming Chen, Martin Wong Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign This research
More informationCongestion-Aware Power Grid. and CMOS Decoupling Capacitors. Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar
Congestion-Aware Power Grid Optimization for 3D circuits Using MIM and CMOS Decoupling Capacitors Pingqiang Zhou Karthikk Sridharan Sachin S. Sapatnekar University of Minnesota 1 Outline Motivation A new
More informationEITF35: Introduction to Structured VLSI Design
EITF35: Introduction to Structured VLSI Design Part 1.1.2: Introduction (Digital VLSI Systems) Liang Liu liang.liu@eit.lth.se 1 Outline Why Digital? History & Roadmap Device Technology & Platforms System
More informationBest Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs.
Best Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs. Cortex-A12: ARM-Cadence collaboration Joint team working on ARM Cortex -A12 irm flow irm content:
More informationDI-SSD: Desymmetrized Interconnection Architecture and Dynamic Timing Calibration for Solid-State Drives
DI-SSD: Desymmetrized Interconnection Architecture and Dynamic Timing Calibration for Solid-State Drives Ren-Shuo Liu and Jian-Hao Huang System and Storage Design Lab Department of Electrical Engineering
More informationChapter 5 Global Timing Constraints. Global Timing Constraints 5-1
Chapter 5 Global Timing Constraints Global Timing Constraints 5-1 Objectives After completing this module, you will be able to Apply timing constraints to a simple synchronous design Specify global timing
More informationAn Interconnect-Centric Design Flow for Nanometer Technologies. Outline
An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong UCLA Computer Science Department Email: cong@cs.ucla.edu Tel: 310-206-2775 http://cadlab.cs.ucla.edu/~cong Outline Global interconnects
More informationDepartment of Electrical and Computer Engineering
LAGRANGIAN RELAXATION FOR GATE IMPLEMENTATION SELECTION Yi-Le Huang, Jiang Hu and Weiping Shi Department of Electrical and Computer Engineering Texas A&M University OUTLINE Introduction and motivation
More informationTOPIC : Verilog Synthesis examples. Module 4.3 : Verilog synthesis
TOPIC : Verilog Synthesis examples Module 4.3 : Verilog synthesis Example : 4-bit magnitude comptarator Discuss synthesis of a 4-bit magnitude comparator to understand each step in the synthesis flow.
More informationOptimal Multi-Domain Clock Skew Scheduling
Optimal Multi-Domain Clock Skew Scheduling Li Li, Yinghai Lu, and Hai Zhou Department of Electrical Engineering and Computer Science Northwestern University Abstract Clock skew scheduling is an effective
More informationCo-synthesis and Accelerator based Embedded System Design
Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer
More informationRegularity for Reduced Variability
Regularity for Reduced Variability Larry Pileggi Carnegie Mellon pileggi@ece.cmu.edu 28 July 2006 CMU Collaborators Andrzej Strojwas Slava Rovner Tejas Jhaveri Thiago Hersan Kim Yaw Tong Sandeep Gupta
More informationSequential Circuit Design: Principle
Sequential Circuit Design: Principle Chapter 8 1 Outline 1. Overview on sequential circuits 2. Synchronous circuits 3. Danger of synthesizing asynchronous circuit 4. Inference of basic memory elements
More informationPhysical Design Closure
Physical Design Closure Olivier Coudert Monterey Design System DAC 2000 DAC2000 (C) Monterey Design Systems 1 DSM Dilemma SOC Time to market Million gates High density, larger die Higher clock speeds Long
More informationPreclass Warmup. ESE535: Electronic Design Automation. Motivation (1) Today. Bisection Width. Motivation (2)
ESE535: Electronic Design Automation Preclass Warmup What cut size were you able to achieve? Day 4: January 28, 25 Partitioning (Intro, KLFM) 2 Partitioning why important Today Can be used as tool at many
More informationTechnology Mapping and Packing. FPGAs
Technology Mapping and Packing for Coarse-grained, Anti-fuse Based FPGAs Chang Woo Kang, Ali Iranli, and Massoud Pedram University of Southern California Department of Electrical Engineering Los Angeles
More informationLogic Synthesis. Logic Synthesis. Gate-Level Optimization. Logic Synthesis Flow. Logic Synthesis. = Translation+ Optimization+ Mapping
Logic Synthesis Logic Synthesis = Translation+ Optimization+ Mapping Logic Synthesis 2 Gate-Level Optimization Logic Synthesis Flow 3 4 Design Compiler Procedure Logic Synthesis Input/Output 5 6 Design
More informationTrace Signal Selection to Enhance Timing and Logic Visibility in Post-Silicon Validation
Trace Signal Selection to Enhance Timing and Logic Visibility in Post-Silicon Validation Hamid Shojaei, and Azadeh Davoodi University of Wisconsin 1415 Engineering Drive, Madison WI 53706 Email: {shojaei,
More informationA mm 2 780mW Fully Synthesizable PLL with a Current Output DAC and an Interpolative Phase-Coupled Oscillator using Edge Injection Technique
A 0.0066mm 2 780mW Fully Synthesizable PLL with a Current Output DAC and an Interpolative Phase-Coupled Oscillator using Edge Injection Technique Wei Deng, Dongsheng Yang, Tomohiro Ueno, Teerachot Siriburanon,
More informationEliminating False Loops Caused by Sharing in Control Path
Eliminating False Loops Caused by Sharing in Control Path ALAN SU and YU-CHIN HSU University of California Riverside and TA-YUNG LIU and MIKE TIEN-CHIEN LEE Avant! Corporation In high-level synthesis,
More informationGate Sizing by Lagrangian Relaxation Revisited
Gate Sizing by Lagrangian Relaxation Revisited Jia Wang, Debasish Das, and Hai Zhou Electrical Engineering and Computer Science Northwestern University Evanston, Illinois, United States October 17, 2007
More informationAn overview of standard cell based digital VLSI design
An overview of standard cell based digital VLSI design Implementation of the first generation AsAP processor Zhiyi Yu and Tinoosh Mohsenin VCL Laboratory UC Davis Outline Overview of standard cellbased
More informationArea Optimization of Resilient Designs Guided by a Mixed Integer Geometric Program
Area Optimization of Resilient Designs Guided by a Mixed Integer Geometric Program Hsin-Ho Huang 1 Huimei Cheng 1 Chris Chu 2 Peter A. Beerel 1 hsinhohu@usc.edu huimeich@usc.edu cnchu@iastate.edu pabeerel@usc.edu
More informationCadence On-Line Document
Cadence On-Line Document 1 Purpose: Use Cadence On-Line Document to look up command/syntax in SoC Encounter. 2 Cadence On-Line Document An on-line searching system which can be used to inquire about LEF/DEF
More informationEECS Components and Design Techniques for Digital Systems. Lec 20 RTL Design Optimization 11/6/2007
EECS 5 - Components and Design Techniques for Digital Systems Lec 2 RTL Design Optimization /6/27 Shauki Elassaad Electrical Engineering and Computer Sciences University of California, Berkeley Slides
More informationVerilog Code File Normally this will not have delay information, but it will have fanout (loading) information. Cell Model File
Delay Modeling Verilog Code File Normally this will not have delay information, but it will have fanout (loading) information. Cell Model File Delay Modeling This has delay information but not loading
More informationA Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction
A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo Han, Andrew B. Kahng, Jongpil Lee, Jiajia Li and Siddhartha Nath CSE and ECE Departments,
More informationDesign of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on. on-chip Architecture
Design of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on on-chip Architecture Avinash Kodi, Ashwini Sarathy * and Ahmed Louri * Department of Electrical Engineering and
More informationFocus On Structural Test: AC Scan
Focus On Structural Test: AC Scan Alfred L. Crouch Chief Scientist Inovys Corporation al.crouch@inovys.com The DFT Equation The Problem What is Driving Modern Test Technology? 300mm Wafers Volume Silicon/Test
More informationGate Sizing and Vth Assignment for Asynchronous Circuits Using Lagrangian Relaxation. Gang Wu, Ankur Sharma and Chris Chu Iowa State University
1 Gate Sizing and Vth Assignment for Asynchronous Circuits Using Lagrangian Relaxation Gang Wu, Ankur Sharma and Chris Chu Iowa State University Introduction 2 Asynchronous Gate Selection Problem: Selecting
More informationPieceTimer: A Holistic Timing Analysis Framework Considering Setup/Hold Time Interdependency Using A Piecewise Model
PieceTimer: A Holistic Timing Analysis Framewor Considering Setup/Hold Time Interdependency Using A Piecewise Model Grace Li Zhang, Bing Li, Ulf Schlichtmann Institute for Electronic Design Automation
More informationCharacteristics of the ITC 99 Benchmark Circuits
Characteristics of the ITC 99 Benchmark Circuits Scott Davidson Sun Microsystems, Inc. ITC 99 Benchmarks - Scott Davidson Page 1 Outline Why Benchmark? Some History. Soliciting Benchmarks Benchmark Characteristics
More informationDelay Estimation for Technology Independent Synthesis
Delay Estimation for Technology Independent Synthesis Yutaka TAMIYA FUJITSU LABORATORIES LTD. 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki, JAPAN, 211-88 Tel: +81-44-754-2663 Fax: +81-44-754-2664 E-mail:
More information101-1 Under-Graduate Project Digital IC Design Flow
101-1 Under-Graduate Project Digital IC Design Flow Speaker: Ming-Chun Hsiao Adviser: Prof. An-Yeu Wu Date: 2012/9/25 ACCESS IC LAB Outline Introduction to Integrated Circuit IC Design Flow Verilog HDL
More informationAn Interconnect-Centric Design Flow for Nanometer Technologies
An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong UCLA Computer Science Department Email: cong@cs.ucla.edu Tel: 310-206-2775 URL: http://cadlab.cs.ucla.edu/~cong Exponential Device
More informationEfficient Static Timing Analysis Using a Unified Framework for False Paths and Multi-Cycle Paths
Efficient Static Timing Analysis Using a Unified Framework for False Paths and Multi-Cycle Paths Shuo Zhou, Bo Yao, Hongyu Chen, Yi Zhu and Chung-Kuan Cheng University of California at San Diego La Jolla,
More informationOptimal Prescribed-Domain Clock Skew Scheduling
Optimal Prescribed-Domain Clock Skew Scheduling Li Li, Yinghai Lu, Hai Zhou Electrical Engineering and Computer Science Northwestern University 6B-4 Abstract Clock skew scheduling is an efficient technique
More informationAS VLSI technology scales to deep submicron and beyond, interconnect
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL., NO. 1 TILA-S: Timing-Driven Incremental Layer Assignment Avoiding Slew Violations Derong Liu, Bei Yu Member, IEEE, Salim
More informationPhysical Implementation
CS250 VLSI Systems Design Fall 2009 John Wawrzynek, Krste Asanovic, with John Lazzaro Physical Implementation Outline Standard cell back-end place and route tools make layout mostly automatic. However,
More informationICS 252 Introduction to Computer Design
ICS 252 Introduction to Computer Design Logic Optimization Eli Bozorgzadeh Computer Science Department-UCI Hardware compilation flow HDL RTL Synthesis netlist Logic synthesis library netlist Physical design
More informationHIGH-LEVEL SYNTHESIS
HIGH-LEVEL SYNTHESIS Page 1 HIGH-LEVEL SYNTHESIS High-level synthesis: the automatic addition of structural information to a design described by an algorithm. BEHAVIORAL D. STRUCTURAL D. Systems Algorithms
More informationClock Skew Optimization Considering Complicated Power Modes
Clock Skew Optimization Considering Complicated Power Modes Chiao-Ling Lung 1,2, Zi-Yi Zeng 1, Chung-Han Chou 1, Shih-Chieh Chang 1 National Tsing-Hua University, HsinChu, Taiwan 1 Industrial Technology
More informationA Non-Volatile Microcontroller with Integrated Floating-Gate Transistors
A Non-Volatile Microcontroller with Integrated Floating-Gate Transistors Wing-kei Yu, Shantanu Rajwade, Sung-En Wang, Bob Lian, G. Edward Suh, Edwin Kan Cornell University 2 of 32 Self-Powered Devices
More informationSimultaneous OPC- and CMP-Aware Routing Based on Accurate Closed-Form Modeling
Simultaneous OPC- and CMP-Aware Routing Based on Accurate Closed-Form Modeling Shao-Yun Fang, Chung-Wei Lin, Guang-Wan Liao, and Yao-Wen Chang March 26, 2013 Graduate Institute of Electronics Engineering
More informationComprehensive Place-and-Route Platform Olympus-SoC
Comprehensive Place-and-Route Platform Olympus-SoC Digital IC Design D A T A S H E E T BENEFITS: Olympus-SoC is a comprehensive netlist-to-gdsii physical design implementation platform. Solving Advanced
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware 4.1 Introduction We will examine two MIPS implementations
More informationLibAbs: An Efficient and Accurate Timing Macro-Modeling Algorithm for Large Hierarchical Designs
LibAbs: An Efficient and Accurate Timing Macro-Modeling Algorithm for Large Hierarchical Designs Tin-Yin Lai Dept. of ECE, UIUC, IL, USA tinyinlai@gmail.com Tsung-Wei Huang Dept. of ECE, UIUC, IL, USA
More informationCAD Algorithms. Placement and Floorplanning
CAD Algorithms Placement Mohammad Tehranipoor ECE Department 4 November 2008 1 Placement and Floorplanning Layout maps the structural representation of circuit into a physical representation Physical representation:
More informationIntroduction to Innovus
Introduction to Innovus Courtesy of Dr. Dae Hyun Kim@WSU http://csce.uark.edu +1 (479) 575-6043 yrpeng@uark.edu Introduction to Innovus Innovus was called Innovus before v15 Standard Placement and Routing
More informationAn Automated System for Checking Lithography Friendliness of Standard Cells
An Automated System for Checking Lithography Friendliness of Standard Cells I-Lun Tseng, Senior Member, IEEE, Yongfu Li, Senior Member, IEEE, Valerio Perez, Vikas Tripathi, Zhao Chuan Lee, and Jonathan
More information