An overview of standard cell based digital VLSI design Implementation of the first generation AsAP processor Zhiyi Yu and Tinoosh Mohsenin VCL Laboratory UC Davis
Outline Overview of standard cellbased design Overview of AsAP Implementation of the first generation AsAP
Standard cell based IC vs. Custom design IC Standard cell based IC: Design using standard cells Standard cells come from library provider Many different choices for cell size, delay, leakage power Many EDA tools to automate this flow Shorter design time Custom design IC: Design all by yourself Higher performance
Standard cell based VLSI design flow Front end System specification and architecture HDL coding & behavioral simulation Synthesis & gate level simulation Back end Placement and routing DRC (Design Rule Check), LVS (Layout vs Schematic) dynamic simulation and static analysis
Outline Overview of standard cell-based design Overview of AsAP Implementation of the first generation AsAP
AsAP (Asynchronous Array of Simple Processors) A processing chip containing multiple uniform simple processor elements Each processor has its local clock generator Each processor can communicate with its neighbor processors using dual-clock FIFOs
Diagram of a 3x3 AsAP In- FIFO0 In- FIFO1 Inst Mem ALU MAC Control Data Mem Clock Output More information: http://www.ece.ucdavis.edu/vcl/asap/
Outline Overview of standard cell-based design Overview of AsAP Implementation of the first generation AsAP
Simple diagram of the frontend design flow System Specification RTL Coding Synthesis Gate level code Ex: c =!a & b INV (.in (a),.out (a_inv)); AND (.in1 (a_inv),.in2 (b),.out (c));
Simple diagram of the backend design flow gate level Verilog from synthesis Place & Route Final layout (go for fabrication) Gate level Verilog DRC LVS Design rule check Layout vs. schematic Timing information Gate level dynamic and/or static analysis
Back-end design of AsAP Technology: TSMC 0.18 μm CMOS Standard cell library: Artisan Tools Synthesis: Synopsis Design compiler Placement & Route: Cadence Encounter DRC & LVS: Calibre Static timing analysis: Primetime
Flow of placement and routing Import needed files Floorplan Placement & in-place optimization Clock tree generation Routing
Import needed files Gate level verilog (.v) Geometry information (.lef) Timing information (.lib) INV (.in (a),.out (a_inv)); AND (.in1 (a_inv),.in2 (b),.out (c)); b INV: 1um width AND: 2 um width a INV AND C INV: 1ns delay; AND: 2 ns delay Delay (a->c): 1ns + 2ns = 3ns
Floorplan Size of chip Location of Pins Location of main blocks Power supply: give enough power for each gate Power supply (1.8V) 1.75v 1.7v (need another power) 1.65v current VDD (Metal) Gate 1 Gate 2 Gate 3 Gate 4 VSS Voltage drop equation: V2 = V1 I * R
Floorplan of a single processor Inst Mem Data Mem ALU MAC Control Clock InFIFO 0 InFIFO 0
Placement & in-placement optimization Placement: place the gates In-placement optimization Why: timing information difference between synthesis and layout (wire delay) How: change gate size, insert buffers Should not change the circuit function!!
Placement of a single processor
Clock tree Main parameters: skew, delay, transition time
Clock tree of single processor
Routing Connect the gates using wires Two steps Connect the global signals (power) Connect other signals
Metal Layer Topology Routing
Layout of a single processor Area: 0.8mm x 0.8mm Estimated speed: 450 MHz
Layout of the first generation 6x6 AsAP Area: 30 mm^2 in 180 nm CMOS 36 processors 114 PADs One processor
Verification after layout DRC (design rule check) LVS (layout vs. schematic).gds vs. (verilog + spice module) Gate level verilog dynamic simulation Mainly check the function Different with synthesis result
Useful tools Dynamic Simulation: Modelsim (Mentor), NC-verilog (Cadence), Active-HDL Synthesis: Design-compiler, design-analyzer (Synopsys) Placement & Routing Encounter & icfb (Cadence) Astro (Synopsys) DRC & LVS Calibre (Mentor) Dracula (Cadence) Static Analysis Primetime (Synsopsys)