EE219A Spring 2008 Special Topics in Circuits and Signal Processing Lecture 9 FPGA Architecture Ranier Yap, Mohamed Ali Annoucements Homework 2 posted Due Wed, May 7 Now is the time to turn-in your Hw 1 if you haven t done so Next Monday Out of town (second and last travel for this quarter) Plan TBA on Wednesday Synplicity DSP training Fri, May 16 (9am-5pm) Slide 2
Lecture Today FPGA Architecture Ranier Yap, Mohamed Ali More notes online (EEweb) Slide 3 FPGAs Basics and Examples Presented by Ranier Yap Mohamed Ali
Outline FPGA Architecture Logic Block Architecture. Routing Architecture and Techniques. Interconnect Switches. Xilinx FPGAs Overview. Technological side effects (65nm). Virtex-5 FPGAs. References. Slide 5 FPGA Architecture FPGA = Field-Programmable Gate Array Basic elements Logic block architecture Routing architecture Layout Routing techniques Interconnect switches Slide 6
Logic Block Architecture Granularity classifications Fine grain Coarse grain Slide 7 Logic Block Architecture Fine Grain Few, simple logic elements in a block + High utilization of logic block - Lots of interconnects & programmable switches : Larger chip area : Lower performance Slide 8
Logic Block Architecture Coarse Grain Few complex logic elements that performs numerous functionality Most FPGAs Example: Actel ACT1 8 inputs to logic block Performs all 2-input functions, most 3-input functions and some 4-input functions Uses Shannon s Expansion Theorem Slide 9 Outline FPGA Architecture Logic Block Architecture. Routing Architecture and Techniques. Interconnect Switches. Xilinx FPGAs Overview. Technological side effects (65nm). Virtex-5 FPGAs. References. Slide 10
Importance Delay: 40-60% from interconnect delay (RC) Area: Interconnects and switches take up majority of chip area Routing Architecture Slide 11 Layout (Row-Based) Type 1: Row Based Cells located adjacent to routing channel Horizontal routing channel Estimating optimum number of tracks and segments difficult Main tradeoff: area vs delay Slide 12
Routing Techniques (Row-Based) Fully Segmented Channel Switches needed between every cross-point Flexible routing Many switches Slide 13 Routing Techniques (Row-Based) Non-Segmented Channel One track for one connection Few switches Slide 14
Routing Techniques (Row-Based) 1 Segment Routing Divide segments into various lengths on tracks Few switches Slide 15 Routing Techniques (Row-Based) 2 Segment Routing Programmable segments more flexible Less tracks Slide 16
Layout (Matrix-Based) Type 2: Matrix/Symmetrical Based Horizontal & vertical routing channels Long interconnect lines Slide 17 Routing Techniques (Matrix-Based) Connection Blocks (C-Block) connect I/Os of logic blocks to routing channel Switch Blocks (S-Block) connect segments at intersection of routing channels Slide 18
Routing Techniques (Matrix-Based) PIP (Programmable Interconnect Point) Fewer in number means higher speed but lower ability to route Buffering between switches reduces loading and thus delay Slide 19 Outline FPGA Architecture Logic Block Architecture. Routing Architecture and Techniques. Interconnect Switches. Xilinx FPGAs Overview. Technological side effects (65nm). Virtex-5 FPGAs. References. Slide 20
Interconnect Switches Type 1: Antifuse High voltage (11-21V) to blow the fuse Not reprogrammable Requires additional programming circuit Metal 3 Metal-to-Metal Antifuse Metal 2 Via Metal 1 Contact Silicon Slide 21 Interconnect Switches Type 2: SRAM Based Uses pass transistors Controlled by SRAM bit from a lookup table (LUT) Higher interconnect resistance & capacitance than antifuse SRAM loses stored value when power is down Slide 22
Interconnect Switches Type 3: EPROM Based Uses floating gate transistor Turns OFF by injecting charge on the gate Memory retained when power is down Slide 23 Outline FPGA Architecture Logic Block Architecture. Routing Architecture and Techniques. Interconnect Switches. Xilinx FPGAs Overview. Technological side effects (65nm). Virtex-5 FPGAs. References. Slide 24
XILINX FPGAs Two famous series Spartan series. Virtex series. Slide 25 XILINX FPGAs Spartan series. Spartan, Spartan-II, and Spartan-3 Families. Up to 50% lower system cost than competing FPGAs Largest selection of device/package options. Most comprehensive IP library. Efficient, cost-effective board designs. Increased system reliability by eliminating external components. Slide 26
XILINX FPGAs Virtex series. Virtex-E, Virtex-II, Virtex-II Pro, Virtex-4, and Virtex-5 Families. XILINX says that you can replace ASICs by Virtex FPGAs in many applications Most advanced logic fabric. Highest performance FPGAs. Highest Density. Highest throughput embedded processing. Highest speed serial connectivity. Greatest memory capacity. Lower power consumption. Slide 27 Outline FPGA Architecture Logic Block Architecture. Routing Architecture and Techniques. Interconnect Switches. Xilinx FPGAs Overview. Technological side effects (65nm). Virtex-5 FPGAs. References. Slide 28
Technological Side Effects (65nm) Soft Errors Occurs when alpha particles strike and changes the state of a node Solution: Patented methods to improve robustness of devices Slide 29 Technological Side Effects (65nm) Wear-out Mechanisms Hot Carrier Injection (HCI) Time Dependent Dielectric Breakdown (TDDB) Negative Bias Temperature Instability (NBTI) Solution: Lower voltage used and using thicker oxide thickness under the expense of performance Slide 30
Technological Side Effects (65nm) Latchup Occurs when a device has current forced into or out of the substrate Long periods can destroy the device Solution: Follow conservative design rules Chip re-layout once discovered upon testing Slide 31 Technological Side Effects (65nm) Excessive Leakage Current Solution: Third gate oxide thickness on some transistors that do not require high performance Slide 32
Outline FPGA Architecture Logic Block Architecture. Routing Architecture and Techniques. Interconnect Switches. Xilinx FPGAs Overview. Technological side effects (65nm). Virtex-5 FPGAs. References. Slide 33 Outline Virtex-5 FPGA Family Overview. Configurable Logic Blocks (CLBs). Inputs and Outputs. Block RAM. Clock Resources. Power Minimization in Virtex-5. Slide 34
65nm Copper CMOS process. 1.0V core voltage. 12 metal layers. 550MHz clock technology. Up to 50K Virtex-5 slices (330K logic cells) 4 LUTs and 4 FFs per slice. Virtex-5 FPGA Family Overview Up to 1000 DSP48E slices One 25x18 Mul, one adder, and one accumulator per DSP48E slice. Up to 18Mbits of memory. Up to 1,200 user I/Os 1.2 to 3.3V I/O operation. Slide 35 Virtex-5 FPGA Family Overview Four Platforms Virtex-5 LX High-performance general logic applications. Virtex-5 LXT High-performance logic with advanced serial connectivity. Virtex-5 SXT High-performance signal processing applications with advanced serial connectivity. Virtex-5 FXT High-performance embedded systems with advanced serial connectivity. Slide 36
Virtex-5 FPGA Family Overview Slide 37 Virtex-5 FPGA Family Overview Slide 38
Outline Virtex-5 FPGA Family Overview. Configurable Logic Blocks (CLBs). Inputs and Outputs. Block RAM. Clock Resources. Power Minimization in Virtex-5. Slide 39 Virtex-5 FPGA Family Configurable Logic Blocks (CLBs) CLBs are the main logic resources for implementing sequential and combinational designs. A CLB contains two non-connected independent slices. Slide 40
Virtex-5 FPGA Family Configurable Logic Blocks (CLBs) Each slice is connected to the global routing paths through the switching matrix. Each slice column through different CLBs is connected by a fast carry logic. Slide 41 Virtex-5 FPGA Family Configurable Logic Blocks (CLBs) Two types of slices Regular slices (SLICEL). Slices that support more functions (SLICEM). Every CLB contains one or two SLICEL. Every other CLB column contains a SLICEM. Slide 42
Virtex-5 FPGA Family SLICEL Diagram Slide 43 SLICEL contains Virtex-5 FPGA Family Configurable Logic Blocks (CLBs) Four independent, 6-input LUTs. Can be used as just ROMs. Can be used as two 5-input (shared inputs) LUTs. Slide 44
Slide 45 Virtex-5 FPGA Family Configurable Logic Blocks (CLBs) Fast Lookahead Carry Logic Didicated carry logic. Carry chain is running upward through multiple CLBs with 4bits per slice. S for propagate and DI for generate. CYINIT may be used as the first carry bit. Slide 46
Virtex-5 FPGA Family Configurable Logic Blocks (CLBs) Storage elements configuration Edge triggered (+/-) FF or level sensitive (H/L) latch. Synchronous or asynchronous set/reset (using SR and REV inputs). D-inputs from LUTs or using AX, BX, CX, and DX Slide 47 Virtex-5 FPGA Family Configurable Logic Blocks (CLBs) Slide 48
Virtex-5 FPGA Family Configurable Logic Blocks (CLBs) LUTs within a SLICEM can be configured as (beside ROMs and LUTs) Single-port 32x1bit RAM. Dual-port 32x1bit RAM. Quad-port 32x2bit RAM. Simple dual-port 32x6bit RAM. Single-port 64x1bit RAM. Dual-port 64x1bit RAM. Quad-port 64x1bit RAM. Simple dual-port 64x3bit RAM. Single-port 128x1bit RAM. Dual-port 128x1bit RAM. Single-port 256x1bit RAM. 32-bit shift register without using the slice FFs. Slide 49 Virtex-5 FPGA Family Configurable Logic Blocks (CLBs) RAM inside SLICEM are called Distributed RAM. Distributed RAM modules have synchronous input and asynchronous output. The outputs can be synchronous by going through the SLICEM FFs. Slide 50
Outline Virtex-5 FPGA Family Overview. Configurable Logic Blocks (CLBs). Inputs and Outputs. Block RAM. Clock Resources. Power Minimization in Virtex-5. Slide 51 Virtex-5 FPGA Family Inputs and Outputs Each I/O Pad is connected to an I/O Block and ILOGIC, OLOGIC, and IODelay blocks. The I/O block may be configured to a wide variety of I/O standards. Slide 52
Basic I/O Block. Virtex-5 FPGA Family Inputs and Outputs Slide 53 Virtex-5 FPGA Family Inputs and Outputs ILOGIC block. Slide 54
OLOGIC block. Virtex-5 FPGA Family Inputs and Outputs Slide 55 Virtex-5 FPGA Family Inputs and Outputs The I/O Blocks are equipped by Digitally Controlled Impedance (DCI) Adjusts the o/p impedance or i/p termination to accurately match the c/c impedance of the PCB transmission line. Continuously adjusts the impedance compensating the impedance changes due to process variations, temperature, and supply voltage fluctuations. Provides the parallel or series termination for transmitters and receivers. Slide 56
Outline Virtex-5 FPGA Family Overview. Configurable Logic Blocks (CLBs). Inputs and Outputs. Block RAM. Clock Resources. Power Minimization in Virtex-5. Slide 57 Virtex-5 Block RAM Features Virtex-5 FPGA Family Block RAM Each block RAM can store up to 36Kb of data. A block can be configured as two independent 18Kb RAMs. Write and read are synchronous. Read and write ports are independent. Slide 58
Virtex-5 Block RAM Features Virtex-5 FPGA Family Block RAM The memory content can be initialized or cleared by the configuration bit stream. The block RAM can be configured as a FIFO. A write operation requires one clock edge. A read operation requires one clock edge. Slide 59 Outline Virtex-5 FPGA Family Overview. Configurable Logic Blocks (CLBs). Inputs and Outputs. Block RAM. Clock Resources. Power Minimization in Virtex-5. Slide 60
Global Clocks Virtex-5 FPGA Family Clock Resources Each Virtex-5 device has 32 global clk lines. It can clock all sequential resources (CLBs, Block RAMs, and I/Os). Global clock lines are driven by a global clock buffer Can be used as a clock enable. Can select between two clock sources. A global clock buffer is driven by a Clock Management Tile (CMT) that adjusts the clock delay relative to another clock. Slide 61 Virtex-5 FPGA Family Clock Resources Regional Clocks A Virtex-5 device is divided into regions (8 to 24). Each region has two regional clock buffers and four regional clock trees. Each region is assigned an I/O bank that has four clock-capable clock inputs. A regional clock buffer can divide the incoming clock rate by any integer number from 1 to 8. A regional clocks can drive regional clock trees from the adjacent regions. Slide 62
Outline Virtex-5 FPGA Family Overview. Configurable Logic Blocks (CLBs). Inputs and Outputs. Block RAM. Clock Resources. Power Minimization in Virtex-5. Slide 63 Power Minimization in Virtex-5 Static power Challenges High leakage current across the channel or through the gate oxide. Slide 64
Power Minimization in Virtex-5 Static power reduction Triple Oxide Process Technology Normally FPGAs use very thin oxide transistors for high performance (High leakage), and thicker for the high voltage tolerant transistors in the I/O blocks. The triple oxide is to introduce a third medium-thickness gate oxide. The triple oxide process opens up the opportunities to use the right transistor for the right job. Slide 65 Power Minimization in Virtex-5 Static power reduction Triple Oxide Process Technology The thin oxide transistors are used for the core logic. The thick oxide transistors are used for the I/O blocks. The mid-oxide transistors have lower performance and dramatically reduced leakage compared to thin oxide transistors and used in: The configuration memory (no need for high perf) Pass gates used in routing (no need for fast switching) Slide 66
Power Minimization in Virtex-5 Static power reduction The use of 6-inputs LUTs (for the first time) allows increase of logic capacity. More logic happens locally. Less drivers needed and hence less leakage. Slide 67 Power Minimization in Virtex-5 Dynamic power reduction Big LUTs localize the logic leading to reduced load capacitances from the programmable interconnect. Virtex-5 have a new more uniform routing architecture reducing the number of hops. I.e. reduced capacitance. Slide 68
Power Minimization in Virtex-5 Dynamic power reduction The block RAMs are composed of smaller 9Kb RAMs. The proper 9Kb is selected during read or write operation reducing the consumption in the other 9Kb. Slide 69 Outline FPGA Architecture Logic Block Architecture. Routing Architecture and Techniques. Interconnect Switches. Xilinx FPGAs Overview. Technological side effects (65nm). Virtex-5 FPGAs. References. Slide 70
Virtex-5 FPGA User Guide (xilinx.com) Virtex-5 Family Overview (xilinx.com) References http://en.wikipedia.org/wiki/field-programmable_gate_array http://www.ecs.umass.edu/ece/tessier/courses/697ff/lect13-ece697f.ppt http://www.eecg.toronto.edu/~vaughn/challenge/fpga_arch.html http://www.chipdesignmag.com/print.php?articleid=434?issueid=16 http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15828- s98/lectures/0119/index.htm http://www.edacafe.com/books/asic/book/ch05/ch05.1.php Channel Segmentation Design for Symmetrical FPGAs, Wai-Kei, Mak, 1997 Architecture of FPGAs and CPLDs: A Tutorial, Stephen Brown and Jonathan Rose Programmable Logic Handbook, Ashok K. Sharma, 1998 Power Consumption In 65nm FPGAs, By Derek Curd, Xilinx WP246 (V1.2) February 1, 2007. Slide 71 Slide 72