Virtex-II Architecture

Similar documents
Basic FPGA Architecture Xilinx, Inc. All Rights Reserved

The Virtex FPGA and Introduction to design techniques

Chapter 8 FPGA Basics

Programmable Logic. Simple Programmable Logic Devices

EE178 Lecture Module 2. Eric Crabill SJSU / Xilinx Fall 2007

FPGA architecture and design technology

Topics. Midterm Finish Chapter 7

Today. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses

Topics. Midterm Finish Chapter 7

Chapter 5 Global Timing Constraints. Global Timing Constraints 5-1

Field Programmable Gate Array (FPGA)

Virtex-II Architecture. Virtex II technical, Design Solutions. Active Interconnect Technology (continued)

IDEA! Avnet SpeedWay Design Workshop

HDL Coding Style Xilinx, Inc. All Rights Reserved

EECS150 - Digital Design Lecture 16 - Memory

Graduate Institute of Electronics Engineering, NTU FPGA Design with Xilinx ISE

! Program logic functions, interconnect using SRAM. ! Advantages: ! Re-programmable; ! dynamically reconfigurable; ! uses standard processes.

ECEU530. Project Presentations. ECE U530 Digital Hardware Synthesis. Rest of Semester. Memory Structures

EE260: Digital Design, Spring 2018

EECS150 - Digital Design Lecture 13 - Project Description, Part 2: Memory Blocks. Project Overview

ECE 545 Lecture 12. FPGA Resources. George Mason University

EECS150 - Digital Design Lecture 16 Memory 1

TSEA44 - Design for FPGAs

ΔΙΑΛΕΞΗ 2: FPGA Architectures

Xilinx ASMBL Architecture

CPE/EE 422/522. Introduction to Xilinx Virtex Field-Programmable Gate Arrays Devices. Dr. Rhonda Kay Gaede UAH. Outline

ECE 448 Lecture 5. FPGA Devices

FPGA Architecture Overview. Generic FPGA Architecture (1) FPGA Architecture

FPGA: What? Why? Marco D. Santambrogio

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

ECE 645: Lecture 1. Basic Adders and Counters. Implementation of Adders in FPGAs

The Xilinx XC6200 chip, the software tools and the board development tools

INTRODUCTION TO FPGA ARCHITECTURE

7-Series Architecture Overview

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

Summary. Introduction. Application Note: Virtex, Virtex-E, Spartan-IIE, Spartan-3, Virtex-II, Virtex-II Pro. XAPP152 (v2.1) September 17, 2003

Field Programmable Gate Array

Basic FPGA Architecture

Design of Arithmetic circuits

Evolution of Implementation Technologies. ECE 4211/5211 Rapid Prototyping with FPGAs. Gate Array Technology (IBM s) Programmable Logic

Altera FLEX 8000 Block Diagram

Spiral 2-8. Cell Layout

Verilog Sequential Logic. Verilog for Synthesis Rev C (module 3 and 4)

CS Digital Systems Project Laboratory

EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs)

Review from last time. CS152 Computer Architecture and Engineering Lecture 6. Verilog (finish) Multiply, Divide, Shift

Outline. EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) FPGA Overview. Why FPGAs?

FPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.

ALTERA FPGAs Architecture & Design

Verilog for High Performance

DESIGN AND IMPLEMENTATION OF 32-BIT CONTROLLER FOR INTERACTIVE INTERFACING WITH RECONFIGURABLE COMPUTING SYSTEMS

Asynchronous FIFO Design

Introduction to Partial Reconfiguration Methodology

FPGA Design Flow 1. All About FPGA

Lecture 11 Memories in Xilinx FPGAs

Digital Circuit Design and Language. Datapath Design. Chang, Ik Joon Kyunghee University

Outline. Field Programmable Gate Arrays. Programming Technologies Architectures. Programming Interfaces. Historical perspective

EEL 4783: HDL in Digital System Design

Synthesis vs. Compilation Descriptions mapped to hardware Verilog design patterns for best synthesis. Spring 2007 Lec #8 -- HW Synthesis 1

Introduction to Field Programmable Gate Arrays

DSP Resources. Main features: 1 adder-subtractor, 1 multiplier, 1 add/sub/logic ALU, 1 comparator, several pipeline stages

EECS Components and Design Techniques for Digital Systems. Lec 20 RTL Design Optimization 11/6/2007

Logic Synthesis. EECS150 - Digital Design Lecture 6 - Synthesis

Introduction to VHDL Design on Quartus II and DE2 Board

CHAPTER-IV IMPLEMENTATION AND ANALYSIS OF FPGA-BASED DESIGN OF 32-BIT FPAU

Lecture 7. Standard ICs FPGA (Field Programmable Gate Array) VHDL (Very-high-speed integrated circuits. Hardware Description Language)

AGM CPLD AGM CPLD DATASHEET

Chapter 2. Cyclone II Architecture

INTRODUCTION TO FIELD PROGRAMMABLE GATE ARRAYS (FPGAS)

COM-8004SOFT SIGNAL DIVERSITY SPLITTER VHDL SOURCE CODE OVERVIEW

Introduction to Modern FPGAs

Luleå University of Technology Kurskod SMD098 Datum Skrivtid

VHDL for Synthesis. Course Description. Course Duration. Goals

Histogram equalization of images

CDA 4253 FPGA System Design Op7miza7on Techniques. Hao Zheng Comp S ci & Eng Univ of South Florida

Writing Circuit Descriptions 8

Virtex-II SiberBridge Author: Ratima Kataria & the SiberCore Applications Engineering Group

Lab 3 Sequential Logic for Synthesis. FPGA Design Flow.

Don t expect to be able to write and debug your code during the lab session.

Synthesis of VHDL Code for FPGA Design Flow Using Xilinx PlanAhead Tool

Outline. EECS Components and Design Techniques for Digital Systems. Lec 11 Putting it all together Where are we now?

N-input EX-NOR gate. N-output inverter. N-input NOR gate

Architecture by Xilinx, Inc. All rights reserved.

CSE140L: Components and Design Techniques for Digital Systems Lab

Hardware Implementation

What is Xilinx Design Language?

Readings: Storage unit. Can hold an n-bit value Composed of a group of n flip-flops. Each flip-flop stores 1 bit of information.

Digital Design with FPGAs. By Neeraj Kulkarni

The DSP Primer 8. FPGA Technology. DSPprimer Home. DSPprimer Notes. August 2005, University of Strathclyde, Scotland, UK

CDA 4253 FGPA System Design Xilinx FPGA Memories. Hao Zheng Comp Sci & Eng USF

CSE140L: Components and Design

EECS150 - Digital Design Lecture 10 Logic Synthesis

A Brief Introduction to Verilog Hardware Definition Language (HDL)

Memory and Programmable Logic

Introduction to Actel FPGA Architecture

Device XC5202 XC5204 XC5206 XC5210 XC5215. Max Logic Gates 3,000 6,000 10,000 16,000 23,000

Section I. Cyclone II Device Family Data Sheet

Chapter 9: Integration of Full ASIP and its FPGA Implementation

Pipelining & Verilog. Sequential Divider. Verilog divider.v. Math Functions in Coregen. Lab #3 due tonight, LPSet 8 Thurs 10/11

FPGA: FIELD PROGRAMMABLE GATE ARRAY Verilog: a hardware description language. Reference: [1]

Transcription:

Virtex-II Architecture Block SelectRAM resource I/O Blocks (IOBs) edicated multipliers Programmable interconnect Configurable Logic Blocks (CLBs) Virtex -II architecture s core voltage operates at 1.5V Clock Management (CMs, BUFGMUXes) Basic FPGA Architecture 2-2 2005 Xilinx, Inc. All Rights Reserved

Slices and CLBs Each Virtex -II CLB contains four slices Local routing provides feedback between slices in the same CLB, and it provides routing to neighboring CLBs A switch matrix provides access to general routing resources Switch Matrix COUT BUFT BUF T SHIFT Slice S1 Slice S3 Slice S2 COUT Slice S0 Local Routing CIN CIN Basic FPGA Architecture 2-3 2005 Xilinx, Inc. All Rights Reserved

Simplified Slice Structure Each slice has four outputs Two registered outputs, two non-registered outputs Two BUFTs associated with each CLB, accessible by all 16 CLB outputs Carry logic runs vertically, up only Two independent carry chains per CLB Slice 0 PRE LUT Carry Q CE LUT Carry CLR PRE CE Q CLR Basic FPGA Architecture 2-4 2005 Xilinx, Inc. All Rights Reserved

etailed Slice Structure The next few slides discuss the slice features LUTs MUXF5, MUXF6, MUXF7, MUXF8 (only the F5 and F6 MUX are shown in this diagram) Carry Logic MULT_ANs Sequential Elements Basic FPGA Architecture 2-5 2005 Xilinx, Inc. All Rights Reserved

Look-Up Tables Combinatorial logic is stored in Look-Up Tables (LUTs) Also called Function Generators (FGs) Capacity is limited by the number of inputs, not by the complexity elay through the LUT is constant A B C Z 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 0 1 0 0 1 A B C Combinatorial Logic Z 0 1 0 1 1... 1 1 0 0 0 1 1 0 1 0 1 1 1 0 0 1 1 1 1 1 Basic FPGA Architecture 2-6 2005 Xilinx, Inc. All Rights Reserved

Connecting Look-Up Tables CLB Slice S3 Slice S2 F5 F8 F5 F6 MUXF8 combines the two MUXF7 outputs (from the CLB above or below) MUXF6 combines slices S2 and S3 Slice S1 F5 F7 MUXF7 combines the two MUXF6 outputs Slice S0 F5 F6 MUXF6 combines slices S0 and S1 MUXF5 combines LUTs in each slice Basic FPGA Architecture 2-7 2005 Xilinx, Inc. All Rights Reserved

Fast Carry Logic Simple, fast, and complete arithmetic Logic edicated XOR gate for single-level sum completion Uses dedicated routing resources All synthesis tools can infer carry logic CIN COUT To S0 of the next CLB First Carry Chain COUT SLICE S1 SLICE S0 CIN COUT To CIN of S2 of the next CLB COUT Second Carry Chain SLICE S3 SLICE S2 CIN CIN CLB Basic FPGA Architecture 2-8 2005 Xilinx, Inc. All Rights Reserved

MULT_AN Gate Highly efficient multiply and add implementation Earlier FPGA architectures require two LUTs per bit to perform the multiplication and addition The MULT_AN gate enables an area reduction by performing the multiply and the add in one LUT per bit A LUT CY_MUX S CO I CI MULT_AN CY_XOR A x B LUT B LUT Basic FPGA Architecture 2-9 2005 Xilinx, Inc. All Rights Reserved

Flexible Sequential Elements Either flip-flops or latches Two in each slice; eight in each CLB Inputs come from LUTs or from an independent CLB input Separate set and reset controls Can be synchronous or asynchronous All controls are shared within a slice Control signals can be inverted locally within a slice FRSE _1 CE S Q R FCPE PRE Q CE CLR LCPE PRE CE Q G CLR Basic FPGA Architecture 2-10 2005 Xilinx, Inc. All Rights Reserved

Shift Register LUT (SRL16CE) ynamically addressable serial shift registers Maximum delay of 16 clock cycles per LUT (128 per CLB) Cascadable to other LUTs or CLBs for longer shift registers edicated connection from Q15 to input of the next SRL16CE Shift register length can be changed asynchronously by toggling address A LUT CE CLK A[3:0] LUT CE CE CE CE Q Q Q Q Q Q15 (cascade out) Basic FPGA Architecture 2-11 2005 Xilinx, Inc. All Rights Reserved

IOB Element Input path Two R registers Output path Two R registers Two 3-state enable R registers Separate clocks and clock enables for I and O Set and reset signals are shared Reg OCK1 Reg OCK2 Reg OCK1 Reg OCK2 R MUX 3-state R MUX Output IOB Input Reg ICK1 Reg ICK2 PA Basic FPGA Architecture 2-12 2005 Xilinx, Inc. All Rights Reserved

istributed SelectRAM Resources Uses a LUT in a slice as memory Synchronous write Asynchronous read Accompanying flip-flops can be used to create synchronous read RAM and ROM are initialized during configuration ata can be written to RAM after configuration Emulated dual-port RAM One read/write port One read-only port LUT Slice LUT LUT RAM16X1S WE WCLK A0 O A1 A2 A3 RAM32X1S WE WCLK A0 O A1 A2 A3 A4 RAM16X1 WE WCLK A0 SPO A1 A2 A3 PRA0 PO PRA1 PRA2 PRA3 Basic FPGA Architecture 2-13 2005 Xilinx, Inc. All Rights Reserved

Block SelectRAM Resources Up to 3.5 Mb of RAM in 18-kb blocks Synchronous read and write True dual-port memory Each port has synchronous read and write capability ifferent clocks for each port Supports initial values Synchronous reset on output latches Supports parity bits One parity bit per eight data bits 18-kb block SelectRAM memory IA IPA ARA WEA ENA SSRA CLKA IB IPB ARB WEB ENB SSRB CLKB OA OPA OB OPB Basic FPGA Architecture 2-14 2005 Xilinx, Inc. All Rights Reserved

ual-port Block RAM Configurations Configurations available on each port Configuration epth ata Bits Parity Bits 16k x 1 16 kb 1 0 8k x 2 8 kb 2 0 4k x 4 4 kb 4 0 2k x 9 2 kb 8 1 1k x 18 1 kb 16 2 512 x 36 512 32 4 Independent configurations on ports A and B Supports data-width conversion, including parity bits IN 8 bit Port A: 8 bits Port B: 32 bits OUT 32 bit Basic FPGA Architecture 2-15 2005 Xilinx, Inc. All Rights Reserved

edicated Multiplier Blocks 18-bit twos complement signed operation Optimized to implement Multiply and Accumulate functions Multipliers are physically located next to block SelectRAM memory ata_a (18 bits) 4 x 4 signed 18 x 18 Multiplier Output (36 bits) 8 x 8 signed 12 x 12 signed ata_b (18 bits) 18 x 18 signed Basic FPGA Architecture 2-16 2005 Xilinx, Inc. All Rights Reserved

Xilinx esign Flow Plan & Budget Implement Translate Map Create Code/ Schematic Functional Simulation HL RTL Simulation Synthesize to create netlist Place & Route Attain Timing Closure Timing Simulation Create BIT File Basic FPGA Architecture 2-17 2005 Xilinx, Inc. All Rights Reserved

Xilinx Implementation Once you generate a netlist, you can implement the design There are several outputs of implementation Reports Timing simulation netlists Floorplan files FPGA Editor files and more! Implement Translate Map Place & Route.... Basic FPGA Architecture 2-18 2005 Xilinx, Inc. All Rights Reserved

What is Implementation? More than just Place & Route Implementation includes many phases Translate: Merge multiple design files into a single netlist Map: Group logical symbols from the netlist (gates) into physical components (slices and IOBs) Place & Route: Place components onto the chip, connect the components, and extract timing data into reports Each phase generates files that allow you to use other Xilinx tools Floorplanner, FPGA Editor, XPower Basic FPGA Architecture 2-19 2005 Xilinx, Inc. All Rights Reserved

Project Summary esign Overview evice Utilization Performance and Constraints Reports Basic FPGA Architecture 2-20 2005 Xilinx, Inc. All Rights Reserved

Map Reports Map Report contents Command line options for the map program esign summary List of how many device resources are used Errors and warnings Removed logic summary List of logic that was removed due to sourceless or loadless nets IOB properties Indicates whether an I/O flip-flop is used List of attributes on each I/O pin Post-Map Static Timing Report not covered here Basic FPGA Architecture 2-21 2005 Xilinx, Inc. All Rights Reserved

Map Report Example Release 4.1i - Map E.30 Xilinx Mapping Report File for esign 'top' esign Information ------------------ Command Line : map -p xc2v40-fg256-4 -cm area -k 4 -c 100 -tx off top.ngd Target evice : x2v40 Target Package : fg256 Target Speed : -4 Mapper Version : virtex2 -- $Revision: 1.58 $ Mapped ate : Tue Aug 21 09:42:20 2001 esign Summary -------------- Basic FPGA Architecture 2-22 2005 Xilinx, Inc. All Rights Reserved

Map Report Example Number of errors: 0 Number of warnings: 0 Number of Slices: 256 71% 182 out of Number of Slices containing unrelated logic: 182 0% 0 out of Number of Slice Flip Flops: 512 33% 170 out of Total Number 4 input LUTs: 512 48% 248 out of Number used as LUTs: 167 Number used as a route-thru: 81 Number of bonded IOBs: 88 29% 26 out of Number of GCLKs: 16 6% 1 out of Total equivalent gate count for design: 3,475 Additional JTAG gate count for IOBs: 1,248! Basic FPGA Architecture 2-23 2005 Xilinx, Inc. All Rights Reserved

Place & Route Reports Place & Route Report contents Command line options for the par program Errors and warnings evice utilization summary Similar to the esign Summary from the Map Report Unrouted nets Timing summary Statistics on average routing delays Performance versus constraints if the design contains timing constraints Basic FPGA Architecture 2-24 2005 Xilinx, Inc. All Rights Reserved

Timing Reports Timing Report contents (for designs with constraints) Command line options for the trce program Timing Constraints section Summary of each timing constraint etails on paths that fail to meet constraints ata Sheet section Setup/hold, clock to pad, timing between clock domains, and pad-to-pad delay information Organized in easy-to-read table format Timing Summary section Number of errors and Timing Score Constraint coverage Basic FPGA Architecture 2-25 2005 Xilinx, Inc. All Rights Reserved

Timing Report Example Release 4.1i - Trace E.30 Copyright (c) 1995-2001 Xilinx, Inc. All rights reserved. trce -e 3 -l 3 -xml top top.ncd -o top.twr top.pcf esign file: top.ncd Physical constraint file: top.pcf evice,speed: xc2v40,-4 (AVANCE 1.85 2001-07-24) Report level: error report -------------------------------------------------------------------------------- WARNING:Timing - No timing constraints found, doing default enumeration. ================================================================================ Timing constraint: efault period analysis 8292 items analyzed, 0 timing errors detected. Minimum period is 8.852ns. Maximum delay is 11.830ns. -------------------------------------------------------------------------------- Basic FPGA Architecture 2-26 2005 Xilinx, Inc. All Rights Reserved

Timing Report Example All constraints were met. ata Sheet report: ----------------- All values displayed in nanoseconds (ns) Clock FiftyM_clk to Pad ---------------+------------+ clk (edge) estination Pad to PA ---------------+------------+ EN 10.035(R) half1 9.465(R) half2 9.166(F) half3 9.740(R) half4 9.174(F) ---------------+------------+ Basic FPGA Architecture 2-27 2005 Xilinx, Inc. All Rights Reserved

Without Timing Constraints This design had no timing constraints or pin assignments entered when it was implemented Note the logical structure of the placement and pins. Xilinx recommends that you compile your design at least once without timing constraints or pin assignments This design has a maximum system clock frequency of 50 MHz Basic FPGA Architecture 2-28 2005 Xilinx, Inc. All Rights Reserved

With Timing Constraints This is the same design with three global timing constraints entered with the Constraints Editor It has a maximum system clock frequency of 60 MHz Note how most of the logic is placed closer to the edge of the device where the pins have been placed Basic FPGA Architecture 2-29 2005 Xilinx, Inc. All Rights Reserved

Period Constraint In this example the Period constraint optimizes all delay paths between flip-flops The Period constraint does NOT optimize delay paths from input pads to output pads (purely combinatorial), paths from input pads to flip-flops, or paths from flip-flops to output pads AATA FLOP1 Q FLOP2 Q FLOP3 Q OUT1 CLK BUFG FLOP4 FLOP5 BUS [7..0] Q Q OUT2 CATA = Combinatorial Logic Basic FPGA Architecture 2-30 2005 Xilinx, Inc. All Rights Reserved

The Period Constraint A synchronous element is a flip-flop, latch, or a synchronous RAM The Period constraint covers paths Between synchronous elements which are clocked by the reference net Synchronous elements are grouped by the clock signal driving them. This is called forward propagation and enables constraining large pieces of logic with a single constraint Basic FPGA Architecture 2-31 2005 Xilinx, Inc. All Rights Reserved

Offset Constraint In this example, the Offset constraint optimizes delay paths from input pads to flip-flops and paths from flip-flops to output pads Offset In Offset Out AATA FLOP Q FLOP Q FLOP Q OUT1 CLK BUFG FLOP FLOP BUS [7..0] Q Q OUT2 CATA = Combinatorial Logic Basic FPGA Architecture 2-32 2005 Xilinx, Inc. All Rights Reserved

The Offset Constraint The Offset constraint covers paths From input pads to synchronous elements clocked by the reference net (Offset In) From synchronous elements to output pads clocked by the reference net (Offset Out) Note, that this constraint does not cover paths Between synchronous elements From pads to pads (purely combinatorial paths) Basic FPGA Architecture 2-33 2005 Xilinx, Inc. All Rights Reserved