Core Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items

Similar documents
Core Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items

User Manual for FC100

Core Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items

Documentation. Design File Formats. Constraints Files. Verification. Slices 1 IOB 2 GCLK BRAM

Documentation Design File Formats

FFT/IFFTProcessor IP Core Datasheet

logibayer.ucf Core Facts

Core Facts. Documentation. Encrypted VHDL Fallerovo setaliste Zagreb, Croatia. Verification. Reference Designs &

Core Facts. Documentation. Design File Formats. Simulation Tool Used Designed for interfacing configurable (32 or 64

AN 464: DFT/IDFT Reference Design

Fast Fourier Transform IP Core v1.0 Block Floating-Point Streaming Radix-2 Architecture. Introduction. Features. Data Sheet. IPC0002 October 2014

Support Triangle rendering with texturing: used for bitmap rotation, transformation or scaling

LogiCORE IP Multiply Adder v2.0

April 7, 2010 Data Sheet Version: v4.00

LogiCORE IP Fast Fourier Transform v7.1

SHA Core, Xilinx Edition. Core Facts

April 6, 2010 Data Sheet Version: v2.05. Support 16 times Support provided by Xylon

An Efficient Architecture for Ultra Long FFTs in FPGAs and ASICs

LogiCORE IP Fast Fourier Transform v7.1

FFT MegaCore Function User Guide

LogiCORE IP Initiator/Target v5 and v6 for PCI-X

Single Channel HDLC Core V1.3. LogiCORE Facts. Features. General Description. Applications

High-Performance 16-Point Complex FFT Features 1 Functional Description 2 Theory of Operation

White Paper. Floating-Point FFT Processor (IEEE 754 Single Precision) Radix 2 Core. Introduction. Parameters & Ports

An Efficient Architecture for Ultra Long FFTs in FPGAs and ASICs

January 19, 2010 Product Specification Rev1.0. Core Facts. Documentation Design File Formats. Slices 1 BUFG/

32 Channel HDLC Core V1.2. Applications. LogiCORE Facts. Features. General Description. X.25 Frame Relay B-channel and D-channel

ISim Hardware Co-Simulation Tutorial: Accelerating Floating Point FFT Simulation

ISim Hardware Co-Simulation Tutorial: Accelerating Floating Point Fast Fourier Transform Simulation

FPGA Matrix Multiplier

Virtex-5 GTP Aurora v2.8

32-Bit Initiator/Target v3 & v4 for PCI

Documentation. Implementation Xilinx ISE v10.1. Simulation

FFT MegaCore Function User Guide

Discontinued IP. Distributed Memory v7.1. Functional Description. Features

Data Side OCM Bus v1.0 (v2.00b)

Basic Xilinx Design Capture. Objectives. After completing this module, you will be able to:

FFT Compiler IP Core User Guide

LogiCORE IP FIFO Generator v6.1

7 Series FPGAs Memory Interface Solutions (v1.9)

Block RAM. Size. Ports. Virtex-4 and older: 18Kb Virtex-5 and newer: 36Kb, can function as two 18Kb blocks

10GBase-R PCS/PMA Controller Core

Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope

High-Performance DDR3 SDRAM Interface in Virtex-5 Devices Author: Adrian Cosoroaba

24K FFT for 3GPP LTE RACH Detection

LogiCORE IP Serial RapidIO v5.6

Discontinued IP. Verification

LogiCORE IP Device Control Register Bus (DCR) v2.9 (v1.00b)

FPGA Implementation of Discrete Fourier Transform Using CORDIC Algorithm

algotronix Calton Road Edinburgh, Scotland United Kingdom, EH8 8JQ Phone: URL:

Utility Reduced Logic (v1.00a)

An Overview of a Compiler for Mapping MATLAB Programs onto FPGAs

1024-Point Complex FFT/IFFT V Functional Description. Features. Theory of Operation

Channel FIFO (CFIFO) (v1.00a)

AXI4-Stream Infrastructure IP Suite

LogiCORE IP AXI Video Direct Memory Access (axi_vdma) (v3.01.a)

FPGA Implementation and Validation of the Asynchronous Array of simple Processors

LogiCORE IP ChipScope Pro Integrated Controller (ICON) (v1.05a)

Asynchronous FIFO V3.0. Features. Synchronization and Timing Issues. Functional Description

SDACCEL DEVELOPMENT ENVIRONMENT. The Xilinx SDAccel Development Environment. Bringing The Best Performance/Watt to the Data Center

Utility Bus Split (v1.00a)

Gamma Corrector IP Core User Guide

ISim Hardware Co-Simulation Tutorial: Accelerating Floating Point FFT Simulation

ISSN Vol.02, Issue.11, December-2014, Pages:

FlexRIO. FPGAs Bringing Custom Functionality to Instruments. Ravichandran Raghavan Technical Marketing Engineer. ni.com

LogiCORE IP Mailbox (v1.00a)

3GPP Turbo Encoder v4.0

Virtex-6 FPGA Embedded Tri-Mode Ethernet MAC Wrapper v1.4

AT40K FPGA IP Core AT40K-FFT. Features. Description

Table 1: Example Implementation Statistics for Xilinx FPGAs

Multi-Gigahertz Parallel FFTs for FPGA and ASIC Implementation

LogiCORE IP AXI Video Direct Memory Access (axi_vdma) (v3.00.a)

Image Compression System on an FPGA

Graphics Controller Core

LogiCORE IP AXI DataMover v3.00a

White Paper Taking Advantage of Advances in FPGA Floating-Point IP Cores

Digital Signal Processing for Analog Input

Color Space Converter

Stratix II vs. Virtex-4 Performance Comparison

LogiCORE IP AXI Master Lite (axi_master_lite) (v1.00a)

Virtex-5 FPGA Embedded Tri-Mode Ethernet MAC Wrapper v1.7

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

Introduction to Partial Reconfiguration Methodology

ISE Design Suite Software Manuals and Help

Fibre Channel Arbitrated Loop v2.3

AccelDSP Synthesis Tool

Synthesis Options FPGA and ASIC Technology Comparison - 1

Virtual Input/Output v3.0

Agenda. Introduction FPGA DSP platforms Design challenges New programming models for FPGAs

FPGA Architecture Overview. Generic FPGA Architecture (1) FPGA Architecture

LUTs. Block RAMs. Instantiation. Additional Items. Xilinx Implementation Tools. Verification. Simulation

AES Core Specification. Author: Homer Hsing

Synchronous FIFO V3.0. Features. Functional Description

Instantiation. Verification. Simulation. Synthesis

Sine/Cosine Look-Up Table v4.2. Features. Functional Description. θ = THETA radians 2 2 OUTPUT_WIDTH-1 OUTPUT_WIDTH-1

SHA3 Core Specification. Author: Homer Hsing

High Performance Pipelined Design for FFT Processor based on FPGA

Supported Device Family (1) Supported User Interfaces. Simulation Models Supported S/W Drivers. Simulation. Notes:

FFT MegaCore Function User Guide

TSEA44 - Design for FPGAs

Transcription:

(FFT_PIPE) Product Specification Dillon Engineering, Inc. 4974 Lincoln Drive Edina, MN USA, 55436 Phone: 952.836.2413 Fax: 952.927.6514 E mail: info@dilloneng.com URL: www.dilloneng.com Core Facts Documentation Design File Formats Constraints Files Verification Instantiation Templates Reference Designs & Application Notes Additional Items Provided with Core Features Simulation Tool Used Radix 2 Fast Fourier Transform (FFT) with Aldec Riviera 2008.06 pipelined butterfly rank structure Support IEEE 754 Floating Point data Support Provided by Dillon Engineering, Inc. - Uses Xilinx Coregen math operators - Customizable precision, speed, and size - Any width fixed point builds also available Run time selectable length N=32 to 2 m, (m=5 16) - See Table 1 for example 2 m max length builds - Length up to N=64K (m=16) possible in Virtex 5 SX240 Run time selectable Forward/Inverse transform mode Continuous processing at speeds up to Fmax (see Table 1). - Data rate of 250MSamples/sec in Virtex 5. Natural order inputs and outputs Includes C/C++ bit accurate model and data generator - Model also usable from MATLAB Includes Verilog testbench and run scripts User Guide ISE Project with EDIF/NGC netlist, Verilog source available for extra cost.ucf constraints Verilog Testbench, Test Vectors Verilog None C/C++ Model Table 1: Example Implementation Statistics for Xilinx FPGAs Family Example Device FFT Length Fmax MULT/ DCM & Design (MHz) Slice FF 1 Slice LUT 1 IOB 2 GCLK BRAM DSP48/E MGT Tools Spartan 3A XC3SD3400A 5 256 150 23,867 30,143 137 1 16 96 N/A ISE 10.1.02 Virtex 4 XC4VSX55 12 1024 200 21,585 26,079 137 1 26 352 N/A ISE 10.1.02 Virtex 5 XC5VSX50T 3 1024 250 20,562 20,333 137 1 19 176 N/A ISE 10.1.02 Virtex 5 XC5VSX95T 2 16,384 200 27,799 29,163 137 1 109 256 N/A ISE 10.1.02 Notes: 1) Actual slice count dependent on percentage of unrelated logic see Mapping Report File for details 2) Assuming all core I/Os and clocks are routed off chip. October 14, 2008 2008 Xilinx, Inc. All rights reserved. XILINX, the Xilinx Logo, and other designated brands included herein are trademarks of Xilinx, Inc.

Figure 1: Pipelined FFT Block Diagram General Description Dillon Engineering's Pipelined Floating Point FFT IP Core uses a modular radix 2 Fast Fourier Transform (FFT) architecture to provide discrete transforms on data frames or continuous data streams, with sample rate up to the maximum clock frequency. This efficient structure employs a single butterfly and a single delay feedback path per rank for low localized memory usage. True IEEE 754 floating point data maintained throughout, supporting a large dynamic range of data without requiring complicated fixedpoint analysis. The standard Pipelined IP Core is easily scalable to any Xilinx device and customizable to suit many FFT applications. Functional Description Frame The Frame blocks use control signaling to delimit discrete data frames per the selected transform length. Bit Reverse The Bit Reverse block converts natural order inputs to bit reversed order as required by the FFT engine. Pipelined Ranks The Pipelined Rank blocks daisy chain the FFT processing from input to output. Each rank is optimized to contain the proper radix 2 butterfly math elements, twiddle factor ROMs, and local datapath memories for efficient continuous processing. Variable Length Select The Variable Length Select block multiplexes the rank outputs for variable transform length support. Applications The Pipelined Floating Point FFT IP Core is useful in High Performance Embedded Computing (HPEC) applications which require continuous Digital Signal Processing (DSP) at high sample rates. Floating point FFT hardware acceleration or co processing is often a goal of scientific algorithms used in High Performance Computing (HPC). End applications and markets include radar, sonar, spectral analysis, telecommunications and image processing. 2

Core Modifications The standard IP Core is available in netlist or parameterized source code and supports the following: Netlist builds for any Xilinx device. FFT length and speed depend on chip resources and speed grade. Per transform length selectable in powers of 2 from 32 to 2 m points, where m=5 16. Per transform mode selectable between Forward and Inverse FFT. Static length and mode configuration. Pipeline must be clear before changing these configuration settings. IEEE 754 single precision floating point math operators using Xilinx Coregen full DSP usage/maximum latency floating_point_v4_0 cores. Decimation in time (DIT) algorithm with internal bit reversal, providing natural order data inputs and outputs. Potential customized deliveries from Dillon Engineering include: Fixed single length of 2 m for a slight logic savings over run time selectable length. Fixed Forward or Inverse mode for a slight logic savings over run time selectable mode. Pipelined configuration settings. Allows dynamic mode and/or length switching on back to back transforms. Bit reversal stage removed for a slight logic savings and elimination of a BlockRAM FIFO and associated latency. (Note data must then be input in bit reversed order to provide natural order outputs.) Decimation in frequency (DIF) build option, which inputs data in natural order and outputs data in bit reversed order. Any Xilinx Floating Point operator adjustments to precision and latencies, with logic parameter settings to match. Xilinx Floating Point operators are built separately with Coregen, providing RTL source and.ngc netlists. Thus all trade offs between speed, number of pipeline stages, DSP48/Mult macro usage, double or custom precision float, etc., can be supported. Any width fixed point math operators in lieu of floating point. Options for various scaling, rounding and saturation modes, all matched bit accurate with the C/C++ model. Contact Dillon Engineering for more details. Core I/O Signals The core signal I/O have not been fixed to specific device pins to provide flexibility for interfacing with user logic. Descriptions of all signal I/O are provided in Table 2. Table 2: Core I/O Signals. Signal Signal Direction Description CLK Input Clock Input. Single source used for all I/O and internal clocking. RST_N Input Active low asynchronous reset. Resets all control logic. 3

DIR Input Transform mode select. 0 = Forward FFT, 1 = Inverse FFT. SEL[3:0] Input Transform length select. Valid range is from 4'd5 (indicating transform length of 32) up to the maximum length supported by the build (e.g. 4'd10 for a transform length of 1024). Number of SEL bits is dependent on the maximum length. SYNC_IN Input Input sync strobe. Indicates to the core to begin processing i_data on the following clock cycle. A[63:0] Input Input data. Complex data of the form R + iq, where R is contained in bits 63:32 and Q is contained in bits 31:0, each a single precision floating point number. SYNC_OUT Output Output sync strobe. Indicates the core is sending processed o_data beginning on the following clock cycle. X[63:0] Output Output data. Complex data of the form R + iq, where R is contained in bits 63:32 and Critical Signal Descriptions Q is contained in bits 31:0, each a single precision floating point number. All interface and internal operation of the core is synchronous to CLK. Simple SYNC strobes are used on the input and output interfaces to signal that data is valid on the following clock cycle. An active SYNC coinciding with the last data point thus indicates back to back transforms. A SYNC_IN strobe active while the core is already inputing data is ignored. Tying SYNC_IN active will signal the core to perform continuous transforms, and SYNC_OUT will strobe as normal to frame the output data. Figure 2: Interface Input Timing, 1K Length Back to Back Transforms Figure 3: Interface Output Timing, 1K Length Back to Back Transforms The DIR and SEL configuration inputs are by default selectable per transform, but must be stable starting with SYNC_IN active and must not be changed until the transformed data has been completely output from the core (i.e. 2 m clocks after the corresponding SYNC_OUT). Core Assumptions Following SYNC_IN, the initial transform has a start up latency dependent on the bit reversal stage, the floating point core latencies and the length of the transform. The core provides continuous processing at steady state, though the SYNC IN to OUT latencies may vary slightly due to internal pipeline alignment. 4

The standard core with transform length of 1024 has a start up latency of around 2300 clock cycles, or 9.2usec at 250MHz clock rate. Latencies of other lengths 2 m follow approximately the formula: Latency (in clock cycles) = (2 x 2 m ) + (m x ((2 x fp_add_delay) + fp_mult_delay)) Verification Methods The core is verified to be bit accurate with the C/C++ data model under all supported lengths, modes, throughputs and data format, using a rigorous simulation suite of directed and random data. Our model development is evaluated in terms of SQNR with a double precision floating point software FFT implementation. Dillon Engineering's FFT IP Cores have been proven over the years in many Xilinx designs. Ordering Information This product is available directly from Dillon Engineering, Inc. Please contact Dillon Engineering for pricing and additional information about this product using the contact information on the front page of this datasheet. Visit www.dilloneng.com/fft_ip to see all of Dillon Engineering's FFT IP offerings, including: UltraLong FFTs (up to 64M points, fixed or floating point) Parallel Butterfly FFTs (continuous FFTs at multiple points per clock cycle) Full Parallel FFTs (extremely fast rates, up to 25GSamples/sec) 2D FFTs (Two dimensional transform for image processing) Mixed Radix FFTs (for non power of 2 FFT lengths) Related Information Xilinx Programmable Logic For information on Xilinx programmable logic or development system software, contact your local Xilinx sales office, or: Xilinx, Inc. 2100 Logic Drive San Jose, CA 95124 Phone: +1 408 559 7778 Fax: +1 408 559 7114 URL: www.xilinx.com 5