Signal Processing Algorithms into Fixed Point FPGA Hardware Dennis Silage ECE Temple University

Similar documents
INTRODUCTION TO FPGA ARCHITECTURE

Introduction to Field Programmable Gate Arrays

Modeling a 4G LTE System in MATLAB

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany

Accelerating FPGA/ASIC Design and Verification

Implementing MATLAB Algorithms in FPGAs and ASICs By Alexander Schreiber Senior Application Engineer MathWorks

Today. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses

PINE TRAINING ACADEMY

Introduction to DSP/FPGA Programming Using MATLAB Simulink

FPGAs: High Assurance through Model Based Design

Digital Integrated Circuits

System-on Solution from Altera and Xilinx

L2: FPGA HARDWARE : ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA

Field Programmable Gate Array (FPGA) Devices

Core Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items

Midterm Exam. Solutions

EITF35: Introduction to Structured VLSI Design

Agenda. Introduction FPGA DSP platforms Design challenges New programming models for FPGAs

An Introduction to Programmable Logic

INTRODUCTION TO FIELD PROGRAMMABLE GATE ARRAYS (FPGAS)

FPGA Based Digital Design Using Verilog HDL

MATLAB/Simulink 기반의프로그래머블 SoC 설계및검증

Field Programmable Gate Array (FPGA)

Virtex-II Architecture. Virtex II technical, Design Solutions. Active Interconnect Technology (continued)

Graduate Institute of Electronics Engineering, NTU FPGA Design with Xilinx ISE

Cover TBD. intel Quartus prime Design software

Integrated Workflow to Implement Embedded Software and FPGA Designs on the Xilinx Zynq Platform Puneet Kumar Senior Team Lead - SPC

Hardware-Software Co-Design and Prototyping on SoC FPGAs Puneet Kumar Prateek Sikka Application Engineering Team

Introduction to FPGAs. H. Krüger Bonn University

Cover TBD. intel Quartus prime Design software

ECEN 449 Microprocessor System Design. FPGAs and Reconfigurable Computing

Programmable Logic. Any other approaches?

Workspace for '4-FPGA' Page 1 (row 1, column 1)

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

Intro to System Generator. Objectives. After completing this module, you will be able to:

Optimize DSP Designs and Code using Fixed-Point Designer

FPGAs in a Nutshell - Introduction to Embedded Systems-

Zynq AP SoC Family

CPE/EE 422/522. Introduction to Xilinx Virtex Field-Programmable Gate Arrays Devices. Dr. Rhonda Kay Gaede UAH. Outline

Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 3rd year engineering. Winter/Summer Training

FPGA Implementation and Validation of the Asynchronous Array of simple Processors

ECE 448 Lecture 5. FPGA Devices

FPGA How do they work?

An Efficient Architecture for Ultra Long FFTs in FPGAs and ASICs

Summary. Introduction. Application Note: Virtex, Virtex-E, Spartan-IIE, Spartan-3, Virtex-II, Virtex-II Pro. XAPP152 (v2.1) September 17, 2003

Advanced Digital Design Using FPGA. Dr. Shahrokh Abadi

Fast Evaluation of the Square Root and Other Nonlinear Functions in FPGA

OPTIMIZING COARSE- GRAINED UNITS IN FLOATING POINT HYBRID FPGA

Zynq-7000 All Programmable SoC Product Overview

Designing and Prototyping Digital Systems on SoC FPGA The MathWorks, Inc. 1

Field Programmable Gate Array

FlexRIO. FPGAs Bringing Custom Functionality to Instruments. Ravichandran Raghavan Technical Marketing Engineer. ni.com

Fast evaluation of nonlinear functions using FPGAs

PLAs & PALs. Programmable Logic Devices (PLDs) PLAs and PALs

Introduction to C and HDL Code Generation from MATLAB

FPGA architecture and design technology

Early Models in Silicon with SystemC synthesis

Chapter 5 Embedded Soft Core Processors

Support Triangle rendering with texturing: used for bitmap rotation, transformation or scaling

Double Precision Floating-Point Multiplier using Coarse-Grain Units

Simplify System Complexity

Embedded Real-Time Video Processing System on FPGA

LSN 6 Programmable Logic Devices

FIELD PROGRAMMABLE GATE ARRAYS (FPGAS)

Motor Control: Model-Based Design from Concept to Implementation on heterogeneous SoC FPGAs Alexander Schreiber, MathWorks

Generation of Multigrid-based Numerical Solvers for FPGA Accelerators

The Virtex FPGA and Introduction to design techniques

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

S2C K7 Prodigy Logic Module Series

ISE Design Suite Software Manuals and Help

EE178 Lecture Module 2. Eric Crabill SJSU / Xilinx Fall 2007

RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN

TSEA44 - Design for FPGAs

VHDL-MODELING OF A GAS LASER S GAS DISCHARGE CIRCUIT Nataliya Golian, Vera Golian, Olga Kalynychenko

Reconfigurable Hardware Implementation of Mesh Routing in the Number Field Sieve Factorization

Intellectual Property Macrocell for. SpaceWire Interface. Compliant with AMBA-APB Bus

COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design

Scalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA

Simplify System Complexity

Don t Think You Need an FPGA? Think Again!

EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs)

SDSoC: Session 1

Learning Outcomes. Spiral 3 1. Digital Design Targets ASICS & FPGAS REVIEW. Hardware/Software Interfacing

EE260: Digital Design, Spring 2018

Outline. EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) FPGA Overview. Why FPGAs?

Table 1: Example Implementation Statistics for Xilinx FPGAs

An FPGA based rapid prototyping platform for wavelet coprocessors

Performance Verification for ESL Design Methodology from AADL Models

FPGA system development What you need to think about. Frédéric Leens, CEO

An FPGA Based Adaptive Viterbi Decoder

Multi MicroBlaze System for Parallel Computing

Product Obsolete/Under Obsolescence

Distributed Vision Processing in Smart Camera Networks

[Sub Track 1-3] FPGA/ASIC 을타겟으로한알고리즘의효율적인생성방법및신기능소개

HCTL Open Int. J. of Technology Innovations and Research HCTL Open IJTIR, Volume 4, July 2013 e-issn: ISBN (Print):

Evolution of Implementation Technologies. ECE 4211/5211 Rapid Prototyping with FPGAs. Gate Array Technology (IBM s) Programmable Logic

A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning

ECE 331 Digital System Design

Transcription:

Signal Processing Algorithms into Fixed Point FPGA Hardware Dennis Silage silage@temple.edu ECE Temple University www.temple.edu/scdl

Signal Processing Algorithms into Fixed Point FPGA Hardware Motivation Development of Programmable Logic Algorithms into Fixed Point The MathWorks Fixed Point Designer Fixed Point Design Project Example FPGA Synthesis SCDL

Signal Processing Algorithms into Fixed Point FPGA Hardware Motivation Development of Programmable Logic Algorithms into Fixed Point The MathWorks Fixed Point Designer Fixed Point Design Project Example FPGA Synthesis SCDL

Motivation: Why use FPGAs in digital signal processing? Application specific performance in a non- ASIC configuration Low power consumption and heat dissipation Vector and parallel processing

Motivation: Why use fixed-point signal processing in FPGAs? No register size restriction No possible overflow as in fixed register processing Synthesis (put-and-place) may be augmented High speed integer arithmetic

Signal Processing Algorithms into Fixed Point FPGA Hardware Motivation Development of Programmable Logic Algorithms into Fixed Point The MathWorks Fixed Point Designer Fixed Point Design Project Example FPGA Synthesis SCDL

Development of Programmable Logic DEC PDP-8/E ENIAC

Development of Programmable Logic The earliest programmable ICs (1978) were the Programmable Array Logic (PAL) devices PAL16L8

Development of Programmable Logic Combination logic and registered output PAL devices were then configured as macrocells (1983) PAL22V10

Development of Programmable Logic Complex Programmable Logic Devices (CPLD) combined several macrocells and a programmable interconnection system (1988) Xilinx XC2Cxx CPLD

Development of Programmable Logic As an alternative architecture, the early FPGA (1994) provided an array of configurable logic blocks (CLB) dominated by a complex routing scheme Xilinx Spartan FPGA

Development of Programmable Logic The Xilinx Spartan-IIE FPGA (1998) utilizes a matrix of routing channels surrounding the CLBs, input-output blocks (IOB) for the fine grained interconnection

Development of Programmable Logic The Xilinx Spartan-3E FPGA (2003) had routing channels surrounding the CLBs but included coarse grained structures such as block RAM, digital clock manager (DCM) and multipliers

Development of Programmable Logic The Xilinx FPGA has now split into two families: Low cost Spartan series High performance Virtex, Kintex and Artix series

Development of Programmable Logic Except for the first of the Virtex series all the Xilinx FPGAs feature a coarse grained architecture. Multipliers in hardware (DSP48) Block ram with variable size registers Digital clock manager (DCM)

Development of Programmable Logic Although applications in DSP and DIP are facilitated, the coarse-grained architecture has drawbacks for other applications. Hardware multipliers consume power Block RAM may be an impediment in some applications Synthesis (put-and- place) may be difficult because of the coarse grain

Development of Programmable Logic The first in the Xilinx Virtex series (1999) was a sea of CLBs. However, the largest of the class was the XCV1000 with only an array of 64 by 96 (6144) 2-slice CLBs with a 200 MHz clock. Virtex I 2-slice CLB

Development of Programmable Logic The configuration of the CLB remains the same now with only increases in width. Virtex I has a 4-bit look up table (LUT) for combinational logic. Virtex I has no hardware multipliers but some block RAM (128 Kbits)

Development of Programmable Logic The XCV1000 Virtex I had 512 IO pins and was designed for integration.

Development of Programmable Logic The versatility of the Virtex I is encompassed in its sea of CLBs, although small by the standards of today.

Development of Programmable Logic The largest Virtex 7 has 178 000 2-slice CLBs, a 30 times increase, but also 3360 hardware multipliers and 68 Mbit of block RAM in its coarse-grained architecture.

Development of Programmable Logic Although over 18 years old, the SCDL still uses Virtex I XCV1000 for proof of concept because of its ease of synthesis (put-and-place) in vector operations and IO capabilities for performance verification. www.temple.edu/scdl

Signal Processing Algorithms into Fixed Point FPGA Hardware Motivation Development of Programmable Logic Algorithms into Fixed Point The MathWorks Fixed Point Designer Fixed Point Design Project Example FPGA Synthesis SCDL History

Algorithms into Fixed Point FPGA designs implemented in fixed point will always be more efficient than their equivalent in floating point because fixed point implementations consume fewer resources and less power.

Algorithms into Fixed Point Fixed point computer arithmetic references and research form the basis for implementation

Algorithms into Fixed Point Discrete impulse applied to both FIR filters

Algorithms into Fixed Point Dynamic power saving >80%

Algorithms into Fixed Point CORDIC (COrdinate Rotation DIgital Computer) or Volder s algorithm is a simple and efficient process to calculate hyperbolic and trigonometric functions, typically converging with one digit (or bit) per iteration.

Algorithms into Fixed Point CORDIC implementations using fixed-point arithmetic are attractive as they can exhibit high performance and low resource usage.

Algorithms into Fixed Point Contemporary research:

Algorithms into Fixed Point Contemporary research:

Algorithms into Fixed Point Contemporary research:

Algorithms into Fixed Point Contemporary research:

Algorithms into Fixed Point Contemporary research:

Algorithms into Fixed Point Contemporary research:

Signal Processing Algorithms into Fixed Point FPGA Hardware Motivation Development of Programmable Logic Algorithms into Fixed Point The MathWorks Fixed Point Designer Fixed Point Design Project Example FPGA Synthesis SCDL

MathWorks Fixed Point Designer Provides data types and tools for developing fixed-point algorithms to optimize performance on embedded hardware. Analyzes the design and proposes data types and attributes such as word length and scaling.

MathWorks Fixed Point Designer Performs bit-true simulations to observe the impact of limited range and precision. Conversion of double-precision algorithms to fixed point. Specifies data attributes such as rounding mode and overflow action.

MathWorks Fixed Point Designer Optimizes data types that meet numerical accuracy requirements and target hardware constraints. Compares fixed-point results with floatingpoint baselines and Verilog HDL code generation.

MathWorks Fixed Point Designer The fixed point design process starts with a floating point algorithm for later verification.

MathWorks Fixed Point Designer The Fixed Point Designer iteratively translates the floating point algorithm to fixed point.

MathWorks Fixed Point Designer The fixed point algorithm is verified as an intermediary Register Transfer Language (RTL).

MathWorks Fixed Point Designer The implementation of the verified RTL fixed point algorithm in an FPGA is either by the MathWorks HDL Coder or the Xilinx System Generator.

MathWorks Fixed Point Designer There are two design flows for conversion from floating-point to fixed-point in MATLAB/Simulink: Automatic conversion Fixed-Point Converter App Manual conversion fi command line

MathWorks Fixed Point Designer Fixed Point Converter App Verifies the full intended operating range of the algorithm using code coverage results Proposes fraction lengths based on default word lengths Proposes word lengths based on default fraction lengths Optimizes whole numbers

MathWorks Fixed Point Designer Fixed Point Converter App Specifies safety margins for min/max data View a histogram of bits that each variable uses

MathWorks Fixed Point Designer Fixed Point Converter App Detects overflows HDL Coder for FPGA synthesis

MathWorks Fixed Point Designer Fixed Point Converter App Example: fixed point parallel form digital filter Parallel execution in FPGA

MathWorks Fixed Point Designer Fixed Point Converter App Analysis results for fixed point parallel filter

MathWorks Fixed Point Designer Manual Conversion fi constructs fixed point numerical object

MathWorks Fixed Point Designer Manual Conversion However, fi requires more experience in its use

Signal Processing Algorithms into Fixed Point FPGA Hardware Motivation Development of Programmable Logic Algorithms into Fixed Point The MathWorks Fixed Point Designer Fixed Point Design Project Example FPGA Synthesis SCDL

Fixed Point Designer Project Example Calculation in real-time of the projection matrix for the Algebraic Reconstruction Technique (ART) in the 3D tomographic chemical threat mapping using hyperspectral imaging. MESH, Inc

Fixed Point Designer Project Example Path measurements from a limited number of sensors outputs a concentration map. ART is used for noisy, sparsely sampled data from a limited number of sensors. MESH, Inc

Fixed Point Designer Project Example A path integrated concentration length (CL) is determined from wind velocity and integration time. ART uses the projection matrix to produce the object concentration vector for mapping

Fixed Point Designer Project Example The object vector is solved for iteratively, adjusting the part that is affected by the current CL measurement on each iteration.

Fixed Point Designer Project Example The iterative adjustment is an intensive parallel operation and the calculation is done in fixed point and real-time on an FPGA

Signal Processing Algorithms into Fixed Point FPGA Hardware Motivation Development of Programmable Logic Algorithms into Fixed Point The MathWorks Fixed Point Designer Fixed Point Design Project Example FPGA Synthesis SCDL

FPGA Synthesis Synthesis of a digital logic system to meet functionality and timing constraints is only the first task.

FPGA Synthesis Xilinx provides the Integrated Logic Analyzer (ILA) which samples logic signals at design speeds and stores them on-chip in block RAM (BRAM). Porting a signal out of the FPGA for an external logic analyzer can distort the timing.

FPGA Synthesis However, the most vexing problem is the putand-place hardware synthesis problem of the design because of the coarse-grained architecture of the FPGA. The Xilinx PlanAhead tool facilitates but does not completely abrogate this problem.

FPGA Synthesis The coarse-grained architecture often thwarts put-and-place with only a smaller percentage of FPGA resources available for implementation.

FPGA Synthesis The fine-grained architecture of the Virtex I FPGA facilitates put-and-place

Signal Processing Algorithms into Fixed Point FPGA Hardware Motivation Development of Programmable Logic Algorithms into Fixed Point The MathWorks Fixed Point Designer Fixed Point Design Project Example FPGA Synthesis SCDL

System Chip Design Laboratory www.temple.edu/scdl The SCDL was established in 1999 with funding from the Western Design Center, the developer of the 65C02/65C816/65C832 microprocessor SCDL in 1999

System Chip Design Laboratory www.temple.edu/scdl An integrated design environment (IDE) to facilitate the rapid hardware/software codesign of a digital signal processor (DSP) or process control soft cores in Xilinx FPGAs.

System Chip Design Laboratory www.temple.edu/scdl The Xilinx System Generator and MATLAB/Simulink for Xilinx FPGA hardware-in-the-loop embedded processing.

System Chip Design Laboratory www.temple.edu/scdl Real-time benchmarking and performance of the Xilinx Zynq System-on-Chip (SoC) processor and bus. The Zynq integrates a dual ARM Cortex-A9 processor, FPGA, hard core peripherals and the AMBA bus.

System Chip Design Laboratory www.temple.edu/scdl

System Chip Design Laboratory www.temple.edu/scdl Development of a hardware scheduler of multiple tasks in a hard real-time operating system (RTOS). Implementation of services provided by a software RTOS kernel, such as scheduling and inter-process communication, into the hardware.

System Chip Design Laboratory www.temple.edu/scdl Education in embedded system design using the Xilinx FPGA

System Chip Design Laboratory www.temple.edu/scdl Education in embedded system design now using the Xilinx Zynq SoC

Signal Processing Algorithms into Fixed Point FPGA Hardware Dennis Silage silage@temple.edu ECE Temple University www.temple.edu/scdl