Chapter 3 Classification of FFT Processor Algorithms

Similar documents
EE123 Digital Signal Processing

Fast Fourier Transform (FFT) Algorithms

Outline. Applications of FFT in Communications. Fundamental FFT Algorithms. FFT Circuit Design Architectures. Conclusions

Fast Fourier Transform (FFT)

The following algorithms have been tested as a method of converting an I.F. from 16 to 512 MHz to 31 real 16 MHz USB channels:

Spectral leakage and windowing

Computation of DFT. point DFT. W has a specific structure and because

Filter design. 1 Design considerations: a framework. 2 Finite impulse response (FIR) filter design

1.2 Binomial Coefficients and Subsets

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

Computation of DFT. W has a specific structure and because

Lecture 1: Introduction and Strassen s Algorithm

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8)

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA

INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND TECHNOLOGY (IJARET) A NEW RADIX-4 FFT ALGORITHM

Design of Efficient Pipelined Radix-2 2 Single Path Delay Feedback FFT

FPGA Implementation of Pipeline Digit-Slicing Multiplier-Less Radix 2 2 DIF SDF Butterfly for Fast Fourier Transform Structure

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

Elementary Educational Computer

Linear Time-Invariant Systems

Project 2.5 Improved Euler Implementation

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

Major CSL Write your name and entry no on every sheet of the answer script. Time 2 Hrs Max Marks 70

The number n of subintervals times the length h of subintervals gives length of interval (b-a).

How do we evaluate algorithms?

Introduction to FFT Processors

Area As A Limit & Sigma Notation

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

Lecture 5. Counting Sort / Radix Sort

Ones Assignment Method for Solving Traveling Salesman Problem

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures

FFT: An example of complex combinational circuits

. Written in factored form it is easy to see that the roots are 2, 2, i,

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering

GPUMP: a Multiple-Precision Integer Library for GPUs

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

BOOLEAN MATHEMATICS: GENERAL THEORY

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

Matrix representation of a solution of a combinatorial problem of the group theory

Lecture 2. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram

Pattern Recognition Systems Lab 1 Least Mean Squares

LU Decomposition Method

Evaluation scheme for Tracking in AMI

A Study on the Performance of Cholesky-Factorization using MPI

CSE 417: Algorithms and Computational Complexity

3D Model Retrieval Method Based on Sample Prediction

Recurrent Formulas of the Generalized Fibonacci Sequences of Third & Fourth Order

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Big-O Analysis. Asymptotics

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

Improving Template Based Spike Detection

Appendix D. Controller Implementation

Lifting a butterfly A component-based FFT

Cubic Polynomial Curves with a Shape Parameter

Data Structures and Algorithms. Analysis of Algorithms

Optimal Mapped Mesh on the Circle

CHAPTER IV: GRAPH THEORY. Section 1: Introduction to Graphs

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

FPGA IMPLEMENTATION OF BASE-N LOGARITHM. Salvador E. Tropea

The isoperimetric problem on the hypercube

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Introduction to FFT Processors. Chih-Wei Liu VLSI Signal Processing Lab Department of Electronics Engineering National Chiao-Tung University

An Efficient Algorithm for Graph Bisection of Triangularizations

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

A Note on Least-norm Solution of Global WireWarping

Algorithm. Counting Sort Analysis of Algorithms

Python Programming: An Introduction to Computer Science

arxiv: v2 [cs.ds] 24 Mar 2018

A Very Simple Approach for 3-D to 2-D Mapping

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Neural Networks A Model of Boolean Functions

Reversible Realization of Quaternary Decoder, Multiplexer, and Demultiplexer Circuits

A Generalized Set Theoretic Approach for Time and Space Complexity Analysis of Algorithms and Functions

Counting Regions in the Plane and More 1

Examples and Applications of Binary Search

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Math Section 2.2 Polynomial Functions

Math 3201 Notes Chapter 4: Rational Expressions & Equations

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Analysis of Algorithms

NTH, GEOMETRIC, AND TELESCOPING TEST

Accuracy Improvement in Camera Calibration

Arithmetic Transforms for Verifying Compositions of Sequential Datapaths

top() Applications of Stacks

1. Introduction o Microscopic property responsible for MRI Show and discuss graphics that go from macro to H nucleus with N-S pole

Automatic Generation of Polynomial-Basis Multipliers in GF (2 n ) using Recursive VHDL

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)

QMDD and Spectral Transformation of Binary and Multiple-Valued Functions *

Performance Plus Software Parameter Definitions

Asymmetrical Load-Balancing for Incremental Fast Fourier Transform on Multi-Core Processors

COMP 558 lecture 6 Sept. 27, 2010

BOOLEAN DIFFERENTIATION EQUATIONS APPLICABLE IN RECONFIGURABLE COMPUTATIONAL MEDIUM

Computer Science Foundation Exam. August 12, Computer Science. Section 1A. No Calculators! KEY. Solutions and Grading Criteria.

k (check node degree) and j (variable node degree)

The Nature of Light. Chapter 22. Geometric Optics Using a Ray Approximation. Ray Approximation

Transcription:

Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As a alterative to direct computatio of the DFT, we ca use Fast Fourier Trasform (FFT) algorithms to efficietly calculate the DFT. May computatioally efficiet FFT algorithms have bee proposed. This chapter presets a brief collectio of the FFT algorithms ad their properties.. Discrete Fourier Trasform (DFT) Give a legth sequece x(), a DFT geerated complex sequece X(k) ca be defied as k Xk [ ] = x [ ] k=,,..., k where Twiddle Factor, = e 2π j k (.) If the periodic ad symmetric properties of the twiddle factor m are exploited, the the computatio of X[k] will be more efficiet. symmetric : periodic : m /2 m = = m m Two types of FFT algorithms are Decimatio i Time (DIT) ad Decimatio i Frequecy (DIF). The former is based o decompositio of the iput sequece ito smaller subsequeces ad the latter is based o dividig the output sequece ito smaller subsequeces [5]. The computatioal complexity of these two types are the

same. I this chapter, we oly itroduce the DIF FFT algorithms..2 Fixed Radix Algorithms.2. Radix2 DIF FFT Algorithm The radix2 algorithm [5] is the most commo fixed algorithm. This algorithm ca oly compute the data havig a power of 2 legth. The radix2 butterfly calculatio is very simple amog the overall class of FFT algorithms. Each butterfly stage reads two data iputs ad produces two outputs. The defiitio of a poit DFT is give agai as follows: ( /2) k /2 X( k) = x( ) x( ) k = k Xk [ ] = x [ ] k=,,..., ( /2) /2 k ( /2) k x ( ) x ( ) 2 The iput sequece, x() ca be separated ito the first half of the sequece ad the secod half. The mathematical expressios are defied i the followig equatios: where, = ( /2) /2 k ( /2) k k x( ) x ( ) 2 (.2) 2π j k ( /2) k 2 k = = cos( π ) si( π ) = ( ) e k j k e get X( k) = x( ) ( ) x( ) ( /2) /2 k k k 2 ( /2) k k = [ x( ) ( ) x( )] (.) 2 The eve output represeted i Equatio (.2) is obtaied by the summatio of the first half ad the secod half of the iput sequece for each of the /2 poit DFTs.

The odd output represeted i Equatio (.) is obtaied by the subtractio of the first half ad the secod half of the iput sequece for each of the /2 poit DFTs ad multiplyig this subtractive result by. For the first stage, 2r ad 2r is substituted for idex k i Equatio (.), we get the Equatio (.4) ad (.5). ( /2) X[2 r] = ( x [ ] x [ ( / 2)]) = ( /2) g [ ] ( /2) r /2 r /2 X[2r ] = ( x [ ] x [ ( / 2)]) = ( /2) h [ ] r /2 r /2 (.4) (.5) k Xk [ ] = x [ ] k=,,..., k=2r, k=2r ( /2) ([] x x [ ( /2)]) r /2 st stage ( /2) ( x [ ] x [ ( / 2)]) r /2 r=2l, r=2l r=2l, r=2l ( /4) ([] g g [ ( /4)]) l /4 ( /4) ([] g g [ ( /4)]) l 2d stage /2 /4 ( /4) ([] h h [ ( /4)]) l /4 ( /4) ([] h h [ ( /4)]) l /2 /4 l=2p, l=2p l=2p, l=2p l=2p, l=2p l=2p, l=2p A A2 B B2 rd stage C C2 D D2 Figure. Separatio procedure for radix2 algorithm Figure. shows the basic cocept of the separatio procedure for the radix2 algorithm. This procedure ca be applied recursively util log2 stages are produced. That is, the fial stage has bee reduced to 2poit DFT computatios. 2

Amog the stages, there is a /2 butterfly structure i each stage, as show i Figure.2. Radix2 butterfly elemet requires oly two complex adders ad oe complex multiplier. For this reaso, its hardware cost is very low. The FFT operatio has iputs ad outputs, so we eed registers to store all the output data from all the butterfly stage calculatios. If we use iplace computatio, it will ot be ecessary to use storage registers. To reduce registers, we reuse the same registers to store successive iteratios of the results of each butterfly. This kid of calculatio is called a iplace computatio [5]. The procedure show i Figure. is depicted with a complete =6poit example i Figure.. It is obvious that whe the iput sequeces of the DIF FFT algorithm are arraged i ormal, its output sequeces are arraged i bitreversed after usig the iplace computatio method. p m P q Q P= p q Q= ( p q) Figure.2 Butterfly structure of radix2 DIF FFT m

ormal x[] bit reversed x[] 2 4 4 2 x[2] 8 8 2 6 2 8 8 4 4 2 x[4] 6 2 2 6 4 x[] 6 4 2 x[] x[2] 4 6 8 x[] 5 6 8 2 x[] x[4] 6 6 2 8 4 7 6 8 4 2 Figure. Sigal flow graph of complete decompositio of a 6 poit radix2 DIF FFT 4

.2.2 Radix4 DIF FFT Algorithm Each radix4 butterfly stage reads four data iputs ad produces four data outputs. For this reaso, this algorithm provides higher computatio speed compared to the radix2 algorithm. Decomposig Equatio (.4) ad (.5) ito four groups of frequecy compoets: 4u, 4u, 4u2, 4u, the radix 4 algorithm [6] ca be obtaied. ( /4) u /4 (.6) X[4 u] = {( x [ ] x [ ( / 2)]) ( x [ / 4] x [ ( / 4)])} ( /4) X[4u ] = {( x [ ] x [ ( / 2)]) j( x [ / 4] x [ ( / 4)])} ( /4) X[4u 2] = {( x [ ] x [ ( / 2)]) ( x [ / 4] x [ ( / 4)])} ( /4) X[4u ] = {( x [ ] x [ ( / 2)]) j( x [ / 4] x [ ( / 4)])} 2 u /4 u /4 u /4 (.7) (.8) (.9) The butterfly sigal flow graph of Equatio (.6) to Equatio (.9) is depicted i Figure.4, where the iterval is ~(/4). e ca observe that the radix4 butterfly eeds oly the three complex multipliers,, ad eight complex adders 2 for each stage. Multiplicatio by the j term causes the real ad imagiary parts of the data to be exchaged. The sigal flow graph i Figure.5 is illustrated for the computatio of a6poit DFT usig the radix4 algorithm. x[] X[4u] x[/4] 2 X[4u2] x[2/4] X[4u] x[/4] j X[4u] Figure.4 Butterfly structure for radix4 algorithm 5

radix4 FFT radix4 FFT ormal x[] 6 4 bit reversed x[] x[] x[2] x[] x[4] j j j j 6 6 6 6 2 6 4 6 6 6 6 6 2 6 6 6 6 6 6 9 6 j j j j 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 x[2] x[4] x[] x[] Figure.5 Sigal flow graph of complete decompositio of a 6 poit radix4 DIF FFT.2. Radix8 DIF FFT Algorithm Based o the two algorithms preseted i the previous subsectio, we coclude that with icreasig the radix, the cost i hardware will icrease liearly, complexity will icrease, ad implemetatio difficulty will icrease. However, whe usig higher radix algorithms (e.g. radix8), the umber of complex multiplicatios 6

decreases [7]. I this case, the commo direct mappig method of radixr butterfly elemet operatio is more complicated ad has larger area. So, the implemetatio of the butterfly elemet structure is of great iterest to decrease the hardware complexity for higher radix algorithm. The duty of a desiger is how to arrage ad desig the data flow of the butterfly elemet structures to achieve small area ad high performace. Here, we itroduce radix8 FFT algorithm. Each radix8 butterfly stage reads eight data iputs ad produces eight data outputs. Radix8 algorithm [8] is expressed i Equatio (.) through (.7). /8 X[8k ] = [{( x [ ] x [ / 2]) ( x [ / 4] x [ / 4])} (.) {( x [ /8] x [ 5 /8]) ( x [ /8] x [ 7 /8])}] /8 /8 [8 ] = [{( [ ] [ / 2]) ( [ / 4] [ / 4])}. X k x x j x x {( [ / 8] [ 5 / 8]) ( [ / 8] [ 7 / 8])}] 8k x x j x x /8 /8 X[8k 4] = [{( x [ ] x [ /2]) ( x [ /4] x [ /4])} {( x[ / 8] x [ 5 / 8]) ( x [ / 8] x [ 7 / 8])}] 8k 4 (.4) The above eight equatios ca be depicted with the butterfly structure i Figure.6. This structure fully utilizes the twiddle factor with its properties of symmetry ad 7 8k 8k 2 (.) X[8k 2] = [{( x [ ] x [ / 2]) ( x [ / 4] x [ / 4])} j (.2) {( x[ / 8] x[ 5 / 8]) j( x[ / 8] x[ 7 / 8])}] /8 /8 [8 ] = [{( [ ] [ / 2]) ( [ / 4] [ / 4])}. X k x x j x x (.) {( x[ /8] x[ 5 /8]) j( x[ /8] x[ 7 /8])}] /8 /8 [8 5] = [{( [ ] [ / 2]) ( [ / 4] [ / 4])}. 8k X k x x j x x (.5) {( x[ /8] x[ 5 /8]) j( x[ /8] x[ 7 /8])}] /8 8k 5 X[8k 6] = [{( x [ ] x [ / 2]) ( x [ / 4] x [ / 4])} j (.6) {( x[ / 8] x [ 5 / 8]) ( x [ / 8] x [ 7 / 8])}] 8k 6 /8 /8 [8 7] = [{( [ ] [ / 2]) ( [ / 4] [ / 4])}. X k x x j x x {( x[ / 8] x[ 5 / 8]) j( x[ / 8] x[ 7 / 8])}] 8k 7 (.7)

periodicity. It requires two costat multipliers ad /8 /8, seve otrivial complex multipliers ad three trivial factors j. The two costat multipliers with ad /8 /8 ca be replaced by 2 additios by usig shift adders [9]. The computatio of a 8poit DFT accomplished by the radix8 algorithm is depicted i Figure.7. x[] X[8k] x[/8] 4 X[8k4] x[2/8] 2 X[8k2] x[/8] j 6 X[8k6] x[4/8] X[8k] x[5/8] / 8 5 X[8k5] x[6/8] j X[8k] x[7/8] j / 8 7 X[8k7] Figure.6 Butterfly structure of radix8 algorithm 8

ormal x[] 8 bit reversed X[] 8 X[4] 8 X[2] j 8 X[6] 8 X[] 8 8 X[5] x[7] j j 8 8 8 X[] X[7] Figure.7 Sigal flow graph of a 8poit radix8 DIF FFT. Split Radix Algorithms Refereces [7] ad [] preset the fudametal idea of the split radix algorithm. The split radix algorithm specifies that a poit DFT ca be implemeted by decomposig differet parts of the iput sequece usig differet algorithms. Split radix algorithms such as the radix2/4 split algorithm, ad the radix2/8 split algorithm have yet to be preseted. Although they require less complex multipliers, they are ot ofte used i IC desig due to their irregular structure... Split Radix2/4 Algorithm This algorithm [] utilizes a radix2 algorithm to calculate the eveidexed output sequeces ad a radix4 algorithm to calculate the oddidexed output 9

sequeces. This idea ca be represeted by the mathematical expressio i Equatio (.8) through (.2). For the eveidexed term, ( /2) u /2. (.8) X[2 u] = ( x [ ] x [ ( / 2)]) For the oddidexed term, ( /4) u /4 (.9) X[4u ] = {( x [ ] x [ ( / 2)]) j( x [ / 4] x [ ( / 4)])} ( /4) X[4u ] = {( x [ ] x [ ( /2)]) j( x [ /4] x [ ( /4)])} u /4. (.2) Equatio (.9) ad (.2) ca be represeted by the followig butterfly elemet structure icludig twiddle factor coefficiets. Figure.9 is the computatio of a 6poit split radix2/4 algorithm. x[] x[/4] x[2/4] X[4u] x[/4] j X[4u] Figure.8 Butterfly structure of split radix2/4 algorithm 2

ormal x[] bit reversed x[] j x[2] 8 8 x[] x[2] j j j 6 6 2 6 6 6 8 8 j x[4] x[] x[] j 6 x[] x[4] j 6 6 j 9 6 j Figure.9 Computatio of a 6poit split radix2/4 algorithm..2 Split Radix2/8 Algorithm The split radix2/8 algorithm [7] [9] utilizes a radix8 algorithm to calculate the oddidexed output istead of a radix4 algorithm. The eveidexed output is calculated the same way as i the radix2/4 algorithm. That is, the radix2/8 algorithm 2

is based o the decompositio of the legth DFT sequece ito oe legth/2 DFT ad four the legth/8 DFTs. For the eveidexed term with oe legth/2 DFTs, ( /2) u /2. (.2) X[2 u] = ( x [ ] x [ ( / 2)]) For the oddidexed term with four legth/8 DFTs, ( /8) /8 (.22) X[8u ] = {[([] x x [ ( /2)]) j([ x /4] x [ /4])] [( x [ /8] x [ 5 /8]) j( x [ /8] x [ 7 /8])]} u /8 ( /8) /8 (.2) X[8u ] = {[([] x x [ ( /2)]) j([ x /4] x [ /4])] [( x [ /8] x [ 5 /8]) j( x [ /8] x [ 7 /8])]} u /8 ( /8) /8 (.24) X[8u 5] = {[([] x x [ ( /2)]) j([ x /4] x [ /4])] [( x [ /8] x [ 5 /8]) j( x [ /8] x [ 7 /8])]} 5 u /8 ( /8) /8 (.25) X[8u 7] = {[( x [ ] x [ ( / 2)]) j( x [ / 4] x [ / 4])] [( x [ /8] x [ 5 /8]) j( x [ /8] x [ 7 /8])]} 7 u /8 To make the idea of the radix2/8 algorithm easier to uderstad, the simplified butterfly structure is illustrated i Figure.. A 6poit split radix2/8 algorithm is show i Figure.. x[] x[/8] x[2/8] x[/8] x[4/8] X[8u] x[5/8] / 8 5 X[8u5] x[6/8] j X[8u] x[7/8] j / 8 7 X[8u7] Figure. Butterfly structure of radix2/8 algorithm 22

atural x[] bit reversed x[] j x[2] 8 j j 8 x[4] x[] x[2] j 8 8 6 6 6 5 6 6 x[] x[] j 6 x[] x[4] j 8 6 j 8 7 6 Figure. A 6poit split radix2/8 algorithm 2

.4 Mixed Radix Algorithm The radices of the butterfly structure i differet FFT computatio stages eed ot be equal. Such a algorithm is called a mixed radix algorithm []. he the FFT size is ot a power of radixr, we ca oly use a mixed radix algorithm to desig a FFT. For example, a 24poit sequece is ot a power of 8. Thus, we ca combie three stages with a radix8 algorithm ad oe stage with a radix2 algorithm to operate the trasform over a 24poit data sequece. 24