An FFT/IFFT design versus Altera and Xilinx cores

Size: px
Start display at page:

Download "An FFT/IFFT design versus Altera and Xilinx cores"

Transcription

1 2008 International Conference on Reconfigurable Computing and FPGAs An FFT/IFFT design versus Altera and Xilinx cores C. Gonzalez-Concejero, V. Rodellar, A. Alvarez-Marquina, E. Martinez de Icaya and P.Gomez-Vilda Departamento de Arquitectura y Tecnología de Sistemas Informáticos. Grupo de investigación en informática aplicada al procesamiento de señal e imagen. Facultad de Informática Universidad Politécnica de Madrid. Campus de Montegancedo s/n Boadilla del Monte (Madrid) -SPAI cconcejero@gmail.com; victoria@pino.datsi.fi.upm.es Abstract In this paper, a portable hardware design implementing a Fast Fourier Transform oriented to its reusability as a core is presented. The module has been developed using radix-2 Decimation-In-Time algorithm. Structural modeling is implemented using VHDL to describe, simulate and perform the design. The module is portable among different EDA tools and technology independent. It has been synthesized with Quartus II from Altera and ISE from Xilinx. The detailed performance results are presented, as well as a comparison between these and the results provided by Altera and Xilinx FFT IP cores. These show that the proposed design produces better results in the use of physical resources but worsens throughput when compared against the commercial ones. Besides, the IP core from Xilinx shows better throughput than Alteras s but at a higher implementation cost. 1. Introduction IP cores are part of the growing Electronic Design Automation (EDA) industry trend towards repeated use of previously designed components. IP cores offered by vendors are rigorously tested and optimized for the highest performance and lowest cost in programmable logic devices. These parameterized IP blocs can be implemented easily, reducing design and test time and also time-to-maret because they avoid the process of designing standardized functions from scratch. Ideally these blocs should be entirely portable among different EDA tools and fully parameterizable. But most vendor companies offer only their own nonportable IP cores with many features and functionalities, which sometimes are useless for an specific application. The Fast Fourier Transform (FFT) and its Inverse (IFFT) are fundamental blocs being used in many applications in science and engineering, such as communications, spectrum analysis, and implementations of digital signal processing, etc. The main FPGA companies as Altera and Xilinx offer FFT/IFFT cores that can be easily embedded in more complex designs with their design tools and are supported and optimized for a wide range of their device families. The FFT/IFFT v5.0 from Xilinx allows transform sizes from 8 to samples, data precision from 8 to 24 bits, floating point and unscaled and scaled fixed point arithmetic, four different architectures to choose from, bloc or distributed, run time programmable, etc [2]. The FFT/IFFT v8.0 from Altera allows transformation sizes from 64 to samples depending on the type of architecture chosen, data precision from 8 to 32 bits, floating point and fixed point arithmetic, embedded memory, multiple I/O data flow modes, etc [3]. In this paper, we present a radix-2 FFT/IFFT design that allows any size points to transform, fixed point arithmetic, pipeline structure and parameterized data format. The synthesis performance results of the proposed model will be compared with the FFT/IFFT cores from the vendors mentioned before and the advantages and disadvantages of each realization will be discussed. The next section describes the principles of the FFT structure and the mathematical formulation. The architectural design is presented in section 3. Section 4, shows implementation and design results. Finally, conclusions are exposed in section The FFT algorithm Audio and communications signal processing are well developed lines massively used nowadays in many application lines and products. Since digital communications are quite active fields, the arithmetic /08 $ IEEE DOI /ReConFig

2 complexity of the Discrete Fourier Transform (DFT) algorithm becomes a significant factor with impact in global computational costs. Cooley and Tuey [1] developed the well-nown radix-2 Fast Fourier Transform (FFT) algorithm to reduce the computational load of the DFT. It can lower the arithmetic complexity from O( 2 ) to O( log ) and the regularity of the algorithm is suitable for VLSI implementation. Among different FFT approaches ([4], [5] and [6]), the fixed radix and the split radix methods are two most widely used approaches. A split radix FFT is theoretically more efficient than a fixed radix algorithm [7], since it shows the least computation complexity among traditional FFT algorithms. However the supporting structure would render it less suitable for implementation on digital signal processors. Unlie the irregular butterfly structure of split-radix FFT, fixed-radix FFT is simple to analyze and implement in hardware due to its structural regularity. Therefore, the fixed-radix FFT is by far more widely used although it involves more computations from the algorithmic point of view. The -point DFT of a sequence x() is defined as [8]: 1 n = X ( n) x( ) W n = 0,1,..., 1 = 0 where W = j2π / 2π 2π e = cos jsin (1) (2) is referred as the twiddle factor, is the transform size and j = 1. On its turn depends on the number of stages and the number of samples. Similarly the Inverse Discrete Fourier Transform (IDFT) is expressed as: 1 = 0 1 n x ( ) = X ( n) W (3) The algorithm used in the present processor implementation is the version of the Cooley and Tuey s Decimation-In-Time (DIT) FFT algorithm. The DIT algorithm first rearranges the input elements in bit-reversed order and then builds up the output transform. Figure 1 shows the form of this scrambling for an 8-point FFT; to the left input data samples are arranged in bit-reversed order. As it can be seen, the - point DIT-FFT algorithm consists of log 2 stages, each stage consisting of /2 butterfly operations [9]. The input data are multiplied by the twiddle factor. The solid dots represent addition/subtraction operations. The outputs are arranged in their natural order. Figure 1. Signal flow graph for 8-point DIT-FFT with input scrambling. The DIT-FFT radix-2 butterfly is shown in Figure 2 [9]. It taes a pair of complex input data values A and B and produces a pair of complex outputs A and B : A = x + jx (4) B = y + jy (5) where x, y and X, Y are respectively the real and imaginary parts of the input data and: A ' = A + BW (6) B' = A BW (7) Figure 2. Radix-2 butterfly structure Taing into consideration (2), (4) y (5), the equations (6) and (7) may be written as: [( x + y cos ( 2π / ) + Y sin ( 2π / )) + Y cos( 2π / ) y sin ( 2π / ) A' = + (8) j( X )] (9) B' = [( x y cos( 2π / ) Y sin ( 2π / )) + j( X Y cos( 2π / ) + y sin ( 2π / ))] 3. Architectural design The objective of this paper is to implement expressions (8) and (9) in an efficient way, having in mind the reusability of the resulting design as an embedded core 338

3 in a possible wide range of applications. The design has been modeled in VHDL according to the restrictions and recommendations for high level synthesis [10]. The design will be portable among different EDA tools and technology independent. This module is designed to be integrated in a Speech Recognition System. The FFT architecture consists of a single DIT-FFT radix-2 butterfly, a double-port memory to hold the values of the input samples, intermediate operations and results, a control unit, an address generation unit and two ROM memories to store the twiddle factors. The bloc diagram of the FFT is depicted in Figure 3. The scheduling details are given in Table 1. The architecture of the FFT processor can best be understood by inspecting its operation details. The operation is first partitioned into three main processes. The DATA load, COMPUTE and RESULT unload. The operation cycle starts with the DATA load process. This process consists of ing and loading sample data in the memory. During the COMPUTE process, the ernel butterfly operation is calculated. Finally, in the RESULT unload process the FFT results are made available at the output, y to be used by another application. A brief description of the main blocs will be given next. Figure 3. Bloc diagram of the FFT/IFFT A. ROMs and ROM memories store W coefficients. The sizes of these memories are /4, due to the symmetric properties of the trigonometric functions. Since the amplitude of the sine and cosine are the same in the four quarters, they only differ in the signs. According to the system worflow, two data must be from the with a cycle delay between them (Table 1, cycle 0 and 1) and loaded to the butterfly unit. Meanwhile, the two outputs of the butterfly bloc have to be written in the with a cycle delay between them (Table 1 (cont), cycle 6 and 7). B. Butterfly element The butterfly is the nuclear calculation. The butterfly taes two data from memory and computes two other data from them. Results are written bac to the same memory locations of the inputs since an inplace algorithm is used. This maes efficient use of the available memory as the transformed data overs the input data. The structure of the butterfly employing an straightforward implementation of (8) and (9), requires four multipliers, three adders, three subtractors and two modules to lin the real and imaginary parts of the data (Figure 4). Figure 4. Butterfly processing architecture The arithmetic operations involved in this bloc are performed accordingly with a pipeline data flow structure. The operations to calculate a butterfly demand four time instants (cycle 2 to 5), as it can be seen in the butterfly scheduling shown in Table 1. C. Address generation and control units The purpose of the Address Generation Unit (AGU) is to produce valid addresses for the and the ROM blocs. It also eeps trac of which butterfly is being computed in which stage. The bloc level description of the AGU basically consists of a log 2 - bit up counter, a ram_index generator and rom_index generator. The counter output is used to address the during the DATA load and RESULT unload processes. During the DATA load process data should be bit-reversed while being written, but no extra hardware is required for implementing the bit-reversed, it may simply be carried out by wire reversal. Moreover, the counter eeps trac of the current stage in the FFT computation, and supplies the ram_index generator with the number of the stage that is currently being computed. The ram_index generator is responsible for generating addresses for the during the COMPUTE process. The input of the ram_index is the address provided by the counter. The addresses to and data inputs, A and B, can be calculated as follows: 339

4 The control unit is implemented as a finite state machine with twelve states. The sequence of events is determined by the control unit depending on the signals it receives from the corresponding units and also generates other control signals to tae care of houseeeping duties, i.e, incrementing and clearing counters. cycle A(x,X) B(y,Y) nexta 1 nextb 1 nexta 2 ROM cosφ nextcosφ 1 ROM Sinφ nextsinφ 1 Mult M1=Ycosφ nextm1 1 Mult M2=Ysinφ nextm2 1 Mult M3=ycosφ nextm3 1 Mult M4=ysinφ nextm4 1 +/- S1=M3+M2 +/- S2=M1-M4 +/- S3=x+S1 +/- S4=X-S1 +/- S5=x+S2 +/- S6=X-S2 Lin Lin Table 1. Butterfly scheduling (cycles 0-4) cycle nextb 2 nexta 3 nextb 3 nexta 4 nextb 4 ROM nextcosφ 2 nextcosφ 3 nextcosφ 4 ROM nextsinφ 2 nextsinφ 3 nextsinφ 4 Mult nextm1 2 nextm1 3 Mult nextm2 2 nextm2 3 Mult nextm3 2 nextm3 3 Mult nextm4 2 nextm4 3 +/- nexts1 1 nexts1 2 nexts1 3 +/- nexts2 1 nexts2 2 nexts2 3 +/- nexts3 1 nexts3 2 +/- nexts4 1 nexts4 2 +/- nexts5 1 nexts5 2 +/- nexts6 1 nexts6 2 Lin A = nexta 1 nexta 2 S3+jS5 Lin B = nextb 1 nextb 2 S4+jS6 A nexta 1 B nextb 1 Table 1(cont). Butterfly scheduling (cycles 5-9) The address B is calculated just changing the bit 1 to 0 in the fragment of the algorithm shown before. The rom_index generator is responsible for producing addresses for the ROM during the COMPUTE process. It only requires nowing the current stage to generate de address. 4. Implementation results Generally speaing it is very difficult to mae a fair comparison among design performance because there is not a standard benchmaring methodology for FPGA`s. Current CAD tools provide a settings menu that allow to explore different trade-offs among design performance, logic resources demanded, power consumption, memory usage and compilation time. Additionally, user constraints can be included to guide the CAD tool to improve performance results. But the settings producing the best results for one design may not be appropriated for another. The compilation results that will be presented next were obtained with default settings and no constraints. Our model has been synthesized with Quartus II v8.0 and ISE v10.1, and their results have been compared against the FFT/IFFT cores available in the DSP libraries of these CAD tools, which are v8.0 for Altera and v5.0 for Xilinx. These cores have been included using the MegaWizard Plug-In Manager tool for Altera and CoreGen tool for Xilinx. Their structures and detailed pin count can be found in [2][3]. The device selection is also critical due to the differences in the FPGA inner architectures, some designs being easily implemented in a specific architecture while others are not. To tae this aspect into consideration, we have chosen families of the FPGA vendors that may be considered as technologically comparable. Concerning the election of the specific devices to implement the designs, the criterion has been to choose a device of enough size to support a real time speech recognition system (our goal application), where the FFT IP will be an embedded bloc. The target devices used for performing the designs have been Stratix II EP2S15F484C3 from Altera and Virtex IV xc4vlx15-12sf363 from Xilinx. IP commercial cores offer different possibilities of configuration (arithmetic, radix, architectures, number of butterfly engines, I/O modes, etc.) that must be carefully selected to the closest characteristics of the design in order to render comparable performance results. The summary of the characteristics of our design is: Decimation in Time FFT algorithm (DIT), radix-2, fixed point arithmetic two s complement, single butterfly engine, pipeline structure, number of samples (), data size and twiddle factors parameterized, structure implemented with 4 340

5 multipliers/6 adders. The options chosen for the commercial IP core generation were: Xilinx: Unscaled Arithmetics (full precision) fixedpoint, burst I/0 architecture because it uses the DIT method and radix-2, output in natural order and data and twiddle factors in. Altera: Arithmetic bloc floating-point, burst data flow I/0 due to it s the only option to generate a single output FFT engine, number of parallel engines =1 and 4 multipliers/2 adders implementation. The synthesis results of our design vs. FFT from Altera and Xilinx are shown in Table 2 and Table 3. The results presented in both, are for = 64, 128, 256, 512, 1024, 2048 and 4096 samples. The data and twiddle sizes are 16 bits in all cases. The upper part of each cell (grey shadow) contains the results for the design (). The lower part of the cells contains the results for IP vendors (VC). The comparisons among the results have been carried out in terms of the physical resources, number of pins, memory occupation, DSPs and Fmax used. The number of resources available in the devices and the amount required for each particular implementation are also indicated. It must be noticed that the percentages shown in the tables are provided by the tools. They round up or round down depending on the case. ALUTs (12480) Registers (12480) Pins (343) Mem. bits (419328) DSP (96) Fmax MHz VC (2%) 478(4%) 67(20%) 3190(<1%) 8(8%) (5%) 1365(11%) 85(25%) 2560(<1%) 8(8%) (2%) 483(4%) 67(20%) 6271(1%) 8(8%) (6%) 1486(12%) 85(25%) 4864(2%) 8(8%) (2%) 488(4%) 67(20%) 12424(3%) 8(8%) (6%) 1430(11%) 85(25%) 9472(3%) 8(8%) (2%) 491(4%) 67(20%) 24721(6%) 8(8%) (6%) 1532(12%) 85(25%) 18688(4%) 8(8%) (2%) 494(4%) 67(20%) 49306(12%) 8(8%) (6%) 1477(12%) 85(25%) 37120(9%) 8(8%) (2%) 497(4%) 67(20%) 98467(23%) 8(8%) (6%) 1578(13%) 85(25%) 73184(18%) 8(8%) (2%) 500(4%) 67(20%) (49%) 8(8%) (7%) 1522(12%) 85(25%) (35%) 8(8%) Table 2. Altera results for EP2S15F484C3 By comparing the results obtained from Altera s IP with the results from this design the following similarities and differences may be observed. In both cases, the number of demanded ALUTs remains in a similar percentage for all values of but the implementation of the IP vendor demands around three times more resources than ours. The same behavior is observed for the number of registers needed. The number of pins is constant for all values of but our design needs 18 pins less than the IP vendor. Obviously, the demand of memory is proportionally increasing with the size of. The performance of this parameter is better in Altera s core than in this design and in our case it gets worse as increases. According to the percentages given by the CAD tool, a difference of 1% for =256, 3% for = 1024 and 14% for =4096 can be appreciated. The number of DSP is the same in both cases. As expected in this case, the frequency is decreasing according to the increment of but in the case of Altera the behavior seems to be erratic. It can be noticed from Table 2 that frequency increases and decreases as duplicate. In a first analysis, the results seem to lac consistence, it may be observed that for =128 samples the value of F = MHz, for =256 the value increases to MHz, for =512 it decreases to MHz and for =1024 the value increases to MHz, the same tendency can be observed for the rest of the values shown in Table 2. Analyzing these results grouping the odd and even power of two of, the results show a different interpretation. In both groups the frequency decreases as the increases. For odd powers of (128, 512 and 2048) the frequency decreases as MHz, MHz and 225 MHz. respectively. And for even powers of (256, 1024, and 4096) it decreases as MHz, MHz and respectively. By comparing our results with these groups it may be concluded that our results are better for the odd powers but worse for the even powers of. This same erratic behavior maybe noticed for the number of registers. The reason given by the vendor is that for burst architectures radix-4 decomposition is normally applied unless is an odd power of two then the FFT megacore automatically implements a radix-2 in the last pass to complete the transformation. VC FF (12288) LTUs (12280) Pins (240) SLICEs (6144) 48 DSP (32) Fmax MHz (3%) 324(2%) 67(28%) 283(4%) 3(6%) 4(12%) (8%) 839(6%) 100(41%) 749(12%) 5(10%) 6(18%) (3%) 340(2%) 67(28%) 291(4%) 3(6%) 4(12%) (9%) 888(7%) 104(43%) 779(12%) 5(10%) 6(18%) (3%) 346(2%) 67(28%) 295(4%) 3(6%) 4(12%) (9%) 944(7%) 108(45%) 836(13%) 5(10%) 6(18%) (3%) 353(2%) 67(28%) 301(4%) 3(6%) 4(12%) (10%) 1035(8%) 112(46%) 903(14%) 5(10%) 6(18%) (3%) 361(2%) 67(28%) 305(4%) 4(8%) 4(12%) (10%) 1098(8%) 116(48%) 941(15%) 6(10%) 6(18%) (3%) 368(2%) 67(28%) 310(5%) 6(12%) 4(12%) (11%) 1172(9%) 120(50%) 1003(16%) 9(18%) 6(18%) (4%) 375(3%) 67(28%) 314(5%) 12(25%) 4(12%) (12%) 1229(10%) 124(51%) 1043(16%) 15(31%) 6(18%) Table 3. Xilinx results for xc4vlx15-12sf363 Concerning the results for Xilinx shown in Table 3, in our solution the demand of slice FLIP-FLOPs shows almost a constant percentage for all values of, whereas in the IP commercial core these resources increase according to, and can be noticed that the difference between both designs is larger as is increased. A similar behavior is observed for the number of LUTs and occupied slices. In our case, the number of pins remains the same (67) for all values of and Xilinx IP requires 4 pins more each time duplicates. In both implementations the blocs remain around the same percentage (6% and 10%) up to 512 samples and it increments for the rest of the values in the Table. The number of DSP is the same for all values of but our implementation uses 2 DSP s 341

6 less than the Xilinx s one. The results for all physical resources commented above are better in our implementation than in the implementation from Xilinx IP but this produces much better results for Fmax, over passing our solution in around 150 MHz. Concerning latency, our design presents poor results compared with the commercial ones because it needs 4 cycles to mae the calculations of one butterfly while the others only need one cycle. The total number of cycles estimated and the throughput for calculating a complete FFT for 256 and 1024 samples are given in Table 4. Cycles Throug. ( μs) Altera Xilinx Altera IP Xilinx IP = = Table 4. Throughput for 256 and 1024 samples The throughput of our core is better when implemented in Altera than in Xilinx being around 3 times faster for the first and between 4.5 to 5 times for the second. If we compare Xilinx s and Altera s IPs the latency is similar but Xilinx achieve higher frequencies and better throughput results. 5. Summary and conclusions This paper presents an -point FFT/IFFT architecture which is portable among different EDA tools and technology independent. The design is oriented to its reusability as a core. The performance of the design has been compared with the commercial cores provided by Altera and Xilinx vendors. Those cores were configured with the closet characteristics to our design in order to mae the results comparable. The performance of our design presents better results in terms of physical resources demanded but the throughput is poorer when compared with the IP commercial implementations. Concerning IP commercial cores, Xilinx gives better throughput than Altera. The implementation cost between them is difficult to evaluate in a fair manner because the FPGA s inner structures are different but in a first approach, taing as reference the results of our design, the implementation for Xilinx seems to be more costly. Along with these performance results come other considerations which need to be evaluated to select the best approach depending on system requirements lie easy implementation, costs and performance. The generation of a design from an IP commercial core is as easy as to press a button but you don t have any control over the design because they are provided as a blac box. They offer a variety of features and functionalities to be configured and supposedly their implementations are optimized for a subset of their devices, giving the best performance for them but they lac portability. Besides the economical cost, the system requirements could need less performance than that offered by IP commercial cores and this is the case of the present application. Our FFT design has been integrated as part of a Speech Recognition System for isolated commands and implemented in a FPGA together with the other parts of the system such as end point detection, MFCC feature extraction and HMM modeling. In this case the physical resources performance in order to have full implementation of the system in the same FPGA is more important than other criteria used, as far as real time processing is achieved and this condition is fulfilled with the design described in this paper. 6. Acnowledgments This wor was supported by grants CCG06- UPM/IF28, TEC C02-00 from Plan acional de I+D, Ministry of Education and Science and by Project HESPERIA ( from the Programme CEIT, Ministry of Industry, Spain. 7. References [1] J.W. Cooley, J.W. Tuey, An algorithm for the machine calculation of complex Fourier series, Math of comp, 1965, vol.9, pp [2] ft.pdf [3] [4] J.-Y. Oh, M.-S. Lim, A radix-2 4 SDF pipeline FFT processor for OFDM modulation, in: The First IEEE VTS APWCS (Asia Pacific Wireless Communications Symposium), January [5] Lihong Jia, Yonghong Gao, Jouni Isoaho and Hannu Tenhunen, A new VLSI-oriented FFT algorithm and implementation, IEEE ASIC Conf., 1998, pp [6] Saad Bouguezel, M. Omair Ahmad and M..S. Swamy, An efficient split-radix FFT algorithm, Int. Symp. Circuits Systems, 2003, pp [7] S. G. Johnson and M. Frigo, A modified split-radix FFT with fewer arithmetic operations, IEEE Transactions on Signal Processing, 2007, pp [8] B. J. Proais and D. G. Manolais, Digital Signal Processing: Principles, Algorithms and Applications, 2nd ed. ew Yor: Macmillan, 1992 [9] W. B. Jervis and E. C. Ifeachor, Digital Signal Processing: A Practical Approach. Reading, MA: Addison-Wesley, [10] M. Keating and P. Bricand, Reuse Methodology Manual: For System-on-a-Chip Desings. Third Edition. Kluwer Academic Publishers,

Hardware Reusable Design of Feature Extraction for Distributed Speech Recognition

Hardware Reusable Design of Feature Extraction for Distributed Speech Recognition Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 27-29, 2007 47 Hardware Reusable Design of Feature Extraction for Distributed Speech

More information

Linköping University Post Print. Analysis of Twiddle Factor Memory Complexity of Radix-2^i Pipelined FFTs

Linköping University Post Print. Analysis of Twiddle Factor Memory Complexity of Radix-2^i Pipelined FFTs Linköping University Post Print Analysis of Twiddle Factor Complexity of Radix-2^i Pipelined FFTs Fahad Qureshi and Oscar Gustafsson N.B.: When citing this work, cite the original article. 200 IEEE. Personal

More information

FPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA

FPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA FPGA Implementation of 16-Point FFT Core Using NEDA Abhishek Mankar, Ansuman Diptisankar Das and N Prasad Abstract--NEDA is one of the techniques to implement many digital signal processing systems that

More information

DESIGN METHODOLOGY. 5.1 General

DESIGN METHODOLOGY. 5.1 General 87 5 FFT DESIGN METHODOLOGY 5.1 General The fast Fourier transform is used to deliver a fast approach for the processing of data in the wireless transmission. The Fast Fourier Transform is one of the methods

More information

VLSI IMPLEMENTATION AND PERFORMANCE ANALYSIS OF EFFICIENT MIXED-RADIX 8-2 FFT ALGORITHM WITH BIT REVERSAL FOR THE OUTPUT SEQUENCES.

VLSI IMPLEMENTATION AND PERFORMANCE ANALYSIS OF EFFICIENT MIXED-RADIX 8-2 FFT ALGORITHM WITH BIT REVERSAL FOR THE OUTPUT SEQUENCES. VLSI IMPLEMENTATION AND PERFORMANCE ANALYSIS OF EFFICIENT MIXED-RADIX 8-2 ALGORITHM WITH BIT REVERSAL FOR THE OUTPUT SEQUENCES. M. MOHAMED ISMAIL Dr. M.J.S RANGACHAR Dr.Ch. D. V. PARADESI RAO (Research

More information

Implementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics

Implementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics Implementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics Yojana Jadhav 1, A.P. Hatkar 2 PG Student [VLSI & Embedded system], Dept. of ECE, S.V.I.T Engineering College, Chincholi,

More information

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists 3,800 116,000 120M Open access books available International authors and editors Downloads Our

More information

VHDL IMPLEMENTATION OF A FLEXIBLE AND SYNTHESIZABLE FFT PROCESSOR

VHDL IMPLEMENTATION OF A FLEXIBLE AND SYNTHESIZABLE FFT PROCESSOR VHDL IMPLEMENTATION OF A FLEXIBLE AND SYNTHESIZABLE FFT PROCESSOR 1 Gatla Srinivas, 2 P.Masthanaiah, 3 P.Veeranath, 4 R.Durga Gopal, 1,2[ M.Tech], 3 Associate Professor, J.B.R E.C, 4 Associate Professor,

More information

Research Article Design of A Novel 8-point Modified R2MDC with Pipelined Technique for High Speed OFDM Applications

Research Article Design of A Novel 8-point Modified R2MDC with Pipelined Technique for High Speed OFDM Applications Research Journal of Applied Sciences, Engineering and Technology 7(23): 5021-5025, 2014 DOI:10.19026/rjaset.7.895 ISSN: 2040-7459; e-issn: 2040-7467 2014 Maxwell Scientific Publication Corp. Submitted:

More information

FFT MegaCore Function User Guide

FFT MegaCore Function User Guide FFT MegaCore Function User Guide 101 Innovation Drive San Jose, CA 95134 www.altera.com MegaCore Version: 11.0 Document Date: May 2011 Copyright 2011 Altera Corporation. All rights reserved. Altera, The

More information

High Performance Pipelined Design for FFT Processor based on FPGA

High Performance Pipelined Design for FFT Processor based on FPGA High Performance Pipelined Design for FFT Processor based on FPGA A.A. Raut 1, S. M. Kate 2 1 Sinhgad Institute of Technology, Lonavala, Pune University, India 2 Sinhgad Institute of Technology, Lonavala,

More information

Core Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items

Core Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items (FFT_PIPE) Product Specification Dillon Engineering, Inc. 4974 Lincoln Drive Edina, MN USA, 55436 Phone: 952.836.2413 Fax: 952.927.6514 E mail: info@dilloneng.com URL: www.dilloneng.com Core Facts Documentation

More information

Low Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm

Low Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm Low Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm 1 A.Malashri, 2 C.Paramasivam 1 PG Student, Department of Electronics and Communication K S Rangasamy College Of Technology,

More information

Core Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items

Core Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items (FFT_MIXED) November 26, 2008 Product Specification Dillon Engineering, Inc. 4974 Lincoln Drive Edina, MN USA, 55436 Phone: 952.836.2413 Fax: 952.927.6514 E mail: info@dilloneng.com URL: www.dilloneng.com

More information

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN Xiaoying Li 1 Fuming Sun 2 Enhua Wu 1, 3 1 University of Macau, Macao, China 2 University of Science and Technology Beijing, Beijing, China

More information

Fast Fourier Transform IP Core v1.0 Block Floating-Point Streaming Radix-2 Architecture. Introduction. Features. Data Sheet. IPC0002 October 2014

Fast Fourier Transform IP Core v1.0 Block Floating-Point Streaming Radix-2 Architecture. Introduction. Features. Data Sheet. IPC0002 October 2014 Introduction The FFT/IFFT IP core is a highly configurable Fast Fourier Transform (FFT) and Inverse Fast Fourier Transform (IFFT) VHDL IP component. The core performs an N-point complex forward or inverse

More information

Abstract. Literature Survey. Introduction. A.Radix-2/8 FFT algorithm for length qx2 m DFTs

Abstract. Literature Survey. Introduction. A.Radix-2/8 FFT algorithm for length qx2 m DFTs Implementation of Split Radix algorithm for length 6 m DFT using VLSI J.Nancy, PG Scholar,PSNA College of Engineering and Technology; S.Bharath,Assistant Professor,PSNA College of Engineering and Technology;J.Wilson,Assistant

More information

Design and Performance Analysis of 32 and 64 Point FFT using Multiple Radix Algorithms

Design and Performance Analysis of 32 and 64 Point FFT using Multiple Radix Algorithms Design and Performance Analysis of 32 and 64 Point FFT using Multiple Radix Algorithms K.Sowjanya Department of E.C.E, UCEK JNTUK, Kakinada Andhra Pradesh, India. Leela Kumari Balivada Department of E.C.E,

More information

FPGA Implementation of Discrete Fourier Transform Using CORDIC Algorithm

FPGA Implementation of Discrete Fourier Transform Using CORDIC Algorithm AMSE JOURNALS-AMSE IIETA publication-2017-series: Advances B; Vol. 60; N 2; pp 332-337 Submitted Apr. 04, 2017; Revised Sept. 25, 2017; Accepted Sept. 30, 2017 FPGA Implementation of Discrete Fourier Transform

More information

ENT 315 Medical Signal Processing CHAPTER 3 FAST FOURIER TRANSFORM. Dr. Lim Chee Chin

ENT 315 Medical Signal Processing CHAPTER 3 FAST FOURIER TRANSFORM. Dr. Lim Chee Chin ENT 315 Medical Signal Processing CHAPTER 3 FAST FOURIER TRANSFORM Dr. Lim Chee Chin Outline Definition and Introduction FFT Properties of FFT Algorithm of FFT Decimate in Time (DIT) FFT Steps for radix

More information

Research Article International Journal of Emerging Research in Management &Technology ISSN: (Volume-6, Issue-8) Abstract:

Research Article International Journal of Emerging Research in Management &Technology ISSN: (Volume-6, Issue-8) Abstract: International Journal of Emerging Research in Management &Technology Research Article August 27 Design and Implementation of Fast Fourier Transform (FFT) using VHDL Code Akarshika Singhal, Anjana Goen,

More information

Low-Power Split-Radix FFT Processors Using Radix-2 Butterfly Units

Low-Power Split-Radix FFT Processors Using Radix-2 Butterfly Units Low-Power Split-Radix FFT Processors Using Radix-2 Butterfly Units Abstract: Split-radix fast Fourier transform (SRFFT) is an ideal candidate for the implementation of a lowpower FFT processor, because

More information

FFT/IFFTProcessor IP Core Datasheet

FFT/IFFTProcessor IP Core Datasheet System-on-Chip engineering FFT/IFFTProcessor IP Core Datasheet - Released - Core:120801 Doc: 130107 This page has been intentionally left blank ii Copyright reminder Copyright c 2012 by System-on-Chip

More information

Fixed Point Streaming Fft Processor For Ofdm

Fixed Point Streaming Fft Processor For Ofdm Fixed Point Streaming Fft Processor For Ofdm Sudhir Kumar Sa Rashmi Panda Aradhana Raju Abstract Fast Fourier Transform (FFT) processors are today one of the most important blocks in communication systems.

More information

FFT MegaCore Function User Guide

FFT MegaCore Function User Guide FFT MegaCore Function User Guide 101 Innovation Drive San Jose, CA 95134 www.altera.com MegaCore Version: 8.1 Document Date: November 2008 Copyright 2008 Altera Corporation. All rights reserved. Altera,

More information

User Manual for FC100

User Manual for FC100 Sundance Multiprocessor Technology Limited User Manual Form : QCF42 Date : 6 July 2006 Unit / Module Description: IEEE-754 Floating-point FPGA IP Core Unit / Module Number: FC100 Document Issue Number:

More information

Low Power Complex Multiplier based FFT Processor

Low Power Complex Multiplier based FFT Processor Low Power Complex Multiplier based FFT Processor V.Sarada, Dr.T.Vigneswaran 2 ECE, SRM University, Chennai,India saradasaran@gmail.com 2 ECE, VIT University, Chennai,India vigneshvlsi@gmail.com Abstract-

More information

LogiCORE IP Fast Fourier Transform v7.1

LogiCORE IP Fast Fourier Transform v7.1 LogiCORE IP Fast Fourier Transform v7.1 DS260 April 19, 2010 Introduction The Xilinx LogiCORE IP Fast Fourier Transform (FFT) implements the Cooley-Tukey FFT algorithm, a computationally efficient method

More information

FPGA Based Design and Simulation of 32- Point FFT Through Radix-2 DIT Algorith

FPGA Based Design and Simulation of 32- Point FFT Through Radix-2 DIT Algorith FPGA Based Design and Simulation of 32- Point FFT Through Radix-2 DIT Algorith Sudhanshu Mohan Khare M.Tech (perusing), Dept. of ECE Laxmi Naraian College of Technology, Bhopal, India M. Zahid Alam Associate

More information

Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope

Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope G. Mohana Durga 1, D.V.R. Mohan 2 1 M.Tech Student, 2 Professor, Department of ECE, SRKR Engineering College, Bhimavaram, Andhra

More information

The Serial Commutator FFT

The Serial Commutator FFT The Serial Commutator FFT Mario Garrido Gálvez, Shen-Jui Huang, Sau-Gee Chen and Oscar Gustafsson Journal Article N.B.: When citing this work, cite the original article. 2016 IEEE. Personal use of this

More information

Design of Delay Efficient Distributed Arithmetic Based Split Radix FFT

Design of Delay Efficient Distributed Arithmetic Based Split Radix FFT Design of Delay Efficient Arithmetic Based Split Radix FFT Nisha Laguri #1, K. Anusudha *2 #1 M.Tech Student, Electronics, Department of Electronics Engineering, Pondicherry University, Puducherry, India

More information

International Journal of Innovative and Emerging Research in Engineering. e-issn: p-issn:

International Journal of Innovative and Emerging Research in Engineering. e-issn: p-issn: Available online at www.ijiere.com International Journal of Innovative and Emerging Research in Engineering e-issn: 2394-3343 p-issn: 2394-5494 Design and Implementation of FFT Processor using CORDIC Algorithm

More information

AN FFT PROCESSOR BASED ON 16-POINT MODULE

AN FFT PROCESSOR BASED ON 16-POINT MODULE AN FFT PROCESSOR BASED ON 6-POINT MODULE Weidong Li, Mark Vesterbacka and Lars Wanhammar Electronics Systems, Dept. of EE., Linköping University SE-58 8 LINKÖPING, SWEDEN E-mail: {weidongl, markv, larsw}@isy.liu.se,

More information

A 4096-Point Radix-4 Memory-Based FFT Using DSP Slices

A 4096-Point Radix-4 Memory-Based FFT Using DSP Slices A 4096-Point Radix-4 Memory-Based FFT Using DSP Slices Mario Garrido Gálvez, Miguel Angel Sanchez, Maria Luisa Lopez-Vallejo and Jesus Grajal Journal Article N.B.: When citing this work, cite the original

More information

Design & Development of IP-core of FFT for Field Programmable Gate Arrays Bhawesh Sahu ME Reserch Scholar,sem(IV),

Design & Development of IP-core of FFT for Field Programmable Gate Arrays Bhawesh Sahu ME Reserch Scholar,sem(IV), Design & Development of IP-core of FFT for Field Programmable Gate Arrays Bhawesh Sahu ME Reserch Scholar,sem(IV), VLSI design, SSTC,SSGI(FET),Bhilai, Anil Kumar Sahu Assistant Professor,SSGI(FET),Bhlai

More information

Fused Floating Point Arithmetic Unit for Radix 2 FFT Implementation

Fused Floating Point Arithmetic Unit for Radix 2 FFT Implementation IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 2, Ver. I (Mar. -Apr. 2016), PP 58-65 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Fused Floating Point Arithmetic

More information

A DCT Architecture based on Complex Residue Number Systems

A DCT Architecture based on Complex Residue Number Systems A DCT Architecture based on Complex Residue Number Systems J. RAMÍREZ (), A. GARCÍA (), P. G. FERNÁNDEZ (3), L. PARRILLA (), A. LLORIS () () Dept. of Electronics and () Dept. of Computer Sciences (3) Dept.

More information

DESIGN OF PARALLEL PIPELINED FEED FORWARD ARCHITECTURE FOR ZERO FREQUENCY & MINIMUM COMPUTATION (ZMC) ALGORITHM OF FFT

DESIGN OF PARALLEL PIPELINED FEED FORWARD ARCHITECTURE FOR ZERO FREQUENCY & MINIMUM COMPUTATION (ZMC) ALGORITHM OF FFT IMPACT: International Journal of Research in Engineering & Technology (IMPACT: IJRET) ISSN(E): 2321-8843; ISSN(P): 2347-4599 Vol. 2, Issue 4, Apr 2014, 199-206 Impact Journals DESIGN OF PARALLEL PIPELINED

More information

FAST FOURIER TRANSFORM (FFT) and inverse fast

FAST FOURIER TRANSFORM (FFT) and inverse fast IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39, NO. 11, NOVEMBER 2004 2005 A Dynamic Scaling FFT Processor for DVB-T Applications Yu-Wei Lin, Hsuan-Yu Liu, and Chen-Yi Lee Abstract This paper presents an

More information

LOW-POWER SPLIT-RADIX FFT PROCESSORS

LOW-POWER SPLIT-RADIX FFT PROCESSORS LOW-POWER SPLIT-RADIX FFT PROCESSORS Avinash 1, Manjunath Managuli 2, Suresh Babu D 3 ABSTRACT To design a split radix fast Fourier transform is an ideal person for the implementing of a low-power FFT

More information

IMPLEMENTATION OF FAST FOURIER TRANSFORM USING VERILOG HDL

IMPLEMENTATION OF FAST FOURIER TRANSFORM USING VERILOG HDL IMPLEMENTATION OF FAST FOURIER TRANSFORM USING VERILOG HDL 1 ANUP TIWARI, 2 SAMIR KUMAR PANDEY 1 Department of ECE, Jharkhand Rai University,Ranchi, Jharkhand, India 2 Department of Mathematical Sciences,

More information

An FPGA Implementation of the Powering Function with Single Precision Floating-Point Arithmetic

An FPGA Implementation of the Powering Function with Single Precision Floating-Point Arithmetic An FPGA Implementation of the Powering Function with Single Precision Floating-Point Arithmetic Pedro Echeverría, Marisa López-Vallejo Department of Electronic Engineering, Universidad Politécnica de Madrid

More information

LogiCORE IP Fast Fourier Transform v7.1

LogiCORE IP Fast Fourier Transform v7.1 LogiCORE IP Fast Fourier Transform v7.1 DS260 March 1, 2011 Introduction The Xilinx LogiCORE IP Fast Fourier Transform (FFT) implements the Cooley-Tukey FFT algorithm, a computationally efficient method

More information

FAST Fourier transform (FFT) is an important signal processing

FAST Fourier transform (FFT) is an important signal processing IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 54, NO. 4, APRIL 2007 889 Balanced Binary-Tree Decomposition for Area-Efficient Pipelined FFT Processing Hyun-Yong Lee, Student Member,

More information

REAL TIME DIGITAL SIGNAL PROCESSING

REAL TIME DIGITAL SIGNAL PROCESSING REAL TIME DIGITAL SIGAL PROCESSIG UT-FRBA www.electron.frba.utn.edu.ar/dplab UT-FRBA Frequency Analysis Fast Fourier Transform (FFT) Fast Fourier Transform DFT: complex multiplications (-) complex aditions

More information

A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO

A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO 2402 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 6, JUNE 2016 A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO Antony Xavier Glittas,

More information

Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications

Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications , Vol 7(4S), 34 39, April 204 ISSN (Print): 0974-6846 ISSN (Online) : 0974-5645 Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications B. Vignesh *, K. P. Sridhar

More information

DUE to the high computational complexity and real-time

DUE to the high computational complexity and real-time IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO. 3, MARCH 2005 445 A Memory-Efficient Realization of Cyclic Convolution and Its Application to Discrete Cosine Transform Hun-Chen

More information

Reconfigurable Fast Fourier Transform Architecture for Orthogonal Frequency Division Multiplexing Systems

Reconfigurable Fast Fourier Transform Architecture for Orthogonal Frequency Division Multiplexing Systems Reconfigurable Fast Fourier Transform Architecture for Orthogonal Frequency Division Multiplexing Systems Konstantinos E. MAOLOPOULOS, Konstantinos G. AKOS, Dionysios I. REISIS and iolaos G. VLASSOPOULOS

More information

Keywords: Fast Fourier Transforms (FFT), Multipath Delay Commutator (MDC), Pipelined Architecture, Radix-2 k, VLSI.

Keywords: Fast Fourier Transforms (FFT), Multipath Delay Commutator (MDC), Pipelined Architecture, Radix-2 k, VLSI. ww.semargroup.org www.ijvdcs.org ISSN 2322-0929 Vol.02, Issue.05, August-2014, Pages:0294-0298 Radix-2 k Feed Forward FFT Architectures K.KIRAN KUMAR 1, M.MADHU BABU 2 1 PG Scholar, Dept of VLSI & ES,

More information

RECENTLY, researches on gigabit wireless personal area

RECENTLY, researches on gigabit wireless personal area 146 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 55, NO. 2, FEBRUARY 2008 An Indexed-Scaling Pipelined FFT Processor for OFDM-Based WPAN Applications Yuan Chen, Student Member, IEEE,

More information

An Area Efficient Mixed Decimation MDF Architecture for Radix. Parallel FFT

An Area Efficient Mixed Decimation MDF Architecture for Radix. Parallel FFT An Area Efficient Mixed Decimation MDF Architecture for Radix Parallel FFT Reshma K J 1, Prof. Ebin M Manuel 2 1M-Tech, Dept. of ECE Engineering, Government Engineering College, Idukki, Kerala, India 2Professor,

More information

Twiddle Factor Transformation for Pipelined FFT Processing

Twiddle Factor Transformation for Pipelined FFT Processing Twiddle Factor Transformation for Pipelined FFT Processing In-Cheol Park, WonHee Son, and Ji-Hoon Kim School of EECS, Korea Advanced Institute of Science and Technology, Daejeon, Korea icpark@ee.kaist.ac.kr,

More information

Image Compression System on an FPGA

Image Compression System on an FPGA Image Compression System on an FPGA Group 1 Megan Fuller, Ezzeldin Hamed 6.375 Contents 1 Objective 2 2 Background 2 2.1 The DFT........................................ 3 2.2 The DCT........................................

More information

FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE Standard

FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE Standard FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE 754-2008 Standard M. Shyamsi, M. I. Ibrahimy, S. M. A. Motakabber and M. R. Ahsan Dept. of Electrical and Computer Engineering

More information

FFT MegaCore Function User Guide

FFT MegaCore Function User Guide FFT MegaCore Function User Guide 101 Innovation Drive San Jose, CA 95134 www.altera.com MegaCore Version: 8.0 Document Date: May 2008 Copyright 2008 Altera Corporation. All rights reserved. Altera, The

More information

Verilog for High Performance

Verilog for High Performance Verilog for High Performance Course Description This course provides all necessary theoretical and practical know-how to write synthesizable HDL code through Verilog standard language. The course goes

More information

Modified Welch Power Spectral Density Computation with Fast Fourier Transform

Modified Welch Power Spectral Density Computation with Fast Fourier Transform Modified Welch Power Spectral Density Computation with Fast Fourier Transform Sreelekha S 1, Sabi S 2 1 Department of Electronics and Communication, Sree Budha College of Engineering, Kerala, India 2 Professor,

More information

On-Chip Implementation of Pipeline Digit- Slicing Multiplier-Less Butterfly for Fast Fourier Transform Architecture

On-Chip Implementation of Pipeline Digit- Slicing Multiplier-Less Butterfly for Fast Fourier Transform Architecture Neonode Inc From the SelectedWors of Dr. Rozita Teymourzadeh, CEng. 010 On-Chip Implementation of Pipeline Digit- Slicing Multiplier-Less Butterfly for Fast Fourier Transform Architecture Yazan Samir Rozita

More information

An efficient multiplierless approximation of the fast Fourier transform using sum-of-powers-of-two (SOPOT) coefficients

An efficient multiplierless approximation of the fast Fourier transform using sum-of-powers-of-two (SOPOT) coefficients Title An efficient multiplierless approximation of the fast Fourier transm using sum-of-powers-of-two (SOPOT) coefficients Author(s) Chan, SC; Yiu, PM Citation Ieee Signal Processing Letters, 2002, v.

More information

Parallel-computing approach for FFT implementation on digital signal processor (DSP)

Parallel-computing approach for FFT implementation on digital signal processor (DSP) Parallel-computing approach for FFT implementation on digital signal processor (DSP) Yi-Pin Hsu and Shin-Yu Lin Abstract An efficient parallel form in digital signal processor can improve the algorithm

More information

IMPLEMENTATION OF DOUBLE PRECISION FLOATING POINT RADIX-2 FFT USING VHDL

IMPLEMENTATION OF DOUBLE PRECISION FLOATING POINT RADIX-2 FFT USING VHDL IMPLEMENTATION OF DOUBLE PRECISION FLOATING POINT RADIX-2 FFT USING VHDL Tharanidevi.B 1, Jayaprakash.R 2 Assistant Professor, Dept. of ECE, Bharathiyar Institute of Engineering for Woman, Salem, TamilNadu,

More information

A scalable, fixed-shuffling, parallel FFT butterfly processing architecture for SDR environment

A scalable, fixed-shuffling, parallel FFT butterfly processing architecture for SDR environment LETTER IEICE Electronics Express, Vol.11, No.2, 1 9 A scalable, fixed-shuffling, parallel FFT butterfly processing architecture for SDR environment Ting Chen a), Hengzhu Liu, and Botao Zhang College of

More information

FPGA Implementation of a High Speed Multistage Pipelined Adder Based CORDIC Structure for Large Operand Word Lengths

FPGA Implementation of a High Speed Multistage Pipelined Adder Based CORDIC Structure for Large Operand Word Lengths International Journal of Computer Science and Telecommunications [Volume 3, Issue 5, May 2012] 105 ISSN 2047-3338 FPGA Implementation of a High Speed Multistage Pipelined Adder Based CORDIC Structure for

More information

4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013)

4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013) 1 4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013) Lab #1: ITB Room 157, Thurs. and Fridays, 2:30-5:20, EOW Demos to TA: Thurs, Fri, Sept.

More information

FPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 2

FPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 2 FPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 2 Anurag Dwivedi Recap Verilog- Hardware Description Language Modules Combinational circuits assign statement Control statements Sequential

More information

THE orthogonal frequency-division multiplex (OFDM)

THE orthogonal frequency-division multiplex (OFDM) 26 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 57, NO. 1, JANUARY 2010 A Generalized Mixed-Radix Algorithm for Memory-Based FFT Processors Chen-Fong Hsiao, Yuan Chen, Member, IEEE,

More information

System Verification of Hardware Optimization Based on Edge Detection

System Verification of Hardware Optimization Based on Edge Detection Circuits and Systems, 2013, 4, 293-298 http://dx.doi.org/10.4236/cs.2013.43040 Published Online July 2013 (http://www.scirp.org/journal/cs) System Verification of Hardware Optimization Based on Edge Detection

More information

White Paper. Floating-Point FFT Processor (IEEE 754 Single Precision) Radix 2 Core. Introduction. Parameters & Ports

White Paper. Floating-Point FFT Processor (IEEE 754 Single Precision) Radix 2 Core. Introduction. Parameters & Ports White Paper Introduction Floating-Point FFT Processor (IEEE 754 Single Precision) Radix 2 Core The floating-point fast fourier transform (FFT) processor calculates FFTs with IEEE 754 single precision (1

More information

Performance Analysis of CORDIC Architectures Targeted by FPGA Devices

Performance Analysis of CORDIC Architectures Targeted by FPGA Devices International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Performance Analysis of CORDIC Architectures Targeted by FPGA Devices Guddeti Nagarjuna Reddy 1, R.Jayalakshmi 2, Dr.K.Umapathy

More information

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2 IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 05, 2015 ISSN (online): 2321-0613 VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila

More information

An Efficient Multi Precision Floating Point Complex Multiplier Unit in FFT

An Efficient Multi Precision Floating Point Complex Multiplier Unit in FFT An Efficient Multi Precision Floating Point Complex Multiplier Unit in FFT Mrs. Yamini Gayathri T Assistant Professor, ACS College of Engineering, Department of ECE, Bangalore-560074, India Abstract- Discrete

More information

RISC IMPLEMENTATION OF OPTIMAL PROGRAMMABLE DIGITAL IIR FILTER

RISC IMPLEMENTATION OF OPTIMAL PROGRAMMABLE DIGITAL IIR FILTER RISC IMPLEMENTATION OF OPTIMAL PROGRAMMABLE DIGITAL IIR FILTER Miss. Sushma kumari IES COLLEGE OF ENGINEERING, BHOPAL MADHYA PRADESH Mr. Ashish Raghuwanshi(Assist. Prof.) IES COLLEGE OF ENGINEERING, BHOPAL

More information

High Throughput Energy Efficient Parallel FFT Architecture on FPGAs

High Throughput Energy Efficient Parallel FFT Architecture on FPGAs High Throughput Energy Efficient Parallel FFT Architecture on FPGAs Ren Chen Ming Hsieh Department of Electrical Engineering University of Southern California Los Angeles, USA 989 Email: renchen@usc.edu

More information

Design Optimization Techniques Evaluation for High Performance Parallel FIR Filters in FPGA

Design Optimization Techniques Evaluation for High Performance Parallel FIR Filters in FPGA Design Optimization Techniques Evaluation for High Performance Parallel FIR Filters in FPGA Vagner S. Rosa Inst. Informatics - Univ. Fed. Rio Grande do Sul Porto Alegre, RS Brazil vsrosa@inf.ufrgs.br Eduardo

More information

FFT. There are many ways to decompose an FFT [Rabiner and Gold] The simplest ones are radix-2 Computation made up of radix-2 butterflies X = A + BW

FFT. There are many ways to decompose an FFT [Rabiner and Gold] The simplest ones are radix-2 Computation made up of radix-2 butterflies X = A + BW FFT There are many ways to decompose an FFT [Rabiner and Gold] The simplest ones are radix-2 Computation made up of radix-2 butterflies A X = A + BW B Y = A BW B. Baas 442 FFT Dataflow Diagram Dataflow

More information

FPGA Implementation of IP-core of FFT Block for DSP Applications

FPGA Implementation of IP-core of FFT Block for DSP Applications IJISET - International Journal of Innovative Science, Engineering & Technology, Vol. 1 Issue 10, December 2014. FPGA Implementation of IP-core of FFT Block for DSP Applications Bhawesh Sahu 1,Anil Sahu

More information

Design of Feature Extraction Circuit for Speech Recognition Applications

Design of Feature Extraction Circuit for Speech Recognition Applications Design of Feature Extraction Circuit for Speech Recognition Applications SaambhaviVB, SSSPRao and PRajalakshmi Indian Institute of Technology Hyderabad Email: ee10m09@iithacin Email: sssprao@cmcltdcom

More information

Implementation of Floating Point Multiplier Using Dadda Algorithm

Implementation of Floating Point Multiplier Using Dadda Algorithm Implementation of Floating Point Multiplier Using Dadda Algorithm Abstract: Floating point multiplication is the most usefull in all the computation application like in Arithematic operation, DSP application.

More information

Novel design of multiplier-less FFT processors

Novel design of multiplier-less FFT processors Signal Processing 8 (00) 140 140 www.elsevier.com/locate/sigpro Novel design of multiplier-less FFT processors Yuan Zhou, J.M. Noras, S.J. Shepherd School of EDT, University of Bradford, Bradford, West

More information

STUDY OF A CORDIC BASED RADIX-4 FFT PROCESSOR

STUDY OF A CORDIC BASED RADIX-4 FFT PROCESSOR STUDY OF A CORDIC BASED RADIX-4 FFT PROCESSOR 1 AJAY S. PADEKAR, 2 S. S. BELSARE 1 BVDU, College of Engineering, Pune, India 2 Department of E & TC, BVDU, College of Engineering, Pune, India E-mail: ajay.padekar@gmail.com,

More information

TOPICS PIPELINE IMPLEMENTATIONS OF THE FAST FOURIER TRANSFORM (FFT) DISCRETE FOURIER TRANSFORM (DFT) INVERSE DFT (IDFT) Consulted work:

TOPICS PIPELINE IMPLEMENTATIONS OF THE FAST FOURIER TRANSFORM (FFT) DISCRETE FOURIER TRANSFORM (DFT) INVERSE DFT (IDFT) Consulted work: 1 PIPELINE IMPLEMENTATIONS OF THE FAST FOURIER TRANSFORM (FFT) Consulted work: Chiueh, T.D. and P.Y. Tsai, OFDM Baseband Receiver Design for Wireless Communications, John Wiley and Sons Asia, (2007). Second

More information

DSP Co-Processing in FPGAs: Embedding High-Performance, Low-Cost DSP Functions

DSP Co-Processing in FPGAs: Embedding High-Performance, Low-Cost DSP Functions White Paper: Spartan-3 FPGAs WP212 (v1.0) March 18, 2004 DSP Co-Processing in FPGAs: Embedding High-Performance, Low-Cost DSP Functions By: Steve Zack, Signal Processing Engineer Suhel Dhanani, Senior

More information

FPGA Matrix Multiplier

FPGA Matrix Multiplier FPGA Matrix Multiplier In Hwan Baek Henri Samueli School of Engineering and Applied Science University of California Los Angeles Los Angeles, California Email: chris.inhwan.baek@gmail.com David Boeck Henri

More information

Radix-4 FFT Algorithms *

Radix-4 FFT Algorithms * OpenStax-CNX module: m107 1 Radix-4 FFT Algorithms * Douglas L Jones This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 10 The radix-4 decimation-in-time

More information

Fast Block LMS Adaptive Filter Using DA Technique for High Performance in FGPA

Fast Block LMS Adaptive Filter Using DA Technique for High Performance in FGPA Fast Block LMS Adaptive Filter Using DA Technique for High Performance in FGPA Nagaraj Gowd H 1, K.Santha 2, I.V.Rameswar Reddy 3 1, 2, 3 Dept. Of ECE, AVR & SVR Engineering College, Kurnool, A.P, India

More information

Efficient Radix-4 and Radix-8 Butterfly Elements

Efficient Radix-4 and Radix-8 Butterfly Elements Efficient Radix4 and Radix8 Butterfly Elements Weidong Li and Lars Wanhammar Electronics Systems, Department of Electrical Engineering Linköping University, SE581 83 Linköping, Sweden Tel.: +46 13 28 {1721,

More information

On Designs of Radix Converters Using Arithmetic Decompositions

On Designs of Radix Converters Using Arithmetic Decompositions On Designs of Radix Converters Using Arithmetic Decompositions Yukihiro Iguchi 1 Tsutomu Sasao Munehiro Matsuura 1 Dept. of Computer Science, Meiji University, Kawasaki 1-51, Japan Dept. of Computer Science

More information

Speed Optimised CORDIC Based Fast Algorithm for DCT

Speed Optimised CORDIC Based Fast Algorithm for DCT GRD Journals Global Research and Development Journal for Engineering International Conference on Innovations in Engineering and Technology (ICIET) - 2016 July 2016 e-issn: 2455-5703 Speed Optimised CORDIC

More information

ELEC 427 Final Project Area-Efficient FFT on FPGA

ELEC 427 Final Project Area-Efficient FFT on FPGA ELEC 427 Final Project Area-Efficient FFT on FPGA Hamed Rahmani-Mohammad Sadegh Riazi- Seyyed Mohammad Kazempour Introduction The aim of this project was to design a 16 point Discrete Time Fourier Transform

More information

Design of FPGA Based Radix 4 FFT Processor using CORDIC

Design of FPGA Based Radix 4 FFT Processor using CORDIC Design of FPGA Based Radix 4 FFT Processor using CORDIC Chetan Korde 1, Dr. P. Malathi 2, Sudhir N. Shelke 3, Dr. Manish Sharma 4 1,2,4 Department of Electronics and Telecommunication Engineering, DYPCOE,

More information

2 Assoc Prof, Dept of ECE, RGM College of Engineering & Technology, Nandyal, AP-India,

2 Assoc Prof, Dept of ECE, RGM College of Engineering & Technology, Nandyal, AP-India, ISSN 2319-8885 Vol.03,Issue.27 September-2014, Pages:5486-5491 www.ijsetr.com MDC FFT/IFFT Processor with 64-Point using Radix-4 Algorithm for MIMO-OFDM System VARUN REDDY.P 1, M.RAMANA REDDY 2 1 PG Scholar,

More information

Parallel FIR Filters. Chapter 5

Parallel FIR Filters. Chapter 5 Chapter 5 Parallel FIR Filters This chapter describes the implementation of high-performance, parallel, full-precision FIR filters using the DSP48 slice in a Virtex-4 device. ecause the Virtex-4 architecture

More information

MULTIPLIERLESS HIGH PERFORMANCE FFT COMPUTATION

MULTIPLIERLESS HIGH PERFORMANCE FFT COMPUTATION MULTIPLIERLESS HIGH PERFORMANCE FFT COMPUTATION Maheshwari.U 1, Josephine Sugan Priya. 2, 1 PG Student, Dept Of Communication Systems Engg, Idhaya Engg. College For Women, 2 Asst Prof, Dept Of Communication

More information

Multi-Gigahertz Parallel FFTs for FPGA and ASIC Implementation

Multi-Gigahertz Parallel FFTs for FPGA and ASIC Implementation Multi-Gigahertz Parallel FFTs for FPGA and ASIC Implementation Doug Johnson, Applications Consultant Chris Eddington, Technical Marketing Synopsys 2013 1 Synopsys, Inc. 700 E. Middlefield Road Mountain

More information

CHAPTER 4. DIGITAL DOWNCONVERTER FOR WiMAX SYSTEM

CHAPTER 4. DIGITAL DOWNCONVERTER FOR WiMAX SYSTEM CHAPTER 4 IMPLEMENTATION OF DIGITAL UPCONVERTER AND DIGITAL DOWNCONVERTER FOR WiMAX SYSTEM 4.1 Introduction FPGAs provide an ideal implementation platform for developing broadband wireless systems such

More information

FPGA Polyphase Filter Bank Study & Implementation

FPGA Polyphase Filter Bank Study & Implementation FPGA Polyphase Filter Bank Study & Implementation Raghu Rao Matthieu Tisserand Mike Severa Prof. John Villasenor Image Communications/. Electrical Engineering Dept. UCLA 1 Introduction This document describes

More information

DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: OUTLINE APPLICATIONS OF DIGITAL SIGNAL PROCESSING

DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: OUTLINE APPLICATIONS OF DIGITAL SIGNAL PROCESSING 1 DSP applications DSP platforms The synthesis problem Models of computation OUTLINE 2 DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: Time-discrete representation

More information

Latest Innovation For FFT implementation using RCBNS

Latest Innovation For FFT implementation using RCBNS Latest Innovation For FFT implementation using SADAF SAEED, USMAN ALI, SHAHID A. KHAN Department of Electrical Engineering COMSATS Institute of Information Technology, Abbottabad (Pakistan) Abstract: -

More information