Implementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics

Similar documents
FPGA Implementation of FFT Processor in Xilinx

Low Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm

Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope

FPGA Based Design and Simulation of 32- Point FFT Through Radix-2 DIT Algorith

An Efficient Design of Vedic Multiplier using New Encoding Scheme

Low-Power Split-Radix FFT Processors Using Radix-2 Butterfly Units

FPGA IMPLEMENTATION OF DFT PROCESSOR USING VEDIC MULTIPLIER. Amrita School of Engineering, Coimbatore, Amrita Vishwa Vidyapeetham, India

High Performance Pipelined Design for FFT Processor based on FPGA

STUDY OF A CORDIC BASED RADIX-4 FFT PROCESSOR

Fused Floating Point Arithmetic Unit for Radix 2 FFT Implementation

Research Article Design of A Novel 8-point Modified R2MDC with Pipelined Technique for High Speed OFDM Applications

An Area Efficient Mixed Decimation MDF Architecture for Radix. Parallel FFT

Keywords: Fast Fourier Transforms (FFT), Multipath Delay Commutator (MDC), Pipelined Architecture, Radix-2 k, VLSI.

International Journal of Innovative and Emerging Research in Engineering. e-issn: p-issn:

DESIGN METHODOLOGY. 5.1 General

A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO

Area-Time Efficient Square Architecture

Abstract. Literature Survey. Introduction. A.Radix-2/8 FFT algorithm for length qx2 m DFTs

VHDL IMPLEMENTATION OF FLOATING POINT MULTIPLIER USING VEDIC MATHEMATICS

Design of High Speed Area Efficient IEEE754 Floating Point Multiplier

Review on 32-Bit IEEE 754 Complex Number Multiplier Based on FFT Architecture using BOOTH Algorithm

MULTIPLIERLESS HIGH PERFORMANCE FFT COMPUTATION

Fig.1. Floating point number representation of single-precision (32-bit). Floating point number representation in double-precision (64-bit) format:

FPGA Implementation of Discrete Fourier Transform Using CORDIC Algorithm

VHDL IMPLEMENTATION OF A FLEXIBLE AND SYNTHESIZABLE FFT PROCESSOR

Design of Delay Efficient Distributed Arithmetic Based Split Radix FFT

Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications

AN FFT PROCESSOR BASED ON 16-POINT MODULE

FAST FOURIER TRANSFORM (FFT) and inverse fast

VLSI IMPLEMENTATION AND PERFORMANCE ANALYSIS OF EFFICIENT MIXED-RADIX 8-2 FFT ALGORITHM WITH BIT REVERSAL FOR THE OUTPUT SEQUENCES.

Design of FPGA Based Radix 4 FFT Processor using CORDIC

Hemraj Sharma 1, Abhilasha 2

FPGA Implementation of Low-Area Floating Point Multiplier Using Vedic Mathematics

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN

FPGA IMPLEMENTATION OF HIGH SPEED DCT COMPUTATION OF JPEG USING VEDIC MULTIPLIER

TOPICS PIPELINE IMPLEMENTATIONS OF THE FAST FOURIER TRANSFORM (FFT) DISCRETE FOURIER TRANSFORM (DFT) INVERSE DFT (IDFT) Consulted work:

DESIGN OF PARALLEL PIPELINED FEED FORWARD ARCHITECTURE FOR ZERO FREQUENCY & MINIMUM COMPUTATION (ZMC) ALGORITHM OF FFT

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator

Design of Vedic Multiplier for Digital Signal Processing Applications R.Naresh Naik 1, P.Siva Nagendra Reddy 2, K. Madan Mohan 3

Vedic Mathematics Based Floating Point Multiplier Implementation for 24 Bit FFT Computation

A Novel OFDM using Radix 22 and FFT Algorithm

Research Article International Journal of Emerging Research in Management &Technology ISSN: (Volume-6, Issue-8) Abstract:

Design and Performance Analysis of 32 and 64 Point FFT using Multiple Radix Algorithms

An Efficient High Speed VLSI Architecture Based 16-Point Adaptive Split Radix-2 FFT Architecture

THE orthogonal frequency-division multiplex (OFDM)

Design and Simulation of 32 bit Floating Point FFT Processor Using VHDL

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2

High speed DCT design using Vedic mathematics N.J.R. Muniraj 1 and N.Senathipathi 2

Low Power Floating-Point Multiplier Based On Vedic Mathematics

Parallel-computing approach for FFT implementation on digital signal processor (DSP)

CORDIC Based FFT for Signal Processing System

IMPLEMENTATION OF FAST FOURIER TRANSFORM USING VERILOG HDL

FPGA Design, Implementation and Analysis of Trigonometric Generators using Radix 4 CORDIC Algorithm

An Efficient Multi Precision Floating Point Complex Multiplier Unit in FFT

Run-Time Reconfigurable multi-precision floating point multiplier design based on pipelining technique using Karatsuba-Urdhva algorithms

Design and Implementation of 3-D DWT for Video Processing Applications

Design of 2-D DWT VLSI Architecture for Image Processing

FPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA

Design of a Multiplier Architecture Based on LUT and VHBCSE Algorithm For FIR Filter

Design And Simulation Of Pipelined Radix-2 k Feed-Forward FFT Architectures

A BSD ADDER REPRESENTATION USING FFT BUTTERFLY ARCHITECHTURE

Floating-Point Butterfly Architecture Based on Binary Signed-Digit Representation

ISSN Vol.02, Issue.11, December-2014, Pages:

Low Power Complex Multiplier based FFT Processor

ISSN Vol.02, Issue.11, December-2014, Pages:

High Speed Radix 8 CORDIC Processor

Efficient Radix-4 and Radix-8 Butterfly Elements

: : (91-44) (Office) (91-44) (Residence)

Efficient Methods for FFT calculations Using Memory Reduction Techniques.

Implimentation of A 16-bit RISC Processor for Convolution Application

Speed Optimised CORDIC Based Fast Algorithm for DCT

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

Implementation of Area Efficient Multiplexer Based Cordic

A High-Speed FPGA Implementation of an RSD- Based ECC Processor

REALIZATION OF MULTIPLE- OPERAND ADDER-SUBTRACTOR BASED ON VEDIC MATHEMATICS

IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC

Novel design of multiplier-less FFT processors

New Integer-FFT Multiplication Architectures and Implementations for Accelerating Fully Homomorphic Encryption

A scalable, fixed-shuffling, parallel FFT butterfly processing architecture for SDR environment

LOW-POWER SPLIT-RADIX FFT PROCESSORS

FAST Fourier transform (FFT) is an important signal processing

Twiddle Factor Transformation for Pipelined FFT Processing

CORDIC Based DFT on FPGA for DSP Applications

The Serial Commutator FFT

RECENTLY, researches on gigabit wireless personal area

IMPLEMENTATION OF DOUBLE PRECISION FLOATING POINT RADIX-2 FFT USING VHDL

International Journal of Advance Engineering and Research Development

Reconfigurable FFT Processor A Broader Perspective Survey

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE

Area And Power Efficient LMS Adaptive Filter With Low Adaptation Delay

DUE to the high computational complexity and real-time

Pipelined High Speed Double Precision Floating Point Multiplier Using Dadda Algorithm Based on FPGA

OPTIMIZING THE POWER USING FUSED ADD MULTIPLIER

Performance Analysis of CORDIC Architectures Targeted by FPGA Devices

IMPLEMENTATION OF OPTIMIZED 128-POINT PIPELINE FFT PROCESSOR USING MIXED RADIX 4-2 FOR OFDM APPLICATIONS

High Throughput Energy Efficient Parallel FFT Architecture on FPGAs

International Research Journal of Engineering and Technology (IRJET) e-issn:

FFT/IFFTProcessor IP Core Datasheet

Implementation of Two Level DWT VLSI Architecture

FPGA Implementation of a High Speed Multistage Pipelined Adder Based CORDIC Structure for Large Operand Word Lengths

Transcription:

Implementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics Yojana Jadhav 1, A.P. Hatkar 2 PG Student [VLSI & Embedded system], Dept. of ECE, S.V.I.T Engineering College, Chincholi, Nashik, Maharashtra, India 1 Professor, Dept. of ECE, S.V.I.T Engineering College, Chincholi,Nashik, Maharashtra, India 2 ABSTRACT:The FFT is a faster version of the Discrete Fourier Transform (DFT) and calculates Discrete Fourier Transform efficiently by reducing the computational complexity. It has low power consumption. It also reduces the area hence making the processor area-efficient.fast Fourier Transform is an essential data processing technique in communication systems and DSP systems. In this brief, we propose high speed and area efficient 64 point FFT processor using Vedic algorithm. To reduce computational complexity and area, we develop FFT architecture by devising a radix-4 algorithm and optimizing the realization by Vedic algorithm. Furthermore, it can be used in decimation in frequency (DIF) and decimation in time (DIT) decompositions. Moreover, the design can achieve very high speed, which makes them suitable for the most demanding applications of FFT. Indeed, the proposed radix-4 Vedic algorithm based architecture requires fewer hardware resources. The synthesis results are same as that of theoretical analysis and it is observed that more than 15% reduction can be achieved in terms of slices count. In addition, the dynamic power consumption can be reduced and speed can be increased by as much as 25% using Vedic algorithm. KEYWORDS:FFT, Vedic algorithm, DSP, radix-4,urdhavatiryakbhyam. I. INTRODUCTION Now a days, one of the important application of FFT is the orthogonal frequency division multiplexing (OFDM)[1]. It is mainly used in wireless local area network (WLAN), digital audio broadcasting (DAB), digital video broadcasting-terrestrial (DVB-T) and digital video broadcasting-handheld (DVB-H). Due to such diverse application of FFT, it is desirable to develop efficient FFT to meet the requirement of various OFDM communication standards [1]. The FFT is a faster version of the Discrete Fourier Transform (DFT) and calculates Discrete Fourier Transform efficiently in our work by reducing the computational complexity. Memory-based architecture is widely adopted to design an FFT processor. It consists of butterfly processing element and memory units. It has low power consumption but long latency and low throughput. To improve the efficiency of memory based FFT architecture, radix-4 butterfly processing units along with dual port memory is adopted. With the advancement of VLSI, Fast Fourier Transform (FFT) is applied to wide field of digital signal processing (DSP) and communication system applications. The butterfly processing unit decides cost and characteristic of FFT processor. A butterfly unit is composed of complex adders and multipliers. The multiplier is usually the speed bottleneck in the design of the FFT processor [2]. If these complex multiplications are implemented using shift and add operation, it results in higher hardware cost and also, limits the performance of FFT[2]. To improve the performance of such complex computation Vedic algorithm is adopted. The benefits of Vedic algorithm for efficient implementation of complex multiplier have been largely unexplored. In recent years, several designs of FFT using Vedic algorithm have been proposed, but with the limited set of parameters. VEDIC mathematics [3] is the ancient Indian system of mathematics which mainly deals with Vedic mathematical formulae and their application to various branches of mathematics. The word Vedic is derived from the word Veda which means the store-house of all knowledge. Vedic mathematics was reconstructed from the ancient Indian scrip- Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2016. 0403061 3287

tures (Vedas) by Sri BharatiKrisnaTirtha (1884-1960) after his eight years of research on Vedas [3].This paper presents a simple digital multiplier architecture [3] based on the ancient Vedic mathematics Sutra (formula) called Urdhva- Tiryakbhyam(Vertically and Cross wise) Sutra which was traditionally used for decimal system in ancient India. In [3], this Sutra is shown to be a much more efficient multiplication algorithm as compared to the conventional counterparts. Urdhva-Tiryakbhyam Sutra [4] is first applied to the binary number system and is used to develop digital multiplier architecture. This Sutra also shows the effectiveness of reducing the N N multiplier [4] structure into an efficient 4 4 multiplier structures. This work presents a systematic design methodology for fast and area efficient digital multiplier based on Vedic mathematics [4]. In recent years, there has been some research in the design of multi-path pipelined FFT processors that provide a high throughput [5]. The area becomes even larger because the memory modules are duplicated for the 16 data path approach. In order to reduce the area and power consumption, several FFT algorithms and dynamic scaling schemes have been proposed [5]. The radix of the algorithm greatly influences the architecture of the FFT processor and the complexity of the implementation. A small radix is desirable because it results in a simple butterfly. Nevertheless, a high radix reduces the number of twiddle factor multiplications. The radix rk algorithms simultaneously achieve a simple butterfly and a reduced number of twiddle factor multiplications [5]. II. RELATED WORK In order to reduce the area and power consumption, several FFT algorithms and dynamic scaling schemes have been proposed [5]. The radix of the algorithm greatly influences the architecture of the FFT processor and the complexity of the implementation. A small radix is desirable because it results in a simple butterfly.nevertheless, a high radix reduces the number of twiddle factor multiplications. The radix rk algorithms simultaneously achieve a simple butterfly and a reduced number of twiddle factor multiplications [5]. The radix-2 algorithm is a well known simple algorithm for FFT processors, but it requires many complex multipliers. The radix-4 algorithm is primarily used for high data throughput FFT architectures, but requires a 4-point butterfly unit with high complexity. One of the mentioned Sutras in the last subsection, UrdhvaTiryakbhyam (Vertically and Crosswise), deals with the multiplication of numbers. This Sutra has been traditionally used for the multiplication of two numbers in the decimal number system. In this paper, we apply the same idea to the binary number system to make it compatible with the digital hardware. The digits on the two ends of the line are multiplied and the result is added with the previous carry. When there are more lines in one step, all the results are added to the previous carry. The least significant digit of the number thus obtained acts as one of the result digits and the rest act as the carry for the next step. Initially the carry is taken to be zero. Fig. 1. Urdhva-tiryagbyham method of multiplication. Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2016. 0403061 3288

III.HARDWARE AND SOFTWARE DESIGN A. Address Generation Unit (AGU): The address generator performs memory addressing. The address generation unit controls the address bus going to memory. The FFT processor reads and writes from and to the 8 twin port memory banks concurrently. There are 8 read address buses, and 8 write address buses. They have two signals, Start and Busy. The first one enables the processor to process the data and the second one indicates processing of data inputs [1]. The incoming data samples are partitioned into even and odd samples. After the assertion of the Start signal 64 input samples are clocked into the memory bank. A 6-bit counter controls the serial input of the data in the memory bank. Once all the data inputs are stored in memory bank, a signal Busy is asserted to process the data. AGU also controls the operation of selector unit by generating appropriate signals at a time. Initially, memory has the first 64 complex time samples. The butterfly unit selects four samples simultaneously and processes them and outputs the results in the same memory location. The butterfly unit is then given four more samples from memory banks. This process is repeated 16 times (for a 64-point FFT) until all the inputs in memory banks are depleted. Fig. 2. Architecture of 64-point FFT using Vedic algorithm. B.Radix-4 Butterfly Unit using Vedic Algorithm: To perform 64-point FFT a single 4-point FFT unit is recursively used. This 4-point FFT is designed using high speed radix-4 algorithm. Moreover, the performance of FFT is limited by arithmetic operation such as complex multiplication. Complex multiplication of two numbers requires 4 multipliers, 2 adders and 1 subtractor or 3 multiplier and 5 adders. This large number of multipliers degrades the performance of FFT [2]. Vedic algorithm is an ancient and well known technique for arithmetic operation. The method which is used for multiplications is Urdhvatiryagbhyam which means vertical and crosswise. In this way, this algorithm performs multiplication of two given numbers in vertical and crosswise manner until left with only MSB bits. The proposed FFT utilizes Urdhvatiryagbhyam method of Vedic algorithm to perform complex twiddle factor multiplications. C.Twiddle Factor Generator: The twiddle factors are generated using Vedic algorithm based Coordinate Rotation Digital Computer algorithm (CORDIC). CORDIC is a popular technique for computing trigonometric and exponential functions [2]. It performs rotation of a vector through some angle, specified by its coordinates. To eliminate the need of ROM to store twiddle factor, proposed FFT uses CORDIC to generate various values of twiddle factor. It uses two mode, rotation mode and vectoring mode. In rotation mode, the rotation by -45 degrees and the other by -135 degrees is obtained. D. RAM: Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2016. 0403061 3289

A memory unit is composed of 8 dual-port memory banks which facilitate 8-way parallel data access [2]. Each memory bank is 8-bit wide. From memory perceptive, in- place memory scheme is adopted so as to minimize the hardware resources and speed up the memory access time. For radix-r FFT, r banks of memory are needed to store data, and each memory bank could be dual-port memory. With "in-place" strategy, the r outputs of the butterfly can be written back to the same memory locations of the r inputs, and replace the old data. E.Commutator: The Commutator is one type of a switch that ensures efficient data routing mechanism. It consists of 8-to-1 MUXs [2]. It performs following functions. Commutator 1 routes the butterfly outputs to appropriate memory banks. These memory bank outputs are routed to proper 4-point butterfly inputs according to the control signals obtained from address generation unit. Commutator 2 is responsible for routing the input data to proper memory bank during FFT input period. It is also responsible for routing the memory output data to proper output. IV. RESULT AND DISCUSSION An efficient 64 point FFT module was designed using Vedic multipliers and modified adder and simulated using VHDL.FFT utilizing modified wallace multiplier and regular carry select adder was also simulated to compare the performances in terms of area and speed. Both the designs were implemented using the Sparten 3E and their synthesis reports were analyzed. Fig 3 shows the simulation of the testbench which was developed to test and verify the 64 point FFT. The values were verified with Matlab output. The 32 bit inputs are entered in both cases and the product and sum is verified respectively. The conclusion proves that our proposed system is better in terms of area and speed compared to the existing FFT processors. For implementation of Braun Multiplier we are using Xilinx Integrated Software Environment (ISE). We are using Xilinx ISE 8.1i Project Navigator. The Xilinx Integrated Software Environment (ISE) consists of a set of programs to capture, simulate and implement digital hardware designs in a FPGA [5]. We use the Xilinx ISE 8.1i Project Navigator GUI for this work to help us visually manage and automate the entire design process with the aid of toolbars, menus or icons. Fig.3 Test Bench waveforms for VedicMultiplier Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2016. 0403061 3290

TABLE I COMPARISON BETWEEN SYNTHESIS REPORTS OF PROPOSED AND CONVENTIONAL FFT FFT Type Number of Slices Minimum period (ns) Maximum combinational path delay (ns) Proposed Vedic FFT 428 6.670ns 2.655ns Conventional FFT 834 9.718 ns 3.745ns V.CONCLUSION Implementation results prove that the proposed FFT led to 26% area reduction and 25% speed improvement when compared with conventional FFT. This design had a maximum clock frequency of 95.2MHz. Such a 64-point FFT implemented using Vedic concepts and modified adder was found to have a good balance between performance and hardware requirements and is therefore most suitable for use in OFDM. Also, area minimization is obtained by devising an efficient Vedic algorithm based butterfly processing structure, while the novel twiddle factor multiplier has low power consumption and hardware complexity. Synthesis results show that the proposed FFT processor can provide up to 380.51 MHz speed and slices count is 373. The proposed FFT architecture can also be altered to support other longer FFT sizes. The main applications of this proposed FFT are in Digital Signal Processing, OFDM Systems, Digital Image Processing and Communication systems. REFERENCES 1. Nisha John, Prof. Sadanandan G.K, FPGA Implementation of a Novel Efficient Vedic FFT/IFFT Processor For OFDM, International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, Vol. 3, Issue 3, September 2014. 2. More T.V.,Panat A.R. FPGA implementation of FFT using vedic algorithm, Computational intelligene and Computing Research(ICCIC),pp- 1-5,2013. 3. Harpreet Singh Dhillon, AbhijitMitra, A Digital Multiplier Architecture using UrdhvaTiryakbhyam Sutra of Vedic Mathematics,IITG, pp-1-4,2010. 4. A.RonishaPrakash, S.Kirubaveni, Performance Evaluation of FFT Processor Using Conventional and Vedic Algorithm, IEEE International Conference on Emerging Trends in Computing, Communication and Nanotechnology, pp-1-6,2013. 5. AsmitaHaveliya, FPGA implementation of a Vedic convolution algorithm, International Journal of Engineering Research and Applications, Vol. 2, Issue 1, pp.678-684,jan-feb 2012. Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2016. 0403061 3291