Modified Welch Power Spectral Density Computation with Fast Fourier Transform

Similar documents
Power Spectral Density Computation using Modified Welch Method

Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope

DESIGN OF PARALLEL PIPELINED FEED FORWARD ARCHITECTURE FOR ZERO FREQUENCY & MINIMUM COMPUTATION (ZMC) ALGORITHM OF FFT

The Serial Commutator FFT

FPGA Based Design and Simulation of 32- Point FFT Through Radix-2 DIT Algorith

Research Article International Journal of Emerging Research in Management &Technology ISSN: (Volume-6, Issue-8) Abstract:

Twiddle Factor Transformation for Pipelined FFT Processing

A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter

Parallel-computing approach for FFT implementation on digital signal processor (DSP)

VHDL IMPLEMENTATION OF A FLEXIBLE AND SYNTHESIZABLE FFT PROCESSOR

Low Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm

Keywords: Fast Fourier Transforms (FFT), Multipath Delay Commutator (MDC), Pipelined Architecture, Radix-2 k, VLSI.

Fused Floating Point Arithmetic Unit for Radix 2 FFT Implementation

Abstract. Literature Survey. Introduction. A.Radix-2/8 FFT algorithm for length qx2 m DFTs

IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC

TOPICS PIPELINE IMPLEMENTATIONS OF THE FAST FOURIER TRANSFORM (FFT) DISCRETE FOURIER TRANSFORM (DFT) INVERSE DFT (IDFT) Consulted work:

An Area Efficient Mixed Decimation MDF Architecture for Radix. Parallel FFT

FAST FOURIER TRANSFORM (FFT) and inverse fast

Linköping University Post Print. Analysis of Twiddle Factor Memory Complexity of Radix-2^i Pipelined FFTs

AN FFT PROCESSOR BASED ON 16-POINT MODULE

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2

Streamlined real-factor FFTs

ISSN Vol.02, Issue.11, December-2014, Pages:

VLSI IMPLEMENTATION AND PERFORMANCE ANALYSIS OF EFFICIENT MIXED-RADIX 8-2 FFT ALGORITHM WITH BIT REVERSAL FOR THE OUTPUT SEQUENCES.

Research Article Design of A Novel 8-point Modified R2MDC with Pipelined Technique for High Speed OFDM Applications

FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression

Power and Area Efficient Implementation for Parallel FIR Filters Using FFAs and DA

Low-Power Split-Radix FFT Processors Using Radix-2 Butterfly Units

Efficient Methods for FFT calculations Using Memory Reduction Techniques.

LOW-POWER SPLIT-RADIX FFT PROCESSORS

IMPLEMENTATION OF DOUBLE PRECISION FLOATING POINT RADIX-2 FFT USING VHDL

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code

AN EFFICIENT DESIGN OF VLSI ARCHITECTURE FOR FAULT DETECTION USING ORTHOGONAL LATIN SQUARES (OLS) CODES

DUE to the high computational complexity and real-time

FPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA

REAL TIME DIGITAL SIGNAL PROCESSING

High Performance Pipelined Design for FFT Processor based on FPGA

A Pipelined Fused Processing Unit for DSP Applications

Novel design of multiplier-less FFT processors

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

MULTIPLIERLESS HIGH PERFORMANCE FFT COMPUTATION

ENERGY EFFICIENT PARAMETERIZED FFT ARCHITECTURE. Ren Chen, Hoang Le, and Viktor K. Prasanna

Adaptive FIR Filter Using Distributed Airthmetic for Area Efficient Design

An Efficient High Speed VLSI Architecture Based 16-Point Adaptive Split Radix-2 FFT Architecture

A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO

Low Power Floating-Point Multiplier Based On Vedic Mathematics

AREA-DELAY EFFICIENT FFT ARCHITECTURE USING PARALLEL PROCESSING AND NEW MEMORY SHARING TECHNIQUE

DESIGN & SIMULATION PARALLEL PIPELINED RADIX -2^2 FFT ARCHITECTURE FOR REAL VALUED SIGNALS

DESIGN METHODOLOGY. 5.1 General

An efficient multiplierless approximation of the fast Fourier transform using sum-of-powers-of-two (SOPOT) coefficients

Low Power Complex Multiplier based FFT Processor

STUDY OF A CORDIC BASED RADIX-4 FFT PROCESSOR

RECENTLY, researches on gigabit wireless personal area

FAST Fourier transform (FFT) is an important signal processing

VLSI Design and Implementation of High Speed and High Throughput DADDA Multiplier

FIR Filter Architecture for Fixed and Reconfigurable Applications

HIGH-THROUGHPUT FINITE FIELD MULTIPLIERS USING REDUNDANT BASIS FOR FPGA AND ASIC IMPLEMENTATIONS

International Journal of Innovative and Emerging Research in Engineering. e-issn: p-issn:

FPGA Implementation of Discrete Fourier Transform Using CORDIC Algorithm

Design and Implementation of VLSI 8 Bit Systolic Array Multiplier

Multiplierless Unity-Gain SDF FFTs

Implementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics

ENT 315 Medical Signal Processing CHAPTER 3 FAST FOURIER TRANSFORM. Dr. Lim Chee Chin

THE orthogonal frequency-division multiplex (OFDM)

Fixed Point Streaming Fft Processor For Ofdm

ENERGY EFFICIENT PARAMETERIZED FFT ARCHITECTURE. Ren Chen, Hoang Le, and Viktor K. Prasanna

Design And Simulation Of Pipelined Radix-2 k Feed-Forward FFT Architectures

FOLDED ARCHITECTURE FOR NON CANONICAL LEAST MEAN SQUARE ADAPTIVE DIGITAL FILTER USED IN ECHO CANCELLATION

Fast Block LMS Adaptive Filter Using DA Technique for High Performance in FGPA

ISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies

High Throughput Energy Efficient Parallel FFT Architecture on FPGAs

IMPLEMENTATION OF FAST FOURIER TRANSFORM USING VERILOG HDL

FFT. There are many ways to decompose an FFT [Rabiner and Gold] The simplest ones are radix-2 Computation made up of radix-2 butterflies X = A + BW

A 4096-Point Radix-4 Memory-Based FFT Using DSP Slices

SCALABLE IMPLEMENTATION SCHEME FOR MULTIRATE FIR FILTERS AND ITS APPLICATION IN EFFICIENT DESIGN OF SUBBAND FILTER BANKS

A DCT Architecture based on Complex Residue Number Systems

Image Compression System on an FPGA

ASIC IMPLEMENTATION OF 16 BIT CARRY SELECT ADDER

Implementation of a Unified DSP Coprocessor

Power Optimized Programmable Truncated Multiplier and Accumulator Using Reversible Adder

International Journal for Research in Applied Science & Engineering Technology (IJRASET) IIR filter design using CSA for DSP applications

Fixed Point LMS Adaptive Filter with Low Adaptation Delay

Energy Optimizations for FPGA-based 2-D FFT Architecture

Volume 5, Issue 5 OCT 2016

Agenda for supervisor meeting the 22th of March 2011

Implementation of Lifting-Based Two Dimensional Discrete Wavelet Transform on FPGA Using Pipeline Architecture

On the Design of High Speed Parallel CRC Circuits using DSP Algorithams

Sum to Modified Booth Recoding Techniques For Efficient Design of the Fused Add-Multiply Operator

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator

A BSD ADDER REPRESENTATION USING FFT BUTTERFLY ARCHITECHTURE

ON CONFIGURATION OF RESIDUE SCALING PROCESS IN PIPELINED RADIX-4 MQRNS FFT PROCESSOR

Implementation of A Optimized Systolic Array Architecture for FSBMA using FPGA for Real-time Applications

Fault Tolerant Parallel Filters Based on ECC Codes

FPGA IMPLEMENTATION OF DFT PROCESSOR USING VEDIC MULTIPLIER. Amrita School of Engineering, Coimbatore, Amrita Vishwa Vidyapeetham, India

Digital Signal Processing. Soma Biswas

A scalable, fixed-shuffling, parallel FFT butterfly processing architecture for SDR environment

Classification of Mental Task for Brain Computer Interface Using Artificial Neural Network

High-Speed and Low-Power Split-Radix FFT

Area Delay Power Efficient Carry-Select Adder

VLSI Implementation of Parallel CRC Using Pipelining, Unfolding and Retiming

Transcription:

Modified Welch Power Spectral Density Computation with Fast Fourier Transform Sreelekha S 1, Sabi S 2 1 Department of Electronics and Communication, Sree Budha College of Engineering, Kerala, India 2 Professor, Department of Electronics and Communication, Sree Budha College of Engineering, Kerala, India Abstract: Spectral analysis finds application in different fields. In vibration monitoring, the spectral content of measured signals give information on the wear and other characteristics of mechanical parts under study. In economics, meteorology, astronomy and several other fields, the spectral analysis may reveal hidden periodicities in the studied data, which are to be associated with cyclic behavior. This paper describes the low-complexity modified algorithm and architectural view to compute the power spectral density (PSD) using the Modified Welch Method. Welch algorithm also used for spectral power estimation. But high computational complexity is the main drawback of Welch method. To reduce the high computational complexity use 50% overlap by computing N/2-point FFT, where N is the length of the window and is merged with previous N/2-point FFT and then calculate the N-point FFT. Frequency-domain windowing operation is preferred for high performance. Parallel FFT approach is the best choice to improve the computational speed. Keywords: power spectral density, low-complexity, Welch method, windowing, Fast Fourier Transform 1. Introduction Spectral Analysis is an important role in many application such as biomedical signal analyzer[1] for extracting information from the relevant data, analysis of radar and sonar signal [2] for distinguishing and tracking signals of interest [3], [4], and spectrum sensing [5], [6]. Power Spectral Density (PSD) is a power intensity measurement of signal in the frequency domain. In practice, the PSD is computed from the FFT spectrum of a signal. It provides a useful way to characterize the amplitude versus frequency content of a random signal. Welch s method is the technique used for calculating the power spectral density based on fast fourier transform. Power spectral density P (w) is the fourier transform of the auto-correlation of the input signal Φx (l) [7]. Power Spectral Density can be computed in two ways: parametric method and non-parametric methods. Autoregressive-moving average (ARMA) model identification, Minimum-Variance Distortion less Response Method (MVDR) and eigen decomposition based methods [8] are few examples of parametric methods. Periodogram and multiple- window methods are come under in nonparametric approach. Fast Fourier Transform technique is widely used method for calculating the periodogram. Due to the unavailability of efficient FFT algorithms, the periodogram method is preferred over other parametric approaches. Welch PSD method [9] is a popular nonparametric method to estimate the spectral power density based on the periodogram. So this low-complexity architecture suitable for low-power embedded systems. In Welch algorithm, squared magnitude of the periodogram is calculated and then averaged the individual periodogram, which reduces the variance of power measurements. The final result is an array of power measurement Vs frequency. This paper presents original modifications to the Welch PSD method for low-power embedded applications to develop a low- complexity architecture. Biomedical signal analysis [1] is one of the application where a specific hardware can be used in low-cost and low-power systems. Constraints like area, performance and power consumption should be taken into deliberation while trying to incorporate PSD computation into biomedical monitoring systems. Such systems enforce strict constraints on power consumption. This paper is prearranged as follows. Section II provides overview of the power spectral density computation based on the Welch method. Section III, gives the modified Welch s algorithm and FFT architectures. Section IV discussed with Simulation results and some conclusions in Section V. 2. PSD computation By dividing the signal into multiple segments, we can compute the PSD based on the Welch method then the modified periodograms of these segments are calculated and after that, averaging these modified periodograms. The normal algorithm description as follows, The input signal X (n) is divided into N overlapping segments. The specified window is applied to each segment. FFT is applied to the each windowed data. Each periodograms of new windowed data segment is computed which is periodogram. Taking the average of these periodograms to obtain power spectral density. The Welch method [Welch 1967] is obtained from Bartlett method in two respects. First, the data segments in the Welch method are allowed to overlap another one is the windowing of each data segment, this is done before computing the periodogram. Mathematical form of Welch method can be described as, Paper ID: IJSER151016 92 of 96

denote the j th segment. In (1), (j - 1) K is the starting point for the j th sequence of observations. If K = M, then the sequences do not overlap (but are contiguous) and we get the sample splitting used by the Bartlett method (which leads to S = L = N=M data subsamples). However, the value recommended for K in the Welch method is K = M=2, in which case S ' 2M=N data segments (with 50% overlap between successive segments) are obtained. The windowed periodogram corresponding to y j (t) is computed as The window function for these non-overlapped segments will be different over two consecutive segments. Therefore, the windowing operation performed in the frequency domain. The proposed modifications include two new steps in the algorithm: windowing operation has to be performed in frequency domain, and merge two -point FFTs into an -point FFT. where P denotes the power of the window function {v (t)}: The Welch estimate of PSD is determined by averaging the windowed periodograms in (3): 3. Low-Complexity Power Spectral Density The low-complexity Modified Welch algorithm is presented in this section.the main idea behind this is to compute the FFT of the individual non-overlapped segments (i.e., half of the actual segments) and then generate the FFT of the overlapped segments by combining the non-overlapped segments. Fig1 shows the flow chart of modified Welch algorithm. In the modified algorithm the input signal vector is divided into (L+1) non-overlapping segments of length N/2. To each segment we apply an N/2 - point FFT. We obtain an N-point FFT by merging two N/2-point FFTs and the specified window is applied to it. Modified periodograms of each windowed segment is then calculated and is averaged to form the spectral estimate. Fig2 shows the filter circuit used for windowing in frequency domain. Figure2: Filter Circuit for Windowing We can see that the core components incorporated in the modified algorithms such as FFT and Absolute-Square Multiple Accumulator (AMAC).The function of AMAC circuit is to compute the periodograms and average them over L segments. Absolute-square multiple accumulator (AMAC) circuit is similar to the multiply multipleaccumulator (MMAC) in [10]. Fig3 illustrates architecture for the AMAC block for a completely sequential computation. In this figure, the AMAC block stores the values of L different periodograms. The number of registers depends on the size of the FFT used in the PSD computation. The address decoder and multiplexers are correctly accumulating the periodogram outputs. Figure 3: Absolute square-multiple accumulator (AMAC) circuit. A) Fast Fourier Transform Architecture Figure1 : Flow chart of Modified Welch Method Discrete Fourier Transform (DFT) is the most important technique used in digital signal and image processing Paper ID: IJSER151016 93 of 96

applications. It has been usually implemented in digital communication systems such as Radars, Ultra Wide Band receivers and image processing applications. The direct insight of this algorithm done using N-sample input, requires a large number of operations (N2 complex multiplications and N (N 1) complex additions). To minimize the number of operations Cooley- Tukey [1] introduced a fast algorithm called Fast Fourier Transform (FFT) [11]. FFT was developed to efficiently and effectively speed up its Computation time and significantly reduce the hardware cost. Commonly, FFT analyzes an input signal sequence by using decimation-in-frequency (DIF) or decimation-in-time (DIT) decomposition to build up an efficiently computational signal-flow graph (SFG). The latter, consists on decomposing DFT computation into small building blocks called radix-2 by using efficiently the symmetry and the periodicity of the twiddle factors. This decomposition minimizes the complexity from O(N2) to O(NlogN) FFT algorithm can be implemented on software platforms such as General Purpose Processor (GPPs) and Digital Signal Processors (DSPs) and in hardware circuits including Field Programmable Gate Arrays (FPGAs) and Application Specific Integrated Circuits (ASIC). FPGA design of FFT is often adapted to fit high speed on low-power specification due to the reality that FPGAs have grown in capacity and performance and decreased in cost. Figure4: Butterfly Scheme Equation 7 and 8 describe the radix-2 butterfly operation at stage m as shown in Fig5 4. Simulation Results The PSD estimation using Modified Welch method is subjected to behavioral simulation, synthesis, implementation and post-layout simulation using the Xilinx ISE Design Suite 14.7. Verified its functionality using different test vectors using Xilinx ISim Simulator. Implementation followed the step explained as above. The design entry is done using VHDL. ISim, which provides a complete, full-featured HDL simulator integrated within ISE, is used for both behavioral simulation and post-layout simulation. Xilinx Synthesis Technology (XST) is used for the synthesis of HDL designs to create Xilinx specific netlist files. High performance is the advantage of this system. Here, our work employs a DIF decomposition with parallel FFT. Parallel FFT architectures are fast and high throughput architectures with parallelism. Compared to other architectures, it offers high throughput and energy efficient implementations with small latency but the hardware complexity is little high and less flexible. B. FFT Algorithm Fast Fourier Transform (FFT) is widely used method for computing Discrete Fourier Transform (DFT). The discrete Fourier transforms (DFT), X (k) of N-point discrete-time signal x (n) is defined by: N 1 X(k) = kn x( n) W N k=0,1,..n-1 (6) n 0 Where, W N kn = e j 2 kn N Figure3.4 shows the signal flow graph of 32-point Decimation-In-Frequency (DIF) radix-2fft. FFT algorithms composed of butterfly calculation units as illustrated in Fig4. X m+1 (q) = X m (p)+ X m (q) (7) X m+1 (q) = [X m (p)- X m (q)] W N r (8) Figure5: Signal flow graph of 32-point radix-2 FFT A. Graphs for FFT Computation Using Parallel Approach The output waveform of the Fast Fourier transform computation obtained is as shown in the fig6 Paper ID: IJSER151016 94 of 96

The output for the first stage is available immediately after the corresponding inputs are obtained unlike in the normal N- point FFT. In our computation 16-point R2SDF DIT FFT structure is used. We choose 16-bit input signal. In the first stage the input signal is real in nature. But the twiddle factor multiplication makes the input to the successive stages complex. Figure7: Output Waveforms of Window Function Of Real and Imaginary Outputs Figure 6: Output Waveforms of the Fourier Transform Computation B.Graphs for windowing The results of the windowing function of real and imaginary outputs of FFT is shown in fig7. The window function is obtained in frequency domain. Windowing is done in order to reduce the number of non-zero coefficients. Window function is applied separately for real and complex outputs of FFT. C. Graphs for Periodogram Operation The modified periodogram of each windowed segment is computed. To each of the real and complex window output periodogram is applied. Periodogram is simply the square of the windowed data output. The results of the periodogram operation is shown in fig8. The periodogram output is then averaged to obtain the final result. Since the input series is divided into segments of length 16, the periodogram output is averaged for 16 values. The averaged value gives the spectral power. The final power spectral density estimate is shown in fig9. Figure8: Output Waveform of Modified Periodogram 5. Conclusion Figure9: PSD Computation output In this paper designed a parallel architecture instead of pipeline in Welch PSD calculation which gives better and faster computation than pipeline design. Performance loss is the main disadvantage of the pipeline architecture but it reduces the area.high latency is another drawback of this system. These can be overcome by using parallel structure. Similarly, it could increase maximum throughput. The Modified Welch PSD computation suitable for lowpower application. The proposed method can be used for processing signals like speech, electroencephalogram (EEG), electrocardiogram (ECG) etc. Further improvements can be made in the FFT which again leads further reduction in the computation complexity and there by reduces the chip-area. Paper ID: IJSER151016 95 of 96

References International Journal of Scientific Engineering and Research (IJSER) [1] Y. Park, L. Luo, K. K. Parhi, and T. Netoff, Seizure prediction with spectral power of EEG using costsensitive support vector machines, Epilepsia, vol. 52, no. 10, pp. 1761 1770, Oct. 2011. [2] F. El-Hawary and T. Richards, A systolic computer architecture for spectrum analysis, Proc. OCEANS, vol. 4, pp. 1061 1065, Sep. 1989. [3] P. Stoica and R. L. Moses, Introduction to Spectral Analysis. Englewood Cliffs, NJ, USA: Prentice-Hall, 1997. [4] M. Hayes, Statistical Digital Signal Processing and Modeling. New York, NY, USA: Wiley, 1996. 11, no. 1, pp. 116 130, 2009. [5] T.-H. Yu, C.-H. Yang, D. Cabric, and D. Markovic, A 7.4 mw 200 MS/s wideband spectrum sensing digital baseband processor for cognitive radios, in Proc. Symp. VLSI Circuits, Jun. 2011, pp. 254 255. [6] T. Yucek and H. Arslan, A survey of spectrum sensing algorithms for cognitive radio applications, IEEE Commun. Surveys Tutorials, vol. [7] A. Oppenheim and R. Schafer, Discrete Time Signal Processing, 2 nd ed. Englewood Cliffs, NJ, USA: Prentice-Hall. [8] S. Haykin, Adaptive Filter Theory, 4th ed. Englewood Cliffs, NJ, USA: Prentice-Hall. [9] P. D. Welch, The use of fast fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms, IEEE Trans. Audio Electroacoustics, vol. AU-15, pp. 70 73, Jun. 1967. [10] V. Sundararajan and K. K. Parhi, A novel multiply multiple accumulator component for low power PDSP design, in Proc. IEEE Int. Conf. Acoustics, Speech Signal Process., Jun. 2000, vol. 6, pp. 3247 3250. [11] M. Ayinala and K. K. Parhi, FFT architectures for realvalued signals based on radix-23 and radix-24 algorithms, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 60, 2013. Paper ID: IJSER151016 96 of 96