High-Performance FIR Filter Architecture for Fixed and Reconfigurable Applications

Similar documents
FIR Filter Architecture for Fixed and Reconfigurable Applications

Design of Fir Filter Architecture Using Manifold Steady Method

MCM Based FIR Filter Architecture for High Performance

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 10 /Issue 1 / JUN 2018

Vertical-Horizontal Binary Common Sub- Expression Elimination for Reconfigurable Transposed Form FIR Filter

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE

Batchu Jeevanarani and Thota Sreenivas Department of ECE, Sri Vasavi Engg College, Tadepalligudem, West Godavari (DT), Andhra Pradesh, India

ISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies

Critical-Path Realization and Implementation of the LMS Adaptive Algorithm Using Verilog-HDL and Cadence-Tool

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2

IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC

Adaptive FIR Filter Using Distributed Airthmetic for Area Efficient Design

FPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA

Power and Area Efficient Implementation for Parallel FIR Filters Using FFAs and DA

Low Power Floating Point Computation Sharing Multiplier for Signal Processing Applications

Implementation of a Low Power Decimation Filter Using 1/3-Band IIR Filter

An HEVC Fractional Interpolation Hardware Using Memory Based Constant Multiplication

Low-Power, High-Throughput and Low-Area Adaptive Fir Filter Based On Distributed Arithmetic Using FPGA

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN

Design of a Multiplier Architecture Based on LUT and VHBCSE Algorithm For FIR Filter

FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP

Design of a Pipelined 32 Bit MIPS Processor with Floating Point Unit

A Comparative Analysis of Energy Preservation Performance Metric for ERAODV, RAODV, AODV and DSDV Routing Protocols in MANET

Area And Power Efficient LMS Adaptive Filter With Low Adaptation Delay

Area and Delay Optimization using Various Multiple Constant Multiplication Techniques for FIR Filter

DESIGN AND IMPLEMENTATION OF DA- BASED RECONFIGURABLE FIR DIGITAL FILTER USING VERILOGHDL

ENERGY EFFICIENT MULTIPATH ROUTING FOR MOBILE AD HOC NETWORKS

Fast Block LMS Adaptive Filter Using DA Technique for High Performance in FGPA

Fixed Point LMS Adaptive Filter with Low Adaptation Delay

Parallel FIR Filters. Chapter 5

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator

Improving Reconfiguration Speed for Dynamic Circuit Specialization using Placement Constraints

The Efficient Implementation of Numerical Integration for FPGA Platforms

Design of Delay Efficient Distributed Arithmetic Based Split Radix FFT

Design and Implementation of 3-D DWT for Video Processing Applications

Digital Filter Synthesis Considering Multiple Adder Graphs for a Coefficient

An Efficient Constant Multiplier Architecture Based On Vertical- Horizontal Binary Common Sub-Expression Elimination Algorithm

[Kamboj* et al., 5(9): September, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding

CHAPTER 4. DIGITAL DOWNCONVERTER FOR WiMAX SYSTEM

A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter

Maximizing the Lifetime of Clustered Wireless Sensor Network VIA Cooperative Communication

OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION

Mitigating Superfluous Flooding of Control Packets MANET

Implementation of digit serial fir filter using wireless priority service(wps)

A New Energy-Aware Routing Protocol for. Improving Path Stability in Ad-hoc Networks

AnEfficientImplementationofDigitFIRFiltersusingMemorybasedRealization

DESIGN AND IMPLEMENTATION BY USING BIT LEVEL TRANSFORMATION OF ADDER TREES FOR MCMS USING VERILOG

Bit-Level Optimization of Adder-Trees for Multiple Constant Multiplications for Efficient FIR Filter Implementation

RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch

Performance Evaluation of Various Routing Protocols in MANET

A HIGH PERFORMANCE FIR FILTER ARCHITECTURE FOR FIXED AND RECONFIGURABLE APPLICATIONS

INTEGER SEQUENCE WINDOW BASED RECONFIGURABLE FIR FILTERS.

A Reconfigurable Multifunction Computing Cache Architecture

Simulation and Performance Analysis of Throughput and Delay on Varying Time and Number of Nodes in MANET

A COMPARISON OF REACTIVE ROUTING PROTOCOLS DSR, AODV AND TORA IN MANET

Design of Convolution Encoder and Reconfigurable Viterbi Decoder

Simulation & Performance Analysis of Mobile Ad-Hoc Network Routing Protocol

RISC IMPLEMENTATION OF OPTIMAL PROGRAMMABLE DIGITAL IIR FILTER

4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013)

International Journal for Research in Applied Science & Engineering Technology (IJRASET) IIR filter design using CSA for DSP applications

Impulse Radio Ultra Wide Band Based Mobile Adhoc Network Routing Performance Analysis

Ameliorate Threshold Distributed Energy Efficient Clustering Algorithm for Heterogeneous Wireless Sensor Networks

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: April, 2016

A Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithmetic with Decomposed LUT

To enhance Routing in ZigBee Wireless Networks

Compact Clock Skew Scheme for FPGA based Wave- Pipelined Circuits

End-To-End Delay Optimization in Wireless Sensor Network (WSN)

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code

Fault Tolerant Parallel Filters Based On Bch Codes

INTRODUCTION TO FIELD PROGRAMMABLE GATE ARRAYS (FPGAS)

DESIGN AND IMPLEMENTATION OF HIGH SPEED 64-BIT PARALLEL CRC GENERATION

DUE to the high computational complexity and real-time

Impact of IEEE MAC Packet Size on Performance of Wireless Sensor Networks

High Performance and Area Efficient DSP Architecture using Dadda Multiplier

INTRODUCTION TO FPGA ARCHITECTURE

Fault Tolerant Parallel Filters Based on ECC Codes

Secure Enhanced Authenticated Routing Protocol for Mobile Ad Hoc Networks

Computation of Multiple Node Disjoint Paths

Implementation of Reduce the Area- Power Efficient Fixed-Point LMS Adaptive Filter with Low Adaptation-Delay

ECE 341 Midterm Exam

Implementation of Lifting-Based Two Dimensional Discrete Wavelet Transform on FPGA Using Pipeline Architecture

ARITHMETIC operations based on residue number systems

An Energy Efficient Multicast Routing Based On Genetic Algorithm for MANET

ISSN: [Bilani* et al.,7(2): February, 2018] Impact Factor: 5.164

Design and Implementation of VLSI 8 Bit Systolic Array Multiplier

Performance Analysis and Enhancement of Routing Protocol in Manet

VLSI Design and Implementation of High Speed and High Throughput DADDA Multiplier

3. Evaluation of Selected Tree and Mesh based Routing Protocols

The Xilinx XC6200 chip, the software tools and the board development tools

A Literature survey on Improving AODV protocol through cross layer design in MANET

Synthesis of DSP Systems using Data Flow Graphs for Silicon Area Reduction

Chapter 7 CONCLUSION

A High-Speed FPGA Implementation of an RSD-Based ECC Processor

Effect of Variable Bit Rate Traffic Models on the Energy Consumption in MANET Routing Protocols

AN OPTIMIZED CLUSTER BASED APPROACH FOR MULTI- SOURCE MULTICAST ROUTING PROTOCOL IN MOBILE AD HOC NETWORKS USING OWCA

FPGA IMPLEMENTATION FOR REAL TIME SOBEL EDGE DETECTOR BLOCK USING 3-LINE BUFFERS

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

A Dedicated Hardware Solution for the HEVC Interpolation Unit

forward packets do not forward packets

Transcription:

High-Performance FIR Filter Architecture for Fixed and Reconfigurable Applications Pallavi R. Yewale ME Student, Dept. of Electronics and Tele-communication, DYPCOE, Savitribai phule University, Pune, Maharashtra, India. ABSTRACT: Transpose form finite impulse response (FIR) filter is naturally a pipelined structure which supports the multiple constant multiplications (MCM) technique but direct form FIR filter structure does not support MCM technique. The MCM is more effective in Transpose form when the common operand is multiple with the set of constant coefficients that reduce the computational delay. The implementation of MCM technique is easier in fixed coefficient Transpose form FIR filter but complex in reconfigurable coefficients. In fixed coefficients transpose FIR filter, area and delay are reduced by using MCM technique. The low-complexity design using the MCM technique is implemented for fixed coefficients transpose form FIR filters and multiplier-based design is used for reconfigurable transpose form FIR filter. The implemented transpose form FIR filter structure achieved less area and delay than the direct-form FIR filter structure. The XILINX software tool is used for simulation. KEYWORDS: Transpose form FIR filter, multiple constant multiplications (MCM) technique, Block processing. I. INTRODUCTION Finite Impulse response (FIR) digital filter is used in several DSP applications, such as, echo cancellation, speech processing, equalization, adaptive noise cancellation, and various communication applications, including softwaredefined radio (SDR), etc. Many of these applications require FIR filters of large order to meet the stringent frequency specifications. And this filters need to support high sampling rate for high-speed digital communication. The number of multiplications and additions required for their filter output, increases linearly with the filter order. Nodes in MANET have limited battery power and these batteries cannot be replaced or recharged in complex scenarios. To prolong or maximize the network lifetime these batteries should be used efficiently. The energy consumption of each node varies according to its communication state: transmitting, receiving, listening or sleeping modes. Researchers and industries both are working on the mechanism to prolong the lifetime of the node s battery. But routing algorithms plays an important role in energy efficiency because routing algorithm will decide which node has to be selected for communication. There is no redundant computation available in the FIR filter, real-time implementation of a large order FIR filter in a resource constrained environment is a challenging task. Filter coefficients very often remain constant and known a prioriin signal processing applications. This feature has been utilized to reduce the complexity of realization of multiplications. FIR filter has two maintaining the Integrity of the Specifications configurations, namely direct form FIR filter and transposes form FIR filter. Several designs have been designed by various researchers for efficient realization of FIR filters (having Fixed coefficients) using multiple constant multiplication (MCM) and distributed arithmetic (DA) [6] and methods [7]. DAbased designs use lookup tables (LUTs) to store pre-computed result stored the computational complexity. The Transpose form FIR filter only needs N delay units, where N is the order of the filter potentially half as much as direct form. This structure is obtained by reversing the order of the numerator and denominator sections of Direct Form, since they are in fact two linear systems. Then, one will notice that there are two columns of delays that tap off the center net, and these can be combined since they are redundant. The disadvantage is that Transpose form increases the possibility of arithmetic overflow for filters of high Q or resonance. This is because, conceptually, the signal is first passed through an all-pole filter (which normally boosts gain at the resonant frequencies) before the result of that is Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2017. 0505039 9477

saturated, then passed through an all-zero filter (which often attenuates much of what the all-pole half amplifies). In this paper, we realize the possibility of block FIR filter in transpose form configuration to take advantage of the MCMtechnique and the naturally pipelining for area delay efficient realization of large order. The main contributions of this paper are as follows. 1. The Computational analysis of transpose form FIR filter and derivation of flow graph with reduced register complexity. 2. Block formulation for transpose form FIR filter. 3. Design of transpose form FIR filter block for reconfigurable applications 4. A low-complexity design method using multiple constant multiplications technique for the block implementation of fixed FIR filters. II. EXISTING SYSTEM In [2] authors used average residual battery level of the entire network and it was calculated by adding two fields to the RREQ packet header of a on-demand routing algorithm i) average residual battery energy of the nodes on the path ii) number of hops that the RREQ packet has passed through.according to their equation retransmission time is proportional to residual battery energy. Those nodes having more battery energy than the average energy will be selected because its retransmission time will be less. Small hop count is selected at the stage when most of the nodes have same retransmission time. Individual battery power of a node is considered as a metric to prolong the network lifetime in [3]. Authors used an optimization function which considers nature of the packet, size of the packet and distance between the nodes, number of hops and transmission time are also considered for optimization. In [ 4] initial population for Genetic Algorithm has been computed from the multicast group which has a set of paths from source to destination and the calculated lifetime of each path. Lifetime of the path is used as a fitness function. Fitness function will select the highest chromosomes which is having highest lifetime. Cross over and mutation operators are used to enhance the selection. In [5] authors improved AODV protocol by implementing a balanced energy consumption idea into route discovery process. RREQ message will be forwarded when the nodes have sufficient amount of energy to transmit the message otherwise message will be dropped. This condition will be checked with threshold value which is dynamically changing. It allows a node with over used battery to refuse to route the traffic in order to prolong the network life. In [6] Authors had modified the route table of AODV adding power factor field. Only active nodes can take part in rout selection and remaining nodes can be idle. The lifetime of a node is calculated and transmitted along with Hello packets. In [7] authors considered the individual battery power of the node and number of hops, as the large number of hops will help in reducing the range of the transmission power. Route discovery has been done in the same way as being done in on-demand routing algorithms. After packet has been reached to the destination, destination will wait for time δt and collects all the packets. After time δt it calls the optimization function to select the path and send RREP. Optimization function uses the individual node s battery energy; if node is having low energy level then optimization function will not use that node. III. DRAWBACKS It provides only block performances. High delay. Occupies high area Units. IV. PROPOSED SYSTEM The proposed structure for block FIR filter is shown in Figure for the block size L = 4. It consists of one coefficient selection unit (CSU), one register unit (RU), M number of inner product units (IPUs), and one pipeline adder unit (PAU). The CSU stores coefficients of all the filters to be used for the reconfigurable. Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2017. 0505039 9478

Application. It is implemented using N ROM LUTs, such that filter coefficients of any particular channel filter are obtained in one clock cycle, where N is the filter length. The RU receives xk during the kth cycle and produces L rows of S0 k in parallel. L rows of S0 k are transmitted to M IPUs of the proposed structure. The M IPUs also receive M short-weight vectors from the CSU such that during the kth cycle, the (m + 1) th IPU receives the weight vector cm m 1 from the CSU and L rows of S0 k form the RU. Each IPU performs matrix vector product of S0 k with the short-weight vector cm, and computes a block of L partial filter outputs (rm k). Therefore, each IPU performs L inner-product computations of L rows of S0 k with a common weight vector cm. In each cycle, the proposed structure receives a block of L inputs and produces a block of L filter outputs, where the duration of each cycle is T = TM + TA + TFA log2 L, TM is one multiplier delay, TA is one adder. Fig Fig.1 Proposed structures Fig.1 (a) Structure of RU Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2017. 0505039 9479

Fig.1 (b) Structure of (m+1) IPU V. MCM-BASED IMPLEMENTATION OF FIXED-COEFFICIENT FIR FILTER We discuss the derivation of MCM units for transpose form block FIR filter, and the design of proposed structure for fixed filters. For fixed-coefficient implementation, the CSU is no longer required, since the structure is to be tailored for only one given filter. Similarly, IPUs are not required. Those multiplications are required to be mapped to the MCM units for a low-complexity realization. In the following, we show that the proposed formulation for MCM-based implementation of block FIR filter makes use of the symmetry in input matrix S0k to perform horizontal and vertical common sub expression elimination [17] and to minimize the number of shift-add operations in the MCM blocks. Fig.2 (a) Internal structure of (l + 1)th IPC for L = 4. Fig.2 (b) Structure of PAU for block size L = 4. Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2017. 0505039 9480

Is no longer required, since the structure is to be tailored for only one given filter. The recurrence relation can be expressed as Y (z) = z 1 z 1 (z 1rM 1 +rm 2 +rm 3) +r1+r0. Where R =S0 k C Similarly, IPUs are not required. The multiplications are required to be mapped to the MCM units for a low complexity realization. In the following, we show that the proposed formulation for MCM based implementation of block FIR filter makes use of the symmetry in input matrixs0k to perform horizontal and vertical common sub expression elimination and to minimize the number of shift-add operations in the MCM blocks. Input matrix MCM in TRANSPOSE FORM BLOCK FIR FILTER Of Length = 16 and Block Size = 4 Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2017. 0505039 9481

As shown in Table I, MCM can be applied in both horizontal and vertical direction of the coefficient matrix. The sample x(4k 3) appears in four rows or four columns of the following whereas x(4k) appears in only one row or one column Therefore, all the four rows of coefficient matrix are involved in the MCM for the x(4k 3), whereas only the first row of coefficients are involved in the MCM for x(4k). For larger values of N or the smaller block sizes, the row size of the coefficient matrix is larger that results in larger MCM size across all the samples, which results into larger saving in computational complexity. The proposed MCM-based structure for FIR filters for block size L = 4 is shown in Fig. 9 for the purpose of illustration. The MCM-based structure (shown in Fig. 9) involves six MCM blocks corresponding to six input samples. Each MCM block produces the necessary product terms as listed in Table I. The sub expressions of the MCM blocks are shift added in the adder network to produce the inner-product values (rl, m), for 0 l L 1 and 0 m (N/L) 1 corresponding to the matrix product of (14). The inner-product values are finally added in the PAU of Fig. 8(b) to obtain a block of filter output. VI. HARDWARE AND TIME COMPLEXITIES The proposed structure for reconfigurable application consists of one CSU, one RU, M IPUs, and one PAU. The CSU consists of N ROM units of P words each, where P is the number of FIR filters to be implemented by the proposed reconfigurable structure. We have excluded complexity of CSU in the performance comparison, since it is common in all the RFIR structures. Each IPU is comprised of L IP cells, where each IP cell involves L multipliers and (L 1) adders. The RU involves (L 1) registers of B-bit width. The PAU involves (M 1) adders and the same number of registers, where each register has a width of (B+ B_), B, and B_ respectively, being the bit width of input sample and filter coefficients. Therefore, the proposed structure involves LN multipliers, L (N 1) adders, and [B (N 1) + B (N L)] (flip flops) FFs; and processes L samples in every cycle. Fig.3 Proposed MCM Structure Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2017. 0505039 9482

VII. ADVANTAGES Block and higher order N processing. Less area requirement Low delay VIII. SIMULATION AND DISCUSSION Experimental Result for fixed FIR Filter Experimental Result of Reconfigurable FIR Filter Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2017. 0505039 9483

To implement the fixed and reconfigurable coefficients Transpose form FIR filter is simulated using Xilinx Tool. The area (no. of LUT s and flip-flop s clock buffers and I/O buffers) and delay were analyzed for Fixed and reconfigurable coefficients. From the analysis, it is observed that the area and delay reduced when compare to the direct form FIR filter. In the below simulation result is for Fixed coefficient Transpose form FIR filter with 15 coefficients. The input given to the filter is 0001(1) and the output achieved for the filter is 1010. The delay for the fixed transpose form FIR filter is 1.19 ns. The delay of the fixed transpose form filter is reduced when compare to the direct form filter delay. The delay of the direct form fir filter is 1.35ns. In the below simulation result is for reconfigurable coefficients Transpose form FIR filter. The input given to the filter is 0001(1) and the coefficients given for the filter is, 0000, 0001, 0010, 0011, 0100, and 0101. The output achieved for the filter is 1111 (15). The total number LUT used for the reconfigurable FIR is 152 and the Flip-flops are 63 and the shift register, clock buffers and I/O buffers are used 27 cells. The LUTs, Flip-flops are reduced when compare to the direct form FIR filter. The delay for the reconfigurable transpose form FIR filter is 1.27 ns. The delay of the fixed transpose form filter is reduced when compare to the direct form filter delay. The delay of the direct form fir filter is 1.35ns direct form filter delay. The delay of the direct form fir filter is 1.36ns. IX. PERFORMANCE MEASUREMENT Table 1: comparison table reconfigurable fir for the area and delay Delay (ns) LUT s and Flip Flop s Direct form Reconfigurable FIR 1.35 368 Transpose form Reconfigurable FIR 1.26 241 Table 2: comparison table fixed fir for the area and Delay Delay (ns) LUT s and Flip Flop s Direct form Reconfigurable FIR - 1.36 120 Transpose form Reconfigurable FIR 1.198 60 Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2017. 0505039 9484

X. CONCLUSION In this paper, we implemented the design of Transpose form FIR filter for fixed coefficient and reconfigurable coefficients. The transpose form FIR filter is naturally a pipelined structure which supports the multiple constant multiplications (MCM) technique. The implementation of MCM technique is easier in fixed coefficient Transpose form FIR filter but complex in reconfigurable coefficients. In fixed coefficients transpose FIR filter, area and delay are reduced by using MCM technique. The low-complexity design using the MCM technique is implemented for fixed coefficients transpose form FIR filters and multiplier-based design is used for reconfigurable transpose form FIR filter. The implemented transpose form FIR filter structure achieved less area and delay than the direct-form FIR filter structure. In future, the delay and area will be further reduced in transpose form FIR filter by applying another technique. REFERENCES [1] B. K. Mohanty and P. K. Meher, A High-Performance FIR Filter Architecture for Fixed and Reconfigurable Applications IEEE Tran on VLSI,Feb 2015. [2] S. Y. Park and P. K. Meher, Efficient FPGA and ASIC realizations of a DA-based reconfigurable FIR digital filter, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 61, no. 7, pp. 511 515, Jul. 2014.. [3] B. K. Mohanty and P. K. Meher, A high-performance energy-efficient architecture for FIR adaptive filter based on new distributed arithmetic formulation of block LMS algorithm, IEEE Trans. Signal Process., vol. 61, no. 4, pp. 921 932, Feb. 2013. [4] R. Mahesh and A. P. Vinod, New reconfigurable architectures for implementing FIR filters with low complexity, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 29, no. 2, pp. 275 288, Feb. 2010. [5] A. P. Vinod and E. M. Lai, Low power and high-speed Implementation of FIR filters for software defined radio receivers, IEEE Trans. Wireless Commun., vol. 7, no. 5, pp. 1669 1675, Jul. 2006. [6] J. Park, W. Jeong, H. Mahmoodi-Meimand, Y. Wang, H. Choo, and K. Roy, Computation sharing programmable FIR filter for low-power and high-performance applications, IEEE J. Solid State Circuits, vol. 39, no. 2, pp. 348 357, Feb. 2004. Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2017. 0505039 9485