FPGA Implementation of High Speed FIR Filters Using Add and Shift Method
|
|
- Morris Bates
- 5 years ago
- Views:
Transcription
1 FPGA Implementation of High Speed FIR Filters Using Add and Shift Method Abstract We present a method for implementing high speed Finite Impulse Response (FIR) filters using just registered adders and hardwired shifts. We extensively use a modified common subexpression elimination algorithm to reduce the number of adders. We target our optimizations to Xilinx Virtex II devices where we compare our implementations with those produced by Xilinx CoregenTM using istributed Arithmetic. We observe up to 50% reduction in the number of slices and up to 75% reduction in the number of s for fully parallel implementations. We also observed up to 50% reduction in the total dynamic power consumption of the filters. Our designs perform significantly faster than the MAC filters, which use embedded multipliers. 1. Introduction FPGAs are being increasingly used for a variety of computationally intensive applications, mainly in the realm of igital Signal Processing (SP) and communications [2-8]. ue to rapid increases in the technology, current generation of FPGAs contain a very high number of Configurable Logic Blocks (CLBs), and are becoming more feasible for implementing a wide range of applications. The high non-recurring engineering (NRE) costs and long development time for ASICs are making FPGAs more attractive for application specific SP solutions. SP functions such as FIR filters and transforms are used in a number of applications such as communication and multimedia. These functions are major determinants of the performance and power consumption of the whole system. Therefore it is important to have good tools for optimizing these functions. Equation (I) represents the output of an L tap FIR filter, which is the convolution of the latest L input samples. L is the number of coefficients h(k) of the filter, and x(n) represents the input time series. y[n] = h[k] x[n-k] k= 0, 1,..., L-1 (I) The conventional tapped delay line realization of this inner product is shown in Figure 1. This implementation translates to L multiplications and L-1 additions per sample to compute the result. This can be implemented using a single Multiply Accumulate (MAC) engine, but it would require L MAC cycles, before the next input sample can be processed. Using a parallel implementation with L MACs can speed up the performance L times. A general purpose multiplier occupies a large area on FPGAs. Since all the multiplications are with constants, the full flexibility of a general purpose multiplier is not required, and the area can be vastly reduced using techniques developed for constant multiplication. Though most of the current generation FPGAs such as Virtex II TM have embedded multipliers to handle these multiplications, the number of these multipliers is typically limited. Furthermore, the size of these multipliers is limited to only 18 bits, which limits the precision of the computations for high speed requirements. The ideal implementation would involve a sharing of the Combinational Logic Blocks (CLBs) and these multipliers. In this paper, we present a technique that is better than conventional techniques for implementation on the CLBs. X [n] x h x x x L-1 h L-2 h L-3 h 1 z -1 z z -1 z -1 Figure 1. A MAC FIR filter block diagram An alternative to the above approach is istributed Arithmetic (A) which is a well known method to save resources. Using A method, the filter can be implemented either in bit serial or fully parallel mode to trade bandwidth for area utilization. Assuming coefficients c[n] are known constants, equation (I) can be rewritten as follows: y[n] = c[n] x[n] n = 0, 1,, N-1 (II) Variable x[n] can be represented by: x [n] = x b [n] 2 b b=0, 1,, B-1 (III) x b [n] [0, 1] where x b [n] is the b th bit of x[n] and B is the input width. Finally, the inner product can be rewritten as follows: y = c[n] x b [k] 2 b = c[0] (x B-1 [0]2 B-1 x B-2 [0]2 B-2 x 0 [0]2 0 ) c[1] (x B-1 [1]2 B-1 x B-2 [1]2 B-2 x 0 [1]2 0 ) c[n-1] (x B-1 [N-1]2 B-1 x B-2 [0]2 B-2 x 0 [N- 1]2 0 ) = (c[0] x B-1 [0] c[1] x B-1 [1] c[n-1] x B-1 [N- 1])2 B-1 (c[0] x B-2 [0] c[1] x B-2 [1] c[n-1] x B- 2 [N- 1])2 B-2 (c[0] x 0 [0] c[1] x 0 [1] c[n-1] x 0 [N-1])2 0 = 2 b c[n] x b [k] (IV) where n=0, 1,, N-1 and b=0, 1,, B-1 The coefficients in most of SP applications for the multiply accumulate operation are constants. The partial products are obtained by multiplying the coefficients c i by multiplying one bit of data x i at a time in AN operation. These partial products should be added and the result depend only on the outputs of the input shift registers. The AN functions and adders can be replaced by Look Up Tables (s) that gives the partial product. This is shown in Figure 2. Input sequence is fed into the shift register at the input sample rate. The serial output is x h 0 z -1 y [n]
2 presented to the RAM based shift registers (registers are not shown in Figure for simplicity) at the bit clock rate which is n1 times (n is number of bits in a data input sample) the sample rate. The RAM based shift register stores the data in a particular address. The outputs of registered s are added and loaded to the scaling accumulator from LSB to MSB and the result which is the filter output will be accumulated over the time. For an n bit input, n1 clock cycles are needed for a symmetrical filter to generate the output. In conventional MAC method with a limited number of MAC engines, as the filter length is increased, the system sample rate is decreased. This is not the case with serial A architectures since the filter sample rate is decoupled from the filter length. As the filter length is increased, the throughput is maintained but more logic resources are consumed. Though the serial A architecture is efficient by construction, its performance is limited by the fact that the next input sample can be processed only after every bit of the current input samples are processed. Each bit of the current input samples takes one clock cycle to x 0 x 1 x 2 x 3 x 4 x 5 x 6 x 7 scaling accumulator << x 0 x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 0 [i1] x 1 [i1] x 2 [i1] x 3 [i1] x 4 [i1] x 5 [i1] x 6 [i1] x 7 [i1] scaling accumulator Figure 3. A 2 bit parallel A FIR filter block diagram A popular technique for implementing the transposed form of FIR filters is the use of a multiplier block, instead of using multipliers for each constant as shown in Figure 4. The multiplications with the set of constants {h k } are replaced by an optimized set of additions and shift operations, involving computation sharing. Further optimization can be done by factorizing the expression and finding common subexpressions. << Address ata C C 0 C C 0 C 1 C 2 C 3 Figure 2. A serial A FIR filter block diagram process. Therefore, if the input bitwidth is 12, then a new input can be sampled every 12 clock cycles. The performance of the circuit can be improved by modifying the architecture to a parallel architecture which processes the data bits in groups. Figure 3 shows the block diagram of a 2 bit parallel A FIR filter. The tradeoff here is performance for area since increasing the number of bits sampled has a significant effect on resource utilization on FPGA. For instance, doubling the number of bits sampled, doubles the throughput and results in the half the number of clock cycles. This change doubles the number of s as well as the size of the scaling accumulator. The number of bits being processed can be increased to its maximum size which is the input length n. This gives the maximum throughput to the filter. For a fully parallel implementation of the A filter (PA), the number of s required would be enormous. In this work we show an alternative to the PA method for implementing high speed FIR filters that consumes significantly lesser area and power. Figure 4. Replacing constant multiplication by multiplier block The performance of this filter architecture is limited by the latency of the biggest adder and is the same as that of the PA. In this work, we present a filter architecture based on the architecture illustrated in Figure 4. We use a modified common subexpression elimination algorithm that minimizes the number of adders as well as the number of latches. We show that such a scheme produces similar performance as the PA method, with significantly reduced area. We compare the area produced by our method after place and route with the PA based implementation produced by the commercial Xilinx Coregen TM tool. In addition, we also simulate the designs and compare the dynamic power consumption. The rest of the paper is organized as follows: Section 2 presents some related work. In Section3, we describe our filter architecture. In Section 4, we present our optimization algorithm for reducing the total area of the design. In Section 5, we describe our experimental setup and present our results. Finally we conclude the paper in Section Related Work Multiplications with constants have to be performed in many signal processing and communication applications such as FIR filters, audio, video and image
3 processing. Using a general purpose multiplier for performing constant multiplication is expensive in terms of area and latency. Though embedded multipliers are very popular in modern FPGAs, the size (18x18) and the number of these multipliers is a limitation. The partial product Constant Coefficient Multiplier called KCM in [9] is a popular approach for performing constant multiplications in FPGAs. The partial products are generated by table look-ups and are summed up using fast adders. This KCM is about one quarter the size of a fully functional multiplier [9]. istributed Arithmetic has been used extensively for implementing FIR filters on FPGAs [12, 13]. In [12], both Serial and Parallel istributed Arithmetic designs are used to implement a 16-tap FIR filter. The author notes the superior performance achieved by using PA over using a traditional VLIW SP processor. The Xilinx TM CORE Generator has a highly parameterizable, optimized filter core for implementing digital FIR filters [13]. It generates synthesized core that targeting a wide range of Xilinx devices. The Serial istributed Arithmetic (SA) architecture is very area efficient for serial implementations of filters. The fully Parallel istributed Arithmetic (PA) structure can be used for fastest sample rates, but it suffers from excessive area. In this work, we compare the area of our fully parallel architecture derived from the Shift and Add method and compare the area with that produced by PA. In [14], FIR filters are implemented using the Add and Shift method. Canonical Signed igit (CS) encoding is used for the coefficients to minimize the number of additions. The authors observe that the critical path of the filter can be reduced considerably by using registered adders, at minimal increase in area. This is because of the availability of a flip-flop at the output of the. This fact is illustrated in Figure 5, which compares a non-registered and a registered output. We also use registered adders in our design, and the performance of our filters is limited by the latency of the biggest adder. In comparison with [14], we extensively use common subexpression elimination for reducing the number of adders. Furthermore, the use of subexpression elimination increases the number of latches that are required to make the filter fully parallel. Our subexpression elimination algorithm considers both the cost of the latches as well as the cost of an adder. 3. Filter architecture We base our filter architecture on the transposed form of the FIR filter as shown in Figure 1. The filter can be divided into two main parts, the multiplier block and the delay block, and is illustrated in Figure 4. In the multiplier block, the current input variable x[n] is multiplied by all the coefficients of the filter to produce the y i outputs. These y i outputs are then delayed and added in the delay block to produce the filter output y[n]. We perform all our optimizations in the multiplier block. The constant multiplications are decomposed into registered additions and hardwire shifts. The additions are performed using two input adders, which are arranged in the fastest tree structure. We use registered adders, so that the performance of the filter is only limited by the slowest adder. We use common subexpression elimination extensively, to reduce the number of adders, which leads to a reduction in the area. To synchronize all the intermediate values in the computation, we insert registers in the dataflow, wherever necessary. X 1 y 1 X 0 y 0 X y s X z -1 Logic Block 2 Logic Block 1 carry s 1 s 0 X 1 y 1 y Logic Block 2 X 0 y 0 s' 0 Logic Block 1 s' carry (a) (b) Figure 5. Registered adder at no additional cost Performing subexpression elimination can sometimes increase the number of registers substantially, and the overall area could possibly increase. Consider the two expressions F 1 and F 2 which could be part of the multiplier block. F 1 = A B C F 2 = A B C E Figure 6 shows the original unoptimized expression trees. Both the expressions have a minimum critical path of two addition cycles. These expressions require a total of six registered adders for the fastest implementation, and no extra registers are required. From the expressions we can see that the computation A B C is common to both the expressions. If we extract this subexpression, we get the structure shown in Figure 7. Since both and E need to wait for two addition cycles to be added to (A B C), we need to use two registers each for and E, such that new values for A,B,C, and E can be read in at each clock cycle. Assuming that the cost of an adder and a register with the same bitwidth are the same, the structure shown in Figure 7 occupies more area than the one shown in Figure 6. A more careful subexpression elimination algorithm would only extract the common subexpression A B (or AC or B C). The number of adders is decreased by one from the original, and no additional registers are added. This is illustrated in Figure 8. The algorithm for performing this kind of optimization is described in the next section. Figure 6. Unoptimized expression trees s' 1
4 detected by performing intersections among the set of divisors. 4.2 Optimization algorithm Figure 7. Extracting common expression (A B C) We first calculate the minimum number of registers required for our design. We calculate this by arranging the original expressions in the fastest possible tree structure, and then inserting registers. For example, for the six term expression F = A B C E F, we have the fastest tree structure with three addition steps, and we require one register to synchronize the intermediate values, such that new values for A,B,C,,E,F can be read in every clock cycle. This is illustrated in Figure 9. Figure 8. Extracting common subexpression (AB) 4. Optimization algorithm The goal of our optimization is to reduce the area of the multiplier block by reducing the number of adders and any additional registers required for the fastest implementation of the FIR filter. We first give a brief overview of the common subexpression elimination methods. A detailed description can be found in [1]. We then present the modified optimization algorithm to be used for our work. 4.1 Overview of common subexpression elimination We use a polynomial transformation of constant multiplications. Given a representation for the constant C, and the variable X, the multiplication C*X can be represented as a summation of terms denoting the decomposition of the multiplication into shifts and additions as C*X = ± i i XL (V) The terms can be either positive or negative when the constants are represented using signed digit representations such as the Canonical Signed igit (CS) representation. The exponent of L represents the magnitude of the left shift and the i s represent the digit positions of the non-zero digits of the constants. For example the multiplication 7*X = (100-1) CS *X = X<<3 X = XL 3 X, using the polynomial transformation. We use the divisors to represent all possible common subexpressions. ivisors are obtained from an expression by looking at every pair of terms in the expression and dividing the terms by the minimum exponent of L. For example in the expression F = XL 2 XL 3 XL 5, consider the pair of terms (XL 2 XL 3 ). The minimum exponent of L in the two terms is L 2. ividing by L 2, we get the divisor (X XL). From the other two pairs of terms (XL 2 XL 5 ) and (XL 3 XL 5 ), we get the divisors (X XL 3 ) and (X XL 2 ) respectively. These divisors are significant, because every common subexpression in the set of expressions can be Figure 9. Calculating registers required for fastest evaluation We first generate all the divisors for the set of expressions describing the multiplier block. We then use an iterative algorithm, where we extract the divisor that has the greatest value. To calculate the value of the divisor, we assume that the cost of a registered adder and a register is the same. We calculate the value of a divisor as the number of additions saved by extracting it minus the number of registers that have to be added. After selecting the best divisor, we rewrite the expressions using it. We then generate new divisors from the new terms that have been generated due to rewriting, and add them to the dynamic list of divisors. The iteration stops when there is no valuable divisor remaining in the set of divisors. Consider the expressions shown in Figure 6. We need six registered adders and no additional registers for the fastest evaluation of F 1 and F 2. Now consider the selection of the divisor d 1 = (AB). This divisor saves one addition and does not increase the number of registers. ivisors (A C) and (B C) also have the same value, but (AB) is selected randomly. The expressions are now rewritten as: d 1 = (A B) F 1 = d 1 C F 2 = d 1 C E After rewriting the expressions and forming new divisors, the divisor d 2 = (d 1 C) is considered. This divisor saves one adder, but introduces five additional registers, as can be seen in Figure 7. Therefore this divisor has a value of - 4. No other valuable divisors can be found and the iteration stops. We end up with the expressions shown in Figure 8.
5 ReduceArea( {P i } ) { {P i } = Set of expressions in polynomial form; {} = Set o f divisors = ϕ ; //Step 1: Creating divisors and calculating minimum number of registers required for each expression P i in {P i } { { new } = Findivisors(P i ); Update frequency statistics of divisors in {}; {} = {} { new }; P i ->MinRegisters = Calculate Minimum registers required for fastest evaluation of P i ; } //Step 2: Iterative selection and elimination of best divisor while(1) { } Find d = ivisor in {} with greatest Value; // Value = Num Additions reduced Num Registers Added; if( d == NULL) break; Rewrite affected expressions in {P i } using d; Remove divisors in {} that have become invalid; Update frequency statistics of affected divisors; { new } = Set of new divisors from new terms added by division; {} = {} { new }; } Figure 10. Optimization algorithm to reduce area 5. Experiments The goal of our experiments was to compare the number of resources consumed by our add and shift method with that produced by the cores generated by the commercial Coregen TM tool, based on istributed Arithmetic. Besides the resources, we also compared the power consumption of the two implementations, and also measured the performance. For our experiments, we considered 10 FIR filters of various sizes (6, 10, 13, 20, 28, 41, 61, 71, 119 and 152 tap filters). We targeted the Xilinx Virtex II device for our experiments. The constants were normalized to 16 digit of precision and the input samples were assumed to be 12 bits wide. For the add and shift method, we decomposed all the constant multiplications into additions and shifts and optimized the expressions using the algorithm explained in Section 4.2. We used the Xilinx Integrated Software Environment (ISE) for performing synthesis and implementation of the designs. All the designs were synthesized for maximum performance. Table 1a shows the resources utilized for the various filters and the performance in terms of Million samples per second (Msps) for the filters implemented using the add and shift method. Table 1b, shows the same numbers for the filters implemented using Xilinx Coregen, using the Parallel istributed Arithmetic (PA) method. % Reductio Table 1a. Filter Synthesis using Add Shift method Filter (# taps) Slices s FFs Figure 11 plots the reduction in the number of resources, in terms of the number of Slices, Look Up Tables (s) and the number of Flip Flops (FFs). From the results, we can observe an average reduction of 58.7% in the number of s, and about 25% reduction in the number of slices and FFs. Though our algorithm does not optimize for performance, the synthesis produces better performance in most of the cases, and for the 13 and 20 tap filters, we observe about 26% improvement in performance. Table 1b. Filter Synthesis using Coregen (PA method) Reduction in resources # of taps Figure 11. Reduction in resources Performance (Msps) Filter (# taps) Slices s FFs Performance (Msps) Slices s FFs Figure 12 compares power consumption for our add/shift method versus Coregen TM. From the results we can observe up to 50% reduction in dynamic power consumption. We did not include the quiescent power into our calculation since that value is the same for both methods. The power consumption is the result of applying the same test stimulus to both designs and measuring the power using XPower tools provided by Xilinx ISE software.
6 Power (mw ynamic Power consumption Filter (# taps) Figure 11. Power consumption Add/Shift Coregen Comparison with MAC filters using embedded multipliers Coregen TM can produce FIR filters based on the Multiply Accumulate (MAC) method, which makes use of the embedded multipliers. We implemented some of these filters using the MAC method to compare the resource usage and performance with our add and shift method. ue to tool limitations we had to do the experiments for Virtex IV device and due to extremely long synthesis time for the MAC filter, we only implemented a subset of these filters. We present the synthesis results in terms of number of slices on the Virtex IV device and the performance in Msps in Table 2. Table 2. Comparing with MAC filter on Virtex IV Add Shift MAC Filter Method filter (# taps) Slices Msps Slices Msps From the table, it can be seen that the MAC filter uses fewer number of slices compared to the add-shift method, but it also uses the embedded multipliers. The number of multipliers is equal to the number of taps of the filter. For implementing the 61-tap MAC filter, we had to use a higher capacity device, since the device we were using did not have 62 embedded multipliers. The results show that we achieve higher performance as the filter size increases. 6. Conclusion In this paper we presented a multiplierless technique, based on the add and shift method and common subexpression elimination for low area, low power and high speed implementations of FIR filters. We validated our techniques on Virtex II TM devices where we observed significant area and power reductions over traditional istributed Arithmetic based techniques. In future, we would like to modify our algorithm to make use of the limited number of embedded multipliers available on the FPGA devices. 7. References [1] A.Hosangadi, F.Fallah, and R.Kastner, "Reducing Hardware complexity by iteratively eliminating two term common subexpressions", Asia South Pacific esign Automation Conference (ASP-AC), [2] K..Underwood and K.S.Hemmert, "Closing the Gap: CPU and FPGA Trends in Sustainable Floating- Point BLAS Performance", International Symposium on Field-Programmable Custom Computing Machines, California, USA, [3] L.Zhuo and V.K.Prasanna, "Sparse Matrix-Vector Multiplication on FPGAs", International Symposium on Field Programmable Gate Arrays (FPGA), Monterey, CA, [4] Y.Meng, A.P.Brown, R.A.Iltis, T.Sherwood, H.Lee, and R.Kastner, "MP Core: Algorithm and esign Techniques for Efficient Channel Estimation in Wireless Applications", esign Automation Conference (AC), Anaheim, CA, [5] B. L. Hutchings and B. E. Nelson, "Gigaop SP on FPGA", Acoustics, Speech, and Signal Processing, Proceedings. (ICASSP '01) IEEE International Conference on, [6] A.Alsolaim, J.Becker, M.Glesner, and J.Starzyk, "Architecture and Application of a ynamically Reconfigurable Hardware Array for Future Mobile Communication Systems", International Symposium on Field Programmable Custom Computing Machines (FCCM), [7] S.J.Melnikoff, S.F.uigley, and M.J.Russell, "Implementing a Simple Continuous Speech Recognition System on an FPGA", International Symposium on Field-Programmable Custom Computing Machines (FCCM), [8] T.Yokota, M.Nagafuchi, Y.Mekada, T.Yoshinaga, K.Ootsu, and T.Baba, "A Scalable FPGA-based Custom Computing Machine for Medical Image Processing", International Symposium on Field- Programmable Custom Computing Machines (FCCM), [9] K.Chapman, "Constant Coefficient Multipliers for the XC4000E," Xilinx Technical Report [10] M.J.Wirthlin, "Constant Coefficient Multiplication Using Look-Up Tables," Journal of VLSI Signal Processing, vol. 36, pp. 7-15, [11] K. Wiatr and E. Jamro, "Constant coefficient multiplication in FPGA structures", Euromicro Conference, Proceedings of the 26th, [12] G.R.Goslin, "A Guide to Using Field Programmable Gate Arrays (FPGAs) for Application-Specific igital Signal Processing Performance," Xilinx Application Note, San Jose [13] "istributed Arithmetic FIR Filter v9.0," Xilinx Product Specification [14] M. Yamada and A. Nishihara, "High-speed FIR digital filter with CS coefficients implemented on FPGA", esign Automation Conference, Proceedings of the ASP-AC Asia and South Pacific, 2001.
The Efficient Implementation of Numerical Integration for FPGA Platforms
Website: www.ijeee.in (ISSN: 2348-4748, Volume 2, Issue 7, July 2015) The Efficient Implementation of Numerical Integration for FPGA Platforms Hemavathi H Department of Electronics and Communication Engineering
More informationParallel FIR Filters. Chapter 5
Chapter 5 Parallel FIR Filters This chapter describes the implementation of high-performance, parallel, full-precision FIR filters using the DSP48 slice in a Virtex-4 device. ecause the Virtex-4 architecture
More informationSimultaneous Optimization of Delay and Number of Operations in Multiplierless Implementation of Linear Systems
Simultaneous Optimization of Delay and Number of Operations in Multiplierless Implementation of Linear Systems Abstract The problem of implementing linear systems in hardware by efficiently using shifts
More informationA SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN
A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN Xiaoying Li 1 Fuming Sun 2 Enhua Wu 1, 3 1 University of Macau, Macao, China 2 University of Science and Technology Beijing, Beijing, China
More informationFPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression
FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression Divakara.S.S, Research Scholar, J.S.S. Research Foundation, Mysore Cyril Prasanna Raj P Dean(R&D), MSEC, Bangalore Thejas
More informationSimultaneous Optimization of Delay and Number of Operations in Multiplierless Implementation of Linear Systems
Simultaneous Optimization of Delay and Number of Operations in Multiplierless Implementation of Linear Systems Anup Hosangadi Farzan Fallah Ryan Kastner University of California, Fujitsu Labs of America,
More informationTwo High Performance Adaptive Filter Implementation Schemes Using Distributed Arithmetic
Two High Performance Adaptive Filter Implementation Schemes Using istributed Arithmetic Rui Guo and Linda S. ebrunner Abstract istributed arithmetic (A) is performed to design bit-level architectures for
More informationDESIGN AND IMPLEMENTATION OF DA- BASED RECONFIGURABLE FIR DIGITAL FILTER USING VERILOGHDL
DESIGN AND IMPLEMENTATION OF DA- BASED RECONFIGURABLE FIR DIGITAL FILTER USING VERILOGHDL [1] J.SOUJANYA,P.G.SCHOLAR, KSHATRIYA COLLEGE OF ENGINEERING,NIZAMABAD [2] MR. DEVENDHER KANOOR,M.TECH,ASSISTANT
More informationPipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications
, Vol 7(4S), 34 39, April 204 ISSN (Print): 0974-6846 ISSN (Online) : 0974-5645 Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications B. Vignesh *, K. P. Sridhar
More informationImplementing FIR Filters
Implementing FIR Filters in FLEX Devices February 199, ver. 1.01 Application Note 73 FIR Filter Architecture This section describes a conventional FIR filter design and how the design can be optimized
More informationRUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch
RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC Zoltan Baruch Computer Science Department, Technical University of Cluj-Napoca, 26-28, Bariţiu St., 3400 Cluj-Napoca,
More informationImplementation of Floating Point Multiplier Using Dadda Algorithm
Implementation of Floating Point Multiplier Using Dadda Algorithm Abstract: Floating point multiplication is the most usefull in all the computation application like in Arithematic operation, DSP application.
More informationAdaptive FIR Filter Using Distributed Airthmetic for Area Efficient Design
International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Adaptive FIR Filter Using Distributed Airthmetic for Area Efficient Design Manish Kumar *, Dr. R.Ramesh
More informationImplementation of High Speed FIR Filter using Serial and Parallel Distributed Arithmetic Algorithm
Volume 25 No.7, July 211 Implementation of High Speed FIR Filter using Serial and Parallel Distriuted Arithmetic Algorithm Narendra Singh Pal Electronics &Communication, Dr.B.R.Amedkar Tecnology, Jalandhar
More informationMCM Based FIR Filter Architecture for High Performance
ISSN No: 2454-9614 MCM Based FIR Filter Architecture for High Performance R.Gopalana, A.Parameswari * Department Of Electronics and Communication Engineering, Velalar College of Engineering and Technology,
More informationA HIGH PERFORMANCE FIR FILTER ARCHITECTURE FOR FIXED AND RECONFIGURABLE APPLICATIONS
A HIGH PERFORMANCE FIR FILTER ARCHITECTURE FOR FIXED AND RECONFIGURABLE APPLICATIONS Saba Gouhar 1 G. Aruna 2 gouhar.saba@gmail.com 1 arunastefen@gmail.com 2 1 PG Scholar, Department of ECE, Shadan Women
More informationDesign of 2-D DWT VLSI Architecture for Image Processing
Design of 2-D DWT VLSI Architecture for Image Processing Betsy Jose 1 1 ME VLSI Design student Sri Ramakrishna Engineering College, Coimbatore B. Sathish Kumar 2 2 Assistant Professor, ECE Sri Ramakrishna
More informationISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies
VLSI IMPLEMENTATION OF HIGH PERFORMANCE DISTRIBUTED ARITHMETIC (DA) BASED ADAPTIVE FILTER WITH FAST CONVERGENCE FACTOR G. PARTHIBAN 1, P.SATHIYA 2 PG Student, VLSI Design, Department of ECE, Surya Group
More informationVLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2
IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 05, 2015 ISSN (online): 2321-0613 VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila
More informationINTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 10 /Issue 1 / JUN 2018
A HIGH-PERFORMANCE FIR FILTER ARCHITECTURE FOR FIXED AND RECONFIGURABLE APPLICATIONS S.Susmitha 1 T.Tulasi Ram 2 susmitha449@gmail.com 1 ramttr0031@gmail.com 2 1M. Tech Student, Dept of ECE, Vizag Institute
More informationCHAPTER 4. DIGITAL DOWNCONVERTER FOR WiMAX SYSTEM
CHAPTER 4 IMPLEMENTATION OF DIGITAL UPCONVERTER AND DIGITAL DOWNCONVERTER FOR WiMAX SYSTEM 4.1 Introduction FPGAs provide an ideal implementation platform for developing broadband wireless systems such
More informationHIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE
HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE Anni Benitta.M #1 and Felcy Jeba Malar.M *2 1# Centre for excellence in VLSI Design, ECE, KCG College of Technology, Chennai, Tamilnadu
More informationDesign of a Multiplier Architecture Based on LUT and VHBCSE Algorithm For FIR Filter
African Journal of Basic & Applied Sciences 9 (1): 53-58, 2017 ISSN 2079-2034 IDOSI Publications, 2017 DOI: 10.5829/idosi.ajbas.2017.53.58 Design of a Multiplier Architecture Based on LUT and VHBCSE Algorithm
More informationFPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA
FPGA Implementation of 16-Point FFT Core Using NEDA Abhishek Mankar, Ansuman Diptisankar Das and N Prasad Abstract--NEDA is one of the techniques to implement many digital signal processing systems that
More informationAn Efficient Implementation of Floating Point Multiplier
An Efficient Implementation of Floating Point Multiplier Mohamed Al-Ashrafy Mentor Graphics Mohamed_Samy@Mentor.com Ashraf Salem Mentor Graphics Ashraf_Salem@Mentor.com Wagdy Anis Communications and Electronics
More informationAn HEVC Fractional Interpolation Hardware Using Memory Based Constant Multiplication
2018 IEEE International Conference on Consumer Electronics (ICCE) An HEVC Fractional Interpolation Hardware Using Memory Based Constant Multiplication Ahmet Can Mert, Ercan Kalali, Ilker Hamzaoglu Faculty
More informationA Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter
A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter A.S. Sneka Priyaa PG Scholar Government College of Technology Coimbatore ABSTRACT The Least Mean Square Adaptive Filter is frequently
More informationInternational Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering
An Efficient Implementation of Double Precision Floating Point Multiplier Using Booth Algorithm Pallavi Ramteke 1, Dr. N. N. Mhala 2, Prof. P. R. Lakhe M.Tech [IV Sem], Dept. of Comm. Engg., S.D.C.E, [Selukate],
More informationFault Tolerant Parallel Filters Based On Bch Codes
RESEARCH ARTICLE OPEN ACCESS Fault Tolerant Parallel Filters Based On Bch Codes K.Mohana Krishna 1, Mrs.A.Maria Jossy 2 1 Student, M-TECH(VLSI Design) SRM UniversityChennai, India 2 Assistant Professor
More informationArea And Power Efficient LMS Adaptive Filter With Low Adaptation Delay
e-issn: 2349-9745 p-issn: 2393-8161 Scientific Journal Impact Factor (SJIF): 1.711 International Journal of Modern Trends in Engineering and Research www.ijmter.com Area And Power Efficient LMS Adaptive
More informationComparison of Adders for optimized Exponent Addition circuit in IEEE754 Floating point multiplier using VHDL
International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 11, Issue 07 (July 2015), PP.60-65 Comparison of Adders for optimized Exponent Addition
More informationIMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC
IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC Thangamonikha.A 1, Dr.V.R.Balaji 2 1 PG Scholar, Department OF ECE, 2 Assitant Professor, Department of ECE 1, 2 Sri Krishna
More informationCOPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code
COPY RIGHT 2018IJIEMR.Personal use of this material is permitted. Permission from IJIEMR must be obtained for all other uses, in any current or future media, including reprinting/republishing this material
More informationA Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithmetic with Decomposed LUT
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735. Volume 7, Issue 2 (Jul. - Aug. 2013), PP 13-18 A Novel Approach of Area-Efficient FIR Filter
More informationDesign Optimization Techniques Evaluation for High Performance Parallel FIR Filters in FPGA
Design Optimization Techniques Evaluation for High Performance Parallel FIR Filters in FPGA Vagner S. Rosa Inst. Informatics - Univ. Fed. Rio Grande do Sul Porto Alegre, RS Brazil vsrosa@inf.ufrgs.br Eduardo
More informationImplementation and Impact of LNS MAC Units in Digital Filter Application
Implementation and Impact of LNS MAC Units in Digital Filter Application Hari Krishna Raja.V.S *, Christina Jesintha.R * and Harish.I * * Department of Electronics and Communication Engineering, Sri Shakthi
More informationA Dedicated Hardware Solution for the HEVC Interpolation Unit
XXVII SIM - South Symposium on Microelectronics 1 A Dedicated Hardware Solution for the HEVC Interpolation Unit 1 Vladimir Afonso, 1 Marcel Moscarelli Corrêa, 1 Luciano Volcan Agostini, 2 Denis Teixeira
More informationA High Speed Binary Floating Point Multiplier Using Dadda Algorithm
455 A High Speed Binary Floating Point Multiplier Using Dadda Algorithm B. Jeevan, Asst. Professor, Dept. of E&IE, KITS, Warangal. jeevanbs776@gmail.com S. Narender, M.Tech (VLSI&ES), KITS, Warangal. narender.s446@gmail.com
More informationImplementation of Double Precision Floating Point Multiplier in VHDL
ISSN (O): 2349-7084 International Journal of Computer Engineering In Research Trends Available online at: www.ijcert.org Implementation of Double Precision Floating Point Multiplier in VHDL 1 SUNKARA YAMUNA
More informationDesign and Implementation of Effective Architecture for DCT with Reduced Multipliers
Design and Implementation of Effective Architecture for DCT with Reduced Multipliers Susmitha. Remmanapudi & Panguluri Sindhura Dept. of Electronics and Communications Engineering, SVECW Bhimavaram, Andhra
More informationBy, Ajinkya Karande Adarsh Yoga
By, Ajinkya Karande Adarsh Yoga Introduction Early computer designers believed saving computer time and memory were more important than programmer time. Bug in the divide algorithm used in Intel chips.
More informationFPGA IMPLEMENTATION OF FLOATING POINT ADDER AND MULTIPLIER UNDER ROUND TO NEAREST
FPGA IMPLEMENTATION OF FLOATING POINT ADDER AND MULTIPLIER UNDER ROUND TO NEAREST SAKTHIVEL Assistant Professor, Department of ECE, Coimbatore Institute of Engineering and Technology Abstract- FPGA is
More informationHigh Performance Applications on Reconfigurable Clusters
High Performance Applications on Reconfigurable Clusters by Zahi Samir Nakad Thesis Submitted to the Faculty of Virginia Polytechnic Institute and State University In partial fulfillment of the requirements
More informationFPGA Matrix Multiplier
FPGA Matrix Multiplier In Hwan Baek Henri Samueli School of Engineering and Applied Science University of California Los Angeles Los Angeles, California Email: chris.inhwan.baek@gmail.com David Boeck Henri
More informationPERFORMANCE ANALYSIS OF HIGH EFFICIENCY LOW DENSITY PARITY-CHECK CODE DECODER FOR LOW POWER APPLICATIONS
American Journal of Applied Sciences 11 (4): 558-563, 2014 ISSN: 1546-9239 2014 Science Publication doi:10.3844/ajassp.2014.558.563 Published Online 11 (4) 2014 (http://www.thescipub.com/ajas.toc) PERFORMANCE
More informationAn FPGA Implementation of the Powering Function with Single Precision Floating-Point Arithmetic
An FPGA Implementation of the Powering Function with Single Precision Floating-Point Arithmetic Pedro Echeverría, Marisa López-Vallejo Department of Electronic Engineering, Universidad Politécnica de Madrid
More informationDESIGN AND IMPLEMENTATION BY USING BIT LEVEL TRANSFORMATION OF ADDER TREES FOR MCMS USING VERILOG
DESIGN AND IMPLEMENTATION BY USING BIT LEVEL TRANSFORMATION OF ADDER TREES FOR MCMS USING VERILOG 1 S. M. Kiranbabu, PG Scholar in VLSI Design, 2 K. Manjunath, Asst. Professor, ECE Department, 1 babu.ece09@gmail.com,
More informationBit-Level Optimization of Adder-Trees for Multiple Constant Multiplications for Efficient FIR Filter Implementation
Bit-Level Optimization of Adder-Trees for Multiple Constant Multiplications for Efficient FIR Filter Implementation 1 S. SRUTHI, M.Tech, Vlsi System Design, 2 G. DEVENDAR M.Tech, Asst.Professor, ECE Department,
More informationVLSI Design Of a Novel Pre Encoding Multiplier Using DADDA Multiplier. Guntur(Dt),Pin:522017
VLSI Design Of a Novel Pre Encoding Multiplier Using DADDA Multiplier 1 Katakam Hemalatha,(M.Tech),Email Id: hema.spark2011@gmail.com 2 Kundurthi Ravi Kumar, M.Tech,Email Id: kundurthi.ravikumar@gmail.com
More informationPerformance Analysis of CORDIC Architectures Targeted by FPGA Devices
International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Performance Analysis of CORDIC Architectures Targeted by FPGA Devices Guddeti Nagarjuna Reddy 1, R.Jayalakshmi 2, Dr.K.Umapathy
More informationReduction of Latency and Resource Usage in Bit-Level Pipelined Data Paths for FPGAs
Reduction of Latency and Resource Usage in Bit-Level Pipelined Data Paths for FPGAs P. Kollig B. M. Al-Hashimi School of Engineering and Advanced echnology Staffordshire University Beaconside, Stafford
More informationBatchu Jeevanarani and Thota Sreenivas Department of ECE, Sri Vasavi Engg College, Tadepalligudem, West Godavari (DT), Andhra Pradesh, India
Memory-Based Realization of FIR Digital Filter by Look-Up- Table Optimization Batchu Jeevanarani and Thota Sreenivas Department of ECE, Sri Vasavi Engg College, Tadepalligudem, West Godavari (DT), Andhra
More informationFPGA IMPLEMENTATION FOR REAL TIME SOBEL EDGE DETECTOR BLOCK USING 3-LINE BUFFERS
FPGA IMPLEMENTATION FOR REAL TIME SOBEL EDGE DETECTOR BLOCK USING 3-LINE BUFFERS 1 RONNIE O. SERFA JUAN, 2 CHAN SU PARK, 3 HI SEOK KIM, 4 HYEONG WOO CHA 1,2,3,4 CheongJu University E-maul: 1 engr_serfs@yahoo.com,
More informationDesign and Implementation of Low-Complexity Redundant Multiplier Architecture for Finite Field
Design and Implementation of Low-Complexity Redundant Multiplier Architecture for Finite Field Veerraju kaki Electronics and Communication Engineering, India Abstract- In the present work, a low-complexity
More informationFAST FIR FILTERS FOR SIMD PROCESSORS WITH LIMITED MEMORY BANDWIDTH
Key words: Digital Signal Processing, FIR filters, SIMD processors, AltiVec. Grzegorz KRASZEWSKI Białystok Technical University Department of Electrical Engineering Wiejska
More informationDesign and Analysis of Efficient Reconfigurable Wavelet Filters
Design and Analysis of Efficient Reconfigurable Wavelet Filters Amit Pande and Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Ames, IA 50011 Email: {amit, zambreno}@iastate.edu
More informationTSEA44 - Design for FPGAs
2015-11-24 Now for something else... Adapting designs to FPGAs Why? Clock frequency Area Power Target FPGA architecture: Xilinx FPGAs with 4 input LUTs (such as Virtex-II) Determining the maximum frequency
More informationLogiCORE IP FIR Compiler v7.0
LogiCORE IP FIR Compiler v7.0 Product Guide for Vivado Design Suite Table of Contents IP Facts Chapter 1: Overview Feature Summary.................................................................. 5 Licensing
More informationFIR Filter Architecture for Fixed and Reconfigurable Applications
FIR Filter Architecture for Fixed and Reconfigurable Applications Nagajyothi 1,P.Sayannna 2 1 M.Tech student, Dept. of ECE, Sudheer reddy college of Engineering & technology (w), Telangana, India 2 Assosciate
More informationHigh Level Abstractions for Implementation of Software Radios
High Level Abstractions for Implementation of Software Radios J. B. Evans, Ed Komp, S. G. Mathen, and G. Minden Information and Telecommunication Technology Center University of Kansas, Lawrence, KS 66044-7541
More informationMeasuring Improvement When Using HUB Formats to Implement Floating-Point Systems under Round-to- Nearest
Measuring Improvement When Using HUB Formats to Implement Floating-Point Systems under Round-to- Nearest Abstract: This paper analyzes the benefits of using half-unitbiased (HUB) formats to implement floatingpoint
More informationINTRODUCTION TO FPGA ARCHITECTURE
3/3/25 INTRODUCTION TO FPGA ARCHITECTURE DIGITAL LOGIC DESIGN (BASIC TECHNIQUES) a b a y 2input Black Box y b Functional Schematic a b y a b y a b y 2 Truth Table (AND) Truth Table (OR) Truth Table (XOR)
More informationTechnology-Dependent Optimization of FIR Filters based on Carry-Save Multiplier and 4:2 Compressor unit
ELECTRONICS, VOL. 20, NO. 2, ECEMBER 206 43 Technology-ependent Optimization of FIR Filters based on Carry-Save Multiplier and Compressor unit Burhan Khurshid and Roohie Naaz Abstract - This work presents
More informationOPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION
OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION 1 S.Ateeb Ahmed, 2 Mr.S.Yuvaraj 1 Student, Department of Electronics and Communication/ VLSI Design SRM University, Chennai, India 2 Assistant
More informationOPTIMIZATION OF AREA COMPLEXITY AND DELAY USING PRE-ENCODED NR4SD MULTIPLIER.
OPTIMIZATION OF AREA COMPLEXITY AND DELAY USING PRE-ENCODED NR4SD MULTIPLIER. A.Anusha 1 R.Basavaraju 2 anusha201093@gmail.com 1 basava430@gmail.com 2 1 PG Scholar, VLSI, Bharath Institute of Engineering
More informationFPGA Polyphase Filter Bank Study & Implementation
FPGA Polyphase Filter Bank Study & Implementation Raghu Rao Matthieu Tisserand Mike Severa Prof. John Villasenor Image Communications/. Electrical Engineering Dept. UCLA 1 Introduction This document describes
More informationIMPLEMENTATION OF CONFIGURABLE FLOATING POINT MULTIPLIER
IMPLEMENTATION OF CONFIGURABLE FLOATING POINT MULTIPLIER 1 GOKILA D, 2 Dr. MANGALAM H 1 Department of ECE, United Institute of Technology, Coimbatore, Tamil nadu, India 2 Department of ECE, Sri Krishna
More informationHigh-Performance FIR Filter Architecture for Fixed and Reconfigurable Applications
High-Performance FIR Filter Architecture for Fixed and Reconfigurable Applications Pallavi R. Yewale ME Student, Dept. of Electronics and Tele-communication, DYPCOE, Savitribai phule University, Pune,
More informationAn FPGA Based Floating Point Arithmetic Unit Using Verilog
An FPGA Based Floating Point Arithmetic Unit Using Verilog T. Ramesh 1 G. Koteshwar Rao 2 1PG Scholar, Vaagdevi College of Engineering, Telangana. 2Assistant Professor, Vaagdevi College of Engineering,
More informationHigh Speed Systolic Montgomery Modular Multipliers for RSA Cryptosystems
High Speed Systolic Montgomery Modular Multipliers for RSA Cryptosystems RAVI KUMAR SATZODA, CHIP-HONG CHANG and CHING-CHUEN JONG Centre for High Performance Embedded Systems Nanyang Technological University
More informationSystem Verification of Hardware Optimization Based on Edge Detection
Circuits and Systems, 2013, 4, 293-298 http://dx.doi.org/10.4236/cs.2013.43040 Published Online July 2013 (http://www.scirp.org/journal/cs) System Verification of Hardware Optimization Based on Edge Detection
More informationComparison of pipelined IEEE-754 standard floating point multiplier with unpipelined multiplier
Journal of Scientific & Industrial Research Vol. 65, November 2006, pp. 900-904 Comparison of pipelined IEEE-754 standard floating point multiplier with unpipelined multiplier Kavita Khare 1, *, R P Singh
More informationAn Efficient Constant Multiplier Architecture Based On Vertical- Horizontal Binary Common Sub-Expression Elimination Algorithm
Volume-6, Issue-6, November-December 2016 International Journal of Engineering and Management Research Page Number: 229-234 An Efficient Constant Multiplier Architecture Based On Vertical- Horizontal Binary
More informationA Novel Distributed Arithmetic Multiplierless Approach for Computing Complex Inner Products
606 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'5 A ovel Distributed Arithmetic Multiplierless Approach for Computing Complex Inner Products evin. Bowlyn, and azeih M. Botros. Ph.D. Candidate,
More informationA Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier using Modified CSA
RESEARCH ARTICLE OPEN ACCESS A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier using Nishi Pandey, Virendra Singh Sagar Institute of Research & Technology Bhopal Abstract Due to
More informationXilinx DSP. High Performance Signal Processing. January 1998
DSP High Performance Signal Processing January 1998 New High Performance DSP Alternative New advantages in FPGA technology and tools: DSP offers a new alternative to ASICs, fixed function DSP devices,
More informationFixed Point LMS Adaptive Filter with Low Adaptation Delay
Fixed Point LMS Adaptive Filter with Low Adaptation Delay INGUDAM CHITRASEN MEITEI Electronics and Communication Engineering Vel Tech Multitech Dr RR Dr SR Engg. College Chennai, India MR. P. BALAVENKATESHWARLU
More informationFPGA Implementation of a High Speed Multistage Pipelined Adder Based CORDIC Structure for Large Operand Word Lengths
International Journal of Computer Science and Telecommunications [Volume 3, Issue 5, May 2012] 105 ISSN 2047-3338 FPGA Implementation of a High Speed Multistage Pipelined Adder Based CORDIC Structure for
More informationVHDL IMPLEMENTATION OF FLOATING POINT MULTIPLIER USING VEDIC MATHEMATICS
VHDL IMPLEMENTATION OF FLOATING POINT MULTIPLIER USING VEDIC MATHEMATICS I.V.VAIBHAV 1, K.V.SAICHARAN 1, B.SRAVANTHI 1, D.SRINIVASULU 2 1 Students of Department of ECE,SACET, Chirala, AP, India 2 Associate
More informationA Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding
A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding N.Rajagopala krishnan, k.sivasuparamanyan, G.Ramadoss Abstract Field Programmable Gate Arrays (FPGAs) are widely
More informationAnalysis of High-performance Floating-point Arithmetic on FPGAs
Analysis of High-performance Floating-point Arithmetic on FPGAs Gokul Govindu, Ling Zhuo, Seonil Choi and Viktor Prasanna Dept. of Electrical Engineering University of Southern California Los Angeles,
More informationERROR MODELLING OF DUAL FIXED-POINT ARITHMETIC AND ITS APPLICATION IN FIELD PROGRAMMABLE LOGIC
ERROR MODELLING OF DUAL FIXED-POINT ARITHMETIC AND ITS APPLICATION IN FIELD PROGRAMMABLE LOGIC Chun Te Ewe, Peter Y. K. Cheung and George A. Constantinides Department of Electrical & Electronic Engineering,
More informationCHAPTER 4 BLOOM FILTER
54 CHAPTER 4 BLOOM FILTER 4.1 INTRODUCTION Bloom filter was formulated by Bloom (1970) and is used widely today for different purposes including web caching, intrusion detection, content based routing,
More informationDesign and Optimized Implementation of Six-Operand Single- Precision Floating-Point Addition
2011 International Conference on Advancements in Information Technology With workshop of ICBMG 2011 IPCSIT vol.20 (2011) (2011) IACSIT Press, Singapore Design and Optimized Implementation of Six-Operand
More informationEECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs)
EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) September 12, 2002 John Wawrzynek Fall 2002 EECS150 - Lec06-FPGA Page 1 Outline What are FPGAs? Why use FPGAs (a short history
More informationLow Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm
Low Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm 1 A.Malashri, 2 C.Paramasivam 1 PG Student, Department of Electronics and Communication K S Rangasamy College Of Technology,
More informationHead, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India
Mapping Signal Processing Algorithms to Architecture Sumam David S Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India sumam@ieee.org Objectives At the
More informationAn Efficient Design of Sum-Modified Booth Recoder for Fused Add-Multiply Operator
An Efficient Design of Sum-Modified Booth Recoder for Fused Add-Multiply Operator M.Chitra Evangelin Christina Associate Professor Department of Electronics and Communication Engineering Francis Xavier
More informationFigurel. TEEE-754 double precision floating point format. Keywords- Double precision, Floating point, Multiplier,FPGA,IEEE-754.
AN FPGA BASED HIGH SPEED DOUBLE PRECISION FLOATING POINT MULTIPLIER USING VERILOG N.GIRIPRASAD (1), K.MADHAVA RAO (2) VLSI System Design,Tudi Ramireddy Institute of Technology & Sciences (1) Asst.Prof.,
More informationInternational Journal of Research in Computer and Communication Technology, Vol 4, Issue 11, November- 2015
Design of Dadda Algorithm based Floating Point Multiplier A. Bhanu Swetha. PG.Scholar: M.Tech(VLSISD), Department of ECE, BVCITS, Batlapalem. E.mail:swetha.appari@gmail.com V.Ramoji, Asst.Professor, Department
More informationLOW COMPLEXITY SUBBAND ANALYSIS USING QUADRATURE MIRROR FILTERS
LOW COMPLEXITY SUBBAND ANALYSIS USING QUADRATURE MIRROR FILTERS Aditya Chopra, William Reid National Instruments Austin, TX, USA Brian L. Evans The University of Texas at Austin Electrical and Computer
More informationOutline. EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) FPGA Overview. Why FPGAs?
EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) September 12, 2002 John Wawrzynek Outline What are FPGAs? Why use FPGAs (a short history lesson). FPGA variations Internal logic
More informationImplementation of digit serial fir filter using wireless priority service(wps)
Implementation of digit serial fir filter using wireless priority service(wps) S.Aruna Assistant professor,ece department MVSR Engineering College Nadergul,Hyderabad-501510 V.Sravanthi PG Scholar, ECE
More informationFPGA architecture and design technology
CE 435 Embedded Systems Spring 2017 FPGA architecture and design technology Nikos Bellas Computer and Communications Engineering Department University of Thessaly 1 FPGA fabric A generic island-style FPGA
More informationAPPLICATION NOTE. Block Adaptive Filter. General Description of Adaptive Filters. Adaptive Filter Design Overview
APPLICATION NOTE Block Adaptive Filter XAPP 055 January 9, 1997 (Version 1.1) Application Note by Bill Allaire and Bud Fischer Summary This application note describes a specific design for implementing
More informationUniversity, Patiala, Punjab, India 1 2
1102 Design and Implementation of Efficient Adder based Floating Point Multiplier LOKESH BHARDWAJ 1, SAKSHI BAJAJ 2 1 Student, M.tech, VLSI, 2 Assistant Professor,Electronics and Communication Engineering
More informationFILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas
FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS Waqas Akram, Cirrus Logic Inc., Austin, Texas Abstract: This project is concerned with finding ways to synthesize hardware-efficient digital filters given
More informationEmbedded Systems: Hardware Components (part I) Todor Stefanov
Embedded Systems: Hardware Components (part I) Todor Stefanov Leiden Embedded Research Center Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded System
More informationImplementation Of Quadratic Rotation Decomposition Based Recursive Least Squares Algorithm
157 Implementation Of Quadratic Rotation Decomposition Based Recursive Least Squares Algorithm Manpreet Singh 1, Sandeep Singh Gill 2 1 University College of Engineering, Punjabi University, Patiala-India
More informationDesign and Implementation of 3-D DWT for Video Processing Applications
Design and Implementation of 3-D DWT for Video Processing Applications P. Mohaniah 1, P. Sathyanarayana 2, A. S. Ram Kumar Reddy 3 & A. Vijayalakshmi 4 1 E.C.E, N.B.K.R.IST, Vidyanagar, 2 E.C.E, S.V University
More information