CHAPTER 4. DIGITAL DOWNCONVERTER FOR WiMAX SYSTEM

Size: px
Start display at page:

Download "CHAPTER 4. DIGITAL DOWNCONVERTER FOR WiMAX SYSTEM"

Transcription

1 CHAPTER 4 IMPLEMENTATION OF DIGITAL UPCONVERTER AND DIGITAL DOWNCONVERTER FOR WiMAX SYSTEM 4.1 Introduction FPGAs provide an ideal implementation platform for developing broadband wireless systems such as WCDMA, WiMAX etc. To accelerate the performance of these broadband systems, state of the art high end and high performance FPGAs are used. FPGAs have gained rapid acceptance and growth over the past decade because they can be applied to a very wide range of applications. Using logic blocks and programmable routing resources, FPGAs can be configured to implement custom hardware functionality. As FPGAs are completely reconfigurable, so they can be reprogrammed for new applications. The development of high level design tools like system generator and DSP builder has resulted in small design cycle. As FPGAs are truly parallel in nature, different processing operations do not have to compete for the same resources. Each independent processing task is assigned to a dedicated section of the chip, and can function autonomously without any influence from other logic blocks. FPGAs are available which can be used for dedicated DSP applications. Thus the same filtering operations currently implemented in custom VLSI devices can now be implemented in a FPGA device ( Sun, M.T. et.al, 1989). Distributed Arithmetic (DA) can be explored to save resources in FPGA implementation of DSP functions. DA can be used to trade memory for combinatory elements, resulting in low cost look up table (LUT) based FPGAs implementation. Also the designer can select a serial or parallel DA implementation to trade off speed and resource utilization (Stanley A. White, 1989). 66

2 In this chapter FPGA implementation of DUC and DDC for WiMAX system have been proposed using DA. Different configurations for serial and parallel implementations are presented and compared. The resultant implementations are compared in terms of resource utilization for a Stratix II GX device. DSP builder is used to implement pipelining and scaling of parameters. Basics of DA architecture and methods to reduce the requirement of ROM are presented in section 4.2. Overview and architecture of Stratix II GX device are presented in section 4.3. Serial and parallel implementations of FIR filter with DA architecture are explored in section 4.4. Implementation of DUC and DDC is presented in sections 4.5 and 4.6 respectively. 4.2 Distributed Arithmetic Architecture DA is a very efficient mechanism to trade combinational logic with memory for high performance computation. DA can significantly help to save area in DSP hardware design. When the number of elements in a vector is nearly the same as the word size, DA is quite fast because it replaces the explicit multiplications by ROM look ups, which is an efficient technique to implement on Field Programmable Gate Arrays (FPGAs) ( Sun, M.T. 1989). Figure 4.1: Basic Architecture of Distributed Arithmetic In DA, multiplications are reordered and mixed in such a way that the arithmetic becomes 67

3 distributed through the structure rather than being lumped. With the advent of FPGA technology DA plays significant role to improve the system. The basic architecture for DA implementation has been shown in figure 4.1. For the DA implementation no multipliers are required. So accumulators, registers and read only memories (ROMs) are used for its implementation. The N bit registers are used to store the input vectors. This is shown with the help of an example, in which a general sum of product (SOP) equation that defines the response of linear, time invariant networks (4.1) is implemented with DA architecture shown in figure 4.2. y M 1 a b ( n) (4.1) n k k k 0 Where y is the response of network at time n, b ( n) is k th input variable at time n and n k ak is weighing factor of k th input variable that is constant for all n, and so it remains time invariant (Xilinx application note). Because the coefficients are constants, so these values can be precomputed. The output yn has only 2 M possible values, which can be stored in a 2 M size ROM. The bit serial input data can be used to directly address the ROM contents, which can be dropped into an accumulator to obtain the inner sum. Additional control circuitry is required to handle subtraction when the sign bit addresses the ROM (Chung, J. C., et al., 1998). The accumulator output converges to the final result after N cycles. To show this process a FIR filter implemented using the DA architecture is shown in figure 4.2. The input vector X holds four elements that are four bits each. The ROM contains all 16 combinations of the constant vector elements A i. Each of the X i elements is delivered one bit at a time, with the MSB first. Every clock cycle, the register contains the sum of the left shifted version of the previous register value and the current ROM contents. T s is the sign bit to control 68

4 Figure 4.2: FIR Filter using Distributed Arithmetic the addition/subtraction operation. When T s is high, the accumulator subtracts the current ROM contents from the left shifted version of the previous result and when it is low, the accumulator will add the current ROM contents to previous result. After four cycles, the register will have the final dot product. The only problem arises, is the increased size of the required ROM, which grows exponentially with each added input address line. For each element in a vector, there will be an address line. So there will be in total K address lines resulting in 2 K ROM. This increased ROM size problem can be reduced by two methods (Ansari, Z.A. 2003). The first method is based on the ROM decomposition, which is shown in figure 4.3. In this memory will be partioned in smaller parts, and by using an additional adder, all ROM outputs are added. The amount of memory is reduced from 2 N 2 words to 2 2 N 69

5 Figure 4.3: Reducing the memory using decomposition. words, if the original memory is partitioned into two parts. For N =8, the number of words to be store have reduced from 2 8 = 256 to = 32. Hence, this approach reduces the memory significantly at the cost of an additional adder. The second approach is based on a special coding of the ROM content. Memory size can be halved by using the inventive scheme based on the identity 1 x x ( x) (4.2) 2 In two's complement representation, a negative number is obtained by inverting all bits and then adding a 1 to the least significant position of the original number. The identity 4.2 can be rewritten as (White. A. Stainley, 1989) 1 x x x x x 2 Wd 1 Wd 1 k k ( Wd 1 0 k2 ( 0 k2 2 ) k 1 k 1 (4.3) 70

6 0 0 Wd 1 1 k 1 k k k 1 Wd (4.4) x ( x x )2 ( x x )2 2 Notice that xk xk can only take on the values -1 or +1. Using this expression, for FIR filter equation yields Wd 1 k 1 1 k ( 1k, 2k,..., Nk )2 k ( 10, 20,..., N 0)2 Wd (0,0,...,0)2 (4.5) k 1 y F x x x F x x x F F ( x, x,..., x ) a ( x x ) Where 1 2 The function k k k Nk i k k i 1 N Fk is shown in Table 4.1 for N = 3. Table 4.1: Address and Contents of ROM x 1 x 2 x F k y 1 y 2 AS a1 a2 a 0 0 A 3 a a a 0 1 A a1 a2 a3 1 0 A a1 a2 a3 1 1 A a a a 1 1 S a1 a2 a3 1 0 S a1 a2 a3 0 1 S a a a 0 0 S Notice that only half the values are needed, since the other half can be obtained by changing the signs. To explore this redundancy, some address modification is done, shown to the right in table 4.1 by using 4.6 and 4.7. y1 x1 x2 (4.6) y2 x1 x3 (4.7) Here, variable x 1 has been selected as the control signal.the add/sub control (i.e., x 1) 71

7 must also provide the correct addition/subtraction function when the sign bits are accumulated. Therefore, following control signal is used to address the ROM: A S x1 xsignbit (4.8) Where the control signal x signbit is zero at all times except when the sign bit arrives. Figure 4.4 shows the resulting principle for distributed arithmetic with halved ROM. Only N 1 variables are used to address the memory. The XOR gates used for halving the memory can be merged with the XOR gates used for inverting the function F. k Figure 4.4: Distributed arithmetic with smaller ROM This technique for reducing the memory size can easily be implemented using a small modification of the shift accumulator. 4.3 General FPGA Architecture Major FPGA specifications include the amount of configurable logic blocks (CLBs), the number of fixed function logic blocks, such as multipliers, and size of memory resources. Although there are many other parts of an FPGA chip, but these are typically the most 72

8 Figure 4.5: Different Parts of an FPGA important when selecting and comparing FPGAs. The configurable blocks of logic, such as slices or logic cells, are made up of two basic things: flip-flops and LUTs. Figure 4.5 shows the different parts of FPGA. Figure 4.6: Structure of an FPGA The structure of FPGA is array based, meaning that each chip comprises a two dimensional array of logic blocks that can be interconnected via horizontal and vertical 73

9 routing channels. An illustration of this type of architecture is shown in figure 4.6. The CLB is based on LUTs. A LUT is a small one bit wide memory array, where the address lines for the memory are inputs of the logic block and the one bit output from the memory is the LUT output. A LUT with K inputs would then correspond to a 2K x 1 bit memory and can realize any logic function of its K inputs by programming the logic function s truth table directly into the memory Stratix II FPGAs The Stratix II family of FPGAs is based on a 1.5 V, 0.13 μm, all layer copper SRAM process, with densities of up to 79,040 logic elements (LEs) and upto 7.5 MB of RAM (Altera publication, 2002). Stratix devices offer up to 22 digital signal processing (DSP) blocks with up to 176 (9-bit 9-bit) embedded multipliers, optimized for DSP applications that enable efficient implementation of high performance filters. Stratix devices support various I/O standards and also offer a complete clock management solution with its hierarchical clock structure with up to 420 MHz performance. Stratix devices contain a two dimensional row and column based architecture to implement custom logic. A series of column and row interconnects of varying length and speed provide signal interconnects between logic array blocks (LABs), memory block structures, and DSP blocks. The logic array consists of LABs, with 10 logic elements (LEs) in each LAB. An LE is a small unit of logic providing efficient implementation of user logic functions. LABs are grouped into rows and columns across the device. M512 RAM blocks are simple dual port memory blocks with 512 bits. These blocks provide dedicated simple dual port or single port memory up to 18 bits wide. M512 blocks are grouped into columns across the device in between certain LABs. M4K RAM blocks are dual port memory blocks with 4K bits plus parity (4,608 bits). These blocks provide dedicated dual port, simple dual port, or single port memory up to 36 bits wide. These 74

10 blocks are grouped into columns across the device in between certain LABs. M-RAM blocks are dual port memory blocks with 512K bits. These blocks provide dedicated dual port, simple dual port, or single port memory up to 144-bits wide. Several M-RAM blocks are located individually or in pairs within the device s logic array. DSP blocks can implement up to either eight full precision 9 9-bit multipliers, four full-precision bit multipliers, or one full-precision bit multiplier with add or subtract features. These blocks also contain 18-bit input shift registers for digital signal processing applications, including FIR and infinite impulse response (IIR) filters. DSP blocks are grouped into two columns in each device (Altera publication, 2002). Figure 4.7: Block Diagram of Stratix II FPGA 75

11 Each Stratix device I/O pin is fed by an I/O element (IOE) located at the end of LAB rows and columns around the periphery of the device. I/O pins support numerous single ended and differential I/O standards. Each IOE contains a bidirectional I/O buffer and six registers for registering input, output, and output enable signals.the number of M512 RAM, M4K RAM, and DSP blocks varies by device along with row and column numbers and M-RAM blocks Logic Array Blocks (LABs) The LAB local interconnect can drive LEs within the same LAB. The LAB local interconnect is driven by column and row interconnects and LE outputs within the same LAB (Altera publication, 2002).. Figure 4.8: Stratix LAB Structure Neighbouring LABs, M512 RAM blocks, M4K RAM blocks, or DSP blocks from the left and right can also drive an LAB s local interconnect through the direct link connection. The direct link connection feature minimizes the use of row and column interconnects, 76

12 providing higher performance and flexibility. Each LE can drive 30 other LEs through fast local and direct link interconnects. Each LAB contains dedicated logic for driving control signals to its LEs. The control signals include two clocks, two clock enables, two asynchronous clears, synchronous clear, asynchronous preset/load, synchronous load, and add/subtract control signals. This gives a maximum of 10 control signals at a time. Although synchronous load and clear signals are generally used when implementing counters, they can also be used with other functions. Each LAB s clock and clock enable signals are linked. If the LAB uses both the rising and falling edges of a clock, it also uses both LAB clock signals. Deasserting the clock enable signal will turn off the LAB clock. Each LAB can use two asynchronous clear signals and an asynchronous load/preset signal. The asynchronous load acts as a preset when the asynchronous load data input is tied high. With the LAB addnsub ( see figure 4.9) control signal, a single LE can implement a one bit adder and subtractor. This saves LE resources and improves performance for logic functions such as DSP correlators and signed multipliers that alternate between addition and subtraction depending on data Logic Elements (LEs) The smallest unit of logic in the Stratix architecture, the LE, is compact and provides advanced features with efficient logic utilization. Each LE contains a four-input LUT, which is a function generator that can implement any function of four variables (Altera publication, 2002). In addition, each LE contains a programmable register and carry chain with carry select capability. A single LE also supports dynamic single bit addition or subtraction mode selectable by an LAB-wide control signal. Each LE drives all types of interconnects: local, row, column, LUT chain, register chain, and direct link interconnects. Each LE s programmable register can be configured for D, T, JK or SR operation. 77

13 Figure 4.9: Block Diagram of Stratix LE Each register has data, true asynchronous load data, clock, clock enable, clear, and asynchronous load/preset inputs. Global signals, general-purpose I/O pins, or any internal logic can drive the register s clock and clear control signals. Either general purpose I/O pins or internal logic can drive the clock enable, preset, asynchronous load, and asynchronous data. The asynchronous load data input comes from the data 3 input of the LE. Each LE has three outputs that drive the local, row, and column routing resources. The LUT or register output can drive these three outputs independently. Two LE outputs drive column or row and direct link routing connections and one drives local interconnect resources. This allows the LUT to drive one output while the register drives other output. This improves device utilization because the device can use the register and LAB LUT routing from previous LE functions TriMatrix Memory TriMatrix memory consists of three types of RAM blocks: M512, M4K, and M-RAM blocks (Altera publication, 2002). Although these memory blocks are different, still they 78

14 all can implement various types of memory with or without parity, including true dual port, simple dual port, and single port RAM, ROM, and FIFO buffers. The largest TriMatrix memory block, the M-RAM block, is useful for applications where a large volume of data must be stored on-chip. The M-RAM block can be configured in true dual port RAM, simple dual port RAM, single port RAM and FIFO RAM mode. Only synchronous operation is supported in the M-RAM block. The memory address and output width can be configured as 64K 8 bits, 32K 16 bits, 16K 32 bits, 8K 64 bits, and 4K 128 bits. Mixed width configurations are also possible, allowing different read and write widths Digital Signal Processing Block The most commonly used DSP functions are finite impulse response (FIR) filters, complex FIR filters, infinite impulse response (IIR) filters, fast Fourier transform (FFT) functions and direct cosine transform (DCT) functions. Additionally, some applications need specialized operations such as multiply-add and multiply accumulate operations. Stratix devices provide DSP blocks to meet the arithmetic requirements of these functions. Each Stratix device has two columns of DSP blocks to efficiently implement DSP functions faster than LE-based implementations. Each DSP block can be configured to support up to eight 9 9-bit multipliers, eour bit multipliers or one bit multiplier (Altera publication, 2002). As indicated, the Stratix DSP block can support one bit multiplier in a single DSP block. This is true for any matched sign multiplications, but the capabilities for dynamic and mixed sign multiplications are handled differently. The the largest functions that can fit into a single DSP block can be bit unsigned by unsigned multiplication, bit signed by signed multiplication, bit unsigned by signed multiplication, bit signed by unsigned multiplication, bit signed by 79

15 dynamic sign multiplication, bit dynamic sign by signed multiplication, bit unsigned by dynamic sign multiplication, bit dynamic sign by unsigned multiplication, bit dynamic sign multiplication when the sign controls for each operand are different or bit dynamic sign multiplication when the same sign control is used for both operands. DSP block multipliers can optionally feed an adder/subtractor or accumulator within the block depending on the configuration. This makes routing to LEs easier, saves LE routing resources, and increases performance, because all connections and blocks are within the DSP block. So the DSP block registers can be efficiently used to implement shift registers for FIR filter applications Modes of Operation The adder, subtractor, and accumulate functions of a DSP block have simple multiplier, multiply accumulator and multipliers adder modes of operation. In simple multiplier mode, shown in figure 4.10, the DSP block drives the multiplier sub block result directly to the output with or without an output register. Up to four bit multipliers or eight 9 9-bit multipliers can drive their results directly out of one DSP block. DSP blocks can also implement one bit multiplier in multiplier mode. DSP blocks use four bit multipliers combined with dedicated adder and internal shift circuitry to achieve 36- bit multiplication. In MAC mode, the DSP block drives multiplied results to the adder/subtractor/accumulator block configured as an accumulator as shown in figure Two multiply-accumulators up to bits can be implemented in one DSP block. The first and third multiplier subblocks are unused in this mode, because only one multiplier can feed one of two accumulators. The multiply accumulator output can be up to 52 bits. The addnsub signal can set the accumulator for decimation and the overflow signal indicates underflow condition (Altera publication, 2002). For FIR filters, the DSP block combines the four multipliers adder mode with the shift register inputs. 80

16 Figure 4.10: Block Diagram of DSP block in Simple Multiplier Mode Figure 4.11: Block Diagram of DSP block in Multiply Accumulate Mode 81

17 One set of shift inputs contains the filter data, while the other holds the coefficients loaded in serial or parallel. The input shift register eliminates the need for shift registers external to the DSP block. This architecture simplifies filter design since the DSP block implements all of the filter circuitry. One DSP block can implement an entire 18-bit FIR filter with up to four taps. Figure 4.12: Block Diagram of DSP block in Four Multiplier Adder Mode For higher configuration filter implementation, DSP blocks can be cascaded accordingly (Altera publication, 2002). 82

18 I/O Structure The IOE in Stratix devices contains a bidirectional I/O buffer, six registers and a latch for a complete embedded bidirectional single data rate or DDR transfer. As shown in figure 4.13, the IOE contains two input registers with latch, two output registers and two output enable registers. The design can use both input registers and the latch to capture DDR input and both output registers to drive DDR outputs. Figure 4.13: Stratix IOE structure Additionally, the design can use the output enable register for fast clock to output enable timing. The negative edge-clocked OE register is used for DDR SDRAM interfacing. The 83

19 Quartus II software automatically duplicates a single OE register that controls multiple output or bidirectional pins. The IOEs are located in I/O blocks around the periphery of the Stratix device. There are up to four IOEs per row I/O block and six IOEs per column I/O block. The row I/O blocks drive row, column, or direct link interconnects. The column I/O blocks drive column interconnects (Altera publication, 2002). Although by using the FPGA architecture in efficient manner, resources can be reduced but with the help of DA using suitable structural implementation, further improvement in the FPGA design can be obtained. 4.4 Distributed Arithmetic FIR Filter As discussed in chapter 3, FIR filters have the advantage of linear phase, high stability, fewer finite precision errors and efficient implementation. But still they suffers from the requirement of higher order i.e. more coefficients are required as compared to IIR filter. This high order demand imposes more hardware requirements, arithmetic operations, area usage and power consumption when designing and fabricating the filter. Therefore reducing these parameters is a major objective which can be attained with the help of efficient use of DA in FPGA implementation. Mathematically FIR filter can be shown as N k (4.9) k 0 y[ n] a x[ n k] In Equation 4.9, x[n] represents the input, y[n] represents the filter output and ak represents the filter coefficients. This filter is of Nth order and it contains N+1 taps. Equation 4.9 can be implemented conventionally by using multipliers, adders and delay elements as shown in figure The delay elements can be implemented using memory elements and at any time only N most recent inputs need to be stored (Chang, T. S. and Jen, C. W., 1999). But implementing the FIR filter in this manner using MAC units is expensive as it consumes N+1 MAC units which are very high for the filter order of N. 84

20 Figure 4.14: Conventional method for FIR Filter Implementation To overcome this problem of high MAC unit requirements, DA architecture can be used, which is very efficient in implementing the Sum Of Products (SOP) (Stanley A. White, 1989). DA implements MAC operations using LUTs/ROMs instead of dedicated multipliers. DA is bit serial in nature and parallel implementations can be developed by using serial DA FIRs in parallel. Let the input variable x[n k], which is in 2 s complement fixed point fractional format, contain M bits and let x[n k] < 1. It can then be expressed as M 1 m x[ n k] x x 2 (4.10) k, o k, m m 0 In Equation 4.10, k,0 x is the Most Significant Bit (MSB) or sign bit and k, M 1 x is the Least Significant Bit (LSB) of the M bit variable x [n-k]. It must be noted that k, m, x, are binary variables and can only assume values 0 or 1. Substituting Equation 4.10 in Equation 4.9, we get N N M 1 m k,0 k k, m k (4.11) k 1 k 0 m 0 y[ n] x a x a 2 85

21 Equation 4.11 can be expanded and rearranged shown as, y[ n] [ x. a x. a x. a... x. a ] 0,0 0 1,0 1 2,0 2 N,0 1 [ x0,1. a 0 x1,1. a1 x2,1. a 2... xn,1. a N]2 n 2 [ x0,2. a 0 x1,2. a1 x2,2. a 2... xn,2. a N]2 [ x. a x. a x. a... x. a ]2 M 0, M 1 0 1, M 1 1 2, M 1 2 N, M 1 N 1 (4.12) In Equation 4.12, each inner term inside the square brackets denotes a logical AND operation and the plus sign denote arithmetic addition. The negative powers of 2, which appear outside the brackets can be implemented simply by shifting the results of the computation to the right. So the MAC operations in Equation 4.9 are now converted to addition, subtraction, shifting and logical AND operations (Stanley A. White, 1989). Bits of the input variable can be used to address the LUT. A serial DA FIR filter can be constructed using a single LUT and time sharing it to process all the bits. Input shift registers (ISR) are required to supply bits serially to the LUT in serial DA FIR filter shown in figure Bits are output from the ISR MSB first. To construct a parallel DA FIR filter shown in figure 4.16 M LUTs are required. The 1 st bits of all the inputs are connected to the 1st LUT, 2 nd bits of all the inputs are connected to 2 nd LUT and so on. (Tyler J. Moeller and David R. Martinez, 1999). The parallel filter produces one output every clock cycle whereas the serial filter produces one output every M clock cycles. The address and LUT contents has been calculated from equation 4.13 and shown in table 4.2. F x0,0 a0 x1,0 a1 x2,0 a2 (4.13) 86

22 Table 4.2: Address and Contents of an LUT x 0,0 x 1,0 x 2,0 Contents a a a2 a a a0 a a0 a a0 a1 a2 Figure 4.15: Serial Distributed Arithmetic FIR Filter Since all channels have the same filtering requirements, a multi channel DA FIR filter can be constructed by time sharing LUTs across data from multiple channels. For a multi channel DA FIR filter, memory required the amount of memory required to store input variables will be more since it has to store input variables of multiple streams, but the logic resources required to compute results would be the same as a single channel filter. As the filter processes input data one bit at a time per clock cycle, therefore 87

23 Figure 4.16: Parallel Distributed Arithmetic FIR Filter serial structures will require clock cycles equal to the input data width to calculate an output. In contrast, a parallel structure calculates the filter output in a single clock cycle, so parallel structures provide the highest speed performance at the expense of large area. Another option is a multibit serial structure combines several small serial FIR filters in parallel to generate the FIR output. This structure provides greater throughput than a standard serial structure while using less area than a fully parallel structure. Thus different architectures can be used depending upon the specific requirement in term of area or speed. 4.5 Design and Implementation of Proposed Digital Up Converter for WiMAX System In this section design and implementation of the proposed DUC for WiMAX system using DA is presented. For its implementation, different architectures like fully serial, multibit serial and fully parallel architectures are used to choose the best architecture. The 88

24 interpolation filters are implemented using Nyquist FIR design with direct form polyphase structure. The input sample frequency, passband ripple and stpopband attenuation are taken as 11.2 MHz, db and 60 db respectively. The interpolation factor is taken as 8. Proposed DUC is implemented by cascading pusle shaping single rate FIR filter, interpolaion by 2 filter and interpolation by 4 filter. The design and implementation of these pulse shaping single rate FIR filter, interpolaion by 2 filter and interpolation by 4 filters are presented in the following sub sections Design and Implementation of Pulse Shaping Single Rate FIR Filter In the DUC, pulse shaping filter is used to attenuate out of band power in order to meet the spectral mask requirement. RRC is a favorable filter to do pulse shaping as it transition band response meets the Nyquist criteria. The pulse shaping single rate FIR filter is designed with roll off factor 0.25 and stop band attenuation of 60 db. The passband and stopband frequencies is taken as 4.65 MHz and 5.35 MHz respectively. The pulse shaping single rate FIR filter is designed and implemented for fully serial, multibit serial and fully parallel architectures. The resources utilized by different architectures and their performance in term of speed is shown in tables 4.3 and 4.4. From table 4.3, it is concluded that in case of DA fully serial architecture for interpolation single rate channel filter, as the number of serial units are increased from 1 to 4, the number of logic cells increases from 3941 to 4051 i.e. there is an increase of 2.8% Whereas number of clock cycles required to process input and output data decreases from 16 to 4 i.e. the speed increases by fourfold. The results for fully parallel architecture implementation are shown in table 4.4. From table 4.4, it is concluded that DA fully parallel architecture with the pipeline level 1 provides the best performance among all parallel architectures. On analyzing the results of tables 4.3 and 4.4, it is concluded that DA fully serial architecture having 4 numbers of 89

25 Table 4.3: Comparison of FPGA Resource Utilization by Distributed Arithmetic Fully Serial Interpolator Single Rate Filter with different Number of Serial Units FPGA Resources No. of Serial Units =1 No. of Serial Units No. of Serial Units =2 No. of Serial Units =4 Logic Cells M M4K Process Input Data Generate Output Data Table 4.4: Comparison of FPGA Resource Utilization by Distributed Arithmetic Fully Parallel Interpolator Single Rate Filter with different levels of Pipelining Resources Pipeline Level Pipeline Level 1 Pipeline Level 2 Pipeline Level 3 Logic Cells M M4K Process Input Data Generate Output Data

26 serial units requires 4051 Logic cells, whereas DA fully parallel architecture with pipeline level of 1 requires 5137 Logic cells. And DA fully parallel architecture with pipeline level of 1 requires 1 clock cycle to process input data and 1 clock cycle to generate output data whereas DA fully serial architecture having 4 numbers of serial units requires 4 clock cycles to process input data and 4 clock cycles to generate output data. Thus as compared to DA fully serial architecture having 4 numbers of serial units, the speed of DA fully parallel architecture with pipeline level of 1 increases by four folds at an expense of only about 26.8% of FPGA resources. As best result in term of speed are obtained in fully parallel architecture with pipeline level of 1, so for this filter design, fully parallel architecture with pipeline level 1 is used Design and Implementation of Interpolation by 2 FIR Filter In interpolation by 2 filter, the input sample rate will be 11.2 Msps and at output, it will provide 22.4 Msps. So interpolation by 2 filter is designed with input sample rate 11.2 Msps, passband ripple of 0.015, stopband attenuation of 60 db and interpolation factor of 2. This interpolation by 2 filter is implemented for fully serial, multibit serial and fully parallel architectures. The resources utilized by different architectures and their performance in term of speed is shown in tables 4.5 and 4.6. From table 4.5, it is concluded that in case of DA fully serial architecture for interpolation by 2 filter, as the number of serial units are increased from 1 to 4, the number of logic cells increases from 523 to 1021 i.e. there is an increase of approximately 95%. Whereas number of clock cycles required to process input data decreases from 32 to 8 and number of clock cycles required to generate output data decreases from 16 to 4 i.e. the speed increases by fourfold. Table 4.6 shows the result for fully parallel architecture with pilpeline levels 1, 2 and 3. Pipeline level 1 shows the best results in term of speed and less resources in fully 91

27 Table 4.5: Comparison of FPGA Resource Utilization by Distributed Arithmetic Fully Serial Interpolation by 2 Filter with different Number of Serial Units FPGA Resources No. of Serial Units =1 No. of Serial Units No. of Serial Units =2 No. of Serial Units =4 Logic Cells M M4K Process Input Data Generate Output Data Table 4.6: Comparison of FPGA Resource Utilization by Distributed Arithmetic Fully Parallel Interpolation by 2 Filter with different levels of Pipelining Resources Pipeline Level Pipeline Level 1 Pipeline Level 2 Pipeline Level 3 Logic Cells M M4K Process Input Data Generate Output Data

28 parallel architectures. On comparing the results of tables 4.5 and 4.6, it is concluded that DA fully serial architecture having 4 numbers of serial units requires 1021 logic cells, whereas DA fully parallel architecture with pipeline level of 1 requires 1890 logic cells. Also DA fully parallel architecture with pipeline level of 1 requires 2 clock cycle to process input data and 1 clock cycle to generate output data whereas DA fully serial architecture having 4 numbers of serial units requires 8 clock cycles to process input data and 4 clock cycles to generate the output data. Thus as compared to DA fully serial architecture having 4 numbers of serial units, the speed of DA fully parallel architecture with pipeline level of 1 increases by four folds at an expense of about 85% of logic cells Design and Implementation of Interpolation by 4 FIR Filter In the DUC, after the signal get interpolated by 2, now it will be interpolated by 4 to get the required interpolation factor 8. The input sample rate for interpolation by 4 filter is 22.4 Msps, passband ripple is db and stopband attenuation is 60 db. This interpolation by 4 filter is designed and implemented for fully serial, multibit serial and fully parallel architectures. The resources utilized by different architectures and their performance in term of speed is shown in tables 4.7 and 4.8. From table 4.7, it is concluded that in case of DA fully serial architecture for interpolation by 4 filter, as the number of serial units are increased from 1 to 4, the number of logic cells increases from 584 to 818 i.e. there is an increase of approximately 39%. Whereas number of clock cycles required to process input data decreases from 64 to 16 and number of clock cycles required to generate output data decreases from 16 to 4 i.e. the speed increases by fourfold. From table 4.8, it is concluded that in case of DA fully parallel architecture for interpolation by 4 filter, among all pipeline levels, the pipeline level 1 provides the best result in term of speed with less required resources. On comparing the results of tables 4.7 and 4.8, it is concluded that DA fully serial 93

29 Table 4.7: Comparison of FPGA Resource Utilization by Distributed Arithmetic Fully Serial Interpolation by 4 Filter with different Number of Serial Units FPGA Resources No. of Serial Units =1 No. of Serial Units No. of Serial Units =2 No. of Serial Units =4 Logic Cells M M4K Process Input Data Generate Output Data Table 4.8: Comparison of FPGA Resource Utilization by Distributed Arithmetic Fully Parallel Interpolation by 4 Filter with different levels of Pipelining Resources Pipeline Level 1 Pipeline Level Pipeline Level 2 Pipeline Level 3 Logic Cells M M4K Process Input Data Generate Output Data

30 architecture having 4 numbers of serial units requires 818 logic cells, whereas DA fully parallel architecture with pipeline level of 1 requires 1038 logic cells. Also DA fully parallel architecture with pipeline level of 1 requires 4 clock cycle to process input data and 1 clock cycle to generate output data whereas DA fully serial architecture having 4 numbers of serial units requires 8 clock cycles to process input data and 4 clock cycles to generate the output data. Thus as compared to DA fully serial architecture having 4 numbers of serial units, the speed of DA fully parallel architecture with pipeline level of 1 increases by four folds at an expense of about 27% of logic cells. Figure 4.17: Logic cells used by different stages of DUC with different number of serial units for fully serial DA architecture The variations of the number of logic cells used by pulse shaping, interpolation by 2 and interpolation by 4 filters, for fully serial DA architecture with different number of serial units is shown in figure 4.17 and for fully parallel DA architecture with different number of pipeline levels is shown in figure From above discussions, it is concluded that for implementing different stages, fully parallel DA architecture with pipeline level of 1 provides high speed with moderate area requirement. So, in the proposed design fully 95

31 parallel DA architecture with pipeline level of 1 is used to implement all the interpolator stages for DUC for WiMAX system. Figure 4.18: Logic cells used by different stages of DUC with different levels of pipelining for fully parallel DA architecture 4.6 Design and Implementation of Proposed Digital Down Converter for WiMAX System In this section design and implementation of the proposed DDC for WiMAX system using DA is presented. For its implementation, different architectures like fully serial, multibit serial and fully parallel architectures are used to choose the best architecture. The decimation filters are inplemented using Nyquist FIR design with direct form polyphase structure. The input sample rate, passband ripple and stpopband attenuation are taken as 89.6 Msps, db and 60 db respectively. The overall decimation factor is taken as 8. Proposed DDC is implemented by cascading decimation by 4 filter, decimation by 2 and decimation channel filters. The design and implementation of these decimation by 4 filter, 96

32 decimation by 2 and channel filters are presented in the following sub sections Design and Implementation of Decimation by 4 FIR Filter Decimation by 4 filter will downconvert the sample rate from 89.6 Msps to 22.4 Msps. The design specifications for its implementation have been taken as stopband attenuation 60dB, passband attenuation db, decimation factor 4. This decimation by 4 filter is designed and implemented for fully serial, multibit serial and fully parallel architectures. The resources utilized by different architectures and their performance in term of speed is shown in tables 4.9 and Table 4.9: Comparison of FPGA Resource Utilization by Distributed Arithmetic Fully Serial Decimation by 4 Filter with different Number of Serial Units FPGA Resources No. of Serial Units =1 No. of Serial Units No. of Serial Units =2 No. of Serial Units =4 Logic Cells M M4K Process Input Data Generate Output Data From table 4.9, it is concluded that in case of DA fully serial architecture for decimation by 4 filter, as the number of serial units are increased from 1 to 4, the number of logic cells increases from 590 to 824 i.e. there is an increase in required logic cells is 39%. But the number of clock cycles required to process input data decreases from 16 to 4 and number of clock cycles required to generate output data decreases from 64 to 16 i.e. the speed increases by fourfold. From table 4.10, it is concluded that DA fully parallel 97

33 Table 4.10: Comparison of FPGA Resource Utilization by Distributed Arithmetic Fully Parallel Decimation by 4 Filter with different levels of Pipelining Resources Pipeline Level 1 Pipeline Level Pipeline Level 2 Pipeline Level 3 Logic Cells M M4K Process Input Data Generate Output Data architecture with pipeline level 1 outperforms other pipeline architectures. On comparing the results of tables 4.9 and 4.10, it is concluded that DA fully serial architecture having 4 numbers of serial units requires 824 logic cells, whereas DA fully parallel architecture with pipeline level of 1 requires 1044 logic cells. Also DA fully parallel architecture with pipeline level of 1 requires 4 clock cycle to process input data and 1 clock cycle to generate output data whereas DA fully serial architecture having 4 numbers of serial units requires 8 clock cycles to process input data and 4 clock cycles to generate the output data. Thus as compared to DA fully serial architecture having 4 numbers of serial units, the speed of DA fully parallel architecture with pipeline level of 1 increases by four folds at an expense of about 26% of logic cells. so this filter design is implemented with DA fully parallel architecture with pipeline level Design and Implementation of Decimation by 2 FIR Filter In the DDC, after decimation by 4 filter, decimation by 2 filter will be used. Its function is to downconvert the sample rate further by factor 2. So the input sample rate for 98

34 this filter will be 22.4 Msps and the output sample rate will be 11.2 Msps. In other design specifications, the passband ripple and stopband attenuation are taken as db and 60 db. This decimation by 2 filter is designed and implemented for fully serial, multibit serial and fully parallel architectures. The resources utilized by different architectures and their performance in term of speed are shown in tables 4.11 and From table 4.11, it is concluded that in case of DA fully serial architecture for decimation by 2 filter, as the number of serial units are increased from 1 to 4, the number of logic cells increases from Table 4.11: Comparison of FPGA Resource Utilization by Distributed Arithmetic Fully Serial Decimation by 2 Filter with different Number of Serial Units FPGA Resources No. of Serial Units =1 No. of Serial Units No. of Serial Units =2 No. of Serial Units =4 Logic Cells M M4K Process Input Data Generate Output Data to 1024 i.e. there is an increase of approximately 94%. Whereas number of clock cycles required to process input data decreases from 16 to 4 and number of clock cycles required to generate output data decreases from 32 to 8 i.e. the speed increases by fourfold From table 4.12, it can be seen that in case of DA fully parallel architecture with pipeline level 1 provides best performance in term of speed with lesser resources as compared to other parallel structures. On comparing the results of tables 4.11 and 4.12, it 99

35 Table 4.12: Comparison of FPGA Resource Utilization by Distributed Arithmetic Fully Parallel Decimation by 2 Filter with different levels of Pipelining Resources Pipeline Level Pipeline Level 1 Pipeline Level 2 Pipeline Level 3 Logic Cells M M4K Process Input Data Generate Output Data is concluded that DA fully serial architecture having 4 numbers of serial units requires 1024 logic cells, whereas DA fully parallel architecture with pipeline level of 1 requires 1893 logic cells. Also DA fully parallel architecture with pipeline level of 1 requires 4 clock cycle to process input data and 1 clock cycle to generate output data whereas DA fully serial architecture having 4 numbers of serial units requires 8 clock cycles to process input data and 4 clock cycles to generate the output data. Thus as compared to DA fully serial architecture having 4 numbers of serial units, the speed of DA fully parallel architecture with pipeline level of 1 increases by four folds at an expense of about 84% of logic cells. So the decimation by 2 filter is designed with fully parallel architecture with pipeline level Design and Implementation of Decimation Channel Filter In the DDC, the channel filter is used after decimation by 2 filter. The main function of this filter is to provide stopband attenuation to remove adjacent channel interference. In 100

36 addition, it also have to keep passband ripple with in range. For this filter RRC filter with Nyquist design is used with roll off factor 0.25, stopband attenuation 60 db. This decimation channel filter is designed and implemented for fully serial, multibit serial and fully parallel architectures. The resources utilized by different architectures and their performance in term of speed are shown in tables 4.13 and Table 4.13: Comparison of FPGA Resource Utilization by Distributed Arithmetic Fully Serial Decimator Channel Filter with different Number of Serial Units FPGA Resources No. of Serial Units =1 No. of Serial Units No. of Serial Units =2 No. of Serial Units =4 Logic Cells M M4K Process Input Data Generate Output Data From table 4.13, it is concluded that in case of DA fully serial architecture for single rate channel filter of DDC, as the number of serial units are increased from 1 to 4, the number of logic cells increases from 2093 to 2255 i.e. there is an increase of approximately 7%. Whereas number of clock cycles required to process input and output data decreases from 16 to 4 i.e. the speed increases by fourfold. From table 4.14, it is concluded that in case of DA fully parallel architecture for single rate channel filter, among other pipeline level parallel structures, the pipeline level 1 parallel structure provides the best performance in 101

37 Table 4.14: Comparison of FPGA Resource Utilization by Distributed Arithmetic Fully Parallel Decimator Channel Filter with different levels of Pipelining Pipeline Level Resources Pipeline Level 1 Pipeline Level 2 Pipeline Level 3 Logic Cells M M4K Process Input Data Generate Output Data term of speed with lesser area. On comparing the results of tables 4.13 and 4.14, it is concluded that DA fully serial architecture having 4 numbers of serial units requires 2255 logic cells, whereas DA fully parallel architecture with pipeline level of 1 requires 3148 logic cells. Also DA fully parallel architecture with pipeline level of 1 requires 1 clock cycle to process input data and 1 clock cycle to generate output data whereas DA fully serial architecture having 4 numbers of serial units requires 4 clock cycles to process input data and 4 clock cycles to generate output data. Thus as compared to DA fully serial architecture having 4 numbers of serial units, the speed of DA fully parallel architecture with pipeline level of 1 increases by four folds at an expense of about 39% logic cells. so this filter is designed with DA fully architecture with pipeline level 1. The variations of the number of logic cells used by decimation by 4, decimation by 2 and decimation channel filters, for fully serial DA architecture with different number of 102

38 Figure 4.19: Logic cells used by different stages of DDC with different number of serial units for fully serial DA architecture Figure 4.20: Logic cells used by different stages of DDC with different levels of pipelining for fully parallel DA architecture serial units are shown in figure 4.19 and for fully parallel DA architecture with different number of pipeline levels are shown in figure From these discussions, it is concluded that fully parallel DA architecture with pipeline level of 1 has high speed with 103

39 moderate area requirement. So, in the proposed design fully parallel DA architecture with pipeline level of 1 is used to implement all decimator stages of DUC for WiMAX system. So in the proposed design fully parallel DA architecture with pipeline level of 1 is used to implement all interpolator and decimator stages of DUC and DDC for WiMAX system. 4.7 Conclusions Due to their high performance and facility to implement DSP function in efficient manner, FPGAs can be considered a better choice to increse the performance of broadband communication system like WiMAX. Also the availability of high level design tools helps in reducing the design cycle for FPGA implementation. DA can be used to implement low cost LUT based DSP functions either in serial or parallel form. When the number of elements in a vector is same as word size, DA results in fast operational speed. This fast speed is achieved by replacing multiplications by ROM based LUT. Decomposition technique and coding technique are used to reduce the ROM. FIR filters can be implemented using serial or parallel DA architecture. A parallel DA FIR filter produces one output for every clock cycle, whereas serial DA FIR filters requires M clock cycles to produce the output. Thus parallel architecture provides higher speed performance. Multibit serial architecture is another option which combines several small serial FIR units in parallel. This architecture provides greater throughput than the standard serial architectures, but less than parallel architecture. So to improve the performance in terms of speed, DA parallel architecture with pipeline level 1 is used for the proposed designs of interpolation filters and decimation filters of DUC and DDC for WiMAX system. 104

Stratix. Introduction. Features... Programmable Logic Device Family. Preliminary Information

Stratix. Introduction. Features... Programmable Logic Device Family. Preliminary Information Stratix Programmable Logic Device Family February 2002, ver. 1.0 Data Sheet Introduction Preliminary Information The Stratix family of programmable logic devices (PLDs) is based on a 1.5-V, 0.13-µm, all-layer

More information

Stratix. Introduction. Features... 10,570 to 114,140 LEs; see Table 1. FPGA Family. Preliminary Information

Stratix. Introduction. Features... 10,570 to 114,140 LEs; see Table 1. FPGA Family. Preliminary Information Stratix FPGA Family December 2002, ver. 3.0 Data Sheet Introduction Preliminary Information The Stratix TM family of FPGAs is based on a 1.5-V, 0.13-µm, all-layer copper SRAM process, with densities up

More information

Section I. Cyclone FPGA Family Data Sheet

Section I. Cyclone FPGA Family Data Sheet Section I. Cyclone FPGA Family Data Sheet This section provides designers with the data sheet specifications for Cyclone devices. The chapters contain feature definitions of the internal architecture,

More information

Altera FLEX 8000 Block Diagram

Altera FLEX 8000 Block Diagram Altera FLEX 8000 Block Diagram Figure from Altera technical literature FLEX 8000 chip contains 26 162 LABs Each LAB contains 8 Logic Elements (LEs), so a chip contains 208 1296 LEs, totaling 2,500 16,000

More information

Chapter 2. Cyclone II Architecture

Chapter 2. Cyclone II Architecture Chapter 2. Cyclone II Architecture CII51002-1.0 Functional Description Cyclone II devices contain a two-dimensional row- and column-based architecture to implement custom logic. Column and row interconnects

More information

Parallel FIR Filters. Chapter 5

Parallel FIR Filters. Chapter 5 Chapter 5 Parallel FIR Filters This chapter describes the implementation of high-performance, parallel, full-precision FIR filters using the DSP48 slice in a Virtex-4 device. ecause the Virtex-4 architecture

More information

Implementing FIR Filters

Implementing FIR Filters Implementing FIR Filters in FLEX Devices February 199, ver. 1.01 Application Note 73 FIR Filter Architecture This section describes a conventional FIR filter design and how the design can be optimized

More information

Distributed by: www.jameco.com 1-800-831-4242 The content and copyrights of the attached material are the property of its owner. Section I. Stratix II Device Family Data Sheet This section provides the

More information

Section I. Cyclone FPGA Family Data Sheet

Section I. Cyclone FPGA Family Data Sheet Section I. Cyclone FPGA Family Data Sheet This section provides designers with the data sheet specifications for Cyclone devices. The chapters contain feature definitions of the internal architecture,

More information

19. Implementing High-Performance DSP Functions in Stratix & Stratix GX Devices

19. Implementing High-Performance DSP Functions in Stratix & Stratix GX Devices 19. Implementing High-Performance SP Functions in Stratix & Stratix GX evices S52007-1.1 Introduction igital signal processing (SP) is a rapidly advancing field. With products increasing in complexity,

More information

Introduction to Field Programmable Gate Arrays

Introduction to Field Programmable Gate Arrays Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May 9 June 2007 Javier Serrano, CERN AB-CO-HT Outline Digital Signal

More information

2. Stratix II Architecture

2. Stratix II Architecture 2. Stratix II Architecture SII51002-4.3 Functional Description Stratix II devices contain a two-dimensional row- and column-based architecture to implement custom logic. A series of column and row interconnects

More information

Using the DSP Blocks in Stratix & Stratix GX Devices

Using the DSP Blocks in Stratix & Stratix GX Devices Using the SP Blocks in Stratix & Stratix GX evices November 2002, ver. 3.0 Application Note 214 Introduction Traditionally, designers had to make a trade-off between the flexibility of off-the-shelf digital

More information

Section I. Cyclone FPGA Family Data Sheet

Section I. Cyclone FPGA Family Data Sheet Section I. Cyclone FPGA Family Data Sheet This section provides designers with the data sheet specifications for Cyclone devices. The chapters contain feature definitions of the internal architecture,

More information

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices 3 Digital Systems Implementation Programmable Logic Devices Basic FPGA Architectures Why Programmable Logic Devices (PLDs)? Low cost, low risk way of implementing digital circuits as application specific

More information

IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC

IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC Thangamonikha.A 1, Dr.V.R.Balaji 2 1 PG Scholar, Department OF ECE, 2 Assitant Professor, Department of ECE 1, 2 Sri Krishna

More information

4. DSP Blocks in Stratix IV Devices

4. DSP Blocks in Stratix IV Devices 4. DSP Blocks in Stratix IV Devices February 2011 SIV51004-3.1 SIV51004-3.1 This chapter describes how the Stratix IV device digital signal processing (DSP) blocks are optimized to support DSP applications

More information

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic EN2911X: Reconfigurable Computing Topic 01: Programmable Logic Prof. Sherief Reda School of Engineering, Brown University Fall 2012 1 FPGA architecture Programmable interconnect Programmable logic blocks

More information

Field Programmable Gate Array (FPGA)

Field Programmable Gate Array (FPGA) Field Programmable Gate Array (FPGA) Lecturer: Krébesz, Tamas 1 FPGA in general Reprogrammable Si chip Invented in 1985 by Ross Freeman (Xilinx inc.) Combines the advantages of ASIC and uc-based systems

More information

RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch

RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC Zoltan Baruch Computer Science Department, Technical University of Cluj-Napoca, 26-28, Bariţiu St., 3400 Cluj-Napoca,

More information

8. Migrating Stratix II Device Resources to HardCopy II Devices

8. Migrating Stratix II Device Resources to HardCopy II Devices 8. Migrating Stratix II Device Resources to HardCopy II Devices H51024-1.3 Introduction Altera HardCopy II devices and Stratix II devices are both manufactured on a 1.2-V, 90-nm process technology and

More information

CHAPTER 3 MULTISTAGE FILTER DESIGN FOR DIGITAL UPCONVERTER AND DOWNCONVERTER

CHAPTER 3 MULTISTAGE FILTER DESIGN FOR DIGITAL UPCONVERTER AND DOWNCONVERTER CHAPTER 3 MULTISTAGE FILTER DESIGN FOR DIGITAL UPCONVERTER AND DOWNCONVERTER 3.1 Introduction The interpolation and decimation filter design problem is a very important issue in the modern digital communication

More information

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN Xiaoying Li 1 Fuming Sun 2 Enhua Wu 1, 3 1 University of Macau, Macao, China 2 University of Science and Technology Beijing, Beijing, China

More information

Section I. Cyclone II Device Family Data Sheet

Section I. Cyclone II Device Family Data Sheet Section I. Cyclone II Device Family Data Sheet This section provides provides information for board layout designers to successfully layout their boards for Cyclone II devices. It contains the required

More information

FPGA architecture and design technology

FPGA architecture and design technology CE 435 Embedded Systems Spring 2017 FPGA architecture and design technology Nikos Bellas Computer and Communications Engineering Department University of Thessaly 1 FPGA fabric A generic island-style FPGA

More information

2. TriMatrix Embedded Memory Blocks in Stratix II and Stratix II GX Devices

2. TriMatrix Embedded Memory Blocks in Stratix II and Stratix II GX Devices 2. TriMatrix Embedded Memory Blocks in Stratix II and Stratix II GX Devices SII52002-4.5 Introduction Stratix II and Stratix II GX devices feature the TriMatrix memory structure, consisting of three sizes

More information

Topics. Midterm Finish Chapter 7

Topics. Midterm Finish Chapter 7 Lecture 9 Topics Midterm Finish Chapter 7 ROM (review) Memory device in which permanent binary information is stored. Example: 32 x 8 ROM Five input lines (2 5 = 32) 32 outputs, each representing a memory

More information

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1> Chapter 5 Digital Design and Computer Architecture, 2 nd Edition David Money Harris and Sarah L. Harris Chapter 5 Chapter 5 :: Topics Introduction Arithmetic Circuits umber Systems Sequential Building

More information

INTRODUCTION TO FPGA ARCHITECTURE

INTRODUCTION TO FPGA ARCHITECTURE 3/3/25 INTRODUCTION TO FPGA ARCHITECTURE DIGITAL LOGIC DESIGN (BASIC TECHNIQUES) a b a y 2input Black Box y b Functional Schematic a b y a b y a b y 2 Truth Table (AND) Truth Table (OR) Truth Table (XOR)

More information

INTEGER SEQUENCE WINDOW BASED RECONFIGURABLE FIR FILTERS.

INTEGER SEQUENCE WINDOW BASED RECONFIGURABLE FIR FILTERS. INTEGER SEQUENCE WINDOW BASED RECONFIGURABLE FIR FILTERS Arulalan Rajan 1, H S Jamadagni 1, Ashok Rao 2 1 Centre for Electronics Design and Technology, Indian Institute of Science, India (mrarul,hsjam)@cedt.iisc.ernet.in

More information

The Role of Distributed Arithmetic in FPGA-based Signal Processing

The Role of Distributed Arithmetic in FPGA-based Signal Processing Introduction The Role of Distributed Arithmetic in FPGA-based Signal Processing Distributed Arithmetic (DA) plays a key role in embedding DSP functions in the Xilinx 4000 family of FPGA devices. In this

More information

Section I. MAX II Device Family Data Sheet

Section I. MAX II Device Family Data Sheet Section I. MAX II Device Family Data Sheet This section provides designers with the data sheet specifications for MAX II devices. The chapters contain feature definitions of the internal architecture,

More information

MAX 10 FPGA Device Overview

MAX 10 FPGA Device Overview 2014.09.22 M10-OVERVIEW Subscribe MAX 10 devices are the industry s first single chip, non-volatile programmable logic devices (PLDs) to integrate the optimal set of system components. The following lists

More information

Section I. Cyclone II Device Family Data Sheet

Section I. Cyclone II Device Family Data Sheet Section I. Cyclone II Device Family Data Sheet This section provides information for board layout designers to successfully layout their boards for Cyclone II devices. It contains the required PCB layout

More information

Field Programmable Gate Array (FPGA) Devices

Field Programmable Gate Array (FPGA) Devices Field Programmable Gate Array (FPGA) Devices 1 Contents Altera FPGAs and CPLDs CPLDs FPGAs with embedded processors ACEX FPGAs Cyclone I,II FPGAs APEX FPGAs Stratix FPGAs Stratix II,III FPGAs Xilinx FPGAs

More information

ECE 645: Lecture 1. Basic Adders and Counters. Implementation of Adders in FPGAs

ECE 645: Lecture 1. Basic Adders and Counters. Implementation of Adders in FPGAs ECE 645: Lecture Basic Adders and Counters Implementation of Adders in FPGAs Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 5, Basic Addition and Counting,

More information

Intel Stratix 10 Logic Array Blocks and Adaptive Logic Modules User Guide

Intel Stratix 10 Logic Array Blocks and Adaptive Logic Modules User Guide Intel Stratix 10 Logic Array Blocks and Adaptive Logic Modules User Guide Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents 1 Intel Stratix 10 LAB and Overview... 3 2 HyperFlex

More information

EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs)

EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) September 12, 2002 John Wawrzynek Fall 2002 EECS150 - Lec06-FPGA Page 1 Outline What are FPGAs? Why use FPGAs (a short history

More information

Stratix II Device Handbook, Volume 1

Stratix II Device Handbook, Volume 1 Stratix II Device Handbook, Volume 1 101 Innovation Drive San Jose, CA 95134 www.altera.com SII5V1-4.5 Copyright 2011 Altera Corporation. All rights reserved. Altera, The Programmable Solutions Company,

More information

Low Power Design Techniques

Low Power Design Techniques Low Power Design Techniques August 2005, ver 1.0 Application Note 401 Introduction This application note provides low-power logic design techniques for Stratix II and Cyclone II devices. These devices

More information

Compact Clock Skew Scheme for FPGA based Wave- Pipelined Circuits

Compact Clock Skew Scheme for FPGA based Wave- Pipelined Circuits International Journal of Communication Engineering and Technology. ISSN 2277-3150 Volume 3, Number 1 (2013), pp. 13-22 Research India Publications http://www.ripublication.com Compact Clock Skew Scheme

More information

Outline. EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) FPGA Overview. Why FPGAs?

Outline. EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) FPGA Overview. Why FPGAs? EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) September 12, 2002 John Wawrzynek Outline What are FPGAs? Why use FPGAs (a short history lesson). FPGA variations Internal logic

More information

Digital System Design Lecture 7: Altera FPGAs. Amir Masoud Gharehbaghi

Digital System Design Lecture 7: Altera FPGAs. Amir Masoud Gharehbaghi Digital System Design Lecture 7: Altera FPGAs Amir Masoud Gharehbaghi amgh@mehr.sharif.edu Table of Contents Altera FPGAs FLEX 8000 FLEX 10k APEX 20k Sharif University of Technology 2 FLEX 8000 Block Diagram

More information

A Reconfigurable Multifunction Computing Cache Architecture

A Reconfigurable Multifunction Computing Cache Architecture IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 4, AUGUST 2001 509 A Reconfigurable Multifunction Computing Cache Architecture Huesung Kim, Student Member, IEEE, Arun K. Somani,

More information

! Program logic functions, interconnect using SRAM. ! Advantages: ! Re-programmable; ! dynamically reconfigurable; ! uses standard processes.

! Program logic functions, interconnect using SRAM. ! Advantages: ! Re-programmable; ! dynamically reconfigurable; ! uses standard processes. Topics! SRAM-based FPGA fabrics:! Xilinx.! Altera. SRAM-based FPGAs! Program logic functions, using SRAM.! Advantages:! Re-programmable;! dynamically reconfigurable;! uses standard processes.! isadvantages:!

More information

Adders, Subtracters and Accumulators in XC3000

Adders, Subtracters and Accumulators in XC3000 s, ubtracters and Accumulators in XC3000 XAPP 022.000 Application Note By PETER ALFKE and BERNIE NEW ummary This Application Note surveys the different adder techniques that are available for XC3000 designs.

More information

Cyclone Device Handbook, Volume 1

Cyclone Device Handbook, Volume 1 Cyclone Device Handbook, Volume 1 101 Innovation Drive San Jose, CA 95134 www.altera.com Preliminary Information C5V1-2.4 Copyright 2008 Altera Corporation. All rights reserved. Altera, The Programmable

More information

MAX 10 FPGA Device Overview

MAX 10 FPGA Device Overview 2016.05.02 M10-OVERVIEW Subscribe MAX 10 devices are single-chip, non-volatile low-cost programmable logic devices (PLDs) to integrate the optimal set of system components. The highlights of the MAX 10

More information

Programmable Logic. Any other approaches?

Programmable Logic. Any other approaches? Programmable Logic So far, have only talked about PALs (see 22V10 figure next page). What is the next step in the evolution of PLDs? More gates! How do we get more gates? We could put several PALs on one

More information

Intel MAX 10 FPGA Device Overview

Intel MAX 10 FPGA Device Overview Intel MAX 10 FPGA Device Overview Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents...3 Key Advantages of Intel MAX 10 Devices... 3 Summary of Intel MAX 10 Device Features...

More information

Feature EPF10K30E EPF10K50E EPF10K50S

Feature EPF10K30E EPF10K50E EPF10K50S FLEX 10KE Embedded Programmable Logic Family August 1999, ver. 2.02 Data Sheet Features... Embedded programmable logic devices (PLDs), providing System-on-a-Programmable-Chip TM integration in a single

More information

Reduction of Latency and Resource Usage in Bit-Level Pipelined Data Paths for FPGAs

Reduction of Latency and Resource Usage in Bit-Level Pipelined Data Paths for FPGAs Reduction of Latency and Resource Usage in Bit-Level Pipelined Data Paths for FPGAs P. Kollig B. M. Al-Hashimi School of Engineering and Advanced echnology Staffordshire University Beaconside, Stafford

More information

Binary Adders. Ripple-Carry Adder

Binary Adders. Ripple-Carry Adder Ripple-Carry Adder Binary Adders x n y n x y x y c n FA c n - c 2 FA c FA c s n MSB position Longest delay (Critical-path delay): d c(n) = n d carry = 2n gate delays d s(n-) = (n-) d carry +d sum = 2n

More information

Xilinx DSP. High Performance Signal Processing. January 1998

Xilinx DSP. High Performance Signal Processing. January 1998 DSP High Performance Signal Processing January 1998 New High Performance DSP Alternative New advantages in FPGA technology and tools: DSP offers a new alternative to ASICs, fixed function DSP devices,

More information

FPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.

FPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function. FPGA Logic block of an FPGA can be configured in such a way that it can provide functionality as simple as that of transistor or as complex as that of a microprocessor. It can used to implement different

More information

Stratix II Device Handbook, Volume 1

Stratix II Device Handbook, Volume 1 Stratix II Device Handbook, Volume 1 101 Innovation Drive San Jose, CA 95134 (408) 544-7000 http://www.altera.com Preliminary Information SII5V1-2.1 Copyright 2005 Altera Corporation. All rights reserved.

More information

Intel Stratix 10 Variable Precision DSP Blocks User Guide

Intel Stratix 10 Variable Precision DSP Blocks User Guide Intel Stratix 10 Variable Precision DSP Blocks User Guide Updated for Intel Quartus Prime Design Suite: 17.1 Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents 1 Intel Stratix

More information

Using TriMatrix Embedded Memory Blocks

Using TriMatrix Embedded Memory Blocks Using TriMatrix Embedded Memory Blocks in Stratix & Stratix GX evices November 2002, ver. 2.0 Application Note 203 Introduction TriMatrix Memory Stratix and Stratix GX devices feature the TriMatrix memory

More information

Digital Logic & Computer Design CS Professor Dan Moldovan Spring 2010

Digital Logic & Computer Design CS Professor Dan Moldovan Spring 2010 Digital Logic & Computer Design CS 434 Professor Dan Moldovan Spring 2 Copyright 27 Elsevier 5- Chapter 5 :: Digital Building Blocks Digital Design and Computer Architecture David Money Harris and Sarah

More information

4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013)

4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013) 1 4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013) Lab #1: ITB Room 157, Thurs. and Fridays, 2:30-5:20, EOW Demos to TA: Thurs, Fri, Sept.

More information

A Novel Distributed Arithmetic Multiplierless Approach for Computing Complex Inner Products

A Novel Distributed Arithmetic Multiplierless Approach for Computing Complex Inner Products 606 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'5 A ovel Distributed Arithmetic Multiplierless Approach for Computing Complex Inner Products evin. Bowlyn, and azeih M. Botros. Ph.D. Candidate,

More information

Section I. Cyclone II Device Family Data Sheet

Section I. Cyclone II Device Family Data Sheet Section I. Cyclone II Device Family Data Sheet This section provides information for board layout designers to successfully layout their boards for Cyclone II devices. It contains the required PCB layout

More information

Implementation of a Low Power Decimation Filter Using 1/3-Band IIR Filter

Implementation of a Low Power Decimation Filter Using 1/3-Band IIR Filter Implementation of a Low Power Decimation Filter Using /3-Band IIR Filter Khalid H. Abed Department of Electrical Engineering Wright State University Dayton Ohio, 45435 Abstract-This paper presents a unique

More information

Batchu Jeevanarani and Thota Sreenivas Department of ECE, Sri Vasavi Engg College, Tadepalligudem, West Godavari (DT), Andhra Pradesh, India

Batchu Jeevanarani and Thota Sreenivas Department of ECE, Sri Vasavi Engg College, Tadepalligudem, West Godavari (DT), Andhra Pradesh, India Memory-Based Realization of FIR Digital Filter by Look-Up- Table Optimization Batchu Jeevanarani and Thota Sreenivas Department of ECE, Sri Vasavi Engg College, Tadepalligudem, West Godavari (DT), Andhra

More information

MCM Based FIR Filter Architecture for High Performance

MCM Based FIR Filter Architecture for High Performance ISSN No: 2454-9614 MCM Based FIR Filter Architecture for High Performance R.Gopalana, A.Parameswari * Department Of Electronics and Communication Engineering, Velalar College of Engineering and Technology,

More information

Fault Tolerant Parallel Filters Based On Bch Codes

Fault Tolerant Parallel Filters Based On Bch Codes RESEARCH ARTICLE OPEN ACCESS Fault Tolerant Parallel Filters Based On Bch Codes K.Mohana Krishna 1, Mrs.A.Maria Jossy 2 1 Student, M-TECH(VLSI Design) SRM UniversityChennai, India 2 Assistant Professor

More information

Injntu.com Injntu.com Injntu.com R16

Injntu.com Injntu.com Injntu.com R16 1. a) What are the three methods of obtaining the 2 s complement of a given binary (3M) number? b) What do you mean by K-map? Name it advantages and disadvantages. (3M) c) Distinguish between a half-adder

More information

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator A.Sindhu 1, K.PriyaMeenakshi 2 PG Student [VLSI], Dept. of ECE, Muthayammal Engineering College, Rasipuram, Tamil Nadu,

More information

Adaptive FIR Filter Using Distributed Airthmetic for Area Efficient Design

Adaptive FIR Filter Using Distributed Airthmetic for Area Efficient Design International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Adaptive FIR Filter Using Distributed Airthmetic for Area Efficient Design Manish Kumar *, Dr. R.Ramesh

More information

GC2011A 3.3V DIGITAL FILTER CHIP DATASHEET. March 21, 2000

GC2011A 3.3V DIGITAL FILTER CHIP DATASHEET. March 21, 2000 GC2011A 3.3V DIGITAL FILTER CHIP DATASHEET March 21, 2000 Information provided by Graychip is believed to be accurate and reliable. No responsibility is assumed by Graychip for its use, nor for any infringement

More information

II/IV B.Tech (Regular/Supplementary) DEGREE EXAMINATION. Answer ONE question from each unit.

II/IV B.Tech (Regular/Supplementary) DEGREE EXAMINATION. Answer ONE question from each unit. Hall Ticket Number: 14CS IT303 November, 2017 Third Semester Time: Three Hours Answer Question No.1 compulsorily. II/IV B.Tech (Regular/Supplementary) DEGREE EXAMINATION Common for CSE & IT Digital Logic

More information

Stratix vs. Virtex-II Pro FPGA Performance Analysis

Stratix vs. Virtex-II Pro FPGA Performance Analysis White Paper Stratix vs. Virtex-II Pro FPGA Performance Analysis The Stratix TM and Stratix II architecture provides outstanding performance for the high performance design segment, providing clear performance

More information

FPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA

FPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA FPGA Implementation of 16-Point FFT Core Using NEDA Abhishek Mankar, Ansuman Diptisankar Das and N Prasad Abstract--NEDA is one of the techniques to implement many digital signal processing systems that

More information

END-TERM EXAMINATION

END-TERM EXAMINATION (Please Write your Exam Roll No. immediately) END-TERM EXAMINATION DECEMBER 2006 Exam. Roll No... Exam Series code: 100919DEC06200963 Paper Code: MCA-103 Subject: Digital Electronics Time: 3 Hours Maximum

More information

An Introduction to Programmable Logic

An Introduction to Programmable Logic Outline An Introduction to Programmable Logic 3 November 24 Transistors Logic Gates CPLD Architectures FPGA Architectures Device Considerations Soft Core Processors Design Example Quiz Semiconductors Semiconductor

More information

Code No: R Set No. 1

Code No: R Set No. 1 Code No: R059210504 Set No. 1 II B.Tech I Semester Regular Examinations, November 2006 DIGITAL LOGIC DESIGN ( Common to Computer Science & Engineering, Information Technology and Computer Science & Systems

More information

FPGA Architecture Overview. Generic FPGA Architecture (1) FPGA Architecture

FPGA Architecture Overview. Generic FPGA Architecture (1) FPGA Architecture FPGA Architecture Overview dr chris dick dsp chief architect wireless and signal processing group xilinx inc. Generic FPGA Architecture () Generic FPGA architecture consists of an array of logic tiles

More information

Intel Arria 10 Core Fabric and General Purpose I/Os Handbook

Intel Arria 10 Core Fabric and General Purpose I/Os Handbook Intel Arria 10 Core Fabric and General Purpose I/Os Handbook Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents 1 Logic Array Blocks and Adaptive Logic Modules in Intel Arria

More information

Chapter 13 Programmable Logic Device Architectures

Chapter 13 Programmable Logic Device Architectures Chapter 13 Programmable Logic Device Architectures Chapter 13 Objectives Selected areas covered in this chapter: Describing different categories of digital system devices. Describing different types of

More information

Logic design Ibn Al Haitham collage /Computer science Eng. Sameer

Logic design Ibn Al Haitham collage /Computer science Eng. Sameer DEMORGAN'S THEOREMS One of DeMorgan's theorems stated as follows: The complement of a product of variables is equal to the sum of the complements of the variables. DeMorgan's second theorem is stated as

More information

Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India

Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India Mapping Signal Processing Algorithms to Architecture Sumam David S Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India sumam@ieee.org Objectives At the

More information

Combinational Logic II

Combinational Logic II Combinational Logic II Ranga Rodrigo July 26, 2009 1 Binary Adder-Subtractor Digital computers perform variety of information processing tasks. Among the functions encountered are the various arithmetic

More information

Chapter 3: part 3 Binary Subtraction

Chapter 3: part 3 Binary Subtraction Chapter 3: part 3 Binary Subtraction Iterative combinational circuits Binary adders Half and full adders Ripple carry and carry lookahead adders Binary subtraction Binary adder-subtractors Signed binary

More information

Cyclone Device Handbook, Volume 1

Cyclone Device Handbook, Volume 1 Cyclone Device Handbook, Volume 1 101 Innovation Drive San Jose, CA 95134 (408) 544-7000 http://www.altera.com Preliminary Information C5V1-1.4 Copyright 2003 Altera Corporation. All rights reserved. Altera,

More information

ISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies

ISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies VLSI IMPLEMENTATION OF HIGH PERFORMANCE DISTRIBUTED ARITHMETIC (DA) BASED ADAPTIVE FILTER WITH FAST CONVERGENCE FACTOR G. PARTHIBAN 1, P.SATHIYA 2 PG Student, VLSI Design, Department of ECE, Surya Group

More information

UNIT - V MEMORY P.VIDYA SAGAR ( ASSOCIATE PROFESSOR) Department of Electronics and Communication Engineering, VBIT

UNIT - V MEMORY P.VIDYA SAGAR ( ASSOCIATE PROFESSOR) Department of Electronics and Communication Engineering, VBIT UNIT - V MEMORY P.VIDYA SAGAR ( ASSOCIATE PROFESSOR) contents Memory: Introduction, Random-Access memory, Memory decoding, ROM, Programmable Logic Array, Programmable Array Logic, Sequential programmable

More information

VIII. DSP Processors. Digital Signal Processing 8 December 24, 2009

VIII. DSP Processors. Digital Signal Processing 8 December 24, 2009 Digital Signal Processing 8 December 24, 2009 VIII. DSP Processors 2007 Syllabus: Introduction to programmable DSPs: Multiplier and Multiplier-Accumulator (MAC), Modified bus structures and memory access

More information

COMPUTER ARCHITECTURE AND ORGANIZATION Register Transfer and Micro-operations 1. Introduction A digital system is an interconnection of digital

COMPUTER ARCHITECTURE AND ORGANIZATION Register Transfer and Micro-operations 1. Introduction A digital system is an interconnection of digital Register Transfer and Micro-operations 1. Introduction A digital system is an interconnection of digital hardware modules that accomplish a specific information-processing task. Digital systems vary in

More information

CS6303 COMPUTER ARCHITECTURE LESSION NOTES UNIT II ARITHMETIC OPERATIONS ALU In computing an arithmetic logic unit (ALU) is a digital circuit that performs arithmetic and logical operations. The ALU is

More information

High-Performance FIR Filter Architecture for Fixed and Reconfigurable Applications

High-Performance FIR Filter Architecture for Fixed and Reconfigurable Applications High-Performance FIR Filter Architecture for Fixed and Reconfigurable Applications Pallavi R. Yewale ME Student, Dept. of Electronics and Tele-communication, DYPCOE, Savitribai phule University, Pune,

More information

ECE 485/585 Microprocessor System Design

ECE 485/585 Microprocessor System Design Microprocessor System Design Lecture 4: Memory Hierarchy Memory Taxonomy SRAM Basics Memory Organization DRAM Basics Zeshan Chishti Electrical and Computer Engineering Dept Maseeh College of Engineering

More information

Field Programmable Gate Array

Field Programmable Gate Array Field Programmable Gate Array System Arch 27 (Fire Tom Wada) What is FPGA? System Arch 27 (Fire Tom Wada) 2 FPGA Programmable (= reconfigurable) Digital System Component Basic components Combinational

More information

R10. II B. Tech I Semester, Supplementary Examinations, May

R10. II B. Tech I Semester, Supplementary Examinations, May SET - 1 1. a) Convert the following decimal numbers into an equivalent binary numbers. i) 53.625 ii) 4097.188 iii) 167 iv) 0.4475 b) Add the following numbers using 2 s complement method. i) -48 and +31

More information

FPGA Matrix Multiplier

FPGA Matrix Multiplier FPGA Matrix Multiplier In Hwan Baek Henri Samueli School of Engineering and Applied Science University of California Los Angeles Los Angeles, California Email: chris.inhwan.baek@gmail.com David Boeck Henri

More information

Cyclone Device Handbook, Volume 1

Cyclone Device Handbook, Volume 1 Cyclone Device Handbook, Volume 1 101 Innovation Drive San Jose, CA 95134 (408) 544-7000 http://www.altera.com Preliminary Information C5V1-1.8 Copyright 2005 Altera Corporation. All rights reserved. Altera,

More information

FIR Compiler MegaCore Function User Guide

FIR Compiler MegaCore Function User Guide FIR Compiler MegaCore Function User Guide 101 Innovation Drive San Jose, CA 95134 (408) 544-7000 www.altera.com Operations Part Number MegaCore Function Version: 3.3.1 Document Version: 3.3.1 rev 2 Document

More information

DLD VIDYA SAGAR P. potharajuvidyasagar.wordpress.com. Vignana Bharathi Institute of Technology UNIT 3 DLD P VIDYA SAGAR

DLD VIDYA SAGAR P. potharajuvidyasagar.wordpress.com. Vignana Bharathi Institute of Technology UNIT 3 DLD P VIDYA SAGAR DLD UNIT III Combinational Circuits (CC), Analysis procedure, Design Procedure, Combinational circuit for different code converters and other problems, Binary Adder- Subtractor, Decimal Adder, Binary Multiplier,

More information

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE Anni Benitta.M #1 and Felcy Jeba Malar.M *2 1# Centre for excellence in VLSI Design, ECE, KCG College of Technology, Chennai, Tamilnadu

More information

Implementation of Floating Point Multiplier Using Dadda Algorithm

Implementation of Floating Point Multiplier Using Dadda Algorithm Implementation of Floating Point Multiplier Using Dadda Algorithm Abstract: Floating point multiplication is the most usefull in all the computation application like in Arithematic operation, DSP application.

More information

FPGA Provides Speedy Data Compression for Hyperspectral Imagery

FPGA Provides Speedy Data Compression for Hyperspectral Imagery FPGA Provides Speedy Data Compression for Hyperspectral Imagery Engineers implement the Fast Lossless compression algorithm on a Virtex-5 FPGA; this implementation provides the ability to keep up with

More information