ON THE IMPLEMENTATION OF DIFFERENT HYPERBOLIC TANGENT SOLUTIONS IN FPGA. Darío Baptista and Fernando Morgado-Dias

Size: px
Start display at page:

Download "ON THE IMPLEMENTATION OF DIFFERENT HYPERBOLIC TANGENT SOLUTIONS IN FPGA. Darío Baptista and Fernando Morgado-Dias"

Transcription

1 ON THE IMPLEMENTATION OF DIFFERENT HYPERBOLIC TANGENT SOLUTIONS IN FPGA Darío Baptista and Fernando Morgado-Dias Madeira Interactive Technologies Institute and Centro de Competências de Ciências Exactas e da Engenharia, Universidade da Madeira Campus da Penteada, Funchal, Madeira, Portugal. Tel: /1, Fax: Abstract: This paper describes the development of an artificial neuron using the blocks from the library of System Generator. The advantage of using System Generator's block is the ease of conversion to HDL language and the facility of implementation into FPGA. The first solution presented here consists of an efficient method of "query-by-table" (lookup table). In this method, the function s values are pre-calculated and stored in a table. The second set of solutions consists in implementing third degree polynomials. For the polynomial implementation three methods were developed using different hardware for the multiplication: Booth s algorithm, 2 s complement and Digital Signal processors. The best solution describes the hyperbolic tangent with an error equal to 1x Copyright CONTROLO2012 Keywords: Artificial Neural Networks, System Generator, Hyperbolic Tangent. 1. INTRODUCTION Almost since the beginning of Artificial Neural Networks (ANN) use for modelling and control, a need for hardware implementation has arisen. This need conducted to many experimental and academic prototypes and several commercial solutions. One of the first commercial implementations was developed by Nestor and Intel in 1993 for an Optical Character Recognition (OCR), which uses Radial Basis Functions (Morgado-Dias, 2004). The need for physical implementations also arises in medical applications of ANN which work inside or in close connection with the body. The work referred in (Stieglitz, 2006) contains many examples of neural vision systems that cannot be used attached to computers. A large share of the difficulty of implementing an ANN in hardware comes from the non-linear activation function, most of the times a hyperbolic tangent. Since this function cannot be easily calculated by elementary functions, it has to be replaced by an approximation that is both less resource consuming and does not delay the calculation too much. To this end many solutions have been tested: piecewise linear functions, polynomial approximations, Taylor Series, Elliot functions and the CORDIC algorithm. These solutions are only a few examples. The best solutions found in the review of the literature are: (Pinto, 2006) that with a set of five 5th order polynomial approximations with a maximum error of 8x10-5 (using the sigmoid function) and (Ferreira, 2007) that with a piecewise linear approximation determined by an optimized algorithm which achieved 2.18x10-5 error in the hyperbolic tangent and a 5x10-8 error in a control loop simulation. This paper presents an automated solution that makes use of System Generator of Xilinx to obtain an FPGA implementation of a neuron. This work is a development of the one published in (Reis et. al 2011), with improved solutions for the activation function. 2. THEORETICAL DESCRIPTION A basic block for the construction of an artificial neuron is the appropriate choice of activation function. This paper is focused on the implementation of the hyperbolic tangent using the

2 System Generator block. In the first place, for building this activation function it is necessary to analyze its parity. Figure 1 shows that this function presents symmetry at the origin, i.e., it is an odd function. f ( x) f ( x) (1) To determine these polynomials the Chebyshev interpolation was used. To achieve high accuracy the domain of the function was split in to smaller intervals. For this implementation 25 polynomials of third-degree were used, in order to describe the hyperbolic tangent with a Mean Square Error (MSE) equal to 1x It is noted that, for values of function s domain higher than 6, the function is interpolated by a constant value 1. Generally, each polynomial obtained has the following form: P( x) dx 2 3 a bx cx (2) Fig. 1: Graphical representation of hyperbolic tangent 2.1 Implementation of Hyperbolic Tangent using the ROM The first implementation of the hyperbolic tangent consists in storing, in a Read Only Memory (ROM), the function's values for input values between 0-6. For inputs values greater than 6 the function is equal to 1. The implementation of the hyperbolic tangent function using block System Generator is represented in Figure 2 (Reis et. al, 2011). Note that the representation for all coefficients is done with 32 bits, where 2 bits are for the integer part and 30 bits are for fractional part. If the polynomial was implemented exactly as written in eq.(2), 4 additions and 6 multiplication would have to be performed. However, it is possible to write the same polynomial, but using only 3 multiplications and 3 additions, which saves resources in the FPGA. P ( x) a ( b ( c dx) x) x (3) Fig. 2: Implementation of the hyperbolic tangent function using a ROM. The ROM stores the values of the hyperbolic tangent but the inputs of the LUT are the memory addresses instead of the inputs of the hyperbolic tangent. The outputs will be fixed point values and these numbers will be unsigned. For the LUT to perform the required function several transformations are necessary. These changes are done by the subsystem represented in figure 2 (Reis et. al, 2011). This block achieves its function using a ROM to store the values of the hyperbolic tangent. To make an efficient use of the memory the hyperbolic tangent is only represented between the positive values of 0 and 6 since this function is symmetrical and saturates. 2.2 Implementation of Hyperbolic Tangent using an interpolating polynomial The second implementation of this function consisted of using a polynomial interpolator. To do this, a set of polynomials was determined that could describe the hyperbolic tangent function for input values between 0 and 6. Fig. 3: Flowchart descriptive of hyperbolic tangent function using polynomial interpolation

3 Figure 3 shows the flowchart of the polynomial interpolator s implementation. It shows that to implement this idea it is necessary to use only two operations: addition and multiplication. The multiplication operation was implemented in three different forms: using a specific feature of the FPGA (DSP - Digital Signal Processor), using Booth's algorithm and using a 2's complement multiplication Multiplication using DSPs The DSPs are devices which have the capacity to process large quantities of operations in a short time. This ability of DSPs makes this device perfect for applications where delay is not tolerable. When using an FPGA that has this resource available the user can establish whether they should be used for an operation or not Multiplication using 2's complement This implementation allows a multiplication without the use of DSPs. Figure 4 shows the procedure to calculate a multiplication using 2's complement (Morgado Dias, 2010). In this example numbers of 4 bits are used. The multiplicand can have any sign and the multiplier must be positive. Fig. 5: Flowchart descriptive of the implementation of the multiplication using 2's complement Implementation of multiplication using Booth s algorithm. This implementation also allows doing a multiplication without the use of DSPs. Figure 6 shows the procedure to calculate a multiplication using Booth s algorithm (Sêrro, 2007). This procedure uses a 4-bit numbers with a positive multiplier and a multiplicand of any signal. Fig. 4: Example of multiplication in 2's complement using 4-bit numbers. Figure 4 shows that, at the beginning, an initial partial sum is applied (i.e. the partial sum is 0 because the accumulator is initiated to 0). The first multiplication, which is the result of the multiplication of the first digit of the multiplier (B0) by the multiplicand is placed underneath this partial sum (Morgado Dias, 2010). The next step is to calculate the next partial sum where an extension of signal is applied. Underneath this last partial sum, we put the multiplication's result of second digit of the multiplier (B1) by the multiplicand, after shifted and signal extension. This procedure is repeated until meeting the last multiplication between last multiplier s digit (B3) by multiplicand, obtaining, in this way, the desired result (Morgado Dias, 2010). Figure 5 shows the implementation of the algorithm using 2's complement. Fig. 6: Example of multiplication using Booth s algorithm. Figure 6 shows that, at the beginning, an initial partial sum is applied (i.e. the initial partial sum is equal to 0 because the accumulator is initiated with 0) and a copy of the multiplicand is placed underneath this partial sum. The operation held between the partial sum and multiplied is determined according to the pair of bits "B i, B i-1 " Table 1 shows all possible combinations of the pair "B i, B i-1 " and their respective operations (Sêrro, 2007).

4 Table 1: Possible combinations of the pair "B i, B i-1 " and their respective operations. Operation between the partial sum B i ; B i-1 and multiplied 10 Subtraction 01 Addition 00 There is no operation 11 There is no operation To get the next partial sum sign extension is applied. Next, the multiplicand is placed under the partial sum after shifted. One more time, table 1 must be consulted to determine the operation This is repeated until the last pair of bits of the multiplier ("B3, B2") is reached, obtaining the desired result (Sêrro, 2007). Figure 7 shows the implementation of the Booth s algorithm. Analysing table 2 it can be seen that between the solutions which use the polynomial interpolator, the use of DSPs saves the other resources of the FPGA. The implementation that uses the ROM to hold the hyperbolic tangent values contains values and the representation for each theses values is done with 32 bits. The implementation where the multiplication is done with the Booth s algorithm is the solution that uses more resources. On the other hand, this alternative and 2 s complement do not require the use of DSP. The latter solution saves resources when compared with the algorithm of Booth. So, if the intention is to save DSPs, it will be better to use the multiplication with 2 s complement. The solution which uses the ROM can save more resources than the any other solution, although the error is higher and more memory is used. Fig. 7: Flowchart descriptive of implementation of multiplication using Booth s algorithm. 3. RESULTS To compare all these solutions, each system was compiled and a co-simulation block created. This allows to simulate the system s behaviour when it is inserted into the FPGA. This co-simulation block was created for the FPGA Virtex 5 XC5VFX130T-1FF1738. Table 2 presents the resources used in all the solutions presented. Figure 8 shows the system created to do the test with the co-simulation block. To test the implementations an input consisting of 1200 values ranging between -6 and 6 was used. Figure 9 and 10 show the error between the hyperbolic tangent values and the response of each system. It is important to note which the figure 9 is the same to the three solutions using the polynomial interpolation. Comparing figures 9 and 10 it can be seen that the system which uses ROM presents an error 10 times higher than the system which uses the polynomial interpolator. Fig. 8: Test system for co-simulation block. Fig. 9: Error between the values of the hyperbolic tangent and the system response using the polynomial interpolation. Fig. 10: Error between the values of hyperbolic tangent and the system response using ROM

5 Slice Logic Utilization Slice Logic Distribution IO Utilization Specific Feature Utilization Table 2: The resources of FPGA used for each system Design Summary of Hyperbolic Tangent ROM Polynomial using multiplication with: DSP 2 s Complement Algorithm of Booth Number of Slice Registers 642 out of 81,920 (1%) 424 out of 81,920 (1%) 424 out of 81,920 (1%) 424 out of 81,920 (1%) Number used as Flip Flops: 641 Number used as Flip Flops: 424 Number used as Flip Flops: 424 Number used as Flip Flops: 424 Number used as Latch-thrus: 1 Number of Slice LUTs 739 out of 81,920 (1%) 1,003 out of 81,920 (1%) 2,558 out of 81,920 (3%) 5,444 out of 81,920 (6%) Number used as logic 637out of 81,920 (1%) 996 out of 81,920 (1%) 2,551 out of 81,920 (3%) 5,437 out of 81,920 (6%) Number using O6 output only: 404 Number using O5 output only: 146 Number using O5 and O6: 87 Number using O6 output only: 670 Number using O5 output only: 86 Number using O5 and O6: 240 Number using O6 output only: 2,220 Number using O5 output only: 87 Number using O5 and O6: 244 Number using O6 output only: 5,063 Number using O5 output only: 87 Number using O5 and O6: 287 Number used as Memory: 92 out of 25,280 (1%) Number using O6 output only: 90 Number using O6 output only: Number used as exclusive route-thru Number of route-thrus Number using O6 output only: 156 Number using O6 output only: 93 Number using O6 output only: 93 Number using O6 output only: 93 Number using O5 and O6: 1 Number of occupied Slices 303 out of 20,480 (1%) 419 out of 20,480 (2%) 766 out of 20,480 (3%) 1,730 out of 20,480 (8%) Number of LUT Flip Flop pairs used 962 1,158 2,685 5, out of 962 (33%) 223 out of 962 (23%) 419 out of 962 (43%) 734 out of 1,158 (63%) 155 out of 1,158 (13%) 269 out of 1,158 (23%) 2,261 out of 2,685 (84%) 127 out of 2,685 (4%) 297 out of 2,685 (11%) 5,167 out of 5,591 (92%) 147 out of 5,591 (2%) 277 out of 5,591 (4%) Number of unique control sets Number of slice register sites lost to control set restrictions 43 out of 81,920 (1%) 32 out of 81,920 (1%) 32 out of 81,920 (1%) 32 out of 81,920 (1%) Number of bonded IOBs 1 out of 840 (1%) 1 out of 840 (1%) 1 out of 840 (1%) 1 out of 840 (1%) Number of BlockRAM/FIFO 12 out of 298 (4%) 2 out of 298 (1%) 2 out of 298 (1%) 2 out of 298 (1%) Number of 36k BlockRAM used: 9 Number of 18k BlockRAM used: 5 Number of 36k BlockRAM used: 2 Number of 36k BlockRAM used: 2 Number of 36k BlockRAM used: 2 Total Memory used (KB) 414 out of 10,728 (3%) 72 out of 10,728 (1%) 72 out of 10,728 (1%) 72 out of 10,728 (1%) Number of BUFG/BUFGCTRLs 4 out of 32 (9%) 3 out of 32 (9%) 3 out of 32 (9%) 3 out of 32 (9%) Number used as BUFGCTRLs: 2 Number used as BUFGCTRLs: 1 Number used as BUFGCTRLs: 1 Number used as BUFGCTRLs: 1 Number of BSCANs 1 out of 4 (25%) 1 out of 4 (25%) 1 out of 4 (25%) 1 out of 4 (25%) Number of DSP48Es 12 out of 320 (3%) 12 out of 320 (3%) -- --

6 4. CONCLUSION This paper presents different implementations of hyperbolic tangent where the primary aim is to find the best solution which allows to save resources and get the best performance. The first implementation consists in storing the values of hyperbolic tangent in a ROM. This block works like a lookup table because it is necessary to insert a memory s address to get the value of function. This implementation is a solution which allows to save more resources than the second implementation (using an interpolation polynomial). However, the first solution presents a disadvantage because it presents a higher MSE. The second implementation consists in using the Chebyshev interpolation. For the specified error 25 polynomials of third order were needed. To implement the polynomials in the FPGA adders and multipliers are needed. The multipliers can be implemented using DSPs. However, the multiplication can be done without DSPs. Two solutions were tested: Booth s algorithm and 2 s complement. The idea to implement three different ways to perform a multiplication consists in finding the best balance between performance and resource use. After analyzing the implementations it can be stated that: using DSPs frees other resources; 2 s complement is less resource demanding than Booth s algorithm and both these solutions do not use DSPs. Therefore, the reader could ask himself the following: among the three implementations using the interpolation polynomial, what are the possible cases for more effective implementation? One plausible answer could be found if the structure of the intended ANN is analyzed. If the ANN has many inputs or many neurons in the hidden layer, it is advisable to use the algorithm of 2 s complement. This implementation allows saving DSPs which will be necessary to do the sum of products between the weights and the inputs. Otherwise, it is better to use the multiplication s implementation using DSPs. Stieglitz, T. and Meyer, J. (2006). Biomedical Microdevices for Neural Implants, BIOMEMS Microsystems. Vol. 16, pp Morgado-Dias F., Antunes A. and Mota A. (2004). Artificial neural networks: a review of commercial hardware, in Engineering Applications of Artificial Intelligence, Vol.17, pp Pinto J. O. P., Bose B.K., Leite L. C., Silva L. E. B. and Romero M. E.(2006). Field Programmable Gate Array (FPGA) Based Neural Network Implementation of Stator Flux Oriented Vector Control of Induction Motor Drive, in IEEE International Conference on Industrial Technology. Ferreira, P., Ribeiro, P., Antunes, A. and Morgado Dias, F. (2007). A high bit resolution FPGA implementation of a FNN with a new algorithm for the activation function, in Neurocomputing, Vol.71, pp Reis, L., Aguiar, L., Baptista, D., Morgado-Dias, F. (2011), ANGE Automatic Neural Generator, International Conference on Artificial Neural Networks, Espoo, Finland. 5. ACKNOWLEDGMENTS The authors would like to acknowledge the Portuguese Foundation for Science and Technology for their support for this work through project PEst- OE/EEI/LA0009/ REFERENCES Morgado Dias, F. (2010). Sistemas Digitais Princípios e Prática. Primeira Edição. FCA. Sêrro, C. (2007). Multiplicação e Divisão inteira. Acessed: 6 th March [Online] Aula05_MultDiv.pdf

ANGE Automatic Neural Generator

ANGE Automatic Neural Generator ANGE Automatic Neural Generator Leonardo Reis 1, Luis Aguiar 1, Darío Baptista 1, Fernando Morgado Dias 1 1 Centro de Competências de Ciências Exactas e da Engenharia, Universidade da Madeira Campus penteada,

More information

NEURON IMPLEMENTATION USING SYSTEM GENERATOR

NEURON IMPLEMENTATION USING SYSTEM GENERATOR NEURON IMPLEMENTATION USING SYSTEM GENERATOR Luis Aguiar,2, Leonardo Reis,2, Fernando Morgado Dias,2 Centro de Competências de Ciências Exactas e da Engenharia, Universidade da Madeira Campus da Penteada,

More information

HARDWARE IMPLEMENTATION OF AN ARTIFICIAL NEURAL NETWORK WITH AN EMBEDDED MICROPROCESSOR IN A FPGA

HARDWARE IMPLEMENTATION OF AN ARTIFICIAL NEURAL NETWORK WITH AN EMBEDDED MICROPROCESSOR IN A FPGA HARDWARE IMPLEMENTATION OF AN ARTIFICIAL NEURAL NETWORK WITH AN EMBEDDED MICROPROCESSOR IN A FPGA Gisnara Rodrigues Hoelzle 2 and Fernando Morgado Dias 1,2 1 Centro de Ciências Matemáticas CCM, Universidade

More information

DUAL-PROCESSOR NEURAL NETWORK IMPLEMENTATION IN FPGA. Diego Santos, Henrique Nunes, Fernando Morgado-Dias

DUAL-PROCESSOR NEURAL NETWORK IMPLEMENTATION IN FPGA. Diego Santos, Henrique Nunes, Fernando Morgado-Dias DUAL-PROCESSOR NEURAL NETWORK IMPLEMENTATION IN FPGA Diego Santos, Henrique Nunes, Fernando Morgado-Dias Madeira Interactive Technologies Institute and Centro de Competências de Ciências Exactas e da Engenharia,

More information

A Matlab Tool for Analyzing and Improving Fault Tolerance of Artificial Neural Networks

A Matlab Tool for Analyzing and Improving Fault Tolerance of Artificial Neural Networks A Matlab Tool for Analyzing and Improving Fault Tolerance of Artificial Neural Networks Rui Borralho*. Pedro Fontes*. Ana Antunes*. Fernando Morgado Dias**. *Escola Superior de Tecnologia de Setúbal do

More information

COMPARING DIFFERENT IMPLEMENTATIONS FOR THE LEVENBERG-MARQUARDT ALGORITHM. Darío Baptista, Fernando Morgado-Dias

COMPARING DIFFERENT IMPLEMENTATIONS FOR THE LEVENBERG-MARQUARDT ALGORITHM. Darío Baptista, Fernando Morgado-Dias COMPARIG DIFFERE IMPLEMEAIOS FOR HE LEVEBERG-MARQUARD ALGORIHM Darío Baptista, Fernando Morgado-Dias Madeira Interactive echnologies Institute and Centro de Competências de Ciências Eactas e da Engenharia,

More information

KARNUMA AN EDUCATIONAL TOOL FOR DIGITAL SYSTEMS. Valentim Freitas and Fernando Morgado-Dias

KARNUMA AN EDUCATIONAL TOOL FOR DIGITAL SYSTEMS. Valentim Freitas and Fernando Morgado-Dias KARNUMA AN EDUCATIONAL TOOL FOR DIGITAL SYSTEMS Valentim Freitas and Fernando Morgado-Dias Madeira Interactive Technologies Institute and Centro de Competências de Ciências Exactas e da Engenharia, Universidade

More information

Design and Simulation of Pipelined Double Precision Floating Point Adder/Subtractor and Multiplier Using Verilog

Design and Simulation of Pipelined Double Precision Floating Point Adder/Subtractor and Multiplier Using Verilog Design and Simulation of Pipelined Double Precision Floating Point Adder/Subtractor and Multiplier Using Verilog Onkar Singh (1) Kanika Sharma (2) Dept. ECE, Arni University, HP (1) Dept. ECE, NITTTR Chandigarh

More information

Double Precision Floating-Point Multiplier using Coarse-Grain Units

Double Precision Floating-Point Multiplier using Coarse-Grain Units Double Precision Floating-Point Multiplier using Coarse-Grain Units Rui Duarte INESC-ID/IST/UTL. rduarte@prosys.inesc-id.pt Mário Véstias INESC-ID/ISEL/IPL. mvestias@deetc.isel.ipl.pt Horácio Neto INESC-ID/IST/UTL

More information

MONITORING AND MANAGEMENT OF ENERGY IN PENTEADA BUILDING A CASE STUDY

MONITORING AND MANAGEMENT OF ENERGY IN PENTEADA BUILDING A CASE STUDY MONITORING AND MANAGEMENT OF ENERGY IN PENTEADA BUILDING A CASE STUDY Adriano Lopes 1, António Carvalho 1, Filipe Oliveira 2 and Fernando Morgado-Dias 1 1 Madeira Interactive Technologies Institute and

More information

Introduction to Field Programmable Gate Arrays

Introduction to Field Programmable Gate Arrays Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May 9 June 2007 Javier Serrano, CERN AB-CO-HT Outline Digital Signal

More information

ECE 341 Midterm Exam

ECE 341 Midterm Exam ECE 341 Midterm Exam Time allowed: 90 minutes Total Points: 75 Points Scored: Name: Problem No. 1 (10 points) For each of the following statements, indicate whether the statement is TRUE or FALSE: (a)

More information

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering An Efficient Implementation of Double Precision Floating Point Multiplier Using Booth Algorithm Pallavi Ramteke 1, Dr. N. N. Mhala 2, Prof. P. R. Lakhe M.Tech [IV Sem], Dept. of Comm. Engg., S.D.C.E, [Selukate],

More information

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN Xiaoying Li 1 Fuming Sun 2 Enhua Wu 1, 3 1 University of Macau, Macao, China 2 University of Science and Technology Beijing, Beijing, China

More information

DIGITAL IMPLEMENTATION OF THE SIGMOID FUNCTION FOR FPGA CIRCUITS

DIGITAL IMPLEMENTATION OF THE SIGMOID FUNCTION FOR FPGA CIRCUITS DIGITAL IMPLEMENTATION OF THE SIGMOID FUNCTION FOR FPGA CIRCUITS Alin TISAN, Stefan ONIGA, Daniel MIC, Attila BUCHMAN Electronic and Computer Department, North University of Baia Mare, Baia Mare, Romania

More information

COMPUTER ARCHITECTURE AND ORGANIZATION. Operation Add Magnitudes Subtract Magnitudes (+A) + ( B) + (A B) (B A) + (A B)

COMPUTER ARCHITECTURE AND ORGANIZATION. Operation Add Magnitudes Subtract Magnitudes (+A) + ( B) + (A B) (B A) + (A B) Computer Arithmetic Data is manipulated by using the arithmetic instructions in digital computers. Data is manipulated to produce results necessary to give solution for the computation problems. The Addition,

More information

An Efficient Hardwired Realization of Embedded Neural Controller on System-On-Programmable-Chip (SOPC)

An Efficient Hardwired Realization of Embedded Neural Controller on System-On-Programmable-Chip (SOPC) An Efficient Hardwired Realization of Embedded Neural Controller on System-On-Programmable-Chip (SOPC) Karan Goel, Uday Arun, A. K. Sinha ABES Engineering College, Department of Computer Science & Engineering,

More information

Design of Delay Efficient Distributed Arithmetic Based Split Radix FFT

Design of Delay Efficient Distributed Arithmetic Based Split Radix FFT Design of Delay Efficient Arithmetic Based Split Radix FFT Nisha Laguri #1, K. Anusudha *2 #1 M.Tech Student, Electronics, Department of Electronics Engineering, Pondicherry University, Puducherry, India

More information

AES Core Specification. Author: Homer Hsing

AES Core Specification. Author: Homer Hsing AES Core Specification Author: Homer Hsing homer.hsing@gmail.com Rev. 0.1.1 October 30, 2012 This page has been intentionally left blank. www.opencores.org Rev 0.1.1 ii Revision History Rev. Date Author

More information

IMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM

IMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM Annals of the University of Petroşani, Economics, 12(4), 2012, 185-192 185 IMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM MIRCEA PETRINI * ABSTACT: This paper presents some simple techniques to improve

More information

DSP Resources. Main features: 1 adder-subtractor, 1 multiplier, 1 add/sub/logic ALU, 1 comparator, several pipeline stages

DSP Resources. Main features: 1 adder-subtractor, 1 multiplier, 1 add/sub/logic ALU, 1 comparator, several pipeline stages DSP Resources Specialized FPGA columns for complex arithmetic functionality DSP48 Tile: two DSP48 slices, interconnect Each DSP48 is a self-contained arithmeticlogical unit with add/sub/multiply/logic

More information

Hardware implementation of hyperbolic tangent and sigmoid activation functions

Hardware implementation of hyperbolic tangent and sigmoid activation functions BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES, Vol. 66, No. 5, 2018 DOI: 10.24425/124272 Hardware implementation of hyperbolic tangent and sigmoid activation functions Z. HAJDUK* Rzeszów

More information

(+A) + ( B) + (A B) (B A) + (A B) ( A) + (+ B) (A B) + (B A) + (A B) (+ A) (+ B) + (A - B) (B A) + (A B) ( A) ( B) (A B) + (B A) + (A B)

(+A) + ( B) + (A B) (B A) + (A B) ( A) + (+ B) (A B) + (B A) + (A B) (+ A) (+ B) + (A - B) (B A) + (A B) ( A) ( B) (A B) + (B A) + (A B) COMPUTER ARITHMETIC 1. Addition and Subtraction of Unsigned Numbers The direct method of subtraction taught in elementary schools uses the borrowconcept. In this method we borrow a 1 from a higher significant

More information

PROJECT REPORT IMPLEMENTATION OF LOGARITHM COMPUTATION DEVICE AS PART OF VLSI TOOLS COURSE

PROJECT REPORT IMPLEMENTATION OF LOGARITHM COMPUTATION DEVICE AS PART OF VLSI TOOLS COURSE PROJECT REPORT ON IMPLEMENTATION OF LOGARITHM COMPUTATION DEVICE AS PART OF VLSI TOOLS COURSE Project Guide Prof Ravindra Jayanti By Mukund UG3 (ECE) 200630022 Introduction The project was implemented

More information

Journal of Engineering Technology Volume 6, Special Issue on Technology Innovations and Applications Oct. 2017, PP

Journal of Engineering Technology Volume 6, Special Issue on Technology Innovations and Applications Oct. 2017, PP Oct. 07, PP. 00-05 Implementation of a digital neuron using system verilog Azhar Syed and Vilas H Gaidhane Department of Electrical and Electronics Engineering, BITS Pilani Dubai Campus, DIAC Dubai-345055,

More information

Implementation of CORDIC Algorithms in FPGA

Implementation of CORDIC Algorithms in FPGA Summer Project Report Implementation of CORDIC Algorithms in FPGA Sidharth Thomas Suyash Mahar under the guidance of Dr. Bishnu Prasad Das May 2017 Department of Electronics and Communication Engineering

More information

[Dixit*, 4.(9): September, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785

[Dixit*, 4.(9): September, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY REALIZATION OF CANNY EDGE DETECTION ALGORITHM USING FPGA S.R. Dixit*, Dr. A.Y.Deshmukh * Research scholar Department of Electronics

More information

ECE 30 Introduction to Computer Engineering

ECE 30 Introduction to Computer Engineering ECE 30 Introduction to Computer Engineering Study Problems, Set #6 Spring 2015 1. With x = 1111 1111 1111 1111 1011 0011 0101 0011 2 and y = 0000 0000 0000 0000 0000 0010 1101 0111 2 representing two s

More information

High Performance Architecture for Reciprocal Function Evaluation on Virtex II FPGA

High Performance Architecture for Reciprocal Function Evaluation on Virtex II FPGA EurAsia-ICT 00, Shiraz-Iran, 9-31 Oct. High Performance Architecture for Reciprocal Function Evaluation on Virtex II FPGA M. Anane, H. Bessalah and N. Anane Centre de Développement des Technologies Avancées

More information

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE Anni Benitta.M #1 and Felcy Jeba Malar.M *2 1# Centre for excellence in VLSI Design, ECE, KCG College of Technology, Chennai, Tamilnadu

More information

Binary Adders. Ripple-Carry Adder

Binary Adders. Ripple-Carry Adder Ripple-Carry Adder Binary Adders x n y n x y x y c n FA c n - c 2 FA c FA c s n MSB position Longest delay (Critical-path delay): d c(n) = n d carry = 2n gate delays d s(n-) = (n-) d carry +d sum = 2n

More information

CS6303 COMPUTER ARCHITECTURE LESSION NOTES UNIT II ARITHMETIC OPERATIONS ALU In computing an arithmetic logic unit (ALU) is a digital circuit that performs arithmetic and logical operations. The ALU is

More information

Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications

Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications , Vol 7(4S), 34 39, April 204 ISSN (Print): 0974-6846 ISSN (Online) : 0974-5645 Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications B. Vignesh *, K. P. Sridhar

More information

FPGA Implementation of a High Speed Multistage Pipelined Adder Based CORDIC Structure for Large Operand Word Lengths

FPGA Implementation of a High Speed Multistage Pipelined Adder Based CORDIC Structure for Large Operand Word Lengths International Journal of Computer Science and Telecommunications [Volume 3, Issue 5, May 2012] 105 ISSN 2047-3338 FPGA Implementation of a High Speed Multistage Pipelined Adder Based CORDIC Structure for

More information

Dynamic Partial Reconfiguration of FPGA for SEU Mitigation and Area Efficiency

Dynamic Partial Reconfiguration of FPGA for SEU Mitigation and Area Efficiency Dynamic Partial Reconfiguration of FPGA for SEU Mitigation and Area Efficiency Vijay G. Savani, Akash I. Mecwan, N. P. Gajjar Institute of Technology, Nirma University vijay.savani@nirmauni.ac.in, akash.mecwan@nirmauni.ac.in,

More information

Chapter 5 : Computer Arithmetic

Chapter 5 : Computer Arithmetic Chapter 5 Computer Arithmetic Integer Representation: (Fixedpoint representation): An eight bit word can be represented the numbers from zero to 255 including = 1 = 1 11111111 = 255 In general if an nbit

More information

AnEfficientImplementationofDigitFIRFiltersusingMemorybasedRealization

AnEfficientImplementationofDigitFIRFiltersusingMemorybasedRealization Global Journal of Researches in Engineering: F Electrical and Electronics Engineering Volume 14 Issue 9 Version 1.0 ype: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals

More information

arxiv: v1 [cs.ne] 25 Sep 2016

arxiv: v1 [cs.ne] 25 Sep 2016 Accurate and Efficient Hyperbolic Tangent Activation Function on FPGA using the DCT Interpolation Filter Ahmed M. Abdelsalam, J.M. Pierre Langlois and F. Cheriet Computer and Software Engineering Department

More information

High-Performance FIR Filter Architecture for Fixed and Reconfigurable Applications

High-Performance FIR Filter Architecture for Fixed and Reconfigurable Applications High-Performance FIR Filter Architecture for Fixed and Reconfigurable Applications Pallavi R. Yewale ME Student, Dept. of Electronics and Tele-communication, DYPCOE, Savitribai phule University, Pune,

More information

XPLANATION: FPGA 101. The Basics of. by Adam Taylor Principal Engineer EADS Astrium FPGA Mathematics

XPLANATION: FPGA 101. The Basics of. by Adam Taylor Principal Engineer EADS Astrium FPGA Mathematics The Basics of by Adam Taylor Principal Engineer EADS Astrium aptaylor@theiet.org FPGA Mathematics 44 Xcell Journal Third Quarter 2012 One of the main advantages of the FPGA is its ability to perform mathematical

More information

UNIT - I: COMPUTER ARITHMETIC, REGISTER TRANSFER LANGUAGE & MICROOPERATIONS

UNIT - I: COMPUTER ARITHMETIC, REGISTER TRANSFER LANGUAGE & MICROOPERATIONS UNIT - I: COMPUTER ARITHMETIC, REGISTER TRANSFER LANGUAGE & MICROOPERATIONS (09 periods) Computer Arithmetic: Data Representation, Fixed Point Representation, Floating Point Representation, Addition and

More information

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,

More information

High Speed Special Function Unit for Graphics Processing Unit

High Speed Special Function Unit for Graphics Processing Unit High Speed Special Function Unit for Graphics Processing Unit Abd-Elrahman G. Qoutb 1, Abdullah M. El-Gunidy 1, Mohammed F. Tolba 1, and Magdy A. El-Moursy 2 1 Electrical Engineering Department, Fayoum

More information

2015 Paper E2.1: Digital Electronics II

2015 Paper E2.1: Digital Electronics II s 2015 Paper E2.1: Digital Electronics II Answer ALL questions. There are THREE questions on the paper. Question ONE counts for 40% of the marks, other questions 30% Time allowed: 2 hours (Not to be removed

More information

A Novel SoC Architecture on FPGA for Ultra Fast Face Detection

A Novel SoC Architecture on FPGA for Ultra Fast Face Detection A Novel SoC Architecture on FPGA for Ultra Fast Face Detection Chun He 1, Alexandros Papakonstantinou 2, and Deming Chen 2 1 Research Institute of Electronic Science & Tech., Univ. of Electronic Science

More information

Hardware Implementation of CMAC Type Neural Network on FPGA for Command Surface Approximation

Hardware Implementation of CMAC Type Neural Network on FPGA for Command Surface Approximation Acta Polytechnica Hungarica Vol. 4, No. 3, 2007 Hardware Implementation of CMAC Type Neural Network on FPGA for Command Surface Approximation Sándor Tihamér Brassai, László Bakó Sapientia - Hungarian University

More information

TSEA44 - Design for FPGAs

TSEA44 - Design for FPGAs 2015-11-24 Now for something else... Adapting designs to FPGAs Why? Clock frequency Area Power Target FPGA architecture: Xilinx FPGAs with 4 input LUTs (such as Virtex-II) Determining the maximum frequency

More information

DESIGN AND IMPLEMENTATION OF DA- BASED RECONFIGURABLE FIR DIGITAL FILTER USING VERILOGHDL

DESIGN AND IMPLEMENTATION OF DA- BASED RECONFIGURABLE FIR DIGITAL FILTER USING VERILOGHDL DESIGN AND IMPLEMENTATION OF DA- BASED RECONFIGURABLE FIR DIGITAL FILTER USING VERILOGHDL [1] J.SOUJANYA,P.G.SCHOLAR, KSHATRIYA COLLEGE OF ENGINEERING,NIZAMABAD [2] MR. DEVENDHER KANOOR,M.TECH,ASSISTANT

More information

Implementation and Impact of LNS MAC Units in Digital Filter Application

Implementation and Impact of LNS MAC Units in Digital Filter Application Implementation and Impact of LNS MAC Units in Digital Filter Application Hari Krishna Raja.V.S *, Christina Jesintha.R * and Harish.I * * Department of Electronics and Communication Engineering, Sri Shakthi

More information

FPGA architecture and design technology

FPGA architecture and design technology CE 435 Embedded Systems Spring 2017 FPGA architecture and design technology Nikos Bellas Computer and Communications Engineering Department University of Thessaly 1 FPGA fabric A generic island-style FPGA

More information

ISSN Vol.02, Issue.11, December-2014, Pages:

ISSN Vol.02, Issue.11, December-2014, Pages: ISSN 2322-0929 Vol.02, Issue.11, December-2014, Pages:1208-1212 www.ijvdcs.org Implementation of Area Optimized Floating Point Unit using Verilog G.RAJA SEKHAR 1, M.SRIHARI 2 1 PG Scholar, Dept of ECE,

More information

A fast and area-efficient FPGA-based architecture for high accuracy logarithm approximation

A fast and area-efficient FPGA-based architecture for high accuracy logarithm approximation A ast and area-eicient FPGA-based architecture or high accuracy logarithm approximation Dimitris Bariamis, Dimitris Maroulis, Dimitris K. Iakovidis Department o Inormatics and Telecommunications University

More information

Implementation and Analysis of Modified Double Precision Interval Arithmetic Array Multiplication

Implementation and Analysis of Modified Double Precision Interval Arithmetic Array Multiplication Implementation and Analysis of Modified Double Precision Interval Arithmetic Array Multiplication Krutika Ranjan Kumar Bhagwat #1, Prof. Tejas V. Shah *2, Prof. Deepali H. Shah #3 #* Instrumentation &

More information

THE DESIGN OF HIGH PERFORMANCE BARREL INTEGER ADDER S.VenuGopal* 1, J. Mahesh 2

THE DESIGN OF HIGH PERFORMANCE BARREL INTEGER ADDER S.VenuGopal* 1, J. Mahesh 2 e-issn 2277-2685, p-issn 2320-976 IJESR/September 2014/ Vol-4/Issue-9/738-743 S. VenuGopal et. al./ International Journal of Engineering & Science Research ABSTRACT THE DESIGN OF HIGH PERFORMANCE BARREL

More information

2. Neural network basics

2. Neural network basics 2. Neural network basics Next commonalities among different neural networks are discussed in order to get started and show which structural parts or concepts appear in almost all networks. It is presented

More information

Data Representation Type of Data Representation Integers Bits Unsigned 2 s Comp Excess 7 Excess 8

Data Representation Type of Data Representation Integers Bits Unsigned 2 s Comp Excess 7 Excess 8 Data Representation At its most basic level, all digital information must reduce to 0s and 1s, which can be discussed as binary, octal, or hex data. There s no practical limit on how it can be interpreted

More information

FPGA Architecture Overview. Generic FPGA Architecture (1) FPGA Architecture

FPGA Architecture Overview. Generic FPGA Architecture (1) FPGA Architecture FPGA Architecture Overview dr chris dick dsp chief architect wireless and signal processing group xilinx inc. Generic FPGA Architecture () Generic FPGA architecture consists of an array of logic tiles

More information

Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 3rd year engineering. Winter/Summer Training

Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 3rd year engineering. Winter/Summer Training Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 3rd year engineering Winter/Summer Training Level 2 continues. 3 rd Year 4 th Year FIG-3 Level 1 (Basic & Mandatory) & Level 1.1 and

More information

EITF35 - Introduction to Structured VLSI Design (Fall 2017) Course projects

EITF35 - Introduction to Structured VLSI Design (Fall 2017) Course projects 1 EITF35 - Introduction to Structured VLSI Design (Fall 2017) Course projects v.1.0.0 1 Introduction This document describes the course projects provided in EITF35 Introduction to Structured VLSI Design

More information

Advanced FPGA Design Methodologies with Xilinx Vivado

Advanced FPGA Design Methodologies with Xilinx Vivado Advanced FPGA Design Methodologies with Xilinx Vivado Alexander Jäger Computer Architecture Group Heidelberg University, Germany Abstract With shrinking feature sizes in the ASIC manufacturing technology,

More information

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011 FPGA for Complex System Implementation National Chiao Tung University Chun-Jen Tsai 04/14/2011 About FPGA FPGA was invented by Ross Freeman in 1989 SRAM-based FPGA properties Standard parts Allowing multi-level

More information

An Optimized Montgomery Modular Multiplication Algorithm for Cryptography

An Optimized Montgomery Modular Multiplication Algorithm for Cryptography 118 IJCSNS International Journal of Computer Science and Network Security, VOL.13 No.1, January 2013 An Optimized Montgomery Modular Multiplication Algorithm for Cryptography G.Narmadha 1 Asst.Prof /ECE,

More information

FPGA-Based Fuzzy Inference System for Realtime Embedded Applications

FPGA-Based Fuzzy Inference System for Realtime Embedded Applications Philadelphia University, Jordan From the SelectedWorks of Philadelphia University, Jordan Summer October 5, 2010 FPGA-Based Fuzzy Inference System for Realtime Embedded Applications Philadelphia University,

More information

Digital Signal Processing with Field Programmable Gate Arrays

Digital Signal Processing with Field Programmable Gate Arrays Uwe Meyer-Baese Digital Signal Processing with Field Programmable Gate Arrays Third Edition With 359 Figures and 98 Tables Book with CD-ROM ei Springer Contents Preface Preface to Second Edition Preface

More information

INTRODUCTION TO FPGA ARCHITECTURE

INTRODUCTION TO FPGA ARCHITECTURE 3/3/25 INTRODUCTION TO FPGA ARCHITECTURE DIGITAL LOGIC DESIGN (BASIC TECHNIQUES) a b a y 2input Black Box y b Functional Schematic a b y a b y a b y 2 Truth Table (AND) Truth Table (OR) Truth Table (XOR)

More information

Activation Function Architectures for FPGAs

Activation Function Architectures for FPGAs Activation Function Architectures for FPGAs Martin Langhammer Intel UK Bogdan Pasca Intel France Abstract Machine Learning is now one of the most active application areas for FPGAs. The more complex recurrent

More information

Optimized Design and Implementation of a 16-bit Iterative Logarithmic Multiplier

Optimized Design and Implementation of a 16-bit Iterative Logarithmic Multiplier Optimized Design and Implementation a 16-bit Iterative Logarithmic Multiplier Laxmi Kosta 1, Jaspreet Hora 2, Rupa Tomaskar 3 1 Lecturer, Department Electronic & Telecommunication Engineering, RGCER, Nagpur,India,

More information

Parallel FIR Filters. Chapter 5

Parallel FIR Filters. Chapter 5 Chapter 5 Parallel FIR Filters This chapter describes the implementation of high-performance, parallel, full-precision FIR filters using the DSP48 slice in a Virtex-4 device. ecause the Virtex-4 architecture

More information

FPGA Implementation of the Complex Division in Digital Predistortion Linearizer

FPGA Implementation of the Complex Division in Digital Predistortion Linearizer Australian Journal of Basic and Applied Sciences, 4(10): 5028-5037, 2010 ISSN 1991-8178 FPGA Implementation of the Complex Division in Digital Predistortion Linearizer Somayeh Mohammady, Pooria Varahram,

More information

Analysis of the MIPS 32-bit, pipelined processor using synthesized VHDL

Analysis of the MIPS 32-bit, pipelined processor using synthesized VHDL Analysis of the MIPS 32-bit, pipelined processor using synthesized VHDL Aaron Arthurs (ajarthu@uark.edu), Linh Ngo (lngo@uark.edu) Department of Computer Science and Engineering, University of Arkansas

More information

A Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithmetic with Decomposed LUT

A Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithmetic with Decomposed LUT IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735. Volume 7, Issue 2 (Jul. - Aug. 2013), PP 13-18 A Novel Approach of Area-Efficient FIR Filter

More information

Virtex-II Architecture

Virtex-II Architecture Virtex-II Architecture Block SelectRAM resource I/O Blocks (IOBs) edicated multipliers Programmable interconnect Configurable Logic Blocks (CLBs) Virtex -II architecture s core voltage operates at 1.5V

More information

CHAPTER TWO. Data Representation ( M.MORRIS MANO COMPUTER SYSTEM ARCHITECTURE THIRD EDITION ) IN THIS CHAPTER

CHAPTER TWO. Data Representation ( M.MORRIS MANO COMPUTER SYSTEM ARCHITECTURE THIRD EDITION ) IN THIS CHAPTER 1 CHAPTER TWO Data Representation ( M.MORRIS MANO COMPUTER SYSTEM ARCHITECTURE THIRD EDITION ) IN THIS CHAPTER 2-1 Data Types 2-2 Complements 2-3 Fixed-Point Representation 2-4 Floating-Point Representation

More information

Xilinx DSP. High Performance Signal Processing. January 1998

Xilinx DSP. High Performance Signal Processing. January 1998 DSP High Performance Signal Processing January 1998 New High Performance DSP Alternative New advantages in FPGA technology and tools: DSP offers a new alternative to ASICs, fixed function DSP devices,

More information

FPGA Matrix Multiplier

FPGA Matrix Multiplier FPGA Matrix Multiplier In Hwan Baek Henri Samueli School of Engineering and Applied Science University of California Los Angeles Los Angeles, California Email: chris.inhwan.baek@gmail.com David Boeck Henri

More information

EE 109 Unit 6 Binary Arithmetic

EE 109 Unit 6 Binary Arithmetic EE 109 Unit 6 Binary Arithmetic 1 2 Semester Transition Point At this point we are going to start to transition in our class to look more at the hardware organization and the low-level software that is

More information

Basic Xilinx Design Capture. Objectives. After completing this module, you will be able to:

Basic Xilinx Design Capture. Objectives. After completing this module, you will be able to: Basic Xilinx Design Capture This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: List various blocksets available in System

More information

An FPGA Based Floating Point Arithmetic Unit Using Verilog

An FPGA Based Floating Point Arithmetic Unit Using Verilog An FPGA Based Floating Point Arithmetic Unit Using Verilog T. Ramesh 1 G. Koteshwar Rao 2 1PG Scholar, Vaagdevi College of Engineering, Telangana. 2Assistant Professor, Vaagdevi College of Engineering,

More information

Hardware Description of Multi-Directional Fast Sobel Edge Detection Processor by VHDL for Implementing on FPGA

Hardware Description of Multi-Directional Fast Sobel Edge Detection Processor by VHDL for Implementing on FPGA Hardware Description of Multi-Directional Fast Sobel Edge Detection Processor by VHDL for Implementing on FPGA Arash Nosrat Faculty of Engineering Shahid Chamran University Ahvaz, Iran Yousef S. Kavian

More information

A HIGH PERFORMANCE FIR FILTER ARCHITECTURE FOR FIXED AND RECONFIGURABLE APPLICATIONS

A HIGH PERFORMANCE FIR FILTER ARCHITECTURE FOR FIXED AND RECONFIGURABLE APPLICATIONS A HIGH PERFORMANCE FIR FILTER ARCHITECTURE FOR FIXED AND RECONFIGURABLE APPLICATIONS Saba Gouhar 1 G. Aruna 2 gouhar.saba@gmail.com 1 arunastefen@gmail.com 2 1 PG Scholar, Department of ECE, Shadan Women

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: Configuring Floating Point Multiplier on Spartan 2E Hardware

More information

PINE TRAINING ACADEMY

PINE TRAINING ACADEMY PINE TRAINING ACADEMY Course Module A d d r e s s D - 5 5 7, G o v i n d p u r a m, G h a z i a b a d, U. P., 2 0 1 0 1 3, I n d i a Digital Logic System Design using Gates/Verilog or VHDL and Implementation

More information

Design of a Multiplier Architecture Based on LUT and VHBCSE Algorithm For FIR Filter

Design of a Multiplier Architecture Based on LUT and VHBCSE Algorithm For FIR Filter African Journal of Basic & Applied Sciences 9 (1): 53-58, 2017 ISSN 2079-2034 IDOSI Publications, 2017 DOI: 10.5829/idosi.ajbas.2017.53.58 Design of a Multiplier Architecture Based on LUT and VHBCSE Algorithm

More information

INTRODUCTION TO FIELD PROGRAMMABLE GATE ARRAYS (FPGAS)

INTRODUCTION TO FIELD PROGRAMMABLE GATE ARRAYS (FPGAS) INTRODUCTION TO FIELD PROGRAMMABLE GATE ARRAYS (FPGAS) Bill Jason P. Tomas Dept. of Electrical and Computer Engineering University of Nevada Las Vegas FIELD PROGRAMMABLE ARRAYS Dominant digital design

More information

Linköping University Post Print. Analysis of Twiddle Factor Memory Complexity of Radix-2^i Pipelined FFTs

Linköping University Post Print. Analysis of Twiddle Factor Memory Complexity of Radix-2^i Pipelined FFTs Linköping University Post Print Analysis of Twiddle Factor Complexity of Radix-2^i Pipelined FFTs Fahad Qureshi and Oscar Gustafsson N.B.: When citing this work, cite the original article. 200 IEEE. Personal

More information

Digital Logic & Computer Design CS Professor Dan Moldovan Spring 2010

Digital Logic & Computer Design CS Professor Dan Moldovan Spring 2010 Digital Logic & Computer Design CS 434 Professor Dan Moldovan Spring 2 Copyright 27 Elsevier 5- Chapter 5 :: Digital Building Blocks Digital Design and Computer Architecture David Money Harris and Sarah

More information

ESE532: System-on-a-Chip Architecture. Today. Message. Graph Cycles. Preclass 1. Reminder

ESE532: System-on-a-Chip Architecture. Today. Message. Graph Cycles. Preclass 1. Reminder ESE532: System-on-a-Chip Architecture Day 8: September 26, 2018 Spatial Computations Today Graph Cycles (from Day 7) Accelerator Pipelines FPGAs Zynq Computational Capacity 1 2 Message Custom accelerators

More information

How to use the IP generator from Xilinx to instantiate IP cores

How to use the IP generator from Xilinx to instantiate IP cores ÁÌ ¹ ÁÒØÖÓ ÙØ ÓÒ ØÓ ËØÖÙØÙÖ ÎÄËÁ Ò ÐÐ ¾¼½ µ ÓÙÖ ÔÖÓ Ø Úº½º¼º¼ 1 Introduction This document describes the course projects provided in EITF35 Introduction to Structured VLSI Design conducted at EIT, LTH.

More information

Digital Design Using Digilent FPGA Boards -- Verilog / Active-HDL Edition

Digital Design Using Digilent FPGA Boards -- Verilog / Active-HDL Edition Digital Design Using Digilent FPGA Boards -- Verilog / Active-HDL Edition Table of Contents 1. Introduction to Digital Logic 1 1.1 Background 1 1.2 Digital Logic 5 1.3 Verilog 8 2. Basic Logic Gates 9

More information

FPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA

FPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA FPGA Implementation of 16-Point FFT Core Using NEDA Abhishek Mankar, Ansuman Diptisankar Das and N Prasad Abstract--NEDA is one of the techniques to implement many digital signal processing systems that

More information

An Efficient Implementation of Multi Layer Perceptron Neural Network for Signal Processing

An Efficient Implementation of Multi Layer Perceptron Neural Network for Signal Processing An Efficient Implementation of Multi Layer Perceptron Neural Network for Signal Processing A.THILAGAVATHY 1, M.E Vlsi Desgin(Pg Scholar), Srinivasan Engineering College, Perambalur-621 212, Tamilnadu,

More information

FPGA: What? Why? Marco D. Santambrogio

FPGA: What? Why? Marco D. Santambrogio FPGA: What? Why? Marco D. Santambrogio marco.santambrogio@polimi.it 2 Reconfigurable Hardware Reconfigurable computing is intended to fill the gap between hardware and software, achieving potentially much

More information

INTEGER SEQUENCE WINDOW BASED RECONFIGURABLE FIR FILTERS.

INTEGER SEQUENCE WINDOW BASED RECONFIGURABLE FIR FILTERS. INTEGER SEQUENCE WINDOW BASED RECONFIGURABLE FIR FILTERS Arulalan Rajan 1, H S Jamadagni 1, Ashok Rao 2 1 Centre for Electronics Design and Technology, Indian Institute of Science, India (mrarul,hsjam)@cedt.iisc.ernet.in

More information

IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC

IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC Thangamonikha.A 1, Dr.V.R.Balaji 2 1 PG Scholar, Department OF ECE, 2 Assitant Professor, Department of ECE 1, 2 Sri Krishna

More information

CAD SUBSYSTEM FOR DESIGN OF EFFECTIVE DIGITAL FILTERS IN FPGA

CAD SUBSYSTEM FOR DESIGN OF EFFECTIVE DIGITAL FILTERS IN FPGA CAD SUBSYSTEM FOR DESIGN OF EFFECTIVE DIGITAL FILTERS IN FPGA Pavel Plotnikov Vladimir State University, Russia, Gorky str., 87, 600000, plotnikov_pv@inbox.ru In given article analyze of DF design flows,

More information

FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE Standard

FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE Standard FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE 754-2008 Standard M. Shyamsi, M. I. Ibrahimy, S. M. A. Motakabber and M. R. Ahsan Dept. of Electrical and Computer Engineering

More information

Design and Implementation of VLSI 8 Bit Systolic Array Multiplier

Design and Implementation of VLSI 8 Bit Systolic Array Multiplier Design and Implementation of VLSI 8 Bit Systolic Array Multiplier Khumanthem Devjit Singh, K. Jyothi MTech student (VLSI & ES), GIET, Rajahmundry, AP, India Associate Professor, Dept. of ECE, GIET, Rajahmundry,

More information

IEEE-754 compliant Algorithms for Fast Multiplication of Double Precision Floating Point Numbers

IEEE-754 compliant Algorithms for Fast Multiplication of Double Precision Floating Point Numbers International Journal of Research in Computer Science ISSN 2249-8257 Volume 1 Issue 1 (2011) pp. 1-7 White Globe Publications www.ijorcs.org IEEE-754 compliant Algorithms for Fast Multiplication of Double

More information

COMPUTER ARITHMETIC (Part 1)

COMPUTER ARITHMETIC (Part 1) Eastern Mediterranean University School of Computing and Technology ITEC255 Computer Organization & Architecture COMPUTER ARITHMETIC (Part 1) Introduction The two principal concerns for computer arithmetic

More information

Embedded Systems: Hardware Components (part I) Todor Stefanov

Embedded Systems: Hardware Components (part I) Todor Stefanov Embedded Systems: Hardware Components (part I) Todor Stefanov Leiden Embedded Research Center Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded System

More information