FINITE IMPULSE RESPONSE FILTER DESIGN ON DISTRIBUTED ARITHMETIC ARCHITECTURE MUHAMAD IQBAL BIN ABU ZAHARIN

Similar documents
MICRO-SEQUENCER BASED CONTROL UNIT DESIGN FOR A CENTRAL PROCESSING UNIT TAN CHANG HAI

HARDWARE AND SOFTWARE CO-SIMULATION PLATFORM FOR CONVOLUTION OR CORRELATION BASED IMAGE PROCESSING ALGORITHMS SAYED OMID AYAT

THE COMPARISON OF IMAGE MANIFOLD METHOD AND VOLUME ESTIMATION METHOD IN CONSTRUCTING 3D BRAIN TUMOR IMAGE

HARDWARE-ACCELERATED LOCALIZATION FOR AUTOMATED LICENSE PLATE RECOGNITION SYSTEM CHIN TECK LOONG UNIVERSITI TEKNOLOGI MALAYSIA

IMPROVED IMAGE COMPRESSION SCHEME USING HYBRID OF DISCRETE FOURIER, WAVELETS AND COSINE TRANSFORMATION MOH DALI MOUSTAFA ALSAYYH

HIGH SPEED SIX OPERANDS 16-BITS CARRY SAVE ADDER AWATIF BINTI HASHIM

BLOCK-BASED NEURAL NETWORK MAPPING ON GRAPHICS PROCESSOR UNIT ONG CHIN TONG UNIVERSITI TEKNOLOGI MALAYSIA

HARDWARE/SOFTWARE SYSTEM-ON-CHIP CO-VERIFICATION PLATFORM BASED ON LOGIC-BASED ENVIRONMENT FOR APPLICATION PROGRAMMING INTERFACING TEO HONG YAP

DESIGN AND IMPLEMENTATION OF A MUSIC BOX USING FPGA TAN KIAN YIAK

INTEGRATION OF CUBIC MOTION AND VEHICLE DYNAMIC FOR YAW TRAJECTORY MOHD FIRDAUS BIN MAT GHANI

ENHANCING SRAM PERFORMANCE OF COMMON GATE FINFET BY USING CONTROLLABLE INDEPENDENT DOUBLE GATES CHONG CHUNG KEONG UNIVERSITI TEKNOLOGI MALAYSIA

SUPERVISED MACHINE LEARNING APPROACH FOR DETECTION OF MALICIOUS EXECUTABLES YAHYE ABUKAR AHMED

OPTIMIZE PERCEPTUALITY OF DIGITAL IMAGE FROM ENCRYPTION BASED ON QUADTREE HUSSEIN A. HUSSEIN

COLOUR IMAGE WATERMARKING USING DISCRETE COSINE TRANSFORM AND TWO-LEVEL SINGULAR VALUE DECOMPOSITION BOKAN OMAR ALI

LOGICAL OPERATORS AND ITS APPLICATION IN DETERMINING VULNERABLE WEBSITES CAUSED BY SQL INJECTION AMONG UTM FACULTY WEBSITES NURUL FARIHA BINTI MOKHTER

ADAPTIVE LOOK-AHEAD ROUTING FOR LOW LATENCY NETWORK ON-CHIP NADERA NAJIB QAID AL AREQI UNIVERSITI TEKNOLOGI MALAYSIA

AUTOMATIC APPLICATION PROGRAMMING INTERFACE FOR MULTI HOP WIRELESS FIDELITY WIRELESS SENSOR NETWORK

IMPLEMENTATION OF UNMANNED AERIAL VEHICLE MOVING OBJECT DETECTION ALGORITHM ON INTEL ATOM EMBEDDED SYSTEM

THREE BIT SUBTRACTION CIRCUIT VIA FIELD PROGRAMMABLE GATE ARRAY (FPGA) NOORAISYAH BINTI ARASID B

ISOGEOMETRIC ANALYSIS OF PLANE STRESS STRUCTURE CHUM ZHI XIAN

Signature :.~... Name of supervisor :.. ~NA.lf... l.?.~mk.. :... 4./qD F. Universiti Teknikal Malaysia Melaka

A LEVY FLIGHT PARTICLE SWARM OPTIMIZER FOR MACHINING PERFORMANCES OPTIMIZATION ANIS FARHAN BINTI KAMARUZAMAN UNIVERSITI TEKNOLOGI MALAYSIA

SLANTING EDGE METHOD FOR MODULATION TRANSFER FUNCTION COMPUTATION OF X-RAY SYSTEM FARHANK SABER BRAIM UNIVERSITI TEKNOLOGI MALAYSIA

PERFOMANCE ANALYSIS OF SEAMLESS VERTICAL HANDOVER IN 4G NETWOKS MOHAMED ABDINUR SAHAL

FPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA

AMBA AXI BUS TO NETWORK-ON-CHIP BRIDGE NG KENG YOKE UNIVERSITI TEKNOLOGI MALAYSIA

MAGNETIC FLUX LEAKAGE SYSTEM FOR WIRE ROPE INSPECTION USING BLUETOOTH COMMUNICATION MUHAMMAD MAHFUZ BIN SALEHHON UNIVERSITI TEKNOLOGI MALAYSIA

STUDY OF FLOATING BODIES IN WAVE BY USING SMOOTHED PARTICLE HYDRODYNAMICS (SPH) HA CHEUN YUEN UNIVERSITI TEKNOLOGI MALAYSIA

SECURE-SPIN WITH HASHING TO SUPPORT MOBILITY AND SECURITY IN WIRELESS SENSOR NETWORK MOHAMMAD HOSSEIN AMRI UNIVERSITI TEKNOLOGI MALAYSIA

SOLUTION AND INTERPOLATION OF ONE-DIMENSIONAL HEAT EQUATION BY USING CRANK-NICOLSON, CUBIC SPLINE AND CUBIC B-SPLINE WAN KHADIJAH BINTI WAN SULAIMAN

DETECTION OF WORMHOLE ATTACK IN MOBILE AD-HOC NETWORKS MOJTABA GHANAATPISHEH SANAEI

MODELLING AND REASONING OF LARGE SCALE FUZZY PETRI NET USING INFERENCE PATH AND BIDIRECTIONAL METHODS ZHOU KAIQING

OPTIMIZED BURST ASSEMBLY ALGORITHM FOR MULTI-RANKED TRAFFIC OVER OPTICAL BURST SWITCHING NETWORK OLA MAALI MOUSTAFA AHMED SAIFELDEEN

AN IMPROVED PACKET FORWARDING APPROACH FOR SOURCE LOCATION PRIVACY IN WIRELESS SENSORS NETWORK MOHAMMAD ALI NASSIRI ABRISHAMCHI

PRIVACY FRIENDLY DETECTION TECHNIQUE OF SYBIL ATTACK IN VEHICULAR AD HOC NETWORK (VANET) SEYED MOHAMMAD CHERAGHI

ENHANCING TIME-STAMPING TECHNIQUE BY IMPLEMENTING MEDIA ACCESS CONTROL ADDRESS PACU PUTRA SUARLI

LOCALIZING NON-IDEAL IRISES VIA CHAN-VESE MODEL AND VARIATIONAL LEVEL SET OF ACTIVE CONTOURS WITHTOUT RE- INITIALIZATION QADIR KAMAL MOHAMMED ALI

MULTICHANNEL ORTHOGONAL FREQUENCY DIVISION MULTIPLEXING -ROF FOR WIRELESS ACCESS NETWORK MOHD JIMMY BIN ISMAIL

A HIGH PERFORMANCE FIR FILTER ARCHITECTURE FOR FIXED AND RECONFIGURABLE APPLICATIONS

SEMANTICS ORIENTED APPROACH FOR IMAGE RETRIEVAL IN LOW COMPLEX SCENES WANG HUI HUI

DYNAMIC TIMESLOT ALLOCATION TECHNIQUE FOR WIRELESS SENSOR NETWORK OON ERIXNO

The Efficient Implementation of Numerical Integration for FPGA Platforms

SOFTWARE-BASED SELF-TESTING FOR A RISC PROCESSOR TEH WEE MENG UNIVERSITI TEKNOLOGI MALAYSIA

A NEW STEGANOGRAPHY TECHNIQUE USING MAGIC SQUARE MATRIX AND AFFINE CIPHER WALEED S. HASAN AL-HASAN UNIVERSITI TEKNOLOGI MALAYSIA

On Designs of Radix Converters Using Arithmetic Decompositions

ADAPTIVE ONLINE FAULT DETECTION ON NETWORK-ON-CHIP BASED ON PACKET LOGGING MECHANISM LOO LING KIM UNIVERSITI TEKNOLOGI MALAYSIA

DYNAMIC MOBILE SERVER FOR LIVE CASTING APPLICATIONS MUHAMMAD SAZALI BIN HISHAM UNIVERSITI TEKNOLOGI MALAYSIA

Power and Area Efficient Implementation for Parallel FIR Filters Using FFAs and DA

RECOGNITION OF PARTIALLY OCCLUDED OBJECTS IN 2D IMAGES ALMUASHI MOHAMMED ALI UNIVERSITI TEKNOLOGI MALAYSIA

This item is protected by original copyright

A Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithmetic with Decomposed LUT

DEVELOPMENT OF PESONA RISC MICROPROCESSOR ARCHITECTURE IN FPGA MOHD FAHMIR ADZRAN BIN RAMLEE

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 10 /Issue 1 / JUN 2018

LINK QUALITY AWARE ROUTING ALGORITHM IN MOBILE WIRELESS SENSOR NETWORKS RIBWAR BAKHTYAR IBRAHIM UNIVERSITI TEKNOLOGI MALAYSIA

DUE to the high computational complexity and real-time

DESIGN AND IMPLEMENTATION OF DA- BASED RECONFIGURABLE FIR DIGITAL FILTER USING VERILOGHDL

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN

SPEED ENHANCEMENT ON A MATRIX INVERSION HARDWARE ARCHITECTURE BASED ON GAUSS-JORDAN ELIMINATION OH ENG WEI UNIVERSITI TEKNOLOGI MALAYSIA

Two High Performance Adaptive Filter Implementation Schemes Using Distributed Arithmetic

A Novel Distributed Arithmetic Multiplierless Approach for Computing Complex Inner Products

ENHANCEMENT OF UML-BASED WEB ENGINEERING FOR METAMODELS: HOMEPAGE DEVELOPMENT CASESTUDY KARZAN WAKIL SAID

PART A SULIT (EKT 221) BAHAGIAN A. Answer ALL questions. Question 1. a) Briefly explain the concept of Clock Gating.

HERMAN. A thesis submitted in fulfilment of the requirements for the award of the degree of Doctor of Philosophy (Computer Science)

IMPROVED IMPLEMENTATION OF DIGITAL WATERMARKING TECHNIQUES AHMED SABEEH YOUSIF UNIVERSITI TEKNOLOGI MALAYSIA

ANOMALY DETECTION IN WIRELESS SENSOR NETWORK (WSN) LAU WAI FAN

ONTOLOGY-BASED SEMANTIC HETEROGENEOUS DATA INTEGRATION FRAMEWORK FOR LEARNING ENVIRONMENT

ARM PROCESSOR EMULATOR MOHAMAD HASRUZAIRIN B MOHD HASHIM

RGB COLOR IMAGE WATERMARKING USING DISCRETE WAVELET TRANSFORM DWT TECHNIQUE AND 4-BITS PLAN BY HISTOGRAM STRETCHING KARRAR ABDUL AMEER KADHIM

Adaptive FIR Filter Using Distributed Airthmetic for Area Efficient Design

FINGERPRINT DATABASE NUR AMIRA BINTI ARIFFIN THESIS SUBMITTED IN FULFILMENT OF THE DEGREE OF COMPUTER SCIENCE (COMPUTER SYSTEM AND NETWORKING)

Fault Tolerant Parallel Filters Based On Bch Codes

IMPLEMENTATION AND PERFORMANCE ANALYSIS OF IDENTITY- BASED AUTHENTICATION IN WIRELESS SENSOR NETWORKS MIR ALI REZAZADEH BAEE

Fast Block LMS Adaptive Filter Using DA Technique for High Performance in FGPA

AN ENHANCED SIMULATED ANNEALING APPROACH FOR CYLINDRICAL, RECTANGULAR MESH, AND SEMI-DIAGONAL TORUS NETWORK TOPOLOGIES NORAZIAH BINTI ADZHAR

Verilog for High Performance

Three-D DWT of Efficient Architecture

ENHANCING WEB SERVICE SELECTION USING ENHANCED FILTERING MODEL AJAO, TAJUDEEN ADEYEMI

DEVELOPMENT OF SPAKE S MAINTENANCE MODULE FOR MINISTRY OF DEFENCE MALAYSIA SYED ARDI BIN SYED YAHYA KAMAL UNIVERSITI TEKNOLOGI MALAYSIA

FAST FOURIER TRANSFORM MODULE FOR IMPLEMENTATION IN NIOS II EMBEDDED PROCESSOR HISHAM AZHARI AHMED OSMAN UNIVERSITI TEKNOLOGI MALAYSIA

Signature : IHSAN BIN AHMAD ZUBIR. Date : 30 November 2007

UNIVERSITI PUTRA MALAYSIA DIGITAL SIGNAL PROCESSOR (DSP) DESIGN USING VERY LONG INSTRUCTION WORD (VLIW) ARCHITECTURE LEE LINI LEE FK

SYSTEM VERILOG RTL MODELING WITH EMBEDDED ASSERTIONS CHOW CHEE SIANG UNIVERSITI TEKNOLOGI MALAYSIA

VERILOG DESIGN OF INPUT/OUTPUT PROCESSOR WITH BUILT-IN-SELF-TEST GOH KENG HOO UNIVERSITI TEKNOLOGI MALAYSIA

An Efficient Implementation of Fixed-Point LMS Adaptive Filter Using Verilog HDL

DATASET GENERATION AND NETWORK INTRUSION DETECTION BASED ON FLOW-LEVEL INFORMATION AHMED ABDALLA MOHAMEDALI ABDALLA

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2

Design of Delay Efficient Distributed Arithmetic Based Split Radix FFT

CAMERA CALIBRATION FOR UNMANNED AERIAL VEHICLE MAPPING AHMAD RAZALI BIN YUSOFF

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE

INTRODUCTION TO CATAPULT C

CHAPTER - 2 : DESIGN OF ARITHMETIC CIRCUITS

MAC PROTOCOL FOR WIRELESS COGNITIVE NETWORK FARAH NAJWA BINTI MOKHTAR

UNIVERSITI PUTRA MALAYSIA METHODOLOGY OF FUZZY-BASED TUNING FOR SLIDING MODE CONTROLLER

BORANG PENGESAHAN STATUS TESIS

MCM Based FIR Filter Architecture for High Performance

Faculty of Computer Science and Information Technology

FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression

Implementation of Low Power High Speed 32 bit ALU using FPGA

Design and Implementation of VLSI 8 Bit Systolic Array Multiplier

COMBINING TABLES. Akademi Audit Negara. CAATs ASAS ACL / 1

Transcription:

FINITE IMPULSE RESPONSE FILTER DESIGN ON DISTRIBUTED ARITHMETIC ARCHITECTURE MUHAMAD IQBAL BIN ABU ZAHARIN A project report submitted in partial fulfilment of the requirement for the award of the degree of Master of Engineering (Electrical Electronics & Telecommunications) Faculty of Electrical Engineering Universiti Teknologi Malaysia JUNE 2012

iii ACKNOWLEDGEMENT In the name of Allah, the Most Beneficent and Merciful, with the deepest gratefulness to Allah who has given me the strength and ability to complete this project report. First and foremost, I wish to express my sincere appreciation to my project supervisor, Professor Dr. Mohamed Khalil Hani, for encouragement, guidance and support. I would also like to thank Puan Nordinah Ismail for her helpful suggestions and ideas. Their knowledge and support throughout this project has made this work possible. I would also like to thank my family member for their advice, prayer, encouragement and continuous moral support for the completion of this project report. Last but not least, thank you to my lecturers, friends and supporting staffs who has helped me to complete this thesis in various ways.

iv ABSTRACT In signal and image processing application, highly repetitive operations and intense multiplication computation that exist in Digital Signal Processing systems makes it challenging to achieve less hardware requirement (area) and less latency (speed) performance using software based platforms, thus a lot of hardware design architecture becomes available to fill the gap. A close examination of the algorithms used in these, and related, applications reveals that many of these fundamental actions involve calculation of sum of products, vector dot product, inner product or multiply and accumulate (MAC). A study on the FIR filter involves multiply and accumulates operation where inner product forms the basis of the algorithms of the core. It is the aim of this paper to develop efficient architecture of a FIR filter on Distributed Arithmetic (DA) through ROM based ideally suited for efficient computation in order to achieve better performances in terms of speed and area by providing an effective methodology to implement MAC using a simple combination of memory elements, adders and shifters instead of lumped multipliers. The design is an 8-tap low pass filter based on 16-bit input samples and 16-bit signed coefficients at sampling frequency of 16MHz. All the coefficients are stored in the ROM 8x16 words size. The behavioural model of the FIR filter is structured in Verilog code and synthesized using Altera QuartusII version 11. The timing simulation is verified by running testbench codes written in Verilog using ModelSim 6.6d. The results from the simulation show that the FIR filter can be implemented using Distributed Architecture rather than using lumped multipliers.

v ABSTRAK Di dalam applikasi isyarat dan imej pemprosesan, operasi penggandaan yang terlalu mendalam dan berulang yang wujud di dalam sistem Pemprosesan Isyarat Digital adalah mencabar terutamanya dalam memenuhi syarat kurang perkakasan dan mempercepatkan kelajuan pemprosesan dalam melaksanakan pemprosesan melalui perisian menyebabkan wujudnya banyak rekacipta senibina perkakasan baru dalam memenuhi kekurangan tersebut. Penyelidikan yang terperinci terhadap algorithma yang digunakan banyak menjurus kepada asas hasil tambah penggandaan, penggandaan vektor dan hasil darab dan penambahan. Pengkajian terhadap finite impulse response filter (FIR) banyak melibatkan kepada asas hasil darab dan penambahan di mana penggandaaan vektor menjadi asas kepada algorithma teras. Kertas ini bertujuan untuk membangunkan rekabentuk filter menggunakan Distributed Arithmetic (DA) melalui pengenalan memori ROM untuk pengiraan yang lebih berkesan dalam memenuhi keputusan yang lebih baik dari segi kelajuan pemprosesan dan jumlah perkakasan yang digunakan dengan menggunakan kaedah hasil darab dan penambahan yang lebih berkesan dengan menggunakan kombinasi memori, penambah dan pengalih dari menggunakan longgokan pendarab. Rekacipta ini adalah 8-tap low pass filter yang mempunyai 16-bit sampel input dan 16-bit pekali pada kadar frekuensi sampel 16MHz. Kesemua pekali disimpan di dalam memori ROM bersaiz 8x16. Model pemprosesan filter dibina dengan menggunakan kod Verilog dan disintesis menggunakan perisian Altera Quartus versi11. Simulasi pemasaan diperiksa menggunakan kod ditulis menggunakan kod Verilog dan perisian Modelsim 6.6d. Keputusan ujian menunjukkan FIR filter boleh direkabentuk menggunakan Distributed Arithmetic Architecture berbanding menggunakan longgokan pendarab.

vi TABLE OF CONTENTS CHAPTER TITLE PAGE DECLARATION ii ACKNOWLEDGEMENT iii ABSTRACT iv ABSTRAK v TABLE OF CONTENTS vi LIST OF TABLES ix LIST OF FIGURES x LIST OF ABBREVIATIONS xii 1 INTRODUCTION 1.1 Motivation and Rational of the Work 1 1.2 Objectives 2 1.3 Scope of Work 3 1.4 Organization of Project Report 3 2 BACKGROUND OF DISTRIBUTED ARITHMETIC 2.1 Distributed Arithmetic 5 2.2 Mathematical Background 6 2.3 Algorithm Test 8 2.4 DA Memory Construction 8 2.5 DA Memory Organization 10 3 RESEARCH METHODOLOGY 3.1 Distributed Arithmetic Transformation 14

vii 3.2 FIR Filter Design Procedure 17 3.2.1 Filter Specification 17 3.2.2 Coefficient Calculation 18 3.2.3 Realization 18 3.2.4 Implementation 18 3.3 RTL Design Flow 19 3.3.1 Algorithmic Modeling 20 3.3.2 RTL Modeling 20 3.3.3 Datapath (DU) and Control Unit (CU) Design 20 3.3.4 Logic Synthesis 21 3.3.5 Verification 21 4 FIR FILTER DESIGN AND MATLAB ANALYSIS 4.1 Introduction to DA based Finite Impulse Response Filter 22 4.2 Why Implement FIR Filter using DA? 23 4.3 Filter Design and Analysis using MATLAB 24 4.4 Impulse Response 25 4.5 Pole Zero Plot 26 4.6 Realization and Implementation 27 5 RTL DESIGN OF FIR FILTER 5.1 RTL Design of FIR Filter 29 5.2 Design Specification 29 5.3 Algorithm Modelling 30 5.4 RTL Modelling 31 5.5 Datapath Unit Design 33 5.6 Control Unit Design 34 5.7 Design of Complete FIR Filter 35 5.8 Simulation by Components 36

viii 5.8.1 ROM LUT 36 5.8.2 REGR1 Shift Right 36 5.8.3 PSC Load and Shift Register 37 5.8.4 PSC and ROM Integration 38 5.8.5 ALU 39 5.8.6 Accumulator Register C 39 5.9 Simulation Results of DU, CU, Top Module 40 5.9.1 Datapath Unit 40 5.9.2 Control Unit 41 5.9.3 Top Module of FIR 41 5.10 Simulation Results Comparison between MATLAB and FPGA 42 6 FUTURE WORK AND CONLUSION 6.1 DA Limitation 43 6.2 Conclusion 43 REFERENCES 45 APPENDIX 46-72

ix LIST OF TABLES TABLE NO. TITLE PAGE 2.1 ROM LUT 9 2.2 ROM 0 for I = 0, row 0 of matrix A 12 2.3 ROM 1 for I = 1, row 1 of matrix A 12 2.4 ROM 2 for I = 2 row 2 of matrix A 12 5.1 Table describes state and its operation 32 5.2 RTL Code derived from the ASM flowchart 33 5.3 RTL Control Signal Table for FIR filter 34 5.4 Operation table of the shift right register 37 5.5 PSC Load or Shift Operation Table 38 5.6 ALU operation 39

x LIST OF FIGURES FIGURE NO. TITLE PAGE 2.1 Output from the Algorithm test 8 2.2 Top level functional block diagram of a DA 10 3.1 An 8-bit multiplier (AND gates) with adder to produce 15 output, Y 3.2 DA is a bit serial technique 15 3.3 Substitute the Scaling Accumulator into the original 16 design 3.4 Pre-computed sums stored in LUT for the bitwise 16 addition 3.5 Design flow of a filter 17 3.6 RTL flow design 19 4.1 Direct form FIR implementation derived from 23 Matlab s Simulink model 4.2 A lowpass filter frequency response 24 4.3 The impulse response of the FIR filter 26 4.4 The Pole / Zero Plot from Matlab analysis 27 4.5 MATLAB simulation output as Discrete Impulse input 28 forced 5.1 ASM Flowchart of the FIR filter 32 5.2 Functional block diagram of DU of an 8-order FIR 33 filter 5.3 Functional block diagram of the Control Unit 35 5.4 Top level block diagram of FIR filter 35 5.5 The contents of the ROM are the coefficients fetched based on the input data B as the address 36

xi 5.6 Shift right register output 36 5.7 Serial-In Parallel Out 37 5.8 Parallel-in Serial Out 37 5.9 Integration between PSC and ROM. The input 38 addresses successfully call the ROM memory content as expected 5.10 Timing simulation of ALU operation 39 5.11 Register C is a positive-edge triggered D-type 39 5.12 Timing simulation of Datapath Unit 40 5.13 Timing simulation of Datapath Unit when filter 40 coefficients are applied 5.14 Timing simulation of Control Unit 41 5.15 Timing simulation of Top Module of FIR filter 41 5.16 MATLAB Simulation when input is 6 42 5.17 Timing simulation of Top Module when input 6 is applied 42

xii LIST OF ABBREVIATIONS ASM - Algorithmic State Machine CU - Control Unit DA - Distributed Arithmetic Architecture DU - Datapath Unit FBD - Functional Block Diagram FIR - Finite Impulse Response FSM - Finite State Machines RTL - Register Transfer Operation MAC - Multiply and Accumulate

CHAPTER 1 INTRODUCTION This project report describes about the design of efficient architecture of Finite Impulse Response (FIR) filter implemented on Distributed Arithmetic (DA) through ROM based for a fast computation of multiply and accumulate (MAC) in order to achieve better performances in terms of area and speed. 1.1 Motivation and Rational of the Work Finite Impulse Response filter is widely used in signal and image processing applications. FIR filter has impulse response to any finite length of input. The filter settles to zero in finite time. The output of this linear time invariant system is determined by convolving its input signal with impulse response which is a weighted sum of the current and finite number of previous values of the input. The standard form of FIR filter is realized by a shifting register a loop in which the filter coefficients are multiplied by the shifting register values. The sum of these multiplications will determine the output value. This operation is also called multiply and accumulates (MAC) operation which is the core of FIR filter

2 implementations. As the number of filter order increases the hardware requirement (eg: more multipliers, delays thus increasing area) become more important and crucial especially for developing system with hardware and cost constraints. Multiplication is convolution. Multiplication of integers and discrete time convolution are same operations. The bit level description of multiplication can be mixed with the convolution. The most-often encountered form of computation in digital signal processing is inner products. Inner product computations can execute efficiently by DA by taking advantage of pre-computed data stored into ROM memory. Our motivation for using DA in this FIR filter design is because of its computational efficiency. The advantages are best exploited in circuit design. By proper design one may reduce the total gate count in a signal processing arithmetic unit by a number seldom smaller than 50 and often as large as 80 percent (Stanley A. White,1989). 1.2 Objectives In every FIR filter, multiply and accumulate is the core of FIR implementation. A FIR filter simply perform convolution as represented in the below equation: N 1 y( n) A. b( n k) k 0 k (1.1) where: b(n-k) input signal Ak is the coefficients/impulse response representation N- filter order By taking advantage of DA architecture, this project report will discuss on the FIR implementation using DA and how it can reduce the resource requirements for the inner product computations.

3 1.3 Scope of Work Prior the design work, the DA theoretical study and background understanding is developed in order to get the idea how DA manipulates the bit level description. The scope of design work for the FIR design can be divided into two aspects. One is FIR filter design using MATLAB in order to obtained the coefficients and analyze the filter performance. The second aspect is the RTL design of the FIR filter. The computed coefficients from the MATLAB are stored into the ROM memory of DA architecture. The logic synthesize of the Verilog HDL codes is run using Altera Quartus II version 11 with simulation run on Modelsim version 6.6 with testbenches generated in order to verify the design correctness. The comparison between the MATLAB output and designed FIR Filter simulation outputs will be discuss in terms of its performance and hardware requirements. 1.4 Organization of Project Report This project report comprise of six main chapters; Introduction, Background of Distributed Arithmetic, Research Methodology, FIR Filter and Matlab Analysis, RTL Design of FIR Filter and Future Development and Conclusion. Chapter 1 discusses the Motivation and Rational of the Work, Scope of Work and Organization of this Project Report. Chapter 2 discusses about the Background of Distributed Arithmetic. This chapter focuses on DA and its mathematical background where algorithm was also tested using C-program. The bit serial technique is discussed essentially contributes to the memory construction and organization.

4 Chapter 3 discusses the Research Methodology applied throughout the development of this project. Three aspects covered including the Distributed Arithmetic architecture study, FIR filter design flow and RTL design flow. Chapter 4 discusses the FIR Filter Design and Matlab Analysis that covers the filter design work flow, filter theory and understanding and how the coefficients are computed. Chapter 5 covers the RTL Design of the FIR Filter. Describes the step by step flow from Algorithm modelling, RTL modelling, Datapath Unit (DU) design and Control Unit (CU) design and the integration of both DU and CU. Simulation by components of each module and output comparison with MATLAB are also discussed. Chapter 6 discusses Future Work and Conclusion. The current DA architecture still have outstanding issue specifically when the filter order become increasingly large the memory size will expand exponentially with 2 k. Several techniques and proposed modified architecture will also be discussed.

REFERENCES 1. A. Amira and F.Bensaali, An FPGA based Parameterisable System for Matrix Product Implementation,in Field-Programmable Technology (FPT), 2003. Proceedings. 2003 IEEE International Conference on 17 December 2003. 2. S. Chandrasekaran and A. Amira, Novel Sparse OBC based Distributed Arithmetic Architecture for Matrix Transforms in Circuits and Systems, 2007. ISCAS 2007. IEEE International Symposium on 27-30 May 2007. 3. Mohamed Khalil Hani, Starter s Guide to Digital Systems VHDL & Verilog Design 4. Application Note The Role of FPGA based Signal Processing, URL:http://www.xilinx.com 5. Stanley A. White, Applications of Distributed Arithmetic to Digital Signal Processing: A Tutorial Review, IEEE ASSP Magazine, July, 1989 6. H.C. Chen, J.I. Guo, C.W. Jen and T.S Chang, Distributed Arithmetic realization of cyclic convolution and its DFT application, IEEE Proc.- Circuits Devices Syst., Vol. 152, No. 6, December 2005 7. Bo Hong, Haibin Yin, Xiumin Wang, Ying Xiao, Implementation of FIR Filter on FPGA Using DA-OBC Algorithm, IEEE International Conference, December 2010. 8. Pramod Kumar Maher, New Approach to Look-Up-Table Design and Memory-Based Realization of FIR Digital Filter, IEEE Transcations on Circuits and Systems, Vol57, No. 3, March 2010

46 9. Heejong Yoo and David V.Anderson, Hardware Efficient Distributed Arithmetic Architecture for High Order Digital Filters, IEEE International Conference, March 2005. 10. Tsutomu Sasao, Yukihiro Iguchi, Takahiro Suzuki, On LUT Cascade Realization of FIR Filters, Proceedings of the 2005 8 th Euromicro conference on Digital System Design.