ARCHITECTURAL DESIGN OF 8 BIT FLOATING POINT MULTIPLICATION UNIT

Similar documents
Run-Time Reconfigurable multi-precision floating point multiplier design based on pipelining technique using Karatsuba-Urdhva algorithms

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

By, Ajinkya Karande Adarsh Yoga

Implementation of IEEE754 Floating Point Multiplier

ISSN: X Impact factor: (Volume3, Issue2) Analyzing Two-Term Dot Product of Multiplier Using Floating Point and Booth Multiplier

FLOATING POINT NUMBERS

VHDL IMPLEMENTATION OF FLOATING POINT MULTIPLIER USING VEDIC MATHEMATICS

COMPUTER ORGANIZATION AND ARCHITECTURE

Implementation of Double Precision Floating Point Multiplier Using Wallace Tree Multiplier

Implementation of Floating Point Multiplier Using Dadda Algorithm

Architecture and Design of Generic IEEE-754 Based Floating Point Adder, Subtractor and Multiplier

Prachi Sharma 1, Rama Laxmi 2, Arun Kumar Mishra 3 1 Student, 2,3 Assistant Professor, EC Department, Bhabha College of Engineering

University, Patiala, Punjab, India 1 2

Divide: Paper & Pencil

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

Number Systems and Computer Arithmetic

FPGA Implementation of Single Precision Floating Point Multiplier Using High Speed Compressors

HIGH SPEED SINGLE PRECISION FLOATING POINT UNIT IMPLEMENTATION USING VERILOG

Implementation of a High Speed Binary Floating point Multiplier Using Dadda Algorithm in FPGA

Pipelined Quadratic Equation based Novel Multiplication Method for Cryptographic Applications

Module 2: Computer Arithmetic

Chapter 5 : Computer Arithmetic

Design of Double Precision Floating Point Multiplier Using Vedic Multiplication

ECE 341 Midterm Exam

CO212 Lecture 10: Arithmetic & Logical Unit

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

FPGA Based Implementation of Pipelined 32-bit RISC Processor with Floating Point Unit

Design and Implementation of IEEE-754 Decimal Floating Point Adder, Subtractor and Multiplier

Advanced Computer Architecture-CS501

REALIZATION OF AN 8-BIT PROCESSOR USING XILINX

Pipelined High Speed Double Precision Floating Point Multiplier Using Dadda Algorithm Based on FPGA

IEEE-754 compliant Algorithms for Fast Multiplication of Double Precision Floating Point Numbers

Double Precision Floating-Point Arithmetic on FPGAs

Keywords: Soft Core Processor, Arithmetic and Logical Unit, Back End Implementation and Front End Implementation.

ISSN Vol.02, Issue.11, December-2014, Pages:

Computer Organisation CS303

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers Implementation

Floating Point Arithmetic

Fig.1. Floating point number representation of single-precision (32-bit). Floating point number representation in double-precision (64-bit) format:

THE DESIGN OF HIGH PERFORMANCE BARREL INTEGER ADDER S.VenuGopal* 1, J. Mahesh 2

Implementation of Double Precision Floating Point Multiplier in VHDL

Chapter 3: Arithmetic for Computers

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

Implementation of 64-Bit Pipelined Floating Point ALU Using Verilog

Designing an Improved 64 Bit Arithmetic and Logical Unit for Digital Signaling Processing Purposes

VHDL implementation of 32-bit floating point unit (FPU)

Design and Optimized Implementation of Six-Operand Single- Precision Floating-Point Addition

DESIGN AND ANALYSIS OF COMPETENT ARITHMETIC AND LOGIC UNIT FOR RISC PROCESSOR

CPE300: Digital System Architecture and Design

Floating Point Arithmetic

Quixilica Floating Point FPGA Cores

ECE 2030B 1:00pm Computer Engineering Spring problems, 5 pages Exam Two 10 March 2010

International Journal of Advanced Research in Computer Science and Software Engineering

High speed Integrated Circuit Hardware Description Language), RTL (Register transfer level). Abstract:

Digital Logic & Computer Design CS Professor Dan Moldovan Spring 2010

An Efficient Implementation of Floating Point Multiplier

VLSI Based Low Power FFT Implementation using Floating Point Operations

COMPUTER ARCHITECTURE AND ORGANIZATION. Operation Add Magnitudes Subtract Magnitudes (+A) + ( B) + (A B) (B A) + (A B)

An instruction set processor consist of two important units: Data Processing Unit (DataPath) Program Control Unit

EE260: Logic Design, Spring n Integer multiplication. n Booth s algorithm. n Integer division. n Restoring, non-restoring

D I G I T A L C I R C U I T S E E

International Journal of Advanced Research in Computer Science and Software Engineering

THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE

ANALYSIS OF AN AREA EFFICIENT VLSI ARCHITECTURE FOR FLOATING POINT MULTIPLIER AND GALOIS FIELD MULTIPLIER*

Number Systems Standard positional representation of numbers: An unsigned number with whole and fraction portions is represented as:

Inf2C - Computer Systems Lecture 2 Data Representation

Fused Floating Point Arithmetic Unit for Radix 2 FFT Implementation

CHAPTER V NUMBER SYSTEMS AND ARITHMETIC

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

Binary Adders. Ripple-Carry Adder

A comparative study of Floating Point Multipliers Using Ripple Carry Adder and Carry Look Ahead Adder

ECE 341 Midterm Exam

Design of High Speed Area Efficient IEEE754 Floating Point Multiplier

ECE232: Hardware Organization and Design

An Implementation of Double precision Floating point Adder & Subtractor Using Verilog

CHAPTER 3 METHODOLOGY. 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier

CSE 141 Computer Architecture Summer Session Lecture 3 ALU Part 2 Single Cycle CPU Part 1. Pramod V. Argade

FPGA IMPLEMENTATION OF FLOATING POINT ADDER AND MULTIPLIER UNDER ROUND TO NEAREST

Design and Analysis of Kogge-Stone and Han-Carlson Adders in 130nm CMOS Technology


UNIT-III COMPUTER ARTHIMETIC

FPGA Matrix Multiplier

CS 101: Computer Programming and Utilization

Design and Simulation of Pipelined Double Precision Floating Point Adder/Subtractor and Multiplier Using Verilog

FPGA IMPLEMENTATION OF HIGHLY AREA EFFICIENT ADVANCED ENCRYPTION STANDARD ALGORITHM

An FPGA Based Floating Point Arithmetic Unit Using Verilog

IJRASET 2015: All Rights are Reserved

Review on Floating Point Adder and Converter Units Using VHDL

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS

An FPGA based Implementation of Floating-point Multiplier

Implementation of Double Precision Floating Point Multiplier on FPGA

Data Representation Type of Data Representation Integers Bits Unsigned 2 s Comp Excess 7 Excess 8

Design of a Pipelined 32 Bit MIPS Processor with Floating Point Unit

COMPUTER ORGANIZATION AND. Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers

DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTOR SUPPORT

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

COMPARISION OF PARALLEL BCD MULTIPLICATION IN LUT-6 FPGA AND 64-BIT FLOTING POINT ARITHMATIC USING VHDL

Operations On Data CHAPTER 4. (Solutions to Odd-Numbered Problems) Review Questions

How a Digital Binary Adder Operates

A High Speed Binary Floating Point Multiplier Using Dadda Algorithm

Transcription:

ARCHITECTURAL DESIGN OF 8 BIT FLOATING POINT MULTIPLICATION UNIT Usha S. 1 and Vijaya Kumar V. 2 1 VLSI Design, Sathyabama University, Chennai, India 2 Department of Electronics and Communication Engineering, Sathyabama University, Chennai, India E-Mail: ushass.sekar@gmail.com ABSTRACT In the recent high speed processor, floating point ALU (FP ALU) is one of the important units to perform the arithmetic and logical functions of the floating point number. Floating point multiplication is one of the arithmetic operations that take more computational time and when implemented in hardware, it requires more area due to the large number of component. However the advantage of implementing the floating point unit in the hardware system is to overcome the overflow and underflow conditions that occur in the logical operation. This paper presents a high speed 8 bit multiplier. To increase the speed and to reduce the power consumption due to the clock load, wave pipelining method has been used. The coding is written in VERILOG HDL and the design is analyzed in the Xilinx environment. Keywords: FPU, floating point multiplication, wave pipelining, HDL. INTRODUCTION Multipliers are extensively used in Microprocessors, Digital Signal Processors and Communication applications. For the multiplication of larger numbers and higher order, a adders used are huge in number to perform the partial product addition. High speed multipliers are needed for increasing the speed of the processors. Higher throughput arithmetic operations are important to achieve the desired performance in many real time signal and image processing applications. One of the key arithmetic operations in such applications is multiplication and the development of fast multiplier circuit has been a subject of interest over decades [4] Reducing the time delay and power consumption are very essential requirements for many applications. In the past, multiplication was implemented generally with a sequence of addition, subtraction and shift operations. Two most common multiplication algorithms followed in the digital hardware are array multiplication algorithm and Booth multiplication algorithm. Since the importance of digital multipliers in DSP s is large, it has always been an active area of research. An 8 bit floating point number consists of signed, exponent and mantissa bits. All the bits are to be processed in order to perform the multiplication operation. Pipelining and parallel processing are the concepts that can be used in computational units to reduce the delay [1, 8]. Wave pipelining is the technique for pipelining digital systems that can increase the clock frequency of practical circuits without increasing the number of storage elements. In wave pipelining many coherent waves of data are sent through a block of combinational logic by applying new input faster than the delay through the logic ideally [3] If all parts from input and output have equal delay, when the circuit s clock frequency is limited by rise and fall times, clock skew, setup and hold times of the storage elements [2, 9]. In the ordinary pipelined system there is one wave of data between register stages. When a new set of values are clocked into one set of registers then the values are allowed to the next set of register before the first set is clocked again. In contrast wave pipelining is the use of multiple coherent waves of data between the storage elements. 8 BIT FLOATING POINT REPRESENTATION Floating-point operations are used to simplify the addition, subtraction, multiplication, and division of fractional numbers. They are used when dealing with fractional numbers, such as 5.724 or a very large number and signed fractional numbers. When performing arithmetic operations involving fractions or very large numbers, it is necessary to know the location of the binary (radix) point and to properly align this point before the arithmetic operation. For floating-point operations, the location of the binary point will depend on the format of the computer. All numbers are placed in this format before carrying out the arithmetic operation. The fractional portion of the number is called the mantissa and the whole integer portion, indicating the scaled factor or exponent, is called the characteristic. The floating-point numbers are coded as reversing the sign-bit inverses the sign. Consequently the same operator performs as well addition or subtraction according to the two operand's signs. In the floating point 8 bit IEEE representation the first bit is assigned for sign of the number, the next three bits are assigned for the exponents and the last four bits are assigned to the mantissa. Figure-1 represents the 8 bit floating point number in IEEE standard. 5363

Figure-1. 8 Bit floating point representation. 8 BIT FLOATING POINT MULTIPLICATION The multiplication entity has two 8 bit inputs A and B. The flow diagram of the multiplication process is shown in Figure-2. The multiplication process of floating point numbers involves the following steps. Consider the following examples A and B. Obtain the input bits and assign the bits to separate wires. Sa and Sb are the signed bits, Ea and Eb are the exponent bits, Ma and Mb are the mantissa of the inputs A and B. Carry out the ex-or operation between Sa and Sb and obtain the resultant sign bit. Equation 1 provides the resultant sign bit. Sign = Sa^Sb Add the exponents and subtract the bias (100) from the result. Thus the resultant exponent can be obtained. Equation 3 represents the resultant exponent bit. = + Figure-2. Flow diagram of floating point multiplication. = + = = = = Multiply the Mantissa Ma and Mb using the conventional multiplication method and normalize the result. FLOATING POINT MULTIPLICATION UNIT The architectural level helps in obtaining the exact result of the floating point multiplication. Hence the design does not require any status register for the overflow and underflow conditions. This has been separated into 3 modules like sign, exponent and mantissa. However for the exceptions like Not a Number (NAN) [7] and infinity the design provides the result to be 0. The different modules are explained in detail below. Figure-1 shows the block diagram of the floating point multiplication unit. The ex-or gate is used to produce the signed bit of the resultant bit. Sa and Sb are the inputs and sign is the out bit of the signed module. Ex is the output of the 3 bit ripple carry adder. Er is the output of exponent module after the bias is subtracted. The mantissa module is 5364

designed for the conventional multiplication where the pipeline registers are added in between each stage. increased. Wave pipelining is the concept of finding the delay of each element and then making the delay to be same overall the circuits. Multiple signals with different timing delays are sent to trigger the transmission gates of the wave pipelined circuit. Hence they transfer the data from one stage to another. Figure-3. Floating point multiplication unit. Sign module In this module the resultant sign bit is obtained by performing ex-or operation between the inputs Sa and Sb. Figure-4 show the ex-or gate that gives out the output sign. Figure-4. Ex-Or gate for resultant sign bit. Figure-5. Circuit for mantissa multiplication. The output of the AND gate M1 is taken directly as the LSB of the result as s0. Outputs of M2 and M5 are fed into the half adder. The sum output s1 is considered and the carry of which is given as the input to the full adder along with which M3 and M6 are fed in and the half adder which takes the carry along with the input M9 produces the sum s2. The same method is used to obtain the other outputs till s7 as in Figure-7. SIMULATION RESULTS Exponent module In this module processing steps for the exponents are carried out. The exponents are added and the bias number is being subtracted from the result. Mantissa module In this module multiplication of the mantissa is carried out. The sets M1 to M16 are obtained by conventional multiplication method by using an AND gate. By repeated addition the sum and carry of each element are obtained. Thus from the Figure-7 the sum s0 to s7 are the added sum result and the carry c9 is also stored in the register. Figure-5 shows the circuit for mantissa multiplication. Thus when the pipelined registers are used in between these stages the speed of operation gets Figure-6. Simulation Result of an floating point multiplier for uneven exponents. 5365

In the Figure-6 the simulation result for the multiplier with different exponents is shown. In this A, B, clk are the inputs and the output is out. Table-2. Performance analysis for wave pipelined multiplication unit. Module Multiplication unit Wave pipelined multiplication unit Power (W) Delay (s) PDP (J) 160m 0.62n 99.05p Table-1 and Table-2 shows the performance analysis of the multiplication unit, where the performance is improved by 65% in wave pipelined based design. Figure-7. Simulation result for floating point multiplier with equal exponents. In the Figure-7 the simulation result for the multiplier with equal exponents is shown. In this A, B, clk are the inputs and the output is out. CONCLUSIONS Thus an efficient floating point multiplication unit has been designed. To increase the computation speed pipelined registers are used. This makes the hardware implementation of the design to be easier. It gives us method for hierarchical multiplier design and clearly indicates the computational advantages offered by pipelined stages. For the exponent part the 3 bit adders are used for the design and the sign bit is obtained by the exor operation of the input sign bits. The same design is also carried out using the wave pipelining method. The delay increased in the unit is negligible. However the power consumption of the wave pipelined multiplier gets reduced by 65%. The designed is analyzed in Xilinx environment and the result is obtained after simulation. REFERENCES [1] Dave Omkar. R. ASIC Implementation of 32 and 64 bit Floating Point ALU using Pipelining. International Journal for Computer Applications. Figure-8. Simulation result for wave pipelined multiplier. In the Figure-8 the simulation result for the multiplier with different exponents is shown. In this A, B, ena are the inputs and the output is out_reg. Table-1. Performance analysis for synchronous multiplication unit. Module Multiplication unit Synchronous multiplication unit Power (W) Delay (s) PDP (J) 192m 0.20n 395.5p [2] Shuchi Bhagwani. Comparative Study of Various Network Processors Processing Elements Topologies. International Journal of Engineering Science and Innovative Technology (IJESIT). 2(1). [3] Deck C. Wong et al. Designing High Performance Digital circuits using wave pipelining: Algorithms and Practical Experience. IEEE Transactions on computer aided design of Integrated circuits and Systems. 12(1). [4] Sivarama Dandamudi. Guide to RISC ProcessorsFor Programmers And Engineers. [5] M. Morris Mano. 1993. Computer System Architecture. 3 rd edition, Prientice-Hall, New Jersey, USA. pp. 346-348. [6] Premananda B.S. et al. Design and Implementation of 8-Bit Vedic Multiplier. International Journal of 5366

Advanced Research in Electrical, Electronics and Instrumentation Engineering. 2(12). [7] Shrivastava Purnima et al. VHDL environment for Floating point Arithmetic Logic Unit - ALU Design and Simulation. Research Journal of Engineering Sciences, ISSN 2278-9472, 1(2): 1-6. [8] Ravi.T, Kannan. V. 2012. Design and Analysis of Low power CNTFET TSPC D Flip-Flop based shift Registers. Applied Mechanics and Materials journal. [9] Ravi T et al. 2013. Ultra Low Power Single Edge Triggered Delay Flip Flop Based Shift Registers using 10-Nanometer Carbon Nano Tube Field Effect Transistor. In American Journal of Applied Sciences. 10(12): 1509-1520. 5367