A comparative study of Floating Point Multipliers Using Ripple Carry Adder and Carry Look Ahead Adder

Similar documents
International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

An Efficient Implementation of Floating Point Multiplier

Development of an FPGA based high speed single precision floating point multiplier

An FPGA Based Floating Point Arithmetic Unit Using Verilog

Figurel. TEEE-754 double precision floating point format. Keywords- Double precision, Floating point, Multiplier,FPGA,IEEE-754.

Pipelined High Speed Double Precision Floating Point Multiplier Using Dadda Algorithm Based on FPGA

Design and Implementation of Floating Point Multiplier for Better Timing Performance

Implementation of Floating Point Multiplier Using Dadda Algorithm

An FPGA based Implementation of Floating-point Multiplier

THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE

Implementation of IEEE-754 Double Precision Floating Point Multiplier

A High Speed Binary Floating Point Multiplier Using Dadda Algorithm

Design of High Speed Area Efficient IEEE754 Floating Point Multiplier

Implementation of Double Precision Floating Point Multiplier in VHDL

Comparison of Adders for optimized Exponent Addition circuit in IEEE754 Floating point multiplier using VHDL

VHDL IMPLEMENTATION OF FLOATING POINT MULTIPLIER USING VEDIC MATHEMATICS

University, Patiala, Punjab, India 1 2

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN

International Journal of Research in Computer and Communication Technology, Vol 4, Issue 11, November- 2015

CS Computer Architecture. 1. Explain Carry Look Ahead adders in detail

COMPARISION OF PARALLEL BCD MULTIPLICATION IN LUT-6 FPGA AND 64-BIT FLOTING POINT ARITHMATIC USING VHDL

A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier using Modified CSA

Fig.1. Floating point number representation of single-precision (32-bit). Floating point number representation in double-precision (64-bit) format:

Numeric Encodings Prof. James L. Frankel Harvard University

Hemraj Sharma 1, Abhilasha 2

Design and Implementation of IEEE-754 Decimal Floating Point Adder, Subtractor and Multiplier

Computer Architecture Set Four. Arithmetic

Design of Double Precision Floating Point Multiplier Using Vedic Multiplication

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Arithmetic (a) The four possible cases Carry (b) Truth table x y

Computer Arithmetic Ch 8

Computer Arithmetic Ch 8

Run-Time Reconfigurable multi-precision floating point multiplier design based on pipelining technique using Karatsuba-Urdhva algorithms

International Journal Of Global Innovations -Vol.1, Issue.II Paper Id: SP-V1-I2-221 ISSN Online:

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

EE260: Logic Design, Spring n Integer multiplication. n Booth s algorithm. n Integer division. n Restoring, non-restoring

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

IEEE-754 floating-point

At the ith stage: Input: ci is the carry-in Output: si is the sum ci+1 carry-out to (i+1)st state

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

Number Systems Standard positional representation of numbers: An unsigned number with whole and fraction portions is represented as:

Implementation of Double Precision Floating Point Multiplier on FPGA

Number Systems and Computer Arithmetic

MIPS Integer ALU Requirements

Architecture and Design of Generic IEEE-754 Based Floating Point Adder, Subtractor and Multiplier

VLSI Based Low Power FFT Implementation using Floating Point Operations

2 Prof, Dept of ECE, VNR Vignana Jyothi Institute of Engineering & Technology, A.P-India,

ECE232: Hardware Organization and Design

COMP2611: Computer Organization. Data Representation

CO212 Lecture 10: Arithmetic & Logical Unit

By, Ajinkya Karande Adarsh Yoga

International Journal of Advanced Research in Computer Science and Software Engineering

Module 2: Computer Arithmetic

Double Precision Floating-Point Arithmetic on FPGAs

CHAPTER 7 FPGA IMPLEMENTATION OF HIGH SPEED ARITHMETIC CIRCUIT FOR FACTORIAL CALCULATION

Chapter 3 Arithmetic for Computers

FPGA Implementation of Low-Area Floating Point Multiplier Using Vedic Mathematics

Implementation of a High Speed Binary Floating point Multiplier Using Dadda Algorithm in FPGA

Design and Optimized Implementation of Six-Operand Single- Precision Floating-Point Addition

EC2303-COMPUTER ARCHITECTURE AND ORGANIZATION

Floating Point Arithmetic

Chapter 10 - Computer Arithmetic

Floating Point Considerations

CPE300: Digital System Architecture and Design

Implementation of Double Precision Floating Point Multiplier Using Wallace Tree Multiplier

COMPUTER ARCHITECTURE AND ORGANIZATION. Operation Add Magnitudes Subtract Magnitudes (+A) + ( B) + (A B) (B A) + (A B)

SINGLE PRECISION FLOATING POINT DIVISION

C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0.

Chapter 3: Arithmetic for Computers

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI.

Prachi Sharma 1, Rama Laxmi 2, Arun Kumar Mishra 3 1 Student, 2,3 Assistant Professor, EC Department, Bhabha College of Engineering

International Journal of Advanced Research in Computer Science and Software Engineering

ARCHITECTURAL DESIGN OF 8 BIT FLOATING POINT MULTIPLICATION UNIT

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS

Principles of Computer Architecture. Chapter 3: Arithmetic

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS

FLOATING POINT NUMBERS

Organisasi Sistem Komputer

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

Chapter 4 Arithmetic Functions

REALIZATION OF MULTIPLE- OPERAND ADDER-SUBTRACTOR BASED ON VEDIC MATHEMATICS

Week 7: Assignment Solutions

Signed Multiplication Multiply the positives Negate result if signs of operand are different

Chapter 5 : Computer Arithmetic

An Efficient Design of Vedic Multiplier using New Encoding Scheme

Foundations of Computer Systems

Computer Arithmetic Floating Point

Data Representation Type of Data Representation Integers Bits Unsigned 2 s Comp Excess 7 Excess 8

Chapter 3: part 3 Binary Subtraction

SIMULATION AND SYNTHESIS OF 32-BIT MULTIPLIER USING CONFIGURABLE DEVICES

Double Precision Floating Point Core VHDL

IEEE Standard 754 Floating Point Numbers

Chapter 4. Operations on Data

FPGA Implementation of Single Precision Floating Point Multiplier Using High Speed Compressors

Divide: Paper & Pencil

A Library of Parameterized Floating-point Modules and Their Use

4 Operations On Data 4.1. Foundations of Computer Science Cengage Learning

UNIT - I: COMPUTER ARITHMETIC, REGISTER TRANSFER LANGUAGE & MICROOPERATIONS

Area-Time Efficient Square Architecture

Design and Implementation of an Efficient Single Precision Floating Point Multiplier using Vedic Multiplication

Transcription:

A comparative study of Floating Point Multipliers Using Ripple Carry Adder and Carry Look Ahead Adder 1 Jaidev Dalvi, 2 Shreya Mahajan, 3 Saya Mogra, 4 Akanksha Warrier, 5 Darshana Sankhe 1,2,3,4,5 Department of Electronics, D. J. Sanghvi College of Engineering, Mumbai, India Abstract This paper presents a comparative study of floating point multipliers using two different adders to implement a Vedic Algorithm for multiplication. Floating point numbers are represented here using the IEEE-754 single precision format. The adders used here are the Ripple Carry Adder and the Carry Look Ahead Adder. The algorithm used for multiplication is the Urdhav Tiryakbhyam Algorithm based on ancient concepts of Vedic Mathematics. A detailed analysis of the delay in each of these implementations is presented. They were coded in VHDL, simulated using Altera Quartus Prime (version 16.0 Standard Edition) and synthesized. Keywords floating point multiplication, IEEE-754 single precision format, Urdhya Tiryakbhyam, Ripple Carry Adder Carry Look Ahead Adder, VHDL, Altera Quartus I. INTRODUCTION The floating point format is crucial for representing numbers when the data to be represented spans a wide range of values and hence, it has found applications in a varied array of fields, including digital signal processing, digital image processing and embedded systems. The floating point representation is an unencoded member of a floating-point format, representing a finite number, a signed infinity, a quiet NaN, or a signalling NaN. Finite numbers are represented by three components: a sign, an exponent, and a significand; its numerical value is the signed product of its significand and its radix raised to the power of its exponent [2]. Multiplication and addition are the most frequently used floating point arithmetic operations. Today, most computational functions like those used in image processing and signal processing involve recursive multiplication on a large dataset, which needs to be performed within nanoseconds in order to ensure that the processing occurs in real-time. Since floating point multiplication entails multiplication of the mantissas, as well as addition of the exponents, accurate and speed optimised multipliers and adders are extremely essential for developing an efficient floating point multiplier. After implementing two types of floating point multipliers, both using a Vedic multiplication algorithm for multiplying the mantissas with one using ripple carry adders for all the additions (exponent addition as well as carry look ahead adders, we have analysed the delay in each case and presented our findings here. The Vedic Multiplier, based on the Urdhav Tiryakbhyam algorithm, is faster than a conventional multiplier [9]. II. IEEE -754 STANDARD Over the years, there have been have many formats for floating point representation. But the most significant one has been defined by the IEEE 754 standard. It was adopted in 1985 and revised in 2008. Currently it is used by all processors and coprocessors [1]. It encapsulates both decimal and binary floating point representations. However, in this paper we emphasize on binary representation only. IEEE 754 Binary Floating Point single precision is a 32-bit representation, while the double precision format is a 64 representation. The 32- bit representation consists of three parts. The sign of the number is given in the first bit. If the number is positive then this bit is 0, if the number is negative then this bit is 1.The next 8 bits are a representation of the exponent to the base 2[2]. The value stored in the exponent field is an unsigned integer E'. It is stored in an excess-127 format. Which means that E' is in the range 0<=E'<=255. Thus, the signed exponent E is represented as E'=E+127. This means that E' is in the range of 0 to 255 whereas E is in the range of -126 to +127. The mantissa is represented in the last 23 bits. Hence, a 2^23 precision. The MSB of the mantissa is always equal to 1, which is known as binary Normalization. This bit is not a part of the 23 bits used for representation. It is implicitly assumed.[3] The standard gives accurate representation of positive and negative infinity, positive and negative zero and also sets exception flags in the case of underflow, overflow, divide by zero, invalid, inexact. An interrupt routine can be set for any of the exception flags. This can be system defined or user defined. The standard also defines representations for positive and negative infinity, a "negative zero", five exceptions to handle invalid results like division by zero, special values called NaNs for representing those exceptions, denormal numbers to represent numbers smaller than shown above, and four rounding modes. additions within the multiplier), while the other using 6

Fig. 1 : IEEE 754 Single Precision format Normalized scale Exponen t (E) Signifi cand (N) Value/Commen ts 255 Not Does not equal represent a to 0 number 255 0 - to + depending on sign bit 0<E<25 Any 5 0 0 0 depending on sign bit Table 1 : IEEE 754 Format III. FLOATING POINT MULTIPLICATION consider a 3x3 basic block. For a 3 bit operation, first we multiply the Least Significant bit (LSB) of the Multiplicand and Multiplier. This gives us the LSB of the Result. We then perform a cross wise multiplication of the Least two significant bits of both the multiplier and multiplicand. This crosswise multiplication is then performed for all three bits, followed by the two most significant bits (MSB), followed by the MSBs of multiplicand and multiplier. In each stage, the carry from the previous stage is added to the output. We eventually get a 6 bit result from the 3 bit numbers. [3] Fig. 3 : The 3 bit macro using the Urdhav Tiryakbhyam algorithm [3] The 3 bit block is then inculcated into a 6 bit block. We carry out the same process for 6x6 crosswise multiplication. The obtained result is a 12 bit result. 6 bit blocks are then used to make the 12 bit block. We then perform a 12x12 bit crosswise multiplication to obtain the final 24 bit mantissa result [5]. Fig. 2 : Algorithm for floating point multiplication [10] A. MANTISSA MULTIPLICATION A5 A4 A3 A2 A1 A0 B5 B4 B3 B2 B1 B0 C D X X X E X X X F X X X X X X RESULT Fig. 4 : Additions performed within the multiplier at each stage The multiplication of two floating point numbers is a stepwise process [4]. Firstly the two numbers are converted to the IEEE 754 Standard Format of representation. Then, the significands are multiplied. The 23 bit mantissa of the given number is represented with 1 at the MSB (as is required by the format), giving us two 24 bit numbers. These numbers are then multiplied using the Urdhav Tiryakbhyam Sutra. The algorithm follows a vertical and cross wise mechanism for the multiplication of two numbers. Derived from ancient Indian concepts of Math, it multiplies two numbers in a short amount of time. In the Urdhav Tiryakbhyam Algorithm, we first begin Where, C= A2A1A0 x B2B1B0 D= A5A4A3 x B2B1B0 E= A2A1A0 x B5B4B3 F= A5A4A3 x B5B4B3 B. EXPONENT ADDITION As seen from the operation method for floating point multiplication, an important stage is the addition of exponents. with a smaller block for multiplication. Let us initially 7

The biased exponents must first be made unbiased. To get the original exponent, from the biased exponent, we need to subtract 127. We then add the two exponents. To get a biased exponent output, we add 127 to the result. This can be mathematically expressed as, Output = (E1 127) + (E2 127) + 127 = E1 + E2 127 Exponent addition is implemented with the use of an adder. Since exponent addition involves two or more n bit numbers, we can use a Ripple Carry Adder or a Carry Look Ahead Adder. We perform a comparative study of the delay resulted by the choice of adder. In all summations as demanded by the algorithm and as required for calculating the final biased exponent, we use the same Adder (Ripple Carry or Carry Look Ahead). 1. RIPPLE CARRY ADDER A ripple carry adder is a kind of logic circuit that ripples the Carry bit through its various stages. Multiple Full adders are cascaded to add two N bit numbers. Each stage, apart from the first stage, has a Carry In bit. The Carry Out of the previous stage serves as the Carry In for the succeeding stage. The Sum and Carry Out of each stage is only valid, when the Carry In of the stage occurs. This contributes to a delay in propagation. There is a lapse between the input and occurrence of the output. [1] While this has a considerable delay, its simplicity is its advantage. The layout of the ripple carry adder is very easy to understand. Gate delay for this adder can be calculated easily by inspecting the circuit. If we consider the delay in terms of X units of time. For an n-bit adder, there will hence be a delay of, n.x While the adders are working in parallel, the Carry must ripple their way from the LSB and work their way to the MSB. It takes X units for the carry out of the rightmost column to make it as input to the adder in the column to its immediate left. 2.CARRY LOOK AHEAD ADDER A ripple carry adder, is slowed down due to the propagation of carry through each stage. The sum and carry outputs of a stage cannot be produced till the input carry occurs. This delay caused is known as the carry propagation delay. Other arithmetic operations, such as multiplication or division consist of an adder segment within them. The speed limitation hence slows down complex arithmetic operations by a considerable amount. A carry look ahead adder solves this issue by calculating the carry beforehand [7]. As we know there are two conditions that generate a possible carry: when both bits are 1 When one of the two bits is 1 and the carry-in (carry out from last stage) is 1. A CLA adder, first calculates if, a particular digit is going to propagate a carry, if a carry comes in from the previous stage. This is then evaluated as a group i.e. whether the group is going to propagate the carry or not. For a carry bit C1, C1=G0 + (P0.C0) Where, G0=a. b P0=a xor b C2=G1+P1 (G0 + (P0.C0)). Here we substituted the equation from the previous stage. Generalizing, we can get, Ci = G P0P1 Pi 1 + G0P1P2 Pi 1 + G1P2P3 Pi 1 + + Gi 2Pi 1 + Gi 1 = X i 1 j= 1 Gji Y 1 k=j+1 Pk [1] A CLA adder is faster and unique since it calculates multiple carries in parallel. Fig. 5 : Ripple Carry adder For addition of two 8 bit numbers, the Ripple Carry Adder gives a delay of 9.86ns. Even for large numbers, the complexity of this form of addition stays simple. Stages of super-groups are added when needed. Considering the increase in digits, the corresponding increase in the number of gates is quite feasible. 8

been performed using only an RCA. We found that the overall delay in this implementation is 34.884ns. Fig. 6 : Carry Look Ahead adder For addition of two 8 bit numbers, the Carry Look Ahead Adder, gives a delay of 9.69 ns. As observed, this produces a more efficient output than a Ripple Carry Adder due to the speed optimization. The difference in delay is 0.17ns. In successive iterations, this cumulatively optimizes the speed by a large margin. C. SIGN CALCULATION To get the output sign bit, EXOR operation is performed between the input sign bits. This can be expressed as, S= s1 XOR s2 D. NORMALIZATION The result obtained is normalized. We obtain the normalized 23 bits and a biased exponent. For doing so, first the result number is checked for a leading 1. For a leading one for a particular combination assumed as the 46 th bit, the exponent result is first incremented by1. We then take the Mantissa M [45:23] as the normalized set of bits. Alternatively, without incrementing the exponent result, we take Mantissa M [44:22] as the normalized set of bits [8]. IV. EXPERIMENTAL RESULTS The proposed Floating Point multiplier has been coded in VHDL, synthesized and simulated using Altera Quartus where we selected the device Altera Cyclone IV. Fig. 8 : Vedic multiplier using CLA adders We also tested the same floating point multiplier using a CLA Adder. Here we found a delay of 27.3 ns. This resultant delay can be attributed to the faster computational features of the Carry Look Ahead Adder. There was a difference in delay of 7.584 ns in both implementations. The given output displays a multiplication operation between the following two 8 bit numbers: input1 = 123.456 input2 = 456.789, Which corresponds to input1= 01000010111101101110100101111001 input2= 01000011111001000110010011111110 In IEEE 754, Single Precision Floating Point Format, in both the multipliers. It was also observed that, the more the number of 1s in the values being multiplied, the greater the delay for their multiplication and hence, greater the difference in the delays. V. CONCLUSION This paper presents an efficient implementation of the Floating Point multiplier using a Vedic Algorithm. Also, we compared its implementation using two different adders on the basis of delay in computation of output. We observed that floating point multiplier based on Urdhav Tiryakbhyam Sutra implemented using a Carry Look Ahead adder gave a faster output as compared to that using a Ripple Carry Adder. Fig. 7 : Vedic multiplier using Ripple Carry adders As part of the implementation for the above output, the multiplier has been designed using a Ripple Carry Adder. All addition operations included in the Vedic REFERENCES [1]. Al-Ashrafy, Mohamed, Ashraf Salem, and Wagdy Anis. "An efficient implementation of floating point multiplier." Electronics, Algorithm along with the addition of exponents have 9

Communications and Photonics Conference (SIECPC), 2011 Saudi International. IEEE, 2011. [2]. IEEE Standard for Floating-Point Arithmetic," in IEEE Std 754-2008, vol., no., pp.1-70, Aug. 29 2008 [3]. Paldurai, K., and K. Hariharan. "FPGA implementation of delay optimized single precision floating point multiplier." Advanced Computing and Communication Systems, 2015 International Conference on. IEEE, 2015. [4]. N. Shirazi, A. Walters, and P. Athanas, Quantitative Analysis of Floating Point Arithmetic on FPGA Based Custom Computing Machines, Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM 95), pp.155 162, 1995. [5]. Arish, S., and R. K. Sharma. "Run-time reconfigurable multi-precision floating point multiplier design for high speed, low-power applications." Signal Processing and Integrated Networks (SPIN), 2015 2nd International Conference on. IEEE, 2015. [7]. Kumar, Padala Siva, et al. "Efficient Floating Point Multiplier Implementation via Carry Save Multiplier." Middle-East Journal of Scientific Research 22.11 (2014): 1652-1657. [8]. Ganesh, B. Sreenivasa, J. E. N. Abhilash, and G. Rajesh Kumar. "Design and Implementation of Floating Point Multiplier for Better Timing Performance." International Journal of Advanced Research in Computer Engineering & Technology 1.7(2012). [9]. Kumar, G. Ganesh, and V. Charishma. "Design of high speed vedic multiplier using vedic mathematics techniques." International Journal of Scientific and Research Publications 2.3 (2012): 1. [10]. Stallings, William. Computer organization and architecture designing for performance. Pearson Education India, 2000. 10