A Binary Floating-Point Adder with the Signed-Digit Number Arithmetic

Similar documents
Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

Floating Point Arithmetic

FLOATING POINT ADDERS AND MULTIPLIERS

CO212 Lecture 10: Arithmetic & Logical Unit

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

FPGA IMPLEMENTATION OF FLOATING POINT ADDER AND MULTIPLIER UNDER ROUND TO NEAREST

Numeric Encodings Prof. James L. Frankel Harvard University

Number System. Introduction. Decimal Numbers

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

CPE300: Digital System Architecture and Design

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

By, Ajinkya Karande Adarsh Yoga

ADDERS AND MULTIPLIERS

Divide: Paper & Pencil

Floating-point representations

Floating-point representations

Computer Architecture and Organization

Floating Point Square Root under HUB Format

Number Systems Standard positional representation of numbers: An unsigned number with whole and fraction portions is represented as:

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

Computer Arithmetic andveriloghdl Fundamentals

ECE232: Hardware Organization and Design

At the ith stage: Input: ci is the carry-in Output: si is the sum ci+1 carry-out to (i+1)st state

The Sign consists of a single bit. If this bit is '1', then the number is negative. If this bit is '0', then the number is positive.

CS 101: Computer Programming and Utilization

ECE331: Hardware Organization and Design

CS Computer Architecture. 1. Explain Carry Look Ahead adders in detail

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

Implementation of Double Precision Floating Point Multiplier in VHDL

Measuring Improvement When Using HUB Formats to Implement Floating-Point Systems under Round-to- Nearest

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

A Library of Parameterized Floating-point Modules and Their Use

Chapter Three. Arithmetic

CHAPTER 1 Numerical Representation

Chapter 5 : Computer Arithmetic

Signed Multiplication Multiply the positives Negate result if signs of operand are different

Number Systems and Computer Arithmetic

COMP2611: Computer Organization. Data Representation

EC2303-COMPUTER ARCHITECTURE AND ORGANIZATION

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

Topics. 6.1 Number Systems and Radix Conversion 6.2 Fixed-Point Arithmetic 6.3 Seminumeric Aspects of ALU Design 6.4 Floating-Point Arithmetic

HIGH PERFORMANCE QUATERNARY ARITHMETIC LOGIC UNIT ON PROGRAMMABLE LOGIC DEVICE

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS

ECE 30 Introduction to Computer Engineering

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

MIPS Integer ALU Requirements

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Organisasi Sistem Komputer

Chapter 10 - Computer Arithmetic

Principles of Computer Architecture. Chapter 3: Arithmetic

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

±M R ±E, S M CHARACTERISTIC MANTISSA 1 k j

C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0.

Chapter 2 Data Representations

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Arithmetic (a) The four possible cases Carry (b) Truth table x y

COMPUTER ARCHITECTURE AND ORGANIZATION. Operation Add Magnitudes Subtract Magnitudes (+A) + ( B) + (A B) (B A) + (A B)

Part III The Arithmetic/Logic Unit. Oct Computer Architecture, The Arithmetic/Logic Unit Slide 1

Advanced Computer Architecture-CS501

Data Representations & Arithmetic Operations

Implementation of IEEE754 Floating Point Multiplier

UNIT IV: DATA PATH DESIGN

CHW 261: Logic Design

A Radix-10 SRT Divider Based on Alternative BCD Codings

10.1. Unit 10. Signed Representation Systems Binary Arithmetic

An FPGA based Implementation of Floating-point Multiplier

3.5 Floating Point: Overview

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE

Fixed Point. Basic idea: Choose a xed place in the binary number where the radix point is located. For the example above, the number is

Double Precision Floating-Point Arithmetic on FPGAs

Computer Arithmetic Ch 8

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

Computer Arithmetic Ch 8

Digital Computer Arithmetic

IEEE Standard for Floating-Point Arithmetic: 754

Computer Architecture and IC Design Lab. Chapter 3 Part 2 Arithmetic for Computers Floating Point

An Efficient Implementation of Floating Point Multiplier

Chapter 2. Data Representation in Computer Systems

Implementation of Floating Point Multiplier Using Dadda Algorithm


Double Precision Floating-Point Multiplier using Coarse-Grain Units

Floating-Point Butterfly Architecture Based on Binary Signed-Digit Representation

Number Systems. Both numbers are positive

Efficient Floating-Point Representation for Balanced Codes for FPGA Devices (preprint version)

Computer Sc. & IT. Digital Logic. Computer Sciencee & Information Technology. 20 Rank under AIR 100. Postal Correspondence

FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE Standard

Redundant-Digit Floating-Point Addition Scheme Based on a Stored Rounding Value

9/3/2015. Data Representation II. 2.4 Signed Integer Representation. 2.4 Signed Integer Representation

A Single/Double Precision Floating-Point Reciprocal Unit Design for Multimedia Applications

Floating Point Numbers

CMPSCI 145 MIDTERM #1 Solution Key. SPRING 2017 March 3, 2017 Professor William T. Verts

Floating Point Arithmetic

Inf2C - Computer Systems Lecture 2 Data Representation

Binary Addition. Add the binary numbers and and show the equivalent decimal addition.

Computer Architecture Set Four. Arithmetic

A Pipelined Fused Processing Unit for DSP Applications

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers Implementation

Floating-Point Arithmetic

Transcription:

Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 528 A Binary Floating-Point Adder with the Signed-Digit Number Arithmetic Shugang Wei Gunma University Department of Computer Science Tenjin-cho 1-5-1, Kiryu-shi, Gunma 376-8515 Japan wei@ja4.cs.gunma-u.ac.jp Abstract: In a conventional binary floating-point number arithmetic system, two s complement number representation is often used to perform addition/subtraction in a floating-point adder. Since the significand of an addition operand is usually expressed as a sign-magnitude number representation, the swapping operation of two operands and the carry propagation in the addition will limit the arithmetic speed. In this paper, we introduce a radix-two signed-digit(sd) number arithmetic to the floating-point number arithmetic. Then the swapping operation is not required and the carry propagation becomes free for the inner addition. We present an addition circuit architecture using the SD arithmetic with a normal binary floating-point number representation in both input and output. Efficient SD-binary conversion and normalization circuits are also proposed. Key Words: Floating-point number, Signed-digit number, Arithmetic circuit, Binary addition, IEEE standard 754. 1 Introduction Since fixed-point number systems have certain limitations in the relatively small range of numbers that can be represented and the possible loss of significant digits during computations, floating-point number systems are often used to manipulate very large and very small numbers for scientific computation and digital signal processing[1, 2]. A floating-point number is usually represented as an exponent part and a significand part, so that the arithmetic with floating-point numbers is very complicated comparing to that with the fixed-point numbers. In basic arithmetic operations, the floating-point addition and subtraction are most complicated ones that consist of the operations such as exponent manipulation, alignment shifting, addition, normalization and rounding. A number of adders for the IEEE Standard 754, a binary floating-point data format, have been presented[1, 3]. However, the swapping operation of two operands in the ordinary binary number representation and the carry propagation during the addition of them will limit the arithmetic operation speed. It is well known that the carry propagation is limited to one position during additions of signeddigit(sd) numbers[4]. The redundant SD number representation is often used to achieve high-speed arithmetic[6]. In this paper, we present a new architecture of the floating-point adder by introducing a radixtwo SD number arithmetic. Therefore, the swapping operation of two operands is not required and the carry propagation becomes free for the inner addition. We also present efficient SD-binary conversion and normalization circuits. The design and simulation results show that the delay time of the proposed floating-point adder is reduced to 77 % comparing to the conventional one. 2 Addition in Floating-Point Number System 2.1 IEEE Floating-Point Number System The floating-point number system has a very large range of number representations, comparing to the fixed-point number system with the same wordlength. Figure 1 shows a 32-bit floating-point data format, which is well known as IEEE standard 754 [4]. The sign bit is the leading bit and the 8-bit exponent field precedes the 23-bit significands field. The exponent is biased by adding 2 7 1 and the significand is in a signand-magnitude notation. The 1 is hidden in the number representation, so that the value range expressed by the significands field is [1, 2). Therefore, the value of such a floating-point number x is given by the following equation: x = ( 1) s M b e 127, (1)

Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 529 s 1 bit Biased exponent e + bias Significand M=1.f (the 1 is hidden) 8 bits 23 bits : 32 bits Figure 1: IEEE standard floating-point number representation. f 8:9<; 5 C "!$# "!&% '(*),+.- / 021 3 465 025 7 8:9<; 569<="9?>A@256B SO^+_).),P.H where ( 1) s denotes the sign of x. That is, s = 0 for positive numbers and s = 1 for negative numbers. b = 2 is the exponent base, and e expresses the exponent. M = 1.f and f is a pure unsigned fractional number. Most of addition/subtraction time is for the arithmetic operations on the significands. 2.2 Conventional Adder Architecture Let x 1 and x 2 be two floating-point numbers, as shown in Fig.1. Since the subtraction can be performed by x 1 x 2 = x 1 + ( x 2 ), we consider the addition without loss of generality. When e 1 e 2, the addition can be performed as follows: x 1 + x 2 = (( 1) s 1 M 1 + ( 1) s 2 M 2 2 (e 1 e 2 ) )2 e 1 bias To implement the above arithmetic, the following steps are required for adding two floating-point numbers: (1) Calculate the difference of the two exponents, d = e 1 e 2. (2) Shift the significand of the smaller number by d bit positions to the right. (3) Add the aligned significands and set the exponent of the result equal to the exponent of the larger operand. (4) Normalize the resultant significand and adjust the resultant exponent if necessary. When x 1 < x 2 and e 1 = e 2 specially, the swapping of the operands with a comparison is required, and a pretty long delay time for the comparison is necessary. A conventional architecture of floating-point adder is illustrated in Fig.2[4, 7]. The swapper compares the magnitudes and changes the order of the significands. The bit-inverter makes the two s complement for the negative value of the second operand. Then the aligned binary numbers are added in the binary adder. The rounding and shifting operations are performed for the normalization. In the floating-point addition circuit, swapping two unsigned significands with a comparison time and adding them with a carry propagation delay time will limit the arithmetic operation speed. DFEG(.+IHKJMLON.NIP.H `a.bi(.nieg(.c Q,RKSFTIEGUWVXPIH 8:9<; 5 021 3 465 025 7 8:9<; 569<="9?>A@256B YF+.- / Z\[\] dfefghihjefkml_n kfio Figure 2: Conventional binary floating-point adder. 3 Floating-Point Adder Using SD Arithmetic 3.1 Architecture of the Proposed Floating- Point Adder We introduce a radix-two SD number arithmetic to the floating-point addition for getting a shorter delay time. An architecture of the floating-point adder using the SD number arithmetic is shown in Fig.3. The swapping is not required for the addition of two SD numbers, then the unsigned significands are directly aligned by shifting in R-shifters. The shift bit number to right is determined by the values of the exponents in the circuit Judge Size. Then the guard digits, R and S, are generated for rounding. After attaching the signs to the addends in SD Coder, the SD addition of two SD numbers are performed. The carry propagation is free for the SD addition, so that a high-speed floating-point adder may be designed. If the operands have the same sign, the exponent is increased by 1 and the SD number is shifted to right. To convert the SD number into the IEEE standard floating-point data format, an SD Decoder is used to transfer the SD number into a binary one, and a properly left-shifting is performed for the postnormalization. For the SD-tobinary conversion, a binary addition is performed and the carry propagation will arises for the binary number arithmetic. In parallel, the number of shift bit number,

Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 530 35476 0! "#%$'&)( * +-,. /10 +-0 2 35476 0 4784:9<; 01= x p-1 y p-1 x p-2 y p-2 x 1 y 1 x 0 y 0 c p-1 c 1 c 0 >?A@CBD#)BDEF&'G H)E >?u@vb)#db)e&]g H)E SDFA ADD2 ADD2 ADD2 - < w wx?jigfh)mdb)e?jilknmdm)bde OPO 1Q-R1 \]X^?JS)TVUWG B)E_N>H)[D#)M)TV#D` 35476 0 +-,. /10 +-0 2 35476 0 4784:9<; 01= aj&)( * bdcde?jtv`d#)bdm]x^idtv`)tg KNETG S)YZB'G TV(?JSDTVUWG:X KNYZHD[)#'G hjijkmlnl^ijoqpsr ojnt Figure 3: Architecture of floating-point adder using SD arithmetic. which is required in the following L-shifter, are decided by the circuit Shift-Amount. When the operands have the same sign, the sign will be as the output sign; when they have different signs, the output data sign will be determined by the calculation result in SD Decoder. 3.2 Sign-Digit Arithmetic A number x can be represented by a p-digit radix-two SD number representation as follows: x = x p 1 2 p 1 + x p 2 2 p 2 + + x 0, x i { 1, 0, 1} (i = 0, 1,, p 1), (2) which can be denoted as x = (x p 1, x p 2,, x 0 ) SD. The SD number representation has redundancy; for example, 7 may be represented by (0, 1, 1, 1) SD, (1, 1, 1, 1) SD or (1, 0, 0, 1) SD for p = 4. By using the redundant number representation, parallel arithmetic can be achieved without the carry propagation which occurs during addition in an ordinary binary system. In the SD number representation, x has a value in the range of [ (2 p 1), 2 p 1]. An SD adder is shown in Fig.4, performing the following two steps. z p-1 Figure 4: Block diagram of SD adder. SD Addition (x + y): Let c i, w i and z i be the intermediate carry, intermediate sum and sum at the ith SD digit (i = 0, 1,, p 1), respectively. For each digit, the following two steps are performed. Add1: When abs(x i ) = abs(y i ), z 1 (w i, c i ) = (0, (x i + y i ) div 2); when abs(x i ) abs(y i ), ( (x i + y i ), x i + y i ) if (x i + y i ) and (x (w i, c i ) = i 1 + y i 1 ) have the same sign (x i + y i, 0) otherwise Since x i, y i { 1, 0, 1}, w i, c i { 1, 0, 1}. Add2: z i = w i + c i 1. Thus, z = z p 2 p + z p 1 2 p 1 + + z 0 It is always true that 2c i + w i = x i + y i, and w i and c i 1 do not have the same sign so that z i { 1, 0, 1}. Thus the carry propagation is limited to one digit position in an SD adder as shown in Fig.4.. An SD number obtained from the SD addition should be converted into a binary number, then the rounding and the normalization is performed. For the postnormalization, the shift bit number is required. We can convert the SD number into the binary and calculate the number of shift bits in parallel. In SD Decoder, at first the SD number is divided into a positive and a negative binary number. Then an addition in the binary number representation is performed, where the negative number is represented by a two s complement. If the sign is negative, then the number is changed by making the complement at every bit. The shift bit number is obtained directly from the SD number in parallel with the SD-to-binary conversion. A method similar to a Leading Zero circuit is applied, performing the following procedure. Shift Amount: Consider a p-digit SD number, x = (x p 1, x p 2,, x 0 ) SD. Let t be the shift bit number. z 0

Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 531 Table 1: Example of the proposed floating-point addition. R-Shifting: Sig A 1.01000001 R:0,S:0 Sig B aligned 0.01110001 R:1,S:1 SD Coding and : Sig A SD 1.01000001 R:0,S:0 Sig B SD aligned 0.0 1 1 1000 1 R: 1,S : 1 ADD2: c i 00.0 100000 1 0 w i 1.001 10000 1 1 Sig A+B 1.0 11 1000 1 1 1 SD Recoding: Positive(+) 01.00100000 10 Negative(-) 11.10101110 11 Sig (A + B) 00.11001111 01 Normalizatioin: Sig (A + B) N 1.10011111 Shift 1 bit Table 2: Performance of the proposed floating-point adder Area(Gates) Delay time [ns] Main circuits Binary SD Binary SD Swapper 118 18.93 Adder 228 534 26.90 5.93 RS Generator 244 244 8.34 8.34 Rounding 65 9.75 SD Coder 24 0.79 SD Decoder 208 26.91 Shiftamount 172 (24.39) Total 945 1449 92.90 71.79 the absolute value of a i. Thus, a p-digit radix-two SD number a is represented by a vector with 2p-bit length. t := 0 for i = p-1 downto 0, begin if x(i) = 0 then t := t + 1 else if x(i)=x(i-1) <> 0 then exit else for j = i-1 downto 0, if x(i)=-x(j-1)<>0 then t := t + 1 end else exit Shift-Amount circuit based on the above procedure is more complicated than the normal Leading Zero circuit. However, the delay time may be smaller than that of SD Decoder circuit. Example 1: Comsider a simple addition of A and B having the following binary expressions: { A = 2 3 1.01000001 B = 2 1 1.11000111. After shifting and generating the guard digits, R and S, the SD number representations for them can be obtained in SD Coder. Table 1 illustrates the arithmetic operation process, where 1 = 1. The shift bit number is 1. The round-to-nearest method is used for the normalization and the final addition result is A + B = 2 2 1.10011111. 4 On VLSI Implementation We specify a binary representation for a radix-two signed-digit a i, that is, a i (1) is the sign and a i (0) is a = (a p 1, a p 2,, a 0 ) SD = [a p 1 (1)a p 1 (0) a p 2 (1)a p 2 (0) a 0 (1)a 0 (0)] (3) For example, (1, 0, 0, 1) SD = [01000011]. Using the binary representation, the arithmetic operations in functional blocks including the SD arithmetic circuits are described and simulated. By using VHDL description codes, and the simulation and the logic circuit synthesis are performed by a synthesis software tool. In our experiments, the delay time is obtained by the simulation under the condition of 1µm CMOS gate array technology. In Table 2, the design results of the floating-point adder having IEEE standard data format is shown, based on the SD arithmetic method. For the performance evaluation, a floating-point adder with a binary arithmetic is also designed for comparing with the presented one. In the proposed floating-point adder, the swapper, which is used in the binary adder, is not required. Moreover, the addition is performed by the SD adder. Therefore, the delay time of the addition circuit is reduced by 77 %, comparing to the binary one. However, the presented floating-point adder needs about 1.5 times of hardware the binary one needs. 5 Conclusion A new architecture of a floating-point adder using the SD arithmetic has been proposed. Because the swapping operation is not required and the carry propagation is free in the SD addition, a high-speed floating-

Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 532 point adder can be implemented by the presented method. The design and simulation results under the condition of 1µm CMOS gate array technology show that the SD floating-point adder has a shorter delay time by 77% comparing to that of a conversional floatingpoint adder. Our studies also focus on the error analysis of the proposed adder and the applications. References: [1] A. R. Omondi,Computer Arithmetic Systems: Algorithms, Architecture and Implementation, Prentice Hall, 1994. [2] L. Wanhammar,DSP Integrated Circuits, Academic Press, 1999. [3] Leading-zero Anticipatory Logic for Highspeed Floating Point Addition, Technical Report of IEICE DSP95-98 ICD95-147, 1995. [4] ANSI/IEEE Standard 754-1985 for Binary Floating-point Arithmetic, IEEE, 1985. [5] A.avizienis, Signed-digit number representations for fast parallel arithmetic, IRE Trans. Elect. Comput., EC-10, pp.389-400, 1961. [6] S.Wei and K.Shimizu, Residue Arithmetic Circuits Using a Signed-Digit Number Representation, IEEE Proc. of ISCAS 2000, pp.i24-i27, May 2000. [7] Woo-Chan Park, Design of the Floatingpoint Adder Supporting the Format Conversion and the Rounding Operations with Simultaneous Rounding Scheme, IEICE Trans.Inf.&Syst.,Vol.E85-D,No.8, 2002.