Double Precision Floating-Point Arithmetic on FPGAs

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Double Precision Floating-Point Arithmetic on FPGAs"

Transcription

1 MITSUBISHI ELECTRIC ITE VI-Lab Title: Double Precision Floating-Point Arithmetic on FPGAs Internal Reference: Publication Date: VIL04-D098 Author: S. Paschalakis, P. Lee Rev. A Dec Reference: Paschalakis, S., Lee, P., Double Precision Floating-Point Arithmetic on FPGAs, In Proc nd IEEE International Conference on Field Programmable Technology (FPT 03), Tokyo, Japan, Dec , pp , 2003 Double Precision Floating-Point Arithmetic on FPGAs Stavros Paschalakis, Peter Lee Abstract We present low cost FPGA floating-point arithmetic circuits for all the common operations, i.e. addition/subtraction, multiplication, division and square root. Such circuits can be extremely useful in the FPGA implementation of complex systems that benefit from the reprogrammability and parallelism of the FPGA device but also require a general purpose arithmetic unit. While previous work has considered circuits for low precision floating-point formats, we consider the implementation of 64-bit double precision circuits that also provide rounding and exception handling Mitsubishi Electric ITE B.V. - Visual Information Laboratory. All rights reserved.

2 Double Precision Floating-Point Arithmetic on FPGAs Stavros Paschalakis Mitsubishi Electric ITE BV VI-Lab Abstract We present low cost FPGA floating-point arithmetic circuits for all the common operations, i.e. addition/subtraction, multiplication, division and square root. Such circuits can be extremely useful in the FPGA implementation of complex systems that benefit from the reprogrammability and parallelism of the FPGA device but also require a general purpose arithmetic unit. While previous work has considered circuits for low precision floating-point formats, we consider the implementation of 64-bit double precision circuits that also provide rounding and exception handling. 1. Introduction FPGAs have established themselves as invaluable tools in the implementation of high performance systems, combining the reprogrammability advantage of general purpose processors with the speed and parallel processing advantages of custom hardware. However, a problem that is frequently encountered with FPGA-based system-on-achip solutions, e.g. in signal processing or computer vision applications, is that the algorithmic frameworks of most real-world problems will, at some point, require general purpose arithmetic processing units which are not standard components of the FPGA device. Therefore, various researchers have examined the FPGA implementation of floating-point operators [1-7] to alleviate this problem. The earliest work considered the implementation of operators in low precision custom formats, e.g. 16 or 18 bits in total, in order to reduce the associated circuit costs and increase their speed. More recently, the increasing size of FPGA devices allowed researchers to efficiently implement operators in the 32- bit single precision format, the most basic format of the ANSI/IEEE binary floating-point arithmetic standard [8], and also consider features such as rounding and exception handling. In this paper we consider the implementation of FPGA floating-point arithmetic circuits for all the common operations, i.e. addition/subtraction, multiplication, division and square root, in the 64-bit double precision format, which is most commonly used in scientific computations. All the operators presented here provide Peter Lee University of Kent at Canterbury rounding and exception handling. We have used these circuits in the implementation of a high-speed object recognition system which performs the extraction, normalisation and classification of moment descriptors and relies partly on custom parallel processing structures and partly on floating-point processing. A detailed description of the system is not given here but can be found in [9]. 2. Floating-Point Numerical Representation This section examines only briefly the double precision floating-point format. More details and discussions can be found in [8,10]. In a floating-point representation system of a radix β, a real number N is represented in terms of a sign s, with s=0 or s=1, an exponent e and a significand S so that N=( 1) s β e S. The IEEE standard specifies that double precision floating-point numbers are comprised of 64 bits, i.e. a sign bit (bit 63), 11 bits for the exponent E (bits 62 down to 52) and 52 bits for the fraction f (bits 51 to 0). E is an unsigned biased number and the true exponent e is obtained as e=e E bias with E bias =1023. The fraction f represents a number in the range [0,1) and the significand S is given by S=1.f and is in the range [1,2). The leading 1 of the significand, is commonly referred to as the hidden bit. This is usually made explicit for operations, a process usually referred to as unpacking. When the MSB of the significand is 1 and is followed by the radix point, the representation is said to be normalised. For double precision numbers, the range of the unbiased exponent e is [ 1022,1023], which translates to a range of only [1,2046] for the biased exponent E. The values E=0 and E=2047 are reserved for special quantities. The number zero is represented with E=0, and f=0. The hidden significand bit is also 0 and not 1. Zero has a positive or negative sign like normal numbers. When E=0 and f 0 then the number has e= 1022 and a significand S=0.f. The hidden bit is 0 and not 1 and the sign is determined as for normal numbers. Such numbers are referred to as denormalised. Because of the additional complexity and costs, this part of the standard is not commonly implemented in hardware. For the same reason, our circuits do not support denormalised numbers. An exponent E=2047 and a fraction f=0 represent infinity.

3 Table 1. Double precision floating-point operator statistics on a XILINX XCV1000 Virtex FPGA device *. Adder Multiplier Divider Square Root Slices 675 (5.49%) 495 (4.03%) 343 (2.79%) 347 (2.82%) Slice flip-flops input LUTs 1, Total equivalent gate count 10,334 8,426 6,464 5,366 * Device utilisation figures include I/O flip-flops: 194 for adder, 193 for multiplier, 193 for divider and 129 for square root. The sign of infinity is determined as for normal numbers. Finally, an exponent E=2047, and a fraction f 0 represent the symbolic unsigned entity NaN (Not a Number), which is produced by operations like 0/0 and 0. The standard does not specify any NaN values, allowing the implementation of multiple NaNs. Here, only one NaN is provided with E=2047 and f= Finally, a note should be made on the issue of rounding. It is clear that arithmetic operations on the significands can result in values which do not fit in the chosen representation and need to be rounded. The IEEE standard specifies four rounding modes. Here, only the default mode is considered, which is the most difficult to implement and is known as round-to-nearest-even (RNE). This is implemented by extending the relevant significands by three bits beyond their LSB (L) [10]. These bits are referred to, from the most significant to the least significant, as guard (G), round (R) and sticky (S). The fist two are normal extension bits, while the last one is the OR of all the bits that are lower than the R bit. Rounding up, by adding a 1 to the L bit, is performed when (i) G=1 and R S=1 for any L or (ii) G=1 and R S=0 for L=0. In other cases, truncation takes place. 3. Addition/Subtraction The main steps in the calculation of the sum or difference R of two floating-point numbers A and B are as follows. First, calculate the absolute value of the difference of the two exponents, i.e. E A E B, and set the exponent of the result to the value of the larger of the two exponents. Then, shift right the significand which corresponds to the smaller exponent by E A E B places. Add or subtract the two significands and S B, according to the effective operation, and make the result positive if it is negative. Normalise, adjusting as appropriate, and round, which may require to be readjusted. Clearly, this procedure is quite generic and various modifications exist. Because addition is most frequent in scientific computations, our circuit aims at a low implementation cost combined with a low latency. The circuit is not pipelined, so that key components may be reused, with a fixed latency of three clock cycles. Its overall organisation is shown in Figure 1. In the first cycle, the operands A and B are unpacked and checks for zero, infinity or NaN are performed. For now we can assume that neither operand is infinity or NaN. Based on the sign bits s A and s B and the original operation, the effective operation when both operands are made positive is determined, e.g. ( A ) ( B ) becomes ( A B ), which results in the same effective operation but with a sign inversion of the result. From this point, it can be assumed that both A and B are positive. The absolute difference E A E B is calculated using two cascaded adders and a multiplexer. Both adders are fast ripple-carry adders, using the dedicated carry logic of the device (here, fast ripple-carry will always refer to such adders). Implicit in this is also the identification of the larger of the two exponents, and this provisionally becomes the exponent of the result. The relation between the two operands A and B is determined based on the relation between E A and E B and by comparing the significands and S B, which is required if E A =E B. This significand comparison deviates from the generic algorithm given earlier but has certain advantages, as will be seen. The significand comparator was implemented using seven 8-bit comparators that operate in parallel and an additional 8-bit comparator which processes their outputs. All the comparators employ the fast carry logic of the device. If B>A then the significands and S B are swapped Both significands are then extended to 56 bits, i.e. by the G, R and S bits as discussed earlier, and are stored in registers. Swapping and S B is equivalent to swapping A and B and making an adjustment to the sign s R. This swapping requires only multiplexers. In the second cycle, the significand alignment shift is performed and the effective operation is carried out. The advantage of swapping the significands is that it is always S B which will undergo the alignment shift. Hence, only the S B path needs a shifter. A modified 6-stage barrel shifter wired for alignment shifts performs the alignment. Each stage in the barrel shifter can clear the bits which rotate back from the MSB to achieve the alignment. Also, each stage calculates the OR of the bits that are shifted out and cleared. This allows the sticky bit S to be calculated as the OR of these six partial sticky bits along with the value that is in the sticky bit position of the output pattern. The barrel shifter is organised so that the 32-bit stage is followed by the 16-bit stage and so on, so that the large ORs do not become a significant factor with

4 Operation A Unpack B LSB LSB+1 LSB Effective Operation s A s B E A E B S B Difference E B >E A Swap E B =E A Logic E A E B B=A B>A A, B =, NaN Input Pattern MSB 1 MSB MSB 1 MSB MSB MSB 1 MSB Leading-1 Pattern Sign Logic Swap Figure 2. Leading-1 detection s R B=A Effective Operation Pack R Leading 1 Detection Rounding Control Significand Add/Subtract Normalise Round Figure 1. Floating-point adder Align with respect to the speed of the shifter. The shifter output is the 56-bit significand S B, aligned and with the correct values in the G, R and S positions. A fast ripple-carry adder then calculates either +S B or S B according to the effective operation. The advantage of the significand comparison earlier is that the result of this operation will never be negative, since S B after alignment. The result of this operation is the provisional significand of the result and is routed back to the S B path. It is clear that will not necessarily be normalised. More specifically, setting aside =0, there are three cases: (a) is normalised (b) is subnormal and requires a left shift by one or more places and (c) is supernormal and requires a right shift by just one place. For the first two cases, a leading-1 detection component examines and calculates the appropriate normalisation shift size, equal to the number of 0 bits that are above the leading-1. Figure 2 shows a straightforward design for a leading-1 detector. The 56-bit leading-1 detector is comprised of seven 8-bit components and some simple connecting logic. For the final normalisation case, i.e. a supernormal in the range [2,4), the output of the leading-1 detector is overridden. This case is easily detected by monitoring the carry-out of the significand adder. Finally, if =0 then normalisation is, obviously, inapplicable. In this case, the leading-1 detector produces a zero shift size and is treated as a normalised significand. The normalisation of takes place in the third and final cycle of the operation. is normalised by the alignment barrel shifter, which is wired for right shifts. If is normalised, then it passes straight through the shifter unmodified. If is subnormal, it is shifted right and rotated by the appropriate number of places so that the normalisation left shift can be achieved. If is supernormal, it is right shifted by one place, feeding a 1 at the MSB, and the sticky bit S is recalculated. The output of the shifter is the normalised 56-bit with the correct values in the G, R and S positions. Then, rounding is performed as discussed earlier. The rounding addition is performed by the significand adder, with on the S B path The normalised and rounded is given by the top 53 bits of the result, i.e. MSB down to the L bit, of which MSB will become hidden during packing. A complication that could arise is the rounding addition producing the significand = However, no actual normalisation takes place because the bits MSB 1 down to L already represent the correct fraction f R for packing. Each time requires normalisation, the exponent needs to be adjusted. This relies on the same hardware used for the processing of E A and E B in the first cycle. One adder performs the adjustment arising from the normalisation of. If is normalised, passes through unmodified. If is subnormal, is reduced by the left shift amount required for the normalisation. If is supernormal, is incremented by 1. The second cascaded adder increments this result by 1. The two results are multiplexed. If the rounded is supernormal then the result of the second adder is the correct. Otherwise, the result of the first adder is the correct. The calculation of the sign s R is performed in the first cycle and is trivial, requiring only a few logic gates. Checks are also performed on the final to detect an overflow or underflow, whereby R is forced to the appropriate patterns before packing. Another check that is performed is for an effective subtraction with A=B, whereby R is set to a positive zero, according the IEEE standard. Finally, infinity or NaN operands result in an

5 infinity or NaN value for R according to a set of rules. These are not included here but can be found in [9]. Table 1 shows the implementation statistics of the double precision floating-point adder on a XILINX XCV1000 Virtex FPGA device of 6 speed grade. At 5.49% usage, the circuit is quite small. These figures also include 194 I/O synchronization registers. The circuit can operate at up to ~25MHz, the critical path lying on the significand processing path and comprised of 41.1% and 58.9% logic and routing delays respectively. Since the design is not pipelined and has a latency of three cycles, this gives rise to a performance of ~8.33 MFLOPS. Obviously, the circuit is small enough to allow multiple instances to be incorporated in a single FPGA if required. A, B = 0,, NaN s A s B Sign Logic s R A E A Add E A +E B E B Remove Excess Bias Unpack B S B Significand Multiply Normalise Rounding Control 4. Multiplication Round The most significant steps in the calculation of the product R of two numbers A and B are as follows. Calculate the exponent of the result as =E A +E B E bias. Multiply and S B to obtain the significand of the result. Normalise, adjusting as appropriate, and round, which may require to be readjusted. After addition and subtraction, multiplication is the most frequent operation in scientific computing. Thus, our double precision floating-point multiplier aims at a low implementation cost while maintaining a low latency, relative to the scale of the significand multiplication involved. The circuit is not pipelined and with a latency of ten cycles. Unlike the floating-point adder which operates on a single clock, this circuit operates on two clocks; a primary clock (CLK 1 ), to which the ten cycle latency corresponds, and an internal secondary clock (CLK 2 ), which is twice as fast as the primary clock and is used by the significand multiplier. Figure 3 shows the overall organisation of this circuit. In the first CLK 1 cycle, the operands A and B are unpacked and checks are performed for zero, infinity or NaN operands. For now we can assume that neither operand is zero, infinity or NaN. The sign s R of the result is easily determined as the XOR of the signs s A and s B. From this point, A and B can be considered positive. As the first step in the calculation of, the sum E A +E B is calculated using a fast ripple-carry adder. In the second CLK 1 cycle, the excess bias is removed from E A +E B using the same adder that was used for the exponent processing of the previous cycle. This completes the calculation of. The significand multiplication also begins in the second CLK 1 cycle. Since both and S B are 53-bit normalised numbers, will initially be 106 bits long and in the range [1,4). The significand multiplier is based on the Modified Booth s 2-bit parallel multiplier recoding method and has been implemented using a serial carry-save adder array and a fast ripple-carry adder for the assimilation of the final carry bits into the final sum bits. Pack R Figure 3. Floating-point multiplier With respect to the carry-save array, this contains two cascaded carry-save adders, which retire four sum and two carry bits in each CLK 2 cycle. For the 53-bit and S B, 14 CLK 2 cycles are required to produce all the sum and carry bits, i.e. until the end of the eighth CLK 1 cycle. Alongside the carry-save array, a 4-bit fast ripple-carry adder performs a partial assimilation, i.e. processes the four sum and two carry bits produced in the previous carry-save addition cycle, taking a carry-in from the previous partial assimilation. Taking into account the logic levels associated with the generation of the partial products required by Booth s method and the carry-save additions, the latency of this component is so small that it has no effect on the clocking of the carry-save array, while it greatly accelerates the speed of the subsequent carry-propagate addition. The results of these partial assimilations need not be stored; all that needs to be stored is their OR, since they would eventually have been ORed into the sticky bit S. In the ninth CLK1 cycle, the final sum and carry bits produced by the carry-save array are added together, taking a carry-in form the last partial assimilation. Since is in the range [1,4), it can be written as y 1 y 2.y 3 y 4 y 5 y 104 y 105 y 106. Bits y 1 to y 56 are produced by this final carry assimilation. Bits y 57 to y 106 don t exist as such; all we have is their OR, which we can write as y 57+, calculated during the partial assimilations of the previous cycles. Now, if y 1 =0, then is normalised and the 56-bit for rounding is given by y 2.y 3 y 4 y 5 y 54 y 55 y 56 y 57+. If y 1 =1, and requires a 1-bit right shift for normalisation, and the final 56-bit for rounding is given by y 1.y 2 y 3 y 4 y 53 y 54 y 55 y 56+, where y 56+ is the OR of y 56 and y 57+. The normalisation of a supernormal is achieved

6 using multiplexers switched by y 1. If is supernormal, is adjusted, i.e. incremented by 1, using the same adder that was previously used for the exponent processing. This adjustment does take place after it is determined that is supernormal but performed at the beginning of this cycle and then the adjusted either replaces the old or is discarded.. The rounding decision is also made in the ninth CLK1 cycle and without waiting for the final carry assimilation to finish. That is, a rounding decision is reached for both a normal and a supernormal. Then, the correct decision is chosen once y 1 has been calculated. In the tenth and final CLK 1 cycle, the rounding of is performed using the same fast ripple-carry adder that is used by the significand multiplier. The result of this addition is the final normalised and rounded 53-bit significand. As for the adder of the previous section, the complication that might arise is a supernormal after rounding. As before, no actual normalisation is needed because it would not change the fraction f R for packing The exponent, however, is adjusted, i.e. incremented by 1, using the same exponent processing adder. This adjustment is performed at the beginning of this cycle and then the correct is chosen between the previous or the adjusted based on the final. Checks are also performed on the final to detect an overflow or underflow, whereby R is forced to the correct bit patterns before packing. Finally, zero, infinity or a NaN operands result to a zero, infinity or NaN value for R according to a simple set of rules. Table 1 shows the implementation statistics of the double precision floating-point multiplier. The circuit is quite small, occupying only 4.03% of the device. The figures also include 193 I/O synchronization registers. The primary clock CLK 1 can be set to a frequency of up to ~40MHz, its critical path comprised of 36.4% and 63.6% logic and routing delays respectively, while the secondary clock CLK 2 can be set to a frequency of up to ~75MHz, its critical path comprised of 36.8% and 63.2% logic and routing delays respectively. Since the circuit is not pipelined with a fixed latency of ten CLK 1 cycles, a frequency of 37MHz and 74MHz for CLK 1 and CLK 2 respectively gives rise to a performance in the order of 3.7 MFLOPS. Obviously, the circuit is small enough to allow multiple instances to be placed in a single chip. 5. Division In general, division is a much less frequent than the previous operations. The most significant steps in the calculation of the quotient R of two numbers A (the dividend) and B (the divisor) are as follows. Calculate the exponent of the result as =E A E B +E bias. Divide by S B to obtain the significand. Normalise, adjusting as appropriate, and round, which may require to be readjusted. Our double precision floating-point divider A, B = 0,, NaN s A Sign Logic s R s B A E A E B Subtract E A E B Add Bias Unpack Pack R S B Significand Divide Normalise Round Figure 4. Floating-point divider B Rounding Control aims solely at a low implementation cost. A non-pipelined design is adopted, incorporating an economic significand divider, with a fixed latency of 60 clock cycles. Figure 4 shows the overall organisation of this circuit. In the first cycle, the operands A and B are unpacked For now, we can assume that neither operand is zero, infinity or NaN. The sign s R of the result is the XOR of s A and s B,. As the first step in the calculation of, the difference E A E B is calculated using a fast ripple-carry adder. In the second cycle, the bias is added to E A E B, using the same exponent processing adder of the previous cycle. This completes the calculation of. The significand division also begins in the second cycle. The division algorithm employed here is the simple non-performing sequential algorithm and the division proceeds as follows. First, the remainder of the division is set to the value of the dividend. The divisor S B is subtracted from the remainder. If the result is positive or zero, the MSB of the quotient is 1 and this result replaces the remainder. Otherwise, the MSB of is 0 and the remainder is not replaced. The remainder is then shifted left by one place. The divisor S B is subtracted from the remainder for the calculation of MSB-1 and so on. The significand divider calculates one bit per cycle and its main components are two registers for the remainder and the divisor, a fast ripple-carry adder, and a shift register for. The divider operates for 55 cycles, i.e. during the cycles 2 to 56, and produces a 55-bit, the two least significant bits being the G and R bits. In cycle 57, the sticky bit S is calculated as the OR of all the bits of the final remainder. Since both and S B are

7 normalised, will be in the range (0.5,2), i.e. if not already normalised, it will require a left shift by just one place. This normalisation is also performed in cycle 57. No additional hardware is required, since is already stored in a shift register. If requires normalisation, the exponent is incremented by 1 in cycle 58. This exponent adjustment is performed using the same adder that is used for the exponent processing of the previous cycles. Also in cycle 58, is transferred to the divisor register, which is connected to the adder of the significand divider. In cycle 59, the 56-bit is rounded to 53 bits using the significand divider adder. For a supernormal after rounding no normalisation is actually required but the exponent is incremented by 1 and this takes place in cycle 60 and using the same adder that is used for the exponent processing of the previous cycles. Checks are also performed on for an overflow or underflow, whereby the result R is appropriately set before packing. As for zero, infinity and NaN operands, R will also be zero, infinity or NaN according to a simple set of rules. Table 1 shows the implementation statistics of the double precision floating-point divider. The circuit is very small, occupying only 2.79% of the device, which also includes 193 I/O synchronization registers. This circuit can operate at up to ~60MHz, the critical path comprised of 42.8% and 57.2% logic and routing delays respectively. Since the design is not pipelined and has a fixed latency of 60 clock cycles, this gives a performance in the order of 1 MFLOPS. As for the previous circuits, the implementation considered here is small enough to allow multiple instances to be incorporated in a single FPGA device if needed. 6. Square Root The square root function is much less frequent than the previous operations. Thus, our floating-point square root circuit aims solely at a low implementation cost. A nonpipelined design is adopted with a fixed latency of 59 cycles. Figure 5 shows the organisation of this circuit. With the circuit considered here, the calculation of the square root R of the floating-point number A proceeds as follows. In the first cycle, the operand A is unpacked. For now we can assume that A is positive and not zero, infinity or NaN. The biased exponent of the result is calculated directly from the biased exponent E A using [9] EA if EA is even (and left shift one place) 2 ER = 1 EA if EA is odd 2 is calculated using a fast ripple carry adder, while the division by 2 is just a matter of discarding the LSB of the numerator, which will always be even. The calculation of A = Neg., 0,, NaN s A E A Calculation A Unpack Pack R Significand Square Root Round Rounding Control Figure 5. Floating-point square root the significand, i.e. of the square root of, starts in the second clock cycle. According to (1), will be in the range [1,4). Consequently, its square root will be in the range [1,2), i.e. it will always be normalised. Denoting as y 1.y 2 y 3 y 4, each bit y n is calculated using [9] 1 if ( X n Tn ) 0 y n = 0 otherwise ( X T ) 2 n n if yn = 1 X n+ 1 =, Tn+ 1 = y1. y2 y3 K yn 01 2X n if yn = 0 with X 1 =, T 1 = 0. 1 and n=1,2,3, 2 From (2) it can be seen that the adopted square root calculation algorithm is quite similar to the division method examined in the previous section. Based on this algorithm a significand square root circuit calculates one bit of per clock cycle. The main components of this part of the circuit are two registers for X n and T n, and a fast ripple-carry adder. The T n register has been implemented so that each flip-flop has its own enable signal, which allows each individual bit y n to be stored in the correct position in the register and also controls the reset and set inputs of the two next flips-flops in the register so that the correct T n is formed for each cycle of the process. After the significand square root calculation process is complete, the contents of register T n form the significand for rounding. Thus, the significand square root circuit operates for 55 cycles, i.e. during the clock cycles 2 to 56, and produces a 55-bit. The last two bits are the guard bit G and the round bit R. In cycle 57, the sticky bit S is calculated as the OR of all the bits of the final remainder of the significand square root calculation. Thus, a 56-bit for rounding is formulated. In cycle 58, is rounded to 53 bits using the 2

8 same adder that is used by the significand square root circuit. For a supernormal after rounding, no actual normalisation is required but the exponent is incremented by 1 and this adjustment takes place in cycle 59 and is performed using the same adder that is used for the exponent processing during the first cycle. From the definition of in (1) it is easy to see that an overflow or underflow will never occur. Finally, for zero, infinity, NaN or negative operands, simple rules apply with regards to the result R. Table 1 shows the implementation statistics of our double precision floating-point square root function. The circuit is very small, occupying only 2.82% of the device, which includes 129 I/O synchronization registers. The circuit can operate at up to ~80MHz, the critical path comprised of 53.0% and 47.0% logic and routing delays respectively. Since the circuit is not pipelined and has a fixed latency of 59 clock cycles, this gives rise to a performance in the order of 1.36 MFLOPS. 8. Discussion We have presented low cost FPGA floating-point arithmetic circuits in the 64-bit double precision format and for all the common operations. Such circuits can be extremely useful in the FPGA-based implementation of complex systems that benefit from the reprogrammability and parallelism of the FPGA device but also require a general purpose arithmetic unit. In our case, these circuits were used in the implementation of a high-speed object recognition system which relies partly on custom parallel processing structures and partly on floating-point processing. The implementation statistics of the operators show that they are very economical in relation to contemporary FPGAs, which also facilitates multiple instances of the desired operators into the same chip. Although non-pipelined circuits were considered here to achieve low circuit costs, the adder and multiplier, with a latency of three and ten cycles respectively, are suitable for pipelining to increase their throughput. For the divider and square root operators, pipelining the existing designs may not be the most efficient option and different designs and/or algorithms should be considered, e.g. a high radix SRT algorithm for division [6]. Clearly, there are significant speed and circuit size tradeoffs to consider when deciding on the range and precision of FPGA floating-point arithmetic circuits. A direct comparison with other floating-point unit implementations is very difficult to perform, not only because of floating-point format differences, but also due to other circuit characteristics, e.g. all the circuits presented here incorporate I/O registers, which would eventually be absorbed by the surrounding hardware. As an indication, and with some caution, the double precision floating-point adder presented here occupies approximately the same number of slices as the single precision floating-point adder in [7]. That circuit has a higher latency than the adder presented here, but also a higher maximum clock speed, which result in both circuits having approximately the same performance. The adder of [7], however, is fully pipelined and has a much higher peak throughput. In conclusion, the circuits presented here provide an indication of the costs of FPGA floating-point operators using a long format. The choice of floating-point format for any given problem ultimately rests with the designer. References 1. Shirazi, N., Walters, A., Athanas, P., Quantitative Analysis of Floating Point Arithmetic on FPGA Based Custom Computing Machines, In Proc. IEEE Symposium on FPGAs for Custom Computing Machines, 1995, pp Loucas, L., Cook, T.A., Johnson, W.H., Implementation of IEEE Single Precision Floating Point Addition and Multiplication on FPGAs, In Proc. IEEE Symposium on FPGAs for Custom Computing Machines, 1996,pp Li, Y., Chu, W., Implementation of Single Precision Floating Point Square Root on FPGAs, In Proc. 5 th IEEE Symposium on Field Programmable Custom Computing Machines, 1997, pp Ligon, W.B., McMillan, S., Monn, G., Schoonover, K., Stivers, F., Underwood, K.D., A Re-Evaluation of the Practicality of Floating Point Operations on FPGAs, In Proc. IEEE Symposium on FPGAs for Custom Computing Machines, 1998, pp Belanovic, P., Leeser, M., A Library of Parameterized Floating-Point Modules and Their Use, In Proc. Field Programmable Logic and Applications, 2002, pp Wang, X., Nelson, B.E., Tradeoffs of Designing Floating-Point Division and Square Root on Virtex FPGAs, In Proc. 11 th IEEE Symposium on Field- Programmable Custom Computing Machines, 2003, pp Digital Core Design, Alliance Core Data Sheets for Floating-Point Arithmetic, 2001, 8. ANSI/IEEE Std , IEEE Standard for Binary Floating-Point Arithmetic, Goldberg, D., What Every Computer Scientist Should Know About Floating-Point Arithmetic, ACM Computing Surveys, 1991, vol. 23, no. 1, pp Paschalakis, S., Moment Methods and Hardware Architectures for High Speed Binary, Greyscale and Colour Pattern Recognition, Ph.D. Thesis, Department of Electronics, University of Kent at Canterbury, UK, 2001

A Library of Parameterized Floating-point Modules and Their Use

A Library of Parameterized Floating-point Modules and Their Use A Library of Parameterized Floating-point Modules and Their Use Pavle Belanović and Miriam Leeser Department of Electrical and Computer Engineering Northeastern University Boston, MA, 02115, USA {pbelanov,mel}@ece.neu.edu

More information

An Efficient Implementation of Floating Point Multiplier

An Efficient Implementation of Floating Point Multiplier An Efficient Implementation of Floating Point Multiplier Mohamed Al-Ashrafy Mentor Graphics Mohamed_Samy@Mentor.com Ashraf Salem Mentor Graphics Ashraf_Salem@Mentor.com Wagdy Anis Communications and Electronics

More information

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering An Efficient Implementation of Double Precision Floating Point Multiplier Using Booth Algorithm Pallavi Ramteke 1, Dr. N. N. Mhala 2, Prof. P. R. Lakhe M.Tech [IV Sem], Dept. of Comm. Engg., S.D.C.E, [Selukate],

More information

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3 Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Instructor: Nicole Hynes nicole.hynes@rutgers.edu 1 Fixed Point Numbers Fixed point number: integer part

More information

The Sign consists of a single bit. If this bit is '1', then the number is negative. If this bit is '0', then the number is positive.

The Sign consists of a single bit. If this bit is '1', then the number is negative. If this bit is '0', then the number is positive. IEEE 754 Standard - Overview Frozen Content Modified by on 13-Sep-2017 Before discussing the actual WB_FPU - Wishbone Floating Point Unit peripheral in detail, it is worth spending some time to look at

More information

An FPGA Based Floating Point Arithmetic Unit Using Verilog

An FPGA Based Floating Point Arithmetic Unit Using Verilog An FPGA Based Floating Point Arithmetic Unit Using Verilog T. Ramesh 1 G. Koteshwar Rao 2 1PG Scholar, Vaagdevi College of Engineering, Telangana. 2Assistant Professor, Vaagdevi College of Engineering,

More information

Implementation of Floating Point Multiplier Using Dadda Algorithm

Implementation of Floating Point Multiplier Using Dadda Algorithm Implementation of Floating Point Multiplier Using Dadda Algorithm Abstract: Floating point multiplication is the most usefull in all the computation application like in Arithematic operation, DSP application.

More information

A High Speed Binary Floating Point Multiplier Using Dadda Algorithm

A High Speed Binary Floating Point Multiplier Using Dadda Algorithm 455 A High Speed Binary Floating Point Multiplier Using Dadda Algorithm B. Jeevan, Asst. Professor, Dept. of E&IE, KITS, Warangal. jeevanbs776@gmail.com S. Narender, M.Tech (VLSI&ES), KITS, Warangal. narender.s446@gmail.com

More information

UNIT - I: COMPUTER ARITHMETIC, REGISTER TRANSFER LANGUAGE & MICROOPERATIONS

UNIT - I: COMPUTER ARITHMETIC, REGISTER TRANSFER LANGUAGE & MICROOPERATIONS UNIT - I: COMPUTER ARITHMETIC, REGISTER TRANSFER LANGUAGE & MICROOPERATIONS (09 periods) Computer Arithmetic: Data Representation, Fixed Point Representation, Floating Point Representation, Addition and

More information

Module 2: Computer Arithmetic

Module 2: Computer Arithmetic Module 2: Computer Arithmetic 1 B O O K : C O M P U T E R O R G A N I Z A T I O N A N D D E S I G N, 3 E D, D A V I D L. P A T T E R S O N A N D J O H N L. H A N N E S S Y, M O R G A N K A U F M A N N

More information

An Implementation of Double precision Floating point Adder & Subtractor Using Verilog

An Implementation of Double precision Floating point Adder & Subtractor Using Verilog IOSR Journal of Electrical and Electronics Engineering (IOSR-JEEE) e-issn: 2278-1676,p-ISSN: 2320-3331, Volume 9, Issue 4 Ver. III (Jul Aug. 2014), PP 01-05 An Implementation of Double precision Floating

More information

Vendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs

Vendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs Vendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs Xin Fang and Miriam Leeser Dept of Electrical and Computer Eng Northeastern University Boston, Massachusetts 02115

More information

LogiCORE IP Floating-Point Operator v6.2

LogiCORE IP Floating-Point Operator v6.2 LogiCORE IP Floating-Point Operator v6.2 Product Guide Table of Contents SECTION I: SUMMARY IP Facts Chapter 1: Overview Unsupported Features..............................................................

More information

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN FPGA Implementation of 64 Bit Floating Point Multiplier Using DADDA Algorithm Priyanka Saxena *1, Ms. Imthiyazunnisa Begum *2 M. Tech (VLSI System Design), Department of ECE, VIFCET, Gandipet, Telangana,

More information

THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE

THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE Design and Implementation of Optimized Floating Point Matrix Multiplier Based on FPGA Maruti L. Doddamani IV Semester, M.Tech (Digital Electronics), Department

More information

Comparison of Adders for optimized Exponent Addition circuit in IEEE754 Floating point multiplier using VHDL

Comparison of Adders for optimized Exponent Addition circuit in IEEE754 Floating point multiplier using VHDL International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 11, Issue 07 (July 2015), PP.60-65 Comparison of Adders for optimized Exponent Addition

More information

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction 1 Floating Point The World is Not Just Integers Programming languages support numbers with fraction Called floating-point numbers Examples: 3.14159265 (π) 2.71828 (e) 0.000000001 or 1.0 10 9 (seconds in

More information

Number Systems and Computer Arithmetic

Number Systems and Computer Arithmetic Number Systems and Computer Arithmetic Counting to four billion two fingers at a time What do all those bits mean now? bits (011011011100010...01) instruction R-format I-format... integer data number text

More information

FPSim: A Floating- Point Arithmetic Demonstration Applet

FPSim: A Floating- Point Arithmetic Demonstration Applet FPSim: A Floating- Point Arithmetic Demonstration Applet Jeffrey Ward Departmen t of Computer Science St. Cloud State University waje9901@stclouds ta te.edu Abstract An understan ding of IEEE 754 standar

More information

EC2303-COMPUTER ARCHITECTURE AND ORGANIZATION

EC2303-COMPUTER ARCHITECTURE AND ORGANIZATION EC2303-COMPUTER ARCHITECTURE AND ORGANIZATION QUESTION BANK UNIT-II 1. What are the disadvantages in using a ripple carry adder? (NOV/DEC 2006) The main disadvantage using ripple carry adder is time delay.

More information

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1.1 Introduction Given that digital logic and memory devices are based on two electrical states (on and off), it is natural to use a number

More information

Chapter 10 - Computer Arithmetic

Chapter 10 - Computer Arithmetic Chapter 10 - Computer Arithmetic Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 10 - Computer Arithmetic 1 / 126 1 Motivation 2 Arithmetic and Logic Unit 3 Integer representation

More information

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop. CS 320 Ch 10 Computer Arithmetic The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop. Signed integers are typically represented in sign-magnitude

More information

Chapter 2 Data Representations

Chapter 2 Data Representations Computer Engineering Chapter 2 Data Representations Hiroaki Kobayashi 4/21/2008 4/21/2008 1 Agenda in Chapter 2 Translation between binary numbers and decimal numbers Data Representations for Integers

More information

Architecture and Design of Generic IEEE-754 Based Floating Point Adder, Subtractor and Multiplier

Architecture and Design of Generic IEEE-754 Based Floating Point Adder, Subtractor and Multiplier Architecture and Design of Generic IEEE-754 Based Floating Point Adder, Subtractor and Multiplier Sahdev D. Kanjariya VLSI & Embedded Systems Design Gujarat Technological University PG School Ahmedabad,

More information

ARCHITECTURAL DESIGN OF 8 BIT FLOATING POINT MULTIPLICATION UNIT

ARCHITECTURAL DESIGN OF 8 BIT FLOATING POINT MULTIPLICATION UNIT ARCHITECTURAL DESIGN OF 8 BIT FLOATING POINT MULTIPLICATION UNIT Usha S. 1 and Vijaya Kumar V. 2 1 VLSI Design, Sathyabama University, Chennai, India 2 Department of Electronics and Communication Engineering,

More information

±M R ±E, S M CHARACTERISTIC MANTISSA 1 k j

±M R ±E, S M CHARACTERISTIC MANTISSA 1 k j ENEE 350 c C. B. Silio, Jan., 2010 FLOATING POINT REPRESENTATIONS It is assumed that the student is familiar with the discussion in Appendix B of the text by A. Tanenbaum, Structured Computer Organization,

More information

Efficient Floating-Point Representation for Balanced Codes for FPGA Devices (preprint version)

Efficient Floating-Point Representation for Balanced Codes for FPGA Devices (preprint version) Efficient Floating-Point Representation for Balanced Codes for FPGA Devices (preprint version) Abstract We propose a floating point representation to deal efficiently with arithmetic operations in codes

More information

Data Representations & Arithmetic Operations

Data Representations & Arithmetic Operations Data Representations & Arithmetic Operations Hiroaki Kobayashi 7/13/2011 7/13/2011 Computer Science 1 Agenda Translation between binary numbers and decimal numbers Data Representations for Integers Negative

More information

FPGA Implementation of Low-Area Floating Point Multiplier Using Vedic Mathematics

FPGA Implementation of Low-Area Floating Point Multiplier Using Vedic Mathematics FPGA Implementation of Low-Area Floating Point Multiplier Using Vedic Mathematics R. Sai Siva Teja 1, A. Madhusudhan 2 1 M.Tech Student, 2 Assistant Professor, Dept of ECE, Anurag Group of Institutions

More information

Computer Arithmetic Ch 8

Computer Arithmetic Ch 8 Computer Arithmetic Ch 8 ALU Integer Representation Integer Arithmetic Floating-Point Representation Floating-Point Arithmetic 1 Arithmetic Logical Unit (ALU) (2) (aritmeettis-looginen yksikkö) Does all

More information

Computer Architecture and Organization

Computer Architecture and Organization 3-1 Chapter 3 - Arithmetic Computer Architecture and Organization Miles Murdocca and Vincent Heuring Chapter 3 Arithmetic 3-2 Chapter 3 - Arithmetic Chapter Contents 3.1 Fixed Point Addition and Subtraction

More information

Computer Arithmetic Ch 8

Computer Arithmetic Ch 8 Computer Arithmetic Ch 8 ALU Integer Representation Integer Arithmetic Floating-Point Representation Floating-Point Arithmetic 1 Arithmetic Logical Unit (ALU) (2) Does all work in CPU (aritmeettis-looginen

More information

D I G I T A L C I R C U I T S E E

D I G I T A L C I R C U I T S E E D I G I T A L C I R C U I T S E E Digital Circuits Basic Scope and Introduction This book covers theory solved examples and previous year gate question for following topics: Number system, Boolean algebra,

More information

FLOATING POINT ADDERS AND MULTIPLIERS

FLOATING POINT ADDERS AND MULTIPLIERS Concordia University FLOATING POINT ADDERS AND MULTIPLIERS 1 Concordia University Lecture #4 In this lecture we will go over the following concepts: 1) Floating Point Number representation 2) Accuracy

More information

International Journal Of Global Innovations -Vol.1, Issue.II Paper Id: SP-V1-I2-221 ISSN Online:

International Journal Of Global Innovations -Vol.1, Issue.II Paper Id: SP-V1-I2-221 ISSN Online: AN EFFICIENT IMPLEMENTATION OF FLOATING POINT ALGORITHMS #1 SEVAKULA PRASANNA - M.Tech Student, #2 PEDDI ANUDEEP - Assistant Professor, Dept of ECE, MLR INSTITUTE OF TECHNOLOGY, DUNDIGAL, HYD, T.S., INDIA.

More information

IEEE-754 compliant Algorithms for Fast Multiplication of Double Precision Floating Point Numbers

IEEE-754 compliant Algorithms for Fast Multiplication of Double Precision Floating Point Numbers International Journal of Research in Computer Science ISSN 2249-8257 Volume 1 Issue 1 (2011) pp. 1-7 White Globe Publications www.ijorcs.org IEEE-754 compliant Algorithms for Fast Multiplication of Double

More information

Hardware Modules for Safe Integer and Floating-Point Arithmetic

Hardware Modules for Safe Integer and Floating-Point Arithmetic Hardware Modules for Safe Integer and Floating-Point Arithmetic A Thesis submitted to the Graduate School Of The University of Cincinnati In partial fulfillment of the requirements for the degree of Master

More information

Floating Point Numbers

Floating Point Numbers Floating Point Numbers Summer 8 Fractional numbers Fractional numbers fixed point Floating point numbers the IEEE 7 floating point standard Floating point operations Rounding modes CMPE Summer 8 Slides

More information

A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier using Modified CSA

A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier using Modified CSA RESEARCH ARTICLE OPEN ACCESS A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier using Nishi Pandey, Virendra Singh Sagar Institute of Research & Technology Bhopal Abstract Due to

More information

Design and Implementation of Floating Point Multiplier for Better Timing Performance

Design and Implementation of Floating Point Multiplier for Better Timing Performance Design and Implementation of Floating Point Multiplier for Better Timing Performance B.Sreenivasa Ganesh 1,J.E.N.Abhilash 2, G. Rajesh Kumar 3 SwarnandhraCollege of Engineering& Technology 1,2, Vishnu

More information

Principles of Computer Architecture. Chapter 3: Arithmetic

Principles of Computer Architecture. Chapter 3: Arithmetic 3-1 Chapter 3 - Arithmetic Principles of Computer Architecture Miles Murdocca and Vincent Heuring Chapter 3: Arithmetic 3-2 Chapter 3 - Arithmetic 3.1 Overview Chapter Contents 3.2 Fixed Point Addition

More information

Chapter 2. Data Representation in Computer Systems

Chapter 2. Data Representation in Computer Systems Chapter 2 Data Representation in Computer Systems Chapter 2 Objectives Understand the fundamentals of numerical data representation and manipulation in digital computers. Master the skill of converting

More information

Numeric Encodings Prof. James L. Frankel Harvard University

Numeric Encodings Prof. James L. Frankel Harvard University Numeric Encodings Prof. James L. Frankel Harvard University Version of 10:19 PM 12-Sep-2017 Copyright 2017, 2016 James L. Frankel. All rights reserved. Representation of Positive & Negative Integral and

More information

Floating Point Arithmetic

Floating Point Arithmetic Floating Point Arithmetic Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

CS 101: Computer Programming and Utilization

CS 101: Computer Programming and Utilization CS 101: Computer Programming and Utilization Jul-Nov 2017 Umesh Bellur (cs101@cse.iitb.ac.in) Lecture 3: Number Representa.ons Representing Numbers Digital Circuits can store and manipulate 0 s and 1 s.

More information

Binary Adders. Ripple-Carry Adder

Binary Adders. Ripple-Carry Adder Ripple-Carry Adder Binary Adders x n y n x y x y c n FA c n - c 2 FA c FA c s n MSB position Longest delay (Critical-path delay): d c(n) = n d carry = 2n gate delays d s(n-) = (n-) d carry +d sum = 2n

More information

Number Systems Standard positional representation of numbers: An unsigned number with whole and fraction portions is represented as:

Number Systems Standard positional representation of numbers: An unsigned number with whole and fraction portions is represented as: N Number Systems Standard positional representation of numbers: An unsigned number with whole and fraction portions is represented as: a n a a a The value of this number is given by: = a n Ka a a a a a

More information

Floating Point Arithmetic

Floating Point Arithmetic Floating Point Arithmetic Clark N. Taylor Department of Electrical and Computer Engineering Brigham Young University clark.taylor@byu.edu 1 Introduction Numerical operations are something at which digital

More information

Data Representation Type of Data Representation Integers Bits Unsigned 2 s Comp Excess 7 Excess 8

Data Representation Type of Data Representation Integers Bits Unsigned 2 s Comp Excess 7 Excess 8 Data Representation At its most basic level, all digital information must reduce to 0s and 1s, which can be discussed as binary, octal, or hex data. There s no practical limit on how it can be interpreted

More information

Floating Point Arithmetic

Floating Point Arithmetic Floating Point Arithmetic CS 365 Floating-Point What can be represented in N bits? Unsigned 0 to 2 N 2s Complement -2 N-1 to 2 N-1-1 But, what about? very large numbers? 9,349,398,989,787,762,244,859,087,678

More information

An FPGA Implementation of the Powering Function with Single Precision Floating-Point Arithmetic

An FPGA Implementation of the Powering Function with Single Precision Floating-Point Arithmetic An FPGA Implementation of the Powering Function with Single Precision Floating-Point Arithmetic Pedro Echeverría, Marisa López-Vallejo Department of Electronic Engineering, Universidad Politécnica de Madrid

More information

AT Arithmetic. Integer addition

AT Arithmetic. Integer addition AT Arithmetic Most concern has gone into creating fast implementation of (especially) FP Arith. Under the AT (area-time) rule, area is (almost) as important. So it s important to know the latency, bandwidth

More information

C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0.

C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0. C NUMERIC FORMATS Figure C-. Table C-. Listing C-. Overview The DSP supports the 32-bit single-precision floating-point data format defined in the IEEE Standard 754/854. In addition, the DSP supports an

More information

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers Chapter 03: Computer Arithmetic Lesson 09: Arithmetic using floating point numbers Objective To understand arithmetic operations in case of floating point numbers 2 Multiplication of Floating Point Numbers

More information

IEEE Standard for Floating-Point Arithmetic: 754

IEEE Standard for Floating-Point Arithmetic: 754 IEEE Standard for Floating-Point Arithmetic: 754 G.E. Antoniou G.E. Antoniou () IEEE Standard for Floating-Point Arithmetic: 754 1 / 34 Floating Point Standard: IEEE 754 1985/2008 Established in 1985 (2008)

More information

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right. Floating-point Arithmetic Reading: pp. 312-328 Floating-Point Representation Non-scientific floating point numbers: A non-integer can be represented as: 2 4 2 3 2 2 2 1 2 0.2-1 2-2 2-3 2-4 where you sum

More information

VHDL IMPLEMENTATION OF FLOATING POINT MULTIPLIER USING VEDIC MATHEMATICS

VHDL IMPLEMENTATION OF FLOATING POINT MULTIPLIER USING VEDIC MATHEMATICS VHDL IMPLEMENTATION OF FLOATING POINT MULTIPLIER USING VEDIC MATHEMATICS I.V.VAIBHAV 1, K.V.SAICHARAN 1, B.SRAVANTHI 1, D.SRINIVASULU 2 1 Students of Department of ECE,SACET, Chirala, AP, India 2 Associate

More information

VLSI DESIGN OF FLOATING POINT ARITHMETIC & LOGIC UNIT

VLSI DESIGN OF FLOATING POINT ARITHMETIC & LOGIC UNIT VLSI DESIGN OF FLOATING POINT ARITHMETIC & LOGIC UNIT 1 DHANABAL R, 2 BHARATHI V, 3 G.SRI CHANDRAKIRAN, 4 BHARATH BHUSHAN REDDY.M 1 Assistant Professor (Senior Grade), VLSI division, SENSE, VIT University,

More information

Development of an FPGA based high speed single precision floating point multiplier

Development of an FPGA based high speed single precision floating point multiplier International Journal of Electronic and Electrical Engineering. ISSN 0974-2174 Volume 8, Number 1 (2015), pp. 27-32 International Research Publication House http://www.irphouse.com Development of an FPGA

More information

A Floating Point Divider Performing IEEE Rounding and Quotient Conversion in Parallel

A Floating Point Divider Performing IEEE Rounding and Quotient Conversion in Parallel A Floating Point Divider Performing IEEE Rounding and Quotient Conversion in Parallel Woo-Chan Park 1, Tack-Don Han 2, and Sung-Bong Yang 2 1 Department of Internet Engineering, Sejong University, Seoul

More information

Double Precision IEEE-754 Floating-Point Adder Design Based on FPGA

Double Precision IEEE-754 Floating-Point Adder Design Based on FPGA Double Precision IEEE-754 Floating-Point Adder Design Based on FPGA Adarsha KM 1, Ashwini SS 2, Dr. MZ Kurian 3 PG Student [VLS& ES], Dept. of ECE, Sri Siddhartha Institute of Technology, Tumkur, Karnataka,

More information

ADDERS AND MULTIPLIERS

ADDERS AND MULTIPLIERS Concordia University FLOATING POINT ADDERS AND MULTIPLIERS 1 Concordia University Lecture #4 In this lecture we will go over the following concepts: 1) Floating Point Number representation 2) Accuracy

More information

Arithmetic for Computers. Hwansoo Han

Arithmetic for Computers. Hwansoo Han Arithmetic for Computers Hwansoo Han Arithmetic for Computers Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers Representation

More information

Chapter 3 Arithmetic for Computers (Part 2)

Chapter 3 Arithmetic for Computers (Part 2) Department of Electr rical Eng ineering, Chapter 3 Arithmetic for Computers (Part 2) 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Depar rtment of Electr rical Eng ineering, Feng-Chia Unive

More information

fractional quantities are typically represented in computers using floating point format this approach is very much similar to scientific notation

fractional quantities are typically represented in computers using floating point format this approach is very much similar to scientific notation Floating Point Arithmetic fractional quantities are typically represented in computers using floating point format this approach is very much similar to scientific notation for example, fixed point number

More information

Microcomputers. Outline. Number Systems and Digital Logic Review

Microcomputers. Outline. Number Systems and Digital Logic Review Microcomputers Number Systems and Digital Logic Review Lecture 1-1 Outline Number systems and formats Common number systems Base Conversion Integer representation Signed integer representation Binary coded

More information

Tailoring the 32-Bit ALU to MIPS

Tailoring the 32-Bit ALU to MIPS Tailoring the 32-Bit ALU to MIPS MIPS ALU extensions Overflow detection: Carry into MSB XOR Carry out of MSB Branch instructions Shift instructions Slt instruction Immediate instructions ALU performance

More information

Computer Arithmetic. The Fast drives out the Slow even if the Fast is wrong. by David Goldberg (Xerox Palo Alto Research Center) W.

Computer Arithmetic. The Fast drives out the Slow even if the Fast is wrong. by David Goldberg (Xerox Palo Alto Research Center) W. A Computer Arithmetic by David Goldberg (Xerox Palo Alto Research Center) The Fast drives out the Slow even if the Fast is wrong. W. Kahan A.1 Introduction A-1 A.2 Basic Techniques of Integer Arithmetic

More information

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing EE878 Special Topics in VLSI Computer Arithmetic for Digital Signal Processing Part 6b High-Speed Multiplication - II Spring 2017 Koren Part.6b.1 Accumulating the Partial Products After generating partial

More information

UNIT IV: DATA PATH DESIGN

UNIT IV: DATA PATH DESIGN UNIT IV: DATA PATH DESIGN Agenda Introduc/on Fixed Point Arithme/c Addi/on Subtrac/on Mul/plica/on & Serial mul/plier Division & Serial Divider Two s Complement (Addi/on, Subtrac/on) Booth s algorithm

More information

4 Operations On Data 4.1. Foundations of Computer Science Cengage Learning

4 Operations On Data 4.1. Foundations of Computer Science Cengage Learning 4 Operations On Data 4.1 Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: List the three categories of operations performed on data.

More information

A Decimal Floating Point Arithmetic Unit for Embedded System Applications using VLSI Techniques

A Decimal Floating Point Arithmetic Unit for Embedded System Applications using VLSI Techniques A Decimal Floating Point Arithmetic Unit for Embedded System Applications using VLSI Techniques Rajeshwari Mathapati #1, Shrikant. K.Shirakol *2 # Dept of E&CE, S.D.M. College of Engg. and Tech., Dharwad,

More information

Chapter 2: Number Systems

Chapter 2: Number Systems Chapter 2: Number Systems Logic circuits are used to generate and transmit 1s and 0s to compute and convey information. This two-valued number system is called binary. As presented earlier, there are many

More information

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation Chapter 2 Float Point Arithmetic Topics IEEE Floating Point Standard Fractional Binary Numbers Rounding Floating Point Operations Mathematical properties Real Numbers in Decimal Notation Representation

More information

VHDL IMPLEMENTATION OF IEEE 754 FLOATING POINT UNIT

VHDL IMPLEMENTATION OF IEEE 754 FLOATING POINT UNIT VHDL IMPLEMENTATION OF IEEE 754 FLOATING POINT UNIT Ms. Anjana Sasidharan Student, Vivekanandha College of Engineering for Women, Namakkal, Tamilnadu, India. Abstract IEEE-754 specifies interchange and

More information

Computer Arithmetic andveriloghdl Fundamentals

Computer Arithmetic andveriloghdl Fundamentals Computer Arithmetic andveriloghdl Fundamentals Joseph Cavanagh Santa Clara University California, USA ( r ec) CRC Press vf J TayiorS«. Francis Group ^"*" "^ Boca Raton London New York CRC Press is an imprint

More information

Design and Implementation of Low power High Speed. Floating Point Adder and Multiplier

Design and Implementation of Low power High Speed. Floating Point Adder and Multiplier Design and Implementation of Low power High Speed Floating Point Adder and Multiplier Thesis report submitted towards the partial fulfillment of requirements for the award of the degree of Master of Technology

More information

FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE Standard

FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE Standard FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE 754-2008 Standard M. Shyamsi, M. I. Ibrahimy, S. M. A. Motakabber and M. R. Ahsan Dept. of Electrical and Computer Engineering

More information

Floating-Point Numbers in Digital Computers

Floating-Point Numbers in Digital Computers POLYTECHNIC UNIVERSITY Department of Computer and Information Science Floating-Point Numbers in Digital Computers K. Ming Leung Abstract: We explain how floating-point numbers are represented and stored

More information

Floating Point Unit Generation and Evaluation for FPGAs

Floating Point Unit Generation and Evaluation for FPGAs Floating Point Unit Generation and Evaluation for FPGAs Jian Liang and Russell Tessier Department of Electrical and Computer Engineering University of Massachusetts Amherst, MA 13 {jliang,tessier}@ecs.umass.edu

More information

Chapter 4 Section 2 Operations on Decimals

Chapter 4 Section 2 Operations on Decimals Chapter 4 Section 2 Operations on Decimals Addition and subtraction of decimals To add decimals, write the numbers so that the decimal points are on a vertical line. Add as you would with whole numbers.

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 6b High-Speed Multiplication - II Israel Koren ECE666/Koren Part.6b.1 Accumulating the Partial

More information

EE 109 Unit 19. IEEE 754 Floating Point Representation Floating Point Arithmetic

EE 109 Unit 19. IEEE 754 Floating Point Representation Floating Point Arithmetic 1 EE 109 Unit 19 IEEE 754 Floating Point Representation Floating Point Arithmetic 2 Floating Point Used to represent very small numbers (fractions) and very large numbers Avogadro s Number: +6.0247 * 10

More information

Floating-Point Numbers in Digital Computers

Floating-Point Numbers in Digital Computers POLYTECHNIC UNIVERSITY Department of Computer and Information Science Floating-Point Numbers in Digital Computers K. Ming Leung Abstract: We explain how floating-point numbers are represented and stored

More information

Run-Time Reconfigurable multi-precision floating point multiplier design based on pipelining technique using Karatsuba-Urdhva algorithms

Run-Time Reconfigurable multi-precision floating point multiplier design based on pipelining technique using Karatsuba-Urdhva algorithms Run-Time Reconfigurable multi-precision floating point multiplier design based on pipelining technique using Karatsuba-Urdhva algorithms 1 Shruthi K.H., 2 Rekha M.G. 1M.Tech, VLSI design and embedded system,

More information

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. ! Floating point Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties Next time! The machine model Chris Riesbeck, Fall 2011 Checkpoint IEEE Floating point Floating

More information

Inf2C - Computer Systems Lecture 2 Data Representation

Inf2C - Computer Systems Lecture 2 Data Representation Inf2C - Computer Systems Lecture 2 Data Representation Boris Grot School of Informatics University of Edinburgh Last lecture Moore s law Types of computer systems Computer components Computer system stack

More information

Double Precision Floating Point Core VHDL

Double Precision Floating Point Core VHDL Double Precision Floating Point Core VHDL Introduction This document describes the VHDL double precision floating point core, posted at www.opencores.org. The Verilog version of the code is in folder fpu_double,

More information

Signed umbers. Sign/Magnitude otation

Signed umbers. Sign/Magnitude otation Signed umbers So far we have discussed unsigned number representations. In particular, we have looked at the binary number system and shorthand methods in representing binary codes. With m binary digits,

More information

FPGA Matrix Multiplier

FPGA Matrix Multiplier FPGA Matrix Multiplier In Hwan Baek Henri Samueli School of Engineering and Applied Science University of California Los Angeles Los Angeles, California Email: chris.inhwan.baek@gmail.com David Boeck Henri

More information

Pipelining Of Double Precision Floating Point Division And Square Root Operations On Fieldprogrammable

Pipelining Of Double Precision Floating Point Division And Square Root Operations On Fieldprogrammable University of Central Florida Electronic Theses and Dissertations Masters Thesis (Open Access) Pipelining Of Double Precision Floating Point Division And Square Root Operations On Fieldprogrammable Gate

More information

Floating Point Representation in Computers

Floating Point Representation in Computers Floating Point Representation in Computers Floating Point Numbers - What are they? Floating Point Representation Floating Point Operations Where Things can go wrong What are Floating Point Numbers? Any

More information

Implementation of a High Speed Binary Floating point Multiplier Using Dadda Algorithm in FPGA

Implementation of a High Speed Binary Floating point Multiplier Using Dadda Algorithm in FPGA Implementation of a High Speed Binary Floating point Multiplier Using Dadda Algorithm in FPGA Ms.Komal N.Batra 1, Prof. Ashish B. Kharate 2 1 PG Student, ENTC Department, HVPM S College of Engineering

More information

Let s put together a Manual Processor

Let s put together a Manual Processor Lecture 14 Let s put together a Manual Processor Hardware Lecture 14 Slide 1 The processor Inside every computer there is at least one processor which can take an instruction, some operands and produce

More information

VLSI Implementation of a High Speed Single Precision Floating Point Unit Using Verilog

VLSI Implementation of a High Speed Single Precision Floating Point Unit Using Verilog VLSI Implementation of a High Speed Single Precision Floating Point Unit Using Verilog Ushasree G ECE-VLSI, VIT University, Vellore- 632014, Tamil Nadu, India usha010@gmail.com Abstract- To represent very

More information

Math 340 Fall 2014, Victor Matveev. Binary system, round-off errors, loss of significance, and double precision accuracy.

Math 340 Fall 2014, Victor Matveev. Binary system, round-off errors, loss of significance, and double precision accuracy. Math 340 Fall 2014, Victor Matveev Binary system, round-off errors, loss of significance, and double precision accuracy. 1. Bits and the binary number system A bit is one digit in a binary representation

More information

Digital Fundamentals

Digital Fundamentals Digital Fundamentals Tenth Edition Floyd Chapter 1 Modified by Yuttapong Jiraraksopakun Floyd, Digital Fundamentals, 10 th 2008 Pearson Education ENE, KMUTT ed 2009 Analog Quantities Most natural quantities

More information

Arithmetic Logic Unit

Arithmetic Logic Unit Arithmetic Logic Unit A.R. Hurson Department of Computer Science Missouri University of Science & Technology A.R. Hurson 1 Arithmetic Logic Unit It is a functional bo designed to perform "basic" arithmetic,

More information

COMP 303 Computer Architecture Lecture 6

COMP 303 Computer Architecture Lecture 6 COMP 303 Computer Architecture Lecture 6 MULTIPLY (unsigned) Paper and pencil example (unsigned): Multiplicand 1000 = 8 Multiplier x 1001 = 9 1000 0000 0000 1000 Product 01001000 = 72 n bits x n bits =

More information