EE 260: Introduction to Digital Design Arithmetic II Yao Zheng Department of Electrical Engineering University of Hawaiʻi at Mānoa Overview n Integer multiplication n Booth s algorithm n Integer division n Restoring, non-restoring n Floating point representation n Floating point addition, multiplication Multiplication Array Multiplier n Flashback to 3 rd grade n Multiplier n Multiplicand n Partial products n Final sum n Base 10: 8 x 9 = 72 n PP: 8 + 0 + 0 + 64 = 72 n How wide is the result? n log(n x m) = log(n) + log(m) n 32b x 32b = 64b result 1 0 0 1 0 0 n Adding all partial products simultaneously using an array of basic cells S in C in A i B j Full Adder C out S out A i,b j 16-bit Array Multiplier Instead: Multicycle Multipliers l l Conceptually straightforward Fairly expensive hardware, integer multiplies relatively rare l Mostly used in array address calc: replace with shifts n Combinational multipliers n Very hardware-intensive n Integer multiply relatively rare n Not the right place to spend resources n Multicycle multipliers n Iterate through bits of multiplier n Conditionally add shifted multiplicand Chapter 8: Arithmetic II 1
Multiplier 1 0 0 Multiplier 1 0 0 Start Multiplier0 = 1 1. Test Multiplier0 = 0 Multiplier0 1a. Add multiplicand to product and place the result in Product register 2. Shift the Multiplicand register left 1 bit 3. Shift the Multiplier register right 1 bit No: < 32 repetitions 32nd repetition? Yes: 32 repetitions Done Multiplier Improvements n Do we really need a 64-bit adder? n No, since low-order bits are not involved n Hence, just use a 32-bit adder n Shift product register right on every step n Do we really need a separate multiplier register? n No, since low-order bits of 64-bit product are initially unused n Hence, just store multiplier there initially Multiplier Multiplicand 32 bits 32-bit ALU Shift right Product Write 64 bits 1 0 0 Control test Multiplier 1 0 0 Start Product0 = 1 1. Test Product0 = 0 Product0 1a. Add multiplicand to the left half of the product and place the result in the left half of the Product register 2. Shift the Product register right 1 bit No: < 32 repetitions 32nd repetition? Yes: 32 repetitions Done n Recall Signed Multiplication n For p = a x b, if a<0 or b<0, then p < 0 n If a<0 and b<0, then p > 0 n Hence sign(p) = sign(a) xor sign(b) n Hence n Convert multiplier, multiplicand to positive number with (n-1) bits n Multiply positive numbers n Compute sign, convert product accordingly n Or, n Perform sign-extension on shifts for prev. design n Right answer falls out Chapter 8: Arithmetic II 2
Signed Multiplication Booth s Encoding n Recall grade school trick n When multiplying by 9: n Multiply by 10 (easy, just shift digits left) n Subtract once n E.g. n 123454 x 9 = 123454 x (10 1) = 1234540 123454 n Converts addition of six partial products to one shift and one subtraction n Booth s algorithm applies same principle n Except no 9 in binary, just 1 and 0 n So, it s actually easier! Booth s Encoding Booth s Algorithm n Search for a run of 1 bits in the multiplier n E.g. 0110 has a run of 2 1 bits in the middle n Multiplying by 0110 (6 in decimal) is equivalent to multiplying by 8 and subtracting twice, since 6 x m = (8 2) x m = 8m 2m n Hence, iterate right to left and: n Subtract multiplicand from product at first 1 n Add multiplicand to product after last 1 n Don t do either for 1 bits in the middle Current bit Bit to right Explanation Example Operation 1 0 Begins run of 1 00001111000 Subtract 1 1 Middle of run of 1 00001111000 Nothing 0 1 End of a run of 1 00001111000 Add 0 0 Middle of a run of 0 00001111000 Nothing Booth s Encoding n Really just a new way to encode numbers n Normally positionally weighted as 2 n n With Booth, each position has a sign bit n Can be extended to multiple bits 0 1 1 0 Binary +1 0-1 0 1-bit Booth +2-2 2-bit Booth 2-bits/cycle Booth Multiplier n For every pair of multiplier bits n If Booth s encoding is -2 n Shift multiplicand left by 1, then subtract n If Booth s encoding is -1 n Subtract n If Booth s encoding is 0 n Do nothing n If Booth s encoding is 1 n Add n If Booth s encoding is 2 n Shift multiplicand left by 1, then add Chapter 8: Arithmetic II 3
2 bits/cycle Booth s 1 bit Booth 00 +0 01 +M; 10 -M; 11 +0 Current Previous Operation Explanation 00 0 +0;shift 2 [00] => +0, [00] => +0; 2x(+0)+(+0)=+0 00 1 +M; shift 2 [00] => +0, [01] => +M; 2x(+0)+(+M)=+M 01 0 +M; shift 2 [01] => +M, [10] => -M; 2x(+M)+(-M)=+M 01 1 +2M; shift 2 [01] => +M, [11] => +0; 2x(+M)+(+0)=+2M 10 0-2M; shift 2 [10] => -M, [00] => +0; 2x(-M)+(+0)=-2M 10 1 -M; shift 2 [10] => -M, [01] => +M; 2x(-M)+(+M)=-M 11 0 -M; shift 2 [11] => +0, [10] => -M; 2x(+0)+(-M)=-M 11 1 +0; shift 2 [11] => +0, [11] => +0; 2x(+0)+(+0)=+0 Booth s Example n Negative multiplicand: -6 x 6 = -36 1010 x 0110, 0110 in Booth s encoding is +0-0 Hence: 1111 1010 x 0 0000 0000 1111 0100 x 1 0000 1100 1110 1000 x 0 0000 0000 1101 0000 x +1 1101 0000 Final Sum: 1101 1100 (-36) Booth s Example n Negative multiplier: -6 x -2 = 12 1010 x 1110, 1110 in Booth s encoding is 00-0 Hence: 1111 1010 x 0 0000 0000 1111 0100 x 1 0000 1100 1110 1000 x 0 0000 0000 1101 0000 x 0 0000 0000 Final Sum: 0000 1100 (12) Integer Division n Again, back to 3 rd grade (74 8 = 9 rem 2) 1 0 0 1 Quotient Divisor 1 0 0 1 0 1 0 Dividend - 1 0 1 0 1 1 0 1 0-1 0 Remainder Integer Division Integer Division n How does hardware know if division fits? n Condition: if remainder divisor n Use subtraction: (remainder divisor) 0 n OK, so if it fits, what do we do? n Remainder n+1 = Remainder n divisor n What if it doesn t fit? n Have to restore original remainder n Called restoring division Divisor Shift right 64 bits 64-bit ALU Remainder Write 64 bits Divisor Control test Quotient Shift left 32 bits 1 0 0 1 Quotient 1 0 0 1 0 1 0 Dividend - 1 0 1 0 1 1 0 1 0-1 0 Remainder Chapter 8: Arithmetic II 4
Integer Division Start 1. Subtract the Divisor register from the Remainder register and place the result in the Remainder register Remainder > 0 Test Remainder Remainder < 0 Integer Division Divisor 2a. Shift the Quotient register to the left, 2b. Restore the original value by adding setting the new rightmost bit to 1 the Divisor register to the Remainder register and place the sum in the Remainder register. Also shift the Quotient register to the left, setting the new least significant bit to 0 1 0 0 1 Quotient 1 0 0 1 0 1 0 Dividend - 3. Shift the Divisor register right 1 bit 1 0 1 0 1 1 0 1 0 No: < 33 repetitions 33rd repetition? - 1 0 Remainder Yes: 33 repetitions Done Division Improvements n Hardware similar to multiplier n Can store quotient in remainder register n Only need 32b ALU n Shift remainder left vs. divisor right Improved Divider Divisor 32 bits 32-bit ALU Shift right Remainder Shift left Write 64 bits Control test Division Improvements n Skip first subtract n Can t shift 1 into quotient anyway n Hence shift first, then subtract n Undo extra shift at end Improved Divider Start 1. Shift the Remainder register left 1 bit 2. Subtract the Divisor register from the left half of the Remainder register and place the result in the left half of the Remainder register Remainder > 0 Test Remainder Remainder < 0 3a. Shift the Remainder register to the left, setting the new rightmost bit to 1 3b. Restore the original value by adding the Divisor register to the left half of the Remainder register and place the sum in the left half of the Remainder register. Also shift the Remainder register to the left, setting the new rightmost bit to 0 32nd repetition? No: < 32 repetitions Yes: 32 repetitions Done. Shift left half of Remainder right 1 bit Chapter 8: Arithmetic II 5
Improved Divider Further Improvements n Division still takes: n 2 ALU cycles per bit position n 1 to check for divisibility (subtract) n One to restore (if needed) n Can reduce to 1 cycle per bit n Called non-restoring division n Avoids restore of remainder when test fails Non-restoring Division n Consider remainder to be restored: R i = R i-1 d < 0 n Since R i is negative, we must restore it, right? n Well, maybe not. Consider next step i+1: R i+1 = 2 x (R i ) d = 2 x (R i d) + d n Hence, we can compute R i+1 by not restoring R i, and adding d instead of subtracting d n Same value for R i+1 results n Throughput of 1 bit per cycle NR Division Example Iteration Step Divisor Remainder 0 Initial values 0010 0000 0111 Shift rem left 1 0010 0000 1110 1 2: Rem = Rem - Div 0010 1110 1110 3b: Rem < 0 (add next), sll 0 0010 1101 1100 2 2: Rem = Rem + Div 0010 1111 1100 3b: Rem < 0 (add next), sll 0 0010 1111 1000 3 2: Rem = Rem + Div 0010 0001 1000 3a: Rem > 0 (sub next), sll 1 0010 0011 0001 4 Rem = Rem Div 0010 0001 0001 Rem > 0 (sub next), sll 1 0010 0010 0011 Shift Rem right by 1 0010 0001 0011 Floating Point n Want to represent larger range of numbers n Fixed point (integer): -2 n-1 (2 n-1 1) n How? Sacrifice precision for range by providing exponent to shift relative weight of each bit position n Similar to scientific notation: 3.14159 x 10 23 n Cannot specify every discrete value in the range, but can span much larger range Floating Point n Still use a fixed number of bits n Sign bit S, exponent E, significand F n Value: (-1) S x F x 2 E n IEEE 754 standard S E F Size Exponent Significand Range Single precision 32b 8b 23b 2x10 +/-38 Double precision 64b 11b 52b 2x10 +/-308 Chapter 8: Arithmetic II 6
Floating Point Exponent n Exponent specified in biased or excess notation n Why? n To simplify sorting n Sign bit is MSB to ease sorting n 2 s complement exponent: n Large numbers have positive exponent n Small numbers have negative exponent n Sorting does not follow naturally Excess or Biased Exponent Exponent 2 s Compl Excess-127-127 1000 0001 0000 0000-126 1000 0010 0000 0001 +127 0111 1111 1111 1110 n Value: (-1) S x F x 2 (E-bias) n SP: bias is 127 n DP: bias is 1023 Floating Point Normalization n S,E,F representation allows more than one representation for a particular value, e.g. 1.0 x 10 5 = 0.1 x 10 6 = 10.0 x 10 4 n This makes comparison operations difficult n Prefer to have a single representation n Hence, normalize by convention: n Only one digit to the left of the floating point n In binary, that digit must be a 1 n Since leading 1 is implicit, no need to store it n Hence, obtain one extra bit of precision for free FP Overflow/Underflow n FP Overflow n Analogous to integer overflow n Result is too big to represent n Means exponent is too big n FP Underflow n Result is too small to represent n Means exponent is too small (too negative) n Both can raise an exception under IEEE754 IEEE754 Special Cases Single Precision Double Precision Value Exponent Significand Exponent Significand 0 0 nonzero 0 nonzero denormalized 1-254 anything 1-2046 anything fp number 255 0 2047 0 infinity 255 nonzero 2047 nonzero NaN (Not a Number) n Rounding is important FP Rounding n Small errors accumulate over billions of ops n FP rounding hardware helps n Compute extra guard bit beyond 23/52 bits n Further, compute additional round bit beyond that n Multiply may result in leading 0 bit, normalize shifts guard bit into product, leaving round bit for rounding n Finally, keep sticky bit that is set whenever 1 bits are lost to the right n Differentiates between 0.5 and 0.500000000001 Chapter 8: Arithmetic II 7
Floating Point Addition n Just like grade school n First, align decimal points n Then, add significands n Finally, normalize result n Example 9.997 x 10 2 9.997000 x 10 2 FP Adder Sign Exponent Significand Sign Exponent Significand Small ALU Exponent difference 0 1 0 1 1 0 Control Shift right Big ALU 0 1 0 1 Compare exponents Shift smaller number right Add 4.631 x 10-1 0.004631 x 10 2 Increment or decrement Shift left or right Normalize Sum 10.001631 x 10 2 Normalized 1.0001631 x 10 3 Rounding hardware Round Sign Exponent Significand FP Multiplication Carry Save Adder n Sign: P s = A s xor B s n Exponent: P E = A E + B E n Due to bias/excess, must subtract bias e = e1 + e2 E = e + 1023 = e1 + e2 + 1023 E = (E1 1023) + (E2 1023) + 1023 E = E1 + E2 1023 n Significand: P F = A F x B F n Standard integer multiply (23b or 52b + g/r/s bits) n Use Wallace tree of s to sum partial products A + B => S Save carries A + B => S, C out Use C in A + B + C => S1, S2 (3# to 2# in parallel) Used in combinational multipliers by building a Wallace Tree c b a c s Adding Up Many Bits f e d c b a S 0 FP Multiplication n Compute sign, exponent, significand n Normalize n Shift left, right by 1 n Check for overflow, underflow n Round n Normalize again (if necessary) S 2 S 1 Chapter 8: Arithmetic II 8
Logical Operations n Bitwise AND, OR, XOR, NOR n Implement w/ 32 gates in parallel n Shifts and rotates n rol => rotate left (MSB->LSB) n ror => rotate right (LSB->MSB) n sll -> shift left logical (0->LSB) n srl -> shift right logical (0->LSB) n sra -> shift right arithmetic (old MSB->new MSB) Shifter d 7 d 6 d 1 d 0 d 0 0 Mux shift based on 0 th bit by 0 or 1 stage0 s0 7 s0 0 s0 7 s0 6 s0 0 s0 2 s0 1 0 s0 0 0 Mux shift based on 1 st bit by 0 or 2 stage1 s1 7 s1 0 s17 s1 s1 3 s1 4 s1 0 s1 3 0 0 0 Mux shift based on 2 nd bit by 0 or 4 dout dout 7 dout 7 shamt0 shamt1 shamt2 Shifter E.g., Shift left logical for d<7:0> and shamt<2:0> Using 2-1 Muxes called Mux(select, in 0, in 1 ) stage0<7:0> = Mux(shamt<0>,d<7:0>, 0 d<7:1>) stage1<7:0> = Mux(shamt<1>, stage0<7:0>, 00 stage0<6:2>) dout<7:0) = Mux(shamt<2>, stage1<7:0>, 0000 stage1<3:0>) For Barrel shifter used wider muxes n Integer multiply n Combinational n Multicycle n Booth s algorithm n Integer divide n Multicycle restoring n Non-restoring Summary Summary n Floating point representation n Normalization n Overflow, underflow n Rounding n Floating point add n Floating point multiply Chapter 8: Arithmetic II 9