EECS15 - Digital Design Lecture 13 - Combinational Logic & Arithmetic Circuits Part 3 October 8, 22 John Wawrzynek Fall 22 EECS15 - Lec13-cla3 Page 1 Multiplication a 3 a 2 a 1 a Multiplicand b 3 b 2 b 1 b Multiplier X a 3 b a 2 b a 1 b a b a 3 b 1 a 2 b 1 a 1 b 1 a b 1 Partial a 3 b 2 a 2 b 2 a 1 b 2 a b 2 products a 3 b 3 a 2 b 3 a 1 b 3 a b 3... a 1 b +a b 1 a b Product Many different circuits exist for multiplication. Each one has a different balance between speed (performance) and amount of logic (cost). Fall 22 EECS15 - Lec13-cla3 Page 2
+ n-bit adder 1 Shift and Add Multiplier P B n-bit shift registers A n-bit register Cost α n, Τ = n clock cycles. What is the critical path for determining the min clock period? Sums each partial product, one at a time. In binary, each partial product is shifted versions of A or. Control Algorithm: 1. P, A multiplicand, B multiplier 2. If LSB of B==1 then add A to P else add 3. Shift [P][B] right 1 4. Repeat steps 2 and 3 n-1 times. 5. [P][B] has product. Fall 22 EECS15 - Lec13-cla3 Page 3 Shift and Add Multiplier Signed Multiplication: Remember for 2 omplement numbers MSB has negative weight: N 2 i X = x i 2 x i= n 1 n 1 ex: -6 = 111 2 = 2 + 1 2 1 + 2 2 + 1 2 3-1 2 4 = + 2 + + 8-16 = -6 2 Therefore for multiplication: a) subtract final partial product b) sign-extend partial products Modifications to shift & add circuit: a) adder/subtractor b) sign-extender on P shifter register Fall 22 EECS15 - Lec13-cla3 Page 4
Array Multiplier Generates all n partial products simultaneously. b3 b2 b1 b a Each row: n-bit adder with AND gates a1 P b j sum in a2 a3 P1 P2 P3 carry out FA sum out a i carry in P7 P6 P5 P4 What is the critical path? Fall 22 EECS15 - Lec13-cla3 Page 5 Carry-save Addition Speeding up multiplication is a matter of speeding up the summing of the partial products. Carry-save addition can help. Carry-save addition passes (saves) the carries to the output, rather than propagating them. Example: sum three numbers, 3 1 = 11, 2 1 = 1, 3 1 = 11 3 1 11 + 2 1 1 c 1 = 4 1 s 1 = 1 1 carry-save add carry-save add carry-propagate add 3 1 11 c 1 = 2 1 s 11 = 6 1 1 = 8 1 In general, carry-save addition takes in 3 numbers and produces 2. Whereas, carry-propagate takes 2 and produces 1. With this technique, we can avoid carry propagation until final addition Fall 22 EECS15 - Lec13-cla3 Page 6
Carry-save Circuits FA FA FA FA FA FA FA FA c When adding sets of numbers, carry-save can be used on all but the final sum. Standard adder (carry propagate) is used for final sum. x 2 x 1 x CPA Fall 22 EECS15 - Lec13-cla3 Page 7 Array Multiplier using Carry-save Addition b3 b2 b1 b a a1 P b j sum in a i a2 a3 P1 P2 carry out FA sum out carry in P3 1 Fast carrypropagate adder P7 P6 P5 P4 Fall 22 EECS15 - Lec13-cla3 Page 8
Carry-save Addition is associative and communitive. For example: (((X + X 1 )+X 2 )+X 3 ) = ((X + X 1 )+(X 2 +X 3 )) x 7 x 6 x 5 x 4 x 3 x 2 x 1 x A balanced tree can be used to reduce the logic delay. log 3/2 N This structure is the basis of the Wallace Tree Multiplier. Partial products are summed with the tree. Fast CPA (ex: CLA) is used for final sum. Multiplier delay α log 3/2 N + log 2 N CPA log 2 N Fall 22 EECS15 - Lec13-cla3 Page 9 Division 11 Quotient Divisor 1 111 Dividend 1 1 11 11 1 1 (or Modulo result) See how big a number can be subtracted, creating quotient bit on each step Binary 1 * divisor or * divisor Dividend = Quotient x Divisor + sizeof(dividend) = sizeof(quotient) + sizeof(divisor) 3 versions of divide, successive refinement Fall 22 EECS15 - Lec13-cla3 Page 1
DIVIDE HARDWARE Version 1 64-bit Divisor register, 64-bit adder/subtractor, 64-bit register, 32-bit Quotient register add/sub Divisor 64 bits Shift Right Quotient 32 bits Shift Left 64 bits Write Control Fall 22 EECS15 - Lec13-cla3 Page 11 Divide Algorithm Version 1 Takes n+1 steps for n-bit Quotient & Rem. Quotient Divisor 111 1 Start: Place Dividend in 1. Subtract the Divisor register from the register, and place the result in the register. 7 1 2 1 Test < 2a. Shift the Quotient register to the left setting the new rightmost bit to 1. 2b. Restore the original value by adding the Divisor register to the register, & place the sum in the register. Also shift the Quotient register to the left, setting the new least significant bit to. 3. Shift the Divisor register right 1 bit. n+1 repetition? No: < n+1 repetitions Yes: n+1 repetitions (n = 4 here) Fall 22 EECS15 - Lec13-cla3 Page 12 Done
Version 1 Division Example 7/2 Iteration step quotient divisor remainder Initial values 1 111 1 1: rem=rem-div 1 111 111 2b: rem< +div, sll Q, Q= 1 111 3: shift div right 1 111 2 1: rem=rem-div 1 1111 111 2b: rem< +div, sll Q, Q= 1 111 3: shift div right 1 111 3 1: rem=rem-div 1 1111 1111 2b: rem< +div, sll Q, Q= 1 111 3: shift div right 1 111 4 1: rem=rem-div 1 11 2a: rem sll Q, Q=1 1 11 3: shift div right 1 11 5 1: rem=rem-div 1 1 2a: rem sll Q, Q=1 1 1 1 3: shift div right 1 1 1 Fall 22 EECS15 - Lec13-cla3 Page 13 Observations on Divide Version 1 1/2 bits in divisor always 1/2 of 64-bit adder is wasted 1/2 of divisor is wasted Instead of shifting divisor to right, shift remainder to left? 1 st step cannot produce a 1 in quotient bit (otherwise too big) switch order to shift first and then subtract, can save 1 iteration Fall 22 EECS15 - Lec13-cla3 Page 14
DIVIDE HARDWARE Version 2 32-bit Divisor register, 32-bit ALU, 64-bit register, 32-bit Quotient register Divisor 32 bits add/sub 64 bits Shift Left Write Quotient 32 bits Control Shift Left Fall 22 EECS15 - Lec13-cla3 Page 15 Divide Algorithm Version 2 Quotient Divisor 111 1 7 1 2 1 Start: Place Dividend in 1. Shift the register left 1 bit. 2. Subtract the Divisor register from the left half of the register, & place the result in the left half of the register. Test < 3a. Shift the Quotient register to the left setting the new rightmost bit to 1. 3b. Restore the original value by adding the Divisor register to the left half of the register, &place the sum in the left half of the register. Also shift the Quotient register to the left, setting the new least significant bit to. nth repetition? No: < n repetitions Yes: n repetitions (n = 4 here) Fall 22 EECS15 - Lec13-cla3 Done Page 16
Observations on Divide Version 2 Eliminate Quotient register by combining with as shifted left. Start by shifting the left as before. Thereafter loop contains only two steps because the shifting of the register shifts both the remainder in the left half and the quotient in the right half The consequence of combining the two registers together and the new order of the operations in the loop is that the remainder will shifted left one time too many. Thus the final correction step must shift back only the remainder in the left half of the register Fall 22 EECS15 - Lec13-cla3 Page 17 DIVIDE HARDWARE Version 3 32-bit Divisor register, 32-bit adder/subtractor, 64-bit register, (-bit Quotient reg) Divisor 32 bits 32-bit ALU HI LO (Quotient) 64 bits Shift Left Write Control Fall 22 EECS15 - Lec13-cla3 Page 18
Divide Algorithm Version 3 Divisor 111 1 7 1 2 1 Start: Place Dividend in 1. Shift the register left 1 bit. 2. Subtract the Divisor register from the left half of the register, & place the result in the left half of the register. Test < 3a. Shift the register to the left setting the new rightmost bit to 1. 3b. Restore the original value by adding the Divisor register to the left half of the register, &place the sum in the left half of the register. Also shift the register to the left, setting the new least significant bit to. *upper-half nth repetition? No: < n repetitions Yes: n repetitions (n = 4 here) Fall 22 Done. Shift left EECS15 half of - Lec13-cla3 right 1 bit. Page 19 Observations on Divide Version 3 Same Hardware as shift and add multiplier: just 63-bit register to shift left or shift right Hi and LO registers in MIPS combine to act as 64-bit register for multiply and divide Signed divides: Simplest is to remember signs, make positive, and complement quotient and remainder if necessary Note: Dividend and must have same sign Note: Quotient negated if Divisor sign & Dividend sign disagree e.g., 7 2 = 3, remainder = 1 Possible for quotient to be too large: if divide 64-bit integer by 1, quotient is 64 bits ( called saturation ) Fall 22 EECS15 - Lec13-cla3 Page 2