inary Multiplication The key to multiplication was memorizing a digit-by-digit table Everything else was just adding 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 2 2 4 6 8 2 4 6 8 3 3 6 9 2 5 8 2 24 27 + You ve got to be kidding It can t be that easy 4 4 8 2 6 2 24 28 32 36 5 5 5 2 25 3 35 4 45 6 6 2 8 24 3 36 42 48 54 7 7 4 2 28 35 42 49 56 63 8 8 6 24 32 4 48 56 64 72 9 9 8 27 36 45 54 63 72 8 /23/27 Comp 4 - Fall 27
Digit by digit = bit by bit Hey, that looks like an AND gate The inary Multiplication Table X inary multiplication is implemented using the same basic longhand algorithm that you learned in grade school. A 2 2 A A A 3 3 A j i is a A A 2 A 3 partial product A A A A 3 2 A A A 2 A 3 2 2 2 2 + A A A 3 A 3 3 2 3 3 x A Easy part: forming partial products (just an AND gate since I is either or ) Hard part: adding M, N-bit partial products Multiplying N-digit number by M-digit number gives (N+M)-digit result /23/27 Comp 4 - Fall 27 2
Multiplying in Assembly One can use this Shift and Add approach to write a multiply function in assembly language: Hum, maybe we could do something more clever. ; multiplies r and r mult: mov r3,# ; zero product part: tst r,# ; check if least significant bit= addne r3,r3,r ; add multiplicand to product mov r,r,lsl # ; multiplicand *= 2 movs r,r,lsr # ; multiplier /= 2 bne part ; continue while multiplier is not mov r,r3 ; copy product to return value bx lr Multiplier r: r: /23/27 Comp 4 - Fall 27 Multiplicand 3
Multiplier Unit-lock We introduce a new abstraction to aid in the construction of multipliers called the Unsigned Multiplier Unit-block We did a similar thing last lecture when we converted our adder to an add/subtract unit. A k are bits of the Multiplicand and i are bits of the Multiplier. The P i,k inputs and outputs represent partial products which are partial results from adding together shifted instances of the Multiplicand. The initial P,k is zero. Add/Subtract Unit lock A C Unsigned Multiply Unit lock i A i CO S i FA S i p i-,k A CO S CI A k CI Subtract C i- C k FA C k- p i,k i /23/27 Comp 4 - Fall 27 4
Simple Combinational Multiplier t PD = * t PD Is this faster than our assembly code? not 6 t PD = (2*(N-) + N) * t PD HA A Co S HA A Co S HA A Co S HA A Co S To determine the timing specification of a composite combinational circuit we find the worst-case path for every output to any input. Components N * HA N(N-) * FA N: this circuit only works for nonnegative operands /23/27 Comp 4 - Fall 27 5
Carry-Save Multiplier Observation: Rather than propagating the carries to the next adder in each row, they can instead be forwarded to the next column of the following row t PD = 8 * t PD t PD = (N+N) * t PD Components N * HA N 2 * FA These Adders can be removed, and the AND gate outputs tied directly to the Carry inputs of the next stage. This small performance improvement hardly seems worth the effort, however, this design is easier to pipeline. /23/27 Comp 4 - Fall 27 6
Higher-Radix Multiplication Idea: If we could use, say, 2 bits of the multiplier in generating each partial product we would halve the number of rows and halve the latency of the multiplier! A N- A N-2 A 3 A 2 A A M- M-2 3 2 x M/2 2... ooth s insight: rewrite 2*A and 3*A cases, leave 4A for next partial product to do! K+,K *A K+,K *A = *A = *A A = 2*A Just 2A or a shift 4A 2A = 3*A Requires 4A A adding /23/27 Comp 4 - Fall 27 7
ooth Recoding of Multiplier current bit pair -89 =. Hey, isn t that a negative number? = - * 2 (-) + 2 * 2 2 ( 8) + (-2) * 2 4 (-32) + (-) * 2 6 (-64) -89 2K+ 2K 2K- from previous bit pair action add add A add A add 2*A sub 2*A sub A sub A add -2*A+A -A+A An encoding where each bit has the following weights: W( 2K+ ) = -2 * 2 2K W( 2K ) = * 2 2K W( 2K- ) = * 2 2K Yep! ooth recoding works for 2-Complement integers, now we can build a signed multiplier. A in this bit means the previous stage needed to add 4*A. Since this stage is shifted by 2 bits with respect to the previous stage, adding 4*A in the previous stage is like adding A in this stage! /23/27 Comp 4 - Fall 27 8
ooth Multiplier unit block Logic surrounding each basic adder: - Control lines (x2, Sub, Zero) Are shared across each row - Must handle the + when Sub is (extra half adders in a carry-save array) Signed Multiply Unit lock A i A i- x2 Sub Zero NOTE: - ooth recoding can be used to implement signed multiplications 2K+ 2K 2K- x2 Sub Zero X X X X p i,k- A CO FACI S p i,k /23/27 Comp 4 - Fall 27 9
igger Multipliers Using the approaches described we can construct multipliers of arbitrary sizes, by considering every adder at the bit level We can also, build bigger multipliers using smaller ones A 4 4 4 4 P H P L Considering this problem at a higher-level leads to more non-obvious optimizations I O /23/27 Comp 4 - Fall 27
Can We Multiply With Less? How many operations are needed to multiply 2, 2-digit numbers? 4 multipliers 4 Adders This technique generalizes You can build an 8-bit multiplier using 4 4-bit multipliers and 4 8-bit adders O(N 2 + N) = O(N 2 ) A x CD D DA C CA /23/27 Comp 4 - Fall 27
O(N 2 ) multiplier logic The functional blocks look like C A D Mult Mult Mult Mult Add Add HA Add Add A x CD D DA C CA Product bits /23/27 Comp 4 - Fall 27 2
A Trick The two middle partial products can be computed using a single multiplier and other partial products DA + C = (C + D)(A + ) (CA + D) 3 multipliers 8 adders This can be applied recursively (i.e. applied within each partial product) Leads to O(N.58 ) adders This trick is becoming more popular as N grows. However, it is less regular, and the overhead of the extra adders is high for small N x A CD D DA C CA /23/27 Comp 4 - Fall 27 3
Let s Try it y Hand ) Choose 2, 2 digit numbers to multiply: ab cd 42 x 37 2) Multiply digits: p = a x c, p2 = b x d, p3 = (c + d)(a + b) p = 4 x 3 = 2, p2 = 2*7 = 4, p3 = (4+2)x(3+7) = 6 3) Compute partial subtracted sum, SS = p3 - (p + p2) SS = 6 - (2 + 4) = 34 4) Add as follows: p = x p + x SS + p2 P = 2 + 34 + 4 = 554 = 42 x 37 42 x 37 =? /23/27 Comp 4 - Fall 27 4
An O(N.58 ) Multiplier The functional blocks would look like: A x CD D SS CA Where SS = (C+D)(A+) (CA+D) C A Mult HA Add Mult Add Add SS Add Add Add Add D Mult Add Note: Adders with a bubble on one of their inputs becomes a subtractor in this notation. /23/27 Comp 4 - Fall 27 Product bits 5
Next time We dive into floating-point arithmetic /23/27 Comp 4 - Fall 27 6