Arithmetic. Chapter 3 Computer Organization and Design

Arithmetic Chapter 3 Computer Organization and Design

Addition Addition is similar to decimals 0000 0111 + 0000 0101 = 0000 1100 Subtraction (negate) 0000 0111 + 1111 1011 = 0000 0010

Over(under)flow For signed addition If signs are the same It has to agree with the arguments If signs different There can be no over(under)flow.

Over(under)flow For signed subtraction If signs are the same There can be no over(under)flow If signs are different It has to match the subtractant (zero?)

Over(under)flow For unsigned addition If result smaller than the argument, it is overflow For unsigned subtraction If the result is greater than the subtractant it is overflow

In practice... MIPS signed ops can cause overflow Causes exception and saves the offending instruction in Exception PC (mfc0) Unsigned ops will not cause C ignores integer overflows.

Multimedia Operations Operations like MMX, SSE, etc Graphics and image processing often work with 8 or 16 bits Treat a 32 bit register as 4 8 bit registers or 2 16 bits registers. Break the carry chain

Multiplication 1000 x 1001 1000 0000 0000 1000 = 1001000

Hardware Multiplicant Adder Multiplier Product Control

Can we speed up? All operations could be done in parallel Have a cascade of adders and add them all Gate delays will be a problem Have a tree of additions Logarithmic time Use carry save adders

Cascade of Adders multiplier multiplicant & & & Adder & Adder Adder

Tree of Adders mplier3*mcant mplier2*mcant mplier1*mcant mplier0*mcant Adder Adder Adder Pr7 Pr6 2 Pr1 Pr0

Division unsigned long int rem, ldvsr; int i; rem = divident; ldvsr = ((long)divisor)<<31; quot = 0; for (i=1; i<32; i++){ quot = quot<<1; if (rem>=ldvsr){ rem =ldvsr; quot+=1; } }

Doing it faster Hard! The reason is that we do not know the reminder in advance to compare Cannot unroll the loop Can do it with octals or hex Resembles the pen an pencil decimal method Tricky on hardware

Multiply on MIPS Result can be up to 64 bits All registers are 32 bits MIPS uses two special registers Hi, Lo And two special commands mfhi (move from hi) mflo (move from lo)

Divide on MIPS Similar problem We need a 32 bit register for quotient Another for remainder We use the same registers Hi, Lo, Hi: remainder Lo: quotient

Signed Mult/Div We have two versions mult/multu and div/divu for signed and unsigned. For signed mult CPU converts to positive and remembers the signs If signs the same result is positive o/w negative

Signed Mult/Div Division is tricky Follows the formula Divident = q * divisor + r For example 7 = 2 * 3 + 1 7 = ( 2)* 3 + ( 1) 7 = 2 *( 3)+ ( 1) 7 = ( 2)*( 3)+ 1

The Rule If divident and divisor have same sign Quotient is positive If divident and divisor have different sign Quotient is negative The sign of the reminder... Matches the divident

Floating point So far we dealt with integers A.K.A. Fixed point In mathematics we have reals We can only approximate reals We do not distinguish rationals and irrationals We follow the scientific notation The one preferred by scientists

Scientific notation 32.43*10 2 =.3243 For computer use we improve it 1010110.10101 * 2 101 = 101011010101. We further improve it (normalize) 1.0011001 * 2 110 = 1001100.1 Since the first bit is always 1 We make it implicit (do not include it in the representation)

Representation 1.0010101 * 2 110 We need Sign Fraction Exponent We need to compromise between space for fraction and space for exponent We care more for accuracy that representing large numbers

MIPS Representation (32 bits) Sign: 1 bit, 0: positive, 1: negative Exponent: 8 bits biased complement of 2 Fraction: 23 bits sign and magnitude Overflow: too big (positive) or too small (negative) to be represented Underflow: too small (positive) or too big (negative) to be represented

MIPS Representation (64 bits) Sign: 1 bit Exponent: 11 bits Fraction: 52 bits.

IEEE 754 Floating point std Introduced to facilitate the exchange of floating point numbers, increase portability and imporve the quality of computation. MIPS follows IEEE 754 Has implicit leading bit 1.0101101101001 is represented as.0101101101001 and the leading 1 is implied Definition: 1.0101101101001 significant

Special Numbers How do we represent zero? We can't if we have implicit leading bit We define zero to be 000...00 (all zeros) The exponent is biased (add 127 or 1023) For normalized numbers ( 1) sign * ( 1+ Fraction ) * 2 (exponent bias) For de normalized numbers (when exponent has the smallest value) ( 1) sign * ( Fraction ) * 2 (exponent bias)

Special Numbers Infinity (positive or negative) Exponent 255, fraction 0 Not a Number (NaN) Exponent 255, fraction not zero Division by zero gives infinity Infinity minus infinity gives NaN

Comparisons in Floating Point Placement of sign, bias of exponent chosen to make comparison a bit simpler. No need to convert to a different representation

Floating Point Addition Add 1.111*2 1 and 1.011*2 1. Step 1: shift the smallest in abs. val. to make exponents the same 0.01011*2 1. Step 2: Add the fractions 1.111 + 0.01011 = 10.10111 Step 3: Normalize 10.10111*2 1 = 1.010111*2 2 = Step 4: Round (Might need to jump to Step: 3)

Floating Point Multiplication 1.100*2 1 * 1.110*2 2 Step 1: add the exponents: 1 2 = 3 Step 2: multiply the significands 1.100*1.110 = 10.1010000 Step 3: Normalize and check for overflow 1.01010000*2 3 Step 4: Round: 1.011*2 3 Step 5: Adjust signs

Example void mm(double x[ ][ ], double y[ ][ ], double z[ ][ ]) { int, i, j, k; for (i=0; i!=32; i++) for (j=0; i!=32; j++) for (k=0; k!=32; k++) x[i][j] += y[i][k] * z[k][j]; }

Example void mm(double x[ ][ ], double y[ ][ ], double z[ ][ ]) Mm: ; save s0, s1, s2 on stack addi $sp, 12 { int, i, j, k; for (i=0; i!=32; i++) sw $s2, 8($sp) for (j=0; i!=32; j++) sw $s1, 4($sp) for (k=0; k!=32; k++) sw $s0, 0($sp) mmexit: ; restore from stack } x[i][j] += y[i][k] * z[k][j]; lw lw lw $s2, 8($sp) $s1, 4($sp) $s0, 0($sp) addi $sp, 12 jr $ra

li $t1, 32 # mat. size li $s0, 0 # i=0 Example L1: li $s1, 0 # j = 0 void mm(double x[ ][ ], double y[ ][ ], double z[ ][ ]) L2: li $s2, 0 # k = 0 { sll $t2, $s0, 5 # i rows int, i, j, k; addu $t2, $t2, $s1 # j elements for (i=0; i!=32; i++) sll $t2, $t2, 3 # doubles are 8 bytes for (j=0; i!=32; j++) addu $t2, $a0, $t2 # for (k=0; k!=32; k++) l.d $f4, 0($t2) x[i][j] += y[i][k] * z[k][j]; L3: sll $t0, $s2, 5 # Load z[k][j] }... l.d $f16, 0($t0)... # Load y[i][k] mul.d $f16, $f18, $f16 add.d $f4, $f4, $f16 # f4 += y[i][k]*z[k][j] addiu $s2,$s2, 1 bne s.d $s2, $t1, L3 $f4, 0($t2)... bne $s0, $t1, L1

Floating point instructions add.s, sub.s, mul.s div.s add.d sub.d, mul.d div.d lwc1, swc1, mfc1, mtc1 c.lt.s, c.gt.s, c.eq.s, c.ne.s, c.le.s, c.ge.s c.lt.d, c.gt.d, c.eq.d, c.ne.d, c.le.d, c.ge.d bc1t, bc1f

Floating point registers f0..f3: return results f4..f11, f16..f19: temporary f12..f15: arguments f20..f31: saved temporary

Accuracy To round properly we need to cary extra bits (typically two) Many architectures have multiply and add in one instruction A += b*c Floating point arithmetic is not associative (1+10 10 ) 10 10!= 1