Arithmetic for Computers. Hwansoo Han

Arithmetic for Computers Hwansoo Han

Arithmetic for Computers Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers Representation and operations 2

Integer Addition Example: 7 + 6 Overflow if result out of range Adding positive(+) and negative( ) operands, no overflow Adding two positive(+) operands Overflow if result sign is 1 Adding two negative( ) operands Overflow if result sign is 0 3

Integer Subtraction Add negation of second operand Example: 7 6 = 7 + ( 6) +7: 0000 0000 0000 0111 6: 1111 1111 1111 1010 +1: 0000 0000 0000 0001 Overflow if result out of range Subtracting two positive(+) or two negative( ) operands, no overflow Subtracting positive(+) from negative( ) operand Overflow if result sign is 0 Subtracting negative( ) from positive(+) operand Overflow if result sign is 1 4

Dealing with Overflow Some languages (e.g., C) ignore overflow Use MIPS addu, addui, subu instructions Other languages (e.g., Ada, Fortran) require raising an exception Use MIPS add, addi, sub instructions On overflow, invoke exception handler Save PC in exception program counter (EPC) register Jump to predefined handler address mfc0 (move from coprocessor register) instruction can retrieve EPC value, to return after corrective action 5

Arithmetic for Multimedia Graphics and media processing operates on vectors of 8-bit and 16-bit data Use 64-bit adder, with partitioned carry chain Operate on 8 8-bit, 4 16-bit, or 2 32-bit vectors SIMD (single-instruction, multiple-data) Saturating operations On overflow, result is largest representable value vs. 2s-complement modulo arithmetic E.g., clipping in audio, saturation in video 6

Multiplication Start with long-multiplication approach multiplicand multiplier product 1000 1001 1000 0000 0000 1000 1001000 Length of product is the sum of operand lengths 7

Multiplication Hardware Initially 0 8

Optimized Multiplier Perform steps in parallel: add/shift Product Multiplier Shift right Write One cycle per partial-product addition That s OK, if frequency of multiplications is low 9

Faster Multiplier Uses multiple adders Cost/performance tradeoff Can be pipelined 10 Several multiplication performed in parallel

MIPS Multiplication Two 32-bit registers for product (separately provided) HI: most-significant 32 bits LO: least-significant 32-bits Instructions mult rs, rt / multu rs, rt 64-bit product in HI/LO mfhi rd / mflo rd Move from HI/LO to rd Can test HI value to see if product overflows 32 bits mul rd, rs, rt Least-significant 32 bits of product > rd 11

Division Check for 0 divisor Long division approach If divisor dividend bits 1 bit in quotient, subtract Otherwise 0 bit in quotient, bring down next dividend bit Restoring division Do the subtract, and if remainder goes < 0, add divisor back Signed division Divide using absolute values Adjust sign of quotient and remainder as required divisor quotient dividend remainder 1001 1000 1001010-1000 10 101 1010-1000 10 n-bit operands yield n-bit quotient and remainder 12

Division Hardware Initially divisor in left half divisor divisor Shift left remainder dividend Write Initially dividend 13

Optimized Divider One cycle per partial-remainder subtraction Quotient in right half of 64-bit remainder register Looks a lot like a multiplier! Same hardware can be used for both 14

Faster Division Cannot use parallel hardware as in multiplier Subtraction is conditional on sign of remainder Faster dividers (e.g. SRT devision) generate multiple quotient bits per step Guess quotient bits and correct wrong guesses in subsequent steps Still require multiple steps 15

MIPS Division Use HI/LO registers for result HI: 32-bit remainder LO: 32-bit quotient Instructions div rs, rt / divu rs, rt No overflow or divide-by-0 checking Software must perform checks if required Use mfhi, mflo to access result 16

Floating Point We need a way to represent Numbers with fractions, e.g., 3.1416 Very small numbers, e.g., 0.000000001 Very large numbers, e.g., 3.15576 x 10 9 Like Scientific notation 2.34 10 56 +0.002 10 4 +987.02 10 9 In binary ±1.xxxxxxx 2 2 yyyy Types float and double in C 18

Floating Point Representation Sign, exponent, significand: ( 1) sign x significand x 2 exponent More bits for significand gives more accuracy More bits for exponent increases range IEEE 754 floating point standard: Single precision: 8 bit exponent, 23 bit significand, 1 bit sign Double precision: 11 bit exponent, 52 bit significand, 1 bit sign 19

Fractional Binary Numbers Representation 20 2 i 2 i 1 b i b i 1 b 2 b 1 b 0. b 1 b 2 b 3 b j 1/2 1/4 1/8 Bits to right of binary point represent fractional power of 2 Represents rational number: i 2 j 4 2 1 k j b k 2 k

Fractional Binary Numbers (cont d) Examples: Value Representation 5 + 3 / 4 101.11 2 2 + 7 / 8 10.111 2 0.111111 2 represents just below 1.0 1/2 + 1/4 + 1/8 + + 1/2 i + 1.0 (notation 1.0 ) Representable Numbers Can only exactly represent numbers of the form x/2 k Other numbers have repeating bit representations Value Representation 1/3 0.0101010101[01] 2 1/5 0.001100110011[0011] 2 1/10 0.0001100110011[0011] 2 21

IEEE 754 FP: Normalized Values Condition: exponent 000 0 and exponent 111 1 Significand coded with implied leading 1 significand = 1.xxx x 2 Minimum when 000 0 (significand = 1.0) Maximum when 111 1 (significand = 2.0 - ) Get extra leading bit for free Exponent is biased to make sorting easier exponent: 1 ~ 254 bias: 127 for single precision (1023 for double precision) format: ( 1) sign x (1 + significand) x 2exponent bias sign exponent significand 22

IEEE 754 FP: Normalized Values (cont d) Example for -0.75 Decimal: -.75 = - ( ½ + ¼ ) Binary: -.11 = -1.1 x 2-1 Floating point exponent = 126 = 01111110 (-1=126-127) IEEE single precision: -0.75 =(-1) 1 x (1 +.1000 00000000) x 2 126-127 Another example for 15213.0 23 sign exponent significand 1 01111110 10000000000000000000000 15213 10 = 11101101101101 2 = 1.1101101101101 2 x 2 13 significand = 1101101101101 (implied leading 1) exponent = 140 = 10001100 2 (exponent bias = 13, bias = 127) 0 10001100 11011011011010000000000 bias for single precision

IEEE 754 FP: Denormalized Values Condition: exp = 000 0 Value exponent = 1 bias (e.g. single precision: -126 = 1-127) significand = 0.xxx x 2 (no implied leading 1) Cases exp = 000 0, frac = 000 0 Represents value 0 Note that have distinct values +0 and -0 (depending on sign bit) exp = 000 0, frac 000 0 Numbers very close to 0.0 (-1 x 2-126, 1 x 2-126 ) Gradual underflow : possible numeric values are spaced evenly near 0.0 normalized denormalized normalized -1.0 x 2-126 0.0 1.0 x 2-126 24

IEEE 754 FP: Special Values Condition: exp = 111 1 exp = 111 1, frac = 000 0 Represents value (infinity) Operation that overflows Both positive and negative e.g., 1.0/0.0 = -1.0/-0.0 = +, 1.0/-0.0 = - exp = 111 1, frac 000 0 Not-a-Number (NaN) Represents case when no numeric value can be determined e.g., 0.0/0.0, sqrt(-1), - 25

IEEE 754 FP: Summary Single precision Double precision Object represented Exponent Fraction Exponent Fraction 0 0 0 0 0 0 Nonzero 0 Nonzero denormalized number 1-254 Anything 1-2046 Anything floating-point number 255 0 2047 0 Infinity 255 Nonzero 2047 Nonzero NaN (Not-a-Number) denormalized normalized special 26

Floating Point Addition Sketch of FP addition Align numbers to have the same exponent (the larger exponent) Add two significands with their signs and implied leading 1s Normalize the sum, checking overflow and underflow Round the sum and renormalize if necessary Example for 0.5 + -0.4375 FP format: 0.5 = 1.000 x 2-1, -0.4375 = -1.110 x 2-2 Align to the larger exponent: -1.110 x 2-2 = -0.111 x 2-1 Addition of significands: 1000 + -0111 = 0001 Result: 0.001 x 2-1 Normalize: 1.000 x 2-4 Checking overflow and underflow: -126-4 127 27

FP Adder Hardware Step 1 Step 2 Step 3 Step 4 28

Floating-Point Multiplication Sketch of FP multiplication Add exponents without bias Multiply two significands Check for underflow and overflow Normalize the result Round and renormalize if necessary Set the sign bit Example for (1.000 x 2-1 ) x (-1.110 x 2-2 ) Exponents: -1 + (-2) = -3 Significand: 1.000 x 1.110 = 1.110000 Check : -126-3 127 Normalize: 1.110 x 2-3 Sign bit: -1.110 x 2-3 29

FP Arithmetic Hardware FP multiplier is of similar complexity to FP adder But uses a multiplier for significands instead of an adder Operations are somewhat more complicated In addition to overflow we can have underflow FP arithmetic hardware usually does Addition, subtraction, multiplication, division, reciprocal, square-root FP integer conversion Operations usually takes several cycles Can be pipelined 30

FP Instructions in MIPS FP hardware is coprocessor 1 Adjunct processor that extends the ISA Separate FP registers 32 single-precision: $f0, $f1, $f31 Paired for double-precision: $f0/$f1, $f2/$f3, Release 2 of MIPs ISA supports 32 64-bit FP registers FP instructions operate only on FP registers Generally no integer operations on FP data, or vice versa More registers with minimal code-size impact 31

Floating Point Instructions (MIPS) Single precision and double precision instructions Addition : add.s, add.d Subtraction: sub.s, sub.d Multiplication: mul.s, mul.d Division: div.s, div.d Comparison sets or clear FP condition code bit c.xx.s, c.xx.d (xx can be eq, neq, lt, le, gt, ge) e.g., c.lt.s $f3, $f4 Branch on FP condition code true or false bc1t, bc1f e.g., bc1t targetlabel FP load and store instructions (from/to coprocessor 1) lwc1, ldc1, swc1, sdc1 (w: 32-bit data, d: 64-bit data) e.g., ldc1 $f8, 32($sp) 32

Accurate Arithmetic IEEE Std 754 specifies additional rounding control Extra bits of precision (guard, round, sticky) Choice of rounding modes Allows programmer to fine-tune numerical behavior of a computation Not all FP units implement all options Most programming languages and FP libraries just use defaults Trade-off between hardware complexity, performance, and market requirements 33

Ariane 5 Ariane 5 tragedy (June 4, 1996) Why? Exploded 37 seconds after liftoff Satellites worth $500 million Computed horizontal velocity as floating point number Converted to 16-bit integer Careful analysis of Ariane 4 trajectory proved 16-bit is enough Reused a module from 10-year-old software Overflowed for Ariane 5 No precise specification for the S/W 34

0.10 Gulf War in 1991 US Patriot missile failed to intercept Iraqi Scud missile 28 soldiers were killed Binary representation of 0.10 Patriot incremented a counter once every 0.10 seconds Implemented to multiply the counter value by 0.10 to get actual time 24-bit binary representation of 0.10 actually was 0.099999904632568359375 which is off by 0.000000095367431640625 (doesn t seem like much) After 100 hours, the time is off by 0.34 seconds Enough time for a Scud to travel 500 meters! 35

Summary Computer arithmetic is constrained by limited precision Bit patterns have no inherent meaning But standards do exist for number representations Two s complement, IEEE 754 floating point Computer instructions determine meaning of the bit patterns Performance and accuracy are important So there are many complexities in real machines Algorithm choice is important May lead to hardware optimizations for both space and time (e.g., multiplication) 36