ARM D COMPUTER ORGANIZATION AND Edition The Hardware/Software Interface Chapter 3 Arithmetic for Computers Modified and extended by R.J. Leduc - 2016
In this chapter, we will investigate: How integer arithmetic operations are carried out: Addition and subtraction Multiplication and division Floating-point (real) numbers How they are represented in a computer How their arithmetic operations are performed 3.1 Introduction Arithmetic for Computers Chapter 3 Arithmetic for Computers 2
3.2 Addition and Subtraction Integer Addition Example: 7 + 6 Overflow occurs if result out of range Adding +ve and ve operands, no overflow Adding two +ve operands Overflow if MSB of result is 1 Adding two ve operands Overflow if MSB of result is 0 Chapter 3 Arithmetic for Computers 3
Oversized Operands ARMv8 is set up to add two 64 bit numbers How do you handle the addition of two 128 bit numbers? You add the first 64 bits as normal You link the two additions using the carry out from this operation You add the upper 64 bits using the carry out of previous stage as the carry in of the current operation ARMv8 has instructions for this: see Section 2.19, Figure 2.41 (i.e. ADC, ADCS regular operation with added with carry ) Similar for subtraction The hardware that performs addition, subtraction and logical operations such as AND and OR is called an Arithmetic Logic Unit (ALU) Chapter 3 Arithmetic for Computers 4
Integer Subtraction To subtract, convert second operand to 2's complement version, and add the result to first operand Example: 7 6 = 7 + ( 6) +7: 0000 0000 0000 0111 6: 1111 1111 1111 1010 +1: 0000 0000 0000 0001 Overflow occurs if result is out of range Subtracting two +ve or two ve operands, no overflow Subtracting +ve from ve operand Overflow if MSB of result is 0 Subtracting ve from +ve operand Overflow if MSB of result is 1 Chapter 3 Arithmetic for Computers 5
n- Bit Ripple Carry Adder For each stage (called a full adder), we are adding xi + yi+ci to get si and ci+1 Normally the first carry in (c0) is set to zero could also be connected to the carry out (cn) from another n-bit adder, to form a 2n-bit adder Chapter 3 Arithmetic for Computers 6
Full Adder A full adder takes as input a carry from the previous stage (ci) and a bit from each n-bit number being added (xi and yi). It produces a sum bit (si) and a carry out to next stage (ci+1) Chapter 3 Arithmetic for Computers 7
Adder/ Subtractor If signal Add/Sub = 0, then we get X +Y + 0 If signal Add/Sub = 1, then we get X +Y + 1 = X Y Remember, 2's complement of Y is Y + 1, where Y is the bitwise complement of Y Chapter 3 Arithmetic for Computers 8
3.3 Multiplication Integer Multiplication We first look at base-10 longmultiplication Consider the calculation on right which we limited to only 0 and 1 digits For base-2, digits can be only 0 or 1 Means that at each step, we either get zero, or a shifted copy of multiplicand Final step is to add values from all steps together multiplicand multiplier product 1000 1001 1000 0000 0000 1000 1001000 Length of product is the sum of operand lengths Chapter 3 Arithmetic for Computers 9
Integer Multiplication II The 64 bit multiplicand starts in right half of multiplicand register It is shifted left one bit on each step Multiplier is shifted right once on each step Product register is initialized to zero Control decides when to shift each register and when to load a new value into product register Chapter 3 Arithmetic for Computers 10
Multiplication Algorithm Multiplier0 is the least significant bit in the multiplier register Only add multiplicand to product and store in product if multiplier0 = 1, else do nothing Figure should have 64th repetition? as label, not 32 See textbook for more efficient version of multiplier See example on page 195 in text Chapter 3 Arithmetic for Computers 11
Signed Multiplication Method presented so far would only work on unsigned numbers For signed, convert all operands to magnitude (positive) version Do multiplication as normal If the signs of initial operands not the same, negate the final product. Chapter 3 Arithmetic for Computers 12
LEGv8 Multiplication Three LEGv8 multiply instructions: MUL: multiply SMULH: signed multiply high Gives the upper 64 bits of the product, assuming the operands are signed UMULH: unsigned multiply high Gives the lower 64 bits of the product Gives the upper 64 bits of the product, assuming the operands are unsigned LEGv8 multiply instructions do not set overflow condition code Chapter 3 Arithmetic for Computers 13
quotient dividend divisor 1001 1000 1001010-1000 10 101 1010-1000 10 remainder n-bit operands yield n-bit quotient and remainder We first consider base-10 division Consider the calculation on left which we limited to only 0 and 1 digits Dividend = Quotient x Divisor + Remainder Need to check that divisor is not zero As binary numbers only have 0 or 1, the divisor can only divide into dividend once or not at all at each spot 3.4 Division Integer Division Chapter 3 Arithmetic for Computers 14
Division Hardware We start with quotient set to zero Shifted to left one bit per iteration and new bit added Divisor starts on left side of register and is shifted to the right one bit each iteration Control decides when to shift divisor and quotient registers and when to store new value in remainder Initially divisor in left half Initially dividend Chapter 3 Arithmetic for Computers 15
Division Algorithm At step1, we calculate Remainder register Divisor register This is to determine if divisor is smaller than dividend If remainder was larger (result positive), then divisor can divide into dividend Shift quotient left, set LSB to 1 Otherwise, shift quotient left and set LSB to zero Chapter 3 Arithmetic for Computers 16
Signed Division For signed operands, do division using absolute values of the operands If divisor and dividend not originally same sign, then negate quotient Adjust the sign of the remainder by using the following rule: Dividend (original sign) and remainder must have same sign The signs of the divisor and quotient do not affect sign of remainder See example on page 200 of text Chapter 3 Arithmetic for Computers 17
LEGv8 Division LEGv8 has two instructions for division: SDIV (signed) UDIV (unsigned) Both instructions ignore overflow and division-by-zero Chapter 3 Arithmetic for Computers 18
Method to represent real numbers in digital hardware Number represented as an n-bit integer part, and a k-bit fractional part 3.5 Floating Point Fixed Point Numbers This means the decimal point is fixed For binary number: B = bn-1... b0.b-1b-2 b-k, its base-10 value is: V(B) = Σi=-k to n-1 (bi x 2i ) For B = 000.0001001, if n = 4, and k= 3, we would get 0000.000 Limited usefulness Chapter 3 Arithmetic for Computers 19
Floating Point Numbers To represent a number as fixed point, require a digit for each position Representing 2376 x 10-78 would require many zeros... Floating point method represents numbers by a mantissa containing the significant digits plus an exponent of Radix R: Mantissa x RExponent Numbers are often normalized so that the decimal place is to the right of the leftmost digit. i.e. 5.234 x 1043 or 6.31 x 10-28 Chapter 3 Arithmetic for Computers 20
IEEE Floating Point Formats Two main representations are the IEEE singleprecision 32 bit format and the IEEE doubleprecision 64-bit format Chapter 3 Arithmetic for Computers 21
Floating Point Numbers II Numbers are normalized so that mostsignificant bit of mantissa (fraction) is a 1 Base 10 value for single-precision number is: V = +/- 1.M x 2Exponent The exponent is stored in excess 127 format so that E is unsigned and ranges from 0 to 255. Thus: Exponent = E -127 Chapter 3 Arithmetic for Computers 22
Floating Point Numbers III The value of E = 0 is defined to be exact zero and E = 255 to be infinity. This means Exponent can go in range -126 to 127 Example: (13) 3 10 = (1101)2 = 1.101 x 2, E = 3 + 127 = 130 Single-precision: 0 10000010 1010... 0 = 0 E M In C, types float and double Chapter 3 Arithmetic for Computers 23
Floating-Point Example II What number is represented by the singleprecision float 11000000101000 00 S=1 Fraction = 01000 002 E = 100000012 = 129 X = ( 1) (1.01)2 2(129 127) = ( 1) 1.25 22 = 5.0 Chapter 3 Arithmetic for Computers 24
Overflow and Underflow With floating point numbers, overflow means that the exponent is too large to fit in the exponent field. Underflow occurs when the negative exponent is too small to fit in the exponent field. For single precision, exponents can go in range from -126 to 127 For double precision, the range is -1022 to 1023 Chapter 3 Arithmetic for Computers 25
Exceptions LEGv8 uses exceptions to inform the user when floating point overflow or underflow occurs An exception is an unscheduled procedure call The address of the instruction that overflowed/ underflowed is saved in a register Computer jumps to a predefined address to invoke the appropriate routine for that exception Chapter 3 Arithmetic for Computers 26
Infinities and NaNs IEEE standard defines some codes to identify special situations: Exponent = 111...1, Mantissa = 000...0 Means the number is ±infinity This can be used in subsequent calculations, avoiding need for overflow check Exponent = 111...1, Mantissa 000...0 Result is Not-a-Number (NaN) Indicates illegal or undefined result e.g., 0.0 / 0.0 Can be used in subsequent calculations Chapter 3 Arithmetic for Computers 27
FP Registers LEGv8 has 32 double-precision registers just for floating point operations Labeled: D0,, D31 LEGv8 has 32 single-precision registers just for floating point operations Labeled: S0,, S31 FP instructions operate only on FP registers Programs generally don t do integer ops on FP data, or vice versa Means more registers with minimal codesize impact Chapter 3 Arithmetic for Computers 28
FP Registers II FP load and store instructions LDURS - load single-precision LDURD - load double-precision STURS - store single-precision STURD - store double-precision See Elaboration on page 222 of text about ARMv8 difference from LEGv8 Figure 3.17 of text gives a listing of the new FP instructions plus some examples Chapter 3 Arithmetic for Computers 29
FP Instructions in LEGv8 Single-precision arithmetic FADDS, FSUBS, FMULS, FDIVS Double-precision arithmetic FADDD, FSUBD, FMULD, FDIVD FCMPS, FCMPD Sets or clears FP condition-code bits Branch on FP condition code true or false e.g., FADDD D2, D4, D6 Single- and double-precision comparison e.g., FADDS S2, S4, S6 B.cond (these are the B.LE etc. conditional branches discussed earlier) See the FP examples on page 222 and 223 of text Chapter 3 Arithmetic for Computers 30
Read Section 3.6 for own interest Contains some common terms (i.e. SIMD) that you should be aware of 3.6 Parallelism and Computer Arithmetic: Subword Parallelism Subword Parallellism Chapter 3 Arithmetic for Computers 31
Read Section 3.8 for own interest Useful if you wish to program in ARMv8 assembly in the future 3.8 the Rest of the ARMv8 Arithmetic Instructions REAL Stuff: The rest of ARMv8 Arithmetic Instructions Chapter 3 Arithmetic for Computers 32
Read Section 3.10 on own 3.10 Fallacies and Pitfalls Fallacies and Pitfalls Chapter 3 Arithmetic for Computers 33
Read Section 3.11 on own 3.11 Concluding Remarks Concluding Remarks Chapter 3 Arithmetic for Computers 34