Slide Set 11. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng

Slide Set 11 for ENCM 369 Winter 2015 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2015

ENCM 369 W15 Section 01 Slide Set 11 slide 2/69 Contents Integer Multiplication and Division Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic

ENCM 369 W15 Section 01 Slide Set 11 slide 3/69 Outline of Slide Set 11 Integer Multiplication and Division Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic

ENCM 369 W15 Section 01 Slide Set 11 slide 4/69 Integer Multiplication and Division So far in ENCM 369, we ve looked in detail at integer addition and subtraction, and also at use of left-shift instructions for multiplication by powers of two. Obviously, many computer programs need to do multiplication of integers that are not powers of two, and some programs need to do integer division.

ENCM 369 W15 Section 01 Slide Set 11 slide 5/69 However, the basics of integer multiplication and division with computers are relatively easy to learn by reading textbooks and online material, and lecture time in ENCM 369 in Winter 2015 is now very scarce. So lecture content will skip integer multiplication and division, and move on to floating-point numbers. Please go to the ENCM 369 Winter 2015 Home Page, then click on Page with links to PDFs of handouts and other documents used in both lecture sections to find slides on integer multiplication and division. You can expect that a small number of marks on the 2015 final exam will be related to basic understanding of MIPS integer multiplication and division instructions.

ENCM 369 W15 Section 01 Slide Set 11 slide 6/69 Outline of Slide Set 11 Integer Multiplication and Division Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic

ENCM 369 W15 Section 01 Slide Set 11 slide 7/69 Introduction to floating-point numbers Floating-point is the generic name given to the kinds of numbers you ve seen in C and C++ with types double and float. Section 5.3.2 in the textbook is about the basic concepts of floating-point numbers. Section 6.7.4 provides a very brief introduction to MIPS floating-point registers and instructions.

ENCM 369 W15 Section 01 Slide Set 11 slide 8/69 Scientific Notation This is a format that engineering students should be very familiar with! Example: 6.02214179 10 23 mol 1 Example: 1.60217656 10 19 C Floating-point representation has the same structure as scientific notation, but floating-point typically uses base two, not base ten.

ENCM 369 W15 Section 01 Slide Set 11 slide 9/69 Introductory floating-point example A programmer gives a value to a constant in some C code: const double electron_charge = -1.60217656e-19; The C compiler will use the base ten constant in the C code to create a base two constant a computer can work with. When the program runs, the number the computer uses is 1.0111101001001101101000010110101110011100011110101101 two 111111, which is very close to but not exactly equal to 1.60217656 10 19.

ENCM 369 W15 Section 01 Slide Set 11 slide 10/69 Names for parts of a non-zero floating-point number sign significand - 1.01001100011010111010111 fraction two exponent 00001011 The significand includes bits from both sides of the binary point. Another name for significand is mantissa. (Note: This is not base ten, so we should not use the term decimal point!) The fraction is the part of the significand that is to the right of the binary point. So the fraction represents some number that is 0 but < 1.

ENCM 369 W15 Section 01 Slide Set 11 slide 11/69 Normalized non-zero floating-point numbers In normalized form, an f-p number must have a single 1 bit immediately to the left of the binary point, and no other 1 bits left of the binary point. Therefore, the significand of a normalized number must be 1.0 and must also be < 10.0 two. (In English: greater than or equal to one, strictly less than two.)

ENCM 369 W15 Section 01 Slide Set 11 slide 12/69 Normalized non-zero f-p numbers: examples Which of the following are in normalized form? A. 1.00000000 two 00000101 B. +10.0000000 two 00100101 C. +1.10001011 two 00010111 D. 0.11101100 two 00001100 E. +101.111011 two 01001100

ENCM 369 W15 Section 01 Slide Set 11 slide 13/69 Example conversion from base ten to base-two floating-point What is 9.375 ten expressed as a normalized f-p number? What are the sign, significand, fraction, and exponent of this normalized f-p number?

ENCM 369 W15 Section 01 Slide Set 11 slide 14/69 Standard organizations for bits of floating-point numbers For computer hardware to work with f-p numbers there must be precise rules about how to encode these numbers. The most usual overall sizes for f-p numbers are 32 bits or 64 bits, but other sizes (e.g., 16, 80, or 128 bits) are possible. We need one bit for the sign and some number of bits for information about the exponent; the remaining bits can be used for information about the significand.

ENCM 369 W15 Section 01 Slide Set 11 slide 15/69 Sign information for non-zero f-p numbers This requires a single bit. A sign bit of 0 is used for positive numbers. A sign bit of 1 is used for negative numbers.

ENCM 369 W15 Section 01 Slide Set 11 slide 16/69 Exponent information for a non-zero f-p numbers Exponents in f-p numbers are signed integers! f-p numbers with small magnitudes will have negative exponents. So of course two s complement is used for exponents, right...? WRONG! In fact, an alternate system for signed integers, called biased notation, is used for exponents in f-p numbers. (This fact explains why many introductions to two s-complement systems state that two s complement is almost always used for signed integers in modern digital hardware.)

ENCM 369 W15 Section 01 Slide Set 11 slide 17/69 How does biased notation work? The biased exponent is equal to the actual exponent plus some number called a bias. The bias is chosen so that roughly half the allowable actual exponents are negative, and roughly half are positive. Example: The bias for an 8-bit exponent is 127 ten, or 0111_1111 two. If the actual exponent is 3 ten, what is the biased exponent in base ten and base two?

ENCM 369 W15 Section 01 Slide Set 11 slide 18/69 Why is biased notation used for exponents in f-p numbers? It turns out that biased notation helps with the design of relatively small, speedy circuits to decide whether one f-p number is less than another f-p number. (We won t study the details of that in ENCM 369.) Also, it s useful that the bit pattern for an actual exponent of zero is not a sequence of zero bits then a sequence of zero bits can have a different, special meaning.

ENCM 369 W15 Section 01 Slide Set 11 slide 19/69 Significand information for a non-zero, normalized f-p number 1 XXX XXX We know this bit will be a 1. Any pattern of 1 s and 0 s is possible here. There is no need to encode the entire significand. Instead we can record only the bits of the fraction. Leaving out the 1 bit from the left of the binary point allows more precision in the fraction.

ENCM 369 W15 Section 01 Slide Set 11 slide 20/69 Outline of Slide Set 11 Integer Multiplication and Division Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic

ENCM 369 W15 Section 01 Slide Set 11 slide 21/69 MIPS formats for 32-bit and 64-bit f-p numbers bit 31 sign bit bits 30 23 biased exponent bits 22 0 fraction bit 63 sign bit bits 62 52 biased exponent bits 51 0 fraction Exponent bias for 32-bit format: 127 ten = 0111_1111 two. Exponent bias for 64-bit format: 1023 ten = 011_1111_1111 two.

ENCM 369 W15 Section 01 Slide Set 11 slide 22/69 MIPS formats for 32-bit and 64-bit f-p numbers The 32-bit format is called single precision. The 64-bit format is called double precision. We ll see later that MIPS instruction mnemonics for single-precision operations end in.s, as in mov.s, while the mnemonics for double-precision operations end in.d, as in add.d.

ENCM 369 W15 Section 01 Slide Set 11 slide 23/69 Example: How is 9.375 ten encoded in 32-bit and 64-bit formats? From previous work: 9.375 = 9 + 1 4 + 1 8 = 1001.011 two = 1.001011 two three (normalized) For each of the 32-bit and 64-bit formats, what are the bit patterns for the biased exponents? What are the complete bit patterns for the f-p numbers?

ENCM 369 W15 Section 01 Slide Set 11 slide 24/69 More examples How would 9.375 ten be encoded in the 32-bit format? How would 0.125 ten be encoded in the 32-bit format? What base ten number does the 32-bit pattern 1_0111_1110_11_[21 zeros] represent?

ENCM 369 W15 Section 01 Slide Set 11 slide 25/69 How to represent zero in f-p formats A special rule says that if all exponent and fraction bits are zero, the number being represented is 0.0. So, what are the representations of 0.0 in 32-bit and 64-bit formats?

ENCM 369 W15 Section 01 Slide Set 11 slide 26/69 Outline of Slide Set 11 Integer Multiplication and Division Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic

ENCM 369 W15 Section 01 Slide Set 11 slide 27/69 IEEE standards for floating-point numbers and arithmetic (1) IEEE: Institute of Electrical and Electronics Engineers IEEE 754 and IEEE floating-point are informal names for both the original IEEE 754-1985 standard and the revised IEEE 754-2008 standard. Prior to the development of the IEEE 754-1985 standard, different companies produced a wide variety of incompatible schemes for floating-point numbers.

ENCM 369 W15 Section 01 Slide Set 11 slide 28/69 IEEE standards for floating-point numbers and arithmetic (2) Modern computer architectures (if they have f-p at all) typically implement part or all of IEEE standard f-p. MIPS follows the IEEE standard for 32-bit and 64-bit f-p types. The same is true for x86, x86-64, ARM and many other architectures. (So examples in earlier slides were not really MIPS-specific they would also be correct for many other architectures.) In C and C++, float is typically 32-bit IEEE f-p, and double is typically 64-bit IEEE f-p.

ENCM 369 W15 Section 01 Slide Set 11 slide 29/69 Scope of IEEE f-p standards In addition to 32-bit and 64-bit formats, various other formats are specified, for example, 16-bit and 128-bit formats. There are detailed rules for arithmetic comparison, addition, multiplication, and many other operations. There are detailed rules for rounding choosing an approximate value when exact results can t be represented.

ENCM 369 W15 Section 01 Slide Set 11 slide 30/69 Special IEEE f-p bit patterns exponent bits fraction bits meaning all 0 s all 0 s number is 0.0, as seen already all 0 s at least one denormalized number 1 bit all 1 s all 0 s ±infinity, depending all 1 s at least one 1 bit on sign bit NaN: not a number If the exponent field of an IEEE f-p bit pattern has at least one 0 bit and at least one 1 bit, the bit pattern represents a normal, non-zero f-p number.

ENCM 369 W15 Section 01 Slide Set 11 slide 31/69 Denormalized numbers These are non-zero numbers with magnitudes so tiny that they can t be represented in the normal sign-exponent-fraction format. Example: 1.25 2 128 in the 32-bit format. The range of biased exponents is 0000_0001 two to 1111_1110 two, that is, 1 to 254 ten, which allows encoding of actual exponents from 126 ten to +127 ten. We will NOT study the details of the denormalized number format in ENCM 369. (However, if you re curious,... 1.25 2 128 is represented as 0 00000000 01010000000000000000000 in the 32-bit format.)

ENCM 369 W15 Section 01 Slide Set 11 slide 32/69 Infinity IEEE standard f-p arithmetic specifies many ways to generate ±infinity. Some common examples... x / 0.0 generates + if x > 0.0. x / 0.0 generates if x < 0.0. If a and b are regular f-p numbers but the everyday math product a b is too large in magnitude to be an f-p number, then a * b will be ±, depending on the signs of a and b.

ENCM 369 W15 Section 01 Slide Set 11 slide 33/69 NaN: not a number NaN is specified as the result for many computations where not even ±infinity makes sense as a result. Examples... 0.0 / 0.0 infinity / infinity sqrt(x), where x < 0.0 asin(x), where x > 1.0 or x < 1.0 (asin is the C library inverse sine function.) arithmetic operation with one or more NaNs as inputs, e.g., 1.0 + x, where x is NaN

ENCM 369 W15 Section 01 Slide Set 11 slide 34/69 Demonstration of f-p infinity In everyday math, 1630 3 = 4,330,747,000 and (5.7 10 102 ) 3 = 1.85193 10 308. #include <stdio.h> int main(void) { int i = 1630; double d1 = 1630.0, d2 = 5.7e102; printf("%d cubed is %d\n", i, i * i * i); printf("%.1f cubed is %.1f\n", d1, d1 * d1 * d1); printf("%g cubed is %g\n", d2, d2 * d2 * d2); return 0; } Program output... 1630 cubed is 35779704 1630.0 cubed is 4330747000.0 5.7e+102 cubed is inf

ENCM 369 W15 Section 01 Slide Set 11 slide 35/69 Demonstration of Not a Number #include <math.h> #include <stdio.h> int main(void) { double a = 1.0, b = 2.0, c = 2.0; double sqrt_of_d, r1, r2; sqrt_of_d = sqrt(b * b - 4.0 * a * c); r1 = (-b + sqrt_of_d) / (2.0 * a); r2 = (-b - sqrt_of_d) / (2.0 * a); printf("r1 = %g, r2 = %g\n", r1, r2); return 0; } Program output... r1 = -nan, r2 = -nan

ENCM 369 W15 Section 01 Slide Set 11 slide 36/69 The usefulness of infinity and NaN Recall that for integer addition, subtraction, and multiplication, C and C++ systems usually will NOT tell you that results are wrong because magnitudes of numbers got out of hand. Results of ±infinity or NaN in floating-point computation clearly indicate that something has gone wrong. This is helpful! Of course, absence of ±infinity and NaN does NOT prove that your program s results are correct!

ENCM 369 W15 Section 01 Slide Set 11 slide 37/69 ENCM 369 Lecture Document: Floating-Point Format Examples Please read this document carefully. Here are some brief notes on what you will see: π cannot be represented exactly in f-p format. (This is probably not a surprise.) 0.6 cannot be represented exactly in f-p format. (This might be surprising.) 32-bit and 64-bit f-p approximations are given for π and 0.6. f-p bit patterns for 1.0 are given; they are very different from integer bit patterns for 1.

ENCM 369 W15 Section 01 Slide Set 11 slide 38/69 Outline of Slide Set 11 Integer Multiplication and Division Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic

ENCM 369 W15 Section 01 Slide Set 11 slide 39/69 Floating-point registers Most processor architectures that have f-p instructions have a set of floating-point registers (FPRs) that is separate from the set of general-purpose registers (GPRs). Important: Most f-p instructions have only FPRs as sources and destination! But there have to be a few instructions for copying data between FPRs and GPRs, or between FPRs and memory.

ENCM 369 W15 Section 01 Slide Set 11 slide 40/69 MIPS FPRs There are 16 64-bit double-precision FPRs: $f0, $f2, $f4,..., $f28, $f30. (Note that odd numbers are not allowed for names of these double-precision registers.) There are 32 32-bit single-precision FPRs: $f0, $f1, $f2,..., $f30, $f31. Attention: Unlike the set of GPRs, where $zero has special behaviour, none of the FPRs hold a constant value of 0.0. Section 6.7.4 in the textbook suggests names such as $fv0, $fv1, $ft0 $ft3, and so on for the 64-bit FPRs. Those names do not work in MARS!

ENCM 369 W15 Section 01 Slide Set 11 slide 41/69 MIPS FPR organization: Each 64-bit FPR shares bits with two 32-bit FPRs... purple: 64-bit double-precision FPRs green: 32-bit single-precision FPRs $f0 $f1 $f0 $f2 $f3 $f2 $f30 $f31 $f30

ENCM 369 W15 Section 01 Slide Set 11 slide 42/69 Each 64-bit MIPS FPR shares bits with two 32-bit FPRs: Detailed example 63 bit number within 64-bit $f4 32 31 0 31 0 31 0 bit number within bit number within 32-bit $f5 32-bit $f4 A program using the 64-bit $f4 for a double variable must not at the same time use the 32-bit $f4 or $f5 for a float variable!

ENCM 369 W15 Section 01 Slide Set 11 slide 43/69 Outline of Slide Set 11 Integer Multiplication and Division Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic

ENCM 369 W15 Section 01 Slide Set 11 slide 44/69 Coprocessor 1: The MIPS term for floating-point unit In the old days, when dinosaurs roamed, and processor chips only had hundreds of thousands of transistors (or less), f-p units were literally coprocessors. The main processor and the f-p unit were separate chips with separate sockets on a motherboard. Example: Intel 80386 (main processor) and 80387 (f-p unit).

ENCM 369 W15 Section 01 Slide Set 11 slide 45/69 Coprocessor 1, continued In 2015, a single chip (with hundreds of millions of transistors) can have 2 or 4 or 8 cores; each core has a main processor, its own floating-point unit, and a lot of other stuff. Students in ENCM 369 need to know that coprocessor 1 means floating-point unit, because c1 shows up in the mnemonics for many MIPS f-p instructions, and because the Coproc 1 tab in MARS is where you need to look to find values of FPRs.

ENCM 369 W15 Section 01 Slide Set 11 slide 46/69 Some important MIPS c1 instructions what mnemonic stands for / mnemonic operation performed mtc1 move to coprocessor 1 / copy 32 bits from GPR to 32-bit FPR mfc1 move from coprocessor 1 / copy 32 bits from 32-bit FPR to GPR lwc1 load word to coprocessor 1 / copy 32 bits from memory to 32-bit FPR swc1 store word from coprocessor 1 / copy 32 bits from 32-bit FPR to memory ldc1 load double to coprocessor 1 / copy 64 bits from memory to 64-bit FPR sdc1 store double from coprocessor 1 / copy 64 bits from 64-bit FPR to memory

ENCM 369 W15 Section 01 Slide Set 11 slide 47/69 Outline of Slide Set 11 Integer Multiplication and Division Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic

ENCM 369 W15 Section 01 Slide Set 11 slide 48/69 Example translation #1 from C code to MIPS f-p instructions x = i + y; x and y are of type double in $f2 and $f4, and i is of type int in $s0. WRONG ANSWER: addu $f2, $s0, $f4 Why is this a wrong answer? What would be correct code? Let s trace the correct code assuming that $s0 contains 2 and $f4 contains 1.5.

ENCM 369 W15 Section 01 Slide Set 11 slide 49/69 Example translation #2 from C code to MIPS f-p instructions if (x < y) x = y; x and y are of type double in $f2 and $f4. WRONG ANSWER: slt $t0, $f2, $f4 bne $t0, $zero, L1 add.d $f2, $f4, $zero L1: Why is this a wrong answer? What would be correct code?

ENCM 369 W15 Section 01 Slide Set 11 slide 50/69 F-P comparisons in MIPS: How to compare? type is d for double precision, s for single precision... Instruction Test c.lt. type FPR1, FPR2 is FPR1 < FPR2? c.le. type FPR1, FPR2 is FPR1 FPR2? c.eq. type FPR1, FPR2 is FPR1 = FPR2? slt puts its result in a GPR. Where does an f-p comparison put its result?

ENCM 369 W15 Section 01 Slide Set 11 slide 51/69 F-P comparisons in MIPS: How to branch? bc1t: branch if coprocessor 1 flag is true. bc1f: branch if coprocessor 1 flag is false. Messy detail: Actually, MIPS has eight separate coprocessor 1 flag bits, but by default c.lt.d, c.le.d, c.eq.d, c.lt.s, c.le.s, c.eq.s, bc1t and bc1f all access the same single flag bit.

ENCM 369 W15 Section 01 Slide Set 11 slide 52/69 Key things to learn from examples #1 and #2 Do NOT assume that f-p instructions are organized just like integer instructions! Mixing types often works in C arithmetic expressions but usually DOESN T work in assembly language arithmetic instructions. Before writing f-p MARS code in Lab 12, carefully study f-p instruction documentation provided along with the lab instructions.

ENCM 369 W15 Section 01 Slide Set 11 slide 53/69 Register-use conventions and FPRs in ENCM 369 Students are expected to know conventions related to use of GPRs. Use of FPRs makes register-use conventions much more complicated. We ll use simplified register-use conventions for FPRs; each lab and final-exam f-p programming problem will give a description of FPR-use conventions needed for that problem.

ENCM 369 W15 Section 01 Slide Set 11 slide 54/69 Addresses live in GPRs, never in FPRs! void foo(void) { double d; double *p; } more code What kind of register should be used for d? What kind of register should be used for p?

ENCM 369 W15 Section 01 Slide Set 11 slide 55/69 More detail about MIPS f-p programming There won t be any more lecture time spent on details of MIPS floating-point instructions. You ll learn about the most frequently-used f-p instructions by doing Lab 12.

ENCM 369 W15 Section 01 Slide Set 11 slide 56/69 Outline of Slide Set 11 Integer Multiplication and Division Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic

ENCM 369 W15 Section 01 Slide Set 11 slide 57/69 The minifloat type (as introduced in Lab 12) This is an 8-bit f-p type similar to the IEEE 754 types, but with tiny, tiny exponent and fraction fields... bit 7: sign bit bits 6-4: biased exponent bits 3-0: fraction Minifloat is useless for practical computation, but good for classroom examples and pencil-and-paper lab exercises. The exponent bias is 3 ten = 011 two.

ENCM 369 W15 Section 01 Slide Set 11 slide 58/69 Let s try to understand f-p addition by adding two minifloats... (Similar steps would be needed for 32- or 64-bit addition, but we would have to keep track of a lot more bits!) Bits of a are 01001111; bits of b are 00010110. a represents 3.875 ten ; b represents 0.34375 ten. (Check these values yourself!) So, how to compute the best possible minifloat result for a + b?

ENCM 369 W15 Section 01 Slide Set 11 slide 59/69 Rounding errors in f-p arithmetic Both rounded results in the minifloat addition example are approximations to the exact sum, which is 4.21875 ten. The same kind of rounding errors will occur in 32- and 64-bit f-p arithmetic operations. Relative sizes of rounding errors decrease as the number of fraction bits increases.

ENCM 369 W15 Section 01 Slide Set 11 slide 60/69 Floating-point hardware example: Adder (Unfortunately this year s textbook doesn t have any example circuits for f-p arithmetic.) An f-p adder would have to implement all the steps we ve just seen in minifloat addition: comparing exponents of the two inputs shifting the input with the smaller exponent adding normalizing the sum rounding the sum Note how much more complicated this is than a simple integer adder!

ENCM 369 W15 Section 01 Slide Set 11 slide 61/69 Floating-point hardware concepts: Just a couple of remarks f-p arithmetic circuits are significantly larger and more complex than integer arithmetic circuits. But modern f-p circuits are very fast, because f-p performance has been a key selling point for processors and other digital hardware... games and video processing for consumers high-speed number-crunching for science and industry

ENCM 369 W15 Section 01 Slide Set 11 slide 62/69 Remember this: F-P math is usually approximate /* Classic mistake: Counting using fractions... */ double x; for (x = 0.1; x <= 0.3; x += 0.1) printf("x is %f\n", x); Expected output... x is 0.100000 x is 0.200000 x is 0.300000 Actual output... x is 0.100000 x is 0.200000 What went wrong here?

ENCM 369 W15 Section 01 Slide Set 11 slide 63/69 Outline of Slide Set 11 Integer Multiplication and Division Introduction to Floating-Point Numbers MIPS Formats for F-P Numbers IEEE Floating-Point Standards MIPS Floating-Point Registers Coprocessor 1 Translating C F-P Code to to MIPS A.L. Quick Overview of F-P Algorithms and Hardware Some Data and Remarks about Speed of Arithmetic

ENCM 369 W15 Section 01 Slide Set 11 slide 64/69 Some Data and Remarks about Speed of Arithmetic See the lecture document called Arithmetic Performance Examples. Let s make some notes about the copy_test and op_test functions. Computer used: 2009 MacBook Pro with 2013 version of C compiler from Apple.

ENCM 369 W15 Section 01 Slide Set 11 slide 65/69 Speed of Arithmetic: Observations (1) copy_test runs at the same speed for int, double and float. This isn t surprising x86 and x86-64 D-caches are designed to allow reading or writing 64-bit data in a single access. Addition, multiplication and shifts are almost free for this particular arrangement of C code and C compiler op_test never takes much more time than copy_test, except when OP_CHOICE asks for division.

ENCM 369 W15 Section 01 Slide Set 11 slide 66/69 Speed of Arithmetic: Observations (2) Multiplication is approximately as fast as addition for all three types. Using a shift to multiply ints by 512 = 2 9 was not significantly faster than using multiplication performance of integer multipliers in modern hardware is very good. Division, for all three types, is terribly slow compared to addition and multiplication.

ENCM 369 W15 Section 01 Slide Set 11 slide 67/69 Speed of Arithmetic: Observations (3) f-p addition and multiplication are sometimes faster and never very much slower than corresponding integer operations. Except for division, double-precision math seems to be just as fast as single-precision math.

ENCM 369 W15 Section 01 Slide Set 11 slide 68/69 Speed of Arithmetic: Programming Ideas (1) With current processors, do not prefer integer arithmetic over f-p just to get a speed increase. (Many years ago, f-p arithmetic was usually much slower than integer arithmetic.) Try to avoid division in loops where your programs spend a lot of time.

ENCM 369 W15 Section 01 Slide Set 11 slide 69/69 Speed of Arithmetic: Programming Ideas (2) Relative performance for different types and different operations is highly processor-dependent. Technology can change a lot within just a few years! So don t rely on results obtained using a 2009 x86-64 processor to guide your number-crunching designs on ARM (or some other hardware) in 2018! Try different data types and algorithms and measure performance.