Objectives. look at floating point representation in its basic form expose errors of a different form: rounding error highlight IEEE-754 standard

Size: px
Start display at page:

Download "Objectives. look at floating point representation in its basic form expose errors of a different form: rounding error highlight IEEE-754 standard"

Transcription

1 Floating Point

2 Objectives look at floating point representation in its basic form expose errors of a different form: rounding error highlight IEEE-754 standard 1

3 Why this is important: Errors come in two forms: truncation error and rounding error we always have them case study: Intel our jobs as developers: reduce impact 2

4 Intel 3

5 Intel 4

6 Intel 5

7 Intel Timeline June 1994 Intel engineers discover the division error. Managers decide the error will not impact many people. Keep the issue internal. June 1994 Dr Nicely at Lynchburg College notices computation problems Oct 19, 1994 After months of testing, Nicely confirms that other errors are not the cause. The problem is in the Intel Processor. Oct 24, 1994 Nicely contacts Intel. Intel duplicates error. Oct 30, 1994 After no action from Intel, Nicely sends an 6

8 Intel Timeline FROM: Dr. Thomas R. Nicely Professor of Mathematics Lynchburg College 1501 Lakeside Drive Lynchburg, Virginia Phone: Fax: Internet: TO: Whom it may concern RE: Bug in the Pentium FPU DATE: 30 October 1994 It appears that there is a bug in the floating point unit (numeric coprocessor) of many, and perhaps all, Pentium processors. In short, the Pentium FPU is returning erroneous values for certain division operations. For example, 0001/ is calculated incorrectly (all digits beyond the eighth significant digit are in error). This can be verified in compiled code, an ordinary spreadsheet such as Quattro Pro or Excel, or even the Windows calculator (use the scientific mode), by computing 00( )*(1/ ), which should equal 1 exactly (within some extremely small rounding error; in general, coprocessor results should contain 19 significant decimal digits). However, the Pentiums tested return

9 Intel Timeline Nov 1, 1994 Software company Phar Lap Software receives Nicely s . Sends to colleagues at Microsoft, Borland, Watcom, etc. decide the error will not impact many people. Keep the issue internal. Nov 2, with description goes global. Nov 15, 1994 USC reverse-engineers the chip to expose the problem. Intel still denies a problem. Stock falls. Nov 22, 1994 CNN Moneyline interviews Intel. Says the problem is minor. Nov 23, 1994 The MathWorks develops a fix. Nov 24, 1994 New York Times story. Intel still sending out flawed chips. Will replace chips only if it caused a problem in an important application. Dec 12, 1994 IBM halts shipment of Pentium based PCs Dec 16, 1994 Intel stock falls again. 8

10 Numerical bugs Obvious Software has bugs Not-SO-Obvious Numerical software has two unique bugs: 1. roundoff error 2. truncation error 9

11 Numerical Errors Roundoff Roundoff occurs when digits in a decimal point ( ) are lost (0.3333) due to a limit on the memory available for storing one numerical value. Truncation Truncation error occurs when discrete values are used to approximate a mathematical expression. 10

12 Uncertainty: well- or ill-conditioned? Errors in input data can cause uncertain results input data can be experimental or rounded. leads to a certain variation in the results well-conditioned: numerical results are insensitive to small variations in the input ill-conditioned: small variations lead to drastically different numerical calculations (a.k.a. poorly conditioned) 11

13 Our Job As numerical analysts, we need to 1. solve a problem so that the calculation is not susceptible to large roundoff error 2. solve a problem so that the approximation has a tolerable truncation error How? incorporate roundoff-truncation knowledge into the mathematical model the method the algorithm the software design awareness correct interpretation of results 12

14 Wanted: Real Numbers... in a computer Computers can represent integers, using bits: 23 = = (10111) 2 How would we represent fractions, e.g ? 13

15 Wanted: Real Numbers... in a computer Computers can represent integers, using bits: 23 = = (10111) 2 How would we represent fractions, e.g ? Idea: Keep going down past zero exponent: = So: Could store a fixed number of bits with exponents 0 a fixed number of bits with exponents < 0 This is called fixed-point arithmetic. 13

16 Fixed-Point Numbers Suppose we use units of 64 bits, with 32 bits for exponents 0 and 32 bits for exponents < 0. What numbers can we represent? 14

17 Fixed-Point Numbers Suppose we use units of 64 bits, with 32 bits for exponents 0 and 32 bits for exponents < 0. What numbers can we represent?

18 Fixed-Point Numbers Suppose we use units of 64 bits, with 32 bits for exponents 0 and 32 bits for exponents < 0. What numbers can we represent? Smallest: Largest:

19 Fixed-Point Numbers How many digits of relative accuracy (think relative rounding error) are available for the smallest vs. the largest number with 64 bits of fixed point?

20 Fixed-Point Numbers How many digits of relative accuracy (think relative rounding error) are available for the smallest vs. the largest number with 64 bits of fixed point? For large numbers: about 19 For small numbers: few or none Idea: Don t fix the location of the 0 exponent. Let it float. 15

21 Floating Points Normalized Floating-Point Representation Real numbers are stored as x = ±(0.d 1 d 2 d 3... d m ) β β e d 1 d 2 d 3... d m is the significand, e is the exponent e is negative, positive or zero the general normalized form requires d

22 Floating Point Example In base can be written as ( ) can be written as ( )

23 Floating Point Suppose we have only 3 bits for a (non-normalized) significand and a 1 bit exponent stored like.d 1 d 2 d 3 e 1 All possible combinations give: = ,0, = 7 So we get 0, 1 16, 2 16,..., 7 16, 0, 1 4, 2 4,..., 7 4, and 0, 1 8, 2 8,..., 7 8. On the real line: 18

24 Overflow, Underflow computations too close to zero may result in underflow computations too large may result in overflow overflow error is considered more severe underflow can just fall back to 0 19

25 Normalizing If we use the normalized form in our 4-bit case, we lose ,0,1 along with other. So we cannot represent 1 16, 1 8, and

26 Next: Floating Point Numbers We re familiar with base 10 representation of numbers: and 1234 = = we write as an integer part and a fractional part: a 3 a 2 a 1 a 0.b 1 b 2 b 3 b 4 For some (even simple) numbers, there may be an infinite number of digits to the right: π = /9 = =

27 Other bases So far, we have just base 10. What about base β? binary (β = 2), octal (β = 8), hexadecimal (β = 16), etc In the β-system we have n (a n... a 2 a 1 a 0.b 1 b 2 b 3 b 4... ) β = a k β k + b k β k k=0 k=0 22

28 Integer Conversion An algorithm to compute the base 2 representation of a base 10 integer (N) 10 = (a j a j 1... a 2 a 0 ) 2 Compute (N) 10 /2 = Q + R/2: = a j 2 j + + a a N 2 = a j 2 j a a 0 }{{}}{{} 2 =Q =R/2 Example Example: compute (11) 10 base 2 11/2 = 5R1 a 0 = 1 5/2 = 2R1 a 1 = 1 2/2 = 1R0 a 2 = 0 1/2 = 0R1 a 3 = 1 So (11) 10 = (1011) 2 23

29 The other way... Convert a base-2 number to base-10: ( ) 2 = = 1 + 2(0 + 2(1 + 2(0 + 2(0 + 2(0 + 2(1 + 2(1))))))) =

30 Converting fractions straight forward way is not easy goal: for x [0, 1] write x = 0.b 1 b 2 b 3 b 4 = β(x) = (c 1.c 2 c 3 c 4... ) β c k β k = (0.c 1 c 2 c 3... ) β k=1 multiplication by β in base-β only shifts the radix 25

31 Fraction Algorithm An algorithm to compute the binary representation of a fraction x: x = 0.b 1 b 2 b 3 b 4... = b Multiply x by 2. The integer part of 2x is b 1 2x = b b b Example Example:Compute the binary representation of = 1.25 b 1 = = 0.5 b 2 = = 1.0 b 3 = 1 So (0.625) 10 = (0.101) 2 26

32 A problem with precision r 0 = x f o r k = 1, 2,..., m i f r k 1 2 k b k = 1 r k = r k 1 2 k else b k = 0 end end k 2 k b k r k = r k 1 b k 2 k

33 Floating Point numbers Convert 13.5 = (1101.1) 2 into floating point representation. 28

34 Floating Point numbers Convert 13.5 = (1101.1) 2 into floating point representation = = (1.1011)

35 Floating Point numbers Convert 13.5 = (1101.1) 2 into floating point representation = = (1.1011) What pieces do you need to store an FP number? 28

36 Floating Point numbers Convert 13.5 = (1101.1) 2 into floating point representation = = (1.1011) What pieces do you need to store an FP number? Significand: (1.1011) 2 Exponent: 3 Idea: Notice that the leading digit (in binary) of the significand is always one. Only store

37 Floating Point numbers Final storage format: 13.5 = = (1.1011) Significand: 1011 a fixed number of bits Exponent: 3 a (signed!) integer allowing a certain range Exponent is most often stored as a positive offset from a certain negative number. E.g. 3 = 1023 }{{} }{{} implicit offset stored Actually stored: 1026, a positive integer. 29

38 Unrepresentable numbers? Can you think of a somewhat central number that we cannot represent as x = (1. ) 2 2 p? 30

39 Unrepresentable numbers? Can you think of a somewhat central number that we cannot represent as x = (1. ) 2 2 p? Zero. Which is somewhat embarrassing. Core problem: The implicit 1. It s a great idea, were it not for this issue. 30

40 Unrepresentable numbers? Have to break the pattern. Idea: Declare one exponent special, and turn off the leading one for that one. (say, -1023, a.k.a. stored exponent 0) For all larger exponents, the leading one remains in effect. Bonus Q: With this convention, what is the binary representation of a zero? 31

41 Web link: IEEE 754 Web link: What Every Computer Scientist Should Know About Floating-Point Arithmetic 32

42 IEEE-754: Why?!?! IEEE-754 is a widely used standard accepted by hardware/software makers defines the floating point distribution for our computation offer several rounding modes which effect accuracy Floating point arithmetic emerges in nearly every piece of code even modest mathematical operation yield loss of significant bits several pitfalls in common mathematical expressions 33

43 IEEE Floating Point (v. 754) How much storage do we actually use in practice? 32-bit word lengths are typical IEEE Single-precision floating-point numbers use 32 bits IEEE Double-precision floating-point numbers use 64 bits Goal: use the 32-bits to best represent the normalized floating point number 34

44 IEEE Single Precision (Marc-32) x = ±q 2 m 1-bit sign 8-bit exponent m 23-bit fraction q The leading one in the normalized signficand q does not need to be represented: b 1 = 1 is hidden bit IEEE 754: put x in 1.f normalized form 0 < m = c < 255 Largest exponent = 127, Smallest exponent =

45 IEEE Single Precision x = ±q 2 m Process for x = : 1. Convert both integer and fractional to binary: x = ( ) 2 2. Convert to 1.f form: x = }{{} ( } {{ } ) Convert exponent 5 = c 127 c = 132 c = (10 } 000 {{ 100 } ) 2 8 }{{} 1 10 } 000 {{ 100 } 101 } {{ }

46 IEEE Single Precision Special Cases: denormalized/subnormal numbers: use 1 extra bit in the significant: exponent is now 126 (less precision, more range), indicated by in the exponent field two zeros: +0 and 0 (0 fraction, 0 exponent) two s: + and (0 fraction, exponenet) NaN (any fraction, exponent) 37

47 IEEE Double Precision 1-bit sign 11-bit exponent 52-bit fraction single-precision: about 6 decimal digits of precision double-precision: about 15 decimal digits of precision m = c

48 Precision vs. Range type range approx range x single x x double x small numbers example 39

49 Subnormal Numbers What is the smallest representable number in an FP system with 4 stored bits in the significand and an exponent range of [ 7, 7]? 40

50 Subnormal Numbers What is the smallest representable number in an FP system with 4 stored bits in the significand and an exponent range of [ 7, 7]? First attempt: Significand as small as possible all zeros after the implicit leading one Exponent as small as possible: 7 (1.0000) Unfortunately: wrong. We can go way smaller by using the special exponent (which turns off the implicit leading one). 40

51 Subnormal Numbers We ll assume that the special exponent is 8. So: (0.0001) Numbers with the special exponent are called subnormal (or denormal) FP numbers. Technically, zero is also a subnormal. Note: It is thus quite natural to park the special exponent at the low end of the exponent range. 41

52 Subnormal Numbers Why would you want to know about subnormals? Because computing with them is often slow, because it is implemented using FP assist, i.e. not in actual hardware. Many C compilers support options to flush subnormals to zero. 42

53 Subnormal Numbers Why would you want to know about subnormals? Because computing with them is often slow, because it is implemented using FP assist, i.e. not in actual hardware. Many C compilers support options to flush subnormals to zero. FP systems without subnormals will underflow (return 0) as soon as the exponent range is exhausted. This smallest representable normal number is called the underflow level, or UFL. 42

54 Subnormal Numbers Beyond the underflow level, subnormals provide for gradual underflow by keeping going as long as there are bits in the significand, but it is important to note that subnormals don t have as many accurate digits as normal numbers. Analogously (but much more simply no supernormals ): the overflow level, OFL. 43

55 Summary: Translating Floating Point Numbers To summarize: To translate a stored (double precision) floating point value consisting of the stored fraction (a 52-bit integer) and the stored exponent value e stored the into its (mathematical) value, follow the following method: (1 + fraction 2 52 ) e stored e stored 0, value = (fraction 2 52 ) e stored = 0. 44

56 A problem with precision For other numbers, such as 1 5 needed. = 0.2, an infinite length is So 0.2 is stored just fine in base-10, but needs infinite number of digits in base-2!!! This is roundoff error in its basic form... 45

57 Floating Point and Rounding Error What is the relative error produced by working with floating point numbers? What is smallest floating point number > 1? Assume 4 stored bits in the significand. 46

58 Floating Point and Rounding Error What is the relative error produced by working with floating point numbers? What is smallest floating point number > 1? Assume 4 stored bits in the significand. (1.0001) = x ( ) 2 46

59 Floating Point and Rounding Error What is the relative error produced by working with floating point numbers? What is smallest floating point number > 1? Assume 4 stored bits in the significand. (1.0001) = x ( ) 2 What s the smallest FP number > 1024 in that same system? 46

60 Floating Point and Rounding Error What is the relative error produced by working with floating point numbers? What is smallest floating point number > 1? Assume 4 stored bits in the significand. (1.0001) = x ( ) 2 What s the smallest FP number > 1024 in that same system? (1.0001) = x ( ) 2 Can we give that number a name? 46

61 Floating Point and Rounding Error Unit roundoff or machine precision or machine epsilon or ε mach is a bound on the relative error in a floating point represention for a given rounding procedure. With chopping as the rounding procedure, it is the smallest number such that float(1 + ε) 1. In the above system, ε mach = (0.0001) 2. Another related quantity is ULP, or unit in the last place. It is important to note, that if chopping is used, then ε mach is the distance between 1.0 and the next largest floating point number. 47

62 ɛ m With chopping, take x = 1.0 and add 1/2, 1/4,..., 2 i : Hidden bit 52 bits e e e e e e e e e e Ooops! use fl(x) to represent the floating point machine number for the real number x fl( ) 1, but fl( ) = 1 48

63 ɛ m : Machine Epsilon Machine epsilon ɛ m is the smallest number such that fl(1 + ɛ m ) 1 With rounding-to-nearest, The double precision machine epsilon is about The single precision machine epsilon is about

64 50

65 Floating Point Errors Not all reals can be exactly represented as a machine floating point number. Then what? Round-off error IEEE options: Round to next nearest FP (preferred), Round to 0, Round up, and Round down Let x + and x be the two floating point machine numbers closest to x round to nearest: round(x) = x or x +, whichever is closest round toward 0: round(x) = x or x +, whichever is between 0 and x round toward (down): round(x) = x round toward + (up): round(x) = x + 51

66 Floating Point and Rounding Error What does this say about the relative error incurred in floating point calculations? 52

67 Floating Point and Rounding Error What does this say about the relative error incurred in floating point calculations? The factor to get from one FP number to the next larger one is (mostly) independent of magnitude: 1 + ε mach. Since we can t represent any results between x and x (1 + ε mach ), that s really the minimum error incurred. In terms of relative error: x x x = x(1 + ε mach ) x x = ε mach. At least theoretically, ε mach is the maximum relative error in any FP operations. (Practical implementations do fall short of this.) 52

68 Floating Point and Rounding Error What s that same number for double-precision floating point? (52 bits in the significand) 53

69 Floating Point and Rounding Error What s that same number for double-precision floating point? (52 bits in the significand) We can expect FP math to consistently introduce relative errors of about Working in double precision gives you about 16 (decimal) accurate digits. 53

70 Implementing Arithmetic How is floating point addition implemented? Consider adding a = (1.101) and b = (1.001) in a system with three bits in the significand. 54

71 Implementing Arithmetic How is floating point addition implemented? Consider adding a = (1.101) and b = (1.001) in a system with three bits in the significand. Rough algorithm: 1. Bring both numbers onto a common exponent 2. Do grade-school addition from the front, until you run out of digits in your system. 3. Round result. a = b = a + b

72 Problems with FP Addition What happens if you subtract two numbers of very similar magnitude? As an example, consider a = (1.1011) and b = (1.1010)

73 Problems with FP Addition What happens if you subtract two numbers of very similar magnitude? As an example, consider a = (1.1011) and b = (1.1010) a = b = a b ???? 2 1 or, once we normalize, 1.???? 2 3. There is no data to indicate what the missing digits should be. 55

74 Problems with FP Addition Machine fills them with its best guess, which is not often good. This phenomenon is called Catastrophic Cancellation. 56

75 Floating Point Arithmetic Problem: The set of representable machine numbers is FINITE. So not all math operations are well defined! Basic algebra breaks down in floating point arithmetic Example a + (b + c) (a + b) + c 57

76 Floating Point Arithmetic Rule 1. fl(x) = x(1 + ɛ), where ɛ ɛ m Rule 2. For all operations (one of +,,, /) fl(x y) = (x y)(1 + ɛ ), where ɛ ɛ m Rule 3. For +, operations fl(a b) = fl(b a) There were many discussions on what conditions/rules should 58

77 Floating Point Arithmetic Consider the sum of 3 numbers: y = a + b + c. Done as fl(fl(a + b) + c) η = fl(a + b) = (a + b)(1 + ɛ 1 ) y 1 = fl(η + c) = (η + c)(1 + ɛ 2 ) = [(a + b)(1 + ɛ 1 ) + c] (1 + ɛ 2 ) = [(a + b + c) + (a + b)ɛ 1 )] (1 + ɛ 2 ) [ = (a + b + c) 1 + a + b ] a + b + c ɛ 1(1 + ɛ 2 ) + ɛ 2 So disregarding the high order term ɛ 1 ɛ 2 fl(fl(a+b)+c) = (a+b+c)(1+ɛ 3 ) with ɛ 3 a + b a + b + c ɛ 1 +ɛ 2 59

78 Floating Point Arithmetic If we redid the computation as y 2 = fl(a + fl(b + c)) we would find fl(a+fl(b+c)) = (a+b+c)(1+ɛ 4 ) with ɛ 4 b + c a + b + c ɛ 1 +ɛ 2 Main conclusion: The first error is amplified by the factor (a + b)/y in the first case and (b + c)/y in the second case. In order to sum n numbers more accurately, it is better to start with the small numbers first. [However, sorting before adding is usually not worth the cost!] 60

79 Floating Point Arithmetic One of the most serious problems in floating point arithmetic is that of cancellation. If two large and close-by numbers are subtracted the result (a small number) carries very few accurate digits (why?). This is fine if the result is not reused. If the result is part of another calculation, then there may be a serious problem Example Roots of the equation x 2 + 2px q = 0 Assume we want the root with smallest absolute value: 2 q 61

80 Catastrophic Cancellation Adding c = a + b will result in a large error if Let a b a b a = x.xxx 10 0 b = y.yyy 10 8 Then finite precision {}}{ x.xxx xxxx xxxx xxxx yyyy yyyy yyyy yyyy = x.xxx xxxx zzzz zzzz???? }{{????} lost precision 62

81 Catastrophic Cancellation Subtracting c = a b will result in large error if a b. For example a = x.xxxx xxxx xxx1 b = x.xxxx xxxx xxx0 lost {}}{ ssss... lost {}}{ tttt... Then finite precision {}}{ x.xxx xxxx xxx1 + x.xxx xxxx xxx0 = ???? }{{????} lost precision 63

82 Summary addition: c = a + b if a b or a b subtraction: c = a b if a b catastrophic: caused by a single operation, not by an accumulation of errors can often be fixed by mathematical rearrangement 64

83 Loss of Significance Example x = and y = What is the relative error in x y in a computer with 5 decimal digits of accuracy? x y ( x ȳ) x y =

84 Loss of Significance Loss of Precision Theorem Let x and y be (normalized) floating point machine numbers with x > y > 0. If 2 p 1 y x 2 q for positive integers p and q, the significant binary digits lost in calculating x y is between q and p. 66

85 Loss of Significance Example Consider x = and y = < 1 y x = < 2 12 So we lose 11 or 12 bits in the computation of x y. yikes! Example Back to the other example (5 digits): x = and y = < 1 y x = < 10 5 So we lose 4 or 5 bits in the computation of x y. Here, x y = which has only 1 significant digit that we can be sure about 67

86 Loss of Significance So what to do? Mainly rearrangement. f (x) = x

87 Loss of Significance So what to do? Mainly rearrangement. f (x) = x Problem at x 0. 68

88 Loss of Significance So what to do? Mainly rearrangement. f (x) = x Problem at x 0. One type of fix: ( ) ( ) x f (x) = x x no subtraction! = x 2 x

Roundoff Errors and Computer Arithmetic

Roundoff Errors and Computer Arithmetic Jim Lambers Math 105A Summer Session I 2003-04 Lecture 2 Notes These notes correspond to Section 1.2 in the text. Roundoff Errors and Computer Arithmetic In computing the solution to any mathematical problem,

More information

Floating-Point Numbers in Digital Computers

Floating-Point Numbers in Digital Computers POLYTECHNIC UNIVERSITY Department of Computer and Information Science Floating-Point Numbers in Digital Computers K. Ming Leung Abstract: We explain how floating-point numbers are represented and stored

More information

Floating-Point Numbers in Digital Computers

Floating-Point Numbers in Digital Computers POLYTECHNIC UNIVERSITY Department of Computer and Information Science Floating-Point Numbers in Digital Computers K. Ming Leung Abstract: We explain how floating-point numbers are represented and stored

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 1 Scientific Computing Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

More information

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science) Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science) Floating Point Background: Fractional binary numbers IEEE floating point standard: Definition Example and properties

More information

Computational Methods. Sources of Errors

Computational Methods. Sources of Errors Computational Methods Sources of Errors Manfred Huber 2011 1 Numerical Analysis / Scientific Computing Many problems in Science and Engineering can not be solved analytically on a computer Numeric solutions

More information

Scientific Computing. Error Analysis

Scientific Computing. Error Analysis ECE257 Numerical Methods and Scientific Computing Error Analysis Today s s class: Introduction to error analysis Approximations Round-Off Errors Introduction Error is the difference between the exact solution

More information

2.1.1 Fixed-Point (or Integer) Arithmetic

2.1.1 Fixed-Point (or Integer) Arithmetic x = approximation to true value x error = x x, relative error = x x. x 2.1.1 Fixed-Point (or Integer) Arithmetic A base 2 (base 10) fixed-point number has a fixed number of binary (decimal) places. 1.

More information

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for 1 Logistics Notes for 2016-09-07 1. We are still at 50. If you are still waiting and are not interested in knowing if a slot frees up, let me know. 2. There is a correction to HW 1, problem 4; the condition

More information

Floating Point Representation. CS Summer 2008 Jonathan Kaldor

Floating Point Representation. CS Summer 2008 Jonathan Kaldor Floating Point Representation CS3220 - Summer 2008 Jonathan Kaldor Floating Point Numbers Infinite supply of real numbers Requires infinite space to represent certain numbers We need to be able to represent

More information

CS321. Introduction to Numerical Methods

CS321. Introduction to Numerical Methods CS31 Introduction to Numerical Methods Lecture 1 Number Representations and Errors Professor Jun Zhang Department of Computer Science University of Kentucky Lexington, KY 40506 0633 August 5, 017 Number

More information

Data Representation Floating Point

Data Representation Floating Point Data Representation Floating Point CSCI 2400 / ECE 3217: Computer Architecture Instructor: David Ferry Slides adapted from Bryant & O Hallaron s slides via Jason Fritts Today: Floating Point Background:

More information

Floating Point Arithmetic

Floating Point Arithmetic Floating Point Arithmetic CS 365 Floating-Point What can be represented in N bits? Unsigned 0 to 2 N 2s Complement -2 N-1 to 2 N-1-1 But, what about? very large numbers? 9,349,398,989,787,762,244,859,087,678

More information

Data Representation Floating Point

Data Representation Floating Point Data Representation Floating Point CSCI 2400 / ECE 3217: Computer Architecture Instructor: David Ferry Slides adapted from Bryant & O Hallaron s slides via Jason Fritts Today: Floating Point Background:

More information

Review of Calculus, cont d

Review of Calculus, cont d Jim Lambers MAT 460/560 Fall Semester 2009-10 Lecture 4 Notes These notes correspond to Sections 1.1 1.2 in the text. Review of Calculus, cont d Taylor s Theorem, cont d We conclude our discussion of Taylor

More information

Floating Point January 24, 2008

Floating Point January 24, 2008 15-213 The course that gives CMU its Zip! Floating Point January 24, 2008 Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties class04.ppt 15-213, S 08 Floating

More information

Table : IEEE Single Format ± a a 2 a 3 :::a 8 b b 2 b 3 :::b 23 If exponent bitstring a :::a 8 is Then numerical value represented is ( ) 2 = (

Table : IEEE Single Format ± a a 2 a 3 :::a 8 b b 2 b 3 :::b 23 If exponent bitstring a :::a 8 is Then numerical value represented is ( ) 2 = ( Floating Point Numbers in Java by Michael L. Overton Virtually all modern computers follow the IEEE 2 floating point standard in their representation of floating point numbers. The Java programming language

More information

Review Questions 26 CHAPTER 1. SCIENTIFIC COMPUTING

Review Questions 26 CHAPTER 1. SCIENTIFIC COMPUTING 26 CHAPTER 1. SCIENTIFIC COMPUTING amples. The IEEE floating-point standard can be found in [131]. A useful tutorial on floating-point arithmetic and the IEEE standard is [97]. Although it is no substitute

More information

Floating Point Representation in Computers

Floating Point Representation in Computers Floating Point Representation in Computers Floating Point Numbers - What are they? Floating Point Representation Floating Point Operations Where Things can go wrong What are Floating Point Numbers? Any

More information

CS321 Introduction To Numerical Methods

CS321 Introduction To Numerical Methods CS3 Introduction To Numerical Methods Fuhua (Frank) Cheng Department of Computer Science University of Kentucky Lexington KY 456-46 - - Table of Contents Errors and Number Representations 3 Error Types

More information

MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic

MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic September 28, 2018 Lecture 1 September 28, 2018 1 / 25 Floating point arithmetic Computers use finite strings of binary digits to represent

More information

What Every Programmer Should Know About Floating-Point Arithmetic

What Every Programmer Should Know About Floating-Point Arithmetic What Every Programmer Should Know About Floating-Point Arithmetic Last updated: October 15, 2015 Contents 1 Why don t my numbers add up? 3 2 Basic Answers 3 2.1 Why don t my numbers, like 0.1 + 0.2 add

More information

Foundations of Computer Systems

Foundations of Computer Systems 18-600 Foundations of Computer Systems Lecture 4: Floating Point Required Reading Assignment: Chapter 2 of CS:APP (3 rd edition) by Randy Bryant & Dave O Hallaron Assignments for This Week: Lab 1 18-600

More information

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754 Floating Point Puzzles Topics Lecture 3B Floating Point IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties For each of the following C expressions, either: Argue that

More information

Binary floating point encodings

Binary floating point encodings Week 1: Wednesday, Jan 25 Binary floating point encodings Binary floating point arithmetic is essentially scientific notation. Where in decimal scientific notation we write in floating point, we write

More information

Floating Point Numbers

Floating Point Numbers Floating Point Numbers Summer 8 Fractional numbers Fractional numbers fixed point Floating point numbers the IEEE 7 floating point standard Floating point operations Rounding modes CMPE Summer 8 Slides

More information

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Systems I Floating Point Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties IEEE Floating Point IEEE Standard 754 Established in 1985 as uniform standard for

More information

3.5 Floating Point: Overview

3.5 Floating Point: Overview 3.5 Floating Point: Overview Floating point (FP) numbers Scientific notation Decimal scientific notation Binary scientific notation IEEE 754 FP Standard Floating point representation inside a computer

More information

CS 6210 Fall 2016 Bei Wang. Lecture 4 Floating Point Systems Continued

CS 6210 Fall 2016 Bei Wang. Lecture 4 Floating Point Systems Continued CS 6210 Fall 2016 Bei Wang Lecture 4 Floating Point Systems Continued Take home message 1. Floating point rounding 2. Rounding unit 3. 64 bit word: double precision (IEEE standard word) 4. Exact rounding

More information

fractional quantities are typically represented in computers using floating point format this approach is very much similar to scientific notation

fractional quantities are typically represented in computers using floating point format this approach is very much similar to scientific notation Floating Point Arithmetic fractional quantities are typically represented in computers using floating point format this approach is very much similar to scientific notation for example, fixed point number

More information

Floating-Point Arithmetic

Floating-Point Arithmetic Floating-Point Arithmetic Raymond J. Spiteri Lecture Notes for CMPT 898: Numerical Software University of Saskatchewan January 9, 2013 Objectives Floating-point numbers Floating-point arithmetic Analysis

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 1 Scientific Computing Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

More information

What we need to know about error: Class Outline. Computational Methods CMSC/AMSC/MAPL 460. Errors in data and computation

What we need to know about error: Class Outline. Computational Methods CMSC/AMSC/MAPL 460. Errors in data and computation Class Outline Computational Methods CMSC/AMSC/MAPL 460 Errors in data and computation Representing numbers in floating point Ramani Duraiswami, Dept. of Computer Science Computations should be as accurate

More information

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754 Floating Point Puzzles Topics Lecture 3B Floating Point IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties For each of the following C expressions, either: Argue that

More information

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. What is scientific computing?

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. What is scientific computing? Lecture Notes to Accompany Scientific Computing An Introductory Survey Second Edition by Michael T. Heath Scientific Computing What is scientific computing? Design and analysis of algorithms for solving

More information

Computational Methods CMSC/AMSC/MAPL 460. Representing numbers in floating point and associated issues. Ramani Duraiswami, Dept. of Computer Science

Computational Methods CMSC/AMSC/MAPL 460. Representing numbers in floating point and associated issues. Ramani Duraiswami, Dept. of Computer Science Computational Methods CMSC/AMSC/MAPL 460 Representing numbers in floating point and associated issues Ramani Duraiswami, Dept. of Computer Science Class Outline Computations should be as accurate and as

More information

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation Chapter 2 Float Point Arithmetic Topics IEEE Floating Point Standard Fractional Binary Numbers Rounding Floating Point Operations Mathematical properties Real Numbers in Decimal Notation Representation

More information

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor.

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor. CS 261 Fall 2018 Mike Lam, Professor https://xkcd.com/217/ Floating-Point Numbers Floating-point Topics Binary fractions Floating-point representation Conversions and rounding error Binary fractions Now

More information

Representing and Manipulating Floating Points. Jo, Heeseung

Representing and Manipulating Floating Points. Jo, Heeseung Representing and Manipulating Floating Points Jo, Heeseung The Problem How to represent fractional values with finite number of bits? 0.1 0.612 3.14159265358979323846264338327950288... 2 Fractional Binary

More information

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. ! Floating point Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties Next time! The machine model Chris Riesbeck, Fall 2011 Checkpoint IEEE Floating point Floating

More information

Classes of Real Numbers 1/2. The Real Line

Classes of Real Numbers 1/2. The Real Line Classes of Real Numbers All real numbers can be represented by a line: 1/2 π 1 0 1 2 3 4 real numbers The Real Line { integers rational numbers non-integral fractions irrational numbers Rational numbers

More information

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing EE878 Special Topics in VLSI Computer Arithmetic for Digital Signal Processing Part 4-B Floating-Point Arithmetic - II Spring 2017 Koren Part.4b.1 The IEEE Floating-Point Standard Four formats for floating-point

More information

Giving credit where credit is due

Giving credit where credit is due CSCE 230J Computer Organization Floating Point Dr. Steve Goddard goddard@cse.unl.edu http://cse.unl.edu/~goddard/courses/csce230j Giving credit where credit is due Most of slides for this lecture are based

More information

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor. https://xkcd.com/217/

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor. https://xkcd.com/217/ CS 261 Fall 2017 Mike Lam, Professor https://xkcd.com/217/ Floating-Point Numbers Floating-point Topics Binary fractions Floating-point representation Conversions and rounding error Binary fractions Now

More information

Giving credit where credit is due

Giving credit where credit is due JDEP 284H Foundations of Computer Systems Floating Point Dr. Steve Goddard goddard@cse.unl.edu Giving credit where credit is due Most of slides for this lecture are based on slides created by Drs. Bryant

More information

Representing and Manipulating Floating Points

Representing and Manipulating Floating Points Representing and Manipulating Floating Points Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu The Problem How to represent fractional values with

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 4-B Floating-Point Arithmetic - II Israel Koren ECE666/Koren Part.4b.1 The IEEE Floating-Point

More information

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Floating Point. CSE 351 Autumn Instructor: Justin Hsia Floating Point CSE 351 Autumn 2016 Instructor: Justin Hsia Teaching Assistants: Chris Ma Hunter Zahn John Kaltenbach Kevin Bi Sachin Mehta Suraj Bhat Thomas Neuman Waylon Huang Xi Liu Yufang Sun http://xkcd.com/899/

More information

Floating-Point Arithmetic

Floating-Point Arithmetic Floating-Point Arithmetic ECS30 Winter 207 January 27, 207 Floating point numbers Floating-point representation of numbers (scientific notation) has four components, for example, 3.46 0 sign significand

More information

2 Computation with Floating-Point Numbers

2 Computation with Floating-Point Numbers 2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers

More information

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers. class04.ppt 15-213 The course that gives CMU its Zip! Topics Floating Point Jan 22, 2004 IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Floating Point Puzzles For

More information

System Programming CISC 360. Floating Point September 16, 2008

System Programming CISC 360. Floating Point September 16, 2008 System Programming CISC 360 Floating Point September 16, 2008 Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Powerpoint Lecture Notes for Computer Systems:

More information

Divide: Paper & Pencil

Divide: Paper & Pencil Divide: Paper & Pencil 1001 Quotient Divisor 1000 1001010 Dividend -1000 10 101 1010 1000 10 Remainder See how big a number can be subtracted, creating quotient bit on each step Binary => 1 * divisor or

More information

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3 Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Instructor: Nicole Hynes nicole.hynes@rutgers.edu 1 Fixed Point Numbers Fixed point number: integer part

More information

Representing and Manipulating Floating Points

Representing and Manipulating Floating Points Representing and Manipulating Floating Points Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu The Problem How to represent fractional values with

More information

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction 1 Floating Point The World is Not Just Integers Programming languages support numbers with fraction Called floating-point numbers Examples: 3.14159265 (π) 2.71828 (e) 0.000000001 or 1.0 10 9 (seconds in

More information

Chapter Three. Arithmetic

Chapter Three. Arithmetic Chapter Three 1 Arithmetic Where we've been: Performance (seconds, cycles, instructions) Abstractions: Instruction Set Architecture Assembly Language and Machine Language What's up ahead: Implementing

More information

2 Computation with Floating-Point Numbers

2 Computation with Floating-Point Numbers 2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers

More information

15213 Recitation 2: Floating Point

15213 Recitation 2: Floating Point 15213 Recitation 2: Floating Point 1 Introduction This handout will introduce and test your knowledge of the floating point representation of real numbers, as defined by the IEEE standard. This information

More information

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1.1 Introduction Given that digital logic and memory devices are based on two electrical states (on and off), it is natural to use a number

More information

Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective A Programmer's Perspective Representing Numbers Gal A. Kaminka galk@cs.biu.ac.il Fractional Binary Numbers 2 i 2 i 1 4 2 1 b i b i 1 b 2 b 1 b 0. b 1 b 2 b 3 b j 1/2 1/4 1/8 Representation Bits to right

More information

Floating-point numbers. Phys 420/580 Lecture 6

Floating-point numbers. Phys 420/580 Lecture 6 Floating-point numbers Phys 420/580 Lecture 6 Random walk CA Activate a single cell at site i = 0 For all subsequent times steps, let the active site wander to i := i ± 1 with equal probability Random

More information

CS429: Computer Organization and Architecture

CS429: Computer Organization and Architecture CS429: Computer Organization and Architecture Dr. Bill Young Department of Computer Sciences University of Texas at Austin Last updated: September 18, 2017 at 12:48 CS429 Slideset 4: 1 Topics of this Slideset

More information

Numerical Methods 5633

Numerical Methods 5633 Numerical Methods 5633 Lecture 2 Marina Krstic Marinkovic mmarina@maths.tcd.ie School of Mathematics Trinity College Dublin Marina Krstic Marinkovic 1 / 15 5633-Numerical Methods Organisational Assignment

More information

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Floating Point. CSE 351 Autumn Instructor: Justin Hsia Floating Point CSE 351 Autumn 2017 Instructor: Justin Hsia Teaching Assistants: Lucas Wotton Michael Zhang Parker DeWilde Ryan Wong Sam Gehman Sam Wolfson Savanna Yee Vinny Palaniappan Administrivia Lab

More information

Mathematical preliminaries and error analysis

Mathematical preliminaries and error analysis Mathematical preliminaries and error analysis Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan August 28, 2011 Outline 1 Round-off errors and computer arithmetic IEEE

More information

Floating-Point Arithmetic

Floating-Point Arithmetic Floating-Point Arithmetic if ((A + A) - A == A) { SelfDestruct() } Reading: Study Chapter 4. L12 Multiplication 1 Why Floating Point? Aren t Integers enough? Many applications require numbers with a VERY

More information

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI 3.5 Today s Contents Floating point numbers: 2.5, 10.1, 100.2, etc.. How

More information

Computer Arithmetic Floating Point

Computer Arithmetic Floating Point Computer Arithmetic Floating Point Chapter 3.6 EEC7 FQ 25 About Floating Point Arithmetic Arithmetic basic operations on floating point numbers are: Add, Subtract, Multiply, Divide Transcendental operations

More information

Representing and Manipulating Floating Points

Representing and Manipulating Floating Points Representing and Manipulating Floating Points Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE23: Introduction to Computer Systems, Spring 218,

More information

1.2 Round-off Errors and Computer Arithmetic

1.2 Round-off Errors and Computer Arithmetic 1.2 Round-off Errors and Computer Arithmetic 1 In a computer model, a memory storage unit word is used to store a number. A word has only a finite number of bits. These facts imply: 1. Only a small set

More information

Floating-point representation

Floating-point representation Lecture 3-4: Floating-point representation and arithmetic Floating-point representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However,

More information

Representing and Manipulating Floating Points. Computer Systems Laboratory Sungkyunkwan University

Representing and Manipulating Floating Points. Computer Systems Laboratory Sungkyunkwan University Representing and Manipulating Floating Points Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu The Problem How to represent fractional values with

More information

Basics of Computation. PHY 604:Computational Methods in Physics and Astrophysics II

Basics of Computation. PHY 604:Computational Methods in Physics and Astrophysics II Basics of Computation Basics of Computation Computers store information and allow us to operate on it. That's basically it. Computers have finite memory, so it is not possible to store the infinite range

More information

Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition. Carnegie Mellon

Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition. Carnegie Mellon Carnegie Mellon Floating Point 15-213/18-213/14-513/15-513: Introduction to Computer Systems 4 th Lecture, Sept. 6, 2018 Today: Floating Point Background: Fractional binary numbers IEEE floating point

More information

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right. Floating-point Arithmetic Reading: pp. 312-328 Floating-Point Representation Non-scientific floating point numbers: A non-integer can be represented as: 2 4 2 3 2 2 2 1 2 0.2-1 2-2 2-3 2-4 where you sum

More information

Review: MULTIPLY HARDWARE Version 1. ECE4680 Computer Organization & Architecture. Divide, Floating Point, Pentium Bug

Review: MULTIPLY HARDWARE Version 1. ECE4680 Computer Organization & Architecture. Divide, Floating Point, Pentium Bug ECE468 ALU-III.1 2002-2-27 Review: MULTIPLY HARDWARE Version 1 ECE4680 Computer Organization & Architecture 64-bit Multiplicand reg, 64-bit ALU, 64-bit Product reg, 32-bit multiplier reg Divide, Floating

More information

Computational Economics and Finance

Computational Economics and Finance Computational Economics and Finance Part I: Elementary Concepts of Numerical Analysis Spring 2015 Outline Computer arithmetic Error analysis: Sources of error Error propagation Controlling the error Rates

More information

Floating-Point Arithmetic

Floating-Point Arithmetic Floating-Point Arithmetic if ((A + A) - A == A) { SelfDestruct() } Reading: Study Chapter 3. L12 Multiplication 1 Approximating Real Numbers on Computers Thus far, we ve entirely ignored one of the most

More information

COMP2611: Computer Organization. Data Representation

COMP2611: Computer Organization. Data Representation COMP2611: Computer Organization Comp2611 Fall 2015 2 1. Binary numbers and 2 s Complement Numbers 3 Bits: are the basis for binary number representation in digital computers What you will learn here: How

More information

Floating point. Today. IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.

Floating point. Today. IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time. Floating point Today IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time The machine model Fabián E. Bustamante, Spring 2010 IEEE Floating point Floating point

More information

Number Representations

Number Representations Number Representations times XVII LIX CLXX -XVII D(CCL)LL DCCC LLLL X-X X-VII = DCCC CC III = MIII X-VII = VIIIII-VII = III 1/25/02 Memory Organization Viewed as a large, single-dimension array, with an

More information

Divide: Paper & Pencil CS152. Computer Architecture and Engineering Lecture 7. Divide, Floating Point, Pentium Bug. DIVIDE HARDWARE Version 1

Divide: Paper & Pencil CS152. Computer Architecture and Engineering Lecture 7. Divide, Floating Point, Pentium Bug. DIVIDE HARDWARE Version 1 Divide: Paper & Pencil Computer Architecture and Engineering Lecture 7 Divide, Floating Point, Pentium Bug 1001 Quotient 1000 1001010 Dividend 1000 10 101 1010 1000 10 (or Modulo result) See how big a

More information

Section 1.4 Mathematics on the Computer: Floating Point Arithmetic

Section 1.4 Mathematics on the Computer: Floating Point Arithmetic Section 1.4 Mathematics on the Computer: Floating Point Arithmetic Key terms Floating point arithmetic IEE Standard Mantissa Exponent Roundoff error Pitfalls of floating point arithmetic Structuring computations

More information

Floating-Point Arithmetic

Floating-Point Arithmetic Floating-Point Arithmetic if ((A + A) - A == A) { SelfDestruct() } L11 Floating Point 1 What is the problem? Many numeric applications require numbers over a VERY large range. (e.g. nanoseconds to centuries)

More information

The Perils of Floating Point

The Perils of Floating Point The Perils of Floating Point by Bruce M. Bush Copyright (c) 1996 Lahey Computer Systems, Inc. Permission to copy is granted with acknowledgement of the source. Many great engineering and scientific advances

More information

Floating-point representations

Floating-point representations Lecture 10 Floating-point representations Methods of representing real numbers (1) 1. Fixed-point number system limited range and/or limited precision results must be scaled 100101010 1111010 100101010.1111010

More information

Floating-point representations

Floating-point representations Lecture 10 Floating-point representations Methods of representing real numbers (1) 1. Fixed-point number system limited range and/or limited precision results must be scaled 100101010 1111010 100101010.1111010

More information

Floating Point Numbers

Floating Point Numbers Floating Point Numbers Computer Systems Organization (Spring 2016) CSCI-UA 201, Section 2 Instructor: Joanna Klukowska Slides adapted from Randal E. Bryant and David R. O Hallaron (CMU) Mohamed Zahran

More information

Floating Point Numbers

Floating Point Numbers Floating Point Numbers Computer Systems Organization (Spring 2016) CSCI-UA 201, Section 2 Fractions in Binary Instructor: Joanna Klukowska Slides adapted from Randal E. Bryant and David R. O Hallaron (CMU)

More information

Floating Point Numbers. Lecture 9 CAP

Floating Point Numbers. Lecture 9 CAP Floating Point Numbers Lecture 9 CAP 3103 06-16-2014 Review of Numbers Computers are made to deal with numbers What can we represent in N bits? 2 N things, and no more! They could be Unsigned integers:

More information

IEEE Standard 754 Floating Point Numbers

IEEE Standard 754 Floating Point Numbers IEEE Standard 754 Floating Point Numbers Steve Hollasch / Last update 2005-Feb-24 IEEE Standard 754 floating point is the most common representation today for real numbers on computers, including Intel-based

More information

Chapter 2. Data Representation in Computer Systems

Chapter 2. Data Representation in Computer Systems Chapter 2 Data Representation in Computer Systems Chapter 2 Objectives Understand the fundamentals of numerical data representation and manipulation in digital computers. Master the skill of converting

More information

Finite arithmetic and error analysis

Finite arithmetic and error analysis Finite arithmetic and error analysis Escuela de Ingeniería Informática de Oviedo (Dpto de Matemáticas-UniOvi) Numerical Computation Finite arithmetic and error analysis 1 / 45 Outline 1 Number representation:

More information

Floating Point : Introduction to Computer Systems 4 th Lecture, May 25, Instructor: Brian Railing. Carnegie Mellon

Floating Point : Introduction to Computer Systems 4 th Lecture, May 25, Instructor: Brian Railing. Carnegie Mellon Floating Point 15-213: Introduction to Computer Systems 4 th Lecture, May 25, 2018 Instructor: Brian Railing Today: Floating Point Background: Fractional binary numbers IEEE floating point standard: Definition

More information

CS 33. Data Representation (Part 3) CS33 Intro to Computer Systems VIII 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

CS 33. Data Representation (Part 3) CS33 Intro to Computer Systems VIII 1 Copyright 2018 Thomas W. Doeppner. All rights reserved. CS 33 Data Representation (Part 3) CS33 Intro to Computer Systems VIII 1 Copyright 2018 Thomas W. Doeppner. All rights reserved. Byte-Oriented Memory Organization 00 0 FF F Programs refer to data by address

More information

Computer Architecture Chapter 3. Fall 2005 Department of Computer Science Kent State University

Computer Architecture Chapter 3. Fall 2005 Department of Computer Science Kent State University Computer Architecture Chapter 3 Fall 2005 Department of Computer Science Kent State University Objectives Signed and Unsigned Numbers Addition and Subtraction Multiplication and Division Floating Point

More information

Floating Point Numbers

Floating Point Numbers Floating Point Floating Point Numbers Mathematical background: tional binary numbers Representation on computers: IEEE floating point standard Rounding, addition, multiplication Kai Shen 1 2 Fractional

More information

Today: Floating Point. Floating Point. Fractional Binary Numbers. Fractional binary numbers. bi bi 1 b2 b1 b0 b 1 b 2 b 3 b j

Today: Floating Point. Floating Point. Fractional Binary Numbers. Fractional binary numbers. bi bi 1 b2 b1 b0 b 1 b 2 b 3 b j Floating Point 15 213: Introduction to Computer Systems 4 th Lecture, Jan 24, 2013 Instructors: Seth Copen Goldstein, Anthony Rowe, Greg Kesden 2 Fractional binary numbers What is 1011.101 2? Fractional

More information

Floating Point. CSE 238/2038/2138: Systems Programming. Instructor: Fatma CORUT ERGİN. Slides adapted from Bryant & O Hallaron s slides

Floating Point. CSE 238/2038/2138: Systems Programming. Instructor: Fatma CORUT ERGİN. Slides adapted from Bryant & O Hallaron s slides Floating Point CSE 238/2038/2138: Systems Programming Instructor: Fatma CORUT ERGİN Slides adapted from Bryant & O Hallaron s slides Today: Floating Point Background: Fractional binary numbers IEEE floating

More information