CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI.

Size: px

Start display at page:

Download "CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI."

Edwin Underwood
5 years ago
Views:

1 CSCI 402: Computer Architectures Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI 3.5 Today s Contents Floating point numbers: 2.5, 10.1, 100.2, etc.. How can computers support these real numbers? 4 1

2 Concept of Floating Point Numbers Used to represent Real Numbers (non-integers) Also used to represent very small and very large numbers Scientific notation: A number has 1 single digit to the left of the decimal point Normalized: A number in scientific notation and it has no leading 0s. Some examples? normalized not normalized Apply the same notation to binary numbers: ±1.xxxxxxx 2 2 yyyy Types float and double in C (or mantissa) significand Always the same format. Can simplify floating point arithmetic algorithms 5 IEEE Floating Point Standard There is a compromise between fraction and exponent bits for precision V bits for range However, we only have 32 or 64 bits More precision is always possible, but up to a cost Defined by IEEE Standard Developed in response to divergence of representations Resolve portability issues for scientific code Now, it is universally adopted. Two representations: Single precision floating point (32-bit) Double precision floating point (64-bit) 6 2

3 IEEE Floating-Point Format single: 8 bits double: 11 bits single: 23 bits double: 52 bits x = ( 1) S (1+) 2 [0, 255] (stored Exponent Bias) S: sign bit (0 Þ non-negative, 1 Þ negative) Normalized significand: 1.0 significand < 2.0 Always has a leading (pre-binary-point) 1, so no need to represent it explicitly ( hidden bit ) Significand = with the 1. restored Stored exponent = actual exponent + Bias Actual exponent = stored exponent - Bias Stored exponent is unsigned: e.g., 0 to 255, or 0 to 2047 Single: Bias = 127; Double: Bias = Single-Precision Range First of all: infinity Exponent of and are reserved The smallest value (closest to zero)? Exponent: Þ actual exponent = = 126 : Þ significand = 1.0 ± ± The largest value (furthest from zero)? exponent: Þ actual exponent = = +127 : Þ significand 2.0 ± ±

4 Double-Precision Range Again, exponents and are reserved Smallest value Exponent: Þ actual exponent = = 1022 : Þ significand = 1.0 ± ± Largest value Exponent: Þ actual exponent = = : Þ significand 2.0 ± ± Main advantage: much larger-size fraction! 9 Overflow and Underflow When a number becomes too big to be represented by the exponent field à Overflow i.e., > (float) When a number becomes too small to be represented in the exponent field à Underflow between ± (float) 10 4

$xxxx numbers n Allow for gradual underflow n Denormalized number with fraction = 000.$

5 IEEE 754 Encoding of Floating Point Numbers 0 is a special case in the IEEE 754 Std. It seems we have no way to represent 1.0, which would be 1.0x2 0 (an exponent of zero, times the hidden one)! 11 Denormalized Numbers Exponent = Þ hidden bit is 0, i.e. S -Bias x = (-1) (0+ ) 2 n Smaller than normalized 1.xxxx numbers n Allow for gradual underflow n Denormalized number with fraction = x = (-1) S (0+ 0) 2 Two representations of 0.0! -Bias = ±

6 Representation of Infinities and NaNs Exponent = , = ±Infinity e.g., 1/0 Can be used in subsequent calculations, avoiding the need for overflow check Exponent = , Not-a-Number (NaN) Indicates illegal or undefined result e.g., 0.0 / 0.0, Inf - Inf Can still be used in subsequent calculations 13 Floating-Point# s Precision Precision: Minimum difference between two floating point numbers? For single (23-bit fraction): approximately 2 23 Equivalent to 23 log decimal digits of precision Not sufficient for scientific computing! For double (52-bit fraction): approximately 2 52 Equivalent to 52 log decimal digits of precision also called machine epsilon Minimum epsilon s.t. 1 + epsilon >

7 Accuracy vs Precision Accuracy is how close a measured value is to the true value. Precision is how close the measured values are to each other. Low accuracy High precision High accuracy Low precision High accuracy High precision 15 7

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

$Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.$ Floating-point Arithmetic Reading: pp. 312-328 Floating-Point Representation Non-scientific floating point numbers: A non-integer can be represented as: 2 4 2 3 2 2 2 1 2 0.2-1 2-2 2-3 2-4 where you sum