FLOATING POINT NUMBERS

Similar documents
Signed umbers. Sign/Magnitude otation

Introduction to Computers and Programming. Numeric Values

COMP2611: Computer Organization. Data Representation

Brainstorm. Period. Scientific Notation Activity 7 NOTES

Real Numbers finite subset real numbers floating point numbers Scientific Notation fixed point numbers


Operations On Data CHAPTER 4. (Solutions to Odd-Numbered Problems) Review Questions

Chapter 3: Arithmetic for Computers

COSC 243. Data Representation 3. Lecture 3 - Data Representation 3 1. COSC 243 (Computer Architecture)

Number Systems and Binary Arithmetic. Quantitative Analysis II Professor Bob Orr

CS & IT Conversions. Magnitude 10,000 1,

Floating Point Arithmetic

Number Systems. Both numbers are positive

Birkbeck (University of London) Department of Computer Science and Information Systems. Introduction to Computer Systems (BUCI008H4)

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

Divide: Paper & Pencil

CS101 Lecture 04: Binary Arithmetic

Scientific Computing. Error Analysis

Birkbeck (University of London) Department of Computer Science and Information Systems. Introduction to Computer Systems (BUCI008H4)

1.4 Expressing Numbers: Scientific Notation LEARNING OBJECTIVE

Floating Point Numbers

Chapter Three. Arithmetic

Signed Binary Numbers

MACHINE LEVEL REPRESENTATION OF DATA

Lecture Notes: Floating-Point Numbers

Integers and Floating Point

Number Systems. Decimal numbers. Binary numbers. Chapter 1 <1> 8's column. 1000's column. 2's column. 4's column

Intermediate Algebra. Gregg Waterman Oregon Institute of Technology

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

DLD VIDYA SAGAR P. potharajuvidyasagar.wordpress.com. Vignana Bharathi Institute of Technology UNIT 1 DLD P VIDYA SAGAR

Exponential Numbers ID1050 Quantitative & Qualitative Reasoning

CS321. Introduction to Numerical Methods

Integers. N = sum (b i * 2 i ) where b i = 0 or 1. This is called unsigned binary representation. i = 31. i = 0

COMP Overview of Tutorial #2

Excerpt from "Art of Problem Solving Volume 1: the Basics" 2014 AoPS Inc.

CHAPTER 5: Representing Numerical Data

ECE 2020B Fundamentals of Digital Design Spring problems, 6 pages Exam Two Solutions 26 February 2014

IBM 370 Basic Data Types

±M R ±E, S M CHARACTERISTIC MANTISSA 1 k j

Numeral Systems. -Numeral System -Positional systems -Decimal -Binary -Octal. Subjects:

Signed Multiplication Multiply the positives Negate result if signs of operand are different

UNIT 7A Data Representation: Numbers and Text. Digital Data

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

MA 1128: Lecture 02 1/22/2018

Number Systems Standard positional representation of numbers: An unsigned number with whole and fraction portions is represented as:

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

CHW 261: Logic Design

IT 1204 Section 2.0. Data Representation and Arithmetic. 2009, University of Colombo School of Computing 1

unused unused unused unused unused unused

Real Numbers finite subset real numbers floating point numbers Scientific Notation fixed point numbers

Data Representation 1

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

Number Systems. Binary Numbers. Appendix. Decimal notation represents numbers as powers of 10, for example

Math 340 Fall 2014, Victor Matveev. Binary system, round-off errors, loss of significance, and double precision accuracy.

Exponential Notation

ECE232: Hardware Organization and Design

UNIVERSITY OF WISCONSIN MADISON

ECE 2020B Fundamentals of Digital Design Spring problems, 6 pages Exam Two 26 February 2014

Numeric Encodings Prof. James L. Frankel Harvard University

THE LOGIC OF COMPOUND STATEMENTS

Bits, Words, and Integers

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS

Numbers and Computers. Debdeep Mukhopadhyay Assistant Professor Dept of Computer Sc and Engg IIT Madras

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS

Hexadecimal Numbers. Journal: If you were to extend our numbering system to more digits, what digits would you use? Why those?

Introduction to Scientific Computing Lecture 1

Exponents. Although exponents can be negative as well as positive numbers, this chapter will only address the use of positive exponents.

Chapter 2: Number Systems

IEEE Floating Point Numbers Overview

Number systems and binary

Floating-Point Arithmetic

Introduction (Rules and Suggestions)

CHAPTER V NUMBER SYSTEMS AND ARITHMETIC

Binary. Hexadecimal BINARY CODED DECIMAL

Digital Logic. The Binary System is a way of writing numbers using only the digits 0 and 1. This is the method used by the (digital) computer.

Internal Data Representation

10.1. Unit 10. Signed Representation Systems Binary Arithmetic

Chapter 3 Data Representation

Binary Addition & Subtraction. Unsigned and Sign & Magnitude numbers

Number Systems Using and Converting Between Decimal, Binary, Octal and Hexadecimal Number Systems

What Is It? Instruction Register Address Register Data Register

Representing Information. Bit Juggling. - Representing information using bits - Number representations. - Some other bits - Chapters 1 and 2.3,2.

Floating-point representation

TOPIC: NUMBER SYSTEMS

Number Systems CHAPTER Positional Number Systems

IEEE Standard for Floating-Point Arithmetic: 754

Inf2C - Computer Systems Lecture 2 Data Representation

Numeric Precision 101

Chapter 3. Errors and numerical stability

ECE331: Hardware Organization and Design

Computer (Literacy) Skills. Number representations and memory. Lubomír Bulej KDSS MFF UK

3.4 Equivalent Forms of Rational Numbers: Fractions, Decimals, Percents, and Scientific Notation

Number System. Introduction. Decimal Numbers

in this web service Cambridge University Press

Number Representations

Number Systems and Computer Arithmetic

Representing numbers on the computer. Computer memory/processors consist of items that exist in one of two possible states (binary states).

Logic, Words, and Integers

Chapter 5 : Computer Arithmetic

Memory Addressing, Binary, and Hexadecimal Review

Transcription:

FLOATING POINT NUMBERS Robert P. Webber, Longwood University We have seen how decimal fractions can be converted to binary. For instance, we can write 6.25 10 as 4 + 2 + ¼ = 2 2 + 2 1 + 2-2 = 1*2 2 + 1*2 1 + 0*2 0 + 0*2-1 + 1*2-2 = 110.01 2. Teaching a computer how to do arithmetic using such binary fractions would be difficult. One problem is that the binary point is not fixed; it needs to float. If you multiply a two place fraction by another two place fraction, for instance, the result has four fractional places, not two. Computer scientists realized that it would be easier to do floating point arithmetic if the numbers were written in scientific notation. You may recall this notation from your science classes, where very large numbers and numbers that are very close to zero are written using powers of ten. For example, 1,234,000,000 = 1.123*10 9, 0.0000567 = 5.67*10-5. Computers use binary notation and powers of two, of course. The resulting format is called floating point notation. A floating point number has three parts: its sign, a fractional part, and an exponent: ± fractional_part * 2 exponent For instance, the decimal floating point number 5.16 * 2 13 has a positive sign, a fractional part of 5.16, and an exponent of 13. It is equivalent to 5.16 * 2 13 = 5.16 * 8192 = 42,279.72 in ordinary signed decimal form. There are many slightly different floating point formats. In an effort to bring order from chaos, the Institute of Electrical and Electronics Engineers (IEEE) developed a standard form called IEEE 754 single precision. Many computers use this standard, and it is the one we will examine.

The IEEE standard uses 32 bits for each floating point number, divided into three fields. The left most bit is the sign bit. The next eight bits hold the exponent. The final 23 bits contain the fractional part. Sign (1 bit) Exponent (8 bits) Fractional part (23 bits) The sign bit is 0 for a positive number or zero, and 1 for a negative number. The fractional part assumes the number is binary and in the form 1.xx x, where xx x denotes binary digits. This is called normalized form. Since the fractional part always begins with 1, there is no need to store that bit. Only the part after the binary point is stored. The assumed 1 is called a hidden bit, and it provides 24 bits of accuracy in only 23 bit spaces. The exponent must allow for a sign, since exponents can be positive or negative. It must also allow for quick comparison to other exponents, because many comparisons must be done in floating point arithmetic. Two s complement notation would provide the sign, but not quick comparison. To allow that, the IEEE form uses excess 127 notation. In base 10, Excess 127 exponent = signed decimal exponent + 127. For example, convert the decimal number 50.5 to IEEE format. Step 1: Convert 50.5 to base 2. 50.5 = 32 + 16 + 2 + ½ = 2 5 + 2 4 + 2 1 + 2-1 = 110010.1 2. Step 2: Write the number in normalized form. We must move the binary point five places to the left, so the exponent is 5. 110010.1 = 1.100101 * 2 5. The fractional part is 100101. We drop the left-most 1, because it is the hidden bit. Notice there is no need to write trailing zeros when we write the number by hand. When we write it as the computer will store it, however, we will need to add 17 trailing zeros to make 23 bits for the fractional part in all. The complete fractional part is 10010100000000000000000. Step 3: Find the excess 127 form of the exponent.

127 + 5 = 132 = 128 + 4 = 2 7 + 2 2 = 10000100 2 The sign is 0, since the number is positive; the excess 127 exponent is 10000100, and the fractional part is 100101 followed by 17 zeros. The IEEE form is 01000010010010100000000000000000, and this is how the number would be stored in the computer. We could write this as 424A0000 in hexadecimal form for better readability. The form may be clearer if we break it into fields. 0 10000100 10010100000000000000000 sign exponent fractional part Here s another example. Write 121.7510 in IEEE floating point format. Step 1: Convert 121.75 to binary. 121.75 = 64 + 32 + 16 + 8 + 1 + ½ + ¼ = 2 6 + 2 5 + 2 4 + 2 3 + 2 0 + 2-1 + 2-2 = 1111001.11 2 Step 2: Write the binary number in normalized form. 1111001.11 = 1.11100111 * 2 6 Step 3: Find the excess 127 exponent. 127 + 6 = 133 = 128 + 4 + 1 = 2 7 + 2 2 + 2 0 = 10000101 2 The sign is 1, the fractional part is 11100111 followed by 15 zeros (for a total of 23 bits), and the exponent is 10000101. The IEEE form is 11000010111100111000000000000000 or C2F38000 in hexadecimal. Broken up into fields, it is 1 10000101 11100111000000000000000 sign exponent fractional part

MAGNITUDE AND PRECISION Magnitude refers to the raw size of the number; that is, how large or small it can be. Precision refers to the number of digits of accuracy in a number. Magnitude and precision measure different things. Magnitude refers to the possible number of digits, precision to how many of those digits are accurate. Often we don t care much about the precision in very large numbers. For instance, the 2007 United States population was 302 million people. We don t really mean exactly 302,000,000, of course. The number is accurate to only three digits. Indeed, it would not be possible to be much more accurate, since the exact population is constantly changing. When a computer stores a number in integer format, every digit is accurate. The largest integer that can be stored in 32 bits using two s complement notation is 2 31-1, which is 2,147,483,648. Say an integer has value 15,431. We can be sure that each digit is correct. However, an integer such as 3,500,630,119 cannot be stored in 32 bits. It is too large. Floating point numbers generally do not have this precision property. The magnitude is determined by the exponent, while the precision is determined by the fractional part. All digits of a displayed floating point number may not be accurate. In IEEE format, the precision is 24 bits, including the hidden bit. This translates to about seven decimal digits of accuracy. The magnitude, however, is much larger. The biggest exponent in excess 127 notation that can be stored in eight bits is 127, so the largest number that can be represented has all 1 s in the fractional part and 127 10 in the exponent: 1.11 1 *2 127. This number is approximately 3.4 * 10 38. A number that is larger than 3.4 * 10 38 cannot be stored in a computer using standard IEEE format. This is a huge number, but it is possible to exceed it. For example, there are exactly 35 legal choices for each chess move, but the total number of choices grows exponentially to produce more than 10 50 possible board positions, a number too large for even a computer to hold. It is important to realize that while the magnitude allows us to store numbers, the precision may mean that not all digits are accurate. For instance, suppose the budget for a large corporation is $632,785,417.25. This number can be stored in standard floating point, because its magnitude is much less than 3.4 * 10 38, but only about seven digits of accuracy will be preserved. The stored value will be approximately $632,785,400.00. The last several digits will probably be lost. Other floating point formats are available that increase the precision (but probably not the magnitude). Regardless, you should always remember that a computer generated number

is only accurate to a maximum number of digits (seven for standard IEEE format). Any digits beyond that maximum will not be reliable. Exercises In problems 1 through 8, write the decimal number in normalized form; that is, in the form 1.xx x * 2 exponent. 1. 562 2. 961 3. 1055 4. 2050 5. 69 6. 120 7. 28.125 8. 106.25 In problems 9 through 12, find the excess 127 form of the base 10 number. 9. 7 10. 38 11. 8 12. 19 In problems 13 through 18, write the decimal number in IEEE floating point format. 13. 42.5 14. 105.375 15. 1 26 16. 145.625 4 17. 11/16 18. 15 / 32 In problems 19 through 26, can the quantity be stored as a 32 bit integer? as a standard IEEE floating point number? In each case, if it can be stored, will the stored number be accurate? Explain your answers. 19. The number of seconds in an hour. 20. The number of seconds in a day. 21. The number of seconds in a week.

22. The number of seconds in the month of March. 23. The number of seconds in a non-leap year. 24. The numbers of seconds in a century. 25. The number 3.141592674 (the first ten digits of the number π ). 26. The number 2.718281828 (the first ten digits of the number e). 27. The distance from the Sun to the Earth, expressed in miles. 28. The United States national debt. In problems 29 through 32, show how the two numbers would be stored, the first in binary 2 s complement integer form, the second in IEEE form. Assume 32 bits are used for each. 29. 35, 35.0 30. 6, 6.0 31. 14, 14.0 32. 161, 161.0 In problems 33 through 36, the bit pattern represents an IEEE floating point format number. Find its ordinary signed decimal form. 33. 01000001110110000000000000000000 (the pattern is in binary) 34. 3EA00000 ( the pattern is in hexadecimal) 35. BDE00000 (the pattern is in hexadecimal) 36. 11000010111100110100000000000000 (the pattern is in binary)