# Numeric Precision 101

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 > Service and Support > Technical Support TS Home Intro to Services News and Info Contact TS Site Map FAQ Feedback TS-654 Numeric Precision 101 This paper is intended as a basic introduction to the area of numeric precision. Maybe you re just beginning to program with SAS or maybe it s been a long time since you studied number theory in college. This paper gives a basic overview of numeric precision and representation issues within SAS applications. You can read more details in TS-230 which can be downloaded from our website, One of the most common issues that face each SAS user is numeric precision and representation. Many users write programs that crunch numbers and when the results are output, they assume that "what they see is what they get". Of course we expect computers to be able to do math properly..they re at least that smart, right? If computers weren t reliable, why would nearly all business functions rely on their integrity? These sound like reasonable arguments to support our belief that computers are able to output exactly what is stored internally and represent every number as we can on paper. However, this is not entirely true. We have a difficult time understanding what happens inside of a computer system when we ask it to represent numbers. This is because the number system with which we re most familiar and the number system the computer uses to store numbers are different. As young children we were taught math with the decimal (base ten) number system by using our ten fingers for counting. When the numbers became too large to do our calculations, we did them on paper and always knew we could trust the results. After all, that was the only way we knew to add, subtract, multiply, and divide. This number system became the standard by which everything quantitative would be based. Years later when we began using a computer, many of us tried to apply the same rules that had become second nature and realized the same rules didn t apply. Why aren t the results the same? The reason is that due to hardware limitations and the way numbers are stored, a finite set of numbers must be used to represent the infinite real number system. People are accustomed to decimal arithmetic. Computers use binary arithmetic with finite precision. Therefore, when dealing with numbers that do not have an exact binary representation, computers often produce results that differ slightly from what people expect. For example, the decimal values 0.1 and 0.3 do not have exact binary representations. In decimal arithmetic 3*0.1 is exactly equal to 0.3, but this equality does not hold in binary arithmetic. If you print these two values in SAS, they appear the same, but by computing the difference, you can see they are really not the same: data a; point_three=0.3; three_times_point_one=3 * 0.1; difference=point_three three_times_point_one; proc print noobs; three_ point_ times_ three point_one difference E-17 file:///i /TS-DOC%20TS-654%20-%20Numeric%20Precision%20101.htm (1 of 12) [10/11/02 4:01:11 PM]

2 Floating Point Representation Although there are various means to store numbers, SAS uses floating point, or real binary, representation to store all numeric values. One of the main reasons SAS chose floating point was for computational speed. Floating point also allows for numbers of very large magnitude (numbers such as 2 to the 30 th power) and high degrees of precision (many digits to the right of the decimal place). You re probably more familiar with this method than you realize, because it s an implementation of scientific notation. Each number is represented as a mantissa between 0 and 1 raised to a power of 10. One major difference between scientific notation and floating point representation is that on most operating systems the base is not 10, but is either 2 or 16 depending on the system. The following diagram shows a number written in scientific notation: decimal value 987 =.987 X 10 3 the mantissa is the number multiplied by the base, in this example,.987 the base is the number raised to a power, in this example, 10 the exponent is the power to which the base is raised, in this example, 3 You ll see how floating point representation is calculated on an MVS system and a PC system a little later. In most situations, you will not be affected directly by the way SAS stores numbers, however, you may see what appear to be peculiar results from time to time due to precision and representation issues. Precision Versus Magnitude SAS stores all numeric values in eight bytes of storage unless you specify differently. This doesn t mean that a value is limited to eight digits, but rather that eight bytes are used to store the value. A numeric value is comprised of the mantissa, the sign for the mantissa, the exponent, and the sign for the exponent. Many times real numbers can t be stored with precision, which is the accuracy with which a number can be represented. For integers, precision isn t an issue but magnitude can be. There is a maximum integer that can be accurately represented depending on the number of bytes used to store the variable. The following chart summarizes the largest integer by length for SAS variables: Length In Bytes Mainframe Unix PC N/A 3 65,536 8,192 8, ,777,216 2,097,152 2,097, ,294,967, ,870, ,870, ,099,511,627, ,438,953, ,438,953, ,474,946,710,656 35,184,372,088,832 35,184,372,088, ,057,594,037,927,936 9,007,199,254,740,992 9,007,199,254,740,992 file:///i /TS-DOC%20TS-654%20-%20Numeric%20Precision%20101.htm (2 of 12) [10/11/02 4:01:11 PM]

3 One of the first things you probably noticed about the table is that the maximum integers for each additional byte represented precisely with various lengths is different across hosts. This is due to the specifications of the floating point representation being used by each operating system. The base (2 or 16), the number of bits used to represent the mantissa, and the number of bits used to represent the exponent affect the precision that is available. Whether an operating system truncates or rounds digits that cannot be stored also affects representation error as well. Due to all of these factors that can affect storage, it s imperative that real numbers are stored in the full eight bytes of storage. If you re interested in saving disk space and would like to shorten the lengths of numeric variables that contain integers, use a LENGTH statement in your DATA step to change the number of bytes used to store the value. Be sure the values will be less than or equal to the longest integer by the length specified in the above table. The hardware differ from one another in a variety of ways. The following chart outlines them: Platforms Specifications IBM Mainframe VAX/VMS IEEE ** Base Exponent Bits Mantissa bits Round or Truncate Truncate Round Round Bias for Exponent **IEEE is used by OS/2, Windows, and UNIX Let s discuss what the table means. BASE - base number system used to compute numbers in floating point representation * base 16 - uses numbers 0-9 and letters A-F (to represent the values 10-15) 268,435, , the value 3000 is represented as BB8 = B*(16**2) + B*(16**1) + 8*(16**0) = 11* *16 + 8*1 = 3000 * base 2 uses digits 0 and 1 file:///i /TS-DOC%20TS-654%20-%20Numeric%20Precision%20101.htm (3 of 12) [10/11/02 4:01:11 PM]

4 the value 184 is represented as = 1*(2**7) + 0*(2**6) + 1*(2**5) + 1*(2**4) + 1*(2**3) + 0*(2**2) + 0*(2**1) + 0*(2**0) = = 184 EXPONENT BITS the number of bits reserved for storing the exponent, which determines the magnitude of the number we can store. You can see that the number of exponent bits varies between the operating systems. It stands to reason that IEEE systems yield numbers of greater magnitude since they use more bits for the exponent. MANTISSA BITS the number of bits reserved for storing the mantissa which determines the precision by which the number is stored. Since there are more bits reserved for the mantissa on mainframes, you can expect greater precision on a mainframe compared to a PC. ROUND or TRUNCATE When the mantissa can't be represented precisely, a convention must be adopted on how to handle the last character that can be represented in the number of bytes reserved for the mantissa. One convention is to truncate the value at the length that can be stored. This convention is used by IBM Mainframe systems. An alternative is to round the value based on the digits that cannot be stored, which is done on VAX/VMS and IEEE systems. There is no right or wrong way to handle this dilemma since neither convention results in an exact representation of the value. BIAS The bias for the exponent makes it possible to store both positive and negative exponents without the need for an additional sign bit. This will make more sense when you see examples of computing floating point representation below. file:///i /TS-DOC%20TS-654%20-%20Numeric%20Precision%20101.htm (4 of 12) [10/11/02 4:01:11 PM]

5 Although the IEEE platforms use the same set of specifications for floating point representation, you may occasionally see varying results between the platforms due to hardware differences. The IEEE standard determines how they store numbers in floating point, but that doesn t mean they perform all calculations the same way. It only means that the same number entered into each operating system is stored in floating point representation exactly the same way. On the Windows platforms, the processor does computations in extended real precision. This means that instead of the normal 64 bits that are used to store numeric values (53 bits for the mantissa and 11 bits for the exponent), there are 16 additional bits, 12 for the mantissa and 4 for the exponent. Numeric values aren t stored in 80 bits (10 bytes) since the maximum width for a numeric variable in SAS is 8 bytes. This simply means that the processor uses 80 bits to represent numeric values before it s passed back to memory, which is 64 bits. On Windows, this allows storage of larger numbers than the standard IEEE floating point format used by operating systems such as UNIX. This is one reason why you may see slightly different values from operating systems that use the IEEE standard. It is not uncommon to get slightly different results between operating systems whose floating point representation components differ (i.e. MVS and PC, MVS and UNIX). Some problems with numeric precision arise because the underlying instructions that each operating system uses to do addition, multiplication, division, etc. are slightly different. There is no standard method for doing computations since all operating systems attempt to compute numbers as accurately as possible. It would be helpful at this point to show how floating point representation is calculated on Windows and MVS platforms. This further illustrates how each operating system represents numeric values as best it can, but with some limitations. FLOATING POINT REPRESENTATION on WINDOWS The byte layout for a 64 bit double precision number is as follows: SEEEEEEE EEEEMMMM MMMMMMMM MMMMMMMM byte 1 byte 2 byte 3 byte 4 MMMMMMMM MMMMMMMM MMMMMMMM MMMMMMMM byte 5 byte 6 byte 7 byte 8 S=sign E=exponent M=mantissa This example shows the conversion process for the decimal value to floating point representation. 1. Write out the value in binary which involves the base 2 number system ½ ¼ 1/ ( =.75) is represented in binary as file:///i /TS-DOC%20TS-654%20-%20Numeric%20Precision%20101.htm (5 of 12) [10/11/02 4:01:11 PM]

6 2. Move the decimal over until there's only one digit to the left of it. This process is called normalizing the value. In this case we'll move it 7 places so our exponent is 7. ( ) 3. The bias is 1023 so we add 7 to this number to get Convert 1030 to hexadecimal representation (using base 16 number system) which is 406 and this will be placed in the exponent portion of our final result Convert 406 to binary If the value you re converting is negative, change the first bit to 1 so you have which in hex translates to C Go back to #2 above throw away the first digit and decimal point (called implied 1 bit) so that you now have Break these up into nibbles (half bytes) so that you have In order to have a complete nibble at the end, add enough zeros to complete 4 bits: Convert these to their hex equivalent and this is the mantissa portion. F F 8 8. The final floating point representation for is 406FF The final floating point representation for is C06FF This example was fairly straight forward in that the value was represented precisely. After seeing how the decimal portion of the value is computed in floating point, it s easy to see that many values can t be done so easily. Here s an example of how floating point representation is computed on MVS, and I ve purposely chosen a value that isn t represented exactly. FLOATING POINT REPRESENTATION on MVS file:///i /TS-DOC%20TS-654%20-%20Numeric%20Precision%20101.htm (6 of 12) [10/11/02 4:01:11 PM]

7 IBM mainframe operating environments (OS/390 and CMS) use the same representation made up of 8 bytes as follows: SEEEEEEE MMMMMMMM MMMMMMMM MMMMMMMM byte 1 byte 2 byte 3 byte 4 MMMMMMMM MMMMMMMM MMMMMMMM MMMMMMMM byte 5 byte 6 byte 7 byte 8 S=sign E=exponent M=mantissa This example shows the conversion process for the decimal value to floating point representation. This illustrates how values that we can represent in our decimal number system can t be represented exactly in floating point, which results in numeric precision issues. 1. Because the base is 16, you must first convert the value to hexadecimal notation. 2. Convert the integer portion (decimal) = = 200 * 16 ** 0 Move the decimal point all the way to the left, count the number of positions that you moved, and this is the exponent. 3. Convert the fraction portion.1 = 1/10 = 1.6/16 =.200 * 16 **3 The numerator can t be a fraction so we ll keep the 1 and convert the.6 portion again.6 = 6/10 = 9.6/16 Again, can t have fractions for the numerator so keep the 9 and reconvert.6 portion. The.6 continues to repeat as 9.6 which means you keep the 9 and reconvert. The closest that.1 can be represented is * 16 ** 0 (hexadecimal) 4. The exponent for the value is 3 (step 2 above). To determine the actual exponent that will be stored, take the exponent value and add the bias to it: file:///i /TS-DOC%20TS-654%20-%20Numeric%20Precision%20101.htm (7 of 12) [10/11/02 4:01:11 PM]

8 true exponent + bias = = 43(hexadecimal) = stored exponent The final portion to be determined is the sign of the mantissa. By convention, the sign bit for positive mantissas is 0, and the sign for negative mantissas is 1. This information is stored in the first bit of the first byte. From the hexadecimal value in step 4, compute the decimal equivalent and write it in binary. Add the sign bit to the first position. The stored value now looks like this: 43 hexadecimal = 4*16**1 + 3*16**0 = 67 decimal = binary Had we been computing the floating point representation for , the binary representation for the exponent with sign would be: = 195 in decimal = C3 in hex 5. The final step is to put it all together: floating point representation for C floating point representation for This example illustrates perfectly how values that can be represented exactly in our decimal number system can t necessarily be represented as precisely in floating point. When you notice a floating point value has a repeating pattern of numbers (like the above value has repeating 9s ), it s a good sign that this value can t be represented exactly. This is analogous to trying to represent 1/3 in base 10..it s not exact. The closest we can come is (repeating 3s forever). Now that you ve learned how floating point representation is determined, let s look at an example of an arithmetic operation involving real numbers. We know that should be equal to 3.8 if the calculation is done in decimal arithmetic. data a; x= ; So we should be able to compare it to the literal value of 3.8 and find that they are equal right? data a; x= ; if x=3.8 then put 'equal'; else put 'not equal'; not equal The above output to the SAS log indicates that they are not equal. The next thing you d probably do is output the data set with PROC PRINT to verify the value is indeed 3.8. file:///i /TS-DOC%20TS-654%20-%20Numeric%20Precision%20101.htm (8 of 12) [10/11/02 4:01:11 PM]

9 The SAS System Obs x PROC PRINT indicates the value is 3.8. That s because a slight amount of fuzzing is done in the W.D format used by PROC PRINT whereby some values may appear to be different than they are stored. Now let s look at the value using a format. 279 data a; 280 x= ; 281 if x=3.8 then put 'equal'; 282 else put 'not equal'; 283 put x=10.8; 283 put x=18.16; 284 not equal x= x= The first format used was 10.8 which still shows the value as 3.8; however, displaying the value with a larger width indicates that the value is slightly less than 3.8. Another way to verify that X is not 3.8 is by outputting the assigned value of 3.8 and the stored value with the HEX16. format. This is a special format that can be used to show floating point representation. 322 data a; 323 x= ; 324 literal=3.8; 325 put x=hex16. literal=hex16.; 326 x=400e literal=400e You can see that the values are definitely different. This example may be alarming to you or make you question the integrity of computer output. Regardless of how much precision is available, the problem remains that not all numbers can be represented exactly. In the decimal number system, the fraction 1/3 cannot be precisely represented. Likewise, many fractions (for example,.1) cannot be exactly represented in base 2 or base 16 numbering systems. What happens when you add 1/3 three times? Is the value exactly one? No, it s rather than 1. It stands to reason that when values can t be represented exactly in floating point, performing mathematical operations with other non-exact values will further compound the impreciseness of the result. Here s another example of a numeric precision issue that occurs on MVS but not on the PC. data a; input gender \$ height; cards; m 60 m 58 m 59 m 70 m 60 file:///i /TS-DOC%20TS-654%20-%20Numeric%20Precision%20101.htm (9 of 12) [10/11/02 4:01:11 PM]

10 m 58 ; proc freq; tables gender/out=new; Output from PROC PRINT of data set NEW Gender COUNT PERCENT m data final; set new; if percent=100 then put equal ; else put not equal ; not equal NOTE: There were 1 observations read from the data set WORK.NEW. PROC FREQ created an output data set with a PERCENT variable. Since all of the values for GENDER are the same, we would expect PERCENT to have a value of 100. When the value of PERCENT is tested, the log indicates that PERCENT is not 100. The algorithm used by PROC FREQ to produce the variable PERCENT involves mathematical computations. The result was very close to 100 but not exactly. Using the ROUND function on the IF statement resolves the issue. data final; set new; if round(percent)=100 then put equal ; else put not equal ; equal NOTE: There were 1 observations read from the data set WORK.NEW. There are several things to keep in mind when working with real numbers. 1. Know your data 2. Decide the level of significance you need 3. Remember that all numeric values are stored in floating point representation 4. Use comparison functions such as ROUND The safest way to avoid numeric precision issues is to use whole numbers. You may say this is impossible if your data coming into SAS contains real numbers rather than integers only. You can create integers using several functions. For example, in working with money amounts, you would be concerned with two decimal places. You could use the ROUND function to the thousands place(always use one additional decimal place than you need at the end), multiply by file:///i /TS-DOC%20TS-654%20-%20Numeric%20Precision%20101.htm (10 of 12) [10/11/02 4:01:11 PM]

11 that amount (in this case 1000) and then use the INT function to ensure stray decimals are removed. Then you could do your calculations with the whole numbers and divide by 1000 at the end to get the decimal values back. A format with two decimal places (for money amounts) can be used. In this example, the variable X values are stored in the SAS data set as real numbers. In order to convert them to integers, use the ROUND function to the level of significance, multiply by that level and use the INT function to return integers. To sum all values of NEW, use the SUM statement and on the last observation (detected with the END= option) the sum is divided by data a; set b end=last; new=int(round(x,.001)*1000); sum+new; if last then sum=sum/1000; In the mathematical equation that I used to illustrate numeric precision issues, the ROUND function would have caused the values to compare equally. This happens because the computed value is very close to 3.8. Very close isn t good enough when comparing values to one another. 48 data a; 49 x= ; 50 if round(x,.01)=3.8 then put 'equal'; 51 else put 'not equal'; 52 equal NOTE: The data set WORK.A has 1 observations and 1 variables. NOTE: DATA statement used: real time 0.20 seconds cpu time 0.06 seconds TRANSFERRING DATA BETWEEN OPERATING SYSTEMS You must use extreme caution when moving data between operating systems. The chart on page 3 indicates that the number of bits used to store the mantissa and exponent varies among platforms. This affects the magnitude and precision that is maintained. For more information, refer to "SAS Technical Report P-195: Transporting SAS Files Between Host Systems". In conclusion, although numeric precision and representation issues have been occurring since the inception of computer technology, it occasionally takes users by surprise. Hopefully much of the mystery has been removed and you realize that in a nutshell, our infinite number system can t be precisely represented in the finite confines of an operating system s floating point representation. Take some time to know your data and use the appropriate tools for comparisons. The SAS System can store numbers with great magnitude and precision, but you must stay aware of possible problems whenever your programs use real numbers. file:///i /TS-DOC%20TS-654%20-%20Numeric%20Precision%20101.htm (11 of 12) [10/11/02 4:01:11 PM]

### Signed umbers. Sign/Magnitude otation

Signed umbers So far we have discussed unsigned number representations. In particular, we have looked at the binary number system and shorthand methods in representing binary codes. With m binary digits,

### Number Systems Using and Converting Between Decimal, Binary, Octal and Hexadecimal Number Systems

Number Systems Using and Converting Between Decimal, Binary, Octal and Hexadecimal Number Systems In everyday life, we humans most often count using decimal or base-10 numbers. In computer science, it

### What Every Programmer Should Know About Floating-Point Arithmetic

What Every Programmer Should Know About Floating-Point Arithmetic Last updated: October 15, 2015 Contents 1 Why don t my numbers add up? 3 2 Basic Answers 3 2.1 Why don t my numbers, like 0.1 + 0.2 add

### Number Systems. Binary Numbers. Appendix. Decimal notation represents numbers as powers of 10, for example

Appendix F Number Systems Binary Numbers Decimal notation represents numbers as powers of 10, for example 1729 1 103 7 102 2 101 9 100 decimal = + + + There is no particular reason for the choice of 10,

### Divisibility Rules and Their Explanations

Divisibility Rules and Their Explanations Increase Your Number Sense These divisibility rules apply to determining the divisibility of a positive integer (1, 2, 3, ) by another positive integer or 0 (although

### 1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1.1 Introduction Given that digital logic and memory devices are based on two electrical states (on and off), it is natural to use a number

### Imelda C. Go, South Carolina Department of Education, Columbia, SC

PO 082 Rounding in SAS : Preventing Numeric Representation Problems Imelda C. Go, South Carolina Department of Education, Columbia, SC ABSTRACT As SAS programmers, we come from a variety of backgrounds.

### Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

Floating-point Arithmetic Reading: pp. 312-328 Floating-Point Representation Non-scientific floating point numbers: A non-integer can be represented as: 2 4 2 3 2 2 2 1 2 0.2-1 2-2 2-3 2-4 where you sum

### Module 2: Computer Arithmetic

Module 2: Computer Arithmetic 1 B O O K : C O M P U T E R O R G A N I Z A T I O N A N D D E S I G N, 3 E D, D A V I D L. P A T T E R S O N A N D J O H N L. H A N N E S S Y, M O R G A N K A U F M A N N

### The Perils of Floating Point

The Perils of Floating Point by Bruce M. Bush Copyright (c) 1996 Lahey Computer Systems, Inc. Permission to copy is granted with acknowledgement of the source. Many great engineering and scientific advances

### FLOATING POINT NUMBERS

FLOATING POINT NUMBERS Robert P. Webber, Longwood University We have seen how decimal fractions can be converted to binary. For instance, we can write 6.25 10 as 4 + 2 + ¼ = 2 2 + 2 1 + 2-2 = 1*2 2 + 1*2

### Floating Point Considerations

Chapter 6 Floating Point Considerations In the early days of computing, floating point arithmetic capability was found only in mainframes and supercomputers. Although many microprocessors designed in the

### 1.2 Round-off Errors and Computer Arithmetic

1.2 Round-off Errors and Computer Arithmetic 1 In a computer model, a memory storage unit word is used to store a number. A word has only a finite number of bits. These facts imply: 1. Only a small set

### CS101 Lecture 04: Binary Arithmetic

CS101 Lecture 04: Binary Arithmetic Binary Number Addition Two s complement encoding Briefly: real number representation Aaron Stevens (azs@bu.edu) 25 January 2013 What You ll Learn Today Counting in binary

### Characters, Strings, and Floats

Characters, Strings, and Floats CS 350: Computer Organization & Assembler Language Programming 9/6: pp.8,9; 9/28: Activity Q.6 A. Why? We need to represent textual characters in addition to numbers. Floating-point

### Classes of Real Numbers 1/2. The Real Line

Classes of Real Numbers All real numbers can be represented by a line: 1/2 π 1 0 1 2 3 4 real numbers The Real Line { integers rational numbers non-integral fractions irrational numbers Rational numbers

### Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation

Chapter 2 Float Point Arithmetic Topics IEEE Floating Point Standard Fractional Binary Numbers Rounding Floating Point Operations Mathematical properties Real Numbers in Decimal Notation Representation

### CS429: Computer Organization and Architecture

CS429: Computer Organization and Architecture Dr. Bill Young Department of Computer Sciences University of Texas at Austin Last updated: September 18, 2017 at 12:48 CS429 Slideset 4: 1 Topics of this Slideset

### Chapter 2: Number Systems

Chapter 2: Number Systems Logic circuits are used to generate and transmit 1s and 0s to compute and convey information. This two-valued number system is called binary. As presented earlier, there are many

### Administrivia. CMSC 216 Introduction to Computer Systems Lecture 24 Data Representation and Libraries. Representing characters DATA REPRESENTATION

Administrivia CMSC 216 Introduction to Computer Systems Lecture 24 Data Representation and Libraries Jan Plane & Alan Sussman {jplane, als}@cs.umd.edu Project 6 due next Friday, 12/10 public tests posted

### Section 1.4 Mathematics on the Computer: Floating Point Arithmetic

Section 1.4 Mathematics on the Computer: Floating Point Arithmetic Key terms Floating point arithmetic IEE Standard Mantissa Exponent Roundoff error Pitfalls of floating point arithmetic Structuring computations

### Inf2C - Computer Systems Lecture 2 Data Representation

Inf2C - Computer Systems Lecture 2 Data Representation Boris Grot School of Informatics University of Edinburgh Last lecture Moore s law Types of computer systems Computer components Computer system stack

### Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

Floating point Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties Next time! The machine model Chris Riesbeck, Fall 2011 Checkpoint IEEE Floating point Floating

### CS321. Introduction to Numerical Methods

CS31 Introduction to Numerical Methods Lecture 1 Number Representations and Errors Professor Jun Zhang Department of Computer Science University of Kentucky Lexington, KY 40506 0633 August 5, 017 Number

### Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.

class04.ppt 15-213 The course that gives CMU its Zip! Topics Floating Point Jan 22, 2004 IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Floating Point Puzzles For

### CREATING FLOATING POINT VALUES IN MIL-STD-1750A 32 AND 48 BIT FORMATS: ISSUES AND ALGORITHMS

CREATING FLOATING POINT VALUES IN MIL-STD-1750A 32 AND 48 BIT FORMATS: ISSUES AND ALGORITHMS Jeffrey B. Mitchell L3 Communications, Telemetry & Instrumentation Division Storm Control Systems ABSTRACT Experimentation

### Data Representation Floating Point

Data Representation Floating Point CSCI 2400 / ECE 3217: Computer Architecture Instructor: David Ferry Slides adapted from Bryant & O Hallaron s slides via Jason Fritts Today: Floating Point Background:

### Appendix. Numbering Systems. In this Appendix

Numbering Systems ppendix n this ppendix ntroduction... inary Numbering System... exadecimal Numbering System... Octal Numbering System... inary oded ecimal () Numbering System... 5 Real (Floating Point)

### 2 Computation with Floating-Point Numbers

2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers

### Significant Figures & Scientific Notation

Significant Figures & Scientific Notation Measurements are important in science (particularly chemistry!) Quantity that contains both a number and a unit Must be able to say how correct a measurement is

### Learning the Binary System

Learning the Binary System www.brainlubeonline.com/counting_on_binary/ Formated to L A TEX: /25/22 Abstract This is a document on the base-2 abstract numerical system, or Binary system. This is a VERY

### IEEE Standard 754 Floating Point Numbers

IEEE Standard 754 Floating Point Numbers Steve Hollasch / Last update 2005-Feb-24 IEEE Standard 754 floating point is the most common representation today for real numbers on computers, including Intel-based

### Floating point. Today. IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.

Floating point Today IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time The machine model Fabián E. Bustamante, Spring 2010 IEEE Floating point Floating point

### System Programming CISC 360. Floating Point September 16, 2008

System Programming CISC 360 Floating Point September 16, 2008 Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Powerpoint Lecture Notes for Computer Systems:

### CHAPTER 2 Data Representation in Computer Systems

CHAPTER 2 Data Representation in Computer Systems 2.1 Introduction 37 2.2 Positional Numbering Systems 38 2.3 Decimal to Binary Conversions 38 2.3.1 Converting Unsigned Whole Numbers 39 2.3.2 Converting

### CHAPTER 2 Data Representation in Computer Systems

CHAPTER 2 Data Representation in Computer Systems 2.1 Introduction 37 2.2 Positional Numbering Systems 38 2.3 Decimal to Binary Conversions 38 2.3.1 Converting Unsigned Whole Numbers 39 2.3.2 Converting

### Floating Point Arithmetic

Floating Point Arithmetic CS 365 Floating-Point What can be represented in N bits? Unsigned 0 to 2 N 2s Complement -2 N-1 to 2 N-1-1 But, what about? very large numbers? 9,349,398,989,787,762,244,859,087,678

### Digital Computers and Machine Representation of Data

Digital Computers and Machine Representation of Data K. Cooper 1 1 Department of Mathematics Washington State University 2013 Computers Machine computation requires a few ingredients: 1 A means of representing

### Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

Systems I Floating Point Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties IEEE Floating Point IEEE Standard 754 Established in 1985 as uniform standard for

### Chapter 2 Bits, Data Types, and Operations

Chapter 2 Bits, Data Types, and Operations How do we represent data in a computer? At the lowest level, a computer is an electronic machine. works by controlling the flow of electrons Easy to recognize

### Description Hex M E V smallest value > largest denormalized negative infinity number with hex representation 3BB0 ---

CSE2421 HOMEWORK #2 DUE DATE: MONDAY 11/5 11:59pm PROBLEM 2.84 Given a floating-point format with a k-bit exponent and an n-bit fraction, write formulas for the exponent E, significand M, the fraction

### 10.1. Unit 10. Signed Representation Systems Binary Arithmetic

0. Unit 0 Signed Representation Systems Binary Arithmetic 0.2 BINARY REPRESENTATION SYSTEMS REVIEW 0.3 Interpreting Binary Strings Given a string of s and 0 s, you need to know the representation system

### IBM 370 Basic Data Types

IBM 370 Basic Data Types This lecture discusses the basic data types used on the IBM 370, 1. Two s complement binary numbers 2. EBCDIC (Extended Binary Coded Decimal Interchange Code) 3. Zoned Decimal

### CS367 Test 1 Review Guide

CS367 Test 1 Review Guide This guide tries to revisit what topics we've covered, and also to briefly suggest/hint at types of questions that might show up on the test. Anything on slides, assigned reading,

### 4.1 QUANTIZATION NOISE

DIGITAL SIGNAL PROCESSING UNIT IV FINITE WORD LENGTH EFFECTS Contents : 4.1 Quantization Noise 4.2 Fixed Point and Floating Point Number Representation 4.3 Truncation and Rounding 4.4 Quantization Noise

### Floating-Point Numbers in Digital Computers

POLYTECHNIC UNIVERSITY Department of Computer and Information Science Floating-Point Numbers in Digital Computers K. Ming Leung Abstract: We explain how floating-point numbers are represented and stored

### But first, encode deck of cards. Integer Representation. Two possible representations. Two better representations WELLESLEY CS 240 9/8/15

Integer Representation Representation of integers: unsigned and signed Sign extension Arithmetic and shifting Casting But first, encode deck of cards. cards in suits How do we encode suits, face cards?

### Fundamentals of Programming Session 8

Fundamentals of Programming Session 8 Instructor: Reza Entezari-Maleki Email: entezari@ce.sharif.edu 1 Fall 2013 These slides have been created using Deitel s slides Sharif University of Technology Outlines

### Scientific Computing. Error Analysis

ECE257 Numerical Methods and Scientific Computing Error Analysis Today s s class: Introduction to error analysis Approximations Round-Off Errors Introduction Error is the difference between the exact solution

### More Programming Constructs -- Introduction

More Programming Constructs -- Introduction We can now examine some additional programming concepts and constructs Chapter 5 focuses on: internal data representation conversions between one data type and

### Logic, Words, and Integers

Computer Science 52 Logic, Words, and Integers 1 Words and Data The basic unit of information in a computer is the bit; it is simply a quantity that takes one of two values, 0 or 1. A sequence of k bits

### 8/30/2016. In Binary, We Have A Binary Point. ECE 120: Introduction to Computing. Fixed-Point Representations Support Fractions

University of Illinois at Urbana-Champaign Dept. of Electrical and Computer Engineering ECE 120: Introduction to Computing Fixed- and Floating-Point Representations In Binary, We Have A Binary Point Let

### How & Why We Subnet Lab Workbook

i How & Why We Subnet Lab Workbook ii CertificationKits.com How & Why We Subnet Workbook Copyright 2013 CertificationKits LLC All rights reserved. No part of this book maybe be reproduced or transmitted

### Floating Point. CSC207 Fall 2017

Floating Point CSC207 Fall 2017 Ariane 5 Rocket Launch Ariane 5 rocket explosion In 1996, the European Space Agency s Ariane 5 rocket exploded 40 seconds after launch. During conversion of a 64-bit to

### 15213 Recitation 2: Floating Point

15213 Recitation 2: Floating Point 1 Introduction This handout will introduce and test your knowledge of the floating point representation of real numbers, as defined by the IEEE standard. This information

### IEEE-754 floating-point

IEEE-754 floating-point Real and floating-point numbers Real numbers R form a continuum - Rational numbers are a subset of the reals - Some numbers are irrational, e.g. π Floating-point numbers are an

### How to Do Word Problems. Building the Foundation

Building the Foundation The notion that Mathematics is a language, is held by many mathematicians and is being expressed on frequent occasions. Mathematics is the language of science. It is unique among

### Binary Addition & Subtraction. Unsigned and Sign & Magnitude numbers

Binary Addition & Subtraction Unsigned and Sign & Magnitude numbers Addition and subtraction of unsigned or sign & magnitude binary numbers by hand proceeds exactly as with decimal numbers. (In fact this

### Computer Numbers and their Precision, I Number Storage

Computer Numbers and their Precision, I Number Storage Learning goal: To understand how the ways computers store numbers lead to limited precision and how that introduces errors into calculations. Learning

### Binary the Digital Language

College of DuPage DigitalCommons@C.O.D. Computer and Internetworking Technologies (CIT) Scholarship Computer Science 7-1-2012 Binary the Digital Language Clyde Cox College of DuPage, coxclyde@cod.edu Follow

### 3.5 Floating Point: Overview

3.5 Floating Point: Overview Floating point (FP) numbers Scientific notation Decimal scientific notation Binary scientific notation IEEE 754 FP Standard Floating point representation inside a computer

### Numbers and Representations

Çetin Kaya Koç http://koclab.cs.ucsb.edu/teaching/cs192 koc@cs.ucsb.edu Çetin Kaya Koç http://koclab.cs.ucsb.edu Fall 2016 1 / 38 Outline Computational Thinking Representations of integers Binary and decimal

### Signed Binary Numbers

Signed Binary Numbers Unsigned Binary Numbers We write numbers with as many digits as we need: 0, 99, 65536, 15000, 1979, However, memory locations and CPU registers always hold a constant, fixed number

### On a 64-bit CPU. Size/Range vary by CPU model and Word size.

On a 64-bit CPU. Size/Range vary by CPU model and Word size. unsigned short x; //range 0 to 65553 signed short x; //range ± 32767 short x; //assumed signed There are (usually) no unsigned floats or doubles.

### Number Systems and Computer Arithmetic

Number Systems and Computer Arithmetic Counting to four billion two fingers at a time What do all those bits mean now? bits (011011011100010...01) instruction R-format I-format... integer data number text

### Binary, Hexadecimal and Octal number system

Binary, Hexadecimal and Octal number system Binary, hexadecimal, and octal refer to different number systems. The one that we typically use is called decimal. These number systems refer to the number of

### UNIT - I: COMPUTER ARITHMETIC, REGISTER TRANSFER LANGUAGE & MICROOPERATIONS

UNIT - I: COMPUTER ARITHMETIC, REGISTER TRANSFER LANGUAGE & MICROOPERATIONS (09 periods) Computer Arithmetic: Data Representation, Fixed Point Representation, Floating Point Representation, Addition and

### 1010 2?= ?= CS 64 Lecture 2 Data Representation. Decimal Numbers: Base 10. Reading: FLD Digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

CS 64 Lecture 2 Data Representation Reading: FLD 1.2-1.4 Decimal Numbers: Base 10 Digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 Example: 3271 = (3x10 3 ) + (2x10 2 ) + (7x10 1 ) + (1x10 0 ) 1010 10?= 1010 2?= 1

### Floating Point Numbers

Floating Point Numbers Summer 8 Fractional numbers Fractional numbers fixed point Floating point numbers the IEEE 7 floating point standard Floating point operations Rounding modes CMPE Summer 8 Slides

### CSC201, SECTION 002, Fall 2000: Homework Assignment #2

1 of 7 11/8/2003 7:34 PM CSC201, SECTION 002, Fall 2000: Homework Assignment #2 DUE DATE Monday, October 2, at the start of class. INSTRUCTIONS FOR PREPARATION Neat, in order, answers easy to find. Staple

### 9/23/15. Agenda. Goals of this Lecture. For Your Amusement. Number Systems and Number Representation. The Binary Number System

For Your Amusement Number Systems and Number Representation Jennifer Rexford Question: Why do computer programmers confuse Christmas and Halloween? Answer: Because 25 Dec = 31 Oct -- http://www.electronicsweekly.com

### Data Types Literals, Variables & Constants

VISUAL BASIC Data Types Literals, Variables & Constants Copyright 2013 Dan McElroy Under the Hood As a DRIVER of an automobile, you may not need to know everything that happens under the hood, although

### IEEE Floating Point Numbers Overview

COMP 40: Machine Structure and Assembly Language Programming (Fall 2015) IEEE Floating Point Numbers Overview Noah Mendelsohn Tufts University Email: noah@cs.tufts.edu Web: http://www.cs.tufts.edu/~noah

### Creating Custom Financial Statements Using

Creating Custom Financial Statements Using Steve Collins Sage 50 Solution Provider scollins@iqacct.com 918-851-9713 www.iqaccountingsolutions.com Financial Statement Design Sage 50 Accounting s built in

### Data Representation and Introduction to Visualization

Data Representation and Introduction to Visualization Massimo Ricotti ricotti@astro.umd.edu University of Maryland Data Representation and Introduction to Visualization p.1/18 VISUALIZATION Visualization

### Crash Course in Modernization. A whitepaper from mrc

Crash Course in Modernization A whitepaper from mrc Introduction Modernization is a confusing subject for one main reason: It isn t the same across the board. Different vendors sell different forms of

### Representation of Numbers and Arithmetic in Signal Processors

Representation of Numbers and Arithmetic in Signal Processors 1. General facts Without having any information regarding the used consensus for representing binary numbers in a computer, no exact value

### 4.7 Approximate Integration

4.7 Approximate Integration Some anti-derivatives are difficult to impossible to find. For example, 1 0 e x2 dx or 1 1 1 + x3 dx We came across this situation back in calculus I when we introduced the

### ECE 2020B Fundamentals of Digital Design Spring problems, 6 pages Exam Two 26 February 2014

Instructions: This is a closed book, closed note exam. Calculators are not permitted. If you have a question, raise your hand and I will come to you. Please work the exam in pencil and do not separate

### BITWISE OPERATORS. There are a number of ways to manipulate binary values. Just as you can with

BITWISE OPERATORS There are a number of ways to manipulate binary values. Just as you can with decimal numbers, you can perform standard mathematical operations - addition, subtraction, multiplication,

### Calculations with Sig Figs

Calculations with Sig Figs When you make calculations using data with a specific level of uncertainty, it is important that you also report your answer with the appropriate level of uncertainty (i.e.,

### Microsoft Excel 2010 Handout

Microsoft Excel 2010 Handout Excel is an electronic spreadsheet program you can use to enter and organize data, and perform a wide variety of number crunching tasks. Excel helps you organize and track

### Computational Methods. Sources of Errors

Computational Methods Sources of Errors Manfred Huber 2011 1 Numerical Analysis / Scientific Computing Many problems in Science and Engineering can not be solved analytically on a computer Numeric solutions

### 4 Operations On Data 4.1. Foundations of Computer Science Cengage Learning

4 Operations On Data 4.1 Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: List the three categories of operations performed on data.

### Lecture Notes: Floating-Point Numbers

Lecture Notes: Floating-Point Numbers CS227-Scientific Computing September 8, 2010 What this Lecture is About How computers represent numbers How this affects the accuracy of computation Positional Number

### C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0.

C NUMERIC FORMATS Figure C-. Table C-. Listing C-. Overview The DSP supports the 32-bit single-precision floating-point data format defined in the IEEE Standard 754/854. In addition, the DSP supports an

### Co-processor Math Processor. Richa Upadhyay Prabhu. NMIMS s MPSTME February 9, 2016

8087 Math Processor Richa Upadhyay Prabhu NMIMS s MPSTME richa.upadhyay@nmims.edu February 9, 2016 Introduction Need of Math Processor: In application where fast calculation is required Also where there

### The Design and Implementation of a Rigorous. A Rigorous High Precision Floating Point Arithmetic. for Taylor Models

The and of a Rigorous High Precision Floating Point Arithmetic for Taylor Models Department of Physics, Michigan State University East Lansing, MI, 48824 4th International Workshop on Taylor Methods Boca

Advanced Reporting Tool The Advanced Reporting tool is designed to allow users to quickly and easily create new reports or modify existing reports for use in the Rewards system. The tool utilizes the Active

### Multiply Decimals Multiply # s, Ignore Decimals, Count # of Decimals, Place in Product from right counting in to left

Multiply Decimals Multiply # s, Ignore Decimals, Count # of Decimals, Place in Product from right counting in to left Dividing Decimals Quotient (answer to prob), Dividend (the # being subdivided) & Divisor

### 3.4 Equivalent Forms of Rational Numbers: Fractions, Decimals, Percents, and Scientific Notation

3.4 Equivalent Forms of Rational Numbers: Fractions, Decimals, Percents, and Scientific Notation We know that every rational number has an infinite number of equivalent fraction forms. For instance, 1/

### Lesson Plan. Preparation

Math Practicum in Information Technology Lesson Plan Performance Objective Upon completion of this lesson, each student will be able to convert between different numbering systems and correctly write mathematical

### Lecture 1. Course Overview, Python Basics

Lecture 1 Course Overview, Python Basics We Are Very Full! Lectures and Labs are at fire-code capacity We cannot add sections or seats to lectures You may have to wait until someone drops No auditors are

### Floating Point Numbers. Lecture 9 CAP

Floating Point Numbers Lecture 9 CAP 3103 06-16-2014 Review of Numbers Computers are made to deal with numbers What can we represent in N bits? 2 N things, and no more! They could be Unsigned integers:

### Digital Electronics A Practical Approach with VHDL William Kleitz Ninth Edition

Digital Electronics A Practical Approach with VHDL William Kleitz Ninth Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Visit

### Lab Using the Windows Calculator with Network Addresses

Objectives Part 1: Access the Windows Calculator Part 2: Convert between Numbering Systems Part 3: Convert Host IPv4 Addresses and Subnet Masks into Binary Part 4: Determine the Number of Hosts in a Network

### ECE 372 Microcontroller Design Assembly Programming Arrays. ECE 372 Microcontroller Design Assembly Programming Arrays

Assembly Programming Arrays Assembly Programming Arrays Array For Loop Example: unsigned short a[]; for(j=; j