Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Similar documents
fractional quantities are typically represented in computers using floating point format this approach is very much similar to scientific notation

CS429: Computer Organization and Architecture

Scientific Computing. Error Analysis

Lecture Objectives. Structured Programming & an Introduction to Error. Review the basic good habits of programming

Section 1.4 Mathematics on the Computer: Floating Point Arithmetic

Floating Point. CSC207 Fall 2017

Some disasters caused by numerical errors

ME 261: Numerical Analysis. ME 261: Numerical Analysis

Truncation Errors. Applied Numerical Methods with MATLAB for Engineers and Scientists, 2nd ed., Steven C. Chapra, McGraw Hill, 2008, Ch. 4.

CS321 Introduction To Numerical Methods

Roundoff Errors and Computer Arithmetic


Lecture 03 Approximations, Errors and Their Analysis

2 Computation with Floating-Point Numbers

2 Computation with Floating-Point Numbers

Computational Methods. Sources of Errors

1.2 Round-off Errors and Computer Arithmetic

Mathematical preliminaries and error analysis

1.3 Floating Point Form

Math 340 Fall 2014, Victor Matveev. Binary system, round-off errors, loss of significance, and double precision accuracy.

Floating-point representation

unused unused unused unused unused unused

Floating-Point Numbers in Digital Computers

What Every Programmer Should Know About Floating-Point Arithmetic

CSE 201 JAVA PROGRAMMING I. Copyright 2016 by Smart Coding School

Most nonzero floating-point numbers are normalized. This means they can be expressed as. x = ±(1 + f) 2 e. 0 f < 1

Floating-Point Numbers in Digital Computers

Finite arithmetic and error analysis

3.1 DATA REPRESENTATION (PART C)

CS101 Lecture 04: Binary Arithmetic

Integer Representation Floating point Representation Other data types

Chapter 3: Arithmetic for Computers

Chapter 2. Data Representation in Computer Systems

Representing and Manipulating Floating Points. Jo, Heeseung

Floating-Point Arithmetic

Approximations and Errors

Computing Basics. 1 Sources of Error LECTURE NOTES ECO 613/614 FALL 2007 KAREN A. KOPECKY

Representing and Manipulating Floating Points

Chapter 4. Operations on Data

2.1.1 Fixed-Point (or Integer) Arithmetic

Up next. Midterm. Today s lecture. To follow

The Perils of Floating Point

Introduction to Computer Programming with MATLAB Calculation and Programming Errors. Selis Önel, PhD

Accuracy versus precision

Floating Point Representation in Computers

Chapter 1. Numeric Artifacts. 1.1 Introduction

Numerical Methods 5633

Floating Point Representation. CS Summer 2008 Jonathan Kaldor

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.

Binary floating point encodings

Numerical Solutions of Differential Equations (1)

(Refer Slide Time: 02:59)

Floating Point Arithmetic

Review of Calculus, cont d

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

Scientific Computing: An Introductory Survey

CS 6210 Fall 2016 Bei Wang. Lecture 4 Floating Point Systems Continued

Review Questions 26 CHAPTER 1. SCIENTIFIC COMPUTING

Representing and Manipulating Floating Points

Computer Arithmetic Floating Point

MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic

Representing and Manipulating Floating Points. Computer Systems Laboratory Sungkyunkwan University

What we need to know about error: Class Outline. Computational Methods CMSC/AMSC/MAPL 460. Errors in data and computation

Chapter Three. Arithmetic

Computational Methods CMSC/AMSC/MAPL 460. Representing numbers in floating point and associated issues. Ramani Duraiswami, Dept. of Computer Science

CS321. Introduction to Numerical Methods

Number Systems. Both numbers are positive

Representing and Manipulating Floating Points

Computational Economics and Finance

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

Introduction to Computers and Programming. Numeric Values

Numerical Precision. Or, why my numbers aren t numbering right. 1 of 15

Floating Point Considerations

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

Signed umbers. Sign/Magnitude otation

Characters, Strings, and Floats

FLOATING POINT NUMBERS

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

4 Operations On Data 4.1. Foundations of Computer Science Cengage Learning

Computational Mathematics: Models, Methods and Analysis. Zhilin Li

H5H4, H5E7 lecture 5 Fixed point arithmetic. Overview

Computer Organization: A Programmer's Perspective

Number Systems. Decimal numbers. Binary numbers. Chapter 1 <1> 8's column. 1000's column. 2's column. 4's column

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Lecture Notes: Floating-Point Numbers

The type of all data used in a C++ program must be specified

Floating Point Arithmetic

Integers. N = sum (b i * 2 i ) where b i = 0 or 1. This is called unsigned binary representation. i = 31. i = 0

Module 2: Computer Arithmetic

The type of all data used in a C (or C++) program must be specified

Section A Arithmetic ( 5) Exercise A

CPS 101 Introduction to Computational Science

Objectives. look at floating point representation in its basic form expose errors of a different form: rounding error highlight IEEE-754 standard

4.1 QUANTIZATION NOISE

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Number Representations

2.Simplification & Approximation

MACHINE LEVEL REPRESENTATION OF DATA

AMTH142 Lecture 10. Scilab Graphs Floating Point Arithmetic

Transcription:

CHAPTER I I. Introduction Approximations and Round-off errors The concept of errors is very important to the effective use of numerical methods. Usually we can compare the numerical result with the analytical solution. However, when the analytical solution is not available (which is usually the case), we have to estimate the errors. The first step to minimize the errors is to apply simplifications to our problem and use simple formulations that can be solved analytically. However, sometimes the results are far from the reality. Hence, more complex formulations are needed, but as a consequence, it is more difficult to solve them analytically. Solving these problems will be then only possible by using numerical methods. However, the problem with numerical methods is that they yield approximate results. It is, therefore, important to develop criteria to determine if our approximation of the solution is acceptable. II. Accuracy and precision The errors associated with computation or measurements can be characterized by their accuracy and their precision. Accuracy: how closely a computed or measured value agrees with the true value. Precision: how closely individual computed or measured values agree with each other. Figure.1.2. accuracy and precision (a) inaccurate and imprecise; (b) accurate and imprecise; (c) inaccurate and precise; (d) accurate and precise. In engineering problems, we try to minimize both imprecision and inaccuracy Approximations and round-off errors 1

III. Errors definition The errors encountered in numerical methods can be classified into: Truncation errors: defined as the errors due the fact that we used an approximation to solve the problem instead of solving the problem analytically. Round-off errors: appears when numbers having limited significant figures are used to represent exact numbers (example: π; e; ). When considering the errors due to the fact that we are using numerical methods, the true value for the solution can be written as: Hence, the error can be computed as: True value = approximation + error Error (E t ) = True value approximation E t is the true error since we are comparing the approximation with the true value. To take into account the magnitude of the error, it is preferable to normalize the error to the true value: True error True fractional relative error = True value We can express this in percentage as: True value Approximation t = 100% True error Where t is the relative error. An important point to notice is that in the definition of the true error, we used the true value of the solution. However, the true value is not always available and we have, therefore, to compute an approximation of the error. For that, we normalize the error using the best available estimate of the true value: a = Approximation error Approximation 100% However, in real life it is not obvious to know the approximation. What is the solution? As several numerical methods include an iterative process, we will define the error as: a Current approximation Previous approximation = 100% Current approximation Approximations and round-off errors 2

You can notice, from the above formulation, that the error can be negative or positive. But in reality the most important thing for us is that the absolute error has to be lower than a certain limit s (this limit is very dependent upon the application and the computational time): III.1. Round-off errors a < s These errors originate from the fact that computers retain only a fixed number of significant figures during calculation. These errors are, therefore, directly related to the manner in which numbers are stored in a computer. In fact, remember that instead of using decimal number system (or base-10) as we do, a computer uses a binary system (or base-2). Why? Because, this corresponds to the on/off positions of electronic components. In a 16-bit computer word, the numbers will be stored as: 1 0 1 0 1 1 0 1 0 0 1 1 1 0 1 0 sign number III.2. Floating point representation The floating point representation is used to store fractional quantities. The number is expressed under the following form: m.b e where (m) is the mantissa; (b) is the base of the number system used and (e) is the exponent. As an example, the number 156.76 could be represented as 0.15676 10 3 in a floating point base- 10 system. Usually for the storage of fractional quantities, the first bit is reserved to the sign, then the signed exponent, and the last bits for the mantissa. Therefore, for an optimal storage, if the mantissa has leading zero digits, they are removed and transferred to the exponent. 1/34 = 0.0294117 Would be stored as: 0.0294 10 0 However, because the zero before the 2, we lose the digit (1). A better storage is: 0.2941 10-1 Floating-point representation allows both fractions and large numbers to be stored, however, this has a computational cost since floating-point numbers take more time to be processed than integer numbers, and it has also a precision price, since only a finite number of figures can be stored in the mantissa. In 64 bits using IEEE 754 standard Approximations and round-off errors 3

sign exponent mantissa =64 bits 1bit 11 bits 52 bits III.3. Limited range of quantities that may be represented As the number of bits is limited some very large or very small numbers can not be represented. If you try to store a number outside this range you will generate an overflow error. - How to deal the problem of π? π = 3.141592653558 To be stored on a base-10 system carrying seven significant figures: we can omit the figures after the seventh: π = 3.141592; this is called chopping. This will generate an error of 0.0000065 Or we can round the eighth figure: π = 3.141593 This will generate an error of -0.0000035 Therefore, rounding reduces the error III.4. Comparison between two numbers When comparing two numbers, it is wiser to test that the difference is less than an acceptably small tolerance rather than to test for equality: If you want to test if a=b, the best solution is to write in your program: If a b Machine epsilon can be used as a criterion to ensure a certain portability of the code, since it will not depend on the storage characteristics of the machine used. 1 t = base t: is the number of digits in the mantissa III.5. Extended precision It is also possible to increase the accuracy of the computation by assigning a double precision to the variables. In this case, about 15 to 16 decimals of precision and a range of approximately: 10-308 to 10 308 is used. However, this will increase the execution time and the need for memory storage. Note In almost all engineering problems, the precision provided by computers are enough. The computers using the IEEE format allow 52 bits to be used for the mantissa. Approximations and round-off errors 4

IV. Arithmetic manipulation of computer numbers Basic arithmetic operations such as addition, subtraction or multiplication can lead to significant round off errors. - Addition The mantissa of the number of the smaller exponent is modified so that the exponents are the same. If we consider a computer with just 4-digit mantissa and a 1-digit exponent, if we add 0.1557 10 1 to 0.4381 10-1, the following process will occur if chopping is used: 0.4318 10-1 0.004318 10 1 Then 0.1557 10 1 0.0043 10 1 --------------- 0.1600 10 1 - Subtraction The same thing as for addition happens with subtraction: 0.3641 10 2-0.2686 10 2 --------------- 0.0955 10 2 Due to the presence of the zero just before the (9), the result is normalized: 0.0955 10 2 0.9550 10 1 Note that we added a zero to fill the space of the 4-digit mantissa. - Multiplication 0.1363 10 3 0.6423 10-1 = 0.08754549 10 2 normalization 0.8754549 10 1 chopping 0.8754 10 1 The errors produced by these arithmetic manipulations may seem negligible, but, several methods in engineering require an iterative process to find the solution. The computations are, therefore, interdependent and this might lead to a dramatic increase in the round-off errors. Approximations and round-off errors 5

IV.1. Errors due to addition of large and small numbers 4000 + 0.0010 Is computed as: 0.4000 10 4 0.0000001 10 4 ---------------------- 0.4000001 10 4 chopping 0.4000 10 4 The small number is completely ignored This kind of problems usually occurs in the computation of infinite series where the first terms are large. To avoid this problem, you have to compute the series in an ascending order. IV.2. Subtractive cancellation This error occurs when we perform the substraction of nearly equal floating point numbers. Calculate 9.01 3 on a 3-decimal-digit computer. To avoid the problem of subtractive cancellation, use double precision (use the function Double(X) with SciLab or Matlab ). Single precision [32 bits] - 24 bits assigned to mantissa (first bit assumed =1 and not stored). - 8 bits to signed exponent. Double precisions [64 bits] - 56 bits assigned to mantissa. - 8 bits to signed exponent. Additional Information On June 4, 1996 an unmanned Ariane 5 rocket launched by the European Space Agency exploded just forty seconds after its lift-off from Kourou, French Guiana. The rocket was on its first voyage, after a decade of development costing $7 billion. The destroyed rocket and its cargo were valued at $500 million. A board of inquiry investigated the causes of the explosion and in two weeks issued a report. It turned out that the cause of the failure was a software error in the inertial reference system. Specifically a 64 bit floating point number relating to the horizontal velocity of the rocket with respect to the platform was converted to a 16 bit signed integer. The number was larger than 32,767, the largest integer storable in a 16 bit signed integer, and thus the conversion failed. Approximations and round-off errors 6

Roundoff error on Detroit Edison bills Detroit Edison's residential electric bill has a section titled "Energy Use Report." This section reports incorrect numbers due to improper integer roundoff. One of the fields gives the average daily energy use for the month in Kilowatt-hours, rounded to the nearest integer value. Another field gives the percent change against the same month for the previous year. The percent change is calculated using the rounded value for energy use. This can result in large errors. For example, my February 2005 use was 11.68 KWh/day, compared to 11.21 the previous year. After rounding this becomes 12 compared to 11, and the change is reported on the bill as 9 percent (12/11-1) instead of the correct 4 percent (11.68/11.21-1). I wrote to Detroit Edison about this. Their only response was an offer to "assist [you] in understanding how the percentage... is calculated." Since I already know how it is calculated (incorrectly), I declined the offer. Rounding error changes Parliament makeup Debora Weber-Wulff, 7 Apr 1992 Jim Rees We experienced a shattering computer error during a German election this past Sunday (5 April). The elections to the parliament for the state of Schleswig-Holstein were affected. German elections are quite complicated to calculate. First, there is the 5% clause: no party with less than 5% of the vote may be seated in parliament. All the votes for this party are lost. Seats are distributed by direct vote and by list. All persons winning a precinct vote (i.e. having more votes than any other candidate in the precinct) are seated. Then a complicated system (often D'Hondt, now they have newer systems) is invoked that seats persons from the party lists according to the proportion of the votes for each party. Often quite a number of extra seats (and office space and salaries) are necessary so that the seat distribution reflects the vote percentages each party got. On Sunday the votes were being counted, and it looked like the Green party was hanging on by their teeth to a vote percentage of exactly 5%. This meant that the Social Democrats (SPD) could not have anyone from their list seated, which was most unfortunate, as the candidate for minister president was number one on the list, and the SPD won all precincts: no extra seats needed. After midnight (and after the election results were published) someone discovered that the Greens actually only had 4,97% of the vote. The program that prints out the percentages only uses one place after the decimal, and had *rounded the count up* to 5%! This software had been used for *years*, and no one had thought to turn off the rounding at this very critical (and IMHO very undemocratic) region! So 4,97% of the votes were thrown away, the seats were recalculated, the SPD got to seat one person from the list, and now have a one seat majority in the parliament. And the newspapers are clucking about the "computers" making such a mistake. Approximations and round-off errors 7