Floating-Point Arithmetic

Size: px
Start display at page:

Download "Floating-Point Arithmetic"

Transcription

1 Floating-Point Arithmetic ECS30 Winter 207 January 27, 207

2 Floating point numbers Floating-point representation of numbers (scientific notation) has four components, for example, sign significand base exponent Computers use binary (base 2) representation, each digit is the integer 0 or, e.g. 00 = = 22 in base 0 and 0 = = =

3 Floating point numbers The representation of a floating point number: x = ±b 0.b b 2 b p 2 E where It is normalized, i.e., b0 = (the hidden bit) Precision (= p) is the number of bits in the significand (mantissa) (including the hidden bit). Machine epsilon ɛm = 2 (p ), the gap between the number and the smallest floating point number that is greater than. Special numbers 0, 0, Inf =, Inf =, NaN = Not a Number.

4 IEEE standard All computers designed since 985 use the IEEE Standard for Binary Floating-Point Arithmetic (ANSI/IEEE Std ), represent each number as a binary number and use binary arithmetic. Essentials of the IEEE standard:. consistent representation of floating-point numbers by all machines adopting the standard; 2. correctly rounded floating-point operations, using various rounding modes; 3. consistent treatment of exceptional situation such as division by zero. Citation of 989 Turing Award to William Kahan:... his devotion to providing systems that can be safely and robustly used for numerical computations has impacted the lives of virtually anyone who will use a computer in the future

5 IEEE single precision format Single format takes 32 bits (=4 bytes) long: 8 23 s E f sign exponent binary point fraction It represents the number ( ) s (.f) 2 E 27 The leading in the fraction need not be stored explicitly since it is always (hidden bit) Precision p = 24 and machine epsilon ɛ m = The E 27 in the exponent is to avoid the need for storage of a sign bit. E min = ( ) 2 = () 0, E max = (0) 2 = (254) 0. The range of positive normalized numbers: N min = Emin 27 = N max =. 2 Emax Special repsentations for 0, ± and NaN.

6 IEEE double pecision format IEEE double format takes 64 bits (= 8 bytes) long: 52 s E f sign exponent binary point fraction It represents the numer ( ) s (.f) 2 E 023 Precision p = 53 and machine epsilon ɛ m = The range of positive normalized numbers is from N min = N max = Special repsentations for 0, ± and NaN.

7 Rounding modes Let a positive real number x be in the normalized range, i.e., N min x N max, and write in the normalized form x =.b b 2 b p b p b p E, Then the closest floating-pont number less than or equal to x is i.e., x is obtained by truncating. x =.b b 2 b p 2 E The next floating-point number bigger than x (also the next one that bigger than x) is x + = (.b b 2 b p ) 2 E If x is negative, the situtation is reversed.

8 rounding modes, cont d Four rounding modes:. round down: fl(x) = x 2. round up: fl(x) = x + 3. round towards zero: fl(x) = x of x 0 fl(x) = x + of x 0 4. round to nearest (IEEE default rounding mode): fl(x) = x or x + whichever is nearer to x. Note: except that if x > Nmax, fl(x) =, and if x < Nmax, fl(x) =. In the case of tie, i.e., x and x + are the same distance from x, the one with its least significant bit equal to zero is chosen.

9 Rounding error When the round to nearest is in effect, Therefore, we have relerr = relerr(x) = fl(x) x x 2 ɛ m = , single precision , double precision.

10 Floating-point arithmetic IEEE rules for correctly rounded floating-point operations: if x and y are correctly rounded floating-point numbers, then where for the round to nearest, fl(x + y) = (x + y)( + δ) fl(x y) = (x y)( + δ) fl(x y) = (x y)( + δ) fl(x/y) = (x/y)( + δ) δ 2 ɛ m IEEE standard also requires that correctly rounded remainder and square root operations be provided.

11 Floating-point arithmetic, cont d IEEE standard response to exceptions Event Example Set result to Invalid operation 0/0, 0 NaN Division by zero Finite nonzero/0 ± Overflow x > N max ± or ±N max underflow x 0, x < N min ±0, ±N min or subnormal Inexact whenever fl(x y) x y correctly rounded value

12 Floating-point arithmetic error Let ˆx and ŷ be the floating-point numbers and that ˆx = x( + τ ) and ŷ = y( + τ 2 ), for τ i τ where τ i could be the relative errors in the process of collecting/getting the data from the original source or the previous operations, and Question: how do the four basic arithmetic operations behave?

13 Floating-point arithmetic error: +, Addition and subtraction: fl(ˆx + ŷ) = (ˆx + ŷ)( + δ), δ 2 ɛm = x( + τ )( + δ) + y( + τ 2)( + δ) = x + y + x(τ + δ + O(τɛ m)) + y(τ 2 + δ + O(τɛ m)) ( = (x + y) + x (τ + δ + O(τɛm)) + y ) (τ2 + δ + O(τɛm)) x + y x + y (x + y)( + ˆδ), where ˆδ can be bounded as follows: ˆδ x + y x + y ( τ + ) 2 ɛ m + O(τɛ m ).

14 Floating-point arithmetic error: +, Three possible cases:. If x and y have the same sign, i.e., xy > 0, then x + y = x + y ; this implies ˆδ τ + 2 ɛ m + O(τɛ m ). Thus fl(ˆx + ŷ) approximates x + y well. 2. If x y x + y 0, then ( x + y )/ x + y ; this implies that ˆδ could be nearly or much bigger than. This is so called catastrophic cancellation, it causes relative errors or uncertainties already presented in ˆx and ŷ to be magnified. 3. In general, if ( x + y )/ x + y is not too big, fl(ˆx + ŷ) provides a good approximation to x + y.

15 Catastrophic cancellation: example Computing x + x straightforward causes substantial loss of significant digits for large n x fl( x + ) fl( x) fl(fl( x + ) fl( x).00e e e e-06.00e e e e-06.00e e e e-07.00e e e e-07.00e e e e-08.00e e e e-08.00e e e e+00

16 Catastrophic cancellation: example Catastrophic cancellation can sometimes be avoided if a formula is properly reformulated. For example, one can compute x + x almost to full precision by using the equality x + x = x + + x. Consequently, the computed results are n fl(/( n + + n)).00e e-06.00e e-06.00e e-07.00e e-07.00e e-08.00e e-08.00e e-09

17 Catastrophic cancellation: example 2 Consider the function h(x) = cos x x 2 Note that 0 f(x) < /2 for all x 0. Let x =.2 0 8, then the computed is completely wrong! fl(h(x)) = Alternatively, the function can be re-written as ( ) 2 sin(x/2) h(x) =. x/2 Consequently, for x =.2 0 8, then the computed function fl(h(x)) = < /2 is fine!

18 Floating-point arithmetic error:, / Multiplication and Division: fl(ˆx ŷ) = (ˆx ŷ)( + δ) = xy( + τ )( + τ 2 )( + δ) xy( + ˆδ ), fl(ˆx/ŷ) = (ˆx/ŷ)( + δ) = (x/y)( + τ )( + τ 2 ) ( + δ) xy( + ˆδ ), where ˆδ = τ + τ 2 + δ + O(τɛ m ) ˆδ = τ τ 2 + δ + O(τɛ m ). Thus ˆδ 2τ + 2 ɛ m + O(τɛ m ) and ˆδ 2τ + 2 ɛ m + O(τɛ m ). Multiplication and division are very well-behaved!

19 Reading Section.7 of Numerical Computing with MATLAB by C. Moler Websites discussions of numerical disasters: T. Huckle, Collection of software bugs huckle/bugse.html K. Vuik, Some disasters caused by numerical errors D. Arnold, Some disasters attributable to bad numerical computing arnold/disasters/disasters.html In-depth material: D. Goldberg, What every computer scientist should know about floating-point arithmetic, ACM Computing Survey, Vol.23(), pp.5-48, 99

20 LU factorization revisited: the need of pivoting in LU factorization, numerically LU factorization without pivoting [ ] [ ] [ A = = = LU = l 2 ] [ ] u u 2 u 22 In three decimal-digit floating-point arithmetic, we have [ ] [ ] 0 0 L = fl(/0 4 = ) 0 4, [ ] [ ] Û = fl( 0 4 = ) 0 4, Check: LÛ = [ ] [ ] [ = 0 ] A,

21 LU factorization revisited: the need of pivoting in LU factorization, numerically [ ] Consider solving Ax = for x using this LU factorization. 2 Solving Ly = [ 2 ] = ŷ = fl(/) = ŷ 2 = fl(2 0 4 ) = 0 4. note that the value 2 has been lost by subtracting 0 4 from it. Solving [ ] Ûx = ŷ = 0 4 = x2 = fl(( 04 )/( 0 4 )) = x = fl(( )/0 4 ) = 0, [ ] [ ] x 0 x = = is a completely erroneous solution to the correct x 2 [ ] answer x.

22 LU factorization revisited: the need of pivoting in LU factorization, numerically LU factorization with partial pivoting P A = LU, By exchanging the order of the rows of A, i.e., [ ] 0 P =, 0 Then for the LU factorization of P A is given by [ ] [ ] 0 0 L = fl(0 4 = /) 0 4, [ ] [ ] Û = fl( 0 4 =. ) The computed LU approximates A very accurately. As a result, the computed solution x is also perfect!

Computer Arithmetic. 1. Floating-point representation of numbers (scientific notation) has four components, for example, 3.

Computer Arithmetic. 1. Floating-point representation of numbers (scientific notation) has four components, for example, 3. ECS231 Handout Computer Arithmetic I: Floating-point numbers and representations 1. Floating-point representation of numbers (scientific notation) has four components, for example, 3.1416 10 1 sign significandbase

More information

Binary floating point encodings

Binary floating point encodings Week 1: Wednesday, Jan 25 Binary floating point encodings Binary floating point arithmetic is essentially scientific notation. Where in decimal scientific notation we write in floating point, we write

More information

Classes of Real Numbers 1/2. The Real Line

Classes of Real Numbers 1/2. The Real Line Classes of Real Numbers All real numbers can be represented by a line: 1/2 π 1 0 1 2 3 4 real numbers The Real Line { integers rational numbers non-integral fractions irrational numbers Rational numbers

More information

2 Computation with Floating-Point Numbers

2 Computation with Floating-Point Numbers 2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers

More information

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI 3.5 Today s Contents Floating point numbers: 2.5, 10.1, 100.2, etc.. How

More information

Floating-point representation

Floating-point representation Lecture 3-4: Floating-point representation and arithmetic Floating-point representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However,

More information

2 Computation with Floating-Point Numbers

2 Computation with Floating-Point Numbers 2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers

More information

Floating Point Representation in Computers

Floating Point Representation in Computers Floating Point Representation in Computers Floating Point Numbers - What are they? Floating Point Representation Floating Point Operations Where Things can go wrong What are Floating Point Numbers? Any

More information

Floating-Point Numbers in Digital Computers

Floating-Point Numbers in Digital Computers POLYTECHNIC UNIVERSITY Department of Computer and Information Science Floating-Point Numbers in Digital Computers K. Ming Leung Abstract: We explain how floating-point numbers are represented and stored

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 1 Scientific Computing Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

More information

Floating-Point Numbers in Digital Computers

Floating-Point Numbers in Digital Computers POLYTECHNIC UNIVERSITY Department of Computer and Information Science Floating-Point Numbers in Digital Computers K. Ming Leung Abstract: We explain how floating-point numbers are represented and stored

More information

MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic

MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic September 28, 2018 Lecture 1 September 28, 2018 1 / 25 Floating point arithmetic Computers use finite strings of binary digits to represent

More information

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for 1 Logistics Notes for 2016-09-07 1. We are still at 50. If you are still waiting and are not interested in knowing if a slot frees up, let me know. 2. There is a correction to HW 1, problem 4; the condition

More information

Roundoff Errors and Computer Arithmetic

Roundoff Errors and Computer Arithmetic Jim Lambers Math 105A Summer Session I 2003-04 Lecture 2 Notes These notes correspond to Section 1.2 in the text. Roundoff Errors and Computer Arithmetic In computing the solution to any mathematical problem,

More information

Mathematical preliminaries and error analysis

Mathematical preliminaries and error analysis Mathematical preliminaries and error analysis Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan August 28, 2011 Outline 1 Round-off errors and computer arithmetic IEEE

More information

Finite arithmetic and error analysis

Finite arithmetic and error analysis Finite arithmetic and error analysis Escuela de Ingeniería Informática de Oviedo (Dpto de Matemáticas-UniOvi) Numerical Computation Finite arithmetic and error analysis 1 / 45 Outline 1 Number representation:

More information

CS321. Introduction to Numerical Methods

CS321. Introduction to Numerical Methods CS31 Introduction to Numerical Methods Lecture 1 Number Representations and Errors Professor Jun Zhang Department of Computer Science University of Kentucky Lexington, KY 40506 0633 August 5, 017 Number

More information

Most nonzero floating-point numbers are normalized. This means they can be expressed as. x = ±(1 + f) 2 e. 0 f < 1

Most nonzero floating-point numbers are normalized. This means they can be expressed as. x = ±(1 + f) 2 e. 0 f < 1 Floating-Point Arithmetic Numerical Analysis uses floating-point arithmetic, but it is just one tool in numerical computation. There is an impression that floating point arithmetic is unpredictable and

More information

Computational Mathematics: Models, Methods and Analysis. Zhilin Li

Computational Mathematics: Models, Methods and Analysis. Zhilin Li Computational Mathematics: Models, Methods and Analysis Zhilin Li Chapter 1 Introduction Why is this course important (motivations)? What is the role of this class in the problem solving process using

More information

fractional quantities are typically represented in computers using floating point format this approach is very much similar to scientific notation

fractional quantities are typically represented in computers using floating point format this approach is very much similar to scientific notation Floating Point Arithmetic fractional quantities are typically represented in computers using floating point format this approach is very much similar to scientific notation for example, fixed point number

More information

Floating-Point Arithmetic

Floating-Point Arithmetic Floating-Point Arithmetic Raymond J. Spiteri Lecture Notes for CMPT 898: Numerical Software University of Saskatchewan January 9, 2013 Objectives Floating-point numbers Floating-point arithmetic Analysis

More information

Chapter 3. Errors and numerical stability

Chapter 3. Errors and numerical stability Chapter 3 Errors and numerical stability 1 Representation of numbers Binary system : micro-transistor in state off 0 on 1 Smallest amount of stored data bit Object in memory chain of 1 and 0 10011000110101001111010010100010

More information

Numerical Computing: An Introduction

Numerical Computing: An Introduction Numerical Computing: An Introduction Gyula Horváth Horvath@inf.u-szeged.hu Tom Verhoeff T.Verhoeff@TUE.NL University of Szeged Hungary Eindhoven University of Technology The Netherlands Numerical Computing

More information

Table : IEEE Single Format ± a a 2 a 3 :::a 8 b b 2 b 3 :::b 23 If exponent bitstring a :::a 8 is Then numerical value represented is ( ) 2 = (

Table : IEEE Single Format ± a a 2 a 3 :::a 8 b b 2 b 3 :::b 23 If exponent bitstring a :::a 8 is Then numerical value represented is ( ) 2 = ( Floating Point Numbers in Java by Michael L. Overton Virtually all modern computers follow the IEEE 2 floating point standard in their representation of floating point numbers. The Java programming language

More information

Divide: Paper & Pencil

Divide: Paper & Pencil Divide: Paper & Pencil 1001 Quotient Divisor 1000 1001010 Dividend -1000 10 101 1010 1000 10 Remainder See how big a number can be subtracted, creating quotient bit on each step Binary => 1 * divisor or

More information

Floating Point Numbers

Floating Point Numbers Floating Point Numbers Summer 8 Fractional numbers Fractional numbers fixed point Floating point numbers the IEEE 7 floating point standard Floating point operations Rounding modes CMPE Summer 8 Slides

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 4-A Floating-Point Arithmetic Israel Koren ECE666/Koren Part.4a.1 Preliminaries - Representation

More information

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right. Floating-point Arithmetic Reading: pp. 312-328 Floating-Point Representation Non-scientific floating point numbers: A non-integer can be represented as: 2 4 2 3 2 2 2 1 2 0.2-1 2-2 2-3 2-4 where you sum

More information

1.2 Round-off Errors and Computer Arithmetic

1.2 Round-off Errors and Computer Arithmetic 1.2 Round-off Errors and Computer Arithmetic 1 In a computer model, a memory storage unit word is used to store a number. A word has only a finite number of bits. These facts imply: 1. Only a small set

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 4-C Floating-Point Arithmetic - III Israel Koren ECE666/Koren Part.4c.1 Floating-Point Adders

More information

Floating Point Representation. CS Summer 2008 Jonathan Kaldor

Floating Point Representation. CS Summer 2008 Jonathan Kaldor Floating Point Representation CS3220 - Summer 2008 Jonathan Kaldor Floating Point Numbers Infinite supply of real numbers Requires infinite space to represent certain numbers We need to be able to represent

More information

Class 5. Data Representation and Introduction to Visualization

Class 5. Data Representation and Introduction to Visualization Class 5. Data Representation and Introduction to Visualization Visualization Visualization is useful for: 1. Data entry (initial conditions). 2. Code debugging and performance analysis. 3. Interpretation

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 11: Floating Point & Floating Point Addition Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Last time: Single Precision Format

More information

Review of Calculus, cont d

Review of Calculus, cont d Jim Lambers MAT 460/560 Fall Semester 2009-10 Lecture 4 Notes These notes correspond to Sections 1.1 1.2 in the text. Review of Calculus, cont d Taylor s Theorem, cont d We conclude our discussion of Taylor

More information

Objectives. look at floating point representation in its basic form expose errors of a different form: rounding error highlight IEEE-754 standard

Objectives. look at floating point representation in its basic form expose errors of a different form: rounding error highlight IEEE-754 standard Floating Point Objectives look at floating point representation in its basic form expose errors of a different form: rounding error highlight IEEE-754 standard 1 Why this is important: Errors come in two

More information

Review Questions 26 CHAPTER 1. SCIENTIFIC COMPUTING

Review Questions 26 CHAPTER 1. SCIENTIFIC COMPUTING 26 CHAPTER 1. SCIENTIFIC COMPUTING amples. The IEEE floating-point standard can be found in [131]. A useful tutorial on floating-point arithmetic and the IEEE standard is [97]. Although it is no substitute

More information

CS321 Introduction To Numerical Methods

CS321 Introduction To Numerical Methods CS3 Introduction To Numerical Methods Fuhua (Frank) Cheng Department of Computer Science University of Kentucky Lexington KY 456-46 - - Table of Contents Errors and Number Representations 3 Error Types

More information

Data Representation and Introduction to Visualization

Data Representation and Introduction to Visualization Data Representation and Introduction to Visualization Massimo Ricotti ricotti@astro.umd.edu University of Maryland Data Representation and Introduction to Visualization p.1/18 VISUALIZATION Visualization

More information

Introduction to floating point arithmetic

Introduction to floating point arithmetic Introduction to floating point arithmetic Matthias Petschow and Paolo Bientinesi AICES, RWTH Aachen petschow@aices.rwth-aachen.de October 24th, 2013 Aachen, Germany Matthias Petschow (AICES, RWTH Aachen)

More information

Math 340 Fall 2014, Victor Matveev. Binary system, round-off errors, loss of significance, and double precision accuracy.

Math 340 Fall 2014, Victor Matveev. Binary system, round-off errors, loss of significance, and double precision accuracy. Math 340 Fall 2014, Victor Matveev Binary system, round-off errors, loss of significance, and double precision accuracy. 1. Bits and the binary number system A bit is one digit in a binary representation

More information

Floating-point representations

Floating-point representations Lecture 10 Floating-point representations Methods of representing real numbers (1) 1. Fixed-point number system limited range and/or limited precision results must be scaled 100101010 1111010 100101010.1111010

More information

Floating-point representations

Floating-point representations Lecture 10 Floating-point representations Methods of representing real numbers (1) 1. Fixed-point number system limited range and/or limited precision results must be scaled 100101010 1111010 100101010.1111010

More information

Data Representation Floating Point

Data Representation Floating Point Data Representation Floating Point CSCI 2400 / ECE 3217: Computer Architecture Instructor: David Ferry Slides adapted from Bryant & O Hallaron s slides via Jason Fritts Today: Floating Point Background:

More information

Up next. Midterm. Today s lecture. To follow

Up next. Midterm. Today s lecture. To follow Up next Midterm Next Friday in class Exams page on web site has info + practice problems Excited for you to rock the exams like you have been the assignments! Today s lecture Back to numbers, bits, data

More information

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3 Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Instructor: Nicole Hynes nicole.hynes@rutgers.edu 1 Fixed Point Numbers Fixed point number: integer part

More information

Data Representation Floating Point

Data Representation Floating Point Data Representation Floating Point CSCI 2400 / ECE 3217: Computer Architecture Instructor: David Ferry Slides adapted from Bryant & O Hallaron s slides via Jason Fritts Today: Floating Point Background:

More information

Computational Methods CMSC/AMSC/MAPL 460. Representing numbers in floating point Error Analysis. Ramani Duraiswami, Dept. of Computer Science

Computational Methods CMSC/AMSC/MAPL 460. Representing numbers in floating point Error Analysis. Ramani Duraiswami, Dept. of Computer Science Computational Methods CMSC/AMSC/MAPL 460 Representing numbers in floating point Error Analysis Ramani Duraiswami, Dept. of Computer Science Class Outline Recap of floating point representation Matlab demos

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 1 Scientific Computing Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

More information

Floating Point Arithmetic

Floating Point Arithmetic Floating Point Arithmetic CS 365 Floating-Point What can be represented in N bits? Unsigned 0 to 2 N 2s Complement -2 N-1 to 2 N-1-1 But, what about? very large numbers? 9,349,398,989,787,762,244,859,087,678

More information

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. What is scientific computing?

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. What is scientific computing? Lecture Notes to Accompany Scientific Computing An Introductory Survey Second Edition by Michael T. Heath Scientific Computing What is scientific computing? Design and analysis of algorithms for solving

More information

Computational Methods. Sources of Errors

Computational Methods. Sources of Errors Computational Methods Sources of Errors Manfred Huber 2011 1 Numerical Analysis / Scientific Computing Many problems in Science and Engineering can not be solved analytically on a computer Numeric solutions

More information

Lecture 10. Floating point arithmetic GPUs in perspective

Lecture 10. Floating point arithmetic GPUs in perspective Lecture 10 Floating point arithmetic GPUs in perspective Announcements Interactive use on Forge Trestles accounts? A4 2012 Scott B. Baden /CSE 260/ Winter 2012 2 Today s lecture Floating point arithmetic

More information

The Perils of Floating Point

The Perils of Floating Point The Perils of Floating Point by Bruce M. Bush Copyright (c) 1996 Lahey Computer Systems, Inc. Permission to copy is granted with acknowledgement of the source. Many great engineering and scientific advances

More information

What Every Programmer Should Know About Floating-Point Arithmetic

What Every Programmer Should Know About Floating-Point Arithmetic What Every Programmer Should Know About Floating-Point Arithmetic Last updated: October 15, 2015 Contents 1 Why don t my numbers add up? 3 2 Basic Answers 3 2.1 Why don t my numbers, like 0.1 + 0.2 add

More information

Hani Mehrpouyan, California State University, Bakersfield. Signals and Systems

Hani Mehrpouyan, California State University, Bakersfield. Signals and Systems Hani Mehrpouyan, Department of Electrical and Computer Engineering, California State University, Bakersfield Lecture 3 (Error and Computer Arithmetic) April 8 th, 2013 The material in these lectures is

More information

Floating Point Numbers. Lecture 9 CAP

Floating Point Numbers. Lecture 9 CAP Floating Point Numbers Lecture 9 CAP 3103 06-16-2014 Review of Numbers Computers are made to deal with numbers What can we represent in N bits? 2 N things, and no more! They could be Unsigned integers:

More information

ME 261: Numerical Analysis. ME 261: Numerical Analysis

ME 261: Numerical Analysis. ME 261: Numerical Analysis ME 261: Numerical Analysis 3. credit hours Prereq.: ME 163/ME 171 Course content Approximations and error types Roots of polynomials and transcendental equations Determinants and matrices Solution of linear

More information

2.1.1 Fixed-Point (or Integer) Arithmetic

2.1.1 Fixed-Point (or Integer) Arithmetic x = approximation to true value x error = x x, relative error = x x. x 2.1.1 Fixed-Point (or Integer) Arithmetic A base 2 (base 10) fixed-point number has a fixed number of binary (decimal) places. 1.

More information

Numerical Methods 5633

Numerical Methods 5633 Numerical Methods 5633 Lecture 2 Marina Krstic Marinkovic mmarina@maths.tcd.ie School of Mathematics Trinity College Dublin Marina Krstic Marinkovic 1 / 15 5633-Numerical Methods Organisational Assignment

More information

Floating-point numbers. Phys 420/580 Lecture 6

Floating-point numbers. Phys 420/580 Lecture 6 Floating-point numbers Phys 420/580 Lecture 6 Random walk CA Activate a single cell at site i = 0 For all subsequent times steps, let the active site wander to i := i ± 1 with equal probability Random

More information

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing EE878 Special Topics in VLSI Computer Arithmetic for Digital Signal Processing Part 4-B Floating-Point Arithmetic - II Spring 2017 Koren Part.4b.1 The IEEE Floating-Point Standard Four formats for floating-point

More information

Section 1.4 Mathematics on the Computer: Floating Point Arithmetic

Section 1.4 Mathematics on the Computer: Floating Point Arithmetic Section 1.4 Mathematics on the Computer: Floating Point Arithmetic Key terms Floating point arithmetic IEE Standard Mantissa Exponent Roundoff error Pitfalls of floating point arithmetic Structuring computations

More information

Computer arithmetics: integers, binary floating-point, and decimal floating-point

Computer arithmetics: integers, binary floating-point, and decimal floating-point n!= 0 && -n == n z+1 == z Computer arithmetics: integers, binary floating-point, and decimal floating-point v+w-w!= v x+1 < x Peter Sestoft 2010-02-16 y!= y p == n && 1/p!= 1/n 1 Computer arithmetics Computer

More information

Lecture 13: (Integer Multiplication and Division) FLOATING POINT NUMBERS

Lecture 13: (Integer Multiplication and Division) FLOATING POINT NUMBERS Lecture 13: (Integer Multiplication and Division) FLOATING POINT NUMBERS Lecture 13 Floating Point I (1) Fall 2005 Integer Multiplication (1/3) Paper and pencil example (unsigned): Multiplicand 1000 8

More information

CS 6210 Fall 2016 Bei Wang. Lecture 4 Floating Point Systems Continued

CS 6210 Fall 2016 Bei Wang. Lecture 4 Floating Point Systems Continued CS 6210 Fall 2016 Bei Wang Lecture 4 Floating Point Systems Continued Take home message 1. Floating point rounding 2. Rounding unit 3. 64 bit word: double precision (IEEE standard word) 4. Exact rounding

More information

ecture 25 Floating Point Friedland and Weaver Computer Science 61C Spring 2017 March 17th, 2017

ecture 25 Floating Point Friedland and Weaver Computer Science 61C Spring 2017 March 17th, 2017 ecture 25 Computer Science 61C Spring 2017 March 17th, 2017 Floating Point 1 New-School Machine Structures (It s a bit more complicated!) Software Hardware Parallel Requests Assigned to computer e.g.,

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 4-B Floating-Point Arithmetic - II Israel Koren ECE666/Koren Part.4b.1 The IEEE Floating-Point

More information

CS 61C: Great Ideas in Computer Architecture Floating Point Arithmetic

CS 61C: Great Ideas in Computer Architecture Floating Point Arithmetic CS 61C: Great Ideas in Computer Architecture Floating Point Arithmetic Instructors: Vladimir Stojanovic & Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/ New-School Machine Structures (It s a bit

More information

Outline. 1 Scientific Computing. 2 Approximations. 3 Computer Arithmetic. Scientific Computing Approximations Computer Arithmetic

Outline. 1 Scientific Computing. 2 Approximations. 3 Computer Arithmetic. Scientific Computing Approximations Computer Arithmetic Outline 1 2 3 Michael T. Heath 2 / 46 Introduction Computational Problems General Strategy What is scientific computing? Design and analysis of algorithms for numerically solving mathematical problems

More information

Numerical computing. How computers store real numbers and the problems that result

Numerical computing. How computers store real numbers and the problems that result Numerical computing How computers store real numbers and the problems that result The scientific method Theory: Mathematical equations provide a description or model Experiment Inference from data Test

More information

Basics of Computation. PHY 604:Computational Methods in Physics and Astrophysics II

Basics of Computation. PHY 604:Computational Methods in Physics and Astrophysics II Basics of Computation Basics of Computation Computers store information and allow us to operate on it. That's basically it. Computers have finite memory, so it is not possible to store the infinite range

More information

Number Systems. Both numbers are positive

Number Systems. Both numbers are positive Number Systems Range of Numbers and Overflow When arithmetic operation such as Addition, Subtraction, Multiplication and Division are performed on numbers the results generated may exceed the range of

More information

Computer Architecture and IC Design Lab. Chapter 3 Part 2 Arithmetic for Computers Floating Point

Computer Architecture and IC Design Lab. Chapter 3 Part 2 Arithmetic for Computers Floating Point Chapter 3 Part 2 Arithmetic for Computers Floating Point Floating Point Representation for non integral numbers Including very small and very large numbers 4,600,000,000 or 4.6 x 10 9 0.0000000000000000000000000166

More information

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction 1 Floating Point The World is Not Just Integers Programming languages support numbers with fraction Called floating-point numbers Examples: 3.14159265 (π) 2.71828 (e) 0.000000001 or 1.0 10 9 (seconds in

More information

Lecture 17 Floating point

Lecture 17 Floating point Lecture 17 Floating point What is floatingpoint? A representation ±2.5732 10 22 NaN Single, double, extended precision (80 bits) A set of operations + = * / rem Comparison < = > Conversions between different

More information

AMTH142 Lecture 10. Scilab Graphs Floating Point Arithmetic

AMTH142 Lecture 10. Scilab Graphs Floating Point Arithmetic AMTH142 Lecture 1 Scilab Graphs Floating Point Arithmetic April 2, 27 Contents 1.1 Graphs in Scilab......................... 2 1.1.1 Simple Graphs...................... 2 1.1.2 Line Styles........................

More information

What we need to know about error: Class Outline. Computational Methods CMSC/AMSC/MAPL 460. Errors in data and computation

What we need to know about error: Class Outline. Computational Methods CMSC/AMSC/MAPL 460. Errors in data and computation Class Outline Computational Methods CMSC/AMSC/MAPL 460 Errors in data and computation Representing numbers in floating point Ramani Duraiswami, Dept. of Computer Science Computations should be as accurate

More information

AM205: lecture 2. 1 These have been shifted to MD 323 for the rest of the semester.

AM205: lecture 2. 1 These have been shifted to MD 323 for the rest of the semester. AM205: lecture 2 Luna and Gary will hold a Python tutorial on Wednesday in 60 Oxford Street, Room 330 Assignment 1 will be posted this week Chris will hold office hours on Thursday (1:30pm 3:30pm, Pierce

More information

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science) Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science) Floating Point Background: Fractional binary numbers IEEE floating point standard: Definition Example and properties

More information

Computational Methods CMSC/AMSC/MAPL 460. Representing numbers in floating point and associated issues. Ramani Duraiswami, Dept. of Computer Science

Computational Methods CMSC/AMSC/MAPL 460. Representing numbers in floating point and associated issues. Ramani Duraiswami, Dept. of Computer Science Computational Methods CMSC/AMSC/MAPL 460 Representing numbers in floating point and associated issues Ramani Duraiswami, Dept. of Computer Science Class Outline Computations should be as accurate and as

More information

Chapter 3: Arithmetic for Computers

Chapter 3: Arithmetic for Computers Chapter 3: Arithmetic for Computers Objectives Signed and Unsigned Numbers Addition and Subtraction Multiplication and Division Floating Point Computer Architecture CS 35101-002 2 The Binary Numbering

More information

Computer Representation of Numbers and Computer Arithmetic

Computer Representation of Numbers and Computer Arithmetic Computer Representation of Numbers and Computer Arithmetic January 17, 2017 Contents 1 Binary numbers 6 2 Memory 9 3 Characters 12 1 c A. Sandu, 1998-2015. Distribution outside classroom not permitted.

More information

Computer Systems C S Cynthia Lee

Computer Systems C S Cynthia Lee Computer Systems C S 1 0 7 Cynthia Lee 2 Today s Topics LECTURE: Floating point! Real Numbers and Approximation MATH TIME! Some preliminary observations on approximation We know that some non-integer numbers

More information

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop. CS 320 Ch 10 Computer Arithmetic The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop. Signed integers are typically represented in sign-magnitude

More information

Floating point numbers in Scilab

Floating point numbers in Scilab Floating point numbers in Scilab Michaël Baudin May 2011 Abstract This document is a small introduction to floating point numbers in Scilab. In the first part, we describe the theory of floating point

More information

FP_IEEE_DENORM_GET_ Procedure

FP_IEEE_DENORM_GET_ Procedure FP_IEEE_DENORM_GET_ Procedure FP_IEEE_DENORM_GET_ Procedure The FP_IEEE_DENORM_GET_ procedure reads the IEEE floating-point denormalization mode. fp_ieee_denorm FP_IEEE_DENORM_GET_ (void); DeNorm The denormalization

More information

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers Chapter 03: Computer Arithmetic Lesson 09: Arithmetic using floating point numbers Objective To understand arithmetic operations in case of floating point numbers 2 Multiplication of Floating Point Numbers

More information

COSC 243. Data Representation 3. Lecture 3 - Data Representation 3 1. COSC 243 (Computer Architecture)

COSC 243. Data Representation 3. Lecture 3 - Data Representation 3 1. COSC 243 (Computer Architecture) COSC 243 Data Representation 3 Lecture 3 - Data Representation 3 1 Data Representation Test Material Lectures 1, 2, and 3 Tutorials 1b, 2a, and 2b During Tutorial a Next Week 12 th and 13 th March If you

More information

Computational Economics and Finance

Computational Economics and Finance Computational Economics and Finance Part I: Elementary Concepts of Numerical Analysis Spring 2015 Outline Computer arithmetic Error analysis: Sources of error Error propagation Controlling the error Rates

More information

Numeric Encodings Prof. James L. Frankel Harvard University

Numeric Encodings Prof. James L. Frankel Harvard University Numeric Encodings Prof. James L. Frankel Harvard University Version of 10:19 PM 12-Sep-2017 Copyright 2017, 2016 James L. Frankel. All rights reserved. Representation of Positive & Negative Integral and

More information

Computer arithmetics: integers, binary floating-point, and decimal floating-point

Computer arithmetics: integers, binary floating-point, and decimal floating-point z+1 == z Computer arithmetics: integers, binary floating-point, and decimal floating-point Peter Sestoft 2013-02-18** x+1 < x p == n && 1/p!= 1/n 1 Computer arithmetics Computer numbers are cleverly designed,

More information

Lecture Objectives. Structured Programming & an Introduction to Error. Review the basic good habits of programming

Lecture Objectives. Structured Programming & an Introduction to Error. Review the basic good habits of programming Structured Programming & an Introduction to Error Lecture Objectives Review the basic good habits of programming To understand basic concepts of error and error estimation as it applies to Numerical Methods

More information

Floating Point Arithmetic. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Floating Point Arithmetic. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Floating Point Arithmetic Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Floating Point (1) Representation for non-integral numbers Including very

More information

Floating Point Arithmetic

Floating Point Arithmetic Floating Point Arithmetic Computer Systems, Section 2.4 Abstraction Anything that is not an integer can be thought of as . e.g. 391.1356 Or can be thought of as + /

More information

MATH 353 Engineering mathematics III

MATH 353 Engineering mathematics III MATH 353 Engineering mathematics III Instructor: Francisco-Javier Pancho Sayas Spring 2014 University of Delaware Instructor: Francisco-Javier Pancho Sayas MATH 353 1 / 20 MEET YOUR COMPUTER Instructor:

More information

Representing and Manipulating Floating Points. Jo, Heeseung

Representing and Manipulating Floating Points. Jo, Heeseung Representing and Manipulating Floating Points Jo, Heeseung The Problem How to represent fractional values with finite number of bits? 0.1 0.612 3.14159265358979323846264338327950288... 2 Fractional Binary

More information

Representing and Manipulating Floating Points

Representing and Manipulating Floating Points Representing and Manipulating Floating Points Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu The Problem How to represent fractional values with

More information

Floating Point Arithmetic

Floating Point Arithmetic Floating Point Arithmetic Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

Lecture Notes: Floating-Point Numbers

Lecture Notes: Floating-Point Numbers Lecture Notes: Floating-Point Numbers CS227-Scientific Computing September 8, 2010 What this Lecture is About How computers represent numbers How this affects the accuracy of computation Positional Number

More information

Floating-point operations I

Floating-point operations I Floating-point operations I The science of floating-point arithmetics IEEE standard Reference What every computer scientist should know about floating-point arithmetic, ACM computing survey, 1991 Chih-Jen

More information