Some issues related to double roundings

Similar documents
Proof of Properties in FP Arithmetic

A Formally-Proved Algorithm to Compute the Correct Average of Decimal Floating-Point Numbers

Computing Integer Powers in Floating-Point Arithmetic

On the Computation of Correctly-Rounded Sums

SIPE: Small Integer Plus Exponent

Avoiding double roundings in scaled Newton-Raphson division

Numerical validation of compensated summation algorithms with stochastic arithmetic

Augmented Arithmetic Operations Proposed for IEEE

Computing Correctly Rounded Integer Powers in Floating-Point Arithmetic

Implementing decimal floating-point arithmetic through binary: some suggestions

When double rounding is odd

How to Compute the Area of a Triangle: a Formal Revisit

On the Computation of Correctly-Rounded Sums

Floating point numbers in Scilab

Measuring Improvement When Using HUB Formats to Implement Floating-Point Systems under Round-to- Nearest

On the definition of ulp (x)

Newton-Raphson Algorithms for Floating-Point Division Using an FMA

On various ways to split a floating-point number

Floating Point Arithmetic

Introduction to floating point arithmetic

Floating-Point Arithmetic

What Every Computer Scientist Should Know About Floating-Point Arithmetic

Floating Point Representation in Computers

CS 6210 Fall 2016 Bei Wang. Lecture 4 Floating Point Systems Continued

8. Floating-point Numbers II

Emulation of a FMA and correctly-rounded sums: proved algorithms using rounding to odd

Divide: Paper & Pencil

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

What Every Computer Scientist Should Know About Floating Point Arithmetic

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Finite arithmetic and error analysis

A Parameterized Floating-Point Formalizaton in HOL Light

A Formally-Proved Algorithm to Compute the Correct Average of Decimal Floating-Point Numbers

Formal Verification of a Floating-Point Elementary Function

Roundoff Errors and Computer Arithmetic

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

Data Representation Floating Point

MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic

Floating-Point Numbers in Digital Computers

Floating-Point Numbers in Digital Computers

A Formally-Proved Algorithm to Compute the Correct Average of Decimal Floating-Point Numbers

2 Computation with Floating-Point Numbers

Compensated algorithms in floating point arithmetic: accuracy, validation, performances.

CS321 Introduction To Numerical Methods

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

When double rounding is odd

Scientific Computing. Error Analysis

Optimized Binary64 and Binary128 Arithmetic with GNU MPFR (common work with Vincent Lefèvre)

Floating point. Today. IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.

Formal Verification of Floating-Point programs

Floating-point representations

Exploiting Structure in Floating-Point Arithmetic

Floating-point representations

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation

FPGA IMPLEMENTATION OF FLOATING POINT ADDER AND MULTIPLIER UNDER ROUND TO NEAREST

Floating Point Numbers

Floating Point January 24, 2008

Chapter 3: Arithmetic for Computers


CS429: Computer Organization and Architecture

Data Representation Floating Point

CS321. Introduction to Numerical Methods

2 Computation with Floating-Point Numbers

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

Foundations of Computer Systems

Binary floating point encodings

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

Number Systems. Both numbers are positive

Number Systems and Binary Arithmetic. Quantitative Analysis II Professor Bob Orr

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI.

Classes of Real Numbers 1/2. The Real Line

Generating formally certified bounds on values and round-off errors

Objectives. look at floating point representation in its basic form expose errors of a different form: rounding error highlight IEEE-754 standard

arxiv:cs/ v1 [cs.ds] 11 May 2005

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.

Matthieu Lefebvre Princeton University. Monday, 10 July First Computational and Data Science school for HEP (CoDaS-HEP)

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

EE 109 Unit 19. IEEE 754 Floating Point Representation Floating Point Arithmetic

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

RADIX CONVERSION FOR IEEE MIXED RADIX FLOATING-POINT ARITHMETIC

ECE232: Hardware Organization and Design

COMP2611: Computer Organization. Data Representation

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Representing and Manipulating Floating Points

The Perils of Floating Point

Numeric Encodings Prof. James L. Frankel Harvard University

Math 302 Introduction to Proofs via Number Theory. Robert Jewett (with small modifications by B. Ćurgus)

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Computer Arithmetic. 1. Floating-point representation of numbers (scientific notation) has four components, for example, 3.

System Programming CISC 360. Floating Point September 16, 2008

fractional quantities are typically represented in computers using floating point format this approach is very much similar to scientific notation

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

Representing and Manipulating Floating Points. Jo, Heeseung

Hani Mehrpouyan, California State University, Bakersfield. Signals and Systems

Representing and Manipulating Floating Points

Computing predecessor and successor in rounding to nearest

Representing and Manipulating Floating Points

1.2 Round-off Errors and Computer Arithmetic

Transcription:

Some issues related to double roundings Erik Martin-Dorel 1 Guillaume Melquiond 2 Jean-Michel Muller 3 1 ENS Lyon, 2 Inria, 3 CNRS Valencia, June 2012 Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 1 / 32

Introduction Definition Floating-Point arithmetic, very quickly... Assuming extremal exponents e min and e max, a finite, precision-p, binary FP number x is of the form x = M 2 e p+1, (1) M and e: integers such that { M 2 p 1 e min e e max (2) Largest M (in magnitude) such that (1) and (2) hold: integral significand of x; corresponding value of e (for x 0): exponent of x; subnormal number: e = e min and M < 2 p 1 (assumed available). Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 2 / 32

Introduction Correct rounding IEEE 754: correctly rounded operations Definition 1 (Correct rounding) Rounding function, chosen among: toward : RD(x) is the largest FP number x; toward + : RU(x) is the smallest FP number x; toward zero: RZ(x) is equal to RD(x) if x 0, and to RU(x) if x 0; to nearest: RN(x) = FP number closest to x. In case of a tie: the one whose integral significand is even (another tie-breaking rule: away from 0) Correctly rounded operation : returns (a b) for all FP numbers a and b. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 3 / 32

Introduction Correct rounding IEEE 754: correctly rounded operations IEEE 754-1985: Correct rounding for +,,,, and some conversions. Advantages: if the result of an operation is exactly representable, we get it; if we just use the 4 arith. operations and, deterministic arithmetic: algorithms and proofs that use the specifications; accuracy and portability are improved;... FP arithmetic becomes a structure in itself, that can be studied. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 4 / 32

A few elementary algorithms and properties Sterbenz Lemma First example: Sterbenz Lemma Lemma 2 (Sterbenz) Let a and b be positive FP numbers. If a 2 b 2a then a b is a FP number ( computed exactly, in any rounding mode). Proof: straightforward using the notation x = M 2 e+1 p. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 5 / 32

A few elementary algorithms and properties Error of FP addition (Moller, Knuth, Dekker) Error of rounded-to-nearest FP addition Reminder: RN(x) is x rounded to nearest. Lemma 3 Let a and b be two FP numbers. Let s = RN(a + b) and r = (a + b) s. If no overflow when computing s, then r is a FP number. the error of a FP addition is exactly representable by a FPN. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 6 / 32

A few elementary algorithms and properties Error of FP addition (Moller, Knuth, Dekker) Error of FP addition Proof: Assume a b, 1 s is the FP number nearest a + b it is closest to a + b than a. Hence (a + b) s (a + b) a, therefore r b. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 7 / 32

A few elementary algorithms and properties Error of FP addition (Moller, Knuth, Dekker) Error of FP addition Proof: Assume a b, 1 s is the FP number nearest a + b it is closest to a + b than a. Hence (a + b) s (a + b) a, therefore r b. 2 denote a = M a 2 ea p+1 and b = M b 2 e b p+1, with M a, M b 2 p 1, M a and M b largest, and e a e b. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 7 / 32

A few elementary algorithms and properties Error of FP addition (Moller, Knuth, Dekker) Error of FP addition Proof: Assume a b, 1 s is the FP number nearest a + b it is closest to a + b than a. Hence (a + b) s (a + b) a, therefore r b. 2 denote a = M a 2 ea p+1 and b = M b 2 e b p+1, with M a, M b 2 p 1, M a and M b largest, and e a e b. a + b is multiple of 2 e b p+1 s and r are multiple of 2 e b p+1 too R Z s.t. r = R 2 e b p+1 Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 7 / 32

A few elementary algorithms and properties Error of FP addition (Moller, Knuth, Dekker) Error of FP addition Proof: Assume a b, 1 s is the FP number nearest a + b it is closest to a + b than a. Hence (a + b) s (a + b) a, therefore r b. 2 denote a = M a 2 ea p+1 and b = M b 2 e b p+1, with M a, M b 2 p 1, M a and M b largest, and e a e b. a + b is multiple of 2 e b p+1 s and r are multiple of 2 e b p+1 too R Z s.t. r = R 2 e b p+1 but, r b R M b 2 p 1 r is a FP number. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 7 / 32

A few elementary algorithms and properties Error of FP addition (Moller, Knuth, Dekker) Get r: the fast2sum algorithm (Dekker) Theorem 4 (Fast2Sum (Dekker)) Subnormal numbers available, no overflows. FP numbers a and b s.t. e a e b. Following algorithm: s + r = a + b exactly; s is the FP number that is closest to a + b. Algorithm 1 (FastTwoSum) s RN(a + b) z RN(s a) r RN(b z) C Program 1 s = a+b; z = s-a; r = b-z; Important remark: Proving the behavior of such algorithms requires use of the correct rounding property. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 8 / 32

A few elementary algorithms and properties Error of FP addition (Moller, Knuth, Dekker) The 2Sum algorithm (Knuth) Does not require comparison of a and b. Algorithm 2 (2Sum(a, b)) s RN(a + b) a RN(s b) b RN(s a ) δ a RN(a a ) δ b RN(b b ) t RN(δ a + δ b ) If a and b are normal FPN, then a + b = s + t. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 9 / 32

It s not that simple So we do live in the best of all possible worlds... correct rounding deterministic arithmetic ; Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 10 / 32

It s not that simple So we do live in the best of all possible worlds... correct rounding deterministic arithmetic ; we easily compute the error of a FP addition (by the way: same for ); Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 10 / 32

It s not that simple So we do live in the best of all possible worlds... correct rounding deterministic arithmetic ; we easily compute the error of a FP addition (by the way: same for ); we can re-inject that error later on in a calculation, to compute accurate sums, dot-products... Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 10 / 32

It s not that simple So we do live in the best of all possible worlds... correct rounding deterministic arithmetic ; we easily compute the error of a FP addition (by the way: same for ); we can re-inject that error later on in a calculation, to compute accurate sums, dot-products... already many such algorithms, probably more to come. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 10 / 32

It s not that simple So we do live in the best of all possible worlds... correct rounding deterministic arithmetic ; we easily compute the error of a FP addition (by the way: same for ); we can re-inject that error later on in a calculation, to compute accurate sums, dot-products... already many such algorithms, probably more to come.... except I m a liar! Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 10 / 32

It s not that simple Double roundings and similar problems Deterministic arithmetic? several FP formats in a given environment difficult to know in which format some operations are performed; may make the result of a sequence of operations difficult to predict; Assume all declared variables are of the same format. Two phenomenons may occur when a wider format is available: implicit variables such as result of a+b in d = (a+b)*c : not clear in which format they are computed; explicit variables may be first computed (hence rounded) in the wider format, and then rounded again to the destination format. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 11 / 32

It s not that simple Double roundings and similar problems Deterministic arithmetic? C program: double a = 1848874847.0; double b = 19954562207.0; double c; c = a * b; printf("c = %20.19e\n", c); return 0; Depending on the environment, 3.6893488147419103232e+19 or 3.6893488147419111424e+19 (binary64 number closest to the exact product). Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 12 / 32

It s not that simple Double roundings and similar problems What happened? Exact value of a*b: 36893488147419107329. Binary representation: 64 bits z } { 10000000000000000000000000000000000000000000000000000 {z } 10000000000 01 53 bits If the product is first rounded to double-extended precision, we get 64 bits z } { 10000000000000000000000000000000000000000000000000000 {z } 10000000000 4 53 bits if that intermediate value is rounded to the binary64 destination format, 10000000000000000000000000000000000000000000000000000 {z } 2 13 53 bits = 36893488147419103232 10, rounded down, whereas it should have been rounded up. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 13 / 32

It s not that simple Double roundings and similar problems Is it a problem? these phenomenons: almost always innocuous (error very slightly above 1/2 ulp); they may make the behavior of some programs difficult to predict; most compilers offer options that prevent this problem. And yet, needing these options restricts the portability of numerical programs; possible bad impact on performance and/or accuracy. examine which properties remain true when double roundings occur. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 14 / 32

Double rounding Known issues Double roundings: known issues ±,,, : conditions on the precision of the wider format under which double roundings do not change the result (Kahan, Figueroa s PhD 2000); double roundings may cause a problem in binary to decimal conversions. Solutions given by Goldberg, and by Cornea et al; double roundings may occur, even without available wider format, when performing scaled division iterations to avoid overflow or underflow; Rounding towards ± or 0: double roundings do not change the result of a calculation we focus on round to nearest only. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 15 / 32

Notation, background material and preliminary remarks Notation Notation precision-p target format, and precision-(p + p ) internal format; RN k (u) means u rounded to the nearest precision-k FP number; when the precision is omitted: it is p; precision-p midpoint: exactly halfway between two consecutive precision-p FPN. ulp(x) FPN midpoint x Throughout the presentation: we assume that no overflow occurs. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 16 / 32

Notation, background material and preliminary remarks Double roundings and double rounding slips Double roundings and double rounding slips When the arithmetic operation x y appears in a program: double rounding: what is actually performed is RN p ( RNp+p (x y) ), double rounding slip: a double rounding occurs and the obtained result differs from RN p (x y). Remark 1 Double rounding slip the error of a + b may not be a FPN. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 17 / 32

Some preliminary remarks Double rounding the error of a + b may not be a FPN Consider a = 1 } xxxx {{ x} 01, where xxxx x is any (p 3)-bit bit-chain. p 3 bits Also consider, b = 0.0 } 111111 {{ 1} = 1 2 2 p 1. We have: p ones a + b = } 1xxxx...x01 {{}.0 } 111111...1 {{}, p bits p bits so that if 1 p p, u = RN p+p (a + b) = 1xxxx...x01.100...0, Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 18 / 32

Some preliminary remarks Double rounding the error of a + b may not be a FPN u = RN p+p (a + b) = 1xxxx...x01.100...0, The round to nearest even rule thus implies Therefore, s = RN p (u) = 1xxxx...x10 = a + 1 s (a + b) = a + 1 (a + 1 2 2 p 1 ) = 1 2 + 2 p 1 = 0. 10000 }{{ 01}, p+1 bits which is not exactly representable in precision-p FP arithmetic. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 19 / 32

Some preliminary remarks A few preliminary remarks Remark 2 If a double rounding slip occurs when evaluating a b then RN p+p (a b) is a precision-p midpoint, i.e., a number exactly halfway between two consecutive precision-p FP numbers. RN p+p (x) RN p (RN p+p (x)) x Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 20 / 32

Some preliminary remarks A few preliminary remarks Remark 3 Since the precision-p FPNs are precision-(p + p ) FPNs, each time a b is exactly representable in precision-p arithmetic, we get it: RN p ( RNp+p (a b) ) = RN p (a b) = a b. Sterbenz Lemma still holds in presence of double roundings. Remark 4 Let a and b be precision-p FP numbers, and define s = RN p ( RNp+p (a + b) ). a + b s fits in at most p + 2 bits, so that as soon as p 2, we have RN p ( RNp+p (a + b s) ) = RN p (a + b s). (3) Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 21 / 32

Fast2Sum and double roundings Fast2Sum and double roundings Algorithm 3 (Fast2Sum-with-double-roundings(a, b)) s RN p [ RNp+p (a + b) ] z (s a) t RN p [ RNp+p (b z) ] (u): RN p (u), RN p+p (u), or RN p (RN p+p (u)), or any faithful rounding. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 22 / 32

Fast2Sum and double roundings Fast2Sum and double roundings Theorem 5 If p 3, p 2, and a and b are precision-p FPN with e a e b, then Algorithm 3 satisfies: z = s a exactly; if no double rounding slip occurred when computing s (i.e., if s = RN p (a + b)), then t = (a + b s) exactly; otherwise, t = RN p (a + b s). The proof of Theorem 5 is rather complex (many sub-cases). We have a formal proof that uses the Coq proof assistant. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 23 / 32

Fast2Sum and double roundings Proof in the (much simpler) case a b 1 if a and b have same sign, a a + b 2a, hence (2a is a FPN, rounding is increasing) a s 2a (Sterbenz) z = s a. Therefore t = RN p [ RNp+p (b z) ] = RN p [ RNp+p ((a + b) s) ]. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 24 / 32

Fast2Sum and double roundings Proof in the (much simpler) case a b 1 if a and b have same sign, a a + b 2a, hence (2a is a FPN, rounding is increasing) a s 2a (Sterbenz) z = s a. Therefore t = RN p [ RNp+p (b z) ] = RN p [ RNp+p ((a + b) s) ]. 2 if a and b have opposite signs then either b a/2, which implies (Sterbenz) a + b is a FPN, thus s = a + b, z = b and t = 0; or b < a/2, which implies a + b > a/2, hence s a/2, thus (Sterbenz) z = s a. Therefore t = RN p [RN p+p (b z)] = RN p [RN p+p ((a + b) s)]. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 24 / 32

Fast2Sum and double roundings Proof in the (much simpler) case a b 1 if a and b have same sign, a a + b 2a, hence (2a is a FPN, rounding is increasing) a s 2a (Sterbenz) z = s a. Therefore t = RN p [ RNp+p (b z) ] = RN p [ RNp+p ((a + b) s) ]. 2 if a and b have opposite signs then either b a/2, which implies (Sterbenz) a + b is a FPN, thus s = a + b, z = b and t = 0; or b < a/2, which implies a + b > a/2, hence s a/2, thus (Sterbenz) z = s a. Therefore t = RN p [RN p+p (b z)] = RN p [RN p+p ((a + b) s)]. Remark 4 t = RN p ((a + b) s). Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 24 / 32

2Sum and double roundings 2Sum and double roundings Algorithm 4 (2Sum-with-double-roundings(a, b)) (1) s RN p (RN p+p (a + b)) or RN p (a + b) (2) a RN p (RN p+p (s b)) or RN p (s b)) (3) b (s a ) (4) δ a RN p (RN p+p (a a )) or RN p (a a ) (5) δ b RN p (RN p+p (b b )) or RN p (b b ) (6) t RN p (RN p+p (δ a + δ b )) or RN p (δ a + δ b ) (u): RN p (u), RN p+p (u), or RN p (RN p+p (u)), or any faithful rounding. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 25 / 32

2Sum and double roundings Theorem 6 If p 4 and p + p, with p 2. If a and b are precision-p FPN, and if no overflow occurs, then Algorithm 4 satisfies: if no double rounding slip occurred when computing s then t = (a + b s) exactly; otherwise, t = RN p (a + b s). Proofs and tech. report available at http://hal-ens-lyon.archives-ouvertes.fr/ensl-00644408 (submitted to a journal) Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 26 / 32

Application: summation algorithms u and γ k notations u and γ k notations Higham s notations, very slightly adapted to the context of double roundings. Define u = 2 p and u = 2 p + 2 p p + 2 2p p. For any integer k 2 p, define γ k = ku 1 ku k 2 p, and γ k = ku 1 ku k (2 p + 2 p p ). Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 27 / 32

Application: summation algorithms u and γ k notations Application: summation algorithms Naive, recursive-sum algorithm, rewritten with double roundings. Algorithm 5 r a 1 for i = 2 to n do r RN p (RN p+p (r + a i )) end for return r Property 1 n r a i γ n 1 i=1 n a i. i=1 Without double roundings, the bound is γ n 1 n i=1 a i. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 28 / 32

Application: summation algorithms u and γ k notations Rump, Ogita and Oishi s K-fold summation algorithm Algorithm 6 (VecSum(a), where a = (a 1, a 2,..., a n )) p a for i = 2 to n do (p i, p i 1 ) 2Sum(p i, p i 1 ) end for return p Algorithm 7 (K-fold summation algorithm) for k = 1 to K 1 do a VecSum(a) end for c = a 1 for i = 2 to n 1 do c RN(c + a i ) end for return RN(a n + c) Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 29 / 32

Application: summation algorithms u and γ k notations Rump, Ogita and Oishi s K-fold summation algorithm without double roundings, if 4nu < 1, the FPN σ returned by Algorithm 7 satisfies n σ a i (u + γn 1) 2 n n a i + γ2n 2 K a i. (4) i=1 if a double-rounding slip occurs in the first call to VecSum, not possible to show an error bound better than prop. to 2 2p n i=1 a i ; i=1 i=1 Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 30 / 32

Application: summation algorithms u and γ k notations Rump, Ogita and Oishi s K-fold summation algorithm Example (with n = 5, but easily generalizable): ( (a 1, a 2, a 3, a 4, a 5 ) = 2 p 1 1 + 1, 2 2 p 1, 2 p 1, 2, ) 1 2 Algorithm 7 run with double roundings, with 1 p p; in the first addition (a 1 + a 2 ), double rounding slip after the first Fast2Sum, p 2 = 2 p 1 + 2 and p 1 = 1/2, so that p 1 + p 2 a 1 + a 2 ; At the end of the first call to VecSum, the returned vector is ( 1 2, 0, 0, 0, 1 ) 2 Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 31 / 32

Application: summation algorithms u and γ k notations Rump, Ogita and Oishi s K-fold summation algorithm Example (with n = 5, but easily generalizable): ( (a 1, a 2, a 3, a 4, a 5 ) = 2 p 1 1 + 1, 2 2 p 1, 2 p 1, 2, ) 1 2 Algorithm 7 run with double roundings, with 1 p p; in the first addition (a 1 + a 2 ), double rounding slip after the first Fast2Sum, p 2 = 2 p 1 + 2 and p 1 = 1/2, so that p 1 + p 2 a 1 + a 2 ; At the end of the first call to VecSum, the returned vector is ( 1 2, 0, 0, 0, 1 ) 2 Algorithm 7 returns 0, K, whereas a i = 2 p 1. Final error 2 2p 1 a i, K. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 31 / 32

Conclusion Conclusion investigated possible influence of double roundings on several algorithms of the FP literature; many important properties are preserved; depending on the considered applications, these properties may suffice, or specific compilation options should be chosen to prevent double roundings; hopefully, implementation of IEEE 754-2008 will bring some help; some proofs (e.g., 2Sum) long and tricky formal proof. Martin-Dorel, Melquiond, Muller Some issues related to double roundings Valencia, June 2012 32 / 32