Efficient implementation of elementary functions in the medium-precision range

Size: px
Start display at page:

Download "Efficient implementation of elementary functions in the medium-precision range"

Transcription

1 Efficient implementation of elementary functions in the medium-precision range Fredrik Johansson (LFANT, INRIA Bordeaux) 22nd IEEE Symposium on Computer Arithmetic (ARITH 22), Lyon, France, June / 27

2 Elementary functions Functions: exp, log, sin, cos, atan 2 / 27

3 Elementary functions Functions: exp, log, sin, cos, atan My goal is to do interval arithmetic with arbitrary precision. 2 / 27

4 Elementary functions Functions: exp, log, sin, cos, atan My goal is to do interval arithmetic with arbitrary precision. Input: floating-point number x = a 2 b, precision p 2 Output: m, r with f (x) [m r, m + r] and r 2 p f (x) 2 / 27

5 Precision ranges Hardware precision (n 53 bits) Extensively studied - elsewhere! 3 / 27

6 Precision ranges Hardware precision (n 53 bits) Extensively studied - elsewhere! Medium precision (n bits) Multiplication costs M(n) = O(n 2 ) or O(n 1.6 ) Argument reduction + rectangular splitting: O(n 1/3 M(n)) In the lower range, software overhead is significant 3 / 27

7 Precision ranges Hardware precision (n 53 bits) Extensively studied - elsewhere! Medium precision (n bits) Multiplication costs M(n) = O(n 2 ) or O(n 1.6 ) Argument reduction + rectangular splitting: O(n 1/3 M(n)) In the lower range, software overhead is significant Very high precision (n bits) Multiplication costs M(n) = O(n log n log log n) Asymptotically fast algorithms: binary splitting, arithmetic-geometric mean (AGM) iteration: O(M(n) log(n)) 3 / 27

8 Improvements in this work 1. The low-level mpn layer of GMP is used exclusively Most libraries (e.g. MPFR) use more convenient types, e.g. mpz and mpfr, to implement elementary functions 4 / 27

9 Improvements in this work 1. The low-level mpn layer of GMP is used exclusively Most libraries (e.g. MPFR) use more convenient types, e.g. mpz and mpfr, to implement elementary functions 2. Argument reduction based on lookup tables Old idea, not well explored in high precision 4 / 27

10 Improvements in this work 1. The low-level mpn layer of GMP is used exclusively Most libraries (e.g. MPFR) use more convenient types, e.g. mpz and mpfr, to implement elementary functions 2. Argument reduction based on lookup tables Old idea, not well explored in high precision 3. Faster evaluation of Taylor series Optimized version of Smith s rectangular splitting algorithm Takes advantage of mpn level functions 4 / 27

11 Recipe for elementary functions exp(x) sin(x), cos(x) log(1 + x) atan(x) Domain reduction using π and log(2) x [0, log(2)) x [0, π/4) x [0, 1) x [0, 1) 5 / 27

12 Recipe for elementary functions exp(x) sin(x), cos(x) log(1 + x) atan(x) Domain reduction using π and log(2) x [0, log(2)) x [0, π/4) x [0, 1) x [0, 1) Argument-halving r 8 times exp(x) = [exp(x/2)] 2 log(1 + x) = 2 log( 1 + x) x [0, 2 r ) Taylor series 5 / 27

13 Better recipe at medium precision exp(x) sin(x), cos(x) log(1 + x) atan(x) Domain reduction using π and log(2) x [0, log(2)) x [0, π/4) x [0, 1) x [0, 1) Lookup table with 2 r 2 8 entries exp(t + x) = exp(t) exp(x) log(1 + t + x) = log(1 + t) + log(1 + x/(1 + t)) x [0, 2 r ) Taylor series 6 / 27

14 Argument reduction formulas What we want to compute: f (x), x [0, 1) Table size: q = 2 r Precomputed value: f (t), t = i/q, i = 2 r x Remaining value to compute: f (y), y [0, 2 r ) 7 / 27

15 Argument reduction formulas What we want to compute: f (x), x [0, 1) Table size: q = 2 r Precomputed value: f (t), t = i/q, i = 2 r x Remaining value to compute: f (y), y [0, 2 r ) exp(x) = exp(t) exp(y), sin(x) = sin(t) cos(y) + cos(t) sin(y), cos(x) = cos(t) cos(y) sin(t) sin(y), y = x i/q y = x i/q y = x i/q 7 / 27

16 Argument reduction formulas What we want to compute: f (x), x [0, 1) Table size: q = 2 r Precomputed value: f (t), t = i/q, i = 2 r x Remaining value to compute: f (y), y [0, 2 r ) exp(x) = exp(t) exp(y), sin(x) = sin(t) cos(y) + cos(t) sin(y), cos(x) = cos(t) cos(y) sin(t) sin(y), y = x i/q y = x i/q y = x i/q log(1 + x) = log(1 + t) + log(1 + y), y = (qx i)/(i + q) atan(x) = atan(t) + atan(y), y = (qx i)/(ix + q) 7 / 27

17 Optimizing lookup tables m = 2 tables with entries gives same reduction as m = 1 table with 2 10 entries Function Precision m r Entries Size (KiB) exp exp sin sin cos cos log log atan atan Total / 27

18 Taylor series Logarithmic series: atan(x) = x 1 3 x x x log(1 + x) = 2 atanh(x/(x + 2)) With x < 2 10, need 230 terms for 4600-bit precision 9 / 27

19 Taylor series Logarithmic series: atan(x) = x 1 3 x x x log(1 + x) = 2 atanh(x/(x + 2)) With x < 2 10, need 230 terms for 4600-bit precision Exponential series: exp(x) = 1 + x + 1 2! x ! x ! x sin(x) = x 1 3! x ! x 5..., cos(x) = 1 1 2! x ! x 4... With x < 2 10, need 280 terms for 4600-bit precision 9 / 27

20 Taylor series Logarithmic series: atan(x) = x 1 3 x x x log(1 + x) = 2 atanh(x/(x + 2)) With x < 2 10, need 230 terms for 4600-bit precision Exponential series: exp(x) = 1 + x + 1 2! x ! x ! x sin(x) = x 1 3! x ! x 5..., cos(x) = 1 1 2! x ! x 4... With x < 2 10, need 280 terms for 4600-bit precision Above 300 bits: cos(x) = 1 sin 2 (x) Above 800 bits: exp(x) = sinh(x) sinh 2 (x) 9 / 27

21 Evaluating Taylor series using rectangular splitting Paterson and Stockmeyer, 1973: n i=0 x i in O(n) cheap steps + O(n 1/2 ) expensive steps 10 / 27

22 Evaluating Taylor series using rectangular splitting Paterson and Stockmeyer, 1973: n i=0 x i in O(n) cheap steps + O(n 1/2 ) expensive steps ( + x + x 2 + x 3 ) + ( + x + x 2 + x 3 ) x 4 + ( + x + x 2 + x 3 ) x 8 + ( + x + x 2 + x 3 ) x / 27

23 Evaluating Taylor series using rectangular splitting Paterson and Stockmeyer, 1973: n i=0 x i in O(n) cheap steps + O(n 1/2 ) expensive steps ( + x + x 2 + x 3 ) + ( + x + x 2 + x 3 ) x 4 + ( + x + x 2 + x 3 ) x 8 + ( + x + x 2 + x 3 ) x 12 Smith, 1989: elementary and hypergeometric functions 10 / 27

24 Evaluating Taylor series using rectangular splitting Paterson and Stockmeyer, 1973: n i=0 x i in O(n) cheap steps + O(n 1/2 ) expensive steps ( + x + x 2 + x 3 ) + ( + x + x 2 + x 3 ) x 4 + ( + x + x 2 + x 3 ) x 8 + ( + x + x 2 + x 3 ) x 12 Smith, 1989: elementary and hypergeometric functions Brent & Zimmermann, 2010: improvements to Smith 10 / 27

25 Evaluating Taylor series using rectangular splitting Paterson and Stockmeyer, 1973: n i=0 x i in O(n) cheap steps + O(n 1/2 ) expensive steps ( + x + x 2 + x 3 ) + ( + x + x 2 + x 3 ) x 4 + ( + x + x 2 + x 3 ) x 8 + ( + x + x 2 + x 3 ) x 12 Smith, 1989: elementary and hypergeometric functions Brent & Zimmermann, 2010: improvements to Smith FJ, 2014: generalization to D-finite functions 10 / 27

26 Evaluating Taylor series using rectangular splitting Paterson and Stockmeyer, 1973: n i=0 x i in O(n) cheap steps + O(n 1/2 ) expensive steps ( + x + x 2 + x 3 ) + ( + x + x 2 + x 3 ) x 4 + ( + x + x 2 + x 3 ) x 8 + ( + x + x 2 + x 3 ) x 12 Smith, 1989: elementary and hypergeometric functions Brent & Zimmermann, 2010: improvements to Smith FJ, 2014: generalization to D-finite functions New: optimized algorithm for elementary functions 10 / 27

27 Logarithmic series Rectangular splitting: x x 2 + x 3{ x x 2 + x 3{ x x 2}} 11 / 27

28 Logarithmic series Rectangular splitting: x x 2 + x 3{ x x 2 + x 3{ x x 2}} Improved algorithm with fewer divisions: x [ 30x 2 +x 3{ 20+15x+12x 2 +x 3{ [ 60 [8x+7x 2]]}}] / 27

29 Exponential series Rectangular splitting: 1+x+ 1 2 [ x x 3{ 1+ 1 [ x+ 1 [ x x 3{ 1+ 1 [x x 2]}]]}] 12 / 27

30 Exponential series Rectangular splitting: 1+x+ 1 2 [ x x 3{ 1+ 1 [ x+ 1 [ x x 3{ 1+ 1 [x x 2]}]]}] Improved algorithm with fewer divisions: 1+x+ 1 [ 12x 2 +x 3{ [ 4+1 x+ 1 [ 6x 2 +x 3{ 1+ 1 [8x+x 2]}]]}] / 27

31 Coefficients for exp series (on a 64-bit machine) Numerators 0-20, denominator 20!/0! = Numerators 21-33, denominator 33!/20! = Numerators , denominator 294!/287! = / 27

32 Taylor series evaluation using mpn arithmetic We use n-word fixed-point numbers (ulp = 2 64n ) Negative numbers implicitly or using two s complement! 14 / 27

33 Taylor series evaluation using mpn arithmetic We use n-word fixed-point numbers (ulp = 2 64n ) Negative numbers implicitly or using two s complement! Example: // sum = sum + term * coeff sum[n] += mpn_addmul_1(sum, term, n, coeff) term is n words: real number in [0, 1) sum is n + 1 words: real number in [0, 2 64 ) coeff is 1 word: integer in [0, 2 64 ) 14 / 27

34 Taylor series summation c 0 + c 1 x + c 2 x 2 + c 3 x 3 + x 4[ c 4 + c 5 x + c 6 x 2 + c 7 x 3] 15 / 27

35 Taylor series summation c 0 + c 1 x + c 2 x 2 + c 3 x 3 + x 4[ c 4 + c 5 x + c 6 x 2 + c 7 x 3] sum[n] += mpn_addmul_1(sum, xpowers[3], n, c[7]) sum[n] += mpn_addmul_1(sum, xpowers[2], n, c[6]) sum[n] += mpn_addmul_1(sum, xpowers[1], n, c[5]) sum[n] += c[4] 15 / 27

36 Taylor series summation c 0 + c 1 x + c 2 x 2 + c 3 x 3 + x 4[ c 4 + c 5 x + c 6 x 2 + c 7 x 3] sum[n] += mpn_addmul_1(sum, xpowers[3], n, c[7]) sum[n] += mpn_addmul_1(sum, xpowers[2], n, c[6]) sum[n] += mpn_addmul_1(sum, xpowers[1], n, c[5]) sum[n] += c[4] mpn_mul(tmp, sum, n+1, xpowers[4], n) mpn_copyi(sum, tmp+n, n+1) 15 / 27

37 Taylor series summation c 0 + c 1 x + c 2 x 2 + c 3 x 3 + x 4[ c 4 + c 5 x + c 6 x 2 + c 7 x 3] sum[n] += mpn_addmul_1(sum, xpowers[3], n, c[7]) sum[n] += mpn_addmul_1(sum, xpowers[2], n, c[6]) sum[n] += mpn_addmul_1(sum, xpowers[1], n, c[5]) sum[n] += c[4] mpn_mul(tmp, sum, n+1, xpowers[4], n) mpn_copyi(sum, tmp+n, n+1) sum[n] += mpn_addmul_1(sum, xpowers[3], n, c[3]) sum[n] += mpn_addmul_1(sum, xpowers[2], n, c[2]) sum[n] += mpn_addmul_1(sum, xpowers[1], n, c[1]) sum[n] += c[0] 15 / 27

38 Alternating signs c 0 c 1 x + c 2 x 2 c 3 x 3 + x 4[ c 4 c 5 x + c 6 x 2 c 7 x 3] 16 / 27

39 Alternating signs c 0 c 1 x + c 2 x 2 c 3 x 3 + x 4[ c 4 c 5 x + c 6 x 2 c 7 x 3] sum[n] -= mpn_submul_1(sum, xpowers[3], n, c[7]) sum[n] += mpn_addmul_1(sum, xpowers[2], n, c[6]) sum[n] -= mpn_submul_1(sum, xpowers[1], n, c[5]) sum[n] += c[4] 16 / 27

40 Alternating signs c 0 c 1 x + c 2 x 2 c 3 x 3 + x 4[ c 4 c 5 x + c 6 x 2 c 7 x 3] sum[n] -= mpn_submul_1(sum, xpowers[3], n, c[7]) sum[n] += mpn_addmul_1(sum, xpowers[2], n, c[6]) sum[n] -= mpn_submul_1(sum, xpowers[1], n, c[5]) sum[n] += c[4] mpn_mul(tmp, sum, n+1, xpowers[4], n) mpn_copyi(sum, tmp+n, n+1) 16 / 27

41 Alternating signs c 0 c 1 x + c 2 x 2 c 3 x 3 + x 4[ c 4 c 5 x + c 6 x 2 c 7 x 3] sum[n] -= mpn_submul_1(sum, xpowers[3], n, c[7]) sum[n] += mpn_addmul_1(sum, xpowers[2], n, c[6]) sum[n] -= mpn_submul_1(sum, xpowers[1], n, c[5]) sum[n] += c[4] mpn_mul(tmp, sum, n+1, xpowers[4], n) mpn_copyi(sum, tmp+n, n+1) sum[n] -= mpn_submul_1(sum, xpowers[3], n, c[3]) sum[n] += mpn_addmul_1(sum, xpowers[2], n, c[2]) sum[n] -= mpn_submul_1(sum, xpowers[1], n, c[1]) sum[n] += c[0] 16 / 27

42 Including divisions (exponential series) 1 [c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 1q4 [ c4 x 4 + c 5 x 5]] q 0 17 / 27

43 Including divisions (exponential series) 1 [c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 1q4 [ c4 x 4 + c 5 x 5]] q 0 sum[n] += mpn_addmul_1(sum, xpowers[5], n, c[5]) sum[n] += mpn_addmul_1(sum, xpowers[4], n, c[4]) 17 / 27

44 Including divisions (exponential series) 1 [c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 1q4 [ c4 x 4 + c 5 x 5]] q 0 sum[n] += mpn_addmul_1(sum, xpowers[5], n, c[5]) sum[n] += mpn_addmul_1(sum, xpowers[4], n, c[4]) mpn_divrem_1(sum, 0, sum, n+1, q[4]) 17 / 27

45 Including divisions (exponential series) 1 [c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 1q4 [ c4 x 4 + c 5 x 5]] q 0 sum[n] += mpn_addmul_1(sum, xpowers[5], n, c[5]) sum[n] += mpn_addmul_1(sum, xpowers[4], n, c[4]) mpn_divrem_1(sum, 0, sum, n+1, q[4]) sum[n] += mpn_addmul_1(sum, xpowers[3], n, c[3]) sum[n] += mpn_addmul_1(sum, xpowers[2], n, c[2]) sum[n] += mpn_addmul_1(sum, xpowers[1], n, c[1]) sum[n] += c[0] 17 / 27

46 Including divisions (exponential series) 1 [c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 1q4 [ c4 x 4 + c 5 x 5]] q 0 sum[n] += mpn_addmul_1(sum, xpowers[5], n, c[5]) sum[n] += mpn_addmul_1(sum, xpowers[4], n, c[4]) mpn_divrem_1(sum, 0, sum, n+1, q[4]) sum[n] += mpn_addmul_1(sum, xpowers[3], n, c[3]) sum[n] += mpn_addmul_1(sum, xpowers[2], n, c[2]) sum[n] += mpn_addmul_1(sum, xpowers[1], n, c[1]) sum[n] += c[0] mpn_divrem_1(sum, 0, sum, n+1, q[0]) 17 / 27

47 Including divisions (logarithmic series) 1 q 0 [ c0 + c 1 x + c 2 x 2 + c 3 x 3] + 1 q 4 [ c4 x 4 + c 5 x 5] 18 / 27

48 Including divisions (logarithmic series) 1 q 0 [ c 0 + c 1 x + c 2 x 2 + c 3 x 3 + q 0 [ c4 x 4 + c 5 x 5] ] q 4 18 / 27

49 Including divisions (logarithmic series) 1 q 0 [ c 0 + c 1 x + c 2 x 2 + c 3 x 3 + q 0 [ c4 x 4 + c 5 x 5] ] q 4 sum[n] += mpn_addmul_1(sum, xpowers[5], n, c[5]) sum[n] += mpn_addmul_1(sum, xpowers[4], n, c[4]) 18 / 27

50 Including divisions (logarithmic series) 1 q 0 [ c 0 + c 1 x + c 2 x 2 + c 3 x 3 + q 0 [ c4 x 4 + c 5 x 5] ] q 4 sum[n] += mpn_addmul_1(sum, xpowers[5], n, c[5]) sum[n] += mpn_addmul_1(sum, xpowers[4], n, c[4]) sum[n+1] = mpn_mul_1(sum, sum, n+1, q[0]) mpn_divrem_1(sum, 0, sum, n+2, q[4]) 18 / 27

51 Including divisions (logarithmic series) 1 q 0 [ c 0 + c 1 x + c 2 x 2 + c 3 x 3 + q 0 [ c4 x 4 + c 5 x 5] ] q 4 sum[n] += mpn_addmul_1(sum, xpowers[5], n, c[5]) sum[n] += mpn_addmul_1(sum, xpowers[4], n, c[4]) sum[n+1] = mpn_mul_1(sum, sum, n+1, q[0]) mpn_divrem_1(sum, 0, sum, n+2, q[4]) sum[n] += mpn_addmul_1(sum, xpowers[3], n, c[3]) sum[n] += mpn_addmul_1(sum, xpowers[2], n, c[2]) sum[n] += mpn_addmul_1(sum, xpowers[1], n, c[1]) sum[n] += c[0] 18 / 27

52 Including divisions (logarithmic series) 1 q 0 [ c 0 + c 1 x + c 2 x 2 + c 3 x 3 + q 0 [ c4 x 4 + c 5 x 5] ] q 4 sum[n] += mpn_addmul_1(sum, xpowers[5], n, c[5]) sum[n] += mpn_addmul_1(sum, xpowers[4], n, c[4]) sum[n+1] = mpn_mul_1(sum, sum, n+1, q[0]) mpn_divrem_1(sum, 0, sum, n+2, q[4]) sum[n] += mpn_addmul_1(sum, xpowers[3], n, c[3]) sum[n] += mpn_addmul_1(sum, xpowers[2], n, c[2]) sum[n] += mpn_addmul_1(sum, xpowers[1], n, c[1]) sum[n] += c[0] mpn_divrem_1(sum, 0, sum, n+1, q[0]) 18 / 27

53 Error bounds and correctness The algorithm evaluates each truncated Taylor series with 2 fixed-point ulp error 19 / 27

54 Error bounds and correctness The algorithm evaluates each truncated Taylor series with 2 fixed-point ulp error How does that work? mpn addmul 1 type operations are exact Clearing denominators gives up to 64 guard bits Evaluation is done backwards: multiplications and divisions remove error 19 / 27

55 Error bounds and correctness The algorithm evaluates each truncated Taylor series with 2 fixed-point ulp error How does that work? mpn addmul 1 type operations are exact Clearing denominators gives up to 64 guard bits Evaluation is done backwards: multiplications and divisions remove error Must also show: no overflow, correct handling of signs 19 / 27

56 Error bounds and correctness The algorithm evaluates each truncated Taylor series with 2 fixed-point ulp error How does that work? mpn addmul 1 type operations are exact Clearing denominators gives up to 64 guard bits Evaluation is done backwards: multiplications and divisions remove error Must also show: no overflow, correct handling of signs Proof depends on the precise sequence of numerators and denominators used. 19 / 27

57 Error bounds and correctness The algorithm evaluates each truncated Taylor series with 2 fixed-point ulp error How does that work? mpn addmul 1 type operations are exact Clearing denominators gives up to 64 guard bits Evaluation is done backwards: multiplications and divisions remove error Must also show: no overflow, correct handling of signs Proof depends on the precise sequence of numerators and denominators used. Proof by exhaustive side computation! 19 / 27

58 Benchmarks The code is part of the Arb library Open source (GPL version 2 or later) 20 / 27

59 Timings (microseconds / function evaluation) Bits exp sin cos log atan Measurements done on an Intel i7-2600s CPU. 21 / 27

60 Speedup vs MPFR Bits exp sin cos log atan Measurements done on an Intel i7-2600s CPU. 22 / 27

61 Comparison to MPFR 10-1 Time (seconds) - MPFR dashed exp(x) log(x) 10-2 sin(x) 30 Speedup Arb vs MPFR exp(x) log(x) sin(x) cos(x) cos(x) 10-3 atan(x) 10 atan(x) Precision (bits) Precision (bits) Measurements done on an Intel i7-2600s CPU. 23 / 27

62 Double (53 bits) precision, microseconds/eval, Intel i7-2600s: exp sin cos log atan EGLIBC MPFR Arb / 27

63 Double (53 bits) precision, microseconds/eval, Intel i7-2600s: exp sin cos log atan EGLIBC MPFR Arb Quad (113 bits) precision, microseconds/eval, Intel T4400: exp sin cos log atan MPFR libquadmath QD Arb / 27

64 Double (53 bits) precision, microseconds/eval, Intel i7-2600s: exp sin cos log atan EGLIBC MPFR Arb Quad (113 bits) precision, microseconds/eval, Intel T4400: exp sin cos log atan MPFR libquadmath QD Arb Quad-double (212 bits) precision, microseconds/eval, Intel T4400: exp sin cos log atan MPFR QD Arb / 27

65 Summary Elementary functions with error bounds Variable precision up to 4600 bits 25 / 27

66 Summary Elementary functions with error bounds Variable precision up to 4600 bits mpn arithmetic KB of lookup tables + efficient algorithm to evaluate Taylor series (rectangular splitting, optimized denominator sequence) Similar algorithm for all functions (no Newton iteration, etc.) 25 / 27

67 Summary Elementary functions with error bounds Variable precision up to 4600 bits mpn arithmetic KB of lookup tables + efficient algorithm to evaluate Taylor series (rectangular splitting, optimized denominator sequence) Similar algorithm for all functions (no Newton iteration, etc.) Improvement over MPFR: up to 3-4x for cos, 8-10x for sin/exp/log, 30x for atan Gap to double precision LIBM (EGLIBC): 4-7x 25 / 27

68 Future work Optimizations Gradually change precision in Taylor series summation Use short multiplications (no GMP support) Use precomputed inverses (no GMP support) Assembly for low precision (1-3 words?) Further tuning of lookup tables 26 / 27

69 Future work Optimizations Gradually change precision in Taylor series summation Use short multiplications (no GMP support) Use precomputed inverses (no GMP support) Assembly for low precision (1-3 words?) Further tuning of lookup tables What if floating-point expansions or carry-save bignums are used instead of GMP/64-bit full words? Vectorization? 26 / 27

70 Future work Optimizations Gradually change precision in Taylor series summation Use short multiplications (no GMP support) Use precomputed inverses (no GMP support) Assembly for low precision (1-3 words?) Further tuning of lookup tables What if floating-point expansions or carry-save bignums are used instead of GMP/64-bit full words? Vectorization? Formally verified implementation? 26 / 27

71 Future work Optimizations Gradually change precision in Taylor series summation Use short multiplications (no GMP support) Use precomputed inverses (no GMP support) Assembly for low precision (1-3 words?) Further tuning of lookup tables What if floating-point expansions or carry-save bignums are used instead of GMP/64-bit full words? Vectorization? Formally verified implementation? Port to other libraries (e.g. MPFR) 26 / 27

72 Thank you! 27 / 27

arxiv: v2 [cs.ms] 9 Jun 2015

arxiv: v2 [cs.ms] 9 Jun 2015 Efficient implementation of elementary functions in the medium-precision range Fredrik Johansson arxiv:1410.7176v2 [cs.ms] 9 Jun 2015 Abstract We describe a new implementation of the elementary transcendental

More information

Arb: a C library for ball arithmetic

Arb: a C library for ball arithmetic Arb: a C library for ball arithmetic Fredrik Johansson RISC-Linz ISSAC 2013 Supported by the Austrian Science Fund (FWF) grant Y464-N18 Fredrik Johansson (RISC-Linz) Arb: a C library for ball arithmetic

More information

Additional Key Words and Phrases: Accuracy, elementary function evaluation, floating point, Fortran, mathematical library, portable software

Additional Key Words and Phrases: Accuracy, elementary function evaluation, floating point, Fortran, mathematical library, portable software A Fortran Package For Floating-Point Multiple-Precision Arithmetic David M. Smith Loyola Marymount University FM is a collection of Fortran-77 routines which performs floating-point multiple-precision

More information

CS321 Introduction To Numerical Methods

CS321 Introduction To Numerical Methods CS3 Introduction To Numerical Methods Fuhua (Frank) Cheng Department of Computer Science University of Kentucky Lexington KY 456-46 - - Table of Contents Errors and Number Representations 3 Error Types

More information

Correct Rounding of Mathematical Functions

Correct Rounding of Mathematical Functions Correct Rounding of Mathematical Functions Vincent LEFÈVRE Arénaire, INRIA Grenoble Rhône-Alpes / LIP, ENS-Lyon SIESTE, 2010-03-23 Outline Introduction: Floating-Point Arithmetic Relation with Programming

More information

Formal Verification of a Floating-Point Elementary Function

Formal Verification of a Floating-Point Elementary Function Introduction Coq & Flocq Coq.Interval Gappa Conclusion Formal Verification of a Floating-Point Elementary Function Inria Saclay Île-de-France & LRI, Université Paris Sud, CNRS 2015-06-25 Introduction Coq

More information

Classes of Real Numbers 1/2. The Real Line

Classes of Real Numbers 1/2. The Real Line Classes of Real Numbers All real numbers can be represented by a line: 1/2 π 1 0 1 2 3 4 real numbers The Real Line { integers rational numbers non-integral fractions irrational numbers Rational numbers

More information

SIPE: Small Integer Plus Exponent

SIPE: Small Integer Plus Exponent SIPE: Small Integer Plus Exponent Vincent LEFÈVRE AriC, INRIA Grenoble Rhône-Alpes / LIP, ENS-Lyon Arith 21, Austin, Texas, USA, 2013-04-09 Introduction: Why SIPE? All started with floating-point algorithms

More information

Optimized Binary64 and Binary128 Arithmetic with GNU MPFR (common work with Vincent Lefèvre)

Optimized Binary64 and Binary128 Arithmetic with GNU MPFR (common work with Vincent Lefèvre) Arith 24 conference, London, July 24, 2017 Optimized Binary64 and Binary128 Arithmetic with GNU MPFR (common work with Vincent Lefèvre) Paul Zimmermann Introducing the GNU MPFR library a software implementation

More information

Computing Fundamentals

Computing Fundamentals Computing Fundamentals Salvatore Filippone salvatore.filippone@uniroma2.it 2012 2013 (salvatore.filippone@uniroma2.it) Computing Fundamentals 2012 2013 1 / 18 Octave basics Octave/Matlab: f p r i n t f

More information

General Terms: Algorithms, exponential integral function, multiple precision

General Terms: Algorithms, exponential integral function, multiple precision Multiple-Precision Exponential Integral and Related Functions David M. Smith Loyola Marymount University This article describes a collection of Fortran-95 routines for evaluating the exponential integral

More information

Test Construction for Mathematical Functions

Test Construction for Mathematical Functions Test Construction for Mathematical Functions Victor Kuliamin Institute for System Programming Russian Academy of Sciences 109004, B. Kommunistitcheskaya, 25, Moscow, Russia kuliamin@ispras.ru Abstract.

More information

Non-Holonomic Sequences and Functions

Non-Holonomic Sequences and Functions Non-Holonomic Sequences and Functions Stefan Gerhold May 16, 2006 Holonomic Sequences and Functions A function f (z) is holonomic if it satisfies an LODE p 0 (z)f (0) (z) + + p d (z)f (d) (z) = 0 with

More information

Lecture Objectives. Structured Programming & an Introduction to Error. Review the basic good habits of programming

Lecture Objectives. Structured Programming & an Introduction to Error. Review the basic good habits of programming Structured Programming & an Introduction to Error Lecture Objectives Review the basic good habits of programming To understand basic concepts of error and error estimation as it applies to Numerical Methods

More information

Formal Verification in Industry

Formal Verification in Industry Formal Verification in Industry 1 Formal Verification in Industry John Harrison Intel Corporation The cost of bugs Formal verification Machine-checked proof Automatic and interactive approaches HOL Light

More information

Chapter 4: Basic C Operators

Chapter 4: Basic C Operators Chapter 4: Basic C Operators In this chapter, you will learn about: Arithmetic operators Unary operators Binary operators Assignment operators Equalities and relational operators Logical operators Conditional

More information

2 Computation with Floating-Point Numbers

2 Computation with Floating-Point Numbers 2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers

More information

Generating a Minimal Interval Arithmetic Based on GNU MPFR

Generating a Minimal Interval Arithmetic Based on GNU MPFR Generating a Minimal Interval Arithmetic Based on GNU MPFR (in the context of the search for hard-to-round cases) Vincent LEFÈVRE Arénaire, INRIA Grenoble Rhône-Alpes / LIP, ENS-Lyon Dagstuhl Seminar 11371

More information

Sketchify Tutorial Properties and Variables. sketchify.sf.net Željko Obrenović

Sketchify Tutorial Properties and Variables. sketchify.sf.net Željko Obrenović Sketchify Tutorial Properties and Variables sketchify.sf.net Željko Obrenović z.obrenovic@tue.nl Properties and Variables Properties of active regions and sketches can be given directly, or indirectly

More information

Limits and Continuity: section 12.2

Limits and Continuity: section 12.2 Limits and Continuity: section 1. Definition. Let f(x,y) be a function with domain D, and let (a,b) be a point in the plane. We write f (x,y) = L if for each ε > 0 there exists some δ > 0 such that if

More information

2 Computation with Floating-Point Numbers

2 Computation with Floating-Point Numbers 2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers

More information

Introduction to Programming

Introduction to Programming Introduction to Programming Department of Computer Science and Information Systems Tingting Han (afternoon), Steve Maybank (evening) tingting@dcs.bbk.ac.uk sjmaybank@dcs.bbk.ac.uk Autumn 2017 Week 4: More

More information

Part V Appendices c Copyright, Todd Young and Martin Mohlenkamp, Department of Mathematics, Ohio University, 2017

Part V Appendices c Copyright, Todd Young and Martin Mohlenkamp, Department of Mathematics, Ohio University, 2017 Part V Appendices c Copyright, Todd Young and Martin Mohlenkamp, Department of Mathematics, Ohio University, 2017 Appendix A Glossary of Matlab Commands Mathematical Operations + Addition. Type help plus

More information

LAB 1 General MATLAB Information 1

LAB 1 General MATLAB Information 1 LAB 1 General MATLAB Information 1 General: To enter a matrix: > type the entries between square brackets, [...] > enter it by rows with elements separated by a space or comma > rows are terminated by

More information

3.2 Recursions One-term recursion, searching for a first occurrence, two-term recursion. 2 sin.

3.2 Recursions One-term recursion, searching for a first occurrence, two-term recursion. 2 sin. Chapter 3 Sequences 3.1 Summation Nested loops, while-loops with compound termination criteria 3.2 Recursions One-term recursion, searching for a first occurrence, two-term recursion. In Chapter 2 we played

More information

Outline. L9: Project Discussion and Floating Point Issues. Project Parts (Total = 50%) Project Proposal (due 3/8) 2/13/12.

Outline. L9: Project Discussion and Floating Point Issues. Project Parts (Total = 50%) Project Proposal (due 3/8) 2/13/12. Outline L9: Project Discussion and Floating Point Issues Discussion of semester projects Floating point Mostly single precision until recent architectures Accuracy What s fast and what s not Reading: Ch

More information

Space Complexity of Fast D-Finite Function Evaluation

Space Complexity of Fast D-Finite Function Evaluation Space Complexity of Fast D-Finite Function Evaluation AriC Team Inria CASC 0 Binary Splitting A classical method to evaluate series of rational numbers Classical constants Ex.: π = ( ) k (6k)! (359409

More information

WebAssign Lesson 1-2a Area Between Curves (Homework)

WebAssign Lesson 1-2a Area Between Curves (Homework) WebAssign Lesson 1-2a Area Between Curves (Homework) Current Score : / 30 Due : Thursday, June 26 2014 11:00 AM MDT Jaimos Skriletz Math 175, section 31, Summer 2 2014 Instructor: Jaimos Skriletz 1. /3

More information

Introduction to Computer Programming in Python Dr. William C. Bulko. Data Types

Introduction to Computer Programming in Python Dr. William C. Bulko. Data Types Introduction to Computer Programming in Python Dr William C Bulko Data Types 2017 What is a data type? A data type is the kind of value represented by a constant or stored by a variable So far, you have

More information

Introduction to numerical algorithms

Introduction to numerical algorithms Introduction to numerical algorithms Given an algebraic equation or formula, we may want to approximate the value, and while in calculus, we deal with equations or formulas that are well defined at each

More information

Algorithm must complete after a finite number of instructions have been executed. Each step must be clearly defined, having only one interpretation.

Algorithm must complete after a finite number of instructions have been executed. Each step must be clearly defined, having only one interpretation. Algorithms 1 algorithm: a finite set of instructions that specify a sequence of operations to be carried out in order to solve a specific problem or class of problems An algorithm must possess the following

More information

Introduction to Computer Programming with MATLAB Calculation and Programming Errors. Selis Önel, PhD

Introduction to Computer Programming with MATLAB Calculation and Programming Errors. Selis Önel, PhD Introduction to Computer Programming with MATLAB Calculation and Programming Errors Selis Önel, PhD Today you will learn Numbers, Significant figures Error analysis Absolute error Relative error Chopping

More information

Second Order Function Approximation Using a Single Multiplication on FPGAs

Second Order Function Approximation Using a Single Multiplication on FPGAs Second Order Function Approximation Using a Single Multiplication on FPGAs Jérémie Detrey and Florent de Dinechin Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon 46, allée

More information

Machine Computation of the Sine Function

Machine Computation of the Sine Function Machine Computation of the Sine Function Chuck Allison CNS 3320 Numerical Software Engineering Utah Valley State College February 2007 Abstract This note shows how to determine the number of terms to use

More information

Performance Evaluation of a Novel Direct Table Lookup Method and Architecture With Application to 16-bit Integer Functions

Performance Evaluation of a Novel Direct Table Lookup Method and Architecture With Application to 16-bit Integer Functions Performance Evaluation of a Novel Direct Table Lookup Method and Architecture With Application to 16-bit nteger Functions L. Li, Alex Fit-Florea, M. A. Thornton, D. W. Matula Southern Methodist University,

More information

Floating-point representation

Floating-point representation Lecture 3-4: Floating-point representation and arithmetic Floating-point representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However,

More information

LECTURE 0: Introduction and Background

LECTURE 0: Introduction and Background 1 LECTURE 0: Introduction and Background September 10, 2012 1 Computational science The role of computational science has become increasingly significant during the last few decades. It has become the

More information

2-3 Graphing Rational Functions

2-3 Graphing Rational Functions 2-3 Graphing Rational Functions Factor What are the end behaviors of the Graph? Sketch a graph How to identify the intercepts, asymptotes and end behavior of a rational function. How to sketch the graph

More information

MATLAB QUICK START TUTORIAL

MATLAB QUICK START TUTORIAL MATLAB QUICK START TUTORIAL This tutorial is a brief introduction to MATLAB which is considered one of the most powerful languages of technical computing. In the following sections, the basic knowledge

More information

MATLAB NOTES. Matlab designed for numerical computing. Strongly oriented towards use of arrays, one and two dimensional.

MATLAB NOTES. Matlab designed for numerical computing. Strongly oriented towards use of arrays, one and two dimensional. MATLAB NOTES Matlab designed for numerical computing. Strongly oriented towards use of arrays, one and two dimensional. Excellent graphics that are easy to use. Powerful interactive facilities; and programs

More information

Introduction to floating point arithmetic

Introduction to floating point arithmetic Introduction to floating point arithmetic Matthias Petschow and Paolo Bientinesi AICES, RWTH Aachen petschow@aices.rwth-aachen.de October 24th, 2013 Aachen, Germany Matthias Petschow (AICES, RWTH Aachen)

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Python Programming: An Introduction to Computer Science Chapter 3 Computing with Numbers Python Programming, 3/e 1 Objectives n To understand the concept of data types. n To be familiar with the basic

More information

CS321. Introduction to Numerical Methods

CS321. Introduction to Numerical Methods CS31 Introduction to Numerical Methods Lecture 1 Number Representations and Errors Professor Jun Zhang Department of Computer Science University of Kentucky Lexington, KY 40506 0633 August 5, 017 Number

More information

1.1 Numbers system :-

1.1 Numbers system :- 1.1 Numbers system :- 1.3.1 Decimal System (0-9) :- Decimal system is a way of writing numbers. Any number, from huge quantities to tiny fractions, can be written in the decimal system using only the ten

More information

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science) Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science) Floating Point Background: Fractional binary numbers IEEE floating point standard: Definition Example and properties

More information

GPU Floating Point Features

GPU Floating Point Features CSE 591: GPU Programming Floating Point Considerations Klaus Mueller Computer Science Department Stony Brook University Objective To understand the fundamentals of floating-point representation To know

More information

Review Questions 26 CHAPTER 1. SCIENTIFIC COMPUTING

Review Questions 26 CHAPTER 1. SCIENTIFIC COMPUTING 26 CHAPTER 1. SCIENTIFIC COMPUTING amples. The IEEE floating-point standard can be found in [131]. A useful tutorial on floating-point arithmetic and the IEEE standard is [97]. Although it is no substitute

More information

Green Globs And Graphing Equations

Green Globs And Graphing Equations Green Globs And Graphing Equations Green Globs and Graphing Equations has four parts to it which serve as a tool, a review or testing device, and two games. The menu choices are: Equation Plotter which

More information

Debugging Floating-Point Math in Racket. Neil Toronto RacketCon 2013

Debugging Floating-Point Math in Racket. Neil Toronto RacketCon 2013 Debugging Floating-Point Math in Racket Neil Toronto RacketCon 213 Racket Floating-Point Support Fast (JIT-ed) and compliant (IEEE 754 and C99) 1 Racket Floating-Point Support Fast (JIT-ed) and compliant

More information

Introduction to Engineering gii

Introduction to Engineering gii 25.108 Introduction to Engineering gii Dr. Jay Weitzen Lecture Notes I: Introduction to Matlab from Gilat Book MATLAB - Lecture # 1 Starting with MATLAB / Chapter 1 Topics Covered: 1. Introduction. 2.

More information

9 Using Equation Networks

9 Using Equation Networks 9 Using Equation Networks In this chapter Introduction to Equation Networks 244 Equation format 247 Using register address lists 254 Setting up an enable contact 255 Equations displayed within the Network

More information

Starting MATLAB To logon onto a Temple workstation at the Tech Center, follow the directions below.

Starting MATLAB To logon onto a Temple workstation at the Tech Center, follow the directions below. What is MATLAB? MATLAB (short for MATrix LABoratory) is a language for technical computing, developed by The Mathworks, Inc. (A matrix is a rectangular array or table of usually numerical values.) MATLAB

More information

THE representation formats and the behavior of binary

THE representation formats and the behavior of binary SUBMITTED TO IEEE TC - NOVEMBER 2016 1 Exact Lookup Tables for the Evaluation of Trigonometric and Hyperbolic Functions Hugues de Lassus Saint-Geniès, David Defour, and Guillaume Revy Abstract Elementary

More information

CSI31 Lecture 5. Topics: 3.1 Numeric Data Types 3.2 Using the Math Library 3.3 Accumulating Results: Factorial

CSI31 Lecture 5. Topics: 3.1 Numeric Data Types 3.2 Using the Math Library 3.3 Accumulating Results: Factorial CSI31 Lecture 5 Topics: 3.1 Numeric Data Types 3.2 Using the Math Library 3.3 Accumulating Results: Factorial 1 3.1 Numberic Data Types When computers were first developed, they were seen primarily as

More information

Mathematical Modeling to Formally Prove Correctness

Mathematical Modeling to Formally Prove Correctness 0 Mathematical Modeling to Formally Prove Correctness John Harrison Intel Corporation Gelato Meeting April 24, 2006 1 The human cost of bugs Computers are often used in safety-critical systems where a

More information

A Multiple -Precision Division Algorithm. By David M. Smith

A Multiple -Precision Division Algorithm. By David M. Smith A Multiple -Precision Division Algorithm By David M. Smith Abstract. The classical algorithm for multiple -precision division normalizes digits during each step and sometimes makes correction steps when

More information

ECEn 621 Computer Arithmetic Project #3. CORDIC History

ECEn 621 Computer Arithmetic Project #3. CORDIC History ECEn 621 Computer Arithmetic Project #3 CORDIC Algorithms Slide #1 CORDIC History Coordinate Rotation Digital Computer Invented in late 1950 s Based on the observation that: if you rotate a unit-length

More information

My 2 hours today: 1. Efficient arithmetic in finite fields minute break 3. Elliptic curves. My 2 hours tomorrow:

My 2 hours today: 1. Efficient arithmetic in finite fields minute break 3. Elliptic curves. My 2 hours tomorrow: My 2 hours today: 1. Efficient arithmetic in finite fields 2. 10-minute break 3. Elliptic curves My 2 hours tomorrow: 4. Efficient arithmetic on elliptic curves 5. 10-minute break 6. Choosing curves Efficient

More information

= f (a, b) + (hf x + kf y ) (a,b) +

= f (a, b) + (hf x + kf y ) (a,b) + Chapter 14 Multiple Integrals 1 Double Integrals, Iterated Integrals, Cross-sections 2 Double Integrals over more general regions, Definition, Evaluation of Double Integrals, Properties of Double Integrals

More information

arxiv:cs/ v1 [cs.ds] 11 May 2005

arxiv:cs/ v1 [cs.ds] 11 May 2005 The Generic Multiple-Precision Floating-Point Addition With Exact Rounding (as in the MPFR Library) arxiv:cs/0505027v1 [cs.ds] 11 May 2005 Vincent Lefèvre INRIA Lorraine, 615 rue du Jardin Botanique, 54602

More information

Data Types and Basic Calculation

Data Types and Basic Calculation Data Types and Basic Calculation Intrinsic Data Types Fortran supports five intrinsic data types: 1. INTEGER for exact whole numbers e.g., 1, 100, 534, -18, -654321, etc. 2. REAL for approximate, fractional

More information

Honors Precalculus: Solving equations and inequalities graphically and algebraically. Page 1

Honors Precalculus: Solving equations and inequalities graphically and algebraically. Page 1 Solving equations and inequalities graphically and algebraically 1. Plot points on the Cartesian coordinate plane. P.1 2. Represent data graphically using scatter plots, bar graphs, & line graphs. P.1

More information

Bisecting Floating Point Numbers

Bisecting Floating Point Numbers Squishy Thinking by Jason Merrill jason@squishythinking.com Bisecting Floating Point Numbers 22 Feb 2014 Bisection is about the simplest algorithm there is for isolating a root of a continuous function:

More information

A Thread-Safe Arbitrary Precision Computation Package (Full Documentation)

A Thread-Safe Arbitrary Precision Computation Package (Full Documentation) A Thread-Safe Arbitrary Precision Computation Package (Full Documentation) David H. Bailey August 18, 2018 Abstract Numerous research studies have arisen, particularly in the realm of mathematical physics

More information

Finite arithmetic and error analysis

Finite arithmetic and error analysis Finite arithmetic and error analysis Escuela de Ingeniería Informática de Oviedo (Dpto de Matemáticas-UniOvi) Numerical Computation Finite arithmetic and error analysis 1 / 45 Outline 1 Number representation:

More information

Goals for This Lecture:

Goals for This Lecture: Goals for This Lecture: Understand integer arithmetic Understand mixed-mode arithmetic Understand the hierarchy of arithmetic operations Introduce the use of intrinsic functions Real Arithmetic Valid expressions

More information

16.69 TAYLOR: Manipulation of Taylor series

16.69 TAYLOR: Manipulation of Taylor series 859 16.69 TAYLOR: Manipulation of Taylor series This package carries out the Taylor expansion of an expression in one or more variables and efficient manipulation of the resulting Taylor series. Capabilities

More information

Chapter 3. Errors and numerical stability

Chapter 3. Errors and numerical stability Chapter 3 Errors and numerical stability 1 Representation of numbers Binary system : micro-transistor in state off 0 on 1 Smallest amount of stored data bit Object in memory chain of 1 and 0 10011000110101001111010010100010

More information

CS 6210 Fall 2016 Bei Wang. Lecture 4 Floating Point Systems Continued

CS 6210 Fall 2016 Bei Wang. Lecture 4 Floating Point Systems Continued CS 6210 Fall 2016 Bei Wang Lecture 4 Floating Point Systems Continued Take home message 1. Floating point rounding 2. Rounding unit 3. 64 bit word: double precision (IEEE standard word) 4. Exact rounding

More information

Outline. CSE 1570 Interacting with MATLAB. Starting MATLAB. Outline. MATLAB Windows. MATLAB Desktop Window. Instructor: Aijun An.

Outline. CSE 1570 Interacting with MATLAB. Starting MATLAB. Outline. MATLAB Windows. MATLAB Desktop Window. Instructor: Aijun An. CSE 170 Interacting with MATLAB Instructor: Aijun An Department of Computer Science and Engineering York University aan@cse.yorku.ca Outline Starting MATLAB MATLAB Windows Using the Command Window Some

More information

Math 121. Graphing Rational Functions Fall 2016

Math 121. Graphing Rational Functions Fall 2016 Math 121. Graphing Rational Functions Fall 2016 1. Let x2 85 x 2 70. (a) State the domain of f, and simplify f if possible. (b) Find equations for the vertical asymptotes for the graph of f. (c) For each

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 8 Division through Multiplication Israel Koren ECE666/Koren Part.8.1 Division by Convergence

More information

MYSQL NUMERIC FUNCTIONS

MYSQL NUMERIC FUNCTIONS MYSQL NUMERIC FUNCTIONS http://www.tutorialspoint.com/mysql/mysql-numeric-functions.htm Copyright tutorialspoint.com MySQL numeric functions are used primarily for numeric manipulation and/or mathematical

More information

Floating-point representations

Floating-point representations Lecture 10 Floating-point representations Methods of representing real numbers (1) 1. Fixed-point number system limited range and/or limited precision results must be scaled 100101010 1111010 100101010.1111010

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 4-C Floating-Point Arithmetic - III Israel Koren ECE666/Koren Part.4c.1 Floating-Point Adders

More information

f(0.2) < 0 and f(0.3) > 0 f(1.2) > 0 and f(1.3) < 0 f(1) = 1 > 0 and f(2) 0.

f(0.2) < 0 and f(0.3) > 0 f(1.2) > 0 and f(1.3) < 0 f(1) = 1 > 0 and f(2) 0. 1. [Burden and Faires, Section 1.1 Problem 1]. Show that the following equations have at least one solution in the given intervals. a. xcosx 2x 2 +3x 1 = 0 on [0.2,0.3] and [1.2,1.3]. Let f(x) = xcosx

More information

Floating-point representations

Floating-point representations Lecture 10 Floating-point representations Methods of representing real numbers (1) 1. Fixed-point number system limited range and/or limited precision results must be scaled 100101010 1111010 100101010.1111010

More information

Numerical computing. How computers store real numbers and the problems that result

Numerical computing. How computers store real numbers and the problems that result Numerical computing How computers store real numbers and the problems that result The scientific method Theory: Mathematical equations provide a description or model Experiment Inference from data Test

More information

MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic

MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic September 28, 2018 Lecture 1 September 28, 2018 1 / 25 Floating point arithmetic Computers use finite strings of binary digits to represent

More information

FP-ANR: Another representation format to handle cancellation at run-time. David Defour LAMPS, Univ. de Perpignan (France)

FP-ANR: Another representation format to handle cancellation at run-time. David Defour LAMPS, Univ. de Perpignan (France) FP-ANR: Another representation format to handle cancellation at run-time David Defour LAMPS, Univ. de Perpignan (France) Motivations Number s representation really matters List of problems Uncertainty

More information

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754 Floating Point Puzzles Topics Lecture 3B Floating Point IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties For each of the following C expressions, either: Argue that

More information

Lab 1 - Worksheet Spring 2013

Lab 1 - Worksheet Spring 2013 Math 300 UMKC Lab 1 - Worksheet Spring 2013 Learning Objectives: 1. How to use Matlab as a calculator 2. Learn about Matlab built in functions 3. Matrix and Vector arithmetics 4. MATLAB rref command 5.

More information

Introduction to Numerical Computing

Introduction to Numerical Computing Statistics 580 Introduction to Numerical Computing Number Systems In the decimal system we use the 10 numeric symbols 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 to represent numbers. The relative position of each symbol

More information

Make Computer Arithmetic Great Again?

Make Computer Arithmetic Great Again? Make Computer Arithmetic Great Again? Jean-Michel Muller CNRS, ENS Lyon, Inria, Université de Lyon France ARITH-25 June 2018 -2- An apparent contradiction low number of paper submissions to Arith these

More information

Outline. CSE 1570 Interacting with MATLAB. Starting MATLAB. Outline (Cont d) MATLAB Windows. MATLAB Desktop Window. Instructor: Aijun An

Outline. CSE 1570 Interacting with MATLAB. Starting MATLAB. Outline (Cont d) MATLAB Windows. MATLAB Desktop Window. Instructor: Aijun An CSE 170 Interacting with MATLAB Instructor: Aijun An Department of Computer Science and Engineering York University aan@cse.yorku.ca Outline Starting MATLAB MATLAB Windows Using the Command Window Some

More information

A Multiple-Precision Division Algorithm

A Multiple-Precision Division Algorithm Digital Commons@ Loyola Marymount University and Loyola Law School Mathematics Faculty Works Mathematics 1-1-1996 A Multiple-Precision Division Algorithm David M. Smith Loyola Marymount University, dsmith@lmu.edu

More information

Julia Calculator ( Introduction)

Julia Calculator ( Introduction) Julia Calculator ( Introduction) Julia can replicate the basics of a calculator with the standard notations. Binary operators Symbol Example Addition + 2+2 = 4 Substraction 2*3 = 6 Multify * 3*3 = 9 Division

More information

Easy Accurate Reading and Writing of Floating-Point Numbers. Aubrey Jaffer 1. August 2018

Easy Accurate Reading and Writing of Floating-Point Numbers. Aubrey Jaffer 1. August 2018 Easy Accurate Reading and Writing of Floating-Point Numbers Aubrey Jaffer August 8 Abstract Presented here are algorithms for converting between (decimal) scientific-notation and (binary) IEEE-754 double-precision

More information

MATLAB Basics EE107: COMMUNICATION SYSTEMS HUSSAIN ELKOTBY

MATLAB Basics EE107: COMMUNICATION SYSTEMS HUSSAIN ELKOTBY MATLAB Basics EE107: COMMUNICATION SYSTEMS HUSSAIN ELKOTBY What is MATLAB? MATLAB (MATrix LABoratory) developed by The Mathworks, Inc. (http://www.mathworks.com) Key Features: High-level language for numerical

More information

Scientific Computing. Error Analysis

Scientific Computing. Error Analysis ECE257 Numerical Methods and Scientific Computing Error Analysis Today s s class: Introduction to error analysis Approximations Round-Off Errors Introduction Error is the difference between the exact solution

More information

Reals 1. Floating-point numbers and their properties. Pitfalls of numeric computation. Horner's method. Bisection. Newton's method.

Reals 1. Floating-point numbers and their properties. Pitfalls of numeric computation. Horner's method. Bisection. Newton's method. Reals 1 13 Reals Floating-point numbers and their properties. Pitfalls of numeric computation. Horner's method. Bisection. Newton's method. 13.1 Floating-point numbers Real numbers, those declared to be

More information

Fixed point algorithmic math package user s guide By David Bishop

Fixed point algorithmic math package user s guide By David Bishop Fixed point algorithmic math package user s guide By David Bishop (dbishop@vhdl.org) The fixed point matrix math package was designed to be a synthesizable matrix math package. Because this package allows

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 11: Floating Point & Floating Point Addition Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Last time: Single Precision Format

More information

Floating Point January 24, 2008

Floating Point January 24, 2008 15-213 The course that gives CMU its Zip! Floating Point January 24, 2008 Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties class04.ppt 15-213, S 08 Floating

More information

Representation of Numbers

Representation of Numbers Computer Architecture 10 Representation of Numbers Made with OpenOffice.org 1 Number encodings Additive systems - historical Positional systems radix - the base of the numbering system, the positive integer

More information

Math 340 Fall 2014, Victor Matveev. Binary system, round-off errors, loss of significance, and double precision accuracy.

Math 340 Fall 2014, Victor Matveev. Binary system, round-off errors, loss of significance, and double precision accuracy. Math 340 Fall 2014, Victor Matveev Binary system, round-off errors, loss of significance, and double precision accuracy. 1. Bits and the binary number system A bit is one digit in a binary representation

More information

GPU & Computer Arithmetics

GPU & Computer Arithmetics GPU & Computer Arithmetics David Defour University of Perpignan Key multicore challenges Performance challenge How to scale from 1 to 1000 cores The number of cores is the new MegaHertz Power efficiency

More information

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754 Floating Point Puzzles Topics Lecture 3B Floating Point IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties For each of the following C expressions, either: Argue that

More information

Floating Point Numbers

Floating Point Numbers Floating Point Numbers Computer Systems Organization (Spring 2016) CSCI-UA 201, Section 2 Instructor: Joanna Klukowska Slides adapted from Randal E. Bryant and David R. O Hallaron (CMU) Mohamed Zahran

More information

Floating Point Numbers

Floating Point Numbers Floating Point Numbers Computer Systems Organization (Spring 2016) CSCI-UA 201, Section 2 Fractions in Binary Instructor: Joanna Klukowska Slides adapted from Randal E. Bryant and David R. O Hallaron (CMU)

More information