CSE 160 Lecture 14. Floating Point Arithmetic Condition Variables

Similar documents
Lecture 10. Floating point arithmetic GPUs in perspective

Lecture 17 Floating point

Lecture 6. Floating Point Arithmetic Stencil Methods Introduction to OpenMP

Lecture 15 Floating point

Section. Announcements

CSE 160 Lecture 3. Applications with synchronization. Scott B. Baden

FP_IEEE_DENORM_GET_ Procedure

Floating-Point Arithmetic

Floating Point Numbers

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Announcements. Lab Friday, 1-2:30 and 3-4:30 in Boot your laptop and start Forte, if you brought your laptop

Floating-point numbers. Phys 420/580 Lecture 6

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

2 Computation with Floating-Point Numbers

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

CSE 160 Lecture 9. Load balancing and Scheduling Some finer points of synchronization NUMA

The Sign consists of a single bit. If this bit is '1', then the number is negative. If this bit is '0', then the number is positive.

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

CSCI 402: Computer Architectures

Representing and Manipulating Floating Points

Representing and Manipulating Floating Points. Jo, Heeseung

Most nonzero floating-point numbers are normalized. This means they can be expressed as. x = ±(1 + f) 2 e. 0 f < 1

Finite arithmetic and error analysis

2 Computation with Floating-Point Numbers

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Floating point. Today. IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.

System Programming CISC 360. Floating Point September 16, 2008

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation

Floating Point January 24, 2008

Scientific Computing. Error Analysis

Floating-point representation

Binary floating point encodings

Foundations of Computer Systems

Floating Point. CSE 238/2038/2138: Systems Programming. Instructor: Fatma CORUT ERGİN. Slides adapted from Bryant & O Hallaron s slides

Up next. Midterm. Today s lecture. To follow

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition. Carnegie Mellon

Representing and Manipulating Floating Points

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

Parallel Computing. Prof. Marco Bertini

Precision and Accuracy

Floating Point Representation. CS Summer 2008 Jonathan Kaldor

Representing and Manipulating Floating Points. Computer Systems Laboratory Sungkyunkwan University

UC Berkeley CS61C : Machine Structures

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor.

Floating Point Numbers

Floating Point Numbers

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

Lecture 13: (Integer Multiplication and Division) FLOATING POINT NUMBERS

Classes of Real Numbers 1/2. The Real Line

Representing and Manipulating Floating Points

Module 12 Floating-Point Considerations

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Floating Point : Introduction to Computer Systems 4 th Lecture, May 25, Instructor: Brian Railing. Carnegie Mellon

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor.

Floating-Point Numbers in Digital Computers

Data Representation Floating Point

Giving credit where credit is due

Data Representation Floating Point

Floating-Point Numbers in Digital Computers

Giving credit where credit is due

Data Representation Floating Point

MAT128A: Numerical Analysis Lecture Two: Finite Precision Arithmetic

Lecture 10 Midterm review

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

Floating Point. CSC207 Fall 2017

Computational Methods. Sources of Errors

.. \ , Floating Point. Arithmetic. (The IEEE Standard) :l.

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.

IEEE Standard 754 Floating Point Numbers

Efficient floating point: software and hardware

COMP2611: Computer Organization. Data Representation

Lecture 5. Applications: N-body simulation, sorting, stencil methods

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Floating Point Numbers. Lecture 9 CAP

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI.

C++11 threads -- a simpler interface than pthreads Examples largely taken from 013/02/24/investigating-c11-threads/

Inf2C - Computer Systems Lecture 2 Data Representation

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Today: Floating Point. Floating Point. Fractional Binary Numbers. Fractional binary numbers. bi bi 1 b2 b1 b0 b 1 b 2 b 3 b j

CS429: Computer Organization and Architecture

COMPUTER SCIENCE 314 Numerical Methods SPRING 2013 ASSIGNMENT # 2 (25 points) January 22

UC Berkeley CS61C : Machine Structures

CS 33. Data Representation (Part 3) CS33 Intro to Computer Systems VIII 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Floating-point operations I

IEEE Standard 754 for Binary Floating-Point Arithmetic.

Floating Point Considerations

Computational Mathematics: Models, Methods and Analysis. Zhilin Li

Floating-point representations

Computer Arithmetic. 1. Floating-point representation of numbers (scientific notation) has four components, for example, 3.

Floating-point representations

CS61C : Machine Structures

Numerical considerations

C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0.

CS321. Introduction to Numerical Methods

CS321 Introduction To Numerical Methods

Numerical Computing: An Introduction

September, 2003 Saeid Nooshabadi

Lecture 10: Floating Point, Digital Design

Floating Point Representation in Computers

Transcription:

CSE 16 Lecture 14 Floating Point Arithmetic Condition Variables

Announcements Pick up your regrade exams in office hours today All exams will be moved to student affairs on Wednesday 213 Scott B Baden / CSE 16 / Fall 213 2

Today s lecture Revisiting Gaussian Elimination Floating Point Arithmetic Condition Variables Testing and Debugging 213 Scott B Baden / CSE 16 / Fall 213 3

Visualizing the Gaussian Elimenation Add multiples of each row to later rows to make A upper triangular for each column k zero it out below the diagonal by adding multiples of row k to later rows for k = to n-1 Column kz for each row i below row k for i = k+1 to n-1 add a multiple of row k to row I for j = k+1 to n-1 A[i,j] -= A[ i, k] / A[k,k] * A[k,j] [k,k] [k,j] Row k [i,k] [i,j] Row i 213 Scott B Baden / CSE 16 / Fall 213 4

Eliminating the entries below the diagonal Column k Add multiples of each row to lower rows to make A upper triangular k For each column k : to n-1 subtract multiples of row k: A[k,k+1:n] from rows i = k+1 to n [k,k] [k,j] Row k to zero out column k below row k Multipliers m ik = A[i,k]/A[k,k] [i,k] [i,j] Row i cancel the elements below the diagonal: A[k+1:n-1,k] Update only to the right of & below A[k,k] for i = k+1 to n-1 A[i,k+1:n ] = m ik A[k, k+1:n ] 213 Scott B Baden / CSE 16 / Fall 213 5

Parallelization We ll use 1D vertical strip partitioning The represents outstanding work in succeeding k iterations We divide the trailing matrix + pivot column among the cores The pivot column is always owned by core After i=1 After i=2 After i=3 After i=n-1 213 Scott B Baden / CSE 16 / Fall 213 6

Communication and control Each thread in charge of eliminating N/P columns But as we eliminate columns, N is shrinking Pivot selection is serial work The trailing matrix multiplication parallelizes perfectly 213 Scott B Baden / CSE 16 / Fall 213 7

Today s lecture Revisiting Gaussian Elimination Floating Point Arithmetic Condition Variables Testing and Debugging 213 Scott B Baden / CSE 16 / Fall 213 8

A representation ±25732 1 22 What is floating point? Single, double, extended precision NaN A set of operations + = * / rem Comparison < = > Conversions between different formats, binary to decimal Exception handling IEEE Floating point standard P754 Universally accepted W Kahan received the Turing Award in 1989 for the design of IEEE Floating Point Standard Revised in 28 213 Scott B Baden / CSE 16 / Fall 213 9

IEEE Floating point standard P754 Normalized representation ±1d d 2 esp Macheps = Machine epsilon = ε = 2 relative error in each operation -#significand bits OV = overflow threshold = largest number UN = underflow threshold = smallest number ±Zero: ±significand and exponent = Format # bits #significand bits macheps #exponent bits exponent range ---------- -------- ---------------------- ------------ -------------------- ---------------------- Single 32 23+1 2-24 (~1-7 ) 8 2-126 - 2 127 (~1 +-38 ) Double 64 52+1 2-53 (~1-16 ) 11 2-122 - 2 123 (~1 +-38 ) Double 8 64 2-64 (~1-19 ) 15 2-16382 - 2 16383 (~1 +-4932 ) Jim Demmel 213 Scott B Baden / CSE 16 / Fall 213 1

What happens in a floating point operation? Round to the nearest representable floating point number that corresponds to the exact value(correct rounding) Round to nearest value with the lowest order bit = (rounding toward nearest even) Others are possible We don t need the exact value to work this out! Applies to + = * / rem Error formula: fl(a op b) = (a op b)*(1 + δ) where op one of +, -, *, / δ ε assuming no overflow, underflow, or divide by zero Addition example fl( x i ) = i=1:n x i *(1+e i ) e i < (n-1)ε 213 Scott B Baden / CSE 16 / Fall 213 11

Denormalized numbers Compute: if (a b) then x = a/(a-b) We should never divide by, even if a-b is tiny Underflow exception occurs when exact result a-b < underflow threshold UN Return a denormalized number for a-b Relax restriction that leading digit is 1: ±d d x 2 min_exp Reserve the smallest exponent value, lose a set of small normalized numbers Fill in the gap between and UN uniform distribution of values 213 Scott B Baden / CSE 16 / Fall 213 12

Anomalous behavior Floating point arithmetic is not associative (x + y) + z x+ (y+z) Distributive law doesn t always hold These expressions have different values when y z x*y x*z x(y-z) Optimizers can t reason about floating point If we compute a quantity in extended precision (8 bits) we lose digits when we store to memory y x float x, y=, z= ; x = y + z; y=x; 213 Scott B Baden / CSE 16 / Fall 213 13

NaN (Not a Number) Invalid exception Exact result is not a well-defined real number /, -1 NaN op number = NaN We can have a quiet NaN or an snan Quiet does not raise an exception, but propagates a distinguished value Eg missing data: max(3,nan) = 3 Signaling - generate an exception when accessed Detect uninitialized data 213 Scott B Baden / CSE 16 / Fall 213 14

Exception Handling An exception occurs when the result of a floating point operation is not representable as a normalized floating point number 1/, -1 P754 standardizes how we handle exceptions Overflow: - exact result > OV, too large to represent Underflow: exact result nonzero and < UN, too small to represent Divide-by-zero: nonzero/ Invalid: /, -1, log(), etc Inexact: there was a rounding error (common) Two possible responses Stop the program, given an error message Tolerate the exception, possibly repairing the error 213 Scott B Baden / CSE 16 / Fall 213 15

Graph the function An example f(x) = sin(x) / x f() = 1 But we get a singularity @ x=: 1/x = This is an accident in how we represent the function (W Kahan) We catch the exception (divide by ) Substitute the value f() = 1 213 Scott B Baden / CSE 16 / Fall 213 16

Exception handling An important part of the standard, 5 exceptions Overflow and Underflow Divide-by-zero Invalid Inexact Each of the 5 exceptions manipulates 2 flags Sticky flag set by an exception Remains set until explicitly cleared by the user Exception flag: should a trap occur? If so, we can enter a trap handler But requires precise interrupts, causes problems on a parallel computer We can use exception handling to build faster algorithms Try the faster but riskier algorithm Rapidly test for accuracy (possibly with the aid of exception handling) Substitute slower more stable algorithm as needed 213 Scott B Baden / CSE 16 / Fall 213 17

Today s lecture Revisiting Gaussian Elimination Floating Point Arithmetic Condition Variables Testing and Debugging 213 Scott B Baden / CSE 16 / Fall 213 18

Recall Lock_guard The lock_guard constructor acquires (locks) the provided lock constructor argument When a lock_guard destructor is called, it releases (unlocks) the lock int val, v; std::mutex valmutex; { std::lock_guard<std::mutex> lg(valmutex); v = val; } if (v >= ) f(v); else f(-v); 213 Scott B Baden / CSE 16 / Fall 213 19

Flexible locking with unique_lock The constructor need not acquire (lock) the provided lock constructor argument The destructor unlocks the lock, if currently owned std::mutex m; { std::unique_lock<std::mutex> lock_a(m,std::defer_lock); std::lock(lock_a); } ; 213 Scott B Baden / CSE 16 / Fall 213 2

Condition variables Threads can signal one another with a message: the buffer is ready Or, we might want to wait for a certain time to elapse: but what if we fell asleep during the alarm? We could busy wait on an atomic variable, or a variable protected by a critical section, but this can be wasteful C++ provides condition variables, a preferable way to handle the above situations 213 Scott B Baden / CSE 16 / Fall 213 21

Using Condition variables We need a lock in order to use a condition variable Wait() will check for the desired condition within a critical section protected by the lock If the condition has been met, wait() exits Else, it unlocks the mutex and the thread enters a wait state Notify_one( ) causes the thread waiting on the condition to awaken The thread reacquires the lock If the condition has been met, Notify_one( ) returns, with the lock in the acquired state Else it blocks again How can this happen? 213 Scott B Baden / CSE 16 / Fall 213 22

Example with condition variable (I) #include <mutex> #include <condition_variable> #include <thread> #include <queue> bool more_data_to_prepare() { return false; } data_chunk prepare_data() { return data_chunk(); } bool is_last_chunk(data_chunk&) { return true; } std::mutex mut; std::queue<data_chunk> data_queue; std::condition_variable data_cond; 213 Scott B Baden / CSE 16 / Fall 213 23

Example with condition variable (II) void Producer_T() { while(more_to_produce()) { data_chunk const data=produce(); std::lock_guard<std::mutex> lk(mut); data_queuepush(data); data_condnotify_one(); } } void Consumer_T() { while(true) { std::unique_lock<std::mutex> lk(mut); } } data_condwait(lk,[]{return!data_queueempty();}); data_chunk data=data_queuefront(); data_queuepop(); lkunlock(); Consume(data); if(is_last_chunk(data)) break; int main() { std::thread t1(producer_t); std::thread t2(consumer_t); t1join(); t2join(); } 213 Scott B Baden / CSE 16 / Fall 213 24

Building a barrier with Condition variables Cleaner design than with locks only barrier(int NT =2) : ndone(), _nt(nt){}; void bsync() { unique_lock<mutex> lk(mtx,std::defer_lock); }; lklock(); if (++ndone < _nt) cvarwait(lk); else{ ndone = ; cvarnotify_all(); } Barrier(int NT=2): arrival(unlocked), departure(locked), count= {}; void bsync( ){ arrivallock( ); if (++count < NT) arrivalunlock( ); else departureunlock( ); departurelock( ); if (--count > ) departureunlock( ); else arrivalunlock( ); } 213 Scott B Baden / CSE 16 / Fall 213 25

Today s lecture Revisiting Gaussian Elimination Floating Point Arithmetic Condition Variables Testing and Debugging 213 Scott B Baden / CSE 16 / Fall 213 26

Fin