Fixed Point. Basic idea: Choose a xed place in the binary number where the radix point is located. For the example above, the number is

Similar documents
Floating-point representations

Floating-point representations

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

FLOATING POINT NUMBERS

Divide: Paper & Pencil

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

The Sign consists of a single bit. If this bit is '1', then the number is negative. If this bit is '0', then the number is positive.

ECE232: Hardware Organization and Design

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

MIPS Integer ALU Requirements

Floating Point Arithmetic

Computer Architecture and IC Design Lab. Chapter 3 Part 2 Arithmetic for Computers Floating Point

CO212 Lecture 10: Arithmetic & Logical Unit

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation

Floating Point Numbers

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Operations On Data CHAPTER 4. (Solutions to Odd-Numbered Problems) Review Questions

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Numeric Encodings Prof. James L. Frankel Harvard University

Floating Point Representation. CS Summer 2008 Jonathan Kaldor

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

Floating Point Numbers. Lecture 9 CAP

3.5 Floating Point: Overview

Foundations of Computer Systems

Floating Point January 24, 2008

EE 109 Unit 19. IEEE 754 Floating Point Representation Floating Point Arithmetic

Computer Arithmetic Floating Point

Floating Point : Introduction to Computer Systems 4 th Lecture, May 25, Instructor: Brian Railing. Carnegie Mellon

Representing and Manipulating Floating Points

Floating Point Arithmetic

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.

Floating Point Arithmetic

Floating Point Numbers

Floating Point Numbers

Computer Arithmetic Ch 8

Computer Arithmetic Ch 8

Number Systems Standard positional representation of numbers: An unsigned number with whole and fraction portions is represented as:

Floating point. Today. IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

Module 12 Floating-Point Considerations

Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition. Carnegie Mellon

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Number Representations

Floating Point. CSE 238/2038/2138: Systems Programming. Instructor: Fatma CORUT ERGİN. Slides adapted from Bryant & O Hallaron s slides

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

COSC 243. Data Representation 3. Lecture 3 - Data Representation 3 1. COSC 243 (Computer Architecture)

Number Systems and Computer Arithmetic

Chapter 4. Operations on Data

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Arithmetic (a) The four possible cases Carry (b) Truth table x y

System Programming CISC 360. Floating Point September 16, 2008

Module 12 Floating-Point Considerations

Review: MULTIPLY HARDWARE Version 1. ECE4680 Computer Organization & Architecture. Divide, Floating Point, Pentium Bug

ECE331: Hardware Organization and Design

Architecture and Design of Generic IEEE-754 Based Floating Point Adder, Subtractor and Multiplier

Representing and Manipulating Floating Points

Data Representation Floating Point

Floating Point Arithmetic. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Giving credit where credit is due

Data Representation Floating Point

Today: Floating Point. Floating Point. Fractional Binary Numbers. Fractional binary numbers. bi bi 1 b2 b1 b0 b 1 b 2 b 3 b j

Giving credit where credit is due

Integer Multiplication. Back to Arithmetic. Integer Multiplication. Example (Fig 4.25)

Computer Architecture Chapter 3. Fall 2005 Department of Computer Science Kent State University

Floating Point Arithmetic

Module 2: Computer Arithmetic

EE 109 Unit 20. IEEE 754 Floating Point Representation Floating Point Arithmetic

Implementation of IEEE754 Floating Point Multiplier

Finite arithmetic and error analysis

±M R ±E, S M CHARACTERISTIC MANTISSA 1 k j

Signed Multiplication Multiply the positives Negate result if signs of operand are different

15213 Recitation 2: Floating Point

Binary Addition & Subtraction. Unsigned and Sign & Magnitude numbers

CS429: Computer Organization and Architecture

Representing and Manipulating Floating Points

C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0.

COMP2611: Computer Organization. Data Representation

Floating Point Numbers

IEEE-754 floating-point

Chapter Three. Arithmetic

IEEE Standard for Floating-Point Arithmetic: 754

COMPUTER ORGANIZATION AND ARCHITECTURE

CS 33. Data Representation (Part 3) CS33 Intro to Computer Systems VIII 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Written Homework 3. Floating-Point Example (1/2)

Classes of Real Numbers 1/2. The Real Line

COMPUTER ORGANIZATION AND DESIGN

mith College Computer Science Fixed & Floating Point Formats CSC231 Fall 2017 Week #15 Dominique Thiébaut

Homework 1 graded and returned in class today. Solutions posted online. Request regrades by next class period. Question 10 treated as extra credit

Representing and Manipulating Floating Points. Jo, Heeseung

Divide: Paper & Pencil CS152. Computer Architecture and Engineering Lecture 7. Divide, Floating Point, Pentium Bug. DIVIDE HARDWARE Version 1

Organisasi Sistem Komputer

Numerical computing. How computers store real numbers and the problems that result

Representing and Manipulating Floating Points. Computer Systems Laboratory Sungkyunkwan University

4 Operations On Data 4.1. Foundations of Computer Science Cengage Learning

Transcription:

Real Numbers How do we represent real numbers? Several issues: How many digits can we represent? What is the range? How accurate are mathematical operations? Consistency... Is a + b = b + a? Is (a + b) + c = a + (b + c)? Is (a + b) b = a?

Real Numbers How do we represent real numbers? Several issues: How many digits can we represent? What is the range? How accurate are mathematical operations? Consistency... Is a + b = b + a? Is (a + b) + c = a + (b + c)? Is (a + b) b = a?

Fixed Point Basic idea: 0 1 0 0 1 0 1 0 radix point is here Choose a xed place in the binary number where the radix point is located. For the example above, the number is (010.01010) 2 = 2 + 2 2 + 2 4 = (2.3125) 10 How would you do mathematical operations?

Floating-Point Some problematic numbers... 6.023 10 23 6.673 10 11 6.62607 10 34 Scienti c computations require a number of digits of precision... But they also need range permit the radix point to move Šoating-point numbers

Floating-Point: Scienti c Notation sign + 6.023 x 10 23 exponent mantissa radix (base) Number represented as: mantissa, exponent Arithmetic multiplication, division: perform operation on mantissa, add/subtract exponent addition, subtraction: convert operands to have the same exponent value, add/subtract mantissas

Fixed Point Basic idea: 0 1 0 0 1 0 1 0 radix point is here Choose a xed place in the binary number where the radix point is located. For the example above, the number is (010.01010) 2 = 2 + 2 2 + 2 4 = (2.3125) 10 How would you do mathematical operations?

Floating-Point Some problematic numbers... 6.023 10 23 6.673 10 11 6.62607 10 34 Scienti c computations require a number of digits of precision... But they also need range permit the radix point to move Šoating-point numbers

Floating-Point: Scienti c Notation sign + 6.023 x 10 23 exponent mantissa radix (base) Number represented as: mantissa, exponent Arithmetic multiplication, division: perform operation on mantissa, add/subtract exponent addition, subtraction: convert operands to have the same exponent value, add/subtract mantissas

Floating-Point: Scienti c Notation sign + 6.023 x 10 23 exponent mantissa radix (base) Representation: IEEE 754 standard Standardized in mid-80s Single precision: 32 bits Double precision: 64 bits Both supported by the MIPS processor.

Floating-Point Basics Several design issues. Format Choices: representation of mantissa (aka signi cand), exponent, sign normal forms 6.023 10 23 = 0.6023 10 24 =... range and precision Arithmetic: equalize exponents for add/subtract inexact results, rounding exceptional conditions/errors

IEEE Single Precision Format S exponent (E) mantissa (M) 1 8 23 Uses biased exponent, actual exponent is (E 127). 127 is called the bias (excess 127 bias) Number: ( 1) S (1.M) 2 E 127 when 0 < E < 255. The implied 1 is referred to as a hidden bit. Double-precision: 64 bits, 11-bit exponent, excess 1023 bias, 52-bit signi cand

Normalized Numbers Signi cand is of the form 1.x. Example: 0.375 = (0.011) 2 = +(1.1) 2 2 2 0 01111101 10000000000000000000000 + (125) (0.1) 10 2 Example: 3.25 = (11.01) 2 = (1.101) 2 2 1 1 10000000 10100000000000000000000 (128) 10 (0.101) 2 Example: 0 = (0) 2 =?

Zero Exponent E = 0 is special for this reason. Zero signi cand: number is 0 Non-zero signi cand: denormalized number ( 1) S (0.M) 2 126 Denormal numbers are used to extend the range of Šoating-point numbers. Double-precision: exponent would be 1022. Some hardware does not implement denormal arithmetic, but uses software emulation On a Sun UltraSparc, > 80x slowdown...

Adding Normalized Numbers To calculate X + Y, assuming Y X : Alignment of radix point (denormalize smaller number) d := Exp(Y ) Exp(X), set Exp := Exp(Y ) Sig(X) := Sig(X) >> d Add the aligned components Sig = Sig(X) + Sig(Y ) Normalize the result Shift Sig left/right, changing Exp Check for overšow in Exp Round; repeat if not normalized

Adding Normalized Numbers Example: 4-bit signi cand 1.0110 2 3 + 1.1000 2 2 Align 1.0110 2 3 + 0.1100 2 3 Add 10.0010 2 3 Normalize 1.0001 2 4

Adding Normalized Numbers Example: 4-bit signi cand 1.0001 2 3 1.1110 2 1 Align 1.0001 2 3-0.01111 2 3 Subtract 0.10011 2 3 Normalize/Round 1.0011 2 2 Without extra bit, result would be 1.0010 2 2

Accuracy IEEE standard: want result to be as accurate as possible Maximum error: 1 2 ulp (units in last place) when compared to in nite precision arithmetic Alignment step can be problematic! How many bits are actually needed for arithmetic? Extra bit in last example: guard bit

Rounding Standard speci es 4 different rounding modes: round to nearest even (default) round toward + round toward round toward 0 How many bits are necessary to correctly implement the standard? Remember, the maximum permissible error is 1 2 ulp.

Round Bit Example: 4-bit signi cand 1.0000 2 0 1.0001 2 2 Align 1.0000 2 0-0.010001 2 0 Subtract 0.101111 2 0 Normalize/Round 1.01111 2 1 1.1000 2 1 (simple round up) Without extra bit, result would be 1.0111 2 1

Sticky Bit And Round To Nearest Even Example: 4-bit signi cand 1.0000 2 0 + 1.0001 2 5 Align 1.0000 2 0 + 0.000010 001 2 0 Add 1.000010 001 2 0 Normalize/Round 1.0001 2 0, or 1.0000 2 0 Sticky bit: keep track of whether the bits shifted out are non-zero.

In nity and NaNs Sources of error: If the result is too large to be represented ± What about 0/0,? not a number (NaN) NaNs propagate NaN +x = NaN Can be used to initialize Šoating-point variables. Representation: exponent is all 1 s ( 255 or 2047). If signi cand is 0, ; otherwise NaN.

Exceptions Invalid Operation, 0, etc. square root of a negative number OverŠow Divide by Zero UnderŠow denormal result or non-zero result underšows to zero Inexact rounding error is not zero