Floating-point representations

Similar documents
Floating-point representations

Floating Point Arithmetic

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

Divide: Paper & Pencil

CO212 Lecture 10: Arithmetic & Logical Unit

Floating Point Numbers

ECE232: Hardware Organization and Design

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Fixed Point. Basic idea: Choose a xed place in the binary number where the radix point is located. For the example above, the number is

FLOATING POINT NUMBERS

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

Foundations of Computer Systems

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

Computer Arithmetic Floating Point

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI.

C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0.

Floating Point : Introduction to Computer Systems 4 th Lecture, May 25, Instructor: Brian Railing. Carnegie Mellon

The Sign consists of a single bit. If this bit is '1', then the number is negative. If this bit is '0', then the number is positive.

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

Representing and Manipulating Floating Points

Module 2: Computer Arithmetic

Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition. Carnegie Mellon

MIPS Integer ALU Requirements

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Chapter 4. Operations on Data

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation

Floating Point January 24, 2008

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Floating Point. CSE 238/2038/2138: Systems Programming. Instructor: Fatma CORUT ERGİN. Slides adapted from Bryant & O Hallaron s slides

COMP2611: Computer Organization. Data Representation

Operations On Data CHAPTER 4. (Solutions to Odd-Numbered Problems) Review Questions

Floating point. Today. IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.

Signed Multiplication Multiply the positives Negate result if signs of operand are different

Numeric Encodings Prof. James L. Frankel Harvard University

Floating Point Representation. CS Summer 2008 Jonathan Kaldor

System Programming CISC 360. Floating Point September 16, 2008

Today: Floating Point. Floating Point. Fractional Binary Numbers. Fractional binary numbers. bi bi 1 b2 b1 b0 b 1 b 2 b 3 b j

Data Representation Floating Point

Chapter Three. Arithmetic

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

Representing and Manipulating Floating Points

Floating Point Numbers

Floating Point Numbers

Data Representation Floating Point

3.5 Floating Point: Overview

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

Number Systems Standard positional representation of numbers: An unsigned number with whole and fraction portions is represented as:

Computer Arithmetic Ch 8

Computer Arithmetic Ch 8

Giving credit where credit is due

Finite arithmetic and error analysis

Floating Point Numbers

Giving credit where credit is due

Computer Architecture and IC Design Lab. Chapter 3 Part 2 Arithmetic for Computers Floating Point

Part V Real Arithmetic

Up next. Midterm. Today s lecture. To follow

CS 33. Data Representation (Part 3) CS33 Intro to Computer Systems VIII 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

COMPUTER ORGANIZATION AND DESIGN

Floating Point Numbers. Lecture 9 CAP

Number Representations

CS61c Midterm Review (fa06) Number representation and Floating points From your friendly reader

Floating Point Arithmetic. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

IEEE Standard for Floating-Point Arithmetic: 754

±M R ±E, S M CHARACTERISTIC MANTISSA 1 k j

Representing and Manipulating Floating Points. Jo, Heeseung

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Representing and Manipulating Floating Points

Floating Point COE 308. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Double Precision Floating-Point Multiplier using Coarse-Grain Units

Part V Real Arithmetic

15213 Recitation 2: Floating Point

Classes of Real Numbers 1/2. The Real Line

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

Floating Point Arithmetic

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

CS 101: Computer Programming and Utilization

Chapter 2 Data Representations

CS429: Computer Organization and Architecture

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

4 Operations On Data 4.1. Foundations of Computer Science Cengage Learning

Floating-Point Arithmetic

Divide: Paper & Pencil CS152. Computer Architecture and Engineering Lecture 7. Divide, Floating Point, Pentium Bug. DIVIDE HARDWARE Version 1

Number Systems and Computer Arithmetic

Data Representations & Arithmetic Operations

Computer Organization: A Programmer's Perspective

Lecture 13: (Integer Multiplication and Division) FLOATING POINT NUMBERS

Representing and Manipulating Floating Points. Computer Systems Laboratory Sungkyunkwan University

Computer Architecture Chapter 3. Fall 2005 Department of Computer Science Kent State University

Floating Point Arithmetic

Floating Point Representation in Computers

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

Number Systems. Both numbers are positive

ecture 25 Floating Point Friedland and Weaver Computer Science 61C Spring 2017 March 17th, 2017

EE260: Logic Design, Spring n Integer multiplication. n Booth s algorithm. n Integer division. n Restoring, non-restoring

Measuring Improvement When Using HUB Formats to Implement Floating-Point Systems under Round-to- Nearest

Transcription:

Lecture 10 Floating-point representations Methods of representing real numbers (1) 1. Fixed-point number system limited range and/or limited precision results must be scaled 100101010 1111010 100101010.1111010 2. Rational number system difficult arithmetic operations 1001010101 0110001011 Number = numerator denominator 1001010101 0110001011 Methods of representing real numbers (2) 3. Logarithmic number system 1 10010101010 sign logarithm wide range low precision Number = - 2 1001010.1010 2 4. Floating-point number systems 1

Fixed-point vs. floating point representation (1) Fixed-point ulp 0000 0000. 0000 1001 2 1001 0000. 0000 0000 2 Absolute representation error ulp with truncation ulp/2 with rounding Relative representation error large small Absolute representation error constant Relative representation error variable - large for small numbers - small for large numbers Fixed-point vs. floating point representation (2) Floating-point 1.001 2-5 1.001 2 7 Absolute representation error variable - large for large numbers - small for small numbers Relative representation error constant Floating point number (1) number = ± significand base exponent x = ± s b e Typical value of the base b=2 but in the past b=2, 8, 16, 256 Typical range of the significand s [1, 2) Other common choice s [0, 1) Such s is called mantissa 2

Floating point number (2) Exponent e typically represented using biased representation Exponent stored as e + bias where bias = 2 h-1-1, h number of bits of the exponent Extra cost compared to the 2 s complement representation is small because of relatively small sizes of exponents Range of floating point numbers [-max, -min] and [min, max] 0 where max = largest significand largest exponent max = s max b e max min = smallest significand smallest exponent min = s min b e min Current standard ANSI / IEEE Std 754-1985 Single-precision or short format, 32 bits Double-precision or long format, 64 bits 3

Operations on special numbers 0 NaN not-a-number 0 / 0 = NaN + 0 = NaN ± Ordinary Number = ± NaN + Ordinary Number = NaN Denormals 0.f 2-126 hidden 1 replaced by 0 smallest exponent Representation e+bias = 0 s = f 0 Implementation of denormals is optional Extended formats Used internally to reduce the effect of accumulated errors Number of bits Representation Single-precision Exponent Significand Double-precision Exponent Significand normal 8 23+1 11 52+1 extended 11 32+1 15 64+1 4

Unpacking Unpacking & packing numbers 1. Separate sign, exponent, significand 2. Reinstate hidden 1 3. Convert to internal extended format Packing 1. Test for special values 2. Remove hidden 1 3. Combine sign, exponent, and significand Addition (1) (± s1 b e1 ) + (± b e2 ) = = assuming e1 > e2 = = (± s1 b e1 ) + (± b e1 ) = b e1-e2 = (± s1 ± ) b e1 b e1-e2 Allignment shift or preshift e1>e2 s1 1.01110101 1.01110101 e1-e2 positions 0.00001011 Addition (2): Normalization shift or post shift s = ± s1 ± b e1-e2 s1, [1, 2) For the same sign of operands 1 s < 4 A single right shift of s and adding 1 to the exponent might be required For the different signs of operands 0 s < 2 Multiple left shifts of s and decrementing exponent might be necessary 5

Addition (3) If a common case of e1-e2 2 s1 [1, 2) [0, 1/2) s1- (1/2, 2) s1- might need to be shifted left by one position Addition (4) Each significand represented in two s complement notation using at least the following bits z 1 - sign bit c out z 1 z 0. z -1 z -2 z -l G R S G - guard bit: protects against the loss of precision R - round bit: basic bit used for rounding during left shift by one position S - sticky bit: auxiliary bit used for rounding during left shift by one position Addition (4) R - used for rounding in case of one bit left shift performed during normalization if R=0, set z -l to 0, discarded part < ulp/2 R=1 and S 0 set z -l to 1, discarded part < ulp/2 R=1 and S=0 round to the nearest even number discarded part = ulp/2, S - logical OR of all bits shifted through the given position during preshift (alignment shift) 6

Multiplication (± s1 b e1 ) (± b e2 ) = = (± s1 ) b e1+e2 = s b e 1 s1 < 4 A single right shift of s and adding 1 to the exponent might be required If e1 and e2 are of the same sign, multiplication may lead to overflow or underflow Division ± s1 b e1 = ± ± b e2 s1 be1-e2 1 2 s1 < < 2 A single left shift of s and subtracting 1 from the exponent might be required If e1 and e2 are of the opposite signs, multiplication may lead to overflow or underflow Comparisons and exceptions Valid comparisons < 0 + 0 = 0 NaN NaN NaN ± Comparisons that generate exceptions NaN < + NaN > - Operations that generate exceptions ( ) + (+ ) 0 0 / 0 / 7

Rounding schemes (1) Truncation Signed-magnitude representation 2 s complement representation Rounding schemes (2) Round to the nearest (rtn) Signed-magnitude representation 2 s complement representation Rounding schemes (3) Round to the nearest (rtn) Rounding errors 0.00 0 0 0.01 0-0.25 0.10 1 +0.5 0.11 1 +0.25 Average error +0.125 8

Rounding schemes (4) Round to the nearest even (rtne) Signed-magnitude representation 2 s complement representation 9