Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

Similar documents
Divide: Paper & Pencil

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

CO212 Lecture 10: Arithmetic & Logical Unit

Data Representation Floating Point

Data Representation Floating Point

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation

Organisasi Sistem Komputer

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

ECE232: Hardware Organization and Design

Floating Point Numbers

Floating Point Numbers

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

FLOATING POINT NUMBERS

Floating Point Numbers

Floating Point January 24, 2008

Number Systems Standard positional representation of numbers: An unsigned number with whole and fraction portions is represented as:

Number Systems and Computer Arithmetic

COMP2611: Computer Organization. Data Representation

EE 109 Unit 19. IEEE 754 Floating Point Representation Floating Point Arithmetic

Computer Arithmetic Ch 8

Computer Arithmetic Ch 8

Number Systems. Both numbers are positive

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

Numeric Encodings Prof. James L. Frankel Harvard University

Foundations of Computer Systems

Floating point. Today. IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.

Module 2: Computer Arithmetic

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.

Floating Point Arithmetic

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

15213 Recitation 2: Floating Point

EE 109 Unit 20. IEEE 754 Floating Point Representation Floating Point Arithmetic

Floating Point : Introduction to Computer Systems 4 th Lecture, May 25, Instructor: Brian Railing. Carnegie Mellon

Giving credit where credit is due

System Programming CISC 360. Floating Point September 16, 2008

Representing and Manipulating Floating Points. Jo, Heeseung

Chapter 2 Data Representations

Data Representation Floating Point

Giving credit where credit is due

Introduction to Computer Systems Recitation 2 May 29, Marjorie Carlson Aditya Gupta Shailin Desai

CS429: Computer Organization and Architecture

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI.

Inf2C - Computer Systems Lecture 2 Data Representation

Representing and Manipulating Floating Points

Data Representations & Arithmetic Operations

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Representing and Manipulating Floating Points

IT 1204 Section 2.0. Data Representation and Arithmetic. 2009, University of Colombo School of Computing 1

Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition. Carnegie Mellon

Floating Point. CSE 238/2038/2138: Systems Programming. Instructor: Fatma CORUT ERGİN. Slides adapted from Bryant & O Hallaron s slides

Representing and Manipulating Floating Points

The Sign consists of a single bit. If this bit is '1', then the number is negative. If this bit is '0', then the number is positive.

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor.

Floating Point. EE 109 Unit 20. Floating Point Representation. Fixed Point

Computer Arithmetic Floating Point

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Floating-point representations

Representing and Manipulating Floating Points. Computer Systems Laboratory Sungkyunkwan University

Floating-point representations

Computer Architecture Chapter 3. Fall 2005 Department of Computer Science Kent State University

Computer Organization: A Programmer's Perspective

Written Homework 3. Floating-Point Example (1/2)

Today: Floating Point. Floating Point. Fractional Binary Numbers. Fractional binary numbers. bi bi 1 b2 b1 b0 b 1 b 2 b 3 b j

3.5 Floating Point: Overview

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

COSC 243. Data Representation 3. Lecture 3 - Data Representation 3 1. COSC 243 (Computer Architecture)

CHW 261: Logic Design

CS 101: Computer Programming and Utilization

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

Floating Point Representation. CS Summer 2008 Jonathan Kaldor

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor.

Floating Point Arithmetic. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Signed Multiplication Multiply the positives Negate result if signs of operand are different

Numerical computing. How computers store real numbers and the problems that result

Computer (Literacy) Skills. Number representations and memory. Lubomír Bulej KDSS MFF UK

Finite arithmetic and error analysis

CPE300: Digital System Architecture and Design

Fixed Point. Basic idea: Choose a xed place in the binary number where the radix point is located. For the example above, the number is

Computer Architecture and IC Design Lab. Chapter 3 Part 2 Arithmetic for Computers Floating Point

Chapter 5 : Computer Arithmetic

Number Representations

Chapter 4. Operations on Data

EE260: Logic Design, Spring n Integer multiplication. n Booth s algorithm. n Integer division. n Restoring, non-restoring

CS 33. Data Representation (Part 3) CS33 Intro to Computer Systems VIII 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Integer Multiplication. Back to Arithmetic. Integer Multiplication. Example (Fig 4.25)

Floating Point Arithmetic

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Arithmetic (a) The four possible cases Carry (b) Truth table x y

Integers and Floating Point

Floating Point Numbers

Systems Programming and Computer Architecture ( )

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

Lecture 13: (Integer Multiplication and Division) FLOATING POINT NUMBERS

MIPS Integer ALU Requirements

C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0.

Chapter 10 - Computer Arithmetic

CS101 Introduction to computing Floating Point Numbers

Transcription:

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Instructor: Nicole Hynes nicole.hynes@rutgers.edu 1

Fixed Point Numbers Fixed point number: integer part + fractional part Fixed number of digits to left and right of the radix point In decimal: 23.784 10 integer fraction 23.784 10 = 2 10 1 + 3 10 0 + 7 10-1 + 8 10-2 + 4 10-3 Similarly in binary: 10.1011 2 = 1 2 1 + 0 2 0 + 1 2-1 + 0 2-2 + 1 2-3 + 1 2-4 = 2 + 0 + 0.5 + 0 + 0.125 + 0.0625 = 2.6875 10 What about base b? 2

Converting Decimal Fraction to Binary Fraction Algorithm illustration: 0.6875 10 =? 2 int part frac part 0.6875 2 = 1.375 1 0.375 0.375 2 = 0.75 0 0.75 0.75 2 = 1.5 1 0.5 0.5 2 = 1.0 1 0 Read off int parts in order Therefore, 0.6875 10 = 0.1011 2 Stop when frac part = 0 3

Decimal Fraction to Binary Fraction Converting from decimal to binary may result in a nonterminating fraction. Example: 0.1 10 0.00011 2 repeating sequence May need to round to desired number of fractional places. Example: 0.1 10 =? 2 int part frac part 0.1 2 = 0.2 0 0.2 0.2 2 = 0.4 0 0.4 0.4 2 = 0.8 0 0.8 0.8 2 = 1.6 1 0.6 0.6 2 = 1.2 1 0.2 0.2 2 = 0.4 0 0.4 0.4 2 = 0.8 0 0.8 0.8 2 = 1.6 1 0.6 0.6 2 = 1.2 1 0.2... r e p e a t s 4

Rounding Because computers represent numbers using a fixed number of bits, both the range and precision of numbers that can be represented are limited. Precision is usually associated with the number of fractional bits allowed by the computer representation. If the number has more fractional bits than is allowed by the computer representation, the number must be rounded to the required precision. Example: How should 10.1011 2 to 2 fractional bits? 10.10 2? Or 10.11 2? 5

Rounding Rounding modes Let a be the number and ā be its rounded value 1. Round-toward-zero Round a to nearest number ā of desired precision such that ā a Also called truncation because it simply drops excess fractional bits 2. Round-down Round a to nearest number ā of desired precision such that ā a Also called round-toward-negative-infinity 3. Round-up Round a to nearest number ā of desired precision such that ā a Also called round-toward-positive-infinity 4. Round-to-even Round a to the number ā of desired precision such that a ā is minimized If there is a tie, choose the ā whose least significant digit/bit is even Also called round-to-nearest Default mode used in IEEE Floating Point Format, which we ll discuss next 6

Rounding Rounding examples Assume precision is 2 fractional bits Number Rounded Value Round-toward-0 Round-down Round-up Round-to-even 1.4523 10 1.45 10 1.45 10 1.46 10 1.45 10-2.1786 10-2.17 10-2.18 10-2.17 10-2.18 10 10.10011 2 10.10 2 10.10 2 10.11 2 10.10 2-1.00110 2-1.00 2-1.01 2-1.00 2-1.01 2-10.11100 2-10.11 2-11.00 2-10.11 2-11.00 2 1.10100 2 1.10 2 1.10 2 1.11 2 1.10 2 7

Fixed Point Arithmetic Adapt integer arithmetic algorithms Will illustrate for unsigned fixed point only Addition and Subtraction Similar to integer addition/subtraction Just align radix points Example: 100.101 2 + 10.1101 2 1 0 0. 1 0 1 = 4.625 + 1 0. 1 1 0 1 = 2.8125 1 1 1. 0 1 1 1 = 7.4375 align binary points 8

Multiplication Fixed Point Arithmetic 1. Ignore radix points; multiply as integers 2. Insert radix point of product: no. of fractional places = sum of no. of fractional places of two operands Example: 11.01 2 0.101 2 1 1.0 1 = 3.25 0.1 0 1 = 0.625 1 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 1 0.0 0 0 0 1 = 2.03125 9

Fixed Point Arithmetic Division 1. Shift right radix point of divisor until it is a whole integer 2. Shift right radix point of dividend the same number of positions 3. Divide as in integer division 4. Radix point of quotient is in same position as that of dividend Example: 10.1101 2 1.01 2 (2.8125 10 1.25 10 ) 1011.01 2 101 2 (11.25 10 5 10 ) 1 0. 0 1 1 0 1 1 0 1 1. 0 1-1 0 1 0 1 0 1-1 0 1 0 = 2.25 10 May result in a quotient with non-terminating fractional part round to desired number of fractional places 10

Floating Point Numbers Fixed point numbers can also be written in scientific notation also referred to as floating point format: significand Decimal: 975.673 = 9.75673 10 2 0.000324 = 3.24 10-4 exponent Binary: 1101.011 = 1.101011 2 3 0.010111 = 1.0111 2-2 significand exponent Significand (a.k.a. mantissa) is normalized: exactly one digit/bit to left of decimal/binary point. Allows for more compact representation of real numbers than fixed point format. 11

Floating Point Representation Most computers support the IEEE 754 standard for encoding floating point numbers: Single precision (32 bits): C type float Double precision (64 bits): C type double Intel x86 processors also support extended precision format (80 bits) 12

IEEE Single Precision FP Format Normalized binary FP number Single precision FP format significand ±1.fraction 2 exponent s b_exp frac 32 bits 1 8 23 Field # Bits Value Remarks s 1 0 if number is positive; 1 if negative b_exp 8 exponent + bias, where bias = 2 8-1 -1 = 2 7-1 = 127 called the biased exponent frac 23 fractional part of significand 1 to left of binary point is not stored (hidden bit) 13

IEEE Single Precision FP Format Problem: Find the single precision FP representation of 54.625 10. Solution: 1. Convert to binary FP: 54.625 10 = 110110.101 2 2. Normalize binary FP: 110110.101 = 1.10110101 2 5 3. Map to single precision FP format: s = 1 frac = 10110101000000000000000 (pad with zeros to make 23 bits) b_exp = 5 + 127 = 132 = 10000100 4. Answer: 1 10000100 10110101000000000000000 14

IEEE Single Precision FP Format FP numbers that can be represented in IEEE single precision format: 1. Normalized values Numbers of the form ±1.fraction 2 exponent -126 exponent 127 1 b_exp 254 Most positive/negative number = ±1.11...1 2 127 Least positive/negative number = ± 1.00...0 2-126 Observations on b_exp: - always positive - 00000000 (all zeros) and 11111111 (all 1 s) not used: these bit patterns are used to represent special values s 0 & 255 frac 15

IEEE Single Precision FP Format 2. Denormalized values a. b_exp = 0 and frac = 0 represents the value ±0.0 Note: two representations of zero. s 00000000 00000000000000000000000 b. b_exp = 0 and frac 0 represents the binary number of the form ±0.fraction 2-126 s 00000000 frac Notes: - significand < 1 (bit to left of binary point is 0) - exponent of binary number must be -126 (= 1 bias) - allows representation of numbers smaller than least positive/negative normalized number, ± 1.00...0 2-126 16

IEEE Single Precision FP Format 3. Special values a. b_exp = all 1 s and frac = 0 represents the value ±. Typically used to represent results that overflow. s 11111111 00000000000000000000000 b. b_exp = all 1 s and frac 0 represents NaN ( Not a Number ). Typically used to represent results that can t be represented as a real number (e.g., 1 ). s 11111111 frac 0 17

Why Use a Biased Representation? The IEEE single precision FP format can be generalized to any number of exponent and fractional bits: ±1.fraction 2 exponent s b_exp frac 1 k n For a k-bit biased exponent field: - bias = 2 k-1 1 - b_exp = exponent + bias - Exponent of normalized FP number is limited to [ (2 k-1 2), (2 k-1 1)] - As a result, 1 b_exp (2 k 2) - As before, b_exp = all 0 s and all 1 s are used to represent denormalized values and special values 18

Why Use a Biased Representation? By biasing the exponent, i.e. adding (2 k-1 1) to the true exponent, the resulting biased exponent is always nonnegative and hence can be treated as an unsigned integer. Comparing unsigned integers is easy: Treated as unsigned integers, which is larger: 10100111 or 10111010? Compare bitwise starting from left (msb). Stop at bit position where the numbers differ. The number with a 1 bit is larger. 1 0 1 0 0 1 1 1 1 0 1 1 1 0 1 0 larger Can compare two numbers in IEEE FP format with the same sign using same algorithm: 0 10100111 01011100000000000000000 0 10111011 01101100000000000000000 +1.010111 2 40 +1.011011 2 60 < 19

IEEE Double Precision FP Format Normalized binary FP number Double precision FP format significand ±1.fraction 2 exponent s b_exp frac 64 bits 1 11 52 Field # Bits Value Remarks s 1 0 if number is positive; 1 if negative b_exp 11 exponent + bias, where bias = 2 11-1 -1 = 2 10-1 = 1023 called the biased exponent frac 52 fractional part of significand 1 to left of binary point is not stored (hidden bit) 20

x86 Extended Precision Normalized binary FP number Extended precision FP format significand ±1.fraction 2 exponent s b_exp 1 frac 80 bits 1 15 64 Field # Bits Value Remarks s 1 0 if number is positive; 1 if negative b_exp 15 exponent + bias, where bias = 2 15-1 -1 = 2 14-1 = 16,383 called the biased exponent frac 64 entire significand 1.fraction no hidden bit! 21

Floating Point Arithmetic Addition and Subtraction 1. Make exponents equal 2. Add/subtract significands 3. Normalize result Why? Let A = a 2 e1 and B = b 2 e2 and suppose e1 < e2 Then A can be rewritten as A = a 2 e2 2 -(e2-e1) Therefore, A + B = ( (a 2 -(e2-e1) ) + b ) 2 e2 Shift a right of the binary point (e2-e1) places; then add to b 22

Floating Point Arithmetic Addition Example: IEEE single precision format + s b_exp frac 0 01111101 00000000000000000000000 0 10000101 10010000000000000000000 1.0 2-2 = 0.25 10 1.1001 2 6 = 100.0 10 Don t forget the hidden bit! To simplify illustration, let s show the hidden bit. 0 01111101 1 00000000000000000000000 0.25 10 0 10000101 1 10010000000000000000000 100.0 10 hidden bit significand 23

Floating Point Arithmetic Addition Example, Cont. + 0 01111101 1 00000000000000000000000 0 10000101 1 10010000000000000000000 0.25 10 100.0 10 1. Make exponents equal To leave value unchanged: Shift significand left by 1 bit must decrease exponent by 1 Shift significand right by 1 bit must increase exponent by 1 Increase smaller exponent to equal larger exponent. Why? Will shift significand right, losing only least significant bits Therefore, increase exponent of 0.25 10, shifting significand right by 10000101 01111101 = 00001000 = 8 10 places 24

Floating Point Arithmetic Addition Example, Cont. Note that hidden bit is shifted into msb Shift significand of 0.25 10 right by 8 places 0 01111101 1 00000000000000000000000 original value 0 01111110 0 10000000000000000000000 shift right 1 place 0 01111111 0 01000000000000000000000 shift right 2 places 0 10000000 0 00100000000000000000000 shift right 3 places 0 10000001 0 00010000000000000000000 shift right 4 places 0 10000010 0 00001000000000000000000 shift right 5 places 0 10000011 0 00000100000000000000000 shift right 6 places 0 10000100 0 00000010000000000000000 shift right 7 places 0 10000101 0 00000001000000000000000 shift right 8 places 25

Floating Point Arithmetic Addition Example. Cont. 2. Add significands + 0 10000101 0 00000001000000000000000 0 10000101 1 10010000000000000000000 0.25 10 100.0 10 0 10000101 1 10010001000000000000000 3. Normalize result (already normalized; hide hidden bit) 0 10000101 10010001000000000000000 100.25 10 26

Floating Point Arithmetic Multiplication 1. Add exponents 2. Multiply significands 3. Normalize result Why? Let A = a 2 e1 and B = b 2 e2 Then, A B = ( a b ) 2 e1+e2 27

Floating Point Arithmetic Multiplication Example: IEEE single precision format s b_exp frac 0 01111100 01000000000000000000000 1 10000011 11000000000000000000000 1.01 2-3 = 0.15625 10-1.11 2 4 = -28.0 10 As before, let s show the hidden bit. 0 01111100 1 01000000000000000000000 0.15625 10 1 10000011 1 11000000000000000000000-28.0 10 hidden bit significand 28

Floating Point Arithmetic Multiplication Example. Cont. 1. Add true exponents b_exp 1 0 01111100 1 01000000000000000000000 0.15625 10 b_exp 2 1 10000011 1 11000000000000000000000-28.0 10 Note that these are biased exponents: b_exp 1 = true_exponent 1 + 127 true_exponent 1 = b_exp 1-127 b_exp 2 = true_exponent 2 + 127 true_exponent 2 = b_exp 2-127 Now, true_exponent result = true_exponent 1 + true_exponent 2. Therefore, b_exp result = true_exponent result + 127 = (b_exp 1 + b_exp 2 ) - 127 = (01111100 + 10000011) 01111111 = 10000000 29

Floating Point Arithmetic Multiplication Example. Cont. 2. Multiply significands significand 0 01111100 1 01000000000000000000000 0.15625 10 1 10000011 1 11000000000000000000000-28.0 10 significand result = 1.01 1.11 = 10.0011 sign result = 1 Why? b_exp result = 10000000 (from previous slide) 3. Normalize result shift significand result right by 1 bit 1.00011 (hide hidden bit in IEEE format!) increase b_exp result by 1 10000001 1 10000001 00011000000000000000000-4.375 10 30

Floating Point Arithmetic Division 1. Subtract exponents 2. Divide significands 3. Normalize result Why? Let A = a 2 e1 and B = b 2 e2 Then, A / B = ( a / b ) 2 e1-e2 31

Floating Point Arithmetic Division Example: IEEE single precision format s b_exp frac 0 10000110 00011000000000000000000 0 01111101 11000000000000000000000 1.00011 2 7 = 140.0 10 1.11 2-2 = 0.4375 10 As before, let s show the hidden bit. 0 10000110 1 00011000000000000000000 140.0 10 0 01111101 1 11000000000000000000000 0.4375.0 10 hidden bit significand 32

Floating Point Arithmetic Division Example. Cont. 1. Subtract true exponents b_exp 1 0 10000110 1 00011000000000000000000 140.0 10 b_exp 2 0 00111101 1 11000000000000000000000 0.4375.0 10 Note that these are biased exponents: b_exp 1 = true_exponent 1 + 127 true_exponent 1 = b_exp 1-127 b_exp 2 = true_exponent 2 + 127 true_exponent 2 = b_exp 2-127 Now, true_exponent result = true_exponent 1 - true_exponent 2. Therefore, b_exp result = true_exponent result + 127 = (b_exp 1 - b_exp 2 ) + 127 = (10000110 00111101) + 01111111 = 10001000 33

Floating Point Arithmetic Division Example. Cont. 2. Divide significands significand 0 10000110 1 00011000000000000000000 140.0 10 0 01111101 1 11000000000000000000000 0.4375.0 10 significand result = 1.00011 1.11 = 0.101 sign result = 0 b_exp result = 10001000 (from previous slide) 3. Normalize result shift significand result left by 1 bit 1.01 (hide hidden bit in IEEE format!) decrease b_exp result by 1 10000111 0 10000111 01000000000000000000000 320.0 10 34