IEEE-754 floating-point

Similar documents
Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

Numeric Encodings Prof. James L. Frankel Harvard University

Floating Point Arithmetic

COMP2611: Computer Organization. Data Representation

Representing numbers on the computer. Computer memory/processors consist of items that exist in one of two possible states (binary states).

ECE232: Hardware Organization and Design

EE 109 Unit 19. IEEE 754 Floating Point Representation Floating Point Arithmetic

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI.

M1 Computers and Data

IEEE Standard for Floating-Point Arithmetic: 754

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

Floating Point Numbers

15213 Recitation 2: Floating Point

9/3/2015. Data Representation II. 2.4 Signed Integer Representation. 2.4 Signed Integer Representation

Chapter Three. Arithmetic

Real Numbers finite subset real numbers floating point numbers Scientific Notation fixed point numbers

Computer (Literacy) Skills. Number representations and memory. Lubomír Bulej KDSS MFF UK

Floating-Point Arithmetic

COMP2121: Microprocessors and Interfacing. Number Systems

Computer Systems C S Cynthia Lee

The Sign consists of a single bit. If this bit is '1', then the number is negative. If this bit is '0', then the number is positive.

Scientific Computing. Error Analysis

Number Systems Standard positional representation of numbers: An unsigned number with whole and fraction portions is represented as:

FLOATING POINT NUMBERS

Review: MULTIPLY HARDWARE Version 1. ECE4680 Computer Organization & Architecture. Divide, Floating Point, Pentium Bug

INTEGER REPRESENTATIONS

Inf2C - Computer Systems Lecture 2 Data Representation

COMP Overview of Tutorial #2

Divide: Paper & Pencil

Module 2: Computer Arithmetic

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

EE 109 Unit 20. IEEE 754 Floating Point Representation Floating Point Arithmetic

Floating Point Representation. CS Summer 2008 Jonathan Kaldor

Floating-Point Arithmetic

CS 61C: Great Ideas in Computer Architecture Performance and Floating Point Arithmetic

IT 1204 Section 2.0. Data Representation and Arithmetic. 2009, University of Colombo School of Computing 1

Number Systems. Decimal numbers. Binary numbers. Chapter 1 <1> 8's column. 1000's column. 2's column. 4's column

Data Representation Floating Point

Outline. What is Performance? Restating Performance Equation Time = Seconds. CPU Performance Factors

CS 61C: Great Ideas in Computer Architecture Performance and Floating-Point Arithmetic

Number Representations

CO212 Lecture 10: Arithmetic & Logical Unit

CprE 281: Digital Logic

Number Systems. Both numbers are positive

CS 33. Data Representation (Part 3) CS33 Intro to Computer Systems VIII 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

IEEE 754 Floating-Point Format

Signed Multiplication Multiply the positives Negate result if signs of operand are different

3.5 Floating Point: Overview

Numerical computing. How computers store real numbers and the problems that result

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Floating Point Representation in Computers

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Bits and Bytes and Numbers

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS

Homework 1 graded and returned in class today. Solutions posted online. Request regrades by next class period. Question 10 treated as extra credit

Data Representation Floating Point

Divide: Paper & Pencil CS152. Computer Architecture and Engineering Lecture 7. Divide, Floating Point, Pentium Bug. DIVIDE HARDWARE Version 1

Computer Architecture Chapter 3. Fall 2005 Department of Computer Science Kent State University

Floating Point Arithmetic

Floating-Point Arithmetic

CS 61C: Great Ideas in Computer Architecture Performance and Floating-Point Arithmetic

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Chapter 4: Data Representations

C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0.

Classes of Real Numbers 1/2. The Real Line

Floating Point Numbers. Lecture 9 CAP

Things to know about Numeric Computation

Physics 331 Introduction to Numerical Techniques in Physics

Floating Point. EE 109 Unit 20. Floating Point Representation. Fixed Point

IEEE Standard 754 Floating Point Numbers

Foundations of Computer Systems

Chapter 3: Arithmetic for Computers

Recap from Last Time. CSE 2021: Computer Organization. It s All about Numbers! 5/12/2011. Text Pictures Video clips Audio

Topic Notes: Bits and Bytes and Numbers

Data Representation 1

Floating Point Arithmetic

CMSC 313 Lecture 03 Multiple-byte data big-endian vs little-endian sign extension Multiplication and division Floating point formats Character Codes

Finite arithmetic and error analysis

Integers and Floating Point

MACHINE LEVEL REPRESENTATION OF DATA

Number Systems. Binary Numbers. Appendix. Decimal notation represents numbers as powers of 10, for example

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

10.1. Unit 10. Signed Representation Systems Binary Arithmetic

Topic Notes: Bits and Bytes and Numbers

Representing and Manipulating Floating Points. Jo, Heeseung

QUIZ ch.1. 1 st generation 2 nd generation 3 rd generation 4 th generation 5 th generation Rock s Law Moore s Law

COSC 243. Data Representation 3. Lecture 3 - Data Representation 3 1. COSC 243 (Computer Architecture)

Introduction to Scientific Computing Lecture 1

Data Representation Floating Point

Number Systems CHAPTER Positional Number Systems

Floating-point representations

Floating-point representations

CS 101: Computer Programming and Utilization

ecture 25 Floating Point Friedland and Weaver Computer Science 61C Spring 2017 March 17th, 2017

Survey. Motivation 29.5 / 40 class is required

Transcription:

IEEE-754 floating-point

Real and floating-point numbers Real numbers R form a continuum - Rational numbers are a subset of the reals - Some numbers are irrational, e.g. π Floating-point numbers are an approximation of real numbers - If finite in length, they are a subset of the rationals - Consist of a sign, a significant-digits part --- the mantissa or significand, and an exponent of the base (people usually use base 10)

Floating Point Floating-point numbers are represented by: - a sign - a significand or mantissa - an exponent Sign is easy sign part - 0 number is positive - 1 number is negative exponent part - numerically, factor = -1 sign significand part Significand and exponent have structure

Significand Floating point numbers are normalized - Represent as binary (fixed-point) number - Multiply by positive or negative power of 2, such that there is a single 1 bit to the left of the radix point Example: - 14.5 10 = 1110.1000 2 = 1.1101000 2 2 3 The leftmost bit (to the left of the radix) is always 1, so it doesn t need to be stored - The 1 is hidden or implicit - Store 1101000 as the significand Example 2: - 0.3125 10 = 0.0101 2 = (1.)0100000 2 2-2

Exponent Exponent is a power of 2 Exponents can be positive or negative Exponents are stored in Excess-N notation - N is typically 2 (m-1) 1 for m-bit storage Example: - 2 3 in 5 bits Excess-(2 (5-1) 1) = Excess-15-3 + 15 = 18 10 = 10010 2 Example: - 2-2 in 8 bits Excess-(2 (8-1) 1) = Excess-127 - -2 + 127 = 125 10 = 01111101 2

IEEE-754 a standard for representing floating-point (f.p.) numbers in computer systems - Three binary formats, two decimal formats - additional "storage" formats - adopted in 1985, updated in 2008 - many operational details All formats share some characteristics - Normalized - Implicit MSb - Sign-magnitude representation for significand - Excess-N representation for exponent - Special values for exceptional cases

Formats binary16 - "Half-precision" - storage only binary32 - "Single precision" binary64 - "Double precision" binary128 - "Quadruple precision" decimal32 - storage only decimal64 decimal128 Decimal formats are new to the 2008 revision IBM z-systems implement these formats

IEEE-754 Binary Formats

Examples 0 01111 0000000000 = 0x3c00 1, in Binary16: 0 01111 0000000000 = 0x3c00 - sign bit: 0 - exponent: 0 0+15 = 15 10 = 01111 2 - significand: 0000000000 2» leftmost 1-bit is implicit -2, in Binary32: 1 10000000 0000000 = 0xc000 0000 - sign bit: 1 - exponent: 1 1+127 10 = 128 10 = 10000000 2 - significand: 00000000000000000000000 2 0.3125, in Binary32: 0 01111101 01000000000 = 0x3ea0 0000 - sign bit: 0 - exponent: -2-2+127 10 = 125 10 = 01111101 2 - significand: 01000000000000000000000 2

Exceptional Values small Exponent = all 0 s 0 significand: true zero - positive and negative 0 are both legal non-zero significand: values are subnormal or denormalized no implicit one bit - trade off precision for smaller exponents Binary16 examples: - 0 00000 0000000000 = +0, true (positive) zero - 1 00000 1111111111 = -0.1111111111 2-14» the largest (negative) subnormal - 0 00000 0000000001 = 0.0000000001 2-14 = 1 2-24, the smallest possible number in Binary16

Exceptional Values large Exponent = all 1 s 0 significand: positive or negative infinity non-zero significand: NaN (Not a Number) and indication of an error condition - e.g. division by zero Binary16 examples: - 1 11111 0000000000 = negative infinity, - - 0 11111 1000000000 = quiet NaN, e.g. 0/0» indeterminate values the sign doesn t matter - 0 11111 0100000000 = signaling NaN» invalid operations e.g. a machine exception

Binary32 Format Again sign exponent significand 1 bit 8 bits 23 bits 1 if negative Excess-127 notation, range -126 to +127 normalized to 1 value < 2, leftmost 1 bit not represented All 0 s in the exponent and significand fields represent ± 0 Other values with all 0 s in the exponent field (looks like -127) are subnormal or denormalized values - exponent is -126 - hidden bit is 0 Values with all 1 s in the exponent field (looks like 128) and significand 0 (all 0 bits) represent ±infinity Other values with all 1 s in the exponent field represent NaNs "Not a Number" values

C types and IEEE-754 C's float datatype generally uses "single precision - a.k.a. Binary32 about 7 decimal digits of precision dynamic range roughly 10-45 to 10 +38 C's double datatype generally uses "double precision - a.k.a. Binary64 about 15 decimal digits of precision dynamic range roughly 10-324 to 10 +308 double frequently used for scientific calculations

Show the Bits in Binary32

Intel Processors "Endian"-ness Intel, AMD processors are "Little-endian" - Core i7, Opteron, etc. Little-endian: Least Significant Byte (LSB) stored in lowest memory address Big-endian: LSB stored in highest memory address - Most Significant Byte (MSB) stored in lowest memory address Multi-byte values are affected by the endianness - That's everything except characters

A routine to inspect endianness

Floating-Point -1.0 in 32-bit Intel Memory: Memory address 0x7fff1b5a4360 0x7fff1b5a4361 0x7fff1b5a4362 0x7fff1b5a4363 contents 0x00 0000_0000 0x00 0000_0000 0x80 1000_0000 0xbf 1011_1111 f.p. value -1.0 is 1 7f 000000 1 0111_1111 0000_0000_0

Floating-Point -2.0 in 32-bit Intel Memory: Memory address 0x7fff1b5a4360 0x7fff1b5a4361 0x7fff1b5a4362 0x7fff1b5a4363 contents 0x00 0000_0000 0x00 0000_0000 0x00 0000_0000 0xc0 1100_0000 f.p. value -2.0 is 1 80 000000 1 1000_0000 0000_0000_0

Floating-Point 8.5 in 32-bit Intel Memory: Memory address 0x7fff1b5a4360 0x7fff1b5a4361 0x7fff1b5a4362 0x7fff1b5a4363 contents 0x00 0000_0000 0x00 0000_0000 0x08 0000_1000 0x41 0100_0001 f.p. value 8.5 is 0 82 080000 0 1000_0010 0000_1000_0

Floating-Point 8.99 in 32-bit Intel Memory: Memory address 0x7fff1b5a4360 0x7fff1b5a4361 0x7fff1b5a4362 0x7fff1b5a4363 contents 0x0a 0000_1010 0xd7 1101_0111 0x0f 0000_1111 0x41 0100_0001 f.p. value 8.99 is 0 82 0fd70a 0 1000_0010 0000_1111_1101_0111_0000_1010