C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0.

Similar documents
COMP2611: Computer Organization. Data Representation

ECE232: Hardware Organization and Design

CO212 Lecture 10: Arithmetic & Logical Unit

FLOATING POINT NUMBERS

The Sign consists of a single bit. If this bit is '1', then the number is negative. If this bit is '0', then the number is positive.

Floating-point representations

Floating-point representations

Number Systems Standard positional representation of numbers: An unsigned number with whole and fraction portions is represented as:

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

Floating Point Numbers

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

Inf2C - Computer Systems Lecture 2 Data Representation

Floating Point Arithmetic

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

Numeric Encodings Prof. James L. Frankel Harvard University

Representing and Manipulating Floating Points

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI.

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation

Computer Architecture Chapter 3. Fall 2005 Department of Computer Science Kent State University

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS

CPE 323 REVIEW DATA TYPES AND NUMBER REPRESENTATIONS IN MODERN COMPUTERS

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Representing and Manipulating Floating Points

Computer Architecture and IC Design Lab. Chapter 3 Part 2 Arithmetic for Computers Floating Point

EE 109 Unit 19. IEEE 754 Floating Point Representation Floating Point Arithmetic

Module 2: Computer Arithmetic

Number Systems. Decimal numbers. Binary numbers. Chapter 1 <1> 8's column. 1000's column. 2's column. 4's column


Foundations of Computer Systems

Representing and Manipulating Floating Points

CS 101: Computer Programming and Utilization

Floating Point January 24, 2008

Computer Organization: A Programmer's Perspective

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

Floating Point Numbers

Divide: Paper & Pencil

Floating Point Numbers. Lecture 9 CAP

Chapter 5 : Computer Arithmetic

Representing and Manipulating Floating Points. Computer Systems Laboratory Sungkyunkwan University

Chapter 2 Data Representations

Computer Arithmetic Ch 8

Computer Arithmetic Ch 8

UNIT-III COMPUTER ARTHIMETIC

Finite arithmetic and error analysis

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

Floating Point Arithmetic. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

COMP Overview of Tutorial #2

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.

Operations On Data CHAPTER 4. (Solutions to Odd-Numbered Problems) Review Questions

Representing and Manipulating Floating Points. Jo, Heeseung

Number Systems. Binary Numbers. Appendix. Decimal notation represents numbers as powers of 10, for example

Number Systems. Both numbers are positive

Data Representations & Arithmetic Operations

Kinds Of Data CHAPTER 3 DATA REPRESENTATION. Numbers Are Different! Positional Number Systems. Text. Numbers. Other

World Inside a Computer is Binary

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

15213 Recitation 2: Floating Point

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor.

Signed Multiplication Multiply the positives Negate result if signs of operand are different

System Programming CISC 360. Floating Point September 16, 2008

Floating point. Today. IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.

Organisasi Sistem Komputer

COMPUTER ORGANIZATION AND ARCHITECTURE

Introduction to Computers and Programming. Numeric Values

Floating Point Representation in Computers

Data Representation Floating Point

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor.

10.1. Unit 10. Signed Representation Systems Binary Arithmetic

3.5 Floating Point: Overview

Data Representation Floating Point

Giving credit where credit is due

Giving credit where credit is due

Data Representation Floating Point

Computer (Literacy) Skills. Number representations and memory. Lubomír Bulej KDSS MFF UK

IT 1204 Section 2.0. Data Representation and Arithmetic. 2009, University of Colombo School of Computing 1

Floating Point Arithmetic

CS 33. Data Representation (Part 3) CS33 Intro to Computer Systems VIII 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

4 MAC INSTRUCTIONS. Figure 4-0. Table 4-0. Listing 4-0.

LogiCORE IP Floating-Point Operator v6.2

EE 109 Unit 20. IEEE 754 Floating Point Representation Floating Point Arithmetic

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Arithmetic (a) The four possible cases Carry (b) Truth table x y

Floating Point : Introduction to Computer Systems 4 th Lecture, May 25, Instructor: Brian Railing. Carnegie Mellon

Computer Arithmetic Floating Point

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

MIPS Integer ALU Requirements

Floating Point Numbers

Floating Point Numbers

Course Schedule. CS 221 Computer Architecture. Week 3: Plan. I. Hexadecimals and Character Representations. Hexadecimal Representation

Chapter 4. Operations on Data

Instruction Set extensions to X86. Floating Point SIMD instructions

±M R ±E, S M CHARACTERISTIC MANTISSA 1 k j

CHAPTER V NUMBER SYSTEMS AND ARITHMETIC

Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition. Carnegie Mellon

Floating Point. CSE 238/2038/2138: Systems Programming. Instructor: Fatma CORUT ERGİN. Slides adapted from Bryant & O Hallaron s slides

Co-processor Math Processor. Richa Upadhyay Prabhu. NMIMS s MPSTME February 9, 2016

Topics. 6.1 Number Systems and Radix Conversion 6.2 Fixed-Point Arithmetic 6.3 Seminumeric Aspects of ALU Design 6.4 Floating-Point Arithmetic

Integers and Floating Point

CPE300: Digital System Architecture and Design

Transcription:

C NUMERIC FORMATS Figure C-. Table C-. Listing C-. Overview The DSP supports the 32-bit single-precision floating-point data format defined in the IEEE Standard 754/854. In addition, the DSP supports an extended-precision version of the same format with eight additional bits in the mantissa (4 bits total). The DSP also supports 32-bit fixed-point formats fractional and integer which can be signed (twos-complement) or unsigned. IEEE Single-Precision Floating-point Data Format IEEE Standard 754/854 specifies a 32-bit single-precision floating-point format, shown in Figure C-1. A number in this format consists of a sign bit s, a 24-bit significand, and an 8-bit unsigned-magnitude exponent e. For normalized numbers, the significand consists of a 23-bit fraction f and a hidden bit of 1 that is implicitly presumed to precede f 22 in the significand. The binary point is presumed to lie between this hidden bit and f 22. The least significant bit (LSB) of the fraction is f ; the LSB of the exponent is e. The hidden bit effectively increases the precision of the floating-point significand to 24 bits from the 23 bits actually stored in the data format. It also insures that the significand of any number in the IEEE normalized number format is always greater than or equal to 1 and less than 2. ADSP-2116 SHARC DSP Hardware Reference C-1

IEEE Single-Precision Floating-point Data Format The unsigned exponent e can range between 1 e 254 for normal numbers in the single-precision format. This exponent is biased by +127 (254 2). To calculate the true unbiased exponent, 127 must be subtracted from e. 31 3 23 22 s e e 1. f f 7 22 Hidden Bit Binary Point Figure C-1. IEEE 32-Bit Single-Precision Floating-Point Format The IEEE Standard also provides for several special data types in the single-precision floating-point format: An exponent value of 255 (all ones) with a nonzero fraction is a Not-A-Number (NAN). NANs are usually used as flags for data flow control, for the values of uninitialized variables, and for the results of invalid operations such as *. Infinity is represented as an exponent of 255 and a zero fraction. Note that because the fraction is signed, both positive and negative Infinity can be represented. Zero is represented by a zero exponent and a zero fraction. As with Infinity, both positive Zero and negative Zero can be represented. C-2 ADSP-2116 SHARC DSP Hardware Reference

Numeric Formats The IEEE single-precision floating-point data types supported by the DSP and their interpretations are summarized in Table C-1. Table C-1. IEEE Single-Precision Floating-Point Data Types Type Exponent Fraction Value NAN 255 Nonzero Undefined Infinity 255 ( 1) s Infinity Normal 1 e 254 Any ( 1) s (1.f 22- ) 2 e 127 Zero ( 1) s Zero Extended Precision Floating-point Format The extended precision floating-point format is 4 bits wide, with the same 8-bit exponent as in the standard format but a 32-bit significand. This format is shown in Figure C-2. In all other respects, the extended floating-point format is the same as the IEEE standard format. Short Word Floating-point Format The DSP supports a 16-bit floating-point data type and provides conversion instructions for it. The short float data format has an 11-bit mantissa with a four-bit exponent plus sign bit, as shown in Figure C-3. The 16-bit floating-point numbers reside in the lower 16 bits of the 32-bit floating-point field. ADSP-2116 SHARC DSP Hardware Reference C-3

Packing For Floating-point Data 39 38 31 3 s e e 1. f f 7 3 Hidden Bit Binary Point Figure C-2. 4-Bit Extended-Precision Floating-Point Format 15 14 11 1 s e e 1. f f 3 1 Hidden Bit Binary Point Figure C-3. 16-Bit Floating-Point Format Packing For Floating-point Data Two shifter instructions, FPACK and FUNPACK, perform the packing and unpacking conversions between 32-bit floating-point words and 16-bit floating-point words. The FPACK instruction converts a 32-bit IEEE floating-point number to a 16-bit floating-point number. FUN- PACK converts the 16-bit floating-point numbers back to 32-bit IEEE floating-point. Each instruction executes in a single cycle. The results of C-4 ADSP-2116 SHARC DSP Hardware Reference

Numeric Formats the FPACK and FUNPACK operations appear in Table C-2 and Table C-3. Table C-2. FPACK Operations Condition Result 135 < exp Largest magnitude representation. 12 < exp 135 Exponent is MSB of source exponent concatenated with the three LSBs of source exponent. The packed fraction is the rounded upper 11 bits of the source fraction. 19 < exp 12 Exponent=. Packed fraction is the upper bits (source exponent 11) of the source fraction prefixed by zeros and the hidden 1. The packed fraction is rounded. exp < 11 Packed word is all zeros. exp = source exponent sign bit remains the same in all cases Table C-3. FUNPACK Operations Condition Result < exp 15 Exponent is the 3 LSBs of the source exponent prefixed by the MSB of the source exponent and four copies of the complement of the MSB. The unpacked fraction is the source fraction with 12 zeros appended. exp = Exponent is (12 N) where N is the number of leading zeros in the source fraction. The unpacked fraction is the remainder of the source fraction with zeros appended to pad it and the hidden 1 stripped away. exp = source exponent sign bit remains the same in all cases The short float type supports gradual underflow. This method sacrifices precision for dynamic range. When packing a number which would have ADSP-2116 SHARC DSP Hardware Reference C-5

Fixed-point Formats underflowed, the exponent is set to zero and the mantissa (including hidden 1) is right-shifted the appropriate amount. The packed result is a denormal which can be unpacked into a normal IEEE floating-point number. During the FPACK operation, an overflow will set the SV condition and non-overflow will clear it. During the FUNPACK operation, the SV condition will be cleared. The SZ and SS conditions are cleared by both instructions. Fixed-point Formats The DSP supports two 32-bit fixed-point formats: fractional and integer. In both formats, numbers can be signed (twos-complement) or unsigned. The four possible combinations are shown in Figure C-4. In the fractional format, there is an implied binary point to the left of the most significant magnitude bit. In integer format, the binary point is understood to be to the right of the LSB. Note that the sign bit is negatively weighted in a twos-complement format. ALU outputs always have the same width and data format as the inputs. The multiplier, however, produces a 64-bit product from two 32-bit inputs. If both operands are unsigned integers, the result is a 64-bit unsigned integer. If both operands are unsigned fractions, the result is a 64-bit unsigned fraction. These formats are shown in Figure C-5. If one operand is signed and the other unsigned, the result is signed. If both inputs are signed, the result is signed and automatically shifted left one bit. The LSB becomes zero and bit 62 moves into the sign bit position. Normally bit 63 and bit 62 are identical when both operands are signed. (The only exception is full-scale negative multiplied by itself.) Thus, the left shift normally removes a redundant sign bit, increasing the precision of the most significant product. Also, if the data format is fractional, a single-bit left shift renormalizes the MSP to a fractional format. C-6 ADSP-2116 SHARC DSP Hardware Reference

Numeric Formats The signed formats with and without left shifting are shown in Figure C-6. The multiplier has an 8-bit accumulator to allow the accumulation of 64- bit products. For more information on the multiplier and accumulator, see Multiply Accumulator (Multiplier) on page 2-13. ADSP-2116 SHARC DSP Hardware Reference C-7

Fixed-point Formats 31 3 29 SIGNED INTEGER -2 31 2 3 2 29 2 2 2 SIGN POINT SIGNED FRACTIONAL 31 3 29-2 - 2-1 2-2 2-29 2-3 2-31 SIGN POINT 31 3 29 UNSIGNED INTEGER 2 31 2 3 2 29 2 2 2 POINT 31 3 29 UNSIGNED FRACTIONAL 2-1 2-2 2-3 2-3 2-31 2-32 POINT Figure C-4. 32-Bit Fixed-Point Formats C-8 ADSP-2116 SHARC DSP Hardware Reference

Numeric Formats UNSIGNED INTEGER 63 62 61 2 63 2 62 2 61 2 2 2 PO INT UNSIGNED FRACTIONAL 63 62 61 2-1 2-2 2-3 2-62 2-63 2-64 PO INT Figure C-5. 64-Bit Unsigned Fixed-Point Product ADSP-2116 SHARC DSP Hardware Reference C-9

Fixed-point Formats SIGNED INTEGER, NO LEFT SHIFT 63 62 61-2 63 2 3 62 2 29 62 2 2 2 SIGN POINT SIGN ED FRACTIO NA L, NO LEFT SHIFT 63 62 61-2 2-1 -2-2 2-1 2-6 -29 2-3 -61 2-62 -31 SIGN S POINT SIGNED FRACTIONAL, WITH LEFT SHIFT 63 62 61-2 2-1 2-2 2-61 2-62 2-63 SIGN POINT Figure C-6. 64-Bit Signed Fixed-Point Product C-1 ADSP-2116 SHARC DSP Hardware Reference