Digital Signal Processing Introduction to Finite-Precision Numerical Effects

Similar documents
D. Richard Brown III Associate Professor Worcester Polytechnic Institute Electrical and Computer Engineering Department

D. Richard Brown III Professor Worcester Polytechnic Institute Electrical and Computer Engineering Department

World Inside a Computer is Binary

ECE 450:DIGITAL SIGNAL. Lecture 10: DSP Arithmetic

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI.

COMP2611: Computer Organization. Data Representation

CS101 Lecture 04: Binary Arithmetic

Representation of Numbers and Arithmetic in Signal Processors

Number Systems. Binary Numbers. Appendix. Decimal notation represents numbers as powers of 10, for example

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation

Integers and Floating Point

ECE4703 B Term Laboratory Assignment 2 Floating Point Filters Using the TMS320C6713 DSK Project Code and Report Due at 3 pm 9-Nov-2017

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Floating Point January 24, 2008

Representing and Manipulating Floating Points. Jo, Heeseung

ECE 2020B Fundamentals of Digital Design Spring problems, 6 pages Exam Two Solutions 26 February 2014

Data Representation Floating Point

Representing and Manipulating Floating Points

Floating Point Numbers

2.1. Unit 2. Integer Operations (Arithmetic, Overflow, Bitwise Logic, Shifting)

Binary Representations and Arithmetic

Mathematical preliminaries and error analysis

Representing and Manipulating Floating Points

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

Computer Systems C S Cynthia Lee

Section 1.4 Mathematics on the Computer: Floating Point Arithmetic

Floating Point Numbers

Floating Point Numbers

CS429: Computer Organization and Architecture

Data Representation Floating Point

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

Representing and Manipulating Floating Points. Computer Systems Laboratory Sungkyunkwan University

ECE232: Hardware Organization and Design

ECE 2020B Fundamentals of Digital Design Spring problems, 6 pages Exam Two 26 February 2014

C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0.

Inf2C - Computer Systems Lecture 2 Data Representation

Integers. N = sum (b i * 2 i ) where b i = 0 or 1. This is called unsigned binary representation. i = 31. i = 0

System Programming CISC 360. Floating Point September 16, 2008

Computer Architecture Chapter 3. Fall 2005 Department of Computer Science Kent State University

Computer Architecture and System Software Lecture 02: Overview of Computer Systems & Start of Chapter 2

Floating Point Arithmetic

Representing and Manipulating Floating Points

Floating-point Arithmetic. where you sum up the integer to the left of the decimal point and the fraction to the right.

CMPSCI 145 MIDTERM #2 SPRING 2017 April 7, 2017 Professor William T. Verts NAME PROBLEM SCORE POINTS GRAND TOTAL 100

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.

Giving credit where credit is due

Computer Organization: A Programmer's Perspective

Chapter Three. Arithmetic

10.1. Unit 10. Signed Representation Systems Binary Arithmetic

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor.

LABORATORY. 3 Representing Numbers OBJECTIVE REFERENCES. Learn how negative numbers and real numbers are encoded inside computers.

Computer Systems Programming. Practice Midterm. Name:

Giving credit where credit is due

Number Representations

Design and Verify Embedded Signal Processing Systems Using MATLAB and Simulink

VIII. DSP Processors. Digital Signal Processing 8 December 24, 2009

FLOATING POINT FORMATS

CS 33. Data Representation (Part 3) CS33 Intro to Computer Systems VIII 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

CS 31: Introduction to Computer Systems. 03: Binary Arithmetic January 29

Finite arithmetic and error analysis

Princeton University Computer Science 217: Introduction to Programming Systems. Goals of this Lecture. Number Systems and Number Representation

Floating point. Today. IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.

Introduction to Computer Science (I1100) Data Storage

Lecture 4 - Number Representations, DSK Hardware, Assembly Programming

Number Systems and Number Representation

Instruction Set extensions to X86. Floating Point SIMD instructions

Class 5. Data Representation and Introduction to Visualization

3.1 DATA REPRESENTATION (PART C)

Floating-point to Fixed-point Conversion. Digital Signal Processing Programs (Short Version for FPGA DSP)

Data Representation in Computer Memory

Department of Computer Science Purdue University, West Lafayette

Natural Numbers and Integers. Big Ideas in Numerical Methods. Overflow. Real Numbers 29/07/2011. Taking some ideas from NM course a little further

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

Bits, Words, and Integers

Digital Arithmetic. Digital Arithmetic: Operations and Circuits Dr. Farahmand

CS 261 Fall Floating-Point Numbers. Mike Lam, Professor.

Run time environment of a MIPS program

Data Representation and Introduction to Visualization

Computer System and programming in C

Administrivia. CMSC 216 Introduction to Computer Systems Lecture 24 Data Representation and Libraries. Representing characters DATA REPRESENTATION

ECE 2030D Computer Engineering Spring problems, 5 pages Exam Two 8 March 2012

Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition. Carnegie Mellon

15213 Recitation 2: Floating Point

Description Hex M E V smallest value > largest denormalized negative infinity number with hex representation 3BB0 ---

Floating Point. CSE 238/2038/2138: Systems Programming. Instructor: Fatma CORUT ERGİN. Slides adapted from Bryant & O Hallaron s slides

Floating Point Considerations

Data Representation 1

MC1601 Computer Organization

4.1 QUANTIZATION NOISE

Floating-point representation

Computer Arithmetic Floating Point

Numeric Encodings Prof. James L. Frankel Harvard University

CSE 1320 INTERMEDIATE PROGRAMMING - OVERVIEW AND DATA TYPES

in this web service Cambridge University Press

ECE 2030B 1:00pm Computer Engineering Spring problems, 5 pages Exam Two 10 March 2010

fractional quantities are typically represented in computers using floating point format this approach is very much similar to scientific notation

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

Transcription:

Digital Signal Processing Introduction to Finite-Precision Numerical Effects D. Richard Brown III D. Richard Brown III 1 / 9

Floating-Point vs. Fixed-Point DSP chips are generally divided into fixed-point and floating-point types. Examples: Fixed point: Texas Instruments TMS320C6455, Analog Devices Blackfin,... Floating point: Texas Instruments TMS320C6727, Analog Devices SHARC,... Both types can process floating-point data, but the fixed-point processors use software libraries that are very slow. Hence, fixed-point processors are typically only used to process fixed-point data. Floating-point DSP Easier to program Less likelihood of overflow Usually more accurate Fixed-point DSP Cheaper Lower power Faster D. Richard Brown III 2 / 9

Common Floating-Point Data Types (IEEE 754) Single-precision (32 bit, usually the float datatype) 6-9 significant digits of precision smallest positive value: 1.4 10 38 largest positive value: 3.4 10 38 Special values for ± and NaN. Double-Precision (64 bit, usually the double datatype) 15-17 significant digits of precision smallest positive value: 10 308 largest positive value: 10 308 Special values for ± and NaN. Huge dynamic range means low likelihood of overflow. Matlab uses double-precision for most of its calculations. Double-precision is often quite a bit slower than single-precision on most DSP chips. D. Richard Brown III 3 / 9

Common Integer-Valued Data Types Bits Usual datatype Minimum value Maximum Value 8 char or signed char -128 127 16 short -32768 32767 32 int 2 31 2 31 1 Smaller dynamic range than floating point data types means extra care must be taken to avoid overflow. Given a B + 1 bit datatype with bits {b 0, b 1,..., b B }, decimal values are typically represented with two s complement encoding: decimal value = b 0 2 B + B b i 2 B i i=1 D. Richard Brown III 4 / 9

Fixed-Point Representation A fixed-point number representation uses an integer-valued datatype and associates with it a certain number of fractional bits, denoted as q. This simply scales the integer value as ) B decimal value = 2 ( b q 02 B + b i2 B i Note that q is usually positive, but can be negative. As an example, suppose we want to quantize 1 2 0.7071 to a fixed point number with B + 1 = 4 total bits. The integer datatype has a range of 8 to +7. i=1 Frac bits q Quantized value (dec) Quantized value (bin) Integer value 0 1 0001 1 1 0.5 000 1 1 2 0.75 00 11 3 3 0.75 0 110 6 4 0.4375 0111 7 D. Richard Brown III 5 / 9

Fixed-Point Products Consider: Fixed point variable x with B x + 1 total bits and q x fractional bits. Fixed point variable y with B y + 1 total bits and q y fractional bits. If z = x y, then z will require B z + 1 B x + B y + 2 total bits and q z q x + q y fractional bits to capture the product without any possibility of overflow or loss of precision. Many fixed-point DSPs have an extended-precision (also called double-length) accumulator to accommodate fixed-point products without any possibility of overflow or loss of precision. D. Richard Brown III 6 / 9

Fixed-Point Sums Consider fixed point variables {x 1,..., x N } all with B x + 1 total bits and q x fractional bits. If N z = then z will require total bits and i=1 x i B z + 1 B x + 1 + log 2 (N) q z q x fractional bits to capture the sum without any possibility of overflow or loss of precision. For fixed-point products and sums, if the z datatype does not have enough bits to hold the result, we usually reduce precision (decrease the number of fractional bits) to avoid overflow. D. Richard Brown III 7 / 9

Overview of Finite-Precision Effects Input/output quantization, i.e., ADC and DAC. Filter coefficient quantization. Product roundoff. Potential overflow in sums. x(t) ideal sampling ˆx[n] ŷ[n] ideal reconstruction y(t) z 1 a â In general, nonlinear systems like this are difficult to analyze. We will isolate the sources of quantization error and adopt some approximate approaches to make the analysis tractable. D. Richard Brown III 8 / 9

Fixed-Point Filter Design in fdatool D. Richard Brown III 9 / 9