H5H4, H5E7 lecture 5 Fixed point arithmetic. Overview

Size: px
Start display at page:

Download "H5H4, H5E7 lecture 5 Fixed point arithmetic. Overview"

Transcription

1 H5H4, H5E7 lecture 5 Fixed point arithmetic I. Verbauwhede Acknowledgements: H. DeMan, V. Öwall, D. Hwang, K.U.Leuven 1 Overview Lecture 1: what is a system-on-chip Lecture : terminology for the different steps Lecture : models of computations, SDFG Lecture 4: control flow Lecture 5 today : fixed point refinement Page 1

2 H5H4 goal: Skiing down a mountain SPW, Matlab, C++ pipelining, unrolling Specification Algorithm Transformations loop merging, compaction Memory Transformations and Optimizations 40 bit accumulator Floating-point to Fixed-point ASIC Special Purpose Retargetable coprocessor DSP processor DSP- RISC RISC References P. Lapsley, et al., DSP Processor fundamentals: Architectures and features, IEEE Press, 1997, Chapter. W. Sung, K. Kum, Simulation-based Word-Length Optimization Method for Fixed-point Digital Signal processing systems, IEEE Trans. On Signal Proc. Vol. 4, No. 1, Dec Viktor Öwall, Dept. of Electroscience, Lund Sweden - M. Ercegovac, T. Lang, Digital Arithmetic, Kaufmann Publishers, 004. Fridge project: 4 Page

3 Finite word lengths: a must for DSP Floating-point powerful expensive (storage & ops) bytes (mantissa) + 1 byte (exponent) DSP applications high speed minimum area low power * 8 Fixed-point refinement 6 * 14 5 Consequences of Bad Use of Approximations Example: Failure of Patriot Missile (1991 Feb. 5) Source American Patriot Missile battery in Dharan, Saudi Arabia, failed to intercept incoming Iraqi Scud missile The Scud struck an American Army barracks, killing 8 Cause, per GAO/IMTEC-9-6 report: software problem (inaccurate calculation of the time since boot) Specifics of the problem: time in tenths of second as measured by the system s internal clock was multiplied by 1/10 to get the time in seconds. Internal registers were 4 bits wide 1/10 = (chopped to 4 b) Error Error in 100-hr operation period = 0.4 s Distance traveled by Scud = (0.4 s) (1676 m/s) 570 m This put the Scud outside the Patriot s range gate Ironically, the fact that the bad time calculation had been improved in some (but not all) code parts contributed to the problem, since it meant that inaccuracies did not cancel out 6 Page

4 Consequences of Bad Approximations Example: Explosion of Ariane Rocket (1996 June 4) Source Unmanned Ariane 5 rocket launched by the European Space Agency veered off its flight path, broke up, and exploded only 0 seconds after lift-off (altitude of 700 m) The $500 million rocket (with cargo) was on its 1st voyage after a decade of development costing $7 billion Cause: software error in the inertial reference system Specifics of the problem: a 64 bit floating point number relating to the horizontal velocity of the rocket was being converted to a 16 bit signed integer An SRI* software exception arose during conversion because the 64-bit floating point number had a value greater than what could be represented by a 16-bit signed integer (max 767) 7 Outline Number representation Location of decimal point Precision Dynamic range Truncation, rounding Overflow 8 Page 4

5 Binary numbers, unsigned integers MSB = Most Significant Bit LSB = Least Significant Bit N bits N (0) (1) () () (4) (5) (6) (7) 9 [V. Öwall] Dynamic range and Resolution Nr. of Nr. of Resolution Dynamic Range bits levels V fs =0.5V V LSB = V 0.5V 8 56 mv 8V mV 18V μV 04V 10 How do we use the bits? Depends on the application! [V. Öwall] Page 5

6 Number Representation Unsigned numbers Signed digit numbers Sign magnitude One s complement Two s complement Notation: <W,L> with W = K + L W = wordlength L = number of bits behind decimal (or binary) point 11 Signed-Digit Representations Representations 1) Signed-Magnitude: redundant ) Biased: non-redundant ) Complement» A) Radix Complement (r= two's complement ) non-redundant» B) Digit Complement or Diminished-Radix Complement (r= one's complement ) redundant Redundant two representations for same number Non-redundant each representation is different number 1 Page 6

7 Sign Magnitude Unsigned numbers with a sign-bit Signed Magnitude Two Zeros + Low Power? [V. Öwall] One s Complement Signed numbers by inverting (Complement) One's Complement Two Zeros + Easy to convert to Negative [V. Öwall] Page 7

8 Two s Complement Most widely used fixed point numbering system 000 Complement + LSB Two's Complement One Zero + Easy Addition - Not so easy to convert to Neg [V. Öwall] Signedmagnitude Biased Two s complement One s complement Page 8

9 Position of decimal point MSB=W-L LSB=L i W L <W,L> Total number of bits W Fractional bits L Value representation s complement (i=-1) unsigned (i=1) 17 How do you store this decimal point? Fixed point for DSP processors Simple binary integer (two s complement) MSB=W LSB=0 Signbit W <W,0> Simple binary fractional representation MSB=W LSB=L=W-1 Signbit <W,W-1> W Values between [-1,1[ 18 Page 9

10 Mantissa representation Mantissa: e.g. 4 bit One sign bit Mantissa bit = 1 (always!) [-1, -] and [+1, +] Exponent: e.g. 8 bit Value = Mantisse x exponent MSB=W LSB=L Signbit W <W,L> Precision Quantization error = error when a longer numeric format is converted to a shorter one E.g.: round 1.5 to 1., error = Maximum precision (in bits) = log ( maximum value / max quantization error ) E.g.: 16 bit fractional representation max value = -1, max error = -16 (with rounding) maximum precision = 16 bits Importance of scaling!! 0 Page 10

11 Dynamic range Dynamic range = largest number / smallest number in a given data format E.g. bit fractional value ratio = (1- -1 ) / -1 = +1 = = 187 db Telecom: 50 db, High End Audio: 90dB + DSP processors: provide a few more bits than the dynamic range requires Scaling!! 1 Rounding Page 11

12 How do we quantize? Cheap Nasty fxp flp fxp Sign-Magnitude Unusual flp floor x -compl truncate Magnitude truncate fxp fxp flp flp ceil x round Best Expensive Rounding Rounding occurs when we want to approximate a more precise number (i.e. more fractional bits L) with a less precise number (i.e. fewer fractional bits L') Example 1: down old: (K=6, L=8) new: (K'=6, L'=) Example : up old: (K=6, L=8) new: (K'=6, L'=0) The following show rounding from L>0 fractional bits to L'=0 bits, but the mathematics hold true for any L' < L Usually, keep the number of integral bits the same K'=K 4 Page 1

13 Rounding Equation Whole part Fractional part x k 1 x k... x 1 x 0. x 1 x... x Round l y k 1 y k... y 1 y 0 y = round(x) 5 Rounding Techniques Different rounding techniques: 1) truncation» results in round towards zero in signed magnitude» results in round towards - in two's complement ) round to nearest number ) round to nearest even number (or odd number) 4) round towards + Other rounding techniques 5) jamming or von Neumann 6) ROM rounding Each will differ in their error depending on representation of numbers i.e. signed magnitude versus two's complement 6 Error = round(x) x Page 1

14 1) Truncation The simplest possible rounding scheme: chopping or truncation x k 1 x k... x 1 x 0. x 1 x... x l trunc x k 1 x k... x 1 x 0 ulp Truncation in signed-magnitude results in a number chop(x) that is always of smaller magnitude than x. This is called round towards zero or inward rounding (.5) () 10» Error = (-.5) (-) 10» Error = +0.5 Truncation in two's complement results in a number chop(x) that is always smaller than x. This is called round towards - or downward-directed rounding (.5) () 10» Error = (-.5) (-4) 10» Error = Truncation Function Graph: chop(x) chop(x) Fig Truncation or chopping of a signed-magnitude number (same as round toward 0). x chop(x) Fig Truncation or chopping of a s-complement number (same as round to - ). x 8 Page 14

15 Bias in two's complement truncation X (binary) X (decimal) chop(x) (binary) chop(x) (decimal) Error (decimal) Assuming all combinations of positive and negative values of x equally possible, average error is In general, average error = ( -L' - -L )/, where L' = new number of fractional bits 9 Implementation truncation in hardware Easy, just ignore (i.e. truncate) the fractional digits from L to L'+1 x k-1 x k-.. x 1 x 0. x -1 x -.. x -L = y k-1 y k-.. y 1 y 0. ignore (i.e. truncate the rest) 0 Page 15

16 ) Round to nearest number Rounding to nearest number what we normally think of when say round rtn in two's complement (.5) () 10» Error = (-.5) (-) 10» Error = Round to Nearest Function Graph: rtn(x) rtn(x) x Page 16

17 Bias in two's complement round to nearest X (binary) X (decimal) rtn(x) (binary) rtn(x) (decimal) All combinations of positive and negative values of x equally possible, average error is Smaller average error than truncation, but still not symmetric error We have a problem with the midway value, i.e. exactly at.5 or -.5 leads to positive error bias always Overflow problem: if only allocate K' = K integral bits Example: rtn(011.10) overflow This overflow only occurs on positive numbers near the maximum positive value, not on negative numbers Error (decimal) Truncation and rounding Truncation: cheapest but introduces bias E.g.: use <4,0> 0011 = =.5 truncates to 1100 = = -.5 truncates to -4 Always a smaller number Rounding: round to the nearest Simple hardware trick: add 1/ of the smallest number and truncate E.g.: use <4,0> 0011 = =.5 rounds to = -.5 rounds to - How in hardware? 4 Page 17

18 Rounding Rounding to the nearest: still bias for numbers exactly half way More expensive: convergent rounding Signbit 7 a a a a a a a a Signbit b b 1 0 b b If a:a0 > 1000 b:b0 = a7:a4 + a If a:a0 < 1000 b:b0 = a7:a4 + a If a:a0 = 1000 b:b0 = a7:a4 + a4 5 Overflow 6 Page 18

19 What happens on an overflow? wrap-around saturation fxp flp fxp flp max. value 7 Adding Two's Complement Numbers: Ignoring Overflow Ignoring overflow, adding a K.L two's complement number to a K.L binary unsigned number results in a K.L number Example: = = Ignore c K Ignore c K Adding results in -.5: must add ^K = 16 to get correct result (1.75) Adding results in +1: must add -^K = -16 to get correct result 8 Page 19

20 Two's Complement Wraparound Property Temporary wraparounds are fine as long as final value is in the correct dynamic range: Example: add ( ) + 7 = = 0010» Should be (-14) 10 not (+) 10 wraparound/overflow = 1001» Final result is correct: (-7) 10»Iffinal result guaranteed to be in the correct dynamic range [-8,+7] then intermediate wraparounds are fine 9 Adding Two's Complement Numbers: Avoiding or Detecting Overflow To avoid overflow, adding a <K+L,L> binary two's complement number to a <K+L,L> two's complement number results in a <K+L+1,L> number. To compute, sign extend MSB, ignore c K+1 Example: Ignore c K = K=4, L= If result is confined to a <K+L,L> number, need overflow detection, which is the c K xor c K-1 Example: = c K XOR c K-1 indicates overflow 40 Page 0

21 Subtracting Two's Complement Numbers: Ignoring Overflow Ignoring overflow, subtracting a <K+L,L> two's complement number from a <K+L,L> two's complement number results in a <K+L,L> number Example: = Ignore c K 7 (-8) resulted in -1 A wraparound/overflow occured Must add ^K=^4=16 to get correct value of +15 Again we see the modulo effect As with addition, temporary wraparounds are okay as long as final result is in correct dynamic range 41 Subtracting Two's Complement Numbers: Avoiding or Detecting Overflow To avoid overflow, subtracting a <K+L,L> two's complement number from a <K+L,L> two's complement number results in a <K+L+1,L> number Example: = Ignore c K+1 If result is confined to a K.L number, need overflow detection, which is the c K xor c K-1 Example: = c K XOR c K-1 indicates overflow Page 1

22 Negating a Two's Complement Number Negating a K.L two's complement number usually only requires a K.L digit result. The only exception is when you negate the largest negative number, and you need a K.(L+1) digit result.» = 1001» = need extra bit to negate largest negative number Again overflow detection needed 4 Outline Number representation Location of decimal point Precision Dynamic range Truncation, rounding Overflow Now: what to do? 44 Page

23 The Wordlength, i.e. nr of bits x(n) h0 D D D h1 h h UMTS-filter y(n) Every extra bit costs energy/power delay area the word length has to be reduced 7bits float 45 [V. Öwall] The Wordlength, i.e. nr of bits x(n) D D D h0 h1 h h y(n) The output of adder output needs an extra bit to be sure of no overflow, e.g. + = =100 multiplier MxN bits M+N bits for full precision Precision has to be limited 46 [V. Öwall] Page

24 During design: specify fixed-point formats for signals AD Floating-point algorithm AD 8 7 W,L,Q System context data? +??? * *?? System context coefficients 47 Fixed-point refinement: optimization problem Minimize overall cost: minimal word lengths truncate and wrap-around MSB determination: goal: avoid unwanted overflows method: find min, max signal values result: MSB position, value representation, overflow behaviour LSB determination: goal: keep required precision method: evaluate difference between flp and fxp behavior result: LSB position, quantization safe range quantization cost t t 48 Page 4

25 1.MSB determination: range calculations x d range calc. range info c m * + y Analytical method Put range (min, max) on inputs, states Propagate range over the operators This gives a save(pessimistic) estimate 49 Word length propagation Range propagation translates to word length growth E.g. Two s complement integer addition A + B A and B represented by <W,0> A + B needs <W+1, 0> A B needs <W+1, 0> In general: A is represented by <W A,L A >, B by <W B,L B >, A + B needs <max(w A, W B ) +1, max(l A, L B )> Get s more complicated for multiplication 50 Page 5

26 Range calculations grows unbounded? X(n) + Y(n) x(1) x() z -1 z -1 * + z -1 * + z -1 * + F(n) * a<0 a<0 a max F a n min F 51 Alternative: Collect signal statistics during simulations?min, max stimuli q1 x d stimuli q c m * + y Perform simulation with realistic stimuli. Collect minimum and maximum value on each signal during the simulation This gives an optimistic, stimuli dependent estimate 5 Page 6

27 Combine both methods for accurate MSB determination signal statistic range propagation name min max MSB1 min max MSB signal signal signal If MSB1 == MSB: wrap-around(msb1) If MSB1 < MSB: choose saturate(msb1) or wrap-around(msb) If MSB1 << MSB: range propagation problem (MSB1 + saturation to be used) 5 Transform DFG for cheaper solution Scaling by moving multiplications or shifters over operators, use commutativity, associativity, distributivity (check accuracy!) Need to verify also LSB behavior 54 Page 7

28 . Quantization effects can be modeled as additive noise (LSB) B bits input output input output Q + noise Quantization noise is approximated by a statistical model with the following assumptions: the noise is uncorrelated to the input. the noise is white. the probability distribution is uniform. 55 Each quantization effect is modeled by a mean and variance Rounding: Truncation: mn mn Magnitude truncation: = 0 and σ n Δ and σ = n Δ = 1 Δ = 1 mn Δ is quantization step = 0 and σ n = Δ 56 Page 8

29 This results in an equivalent linear network X(n) + Y(n) X(n) + Y(n) z -1 z -1 Q a * + * a e(n) But quantization is a non-linear operation! 57 Limit cycles are an example of non- linear behavior without rounding: X(n) + Y(n) z Q B bits * with rounding: X(0) = 14, x(n) = 0 for n > 0 round to nearest integer Page 9

30 Limitcycle example 59 a) LSB determination must be based on simulations stimuli x z -1 All fixed-point simulate 0.6 m * Q + y Q no output ok yes x 0.6 m * + y com pare 60 Page 0

31 b) Gradual refinement is necessary to keep the problem manageable For each signal S quantize S only simulate stimuli x 0.6 z -1 m y * Q + reference simulation com pare no Perf. ok yes return 61 Conclusion Number representation Location of decimal point Precision Dynamic range Truncation, rounding Overflow Now: what to do? Why are we doing this? Area/time/power optimization Important design optimization for JPEG project 6 Page 1

Lecture 3: Basic Adders and Counters

Lecture 3: Basic Adders and Counters Lecture 3: Basic Adders and Counters ECE 645 Computer Arithmetic /5/8 ECE 645 Computer Arithmetic Lecture Roadmap Revisiting Addition and Overflow Rounding Techniques Basic Adders and Counters Required

More information

3.1 DATA REPRESENTATION (PART C)

3.1 DATA REPRESENTATION (PART C) 3.1 DATA REPRESENTATION (PART C) 3.1.3 REAL NUMBERS AND NORMALISED FLOATING-POINT REPRESENTATION In decimal notation, the number 23.456 can be written as 0.23456 x 10 2. This means that in decimal notation,

More information

Number Representation

Number Representation ECE 645: Lecture 5 Number Representation Part 2 Floating Point Representations Rounding Representation of the Galois Field elements Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and

More information

Floating Point. CENG331 - Computer Organization. Instructors: Murat Manguoglu (Section 1)

Floating Point. CENG331 - Computer Organization. Instructors: Murat Manguoglu (Section 1) Floating Point CENG331 - Computer Organization Instructors: Murat Manguoglu (Section 1) Adapted from slides of the textbook: http://csapp.cs.cmu.edu/ Today: Floating Point Background: Fractional binary

More information

Scientific Computing. Error Analysis

Scientific Computing. Error Analysis ECE257 Numerical Methods and Scientific Computing Error Analysis Today s s class: Introduction to error analysis Approximations Round-Off Errors Introduction Error is the difference between the exact solution

More information

02 - Numerical Representations

02 - Numerical Representations September 3, 2014 Todays lecture Finite length effects, continued from Lecture 1 Floating point (continued from Lecture 1) Rounding Overflow handling Example: Floating Point Audio Processing Example: MPEG-1

More information

CS429: Computer Organization and Architecture

CS429: Computer Organization and Architecture CS429: Computer Organization and Architecture Dr. Bill Young Department of Computer Sciences University of Texas at Austin Last updated: September 18, 2017 at 12:48 CS429 Slideset 4: 1 Topics of this Slideset

More information

CPS 101 Introduction to Computational Science

CPS 101 Introduction to Computational Science CPS 101 Introduction to Computational Science Wensheng Shen Department of Computational Science SUNY Brockport Chapter 6 Modeling Process Definition Model classification Steps of modeling 6.1 Definition

More information

Floating Point. CSC207 Fall 2017

Floating Point. CSC207 Fall 2017 Floating Point CSC207 Fall 2017 Ariane 5 Rocket Launch Ariane 5 rocket explosion In 1996, the European Space Agency s Ariane 5 rocket exploded 40 seconds after launch. During conversion of a 64-bit to

More information

Floating-point to Fixed-point Conversion. Digital Signal Processing Programs (Short Version for FPGA DSP)

Floating-point to Fixed-point Conversion. Digital Signal Processing Programs (Short Version for FPGA DSP) Floating-point to Fixed-point Conversion for Efficient i Implementation ti of Digital Signal Processing Programs (Short Version for FPGA DSP) Version 2003. 7. 18 School of Electrical Engineering Seoul

More information

Numeric Encodings Prof. James L. Frankel Harvard University

Numeric Encodings Prof. James L. Frankel Harvard University Numeric Encodings Prof. James L. Frankel Harvard University Version of 10:19 PM 12-Sep-2017 Copyright 2017, 2016 James L. Frankel. All rights reserved. Representation of Positive & Negative Integral and

More information

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007]

Numerical Methods for Eng [ENGR 391] [Lyes KADEM 2007] CHAPTER I I. Introduction Approximations and Round-off errors The concept of errors is very important to the effective use of numerical methods. Usually we can compare the numerical result with the analytical

More information

Floating-Point Arithmetic

Floating-Point Arithmetic Floating-Point Arithmetic if ((A + A) - A == A) { SelfDestruct() } Reading: Study Chapter 3. L12 Multiplication 1 Approximating Real Numbers on Computers Thus far, we ve entirely ignored one of the most

More information

Number Systems and Computer Arithmetic

Number Systems and Computer Arithmetic Number Systems and Computer Arithmetic Counting to four billion two fingers at a time What do all those bits mean now? bits (011011011100010...01) instruction R-format I-format... integer data number text

More information

Representing and Manipulating Floating Points

Representing and Manipulating Floating Points Representing and Manipulating Floating Points Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu The Problem How to represent fractional values with

More information

Representing and Manipulating Floating Points

Representing and Manipulating Floating Points Representing and Manipulating Floating Points Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu The Problem How to represent fractional values with

More information

ECE 645: Lecture 5 Number Representation

ECE 645: Lecture 5 Number Representation ECE 645: Lecture 5 Number Representation Part 2 Little-Endian vs. Big-Endian Representations Floating Point Representations Rounding Representation of the Galois Field elements Required Reading Endianness,

More information

Arithmetic for Computers. Hwansoo Han

Arithmetic for Computers. Hwansoo Han Arithmetic for Computers Hwansoo Han Arithmetic for Computers Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers Representation

More information

02 - Numerical Representation and Introduction to Junior

02 - Numerical Representation and Introduction to Junior 02 - Numerical Representation and Introduction to Junior September 10, 2013 Todays lecture Finite length effects, continued from Lecture 1 How to handle overflow Introduction to the Junior processor Demonstration

More information

Representing and Manipulating Floating Points

Representing and Manipulating Floating Points Representing and Manipulating Floating Points Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE23: Introduction to Computer Systems, Spring 218,

More information

In this lesson you will learn: how to add and multiply positive binary integers how to work with signed binary numbers using two s complement how fixed and floating point numbers are used to represent

More information

Finite arithmetic and error analysis

Finite arithmetic and error analysis Finite arithmetic and error analysis Escuela de Ingeniería Informática de Oviedo (Dpto de Matemáticas-UniOvi) Numerical Computation Finite arithmetic and error analysis 1 / 45 Outline 1 Number representation:

More information

Floating-Point Arithmetic

Floating-Point Arithmetic Floating-Point Arithmetic if ((A + A) - A == A) { SelfDestruct() } L11 Floating Point 1 What is the problem? Many numeric applications require numbers over a VERY large range. (e.g. nanoseconds to centuries)

More information

Number System. Introduction. Decimal Numbers

Number System. Introduction. Decimal Numbers Number System Introduction Number systems provide the basis for all operations in information processing systems. In a number system the information is divided into a group of symbols; for example, 26

More information

COMP2611: Computer Organization. Data Representation

COMP2611: Computer Organization. Data Representation COMP2611: Computer Organization Comp2611 Fall 2015 2 1. Binary numbers and 2 s Complement Numbers 3 Bits: are the basis for binary number representation in digital computers What you will learn here: How

More information

Representing and Manipulating Floating Points. Computer Systems Laboratory Sungkyunkwan University

Representing and Manipulating Floating Points. Computer Systems Laboratory Sungkyunkwan University Representing and Manipulating Floating Points Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu The Problem How to represent fractional values with

More information

MIPS Integer ALU Requirements

MIPS Integer ALU Requirements MIPS Integer ALU Requirements Add, AddU, Sub, SubU, AddI, AddIU: 2 s complement adder/sub with overflow detection. And, Or, Andi, Ori, Xor, Xori, Nor: Logical AND, logical OR, XOR, nor. SLTI, SLTIU (set

More information

Chapter 3: Arithmetic for Computers

Chapter 3: Arithmetic for Computers Chapter 3: Arithmetic for Computers Objectives Signed and Unsigned Numbers Addition and Subtraction Multiplication and Division Floating Point Computer Architecture CS 35101-002 2 The Binary Numbering

More information

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754 Floating Point Puzzles Topics Lecture 3B Floating Point IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties For each of the following C expressions, either: Argue that

More information

Floating-Point Arithmetic

Floating-Point Arithmetic Floating-Point Arithmetic if ((A + A) - A == A) { SelfDestruct() } Reading: Study Chapter 4. L12 Multiplication 1 Why Floating Point? Aren t Integers enough? Many applications require numbers with a VERY

More information

Representing and Manipulating Floating Points. Jo, Heeseung

Representing and Manipulating Floating Points. Jo, Heeseung Representing and Manipulating Floating Points Jo, Heeseung The Problem How to represent fractional values with finite number of bits? 0.1 0.612 3.14159265358979323846264338327950288... 2 Fractional Binary

More information

Integers. N = sum (b i * 2 i ) where b i = 0 or 1. This is called unsigned binary representation. i = 31. i = 0

Integers. N = sum (b i * 2 i ) where b i = 0 or 1. This is called unsigned binary representation. i = 31. i = 0 Integers So far, we've seen how to convert numbers between bases. How do we represent particular kinds of data in a certain (32-bit) architecture? We will consider integers floating point characters What

More information

Introduction to Computers and Programming. Numeric Values

Introduction to Computers and Programming. Numeric Values Introduction to Computers and Programming Prof. I. K. Lundqvist Lecture 5 Reading: B pp. 47-71 Sept 1 003 Numeric Values Storing the value of 5 10 using ASCII: 00110010 00110101 Binary notation: 00000000

More information

World Inside a Computer is Binary

World Inside a Computer is Binary C Programming 1 Representation of int data World Inside a Computer is Binary C Programming 2 Decimal Number System Basic symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 Radix-10 positional number system. The radix

More information

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop. CS 320 Ch 10 Computer Arithmetic The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop. Signed integers are typically represented in sign-magnitude

More information

4.1 QUANTIZATION NOISE

4.1 QUANTIZATION NOISE DIGITAL SIGNAL PROCESSING UNIT IV FINITE WORD LENGTH EFFECTS Contents : 4.1 Quantization Noise 4.2 Fixed Point and Floating Point Number Representation 4.3 Truncation and Rounding 4.4 Quantization Noise

More information

Inf2C - Computer Systems Lecture 2 Data Representation

Inf2C - Computer Systems Lecture 2 Data Representation Inf2C - Computer Systems Lecture 2 Data Representation Boris Grot School of Informatics University of Edinburgh Last lecture Moore s law Types of computer systems Computer components Computer system stack

More information

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Systems I Floating Point Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties IEEE Floating Point IEEE Standard 754 Established in 1985 as uniform standard for

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 4-B Floating-Point Arithmetic - II Israel Koren ECE666/Koren Part.4b.1 The IEEE Floating-Point

More information

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers.

Floating Point Puzzles The course that gives CMU its Zip! Floating Point Jan 22, IEEE Floating Point. Fractional Binary Numbers. class04.ppt 15-213 The course that gives CMU its Zip! Topics Floating Point Jan 22, 2004 IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Floating Point Puzzles For

More information

Floating-point Error

Floating-point Error I am HAL 9000 computer production Number 3. I became operational at the Hal Plant in Urbana, Illinios, on January 12, 1997. The quick brown fox jumps over the lazy dog. The rain in Spain is mainly in the

More information

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Arithmetic (a) The four possible cases Carry (b) Truth table x y

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Arithmetic (a) The four possible cases Carry (b) Truth table x y Arithmetic A basic operation in all digital computers is the addition and subtraction of two numbers They are implemented, along with the basic logic functions such as AND,OR, NOT,EX- OR in the ALU subsystem

More information

Floating Point Arithmetic

Floating Point Arithmetic Floating Point Arithmetic Clark N. Taylor Department of Electrical and Computer Engineering Brigham Young University clark.taylor@byu.edu 1 Introduction Numerical operations are something at which digital

More information

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng. CS 265 Computer Architecture Wei Lu, Ph.D., P.Eng. 1 Part 1: Data Representation Our goal: revisit and re-establish fundamental of mathematics for the computer architecture course Overview: what are bits

More information

Foundations of Computer Systems

Foundations of Computer Systems 18-600 Foundations of Computer Systems Lecture 4: Floating Point Required Reading Assignment: Chapter 2 of CS:APP (3 rd edition) by Randy Bryant & Dave O Hallaron Assignments for This Week: Lab 1 18-600

More information

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing EE878 Special Topics in VLSI Computer Arithmetic for Digital Signal Processing Part 4-B Floating-Point Arithmetic - II Spring 2017 Koren Part.4b.1 The IEEE Floating-Point Standard Four formats for floating-point

More information

Giving credit where credit is due

Giving credit where credit is due CSCE 230J Computer Organization Floating Point Dr. Steve Goddard goddard@cse.unl.edu http://cse.unl.edu/~goddard/courses/csce230j Giving credit where credit is due Most of slides for this lecture are based

More information

ROUNDING ERRORS LAB 1. OBJECTIVE 2. INTRODUCTION

ROUNDING ERRORS LAB 1. OBJECTIVE 2. INTRODUCTION ROUNDING ERRORS LAB Imagine you are traveling in Italy, and you are trying to convert $27.00 into Euros. You go to the bank teller, who gives you 20.19. Your friend is with you, and she is converting $2,700.00.

More information

Giving credit where credit is due

Giving credit where credit is due JDEP 284H Foundations of Computer Systems Floating Point Dr. Steve Goddard goddard@cse.unl.edu Giving credit where credit is due Most of slides for this lecture are based on slides created by Drs. Bryant

More information

Divide: Paper & Pencil

Divide: Paper & Pencil Divide: Paper & Pencil 1001 Quotient Divisor 1000 1001010 Dividend -1000 10 101 1010 1000 10 Remainder See how big a number can be subtracted, creating quotient bit on each step Binary => 1 * divisor or

More information

Number Systems. Decimal numbers. Binary numbers. Chapter 1 <1> 8's column. 1000's column. 2's column. 4's column

Number Systems. Decimal numbers. Binary numbers. Chapter 1 <1> 8's column. 1000's column. 2's column. 4's column 1's column 10's column 100's column 1000's column 1's column 2's column 4's column 8's column Number Systems Decimal numbers 5374 10 = Binary numbers 1101 2 = Chapter 1 1's column 10's column 100's

More information

Floating Point Square Root under HUB Format

Floating Point Square Root under HUB Format Floating Point Square Root under HUB Format Julio Villalba-Moreno Dept. of Computer Architecture University of Malaga Malaga, SPAIN jvillalba@uma.es Javier Hormigo Dept. of Computer Architecture University

More information

CS101 Lecture 04: Binary Arithmetic

CS101 Lecture 04: Binary Arithmetic CS101 Lecture 04: Binary Arithmetic Binary Number Addition Two s complement encoding Briefly: real number representation Aaron Stevens (azs@bu.edu) 25 January 2013 What You ll Learn Today Counting in binary

More information

Module 2: Computer Arithmetic

Module 2: Computer Arithmetic Module 2: Computer Arithmetic 1 B O O K : C O M P U T E R O R G A N I Z A T I O N A N D D E S I G N, 3 E D, D A V I D L. P A T T E R S O N A N D J O H N L. H A N N E S S Y, M O R G A N K A U F M A N N

More information

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science) Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science) Floating Point Background: Fractional binary numbers IEEE floating point standard: Definition Example and properties

More information

10.1. Unit 10. Signed Representation Systems Binary Arithmetic

10.1. Unit 10. Signed Representation Systems Binary Arithmetic 0. Unit 0 Signed Representation Systems Binary Arithmetic 0.2 BINARY REPRESENTATION SYSTEMS REVIEW 0.3 Interpreting Binary Strings Given a string of s and 0 s, you need to know the representation system

More information

Digital Fundamentals

Digital Fundamentals Digital Fundamentals Tenth Edition Floyd Chapter 2 2009 Pearson Education, Upper 2008 Pearson Saddle River, Education NJ 07458. All Rights Reserved Decimal Numbers The position of each digit in a weighted

More information

CO212 Lecture 10: Arithmetic & Logical Unit

CO212 Lecture 10: Arithmetic & Logical Unit CO212 Lecture 10: Arithmetic & Logical Unit Shobhanjana Kalita, Dept. of CSE, Tezpur University Slides courtesy: Computer Architecture and Organization, 9 th Ed, W. Stallings Integer Representation For

More information

Up next. Midterm. Today s lecture. To follow

Up next. Midterm. Today s lecture. To follow Up next Midterm Next Friday in class Exams page on web site has info + practice problems Excited for you to rock the exams like you have been the assignments! Today s lecture Back to numbers, bits, data

More information

Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective A Programmer's Perspective Representing Numbers Gal A. Kaminka galk@cs.biu.ac.il Fractional Binary Numbers 2 i 2 i 1 4 2 1 b i b i 1 b 2 b 1 b 0. b 1 b 2 b 3 b j 1/2 1/4 1/8 Representation Bits to right

More information

Addition and multiplication

Addition and multiplication Addition and multiplication Arithmetic is the most basic thing you can do with a computer, but it s not as easy as you might expect! These next few lectures focus on addition, subtraction, multiplication

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 4-A Floating-Point Arithmetic Israel Koren ECE666/Koren Part.4a.1 Preliminaries - Representation

More information

Computer (Literacy) Skills. Number representations and memory. Lubomír Bulej KDSS MFF UK

Computer (Literacy) Skills. Number representations and memory. Lubomír Bulej KDSS MFF UK Computer (Literacy Skills Number representations and memory Lubomír Bulej KDSS MFF UK Number representations? What for? Recall: computer works with binary numbers Groups of zeroes and ones 8 bits (byte,

More information

CHAPTER V NUMBER SYSTEMS AND ARITHMETIC

CHAPTER V NUMBER SYSTEMS AND ARITHMETIC CHAPTER V-1 CHAPTER V CHAPTER V NUMBER SYSTEMS AND ARITHMETIC CHAPTER V-2 NUMBER SYSTEMS RADIX-R REPRESENTATION Decimal number expansion 73625 10 = ( 7 10 4 ) + ( 3 10 3 ) + ( 6 10 2 ) + ( 2 10 1 ) +(

More information

Organisasi Sistem Komputer

Organisasi Sistem Komputer LOGO Organisasi Sistem Komputer OSK 8 Aritmatika Komputer 1 1 PT. Elektronika FT UNY Does the calculations Arithmetic & Logic Unit Everything else in the computer is there to service this unit Handles

More information

Computer Arithmetic. L. Liu Department of Computer Science, ETH Zürich Fall semester, Reconfigurable Computing Systems ( L) Fall 2012

Computer Arithmetic. L. Liu Department of Computer Science, ETH Zürich Fall semester, Reconfigurable Computing Systems ( L) Fall 2012 Reconfigurable Computing Systems (252-2210-00L) all 2012 Computer Arithmetic L. Liu Department of Computer Science, ETH Zürich all semester, 2012 Source: ixed-point arithmetic slides come from Prof. Jarmo

More information

Introduction to Field Programmable Gate Arrays

Introduction to Field Programmable Gate Arrays Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May 9 June 2007 Javier Serrano, CERN AB-CO-HT Outline Digital Signal

More information

Floating-point representations

Floating-point representations Lecture 10 Floating-point representations Methods of representing real numbers (1) 1. Fixed-point number system limited range and/or limited precision results must be scaled 100101010 1111010 100101010.1111010

More information

Floating-point representations

Floating-point representations Lecture 10 Floating-point representations Methods of representing real numbers (1) 1. Fixed-point number system limited range and/or limited precision results must be scaled 100101010 1111010 100101010.1111010

More information

Section 1.4 Mathematics on the Computer: Floating Point Arithmetic

Section 1.4 Mathematics on the Computer: Floating Point Arithmetic Section 1.4 Mathematics on the Computer: Floating Point Arithmetic Key terms Floating point arithmetic IEE Standard Mantissa Exponent Roundoff error Pitfalls of floating point arithmetic Structuring computations

More information

Floating Point Arithmetic

Floating Point Arithmetic Floating Point Arithmetic CS 365 Floating-Point What can be represented in N bits? Unsigned 0 to 2 N 2s Complement -2 N-1 to 2 N-1-1 But, what about? very large numbers? 9,349,398,989,787,762,244,859,087,678

More information

Representation of Numbers

Representation of Numbers Computer Architecture 10 Representation of Numbers Made with OpenOffice.org 1 Number encodings Additive systems - historical Positional systems radix - the base of the numbering system, the positive integer

More information

Chapter Three. Arithmetic

Chapter Three. Arithmetic Chapter Three 1 Arithmetic Where we've been: Performance (seconds, cycles, instructions) Abstractions: Instruction Set Architecture Assembly Language and Machine Language What's up ahead: Implementing

More information

Wordlength Optimization

Wordlength Optimization EE216B: VLSI Signal Processing Wordlength Optimization Prof. Dejan Marković ee216b@gmail.com Number Systems: Algebraic Algebraic Number e.g. a = + b [1] High level abstraction Infinite precision Often

More information

Accuracy versus precision

Accuracy versus precision Accuracy versus precision Accuracy is a consistent error from the true value, but not necessarily a good or precise error Precision is a consistent result within a small error, but not necessarily anywhere

More information

CS 64 Week 1 Lecture 1. Kyle Dewey

CS 64 Week 1 Lecture 1. Kyle Dewey CS 64 Week 1 Lecture 1 Kyle Dewey Overview Bitwise operation wrap-up Two s complement Addition Subtraction Multiplication (if time) Bitwise Operation Wrap-up Shift Left Move all the bits N positions to

More information

CHW 261: Logic Design

CHW 261: Logic Design CHW 261: Logic Design Instructors: Prof. Hala Zayed Dr. Ahmed Shalaby http://www.bu.edu.eg/staff/halazayed14 http://bu.edu.eg/staff/ahmedshalaby14# Slide 1 Slide 2 Slide 3 Digital Fundamentals CHAPTER

More information

Number Systems Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Number Representation

Number Systems Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Number Representation Number Systems Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur 1 Number Representation 2 1 Topics to be Discussed How are numeric data items actually

More information

unused unused unused unused unused unused

unused unused unused unused unused unused BCD numbers. In some applications, such as in the financial industry, the errors that can creep in due to converting numbers back and forth between decimal and binary is unacceptable. For these applications

More information

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers

Chapter 03: Computer Arithmetic. Lesson 09: Arithmetic using floating point numbers Chapter 03: Computer Arithmetic Lesson 09: Arithmetic using floating point numbers Objective To understand arithmetic operations in case of floating point numbers 2 Multiplication of Floating Point Numbers

More information

ECE 450:DIGITAL SIGNAL. Lecture 10: DSP Arithmetic

ECE 450:DIGITAL SIGNAL. Lecture 10: DSP Arithmetic ECE 450:DIGITAL SIGNAL PROCESSORS AND APPLICATIONS Lecture 10: DSP Arithmetic Last Session Floating Point Arithmetic Addition Block Floating Point format Dynamic Range and Precision 2 Today s Session Guard

More information

COSC 243. Data Representation 3. Lecture 3 - Data Representation 3 1. COSC 243 (Computer Architecture)

COSC 243. Data Representation 3. Lecture 3 - Data Representation 3 1. COSC 243 (Computer Architecture) COSC 243 Data Representation 3 Lecture 3 - Data Representation 3 1 Data Representation Test Material Lectures 1, 2, and 3 Tutorials 1b, 2a, and 2b During Tutorial a Next Week 12 th and 13 th March If you

More information

Roundoff Errors and Computer Arithmetic

Roundoff Errors and Computer Arithmetic Jim Lambers Math 105A Summer Session I 2003-04 Lecture 2 Notes These notes correspond to Section 1.2 in the text. Roundoff Errors and Computer Arithmetic In computing the solution to any mathematical problem,

More information

Representation of Numbers and Arithmetic in Signal Processors

Representation of Numbers and Arithmetic in Signal Processors Representation of Numbers and Arithmetic in Signal Processors 1. General facts Without having any information regarding the used consensus for representing binary numbers in a computer, no exact value

More information

MACHINE LEVEL REPRESENTATION OF DATA

MACHINE LEVEL REPRESENTATION OF DATA MACHINE LEVEL REPRESENTATION OF DATA CHAPTER 2 1 Objectives Understand how integers and fractional numbers are represented in binary Explore the relationship between decimal number system and number systems

More information

Chapter 2. Positional number systems. 2.1 Signed number representations Signed magnitude

Chapter 2. Positional number systems. 2.1 Signed number representations Signed magnitude Chapter 2 Positional number systems A positional number system represents numeric values as sequences of one or more digits. Each digit in the representation is weighted according to its position in the

More information

CHAPTER 5: Representing Numerical Data

CHAPTER 5: Representing Numerical Data CHAPTER 5: Representing Numerical Data The Architecture of Computer Hardware and Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint

More information

M1 Computers and Data

M1 Computers and Data M1 Computers and Data Module Outline Architecture vs. Organization. Computer system and its submodules. Concept of frequency. Processor performance equation. Representation of information characters, signed

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 11: Floating Point & Floating Point Addition Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Last time: Single Precision Format

More information

At the ith stage: Input: ci is the carry-in Output: si is the sum ci+1 carry-out to (i+1)st state

At the ith stage: Input: ci is the carry-in Output: si is the sum ci+1 carry-out to (i+1)st state Chapter 4 xi yi Carry in ci Sum s i Carry out c i+ At the ith stage: Input: ci is the carry-in Output: si is the sum ci+ carry-out to (i+)st state si = xi yi ci + xi yi ci + xi yi ci + xi yi ci = x i yi

More information

CS6303 COMPUTER ARCHITECTURE LESSION NOTES UNIT II ARITHMETIC OPERATIONS ALU In computing an arithmetic logic unit (ALU) is a digital circuit that performs arithmetic and logical operations. The ALU is

More information

CHAPTER 1 Numerical Representation

CHAPTER 1 Numerical Representation CHAPTER 1 Numerical Representation To process a signal digitally, it must be represented in a digital format. This point may seem obvious, but it turns out that there are a number of different ways to

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Arithmetic Unit 10122011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Recap Fixed Point Arithmetic Addition/Subtraction

More information

Computer Organisation CS303

Computer Organisation CS303 Computer Organisation CS303 Module Period Assignments 1 Day 1 to Day 6 1. Write a program to evaluate the arithmetic statement: X=(A-B + C * (D * E-F))/G + H*K a. Using a general register computer with

More information

2.1. Unit 2. Integer Operations (Arithmetic, Overflow, Bitwise Logic, Shifting)

2.1. Unit 2. Integer Operations (Arithmetic, Overflow, Bitwise Logic, Shifting) 2.1 Unit 2 Integer Operations (Arithmetic, Overflow, Bitwise Logic, Shifting) 2.2 Skills & Outcomes You should know and be able to apply the following skills with confidence Perform addition & subtraction

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers Implementation

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 3 Arithmetic for Computers Implementation Today Review representations (252/352 recap) Floating point Addition: Ripple

More information

ECE 2030D Computer Engineering Spring problems, 5 pages Exam Two 8 March 2012

ECE 2030D Computer Engineering Spring problems, 5 pages Exam Two 8 March 2012 Instructions: This is a closed book, closed note exam. Calculators are not permitted. If you have a question, raise your hand and I will come to you. Please work the exam in pencil and do not separate

More information

Number Systems. Both numbers are positive

Number Systems. Both numbers are positive Number Systems Range of Numbers and Overflow When arithmetic operation such as Addition, Subtraction, Multiplication and Division are performed on numbers the results generated may exceed the range of

More information

Data Representation 1

Data Representation 1 1 Data Representation Outline Binary Numbers Adding Binary Numbers Negative Integers Other Operations with Binary Numbers Floating Point Numbers Character Representation Image Representation Sound Representation

More information

Floating Point : Introduction to Computer Systems 4 th Lecture, May 25, Instructor: Brian Railing. Carnegie Mellon

Floating Point : Introduction to Computer Systems 4 th Lecture, May 25, Instructor: Brian Railing. Carnegie Mellon Floating Point 15-213: Introduction to Computer Systems 4 th Lecture, May 25, 2018 Instructor: Brian Railing Today: Floating Point Background: Fractional binary numbers IEEE floating point standard: Definition

More information