CSCI 402: Computer Architectures. Arithmetic for Computers (4) Fengguang Song Department of Computer & Information Science IUPUI.

Similar documents
Written Homework 3. Floating-Point Example (1/2)

Floating Point Arithmetic. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Floating Point Arithmetic

Computer Architecture and IC Design Lab. Chapter 3 Part 2 Arithmetic for Computers Floating Point

Chapter 3. Arithmetic for Computers

Chapter 3. Arithmetic for Computers

Thomas Polzer Institut für Technische Informatik

Chapter 3. Arithmetic Text: P&H rev

Chapter 3. Arithmetic for Computers

COMPUTER ORGANIZATION AND DESIGN

TDT4255 Computer Design. Lecture 4. Magnus Jahre

Arithmetic for Computers. Hwansoo Han

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Arithmetic. Chapter 3 Computer Organization and Design

3.5 Floating Point: Overview

Floating Point COE 308. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Chapter 3 Arithmetic for Computers (Part 2)

Boolean Algebra. Chapter 3. Boolean Algebra. Chapter 3 Arithmetic for Computers 1. Fundamental Boolean Operations. Arithmetic for Computers

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers

Floating Point Arithmetic

CO Computer Architecture and Programming Languages CAPL. Lecture 15

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers

5DV118 Computer Organization and Architecture Umeå University Department of Computing Science Stephen J. Hegner. Topic 3: Arithmetic

Arithmetic for Computers. Integer Addition. Chapter 3

Lecture 10: Floating Point, Digital Design

Integer Subtraction. Chapter 3. Overflow conditions. Arithmetic for Computers. Signed Addition. Integer Addition. Arithmetic for Computers

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers

Lecture 10: Floating Point, Digital Design

Computer Architecture. Chapter 3: Arithmetic for Computers

Chapter 3. Arithmetic for Computers

ECE 154A Introduction to. Fall 2012

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W4-W

9 Floating-Point. 9.1 Objectives. 9.2 Floating-Point Number Representation. Value = ± (1.F) 2 2 E Bias

xx.yyyy Lecture #11 Floating Point II Summary (single precision): Precision and Accuracy Fractional Powers of 2 Representation of Fractions

CSCI 402: Computer Architectures. Arithmetic for Computers (3) Fengguang Song Department of Computer & Information Science IUPUI.

Chapter 3 Arithmetic for Computers

CS61C : Machine Structures

IEEE Standard 754 for Binary Floating-Point Arithmetic.

Review: MIPS Organization

ECE232: Hardware Organization and Design

Lecture 13: (Integer Multiplication and Division) FLOATING POINT NUMBERS

Precision and Accuracy

UC Berkeley CS61C : Machine Structures

CS222: Processor Design

UC Berkeley CS61C : Machine Structures

CS61C Floating Point Operations & Multiply/Divide. Lecture 9. February 17, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson)

Review 1/2 Big Idea: Instructions determine meaning of data; nothing inherent inside the data Characters: ASCII takes one byte

CSCI 402: Computer Architectures. Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI.

IEEE Standard 754 for Binary Floating-Point Arithmetic.

Divide: Paper & Pencil

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

MIPS ISA and MIPS Assembly. CS301 Prof. Szajda

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

ECE331: Hardware Organization and Design

Instruction Set Architecture of. MIPS Processor. MIPS Processor. MIPS Registers (continued) MIPS Registers

CSCI 402: Computer Architectures

Format. 10 multiple choice 8 points each. 1 short answer 20 points. Same basic principals as the midterm

CS 33. Data Representation (Part 3) CS33 Intro to Computer Systems VIII 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

ECE331: Hardware Organization and Design

Chapter 3 Arithmetic for Computers

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

95% of the folks out there are completely clueless about floating-point.

UCB CS61C : Machine Structures

CSCI 402: Computer Architectures

MIPS Integer ALU Requirements

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Computer Architecture Chapter 3. Fall 2005 Department of Computer Science Kent State University

Floating Point Numbers

Floating Point Numbers

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation

Floating-Point Arithmetic

ECE 331 Hardware Organization and Design. Professor Jay Taneja UMass ECE - Discussion 5 2/22/2018

Floating Point Arithmetic

CSE 141 Computer Architecture Summer Session Lecture 3 ALU Part 2 Single Cycle CPU Part 1. Pramod V. Argade

Slide Set 11. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng

Floating-Point Arithmetic

Signed Multiplication Multiply the positives Negate result if signs of operand are different

CS61C : Machine Structures

IEEE Standard for Floating-Point Arithmetic: 754

System Programming CISC 360. Floating Point September 16, 2008

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

Systems I. Floating Point. Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties

CS101 Introduction to computing Floating Point Numbers

ICS DEPARTMENT ICS 233 COMPUTER ARCHITECTURE & ASSEMBLY LANGUAGE. Midterm Exam. First Semester (141) Time: 1:00-3:30 PM. Student Name : _KEY

Floating Point January 24, 2008

Floating point. Today. IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.

Number Systems and Computer Arithmetic

Floating Point Puzzles. Lecture 3B Floating Point. IEEE Floating Point. Fractional Binary Numbers. Topics. IEEE Standard 754

Foundations of Computer Systems

Number Representations

Homework 3. Assigned on 02/15 Due time: midnight on 02/21 (1 WEEK only!) B.2 B.11 B.14 (hint: use multiplexors) CSCI 402: Computer Architectures

Floating Point. CSE 238/2038/2138: Systems Programming. Instructor: Fatma CORUT ERGİN. Slides adapted from Bryant & O Hallaron s slides

Floating Point. CSE 351 Autumn Instructor: Justin Hsia

CO Computer Architecture and Programming Languages CAPL. Lecture 13 & 14

Data Representation Floating Point

Floating Point Numbers

October 24. Five Execution Steps

Floating Point Numbers. Lecture 9 CAP

Data Representation Floating Point

Transcription:

CSCI 402: Computer Architectures Arithmetic for Computers (4) Fengguang Song Department of Computer & Information Science IUPUI Homework 4 Assigned on Feb 22, Thursday Due Time: 11:59pm, March 5 on Monday night You have 1.5 weeks (not 2 weeks!) TA will post the solution on March 6 or 7 Your Midterm Exam is on March 8 so you can prepare for it 3.4 3.12, 3.18 (assuming both inputs are two-digit octal numbers consisting of 6 binary bits) 3.20, 3.22, 3.23, 3.24, 3.27 3.41, 3.42, 3.43 5 1

IEEE Floating-Point Format single: 8 bits double: 11 bits S Exponent single: 23 bits double: 52 bits Fraction x = ( 1) S (1+Fraction) 2 [0, 255] (stored Exponent Bias) S: sign bit (0 Þ non-negative, 1 Þ negative) Normalized significand: 1.0 significand < 2.0 Always has a leading (pre-binary-point) 1, so no need to represent it explicitly ( hidden bit ) Significand = Fraction with the 1. restored Stored exponent = actual exponent + Bias Actual exponent = stored exponent - Bias Stored exponent is unsigned: e.g., 0 to 255, or 0 to 2047 Single: Bias = 127; Double: Bias = 1023 6 Floating-Point Example (1/2) Q: How to represent 0.75 10 (=-0.11 2 ) 0.75 = (1.1 2 2 1 ) //normalized number S = 1 Fraction = 1000 00 2 Stored exponent = 1 + Bias Single: 1 + 127 = 126 = 01111110 2 Double: 1 + 1023 = 1022 = 01111111110 2 Single: 1 01111110 1000 00 Double: 1 01111111110 1000 00 7 2

Floating-Point Example (2/2) Q: What decimal value is represented by the following single-precision floating-point number? 1 10000001 01000 00 S = 1 Stored exponent = 1000 0001 2 = 129 Fraction = 01000 00 2 x = ( 1) 1 (1 +.01 2 ) 2 (129 127) = ( 1) 1.25 2 2 = 5.0 8 Floating-Point Addition (base 10) First, consider a decimal example (suppose 4 digits of significand and 2 digits of exponent) 9.999 10 1 + 1.610 10 1 1. Align decimal points (equal exponent) 4 Steps Shift the number with smaller exponent 9.999 10 1 + 0.016 10 1 Now, exponents are equal 2. Add significands 9.999 10 1 + 0.016 10 1 = 10.015 10 1 3. Normalize result & check for over/underflow 1.0015 10 2 4. Round and renormalize the output if necessary 1.002 10 2 9 3

Floating-Point Addition (binary) Similarly, consider a 4-digit binary example 1.000 2 2 1 + 1.110 2 2 2 (i.e., 0.5 + 0.4375) Step 1. Align binary points (s.t. equal exponent) Shift the number with smaller exponent 1.000 2 2 1 + 0.111 2 2 1 Step 2. Add significands 1.000 2 2 1 + 0.111 2 2 1 = 0.001 2 2 1 Step 3. Normalize result & check for over/underflow 1.000 2 2 4, //-4 is between -127 and 128. Step 4. Round and renormalize if necessary 1.000 2 2 4 (no change) = 0.0625 10 FP Adder Hardware Step 1 Larger exp Smaller frac Larger frac Step 2 New faction Step 3 Step 4 11 4

FP Adder Hardware It is more complex than integer adder Doing it in one cycle would make the clock cycle too long! Note that a slower clock would penalize all instructions So, FP Adder takes several cycles Can be pipelined 12 Next: Floating Point Multiplication Given two Operands (2 normalized inputs) as follows: (-1) S1 m1 2 E1 // m1 is the significand: 1.xxxxx (-1) S2 m2 2 E2 Exact Result? Suppose it is: (-1) S m 2 E Sign s: s = 1 if s1 ¹ s2; s = 0 otherwise Significand m: m1 * m2 //just multiply two significands Exponent E: E1 + E2 Fixing result: Round output m to fit the significand precision Overflow if E out of range Implementation: The most complex part is multiplying 2 significands 13 5

Floating-Point Multiplication Again, first consider a decimal example (suppose 4 digits of significand and 2 digits of exponent) 1.110 10 10 9.200 10 5 1. Add exponents New exponent = 10 + 5 = 5 2. Multiply the two significands 1.110 9.200 = 10.212 Þ 10.212 10 5 3. Normalize result & check for over/underflow 1.0212 10 6 4. Round and renormalize if necessary 1.021 10 6 5. Determine sign of result from signs of operands +1.021 10 6 14 Floating-Point Multiplication Now, let s consider a 4-digit binary example 1.000 2 2 1 1.110 2 2 2 (0.5 0.4375) 1. Add exponents Unbiased: 1 + 2 = 3 2. Multiply significands 1.000 2 1.110 2 = 1.110 2 Þ 1.110 2 2 3 3. Normalize result & check for over/underflow 1.110 2 2 3 // no over/underflow 4. Round and renormalize if necessary 1.110 2 2 3 // (no change) 5. Determine sign: +ve ve Þ ve 1.110 2 2 3 = 0.21875 15 6

1.000 x 1.110 ---------------------------- 0000 1000 1000 1000 ---------------------------- 1.110000 16 FP MIPS Instructions Floating point hardware is an adjunct processor that extends the existing MIPS ISA Called coprocessor 1 (c1) There are 32 separate FP registers 32 single-precision: $f0, $f1, $f31 Can be paired for storing double-precision: $f0/$f1, $f2/$f3, i.e., 16 double-precision registers FP Instructions can operate only on FP registers Also, special load and store instructions lwc1, swc1 ldc1, sdc1 e.g., lwc1 $f8, 32($sp) 17 7

CPU (central processing unit) FPU (floating point unit) "coprocessor 1" mfc1 register $0,..,$31 integer arithmetic division multiplication logical ops mtc1 register $f0,.. $f31 floating point arithmetic divison multiplication int float convert sw lwc1 lw swc1 Memory (2^32 bytes) 18 FP MIPS Instructions Single-precision arithmetic add.s, sub.s, mul.s, div.s e.g., add.s $f0, $f1, $f6 //F0=F1+F6 Double-precision arithmetic add.d, sub.d, mul.d, div.d e.g., mul.d $f4, $f4, $f6 //F4=F4*F6 Comparison c.xx.s, c.xx.d (xx is eq, lt, le, ) will set FP condition-code bit e.g. c.lt.s $f3, $f4 Branch on FP condition code true or false bc1t ( branch C1 true ), bc1f ( branch C1 false ) e.g., bc1t TargetLabel 19 8

FP MIPS Example: F to C C code: float f2c (float fahr) { return ((5.0/9.0)*(fahr - 32.0)); } fahr in $f12, result in $f0, literals stored in Global memory Compiled MIPS code: f2c: lwc1 $f16, const5($gp) lwc1 $f18, const9($gp) div.s $f16, $f16, $f18 lwc1 $f18, const32($gp) sub.s $f18, $f12, $f18 mul.s $f0, $f16, $f18 jr $ra //F16 = 5.0/9.0 //product result 20 FP Example: Array Multiplication X = X + Y Z All 32 32 matrices, 64-bit double-precision elements C code: void mm (double x[][], double y[][], double z[][]) { int i, j, k; } for (i = 0; i! = 32; i = i + 1) for (j = 0; j! = 32; j = j + 1) for (k = 0; k! = 32; k = k + 1) x[i][j] = x[i][j] + y[i][k] * z[k][j]; Addresses of x, y, z in $a0, $a1, $a2, and i, j, k in $s0, $s1, $s2 21 9

n FP Example: Array Multiplication MIPS code: li $t1, 32 # $t1 = 32 (row size/loop end) li $s0, 0 # i = 0; initialize 1st for loop L1: li $s1, 0 # j = 0; restart 2nd for loop L2: li $s2, 0 # k = 0; restart 3rd for loop sll $t2, $s0, 5 # $t2 = i * 32, i-th row addu $t2, $t2, $s1 # $t2 = i * 32 + j, j-th column sll $t2, $t2, 3 # $t2 = byte offset of [i][j] addu $t2, $a0, $t2 # $t2 = byte address of x[i][j] l.d $f4, 0($t2) # $f4 = 8 bytes of x[i][j] L3: sll $t0, $s2, 5 # $t0 = k * 32, k-th row addu $t0, $t0, $s1 # $t0 = k * 32 + j, j-th column sll $t0, $t0, 3 # $t0 = byte offset of [k][j] addu $t0, $a2, $t0 # $t0 = byte address of z[k][j] l.d $f16, 0($t0) # $f16 = 8 bytes of z[k][j] 22 Accuracy of Floating Point Numbers Only a subset of real numbers can be represented by computer! 24 10

Accurate Arithmetic NOTE: Floating-point numbers are approximations of real numbers 53 bits vs infinite number of real numbers (consider [0.0, 1.0]) IEEE Std 754 offers a rounding control Allow programmer to fine-tune numerical behavior of a computation HW always keeps two extra bits of precision (guard, round) Will be used during intermediate computations But not all FP hardware implement all options Most programming languages and FP libraries just use defaults 25 Accurate Arithmetic Guard & Round bits IEEE 754 standard specifies the use of 2 extra bits on the right during intermediate calculations Guard bit and Round bit Example: Add 2.56 10 0 and 2.34 10 2 assuming 3 significant digits and without guard and round bits 2.56 10 0 = 0.0256 10 2 2.34 0.02 2.36 10 2 With guard and round bits 2.34 0.0256 2.3656 10 2 ROUND 2.37 10 0 26 11

IEEE Std 754 has 4 different rounding modes 1st is the default; Others are called directed rounding. Round to Nearest round to the nearest value And Ties to Even : If the number falls midway, it is rounded to the nearest even number Round toward 0 directed rounding towards zero (or truncation) Round toward + directed rounding towards positive infinity (ceiling) Round toward directed rounding towards negative infinity (floor) $1.40 $1.60 $1.50 $2.50 -$1.50 Nearest even $1.00 $2.00 $2.00 $2.00 -$2.00 27 Accurate Arithmetic A conceptual view: First compute exact result Then make it fit into the desired precision Possibly overflow if exponent too large Possibly round to fit into significand Rounding modes (illustrate with $ rounding) $1.40 $1.60 $1.50 $2.50 -$1.50 Zero $1.00 $1.00 $1.00 $2.00 -$1.00 - $1.00 $1.00 $1.00 $2.00 -$2.00 + $2.00 $2.00 $2.00 $3.00 -$1.00 Nearest even $1.00 $2.00 $2.00 $2.00 -$2.00 Rounding methods in case of tie cases (fraction = 0.5) No problems in case of fraction 0.5 However, IRS always round up 0.5! Fused Multiply Add: a = a + (b x c). //round only once at the end! 28 12

Interpretation of Data The BIG Picture Bits have no inherent meaning! Could be anything E.g., 32 bits, what does it mean? Interpretation depends on the instruction applied Computer representations of numbers have limited range and limited precision You must know they are approximations (have rounding errors). 29 13