Basic Arithmetic and the ALU. Basic Arithmetic and the ALU. Background. Background. Integer multiplication, division

Similar documents
Lecture Topics. Announcements. Today: Integer Arithmetic (P&H ) Next: continued. Consulting hours. Introduction to Sim. Milestone #1 (due 1/26)

Arithmetic Logic Unit. Digital Computer Design

EE260: Logic Design, Spring n Integer multiplication. n Booth s algorithm. n Integer division. n Restoring, non-restoring

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers Implementation

Arithmetic and Logical Operations

Number Systems and Computer Arithmetic

CS 101: Computer Programming and Utilization

ECE232: Hardware Organization and Design

Tailoring the 32-Bit ALU to MIPS

Lecture Topics. Announcements. Today: Integer Arithmetic (P&H ) Next: The MIPS ISA (P&H ) Consulting hours. Milestone #1 (due 1/26)

ECE 30 Introduction to Computer Engineering

Integer Multiplication. Back to Arithmetic. Integer Multiplication. Example (Fig 4.25)

Binary Arithmetic. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T.

CPE300: Digital System Architecture and Design

Chapter 3: part 3 Binary Subtraction

Review: MIPS Organization

Chapter 3: Arithmetic for Computers

NUMBER OPERATIONS. Mahdi Nazm Bojnordi. CS/ECE 3810: Computer Organization. Assistant Professor School of Computing University of Utah

Number representations

EECS150 - Digital Design Lecture 13 - Combinational Logic & Arithmetic Circuits Part 3

Bitwise Instructions

By, Ajinkya Karande Adarsh Yoga

Microcomputers. Outline. Number Systems and Digital Logic Review

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

COMP 303 Computer Architecture Lecture 6

Divide: Paper & Pencil

Lec-6-HW-3-ALUarithmetic-SOLN

Digital Logic & Computer Design CS Professor Dan Moldovan Spring 2010

Arithmetic Circuits. Design of Digital Circuits 2014 Srdjan Capkun Frank K. Gürkaynak.

ECE260: Fundamentals of Computer Engineering

MIPS Integer ALU Requirements

ENE 334 Microprocessors

Timing for Ripple Carry Adder

Computer Architecture Set Four. Arithmetic

Computer Architecture and Organization: L04: Micro-operations

Chapter 4. Operations on Data

Outline. EEL-4713 Computer Architecture Multipliers and shifters. Deriving requirements of ALU. MIPS arithmetic instructions

Math 230 Assembly Programming (AKA Computer Organization) Spring 2008

ECE260: Fundamentals of Computer Engineering

Unsigned Binary Integers

Unsigned Binary Integers

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

Course Topics - Outline

Chapter 4 Arithmetic Functions

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Arithmetic (a) The four possible cases Carry (b) Truth table x y

Arithmetic Operations

ECE 341 Midterm Exam

Computer Arithmetic Multiplication & Shift Chapter 3.4 EEC170 FQ 2005

Computer Architecture and Organization

Chapter 3 Arithmetic for Computers

Arithmetic and Bitwise Operations on Binary Data

CPE 335 Computer Organization. MIPS Arithmetic Part I. Content from Chapter 3 and Appendix B

COMP2611: Computer Organization. Data Representation

CS3350B Computer Architecture MIPS Instruction Representation

ECE 341. Lecture # 6

Introduction to the MIPS. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Digital Circuit Design and Language. Datapath Design. Chang, Ik Joon Kyunghee University

Logic Instructions. Basic Logic Instructions (AND, OR, XOR, TEST, NOT, NEG) Shift and Rotate instructions (SHL, SAL, SHR, SAR) Segment 4A

Operations On Data CHAPTER 4. (Solutions to Odd-Numbered Problems) Review Questions

University of Illinois at Chicago. Lecture Notes # 10

CS352H: Computer Systems Architecture

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

CHAPTER V NUMBER SYSTEMS AND ARITHMETIC

Arithmetic for Computers

Computer Architecture

Week 7: Assignment Solutions

Lecture 8: Addition, Multiplication & Division

Binary Logic (review)

Arithmetic Operators There are two types of operators: binary and unary Binary operators:

Chapter 2. Instructions: Language of the Computer. HW#1: 1.3 all, 1.4 all, 1.6.1, , , , , and Due date: one week.

Two-Level CLA for 4-bit Adder. Two-Level CLA for 4-bit Adder. Two-Level CLA for 16-bit Adder. A Closer Look at CLA Delay

Chapter 10 Binary Arithmetics

Forecast. Instructions (354 Review) Basics. Basics. Instruction set architecture (ISA) is its vocabulary. Instructions are the words of a computer

Arithmetic-logic units

CHAPTER 6 ARITHMETIC, LOGIC INSTRUCTIONS, AND PROGRAMS

ECE 154A Introduction to. Fall 2012

Shift and Rotate Instructions

CSCI 402: Computer Architectures

ECEN 468 Advanced Logic Design

Binary Adders: Half Adders and Full Adders

COSC 243. Data Representation 3. Lecture 3 - Data Representation 3 1. COSC 243 (Computer Architecture)

Announcements HW1 is due on this Friday (Sept 12th) Appendix A is very helpful to HW1. Check out system calls

Instruction Set extensions to X86. Floating Point SIMD instructions

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Bits and Bytes and Numbers

Chapter 2A Instructions: Language of the Computer

ECE331: Hardware Organization and Design

Number Systems and Their Representations

ECE 4514 Digital Design II. Spring Lecture 7: Dataflow Modeling

Advanced Computer Architecture-CS501

Lecture Topics. Announcements. Today: The MIPS ISA (P&H ) Next: continued. Milestone #1 (due 1/26) Milestone #2 (due 2/2)

Part III The Arithmetic/Logic Unit. Oct Computer Architecture, The Arithmetic/Logic Unit Slide 1

Chapter 3 Arithmetic for Computers. ELEC 5200/ From P-H slides

4 Operations On Data 4.1. Foundations of Computer Science Cengage Learning

ECOM4311 Digital Systems Design

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

CENG3420 L05: Arithmetic and Logic Unit

Basic Arithmetic (adding and subtracting)

CENG3420 Lecture 03 Review

Xuan Guo. Lecture XIV: Review of Chapter 3 & 4. CSC 3210 Computer Organization and Programming Georgia State University. March 5, 2015.


Transcription:

Basic Arithmetic and the ALU Forecast Representing numbs, 2 s Complement, unsigned ition and subtraction /Sub ALU full add, ripple carry, subtraction, togeth Carry-Lookahead addition, etc. Logical opations and, or, xor, nor, shifts - barrel shift Ovflow, MMX Basic Arithmetic and the ALU Integ multiplication, division floating point arithmetic lat not crucial for the project CS/ECE 552 Lecture Notes: Chapt 4 1 CS/ECE 552 Lecture Notes: Chapt 4 2 Background Background Recall: n bits give rise to 2 n combinations let us call a string of 32 bits as b31 b30... b3 b2 b1 b0 No inhent meaning one intpretation f(b31... b4 b3 b2 b1 b0) -> value anoth f(b31... b4 b3 b2 b1 b0) -> control signals 32-bit types include unsigned integs singed integs single-precision floating point MIPS instructions (A.10) CS/ECE 552 Lecture Notes: Chapt 4 3 CS/ECE 552 Lecture Notes: Chapt 4 4

Unsigned integs Signed Integs f(b31... b0) = b31 x 2 31 +... + b1 x 2 + b0 x 2 0 Treat as normal binary numb e.g., 0...011010101 =1x2 7 +1x2 6 +0x2 5 +1x2 4 +0x2 3 +1x2 2 +0x2 1 +1x2 0 = 128 + 64 + 16 + 4 + 1 = 213 max f (111... 11) = 2 32-1 = 4, 294, 967, 295 min f(000... 00) = 0 2 s Complement f(b31 b30... b1 b0) = -b31 x 2 31 +... + b1 x 2 + b0 x 2 0 max f(0111... 11) = 2 31-1 = 2147483647 min f(100... 00) = -2 31 = -2147483648 (asymmetric) range [-2 31, 2 31-1] => #values (2 31-1 - -2 31 + 1) = 2 32 E.g., -6 000... 0110 --> 111.. 1001 + 1 --> 111...1010 range [0, 2 32-1] => # values (2 32-1) - 0 + 1 = 2 32 CS/ECE 552 Lecture Notes: Chapt 4 5 CS/ECE 552 Lecture Notes: Chapt 4 6 Why 2 s Complement ition and Subtraction why not use signed magnitude 2 s complement makes comput arithmetic simpl just like humans don t work with Roman Numals Representation affects ease of calculation 4-bit unsigned example 0 0 1 1 3 1 0 1 0 10 1 1 0 1 13 not answ 000 111 001-1 0 1 110-2 2 010-3 101-4 3 011 100 000 111 001-3 0 1 110-2 2 010-1 101-0 3 011 100 4-bit 2 s Complement - ignoring ovflow 0 0 1 1 3 1 0 1 0-6 1 1 0 1-3 CS/ECE 552 Lecture Notes: Chapt 4 7 CS/ECE 552 Lecture Notes: Chapt 4 8

Subtraction A - B = A + 2 s complement of B E.g., 3-2 0 0 1 1 3 1 1 1 0-2 0 0 0 1 1 full add (a, b, c in )--> (c out, s) c out = two of more of (a, b, c in ) s = exactly one or three of (a, b, c in ) CS/ECE 552 Lecture Notes: Chapt 4 9 CS/ECE 552 Lecture Notes: Chapt 4 10 Ripple-carry Ripple-carry Subtractor Just concatenate the full adds A - B = A + (-B) => invt B and set C in0 to 1 c in C out 1 C out a 0 b 0 a 1 b 1 a 2 b 2 a 31b 31 a 0 b 0 a 1 b 1 a 2 b 2 a 3 b 3 CS/ECE 552 Lecture Notes: Chapt 4 11 CS/ECE 552 Lecture Notes: Chapt 4 12

Combined Ripple-carry /Subtractor Carry Lookahead control = 1 => subtract XOR B with control and set C in0 to control a 0 b 0 a 1 b 1 a 2 b 2 C out a 31 b 31 opation The above ALU is too slow - gate delays for add = 32 x FA + XOR ~= 64 - too slow Theoretically: In parallel sum 0 = f(c in, a 0, b 0 ) sum i = f(c in, a i... a 0, b i... b 0 ) sum 31 = f(c in, a 31... a 0, b 31... b 0 ) CS/ECE 552 Lecture Notes: Chapt 4 13 CS/ECE 552 Lecture Notes: Chapt 4 14 Carry Lookahead Carry Lookahead Need compromise 0101 0100 build tree so delay is O(log 2 n) for n bits E.g., 2 x 5 gate delays for 32-bits 0011 0110 Need both to genate and at least one to propagate Define: g i = a i * b i ## carry genate We will give the basic idea with (a) 4-bit then (b) 16-bit add p i = a i + b i ## carry propagate Recall: c i+1 = a i * b i + a i * c i + b i * c i A little convoluted! = a i * b i + (a i + b i ) * c i = g i + p i * c i CS/ECE 552 Lecture Notes: Chapt 4 15 CS/ECE 552 Lecture Notes: Chapt 4 16

Carry Lookahead 4-bit Carry Lookahead Thefore c 1 = g 0 + p 0 * c 2 = g 1 + p 1 * c 1 = g 1 + p 1 * (g 0 + p 0 * ) c 4 Carry Lookahead Block = g 1 + p 1 * g 0 + p 1 * p 0 * g 3 p 3 a 3 b 3 g 2 p 2 a 2 b 2 g 1 p 1 a 1 b 1 g 0 p 0 a 0 b 0 c 3 = g 2 + p 2 * g 1 + p 2 * p 1 * g 0 + p 2 * p 1 * p 0 * c 4 = g 3 + p 3 *g 2 + p 3 *p 2 *g 1 + p 3 *p 2 *p 1 *g 0 + p 3 *p 2 *p 1 *p 0 * c 3 c 2 c 1 Uses one level to form p i and g i, two levels for carry But, this needs n+1 fanin at the OR and the rightmost AND s 3 s 2 s 1 s 0 CS/ECE 552 Lecture Notes: Chapt 4 17 CS/ECE 552 Lecture Notes: Chapt 4 18 Hiachical Carry Lookahead for 16 bits Hiachical Carry Lookahead for 16 bits Build 16-bit add from four 4-bit adds c 15 Carry Lookahead Block Figure out Genate and Propagate for 4-bits togeth G 0,3 = g 3 + p 3 * g 2 + p 3 * p 2 * g 1 + p 3 * p 2 * p 1 * g 0 P G a,b 12-15 P G a,b 8-11 P G a 4-7 b 4-7 P G a 0-3 b 0-3 P 0,3 = p 3 * p 2 * p 1 * p 0 (Notation a little diffent from the book) G 4,7 = g 7 + p 7 * g 6 + p 7 * p 6 * g 5 + p 7 * p 6 * p 5 * g 4 c 12 c 8 c 4 P 4,7 = p 7 * p 6 * p 5 * p 4 G 12,15 = g 15 + p 15 * g 14 + p 15 * p 14 * g 13 + p 15 * p 14 * p 13 * g 12 s 12-15 s 8-11 s 4-7 s 0-3 P 12,15 = p 15 * p 14 * p 13 * p 12 CS/ECE 552 Lecture Notes: Chapt 4 19 CS/ECE 552 Lecture Notes: Chapt 4 20

Carry Lookahead Basics Carry Lookahead: Compute G s and P s Fill in the holes in G s and P s: G i, k = G j+1,k + P j+1, k * G i,j (assume i < j +1 < k ) P i,k = P i,j * P j+1, k G 12,15 P 12,15 G 8,11 P 8,11 G 4,7 P 4,7 G 0,3 P 0,3 G 0,7 = G 4,7 + P 4,7 * G 0,3 P 0,7 = P 0,3 * P 4,7 G 8,15 = G 12,15 + P 12,15 * G 8,11 P 8,15 = P 8,11 * P 12, 15 G 0,15 = G 8,15 + P 8,15 * G 0,7 P 0,15 = P 0,7 * P 8, 15 G 8,15 P 8,15 G 0,7 P 0,7 G 0,15 P 0,15 CS/ECE 552 Lecture Notes: Chapt 4 21 CS/ECE 552 Lecture Notes: Chapt 4 22 c 12 Carry Lookahead: Compute c s g 12 - g 15 g 8 - g 11 g 4 - g 7 g 0 - g 3 p 12 - p 15 p 8 - p 11 p 4 - p 7 p 0 - p 3 c4 c0 G 8,11 P 8,11 c 8 G 0,3 P 0,3 Oth s: Carry Select Two adds in parallel - one with C in 0 and the oth c in 1 When C in is done, select the right result 0 1 c 8 G 0,7 P 0,7 next select 2-1 Mux CS/ECE 552 Lecture Notes: Chapt 4 24 select CS/ECE 552 Lecture Notes: Chapt 4 23

Oth s: Carry Save Wallace Tree A + B -> S f e d c b a Save carries A + B -> S, C out CSA CSA Use C in A + B + C -> S1, S2 (3# to 2# in parallel) Used in combinational multiplis by building a Wallace Tree c b a CSA CSA CSA c s CS/ECE 552 Lecture Notes: Chapt 4 25 CS/ECE 552 Lecture Notes: Chapt 4 26 Logical Opations Shift Bitwise AND, OR, XOR, NOR Implement with 32 gates in parallel Shifts and rotates rol -> rotate left (MSB --> LSB) ror -> rotate right (LSB --> MSB) sll -> shift left logical (0 --> LSB) srl -> shift right logical (0 --> MSB) srl -> shift right arithmetic (old MSB --> new MSB) E.g., Shift left logical for d<7:0> and shamt<2:0> Using 2-1 Muxes called Mux(select, in 0, in 1 ) stage0<7:0> = Mux(shamt<0>,d<7:0>, 0 d<7:1>) stage1<7:0> = Mux(shamt<1>, stage0<7:0>, 00 stage0<6:2>) dout<7:0) = Mux(shamt<2>, stage1<7:0>, 0000 stage1<3:0>) For Barrel shift used wid muxes CS/ECE 552 Lecture Notes: Chapt 4 27 CS/ECE 552 Lecture Notes: Chapt 4 28

Shift All Togeth Mux d 0 0 d 7 d 6 d 1 d 0 shift based on 0 th bit by 0 or 1 stage0 s0 7 s0 0 s0 7 s0 6 s0 0 s0 2 s0 1 0 s0 0 0 Mux shift based on 1 st bit by 0 or 2 stage1 s1 7 s1 0 s1 7 s1 3 s1 4 s1 0 s1 3 0 s1 0 0 Mux dout shift based on 2 nd bit by 0 or 4 shamt0 shamt1 shamt2 a b invt Mux carry in opation Mux result dout 7 dout 7 CS/ECE 552 Lecture Notes: Chapt 4 29 CS/ECE 552 Lecture Notes: Chapt 4 30 Ovflow Ovflow with n-bits only 2 n combinations Unsigned [0, 2 n -1], 2 s Complement [-2 n-1, 2 n-1-1] Unsigned 5 + 6 > 7 101 + 110 More involved for 2 s Complement -1+ -1 = -2 111 + 111 1110 110 = -2 is correct => can t just use carry-out 1011 f(3:0) = a(2:0) + b(2:0) => ovflow = f(3) ;; carryout CS/ECE 552 Lecture Notes: Chapt 4 31 CS/ECE 552 Lecture Notes: Chapt 4 32

ition Ovflow When is ovflow NOT possible? p1, p2 > 0 and n1, n2 < 0 p1 + p2 p1 + n1 not possible n1 + p2 not possible n1 + n2 ovflow = X * a(2) * b(2) + Y * a(2) * b(2) What are X and Y? ition Ovflow 2 + 3 = 5 > 4 010 + 011 101 = -3 < 0! In genal, X = f(2) -1 + -4 111 + 100 011 which is 011 > 0 In genal Y = f(2) Ovflow = f(2) * a(2) * b(2) + f(2) * a(2) * b(2) CS/ECE 552 Lecture Notes: Chapt 4 33 CS/ECE 552 Lecture Notes: Chapt 4 34 Subtraction Ovflow What to do on ovflow No ovflow on a-b if signs are same neg - pos ==> neg ;; ovflow othwise pos - neg ==> pos ;; ovflow othwise ovflow = f(2) * (a2) * b(2) + f(2) * a(2) * (b2) Ignore! Flag - condition code that may be tested by software sticky flag - e.g., for floating point trap - possibly with mask CS/ECE 552 Lecture Notes: Chapt 4 35 CS/ECE 552 Lecture Notes: Chapt 4 36

zo = f(2) + f(1) + f(0) can t also look at f(3) because 001 +1 + 111-1 1000 0 So, negative = f(2) Zo and Negative Zo and Negative May also want correct answ even on ovflow negative = (a < b) = (a-b < 0) even if ovflow E.g., is -4 < 2? 100-4 - 010 2 1010-6 => ovflow If you work it out, negative = f(2) XOR ovflow CS/ECE 552 Lecture Notes: Chapt 4 37 CS/ECE 552 Lecture Notes: Chapt 4 38 MMX MMX, cont. MMX [Peleg & Weis, IEEE Micro, Aug. 96] Goal 2x pformance in audio, video, etc. Key technique: SIMD - single instruction multiple data 1999 Streaming SIMD Extensions in same spirit Data types 1 x 64 bit quad word 2 x 32 bit double-word 4 x 16 bit word 8 x 8 bit byte E.g., ADDB (for byte) 17 87 100... 6 more + 17 13 200... 6 more ---- ---- ----... 34 100 44 == 300 mod 256 or 255 == maximum value E.g., 16 element dot product from matrix multiply [a1... a16] x [b1... b16] = a1*b1 +... + a16*b16 IA-32: 32 loads, 16 *, 15 +, 12 loop control = 76 instr. MMX: 16 instr. Cycles 200 for int, 76 for FP, & 12 for MMX (6x ov FP!) CS/ECE 552 Lecture Notes: Chapt 4 39 CS/ECE 552 Lecture Notes: Chapt 4 40

MMX, cont. Oths: MOV, (UN)PACK, & MASK (e.g., next) 15 15 100 120 101 76 15 15 15 15 15 15 15 15 15 15 -------------------------------- FF FF 00 00 00 00 FF FF Why? Weathpson at 00 s & weathmap at FF s Comments Backward compatible & no OS changes (ovload FP regs) Oths have similar: Sun, HP, and now Intel SSE ISVs (i.e., for games) have not (yet) embraced CS/ECE 552 Lecture Notes: Chapt 4 41