Tailoring the 32-Bit ALU to MIPS

Similar documents
Outline. EEL-4713 Computer Architecture Multipliers and shifters. Deriving requirements of ALU. MIPS arithmetic instructions

Homework 3. Assigned on 02/15 Due time: midnight on 02/21 (1 WEEK only!) B.2 B.11 B.14 (hint: use multiplexors) CSCI 402: Computer Architectures

COMP 303 Computer Architecture Lecture 6

Integer Multiplication and Division

Chapter 3 Arithmetic for Computers. ELEC 5200/ From P-H slides

MIPS Integer ALU Requirements

Chapter 3 Arithmetic for Computers

Review: MIPS Organization

Review of Last lecture. Review ALU Design. Designing a Multiplier Shifter Design Review. Booth s algorithm. Today s Outline

Number Systems and Computer Arithmetic

Lecture 8: Addition, Multiplication & Division

Arithmetic for Computers

Lecture Topics. Announcements. Today: Integer Arithmetic (P&H ) Next: The MIPS ISA (P&H ) Consulting hours. Milestone #1 (due 1/26)

EEC 483 Computer Organization

Integer Arithmetic. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

CPS 104 Computer Organization and Programming

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Architecture Set Four. Arithmetic

CENG3420 L05: Arithmetic and Logic Unit

EECS150 - Digital Design Lecture 13 - Combinational Logic & Arithmetic Circuits Part 3

361 div.1. Computer Architecture EECS 361 Lecture 7: ALU Design : Division

Integer Multiplication and Division

CENG 3420 Lecture 05: Arithmetic and Logic Unit

Number Systems and Their Representations

Review from last time. CS152 Computer Architecture and Engineering Lecture 6. Verilog (finish) Multiply, Divide, Shift

ECE331: Hardware Organization and Design

NUMBER OPERATIONS. Mahdi Nazm Bojnordi. CS/ECE 3810: Computer Organization. Assistant Professor School of Computing University of Utah

Lecture Topics. Announcements. Today: Integer Arithmetic (P&H ) Next: continued. Consulting hours. Introduction to Sim. Milestone #1 (due 1/26)

MIPS ISA. 1. Data and Address Size 8-, 16-, 32-, 64-bit 2. Which instructions does the processor support

COMPUTER ORGANIZATION AND DESIGN

Thomas Polzer Institut für Technische Informatik

Fast Arithmetic. Philipp Koehn. 19 October 2016

Week 7: Assignment Solutions

ECE 30 Introduction to Computer Engineering

Today s Outline. CS152 Computer Architecture and Engineering Lecture 5. VHDL, Multiply, Shift

Chapter 3 Arithmetic for Computers (Part 2)

Computer Arithmetic Multiplication & Shift Chapter 3.4 EEC170 FQ 2005

More complicated than addition. Let's look at 3 versions based on grade school algorithm (multiplicand) More time and more area

EE260: Logic Design, Spring n Integer multiplication. n Booth s algorithm. n Integer division. n Restoring, non-restoring

Arithmetic for Computers. Hwansoo Han

Chapter 3. Arithmetic Text: P&H rev

Chapter 4. The Processor

ECE232: Hardware Organization and Design. Computer Organization - Previously covered

Math in MIPS. Subtracting a binary number from another binary number also bears an uncanny resemblance to the way it s done in decimal.

Computer Architecture. Chapter 3: Arithmetic for Computers

Divide: Paper & Pencil

ECE331: Hardware Organization and Design

Midterm I March 12, 2003 CS152 Computer Architecture and Engineering

ECE331: Hardware Organization and Design

Mark Redekopp, All rights reserved. EE 357 Unit 11 MIPS ISA

EEM 486: Computer Architecture. Lecture 2. MIPS Instruction Set Architecture

Signed Multiplication Multiply the positives Negate result if signs of operand are different

Two-Level CLA for 4-bit Adder. Two-Level CLA for 4-bit Adder. Two-Level CLA for 16-bit Adder. A Closer Look at CLA Delay

Part III The Arithmetic/Logic Unit. Oct Computer Architecture, The Arithmetic/Logic Unit Slide 1

Arithmetic Logic Unit. Digital Computer Design

CPE300: Digital System Architecture and Design

Reduced Instruction Set Computer (RISC)

Computer Architecture Chapter 3. Fall 2005 Department of Computer Science Kent State University

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers

F. Appendix 6 MIPS Instruction Reference

Chapter 3 Arithmetic for Computers

Introduction to Digital Logic Missouri S&T University CPE 2210 Multipliers/Dividers

5DV118 Computer Organization and Architecture Umeå University Department of Computing Science Stephen J. Hegner. Topic 3: Arithmetic

Assembly Programming

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

ECE260: Fundamentals of Computer Engineering

T insn-mem T regfile T ALU T data-mem T regfile

MIPS Instruction Reference

Reduced Instruction Set Computer (RISC)

M2 Instruction Set Architecture

CSE 141 Computer Architecture Summer Session Lecture 3 ALU Part 2 Single Cycle CPU Part 1. Pramod V. Argade

Instruction Set Architecture of. MIPS Processor. MIPS Processor. MIPS Registers (continued) MIPS Registers

MIPS Instruction Set

Chapter 4. The Processor

HW2 solutions You did this for Lab sbn temp, temp,.+1 # temp = 0; sbn temp, b,.+1 # temp = -b; sbn a, temp,.+1 # a = a (-b) = a + b;

Boolean Algebra. Chapter 3. Boolean Algebra. Chapter 3 Arithmetic for Computers 1. Fundamental Boolean Operations. Arithmetic for Computers

Computer Architecture. MIPS Instruction Set Architecture

Arithmetic and Logical Operations

Timing for Ripple Carry Adder

Computer Science 61C Spring Friedland and Weaver. Instruction Encoding

COMPUTER ARITHMETIC (Part 1)

CPE 335 Computer Organization. MIPS Arithmetic Part I. Content from Chapter 3 and Appendix B

ECE260: Fundamentals of Computer Engineering

Assembly Language. Prof. Dr. Antônio Augusto Fröhlich. Sep 2006

The MIPS Instruction Set Architecture

ECE260: Fundamentals of Computer Engineering

Flow of Control -- Conditional branch instructions

Overview. Introduction to the MIPS ISA. MIPS ISA Overview. Overview (2)

Outline. Introduction to Structured VLSI Design. Signed and Unsigned Integers. 8 bit Signed/Unsigned Integers

CS 61C: Great Ideas in Computer Architecture MIPS Instruction Formats

Processor (I) - datapath & control. Hwansoo Han

Computer Architecture. The Language of the Machine

Midterm I March 3, 1999 CS152 Computer Architecture and Engineering

9 Multiplication and Division

Lets Build a Processor

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations

Chapter 3: Arithmetic for Computers

Q1: /30 Q2: /25 Q3: /45. Total: /100

Examples of branch instructions

Chapter 4. The Processor Designing the datapath

Transcription:

Tailoring the 32-Bit ALU to MIPS MIPS ALU extensions Overflow detection: Carry into MSB XOR Carry out of MSB Branch instructions Shift instructions Slt instruction Immediate instructions ALU performance Performance vs. cost Carry lookahead adder Implementation alternatives Branch Instructions beq $t5, $t6, L Use subtraction: (a-b) = 0 implies a = b Add hardware to test if the result is 0 OR all 32 results and invert the OR output ZERO = (Result 1 + Result 2 +.. + Result 31 ) Note: Signal ZERO is a 1 when the result is zero! 1

Branch Support 1 (A = B) 0 otherwise Shift instructions SLL, SRL, and SRA We need a data line for a shifter (L and R) However, shifters are much more easily implemented at the transistor level (outside the ALU) Barrel shifters x 3 x 2 x 1 x 0 Diagonal closed switch pattern controlled by the control unit x 3 x 2 x 1 x 0 x 2 x 1 x 0 0 0 x 3 x 2 x 1 Output, x Output, x<<1 Output, x>>1 2

Immediate Instructions First input to ALU is the first register (rs) Second input Data from register (rt) Zero- or singextended immediate Add a mux at second input of ALU rs rt ALU Registers 0 1 32 Sign extend IR: 16 Control Unit Result Zero Overflow Memory address Slt rd, rs, rt rd: Slt Instruction 0000 0000 0000 0000 0000 0000 0000 000r 1 if (rs < rt) 0 else A < B => A B < 0 1. Perform subtraction using full adder 2. Check highest-order bit (sign bit) 3. Sign bit tells us whether A < B New input line (Less) goes directly to mux New control code for slt Result for slt is not the output from ALU Need a new 1-bit ALU for the most significant bit It has a new output line (Set) used only for slt (Overflow detection logic is also associated with this bit) 3

First bit (LSB) Slt Support Sign bit What is the control code for slt? Overview I- instruction 32-bit memory address 4

ALU Performance Hardware executes in parallel Is a 32-bit ALU as fast as a 1-bit ALU? Speed vs. Cost Fewer sequential gates vs. number of gates Two extremes to do addition Ripple carry and sum-of-products How could you get rid of the ripple? carry-look-ahead adder c 1 = b 0 c 0 + a 0 c 0 + a 0 b 0 c 2 = b 1 c 1 + a 1 c 1 + a 1 b 1 c 2 = c 2 (a 0,b 0,c 0,a 1,b 1 ) c 3 = b 2 c 2 + a 2 c 2 + a 2 b 2 c 3 = c 3 (a 0,b 0,c 0,a 1,b 1,a 2,b 2 ) c 4 = b 3 c 3 + a 3 c 3 + a 3 b 3 c 4 = c 4 (a 0,b 0,c 0,a 1,b 1,a 2,b 2,a 3,b 3 ) Not feasible! Too many inputs to the gates Conclusions We can build an ALU to support the MIPS ISA Key Idea: Use multiplexer to select ALU output Subtraction uses two s complement addition Replicate 1-bit ALU to produce 32-bit ALU Important points about hardware All of the gates in the ALU work in parallel The speed of a gate is affected by the number of inputs Speed of a circuit is affected by the number of gates in series (on the critical path or the deepest level of logic) Our primary focus: (conceptual) Clever changes to organization can improve performance (similar to using better algorithms in software) 5

Review: 32-bit ALU 1-bit ALU Requirements: Control codes operations Datapath rs rt Registers 0 1 32 Sign extend 16 Datapath for ALU instructions lw/sw instructions Imm instructions Branch instructions IR: ALU Control Unit Result Zero Overflow Memory address 6

3.3 Multiplication More complicated than addition Accomplished via shifting and addition Requires more time and chip area 3 versions of pencil-and-paper algorithm 0010 (multiplicand) x_1011 (multiplier) 0010 1 -> copy & shift (multiplicand to left) 0010 1 -> copy & shift 0000 0 -> shift 0010. 1 -> copy & shift 00010110 Sum Partial Products First Version (V.1) 0 0 1 0 1 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 0 7

V.1: Hardware Multiplicand (64 bits) Shift left Problems: half of the bits of multiplicand are always 0 Wasteful, slow 64-bit ALU Product (64 bits) Write Multiplier (32 bits) Shift right Multiplier0 Control test V.1: Hardware 8

Steps Unsigned multiplication: Shift-and-add Generate one partial product for each digit in the multiplier Partial product = 0 If multiplier digit = 0 Multiplicand If multiplier digit = 1 Total product = sum of (left shifted) partial products The multiplication of two n-bit binary integers results in a product of up to 2n bits in length Signed multiplication Convert them to positive numbers and remember the original signs. Need to extend sign of the product there are better techniques Second Version (V.2) Multiplicand Start 32 bits 32-bit ALU Multiplier Shift right Multiplier0 = 1 1. Test Multiplier0 Multiplier0 = 0 32 bits Product 64 bits Shift right Write Control test 1a. Add multiplicand to the left half of the product and place the result in the left half of the Product register Product 0 0 1 0 x 1 0 1 1 Multiplier0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 0 1 0 1 1 0 0 0 0 0 1 0 1 1 0 2. Shift the Product register right 1 bit 3. Shift the Multiplier register right 1 bit No: < 32 repetitions 32nd repetition? Yes: 32 repetitions Done 9

Final Version (V.3) Multiplicand Start 32bits Product0 = 1 1. Test Product0 Product0 = 0 32-bit ALU Product 64bits Shiftright Write Product Control test 0 0 1 0 x 1 0 1 1 0 0 0 0 1 0 1 1 0 0 1 0 1 0 1 1 0 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 0 1 0 1 1 0 1a. Add multiplicand to the left half of the product and place the result in the left half of the Product register 2. Shift the Product register right 1 bit 32nd repetition? Done No: < 32 repetitions Yes: 32 repetitions General View Multiplicand M 31... M 0 1011 Multiplicand (11) x 1101 Multiplier (13) Product (143) C A Q M 32-bit ALU Shift right Add Control Initial values 1 2 3 Add Shift Shift Add Shift 0 0000 1101 1011 0 1011 1101 1011 0 0101 1110 1011 0 0010 1111 1011 0 1101 1111 1011 0 0110 1111 1011 C A 31... A 0 Q 31... Q 0 4 Add Shift 1 0001 1111 1011 0 1000 1111 1011 Multiplier 10

MIPS Multiplication Special purpose registers for the result (Hi, Lo) Two multiply instructions Mult: signed Multu: unsigned mflo, mfhi move contents from Hi, Lo to general purpose registers (GPRs) No overflow detection in hardware => Software overflow detection Hi must be 0 for multu or the replicated sign of Lo for mult Faster Multiplier Uses multiple adders Cost/performance tradeoff Can be pipelined Several multiplication performed in parallel 11

3.4 Division Long division of unsigned binary integers Divisor Partial remainders 0 0 0 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 1 1 0 0 Quotient Dividend Remainder Dividend = Quotient * Divisor + Reminder Division Hardware Initially divisor in left half Initially dividend 12

Optimized Divider One cycle per partial-remainder subtraction Looks a lot like a multiplier! Same hardware can be used for both MIPS Multiply and divide use existing hardware ALU and shifter Extra hardware: 64-bit register able to SLL/SRA Hi contains the remainder (mfhi) Lo contains the quotient (mflo) Instructions Div: signed divide Divu: unsigned divide MIPS ignores overflow? Division by 0 must be checked in software 13

MIPS Processor Registers 32 Sign extend 16 0 1 M IR: ALU 0 1 2 Sub Operation Zero Overflow Memory address Control Unit SLL/SRA Hi Lo 14