HW2 solutions You did this for Lab sbn temp, temp,.+1 # temp = 0; sbn temp, b,.+1 # temp = -b; sbn a, temp,.+1 # a = a (-b) = a + b;

Similar documents
Chapter loop: lw $v1, 0($a0) addi $v0, $v0, 1 sw $v1, 0($a1) addi $a0, $a0, 1 addi $a1, $a1, 1 bne $v1, $zero, loop

CS152 Computer Architecture and Engineering

F. Appendix 6 MIPS Instruction Reference

MIPS Instruction Reference

Flow of Control -- Conditional branch instructions

Examples of branch instructions

Outline. EEL-4713 Computer Architecture Multipliers and shifters. Deriving requirements of ALU. MIPS arithmetic instructions

MIPS Instruction Format

EEM 486: Computer Architecture. Lecture 2. MIPS Instruction Set Architecture

MIPS Reference Guide

MIPS Instruction Set

Tailoring the 32-Bit ALU to MIPS

ECE 2035 Programming HW/SW Systems Spring problems, 6 pages Exam One 4 February Your Name (please print clearly)

Arithmetic for Computers

SPIM Instruction Set

COMP MIPS instructions 2 Feb. 8, f = g + h i;

Lecture 8: Addition, Multiplication & Division

Reduced Instruction Set Computer (RISC)

M2 Instruction Set Architecture

Q1: /30 Q2: /25 Q3: /45. Total: /100

Computer Architecture. The Language of the Machine

The MIPS Instruction Set Architecture

Review: MIPS Organization

Reduced Instruction Set Computer (RISC)

Computer Architecture. Chapter 3: Arithmetic for Computers

Overview. Introduction to the MIPS ISA. MIPS ISA Overview. Overview (2)

ECE 2035 Programming HW/SW Systems Fall problems, 7 pages Exam Two 23 October 2013

ECE 2035 Programming HW/SW Systems Fall problems, 6 pages Exam One 19 September 2012

Number Systems and Their Representations

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

ENCM 369 Winter 2013: Reference Material for Midterm #2 page 1 of 5

MIPS Integer ALU Requirements

NUMBER OPERATIONS. Mahdi Nazm Bojnordi. CS/ECE 3810: Computer Organization. Assistant Professor School of Computing University of Utah

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA

ECE Exam I February 19 th, :00 pm 4:25pm

ECE 2035 Programming HW/SW Systems Spring problems, 6 pages Exam Two 11 March Your Name (please print) total

COMP 303 Computer Architecture Lecture 6

ECE232: Hardware Organization and Design. Computer Organization - Previously covered

The MIPS R2000 Instruction Set

Integer Multiplication and Division

MIPS Assembly Language

MIPS Assembly Language Programming

bits 5..0 the sub-function of opcode 0, 32 for the add instruction

RTL Model of a Two-Stage MIPS Processor

Chapter 3 Arithmetic for Computers. ELEC 5200/ From P-H slides

CSc 256 Midterm 2 Fall 2011

CS 61C: Great Ideas in Computer Architecture MIPS Instruction Formats

MIPS Assembly Language. Today s Lecture

ECE 2035 Programming HW/SW Systems Fall problems, 6 pages Exam One 22 September Your Name (please print clearly) Signed.

TSK3000A - Generic Instructions

Today s Lecture. MIPS Assembly Language. Review: What Must be Specified? Review: A Program. Review: MIPS Instruction Formats

Computer Science 61C Spring Friedland and Weaver. Instruction Encoding

ICS 233 COMPUTER ARCHITECTURE. MIPS Processor Design Multicycle Implementation

Chapter 3 Arithmetic for Computers

CS61c MIDTERM EXAM: 3/17/99

Kernel Registers 0 1. Global Data Pointer. Stack Pointer. Frame Pointer. Return Address.

Computer Architecture. MIPS Instruction Set Architecture

MIPS ISA. 1. Data and Address Size 8-, 16-, 32-, 64-bit 2. Which instructions does the processor support

Integer Arithmetic. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

INSTRUCTION SET COMPARISONS

5/17/2012. Recap from Last Time. CSE 2021: Computer Organization. The RISC Philosophy. Levels of Programming. Stored Program Computers

Recap from Last Time. CSE 2021: Computer Organization. Levels of Programming. The RISC Philosophy 5/19/2011

Homework 3. Assigned on 02/15 Due time: midnight on 02/21 (1 WEEK only!) B.2 B.11 B.14 (hint: use multiplexors) CSCI 402: Computer Architectures

5DV118 Computer Organization and Architecture Umeå University Department of Computing Science Stephen J. Hegner. Topic 3: Arithmetic

Assembly Programming

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

CSc 256 Midterm 2 Spring 2012

CENG3420 L05: Arithmetic and Logic Unit

CS61C - Machine Structures. Lecture 6 - Instruction Representation. September 15, 2000 David Patterson.

Computer Architecture Instruction Set Architecture part 2. Mehran Rezaei

Review of Last lecture. Review ALU Design. Designing a Multiplier Shifter Design Review. Booth s algorithm. Today s Outline

ECE 30 Introduction to Computer Engineering

Week 10: Assembly Programming

Mark Redekopp, All rights reserved. EE 357 Unit 11 MIPS ISA

ECE 15B Computer Organization Spring 2010

CMPE324 Computer Architecture Lecture 2

Compiling Techniques

Programming the processor

Exam in Computer Engineering

ECE468 Computer Organization & Architecture. MIPS Instruction Set Architecture

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

MIPS ISA and MIPS Assembly. CS301 Prof. Szajda

COMPUTER ORGANIZATION AND DESIGN

Instruction Set Architecture of. MIPS Processor. MIPS Processor. MIPS Registers (continued) MIPS Registers

ECE 2035 Programming HW/SW Systems Fall problems, 6 pages Exam Two 21 October 2016

CSc 256 Midterm (green) Fall 2018

Mips Code Examples Peter Rounce

ECE 2035 A Programming Hw/Sw Systems Spring problems, 8 pages Final Exam 29 April 2015

CENG 3420 Lecture 05: Arithmetic and Logic Unit

UCB CS61C : Machine Structures

Question 0. Do not turn this page until you have received the signal to start. (Please fill out the identification section above) Good Luck!

MIPS Assembly Programming

ECE 2035 Programming HW/SW Systems Fall problems, 6 pages Exam Two 23 October Your Name (please print clearly) Signed.

Lecture Topics. Announcements. Today: Integer Arithmetic (P&H ) Next: The MIPS ISA (P&H ) Consulting hours. Milestone #1 (due 1/26)

CS 4200/5200 Computer Architecture I

CENG 3420 Lecture 06: Datapath

Solutions for Chapter 2 Exercises

ECE/CS 552: Introduction To Computer Architecture 1. Instructor:Mikko H. Lipasti. University of Wisconsin-Madison. Basics Registers and ALU ops

Project Part A: Single Cycle Processor

Adventures in Assembly Land

Transcription:

HW2 solutions 3.10 Pseuodinstructions What is accomplished Minimum sequence of Mips Move $t5, $t3 $t5=$t3 Add $t5, $t3, $0 Clear $t5 $t5=0 Xor $t5, $t5, $t5 Li $t5, small $t5=small Addi $t5, $0, small Li $t5, big $t5=big lui $t5, big[31:16] Ori $t5, $t5, big[15:0] Lw $t5, big($t3) $t5=mem[$t3+big] Lui $t5, big[31:26] Ori $t5, $t0, big[15:0] Add $t3, $t3, $t5 Lw $t5, 0($t3) Addi $t5, $t3, big $t5=$t3+big Lui $t5, big[31:16] Ori $t5, big[15:0] Addi $t5, $t5, $t3 Beq $t5, small, L If $t5=small, branch to L Addi $at, $0, small Beq $t5, $at, L Beq $t5, big, L If $t5=big, branch to L Lui $at, big[31:16] Ori $at, big[15:0] Beq $at, $t5, L Ble $t5, $t3, L If $t5<=$t3, branch to L Slt $at, $t3, $t5 Beq $at, $0, L Bge $t5, $t3, L If $t5>=$t3, branch to L Slt $at, $t5, $3 Beq $at, $0, L Bgt $t5, $t3, L If $t5>$t3, branch to L Slt $at, $t3, $t5 Bne $at, $0 L 3.25 You did this for Lab 1. 3.29 sbn temp, temp,.+1 # temp = 0; sbn temp, b,.+1 # temp = -b; sbn a, temp,.+1 # a = a (-b) = a + b; 3.30 sbn neg_a, neg_a,.+1 # neg_a = 0; sbn neg_a, a,.+1 # neg_a = -a; sbn c, c,.+1 # c = 0; loop: sbn b, one,.+1 # do { b = b 1; sbn c, neg_a,.+1 # c = c + a; sbn temp, temp,.+1 # temp = 0; sbn temp, b, loop # } while (b > 0);

Note (1) This solution does not work if b = 0, because the problem description said to assume that a and b are greater than 0. Perfectionist students are likely to write solutions that do work for b = 0 though, so their answers would be an instruction or too long. 4.17 add $t2, $t3, $t4 slt $t2, $t2, $t3 4.23 You need only alter the full adder for the MSB such that the Set output is the value of the full adder output XORed with the Overflow 4.24 (a * 2^32 + b) * (c * 2^32 + d) = (a * c * 2^64 + a * d * 2^32 + b * c * 2^32 + b * d multu $t5, $t7 # b * d mflo $t3 # product[31:0] = (b * d)[31:0] mfhi $t2 # product[63:32] = (b * d)[64:32] multu $t4, $t7 # a * d mflo $t8 # $t8 = (a * d)[31:0] mfhi $t1 # product[95:64] = (a * d)[63:32] addu $t2, $t2, $t8 # product[63:32] += (a * d)[31:0] sltu $t8, $t2, $t8 # $t8 = carry of 63:32 in last op

addu $t1, $t1, $t8 # product[95:64] += carry sltu $t0, $t1, $t8 # product[127:96] = carry in 95:64 multu $t1, $t2 # b * c mflo $t8 # $t8 = (b * c)[31:0] addu $t2, $t2, $t8 # product[63:32] += (b * c)[31:0] sltu $t8, $t2, $t8` # $t8 = carry of 63:32 in last op addu $t1, $t1, $t8 # product[95:64] += carry mfhi $t8 # $t8 = (b * c)[63:32] addu $t1, $t1, $t8 # product[95:64] += (b * c)[63:32] sltu $t8, $t1, $t8 # $t8 = carry of 95:64 in last op addu $t0, $t0, $t8 # product[127:96] += carry multu $t4, $t6 # a * c mflo $t8 # $t8 = (a * c)[31:0] addu $t1, $t1, $t8 # product[95:64] += (a * c)[31:0] sltu $t8, $t1, $t8 # $t8 = carry of 95:64 in last op addu $t0, $t0, $t8 # product[127:96] += carry mfhi $t8 # $t8 = (a * c)[63:32] addu $t0, $t0, $t8 # product[127:96] += (a * c)[63:32] 4.52 Each CSA has a delay of 2T. The iterative CLA-based multiplier takes: 16 layers * CLA delay = 16 * 7T = 112T The CSA multiplier takes: 6 layers * 2T + CLA delay = 6 * 2T + 7T = 19T

4.53 (ai+1 ai ai1) 0 0 0 == NOP + NOP = NOP 0 0 1 == NOP + multiplicand = multiplicand 0 1 0 == 2 * multiplicand + (-multiplicand) = multiplicand 0 1 1 == 2 * multiplicand + NOP = 2 * multiplicand 1 0 0 == -(2 * multiplicand) + NOP = -(2 * multiplicand) 1 0 1 == -(2 * multiplicand) + multiplicand = -multiplicand 1 1 0 == NOP + -multiplicand = -multiplicand 1 1 1 == NOP + NOP = NOP 4.54 See Lecture notes. Basic algorithm is: Take the top 4 bits of the dividend, and subtract off the divisor. Based on the top value of the result, we choose whether the next stage is an add (top bit was 1), or a subtract (top bit was 0). The inverted value of this top bit is also the quotient result. The next stage is simply the lower 3 bits of the subtracted (or added) results, along with the next bit of the dividend. The divisor remains the same.

You continue this until you have used up all the bits of the dividend. The remainder is the final sum, unless the top bit is 1, in which case you have to add the divisor to that final sum to fix the remainder. A5.ktext 0x80000080 sw $a0, save0 sw $a1, save1 mfc0 $k0, $13 # Move Cause into $k0 mfc0 $k1, $14 # Move EPC into $k1 addiu $v0, $zero, 0x44 slt $v0, $v0, $k0 # Ignore interrupts bgtz $v0, _restore mov $a0, $k0 # Move Cause into $a0 mov $a1, $k1 # EPC into $a1 jal print_excp # Print exception error msg _restore: lw $a0, save0 lw $a1, save1 lw $k0, -4($k1) # $k0 = previous instruction srl $k0, $k0, 26 # $k0 = opcode of prev instr ori $k1, $zero, 2 # opcode of j beq $k0, $k1, _delayslot # ori $k0, $zero, 4 # opcode of beq beq $k0, $k1, _delayslot # and so on for: jr, jal, bne, bltz, bgezal, bczt... _done: mfc0 $k1, $14 # reload EPC into $k1 addiu $k1, $k1, 4 # Do not reexecute fault instr jr $k1 rfe # done in delay-slot of jr _delayslot: mfc0 $k1, $14 # reload EPC into $k1 addiu $k0, $k1, -4 # $k0 = EPC - 4 addiu $k1, $k1, 4 # $k1 = EPC + 4 jr $k0 # poke at branching instr j _check _check: rfe jr $k1 or $zero,$zero,$zero.kdata Save0:.word 0 save1:.word 0 This problem is hard. The basic idea of this solution is to do everything possible in order not to touch the instruction that caused the exception. We need a way to poke the branching instruction, that is, execute the instruction without executing any instructions around it. This procedure works by calling the branching instruction with jr, but putting a j in the delay-slot of the jr, so that we will jump back after executing the branching instruction and not execute its regular delay-slot. If it turns out that the branch is not taken (which may happen with a bne or beq), then we jump back to EPC+4. Note: this

solution assumes that j in branch delay slots will NOT executed if branch is taken! Other elegant solutions will be highly appreciated B.6 B.10 A B!A!B!(A+B)!A *!B!(A*B)!A +!B 0 0 1 1 1 1 1 1 0 1 1 0 0 0 1 1 1 0 0 1 0 0 1 1 1 1 0 0 0 0 0 0 a) F = (!x3 &&!x2 && x1) (!x3 && x2 &&!x1) (x3 &&!x2 &&!x1) b) F = (!x3 && x2 && x1) (x3 &&!x2 && x1) (x3 && x2 &&!x1) c) F = (!x3 &&!x2) (!x3 &&!x1) d) F = (x3 &&!x2) (x3 && x1) B.14 Simply use two muxes: B.21

B.22 State Assignments: Left (00), Middle a (01), Right (10), Middle b (11) S1 S0 S1 S0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 0 0 Solving the K-Maps for S1 and S0, you get: S1 = XOR (S1, S0) S0 = NOT (S0) The Outputs are associated with the state (where both Middle a, b output Middle) C.1 This looks just like the PLA on page C-20, except that there is now S0 through S9. The logic is the same, it just looks a lot bigger. Each column should also only be connected to one of the state bits.