CS 351 Exam 2 Mon. 11/2/2015

Similar documents
CS 351 Exam 2 Wed. 4/5/2017

CS 351 Exam 2, Section 1 Wed. 11/2/2016

CS3350B Computer Architecture Quiz 3 March 15, 2018

CS 351 Exam 2, Fall 2012

ENCM 369 Winter 2013: Reference Material for Midterm #2 page 1 of 5

CSEN 601: Computer System Architecture Summer 2014

Computer Organization and Structure

Lab 4 Report. Single Cycle Design BAOTUNG C. TRAN EEL4713C

CENG 3420 Lecture 06: Datapath

Chapter 4. The Processor. Computer Architecture and IC Design Lab

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl.

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

Laboratory Exercise 6 Pipelined Processors 0.0

Processor. Han Wang CS3410, Spring 2012 Computer Science Cornell University. See P&H Chapter , 4.1 4

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

ECE Exam I February 19 th, :00 pm 4:25pm

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

Computer Architecture. The Language of the Machine

Processor (I) - datapath & control. Hwansoo Han

MIPS%Assembly% E155%

CSE 378 Midterm 2/12/10 Sample Solution


CSc 256 Midterm 2 Fall 2011

5/17/2012. Recap from Last Time. CSE 2021: Computer Organization. The RISC Philosophy. Levels of Programming. Stored Program Computers

Final Project: MIPS-like Microprocessor

Reduced Instruction Set Computer (RISC)

Recap from Last Time. CSE 2021: Computer Organization. Levels of Programming. The RISC Philosophy 5/19/2011

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

CS Computer Architecture Spring Week 10: Chapter

ECE 331 Hardware Organization and Design. Professor Jay Taneja UMass ECE - Discussion 5 2/22/2018

The Processor: Datapath & Control

Anne Bracy CS 3410 Computer Science Cornell University. [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon]

ECE232: Hardware Organization and Design. Computer Organization - Previously covered

Lecture 2. Instructions: Language of the Computer (Chapter 2 of the textbook)

LECTURE 5. Single-Cycle Datapath and Control

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

NATIONAL UNIVERSITY OF SINGAPORE

COMPSCI 313 S Computer Organization. 7 MIPS Instruction Set

University of California College of Engineering Computer Science Division -EECS. CS 152 Midterm I

A Processor. Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. See: P&H Chapter , 4.1-3

Machine Organization & Assembly Language

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Instructions: Language of the Computer

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1

Course Administration

Introduction to the MIPS. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

Reduced Instruction Set Computer (RISC)

CS 61C: Great Ideas in Computer Architecture. MIPS CPU Datapath, Control Introduction

ECE 331 Hardware Organization and Design. Professor Jay Taneja UMass ECE - Discussion 3 2/8/2018

EEM 486: Computer Architecture. Lecture 2. MIPS Instruction Set Architecture

MIPS R-format Instructions. Representing Instructions. Hexadecimal. R-format Example. MIPS I-format Example. MIPS I-format Instructions

Lecture 4: MIPS Instruction Set

ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

Computer Architecture. MIPS Instruction Set Architecture

Concocting an Instruction Set

The MIPS Instruction Set Architecture

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )

ICS 233 COMPUTER ARCHITECTURE. MIPS Processor Design Multicycle Implementation

A General-Purpose Computer The von Neumann Model. Concocting an Instruction Set. Meaning of an Instruction. Anatomy of an Instruction

CS3350B Computer Architecture MIPS Instruction Representation

CS232 Final Exam May 5, 2001

M2 Instruction Set Architecture

ECE 2035 Programming HW/SW Systems Fall problems, 7 pages Exam Two 23 October 2013

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

Grading: 3 pts each part. If answer is correct but uses more instructions, 1 pt off. Wrong answer 3pts off.

EE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

Chapter 2. Instructions: Language of the Computer. HW#1: 1.3 all, 1.4 all, 1.6.1, , , , , and Due date: one week.

Single Cycle CPU Design. Mehran Rezaei

Concocting an Instruction Set

Computer Architecture

MIPS Instruction Set

Faculty of Science FINAL EXAMINATION

CS 61C: Great Ideas in Computer Architecture Datapath. Instructors: John Wawrzynek & Vladimir Stojanovic

Chapter 2: Instructions:

MIPS Reference Guide

Computer Organization MIPS ISA

The MIPS Processor Datapath

Chapter 4. The Processor

Instructions: Language of the Computer

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei

Midterm. Sticker winners: if you got >= 50 / 67

CENG3420 Lecture 03 Review

CPU Organization (Design)

Character Is a byte quantity (00~FF or 0~255) ASCII (American Standard Code for Information Interchange) Page 91, Fig. 2.21

Systems Architecture

MIPS ISA. 1. Data and Address Size 8-, 16-, 32-, 64-bit 2. Which instructions does the processor support

Anne Bracy CS 3410 Computer Science Cornell University. See P&H Chapter: , , Appendix B

Final Exam Fall 2007

CS 61c: Great Ideas in Computer Architecture

CSc 256 Midterm 2 Spring 2012

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

CS222: MIPS Instruction Set

Chapter 5 Solutions: For More Practice

Transcription:

CS 351 Exam 2 Mon. 11/2/2015 Name: Rules and Hints The MIPS cheat sheet and datapath diagram are attached at the end of this exam for your reference. You may use one handwritten 8.5 11 cheat sheet (front and back). This is the only additional resource you may consult during this exam. No calculators. You may write your answers in the form [mathematical expression][units]. There is no need to actually do the arithmetic. Include step-by-step explanations and comments in your answers, and show as much of your work as possible, in order to maximize your partial credit. You may use extra scratch paper if you need more space, but make it clear where to find your answer to each question. Grade Your Score Max Score Problem 1: Short answer 18 Problem 2: Tracing the MIPS datapath 32 Problem 3: Pipeline performance and implementation 25 Problem 4: Caches 25 Total 100

Problem 1: Short answer (18 points) Part A (6 points) Assume that you have a MIPS processor whose stages are shown in the table: IF ID EX MEM WB 300 ps 350 ps 300 ps 350 ps 200 ps You have a pipelined and a non-pipelined version of this processor. The pipelined version takes 13 cycles to run a sequence of MIPS instructions, while the non-pipelined version takes 10 cycles. What can you conclude about their relative performance? Explain. Part B (6 points) In which stage of the datapath does branch prediction take place? What are the inputs and outputs of the branch predictor? 2

Part C (6 points) Is it possible to design a cache that doesn t need to store tag bits? Explain. Problem 2: Tracing the MIPS datapath (32 points) Answer the following questions about the single-cycle datapath provided at the end of this exam. This is the datapath that executes a single instruction, from start to finish, on every clock cycle. Be sure to explain your answers if you want to receive partial credit. Provide the exact numeric answer (in decimal, hex, or binary) to each question if possible. If not, describe the value without stating an exact number. Consider the execution of this instruction: ADDU $sp, $sp, $t0 Assume the following: The bits of this instruction are stored in addresses 900 903 in memory. The initial value in $sp is 0x8abcdefa. The initial value in $t0 is -8. The main control unit will set ALUOp to 00 for add, 01 for subtract, and 10 for look at the function field. 3

Part A: Instruction memory (4 points) What value goes into the instruction memory s Read Address port? How many bits is that value? What is the value of the instruction memory s Instruction[31:0] output? Part B: Register file (7 points) What values go into the register file s Read Register 1 and Read Register 2 ports? How many bits are those values? What values go into the register file s Write Register and Write Data ports? How many bits are those values? What is the value of the RegWrite control signal? Part C: ALU (7 points) What are the values of the ALU s two data inputs? How many bits are those values? 4

What two values go into the ALU control unit? How many bits are those values? What are the values of the ALU s ALU result and Zero outputs? How many bits are those values? Part D: Branch target adder (7 points) What two values are input to the top right adder? How many bits are those values? What is the output of the top right adder? What is the output of the top right mux? Conceptually, what is this mux choosing between? 5

Part E: Modifying the datapath (7 points) How would you set the control signals differently if this instruction were ADDIU $sp, $sp, -8? Specifically, point out any differences in the values of: RegWrite MemRead MemWrite RegDst ALUSrc ALUOp MemtoReg Branch Problem 3: Pipeline hazards (25 points) Part A: Data hazards (10 points) You have the following sequence of MIPS instructions: LW $t0, 0($sp) # Instruction A ADD $t1, $t0, $t2 # B LW $t3, 4($t1) # C SW $t3, 4($sffp) # D ADD $t4, $t5, $t6 # E Fill out the following chart. You are only responsible for documenting read-after-write dependencies between consecutive instructions. Producer and Consumer should be the instructions that produce and consume the value in question (e.g. A and B). Stage produced refers to the pipeline stage in which the producing instruction first produces the value that will ultimately go in the register. Stage consumed refers to the pipeline stage in which the consuming instruction will actually use this value. Leave any unneeded rows blank. The first row is just an example and doesn t pertain to this set of instructions. 6

Producer Register Stage produced Consumer Stage consumed X $s0 MEM Y EX Part B: Branch prediction (15 points) Consider a branch with the following pattern:... NT, T, T, NT, T, T,... Don t assume that the first branch outcome will be the NT it could start anywhere. Give the long-term accuracy of the following schemes. If the long-term accuracy depends on the initial state of the predictor, explain. Always predict not taken A 1-bit predictor A 2-bit predictor, initialized to weakly not taken Explain your answers. 7

Draw a state machine for a predictor that always achieves 100% long-term accuracy for this branch. You should specify the initial state of your predictor. 8

Problem 4: Caches (25 points) You have a system with the following memory hierarchy: Separate L1 instruction and data caches, which each have access times of 1 cycle and hit rates of 90% A unified L2 cache with an access time of 6 cycles and a hit rate of 75% Main memory with an access time of 120 cycles Part A: AMAT (10 points) What is the average memory access time, in cycles, on this system? Part B: Tracing (15 points) For this problem, consider a cache that Has 8 blocks (entries) Is direct-mapped Has 1 byte per block Memory addresses on this system are 8 bits. Assume that all memory references in this problem are reads (instruction fetches or data loads) and that the cache is initially empty (all blocks invalid). 9

Draw the state of the cache, including all metadata, after the following sequence of references: 3, 1, 17, 1, 17, 1, 17, 1, 17, 3. What is the hit rate of this sequence of references? How many of the misses were cold-start vs. capacity vs.conflict? 10

MIPS Arithmetic Instructions Instruction Operation Fmt Opcode Funct ADD $rd, $rs, $rt reg[rd] = reg[rs] + reg[rt] R 0 0x20 ADDI $rt, $rs, imm reg[rt] = reg[rs] + SignExt(imm) I 0x08 ADDU $rd, $rs, $rt reg[rd] = reg[rs] + reg[rt] R 0 0x21 ADDIU $rt, $rs, imm reg[rt] = reg[rs] + SignExt(imm) I 0x09 SUB $rd, $rs, $rt reg[rd] = reg[rs] - reg[rt] R 0 0x22 SUBU $rd, $rs, $rt reg[rd] = reg[rs] - reg[rt] R 0 0x23 LUI $rt, imm reg[rt] = (imm << 16) 0 I 0x0f The signed instructions (ADD, ADDI, SUB) can cause overflow exceptions. MIPS Logical Instructions Instruction Operation Fmt Opcode Funct AND $rd, $rs, $rt reg[rd] = reg[rs] & reg[rt] R 0 0x24 ANDI $rt, $rs, imm reg[rt] = reg[rs] & ZeroExt(imm) I 0x0c NOR $rd, $rs, $rt reg[rd] = ~(reg[rs] reg[rt]) R 0 0x27 OR $rd, $rs, $rt reg[rd] = reg[rs] reg[rt] R 0 0x25 ORI $rt, $rs, imm reg[rt] = reg[rs] ZeroExt(imm) I 0x0d SLL $rd, $rt, shamt reg[rd] = reg[rt] << shamt R 0 0x00 SRL $rd, $rt, shamt reg[rd] = reg[rt] >> shamt R 0 0x02 The SLL and SRL instructions fill in the shifted-out bits with 0. MIPS Branch and Jump Instructions Instruction Operation Fmt Opcode Funct BEQ $rs, $rt, label if (reg[rs] == reg[rt]) PC=BrAddr I 0x04 BNE $rs, $rt, label if (reg[rs]!= reg[rt]) PC=BrAddr I 0x05 J label PC = JumpAddr J 0x02 JAL label $ra = PC+4; PC = JumpAddr J 0x03 JR $rs PC = reg[$rs] R 0 0x08 BrAddr = PC + 4 + SignExt (imm << 2) JumpAddr = (PC+4)[31:28] (26-bit imm << 2) MIPS Memory Access Instructions Instruction Operation Fmt Opcode Funct LW $rt, imm($rs) reg[rt] = Mem[rs + SignExt(imm)] I 0x23 SW $rt, imm($rs) Mem[rs + SignExt(imm)] = reg[rt] I 0x2b LH $rt, imm($rs) Loads 16b from memory I 0x21 LHU $rt, imm($rs) Loads 16b from memory I 0x25 SH $rt, imm($rs) Stores 16b to memory I 0x29 LB $rt, imm($rs) Loads 8b from memory I 0x20 LBU $rt, imm($rs) Loads 8b from memory I 0x24 SB $rt, imm($rs) Stores 8b to memory I 0x28 LH and LB sign-extend the values read from memory in order to fill the leftmost bits of the 32-bit register. LHU/LBU zero-extend.

MIPS Comparison Instructions Instruction Operation Fmt Opcode Funct SLT $rd, $rs, $rt reg[rd] = ($rs < $rt) R 0 0x2a SLTI $rt, $rs, imm reg[rt] = ($rs < SignExt(imm)) I 0x0a SLTU $rd, $rs, $rt reg[rd] = ($rs < $rt) R 0 0x2b SLTIU $rt, $rs, imm reg[rt] = ($rs < SignExt(imm)) I 0x0b Instruction Formats R- format 6b: opcode 5b: rs 5b: rt 5b: rd 5b: shamt 6b: funct I- format 6b: opcode 5b: rs 5b: rt 16b: immediate operand J- format 6b: opcode 26b: JumpAddr[28:2] Register List Name Number Use Preserved across function call? $zero 0 Constant value 0 Y $at 1 Assembler temporary N $v0-$v1 2-3 Function return values N $a0-$a3 4-7 Function arguments N $t0-$t7 8-15 Temporary values N $s0-$s7 16-23 Saved values Y $t8-$t9 24-25 More temporary values N $k0-$k1 26-27 Reserved for OS kernel N $gp 28 Global (heap) pointer Y $sp 29 Stack pointer Y $fp 30 Frame pointer Y $ra 31 Return address Y

FIGURE 4.17 The simple datapath with the control unit. The input to the control unit is the 6-bit opcode field from the instruction. The outputs of the control unit consist of three 1-bit signals that are used to control multiplexors (RegDst, ALUSrc, and MemtoReg), three signals for con trolling reads and writes in the register file and data memory (RegWrite, MemRead, and MemWrite), a 1-bit signal used in determining whether to possibly branch (Branch), and a 2-bit control signal for the ALU (ALUOp). An AND gate is used to combine the branch control signal and the Zero output from the ALU; the AND gate output controls the selection of the next PC. Notice that PCSrc is now a derived signal, rather than one coming directly from the control unit. Thus, we drop the signal name in subsequent figures. Copyright 2009 Elsevier, Inc. All rights reserved. Chapter 4 The Processor 18