Winter 2006 FINAL EXAMINATION Auxiliary Gymnasium Tuesday, April 18 7:00pm to 10:00pm

Similar documents
Winter 2002 FINAL EXAMINATION

Winter 2009 FINAL EXAMINATION Location: Engineering A Block, Room 201 Saturday, April 25 noon to 3:00pm

Winter 2012 MID-SESSION TEST Tuesday, March 6 6:30pm to 8:15pm. Please do not write your U of C ID number on this cover page.

University of Calgary Department of Electrical and Computer Engineering ENCM 369: Computer Organization Instructor: Steve Norman

#1 #2 with corrections Monday, March 12 7:00pm to 8:30pm. Please do not write your U of C ID number on this cover page.

Winter 2003 MID-SESSION TEST Monday, March 10 6:30 to 8:00pm

CS232 Final Exam May 5, 2001

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

Winter 2017 MIDTERM TEST #1 Wednesday, February 8 7:00pm to 8:30pm. Please do not write your U of C ID number on this cover page.

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

Digital Design & Computer Architecture (E85) D. Money Harris Fall 2007

Lecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón

CS 351 Exam 2 Mon. 11/2/2015

ECE Exam I - Solutions February 19 th, :00 pm 4:25pm

CSE 2021: Computer Organization Fall 2010 Solution to Assignment # 3: Multicycle Implementation

ENCM 369 Winter 2013: Reference Material for Midterm #2 page 1 of 5

Chapter 5 Solutions: For More Practice

ECE Exam I February 19 th, :00 pm 4:25pm

ECE369. Chapter 5 ECE369

CC 311- Computer Architecture. The Processor - Control

Points available Your marks Total 100

CS232 Final Exam May 5, 2001

ENE 334 Microprocessors

CSE 2021 COMPUTER ORGANIZATION

Systems Architecture I

RISC Processor Design

Processor: Multi- Cycle Datapath & Control

Processor (I) - datapath & control. Hwansoo Han

CPE 335. Basic MIPS Architecture Part II

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Topic #6. Processor Design

Multi-cycle Approach. Single cycle CPU. Multi-cycle CPU. Requires state elements to hold intermediate values. one clock cycle or instruction

CSEE W3827 Fundamentals of Computer Systems Homework Assignment 3 Solutions

Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle

ECE 313 Computer Organization EXAM 2 November 11, 2000

ECE 3056: Architecture, Concurrency and Energy of Computation. Single and Multi-Cycle Datapaths: Practice Problems

CSE 2021 COMPUTER ORGANIZATION

Chapter 5: The Processor: Datapath and Control

RISC Architecture: Multi-Cycle Implementation

Faculty of Science FINAL EXAMINATION

EE457. Note: Parts of the solutions are extracted from the solutions manual accompanying the text book.

ECE 313 Computer Organization EXAM 2 November 9, 2001

LECTURE 6. Multi-Cycle Datapath and Control

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

Multicycle conclusion

Mapping Control to Hardware

Final Project: MIPS-like Microprocessor

Midterm I October 6, 1999 CS152 Computer Architecture and Engineering

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

Processor. Han Wang CS3410, Spring 2012 Computer Science Cornell University. See P&H Chapter , 4.1 4

CSEN 601: Computer System Architecture Summer 2014

CENG 3420 Lecture 06: Datapath

Lets Build a Processor

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

The Processor: Datapath & Control

RISC Architecture: Multi-Cycle Implementation

ENCM 369 Winter 2019 Lab 6 for the Week of February 25

Microprogrammed Control Approach

CSc 256 Midterm 2 Fall 2011

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

ECE 313 Computer Organization FINAL EXAM December 11, Multicycle Processor Design 30 Points

Microprogramming. Microprogramming

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

RISC Design: Multi-Cycle Implementation

NATIONAL UNIVERSITY OF SINGAPORE

ENCM 369 Winter 2017 Lab 3 for the Week of January 30

Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu (Guest starring: Frank K. Gürkaynak and Aanjhan Ranganathan)

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

CSc 256 Final Fall 2016

Systems Architecture

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl.

We will study the MIPS assembly language as an exemplar of the concept.

Midterm. Sticker winners: if you got >= 50 / 67

Review: Abstract Implementation View

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Machine Organization & Assembly Language

Major CPU Design Steps

Introduction to CMOS VLSI Design (E158) Lab 4: Controller Design

2) Using the same instruction set for the TinyProc2, convert the following hex values to assembly language: x0f

Multicycle Approach. Designing MIPS Processor

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Chapter 4. The Processor

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

The University of Alabama in Huntsville Electrical & Computer Engineering Department CPE Test II November 14, 2000

Chapter 4. The Processor

Slide Set 5. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

ECE 30 Introduction to Computer Engineering

CS3350B Computer Architecture Quiz 3 March 15, 2018

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

Midterm I March 12, 2003 CS152 Computer Architecture and Engineering

Mark Redekopp, All rights reserved. EE 357 Unit 11 MIPS ISA

Chapter 4 The Processor (Part 2)

CSc 256 Midterm 2 Spring 2012

Procedure Calling. Procedure Calling. Register Usage. 25 September CSE2021 Computer Organization

Transcription:

University of Calgary Department of Electrical and Computer Engineering ENCM 369: Computer Organization Lecture Instructor for L01 and L02: Dr. S. A. Norman Winter 2006 FINAL EXAMINATION Auxiliary Gymnasium Tuesday, April 18 7:00pm to 10:00pm NAME (printed): Please don t write anything within this box. 1 / 10 U of C ID NUMBER: 2 / 12 3 / 10 LECTURE SECTION (L01 was MWF at 8am, L02 was MWF at noon): 4 / 17 5 / 3 6 / 8 7 / 8 SIGNATURE: 8 / 10 9 / 9 TOTAL / 87 Instructions Please note that the official University of Calgary examination regulations are printed on page 1 of the Examination Regulations and Reference Material booklet that accompanies this examination paper. All of those regulations are in effect for this examination, except that you must write your answers on the question paper, not in the examination booklet. You may not use electronic calculators or computers during the examination. The examination is closed-book. You may not refer to books or notes during the examination, with one exception: you may refer to the Examination Regulations and Reference Material booklet that accompanies this examination paper. You are not required to add comments to assembly language code you write, but you are strongly encouraged to do so, because writing good comments will improve the probability that your code is correct and will help you to check your code after it is finished. Some problems are relatively easy and some are relatively difficult. Go after the easy marks first. Write all answers on the question paper and hand in the question paper when you are done. Please do not hand in the Examination Regulations and Reference Material booklet. Please print or write your answers legibly. What cannot be read cannot be marked. If you write anything you do not want marked, put a large X through it and write rough work beside it. You may use the backs of pages for rough work.

ENCM 369 Winter 2006 Final Examination page 2 of 9 PROBLEM 1 (10 marks). Below is the beginning of a SPIM translation of the procedure func1 in the C code listed to the right of this text. Complete the SPIM translation, using only instructions from the Final Examination Instruction Subset described in the Examination Regulations and Reference Material booklet. Follow the calling conventions used in lectures and labs, and observe the following additional conventions regarding floating-point registers: $f2, $f4,..., $f10 may be used like $t0 $t9; $f20, $f22,..., $f30 may be used like $s0 $s7..data c0pt01:.double 0.01.text.globl func1 func1: la $t0, c0pt01 l.d $f2, ($t0) void func1(double *y, double *x, int n) { int i; for (i = 0; i < n; i++) y[i] = x[i] + 0.01 * i;

ENCM 369 Winter 2006 Final Examination page 3 of 9 PROBLEM 2 (12 marks). Consider the following C program, which has been correctly translated into SPIM code in the listing on the right. Note that the reverse function makes a copy of a string, with the order of the non- \0 characters reversed. char foo[ ] = "AEIOU"; void reverse(char *dest, const char* src) { const char *p; p = src; while (*p!= \0 ) p++; while (p!= src) { p--; *dest = *p; dest++; *dest = \0 ; int main(void) { char buf[8]; buf[4] = X ; buf[5] = X ; buf[6] = X ; buf[7] = X ; reverse(buf, foo); return 0;.data.globl foo foo:.asciiz "AEIOU".text.globl reverse reverse: addu $t0, $a1, $zero L1: lb $t1, ($t0) beq $t1, $zero, L2 addiu $t0, $t0, 1 j L1 L2: beq $t0, $a1, L3 addiu $t0, $t0, -1 lb $t2, ($t0) sb $t2, ($a0) addiu $a0, $a0, 1 j L2 L3: sb $zero, ($a0) # POINT ONE jr $ra.text.globl main main: addiu $sp, $sp, -12 sw $ra, 8($sp) ori $t9, $zero, X sb $t9, 4($sp) sb $t9, 5($sp) sb $t9, 6($sp) sb $t9, 7($sp) addiu $a0, $sp, 0 la $a1, foo jal reverse # Next 2 lines were in # wrong order in original. lw $ra, 8($sp) addiu $sp, $sp, 12 jr $ra For the point in time when the assembly language program reaches point one, list the values of the registers in the table below as hexadecimal numbers. Also show the contents of the stack frame of main as hexadecimal numbers in the diagram on the right. Note that two of the words in the stack frame are used to hold bytes, and the address offsets of the bytes are indicated in the upper left corner of each byte. To solve the problem, you will need some (but maybe not all) of the following information: In ascii, A is 0x41, E is 0x45, I is 0x49, O is 0x4f, U is 0x55, and X is 0x58. The address of foo[0] is 0x1001_0000. When main starts, $ra contains 0x0040_0018 and $sp contains 0x7fff_ff40. The address of the first instruction of reverse is 0x0040_0024 and address of the first instruction of main is 0x0040_0058. SPIM is able to translate the la pseudoinstruction in main into a single machine instruction. register $ra $a0 $a1 $t0 $t1 $t2 $t9 value high addr esses data saved before main was called 0 1 2 3 0 1 2 3

ENCM 369 Winter 2006 Final Examination page 4 of 9 PROBLEM 3 (10 marks). Write a SPIM translation of the procedure func2 in the C code shown to the right of this text. Use only instructions from the Final Examination Instruction Subset described in the Examination Regulations and Reference Material booklet. Follow the calling conventions used in lectures and labs, and observe the following additional conventions regarding floating-point registers: floating-point return values of type double go in $f0; $f2, $f4,..., $f10 may be used like $t0 $t9; $f20, $f22,..., $f30 may be used like $s0 $s7; the arguments to func2 arrive in $f12, $f14, and $f16; the argument to func3 goes in $f12. double func3(double func3_arg); double func2(double left, double right, double x) { double v; v = func3(x); if (v < left) v = left; else if (right < v) v = right; return 3.14159265358979323846 * v - x; Hint: Making a diagram for the stack frame of func2 is highly recommended.

ENCM 369 Winter 2006 Final Examination page 5 of 9 PROBLEM 4 (total of 17 marks). The Exam16 ISA (instruction set architecture) describes a system in which addresses, instructions, and data words are all 16 bits wide. It has sixteen 16-bit general purpose registers, and a 16-bit PC. The instructions of Exam16 are as follows. [Correction notice: The table on the original exam had bits 15 12 wrong in sub and slt.] Mnemonic Format Description add 0000_ssss_tttt_dddd Add source registers selected by bits ssss and tttt, put result in register selected by bits dddd. sub 0001_ssss_tttt_dddd Same as add, except ALU operation is subtraction. slt 0010_ssss_tttt_dddd Same as add, except ALU operation is set-onless-than. brz 0011_ssss_oooo_oooo Branch if register is zero: If register selected by bits ssss contains zero, branch forward or backward by number of instructions in 8-bit 2 scomplement offset oooo_oooo. lw 0100_0000_aaaa_dddd Using register selected by bits aaaa as an address, load word from data memory into register selected by bits dddd. sw 0101_ssss_aaaa_0000 Using register selected by bits aaaa as an address, store word from register selected by bits ssss into data memory. Note that unlike MIPS, there are no offsets built into Exam16 load and store instructions. Below is a nearly-complete datapath for a computer that implements the Exam16 ISA. It is very much in the style of the single-cycle MIPS subset implementation studied in ENCM 369. Note that there are two 16-bit adders and a 16-bit ALU. The ALUOp signal works as follows: 00 asks for addition, 01 for subtraction, and 10 for set-on-less-than. The circuit labeled All Bits 0? is a big nor gate the 1-bit output is 1 if all 16 input bits are 0, and is 0 otherwise. 2 Adder #1 Adder Instruction[7 0] Sign Extend 16 Shift Left 1 #2 16 All Bits 0? PC clock Address Instruction [15 0] Instruction Memory Instruction[11 8] Instruction[7 4] Instruction[3 0] Reg #1 Reg #2 Write Reg # Write Data RegWrite Data #1 Data#2 Registers clock Mem Address clock ALU Data Memory ALUOp Data 16 MemWrite Instruction[15 12] to Control Unit Write Data Part a (2 marks). The Address and Write Data inputs to the Data Memory are not connected to anything. What signals should be sent to these inputs? Explain why. Part b (3 marks). The Write Data input to the Register File is not connected to anything. How should this signal be driven? Here is a hint: Introduce a new control signal, give it a name, and use it to control a multiplexer. Briefly give reasons to support your design.

ENCM 369 Winter 2006 Final Examination page 6 of 9 PROBLEM 4 (continued from previous page). Part c (3 marks). The input to the PC register is not connected to anything. How should this signal be driven? A new control signal, a new multiplexer, and perhaps some other new, simple logic element will be needed. Briefly give reasons to support your design. Part d (6 marks). Fill in the table of control signal values to the right of this text. The last two columns are reserved for the new control signals you introduced in parts b and c please write in the names of these signals. Use X in table cells to indicate that a particular control signal is a don t care for a particular instruction. Instruction add sub slt brz Mem MemWrite RegWrite ALUOp lw sw Part e (3 marks). Suppose you want to extend the Exam16 ISA to include an addc ( add constant ) instruction with the following format: 1000_ssss_cccc_dddd ssss encodes the source register, dddd encodes the destination register, and cccc encodes a constant in the range from 0 to 15. Describe all the datapath changes (not control changes) that would be needed to add support for addc while continuing to support the original six Exam16 instructions. PROBLEM 5 (3 marks). In the five-stage-pipelined implementation of the MIPS instruction subset, the 3rd stage is execution in the ALU, the 4th state is memory access, and the 5th stage is writeback to the register file. Consider the following sequence of MIPS instructions: add $t0, $t1, $t2 lw $t3, 12($t0) sub $t4, $t4, $t3 Explain how forwarding can be used to start the lw one clock cycle later than the add, and why forwarding can not be used to start the sub one clock cycle later than the lw.

ENCM 369 Winter 2006 Final Examination page 7 of 9 PROBLEM 6 (8 marks). The multicycle implementation for the MIPS subset is shown in Figure 5.28 on page 6 of your Reference Material booklet. Support for the addi instruction can be added without changing the datapath all that is needed is two new states in the finite state machine used for the main control unit. The new states would occur after state 1. Fill in the table to the right to show the values of control signals needed for the four steps of addi. Use X in table cells to indicate that a particular control signal is a don t care for a particular state. signal IorD Mem MemWrite IRWrite ALUOp ALUSrcA ALUSrcB RegDst MemtoReg RegWrite PCSource PCWrite PCWriteCond state 0 state 1 1st new state 2nd new state Here are some reminders and hints: The format for addi is 001000_sssss_ddddd_cccc_cccc_cccc_cccc, where sssss selects the source, ddddd selects the destination, and bits 15 0 supply the constant to be used in the addition. In state 0, the instruction is fetched and the PC+4 computation is done in the ALU. In state 1, which like state 0 is common to all instructions, a branch target is computed. In the 2nd new state, the register file will be updated. If ALUOp is 00, the ALU will do an addition. PROBLEM 7 (total of 8 marks). Part a (4 marks). How would the number 0.078125 be represented as a IEEE 754 doubleprecision number. (Note that the IEEE 754 formats are summarized on page 2 of your Reference Material booklet.) Here are some hints: first, 0.078125 = (1/16 + 1/64), and second, 1023 is 0x3ff. [Correction notice: The original exam incorrectly said that 1023 is 0x7ff.] Show your work, and use base sixteen to represent your final answer. Part b (4 marks). Suppose that $f2 contains 0x7f00_0000 and $f3 contains 0xc080_0000 before the instruction mul.s $f0, $f2, $f3 is run. What bit pattern will the instruction write to $f0? Show your work, and use base sixteen to represent your final answer. Here is a hint: The bit patterns in $f2 and $f3 have been chosen so that any arithmetic you might have to do to solve this problem will be relatively easy.

ENCM 369 Winter 2006 Final Examination page 8 of 9 PROBLEM 8 (total of 10 marks). In parts a, b, and c, show your work so you can get partial credit if you make a mistake. Also, note that the Reference Material booklet has a table of powers of two. Part a (3 marks). A cache for a computer system with 32-bit words and 32-bit addresses is direct-mapped, has one-word blocks, and has a capacity of of 8192 words. Which bits of an address would be used for byte offset, which bits for index, and which bits for tag? Part b (3 marks). A cache for a computer system with 64-bit words and 48-bit addresses is 4-way set-associative, has eight-word blocks, and has a capacity of 65,536 bytes. Which bits of an address would be used for byte offset, which bits for block offset, which bits for index, and which bits for tag? Part c (2 marks). What would be the total number of bits of tag stored in the cache of part b? You may express your answer as a product of integers, something like 29 384, or 7 32 256. (Neither of those examples is a correct answer for this problem!) Part d (2 marks). A current processor design used in laptop computers has a 32-kilobyte Level 1 (L1) I-cache, a 32-kilobyte L1 D-cache, and a 512-kilobyte unified Level 2 (L2) cache. The L1 caches contain information that is also in the L2 caches. Why would it be a bad idea to simplify the design by getting rid of the L2 cache and using the space for bigger L1 caches instead 256 kilobytes for the L1 I-cache, and the same for the L1 D-cache?

ENCM 369 Winter 2006 Final Examination page 9 of 9 PROBLEM 9 (total of 9 marks). Part a (2 marks). Consider a computer that runs the MIPS instruction set. The computer supports virtual memory with a page size of 65,536 bytes. Suppose that a process has the instruction lw $s2, 16($a0) located at virtual address 0x0041_fff8, and suppose that just before the instruction is fetched, $a0 = 0x1002_fffc. When the instruction is fetched, the instruction TLB contains the following translations: virtual page number valid bit physical page number 0x0041 0 0x9900 0x0040 1 0x9711 0x0041 1 0x9822 0x0042 1 0x9633 Will there be a hit or a miss in the instruction TLB? If there is a miss, explain why. If there is a hit, what will be the physical address used to fetch the instruction? Part b (2 marks). Continuing from the instruction fetch in part a, just before the data memory access step, the data TLB contains the following translations: virtual page number valid bit physical page number 0x7fff 1 0x9255 0x1001 1 0x9166 0x1002 0 0x9388 0x1003 1 0x9477 Will there be a hit or a miss in the data TLB? If there is a miss, explain why. If there is a hit, what will be the physical address used for data memory access? Part c (2 marks). Briefly describe what information would be in a page table, and what that information would be used for. Part d (3 marks). Data TLBs are often designed so that each entry in a TLB will have a virtual page number, a physical page number, a valid bit, and another bit called a dirty bit (and maybe a few other special-purpose bits of memory). The dirty bit is set to 0 whenever the entry is updated by the operating system kernel, and is changed to 1 the first time the process using the associated page writes to that page. Suppose the kernel needs to copy a page of data from disk to physical memory. Describe a situation in which the kernel could save a significant amount of time by discovering that a dirty bit in the TLB is 0.