Prerequisite Quiz January 23, 2007 CS252 Computer Architecture and Engineering

Similar documents
Prerequisite Quiz September 3, 2003 CS252 Computer Architecture and Engineering

Computer Architecture CS372 Exam 3

1. Truthiness /8. 2. Branch prediction /5. 3. Choices, choices /6. 5. Pipeline diagrams / Multi-cycle datapath performance /11

are Softw Instruction Set Architecture Microarchitecture are rdw

Computer System Architecture Midterm Examination Spring 2002

Final Exam Fall 2007

CS 251, Winter 2019, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark

Perfect Student CS 343 Final Exam May 19, 2011 Student ID: 9999 Exam ID: 9636 Instructions Use pencil, if you have one. For multiple choice

ECE260: Fundamentals of Computer Engineering

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

COSC 6385 Computer Architecture - Pipelining

University of California, Berkeley College of Engineering Department of Electrical Engineering and Computer Science

CS 351 Exam 2 Mon. 11/2/2015

EE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes

ECE 2300 Digital Logic & Computer Organization. More Caches Measuring Performance

Midterm #2 Solutions April 23, 1997

Question 1: (20 points) For this question, refer to the following pipeline architecture.

ECE 331 Hardware Organization and Design. UMass ECE Discussion 10 4/5/2018

Instruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31

6.823 Computer System Architecture Datapath for DLX Problem Set #2

Comprehensive Exams COMPUTER ARCHITECTURE. Spring April 3, 2006

ECE154A Introduction to Computer Architecture. Homework 4 solution

Midterm I March 12, 2003 CS152 Computer Architecture and Engineering

EE557--FALL 1999 MIDTERM 1. Closed books, closed notes

CS/CoE 1541 Mid Term Exam (Fall 2018).

ENCM 369 Winter 2013: Reference Material for Midterm #2 page 1 of 5

c. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?

Midterm I March 3, 1999 CS152 Computer Architecture and Engineering

Final Exam Fall 2008

EE2011 Computer Organization Lecture 10: Enhancing Performance with Pipelining ~ Pipelined Datapath

Midterm I October 6, 1999 CS152 Computer Architecture and Engineering

Pipelining. Pipeline performance

Computer Architecture and Engineering. CS152 Quiz #1. February 19th, Professor Krste Asanovic. Name:

Computer Architecture Spring 2016

Question 1: Calculate Your Cache A certain system with a 350 MHz clock uses a separate data and instruction cache, and a uniæed second-level cache. Th

COSC4201 Pipelining. Prof. Mokhtar Aboelaze York University

University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences. Spring 2010 May 10, 2010

ECE260: Fundamentals of Computer Engineering

COMPUTER ORGANIZATION AND DESIGN

CS/CoE 1541 Exam 1 (Spring 2019).

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Instruction Pipelining

Chapter 4. The Processor

R-type Instructions. Experiment Introduction. 4.2 Instruction Set Architecture Types of Instructions

Please state clearly any assumptions you make in solving the following problems.

Pipeline design. Mehran Rezaei

ECE331: Hardware Organization and Design

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception

ECE 2300 Digital Logic & Computer Organization. Caches

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS146 Computer Architecture. Fall Midterm Exam

ECE3055B Fall 2004 Computer Architecture and Operating Systems Final Exam Solution Dec 10, 2004

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

LECTURE 3: THE PROCESSOR

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

Full Datapath. Chapter 4 The Processor 2

HY425 Lecture 05: Branch Prediction

ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013

1 Tomasulo s Algorithm

Computer System Architecture Quiz #1 March 8th, 2019

Q1: Finite State Machine (8 points)

CS3350B Computer Architecture Winter 2015

CS 341l Fall 2008 Test #2

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.

Pipelining and Caching. CS230 Tutorial 09

Chapter 4 The Processor 1. Chapter 4A. The Processor

CS252 Graduate Computer Architecture Midterm 1 Solutions

COMPUTER ORGANIZATION AND DESIGN

The Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

RISC Pipeline. Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. See: P&H Chapter 4.6

Midterm I March 21 st, 2007 CS252 Graduate Computer Architecture

CSCE 212: FINAL EXAM Spring 2009

DLX Unpipelined Implementation

Speeding Up DLX Computer Architecture Hadassah College Spring 2018 Speeding Up DLX Dr. Martin Land

Computer Organization. Structure of a Computer. Registers. Register Transfer. Register Files. Memories

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

The Pipelined RiSC-16

NATIONAL UNIVERSITY OF SINGAPORE

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

COMPUTER ORGANIZATION AND DESIGN


COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

ELE 655 Microprocessor System Design

CS 151 Midterm. Instructions: Student ID. (Last Name) (First Name) Signature

Final Exam Spring 2017

CS 465 Final Review. Fall 2017 Prof. Daniel Menasce

ECS 154B Computer Architecture II Spring 2009

ECE 2300 Digital Logic & Computer Organization. More Single Cycle Microprocessor

Instruction Pipelining

The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Final Project: MIPS-like Microprocessor

CS3350B Computer Architecture Quiz 3 March 15, 2018

LECTURE 9. Pipeline Hazards

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

ISA Instruction Operation

CSE 378 Midterm Sample Solution 2/11/11

CS252 Prerequisite Quiz. Solutions Fall 2007

Transcription:

University of California, Berkeley College of Engineering Computer Science Division EECS Spring 2007 John Kubiatowicz Prerequisite Quiz January 23, 2007 CS252 Computer Architecture and Engineering This prerequisite quiz will be used in determining class admissions. Good Luck! Your Name: SID Number: Discussion Section: 1 2 3 Total 1

[ This page left for π ] 3.141592653589793238462643383279502884197169399375105820974944 2

Problem 1: Memory Hierarchy Problem 1a: Below is a series of memory read references set to a cache. The cache holds 128 bytes total. It has 2-word blocks (i.e. 64bits), is 2-way set associative, and uses a least-recently-used replacement policy. Assume that the cache is initially empty. Classify each memory references as a hit or a miss. Identify each cache miss as either compulsory, conflict, or capacity. One example is shown below. Feel free to use space in the margin as scratch. Address Hit/Miss? Miss Type? 0x7 Miss Compulsory 0x4D 0x2A 0x79 0xAB 0xCE 0x2E 0x4B 0x6D 0x8A 0xAF 0x29 0xC8 0xCE 0x6A Problem 1b: Calculate the Miss Rate and the Hit Rate: 3

Problem 1c: Suppose you have a 32-bit processor, with a virtual-memory page-size of 16K. The data cache is 32K in size with 32-byte cache blocks. Finally, your TLB has 4 entries. Assume that you wish to do TLB lookups in parallel with cache lookups. Draw a block diagram of the data cache and TLB organization, showing a virtual address as input and both a physical address and data as output. Include cache hit and TLB hit output signals. Include as much information about the internals of the TLB and cache organization as possible. Include, among other things, all of the comparators in the system and any muxes as well. You can indicate RAM as with a simple block, but make sure to label address widths and data widths. Make sure to use abstraction in your diagram so that we can understand it. Label the function of various blocks and the width of any buses. 4

Next PC 4 Adder Problem 2: Pipelining Next SEQ PC Adder RS1 MUX Zero? Address Memory IF/ID RS2 Reg File ID/EX MUX ALU EX/MEM Data Memory MEM/WB MUX Sign Extend Imm RD RD RD WB Data Figure 1: A simple 5-stage pipeline Problem 2a: Is it possible to have zero branch delay-slots without guessing? Explain carefully. Problem 2b: What is a load delay-slot? Why does it exist in the above pipeline? Problem 2c: What sort of logic would be involved in stalling so that the above pipeline will have correct behavior for instruction sequences like: lw r2, 32(r19) ; load value addi r3, r2, #3 ; add 3 to it. Be general (get all sequences involving loads). Feel free to use pseudo-code to represent logic; data in registers can be named using name of register (i.e. rs1 value in IF/ID = r0 ): 5

The code sequence below is written in DLX assembly. Assume that it will execute on a pipeline similar to that in Figure 1, which includes a multiplier in the ALU. In addition, assume that there is one branch delay slot and that load instructions stall if necessary to get correct execution. 0 addi r5, r0, #50 ; reset sum to max 4 lw r1, 16(r19) ; load base address from stack 8 lw r2, 32(r19) ; load number of iterations 12 loop: lw r3, 0(r1) ; load data x 16 lw r4, 4(r1) ; load data y 20 mulu r7, r3, r4 ; multiply them together 24 subu r5, r5, r7 ; decrement result 28 addi r1, r1, 8 ; increment base by 2 words 32 addi r2, r2, -1 ; decrement count 36 bnez r2, loop ; branch to next iteration 40 noop ; do nothing 44 exit: hcf ; Halt and catch fire (exit) Problem 2d: Assuming that multiplications take 2 cycles to compute, but we otherwise utilize the pipeline shown in Figure 1, how many cycles will the above code take for each iteration of the loop? Explain. Problem 2e: Rearrange the instructions above to shorten the number of cycles/iteration: 6

Problem 2f: Draw the forwarding logic and control logic required to handle the following sequence without stalling in the pipeline of Figure 1: lw sw r1, 16(r2) r1, 32(r2) Problem 2g: Would the existence of the logic from 2f change your answer for 2c? 7

[ This page intentionally left blank ] 8

Problem 3: State Machine Control In this problem, you must design a six-state finite state machine (FSM) that implements a Gray Code Counter with exceptions. There are 4 count states (0-3) and 2 exception states (Overflow, Underflow). The counter only counts when the COUNT signal is asserted. When counting up (the UP signal is asserted), the counter will count as follows: Underflow, 0, 1, 3, 2, Overflow, Overflow, Overflow When counting down (the UP signal is not asserted), the counter will count as follows: Overflow, 2, 3, 1, 0, Underflow, Underflow, Underflow The RESET signal will take the counter to the 0 state. Problem 3a: Complete the following State Transition Diagram for the Gray-Code counter: Uf 0 1 2 3 Of 9

Problem 3b: Construct a State Transition Table for this FSM. Encode the state as 3 bits: S r S 1 S 0. Here, S r indicates out of range, S r, where S r =1 either Uf or Of, depending on the values of the other bits. The other two bits (assuming that S r =0) indicate the value, where S 1 is the MSB (i.e. S r S 1 S 0 =010 for state 2). Ignore RESET. Feel free to encode the out of range states in any way that you like (hint the more don t care states, the easier it is to encode): 10

Problem 3c: Derive Next-State Logic Equations, given your state transition table. Include the RESET signal in your equations. You will have 3 equations. Simplify these as much as possible (i.e. combine together terms as much as possible). Show your work. Hint: Separate the COUNT=0 and COUNT=1 cases, derive separately, then combine S r = S 1 = S 0 = Problem 3d: Draw logic for the S 0 state bit, including (1) all inputs, (2) the flip-flop that stores the state, (3) the clock input. 11

[This page left for scratch] 12