ECE 30, Lab #8 Spring 2014

Similar documents
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

Major CPU Design Steps

CPE 335. Basic MIPS Architecture Part II

CSE 2021 COMPUTER ORGANIZATION

CSE 2021 COMPUTER ORGANIZATION

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

Systems Architecture I

CPU Design Steps. EECC550 - Shaaban

Chapter 5: The Processor: Datapath and Control

ECE 30 Introduction to Computer Engineering

Topic #6. Processor Design

Processor: Multi- Cycle Datapath & Control

CC 311- Computer Architecture. The Processor - Control

CSE 2021: Computer Organization Fall 2010 Solution to Assignment # 3: Multicycle Implementation

Points available Your marks Total 100

Chapter 5 Solutions: For More Practice

Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle

ALUOut. Registers A. I + D Memory IR. combinatorial block. combinatorial block. combinatorial block MDR

Lecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón

RISC Processor Design

Review: Abstract Implementation View

RISC Architecture: Multi-Cycle Implementation

The University of Alabama in Huntsville Electrical & Computer Engineering Department CPE Test II November 14, 2000

Note- E~ S. \3 \S U\e. ~ ~s ~. 4. \\ o~ (fw' \i<.t. (~e., 3\0)

The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

Laboratory 5 Processor Datapath

Final Exam Spring 2017

RISC Architecture: Multi-Cycle Implementation

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Chapter 4. The Processor

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

LECTURE 6. Multi-Cycle Datapath and Control

CS232 Final Exam May 5, 2001

Designing a Multicycle Processor

ECE369. Chapter 5 ECE369

Hardwired Control Unit Ch 16

Midterm I October 6, 1999 CS152 Computer Architecture and Engineering

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

cs470 - Computer Architecture 1 Spring 2002 Final Exam open books, open notes

Mapping Control to Hardware

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Processor (I) - datapath & control. Hwansoo Han

CSEN 601: Computer System Architecture Summer 2014

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

Multicycle Approach. Designing MIPS Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Processor (multi-cycle)

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

EE457 Lab 4 Part 4 Seven Questions From Previous Midterm Exams and Final Exams ee457_lab4_part4.fm 10/6/04

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. The Processor

EE 457 Unit 6a. Basic Pipelining Techniques

CS 251, Winter 2019, Assignment % of course mark

Hardwired Control Unit Ch 14

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

ECE232: Hardware Organization and Design

The Processor: Datapath & Control

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Perfect Student CS 343 Final Exam May 19, 2011 Student ID: 9999 Exam ID: 9636 Instructions Use pencil, if you have one. For multiple choice

CPS 104 Final Exam. 2pm to 5pm Open book exam. Answer all questions, state all your assumptions, clearly mark your final answer. 1.

Systems Architecture

EECS 322 Computer Architecture Improving Memory Access: the Cache

RISC Design: Multi-Cycle Implementation

EE457. Note: Parts of the solutions are extracted from the solutions manual accompanying the text book.

ECE 3056: Architecture, Concurrency and Energy of Computation. Single and Multi-Cycle Datapaths: Practice Problems

EECE 417 Computer Systems Architecture

LECTURE 5. Single-Cycle Datapath and Control

CpE 442. Designing a Multiple Cycle Controller

CSE Lecture In Class Example Handout

ECE468 Computer Organization and Architecture. Designing a Multiple Cycle Controller

Improving Performance: Pipelining

Multiple Cycle Data Path

ECS 154B Computer Architecture II Spring 2009

Chapter 4. The Processor

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University

Single vs. Multi-cycle Implementation

Chapter 4. The Processor Designing the datapath

Midterm I March 12, 2003 CS152 Computer Architecture and Engineering

Lecture 5: The Processor

CENG 3420 Lecture 06: Datapath

Computer Architecture CS372 Exam 3

CPU Organization (Design)

CS 251, Winter 2018, Assignment % of course mark

ECE331: Hardware Organization and Design

Using a Hardware Description Language to Design and Simulate a Processor 5.8

Grading Results Total 100

CS 152 Computer Architecture and Engineering. Lecture 10: Designing a Multicycle Processor

Digital Design and Computer Architecture

CS152 Computer Architecture and Engineering. Lecture 8 Multicycle Design and Microcode John Lazzaro (

Pipelining. CSC Friday, November 6, 2015

COMP2611: Computer Organization. The Pipelined Processor

Transcription:

ECE 30, Lab #8 Spring 20 Shown above is a multi-cycle CPU. There are six special registers in this datapath: PC, IR, MDR, A, B, and ALUOut. Of these, PC and IR are enabled to change when PCWr and IRWr are high (logic ) respectively. These control signals change shortly after the sampling (rising) edge of the clock, as the control-fsm changes its state. Therefore, the outputs of these registers do not change until the next sampling edge of the clock after the enabling control signals are asserted. Additionally, assume that: This multi-cycle CPU breaks instructions down into the following stages, each of which takes exactly one clock cycle: IF: fetch instruction; compute PC+ ID/RF: decode instruction; fetch source operands; compute branch target

EX: perform arithmetic/logic operation or compute data memory address DM: data memory access WB: write result to register file PC+ and the branch target (PC+ + *offset) are computed in the IF and ID/RF stages respectively by the ALU. Note that these are computed regardless of the instruction type. 2

. Note that there are items/connections missing in the CPU diagram. In this problem you will fill in these five missing items. Each circle (A, B, C) highlights an area missing a single or multiple connection(s)/item(s). For each circle: Draw the missing connection(s)/item(s) onto the CPU diagram. Then, do the following: Briefly explain what you added to the diagram (in case you add a connection, also specify the number of bits the connection represents; if this number is less than 2, then also specify which exact bits are connected from the source of the connection). Give an instruction which uses the added portion of the CPU. Choose from one of the following basic instructions: j, beq, add/sub, lw, sw. Give the CPU stage which uses the added portion of the CPU for this instruction (IF, ID/RF, EX, DM, WB). 2. In this problem you will examine the control signals used in executing the sub and beq instructions. In addition to the information provided before, the following information will be useful for this problem: ALUOp = 00 signals the ALU to add A and B. ALUOp = 0 signals the ALU to subtract B from A. ALUOp = 0 signals the ALU to do the operation specified by the R-format function code. Shown below is the state diagram for the sub and beq instructions. Complete the state table shown below for the sub and beq instructions. Please fill out the table in binary. Use dashes ( - ) for unspecified or don t care values. For don t cares: do NOT assign a random value; always clearly indicate it is a don t care. 3

IorD MemRd MemWr IRWr ALUSrcA ALUSrcB ALUOp PCSrc PCWr PCWrCond RegWr RegDst MemtoReg 0 2 3 IF ID/RF EX WB EX 3. (a) Assume the instruction mix shown below for a multi-cycle implementation. Complete the table with i) the number of clock cycles ( #CC ) and ii) the Phases (in the right order) it takes to execute the following instruction types. What is the average CPI of a machine that implements this instruction mix? If the clock period of an alternative single-cycle implementation is equal to the time it takes to execute the longest multicycle instruction, how much faster is the multi-cycle implementation? Instruction Frequency #CC Phases (IF, ID/RF, EX, DM, WB) Arithmetic/Logical 0% lw 20% sw 0% beq % j % (b) How many clock cycles would the jr instruction take? Explain why it is the same as / different from a j instruction.. Consider a processor with a direct-mapped write-through cache with 8 one-word blocks. Assume that the memory address is 32 bits wide and the memory is byte-addressable. (a) Show the layout of the cache, including the data, valid and tag bits, and any logic required to determine hit/miss and select the appropriate data item when reading from the cache. Also, indicate which bits in the 32-bit memory address are used as block offset (if applicable), byte offset, tag, and index, and show where each of these groups of bits are used in the cache architecture. Make sure to label the width of all fields and signals. (b) What is the total amount of memory (in bytes) required to build this cache (including both data and other necessary bits)? (show calculations) (c) What is the block offset (if applicable), byte offset, tag, and index for word address? Give your answer in decimal notation.

(d) Given is the series of address references given as word addresses:,, 8,, 20,,,,,,, 3,,,,. Assume the cache is initially empty. Complete the table below. Label each reference as a hit or a miss and show the final contents of the cache in the diagram you drew under (a). Address Index Tag Hit/Miss 8 20 3 (e) The following miss rate measurements have been made. Instruction miss rate is %; data miss rate is 8%. Assume that one-half of the instructions contain a data reference and that the cache miss penalty in no. of clock cycles is ( + (Block size in words)). Calculate the average miss penalty per instruction.. Consider a processor with a direct-mapped write-through cache with four-word blocks and a total size of words. Assume that the memory address is 32 bits wide and the memory is byte-addressable. (a) Show the layout of the cache, including the data, valid and tag bits, and any logic required to determine hit/miss and select the appropriate data item when reading from the cache. Also, indicate which bits in the 32-bit memory address are used as block offset (if applicable), byte offset, tag, and index, and show where each of these groups of bits are used in the cache architecture. Make sure to label the width of all fields and signals. (b) What is the total amount of memory (in bytes) required to build this cache (including both data and other necessary bits)? (show calculations) (c) What is the block offset (if applicable), byte offset, tag, and index for word address? Give your answer in decimal notation. (d) Given is the series of address references given as word addresses:,, 8,, 20,,,,,,, 3,,,,. Assume the cache is initially empty. Complete the table below. Label each reference as a hit or a miss and show the final contents of the cache in the diagram you drew under (a).

Address Block Offset Index Tag Hit/Miss 8 20 3 (e) The following miss rate measurements have been made. Instruction miss rate is %; data miss rate is %. Assume that one-half of the instructions contain a data reference and that the cache miss penalty in no. of clock cycles is ( + (Block size in words)). Calculate the average miss penalty per instruction.. Consider a processor with a two-way set associative write-back cache with four-word blocks and a total size of words. Assume that the memory address is 32 bits wide and the memory is byte-addressable. (a) Show the layout of the cache, including the data, all special bits, and any logic required to determine hit/miss and select the appropriate data item when reading from the cache. Also, indicate which bits in the 32-bit memory address are used as block offset (if applicable), byte offset, tag, and index, and show where each of these groups of bits are used in the cache architecture. Make sure to label the width of all fields and signals. (b) What is the total amount of memory (in bytes) required to build this cache (including both data and other necessary bits)? (show calculations) (c) What is the block offset (if applicable), byte offset, tag, and index for word address? Give your answer in decimal notation. (d) Given is the series of address references given as word addresses:,, 8,, 20,,,,,,, 3,,,,. Assume the cache is initially empty and LRU is used as replacement policy. Complete the table below. Label each reference as a hit or a miss and show the final contents of the cache in the diagram you drew under (a).

Address Block offset Set index Tag Set entry Hit/Miss 8 20 3 (e) The following miss rate measurements have been made. Instruction miss rate is 3%; data miss rate is %. Assume that one-half of the instructions contain a data reference and that the cache miss penalty in no. of clock cycles is ( + (Block size in words)). Calculate the average miss penalty per instruction. 7