The University of Alabama in Huntsville Electrical & Computer Engineering Department CPE Test II November 14, 2000

Similar documents
Final Exam Fall 2007

CS 230 Practice Final Exam & Actual Take-home Question. Part I: Assembly and Machine Languages (22 pts)

Computer Architecture CS372 Exam 3

CSEE W3827 Fundamentals of Computer Systems Homework Assignment 5 Solutions

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.

CS232 Final Exam May 5, 2001

Updated Exercises by Diana Franklin

ECE Sample Final Examination

Pipelining. CSC Friday, November 6, 2015

Final Exam Fall 2008

c. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?

CS 251, Winter 2019, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark

ECE 30, Lab #8 Spring 2014

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

LECTURE 10. Pipelining: Advanced ILP

Static, multiple-issue (superscaler) pipelines

ECE 3056: Architecture, Concurrency, and Energy of Computation. Sample Problem Sets: Pipelining

CS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions

ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013

CS232 Final Exam May 5, 2001

Final Exam Spring 2017

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

ENCM 369 Winter 2013: Reference Material for Midterm #2 page 1 of 5

cs470 - Computer Architecture 1 Spring 2002 Final Exam open books, open notes

CS/COE1541: Introduction to Computer Architecture

ECE331: Hardware Organization and Design

Computer Organization and Structure

Chapter 4. The Processor

CS161 Design and Architecture of Computer Systems. Cache $$$$$

The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

ECE Exam II - Solutions October 30 th, :35 pm 5:55pm

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

The Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes.

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.


COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. The Processor

COMP2611: Computer Organization. The Pipelined Processor

ECEC 355: Pipelining

CS 61C Fall 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control)

Assignment 1 solutions

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

Pipelined processors and Hazards

EE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes

Welcome to Part 3: Memory Systems and I/O

Lecture 8: Data Hazard and Resolution. James C. Hoe Department of ECE Carnegie Mellon University

Review: Abstract Implementation View

Chapter 4. The Processor

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good

EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

Chapter 4. The Processor

Question 1: (20 points) For this question, refer to the following pipeline architecture.

6.823 Computer System Architecture Datapath for DLX Problem Set #2

CSE Lecture 13/14 In Class Handout For all of these problems: HAS NOT CANNOT Add Add Add must wait until $5 written by previous add;

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

EC 413 Computer Organization - Fall 2017 Problem Set 3 Problem Set 3 Solution

1 Hazards COMP2611 Fall 2015 Pipelined Processor

bits 5..0 the sub-function of opcode 0, 32 for the add instruction

CS 351 Exam 2 Mon. 11/2/2015

COMPUTER ORGANIZATION AND DESIGN

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

ECE Exam II - Solutions November 8 th, 2017

EXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation

Winter 2002 FINAL EXAMINATION

ECE 4750 Computer Architecture, Fall 2017 T05 Integrating Processors and Memories

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception

Perfect Student CS 343 Final Exam May 19, 2011 Student ID: 9999 Exam ID: 9636 Instructions Use pencil, if you have one. For multiple choice

ECE154A Introduction to Computer Architecture. Homework 4 solution

Pipelining and Caching. CS230 Tutorial 09

CS/CoE 1541 Mid Term Exam (Fall 2018).

What do we have so far? Multi-Cycle Datapath (Textbook Version)

Do-While Example. In C++ In assembly language. do { z--; while (a == b); z = b; loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero

Comprehensive Exams COMPUTER ARCHITECTURE. Spring April 3, 2006

ELE 375 / COS 471 Final Exam Fall, 2001 Prof. Martonosi

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously on different instructions.

CS/CoE 1541 Exam 1 (Spring 2019).

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

1 /10 2 /16 3 /18 4 /15 5 /20 6 /9 7 /12

Processor (I) - datapath & control. Hwansoo Han

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

CS3350B Computer Architecture Quiz 3 March 15, 2018

6.004 Tutorial Problems L22 Branch Prediction

ECE232: Hardware Organization and Design

CS 61C Summer 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control)

COMPUTER ORGANIZATION AND DESIGN

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Chapter 4. The Processor Designing the datapath

ECE 30 Introduction to Computer Engineering

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

NATIONAL UNIVERSITY OF SINGAPORE

LECTURE 9. Pipeline Hazards

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Cache Organizations for Multi-cores

Transcription:

The University of Alabama in Huntsville Electrical & Computer Engineering Department CPE 513 01 Test II November 14, 2000 Name: 1. (5 points) For an eight-stage pipeline, how many cycles does it take to execute n instructions if the pipeline is empty when these instructions begin to execute? 2. (1 point) A occurs when an instruction depends on the results of a previous instruction still in the pipeline. 3. (10 points) Here is a series of address references given as word addresses: 1, 4, 8, 5, 20, 17, 19, 56, 9, 11, 4, 43, 5, 6, 9, 17. Assuming a two-way set-associative cache with one-word blocks and a total size of 16 words that is initially empty, (a )label each reference in the list as a hit or a miss and (b) show the final contents of the cache. Assume LRU replacement.

4. (20 points) Consider three machines with different cache configurations: Cache 1: Direct-mapped with one-word blocks Cache 2: Direct-mapped with four-word blocks Cache 3: Two-way set associative with two-word blocks The following miss rate measurements have been made: Cache 1: Instruction miss rate is 8%, data miss rate is 4% Cache 2: Instruction miss rate is 2%, data miss rate is 5% Cache 3: Instruction miss rate is 4%, data miss rate is 2% For these machines, one-half of the instructions contain a data reference. Assume that the cache miss penalty is 6 + Block size in words. The CPI for this workload was measured on a machine with cache 1 and was found to be 2.0. The cycle time for machine 1 is 2.4 ns, the cycle time for machine 2 is 2.0 ns, and the cycle time for machine 3 is 1.6 ns. Determine which machine is the fastest and which is the slowest. 5. (1 point) Pipelining improves performance by increasing, as opposed to decreasing the execution time of an individual instruction. 6. (3 points).the three C s of cache memories are:., and.

7. (15 points) (a )Identify all of the data dependencies in the following code. (b) How is each data dependency either handled or not handled by forwarding? If forwarding occurs, state the source of the data. add $2, $5, $4 lw $2, 25($2) add $4, $2, $5 sw $5, 100($2) add $3, $2, $4 8. (10 points) Consider a virtual memory system with the following properties: 28-bit virtual byte address 1 Kbyte pages 16-bit physical address What is the total size of the page table for each process on this machine, assuming that the valid, protection, dirty, and use bits take a total of 4 bits and that all the virtual pages are in use? (Assume that disk addresses are not stored in the page table.)

9. (15 points) Consider executing the following code on the pipelined datapath of Chapter 6, which has forwarding, in which branches complete in the MEM stage, and in which there is no branch prediction: sort: addi $sp, $sp, -20 sw $ra, 16($sp) sw $s3, 12($sp) sw $s2, 8($sp) sw $s1, 4($sp) sw $s0, 0($sp) add $s2, $a0, $zero add $s3, $a1, $zero add $s0, $zero, $zero for1tst: slt $t0, $s0, $s3 beq $t0, $zero, exit1 addi $s1, $s0, -1 for2tst: slt $s1, $s0, $zero bne $t0, $zero, exit2 add $t1, $s1, $s1 add $t1, $t1, $t1 add $t2, $s2, $t1 lw $t3, 0($t2) lw $t4, 4($t2) slt $t0, $t4, $t3 beq $t0, $zero, exit2 add $a0, $s2, $zero add $a1, $s1, $zero jal swap addi $s1, $s1, -1 j for2tst exit2: addi $s0, $s0, 1 j for1tst exit1: lw $s0, 0($sp) lw $s1, 4($sp) lw $s2, 8($sp) lw $s3, 12($sp) lw $ra, 16($sp) addi $sp, $sp, 20 jr $ra If the addi instruction with the label sort begins executing in cycle 1 and the beq $t0, $zero, exit1 is taken, what are the values stored in the following fields of the ID/EX pipeline register in the 17 th cycle? Assume that before the instructions are executed, the state of the machine was as follows: The PC has the value 50010, the address of the addi instruction. Every register has the initial value 1010 plus the register number. Every memory word accessed as data has the initial value 100010 plus the byte address of the word. ID/EX.EX = ID/EX.PCInc = ID/EX.ReadRegister1 = ID/EX.ReadRegister2 = ID/EX.WriteRt = ID/EX.WriteRd =

10. (15 points) Consider a pipeline for a register-memory architecture. The architecture has two instruction formats: a register-register format and a register-memory format. There is a single memory addressing mode (offset + base register). There is a set of ALU operations with format: or ALUop Rdest, Rsrc1, Rsrc2 ALUop Rdest, Rsrc1, MEM where the ALUop is one of the following: Add, Subtract, And, Or, Load (Rsrc1 ignored), Store. Rdest, Rsrc1 and Rsrc2 are registers. MEM is a base register and offset pair. Branches use a full compare of two registers and are PC-relative. Assume that this machine is pipelined so that a new instruction is started every clock cycle. The following pipeline structure-similar to that used in the VAX 8700 micropipeline-is The first ALU stage is used for effective address calculation for memory references and branches. The second ALU state is used for operations and branch comparisons. RF is both a decode and register-fetch stage. Assume that when a register read and a register write of the same register occur in the same clock cycle, the write data is forwarded. Determine two types of data forwarding for any ALUs that will be needed. Assume that there are separate ALUs for the ALU1 and ALU2 pipe stages. Describe these types of forwarding. Be careful to consider forwarding across an intervening instruction, e. g., ADD R1, Any instruction Add, R1,

11. (5 points) Use the following code fragment: loop: lw $1, 0($2) addi $1, $1, 4 sw $1, 0($2) addi $2, $2, 4 bne $2, $3, loop Insert any necessary stalls, or reorder, if possible, to prevent stalling.