CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.

Similar documents
Final Exam Fall 2007

Pipelining. CSC Friday, November 6, 2015

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

CS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions

Instruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31

1 Hazards COMP2611 Fall 2015 Pipelined Processor

Final Exam Fall 2008

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

Computer Architecture CS372 Exam 3

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

ECE260: Fundamentals of Computer Engineering

Computer Organization and Structure

CS/CoE 1541 Exam 1 (Spring 2019).

CS 351 Exam 2, Fall 2012

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

CENG 5133 Computer Architecture Design Spring Sample Exam 2

CS 230 Practice Final Exam & Actual Take-home Question. Part I: Assembly and Machine Languages (22 pts)

Chapter 4. The Processor

COMP2611: Computer Organization. The Pipelined Processor

ADVANCED COMPUTER ARCHITECTURES: Prof. C. SILVANO Written exam 11 July 2011

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

2. [3 marks] Show your work in the computation for the following questions involving CPI and performance.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Static, multiple-issue (superscaler) pipelines

Instr. execution impl. view

The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011

LECTURE 3: THE PROCESSOR

CSE 490/590 Computer Architecture Homework 2

CS3350B Computer Architecture Quiz 3 March 15, 2018

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

EXAM #1. CS 2410 Graduate Computer Architecture. Spring 2016, MW 11:00 AM 12:15 PM

The University of Alabama in Huntsville Electrical & Computer Engineering Department CPE Test II November 14, 2000

COMPUTER ORGANIZATION AND DESI

Chapter 4. The Processor

c. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?

Instruction Frequency CPI. Load-store 55% 5. Arithmetic 30% 4. Branch 15% 4

Chapter 4. The Processor

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

cs470 - Computer Architecture 1 Spring 2002 Final Exam open books, open notes

Full Datapath. Chapter 4 The Processor 2

CS 251, Winter 2018, Assignment % of course mark

EE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes

The Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes.

COMPUTER ORGANIZATION AND DESIGN

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards

/ : Computer Architecture and Design Fall Midterm Exam October 16, Name: ID #:

Processor (II) - pipelining. Hwansoo Han

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

1. Truthiness /8. 2. Branch prediction /5. 3. Choices, choices /6. 5. Pipeline diagrams / Multi-cycle datapath performance /11

CSE Lecture 13/14 In Class Handout For all of these problems: HAS NOT CANNOT Add Add Add must wait until $5 written by previous add;

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

ECEC 355: Pipelining

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Full Datapath. Chapter 4 The Processor 2

Final Exam Spring 2017

OPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS.

ECE Exam II - Solutions October 30 th, :35 pm 5:55pm

CS 251, Winter 2019, Assignment % of course mark

CS/CoE 1541 Mid Term Exam (Fall 2018).

ECE Exam II - Solutions November 8 th, 2017

4.1.3 [10] < 4.3>Which resources (blocks) produce no output for this instruction? Which resources produce output that is not used?

CSEE 3827: Fundamentals of Computer Systems

Pipelining: Overview. CPSC 252 Computer Organization Ellen Walker, Hiram College

COSC 6385 Computer Architecture - Pipelining

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

LECTURE 10. Pipelining: Advanced ILP

ECE154A Introduction to Computer Architecture. Homework 4 solution

CSEE W3827 Fundamentals of Computer Systems Homework Assignment 5 Solutions

14:332:331 Pipelined Datapath

Advanced Instruction-Level Parallelism

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

CS 2506 Computer Organization II Test 2

CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception

CS 351 Exam 2 Mon. 11/2/2015

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

Good luck and have fun!

Question 1: (20 points) For this question, refer to the following pipeline architecture.

COMPUTER ORGANIZATION AND DESIGN

EC 413 Computer Organization - Fall 2017 Problem Set 3 Problem Set 3 Solution

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST

ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013

Basic Instruction Timings. Pipelining 1. How long would it take to execute the following sequence of instructions?

EE 457 Midterm Summer 14 Redekopp Name: Closed Book / 105 minutes No CALCULATORS Score: / 100

ECE 313 Computer Organization FINAL EXAM December 13, 2000

Processor (IV) - advanced ILP. Hwansoo Han

Determined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Advanced d Instruction Level Parallelism. Computer Systems Laboratory Sungkyunkwan University

ECE232: Hardware Organization and Design

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...

4. The Processor Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3. Emil Sekerinski, McMaster University, Fall Term 2015/16

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1,

Advanced Computer Architecture CMSC 611 Homework 3. Due in class Oct 17 th, 2012

The Processor: Instruction-Level Parallelism

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

ECE331: Hardware Organization and Design

Transcription:

Exam 2 April 12, 2012 You have 80 minutes to complete the exam. Please write your answers clearly and legibly on this exam paper. GRADE: Name. Class ID. 1. (22 pts) Circle the selected answer for T/F and multiple-choice questions; and fill in the blanks for the rest. Each part is 2 points. a. T / F A processor can have different CPIs for different programs. b. T / F In multi-cycle implementation the first two stages, instruction fetch and instruction decode, are the same for all instruction classes. c. T / F Increasing the depth of pipelining always decreases performance. d. T / F Pipelining improves the performance by increasing throughput. e. T / F To pass data from an early pipeline stage to a later pipeline stage, the data must be placed in a pipeline register not to lose the data when the next instruction enters that pipeline stage. f. T / F Data forwarding resolves the data hazard that occurs when an instruction tries to read a register following a load instruction that writes the same register. g. The ideal speedup of a pipelined system with four ideal stages is. h. One solution to data hazards can be. i. One solution to structural hazards can be. j. Which one of the following processors has the highest possible MIPS rate in ideal conditions? a. A single-issue processor driven by a 1 GHz clock. b. A 2-issue processor driven by a 500 MHz clock. c. A 4-issue processor driven by a 250 MHz clock. d. An 8-issue VLIW processor driven by a 200 MHz clock. k. Which one of the following is NOT calculated by the ALU? a. Arithmetic result for arithmetic instructions b. Memory address for load/store instructions c. Branch target address d. Address of the next instruction 1

2. (18 pts) Answer the following questions giving all necessary details. a. (5 pts) Given the sequences of array references below, determine if each sequence exhibits spatial or temporal locality. A[10], B[10], A[11], B[11], A[12], B[12], A[9], B[9] A[1], B[1], A[1000], B[1000], A[1], B[1], A[1000], B[1000] b. (5 pts) List each memory components in memory hierarchy. Order them from fastest to slowest and from smallest in size to larger. Memory Components Order fastest to slowest Order smallest to largest c. (8 pts) Name the five pipeline stages of MIPS Architecture. Explain what part of the instruction execution is performed in that stage. Give enough detail and be specific. 2

3. (16 pts) The datapath for 5-stage MIPS Pipeline Architecture is given below. List the resources that are used during the execution of each instruction below. Ignore the MUXes. When listing, use the numbers associated with the resources. 1 Program Counter 2 Adder in IF stage 3 Instruction Memory 4 Register File 5 Sing-extension Unit 6 Shift-left-2 Unit 7 Adder in EX stage 8 ALU 9 Data Memory Instruction beq s4, zero,loop Resources used 3

4. (10 pts) The latencies of individual stages in five-stage MIPS Architecture are given below. Stage IF ID EX MEM WB Latency 200ps 300ps 250ps 400ps 100ps a. What is the clock cycle time in a pipelined and non-pipelined processor? Pipelined version Non-pipelined version b. What is the total latency of a lw instruction in a pipelined and non-pipelined processor? Pipelined version Non-pipelined version 4

5. (10 pts) What is the accuracy of always-taken and always-not-taken branch predictors for the repeating (T, T, NT, T, T, T, NT, NT, T, T) pattern of branch outcomes? always-taken always-not-taken 6. (5 pts) Given code fragment below, schedule the code to avoid the stalls within a loop iteration. Make the necessary changes, if needed, in the code. Assume the classic five-stage MIPS architecture supports fully forwarding. loop: lw s1, 0(t1) add s3, s1, s2 sw s3, 0(t1) subi t1, t1, 4 bne t1, t2, loop 5

7. (20 pts) Show the pipeline timing diagram for one iteration of the loop using classic five-stage MIPS Architecture. For all parts, assume that register read and register write can be done in the same clock cycle and branches are resolved in EX stage (i.e., the branch target address will be known at the end of EX stage and the target instruction can be fetched in the next clock cycle). There is no branch prediction mechanism employed. loop: lw s1, 0(s2) addi s2, s2, 4 bne s4, zero, loop a. (9 pts) Show the pipeline timing diagram of this instruction sequence assuming forwarding is not supported by the architecture. Clock Cycle 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 lw s1, 0(s2) F D X M W addi s2, s2, 4 bne s4, zero,loop lw s1, 0(s2) It takes clock cycles to execute one iteration (from ID of first lw to ID of next lw). b. (9 pts) Do the same work in part (a) assuming forwarding is fully supported by the architecture. Clock Cycle 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 lw s1, 0(s2) F D X M W addi s2, s2, 4 bne s4, zero,loop lw s1, 0(s2) It takes clock cycles to execute one iteration. c. (2 pts) What is the speedup obtained by forwarding? 6

[Left blank intentionally for scratch] 7