CS/CoE 1541 Mid Term Exam (Fall 2018).
|
|
- MargaretMargaret Hunter
- 5 years ago
- Views:
Transcription
1 CS/CoE 1541 Mid Term Exam (Fall 2018). Name: Question 1: ( =20 points) For this question, refer to the following pipeline architecture. a) Consider the execution of the following code (5 instructions) on the architecture: L: lw r3, 8(r10) add r4,r5, r6 beq r9, r10, LL sw r7, 100(r8) addi r1, r11, 8 Assuming that at the end of some cycle, C, the lw instruction is in the MEM/WB buffer, specify the values of the following control signals at the end of cycle C (use if you do not care about the value being 0 or 1). ID/E.RegWrite: 0 E/MEM.RegWrite: 1 MEM/WB.RegWrite: 1 ID/E.MemtoReg: x E/MEM.MemtoReg: 1 MEM/WB.MemtoReg: 0 ID/E.MemWrite: 0 E/MEM.MemWrite: 0 ID/E.MemRead: x E/MEM.MemRead: x ID/E.Branch: 1 ID/E.RegDst: x ID/E.ALUSrc: 0 1
2 b) What will be stored in the PC and the IF/ID buffer at the end of cycle C? - The PC will store: L+16 - The IF/ID buffer will store: A will store L+16 and B will store the 32 bits instruction sw r7, 100(r8) c) What is the reason for having the three shaded components in the E stage: - The shaded shift left 2 unit: to multiply the branch offset by 4 - The shaded ALU unit: To compute the target address - The shaded AND gate: To force the target address into PC only when a branch is in E d) Now assume that the branch condition of the beq instruction, which is in the ID/E buffer at the end of Cycle C evaluates to true, what actions (in terms of changing the values stored in the inter-stage buffers) will be taken by the architecture during cycle C+1 as a result of the branch being taken? 1) PC = the target address 2) IF/ID = 0 (so that to force opcode = 0, the opcode for a no-op) 3) ID/E.MemWrite = ID/E.RegWrite = ID/ED.Branch = 0 e) If 25% of the instructions executing on this architecture are branch instructions, 30% are lw/sw instructions and 45% are ALU instructions, what would be the CPI for the architecture assuming that 40% of the branches are taken (consider only control hazards and ignore the effect of other hazards)? CPI = * 0.4 * 2 = 1.2 2
3 Question 2 (5+4+4=13 points): Assume that the 5-stage pipelined architecture has only one memory unit which is used for fetching instructions (in the IF stage) and fetching/storing data (in the MEM stage). Hence, if at a given cycle, the IF stage wants to read an instruction and the MEM stage wants to read or write data, then one of the two stages has to stall and use the memory in a later cycle. Assume that when the IF and MEM stages compete for the memory in a given cycle, priority is given to the MEM stage. a) Complete the following timing diagram to trace the execution of the first 7 instructions of the following code segment: I1: lw $1, 100($0) I2: and $3, $2, $4 I3: add $9, $9, $10 I4: sw $7, 50($6) I5: sub $3, $3, $1 I6: sub $2, $2, $2 I7: bneq $9, $6, I1 I8:.. I9: IF ID E MEM WB Cycle 1 I1: lw Cycle 2 I2: and I1: lw Cycle 3 I3: add I2: and I1: lw Cycle 4 I4: sw I3: add I2: and I1: lw sw -- add and lw I5: sub sw -- add and I6: sub I5: sub sw -- add bneq I6: sub I5: sub sw -- bneq -- I6: sub I5: sub sw bneq -- I6: sub I5: sub bneq -- I6: sub bneq -- bneq b) Assuming that the loop (I1-I7) will execute iterations and that the branch condition/target is resolved in the E stage, compute the CPI during the execution of the loop if the branch is always predicted not-taken (ignore the time to fill up the pipeline). During each iteration (7 instructions) there will be two bubbles due to structural hazards and two bubbles due to control hazards. Hence each iteration will take 11 cycles to complete. CPI = 11/7 c) What would the CPI be if a 1-bit branch predictor is used? A one bit branch predictor will be correct 9998 out of times can ignore structural hazard Hence, each iteration will take 9 cycles to complete CPI = 9/7 3
4 Question 3 (6+6=12 points): The following figure shows two forwarding paths from E/MEM and MEM/WB buffers to the E stage. A forwarding unit sets the mux control signals, A and B, to 1 or 2 whenever it detects data dependencies that may cause hazards. a) Using the notation given in the figure for the information stored in the ID/E, E/MEM and MEM/WB buffers, state the condition(s) that will cause the forwarding unit to set up control signal A to the value 1. (MEM/WB.Rd = ID/E.Rs) AND (MEM/WB.RegWrite=1) AND (MEM/WB.Rd!= 0) b) As clear from the figure, the branch condition is determined in the ID stage. A data hazard that we did not discuss in class occurs when the instruction in the E stage writes into a register, $r, and at the same cycle, a branch instruction uses register $r to determine the branch condition. For example add $r1, $r2, $r3 beq $r1, $r5, L causes a data hazard because the condition of the branch is evaluated using the old value of $r1 rather than the value computed by the add instruction. By drawing on the figure, show an additional forwarding path that can be used to avoid the above data hazard. 4
5 Question 4 (5+5+5=15 points): Consider the following loop which accumulates in $r3 the sum of the N elements of a vector. The value of N is initially stored in $r5 and the address of the first element of the vector is initially stored in $r2. I1: lw $r1, 0($r2) I2: add $r3, $r3, $r1 I3: addi $r2, $r2, -4 I4: addi $r5, $r5, -1 I5: bneq $r5, $0, I1 In this question, assume a superscalar architecture with two pipelines, one for ALU/branch instructions and the other for lw/sw instructions with hardware support for forwarding. a) Show, using the table below, the schedule for one iteration of the loop on the superscalar. You should rearrange the instructions to minimize the execution time (of course, without violating correctness). b) Assuming that N is a multiple of 3, show the code for the loop after unrolling it twice. lw $r1, 0($r2) lw $r6, -4($r2) lw $r7, -8($r2) add $r3, $r3, $r1 add $r3, $r3, $r6 add $r3, $r3, $r7 addi $r2, $r2, -12 addi $r5, $r5, -3 bneq $r5, $0, I1 c) Show, using the table below, the schedule for one iteration of the unrolled loop on the superscalar. You should rearrange the instructions to minimize the execution time Answer of part a Schedule for the original loop (one iteration) Answer of part c Schedule for the unrolled loop ALU/branch pipeline Load/store pipeline ALU/branch pipeline Load/store pipeline addi $r5, $r5, -1 lw $r1, 0($r2) addi $r5, $r5, -3 lw $r1, 0($r2) addi $r2, $r2, -4 addi $r2, $r2, -12 lw $r6, -4($r2) add $r3, $r3, $r1 add $r3, $r3, $r1 lw $r7, 4($r2) bneq $r5, $0, I1 add $r3, $r3, $r6 add $r3, $r3, $r7 bneq $r5, $0, I1 5
6 Question 5 (10 points): Indicate which of the following sentences is True and which is False. For the first 4 sentences, refer to the execution of the following sequence of instructions on a 5-stage CPU pipeline: I1: add $r1, $r3, $r2 I2: lw $r1, 100($r2) I3: lw $r3, 104($r1) true false 1 Assuming no forwarding or stalling hardware, data hazard will occur because I2 will not use the correct register data. 2 Assuming no forwarding or stalling hardware, data hazard will occur because I3 will not use the correct register data. 3 Assuming forwarding but not stalling hardware, data hazard will occur because I2 will not use the correct register data. 4 Assuming forwarding but not stalling hardware, data hazard will occur because I3 will not use the correct register data. 5 When an exception occurs in a 5-stage pipelined CPU, the address of the exception handler is stored in the EPC (Exception PC) register 6 In DRAM open row policy, opening a row occurs before every access to the row. 7 In DRAM closed row policy, closing a row occurs after every access to the row. 8 In the write allocate cache policy, a block is allocated in the cache on either a read or a write miss. 9 Increasing the cache s associativity reduces compulsory misses 10 If memory words are accessed sequentially (ex: 1,2,3,4,5,6, ) and the cache block size is 4 words, then the cache miss rate is 25% 6
7 Question 6 (4+5+6=15 points): In this question, use the notation [tag, M(address),...] to describe the content of each entry. For example [4,M(46)] indicates that the entry contains tag=4 and the data from memory location (word address) 46. Similarly, [4,M(46),M(47)] indicates that the entry contains a block of two words from word addresses 46 and 47. a) Show the content of each of the caches shown below after the two memory references (word addresses) 31, 36 (a) An 8-words, direct mapped cache with block size = 1 word (b) An 8-words, direct mapped cache with block size = 2 words Index Content of cache Block index Content of cache [4,M(36), M(37)] 3 3 [3,M(30),M(31)] 4 [4,M(36)] [3,M(31)] b) Show the content of the cache shown below after the three memory references (word addresses) 31, 36, 20 (c) A 32-words, 2-ways set associative cache with Block size = 2 words Block Index 0 Content of cache (Way 1) Content of cache (Way 2) 1 2 [2, M(36),M(37)] [1,M(20),M(21)] [1, M(30),M(31)] 7
8 Question 7 (5+5+5=15 points): Assume that the CPI of a particular CPU is 2.0 with an infinitely large cache and consider the addition of a 4-way, 1MByte L1 cache to the CPU. You have two possible options for the L1 cache: Option C1: uses a block size of 32 bytes and results in an effective miss rate of 4% Option C2: uses a block size of 64 bytes and results in an effective miss rate of 3%. Because the block size in C2 is larger than the block size in C1, the miss penalty for C2 is 100 cycles while the miss penalty of C1 is 80 cycles. (a) By computing and comparing the CPI for each of the cache options, determine which option results in a faster system. CPI with C1 (32 bytes blocks) = * 80 = 5.2 CPI with C2 (64 bytes blocks) = * 100 = 5 Hence, option C2 is the faster than option C1. (b) To compare the total number of bits (for tag and data, ignoring other bits) required to implement the L1 cache, complete the following sentences: For option C1, - the tag for each block in the cache consists of N-5-13 = N-18 bits (14 if N= 32) - the total number of blocks in the cache is _2 15 For option C2, - the tag for each block in the cache consists of N-6-12 = N-18 bits (14 if N= 32) - the total number of blocks in the cache is 2 14 Hence, option C1 uses more bits (total) for tags and data than option C2 (c) Assume that an L2 cache is added to option C2 to reduce the miss penalty for blocks that are not in L1 but are in L2 to 10 cycles. Assuming that the miss penalty of L2 is 100 cycles and that its hit rate is 30% (30% of the misses from L1 are found in L2), what is the CPI of the system with two levels of caches? CPI = * ( * 100) = 4.4 8
Question 1: (20 points) For this question, refer to the following pipeline architecture.
This is the Mid Term exam given in Fall 2018. Note that Question 2(a) was a homework problem this term (was not a homework problem in Fall 2018). Also, Questions 6, 7 and half of 5 are from Chapter 5,
More informationCS/CoE 1541 Exam 1 (Spring 2019).
CS/CoE 1541 Exam 1 (Spring 2019). Name: Question 1 (8+2+2+3=15 points): In this problem, consider the execution of the following code segment on a 5-stage pipeline with forwarding/stalling hardware and
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationLECTURE 3: THE PROCESSOR
LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationStatic, multiple-issue (superscaler) pipelines
Static, multiple-issue (superscaler) pipelines Start more than one instruction in the same cycle Instruction Register file EX + MEM + WB PC Instruction Register file EX + MEM + WB 79 A static two-issue
More informationOPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS.
CS/ECE472 Midterm #2 Fall 2008 NAME: Student ID#: OPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS. Your signature is your promise that you have not cheated and will
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More informationComputer Architecture CS372 Exam 3
Name: Computer Architecture CS372 Exam 3 This exam has 7 pages. Please make sure you have all of them. Write your name on this page and initials on every other page now. You may only use the green card
More informationCS 2410 Mid term (fall 2015) Indicate which of the following statements is true and which is false.
CS 2410 Mid term (fall 2015) Name: Question 1 (10 points) Indicate which of the following statements is true and which is false. (1) SMT architectures reduces the thread context switch time by saving in
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationFinal Exam Fall 2007
ICS 233 - Computer Architecture & Assembly Language Final Exam Fall 2007 Wednesday, January 23, 2007 7:30 am 10:00 am Computer Engineering Department College of Computer Sciences & Engineering King Fahd
More informationThe Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
The Processor (3) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationInstruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31
4.16 Exercises 419 Exercise 4.11 In this exercise we examine in detail how an instruction is executed in a single-cycle datapath. Problems in this exercise refer to a clock cycle in which the processor
More informationCS 251, Winter 2019, Assignment % of course mark
CS 251, Winter 2019, Assignment 5.1.1 3% of course mark Due Wednesday, March 27th, 5:30PM Lates accepted until 1:00pm March 28th with a 15% penalty 1. (10 points) The code sequence below executes on a
More informationComputer Architecture V Fall Practice Exam Questions
Computer Architecture V22.0436 Fall 2002 Practice Exam Questions These are practice exam questions for the material covered since the mid-term exam. Please note that the final exam is cumulative. See the
More informationOutline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception
Outline A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception 1 4 Which stage is the branch decision made? Case 1: 0 M u x 1 Add
More informationECE Exam II - Solutions November 8 th, 2017
ECE 3056 Exam II - Solutions November 8 th, 2017 1. (15 pts) To the base pipeline we add data forwarding to EX, data hazard detection and stall generation, and branches implemented in MEM and predicted
More informationEE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes
NAME: STUDENT NUMBER: EE557--FALL 1999 MAKE-UP MIDTERM 1 Closed books, closed notes Q1: /1 Q2: /1 Q3: /1 Q4: /1 Q5: /15 Q6: /1 TOTAL: /65 Grade: /25 1 QUESTION 1(Performance evaluation) 1 points We are
More informationPipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction
More informationCENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.
Exam 2 April 12, 2012 You have 80 minutes to complete the exam. Please write your answers clearly and legibly on this exam paper. GRADE: Name. Class ID. 1. (22 pts) Circle the selected answer for T/F and
More informationPipelining. CSC Friday, November 6, 2015
Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not
More informationThomas Polzer Institut für Technische Informatik
Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =
More informationCOMPUTER ORGANIZATION AND DESI
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationMultiple Instruction Issue. Superscalars
Multiple Instruction Issue Multiple instructions issued each cycle better performance increase instruction throughput decrease in CPI (below 1) greater hardware complexity, potentially longer wire lengths
More informationComputer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationFinal Exam Fall 2008
COE 308 Computer Architecture Final Exam Fall 2008 page 1 of 8 Saturday, February 7, 2009 7:30 10:00 AM Computer Engineering Department College of Computer Sciences & Engineering King Fahd University of
More informationPlease state clearly any assumptions you make in solving the following problems.
Computer Architecture Homework 3 2012-2013 Please state clearly any assumptions you make in solving the following problems. 1 Processors Write a short report on at least five processors from at least three
More informationDetermined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version
MIPS Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationCS 251, Winter 2018, Assignment % of course mark
CS 251, Winter 2018, Assignment 5.0.4 3% of course mark Due Wednesday, March 21st, 4:30PM Lates accepted until 10:00am March 22nd with a 15% penalty 1. (10 points) The code sequence below executes on a
More informationChapter 4. The Processor
Chapter 4 The Processor 1 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A
More informationProcessor (II) - pipelining. Hwansoo Han
Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number
More informationc. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?
Brown University School of Engineering ENGN 164 Design of Computing Systems Professor Sherief Reda Homework 07. 140 points. Due Date: Monday May 12th in B&H 349 1. [30 points] Consider the non-pipelined
More information3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:
More informationChapter 4 The Processor 1. Chapter 4A. The Processor
Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor
More informationCMSC411 Fall 2013 Midterm 1
CMSC411 Fall 2013 Midterm 1 Name: Instructions You have 75 minutes to take this exam. There are 100 points in this exam, so spend about 45 seconds per point. You do not need to provide a number if you
More informationControl Hazards - branching causes problems since the pipeline can be filled with the wrong instructions.
Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions Stage Instruction Fetch Instruction Decode Execution / Effective addr Memory access Write-back Abbreviation
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationBasic Instruction Timings. Pipelining 1. How long would it take to execute the following sequence of instructions?
Basic Instruction Timings Pipelining 1 Making some assumptions regarding the operation times for some of the basic hardware units in our datapath, we have the following timings: Instruction class Instruction
More informationPipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.
Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationMidnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4
IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM
More informationIn-order vs. Out-of-order Execution. In-order vs. Out-of-order Execution
In-order vs. Out-of-order Execution In-order instruction execution instructions are fetched, executed & committed in compilergenerated order if one instruction stalls, all instructions behind it stall
More informationChapter 4. The Processor
Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations Determined by ISA
More informationDo not open this exam until instructed to do so. CS/ECE 354 Final Exam May 19, CS Login: QUESTION MAXIMUM SCORE TOTAL 115
Name: Solution Signature: Student ID#: Section #: CS Login: Section 2: Section 3: 11:00am (Wood) 1:00pm (Castano) CS/ECE 354 Final Exam May 19, 2000 1. This exam is open book/notes. 2. No calculators.
More information14:332:331 Pipelined Datapath
14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate
More informationEIE/ENE 334 Microprocessors
EIE/ENE 334 Microprocessors Lecture 6: The Processor Week #06/07 : Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, 2009, Elsevier (MK) http://webstaff.kmutt.ac.th/~dejwoot.kha/
More informationThe University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011
1. Performance Principles [5 pts] The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011 For each of the following comparisons,
More informationCS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed
Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted fact sheet, with a restriction: 1) one 8.5x11 sheet, both sides, handwritten
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware 4.1 Introduction We will examine two MIPS implementations
More informationSOLUTION. Midterm #1 February 26th, 2018 Professor Krste Asanovic Name:
SOLUTION Notes: CS 152 Computer Architecture and Engineering CS 252 Graduate Computer Architecture Midterm #1 February 26th, 2018 Professor Krste Asanovic Name: I am taking CS152 / CS252 This is a closed
More informationCSEE 3827: Fundamentals of Computer Systems
CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 martha@cs.columbia.edu Amdahl s Law Be aware when optimizing... T = improved Taffected improvement factor + T unaffected
More informationPipelining. lecture 15. MIPS data path and control 3. Five stages of a MIPS (CPU) instruction. - factory assembly line (Henry Ford years ago)
lecture 15 Pipelining MIPS data path and control 3 - factory assembly line (Henry Ford - 100 years ago) - car wash Multicycle model: March 7, 2016 Pipelining - cafeteria -... Main idea: achieve efficiency
More informationCS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed
Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted fact sheet, with a restriction: 1) one 8.5x11 sheet, both sides, handwritten
More informationPipelining Exercises, Continued
Pipelining Exercises, Continued. Spot all data dependencies (including ones that do not lead to stalls). Draw arrows from the stages where data is made available, directed to where it is needed. Circle
More informationECE473 Computer Architecture and Organization. Pipeline: Control Hazard
Computer Architecture and Organization Pipeline: Control Hazard Lecturer: Prof. Yifeng Zhu Fall, 2015 Portions of these slides are derived from: Dave Patterson UCB Lec 15.1 Pipelining Outline Introduction
More informationComputer Organization and Structure. Bing-Yu Chen National Taiwan University
Computer Organization and Structure Bing-Yu Chen National Taiwan University The Processor Logic Design Conventions Building a Datapath A Simple Implementation Scheme An Overview of Pipelining Pipelined
More information1 Hazards COMP2611 Fall 2015 Pipelined Processor
1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add
More informationData Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard
Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:
More informationCS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed
Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted fact sheet, with a restriction: 1) one 8.5x11 sheet, both sides, handwritten
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More information4. The Processor Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3. Emil Sekerinski, McMaster University, Fall Term 2015/16
4. The Processor Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3 Emil Sekerinski, McMaster University, Fall Term 2015/16 Instruction Execution Consider simplified MIPS: lw/sw rt, offset(rs) add/sub/and/or/slt
More informationThe Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture
The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count
More informationInstruction Frequency CPI. Load-store 55% 5. Arithmetic 30% 4. Branch 15% 4
PROBLEM 1: An application running on a 1GHz pipelined processor has the following instruction mix: Instruction Frequency CPI Load-store 55% 5 Arithmetic 30% 4 Branch 15% 4 a) Determine the overall CPI
More informationEXAM #1. CS 2410 Graduate Computer Architecture. Spring 2016, MW 11:00 AM 12:15 PM
EXAM #1 CS 2410 Graduate Computer Architecture Spring 2016, MW 11:00 AM 12:15 PM Directions: This exam is closed book. Put all materials under your desk, including cell phones, smart phones, smart watches,
More information1. Truthiness /8. 2. Branch prediction /5. 3. Choices, choices /6. 5. Pipeline diagrams / Multi-cycle datapath performance /11
The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 ANSWER KEY November 23 rd, 2010 Name: University of Michigan uniqname: (NOT your student ID
More informationLecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)
Lecture Topics Today: Data and Control Hazards (P&H 4.7-4.8) Next: continued 1 Announcements Exam #1 returned Milestone #5 (due 2/27) Milestone #6 (due 3/13) 2 1 Review: Pipelined Implementations Pipelining
More informationECE154A Introduction to Computer Architecture. Homework 4 solution
ECE154A Introduction to Computer Architecture Homework 4 solution 4.16.1 According to Figure 4.65 on the textbook, each register located between two pipeline stages keeps data shown below. Register IF/ID
More informationCS 2410 Mid term (fall 2018)
CS 2410 Mid term (fall 2018) Name: Question 1 (6+6+3=15 points): Consider two machines, the first being a 5-stage operating at 1ns clock and the second is a 12-stage operating at 0.7ns clock. Due to data
More informationCS 2506 Computer Organization II Test 2
Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted one-page formula sheet. No calculators or other computing devices may
More informationComputer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationLECTURE 9. Pipeline Hazards
LECTURE 9 Pipeline Hazards PIPELINED DATAPATH AND CONTROL In the previous lecture, we finalized the pipelined datapath for instruction sequences which do not include hazards of any kind. Remember that
More informationAdvanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017
Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation
More informationECE 2300 Digital Logic & Computer Organization. Caches
ECE 23 Digital Logic & Computer Organization Spring 217 s Lecture 2: 1 Announcements HW7 will be posted tonight Lab sessions resume next week Lecture 2: 2 Course Content Binary numbers and logic gates
More informationECE Sample Final Examination
ECE 3056 Sample Final Examination 1 Overview The following applies to all problems unless otherwise explicitly stated. Consider a 2 GHz MIPS processor with a canonical 5-stage pipeline and 32 general-purpose
More informationEE557--FALL 1999 MIDTERM 1. Closed books, closed notes
NAME: SOLUTIONS STUDENT NUMBER: EE557--FALL 1999 MIDTERM 1 Closed books, closed notes GRADING POLICY: The front page of your exam shows your total numerical score out of 75. The highest numerical score
More informationCS232 Final Exam May 5, 2001
CS232 Final Exam May 5, 2 Name: This exam has 4 pages, including this cover. There are six questions, worth a total of 5 points. You have 3 hours. Budget your time! Write clearly and show your work. State
More informationECE 3056: Architecture, Concurrency, and Energy of Computation. Sample Problem Sets: Pipelining
ECE 3056: Architecture, Concurrency, and Energy of Computation Sample Problem Sets: Pipelining 1. Consider the following code sequence used often in block copies. This produces a special form of the load
More informationCS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions
CS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions Tutorial Questions 2. [AY2014/5 Semester 2 Exam] Refer to the following MIPS program: # register $s0 contains a 32-bit
More informationComputer Architecture EE 4720 Final Examination
Name Computer Architecture EE 4720 Final Examination Primary: 6 December 1999, Alternate: 7 December 1999, 10:00 12:00 CST 15:00 17:00 CST Alias Problem 1 Problem 2 Problem 3 Problem 4 Exam Total (25 pts)
More informationCS252 Prerequisite Quiz. Solutions Fall 2007
CS252 Prerequisite Quiz Krste Asanovic Solutions Fall 2007 Problem 1 (29 points) The followings are two code segments written in MIPS64 assembly language: Segment A: Loop: LD r5, 0(r1) # r5 Mem[r1+0] LD
More informationCS146 Computer Architecture. Fall Midterm Exam
CS146 Computer Architecture Fall 2002 Midterm Exam This exam is worth a total of 100 points. Note the point breakdown below and budget your time wisely. To maximize partial credit, show your work and state
More informationPipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...
CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100
More informationCSE Lecture 13/14 In Class Handout For all of these problems: HAS NOT CANNOT Add Add Add must wait until $5 written by previous add;
CSE 30321 Lecture 13/14 In Class Handout For the sequence of instructions shown below, show how they would progress through the pipeline. For all of these problems: - Stalls are indicated by placing the
More informationInstruction Pipelining Review
Instruction Pipelining Review Instruction pipelining is CPU implementation technique where multiple operations on a number of instructions are overlapped. An instruction execution pipeline involves a number
More informationCOSC 6385 Computer Architecture - Pipelining
COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage
More informationLECTURE 10. Pipelining: Advanced ILP
LECTURE 10 Pipelining: Advanced ILP EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls, returns) that changes the normal flow of instruction
More informationCS/CoE 1541 Exam 2 (Spring 2019).
CS/CoE 1541 Exam 2 (Spring 2019) Name: Question 1 (5+5+5=15 points): Show the content of each of the caches shown below after the two memory references 35, 44 Use the notation [tag, M(address),] to describe
More informationELE 655 Microprocessor System Design
ELE 655 Microprocessor System Design Section 2 Instruction Level Parallelism Class 1 Basic Pipeline Notes: Reg shows up two places but actually is the same register file Writes occur on the second half
More informationAdvanced Computer Architecture CMSC 611 Homework 3. Due in class Oct 17 th, 2012
Advanced Computer Architecture CMSC 611 Homework 3 Due in class Oct 17 th, 2012 (Show your work to receive partial credit) 1) For the following code snippet list the data dependencies and rewrite the code
More informationCSE 490/590 Computer Architecture Homework 2
CSE 490/590 Computer Architecture Homework 2 1. Suppose that you have the following out-of-order datapath with 1-cycle ALU, 2-cycle Mem, 3-cycle Fadd, 5-cycle Fmul, no branch prediction, and in-order fetch
More informationCS433 Midterm. Prof Josep Torrellas. October 16, Time: 1 hour + 15 minutes
CS433 Midterm Prof Josep Torrellas October 16, 2014 Time: 1 hour + 15 minutes Name: Alias: Instructions: 1. This is a closed-book, closed-notes examination. 2. The Exam has 4 Questions. Please budget your
More informationProcessor (I) - datapath & control. Hwansoo Han
Processor (I) - datapath & control Hwansoo Han Introduction CPU performance factors Instruction count - Determined by ISA and compiler CPI and Cycle time - Determined by CPU hardware We will examine two
More information