1. Truthiness /8. 2. Branch prediction /5. 3. Choices, choices /6. 5. Pipeline diagrams / Multi-cycle datapath performance /11

Size: px
Start display at page:

Download "1. Truthiness /8. 2. Branch prediction /5. 3. Choices, choices /6. 5. Pipeline diagrams / Multi-cycle datapath performance /11"

Transcription

1 The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 ANSWER KEY November 23 rd, 2010 Name: University of Michigan uniqname: (NOT your student ID number!) Open book, open notes. No laptops, PDAs, cell phones, etc. (calculators are ok). Questions vary in difficulty; it is strongly recommended that you do not spend too much time on any one question. For questions where a box is provided, please put your final answer in the box. The rules of the Honor Code of the University of Michigan - College of Engineering apply for this exam. Honor code pledge: I have neither given nor received aid on this examination, nor have I concealed any violations of the Honor Code. Signature: (Exams without a signed pledge will not be graded) Question Score/Point Value 1. Truthiness /8 2. Branch prediction /5 3. Choices, choices /6 4. Hazards /12 5. Pipeline diagrams /12 6. Multi-cycle datapath performance /11 7. A pipeline with cache /11 8. A simple cache for LC2K /10 9. Pipeline design / Cache misses /18 TOTAL /110 Page 1 of 12

2 1. Truthiness [8 points] For each of the following statements, indicate whether it is True or False. Circle your choice. [1 pt per correct answer] On caches: a) Capacity misses may occur due to the size of a block in a cache TRUE / FALSE b) Compulsory misses cannot be reduced by increasing the size of the cache, while keeping the block size the same TRUE / FALSE c) Compulsory misses may be reduced with a larger block size TRUE / FALSE d) Conflict misses may or may not occur depending on the replacement policy in use On datapaths: e) Resolving branches earlier in a pipeline will result in a faster clock frequency than if resolved in later pipelined stages TRUE / FALSE TRUE / FALSE f) A multi-cycle datapath is the cheapest one to implement compared to single-cycle and pipelined ones, as it reuses datapath elements TRUE / FALSE g) Pipelined execution produces the highest throughput compared to single-cycle and multi-cycle execution h) A single-cycle datapath processor has faster clock frequency than one implemented with a multi-cycle datapath, if everything else is the same TRUE / FALSE TRUE / FALSE 2. Branch Prediction [5 points] Consider the 2 bit saturating predictor as discussed in lecture. The predictor is initialized to 2; it is incremented when a branch is taken (T) and decremented when a branch is not taken (N). Given the execution sequence below for a single branch, fill in the table with the counter values and predictions at each branch occurrence. [0.5 pt per correct column] Prediction before branch resolves Counter after branch resolves Init N N T T N T T T T T N N N T N T T What is the accuracy of this branch predictor (provide your answer as a percentage)? [1pt] 3/8 = = 37.5% Answer: 37.5 % Page 2 of 12

3 3. Choices, Choices [6 points] For the two questions below, select the option(s) that best matches: [1pt per correct answer] a) Associate the characteristics below with one option among A, B or C at the right: Highest frequency of access: B Makes use of both temporal and spatial locality: C Largest access latency: A Located physically closest to the ALU: B A. Optical disk B. Register C. Cache b) Consider the following snippet of MIPS assembly: 1: lui $s1, 10 2*: add $zero, $s1, $s1 3: xor $s2, $s1, $s1 4: sw $s1, 0($sp) 5: lw $s3, 0($sp) 6: div $s3, $s2 * Remember that writing to register $zero is not an error and no exception will be triggered. The register write just fails silently. Which instruction (line number), if any, causes the processor to trigger an exception? [2pts] A. 3 B. 4 C. 5 D. 6 E. None Page 3 of 12

4 4. Hazards [12 points] Oh no! You are provided with a 5-stage LC2K pipeline which has neither hazard detection capabilities nor data forwarding paths. However, you are slightly relieved to learn that the register file does support internal forwarding, i.e., registers can be read and written in the same cycle (similar to what we have done in lecture). Given the LC2K code snippet showed on the right: LC2K code snippet: a) Draw arrows on the code snippet to indicate all the hazards present. For instance, if there were two hazards between registers C1 and A2 and between C1 and B2 for instructions 1 and 2 below, your answer should look like: 1: opcode1 A1 B1 C1 2: opcode2 A2 B2 C2 [3 pts] b) Use the boxed space below to show all dependencies among the instructions in the snippet. Whenever a particular instruction must complete before another instruction, use an arrow to indicate this constraint. As an example, a diagram like: indicates that instruction 3 must be executed after 2, instruction 4 must be executed after 3, and instruction 6 can only be performed after both 1 and 4 are completed. Instruction 5 is not shown anywhere in the example diagram, thus it can be executed at any time. [4 pts] Your diagram: 1:add :nand :lw 1 4 laba 4:sw 3 4 labb 5:nand :add c) Based on your answers in parts a) and b) above, reorder the snippet s instructions to achieve maximum performance, adding noops as necessary to ensure correctness. (Note: Your answer should not exceed 10 instructions) [5 pts] other 8-cycle solutions possible 1: add : lw 1 4 laba 3: noop 4: nand : add : noop 7: sw 3 4 labb 8: nand : 10: Page 4 of 12

5 5. Pipelined Diagrams [12 points] You are currently working on an 8-stage LC2K pipeline design, as follows: Fetch: Instructions are fetched ID1: Register values begin to be loaded ID2: Register values loading completes EX1: ALU operation begins for all instructions EX2: Branches are resolved EX3: ALU operation ends for all instructions MEM: Memory access WB: Write back Branches are predicted not taken and mispredicted branches are squashed. Note that they are resolved in EX2. In addition, the pipeline has full forwarding paths from EX3/MEM and MEM/WB back to EX1. If an instruction must be stalled, the stall should be at the latest possible pipeline stage. The design executes the following snippet of code: 0: lw 0 2 two 1: sw 0 3 six 2: add : beq : halt 5: six.fill 6 6: two.fill 2 Please complete the table below, indicating which instruction (use its opcode) is executing in which pipeline stage for the first 10 cycles of execution. Indicate stalls with a --. To get you started, we completed the first four cycles for you. [2 pts per correct cycle, total 12 pts] Cycle Fetch ID1 ID2 EX1 EX2 EX3 MEM WB 1 lw 2 sw lw 3 add sw lw 4 beq add sw lw 5 halt beq add sw lw 6 halt beq add - sw lw 7 halt beq add - - sw lw 8 halt beq add - - sw lw 9 halt beq - add - - sw 10 halt beq - - add - - Page 5 of 12

6 6. Multi-cycle Datapath Performance [11 points] The following code took 32.6 nanoseconds to execute: lw 0 1 init lw 0 2 one lw 0 4 end start beq 1 4 done add add beq 0 0 start done halt init.fill 0 one.fill 1 end.fill 10 a) How many times does this code go through the loop? [1pt] 10 times b) If this program was executed on the LC2K multicycle datapath discussed in class, how many cycles does it take to execute? To compute this number, fill in the table below with the number of times each type of instruction is executed and the total number of cycles it took to execute that instruction. Then provide your answer for the total number of cycles. Assume it takes 4 cycles for the halt instruction. As an example, in the code below the add instruction is executed twice. add [5pts, 1 for each correct row] add Instruction Number of times it is Total cycles for type executed instruction add lw 3 15 beq halt 1 4.fill 0 0 Total cycles executed for the entire program:[1 pt] 183 cycles c) What is the average CPI? [Show your calculation to receive credit for this part] [2 pts] 183/45 = CPI d) What is the frequency of this processor (in GHz)? [Show your calculation to receive credit for this part] [2 pts] 183/ (32.6 ns) 5.61 Ghz Page 6 of 12

7 7. A pipeline with cache [11 Points] Consider a 5-stage pipelined processor similar to the LC2K we have studied in class. All branches are predicted not taken and resolved in the memory stage. There are separate instruction and data caches. If there is a miss in any of the caches, the data will always be available in memory. The pipeline has the following instruction breakdown and statistics: 10% add 10% nand 25% lw 20% sw 30% beq 5% noop 15% of add and nand are followed by dependent instructions 25% of loads are followed by dependent instructions 5% of all stores are followed by a load instruction that loads from the same address as the preceding store 75% branches are taken 40% branches followed by another branch D-cache hits 95% of time, I-Cache hits 99.9% of time. Each memory access takes 150 cycles. Below is a list of possible factors that may impact the overall CPI of this pipelined datapath. Please calculate the amount of CPI increase (beyond the ideal CPI=1) due to each of these factors. If you believe that a factor does not impact this pipeline s CPI, simply indicate 0. [YOU **MUST SHOW YOUR WORK** TO RECEIVE ANY CREDIT FOR YOUR ANSWERS] a) Branch misprediction [2 pts] 0.3*0.75*3= CPI b) Adds and nands followed by dependent instructions [1 pt] 0 CPI c) Cache misses [2 pts, 1 per addendum] 1.00*0.001*150 + ( )*0.05*150= CPI d) Loads followed by dependent instructions [2 pts] 0.25*.25*1 = e) Stores to an address followed by loads to the same address [1 pt] CPI 0 CPI f) Branches followed by another branch [1 pt] 0 CPI TOTAL CPI for this datapath: [2 pts] Page 7 of 12

8 8. A simple cache for LC2K [10 points] We want to design a cache for our LC2k pipelined datapath. As you know, LC2k is wordaddressable with 16-bit addresses. Assume the memory has an average access time of 100ns. We would like to add a direct-mapped, 1KB cache with a block size of 4 words a) How many sets are in the cache? [2 pts] Answer: 64 b) What is the associativity of the cache? [2 pts] (direct-mapped also acceptable) Answer: 1-way c) Indicate below how many address bits must be reserved for tag, set index and block offset: Answer: [3 pts] tag set index block offset Now, we want to evaluate the performance improvement of the memory system of this LC2k design with our cache in place. Assume that the cache has an average access time of 4ns and a hit rate of 92% when running our greatest benchmark, the combination program from project 2. d) What is the average access time of the system with the cache in place? [2 pts] (show your work below to receive credit) 4ns * 100ns = 12ns (1 pt for 11.68ns, which assumes cache access overlaps memory access) Answer: 12 ns e) What is the performance improvement compared to the original system without the cache? (old access time / new access time) [1 pt] 100ns/12ns = 8.33 Answer: 8.33 times better Page 8 of 12

9 9. Pipeline Design [17 points] Your LC2K company sales are rapidly growing. Sales are doubling every year. The company programmer tells you that soon there will be a problem with the accounting software. Since the LC2K does not support floating point, all accounting is done in cents using signed two's complement integer arithmetic. With this data representation, the annual sales revenue for LC2K systems will no longer fit in a word three years from now. a) What is the current annual sales revenue for LC2K systems? (you can round the value to thousands of $). [2pts] will overflow when cents will not fit in a 32 bit signed integer, which happens at 2**31 cents. Sales double each year, and it will overflow in three years, so we are currently at 1/8 of this, 2**28 cents. $2,684 thousands Precise answer: current annual sales $2,684, Reasonable rounded answer: annual sales $2.7 million per year (also accept $3 million or $2.6 million or something more precise for full credit) You are feeling optimistic that long term sales of LC2K systems will continue to increase, so you decide that sales figures should be stored as single precision floating point numbers. To maintain acceptable performance of the accounting software, you create a floating point add instruction, "fadd" and you build a new pipelined machine that implements the traditional LC2K instruction set plus this instruction. However, you find that implementing "fadd" makes the ALU slower. As the ALU was already one part of the critical path that limited clock speed, you must slow the clock down as a result. The old ALU produced a result in 3 nanosec, while the new ALU produces a result in 4 nanosec, and the old machine had a maximum clock frequency of 250 MHz. b) What is the maximum clock frequency of the new machine? [2pts] Old clock was 250 MHz, so clock period was 4 nsec. ALU was said to be in the critical path, and it changed from 3 nsec to 4 nsec, which slows the critical path down by 1 nsec, so it must change from 4 nsec to 5 nsec. 5 nsec clock period implies clock frequency of 200 MHz 200 Mhz To avoid slowing the clock down, you decide to build a separate pipelined ALU just to handle "fadd". This ALU takes input from ID/EX and after two cycles it places a result into a new pipeline register in MEM/WB called "faluresult". This new ALU has an internal pipeline register so it can accept new input data on every cycle and it can also produce new output data on every cycle. Data hazards are resolved with "detect and forward". Assume that the pipeline behaves like the version in the lectures (register file has internal forwarding so new register values are read correctly on the same cycle that they are written). Page 9 of 12

10 Above we provide a figure showing the pipelined LC2K with the new floating point ALU (marked "falu"). However, we did not get a chance to complete all of the connections. Your job is to answer the question below. Questions c) and d) refer to the lettered locations, A through F, near the center of the figure. You should circle exactly one choice for each question. c) Where should the upper input of the new floating point ALU be connected? [1pt] A B C D E F F also works if they selected E for d) d) Where should the lower input of the new floating point ALU be connected? [1pt] A B C D E F E also works if they selected F for c) e) The five MUXes in the figure have been labeled M1 through M5. Circle below all of the MUXes that will need one or more additional input to support the new falu: [1pt] M1 M2 M3 M4 M5 Page 10 of 12

11 For the remaining parts of this question, consider each of the following instruction sequences. Assume there are no dependent instructions preceding or following these sequences. Sequence V Sequence W Sequence X Sequence Y Sequence Z fadd add fadd fadd lw fadd fadd add lw fadd For each statement below, list all sequences (out of V, W, X, Y, Z) for which the statement is true. The first one is completed for you as an example: [10pts, 2 for each sequence placed in all its correct slots] [EXAMPLE] This sequence modifies register 3: [ANSWER]: V W X f) No stall is needed: Answer: W g) A stall of exactly one cycle is needed: Answer: V, X, Y, Z h) A stall of two or more cycles is needed: Answer: NONE i) No forwarding is needed: Answer: NONE j) Forwarding from EX/MEM is needed: Answer: W k) Forwarding from MEM/WB is needed: Answer: V, X, Y, Z Sequences behave as follows: V: one cycle stall then register 3 forwards from MEM/WB W: no stall; register 3 forwards from EX/MEM X: onc cycle stall then register 3 forwards from MEM/WB Y: one cycle stall then register 5 forwards from MEM/WB Z: one cycle stall then register 5 forwards from MEM/WB Page 11 of 12

12 10. Cache Misses [18 points] Cache-pro incorporated has developed a new cache data replacement algorithm called leastfrequently used (LFU). LFU tracks how often an address has been accessed once it was loaded in the cache. The addresses in the cache that are accessed the least are evicted first. If there are multiple items in a block, the sum of each item in the block is used. We have a 16-byte cache using LFU in a system that uses 32-bit addresses. The cache has the following properties: byte-addressable memory, 2 byte block size, write-back, 2-way set associative. a) Indicate below how many address bits must be reserved for tag, set index and block offset: Answer: [3pts] 29 tag set index block offset b) Now we perform a sequence of memory accesses to this cache at the addresses and in the order reported in the table below: 12, 15,... Your job is to fill out the table. Make sure to provide the set and block numbers as decimal values. The #LFU field tracks the current access, as well as previous ones, to a same block. In the last two fields, circle whether the access is a hit or a miss and what type of miss. On the right, we drew a 16 bytes cache that you must partition in blocks and sets and use to indicate the values that you write or overwrite for each access in order. We started to fill the table for you. Make sure you report the first four addresses in the cache schematic on the right as well. [14pts,.5 for each Set/Block, LFU, Hit/Miss, type of miss] Cache schematic Dec Hex Set # Block # # LFU Hit/Miss Type of miss 12 0xC Miss Comp 15 0xF Miss Comp 33 0x Miss Comp 11 0xB Miss Comp 55 0x Hit / Miss Comp / Cap / Conf 14 0xE Hit / Miss Comp / Cap / Conf 87 0x Hit / Miss Comp / Cap / Conf 27 0x1B Hit / Miss Comp / Cap / Conf 13 0xD Hit / Miss Comp / Cap / Conf 63 0x3F Hit / Miss Comp / Cap / Conf 86 0x Hit / Miss Comp / Cap / Conf c) How many cache hits and cache misses occurred while accessing the memory addresses of part b)? [1pt] Hits: 2 Misses: Page 12 of 12

The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011

The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011 1. Performance Principles [5 pts] The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011 For each of the following comparisons,

More information

Computer Architecture CS372 Exam 3

Computer Architecture CS372 Exam 3 Name: Computer Architecture CS372 Exam 3 This exam has 7 pages. Please make sure you have all of them. Write your name on this page and initials on every other page now. You may only use the green card

More information

CS 2506 Computer Organization II Test 2

CS 2506 Computer Organization II Test 2 Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted one-page formula sheet. No calculators or other computing devices may

More information

OPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS.

OPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS. CS/ECE472 Midterm #2 Fall 2008 NAME: Student ID#: OPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS. Your signature is your promise that you have not cheated and will

More information

Final Exam Fall 2007

Final Exam Fall 2007 ICS 233 - Computer Architecture & Assembly Language Final Exam Fall 2007 Wednesday, January 23, 2007 7:30 am 10:00 am Computer Engineering Department College of Computer Sciences & Engineering King Fahd

More information

Final Exam Fall 2008

Final Exam Fall 2008 COE 308 Computer Architecture Final Exam Fall 2008 page 1 of 8 Saturday, February 7, 2009 7:30 10:00 AM Computer Engineering Department College of Computer Sciences & Engineering King Fahd University of

More information

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs. Exam 2 April 12, 2012 You have 80 minutes to complete the exam. Please write your answers clearly and legibly on this exam paper. GRADE: Name. Class ID. 1. (22 pts) Circle the selected answer for T/F and

More information

Perfect Student CS 343 Final Exam May 19, 2011 Student ID: 9999 Exam ID: 9636 Instructions Use pencil, if you have one. For multiple choice

Perfect Student CS 343 Final Exam May 19, 2011 Student ID: 9999 Exam ID: 9636 Instructions Use pencil, if you have one. For multiple choice Instructions Page 1 of 7 Use pencil, if you have one. For multiple choice questions, circle the letter of the one best choice unless the question specifically says to select all correct choices. There

More information

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted fact sheet, with a restriction: 1) one 8.5x11 sheet, both sides, handwritten

More information

Name: University of Michigan uniqname: (NOT your student ID number!)

Name: University of Michigan uniqname: (NOT your student ID number!) The University of Michigan - Department of EECS EECS370 Introduction to Computer Organization Midterm Exam 1 October 22, 2009 Name: University of Michigan uniqname: (NOT your student ID number!) Open book,

More information

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted fact sheet, with a restriction: 1) one 8.5x11 sheet, both sides, handwritten

More information

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted fact sheet, with a restriction: 1) one 8.5x11 sheet, both sides, handwritten

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor

More information

Final Exam Spring 2017

Final Exam Spring 2017 COE 3 / ICS 233 Computer Organization Final Exam Spring 27 Friday, May 9, 27 7:3 AM Computer Engineering Department College of Computer Sciences & Engineering King Fahd University of Petroleum & Minerals

More information

LECTURE 3: THE PROCESSOR

LECTURE 3: THE PROCESSOR LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU

More information

CS/CoE 1541 Mid Term Exam (Fall 2018).

CS/CoE 1541 Mid Term Exam (Fall 2018). CS/CoE 1541 Mid Term Exam (Fall 2018). Name: Question 1: (6+3+3+4+4=20 points) For this question, refer to the following pipeline architecture. a) Consider the execution of the following code (5 instructions)

More information

ECE 2300 Digital Logic & Computer Organization. Caches

ECE 2300 Digital Logic & Computer Organization. Caches ECE 23 Digital Logic & Computer Organization Spring 217 s Lecture 2: 1 Announcements HW7 will be posted tonight Lab sessions resume next week Lecture 2: 2 Course Content Binary numbers and logic gates

More information

The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 1 February 17, 2011

The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 1 February 17, 2011 The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 1 February 17, 2011 Name: KEY_(Answers in red) University of Michigan uniqname: (NOT your student

More information

Question 1: (20 points) For this question, refer to the following pipeline architecture.

Question 1: (20 points) For this question, refer to the following pipeline architecture. This is the Mid Term exam given in Fall 2018. Note that Question 2(a) was a homework problem this term (was not a homework problem in Fall 2018). Also, Questions 6, 7 and half of 5 are from Chapter 5,

More information

Do not start the test until instructed to do so!

Do not start the test until instructed to do so! Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted one-page formula sheet and the MIPS reference card. No calculators

More information

Prerequisite Quiz January 23, 2007 CS252 Computer Architecture and Engineering

Prerequisite Quiz January 23, 2007 CS252 Computer Architecture and Engineering University of California, Berkeley College of Engineering Computer Science Division EECS Spring 2007 John Kubiatowicz Prerequisite Quiz January 23, 2007 CS252 Computer Architecture and Engineering This

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:

More information

ECE 313 Computer Organization FINAL EXAM December 13, 2000

ECE 313 Computer Organization FINAL EXAM December 13, 2000 This exam is open book and open notes. You have until 11:00AM. Credit for problems requiring calculation will be given only if you show your work. 1. Floating Point Representation / MIPS Assembly Language

More information

Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions.

Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions. Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions Stage Instruction Fetch Instruction Decode Execution / Effective addr Memory access Write-back Abbreviation

More information

ECEC 355: Pipelining

ECEC 355: Pipelining ECEC 355: Pipelining November 8, 2007 What is Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. A pipeline is similar in concept to an assembly

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

Computer Organization and Structure

Computer Organization and Structure Computer Organization and Structure 1. Assuming the following repeating pattern (e.g., in a loop) of branch outcomes: Branch outcomes a. T, T, NT, T b. T, T, T, NT, NT Homework #4 Due: 2014/12/9 a. What

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

EE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes

EE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes NAME: STUDENT NUMBER: EE557--FALL 1999 MAKE-UP MIDTERM 1 Closed books, closed notes Q1: /1 Q2: /1 Q3: /1 Q4: /1 Q5: /15 Q6: /1 TOTAL: /65 Grade: /25 1 QUESTION 1(Performance evaluation) 1 points We are

More information

Write only as much as necessary. Be brief!

Write only as much as necessary. Be brief! 1 CIS371 Computer Organization and Design Midterm Exam Prof. Martin Thursday, March 15th, 2012 This exam is an individual-work exam. Write your answers on these pages. Additional pages may be attached

More information

CS 341l Fall 2008 Test #2

CS 341l Fall 2008 Test #2 CS 341l all 2008 Test #2 Name: Key CS 341l, test #2. 100 points total, number of points each question is worth is indicated in parentheses. Answer all questions. Be as concise as possible while still answering

More information

EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts

EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts Prof. Sherief Reda School of Engineering Brown University S. Reda EN2910A FALL'15 1 Classical concepts (prerequisite) 1. Instruction

More information

CS/CoE 1541 Exam 1 (Spring 2019).

CS/CoE 1541 Exam 1 (Spring 2019). CS/CoE 1541 Exam 1 (Spring 2019). Name: Question 1 (8+2+2+3=15 points): In this problem, consider the execution of the following code segment on a 5-stage pipeline with forwarding/stalling hardware and

More information

Instruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31

Instruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31 4.16 Exercises 419 Exercise 4.11 In this exercise we examine in detail how an instruction is executed in a single-cycle datapath. Problems in this exercise refer to a clock cycle in which the processor

More information

1 Hazards COMP2611 Fall 2015 Pipelined Processor

1 Hazards COMP2611 Fall 2015 Pipelined Processor 1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add

More information

ECE154A Introduction to Computer Architecture. Homework 4 solution

ECE154A Introduction to Computer Architecture. Homework 4 solution ECE154A Introduction to Computer Architecture Homework 4 solution 4.16.1 According to Figure 4.65 on the textbook, each register located between two pipeline stages keeps data shown below. Register IF/ID

More information

CS232 Final Exam May 5, 2001

CS232 Final Exam May 5, 2001 CS232 Final Exam May 5, 2 Name: This exam has 4 pages, including this cover. There are six questions, worth a total of 5 points. You have 3 hours. Budget your time! Write clearly and show your work. State

More information

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4 IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM

More information

ECE 331 Hardware Organization and Design. UMass ECE Discussion 10 4/5/2018

ECE 331 Hardware Organization and Design. UMass ECE Discussion 10 4/5/2018 ECE 331 Hardware Organization and Design UMass ECE Discussion 10 4/5/2018 Today s Discussion Topics Direct and Set Associative Cache Midterm Review Hazards Code reordering and forwarding Direct Mapped

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 02: Introduction II Shuai Wang Department of Computer Science and Technology Nanjing University Pipeline Hazards Major hurdle to pipelining: hazards prevent the

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction

More information

CPSC 3300 Spring 2016 Final Exam Version A No Calculators

CPSC 3300 Spring 2016 Final Exam Version A No Calculators CPSC 3300 Spring 2016 Final Exam Version A No Calculators Name: 1. Find the execution time of a program that executes 8 billion instructions on a processor with an average CPI of 2 and a clock frequency

More information

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor. COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

COSC 6385 Computer Architecture - Pipelining

COSC 6385 Computer Architecture - Pipelining COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage

More information

Pipelining. CSC Friday, November 6, 2015

Pipelining. CSC Friday, November 6, 2015 Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not

More information

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27) Lecture Topics Today: Data and Control Hazards (P&H 4.7-4.8) Next: continued 1 Announcements Exam #1 returned Milestone #5 (due 2/27) Milestone #6 (due 3/13) 2 1 Review: Pipelined Implementations Pipelining

More information

ISA Instruction Operation

ISA Instruction Operation This exam has 6 problems. Make sure you have a complete exam before you begin. Write your name on every page in case pages become separated during grading. You will have three hours to complete this exam.

More information

CS 230 Practice Final Exam & Actual Take-home Question. Part I: Assembly and Machine Languages (22 pts)

CS 230 Practice Final Exam & Actual Take-home Question. Part I: Assembly and Machine Languages (22 pts) Part I: Assembly and Machine Languages (22 pts) 1. Assume that assembly code for the following variable definitions has already been generated (and initialization of A and length). int powerof2; /* powerof2

More information

CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007

CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007 CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007 Name: Solutions (please print) 1-3. 11 points 4. 7 points 5. 7 points 6. 20 points 7. 30 points 8. 25 points Total (105 pts):

More information

Instruction Frequency CPI. Load-store 55% 5. Arithmetic 30% 4. Branch 15% 4

Instruction Frequency CPI. Load-store 55% 5. Arithmetic 30% 4. Branch 15% 4 PROBLEM 1: An application running on a 1GHz pipelined processor has the following instruction mix: Instruction Frequency CPI Load-store 55% 5 Arithmetic 30% 4 Branch 15% 4 a) Determine the overall CPI

More information

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Cache Introduction [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user with as much

More information

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/sp16 1 Pipelined Execution Representation Time

More information

CS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions

CS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions CS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions Tutorial Questions 2. [AY2014/5 Semester 2 Exam] Refer to the following MIPS program: # register $s0 contains a 32-bit

More information

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination May 23, 2014 Name: Email: Student ID: Lab Section Number: Instructions: 1. This

More information

Chapter 4 The Processor 1. Chapter 4A. The Processor

Chapter 4 The Processor 1. Chapter 4A. The Processor Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

Introduction to Pipelining. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T.

Introduction to Pipelining. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T. Introduction to Pipelining Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T. L15-1 Performance Measures Two metrics of interest when designing a system: 1. Latency: The delay

More information

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ... CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100

More information

Instruction Pipelining Review

Instruction Pipelining Review Instruction Pipelining Review Instruction pipelining is CPU implementation technique where multiple operations on a number of instructions are overlapped. An instruction execution pipeline involves a number

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

CS232 Final Exam May 5, 2001

CS232 Final Exam May 5, 2001 CS232 Final Exam May 5, 2 Name: Spiderman This exam has 4 pages, including this cover. There are six questions, worth a total of 5 points. You have 3 hours. Budget your time! Write clearly and show your

More information

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception Outline A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception 1 4 Which stage is the branch decision made? Case 1: 0 M u x 1 Add

More information

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017 Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation

More information

University of California, Berkeley College of Engineering

University of California, Berkeley College of Engineering University of California, Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Spring 2016 Instructors: Vladimir Stojanovic, Nicholas Weaver 2016-04-04 L J After the

More information

HY425 Lecture 05: Branch Prediction

HY425 Lecture 05: Branch Prediction HY425 Lecture 05: Branch Prediction Dimitrios S. Nikolopoulos University of Crete and FORTH-ICS October 19, 2011 Dimitrios S. Nikolopoulos HY425 Lecture 05: Branch Prediction 1 / 45 Exploiting ILP in hardware

More information

CS252 Graduate Computer Architecture Midterm 1 Solutions

CS252 Graduate Computer Architecture Midterm 1 Solutions CS252 Graduate Computer Architecture Midterm 1 Solutions Part A: Branch Prediction (22 Points) Consider a fetch pipeline based on the UltraSparc-III processor (as seen in Lecture 5). In this part, we evaluate

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

Computer System Architecture Midterm Examination Spring 2002

Computer System Architecture Midterm Examination Spring 2002 Computer System Architecture 6.823 Midterm Examination Spring 2002 Name: This is an open book, open notes exam. 110 Minutes 1 Pages Notes: Not all questions are of equal difficulty, so look over the entire

More information

/ : Computer Architecture and Design Fall 2014 Midterm Exam Solution

/ : Computer Architecture and Design Fall 2014 Midterm Exam Solution 16.482 / 16.561: Computer Architecture and Design Fall 2014 Midterm Exam Solution 1. (8 points) UEvaluating instructions Assume the following initial state prior to executing the instructions below. Note

More information

4.1.3 [10] < 4.3>Which resources (blocks) produce no output for this instruction? Which resources produce output that is not used?

4.1.3 [10] < 4.3>Which resources (blocks) produce no output for this instruction? Which resources produce output that is not used? 2.10 [20] < 2.2, 2.5> For each LEGv8 instruction in Exercise 2.9 (copied below), show the value of the opcode (Op), source register (Rn), and target register (Rd or Rt) fields. For the I-type instructions,

More information

EECS 470 Midterm Exam

EECS 470 Midterm Exam EECS 470 Midterm Exam Winter 2014 Name: unique name: Sign the honor code: I have neither given nor received aid on this exam nor observed anyone else doing so. Scores: NOTES: # Points Page 2 /12 Page 3

More information

CS 251, Winter 2019, Assignment % of course mark

CS 251, Winter 2019, Assignment % of course mark CS 251, Winter 2019, Assignment 5.1.1 3% of course mark Due Wednesday, March 27th, 5:30PM Lates accepted until 1:00pm March 28th with a 15% penalty 1. (10 points) The code sequence below executes on a

More information

ECE 341 Final Exam Solution

ECE 341 Final Exam Solution ECE 341 Final Exam Solution Time allowed: 110 minutes Total Points: 100 Points Scored: Name: Problem No. 1 (10 points) For each of the following statements, indicate whether the statement is TRUE or FALSE.

More information

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will

More information

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work? EEC 17 Computer Architecture Fall 25 Introduction Review Review: The Hierarchy Take advantage of the principle of locality to present the user with as much memory as is available in the cheapest technology

More information

CMSC411 Fall 2013 Midterm 1

CMSC411 Fall 2013 Midterm 1 CMSC411 Fall 2013 Midterm 1 Name: Instructions You have 75 minutes to take this exam. There are 100 points in this exam, so spend about 45 seconds per point. You do not need to provide a number if you

More information

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle? CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:

More information

Computer Architecture V Fall Practice Exam Questions

Computer Architecture V Fall Practice Exam Questions Computer Architecture V22.0436 Fall 2002 Practice Exam Questions These are practice exam questions for the material covered since the mid-term exam. Please note that the final exam is cumulative. See the

More information

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours. This exam is open book and open notes. You have 2 hours. Problems 1-4 refer to a proposed MIPS instruction lwu (load word - update) which implements update addressing an addressing mode that is used in

More information

Complex Pipelines and Branch Prediction

Complex Pipelines and Branch Prediction Complex Pipelines and Branch Prediction Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. L22-1 Processor Performance Time Program Instructions Program Cycles Instruction CPI Time Cycle

More information

CS161 Design and Architecture of Computer Systems. Cache $$$$$

CS161 Design and Architecture of Computer Systems. Cache $$$$$ CS161 Design and Architecture of Computer Systems Cache $$$$$ Memory Systems! How can we supply the CPU with enough data to keep it busy?! We will focus on memory issues,! which are frequently bottlenecks

More information

Pipeline design. Mehran Rezaei

Pipeline design. Mehran Rezaei Pipeline design Mehran Rezaei How Can We Improve the Performance? Exec Time = IC * CPI * CCT Optimization IC CPI CCT Source Level * Compiler * * ISA * * Organization * * Technology * With Pipelining We

More information

CS 351 Exam 2 Mon. 11/2/2015

CS 351 Exam 2 Mon. 11/2/2015 CS 351 Exam 2 Mon. 11/2/2015 Name: Rules and Hints The MIPS cheat sheet and datapath diagram are attached at the end of this exam for your reference. You may use one handwritten 8.5 11 cheat sheet (front

More information

c. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?

c. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations? Brown University School of Engineering ENGN 164 Design of Computing Systems Professor Sherief Reda Homework 07. 140 points. Due Date: Monday May 12th in B&H 349 1. [30 points] Consider the non-pipelined

More information

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Pipelining and Exploiting Instruction-Level Parallelism (ILP) Pipelining and Exploiting Instruction-Level Parallelism (ILP) Pipelining and Instruction-Level Parallelism (ILP). Definition of basic instruction block Increasing Instruction-Level Parallelism (ILP) &

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture3 Review of Memory Hierarchy Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Performance 1000 Recap: Who Cares About the Memory Hierarchy? Processor-DRAM Memory Gap

More information

ECE260: Fundamentals of Computer Engineering

ECE260: Fundamentals of Computer Engineering ECE260: Fundamentals of Computer Engineering Pipelined Datapath and Control James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania ECE260: Fundamentals of Computer Engineering

More information

SOLUTION. Midterm #1 February 26th, 2018 Professor Krste Asanovic Name:

SOLUTION. Midterm #1 February 26th, 2018 Professor Krste Asanovic Name: SOLUTION Notes: CS 152 Computer Architecture and Engineering CS 252 Graduate Computer Architecture Midterm #1 February 26th, 2018 Professor Krste Asanovic Name: I am taking CS152 / CS252 This is a closed

More information

Processor design - MIPS

Processor design - MIPS EASY Processor design - MIPS Q.1 What happens when a register is loaded? 1. The bits of the register are set to all ones. 2. The bit pattern in the register is copied to a location in memory. 3. A bit

More information

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds. Performance 980 98 982 983 984 985 986 987 988 989 990 99 992 993 994 995 996 997 998 999 2000 7/4/20 CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Instructor: Michael Greenbaum

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 251, Winter 2018, Assignment 5.0.4 3% of course mark Due Wednesday, March 21st, 4:30PM Lates accepted until 10:00am March 22nd with a 15% penalty 1. (10 points) The code sequence below executes on a

More information

Basic Pipelining Concepts

Basic Pipelining Concepts Basic ipelining oncepts Appendix A (recommended reading, not everything will be covered today) Basic pipelining ipeline hazards Data hazards ontrol hazards Structural hazards Multicycle operations Execution

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 17: Pipelining Wrapup Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Outline The textbook includes lots of information Focus on

More information

Appendix C. Authors: John Hennessy & David Patterson. Copyright 2011, Elsevier Inc. All rights Reserved. 1

Appendix C. Authors: John Hennessy & David Patterson. Copyright 2011, Elsevier Inc. All rights Reserved. 1 Appendix C Authors: John Hennessy & David Patterson Copyright 2011, Elsevier Inc. All rights Reserved. 1 Figure C.2 The pipeline can be thought of as a series of data paths shifted in time. This shows

More information

LECTURE 10. Pipelining: Advanced ILP

LECTURE 10. Pipelining: Advanced ILP LECTURE 10 Pipelining: Advanced ILP EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls, returns) that changes the normal flow of instruction

More information

Appendix C: Pipelining: Basic and Intermediate Concepts

Appendix C: Pipelining: Basic and Intermediate Concepts Appendix C: Pipelining: Basic and Intermediate Concepts Key ideas and simple pipeline (Section C.1) Hazards (Sections C.2 and C.3) Structural hazards Data hazards Control hazards Exceptions (Section C.4)

More information