Course on Advanced Computer Architectures
|
|
- Alexis Caldwell
- 5 years ago
- Views:
Transcription
1 Course on Advanced Computer Architectures Politecnico di Milano, June 30th, 2014 Surname (Cognome) Name (Nome) POLIMI ID Number POLIMI Address Signature (Firma) SOLUTION Prof. C. Silvano EX1A ( 2.5points) EX1B ( 2.5points) EX1C ( 2.5points) EX2A ( 5 points) EX2B ( 2.5 points) Q3 ( 5 points) Q4 ( 5 points) Q5 ( 5 points) Q6 ( 3 points) TOTAL (33 points) (33 points)
2 Course on: Advanced Computer Architectures Prof. C Silvano EXAM 30/06/2014 Please write in CAPITAL LETTERS AND BLACK/BLUE COLORS!!! (MAIUSCOLO e COLORE NERO/BLU!!!) EXERCISE 1A PIPELINE BASIC (2.5 points) Given the following loop expressed in a high level language: for (i =0; i < N; i ++) vectb[i] = vectb[i] + k1 + k2; The program has been compiled in MIPS assembly code assuming that registers $t6 and $t7 have been initialized with values 0 and 4N respectively. The symbol VECTB is a 16-bit constant. The processor clock frequency is 1.2 GHz. Let us consider the loop executed by 5-stage pipelined MIPS processor WITHOUT any optimization in the pipeline (PLEASE don t consider any inter-iteration dependencies) 1. Identify the RAW (Read After Write) Hazards in the pipeline scheme and identify the Hazard Type (Data Hazard or Control Hazard) in the last column 2. Identify in the first column the number of stalls to be inserted before each instruction (or between the stage IF and ID of each instruction) necessary to solve the hazards Num. Stalls INSTRUCTION C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 Hazard Type 3 FOR:beq $t6,$t7,end IF ID EX ME WB CNTR 3 lw $t2,vectb($t6) IF ID EX ME WB CNTR 3 addi $t2,$t2,k1 IF ID EX ME WB RAW $t2 3 addi $t2,$t2,k2 IF ID EX ME WB RAW $t2 3 sw $t2,vectb($t6) IF ID EX ME WB RAW $t2 addi $t6,$t6,4 IF ID EX ME WB 3 blt $t6,$t7, FOR IF ID EX ME WB RAW $t6 blt $t6, $t7, FOR # branch on less than Express the formula then calculate the following metrics: Instruction Count per iteration (IC): IC = 7 Number of stalls per iteration: N stall = 18 Asymptotic CPI (N cycles) : CPI AS = (IC + # stalls) / IC = (7+18) /7 = 3.57 Asymptotic Throughput (expressed in MIPS) (N cycles): MIPS AS = f CLOCK / CPI AS * 10 6 = ( ) / (CPI AS * 10 6 ) = /3.57=336 Page 1 - SOLUTION
3 Course on: Advanced Computer Architectures Prof. C Silvano EXAM 30/06/2014 Please write in CAPITAL LETTERS AND BLACK/BLUE COLORS!!! (MAIUSCOLO e COLORE NERO/BLU!!!) EXERCISE 1B PIPELINE OPTIMIZATIONS (2.5 points) Assuming there are the following optimisations in the pipeline (PLEASE don t consider any inter-iteration dependencies) - In the Register File it is possible the read and write at the same address in the same clock cycle; - Forwarding - Computation of PC and TARGET ADDRESS for branch & jump instructions anticipated in the ID stage 1. Identify the RAW (Read After Write) Hazards in the pipeline scheme and identify the Hazard Type (Data Hazard or Control Hazard) in the penultimate (one before last) column 2. Identify in the first column the number of stalls to be inserted before each instruction (or between the stage IF and ID of each instruction) necessary to solve the hazards 3. Draw the pipeline scheme by inserting the stalls to solve the given hazards and adding an ARROW to indicate the Forwarding paths used: Num. Stalls INSTRUCTION C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 Hazard Type Fowarding Path 1 FOR:beq $t6,$t7,end S IF ID EX ME WB CNTR 1 lw $t2,vectb($t6) S IF ID EX ME WB CNTR 1 addi $t2,$t2,k1 IF ID S EX ME WB RAW $t2 MEM-EX $t2 addi $t2,$t2,k2 IF S ID EX ME WB (RAW $t2) EX-EX $t2 sw $t2,vectb($t6) S IF ID EX ME WB (RAW $t2) EX-EX $t2 addi $t6,$t6,4 IF ID EX ME WB 1 blt $t6,$t7, FOR IF S ID EX ME WB RAW $t6 EX-ID $t6 Express the formula then calculate the following metrics: Number of stalls per iteration: N stall = 4 Asymptotic CPI (N cycles) : CPI AS = (IC + # stalls) / IC = (7+4) /7 = 1.57 Asymptotic Throughput (expressed in MIPS) (N cycles): MIPS AS = f CLOCK / CPI AS * 10 6 = ( ) / (CPI AS * 10 6 ) = /1.57= 764 Calculate the speedup with respect to the EXERCISE 1A: Speedup = CPI AS1A /CPI AS1B = / = 3.57 / 1.57 = 2.27 Page 2 - SOLUTION
4 Course on: Advanced Computer Architectures Prof. C Silvano EXAM 30/06/2014 Please write in CAPITAL LETTERS AND BLACK/BLUE COLORS!!! (MAIUSCOLO e COLORE NERO/BLU!!!) EXERCISE 1C PIPELINE OPTIMIZATIONS with DATA CACHE MISSES(2.5 points) Assuming there are the previous optimisations in the pipeline (EX 1B) and EACH ACCESS TO THE DATA MEMORY GENERATE A DATA CACHE MISS requiring 2 stalls to access the data memory, Draw the pipeline scheme by inserting the stalls due to the given hazards and to solve the data cache misses and adding an ARROW to indicate the Forwarding paths used: Num. INSTRUCTION C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C14 C15 C16 C17 C18 C19 C20 C21 C22 Hazard Fowarding Stalls Type Path 1 FOR:beq $t6,$t7,end S IF ID EX ME WB CNTR 3 lw $t2,vectb($t6) S IF ID EX ME S S WB CNTR + miss 1 addi $t2,$t2,k1 IF ID S S S EX ME WB RAW $t2 MEM-EX $t2 addi $t2,$t2,k2 IF S S S ID EX ME WB (RAW $t2) EX-EX $t2 2 sw $t2,vectb($t6) S S S IF ID EX ME S S WB (RAW $t2) +miss EX-EX $t2 addi $t6,$t6,4 IF ID EX S S ME WB 1 blt $t6,$t7, FOR IF S S S ID EX ME WB RAW $t6 EX-ID $t6 Express the formula then calculate the following metrics: Number of stalls per iteration: N stall = 8 (4 out of them are due to data cache misses) Asymptotic CPI (N cycles) : CPI AS = (IC + # stalls) / IC = (7+8) /7 = 2.14 Asymptotic Throughput (expressed in MIPS) (N cycles): MIPS AS = f CLOCK / CPI AS * 10 6 = ( ) / (CPI AS * 10 6 ) = /2.14 = 561 Calculate the speedup with respect to the EXERCISE 1A: Speedup = CPI AS1A /CPI AS1C = 3.57 / 2.14 = 1.67 Page 3 - SOLUTION
5 Course on: Advanced Computer Architectures Prof. C Silvano EXAM 30/06/2014 Please write in CAPITAL LETTERS AND BLACK/BLUE COLORS!!! (MAIUSCOLO e COLORE NERO/BLU!!!) EXERCISE 2A - TOMASULO (5 points) Please consider the program in the table be executed on a CPU with dynamic scheduling based on TOMASULO algorithm with: 2 RESERVATION STATIONS (RS1, RS2) + 1 LOAD/STORE unit (LDU1) with latency 6 2 RESERVATION STATIONS (RS3, RS4) + 2 ALU/BR FUs (ALU1, ALU2) with latency 2 Check STRUCTURAL hazards for RS in ISSUE phase Check RAW hazards and Check STRUCTURAL hazards for FUs in START EXECUTE phase WRITE RESULT in RESERVATION STATIONS and RF We assume 1 CDB for RF Static Branch Prediction BTFNT (BACKWARD TAKEN FORWARD NOT TAKEN) with Branch Target Buffer 1. Please complete the TOMASULO TABLE by assuming all cache HITS and considering ONE ITERATION: ISTRUZIONE Prediction ISSUE START WRITE Hazards Type RSi UNIT T /NT EXEC RESULT FOR:beq $t6,$t7,end NT RS3 ALU1 lw $t2,vectb($t6) RS1 LDU1 addi $t2,$t2,k RAW $t2 RS4 ALU2 addi $t2,$t2,k STRUCT RS3 +RAW $t2 RS3 ALU1 sw $t2,vectb($t6) (STRUCT LDU1)+RAW $t2 RS2 LDU1 addi $t6,$t6, STRUCT RS4 RS4 ALU2 blt $t6,$t7, FOR T STRUCT RS3 +(RAW $t6) RS3 ALU1 2. Express the formula then calculate the following metrics: CPI = (#clock cyles / IC) = 22/7 =3.14 IPC = 1/CPI = 1/ 3.14 = 0.32 Page 4 - SOLUTION
6 Course on: Advanced Computer Architectures Prof. C Silvano EXAM 30/06/2014 Please write in CAPITAL LETTERS AND BLACK/BLUE COLORS!!! (MAIUSCOLO e COLORE NERO/BLU!!!) EXERCISE 2B - TOMASULO (2.5 points) Please consider the program in the table be executed on a CPU with dynamic scheduling based on TOMASULO algorithm with: 2 RESERVATION STATIONS (RS1, RS2) + 1 LOAD/STORE unit (LDU1) with latency 6 3 RESERVATION STATIONS (RS3, RS4, RS5) + 2 ALU/BR FU (ALU1, ALU2) with latency 2 Check STRUCTURAL hazards for RS in ISSUE phase Check RAW hazards and Check STRUCTURAL hazards for FUs in START EXECUTE phase WRITE RESULT in RESERVATION STATIONS and RF We assume 1 CDB for RF Static Branch Prediction BTFNT (BACKWARD TAKEN FORWARD NOT TAKEN) with Branch Target Buffer 1. Please complete the TOMASULO TABLE by assuming all cache HITS and considering ONE ITERATION: ISTRUZIONE Prediction ISSUE START WRITE Hazards Type RSi UNIT T /NT EXEC RESULT FOR:beq $t6,$t7,end NT RS3 ALU1 lw $t2,vectb($t6) RS1 LDU1 addi $t2,$t2,k RAW $t2 RS4 ALU2 addi $t2,$t2,k RAW $t2 RS5 ALU1 sw $t2,vectb($t6) (STRUCT LDU1) +RAW $t2 RS2 LDU1 addi $t6,$t6, STRUCT ALU2 +STRUCT RF WR RS3 ALU2 blt $t6,$t7, FOR T STRUCT RS4+(STRUCT ALU1)+RAW $t6 RS4 ALU1 2. Express the formula then calculate the following metrics: CPI = (#clock cyles / IC) = 22/7 = 3.14 The introduction of more reservation station has been advantageous? NO Page 5 - SOLUTION
7 Course on Advanced Computer Architectures Prof. C Silvano EXAM 30/06/2014 Please write in CAPITAL LETTERS AND BLACK/BLUE COLORS!!! (MAIUSCOLO e COLORE NERO/BLU!!!) QUESTION (3): INSTRUCTION LEVEL PARALLELISM (5 points) Describe the role of Reservation Stations in Tomasulo processor architectures Describe the role of Reorder Buffer in Tomasulo processor architectures Page 6 - SOLUTION
8 Course on Advanced Computer Architectures Prof. C Silvano EXAM 30/06/2014 Please write in CAPITAL LETTERS AND BLACK/BLUE COLORS!!! (MAIUSCOLO e COLORE NERO/BLU!!!) QUESTION (4): MULTITHREADING (5 points) Explain the main concepts of multithreading Explain the main concepts of fine-grained multithreading Page 7 - SOLUTION
9 Course on Advanced Computer Architectures Prof. C Silvano EXAM 30/06/2014 Please write in CAPITAL LETTERS AND BLACK/BLUE COLORS!!! (MAIUSCOLO e COLORE NERO/BLU!!!) QUESTION (5): MULTIPROCESSORS (5 points) Describe the main concepts and benefits of distributed shared memory architectures Page 8 - SOLUTION
10 Course on Advanced Computer Architectures Prof. C Silvano EXAM 30/06/2014 Please write in CAPITAL LETTERS AND BLACK/BLUE COLORS!!! (MAIUSCOLO e COLORE NERO/BLU!!!) QUESTION (6): PERFORMANCE OF MEMORY HIERARCHY (3 points) Write the formula for the AVERAGE MEMORY ACCESS TIME (AMAT) when there is L1 cache only: Write the formula for the AVERAGE MEMORY ACCESS TIME (AMAT) when there are L1 and L2 caches: Write the formula for the GLOBAL MISS RATE for Last Level Cache (L2): Page 9 - SOLUTION
Corso Architetture Avanzate dei Calcolatori
Corso Architetture Avanzate dei Calcolatori Politecnico di Milano, Feb. 5th, 2014 Cognome (Surname) Nome (Name) POLIMICodice Persona (POLIMIPersonal Code) email Firma (Signature) SOLUTION Prof. C. Silvano
More informationADVANCED COMPUTER ARCHITECTURES: Prof. C. SILVANO Written exam 11 July 2011
ADVANCED COMPUTER ARCHITECTURES: 088949 Prof. C. SILVANO Written exam 11 July 2011 SURNAME NAME ID EMAIL SIGNATURE EX1 (3) EX2 (3) EX3 (3) EX4 (5) EX5 (5) EX6 (4) EX7 (5) EX8 (3+2) TOTAL (33) EXERCISE
More informationCOGNOME NOME MATRICOLA
CORSO Architectures for Multimedia Systems - Code: 078650 (073335) - Prof. C. SILVANO Prova del 3 settembre 2010 COGNOME NOME MATRICOLA EMAIL EXERCISE 1 - PIPELINE Given the following loop expressed in
More informationArchitectures for Multimedia Systems - Code: (073335) Prof. C. SILVANO 5th July 2010
Architectures for Multimedia Systems - Code: 078650 (073335) Prof. C. SILVANO 5th July 2010 SURNAME NAME STUDENT ID (MATRICOLA) EMAIL EXERCISE 1 - PIPELINE Given the following loop expressed in a high
More informationCourse on Advanced Computer Architectures
Surname (Cognome) Name (Nome) POLIMI ID Number Signature (Firma) SOLUION Politecnico di Milano, June 22nd, 2018 Course on Advanced Computer Architectures Prof. D. Sciuto, Prof. C. Silvano EX1 EX2 EX3 Q1
More informationCourse on Advanced Computer Architectures
Surname (Cognome) Name (Nome) POLIMI ID Number Signature (Firma) SOLUTION Politecnico di Milano, July 9, 2018 Course on Advanced Computer Architectures Prof. D. Sciuto, Prof. C. Silvano EX1 EX2 EX3 Q1
More informationAdvanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017
Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation
More informationFinal Exam Fall 2007
ICS 233 - Computer Architecture & Assembly Language Final Exam Fall 2007 Wednesday, January 23, 2007 7:30 am 10:00 am Computer Engineering Department College of Computer Sciences & Engineering King Fahd
More informationCENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.
Exam 2 April 12, 2012 You have 80 minutes to complete the exam. Please write your answers clearly and legibly on this exam paper. GRADE: Name. Class ID. 1. (22 pts) Circle the selected answer for T/F and
More informationInstruction Frequency CPI. Load-store 55% 5. Arithmetic 30% 4. Branch 15% 4
PROBLEM 1: An application running on a 1GHz pipelined processor has the following instruction mix: Instruction Frequency CPI Load-store 55% 5 Arithmetic 30% 4 Branch 15% 4 a) Determine the overall CPI
More information3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:
More informationFinal Exam Fall 2008
COE 308 Computer Architecture Final Exam Fall 2008 page 1 of 8 Saturday, February 7, 2009 7:30 10:00 AM Computer Engineering Department College of Computer Sciences & Engineering King Fahd University of
More information4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?
Chapter 4: Assessing and Understanding Performance 1. Define response (execution) time. 2. Define throughput. 3. Describe why using the clock rate of a processor is a bad way to measure performance. Provide
More informationComputer Architecture CS372 Exam 3
Name: Computer Architecture CS372 Exam 3 This exam has 7 pages. Please make sure you have all of them. Write your name on this page and initials on every other page now. You may only use the green card
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More informationPipelining. CSC Friday, November 6, 2015
Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not
More informationAdvanced Computer Architecture CMSC 611 Homework 3. Due in class Oct 17 th, 2012
Advanced Computer Architecture CMSC 611 Homework 3 Due in class Oct 17 th, 2012 (Show your work to receive partial credit) 1) For the following code snippet list the data dependencies and rewrite the code
More informationFinal Exam Spring 2017
COE 3 / ICS 233 Computer Organization Final Exam Spring 27 Friday, May 9, 27 7:3 AM Computer Engineering Department College of Computer Sciences & Engineering King Fahd University of Petroleum & Minerals
More informationCS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions
CS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions Tutorial Questions 2. [AY2014/5 Semester 2 Exam] Refer to the following MIPS program: # register $s0 contains a 32-bit
More informationCS/CoE 1541 Exam 1 (Spring 2019).
CS/CoE 1541 Exam 1 (Spring 2019). Name: Question 1 (8+2+2+3=15 points): In this problem, consider the execution of the following code segment on a 5-stage pipeline with forwarding/stalling hardware and
More informationAdvanced Instruction-Level Parallelism
Advanced Instruction-Level Parallelism Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu
More informationECE260: Fundamentals of Computer Engineering
Pipelining James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy What is Pipelining? Pipelining
More informationMidnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4
IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM
More informationLECTURE 10. Pipelining: Advanced ILP
LECTURE 10 Pipelining: Advanced ILP EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls, returns) that changes the normal flow of instruction
More informationProcessor (IV) - advanced ILP. Hwansoo Han
Processor (IV) - advanced ILP Hwansoo Han Instruction-Level Parallelism (ILP) Pipelining: executing multiple instructions in parallel To increase ILP Deeper pipeline Less work per stage shorter clock cycle
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationCS146 Computer Architecture. Fall Midterm Exam
CS146 Computer Architecture Fall 2002 Midterm Exam This exam is worth a total of 100 points. Note the point breakdown below and budget your time wisely. To maximize partial credit, show your work and state
More informationThe Processor: Instruction-Level Parallelism
The Processor: Instruction-Level Parallelism Computer Organization Architectures for Embedded Computing Tuesday 21 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy
More informationAlexandria University
Alexandria University Faculty of Engineering Computer and Communications Department CC322: CC423: Advanced Computer Architecture Sheet 3: Instruction- Level Parallelism and Its Exploitation 1. What would
More informationLECTURE 3: THE PROCESSOR
LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU
More informationCS433 Homework 2 (Chapter 3)
CS433 Homework 2 (Chapter 3) Assigned on 9/19/2017 Due in class on 10/5/2017 Instructions: 1. Please write your name and NetID clearly on the first page. 2. Refer to the course fact sheet for policies
More informationCS 2410 Mid term (fall 2015) Indicate which of the following statements is true and which is false.
CS 2410 Mid term (fall 2015) Name: Question 1 (10 points) Indicate which of the following statements is true and which is false. (1) SMT architectures reduces the thread context switch time by saving in
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationEECC551 Exam Review 4 questions out of 6 questions
EECC551 Exam Review 4 questions out of 6 questions (Must answer first 2 questions and 2 from remaining 4) Instruction Dependencies and graphs In-order Floating Point/Multicycle Pipelining (quiz 2) Improving
More informationECE 4750 Computer Architecture, Fall 2017 T05 Integrating Processors and Memories
ECE 4750 Computer Architecture, Fall 2017 T05 Integrating Processors and Memories School of Electrical and Computer Engineering Cornell University revision: 2017-10-17-12-06 1 Processor and L1 Cache Interface
More informationPipelining: Hazards Ver. Jan 14, 2014
POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? Pipelining: Hazards Ver. Jan 14, 2014 Marco D. Santambrogio: marco.santambrogio@polimi.it Simone Campanoni:
More informationCMSC411 Fall 2013 Midterm 2 Solutions
CMSC411 Fall 2013 Midterm 2 Solutions 1. (12 pts) Memory hierarchy a. (6 pts) Suppose we have a virtual memory of size 64 GB, or 2 36 bytes, where pages are 16 KB (2 14 bytes) each, and the machine has
More informationCS 61C: Great Ideas in Computer Architecture Pipelining and Hazards
CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/sp16 1 Pipelined Execution Representation Time
More informationCS433 Midterm. Prof Josep Torrellas. October 16, Time: 1 hour + 15 minutes
CS433 Midterm Prof Josep Torrellas October 16, 2014 Time: 1 hour + 15 minutes Name: Alias: Instructions: 1. This is a closed-book, closed-notes examination. 2. The Exam has 4 Questions. Please budget your
More informationCS433 Homework 2 (Chapter 3)
CS Homework 2 (Chapter ) Assigned on 9/19/2017 Due in class on 10/5/2017 Instructions: 1. Please write your name and NetID clearly on the first page. 2. Refer to the course fact sheet for policies on collaboration..
More informationCS/CoE 1541 Mid Term Exam (Fall 2018).
CS/CoE 1541 Mid Term Exam (Fall 2018). Name: Question 1: (6+3+3+4+4=20 points) For this question, refer to the following pipeline architecture. a) Consider the execution of the following code (5 instructions)
More informationCOMPUTER ORGANIZATION AND DESI
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationOPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS.
CS/ECE472 Midterm #2 Fall 2008 NAME: Student ID#: OPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS. Your signature is your promise that you have not cheated and will
More informationGood luck and have fun!
Midterm Exam October 13, 2014 Name: Problem 1 2 3 4 total Points Exam rules: Time: 90 minutes. Individual test: No team work! Open book, open notes. No electronic devices, except an unprogrammed calculator.
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationILP concepts (2.1) Basic compiler techniques (2.2) Reducing branch costs with prediction (2.3) Dynamic scheduling (2.4 and 2.5)
Instruction-Level Parallelism and its Exploitation: PART 1 ILP concepts (2.1) Basic compiler techniques (2.2) Reducing branch costs with prediction (2.3) Dynamic scheduling (2.4 and 2.5) Project and Case
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationFor this problem, consider the following architecture specifications: Functional Unit Type Cycles in EX Number of Functional Units
CS333: Computer Architecture Spring 006 Homework 3 Total Points: 49 Points (undergrad), 57 Points (graduate) Due Date: Feb. 8, 006 by 1:30 pm (See course information handout for more details on late submissions)
More informationAdvanced d Instruction Level Parallelism. Computer Systems Laboratory Sungkyunkwan University
Advanced d Instruction ti Level Parallelism Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ILP Instruction-Level Parallelism (ILP) Pipelining:
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationStatic, multiple-issue (superscaler) pipelines
Static, multiple-issue (superscaler) pipelines Start more than one instruction in the same cycle Instruction Register file EX + MEM + WB PC Instruction Register file EX + MEM + WB 79 A static two-issue
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor
More informationData Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard
Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:
More informationQuestion 1: (20 points) For this question, refer to the following pipeline architecture.
This is the Mid Term exam given in Fall 2018. Note that Question 2(a) was a homework problem this term (was not a homework problem in Fall 2018). Also, Questions 6, 7 and half of 5 are from Chapter 5,
More informationProcessor (II) - pipelining. Hwansoo Han
Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number
More informationCISC 662 Graduate Computer Architecture Lecture 13 - CPI < 1
CISC 662 Graduate Computer Architecture Lecture 13 - CPI < 1 Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer
More informationCS 251, Winter 2019, Assignment % of course mark
CS 251, Winter 2019, Assignment 5.1.1 3% of course mark Due Wednesday, March 27th, 5:30PM Lates accepted until 1:00pm March 28th with a 15% penalty 1. (10 points) The code sequence below executes on a
More informationReal Processors. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University
Real Processors Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Instruction-Level Parallelism (ILP) Pipelining: executing multiple instructions in parallel
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationInstruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31
4.16 Exercises 419 Exercise 4.11 In this exercise we examine in detail how an instruction is executed in a single-cycle datapath. Problems in this exercise refer to a clock cycle in which the processor
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationCOSC4201 Instruction Level Parallelism Dynamic Scheduling
COSC4201 Instruction Level Parallelism Dynamic Scheduling Prof. Mokhtar Aboelaze Parts of these slides are taken from Notes by Prof. David Patterson (UCB) Outline Data dependence and hazards Exposing parallelism
More informationCISC 662 Graduate Computer Architecture. Lecture 10 - ILP 3
CISC 662 Graduate Computer Architecture Lecture 10 - ILP 3 Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer
More informationComputer Organization and Structure
Computer Organization and Structure 1. Assuming the following repeating pattern (e.g., in a loop) of branch outcomes: Branch outcomes a. T, T, NT, T b. T, T, T, NT, NT Homework #4 Due: 2014/12/9 a. What
More informationCIS 662: Midterm. 16 cycles, 6 stalls
CIS 662: Midterm Name: Points: /100 First read all the questions carefully and note how many points each question carries and how difficult it is. You have 1 hour 15 minutes. Plan your time accordingly.
More information4.1.3 [10] < 4.3>Which resources (blocks) produce no output for this instruction? Which resources produce output that is not used?
2.10 [20] < 2.2, 2.5> For each LEGv8 instruction in Exercise 2.9 (copied below), show the value of the opcode (Op), source register (Rn), and target register (Rd or Rt) fields. For the I-type instructions,
More informationPipeline issues. Pipeline hazard: RaW. Pipeline hazard: RaW. Calcolatori Elettronici e Sistemi Operativi. Hazards. Data hazard.
Calcolatori Elettronici e Sistemi Operativi Pipeline issues Hazards Pipeline issues Data hazard Control hazard Structural hazard Pipeline hazard: RaW Pipeline hazard: RaW 5 6 7 8 9 5 6 7 8 9 : add R,R,R
More informationThomas Polzer Institut für Technische Informatik
Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =
More informationECEC 355: Pipelining
ECEC 355: Pipelining November 8, 2007 What is Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. A pipeline is similar in concept to an assembly
More informationSolutions to exercises on Instruction Level Parallelism
Solutions to exercises on Instruction Level Parallelism J. Daniel García Sánchez (coordinator) David Expósito Singh Javier García Blas Computer Architecture ARCOS Group Computer Science and Engineering
More information3.16 Historical Perspective and References
Case Studies and Exercises by Jason D. Bakos and Robert P. Colwell 247 Power4 Power5 Power6 Power7 Introduced 2001 2004 2007 2010 Initial clock rate (GHz) 1.3 1.9 4.7 3.6 Transistor count (M) 174 276 790
More informationInstruction-Level Parallelism and Its Exploitation
Chapter 2 Instruction-Level Parallelism and Its Exploitation 1 Overview Instruction level parallelism Dynamic Scheduling Techniques es Scoreboarding Tomasulo s s Algorithm Reducing Branch Cost with Dynamic
More informationLecture-13 (ROB and Multi-threading) CS422-Spring
Lecture-13 (ROB and Multi-threading) CS422-Spring 2018 Biswa@CSE-IITK Cycle 62 (Scoreboard) vs 57 in Tomasulo Instruction status: Read Exec Write Exec Write Instruction j k Issue Oper Comp Result Issue
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More informationPipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome
Thoai Nam Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy & David a Patterson,
More informationProf. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University. P & H Chapter 4.10, 1.7, 1.8, 5.10, 6
Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University P & H Chapter 4.10, 1.7, 1.8, 5.10, 6 Why do I need four computing cores on my phone?! Why do I need eight computing
More informationReferences EE457. Out of Order (OoO) Execution. Instruction Scheduling (Re-ordering of instructions)
EE457 Out of Order (OoO) Execution Introduction to Dynamic Scheduling of Instructions (The Tomasulo Algorithm) By Gandhi Puvvada References EE557 Textbook Prof Dubois EE557 Classnotes Prof Annavaram s
More informationPipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction
More informationCS 61C Fall 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control)
CS 61C Fall 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control) 1) If this exam were a CPU, you d be halfway through the pipeline (Sp15 Final) We found that the instruction fetch and memory stages
More informationThese actions may use different parts of the CPU. Pipelining is when the parts run simultaneously on different instructions.
MIPS Pipe Line 2 Introduction Pipelining To complete an instruction a computer needs to perform a number of actions. These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously
More informationCSEE 3827: Fundamentals of Computer Systems
CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 martha@cs.columbia.edu Amdahl s Law Be aware when optimizing... T = improved Taffected improvement factor + T unaffected
More informationFour Steps of Speculative Tomasulo cycle 0
HW support for More ILP Hardware Speculative Execution Speculation: allow an instruction to issue that is dependent on branch, without any consequences (including exceptions) if branch is predicted incorrectly
More informationCMSC 411 Practice Exam 1 w/answers. 1. CPU performance Suppose we have the following instruction mix and clock cycles per instruction.
CMSC 4 Practice Exam w/answers General instructions. Be complete, yet concise. You may leave arithmetic expressions in any form that a calculator could evaluate.. CPU performance Suppose we have the following
More informationIF1/IF2. Dout2[31:0] Data Memory. Addr[31:0] Din[31:0] Zero. Res ALU << 2. CPU Registers. extension. sign. W_add[4:0] Din[31:0] Dout[31:0] PC+4
12 1 CMPE110 Fall 2006 A. Di Blas 110 Fall 2006 CMPE pipeline concepts Advanced ffl ILP ffl Deep pipeline ffl Static multiple issue ffl Loop unrolling ffl VLIW ffl Dynamic multiple issue Textbook Edition:
More informationCS 251, Winter 2018, Assignment % of course mark
CS 251, Winter 2018, Assignment 5.0.4 3% of course mark Due Wednesday, March 21st, 4:30PM Lates accepted until 10:00am March 22nd with a 15% penalty 1. (10 points) The code sequence below executes on a
More informationPipelining and Caching. CS230 Tutorial 09
Pipelining and Caching CS230 Tutorial 09 Pipelining Hazards Data hazard: What happens when one instruction needs something that isn t ready? Example: add $3, $1, $2 add $5, $3, $4 This is solved by forwarding
More informationChapter 3 Instruction-Level Parallelism and its Exploitation (Part 1)
Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1) ILP vs. Parallel Computers Dynamic Scheduling (Section 3.4, 3.5) Dynamic Branch Prediction (Section 3.3) Hardware Speculation and Precise
More informationEE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes
NAME: STUDENT NUMBER: EE557--FALL 1999 MAKE-UP MIDTERM 1 Closed books, closed notes Q1: /1 Q2: /1 Q3: /1 Q4: /1 Q5: /15 Q6: /1 TOTAL: /65 Grade: /25 1 QUESTION 1(Performance evaluation) 1 points We are
More informationChapter 7. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 7 <1>
Chapter 7 Digital Design and Computer Architecture, 2 nd Edition David Money Harris and Sarah L. Harris Chapter 7 Chapter 7 :: Topics Introduction (done) Performance Analysis (done) Single-Cycle Processor
More informationExploiting ILP with SW Approaches. Aleksandar Milenković, Electrical and Computer Engineering University of Alabama in Huntsville
Lecture : Exploiting ILP with SW Approaches Aleksandar Milenković, milenka@ece.uah.edu Electrical and Computer Engineering University of Alabama in Huntsville Outline Basic Pipeline Scheduling and Loop
More informationMultiple Instruction Issue. Superscalars
Multiple Instruction Issue Multiple instructions issued each cycle better performance increase instruction throughput decrease in CPI (below 1) greater hardware complexity, potentially longer wire lengths
More informationWritten Exam / Tentamen
Written Exam / Tentamen Computer Organization and Components / Datorteknik och komponenter (IS1500), 9 hp Computer Hardware Engineering / Datorteknik, grundkurs (IS1200), 7.5 hp KTH Royal Institute of
More informationCache Optimization. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Cache Optimization Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Cache Misses On cache hit CPU proceeds normally On cache miss Stall the CPU pipeline
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined
More informationCS 61C Summer 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control)
CS 61C Summer 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control) 1) If this exam were a CPU, you d be halfway through the pipeline (Sp15 Final) We found that the instruction fetch and memory stages
More informationDYNAMIC AND SPECULATIVE INSTRUCTION SCHEDULING
DYNAMIC AND SPECULATIVE INSTRUCTION SCHEDULING Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 3, John L. Hennessy and David A. Patterson,
More informationPipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome
Pipeline Thoai Nam Outline Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy
More informationCS433 Midterm. Prof Josep Torrellas. October 19, Time: 1 hour + 15 minutes
CS433 Midterm Prof Josep Torrellas October 19, 2017 Time: 1 hour + 15 minutes Name: Instructions: 1. This is a closed-book, closed-notes examination. 2. The Exam has 4 Questions. Please budget your time.
More informationThe Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes.
The Processor Pipeline Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes. Pipeline A Basic MIPS Implementation Memory-reference instructions Load Word (lw) and Store Word (sw) ALU instructions
More information