1 /10 2 /16 3 /18 4 /15 5 /20 6 /9 7 /12
|
|
- Oswin Doyle
- 5 years ago
- Views:
Transcription
1 M A S S A C H U S E T T S I N S T I T U T E O F T E C H N O L O G Y DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Computation Structures Fall 2018 Practice Quiz #3B Name Athena login name Score 1 /10 2 /16 3 /18 4 /15 5 /20 6 /9 7 /12 Recitation section o WF 11, (Silvina) o WF 1, (Andy) o WF 12, (Silvina) o None (pick up quiz in 32-G846) Please enter your name, Athena login name, and recitation section above. Enter your answers in the spaces provided below. You can use the extra white space and the backs of the pages for scratch work. Problem 1. Potpourri (10 points) For the following questions, circle the correct of the two choices. (A) (2 points) In a system with virtual memory: a. User processes need to use system calls to communicate with peripherals that are using MMIO addresses outside of the process s address space. b. User processes can directly load from and store to MMIO addresses because MMIO addresses are never virtual addresses. Virtual memory is deactivated for them. (B) (2 points) In the processors that we studied in class and in part II (and part III) of Lab 7, virtual-to-physical address translation happens for: a. Every memory operation emitted in user mode (instruction loads, data loads, and data stores get their address translated). b. Only data memory operations emitted in user mode (data loads and data stores get their address translated. (C) (2 points) After an interrupt in a process at pc X returns to the original running process, the control flow is restored to: a. Address X b. Address X + 4 (D) (2 points) An exception always returns control to the process that caused the exception. a. True b. False e.g., sleep syscall in lab 7, returns to another process Fall of 13 - Practice Quiz #3B
2 (E) (2 points) System calls (invoked using ecall in RISC-V): a. Behave like jal, so the user process must save all caller-saved registers before invoking a system call. b. Behave differently from jal, so the user process does not need to save all callersaved registers before invoking a system call. All registers treated as callee-saved registers in system calls Fall of 13 - Practice Quiz #3B
3 Problem 2. Virtual Memory (16 points) A standard RISC-V CPU is connected to a memory management unit (MMU) that uses a page table to translate 32-bit virtual addresses to 28-bit physical addressing using a page size of 2 16 bytes. (A) (6 points) Including the D (dirty) and R (resident) control bits, please give the number of entries in the page map and the number of bits required for each entry in page map. VA: VPN: 16 bits, page offset: 16 bits PA: PPN: 12 bits, page offset: 16 bits Number of entries in the page map: 2 16 Number of bits required for each page map entry: 14 (B) (10 points) The following program fragment is executed and a record is made of the inputs and outputs of the MMU. The record is shown is shown in the table on the right. lw x12, 0(x10) addi x10, x10, 4 slli x12, x12, 5 sw x11, 0(x12) Access type Virtual address Physical address Inst. fetch 0x13FFF8 0x3FFFF8 Data read 0x x Inst. fetch 0x13FFFC 0x3FFFFC Inst. fetch 0x x Inst. fetch 0x x Data write 0x x Using information from the program and the table above, please deduce the contents of as many entries as possible in the page table. Please make an entry in the table below for each page table entry we learn about, giving the VPN, D and R controls bits, and the PPN, showing the state of the page table after the execution of the program fragment. If you can t deduce the value of a field, please leave the field blank. Assume that pages holding instructions are read-only. VPN D R PPN 0x x3F 0x98 1 0x89 0x x42 0x x Fall of 13 - Practice Quiz #3B
4 Problem 3. Pipelining Combinational Circuits (18 points) For each of the questions below, please create a valid K-stage pipeline of the given circuit. Each component in the circuit is annotated with its propagation delay. Show your pipelining contours and place large black circles ( ) on the signal arrows to indicate the placement of pipeline registers. Give the latency and throughput of each design, assuming ideal registers (t PD=0, t SETUP=0). Remember that our convention is to place a pipeline register on each output. (A) (3 points) Show the maximum-throughput 1-stage pipeline. t CLK = 11 ns Latency (ns): 11 Throughput (ns -1 ): 1/11 (B) (5 points) Show the maximum-throughput 2-stage pipeline using a minimal number of registers. t CLK = 7 ns; latency = 2 * 7 = 14 Latency (ns): 14 Throughput (ns -1 ): 1/7 (C) (5 points) Show the maximum-throughput pipeline using a minimal number of registers. t CLK = 4 ns; latency = 3 * 4 = 12 Latency (ns): 12 Throughput (ns -1 ): 1/ Fall of 13 - Practice Quiz #3B
5 (D) (5 points) You manage to reimplement the slowest combinational component in the previous circuit (the one with a propagation delay of 4 ns) using two components with propagation delays of 2ns, as shown right. Show the maximum-throughput pipeline using a minimal number of registers. Latency (ns): 12 Throughput (ns -1 ): 1/3 _ t CLK = 3 ns; latency = 4 * 3 = Fall of 13 - Practice Quiz #3B
6 Problem 4. Pipelined Processors (15 points) Consider the execution of the following code sequence on a 5-stage pipelined RISC-V processor, which is fully bypassed, predicts all branches not-taken, and kills instructions following taken branches. Assume that branch taken/not taken decisions are made in the EXE stage. Also, assume that the results of load operations are not available until the WB stage. The loop sums the first 100 elements of the integer array at address 0x1000 and stores the result at address 0x2000. Assume execution halts at instruction unimp. lui x11, 1 // x11 = 0x1000 (array) lui x15, 2 // x15 = 0x2000 (result) addi x12, x0, 0 // x12 holds sum addi x14, x11, 400 // address of array element 101 A: lw x13, 0(x11) // load next array element addi x11, x11, 4 // addr of next array element add x12, x12, x13 // add element value to sum bne x11, x14, A // loop until element 101 sw x12, 0(x15) // store result xor x2, x3, x4 unimp To help answer the following questions, fill in the pipeline diagram below showing execution of the loop assuming that the loop was previously executed, and will be repeated again after this iteration. Extra copies of this table are provided at the end of the quiz. Cycle IF lw addi add bne bne sw xor lw addi add DEC lw addi add add bne sw NOP lw addi EXE lw addi NOP add bne NOP NOP lw MEM lw addi NOP add bne NOP NOP WB lw addi NOP add bne NOP Bypass paths Cycle 5: lw to add Cycle 6: addi to lw Cycle 7: add to sw Fall of 13 - Practice Quiz #3B
7 (A) (3 points) Are there points in the execution of the sequence when data is bypassed from the EXE stage back to the DEC stage? If so, give the instruction(s) in the DEC stage at each such point; otherwise, enter NONE. Instruction(s) in DEC, or NONE: NONE (B) (3 points) Are there points in the execution of the sequence when data is bypassed from the WB stage back to the DEC stage? If so, give the instruction(s) in the DEC stage at each such point; otherwise, enter NONE. Instruction(s) in DEC, or NONE: add, bne (C) (3 points) Are there points during the execution of the sequence when the pipeline is stalled? If so, give the instruction(s) in the DEC stage at each such point; otherwise, enter NONE. Instruction(s) in DEC, or NONE: add (D) (6 points) Fill in the contents of the WB stage below. You may have blank spaces indicating the time it takes for the lw instruction to reach the WB stage. cycle WB stage lw addi NOP add bne NOP Fall of 13 - Practice Quiz #3B
8 Problem 5. Processor Pipeline Performance (20 points) You are designing a four stage RISC-V processor (IF, DEC, EXE, WB) with a BTB for next address prediction and a scoreboard for stalling on data hazards. Currently you are trying to decide whether to include bypassing through the register file from the write-back stage to the decode stage. As part of this evaluation, you construct two processors: Processor A: No bypassing from WB to DEC. Processor B: Bypassing from WB to DEC through the register file. (A) (2 points) Question not covered on this quiz. You are using the following loop of an important program to evaluate the performance of the processor: L1: lw t0, 0(a0) add a1, a1, t0 addi a0, a0, 4 blt a0, a2, L1 For the following questions, assume this loop has been running for a long time. (B) (4 points) How many cycles per loop iteration does the decode stage stall due to read after write hazards in the following cases? Processor A decode stall cycles per iteration: 4 Processor B decode stall cycles per iteration: 2 (C) (2 points) How many cycles does this loop take to execute in the following cases? Processor A cycles per iteration: 8 Processor B cycles per iteration: 6 A IF lw add addi addi addi blt lw lw lw add addi DEC lw add add add addi blt blt blt lw add EXE lw NOP NOP Add addi NOP NOP blt lw WB lw NOP NOP add addi NOP NOP blt B IF lw add addi addi blt lw lw add DEC lw add add addi blt blt lw add add EXE lw NOP add addi NOP blt lw NOP add WB lw NOP add addi NOP blt lw NOP Fall of 13 - Practice Quiz #3B
9 Processor A has the following propagation delays for each of the pipeline stages: IF: 2 ns DEC: 3.0 ns EX: 3.5 ns WB: 1.0 ns The logic for the bypassing path of processor B can be viewed as taking the output from the DEC and WB stages of processor A and adding an additional bypass logic (BYP) as shown in the picture below. Assuming the BYP logic has a propagation delay of 1 ns. (D) (4 points) What is the minimum clock period for each processor? Clock period for processor A: 3.5 ns Clock period for processor B: 4 ns (E) (4 points) For the loop shown above, what is the average cycles per instruction for the two processors: Average cycles per instruction for processor A: 8/4 = 2 Average cycles per instruction for processor B: 6/4 = 3/2 (F) (4 points) For the loop shown above, what is the average number of instructions per second for the two processors: Average number of instructions per second for processor A: 1/(7ns) Average number of instructions per second for processor B: 1/(6ns) Instr/sec = 1/(cyc/instr)(sec/cycle) A: 1/(2*3.5ns) = 1/(7ns) B: 1/((3/2)*4ns) = 1/(6ns) Fall of 13 - Practice Quiz #3B
10 Problem 6. Synchronization (9 points) G. Nome has designed four separate concurrent processes each of which prints a single character A, C, G or T. Her customers place orders for sequences that satisfy certain constraints and Ms. Nome adds semaphores as appropriate to ensure the printed sequence meets the specified criteria. For each of customer orders below, add the appropriate semaphores so the running processes will produce sequences that make the customers happy don t forget to specify the semaphores initial values! To receive full credit, don t impose any unnecessary constraints. Assume the processes start running immediately and that there are no constraints on the order in which statements in different processes are executed except those imposed by your semaphores. Processes will run indefinitely although they may, of course, end up stuck in a WAIT(). (A) (4 points) I d like the sequence CAT. semc = 1; sema = 0; semt = 0; semg = 0; Process #1 Process #2 Process #3 Process #4 A: C: G: T: wait(sema) wait(semc) wait(semg) wait(semt) print( A ) print( C ) print( G ) print( T ) signal(semt) signal(sema) goto A goto C goto G goto T (B) (5 points) My sequences have to be exactly 4 characters long. semc = 4; Process #1 Process #2 Process #3 Process #4 A: C: G: T: wait(semc) wait(semc) wait(semc) wait(semc) print( A ) print( C ) print( G ) print( T ) goto A goto C goto G goto T Problem 7. Branch Prediction in Complex Pipeline (12 points) Ben Bitdiddle has decided his high-performance RISC-V processor should have 8 pipeline stages, shown below. IF1 IF2 Instruction fetch, first cycle Instruction fetch, second cycle Fall of 13 - Practice Quiz #3B
11 D RF ALU MEM1 MEM2 WB Instruction decode, calculate branch target address Read/bypass register operands, make branch decision Perform ALU operation on operands LD/ST memory access, first cycle LD/ST memory access, second cycle Write result to register file at end of cycle Unless directed otherwise, the IF1 stage speculates that the next instruction comes from PC+4. The determination that an instruction is a branch instruction (e.g., beq, bne) is made in the D stage. The calculation of the branch target address is also made in the D stage. The actual branch decision (taken/not taken) is made in the ALU stage. (A) (4 points) With the 8-stage pipeline, what is the number of NOPs introduced into the pipeline when a branch instruction changes the PC to the branch target address, i.e., it s a taken branch? When a branch instruction is not a taken branch? The number of NOPs introduced is called the branch penalty. Branch penalty for taken branches (# of NOPs introduced): 4 Branch penalty for not-taken branches (# of NOPs introduced): 0 To reduce the penalty for taken branches, Ben plans to use a direct mapped Branch Target Buffer (BTB) in the IF1 stage together with a Branch History Table (BHT) in the D stage. The detailed description of the BTB and BHT are provided below. Recall that in the ALU stage, it becomes known whether the branch was actually taken or not. 1. The BTB holds entrypc, targetpc pairs for jumps and branches predicted to be taken. Assume that the targetpc predicted by the BTB is always correct for this question. (Yet the direction still might be wrong.) 2. The BTB is accessed every cycle. If there is a match with the current PC, PC is redirected to the targetpc predicted by the BTB (unless PC is redirected by an older instruction); if not, it is set to PC In the D stage (Instruction decode, calculate branch target address), a conditional branch instruction (beq/bne) looks up the BHT, but an unconditional jump (jal, jalr) does not. If a branch is predicted to be taken, stages IF1 and IF2 are flushed and the PC is redirected to the calculated branch target address Fall of 13 - Practice Quiz #3B
12 (B) (8 points) Fill out the following table of the number of pipeline bubbles (inserted NOPs) for conditional branches. Fill in table of branch penalties BTB hit? BHT predicted taken? Actually taken? Pipeline Bubbles Yes Yes Yes 0 Yes Yes No 4 Yes No Yes Cannot occur Yes No No Cannot occur No Yes Yes 2 No Yes No 4 No No Yes 4 No No No Fall of 13 - Practice Quiz #3B
13 Blank pipeline diagrams for Problem 3 Cycle IF lw DEC EXE MEM WB Cycle IF lw DEC EXE MEM WB END OF PRACTICE QUIZ 3! Fall of 13 - Practice Quiz #3B
6.004 Tutorial Problems L22 Branch Prediction
6.004 Tutorial Problems L22 Branch Prediction Branch target buffer (BTB): Direct-mapped cache (can also be set-associative) that stores the target address of jumps and taken branches. The BTB is searched
More information1 Hazards COMP2611 Fall 2015 Pipelined Processor
1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add
More information1 /20 2 /18 3 /20 4 /18 5 /24
M A S S A C H U S E T T S I N S T I T U T E O F T E C H N O L O G Y DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE 6.S084 Computation Structures Spring 2018 1 /20 2 /18 3 /20 4 /18 5 /24 Practice
More information1 /20 2 /18 3 /20 4 /18 5 /24
M A S S A C H U S E T T S I N S T I T U T E O F T E C H N O L O G Y DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE 6.S084 Computation Structures Spring 2018 1 /20 2 /18 3 /20 4 /18 5 /24 Practice
More information1 /15 2 /20 3 /20 4 /25 5 /20
M A S S A C H U S E T T S I N S T I T U T E O F T E C H N O L O G Y DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE 6.S084 Computation Structures Spring 2018 1 /15 2 /20 3 /20 4 /25 5 /20 Quiz
More information3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:
More informationIntroduction to Pipelining. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T.
Introduction to Pipelining Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T. L15-1 Performance Measures Two metrics of interest when designing a system: 1. Latency: The delay
More informationAdvanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017
Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation
More informationPipelining. CSC Friday, November 6, 2015
Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More informationComplex Pipelines and Branch Prediction
Complex Pipelines and Branch Prediction Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. L22-1 Processor Performance Time Program Instructions Program Cycles Instruction CPI Time Cycle
More informationCS252 Graduate Computer Architecture Midterm 1 Solutions
CS252 Graduate Computer Architecture Midterm 1 Solutions Part A: Branch Prediction (22 Points) Consider a fetch pipeline based on the UltraSparc-III processor (as seen in Lecture 5). In this part, we evaluate
More information1 /18 2 /16 3 /18 4 /26 5 /22
M A S S A C H U S E T T S I N S T I T U T E O F T E C H N O L O G Y DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE 6.004 Computation Structures Fall 2018 Quiz #2 1 /18 2 /16 3 /18 4 /26 5 /22
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationc. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?
Brown University School of Engineering ENGN 164 Design of Computing Systems Professor Sherief Reda Homework 07. 140 points. Due Date: Monday May 12th in B&H 349 1. [30 points] Consider the non-pipelined
More informationCS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions
CS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions Tutorial Questions 2. [AY2014/5 Semester 2 Exam] Refer to the following MIPS program: # register $s0 contains a 32-bit
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More information1 /18 2 /16 3 /18 4 /26 5 /22
M A S S A C H U S E T T S I N S T I T U T E O F T E C H N O L O G Y DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE 6.004 Computation Structures Fall 2018 Quiz #2 1 /18 2 /16 3 /18 4 /26 5 /22
More informationCENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.
Exam 2 April 12, 2012 You have 80 minutes to complete the exam. Please write your answers clearly and legibly on this exam paper. GRADE: Name. Class ID. 1. (22 pts) Circle the selected answer for T/F and
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationCSE Lecture 13/14 In Class Handout For all of these problems: HAS NOT CANNOT Add Add Add must wait until $5 written by previous add;
CSE 30321 Lecture 13/14 In Class Handout For the sequence of instructions shown below, show how they would progress through the pipeline. For all of these problems: - Stalls are indicated by placing the
More informationLECTURE 3: THE PROCESSOR
LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU
More informationCS 230 Practice Final Exam & Actual Take-home Question. Part I: Assembly and Machine Languages (22 pts)
Part I: Assembly and Machine Languages (22 pts) 1. Assume that assembly code for the following variable definitions has already been generated (and initialization of A and length). int powerof2; /* powerof2
More informationSOLUTIONS. CS152 Computer Architecture and Engineering. ISAs, Microprogramming and Pipelining Assigned 8/26/2016 Problem Set #1 Due September 13
CS152 Computer Architecture and Engineering SOLUTIONS ISAs, Microprogramming and Pipelining Assigned 8/26/2016 Problem Set #1 Due September 13 The problem sets are intended to help you learn the material,
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationThe University of Alabama in Huntsville Electrical & Computer Engineering Department CPE Test II November 14, 2000
The University of Alabama in Huntsville Electrical & Computer Engineering Department CPE 513 01 Test II November 14, 2000 Name: 1. (5 points) For an eight-stage pipeline, how many cycles does it take to
More informationOutline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception
Outline A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception 1 4 Which stage is the branch decision made? Case 1: 0 M u x 1 Add
More informationPipelined CPUs. Study Chapter 4 of Text. Where are the registers?
Pipelined CPUs Where are the registers? Study Chapter 4 of Text Second Quiz on Friday. Covers lectures 8-14. Open book, open note, no computers or calculators. L17 Pipelined CPU I 1 Review of CPU Performance
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More information/ : Computer Architecture and Design Fall Midterm Exam October 16, Name: ID #:
16.482 / 16.561: Computer Architecture and Design Fall 2014 Midterm Exam October 16, 2014 Name: ID #: For this exam, you may use a calculator and two 8.5 x 11 double-sided page of notes. All other electronic
More informationCOMPUTER ORGANIZATION AND DESI
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler
More informationFinal Exam Fall 2007
ICS 233 - Computer Architecture & Assembly Language Final Exam Fall 2007 Wednesday, January 23, 2007 7:30 am 10:00 am Computer Engineering Department College of Computer Sciences & Engineering King Fahd
More informationL19 Pipelined CPU I 1. Where are the registers? Study Chapter 6 of Text. Pipelined CPUs. Comp 411 Fall /07/07
Pipelined CPUs Where are the registers? Study Chapter 6 of Text L19 Pipelined CPU I 1 Review of CPU Performance MIPS = Millions of Instructions/Second MIPS = Freq CPI Freq = Clock Frequency, MHz CPI =
More informationPipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction
More informationPipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome
Thoai Nam Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy & David a Patterson,
More informationECE 154B Spring Project 4. Dual-Issue Superscalar MIPS Processor. Project Checkoff: Friday, June 1 nd, Report Due: Monday, June 4 th, 2018
Project 4 Dual-Issue Superscalar MIPS Processor Project Checkoff: Friday, June 1 nd, 2018 Report Due: Monday, June 4 th, 2018 Overview: Some machines go beyond pipelining and execute more than one instruction
More informationECE260: Fundamentals of Computer Engineering
Pipelining James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy What is Pipelining? Pipelining
More informationCS252 Prerequisite Quiz. Solutions Fall 2007
CS252 Prerequisite Quiz Krste Asanovic Solutions Fall 2007 Problem 1 (29 points) The followings are two code segments written in MIPS64 assembly language: Segment A: Loop: LD r5, 0(r1) # r5 Mem[r1+0] LD
More informationControl Hazards - branching causes problems since the pipeline can be filled with the wrong instructions.
Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions Stage Instruction Fetch Instruction Decode Execution / Effective addr Memory access Write-back Abbreviation
More informationPipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome
Pipeline Thoai Nam Outline Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy
More informationData Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard
Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:
More informationENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013
ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013 Professor: Sherief Reda School of Engineering, Brown University 1. [from Debois et al. 30 points] Consider the non-pipelined implementation of
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationECEC 355: Pipelining
ECEC 355: Pipelining November 8, 2007 What is Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. A pipeline is similar in concept to an assembly
More informationInstruction Level Parallelism. Appendix C and Chapter 3, HP5e
Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation
More informationThomas Polzer Institut für Technische Informatik
Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =
More informationOrange Coast College. Business Division. Computer Science Department. CS 116- Computer Architecture. Pipelining
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Pipelining Recall Pipelining is parallelizing execution Key to speedups in processors Split instruction
More informationComputer Organization MIPS Architecture. Department of Computer Science Missouri University of Science & Technology
Computer Organization MIPS Architecture Department of Computer Science Missouri University of Science & Technology hurson@mst.edu Computer Organization Note, this unit will be covered in three lectures.
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationStructure of Computer Systems
288 between this new matrix and the initial collision matrix M A, because the original forbidden latencies for functional unit A still have to be considered in later initiations. Figure 5.37. State diagram
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationDepartment of Electrical Engineering and Computer Sciences Fall 2017 Instructors: Randy Katz, Krste Asanovic CS61C MIDTERM 2
University of California, Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Fall 2017 Instructors: Randy Katz, Krste Asanovic 2017 10 31 CS61C MIDTERM 2 After the
More informationFinal Exam Fall 2008
COE 308 Computer Architecture Final Exam Fall 2008 page 1 of 8 Saturday, February 7, 2009 7:30 10:00 AM Computer Engineering Department College of Computer Sciences & Engineering King Fahd University of
More informationLecture 7 Pipelining. Peng Liu.
Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt
More information/ : Computer Architecture and Design Fall 2014 Midterm Exam Solution
16.482 / 16.561: Computer Architecture and Design Fall 2014 Midterm Exam Solution 1. (8 points) UEvaluating instructions Assume the following initial state prior to executing the instructions below. Note
More informationMidnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4
IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM
More informationThe Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
The Processor (3) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationOPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS.
CS/ECE472 Midterm #2 Fall 2008 NAME: Student ID#: OPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS. Your signature is your promise that you have not cheated and will
More informationCS3350B Computer Architecture Quiz 3 March 15, 2018
CS3350B Computer Architecture Quiz 3 March 15, 2018 Student ID number: Student Last Name: Question 1.1 1.2 1.3 2.1 2.2 2.3 Total Marks The quiz consists of two exercises. The expected duration is 30 minutes.
More informationPage 1. CISC 662 Graduate Computer Architecture. Lecture 8 - ILP 1. Pipeline CPI. Pipeline CPI (I) Pipeline CPI (II) Michela Taufer
CISC 662 Graduate Computer Architecture Lecture 8 - ILP 1 Michela Taufer Pipeline CPI http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson
More informationEE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes
NAME: STUDENT NUMBER: EE557--FALL 1999 MAKE-UP MIDTERM 1 Closed books, closed notes Q1: /1 Q2: /1 Q3: /1 Q4: /1 Q5: /15 Q6: /1 TOTAL: /65 Grade: /25 1 QUESTION 1(Performance evaluation) 1 points We are
More informationELE 655 Microprocessor System Design
ELE 655 Microprocessor System Design Section 2 Instruction Level Parallelism Class 1 Basic Pipeline Notes: Reg shows up two places but actually is the same register file Writes occur on the second half
More informationHakim Weatherspoon CS 3410 Computer Science Cornell University
Hakim Weatherspoon CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, McKee, and Sirer. memory inst register
More information2B 52 AB CA 3E A1 +29 A B C. CS120 Fall 2018 Final Prep and super secret quiz 9
S2 Fall 28 Final Prep and super secret quiz 9 ) onvert 8-bit (2-digit) 2 s complement hex values: 4-29 inary: Hex: x29 2) onvert 8-bit 2 s complement hex to decimal: x3 inary: xe5 Decimal: 58 Note 3*6+
More informationPage # CISC 662 Graduate Computer Architecture. Lecture 8 - ILP 1. Pipeline CPI. Pipeline CPI (I) Michela Taufer
CISC 662 Graduate Computer Architecture Lecture 8 - ILP 1 Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,
More informationComputer Architecture. Lecture 6.1: Fundamentals of
CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and
More informationCS152 Computer Architecture and Engineering January 29, 2013 ISAs, Microprogramming and Pipelining Assigned January 29 Problem Set #1 Due February 14
CS152 Computer Architecture and Engineering January 29, 2013 ISAs, Microprogramming and Pipelining Assigned January 29 Problem Set #1 Due February 14 http://inst.eecs.berkeley.edu/~cs152/sp13 The problem
More informationECE 3056: Architecture, Concurrency, and Energy of Computation. Sample Problem Sets: Pipelining
ECE 3056: Architecture, Concurrency, and Energy of Computation Sample Problem Sets: Pipelining 1. Consider the following code sequence used often in block copies. This produces a special form of the load
More informationStatic, multiple-issue (superscaler) pipelines
Static, multiple-issue (superscaler) pipelines Start more than one instruction in the same cycle Instruction Register file EX + MEM + WB PC Instruction Register file EX + MEM + WB 79 A static two-issue
More informationMultiple Instruction Issue. Superscalars
Multiple Instruction Issue Multiple instructions issued each cycle better performance increase instruction throughput decrease in CPI (below 1) greater hardware complexity, potentially longer wire lengths
More informationCS 61C: Great Ideas in Computer Architecture. Lecture 13: Pipelining. Krste Asanović & Randy Katz
CS 61C: Great Ideas in Computer Architecture Lecture 13: Pipelining Krste Asanović & Randy Katz http://inst.eecs.berkeley.edu/~cs61c/fa17 RISC-V Pipeline Pipeline Control Hazards Structural Data R-type
More informationComputer System Architecture Quiz #1 March 8th, 2019
Computer System Architecture 6.823 Quiz #1 March 8th, 2019 Name: This is a closed book, closed notes exam. 80 Minutes 14 Pages (+2 Scratch) Notes: Not all questions are of equal difficulty, so look over
More informationThe Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes.
The Processor Pipeline Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes. Pipeline A Basic MIPS Implementation Memory-reference instructions Load Word (lw) and Store Word (sw) ALU instructions
More informationChapter 4. The Processor
Chapter 4 The Processor 1 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A
More informationDesign Project Computation Structures Fall 2018
Due date: Friday December 7th 11:59:59pm EST. This is a hard deadline: To comply with MIT rules, we cannot allow the use of late days. Getting started: To create your initial Design Project repository,
More informationCS 251, Winter 2019, Assignment % of course mark
CS 251, Winter 2019, Assignment 5.1.1 3% of course mark Due Wednesday, March 27th, 5:30PM Lates accepted until 1:00pm March 28th with a 15% penalty 1. (10 points) The code sequence below executes on a
More informationProcessor design - MIPS
EASY Processor design - MIPS Q.1 What happens when a register is loaded? 1. The bits of the register are set to all ones. 2. The bit pattern in the register is copied to a location in memory. 3. A bit
More informationPipelined Processor Design
Pipelined Processor Design Pipelined Implementation: MIPS Virendra Singh Computer Design and Test Lab. Indian Institute of Science (IISc) Bangalore virendra@computer.org Advance Computer Architecture http://www.serc.iisc.ernet.in/~viren/courses/aca/aca.htm
More informationLECTURE 10. Pipelining: Advanced ILP
LECTURE 10 Pipelining: Advanced ILP EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls, returns) that changes the normal flow of instruction
More informationPhoto David Wright STEVEN R. BAGLEY PIPELINES AND ILP
Photo David Wright https://www.flickr.com/photos/dhwright/3312563248 STEVEN R. BAGLEY PIPELINES AND ILP INTRODUCTION Been considering what makes the CPU run at a particular speed Spent the last two weeks
More informationCS 61C: Great Ideas in Computer Architecture. Multiple Instruction Issue, Virtual Memory Introduction
CS 61C: Great Ideas in Computer Architecture Multiple Instruction Issue, Virtual Memory Introduction Instructor: Justin Hsia 7/26/2012 Summer 2012 Lecture #23 1 Parallel Requests Assigned to computer e.g.
More informationCS433 Midterm. Prof Josep Torrellas. October 19, Time: 1 hour + 15 minutes
CS433 Midterm Prof Josep Torrellas October 19, 2017 Time: 1 hour + 15 minutes Name: Instructions: 1. This is a closed-book, closed-notes examination. 2. The Exam has 4 Questions. Please budget your time.
More informationComputer Architecture and Engineering CS152 Quiz #1 Wed, February 17th, 2016 Professor George Michelogiannakis Name: <ANSWER KEY>
Computer Architecture and Engineering CS152 Quiz #1 Wed, February 17th, 2016 Professor George Michelogiannakis Name: This is a closed book, closed notes exam. 80 Minutes. 18 pages Notes: Not
More informationChapter 4 The Processor 1. Chapter 4B. The Processor
Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always
More informationCS150 Fall 2012 Solutions to Homework 6
CS150 Fall 2012 Solutions to Homework 6 October 6, 2012 Problem 1 a.) Answer: 0.09 ns This delay is given in Table 65 as T ILO, specifically An Dn LUT address to A. b.) Answer: 0.41 ns In Table 65, this
More informationCS 152 Computer Architecture and Engineering
CS 152 Computer Architecture and Engineering Lecture 20 Advanced Processors I 2005-4-5 John Lazzaro (www.cs.berkeley.edu/~lazzaro) TAs: Ted Hong and David Marquardt www-inst.eecs.berkeley.edu/~cs152/ Last
More informationComputer Architecture CS372 Exam 3
Name: Computer Architecture CS372 Exam 3 This exam has 7 pages. Please make sure you have all of them. Write your name on this page and initials on every other page now. You may only use the green card
More informationECE473 Computer Architecture and Organization. Pipeline: Control Hazard
Computer Architecture and Organization Pipeline: Control Hazard Lecturer: Prof. Yifeng Zhu Fall, 2015 Portions of these slides are derived from: Dave Patterson UCB Lec 15.1 Pipelining Outline Introduction
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor
More informationPipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.
Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview
More informationLecture 29 Review" CPU time: the best metric" Be sure you understand CC, clock period" Common (and good) performance metrics"
Be sure you understand CC, clock period Lecture 29 Review Suggested reading: Everything Q1: D[8] = D[8] + RF[1] + RF[4] I[15]: Add R2, R1, R4 RF[1] = 4 I[16]: MOV R3, 8 RF[4] = 5 I[17]: Add R2, R2, R3
More informationFinal Exam Spring 2017
COE 3 / ICS 233 Computer Organization Final Exam Spring 27 Friday, May 9, 27 7:3 AM Computer Engineering Department College of Computer Sciences & Engineering King Fahd University of Petroleum & Minerals
More information--------------------------------------------------------------------------------------------------------------------- 1. Objectives: Using the Logisim simulator Designing and testing a Pipelined 16-bit
More informationLecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1
Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number
More information2 GHz = 500 picosec frequency. Vars declared outside of main() are in static. 2 # oset bits = block size Put starting arrow in FSM diagrams
CS 61C Fall 2011 Kenny Do Final cheat sheet Increment memory addresses by multiples of 4, since lw and sw are bytealigned When going from C to Mips, always use addu, addiu, and subu When saving stu into
More information6.823 Computer System Architecture Datapath for DLX Problem Set #2
6.823 Computer System Architecture Datapath for DLX Problem Set #2 Spring 2002 Students are allowed to collaborate in groups of up to 3 people. A group hands in only one copy of the solution to a problem
More informationDo not open this exam until instructed to do so. CS/ECE 354 Final Exam May 19, CS Login: QUESTION MAXIMUM SCORE TOTAL 115
Name: Solution Signature: Student ID#: Section #: CS Login: Section 2: Section 3: 11:00am (Wood) 1:00pm (Castano) CS/ECE 354 Final Exam May 19, 2000 1. This exam is open book/notes. 2. No calculators.
More informationEN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts
EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts Prof. Sherief Reda School of Engineering Brown University S. Reda EN2910A FALL'15 1 Classical concepts (prerequisite) 1. Instruction
More information