Name:. _ -A. ECE 411 Exam 2. November 3, pm-10pm. This exam has 5 questions. Make sure you have a complete exam before you begin.
|
|
- Eustacia Chapman
- 5 years ago
- Views:
Transcription
1 Name:. _ A.!...,!n.!...l.;s~w=e=:: '.) J...t _ November 3, pm10pm This exam has 5 questions. Make sure you have a complete exam before you begin. Write your name on every page in case pages become separated during grading. You will have three hours to complete this exam. Write all of your answers on the exam itself. If you need more space to answer a given question, continue on the back of the page. but clearly indicate that you have done so. This exam is closedbook. You may use one sheet of notes. You may use a calculator. Do not do anything that might be perceived as cheating. The minimwn penalty for cheating will be a grade of zero. Show all of your work on all problems. Correct answers that do not include work demonstrating how they were generated may not receive full credit, and answers that show no work cannot receive partial credit. The exam is meant to test your understanding. Ample time has been provided. So be patient and read the questions carefully before you answer. Good luck! Question Points Score Machine Problem 16 Data Hazard 18 Control Hazard 18 IEEE Floating Point Format 21 Tomasulo 22 Total: 95
2 1. Machjne Problem (16 points) for the entire question, the cache has 8 sets, each line is 128 bits, and is byte addressable. (a) (4 points) In the given figures is a controller and part of a data path for an ECE 4 11 MP2 cache. Write SystemVerilog code to implement tl1e module labeled gen_mem_resp. The mem_resp signal should only be high wheu a memory access is being performed. (write answer on page 4). address_split a.,2,o 3 ~ g~ lb :::; 8"3 ~ 3 c I I ~ ;;.., :I 'o ~ :E a. < oro.c.~ ~. a. ~ \0~,..,.;ti)~ lb data_array tag_ array valid_array dirty_array wa data[! way_dirtyll) mem_write gen_mem_resp mem_resp 3 v; "'
3 Name: import lc3b_types: :*; module gen_mem_resp ( //Your answer for part A starts here ) ; I II\ ~ut If! f>~ t ovtrvt " ~~<ter~... h, r 1,.,c,_ resf endmodule gen_mem_resp; (b) (4 points) Given the following table of component latencies. Compute the latency ofmem_resp for part A for a \vrite. and gate 2ns or gate 2ns 2to1 mux 4 ns 18 bit comparator 4 ns 916 bit comparator 6ns tag/ data/valid I dirty array 20 ns onj ( h,.;... ~.~) lo + 31 (\ ~ ECE41 l Exam2 Page4
4 DataPath byte_enable mem_write mem_read mem_wdata mem_rdata mem_address mem_resp mem_address byte_enable address rdata wdata read write CacheComponent0 write_line cache_line_input eviction_data eviction_address hit miss eviction mem_address[6:4] 0 1 byte_enable CacheComponent1 hit miss eviction address rdata wdata read write eviction_address eviction_data cache_line_input write_line pmem_rdata 1Bit Array write input index lru_out pmem_rdata lru_out output lru_out lru_out 0 1 lru_out pmem_rdata pmem_write pmem_read pmem_wdata pmem_rdata pmem_address pmem_resp PhysicalMemory lru_out mem_address cw eviction pw pr pmem_resp miss Input/Output Signals ~pmem_resp State Machine HitIdle WB LdWait Load cw=0; cw=0; cw=0 cw=1 pw=0; pw=1; pw=0 pw=0 pr=0; pr=0; pr=1 pr=0 Output Values WB eviction pmem_resp HitIdle miss&~eviction LdWait ~miss always pmem_resp Load ~pmem_resp
5 Name: 2. Data Hazard (18 points) This question tests your understanding of the mechanisms for handling data hazards in pipelined processors. Assume a 5stage pipelined processor discussed in class with transparent register file. It has no data forwarding but staus on data hazards. Consider the following code segment: 11: LOR Rl, RO, A 12: LOR R2, RO, 8 13: ADD Rl, Rl, R2 14: STR Rl, RO, c 15: AND R3, Rl, 4 16: ADD R3, Rl, R3 (a) (5 points) Identify all the register data dependencies in the code above. For each occurrence of dependency, give the two instructions and the register involved in the form of (11, 12, Rn), which means 12 depends on I 1 through their use of Rn. Leave blank if no dependency is present for that category. RAW Dependency: (:r,. h '\"' (.L, 1... P1.') l~ '4.R ) WAW Dependency: ::... 1~. ~~) I k l.t ll~) WAR Dependency: Which of the dependences is/ are data hazard(s) in this system? Why and why not? Page 7
6 Nan1e:. (b) (4 points) Complete the following table with all the data hazards handled with stall but no forwarding. Fill each cell with the current pipeline stage (IF, ID, EX, MEM or WB) the instruction is in at each cycle. Leave the cell blank only when the instruction is not in the pipeline. The first two rows have been completed as an example. ~;~~~>Rnt.Ro. A ~~~~~~=~~~~~~l~~~;t~~~i~~~~=: ~c:~= ~~? Jo... 1i n13 ~~~ ~ ~=~ ~;,~~~:;~~:~~.~~ ~~~... ~:.. f:=: :~... ~~i.r:.lisl: 1 :~ f"l( Mt~ W\~ =~= = := :::~ ::: :~ :.~:~.~.. ~.= f )STRRt.RO,C ' i j ~'!: "t.r ~F ):{) E~ M.tf" Vvi} IS),\NDR3,R1, 4... t...l _j._f..::92 t f.~~ _w'"'> f )ADOR3, RI.R3 [ l ~ '!.!i" '!.~ :Lt) fll ~ vvl')...l..' ''... "... L~... (c) (4 points) What is the speedup of running this code segment with full data forwarding (MEM >EX, WB>EX and WB>MEM)? Show your calculation. 4 (d) (5 points) Sort the following optimizations by the max:imum speedup they can achieve on this code segment assuming that we have both stalling and bypassing. Use only> and =. For example, a> b means option a has a higher speedup than option b. Put your answer in the box and explain your reasoning below the box. a. Instruction reordering b. Hardware Register renaming c. 2issue superscalar d. 3issue superscalar ECE4ll Exam2 Page 8
7 Name: 3. Control Hazard (18 points) Assume the following conditions for this problem: Ll Cache hit rate 100% BTB hit rate 100% Branch decision is available at the end of EX stage For every cycle, fill in the pc of the instruction for each stage. Only the instruction shown in the following table are branch instructions, and there are no other instruction which would cause a pc change (i.e., no JSR/JMP/TRAP/etc) and no indirect instructions. Leave the cell blank ifrhere is no valid instruction in there. Branch Id Branch Instruction PC Target Action PC 0 Ox22 Ox38 Not Taken 1 Ox24 Ox36 Taken 2 Ox38 Ox50 Not Taken (a) (6 points) There is no prediction (and no BTB). Cycle# IF ID EX 0 Ox22 Ox20 Oxle 1 )il). 0x)..0 2 I 0Jt "2)_ 3 Ch L4 4 Orl}f 5 D'L 1Af 6 OdC. 7 ().., t.4. fh H,, (b) (6 points) BTB is present, always predict t~ken. Cycle# IF ID EX 0 Ox MEM Oxlc Oxle OvL.O ('J>.t 2. "L MEM Oxlc WB Ox1a Oxlc Oxle Ov20 Ot.. L <f. ~t...cf WB Oxla Page9
8 Name: (c) (6 points) BRIEFLY explain why it is (or is not) challenging to resolve branch in EX stage or earlier. BRIEFLY give a high level idea on how it would work (compare to the baseline: resolve branch in WB). Resolve means "having the branch decision ready" LOR ih. M"&M ~R it.. E;X Y\e'e~ tv {Mollf~ CL ~ to EX M.&A1 Page 10
9 Name: _ 4. IEEE Floating Point Format (21 points) This question tests your understanding of the IEEE 754 Floating Point Standard. Assume a hypothetical 6bit floating format (I bits, 3bit E, 2bit M) that conforms to the IEEE 754 standard in answering the following questions. The format supports denom1ajized numbers. Show your work for full and partial credit. (a) (3 points) What is the representation of decimal value 0.0 in this format? s:.o, E=v I M =. o '" (0; ()0 0 '00) (b) (3 points) What is the decimal value of the largest representable number in this format? (c) (3 points) What is the decimal value of ( )? You can expressed as a power of2? ~ ~f\'ot'lr\ n..lt I Lf ~ ( yf /1.6 ~...~.l:z,j... ~ y ' 00.,).A(_$) (e) (3 points) What would be the correct result value of the previous su~ja!non~.c ( 1) :( o. o ~ ( 0 0 0! (f) {3 points) What is the ULP for ( )? (g) (3 points) What is the decimal value of ( )? to 1f...t v~~ 6 f f /...,! S ~.A(I )::: 2 Page 11 _
10 Name: 5. Tomasulo (22 points) Follow these guidelines Null (empty) values are denoted by"". All instructions are fetched and reside in instruction queue. One instruction can be issued per cycle. If all operands are available, an instruction can issue and begin execution on the same cycle. LD/ST instructions can begin execution once the address is available. When a reservation station obtains its last operand value, it can begin execution on the next cycle. If one instruction finishes executing, the result is broadcast at the same cycle. The value is written to register tile at the next cycle. If 1:\vo or more instructions finish execution at the same cycle, the one issued first broadcast first. There are one ADDF /SUBF tmit, one MULTF /DIVF unit, and one LD/ST unit. Each pipelined execution unit has two reservation stations (or load/store buffer) and can start executing a new instruction every clock cycle. The execution unit latencies are ADDF/SUBF = 5, MULTF = 10, DIVF = 20, LD = 10, ST = 15. These latencies do not include writing the results to reservation stations and/ or registers. Reservation stations are deallocated during the result writing cycle and can be reassigned on the following cycle. Deallocated but not reassigned reservation stations should be indicated with null fields. Initial register contents of Rl: 100. Values in memory should be denoted by MEM(addrs). Here is a snapshot of the system at cycle 0 (CO). Instruction Status: # Instruction Issued? EX Complete l ADDF Fl, F2, F3 y 2 SUBF F4, Fl, F5 N 3 LD Fl, O(Rl) N 4 DIVF F7, F3, F8 N 5 ADDF F4, F4, Fl N 6 MlJLfF F6, F3, F4 N 7 STO(Rl), F6 N Results Written? Reservation Stations: Name Busy? OP Addl y ADDF Add2 N Multl N Mult2 N Valuel 2 Value2 3 Producerl Producer2 ECE 41 l Exam 2 Page 12
11 Name: Load/Store Buffers: Register File (FlF8) Status: Name Busy? Address Producer Value Loadl N N/A N/A Storel N Register Fl F2 F3 F4 F5 F6 F7 F8 Producer Addl Value (a) (2 points) Suppose that instruction 1 issued in cycle CO. Which instructions will issue in the next 6 cycles and at which cycle? If an instruction is not issued, briefly explain why it cannot. crc.b 2 C1 A /;. 3 ()z l~j /. 4 CJ t ttdjt I rth AJJ.I (b) (8 points) Show the state of the system at the end of cycle 12 (Cl2). Instruction Status: # Instruction Issued? EX Complete Results Written? l ADDF Fl. 2, F3 A~J1 AJb ( J I 2 3 tj1.,tf 4 1\ JJ I f5+===::,..:r~t ;._~+!..:!...!.,L1~t~,. f rvtl\rtz. 6 ~~=~~ ~~~1~~~~~1~T+1 01 ~ 7 ~~~~~~~~~~~~~~ Page 13
12 Name: Reservation Stations: Name Busy? OP Valuel Value2 Producer! Producer2 Addl Add2 tv\lm ( lf"'! Multi /2 Mult2 Load/Store Buffers: Name I Busy? Address Producer Value Loadl I tj N/A N/A Store I I ~ li17 t.jh.h '], I Register File (Flf8) Status: Register Fl F2 F3 F4 FS f6 F7 F8 Producer II Adrll M.~tfJ; M lu ~ 'l' Value > c L z. (c) (4 points) Suppose the following instruction occurred after the given instructions. The instruction may not run correctly on the machine described in this problem. Why would it be incorrect? What should be done to correct the problem? 8: LD FlO, O(R I) Page 14
13 Name: (d) (4 points) Assume the instruction LD Fl, O(H]) causes a page fault at the last cycle of its execution. Without reorder buller, some of the register value may contain wrong value. Fill in the values for the register file at the cycle when the ill instruction incurs page fault. Indicate the registers that have wrong values. vii Hegister File (FlF8) Status: Register Fl F2 F3 F4 FS F6 F7 F8 Value z 5 t t2 Value Correct? I( I( /I( y' I (e) (4 points) Besides exception, what other kinds of situation or instruction can a reorder buffer support? ECE 41 l Exam 2 Page 15 r~ ~~~~~~1
ISA Instruction Operation
This exam has 6 problems. Make sure you have a complete exam before you begin. Write your name on every page in case pages become separated during grading. You will have three hours to complete this exam.
More informationECE 505 Computer Architecture
ECE 505 Computer Architecture Pipelining 2 Berk Sunar and Thomas Eisenbarth Review 5 stages of RISC IF ID EX MEM WB Ideal speedup of pipelining = Pipeline depth (N) Practically Implementation problems
More informationThis Set. Scheduling and Dynamic Execution Definitions From various parts of Chapter 4. Description of Three Dynamic Scheduling Methods
10-1 Dynamic Scheduling 10-1 This Set Scheduling and Dynamic Execution Definitions From various parts of Chapter 4. Description of Three Dynamic Scheduling Methods Not yet complete. (Material below may
More informationFinal Exam Fall 2007
ICS 233 - Computer Architecture & Assembly Language Final Exam Fall 2007 Wednesday, January 23, 2007 7:30 am 10:00 am Computer Engineering Department College of Computer Sciences & Engineering King Fahd
More informationPipeline issues. Pipeline hazard: RaW. Pipeline hazard: RaW. Calcolatori Elettronici e Sistemi Operativi. Hazards. Data hazard.
Calcolatori Elettronici e Sistemi Operativi Pipeline issues Hazards Pipeline issues Data hazard Control hazard Structural hazard Pipeline hazard: RaW Pipeline hazard: RaW 5 6 7 8 9 5 6 7 8 9 : add R,R,R
More informationThis Set. Scheduling and Dynamic Execution Definitions From various parts of Chapter 4. Description of Two Dynamic Scheduling Methods
10 1 Dynamic Scheduling 10 1 This Set Scheduling and Dynamic Execution Definitions From various parts of Chapter 4. Description of Two Dynamic Scheduling Methods Not yet complete. (Material below may repeat
More information3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:
More informationGood luck and have fun!
Midterm Exam October 13, 2014 Name: Problem 1 2 3 4 total Points Exam rules: Time: 90 minutes. Individual test: No team work! Open book, open notes. No electronic devices, except an unprogrammed calculator.
More informationHardware-based speculation (2.6) Multiple-issue plus static scheduling = VLIW (2.7) Multiple-issue, dynamic scheduling, and speculation (2.
Instruction-Level Parallelism and its Exploitation: PART 2 Hardware-based speculation (2.6) Multiple-issue plus static scheduling = VLIW (2.7) Multiple-issue, dynamic scheduling, and speculation (2.8)
More informationECE 411 Exam 1. Name:
This exam has 5 problems. Make sure you have a complete exam before you begin. Write your name on every page in case pages become separated during grading. You will have 3 hours to complete this exam.
More informationPipelining. CSC Friday, November 6, 2015
Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not
More information6.823 Computer System Architecture
6.823 Computer System Architecture Problem Set #4 Spring 2002 Students are encouraged to collaborate in groups of up to 3 people. A group needs to hand in only one copy of the solution to a problem set.
More informations complement 1-bit Booth s 2-bit Booth s
ECE/CS 552 : Introduction to Computer Architecture FINAL EXAM May 12th, 2002 NAME: This exam is to be done individually. Total 6 Questions, 100 points Show all your work to receive partial credit for incorrect
More informationChapter 06: Instruction Pipelining and Parallel Processing
Chapter 06: Instruction Pipelining and Parallel Processing Lesson 09: Superscalar Processors and Parallel Computer Systems Objective To understand parallel pipelines and multiple execution units Instruction
More informationCS433 Homework 2 (Chapter 3)
CS Homework 2 (Chapter ) Assigned on 9/19/2017 Due in class on 10/5/2017 Instructions: 1. Please write your name and NetID clearly on the first page. 2. Refer to the course fact sheet for policies on collaboration..
More informationCS433 Homework 2 (Chapter 3)
CS433 Homework 2 (Chapter 3) Assigned on 9/19/2017 Due in class on 10/5/2017 Instructions: 1. Please write your name and NetID clearly on the first page. 2. Refer to the course fact sheet for policies
More informationChapter 3 Instruction-Level Parallelism and its Exploitation (Part 1)
Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1) ILP vs. Parallel Computers Dynamic Scheduling (Section 3.4, 3.5) Dynamic Branch Prediction (Section 3.3) Hardware Speculation and Precise
More informationECE 411 Exam 1. This exam has 5 problems. Make sure you have a complete exam before you begin.
This exam has 5 problems. Make sure you have a complete exam before you begin. Write your name on every page in case pages become separated during grading. You will have three hours to complete this exam.
More informationILP concepts (2.1) Basic compiler techniques (2.2) Reducing branch costs with prediction (2.3) Dynamic scheduling (2.4 and 2.5)
Instruction-Level Parallelism and its Exploitation: PART 1 ILP concepts (2.1) Basic compiler techniques (2.2) Reducing branch costs with prediction (2.3) Dynamic scheduling (2.4 and 2.5) Project and Case
More informationDynamic Scheduling. Better than static scheduling Scoreboarding: Tomasulo algorithm:
LECTURE - 13 Dynamic Scheduling Better than static scheduling Scoreboarding: Used by the CDC 6600 Useful only within basic block WAW and WAR stalls Tomasulo algorithm: Used in IBM 360/91 for the FP unit
More informationCOSC4201 Instruction Level Parallelism Dynamic Scheduling
COSC4201 Instruction Level Parallelism Dynamic Scheduling Prof. Mokhtar Aboelaze Parts of these slides are taken from Notes by Prof. David Patterson (UCB) Outline Data dependence and hazards Exposing parallelism
More informationSOLUTION. Midterm #1 February 26th, 2018 Professor Krste Asanovic Name:
SOLUTION Notes: CS 152 Computer Architecture and Engineering CS 252 Graduate Computer Architecture Midterm #1 February 26th, 2018 Professor Krste Asanovic Name: I am taking CS152 / CS252 This is a closed
More informationInstruction Level Parallelism. Appendix C and Chapter 3, HP5e
Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation
More informationHardware-based Speculation
Hardware-based Speculation Hardware-based Speculation To exploit instruction-level parallelism, maintaining control dependences becomes an increasing burden. For a processor executing multiple instructions
More informationUniversity of Toronto Faculty of Applied Science and Engineering
Print: First Name:............ Solutions............ Last Name:............................. Student Number:............................................... University of Toronto Faculty of Applied Science
More informationCS433 Midterm. Prof Josep Torrellas. October 19, Time: 1 hour + 15 minutes
CS433 Midterm Prof Josep Torrellas October 19, 2017 Time: 1 hour + 15 minutes Name: Instructions: 1. This is a closed-book, closed-notes examination. 2. The Exam has 4 Questions. Please budget your time.
More informationCS152 Exam #2 Fall Professor Dave Patterson
CS152 Exam #2 Fall 2003 Professor Dave Patterson Question 1: Potpourri (Jack and Dave s Question) Part A: TLBs entries have valid bits and dirty bits. Data caches have them also. Which of the following
More informationTomasulo s Algorithm
Tomasulo s Algorithm Architecture to increase ILP Removes WAR and WAW dependencies during issue WAR and WAW Name Dependencies Artifact of using the same storage location (variable name) Can be avoided
More informationCourse on Advanced Computer Architectures
Surname (Cognome) Name (Nome) POLIMI ID Number Signature (Firma) SOLUTION Politecnico di Milano, July 9, 2018 Course on Advanced Computer Architectures Prof. D. Sciuto, Prof. C. Silvano EX1 EX2 EX3 Q1
More information/ : Computer Architecture and Design Fall 2014 Midterm Exam Solution
16.482 / 16.561: Computer Architecture and Design Fall 2014 Midterm Exam Solution 1. (8 points) UEvaluating instructions Assume the following initial state prior to executing the instructions below. Note
More informationCS 351 Exam 2, Fall 2012
CS 351 Exam 2, Fall 2012 Your name: Rules You may use one handwritten 8.5 x 11 cheat sheet (front and back). This is the only resource you may consult during this exam. Include explanations and comments
More informationCPE 631 Lecture 10: Instruction Level Parallelism and Its Dynamic Exploitation
Lecture 10: Instruction Level Parallelism and Its Dynamic Exploitation Aleksandar Milenković, milenka@ece.uah.edu Electrical and Computer Engineering University of Alabama in Huntsville Outline Tomasulo
More informationEECC551 Exam Review 4 questions out of 6 questions
EECC551 Exam Review 4 questions out of 6 questions (Must answer first 2 questions and 2 from remaining 4) Instruction Dependencies and graphs In-order Floating Point/Multicycle Pipelining (quiz 2) Improving
More informationProcessor: Superscalars Dynamic Scheduling
Processor: Superscalars Dynamic Scheduling Z. Jerry Shi Assistant Professor of Computer Science and Engineering University of Connecticut * Slides adapted from Blumrich&Gschwind/ELE475 03, Peh/ELE475 (Princeton),
More informationINSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing
UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 09
More informationReorder Buffer Implementation (Pentium Pro) Reorder Buffer Implementation (Pentium Pro)
Reorder Buffer Implementation (Pentium Pro) Hardware data structures retirement register file (RRF) (~ IBM 360/91 physical registers) physical register file that is the same size as the architectural registers
More informationComputer System Architecture Final Examination Spring 2002
Computer System Architecture 6.823 Final Examination Spring 2002 Name: This is an open book, open notes exam. 180 Minutes 22 Pages Notes: Not all questions are of equal difficulty, so look over the entire
More informationChapter 3 (CONT II) Instructor: Josep Torrellas CS433. Copyright J. Torrellas 1999,2001,2002,2007,
Chapter 3 (CONT II) Instructor: Josep Torrellas CS433 Copyright J. Torrellas 1999,2001,2002,2007, 2013 1 Hardware-Based Speculation (Section 3.6) In multiple issue processors, stalls due to branches would
More informationDAT105: Computer Architecture Study Period 2, 2009 Exercise 3 Chapter 2: Instruction-Level Parallelism and Its Exploitation
Study Period 2, 2009 Exercise 3 Chapter 2: Instruction-Level Parallelism and Its Exploitation Mafijul Islam Department of Computer Science and Engineering November 19, 2009 Study Period 2, 2009 Goals:
More informationEECS 470 Midterm Exam Answer Key Fall 2004
EECS 470 Midterm Exam Answer Key Fall 2004 Name: unique name: Sign the honor code: I have neither given nor received aid on this exam nor observed anyone else doing so. Scores: # Points Part I /23 Part
More informationCMSC411 Fall 2013 Midterm 2 Solutions
CMSC411 Fall 2013 Midterm 2 Solutions 1. (12 pts) Memory hierarchy a. (6 pts) Suppose we have a virtual memory of size 64 GB, or 2 36 bytes, where pages are 16 KB (2 14 bytes) each, and the machine has
More informationUniversity of Toronto Faculty of Applied Science and Engineering
Print: First Name:......... SOLUTION............... Last Name:............................. Student Number:............................................... University of Toronto Faculty of Applied Science
More informationFull Name: NetID: Midterm Summer 2017
Full Name: NetID: Midterm Summer 2017 OAKLAND UNIVERSITY, School of Engineering and Computer Science CSE 564: Computer Architecture Please write and/or mark your answers clearly and neatly; answers that
More informationCMSC411 Fall 2013 Midterm 1
CMSC411 Fall 2013 Midterm 1 Name: Instructions You have 75 minutes to take this exam. There are 100 points in this exam, so spend about 45 seconds per point. You do not need to provide a number if you
More informationEITF20: Computer Architecture Part3.2.1: Pipeline - 3
EITF20: Computer Architecture Part3.2.1: Pipeline - 3 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Dynamic scheduling - Tomasulo Superscalar, VLIW Speculation ILP limitations What we have done
More informationInstruction Frequency CPI. Load-store 55% 5. Arithmetic 30% 4. Branch 15% 4
PROBLEM 1: An application running on a 1GHz pipelined processor has the following instruction mix: Instruction Frequency CPI Load-store 55% 5 Arithmetic 30% 4 Branch 15% 4 a) Determine the overall CPI
More informationStructure of Computer Systems
288 between this new matrix and the initial collision matrix M A, because the original forbidden latencies for functional unit A still have to be considered in later initiations. Figure 5.37. State diagram
More informationCIS 662: Midterm. 16 cycles, 6 stalls
CIS 662: Midterm Name: Points: /100 First read all the questions carefully and note how many points each question carries and how difficult it is. You have 1 hour 15 minutes. Plan your time accordingly.
More informationCS152 Computer Architecture and Engineering March 13, 2008 Out of Order Execution and Branch Prediction Assigned March 13 Problem Set #4 Due March 25
CS152 Computer Architecture and Engineering March 13, 2008 Out of Order Execution and Branch Prediction Assigned March 13 Problem Set #4 Due March 25 http://inst.eecs.berkeley.edu/~cs152/sp08 The problem
More information/ : Computer Architecture and Design Fall Midterm Exam October 16, Name: ID #:
16.482 / 16.561: Computer Architecture and Design Fall 2014 Midterm Exam October 16, 2014 Name: ID #: For this exam, you may use a calculator and two 8.5 x 11 double-sided page of notes. All other electronic
More informationReduction of Data Hazards Stalls with Dynamic Scheduling So far we have dealt with data hazards in instruction pipelines by:
Reduction of Data Hazards Stalls with Dynamic Scheduling So far we have dealt with data hazards in instruction pipelines by: Result forwarding (register bypassing) to reduce or eliminate stalls needed
More informationCPE 631 Lecture 11: Instruction Level Parallelism and Its Dynamic Exploitation
Lecture 11: Instruction Level Parallelism and Its Dynamic Exploitation Aleksandar Milenkovic, milenka@ece.uah.edu Electrical and Computer Engineering University of Alabama in Huntsville Outline Instruction
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationInstruction Level Parallelism (ILP)
Instruction Level Parallelism (ILP) Pipelining supports a limited sense of ILP e.g. overlapped instructions, out of order completion and issue, bypass logic, etc. Remember Pipeline CPI = Ideal Pipeline
More informationDynamic Scheduling. Chapter 2: Instruction Level Parallelism. Dynamic Scheduling. Dynamic Scheduling
Chapter 2: Instruction Level Parallelism Dynamic instruction scheduling Tomasulo s algorithm Advanced superscalar processors branch prediction, reorder buffer Case studies R10000, K5, Alpha 21264, P6 Compiler
More informationECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.
This exam is open book and open notes. You have 2 hours. Problems 1-4 refer to a proposed MIPS instruction lwu (load word - update) which implements update addressing an addressing mode that is used in
More informationFinal Exam Fall 2008
COE 308 Computer Architecture Final Exam Fall 2008 page 1 of 8 Saturday, February 7, 2009 7:30 10:00 AM Computer Engineering Department College of Computer Sciences & Engineering King Fahd University of
More informationScoreboard information (3 tables) Four stages of scoreboard control
Scoreboard information (3 tables) Instruction : issued, read operands and started execution (dispatched), completed execution or wrote result, Functional unit (assuming non-pipelined units) busy/not busy
More informationECE 411, Exam 1. Good luck!
This exam has 6 problems. Make sure you have a complete exam before you begin. Write your name on every page in case pages become separated during grading. You will have three hours to complete this exam.
More informationLecture 16: Core Design. Today: basics of implementing a correct ooo core: register renaming, commit, LSQ, issue queue
Lecture 16: Core Design Today: basics of implementing a correct ooo core: register renaming, commit, LSQ, issue queue 1 The Alpha 21264 Out-of-Order Implementation Reorder Buffer (ROB) Branch prediction
More informationWebsite for Students VTU NOTES QUESTION PAPERS NEWS RESULTS
Advanced Computer Architecture- 06CS81 Hardware Based Speculation Tomasulu algorithm and Reorder Buffer Tomasulu idea: 1. Have reservation stations where register renaming is possible 2. Results are directly
More informationThe basic structure of a MIPS floating-point unit
Tomasulo s scheme The algorithm based on the idea of reservation station The reservation station fetches and buffers an operand as soon as it is available, eliminating the need to get the operand from
More informationFor this problem, consider the following architecture specifications: Functional Unit Type Cycles in EX Number of Functional Units
CS333: Computer Architecture Spring 006 Homework 3 Total Points: 49 Points (undergrad), 57 Points (graduate) Due Date: Feb. 8, 006 by 1:30 pm (See course information handout for more details on late submissions)
More informationCS252 Graduate Computer Architecture Midterm 1 Solutions
CS252 Graduate Computer Architecture Midterm 1 Solutions Part A: Branch Prediction (22 Points) Consider a fetch pipeline based on the UltraSparc-III processor (as seen in Lecture 5). In this part, we evaluate
More informationCS152 Computer Architecture and Engineering. Complex Pipelines
CS152 Computer Architecture and Engineering Complex Pipelines Assigned March 6 Problem Set #3 Due March 20 http://inst.eecs.berkeley.edu/~cs152/sp12 The problem sets are intended to help you learn the
More informationCSE 240A Midterm Exam
Student ID Page 1 of 7 2011 Fall Professor Steven Swanson CSE 240A Midterm Exam Please write your name at the top of each page This is a close book, closed notes exam. No outside material may be used.
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More informationCOSC 6385 Computer Architecture - Pipelining
COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage
More informationChapter 4. Advanced Pipelining and Instruction-Level Parallelism. In-Cheol Park Dept. of EE, KAIST
Chapter 4. Advanced Pipelining and Instruction-Level Parallelism In-Cheol Park Dept. of EE, KAIST Instruction-level parallelism Loop unrolling Dependence Data/ name / control dependence Loop level parallelism
More informationDYNAMIC AND SPECULATIVE INSTRUCTION SCHEDULING
DYNAMIC AND SPECULATIVE INSTRUCTION SCHEDULING Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 3, John L. Hennessy and David A. Patterson,
More informationProcessor (II) - pipelining. Hwansoo Han
Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number
More informationShort Answer: [3] What is the primary difference between Tomasulo s algorithm and Scoreboarding?
Short Answer: [] What is the primary difference between Tomasulo s algorithm and Scoreboarding? [] Which data hazard occurs when instructions are allowed to complete out of order? Which one occurs when
More informationLecture 9: Dynamic ILP. Topics: out-of-order processors (Sections )
Lecture 9: Dynamic ILP Topics: out-of-order processors (Sections 2.3-2.6) 1 An Out-of-Order Processor Implementation Reorder Buffer (ROB) Branch prediction and instr fetch R1 R1+R2 R2 R1+R3 BEQZ R2 R3
More informationFloating Point/Multicycle Pipelining in DLX
Floating Point/Multicycle Pipelining in DLX Completion of DLX EX stage floating point arithmetic operations in one or two cycles is impractical since it requires: A much longer CPU clock cycle, and/or
More informationPage 1. Recall from Pipelining Review. Lecture 16: Instruction Level Parallelism and Dynamic Execution #1: Ideas to Reduce Stalls
CS252 Graduate Computer Architecture Recall from Pipelining Review Lecture 16: Instruction Level Parallelism and Dynamic Execution #1: March 16, 2001 Prof. David A. Patterson Computer Science 252 Spring
More informationDYNAMIC SPECULATIVE EXECUTION
DYNAMIC SPECULATIVE EXECUTION Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 3, John L. Hennessy and David A. Patterson, Morgan Kaufmann,
More informationPipelining and Vector Processing
Chapter 8 Pipelining and Vector Processing 8 1 If the pipeline stages are heterogeneous, the slowest stage determines the flow rate of the entire pipeline. This leads to other stages idling. 8 2 Pipeline
More informationEECS 470 Midterm Exam Winter 2008 answers
EECS 470 Midterm Exam Winter 2008 answers Name: KEY unique name: KEY Sign the honor code: I have neither given nor received aid on this exam nor observed anyone else doing so. Scores: #Page Points 2 /10
More informationCISC 662 Graduate Computer Architecture. Lecture 10 - ILP 3
CISC 662 Graduate Computer Architecture Lecture 10 - ILP 3 Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer
More informationCS 2506 Computer Organization II Test 2
Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted one-page formula sheet. No calculators or other computing devices may
More informationTHE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination May 23, 2014 Name: Email: Student ID: Lab Section Number: Instructions: 1. This
More informationE0-243: Computer Architecture
E0-243: Computer Architecture L1 ILP Processors RG:E0243:L1-ILP Processors 1 ILP Architectures Superscalar Architecture VLIW Architecture EPIC, Subword Parallelism, RG:E0243:L1-ILP Processors 2 Motivation
More informationECE Sample Final Examination
ECE 3056 Sample Final Examination 1 Overview The following applies to all problems unless otherwise explicitly stated. Consider a 2 GHz MIPS processor with a canonical 5-stage pipeline and 32 general-purpose
More informationUniversity of Toronto Faculty of Applied Science and Engineering
Print: First Name:............ Solutions............ Last Name:............................. Student Number:............................................... University of Toronto Faculty of Applied Science
More informationMidterm I SOLUTIONS March 18, 2009 CS252 Graduate Computer Architecture
University of California, Berkeley College of Engineering Computer Science Division EECS Spring 2009 John Kubiatowicz Midterm I SOLUTIONS March 18, 2009 CS252 Graduate Computer Architecture Your Name:
More informationHardware-Based Speculation
Hardware-Based Speculation Execute instructions along predicted execution paths but only commit the results if prediction was correct Instruction commit: allowing an instruction to update the register
More informationThe Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes.
The Processor Pipeline Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes. Pipeline A Basic MIPS Implementation Memory-reference instructions Load Word (lw) and Store Word (sw) ALU instructions
More informationUpdated Exercises by Diana Franklin
C-82 Appendix C Pipelining: Basic and Intermediate Concepts Updated Exercises by Diana Franklin C.1 [15/15/15/15/25/10/15] Use the following code fragment: Loop: LD R1,0(R2) ;load R1 from address
More informationCS 2410 Mid term (fall 2018)
CS 2410 Mid term (fall 2018) Name: Question 1 (6+6+3=15 points): Consider two machines, the first being a 5-stage operating at 1ns clock and the second is a 12-stage operating at 0.7ns clock. Due to data
More informationLecture: Out-of-order Processors. Topics: out-of-order implementations with issue queue, register renaming, and reorder buffer, timing, LSQ
Lecture: Out-of-order Processors Topics: out-of-order implementations with issue queue, register renaming, and reorder buffer, timing, LSQ 1 An Out-of-Order Processor Implementation Reorder Buffer (ROB)
More informationCS252 Graduate Computer Architecture Lecture 6. Recall: Software Pipelining Example
CS252 Graduate Computer Architecture Lecture 6 Tomasulo, Implicit Register Renaming, Loop-Level Parallelism Extraction Explicit Register Renaming John Kubiatowicz Electrical Engineering and Computer Sciences
More informationPage # CISC 662 Graduate Computer Architecture. Lecture 8 - ILP 1. Pipeline CPI. Pipeline CPI (I) Michela Taufer
CISC 662 Graduate Computer Architecture Lecture 8 - ILP 1 Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,
More informationComplex Pipelining: Out-of-order Execution & Register Renaming. Multiple Function Units
6823, L14--1 Complex Pipelining: Out-of-order Execution & Register Renaming Laboratory for Computer Science MIT http://wwwcsglcsmitedu/6823 Multiple Function Units 6823, L14--2 ALU Mem IF ID Issue WB Fadd
More informationLecture 11: Out-of-order Processors. Topics: more ooo design details, timing, load-store queue
Lecture 11: Out-of-order Processors Topics: more ooo design details, timing, load-store queue 1 Problem 0 Show the renamed version of the following code: Assume that you have 36 physical registers and
More informationAdvanced Computer Architecture. Chapter 4: More sophisticated CPU architectures
Advanced Computer Architecture Chapter 4: More sophisticated CPU architectures Lecturer: Paul H J Kelly Autumn 2001 Department of Computing Imperial College Room 423 email: phjk@doc.ic.ac.uk Course web
More informationCENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.
Exam 2 April 12, 2012 You have 80 minutes to complete the exam. Please write your answers clearly and legibly on this exam paper. GRADE: Name. Class ID. 1. (22 pts) Circle the selected answer for T/F and
More informationLecture 7: Pipelining Contd. More pipelining complications: Interrupts and Exceptions
Lecture 7: Pipelining Contd. Kunle Olukotun Gates 302 kunle@ogun.stanford.edu http://www-leland.stanford.edu/class/ee282h/ 1 More pipelining complications: Interrupts and Exceptions Hard to handle in pipelined
More informationBasic Pipelining Concepts
Basic ipelining oncepts Appendix A (recommended reading, not everything will be covered today) Basic pipelining ipeline hazards Data hazards ontrol hazards Structural hazards Multicycle operations Execution
More informationDynamic Scheduling. CSE471 Susan Eggers 1
Dynamic Scheduling Why go out of style? expensive hardware for the time (actually, still is, relatively) register files grew so less register pressure early RISCs had lower CPIs Why come back? higher chip
More informationCS / ECE 6810 Midterm Exam - Oct 21st 2008
Name and ID: CS / ECE 6810 Midterm Exam - Oct 21st 2008 Notes: This is an open notes and open book exam. If necessary, make reasonable assumptions and clearly state them. The only clarifications you may
More information