Processor Design Pipelined Processor. Hung-Wei Tseng
|
|
- Andrea Booker
- 6 years ago
- Views:
Transcription
1 Processor Design Pipelined Processor Hung-Wei Tseng
2 Pipelining 7
3 Pipelining Break up the logic with isters into pipeline stages Each stage can act on different instruction/data States/Control signals of instructions are hold in isters latch latch 8
4 Pipelining cycle # cycle #2 cycle #3 cycle #4 cycle #5 After the 5th cycle, the processor can do 5 instructions in parallel 9
5 Pipelining cycle #6 cycle #7 cycle #8 cycle #9 cycle # The processor can complete instruction each cycle CPI == if everything works perfectly!
6 Single-cycle v.s. pipeline v.s.
7 Cycle time of a pipeline processor Critical path is the longest possible delay between two registers in a design. The critical path sets the cycle time, since the cycle time must be long enough for a signal to traverse the critical path. change performance Lengthening or shortening non-critical paths does not Ideally, all paths are about the same length 3
8 Designing a 5-stage pipeline processor for MIPS 5
9 Basic steps of execution Instruction fetch: where? instruction memory Decode: What s the instruction? Where are the operands? registers Execute ALUs Memory access data memory Where is my data? Where to put the result Processor Write back registers 8bf94: 8 8 Determine the next PC 8bf98: c bf9c: instruction memory ALU PC R R R2... R3 registers 27a3: fbb27 ldah gp,5(t2) 27a34: 59cbd23 lda gp,-2552(gp) 27a38: 5d24 ldah t,(gp) 27a3c: bd24 ldah t4,(gp) 27a4: 2ca422a ldl t,-2358(t) 27a44: 32e4 beq t,27a94 27a48: 3d24 ldah t,(gp) 27a4c: 2ca4e2b3 stl zero,-2358(t) 8bf94: 8 8 8bf98: c2f bf9c: 8 8 8bf9: c2f data memory 8bf9: c2e
10 Pipeline a MIPS processor Instruction Fetch from instruction memory Decode Instruction Fetch () Figure out the incoming instruction? Instruction Decode () Fetch the operands from the registers Execution Perform ALU functions Memory access /write data memory Write back results to registers Write to the register file Execution () Memory Access () Write Back () 7
11 PC From single-cycle to pipeline Instruction Fetch Instruction Decode Execution PCSrc = Branch & Zero PCSrc Memory Access Write Back Control 4 Address Add Instruc(on Memory inst[3:] inst[3:25],inst[5:] RegWrite inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg inst[5:] RegDst Data 2 Write Data 6 signextend 32 ALUSrc Shi> le> 2 Zero ALU ALUop Add Address MemWrite Write Data Data Memory Mem Data MemtoReg / /EX EX/ / Will this work? 8
12 PC Pipelined processor PCSrc Control 4 Address Add Instruc(on Memory add $, $2, $3 lw $4, ($5) sub $6, $7, $8 sub $9,$,$ sw $, ($2) inst[3:] inst[3:25],inst[5:] RegWrite inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg inst[5:] RegDst Data 2 Write Data 6 signextend 32 ALUSrc Shi> le> 2 Zero ALU ALUop Add Address MemWrite Write Data Data Memory Mem Data MemtoReg / /EX EX/ / 9
13 PC Pipelined processor PCSrc Control 4 Address Add Instruc(on Memory add $, $2, $3 lw $4, ($5) sub $6, $7, $8 sub $9,$,$ sw $, ($2) inst[3:] inst[3:25],inst[5:] RegWrite inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg inst[5:] RegDst Data 2 Write Data 6 signextend 32 ALUSrc Shi> le> 2 Zero ALU ALUop Add Address MemWrite Write Data Data Memory Mem Data MemtoReg / /EX EX/ / 2
14 PC Pipelined processor PCSrc 4 Address Add Instruc(on Memory add $, $2, $3 lw $4, ($5) sub $6, $7, $8 sub $9,$,$ sw $, ($2) inst[3:] Control inst[3:25],inst[5:] RegWrite inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg inst[5:] RegDst Data 2 Write Data 6 signextend 32 ME EX ALUSrc Shi> le> 2 Zero ALU ALUop Add Address MemWrite Write Data Data Memory Mem Data MemtoReg / /EX EX/ / Where can I find these? ME 2
15 PC Pipelined processor PCSrc 4 Address Add Instruc(on Memory add $, $2, $3 lw $4, ($5) sub $6, $7, $8 sub $9,$,$ sw $, ($2) inst[3:] Control inst[3:25],inst[5:] RegWrite inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg inst[5:] RegDst Data 2 Write Data 6 signextend 32 ME EX ALUSrc Shi> le> 2 Zero ALU ALUop Add Address MemWrite Write Data Data Memory Mem Data MemtoReg / /EX EX/ / ME 22
16 PC Pipelined processor PCSrc Is this right? RegWrite 4 Address Add Instruc(on Memory add $, $2, $3 lw $4, ($5) sub $6, $7, $8 sub $9,$,$ sw $, ($2) inst[3:] Control inst[3:25],inst[5:] RegWrite inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg inst[5:] Data 2 RegDst Write Data 6 signextend 32 ME EX ALUSrc Shi> le> 2 Zero ALU ALUop Add Address MemWrite Write Data Data Memory Mem Data MemtoReg / /EX EX/ / ME 23
17 PC Pipelined processor 4 PCSrc Address Add Instruc(on Memory inst[3:] / /EX EX/ / inst[5:] Control inst[3:25],inst[5:] RegWrite inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg Data 2 Write Data 6 signextend 32 ME EX ALUSrc Shi> le> 2 RegDst Zero ALU ALUop Add ME Address MemWrite Write Data Data Memory Mem Data RegWrite MemtoReg 24
18 PC 5-stage pipelined processor 4 PCSrc Address Add Instruc(on Memory inst[3:] / /EX EX/ / inst[5:] Control inst[3:25],inst[5:] RegWrite inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg Data 2 Write Data 6 signextend 32 ME EX ALUSrc Shi> le> 2 RegDst Zero ALU ALUop Add ME Address MemWrite Write Data Data Memory Mem Data RegWrite MemtoReg 25
19 Simplified pipeline diagram Use symbols to represent the physical resources with the abbreviations for pipeline stages.,,,, Horizontal axis represent the timeline, vertical axis for the instruction stream Example: add $, $2, $3 lw $4, ($5) sub $6, $7, $8 sub $9,$,$ sw $, ($2) 26
20 Pipeline hazards 28
21 Pipeline hazards Even though we perfectly divide pipeline stages, it s still hard to achieve CPI ==. Pipeline hazards: Structural hazard The hardware does not allow two pipeline stages to work concurrently Data hazard A later instruction in a pipeline stage depends on the outcome of an earlier instruction in the pipeline Control hazard The processor is not clear about what s the next instruction to fetch 29
22 Can we get the right result? Given the current 5-stage pipeline, how many of the following MIPS code can work correctly? a: b: c: d: e: add $, $2, $3 lw $4, ($) sub $6, $7, $8 sub $9,$,$ sw $, ($2) I II III IV add $, $2, $3 lw $4, ($5) sub $6, $7, $8 sub $9, $, $ sw $, ($2) add $, $2, $3 lw $4, ($5) bne $, $7, L sub $9,$,$ sw $, ($2) add $, $2, $3 lw $4, ($5) sub $6, $7, $8 sub $9,$,$ sw $, ($2) b cannot get $ produced by a before Data hazard both a and d are accessing $ at 5th cycle Structural hazard We don t know if d & e will be executed or not Control hazard 3
23 Structural hazard 3
24 Structural hazard The hardware cannot support the combination of instructions that we want to execute at the same cycle two instructions competing the same register. The original pipeline incurs structural hazard when Solution: write early, read late Writes occur at the clock edge and complete long enough before the end of the clock cycle. This leaves enough time for outputs to settle for reads The revised register file is the default one from now! add $, $2, $3 lw $4, ($5) sub $6, $7, $8 sub $9,$, $ sw $, ($2) 33
25 Structural hazard The design of hardware causes structural hazard We need to modify the hardware design to avoid structural hazard 35
26 Data hazard 36
27 Data hazard When an instruction in the pipeline needs a value that is not available Data dependences The output of an instruction is the input of a later instruction May result in data hazard if the later instruction that consumes the result is still in the pipeline 38
28 Sol. of data hazard I: Stall When the source operand of an instruction is not ready, stall the pipeline Suspend the instruction and the following instruction Allow the previous instructions to proceed This introduces a pipeline bubble: a bubble does nothing, propagate through the pipeline like a nop instruction Disable the PC update How to stall the pipeline? Disable the isters on the earlier pipeline stages When the stall is over, re-enable the isters, PC updates 4
29 PC PCWrite PCSrc 4 Address Hazard detection & stall hazard detection unit Add Instruc(on Memory /Write inst[3:] Check if the destination register of EX == source register of the instruction in / /EX EX/ / inst[5:] Control inst[3:25],inst[5:] inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg Data 2 Write Data 6 RegWrite /EX.Mem signextend 32 ME EX ALUSrc Shi> le> 2 RegDst Zero ALU ALUop Insert a noop if we need to stall Add ME Address MemWrite Write Data Data Memory Mem Data RegWrite MemtoReg Check if the destination register of == source register of the instruction in 4
30 Performance of stall Insert a noop in stage Insert another noop in stage, previous noop goes to stage add $, $2, $3 lw $4, ($) sub $5, $2, $4 sub $, $3, $ sw $, ($5) 5 cycles! CPI == 3 (If there is no stall, CPI should be just!) 42
31 Sol. of data hazard II: Forwarding The result is available after and stage, but publicized in! The data is already there, we should use it right away! Also called bypassing add $, $2, $3 lw $4, ($) sub $5, $2, $4 sub $, $3, $ sw $, ($5) We can obtain the result here! 43
32 Sol. of data hazard II: Forwarding Take the values, where ever they are! add $, $2, $3 lw $4, ($) sub $5, $2, $4 sub $, $3, $ sw $, ($5) cycles! CPI == 2 (Not optimal, but much better!) 44
33 When can/should we forward data? If the instruction entering the stage consumes a result from a previous instruction that is entering stage or stage A source of the instruction entering stage is the destination of an instruction entering / stage The previous instruction must be an instruction that updates register file 46
34 PC 4 PCSrc Address Forwarding in hardware Add Instruc(on Memory inst[3:] / /EX EX/ / inst[5:] revious instruction (Ins#) urernt instruction (Ins#2) How about load? Control inst[3:25],inst[5:] RegWrite inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg Data 2 Write Data 6 Rs of Ins#2 Rt of Ins#2 signextend 32 ME EX Control of Ins#2 ForwardA Shi> le> 2 RegDst ForwardB ForwardA ForwardB Zero ALU ALUop Add forwarding unit ALUSrc ME Control of Ins# Address MemWrite Write Data Data Memory Mem Data MemtoReg 47 RegWrite ALU result of Ins# destination of Ins#
35 PC 4 PCSrc Address Forwarding in hardware Add Instruc(on Memory inst[3:] / /EX EX/ / inst[5:] Control inst[3:25],inst[5:] RegWrite inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg Data 2 Write Data 6 signextend 32 ME EX ForwardA Shi> le> 2 RegDst ForwardB Zero ALU ALUop Add ME ALU/ result of Ins# Control of Ins# Address MemWrite Write Data Data Memory Mem Data RegWrite MemtoReg ForwardA ForwardB forwarding unit ALUSrc Rd of Ins# 48
36 There is still a case that we have to stall... Revisit the following code: add $, $2, $3 lw $4, ($) sub $5, $2, $4 sub $, $3, $ sw $, ($5) lw generates result at stage, we have to stall If the instruction entering stage depends on a load instruction that does not finish its stage yet, we have to stall! We call this hazard detection We need to know the following:. If an instruction in EX/ updates a register (RegWrite) 2. If an instruction in EX/ reads memory (Mem) 3. If the destination register of EX/ is a source of /EX (rs, rt of /EX == rt of EX/ #) 49
37 PC Hazard detection with forwarding hazard detection unit PCWrite PCSrc 4 Address Add Instruc(on Memory /Write inst[3:] / /EX EX/ / inst[5:] Control inst[3:25],inst[5:] inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg Data 2 Write Data 6 RegWrite /EX.Mem signextend 32 ME EX ForwardA Shi> le> 2 RegDst ForwardB Zero ALU ALUop Add ME Address MemWrite Write Data Data Memory Mem Data RegWrite MemtoReg ForwardA ForwardB forwarding unit ALUSrc 5
38 Control hazard 5
39 Control hazard The processor cannot determine the next PC to fetch LOOP: lw $t3, ($s) addi $t, $t, add $v, $v, $t3 addi $s, $s, 4 bne $t, $t, LOOP lw $t3, ($s) stall 7 cycles per loop 54
40 Reducing the overhead of control hazards 55
41 Solution I: Delayed branches An agreement between ISA and hardware Branch delay slots: the next N instructions after a branch are always executed Compiler decides the instructions in branch delay slots Reordering the instruction cannot affect the correctness of the program MIPS has one branch delay slot Good Simple hardware Bad N cannot change Sometimes cannot find good candidates for the slot 56
42 Solution I: Delayed branches LOOP: lw $t3, ($s) addi $t, $t, add $v, $v, $t3 addi $s, $s, 4 bne $t, $t, LOOP branch delay slot LOOP: lw $t3, ($s) addi $t, $t, add $v, $v, $t3 bne $t, $t, LOOP addi $s, $s, 4 lw $t3, ($s) stall 6 cycles per loop 57
43 Solution II: always predict not-taken Always predict the next PC is PC+4 LOOP: lw $t3, ($s) addi $t, $t, add $v, $v, $t3 addi $s, $s, 4 bne $t, $t, LOOP sw $v, ($s) add $t4, $t3, $t5 nop nop nop nop nop lw $t3, ($s) If branch is not taken: no stalls! If branch is taken: doesn t hurt! 7 cycles per loop flush the instructions fetched incorrectly 58
44 PC Solution III: always predict taken PCWrite PCSrc 4 Address Add Instruc(on Memory /Write inst[3:] / /EX EX/ / inst[5:] hazard detection unit Control inst[3:25],inst[5:] inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg Data 2 Write Data 6 RegWrite /EX.Mem signextend 32 ME EX ForwardA Shi> le> 2 RegDst ForwardB Zero ALU ALUop Add ME Address MemWrite Write Data Data Memory Mem Data RegWrite MemtoReg ForwardA ForwardB forwarding unit ALUSrc 6
45 PC Solution III: always predict taken PCWrite PCSrc 4 Address Add Instruc(on Memory /Write inst[3:] / /EX EX/ / inst[5:] hazard detection unit Control inst[3:25],inst[5:] inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg Data 2 Write Data 6 Shi> le> 2 Add RegWrite signextend 32 /EX.Mem ME EX ForwardA RegDst ForwardB Zero ALU ALUop ME Address MemWrite Write Data Data Memory Mem Data RegWrite MemtoReg Still have to stall cycle ForwardA ForwardB forwarding unit ALUSrc 62
46 PC Solution III: always predict taken PCWrite PCSrc 4 Address Add Instruc(on Memory /Write inst[3:] / /EX EX/ / inst[5:] hazard detection unit Control inst[3:25],inst[5:] inst[25:2] Reg Register inst[2:6] Reg 2 Data File Write Reg Data 2 Write Data 6 Shi> le> 2 Add RegWrite signextend 32 /EX.Mem ME EX ForwardA RegDst ForwardB Zero ALU ALUop ME Address MemWrite Write Data Data Memory Mem Data RegWrite MemtoReg Branch Target Buffer Consult BTB in fetch stage ForwardA ForwardB forwarding unit ALUSrc 63
47 PC Branch Target Buffer branch PC target address or target instruction Branch Target Buffer 64
48 Solution III: always predict taken Always predict taken with the help of BTB LOOP: lw $t3, ($s) addi $t, $t, add $v, $v, $t3 addi $s, $s, 4 bne $t, $t, LOOP lw $t3, ($s) addi $t, $t, add $v, $v, $t3 5 cycles per loop (CPI ==!!!) But what if the branch is not always taken? 65
49 Dynamic branch prediction 68
50 -bit counter Predict this branch will go the same way as the result of the last time this branch executed for taken, for not takens PC = x442 x442 x x4464 x Taken! x4578 x8485a x4c x Branch Target Buffer 69
51 2-bit counter A 2-bit counter for each branch taken Predict taken if the counter value >= 2 If the prediction in taken states, fetch from target PC, otherwise, use PC+4 Taken 3 () not taken taken Taken 2 () PC= x442 Not Taken () taken not taken taken Not Taken () not taken x442 x x4464 x x4578 x8485a Taken! not taken x4c x Branch Target Buffer 7
52 Performance of 2-bit counter 2-bit state machine for each branch taken for(i = ; i < ; i++) {! sum += a[i]; } Taken 3 () Not Taken () not taken not taken taken taken not taken taken Taken 2 () Not Taken () not taken 9% accuracy! i state predict actual T T 2 T T 3 T T 4-9 T T T NT Application: 8% ALU, 2% Branch, and branch resolved in EX stage, average CPI? +2%*(-9%)*2 =
53 Make the prediction better Consider the following code: i = ; do { if( i % 3!= ) // Branch Y, taken if i % 3 == a[i] *= 2; a[i] += i; } while ( ++i < ) // Branch X Can we capture the pattern? i branch result Y T X T Y NT X T 2 Y NT 2 X T 3 Y T 3 X T 4 Y NT 4 X T 5 Y NT 5 X T 6 Y T 6 X T 7 Y NT 74
54 Predict using history Instead of using the PC to choose the predictor, use a bit vector (global history register, GHR) made up of the previous branch outcomes. Each entry in the history table has its own counter. n-bit GHR index = (T, NT, T) 2 n entries history table Taken! 75
55 Performance of global history predictor Consider the following code: i = ; do { if( i % 3!= ) // Branch Y, taken if i % 3 == a[i] *= 2; a[i] += i; // Branch Y } while ( ++i < ) // Branch X Assume that we start with a 4- bit GHR=, all counters are. Nearly perfect after this i? GHR BHT prediction actual New BHT Y T T X T T Y T NT X T T 2 Y T NT 2 X T T 3 Y T T 3 X T T 4 Y T NT 4 X T T 5 Y NT NT 5 X T T 6 Y T T 6 X T T 7 Y NT NT 7 X T T 8 Y NT NT 8 X T T 9 Y T T 9 X T T 76 Y NT NT
56 Branch prediction and modern processors 79
57 Deeper pipeline Higher frequencies by shortening the pipeline stages performance with frequencies Potentially higher power consumption as dynamic/active power = acv 2 f Higher marketing values since consumers usually link If the execution time is better, still consume less energy 8
58 Case Study 8
59 Intel Pentium 4 Microarch. 82
60 Intel Pentium 4 Very deep pipeline: in order to achieve high frequency! (start from.5ghz) 2 stages in Netburst TC Nxt IP TC Fetch Drive Alloc Rename Que 3 stages in Prescott Sch 3W (3.6GHz, 65nm) Reference The Microarchitecture of the Pentium 4 Processor Sch 2 Sch 3 Disp 4 Disp 5 RF 6 RF 7 Ex 8 Flgs 9 Br Ck 2 Drive 83
61 AMD Athlon 64 84
62 2 stage pipeline AMD Athlon 64 Inst. Addr Decode 2 Inst Mem 3 Inst. Byte Pick Inst. Dbl. & Pack 7 and Pack 8 Dispatch 9 Scheduling Execution D-Cache Address 2 D-cache Access 89W TDP (Opteron 2.2GHz 9nm) 85
63 Demo revisited Why the sorting the array speed up the code despite the increased instruction count? if(option) std::sort(data, data + arraysize); for (unsigned i = ; i < ; ++i) { int threshold = std::rand(); for (unsigned i = ; i < arraysize; ++i) { if (data[i] >= threshold) sum ++; } } 88
64 Deep pipelining and data hazards 89
65 Data hazard revisited How many cycles it takes to execute the following code? Draw the pipeline execution diagram assume that we have full data forwarding. lw $t, ($a) lw $a, ($t) bne $a, $zero, EX 9 cycles 9
66 Intel s latest SkyLake BPU 32K L Instruction Cache MSROM 4 uops/cycle 6 uops/cycle Decoded Icache (DSB) Instruction Decode Queue (Q,, or micro-op queue) 5 uops/cycle Legacy Decode Pipeline Allocate/Rename/Retire/MoveElimination/ZeroIdiom Port Scheduler Port Port 5 Port 6 Port 2 LD/STA 256K L2 Cache (Unified) Int ALU, Vec FMA, Vec MUL, Vec Add, Vec ALU, Vec Shft, Divide, Branch2 Int ALU, Fast LEA, Vec FMA, Vec MUL, Vec Add, Vec ALU, Vec Shft, Int MUL, Slow LEA Int ALU, Fast LEA, Vec SHUF, Vec ALU, CVT Int ALU, Int Shft, Branch, Port 3 LD/STA Port 4 STD Port 7 STA 32K L Data Cache Good reference for intel microarchitectures: 92
Processor Design Pipelined Processor (II) Hung-Wei Tseng
Processor Design Pipelined Processor (II) Hung-Wei Tseng Recap: Pipelining Break up the logic with pipeline registers into pipeline stages Each pipeline registers is clocked Each pipeline stage takes one
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationChapter 4 The Processor 1. Chapter 4B. The Processor
Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always
More informationVirtual memory. Hung-Wei Tseng
Virtual memory Hung-Wei Tseng Why virtual memory How VM works VM and cache Outline 2 Virtual memory 3 Scenario I An application is design on machine A with memory size X. Can we safely execute the same
More informationLecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1
Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More informationPipeline design. Mehran Rezaei
Pipeline design Mehran Rezaei How Can We Improve the Performance? Exec Time = IC * CPI * CCT Optimization IC CPI CCT Source Level * Compiler * * ISA * * Organization * * Technology * With Pipelining We
More informationComputer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining
Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one
More informationVirtual memory. Hung-Wei Tseng
Virtual memory Hung-Wei Tseng Why virtual memory How VM works VM and cache Outline 4 Virtual memory 5 Scenario I An application is design on machine A with memory size X. Can we safely execute the same
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationLecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl.
Lecture 4: Review of MIPS Instruction formats, impl. of control and datapath, pipelined impl. 1 MIPS Instruction Types Data transfer: Load and store Integer arithmetic/logic Floating point arithmetic Control
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationMulti-threaded processors. Hung-Wei Tseng x Dean Tullsen
Multi-threaded processors Hung-Wei Tseng x Dean Tullsen OoO SuperScalar Processor Fetch instructions in the instruction window Register renaming to eliminate false dependencies edule an instruction to
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationComputer Organization and Structure. Bing-Yu Chen National Taiwan University
Computer Organization and Structure Bing-Yu Chen National Taiwan University The Processor Logic Design Conventions Building a Datapath A Simple Implementation Scheme An Overview of Pipelining Pipelined
More informationComputer Organization and Structure
Computer Organization and Structure 1. Assuming the following repeating pattern (e.g., in a loop) of branch outcomes: Branch outcomes a. T, T, NT, T b. T, T, T, NT, NT Homework #4 Due: 2014/12/9 a. What
More informationPipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction
More informationPipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...
CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100
More informationImproving Performance: Pipelining
Improving Performance: Pipelining Memory General registers Memory ID EXE MEM WB Instruction Fetch (includes PC increment) ID Instruction Decode + fetching values from general purpose registers EXE EXEcute
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationChapter 4 The Processor 1. Chapter 4A. The Processor
Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationOutline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception
Outline A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception 1 4 Which stage is the branch decision made? Case 1: 0 M u x 1 Add
More informationComputer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationCENG 3420 Lecture 06: Pipeline
CENG 3420 Lecture 06: Pipeline Bei Yu byu@cse.cuhk.edu.hk CENG3420 L06.1 Spring 2019 Outline q Pipeline Motivations q Pipeline Hazards q Exceptions q Background: Flip-Flop Control Signals CENG3420 L06.2
More informationCOMPUTER ORGANIZATION AND DESI
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined
More informationProcessor (II) - pipelining. Hwansoo Han
Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationPipelining. CSC Friday, November 6, 2015
Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationPipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.
Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview
More informationELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control
ELEC 52/62 Computer Architecture and Design Spring 217 Lecture 4: Datapath and Control Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationThe Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
The Processor (3) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationAdvanced d Instruction Level Parallelism. Computer Systems Laboratory Sungkyunkwan University
Advanced d Instruction ti Level Parallelism Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ILP Instruction-Level Parallelism (ILP) Pipelining:
More informationLecture 9 Pipeline and Cache
Lecture 9 Pipeline and Cache Peng Liu liupeng@zju.edu.cn 1 What makes it easy Pipelining Review all instructions are the same length just a few instruction formats memory operands appear only in loads
More informationCSEN 601: Computer System Architecture Summer 2014
CSEN 601: Computer System Architecture Summer 2014 Practice Assignment 5 Solutions Exercise 5-1: (Midterm Spring 2013) a. What are the values of the control signals (except ALUOp) for each of the following
More informationComplex Pipelines and Branch Prediction
Complex Pipelines and Branch Prediction Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. L22-1 Processor Performance Time Program Instructions Program Cycles Instruction CPI Time Cycle
More informationECS 154B Computer Architecture II Spring 2009
ECS 154B Computer Architecture II Spring 2009 Pipelining Datapath and Control 6.2-6.3 Partially adapted from slides by Mary Jane Irwin, Penn State And Kurtis Kredo, UCD Pipelined CPU Break execution into
More informationProcessor (IV) - advanced ILP. Hwansoo Han
Processor (IV) - advanced ILP Hwansoo Han Instruction-Level Parallelism (ILP) Pipelining: executing multiple instructions in parallel To increase ILP Deeper pipeline Less work per stage shorter clock cycle
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More information1 Hazards COMP2611 Fall 2015 Pipelined Processor
1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add
More informationEE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes
NAME: STUDENT NUMBER: EE557--FALL 1999 MAKE-UP MIDTERM 1 Closed books, closed notes Q1: /1 Q2: /1 Q3: /1 Q4: /1 Q5: /15 Q6: /1 TOTAL: /65 Grade: /25 1 QUESTION 1(Performance evaluation) 1 points We are
More informationControl Hazards - branching causes problems since the pipeline can be filled with the wrong instructions.
Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions Stage Instruction Fetch Instruction Decode Execution / Effective addr Memory access Write-back Abbreviation
More informationChapter 4. The Processor
Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations Determined by ISA
More informationComputer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationEECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141
EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 13 Project Introduction You will design and optimize a RISC-V processor Phase 1: Design
More informationDetermined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version
MIPS Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationPipelined Processor Design. EE/ECE 4305: Computer Architecture University of Minnesota Duluth By Dr. Taek M. Kwon
Pipelined Processor Design EE/ECE 4305: Computer Architecture University of Minnesota Duluth By Dr. Taek M. Kwon Concept Identification of Pipeline Segments Add Pipeline Registers Pipeline Stage Control
More informationThomas Polzer Institut für Technische Informatik
Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =
More informationPipelined datapath Staging data. CS2504, Spring'2007 Dimitris Nikolopoulos
Pipelined datapath Staging data b 55 Life of a load in the MIPS pipeline Note: both the instruction and the incremented PC value need to be forwarded in the next stage (in case the instruction is a beq)
More informationLECTURE 3: THE PROCESSOR
LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More information4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?
Chapter 4: Assessing and Understanding Performance 1. Define response (execution) time. 2. Define throughput. 3. Describe why using the clock rate of a processor is a bad way to measure performance. Provide
More informationCS3350B Computer Architecture Quiz 3 March 15, 2018
CS3350B Computer Architecture Quiz 3 March 15, 2018 Student ID number: Student Last Name: Question 1.1 1.2 1.3 2.1 2.2 2.3 Total Marks The quiz consists of two exercises. The expected duration is 30 minutes.
More informationLecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University
Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will
More informationThe Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture
The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count
More information14:332:331 Pipelined Datapath
14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate
More informationCSE 378 Midterm 2/12/10 Sample Solution
Question 1. (6 points) (a) Rewrite the instruction sub $v0,$t8,$a2 using absolute register numbers instead of symbolic names (i.e., if the instruction contained $at, you would rewrite that as $1.) sub
More informationPipelining is Hazardous!
Pipelining is Hazardous! Hazards are situations where pipelining does not work as elegantly as we would like Three kinds Structural hazards -- we have run out of a hardware resource. Data hazards -- an
More informationCOMP2611: Computer Organization. The Pipelined Processor
COMP2611: Computer Organization The 1 2 Background 2 High-Performance Processors 3 Two techniques for designing high-performance processors by exploiting parallelism: Multiprocessing: parallelism among
More informationLecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)
Lecture Topics Today: Data and Control Hazards (P&H 4.7-4.8) Next: continued 1 Announcements Exam #1 returned Milestone #5 (due 2/27) Milestone #6 (due 3/13) 2 1 Review: Pipelined Implementations Pipelining
More informationInstruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31
4.16 Exercises 419 Exercise 4.11 In this exercise we examine in detail how an instruction is executed in a single-cycle datapath. Problems in this exercise refer to a clock cycle in which the processor
More informationEIE/ENE 334 Microprocessors
EIE/ENE 334 Microprocessors Lecture 6: The Processor Week #06/07 : Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, 2009, Elsevier (MK) http://webstaff.kmutt.ac.th/~dejwoot.kha/
More informationChapter 4. The Processor
Chapter 4 The Processor Recall. ISA? Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction Instruction Format or Encoding how is it decoded? Location of operands and
More informationChapter 5 Solutions: For More Practice
Chapter 5 Solutions: For More Practice 1 Chapter 5 Solutions: For More Practice 5.4 Fetching, reading registers, and writing the destination register takes a total of 300ps for both floating point add/subtract
More informationThe Processor: Instruction-Level Parallelism
The Processor: Instruction-Level Parallelism Computer Organization Architectures for Embedded Computing Tuesday 21 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy
More informationInstruction Level Parallelism. Appendix C and Chapter 3, HP5e
Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation
More informationOrange Coast College. Business Division. Computer Science Department. CS 116- Computer Architecture. Pipelining
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Pipelining Recall Pipelining is parallelizing execution Key to speedups in processors Split instruction
More informationCS232 Final Exam May 5, 2001
CS232 Final Exam May 5, 2 Name: This exam has 4 pages, including this cover. There are six questions, worth a total of 5 points. You have 3 hours. Budget your time! Write clearly and show your work. State
More informationDEE 1053 Computer Organization Lecture 6: Pipelining
Dept. Electronics Engineering, National Chiao Tung University DEE 1053 Computer Organization Lecture 6: Pipelining Dr. Tian-Sheuan Chang tschang@twins.ee.nctu.edu.tw Dept. Electronics Engineering National
More informationLecture 7 Pipelining. Peng Liu.
Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt
More information3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:
More informationT = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good
CPU performance equation: T = I x CPI x C Both effective CPI and clock cycle C are heavily influenced by CPU design. For single-cycle CPU: CPI = 1 good Long cycle time bad On the other hand, for multi-cycle
More informationReal Processors. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University
Real Processors Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Instruction-Level Parallelism (ILP) Pipelining: executing multiple instructions in parallel
More informationELE 655 Microprocessor System Design
ELE 655 Microprocessor System Design Section 2 Instruction Level Parallelism Class 1 Basic Pipeline Notes: Reg shows up two places but actually is the same register file Writes occur on the second half
More informationChapter 5: The Processor: Datapath and Control
Chapter 5: The Processor: Datapath and Control Overview Logic Design Conventions Building a Datapath and Control Unit Different Implementations of MIPS instruction set A simple implementation of a processor
More informationCS 351 Exam 2 Mon. 11/2/2015
CS 351 Exam 2 Mon. 11/2/2015 Name: Rules and Hints The MIPS cheat sheet and datapath diagram are attached at the end of this exam for your reference. You may use one handwritten 8.5 11 cheat sheet (front
More informationWhat about branches? Branch outcomes are not known until EXE What are our options?
What about branches? Branch outcomes are not known until EXE What are our options? 1 Control Hazards 2 Today Quiz Control Hazards Midterm review Return your papers 3 Key Points: Control Hazards Control
More informationPIPELINING: HAZARDS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah
PIPELINING: HAZARDS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 1 submission deadline: Jan. 30 th This
More informationEECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction
EECS150 - Digital Design Lecture 10- CPU Microarchitecture Feb 18, 2010 John Wawrzynek Spring 2010 EECS150 - Lec10-cpu Page 1 Processor Microarchitecture Introduction Microarchitecture: how to implement
More informationChapter 4. The Processor. Computer Architecture and IC Design Lab
Chapter 4 The Processor Introduction CPU performance factors CPI Clock Cycle Time Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS
More informationChapter 4 (Part II) Sequential Laundry
Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P 7 8 9 10 11 12 1 2 A T a s k O r d e r A B C D 30 30 30 30 30 30 30 30 30 30
More informationCS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions
CS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions Tutorial Questions 2. [AY2014/5 Semester 2 Exam] Refer to the following MIPS program: # register $s0 contains a 32-bit
More information4. The Processor Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3. Emil Sekerinski, McMaster University, Fall Term 2015/16
4. The Processor Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3 Emil Sekerinski, McMaster University, Fall Term 2015/16 Instruction Execution Consider simplified MIPS: lw/sw rt, offset(rs) add/sub/and/or/slt
More informationChapter 4. The Processor
Chapter 4 The Processor 1 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A
More informationThese actions may use different parts of the CPU. Pipelining is when the parts run simultaneously on different instructions.
MIPS Pipe Line 2 Introduction Pipelining To complete an instruction a computer needs to perform a number of actions. These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously
More informationECE232: Hardware Organization and Design
ECE232: Hardware Organization and Design Lecture 17: Pipelining Wrapup Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Outline The textbook includes lots of information Focus on
More informationECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.
This exam is open book and open notes. You have 2 hours. Problems 1-4 refer to a proposed MIPS instruction lwu (load word - update) which implements update addressing an addressing mode that is used in
More informationLECTURE 9. Pipeline Hazards
LECTURE 9 Pipeline Hazards PIPELINED DATAPATH AND CONTROL In the previous lecture, we finalized the pipelined datapath for instruction sequences which do not include hazards of any kind. Remember that
More informationLecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August
Lecture 8: Control COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Datapath and Control Datapath The collection of state elements, computation elements,
More informationCOMPUTER ORGANIZATION AND DESIGN
ARM COMPUTER ORGANIZATION AND DESIGN Edition The Hardware/Software Interface Chapter 4 The Processor Modified and extended by R.J. Leduc - 2016 To understand this chapter, you will need to understand some
More informationCSEE 3827: Fundamentals of Computer Systems
CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 martha@cs.columbia.edu Amdahl s Law Be aware when optimizing... T = improved Taffected improvement factor + T unaffected
More informationStatic, multiple-issue (superscaler) pipelines
Static, multiple-issue (superscaler) pipelines Start more than one instruction in the same cycle Instruction Register file EX + MEM + WB PC Instruction Register file EX + MEM + WB 79 A static two-issue
More informationECE Exam II - Solutions November 8 th, 2017
ECE 3056 Exam II - Solutions November 8 th, 2017 1. (15 pts) To the base pipeline we add data forwarding to EX, data hazard detection and stall generation, and branches implemented in MEM and predicted
More informationQuiz for Chapter 4 The Processor3.10
Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: 1. [6 points] For the MIPS datapath shown below, several lines
More informationFinal Exam Spring 2017
COE 3 / ICS 233 Computer Organization Final Exam Spring 27 Friday, May 9, 27 7:3 AM Computer Engineering Department College of Computer Sciences & Engineering King Fahd University of Petroleum & Minerals
More informationEE 457 Unit 6a. Basic Pipelining Techniques
EE 47 Unit 6a Basic Pipelining Techniques 2 Pipelining Introduction Consider a drink bottling plant Filling the bottle = 3 sec. Placing the cap = 3 sec. Labeling = 3 sec. Would you want Machine = Does
More informationProcessor (I) - datapath & control. Hwansoo Han
Processor (I) - datapath & control Hwansoo Han Introduction CPU performance factors Instruction count - Determined by ISA and compiler CPI and Cycle time - Determined by CPU hardware We will examine two
More information