Design of Digital Circuits Lecture 15: Pipelining. Prof. Onur Mutlu ETH Zurich Spring April 2017
|
|
- Gary Reynolds
- 6 years ago
- Views:
Transcription
1 Design of Digital Circuits Lecture 5: Pipelining Prof. Onur Mutlu ETH Zurich Spring 27 3 April 27
2 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed Microarchitectures! Pipelining! Issues in Pipelining: Control & Data Dependence Handling, State Maintenance and Recovery,! Out-of-Order Execution! Issues in OoO Execution: Load-Store Handling, 2
3 Readings for This Week! H&H, Chapter 7.5 (keep reading) 3
4 Wrap Up Microprogramming 4
5 Remember: An Exercise in Microprogramming 5
6 Handouts! 7 pages of Microprogrammed LC-3b design! infk/inst-infsec/system-security-group-dam/education/ Digitaltechnik_7/lecture/lc3b-figures.pdf 6
7 A Simple LC-3b Control and Datapath 7
8 MAR <! PC PC <! PC + 2 8, 9 MDR <! M 33 R R IR <! MDR 35 To 8 RTI ADD 32 BEN<!IR[] & N + IR[] & Z + IR[9] & P [IR[5:2]] BR To To To 8 DR<!SR+OP2* set CC DR<!SR&OP2* set CC 5 AND XOR TRAP SHF LEA LDB LDW STW STB JSR JMP [BEN] 22 PC<!PC+LSHF(off9,) To 8 9 DR<!SR XOR OP2* set CC 2 PC<!BaseR To 8 To 8 MAR<!LSHF(ZEXT[IR[7:]],) 5 4 [IR[]] To 8 R MDR<!M[MAR] R7<!PC R PC<!MDR R7<!PC PC<!BaseR 2 R7<!PC To 8 PC<!PC+LSHF(off,) To 8 3 DR<!SHF(SR,A,D,amt4) set CC To 8 To 8 4 DR<!PC+LSHF(off9, ) set CC 2 MAR<!B+off6 6 MAR<!B+LSHF(off6,) 7 MAR<!B+LSHF(off6,) 3 MAR<!B+off6 To NOTES B+off6 : Base + SEXT[offset6] PC+off9 : PC + SEXT[offset9] *OP2 may be SR2 or SEXT[imm5] ** [5:8] or [7:] depending on MAR[] MDR<!M[MAR[5:] ] R R 3 DR<!SEXT[BYTE.DATA] set CC MDR<!M[MAR] 27 R DR<!MDR set CC R MDR<!SR 6 M[MAR]<!MDR R R MDR<!SR[7:] 7 M[MAR]<!MDR** R R To 8 To 8 To 8 To 9
9 GateMARMUX GatePC LD.PC PC ZEXT & LSHF MARMUX 6 6 LSHF PCMUX ADDRMUX LD.REG 3 SR2 6 SR2 OUT REG FILE SR OUT 3 3 DR SR [7:] 2 ADDR2MUX [:] SEXT [8:] SEXT SR2MUX [5:] [4:] SEXT SEXT CONTROL R LD.IR IR 6 LD.CC N Z P 2 B A ALUK ALU SHF 6 IR[5:] LOGIC 6 6 GateALU 6 GateSHF GateMDR MAR LD. MAR A Simple Datapath Can Become Very Powerful LOGIC MDR DATA.SIZE MAR[] 6 LD. MDR MIO.EN WE WE WE LOGIC MEMORY MEM.EN R [] R.W DATA. SIZE ADDR. CTL. LOGIC 2 MIO.EN INPUT KBDR KBSR DDR OUTPUT DSR 6 6 LOGIC DATA.SIZE MAR[] INMUX
10 State Machine for LDW Microsequencer COND COND BEN R IR[] Branch Ready Addr. Mode J[5] J[4] J[3] J[2] J[] J[],,IR[5:2] 6 IRD 6 Address of Next State State 8 () State 33 () State 35 () State 32 () State 6 () State 25 () State 27 ()
11 IR[:9] DR IR[:9] IR[8:6] SR DRMUX SRMUX (a) (b) IR[:9] N Z P Logic BEN (c)
12
13 R IR[5:] BEN Microsequencer 6 Simple Design of the Control Structure Control Store 2 6 x Microinstruction 9 26 (J, COND, IRD)
14 COND COND BEN R IR[] Branch Ready Addr. Mode J[5] J[4] J[3] J[2] J[] J[],,IR[5:2] 6 IRD 6 Address of Next State
15 J IRD Cond LD.MDR LD.IR LD.BEN LD.REG LD.CC LD.MAR GatePC GateMDR GateALU LD.PC GateMARMUX GateSHF PCMUX DRMUX SRMUX ADDRMUX ADDR2MUX MARMUX ALUK MIO.EN R.W DATA.SIZE LSHF (State ) (State ) (State 2) (State 3) (State 4) (State 5) (State 6) (State 7) (State 8) (State 9) (State ) (State ) (State 2) (State 3) (State 4) (State 5) (State 6) (State 7) (State 8) (State 9) (State 2) (State 2) (State 22) (State 23) (State 24) (State 25) (State 26) (State 27) (State 28) (State 29) (State 3) (State 3) (State 32) (State 33) (State 34) (State 35) (State 36) (State 37) (State 38) (State 39) (State 4) (State 4) (State 42) (State 43) (State 44) (State 45) (State 46) (State 47) (State 48) (State 49) (State 5) (State 5) (State 52) (State 53) (State 54) (State 55) (State 56) (State 57) (State 58) (State 59) (State 6) (State 6) (State 62) (State 63)
16 End of the Exercise in Microprogramming 6
17 Variable-Latency Memory! The ready signal (R) enables memory read/write to execute correctly " Example: transition from state 33 to state 35 is controlled by the R bit asserted by memory when memory data is available! Could we have done this in a single-cycle microarchitecture?! What did we assume about memory and registers in a single-cycle microarchitecture? 7
18 The Microsequencer: Advanced Questions! What happens if the machine is interrupted?! What if an instruction generates an exception?! How can you implement a complex instruction using this control structure? " Think REP MOVS instruction in x86 8
19 The Power of Abstraction! The concept of a control store of microinstructions enables the hardware designer with a new abstraction: microprogramming! The designer can translate any desired operation to a sequence of microinstructions! All the designer needs to provide is " The sequence of microinstructions needed to implement the desired operation " The ability for the control logic to correctly sequence through the microinstructions " Any additional datapath elements and control signals needed (no need if the operation can be translated into existing control signals) 9
20 Let s Do Some More Microprogramming! Implement REP MOVS in the LC-3b microarchitecture! What changes, if any, do you make to the " state machine? " datapath? " control store? " microsequencer?! Show all changes and microinstructions! Extra Credit Assignment 2
21 x86 REP MOVS (String Copy) REP MOVS (DEST SRC) How many instructions does this take in MIPS ISA? How many microinstructions does this take to add to the LC-3b microarchitecture? 2
22 Aside: Alignment Correction in Memory! Unaligned accesses! LC-3b has byte load and byte store instructions that move data not aligned at the word-address boundary " Convenience to the programmer/compiler! How does the hardware ensure this works correctly? " Take a look at state 29 for LDB " States 24 and 7 for STB " Additional logic to handle unaligned accesses! P&P, Revised Appendix C.5 22
23 Aside: Memory Mapped I/O! Address control logic determines whether the specified address of LDW and STW are to memory or I/O devices! Correspondingly enables memory or I/O devices and sets up muxes! An instance where the final control signals of some datapath elements (e.g., MEM.EN or INMUX/2) cannot be stored in the control store " These signals are dependent on memory address! P&P, Revised Appendix C.6 23
24 Advantages of Microprogrammed Control! Allows a very simple design to do powerful computation by controlling the datapath (using a sequencer) " High-level ISA translated into microcode (sequence of u-instructions) " Microcode (u-code) enables a minimal datapath to emulate an ISA " Microinstructions can be thought of as a user-invisible ISA (u-isa)! Enables easy extensibility of the ISA " Can support a new instruction by changing the microcode " Can support complex instructions as a sequence of simple microinstructions (e.g., REP MOVS, INC [MEM])! Enables update of machine behavior " A buggy implementation of an instruction can be fixed by changing the microcode in the field! Easier if datapath provides ability to do the same thing in different ways 24
25 Update of Machine Behavior! The ability to update/patch microcode in the field (after a processor is shipped) enables " Ability to add new instructions without changing the processor! " Ability to fix buggy hardware implementations! Examples " IBM 37 Model 45: microcode stored in main memory, can be updated after a reboot " IBM System z: Similar to 37/45.! Heller and Farrell, Millicode in an IBM zseries processor, IBM JR&D, May/Jul 24. " B7 microcode can be updated while the processor is running! User-microprogrammable machine!! Wilner, Microprogramming environment on the Burroughs B7, CompCon
26 Multi-Cycle vs. Single-Cycle uarch! Advantages! Disadvantages! For you to fill in 26
27 Can We Do Better? 27
28 Can We Do Better?! What limitations do you see with the multi-cycle design?! Limited concurrency " Some hardware resources are idle during different phases of instruction processing cycle " Fetch logic is idle when an instruction is being decoded or executed " Most of the datapath is idle when a memory access is happening 28
29 Can We Use the Idle Hardware to Improve Concurrency?! Goal: More concurrency # Higher instruction throughput (i.e., more work completed in one cycle)! Idea: When an instruction is using some resources in its processing phase, process other instructions on idle resources not needed by that instruction " E.g., when an instruction is being decoded, fetch the next instruction " E.g., when an instruction is being executed, decode another instruction " E.g., when an instruction is accessing data memory (ld/st), execute the next instruction " E.g., when an instruction is writing its result into the register file, access data memory for the next instruction 29
30 Pipelining 3
31 Pipelining: Basic Idea! More systematically: " Pipeline the execution of multiple instructions " Analogy: Assembly line processing of instructions! Idea: " Divide the instruction processing cycle into distinct stages of processing " Ensure there are enough hardware resources to process one instruction in each stage " Process a different instruction in each stage! s consecutive in program order are processed in consecutive stages! Benefit: Increases instruction processing throughput (/CPI)! Downside: Start thinking about this 3
32 Example: Execution of Four Independent ADDs! Multi-cycle: 4 cycles per instruction F D E W F D E W F D E W F D E W! Pipelined: 4 cycles per 4 instructions (steady state) F D E W F D E W F D E W Is life always this beau9ful? Time F D E W Time 32
33 The Laundry Analogy Time Task order A B C D 6 PM AM! place one dirty load of clothes in the washer! when the washer is finished, place the wet load in the dryer! when the dryer is finished, take out the dry load and fold! when folding is finished, ask your roommate (??) to put the clothes away - steps to do a load are sequentially dependent - no dependence between different loads - different steps do not share resources Based on original figure from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 33
34 Pipelining Multiple Loads of Laundry Time Task order A B C D 6 PM AM Time 6 PM AM Task order A B C D - 4 loads of laundry in parallel - no additional resources - throughput increased by 4 - latency per load is the same Based on original figure from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 34
35 Pipelining Multiple Loads of Laundry: In Practice Time Task order A B C D 6 PM AM Time 6 PM AM Task order A B C D the slowest step decides throughput Based on original figure from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 35
36 Pipelining Multiple Loads of Laundry: In Practice Time Task order A B C D 6 PM AM Time 6 PM AM Task order A B C D Based on original figure from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] A B A B throughput restored (2 loads per hour) using 2 dryers 36
37 An Ideal Pipeline! Goal: Increase throughput with little increase in cost (hardware cost, in case of instruction processing)! Repetition of identical operations " The same operation is repeated on a large number of different inputs (e.g., all laundry loads go through the same steps)! Repetition of independent operations " No dependencies between repeated operations! Uniformly partitionable suboperations " Processing can be evenly divided into uniform-latency suboperations (that do not share resources)! Fitting examples: automobile assembly line, doing laundry " What about the instruction processing cycle? 37
38 Ideal Pipelining combinatonal logic (F,D,E,M,W) T psec BW=~(/T) T/2 ps (F,D,E) T/2 ps (M,W) BW=~(2/T) T/3 ps (F,D) T/3 ps (E,M) T/3 ps (M,W) BW=~(3/T) 38
39 More Realistic Pipeline: Throughput! Nonpipelined version with delay T BW = /(T+S) where S = latch delay T ps! k-stage pipelined version BW k-stage = / (T/k +S ) BW max = / ( gate delay + S ) Latch delay reduces throughput (switching overhead b/w stages) T/k ps T/k ps 39
40 More Realistic Pipeline: Cost! Nonpipelined version with combinatonal cost G Cost = G+L where L = latch cost G gates! k-stage pipelined version Cost k-stage = G + Lk Latches increase hardware cost G/k G/k 4
41 Pipelining Processing 4
42 Remember: The Processing Cycle. " Fetch fetch (IF) 2. " Decode decode and register " Evaluate operand Address fetch (ID/RF) 3. Execute/Evaluate " Fetch Operands memory address (EX/AG) 4. Memory operand fetch (MEM) " Execute 5. Store/writeback result (WB) " Store Result 42
43 Remember the Single-Cycle Uarch [25 ] Shift Jump address [3 ] left PCSrc =Jump 4 Add PC+4 [3 28] [3 26] Control RegDst Jump Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite Shift left 2 Add result ALU M u x M u x PCSrc 2 =Br Taken PC Read address memory [3 ] [25 2] [2 6] [5 ] M u x Read register Read data Read register 2 Registers Read Write data 2 register Write data M u x Zero ALU ALU result bcond Address Write data Data memory Read data M u x [5 ] 6 Sign 32 extend ALU control [5 ] ALU operaton Based on original figure from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] T BW=~(/T) 43
44 Dividing Into Stages 2ps IF: fetch M u x ps 2ps 2ps ps ID: decode/ register file read EX: Execute/ address calculation MEM: Memory access WB: Write back ignore for now Add 4 Shift left 2 Add Add result PC Address memory Read register Read data Read register 2 Registers Read data 2 Write register Write data M u x Zero ALU ALU result Address Write data Data memory Read data M u x RF write 6 Sign extend 32 Is this the correct partitioning? Why not 4 or 6 stages? Why not different boundaries? Based on original figure from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 44
45 Pipeline Throughput Program execution order Time (in instructions) lw $, ($) fetch Reg ALU Data access Reg lw $2, 2($) 8ps 8 ns fetch Reg ALU Data access Reg lw $3, 3($) Program execution Time order (in instructions) lw $, ($) fetch 8 ns 8ps Reg ALU Data access Reg fetch 8ps 8 ns... lw $2, 2($) 2 ns 2ps fetch Reg ALU Data access Reg lw $3, 3($) 2ps 2 ns fetch Reg ALU Data access Reg 2ps 2 ns 2ps 2 ns 22ps ns 2ps 2 ns 2ps 2 ns 5-stage speedup is 4, not 5 as predicted by the ideal model. Why? 45
46 Enabling Pipelined Processing: Pipeline Registers IF: fetch M M u u x x ID: decode/ register file read EX: Execute/ address calculation MEM: Memory access WB: Write back No resource is used by more than stage! IF/ID ID/EX EX/MEM MEM/WB 4 4 Add Add PC D +4 PC E +4 Add Add Add Add result result npc M Shift Shift left left 2 2 PC PC PC F Address Address memory memory IR D Read Read register register Read Read data data Read Read register 2 2 Registers Read Read Write Write data data 2 2 register register Write Write data data Sign Sign extend extend A E B E Imm E M M u u x x Zero Zero ALU ALU ALU ALU result result Aout M B M Address Address Write Write data data Data Data memory Read Read data data MDR W Aout W M M u u x x Based on original figure from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] T/k ps T T/k ps 46
47 Pipelined Operation Example lw fetch M u x All instruction classes must follow the same path and timing through the pipeline stages. lw lw decode Any performance impact? Execution lw Memory lw Write back IF/ID ID/EX EX/MEM MEM/WB Add 4 Shift left left 2 Add Add result PC PC Address memory Read register Read data Read register 2 Registers Read Write data 2 register Write data 6 6 Sign extend M u x Zero ALU ALU result Address Data memory Data memory Write data data Read data M u x Based on original figure from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 47
48 Write data Pipelined Operation Example 32 6 Sign M u x Write data Data memory M u x extend Clock 5 sub lw $, $, 2($) $2, $3 fetch M u x sub lw $, $, 2($) $2, $3 decode lw $, 2($) Execution sub $, $2, $3 Execution sub lw $, $, 2($) $2, $3 Memory sub lw $, $, 2($) $2, $3 Write back IF/ID ID/EX EX/MEM MEM/WB Add 4 Shift left 2 Add Add result PC Address memory Read register Read Read data register 2 Zero Registers Read ALU ALU Write data 2 result register M u Write x data Is life always this beau9ful? 6 Sign extend 32 Address Data memory Write data Read data M u x Clock 2 3 Clock Clock 56 4 Based on original figure from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 48 sub $, $2, $3
49 Illustrating Pipeline Operation: Operation View t t t 2 t 3 t 4 t 5 Inst Inst Inst 2 Inst 3 Inst 4 IF ID IF EX ID IF MEM EX ID IF WB MEM EX ID IF steady state (full pipeline) WB MEM EX ID IF WB MEM EX ID IF WB MEM EX ID IF 49
50 Illustrating Pipeline Operation: Resource View t t t 2 t 3 t 4 t 5 t 6 t 7 t 8 t 9 t IF I I I 2 I 3 I 4 I 5 I 6 I 7 I 8 I 9 I ID I I I 2 I 3 I 4 I 5 I 6 I 7 I 8 I 9 EX I I I 2 I 3 I 4 I 5 I 6 I 7 I 8 MEM I I I 2 I 3 I 4 I 5 I 6 I 7 WB I I I 2 I 3 I 4 I 5 I 6 5
51 Control Points in a Pipeline PCSrc M u x IF/ID ID/EX EX/MEM MEM/WB Add 4 RegWrite Shift left 2 Add Add result Branch PC Address memory Read register Read data Read register 2 Registers Read Write data 2 register Write data [5 ] 6 Sign 32 extend ALUSrc M u x 6 ALU control Zero ALU ALU result Address Write data MemWrite Data memory MemRead Read data MemtoReg M u x Based on original figure from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] [2 6] [5 ] M u x RegDst ALUOp Identical set of control points as the single-cycle datapath!! 5
52 Control Signals in a Pipeline! For a given instruction " same control signals as single-cycle, but " control signals required at different cycles, depending on stage Option : decode once using the same logic as single-cycle and buffer signals until consumed WB Control M WB EX M WB IF/ID ID/EX EX/MEM MEM/WB Option 2: carry relevant instruction word/field down the pipeline and decode locally within each or in a previous stage Which one is better? 52
53 Pipelined Control Signals PCSrc M u x Control ID/EX WB M EX/MEM WB MEM/WB IF/ID EX M WB Add PC 4 Address memory Read register Read data Read register 2 Registers Read Write data 2 register Write data RegWrite Shift left 2 M u x Add Add result ALUSrc Zero ALU ALU result Branch Write data MemWrite Address Data memory Read data MemtoReg M u x [5 ] 6 Sign 32 extend 6 ALU control MemRead [2 6] [5 ] M u x RegDst ALUOp Based on original figure from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 53
54 Carnegie Mellon Another Example: Single-Cycle and Pipelined CLK PC' PC A RD Memory Instr 25:2 2:6 CLK A A2 A3 WD3 WE3 Register File RD RD2 SrcA SrcB ALU Zero ALUResult WriteData CLK A RD Data Memory WD WE ReadData 4 + PCPlus4 2:6 5: 5: Sign Extend SignImm PC' CLK PCF A RD Memory CLK InstrD 25:2 2:6 2:6 5: CLK A A2 A3 WD3 WE3 Register File RD RD2 CLK RtE RdE SrcAE SrcBE WriteDataE WriteRegE 4: CLK ZeroM ALUOutM WriteDataM CLK WE A RD Data Memory WD ALUOutW ReadDataW + 4 5: Sign Extend SignImmE <<2 PCBranchM + WriteReg 4: <<2 + PCBranch Result ALU CLK PCPlus4F PCPlus4D PCPlus4E Fetch Decode Execute Memory Writeback ResultW 54
55 Carnegie Mellon Another Example: Correct Pipelined Datapath CLK CLK ALUOutW CLK PC' PCF A RD Memory CLK InstrD 25:2 2:6 2:6 5: CLK A A2 A3 WD3 WE3 Register File RD RD2 CLK RtE RdE SrcAE SrcBE ALU WriteDataE WriteRegE 4: ZeroM ALUOutM WriteDataM WriteRegM 4: CLK A RD Data Memory WD WE ReadDataW WriteRegW 4: 4 + 5: Sign Extend SignImmE <<2 + PCBranchM PCPlus4F PCPlus4D PCPlus4E ResultW Fetch Decode Execute Memory Writeback! WriteReg must arrive at the same 9me as Result 55
56 Carnegie Mellon Another Example: Pipelined Control CLK CLK CLK Control Unit RegWriteD MemtoRegD MemWriteD RegWriteE RegWriteM RegWriteW MemtoRegE MemtoRegM MemtoRegW MemWriteE MemWriteM 3:26 5: Op Funct BranchD ALUControlD ALUSrcD BranchE ALUControlE 2: ALUSrcE BranchM PCSrcM RegDstD RegDstE ALUOutW PC' CLK PCF A RD Memory CLK InstrD 25:2 2:6 2:6 5: CLK A A2 A3 WD3 WE3 Register File RD RD2 RtE RdE SrcAE SrcBE WriteDataE ALU WriteRegE 4: ZeroM ALUOutM WriteDataM WriteRegM 4: CLK A RD Data Memory WD WE ReadDataW WriteRegW 4: 4 + 5: Sign Extend SignImmE <<2 + PCBranchM PCPlus4F PCPlus4D PCPlus4E ResultW! Same control unit as single-cycle processor Control delayed to proper pipeline stage 56
57 Remember: An Ideal Pipeline! Goal: Increase throughput with little increase in cost (hardware cost, in case of instruction processing)! Repetition of identical operations " The same operation is repeated on a large number of different inputs (e.g., all laundry loads go through the same steps)! Repetition of independent operations " No dependencies between repeated operations! Uniformly partitionable suboperations " Processing an be evenly divided into uniform-latency suboperations (that do not share resources)! Fitting examples: automobile assembly line, doing laundry " What about the instruction processing cycle? 57
58 Pipeline: Not An Ideal Pipeline! Identical operations... NOT! different instructions # not all need the same stages Forcing different instructions to go through the same pipe stages # external fragmentation (some pipe stages idle for some instructions)! Uniform suboperations... NOT! different pipeline stages # not the same latency Need to force each stage to be controlled by the same clock # internal fragmentation (some pipe stages are too fast but all take the same clock cycle time)! Independent operations... NOT! instructions are not independent of each other Need to detect and resolve inter-instruction dependencies to ensure the pipeline provides correct results # pipeline stalls (pipeline is not always moving) 58
59 Issues in Pipeline Design! Balancing work in pipeline stages " How many stages and what is done in each stage! Keeping the pipeline correct, moving, and full in the presence of events that disrupt pipeline flow " Handling dependences! Data! Control " Handling resource contention " Handling long-latency (multi-cycle) operations! Handling exceptions, interrupts! Advanced: Improving pipeline throughput " Minimizing stalls 59
60 Causes of Pipeline Stalls! Stall: A condition when the pipeline stops moving! Resource contention! Dependences (between instructions) " Data " Control! Long-latency (multi-cycle) operations 6
61 Dependences and Their Types! Also called dependency or less desirably hazard! Dependences dictate ordering requirements between instructions! Two types " Data dependence " Control dependence! Resource contention is sometimes called resource dependence " However, this is not fundamental to (dictated by) program semantics, so we will treat it separately 6
62 Handling Resource Contention! Happens when instructions in two pipeline stages need the same resource! Solution : Eliminate the cause of contention " Duplicate the resource or increase its throughput! E.g., use separate instruction and data memories (caches)! E.g., use multiple ports for memory structures! Solution 2: Detect the resource contention and stall one of the contending stages " Which stage do you stall? " Example: What if you had a single read and write port for the register file? 62
63 Carnegie Mellon Example Resource Dependence: RegFile! The register file can be read and wrinen in the same cycle: $ write takes place during the st half of the cycle $ read takes place during the 2nd half of the cycle => no problem!!! $ However operatons that involve register file have only half a clock cycle to complete the operaton!! Time (cycles) add $s2 add $s, $s2, $s3 IM RF $s3 + DM $s RF and $t, $s, $s IM and $s RF $s & DM $t RF or $t, $s4, $s IM or $s4 RF $s DM $t RF sub $t2, $s, $s5 IM sub $s RF $s5 - DM $t2 RF 63
64 Design of Digital Circuits Lecture 5: Pipelining Prof. Onur Mutlu ETH Zurich Spring 27 3 April 27
Computer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures
18-447 Computer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 1/28/2015 Agenda for Today & Next Few Lectures Single-cycle
More informationComputer Architecture. Lecture 5: Multi-Cycle and Microprogrammed Microarchitectures
Computer Architecture Lecture 5: Multi-Cycle and Microprogrammed Microarchitectures Dr. Ahmed Sallam Based on original slides by Prof. Onur Mutlu Agenda for Today & Next Few Lectures Single-cycle Microarchitectures
More informationDesign of Digital Circuits Lecture 16: Dependence Handling. Prof. Onur Mutlu ETH Zurich Spring April 2017
Design of Digital Circuits Lecture 16: Dependence Handling Prof. Onur Mutlu ETH Zurich Spring 2017 27 April 2017 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed
More informationTopics. Lecture 12: Pipelining. Introduction to pipelining. Pipelined datapath. Hazards in pipeline. Performance. Design issues.
Lecture 2: Pipelining Topics Introduction to pipelining Performance Pipelined datapath Design issues Hazards in pipeline Types Solutions Pipelining is Natural! Laundry Example Use case scenario Ann, Brian,
More informationDesign of Digital Circuits Lecture 17: Pipelining Issues. Prof. Onur Mutlu ETH Zurich Spring April 2017
Design of Digital Circuits Lecture 17: Pipelining Issues Prof. Onur Mutlu ETH Zurich Spring 2017 28 April 2017 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed
More informationCMSC Computer Architecture Lecture 4: Single-Cycle uarch and Pipelining. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 4: Single-Cycle uarch and Pipelining Prof. Yanjing Li University of Chicago Administrative Stuff! Lab1 due at 11:59pm today! Lab2 out " Pipeline ARM simulator "
More informationComputer Architectures
Computer Architectures Pipelined instruction execution Hazards, stages balancing, super-scalar systems Pavel Píša, Michal Štepanovský, Miroslav Šnorek Main source of inspiration: Patterson Czech Technical
More informationCHW 362 : Computer Architecture & Organization
CHW 362 : Computer Architecture & Organization Instructors: Dr Ahmed Shalaby Dr Mona Ali http://bu.edu.eg/staff/ahmedshalaby4# http://www.bu.edu.eg/staff/mona.abdelbaset Review: Instruction Formats R-Type
More informationCENG 5133 Computer Architecture Design Spring Sample Exam 2
CENG 533 Computer Architecture Design Spring 24 Sample Exam 2. (6 pt) Determine the propagation delay and contamination delay of the following circuit using the gate delays given below. Gate t pd (ps)
More informationThe Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture
The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count
More informationDesign of Digital Circuits Lecture 13: Multi-Cycle Microarch. Prof. Onur Mutlu ETH Zurich Spring April 2017
Design of Digital Circuits Lecture 3: Multi-Cycle Microarch. Prof. Onur Mutlu ETH Zurich Spring 27 6 April 27 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed
More informationCOMP2611: Computer Organization. The Pipelined Processor
COMP2611: Computer Organization The 1 2 Background 2 High-Performance Processors 3 Two techniques for designing high-performance processors by exploiting parallelism: Multiprocessing: parallelism among
More informationENCM 501 Winter 2019 Assignment 6 for the Week of March 11
page of 8 ENCM 5 Winter 29 Assignment 6 for the Week of March Steve Norman Department of Electrical & Computer Engineering University of Calgary February 29 Assignment instructions and other documents
More informationSlide Set 7 for Lecture Section 01
Slide Set 7 for Lecture Section 01 for ENCM 369 Winter 2017 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary February 2017 ENCM 369 Winter
More information11/28/2016. ECE 120: Introduction to Computing. Register Loads Control Updates to Register Values. We Consider Five Groups of LC-3 Control Signals
University of Illinois at Urbana-Champaign Dept. of Electrical and Computer Engineering ECE 120: Introduction to Computing LC-3 Control Signals Time to Examine a Processor s Control Signals in Detail Recall
More informationCS 2461: Computer Architecture I
Computer Architecture is... CS 2461: Computer Architecture I Instructor: Prof. Bhagi Narahari Dept. of Computer Science Course URL: www.seas.gwu.edu/~bhagiweb/cs2461/ Instruction Set Architecture Organization
More informationCO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19
CO2-3224 Computer Architecture and Programming Languages CAPL Lecture 8 & 9 Dr. Kinga Lipskoch Fall 27 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be
More informationChapter 4 The Processor 1. Chapter 4A. The Processor
Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationDesign of Digital Circuits 2017 Srdjan Capkun Onur Mutlu (Guest starring: Frank K. Gürkaynak and Aanjhan Ranganathan)
Microarchitecture Design of Digital Circuits 27 Srdjan Capkun Onur Mutlu (Guest starring: Frank K. Gürkaynak and Aanjhan Ranganathan) http://www.syssec.ethz.ch/education/digitaltechnik_7 Adapted from Digital
More informationLC3DataPath ECE2893. Lecture 9a. ECE2893 LC3DataPath Spring / 14
LC3DataPath ECE2893 Lecture 9a ECE2893 LC3DataPath Spring 2011 1 / 14 LC3 Data Path [4:0] FINITE MACHINE STATE MEMORY IR ADDR2MUX ADDR1MUX + GateMARMUX LDPC MARMUX ZEXT SEXT SEXT SEXT RESET GateALU +1
More informationECS 154B Computer Architecture II Spring 2009
ECS 154B Computer Architecture II Spring 2009 Pipelining Datapath and Control 6.2-6.3 Partially adapted from slides by Mary Jane Irwin, Penn State And Kurtis Kredo, UCD Pipelined CPU Break execution into
More informationLecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1
Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number
More informationComputer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining
Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one
More informationComputer Architecture. Lecture 6.1: Fundamentals of
CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and
More informationLecture 7 Pipelining. Peng Liu.
Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt
More informationDepartment of Electrical and Computer Engineering The University of Texas at Austin
Department of Electrical and Computer Engineering The University of Texas at Austin EE 360N, Fall 003 Yale Patt, Instructor Santhosh Srinath, Danny Lynch, TAs Exam 1, October 0, 003 Name: Problem 1 (0
More informationFull Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI
CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Full Datapath Branch Target Instruction Fetch Immediate 4 Today s Contents We have looked
More informationCSCI-564 Advanced Computer Architecture
CSCI-564 Advanced Computer Architecture Lecture 6: Pipelining Review Bo Wu Colorado School of Mines Wake up! Time to do laundry! The Laundry Analogy Place one dirty load of clothes in the washer When the
More informationCOMPUTER ORGANIZATION AND DESIGN
ARM COMPUTER ORGANIZATION AND DESIGN Edition The Hardware/Software Interface Chapter 4 The Processor Modified and extended by R.J. Leduc - 2016 To understand this chapter, you will need to understand some
More informationDepartment of Electrical and Computer Engineering The University of Texas at Austin
Department of Electrical and Computer Engineering The University of Texas at Austin EE 360N, Spring 2003 Yale Patt, Instructor Hyesoon Kim, Onur Mutlu, Moinuddin Qureshi, Santhosh Srinath, TAs Exam 1,
More information3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:
More informationLecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University
Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will
More informationWorking on the Pipeline
Computer Science 6C Spring 27 Working on the Pipeline Datapath Control Signals Computer Science 6C Spring 27 MemWr: write memory MemtoReg: ALU; Mem RegDst: rt ; rd RegWr: write register 4 PC Ext Imm6 Adder
More informationCPE 335 Computer Organization. Basic MIPS Pipelining Part I
CPE 335 Computer Organization Basic MIPS Pipelining Part I Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE232 Basic MIPS Pipelining
More informationLecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1
Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)
More informationCSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content
3/6/8 CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Today s Content We have looked at how to design a Data Path. 4.4, 4.5 We will design
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More information15-740/ Computer Architecture Lecture 4: Pipelining. Prof. Onur Mutlu Carnegie Mellon University
15-740/18-740 Computer Architecture Lecture 4: Pipelining Prof. Onur Mutlu Carnegie Mellon University Last Time Addressing modes Other ISA-level tradeoffs Programmer vs. microarchitect Virtual memory Unaligned
More informationCOSC 6385 Computer Architecture - Pipelining
COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage
More informationCPE 335. Basic MIPS Architecture Part II
CPE 335 Computer Organization Basic MIPS Architecture Part II Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE232 Basic MIPS Architecture
More informationEECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction
EECS150 - Digital Design Lecture 10- CPU Microarchitecture Feb 18, 2010 John Wawrzynek Spring 2010 EECS150 - Lec10-cpu Page 1 Processor Microarchitecture Introduction Microarchitecture: how to implement
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationPipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.
Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview
More informationDepartment of Electrical and Computer Engineering The University of Texas at Austin
Department of Electrical and Computer Engineering The University of Texas at Austin EE 360N, Fall 2005 Yale Patt, Instructor Aater Suleman, Linda Bigelow, Jose Joao, Veynu Narasiman, TAs Final Exam, December,
More informationLecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August
Lecture 8: Control COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Datapath and Control Datapath The collection of state elements, computation elements,
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control
ELEC 52/62 Computer Architecture and Design Spring 217 Lecture 4: Datapath and Control Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849
More informationComputer Organization and Structure. Bing-Yu Chen National Taiwan University
Computer Organization and Structure Bing-Yu Chen National Taiwan University The Processor Logic Design Conventions Building a Datapath A Simple Implementation Scheme An Overview of Pipelining Pipelined
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationENGN1640: Design of Computing Systems Topic 04: Single-Cycle Processor Design
ENGN64: Design of Computing Systems Topic 4: Single-Cycle Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University
More informationThe LC-3 Instruction Set Architecture. ISA Overview Operate instructions Data Movement instructions Control Instructions LC-3 data path
Chapter 5 The LC-3 Instruction Set Architecture ISA Overview Operate instructions Data Movement instructions Control Instructions LC-3 data path A specific ISA: The LC-3 We have: Reviewed data encoding
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationPipeline design. Mehran Rezaei
Pipeline design Mehran Rezaei How Can We Improve the Performance? Exec Time = IC * CPI * CCT Optimization IC CPI CCT Source Level * Compiler * * ISA * * Organization * * Technology * With Pipelining We
More informationEE 457 Unit 6a. Basic Pipelining Techniques
EE 47 Unit 6a Basic Pipelining Techniques 2 Pipelining Introduction Consider a drink bottling plant Filling the bottle = 3 sec. Placing the cap = 3 sec. Labeling = 3 sec. Would you want Machine = Does
More informationEECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer
EECS150 - Digital Design Lecture 9- CPU Microarchitecture Feb 15, 2011 John Wawrzynek Spring 2011 EECS150 - Lec09-cpu Page 1 Watson: Jeopardy-playing Computer Watson is made up of a cluster of ninety IBM
More informationMidnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4
IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM
More information15-740/ Computer Architecture Lecture 7: Pipelining. Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 9/26/2011
15-740/18-740 Computer Architecture Lecture 7: Pipelining Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 9/26/2011 Review of Last Lecture More ISA Tradeoffs Programmer vs. microarchitect Transactional
More informationDepartment of Electrical and Computer Engineering The University of Texas at Austin
Department of Electrical and Computer Engineering The University of Texas at Austin EE 60N, Fall 00 Yale Patt, Instructor Santhosh Srinath, Danny Lynch, TAs Exam, November 9, 00 Name: Problem (0 points):
More informationSpecial Microarchitecture based on a lecture by Sanjay Rajopadhye modified by Yashwant Malaiya
Special Microarchitecture based on a lecture by Sanjay Rajopadhye modified by Yashwant Malaiya Computing Layers Problems Algorithms Language Instruction Set Architecture Microarchitecture Circuits Devices
More informationCS 61C: Great Ideas in Computer Architecture Control and Pipelining
CS 6C: Great Ideas in Computer Architecture Control and Pipelining Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs6c/sp6 Datapath Control Signals ExtOp: zero, sign
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike
More informationProcessor (II) - pipelining. Hwansoo Han
Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number
More informationData Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard
Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:
More informationComputer Architecture 计算机体系结构. Lecture 2. Instruction Set Architecture 第二讲 指令集架构. Chao Li, PhD. 李超博士
Computer Architecture 计算机体系结构 Lecture 2. Instruction Set Architecture 第二讲 指令集架构 Chao Li, PhD. 李超博士 SJTU-SE346, Spring 27 Review ENIAC (946) used decimal representation; vacuum tubes per digit; could store
More informationComputer Architecture. Lecture 6: Pipelining
Compter Architectre Lectre 6: Pipelining Dr. Ahmed Sallam Based on original slides by Prof. Onr tl Agenda for Today & Net Few Lectres Single-cycle icroarchitectres lti-cycle and icroprogrammed icroarchitectres
More informationPipelined Processor Design
Pipelined Processor Design Pipelined Implementation: MIPS Virendra Singh Computer Design and Test Lab. Indian Institute of Science (IISc) Bangalore virendra@computer.org Advance Computer Architecture http://www.serc.iisc.ernet.in/~viren/courses/aca/aca.htm
More informationENCM 369 Winter 2018 Lab 9 for the Week of March 19
page 1 of 9 ENCM 369 Winter 2018 Lab 9 for the Week of March 19 Steve Norman Department of Electrical & Computer Engineering University of Calgary March 2018 Lab instructions and other documents for ENCM
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined
More informationLC-3 Instruction Processing
LC-3 Instruction Processing (Textbookʼs Chapter 4)# Next set of Slides:# Textbook Chapter 10-10.2# Instruction Processing# It is impossible to do all of an instruction in one clock cycle.# Processors break
More informationPipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science
Pipeline Overview Dr. Jiang Li Adapted from the slides provided by the authors Outline MIPS An ISA for Pipelining 5 stage pipelining Structural and Data Hazards Forwarding Branch Schemes Exceptions and
More informationLecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation
Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu www.secs.oakland.edu/~yan
More informationModern Computer Architecture
Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationCS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST
CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C
More informationMajor CPU Design Steps
Datapath Major CPU Design Steps. Analyze instruction set operations using independent RTN ISA => RTN => datapath requirements. This provides the the required datapath components and how they are connected
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationDepartment of Electrical and Computer Engineering The University of Texas at Austin
Department of Electrical and Computer Engineering The University of Texas at Austin EE N Spring 7 Y. N. Patt, Instructor Chirag Sakhuja, Sarbartha Banerjee, Jonathan Dahm, Arjun Teh, TAs Exam March, 7
More informationLECTURE 3: THE PROCESSOR
LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU
More informationLC-3 Instruction Processing. (Textbook s Chapter 4)
LC-3 Instruction Processing (Textbook s Chapter 4) Instruction Processing Fetch instruction from memory Decode instruction Evaluate address Fetch operands from memory Usually combine Execute operation
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationEECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141
EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 13 Project Introduction You will design and optimize a RISC-V processor Phase 1: Design
More informationSimple Instruction Pipelining
Simple Instruction Pipelining Krste Asanovic Laboratory for Computer Science Massachusetts Institute of Technology Processor Performance Equation Time = Instructions * Cycles * Time Program Program Instruction
More informationComputer Systems Architecture Spring 2016
Computer Systems Architecture Spring 2016 Lecture 01: Introduction Shuai Wang Department of Computer Science and Technology Nanjing University [Adapted from Computer Architecture: A Quantitative Approach,
More informationLecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón
ICS 152 Computer Systems Architecture Prof. Juan Luis Aragón Lecture 5 and 6 Multicycle Implementation Introduction to Microprogramming Readings: Sections 5.4 and 5.5 1 Review of Last Lecture We have seen
More informationColumbia University CSEE 3827 Fundamentals of Computer Systems Final Exam
Columbia University CSEE 3827 Fundamentals of Computer Systems Final Exam Prof. Martha A. Kim December 7, 23 Name: First Last (Family) UNI (e.g., mak29) You are allowed 3 hours. You may consult your own
More informationETH, Design of Digital Circuits, SS17 Practice Exercises III
ETH, Design of Digital Circuits, SS17 Practice Exercises III Instructors: Prof. Onur Mutlu, Prof. Srdjan Capkun TAs: Jeremie Kim, Minesh Patel, Hasan Hassan, Arash Tavakkol, Der-Yeuan Yu, Francois Serre,
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationCOMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath
COMP33 - Computer Architecture Lecture 8 Designing a Single Cycle Datapath The Big Picture The Five Classic Components of a Computer Processor Input Control Memory Datapath Output The Big Picture: The
More informationPipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...
CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100
More informationEI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)
EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building
More informationPipelining. Maurizio Palesi
* Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer
More informationLecture 6: Pipelining
Lecture 6: Pipelining i CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and other
More informationComputer Architecture
Lecture 3: Pipelining Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in
More informationProcessor (I) - datapath & control. Hwansoo Han
Processor (I) - datapath & control Hwansoo Han Introduction CPU performance factors Instruction count - Determined by ISA and compiler CPI and Cycle time - Determined by CPU hardware We will examine two
More informationLecture 10: Pipelined Implementations
U 8-7 S 9 L- 8-7 Lectre : Pipelined Implementations James. Hoe ept of EE, U Febrary 23, 29 nnoncements: Project is de this week idterm graded, d reslts posted Handots: H9 Homework 3 (on lackboard) Graded
More informationENE 334 Microprocessors
ENE 334 Microprocessors Lecture 6: Datapath and Control : Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 3 th & 4 th Edition, Patterson & Hennessy, 2005/2008, Elsevier (MK) http://webstaff.kmutt.ac.th/~dejwoot.kha/
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More information14:332:331 Pipelined Datapath
14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate
More information