Chapter 4 (Part II) Sequential Laundry

Size: px
Start display at page:

Download "Chapter 4 (Part II) Sequential Laundry"

Transcription

1 Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P A T a s k O r d e r A B C D Time Sequential laundry takes 8 hours for 4 loads 2 Division of Engineering Programs SUNY New Paltz 1

2 Pipelined Laundry 6 P T a s k O r d e r A B C Time D Pipelined laundry: overlapping execution Parallelism improves performance 3 Pipelining Analogy Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Pipelining doesn t help latency of single task, it helps throughput of entire workload ultiple tasks operating simultaneously using different resources Potential speedup = Number pipe stages Pipeline rate limited by slowest pipeline stage Unbalanced lengths of pipe stages reduces speedup Time to fill pipeline and time to drain it reduces speedup Stall for dependencies Speedup = 2n/0.5n = number of stages 4 Division of Engineering Programs SUNY New Paltz 2

3 Single Cycle Implementation Calculate cycle time assuming negligible delays except: memory (2ns), and adders (2ns), register file access (1ns) PCSrc 4 Add Write Shift left 2 Add resu lt 1 u x 0 PC address In s tru ction [3 1 0 ] Instruction memory In structio n [25 2 1] In structio n [20 1 6] In structio n [15 1 1] 1 u x 0 Dst In structio n [15 0 ] register 1 register 2 Write register Write data d ata 1 d ata 2 isters 16 Sign 32 extend Src 1 u x 0 control Zero result emwrite Address Write data Data memory em data emto 1 u x 0 Instru ction [5 0] Op Instruction Instr. emory ister Op. Data emory. Write Total R-format 200ps 100ps 200ps 0 100ps 600ps lw 200ps 100ps 200ps 200ps 100ps 800ps sw 200ps 100ps 200ps 200ps 700ps beq 200ps 100ps 200ps 500ps 5 Single Stage VS. Pipeline Performance Instruction Instr. emory ister Op. Data emory. Write Total R-format 200ps 100ps 200ps 0 100ps 600ps lw 200ps 100ps 200ps 200ps 100ps 800ps sw 200ps 100ps 200ps 200ps 700ps beq 200ps 100ps 200ps 500ps Single-cycle (T c = 800ps) Pipelined (T c = 200ps) 6 Division of Engineering Programs SUNY New Paltz 3

4 IPS Pipeline Five stages, one step per stage 1. IF: Instruction fetch from memory 2. ID: Instruction decode & register read 3. EX: Execute operation or calculate address 4. E: Access memory operand 5. WB: Write result back to register 7 Pipelining What makes it easy in IPS All instructions are the same length Just a few instruction formats emory operands appear only in loads and stores What makes it hard? Structural hazards: suppose we had only one memory Control hazards: need to worry about branch instructions Data hazards: an instruction depends on a previous instruction We ll build a simple pipeline and look at these issues what makes it really hard for modern processors Exception handling Trying to improve performance with out-of-order execution 8 Division of Engineering Programs SUNY New Paltz 4

5 Pipeline Speedup 9 The Five Stages of Load What do we need to add to actually split the datapath into stages? LW Ifetch /Dec Exec em Wr Ifetch: Instruction Fetch Fetch the instruction from the Instruction emory /Dec: isters Fetch and Instruction Decode Exec: Calculate the memory address em: the data from the Data emory Wr: Write the data back to the register file 10 Division of Engineering Programs SUNY New Paltz 5

6 Basic Idea IF: Instruction fetch 0 u x ID: Instruction decode/ registerfile read EX: Execute/ address calculation E: emory access WB: Write back 1 Add 4 Add Add re s ult Shift left 2 PC Addre s s Ins tructio n Ins truction me mory re gis ter 1 re gis ter 2 isters Write re gis te r Write data data 1 data 2 0 u x 1 Ze ro res ult Addre ss Data me mory Write data data 1 u x 0 16 Sign extend Basic Idea IF: Instruction fetch 0 u x ID: Instruction decode/ registerfile read EX: Execute/ address calculation E: emory access WB: Write back 1 Add 4 Add Add re s ult Shift left 2 PC Addre s s Ins tructio n Ins truction me mory re gis ter 1 re gis ter 2 isters Write re gis te r Write data data 1 data 2 0 u x 1 Ze ro res ult Addre ss Data me mory Write data data 1 u x 0 16 Sign extend Division of Engineering Programs SUNY New Paltz 6

7 Pipelining and ISA Design IPS ISA designed for pipelining All instructions are 32-bits Easier to fetch and decode in one cycle c.f. x86: 1- to 17-byte instructions Few and regular instruction formats Can decode and read registers in one step Load/store addressing Can calculate address in 3 rd stage, access memory in 4 th stage Alignment of memory operands emory access takes only one cycle 13 Pipeline registers Need registers between stages To hold information produced in previous cycle Can you find a problem? What instructions can we execute to manifest the problem? 14 Division of Engineering Programs SUNY New Paltz 7

8 Pipeline Operation Cycle-by-cycle flow of instructions through the pipelined datapath Single-clock-cycle pipeline diagram Shows pipeline usage in a single cycle Highlight resources used c.f. multi-clock-cycle diagram Graph of operation over time We ll look at single-clock-cycle diagrams for load & store 15 IF for Load, Store, 16 Division of Engineering Programs SUNY New Paltz 8

9 ID for Load, Store, 17 EX for Load 18 Division of Engineering Programs SUNY New Paltz 9

10 E for Load 19 WB for Load Wrong register number 20 Division of Engineering Programs SUNY New Paltz 10

11 Corrected Datapath for Load 21 EX for Store 22 Division of Engineering Programs SUNY New Paltz 11

12 E for Store 23 WB for Store 24 Division of Engineering Programs SUNY New Paltz 12

13 Graphically Representing Pipelines Program execution order (in instructions) lw $10, 20($1) Time (in clock cycles) CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 I D sub $11, $2, $3 I D Can help with answering questions like: How many cycles does it take to execute this code? What is the doing during cycle 4? Use this representation to help understand datapaths 25 Why Pipeline? Time (clock cycles) I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Im Dm Im Dm Im Dm Im Dm Im Dm 26 Division of Engineering Programs SUNY New Paltz 13

14 ulti-cycle Pipeline Diagram Form showing resource usage 27 Single-Cycle Pipeline Diagram State of pipeline in a given cycle 28 Division of Engineering Programs SUNY New Paltz 14

15 Pipelined Control (Simplified) 29 Pipeline Control We have 5 stages. What needs to be controlled in each stage? Instruction Fetch and PC Increment Instruction Decode / ister Fetch Execution emory Stage Write Back How would control be handled in an automobile plant? A fancy control center telling everyone what to do? Should we use a finite state machine? 30 Division of Engineering Programs SUNY New Paltz 15

16 Pipeline Control Control signals derived from instruction As in single-cycle implementation Pass control signals along just like the data Execution/Address Calculation stage control lines emory access stage control lines Write-back stage control lines Instruction Dst Op1 Op0 Src Branch em em Write write em to R-format lw sw X X beq X X WB Instruction Control WB EX WB IF/ID ID/EX EX/ E E/W B 31 Pipelined Control 32 Division of Engineering Programs SUNY New Paltz 16

17 Designing a Pipelined Processor Go back and examine your datapath and control diagram Associated resources with states Ensure that flows do not conflict, or figure out how to resolve Assert control in appropriate stage 33 Pipelining Troubles? Pipeline Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards: attempt to use the same resource two different ways at the same time. Data hazards: attempt to use item before it is ready. Instruction depends on result of prior instruction still in the pipeline. Control hazards: attempt to make a decision before condition is evaluated. Branch instructions Can always resolve hazards by waiting. Pipeline control must detect the hazard. Take action (or delay action) to resolve hazards. 34 Division of Engineering Programs SUNY New Paltz 17

18 Structure Hazards Conflict for use of a resource In IPS pipeline with a single memory Load/store requires data access Instruction fetch would have to stall for that cycle Would cause a pipeline bubble Hence, pipelined datapaths require separate instruction/data memories Or separate instruction/data caches 35 Data Hazards Problem with starting next instruction before first is finished Dependencies that go backward in time are data hazards Time (in clock cycles) Value of register $2 : Program execution order (in instructions) sub $2, $1,$3 CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC / I D and $12, $2, $5 I D or $13, $6, $2 I D add $14, $2, $2 I D sw $15, 100($2) I D 36 Division of Engineering Programs SUNY New Paltz 18

19 Data Hazard Solution An instruction depends on completion of data access by a previous instruction add $s0, $t0, $t1 sub $t2, $s0, $t3 nop nop 37 Software Solution Have compiler guarantee no hazards Where should compiler insert nop instructions? sub $2, $1, $3 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15, 100($2) Problem: It happens too often to rely on compiler It really slows us down! 38 Division of Engineering Programs SUNY New Paltz 19

20 Data Hazards Software Solution Time (in clock cycles) Value of CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 register $2 : / Program execution order (in instructions) sub $2, $1,$3 I D and $12, $2, $5 I D or $13, $6, $2 I D add $14, $2, $2 I D sw $15, 100($2) I D 39 Code Scheduling to Avoid Stalls C code for A = B + E; C = B + F; stall stall lw $t1, 0($t0) lw $t2, 4($t0) add $t3, $t1, $t2 sw $t3, 12($t0) lw $t4, 8($t0) add $t5, $t1, $t4 sw $t5, 16($t0) I I D D I D I D I D I D I D 11 cycles not counting the dependencies 40 Division of Engineering Programs SUNY New Paltz 20

21 Code Scheduling to Avoid Stalls Reorder code to avoid use of load result in the next instruction C code for A = B + E; C = B + F; stall stall 15 cycles lw $t1, 0($t0) lw $t2, 4($t0) nop nop add $t3, $t1, $t2 sw $t3, 12($t0) lw $t4, 8($t0) nop nop add $t5, $t1, $t4 sw $t5, 16($t0) 12 cycles lw $t1, 0($t0) lw $t2, 4($t0) nop lw $t4, 8($t0) add $t3, $t1, $t2 sw $t3, 12($t0) add $t5, $t1, $t4 sw $t5, 16($t0) 41 Forwarding (aka Bypassing) Use result when it is computed Don t wait for it to be stored in a register Requires extra connections in the datapath 42 Division of Engineering Programs SUNY New Paltz 21

22 Data Hazards in Instructions Consider this sequence: sub $2, $1,$3 and $12,$2,$5 or $13,$6,$2 add $14,$2,$2 sw $15,100($2) We can resolve hazards with forwarding How do we detect when to forward? 4.7 Data Hazards: Forwarding vs. Stalling 43 Data Hazard Solution: Forwarding Use temporary results ( forwarding), don t wait for them to be written Also, write register file during 1st half of clock and read during 2nd Time (in clock cycles) half CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 Value of register $2 : / Value of EX/E : X X X 20 X X X X X Value of E/W B : X X X X 20 X X X X Program execution order (in instructions) sub $2,$1,$3 I R eg D and $12, $2,$5 I D or $13, $6, $2 I D add $14, $2, $2 I D sw $15, 100($2) I D what if this $2 was $13? 44 Division of Engineering Programs SUNY New Paltz 22

23 Data Hazard Solution: Forwarding Time (in clock cycles) CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 Value of register $2 : / Value of EX/E : X X X 20 X X X X X Value of E/W B : X X X X 20 X X X X Program execution order (in instructions) sub $2, $1, $3 I R eg D and $12, $2,$5 I D or $13, $6, $2 add $14, $2, $2 sw $15, 100($2) what if this $2 was $13? 45 Dependencies & Forwarding 46 Division of Engineering Programs SUNY New Paltz 23

24 Forwarding Paths 47 Detecting the Need to Forward Pass register numbers along pipeline e.g., ID/EX.isterRs = register number for Rs sitting in ID/EX pipeline register operand register numbers in EX stage are given by ID/EX.isterRs, ID/EX.isterRt Data hazards when 1a. EX/E.isterRd = ID/EX.isterRs 1b. EX/E.isterRd = ID/EX.isterRt 2a. E/WB.isterRd = ID/EX.isterRs 2b. E/WB.isterRd = ID/EX.isterRt Fwd from EX/E pipeline reg Fwd from E/WB pipeline reg 48 Division of Engineering Programs SUNY New Paltz 24

25 Detecting the Need to Forward But only if forwarding instruction will write to a register! EX/E.Write, E/WB.Write And only if Rd for that instruction is not $zero EX/E.isterRd 0, E/WB.isterRd 0 49 Datapath with Forwarding 00 ister file 01 em. or earlier 10 Prior 50 Division of Engineering Programs SUNY New Paltz 25

26 Forwarding Conditions EX hazard if (EX/E.Write and (EX/E.isterRd 0) and (EX/E.isterRd = ID/EX.isterRs)) ForwardA = 10 if (EX/E.Write and (EX/E.isterRd 0) and (EX/E.isterRd = ID/EX.isterRt)) ForwardB = 10 E hazard if (E/WB.Write and (E/WB.isterRd 0) and (E/WB.isterRd = ID/EX.isterRs)) ForwardA = 01 if (E/WB.Write and (E/WB.isterRd 0) and (E/WB.isterRd = ID/EX.isterRt)) ForwardB = Hazard Conditions Steer the result from previous instruction to the ID/EX WB 00 ister file 01 em. or earlier 10 Prior EX/E Control WB E/WB IF/ID EX WB PC Instruction memory Instruction isters ux Data memory ux ux EX hazard if (EX/E.Write and (EX/E.isterRd 0) IF/ID.isterRs IF/ID.isterRt IF/ID.isterRt IF/ID.isterRd and (EX /E.isterRd = ID/EX.isterRs)) ForwardA = 10 if (EX/E.Write and (EX/E.isterRd 0) and (EX /E.isterRd = ID/EX.isterRt)) ForwardB = 10 Rs Rt Rt Rd ux Forwarding unit EX/E.isterRd E/WB.isterRd 52 Division of Engineering Programs SUNY New Paltz 26

27 Hazard Conditions Steer the result from precious instruction to the ID/EX WB EX/E Control WB E/WB IF/ID EX WB PC Instruction memory Instruction isters ux Data memory ux ux E hazard if (E/WB.Write and (E/WB.isterRd 0) IF/ID.isterRs IF/ID.isterRt IF/ID.isterRt IF/ID.isterRd and (E/WB.isterRd = ID/EX.isterRs)) ForwardA = 01 if (E/WB.Write and (E/WB.isterRd 0) and (E/WB.isterRd = ID/EX.isterRt)) ForwardB = 01 Rs Rt Rt Rd ux Forwarding unit EX/E.isterRd E/WB.isterRd 53 Double Data Hazard Consider the sequence: add $1,$1,$2 add $1,$1,$3 add $1,$1,$4 Both hazards occur Want to use the most recent Revise E hazard condition Only fwd if EX hazard condition isn t true I D I D I D 54 Division of Engineering Programs SUNY New Paltz 27

28 Revised Forwarding Condition E hazard if (E/WB.Write and (E/WB.isterRd 0) and not (EX/E.Write and (EX/E.isterRd 0) and (EX/E.isterRd = ID/EX.isterRs)) and (E/WB.isterRd = ID/EX.isterRs)) ForwardA = 01 if (E/WB.Write and (E/WB.isterRd 0) and not (EX/E.Write and (EX/E.isterRd 0) and (EX/E.isterRd = ID/EX.isterRt)) and (E/WB.isterRd = ID/EX.isterRt)) ForwardB = Can't always forward lw can still cause a hazard: An instruction tries to read a register following a load instruction that writes to the same register. Time (in clock cycles) Program CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 execution order (in instructions) lw $2, 20($1) I D and $4, $2, $5 I D CC 7 CC 8 CC 9 Need to stall for one cycle or $8, $2, $6 I D add $9, $4, $2 I D slt $1, $6, $7 I D Thus, we need a hazard detection unit to stall the load instruction 56 Division of Engineering Programs SUNY New Paltz 28

29 Load-Use Data Hazard Can t always avoid stalls by forwarding If value not computed when needed Can t forward backward in time! 57 Load-Use Hazard Detection Check when using instruction is decoded in ID stage operand register numbers in ID stage are given by IF/ID.isterRs, IF/ID.isterRt Load-use hazard when ID/EX.em and ((ID/EX.isterRt = IF/ID.isterRs) or (ID/EX.isterRt = IF/ID.isterRt)) If detected, stall and insert bubble 58 Division of Engineering Programs SUNY New Paltz 29

30 How to Stall the Pipeline Force control values in ID/EX register to 0 EX, E and WB do nop (no-operation) Prevent update of PC and IF/ID register Using instruction is decoded again Following instruction is fetched again 1-cycle stall allows E to read data for lw Can subsequently forward to EX stage 59 Stall/Bubble in the Pipeline Stall inserted here 60 Division of Engineering Programs SUNY New Paltz 30

31 Stall/Bubble in the Pipeline Or, more accurately 61 Datapath with Hazard Detection Stall by letting an instruction that won t write anything go forward Controls writing of the PC and IF/ID plus UX 62 Division of Engineering Programs SUNY New Paltz 31

32 Code Scheduling to Avoid Stalls Revisiting reordering code to avoid use of load result in the next instruction C code for A = B + E; C = B + F; stall stall lw $t1, 0($t0) lw $t2, 4($t0) nop add $t3, $t1, $t2 sw $t3, 12($t0) lw $t4, 8($t0) nop add $t5, $t1, $t4 sw $t5, 16($t0) lw $t1, 0($t0) lw $t2, 4($t0) lw $t4, 8($t0) add $t3, $t1, $t2 sw $t3, 12($t0) add $t5, $t1, $t4 sw $t5, 16($t0) 11 cycles 13 cycles 63 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always fetch correct instruction Still working on ID stage of branch In IPS pipeline Need to compare registers and compute target early in the pipeline Add hardware to do it in ID stage 64 Division of Engineering Programs SUNY New Paltz 32

33 Branch Hazards When decide to branch, other instructions may be in the pipeline! If branch outcome determined in E 4.8 Control Hazards Flush these instructions (Set control values to 0) PC 65 Stall on Branch Wait until branch outcome determined before fetching next instruction 66 Division of Engineering Programs SUNY New Paltz 33

34 Branch Prediction Longer pipelines can t readily determine branch outcome early Stall penalty becomes unacceptable Predict outcome of branch Only stall if prediction is wrong In IPS pipeline Can predict branches not taken Fetch instruction after branch, with no delay Need to add hardware for flushing instructions if we are wrong 67 IPS with Predict Not Taken Prediction correct Prediction incorrect 68 Division of Engineering Programs SUNY New Paltz 34

35 Our Original Datapath PCSrc ux 0 1 Control ID/EX WB EX/E WB E/WB IF/ID EX WB Add PC 4 Address Instruction memory Instruction Write register 1 data 1 register 2 isters Write data 2 register Write data Shift left 2 0 ux 1 Add result Add Src Zero result Branch Write data emwrite Address Data memory data emto ux 1 0 Instruction [15 0] Instruction [20 16] Instruction [15 11] Sign extend Dst 69 ux control Op em Reduce Branch Delay IF.Flush ux Hazard detection unit ID/EX WB EX/E Control 0 ux WB E/WB IF/ID EX WB 4 Shift left 2 PC Instruction memory isters = ux ux Data memory ux Sign extend ux Forwarding unit 70 Division of Engineering Programs SUNY New Paltz 35

36 Reducing Branch Delay ove hardware to determine outcome to ID stage Target address adder ister comparator Example: branch taken 36: sub $10, $4, $8 40: beq $1, $3, 7 44: and $12, $2, $5 48: or $13, $2, $6 52: add $14, $4, $2 56: slt $15, $6, $ : lw $4, 50($7) 71 Example: Branch Taken 72 Division of Engineering Programs SUNY New Paltz 36

37 Example: Branch Taken Pipeline Summary The BIG Picture Pipelining improves performance by increasing instruction throughput Executes multiple instructions in parallel Each instruction has the same latency Subject to hazards Structure, data, control Instruction set design affects complexity of pipeline implementation 74 Division of Engineering Programs SUNY New Paltz 37

38 Stalls and Performance The BIG Picture Stalls reduce performance But are required to get correct results Compiler can arrange code to avoid hazards and stalls Requires knowledge of the pipeline structure 75 Division of Engineering Programs SUNY New Paltz 38

Chapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns

Chapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns Chapter Si Pipelining Improve perfomance by increasing instruction throughput eecutionı Time lw $, ($) 2 6 8 2 6 8 access lw $2, 2($) 8 ns access lw $3, 3($) eecutionı Time lw $, ($) lw $2, 2($) 2 ns 8

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

Processor (II) - pipelining. Hwansoo Han

Processor (II) - pipelining. Hwansoo Han Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number

More information

Improve performance by increasing instruction throughput

Improve performance by increasing instruction throughput Improve performance by increasing instruction throughput Program execution order Time (in instructions) lw $1, 100($0) fetch 2 4 6 8 10 12 14 16 18 ALU Data access lw $2, 200($0) 8ns fetch ALU Data access

More information

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle? CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14 MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK

More information

What do we have so far? Multi-Cycle Datapath (Textbook Version)

What do we have so far? Multi-Cycle Datapath (Textbook Version) What do we have so far? ulti-cycle Datapath (Textbook Version) CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instruction being processed in datapath How to lower CPI further? #1 Lec # 8 Summer2001

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Pipelining. CSC Friday, November 6, 2015

Pipelining. CSC Friday, November 6, 2015 Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

Lecture 6: Pipelining

Lecture 6: Pipelining Lecture 6: Pipelining i CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and other

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike

More information

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good CPU performance equation: T = I x CPI x C Both effective CPI and clock cycle C are heavily influenced by CPU design. For single-cycle CPU: CPI = 1 good Long cycle time bad On the other hand, for multi-cycle

More information

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ... CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100

More information

LECTURE 3: THE PROCESSOR

LECTURE 3: THE PROCESSOR LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

CSEE 3827: Fundamentals of Computer Systems

CSEE 3827: Fundamentals of Computer Systems CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 martha@cs.columbia.edu Amdahl s Law Be aware when optimizing... T = improved Taffected improvement factor + T unaffected

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction

More information

Chapter 4 The Processor 1. Chapter 4B. The Processor

Chapter 4 The Processor 1. Chapter 4B. The Processor Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always

More information

The Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

The Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University The Processor (3) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Chapter 4 The Processor 1. Chapter 4A. The Processor

Chapter 4 The Processor 1. Chapter 4A. The Processor Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

微算機系統第六章. Enhancing Performance with Pipelining 陳伯寧教授電信工程學系國立交通大學. Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold

微算機系統第六章. Enhancing Performance with Pipelining 陳伯寧教授電信工程學系國立交通大學. Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold 微算機系統第六章 Enhancing Performance with Pipelining 陳伯寧教授電信工程學系國立交通大學 chap6- Pipeline is natural! Laundry Example Ann, Brian, athy, Dave each have one load of clothes to wash, dry, and fold A B D Washer takes

More information

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor. COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing Note: Appendices A-E in the hardcopy text correspond to chapters 7- in the online

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S. Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing (2) Pipeline Performance Assume time for stages is ps for register read or write

More information

What do we have so far? Multi-Cycle Datapath

What do we have so far? Multi-Cycle Datapath What do we have so far? lti-cycle Datapath CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instrction being processed in datapath How to lower CPI frther? #1 Lec # 8 Spring2 4-11-2 Pipelining pipelining

More information

Designing a Pipelined CPU

Designing a Pipelined CPU Designing a Pipelined CPU CSE 4, S2'6 Review -- Single Cycle CPU CSE 4, S2'6 Review -- ultiple Cycle CPU CSE 4, S2'6 Review -- Instruction Latencies Single-Cycle CPU Load Ifetch /Dec Exec em Wr ultiple

More information

Pipeline Data Hazards. Dealing With Data Hazards

Pipeline Data Hazards. Dealing With Data Hazards Pipeline Data Hazards Warning, warning, warning! Dealing With Data Hazards In Software inserting independent instructions In Hardware inserting bubbles (stalling the pipeline) data forwarding Data Data

More information

Multi-cycle Datapath (Our Version)

Multi-cycle Datapath (Our Version) ulti-cycle Datapath (Our Version) npc_sel Next PC PC Instruction Fetch IR File Operand Fetch A B ExtOp ALUSrc ALUctr Ext ALU R emrd emwr em Access emto Data em Dst Wr. File isters added: IR: Instruction

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University The Processor Logic Design Conventions Building a Datapath A Simple Implementation Scheme An Overview of Pipelining Pipelined

More information

Enhanced Performance with Pipelining

Enhanced Performance with Pipelining Chapter 6 Enhanced Performance with Pipelining Note: The slides being presented represent a mi. Some are created by ark Franklin, Washington University in St. Lois, Dept. of CSE. any are taken from the

More information

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count

More information

Basic Instruction Timings. Pipelining 1. How long would it take to execute the following sequence of instructions?

Basic Instruction Timings. Pipelining 1. How long would it take to execute the following sequence of instructions? Basic Instruction Timings Pipelining 1 Making some assumptions regarding the operation times for some of the basic hardware units in our datapath, we have the following timings: Instruction class Instruction

More information

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1,

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1, SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Chapter 6 ADMIN ing for Chapter 6: 6., 6.9-6.2 2 Midnight Laundry Task order A 6 PM 7 8 9 0 2 2 AM B C D 3 Smarty

More information

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4 IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM

More information

Instruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion

Instruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion . (Chapter 5) Fill in the vales for SrcA, SrcB, IorD, Dst and emto to complete the Finite State achine for the mlti-cycle datapath shown below. emory address comptation 2 SrcA = SrcB = Op = fetch em SrcA

More information

Computer Architecture. Lecture 6.1: Fundamentals of

Computer Architecture. Lecture 6.1: Fundamentals of CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and

More information

1 Hazards COMP2611 Fall 2015 Pipelined Processor

1 Hazards COMP2611 Fall 2015 Pipelined Processor 1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add

More information

Thomas Polzer Institut für Technische Informatik

Thomas Polzer Institut für Technische Informatik Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =

More information

Determined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version

Determined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version MIPS Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor 1 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A

More information

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed Lecture 3: General Purpose Processor Design CSCE 665 Advanced VLSI Systems Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, tet etc included in slides are borrowed from various books, websites,

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

COSC 6385 Computer Architecture - Pipelining

COSC 6385 Computer Architecture - Pipelining COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage

More information

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts CS359: Computer Architecture Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Computer Science and Engineering Shanghai Jiao Tong University Parallel

More information

ECE 154A Introduction to. Fall 2012

ECE 154A Introduction to. Fall 2012 ECE 154A Introduction to Computer Architecture Fall 2012 Dmitri Strukov Lecture 10 Floating point review Pipelined design IEEE Floating Point Format single: 8 bits double: 11 bits single: 23 bits double:

More information

14:332:331 Pipelined Datapath

14:332:331 Pipelined Datapath 14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate

More information

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27) Lecture Topics Today: Data and Control Hazards (P&H 4.7-4.8) Next: continued 1 Announcements Exam #1 returned Milestone #5 (due 2/27) Milestone #6 (due 3/13) 2 1 Review: Pipelined Implementations Pipelining

More information

COMPUTER ORGANIZATION AND DESI

COMPUTER ORGANIZATION AND DESI COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations Determined by ISA

More information

Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation

Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu www.secs.oakland.edu/~yan

More information

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin.   School of Information Science and Technology SIST CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN ARM COMPUTER ORGANIZATION AND DESIGN Edition The Hardware/Software Interface Chapter 4 The Processor Modified and extended by R.J. Leduc - 2016 To understand this chapter, you will need to understand some

More information

Advanced Computer Architecture Pipelining

Advanced Computer Architecture Pipelining Advanced Computer Architecture Pipelining Dr. Shadrokh Samavi Some slides are from the instructors resources which accompany the 6 th and previous editions of the textbook. Some slides are from David Patterson,

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 17: Pipelining Wrapup Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Outline The textbook includes lots of information Focus on

More information

MIPS An ISA for Pipelining

MIPS An ISA for Pipelining Pipelining: Basic and Intermediate Concepts Slides by: Muhamed Mudawar CS 282 KAUST Spring 2010 Outline: MIPS An ISA for Pipelining 5 stage pipelining i Structural Hazards Data Hazards & Forwarding Branch

More information

EIE/ENE 334 Microprocessors

EIE/ENE 334 Microprocessors EIE/ENE 334 Microprocessors Lecture 6: The Processor Week #06/07 : Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, 2009, Elsevier (MK) http://webstaff.kmutt.ac.th/~dejwoot.kha/

More information

ECE260: Fundamentals of Computer Engineering

ECE260: Fundamentals of Computer Engineering Pipelining James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy What is Pipelining? Pipelining

More information

CENG 3420 Lecture 06: Pipeline

CENG 3420 Lecture 06: Pipeline CENG 3420 Lecture 06: Pipeline Bei Yu byu@cse.cuhk.edu.hk CENG3420 L06.1 Spring 2019 Outline q Pipeline Motivations q Pipeline Hazards q Exceptions q Background: Flip-Flop Control Signals CENG3420 L06.2

More information

COMP2611: Computer Organization. The Pipelined Processor

COMP2611: Computer Organization. The Pipelined Processor COMP2611: Computer Organization The 1 2 Background 2 High-Performance Processors 3 Two techniques for designing high-performance processors by exploiting parallelism: Multiprocessing: parallelism among

More information

CPU Pipelining Issues

CPU Pipelining Issues CPU Pipelining Issues What have you been beating your head against? This pipe stuff makes my head hurt! L17 Pipeline Issues & Memory 1 Pipelining Improve performance by increasing instruction throughput

More information

Pipelined Processor Design

Pipelined Processor Design Pipelined Processor Design Pipelined Implementation: MIPS Virendra Singh Computer Design and Test Lab. Indian Institute of Science (IISc) Bangalore virendra@computer.org Advance Computer Architecture http://www.serc.iisc.ernet.in/~viren/courses/aca/aca.htm

More information

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content 3/6/8 CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Today s Content We have looked at how to design a Data Path. 4.4, 4.5 We will design

More information

Overview of Pipelining

Overview of Pipelining EEC 58 Compter Architectre Pipelining Department of Electrical Engineering and Compter Science Cleveland State University Fndamental Principles Overview of Pipelining Pipelined Design otivation: Increase

More information

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will

More information

Lecture 7 Pipelining. Peng Liu.

Lecture 7 Pipelining. Peng Liu. Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt

More information

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one

More information

Instruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31

Instruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31 4.16 Exercises 419 Exercise 4.11 In this exercise we examine in detail how an instruction is executed in a single-cycle datapath. Problems in this exercise refer to a clock cycle in which the processor

More information

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Processor Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Moore s Law Gordon Moore @ Intel (1965) 2 Computer Architecture Trends (1)

More information

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception Outline A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception 1 4 Which stage is the branch decision made? Case 1: 0 M u x 1 Add

More information

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:

More information

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Full Datapath Branch Target Instruction Fetch Immediate 4 Today s Contents We have looked

More information

Processor Architecture

Processor Architecture Processor Architecture Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong (jinkyu@skku.edu)

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 35: Final Exam Review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Material from Earlier in the Semester Throughput and latency

More information

Pipelining: Basic Concepts

Pipelining: Basic Concepts Pipelining: Basic Concepts Prof. Cristina Silvano Dipartimento di Elettronica e Informazione Politecnico di ilano email: silvano@elet.polimi.it Outline Reduced Instruction Set of IPS Processor Implementation

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each

More information

Orange Coast College. Business Division. Computer Science Department. CS 116- Computer Architecture. Pipelining

Orange Coast College. Business Division. Computer Science Department. CS 116- Computer Architecture. Pipelining Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Pipelining Recall Pipelining is parallelizing execution Key to speedups in processors Split instruction

More information

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor

More information

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19 CO2-3224 Computer Architecture and Programming Languages CAPL Lecture 8 & 9 Dr. Kinga Lipskoch Fall 27 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be

More information

Pipelining. Maurizio Palesi

Pipelining. Maurizio Palesi * Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer

More information

Lecture 2: Processor and Pipelining 1

Lecture 2: Processor and Pipelining 1 The Simple BIG Picture! Chapter 3 Additional Slides The Processor and Pipelining CENG 6332 2 Datapath vs Control Datapath signals Control Points Controller Datapath: Storage, FU, interconnect sufficient

More information

Lecture 19 Introduction to Pipelining

Lecture 19 Introduction to Pipelining CSE 30321 Lecture 19 Pipelining (Part 1) 1 Lecture 19 Introduction to Pipelining CSE 30321 Lecture 19 Pipelining (Part 1) Basic pipelining basic := single, in-order issue single issue one instruction at

More information

Pipelining: Hazards Ver. Jan 14, 2014

Pipelining: Hazards Ver. Jan 14, 2014 POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? Pipelining: Hazards Ver. Jan 14, 2014 Marco D. Santambrogio: marco.santambrogio@polimi.it Simone Campanoni:

More information

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/sp16 1 Pipelined Execution Representation Time

More information

Design a MIPS Processor (2/2)

Design a MIPS Processor (2/2) 93-2Digital System Design Design a MIPS Processor (2/2) Lecturer: Chihhao Chao Advisor: Prof. An-Yeu Wu 2005/5/13 Friday ACCESS IC LABORTORY Outline v 6.1 An Overview of Pipelining v 6.2 A Pipelined Datapath

More information

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Recall. ISA? Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction Instruction Format or Encoding how is it decoded? Location of operands and

More information

Lecture 8: Data Hazard and Resolution. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 8: Data Hazard and Resolution. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 8: Data Hazard and Resolution James C. Hoe Department of ECE Carnegie ellon University 18 447 S18 L08 S1, James C. Hoe, CU/ECE/CALC, 2018 Your goal today Housekeeping detect and resolve

More information

PS Midterm 2. Pipelining

PS Midterm 2. Pipelining PS idterm 2 Pipelining Seqential Landry 6 P 7 8 9 idnight Time T a s k O r d e r A B C D 3 4 2 3 4 2 3 4 2 3 4 2 Seqential landry takes 6 hors for 4 loads If they learned pipelining, how long wold landry

More information

ECE260: Fundamentals of Computer Engineering

ECE260: Fundamentals of Computer Engineering Data Hazards in a Pipelined Datapath James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy Data

More information