MIPS An ISA for Pipelining

Size: px
Start display at page:

Download "MIPS An ISA for Pipelining"

Transcription

1 Pipelining: Basic and Intermediate Concepts Slides by: Muhamed Mudawar CS 282 KAUST Spring 2010 Outline: MIPS An ISA for Pipelining 5 stage pipelining i Structural Hazards Data Hazards & Forwarding Branch Hazards Handling Exceptions Handling Multicycle Operations Slide 2 1

2 MIPS: Typical RISC ISA 32-bit fixed format instruction 32 GPR (R0 contains zero) 3-address, reg-reg arithmetic instruction Single address mode for load/store: base + displacement Simple branch conditions Delayed branch See Also: SPARC, IBM Power, and Itanium Slide 3 Instruction Formats Slide 4 2

3 Overview of the MIPS isters 32 General Purpose isters (GPRs) 64-bit registers are used in MIPS64 ister 0 is always zero Any value written to R0 is discarded Special-purpose registers LO and HI Hold results of integer multiply and divide Special-purpose program counter PC 32 Floating Point isters (FPRs) 64-bit Floating Point registers in MIPS 64 FIR: Floating-point Implementation ister FCSR: Floating-point Control & Status ister GPRs R0 R31 LO HI PC FPRs F0 F31 FIR FCSR Slide 5 Load and Store Instructions Slide 6 3

4 Arithmetic / Logical Instructions Slide 7 Control Flow Instructions Slide 8 4

5 Data Transfer / Arithmetic / Logical Slide 9 Control and Floating Point Slide 10 5

6 Instruction Mix for SPECint2000 Slide 11 Instruction Mix for SPECfp2000 Slide 12 6

7 Next: MIPS An ISA for Pipelining 5 stage pipelining i Structural Hazards Data Hazards & Forwarding Branch Hazards Handling Exceptions Handling Multicycle Operations Slide 13 5 Steps of Instruction Execution Slide 14 7

8 Pipelined MIPS Datapath & Stage isters Instruction Fetch Instr. Decode. Fetch Execute Addr. Calc Memory Access Write Back Next PC Next SEQ PC Next SEQ PC MU 4 Adder RS1 Zero? X Address Memory IF/ID RS2 File ID/EX MUX MUX EX/MEM Data Memory MEM/WB MUX Imm Sign Extend RW = Rd or Rt RW RW WB Data Slide 15 Stage IF ID EX Events on Every Pipe Stage Any Instruction IF/ID.IR MEM[PC]; IF/ID.NPC PC+4 PC if ((EX/MEM.opcode=branch) & EX/MEM.cond) {EX/MEM.output} else {PC + 4} ID/EX.A s[if/id.ir[rs]]; ID/EX.B s[if/id.ir[rt]] ID/EX.NPC IF/ID.NPC; ID/EX.Imm extend(if/id.ir[imm]); ID/EX.Rw IF/ID.IR[Rt or Rd] Instruction Load / Store Branch EX/MEM.output ID/EX.A func ID/EX.B, or EX/MEM.output ID/EX.A op ID/EX.Imm EX/MEM.output ID/EX.A + ID/EX.Imm EX/MEM.B ID/EX.B MEM MEM/WB.output MEM/WB.LMD EX/MEM.output MEM[EX/MEM.output] or MEM[EX/MEM.output] EX/MEM.B WB s[mem/wb.rw] For load only: MEM/WB.Output s[mem/wb.rw] MEM/WB.LMD EX/MEM.output ID/EX.NPC + (ID/EX.Imm << 2) EX/MEM.cond br condition Slide 16 8

9 Pipelined Control Control signals derived from instruction opcode Control signals are pipelined pp just like data Slide 17 Visualizing Pipelining Pipeline registers carry data for a given instruction from one stage to the other One instruction completes each cycle Overlapped Execution of Instructions Slide 18 9

10 Pipeline Performance Assume time for stages is 100ps for register read or write 200ps for other stages Compare pipelined versus non-pipelined datapath Instr Instr fetch ister read op Memory access ister write Total time load 200ps 100 ps 200ps 200ps 100 ps 800ps store 200ps 100 ps 200ps 200ps 700ps R-format 200ps 100 ps 200ps 100 ps 600ps branch 200ps 100 ps 200ps 500ps Slide 19 Pipeline Performance Single-cycle (T c = 800ps) Pipelined (T c = 200ps) Speedup = 800 / 200 =

11 Pipeline Speedup If all stages are balanced All stages take the same time Time between instructions pipelined Time between instructions nonpipelined = Number of stages If not balanced, speedup is less Speedup due to increased throughput Latency (time for each instruction) does not decrease Slide 21 Pipelining and ISA Design MIPS ISA designed for pipelining All instructions are 32-bits Easier to fetch and decode in one cycle Compare with Intel x86: 1- to 17-byte instructions Few and regular instruction formats Can decode and read registers in one step Load/store addressing Calculate address in 3 rd stage, access memory in 4 th Alignment of memory operands Memory access takes only one cycle Slide 22 11

12 Pipelining is not quite that easy! Limits to pipelining: Hazards prevent next instruction from executing during its designated clock cycle Structural hazards: HW cannot allow two instructions to use same resource during same cycle Data hazards: Instruction depends on result of prior instruction i still in the pipeline Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps) Slide 23 Next: MIPS An ISA for Pipelining 5 stage pipelining Structural Hazards Data Hazards & Forwarding Branch Hazards Handling Exceptions Handling Multicycle Operations Slide 24 12

13 Structure Hazards Conflict for use of a resource In MIPS pipeline with a single memory Load/store requires data access Instruction fetch would have to stall for a cycle Causes a pipeline bubble Hence, pipelined datapaths require separate Instruction and Data memories Or separate Instruction and Data caches Slide 25 One Memory Port/Structural Hazards Time (clock cycles) Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 I n Load MEM s t Instr 1 r. Instr 2 O r d Instr 3 e r Instr 4 Mem MEM MEM MEM DMem Ifetch DMem DMem Slide 26 13

14 One Memory Port/Structural Hazards Time (clock cycles) Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 I n s t r. O r d e r Load Mem Instr 1 Instr 2 Stall Instr 3 Mem Mem Mem Mem Mem Mem Bubble Bubble Bubble Bubble How do you bubble the pipe? Slide 27 Resolving Structural Hazards Serious Hazard: Hazard cannot be ignored Solution 1: Delay Access to Resource Must have mechanism to delay access to resource Stall the pipeline until resource is available Solution 2: Add more hardware resources Add more hardware to eliminate the structural hazard Redesign the memory to have two ports Or have two memories, each with a single port One memory for instructions and the second for data Harvard Architecture Slide 28 14

15 Speedup Equation for Pipelining CPI Speedup CPI unpipelined pipelined Cycle Time Cycle Time unpipelined pipelined CPI pipelined Ideal CPI Average Stall cycles per Inst For simple single-issue pipeline, Ideal CPI = 1 1 Cycle Timeunpipelined Speedup 1 Pipeline stall cycles per instruction Cycle Time pipelined If stages are balances, Cycle unpipelined /Cycle pipelined = Pipeline Depth Pipeline Depth Speedup 1 Pipeline stall cycles per instruction Slide 29 Example: Dual-port vs. Single-port Memory Machine A: Two memories ( Harvard Architecture ) Machine B: Single ported memory, but it is pipelined B has a clock rate 1.05 times faster than clock rate of A Ideal pipelined CPI = 1 for both Loads are 40% of instructions executed Stall cycles per instruction due to structural hazards Speedup A/B CPI CPI B A Clock rate Clock rate A B Machine A is 1.33 times faster than B Slide 30 15

16 Writing result in Stage 4 Problem Writing back result in stage 4 Structural Hazard Conflict with writing load data in stage 5 Two instructions are attempting to write the register file during same cycle Instruct tions lw r6, 8(r5) ori r4, r3, 7 sub r5, r2, r3 sw r2, 10(r3) IF ID IF EX ID IF MEM EX ID IF WB WB EX ID WB EX MEM CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 Time Slide 31 Resolving Write-Back Structure Hazard Solution 1 Add a second write port (costly solution) Can do two writes during same cycle Solution 2 (better for single-issue pipeline) Delay all write backs to the register file to stage 5 instructions bypass stage 4 doing nothing lw r6, 8(r5) ori r4, r3, 7 sub r5, r2, r3 IF ID IF EX ID IF MEM EX ID WB EX WB instructions skip the MEM stage WB sw r2, 10(r3) IF ID EX MEM CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 Time Diagram shows instruction use of stages at each clock cycle Slide 32 16

17 Next: MIPS An ISA for Pipelining 5 stage pipelining Structural Hazards Data Hazards & Forwarding Branch Hazards Handling Exceptions Handling Multicycle Operations Slide 33 Data Hazards Data Dependence can cause a data hazard The dependent instructions are close to each other Pipelined execution changes the order of operand access Read After Write RAW Hazard Given two instructions I and J, where I comes before J Instruction J should read an operand after it is written by I Called a data dependence in compiler terminology I: add r1, r2, r3 # r1 is written J: sub r4, r1, r3 # r1 is read Hazard occurs when J reads the operand before I writes it Slide 34 17

18 Example of a RAW Data Hazard Time (cycles) value of r2 CC1 10 CC2 10 CC3 10 CC4 10 CC5 10/20 CC6 20 CC7 20 CC8 20 Program Execution Ord der sub r2, r1, r3 IM and r4, r2, r5 or r6, r3, r2 add r7, r2, r2 DM IM DM IM IM sw r8, 10(r2) IM DM DM DM Result of sub is needed by and, or, add, & sw instructions Instructions and & or will read old value of r2 from reg file During CC5, r2 is written and read new value is read Slide 35 Solution 1: Stalling the Pipeline Order Time (in cycles) value of r2 CC1 10 CC2 10 CC3 10 CC4 10 CC5 10/20 sub r2, r1, r3 IM DM CC6 20 CC7 20 CC8 20 Instruction and r4, r2, r5 IM bubble bubble or r6, r3, r2 IM DM DM The and instruction cannot fetch r2 until CC5 The and instruction remains in the IF/ID register until CC5 Two bubbles are inserted into ID/EX at end of CC3 & CC4 Bubbles are NOP instructions: do not modify registers or memory Bubbles delay instruction execution and waste clock cycles Slide 36 18

19 Solution 2: Forwarding Result The result is forwarded (fed back) to the input No bubbles are inserted into the pipeline and no cycles are wasted result exists in either EX/MEM or MEM/WB register Time (in cycles) CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 Program Exec cution Order sub r2, r1, r3 IM DM and r4, r2, r5 IM DM or r6, r3, r2 IM DM add r7, r2, r2 IM sw r8, 10(r2) IM DM DM Slide 37 Hardware Support for Forwarding Slide 38 19

20 Detecting RAW Hazards Pass register numbers along pipeline ID/EX.isterRs = register number for Rs in ID/EX ID/EX.isterRt = register number for Rt in ID/EX ID/EX.isterRd = register number for Rd in ID/EX Current instruction being executed in ID/EX register Previous instruction is in the EX/MEM register Second previous is in the MEM/WB register RAW Data hazards when 1a. EX/MEM.isterRd = ID/EX.isterRs 1b. EX/MEM.isterRd = ID/EX.isterRt 2a. MEM/WB.isterRd = ID/EX.isterRs 2b. MEM/WB.isterRd = ID/EX.isterRt Fwd from EX/MEM pipeline reg Fwd from MEM/WB pipeline reg Slide 39 Detecting the Need to Forward But only if forwarding instruction will write to a register! EX/MEM.Write, MEM/WB.Write And only if Rd for that instruction is not R0 EX/MEM.isterRd 0 MEM/WB.isterRd 0 Slide 40 20

21 Forwarding Conditions Detecting RAW hazard with Previous Instruction if (EX/MEM.Write and (EX/MEM.isterRd 0) and (EX/MEM.isterRd = ID/EX.isterRs)) ForwardA = 01 (Forward from EX/MEM pipe stage) if (EX/MEM.Write and (EX/MEM.isterRd 0) and (EX/MEM.isterRd = ID/EX.isterRt)) ForwardB = 01 (Forward from EX/MEM pipe stage) Detecting RAW hazard with Second Previous if (MEM/WB.Write and (MEM/WB.isterRd 0) and (MEM/WB.isterRd = ID/EX.isterRs)) ForwardA = 10 (Forward from MEM/WB pipe stage) if (MEM/WB.Write and (MEM/WB.isterRd 0) and (MEM/WB.isterRd = ID/EX.isterRt)) ForwardB = 10 (Forward from MEM/WB pipe stage) Slide 41 Double Data Hazard Consider the sequence: add r1,r1,r2 r2 sub r1,r1,r3 and r1,r1,r4 Both hazards occur Want to use the most recent When executing AND, forward result of SUB ForwardA = 01 (from the EX/MEM pipe stage) Slide 42 21

22 Datapath with Forwarding Slide 43 Load Delay Not all RAW data hazards can be forwarded Load has a delay that cannot be eliminated by forwarding In the example shown below The LW instruction does not have data until end of CC4 AND wants data at beginning of CC4 - NOT possible Program Ord der Time (cycles) CC1 CC2 CC3 CC4 CC5 lw r2, 20(r1) IF DM and r4, r2, r5 IF DM or r6, r3, r2 IF CC6 CC7 CC8 DM However, load can forward data to second next instruction add r7, r2, r2 IF DM Slide 44 22

23 Stall the Pipeline for one Cycle Freeze the PC and the IF/ID registers No new instruction is fetched and instruction after load is stalled Allow the Load in ID/EX register to proceed Introduce a bubble into the ID/EX register Load can forward data after stalling next instruction Time (cycles) CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 r Program Order lw r2, 20(r1) IM DM and r4, r2, r5 IM bubble DM or r6, r3, r2 IM DM Slide 45 Compiler Scheduling Compilers can schedule code in a way to avoid load stalls Consider the following statements: a = b + c; d = e f; Slow code: 2 stall cycles lw r10, (r1) # r1 = addr b lw r11, (r2) # r2 = addr c add r12, r10, $11 # stall sw r12, (r3) # r3 = addr a lw r13, (r4) # r4 = addr e lw r14, (r5) # r5 = addr f sub r15, r13, r14 # stall sw r15, (r6) # r6 = addr d Fast code: No Stalls lw r10, 0(r1) lw r11, 0(r2) lw r13, 0(r4) lw r14, 0(r5) add r12, r10, r11 sw r12, 0(r3) sub r15, r13, r14 sw r14, 0(r6) Compiler optimizes for performance. Hardware checks for safety. Slide 46 23

24 Load/Store Data Forwarding How to implement Load/Store Data Forwarding? Slide 47 Write After Read Instr J should write its result after it is read by I Called an anti-dependence by compiler writers I: sub r4, r1, r3 # r1 is read J: add r1, r2, r3 # r1 is written Results from reuse of the name r1 Hazard occurs when J writes r1 before I reads it Cannot occur in the basic 5-stage pipeline because: Reads are always in stage 2, and Writes are always in stage 5 Instructions are processed in order Slide 48 24

25 Write After Write Inst J should write its result after I Called output-dependence in compiler terminology I: sub r1, r4, r3 # r1 is written J: add r1, r2, r3 # r1 is written again This hazard also results from the reuse of name r1 Hazard when writes occur in the wrong order Can t happen in our basic 5-stage pipeline because: All writes are ordered and take place in stage 5 WAR and WAW hazards occur in complex pipelines Notice that Read After Read RAR is NOT a hazard Slide 49 Next: MIPS An ISA for Pipelining 5 stage pipelining Structural Hazards Data Hazards & Forwarding Branch Hazards Handling Exceptions Handling Multicycle Operations Slide 50 25

26 Control Hazards Branch instructions can cause great performance loss Branch instructions need two things: Branch Result Branch Target PC + 4 PC imm For our pipeline: 3-cycle branch delay Taken or Not Taken If Branch is NOT taken If Branch is Taken PC is updated 3 cycles after fetching branch instruction Branch target address is calculated in the stage Branch result is also computed in the stage What to do with the next 3 instructions after branch? Slide 51 3-Cycle Branch Delay beq r1,r3,label Ifetch DMem Next1 Ifetch DMem Next2 Ifetch DMem Next3 Ifetch DMem Label: target instruction Ifetch DMem Next1 thru Next3 instructions will be fetched anyway Pipeline should flush Next1 - Next3 if branch is taken Slide 52 26

27 Branch Stall Impact If CPI = 1 without branch stalls, and 30% branch If stalling 3 cycles per branch => new CPI = = 1.9 Two part solution: Determine branch taken or not sooner, and Compute taken branch address earlier MIPS Solution: Move branch test to ID stage (second stage) Adder to calculate new PC in ID stage Branch delay is reduced from 3 to just 1 clock cycle Slide 53 Modified Pipelined MIPS Datapath Instruction Fetch Instr. Decode. Fetch Execute Addr. Calc Memory Access Write Back Next PC 4 Adder Next SEQ PC Adder RS1 MU UX Zero? Address Memory IF/ID RS2 File ID/EX MUX EX/MEM Data Memory MEM/WB MUX Imm Sign Extend RD RD RD WB Data ID stage: Computing Branch address and result to reduce branch delay Slide 54 27

28 Four Branch Hazard Alternatives #1: Stall until branch direction is clear #2: Predict Branch Not Taken Execute successor instructions in sequence Squash instructions in pipeline if branch actually taken 47% MIPS branches not taken on average PC+4 already calculated, so use it to get next instruction #3: Predict Branch Taken 53% MIPS branches taken on average But haven t calculated branch target address until ID stage MIPS still incurs 1 cycle branch penalty Other machines: branch target known before branch outcome Slide 55 Four Branch Hazard Alternatives #4: Delayed Branch Define branch to take place AFTER following instruction branch instruction sequential successor 1 sequential successor 2... sequential successor n branch target if taken Branch delay of length n One branch delay slot allows proper decision and branch target address in 5 stage pipeline MIPS uses one branch delay slot Slide 56 28

29 Scheduling Branch Delay Slots A. From before branch B. From branch target C. From fall through add r1,r2,r3 if r2=0 then delay slot sub r4,r5,r6 add r1,r2,r3 if r1=0 then delay slot add r1,r2,r3 if r1=0 then delay slot or r7,r8,r9r8 r9 sub r4,r5,r6 becomes if r2=0 then add r1,r2,r3 becomes sub r4,r5,r6 add r1,r2,r3 if r1=0 then sub r4,r5,r6 becomes add r1,r2,r3 if r1=0 then or r7,r8,r9 sub r4,r5,r6 A is the best choice, fills delay slot & reduces instruction count (IC) In B, the sub instruction may need to be copied, increasing IC In B & C, must be okay to execute instruction in delay slot in all cases Slide 57 Effectiveness of Delayed Branch Compiler effectiveness for single branch delay slot Fills about 60% of branch delay slots About 80% of instructions executed in branch delay slots useful in computation About 50% (60% x 80%) of slots usefully filled Delayed Branch downside: As processor go to deeper pipelines and multiple issue, the branch delay ygrows and need more than one delay slot Delayed branching has lost popularity compared to more expensive but more flexible dynamic approaches Growth in available transistors has made dynamic approaches relatively cheaper Slide 58 29

30 Performance of Branch Schemes Assuming an ideal CPI = 1 without counting branch stalls Pipeline speedup over non-pipelined datapath: Pipeline depth Pipeline speedup 1 Pipeline stall cycles from branches Pipeline depth Pipeline speedup 1 Branch Frequency Branch Penalty Pipeline CPI branch stalls Ideal CPI no stalls Branch Freq Branch Penalty Slide 59 Evaluating Branch Alternatives Branch Scheme Penalty Unconditional Penalty Untaken Penalty Taken Stall always Predict taken Predict not taken Delayed branch Assume 4% unconditional branch, 6% conditional branch- untaken, and 10% conditional branch-taken. What is the impact on the CPI? Branch Scheme Unconditional Branches Untaken Branches Taken Branches All Branches 4% 6% 10% 20% Stall always CPI+0.56 Predict taken CPI+0.46 Predict not taken CPI+0.38 Delayed branch CPI+0.24 Slide 60 30

31 Next: MIPS An ISA for Pipelining 5 stage pipelining Structural Hazards Data Hazards & Forwarding Branch Hazards Handling Exceptions Handling Multicycle Operations Slide 61 Exceptions and Interrupts Unexpected events requiring change in flow of control Different ISAs use the terms differently Exception Arises within the execution of an instruction e.g., undefined opcode, overflow, syscall, Interrupt An external I/O device controller is requesting processor Exceptions and Interrupts complicate the implementation and control of the pipeline Slide 62 31

32 Types of Exceptions I/O device request (hardware interrupt) Invoking the OS (system call) Tracing instruction execution Breakpoint (programmer requested) Integer arithmetic overflow Floating Point arithmetic anomaly Page fault (requested page is not in memory) Misaligned memory access Memory protection violation Undefined instruction Hardware malfunction and Power failure Slide 63 Handling Exceptions In MIPS, exceptions are managed by a System Control Coprocessor (CP0) Save PC of offending (or interrupted) instruction Exception Program Counter (EPC) Save indication of the problem In MIPS: Cause register Jump to handler at a fixed address Slide 64 32

33 Handler Actions Read cause, and transfer to relevant handler Determine action required If program can be restarted Take corrective action Use EPC to return to program Otherwise Terminate program Report error using EPC, Cause, Slide 65 Alternative Approach Vectored Interrupts Handler address determined by the cause Example: Undefined opcode: C Overflow: C : C Instructions either Deal with the interrupt, or Jump to real handler Slide 66 33

34 Exceptions in MIPS 5-stage pipeline Stage IF ID EX MEM WB Exceptions that may occur Page fault on instruction fetch, misaligned memory access, memory protection violation Undefined or illegal opcode Arithmetic exception Page fault on data fetch, misaligned memory access, memory protection violation None Slide 67 Exceptions in a Pipeline Another form of control hazard Consider overflow on add in EX stage add r1, r2, r1 Prevent r1 from being written Complete previous instructions Flush add and subsequent instructions Set Cause and EPC register values Transfer control to handler Similar to mispredicted branch Use much of the same hardware Slide 68 34

35 5-Stage Pipeline with Exceptions Slide 69 Multiple Exceptions Pipelining overlaps multiple instructions Could have multiple exceptions at once Simple approach: deal with exception from earliest instruction Flush subsequent instructions Precise exceptions In complex pipelines Multiple instructions issued per cycle Out-of-order completion Maintaining precise exceptions is more difficult! Slide 70 35

36 Imprecise Exceptions Just stop pipeline and save state Including exception cause(s) Let the handler work out Which instruction(s) had exceptions Which to complete or flush May require manual completion Simplifies hardware, but more complex handler software Not feasible for complex multiple-issue out-of-order pipelines Slide 71 Next: MIPS An ISA for Pipelining 5 stage pipelining Structural Hazards Data Hazards & Forwarding Branch Hazards Handling Exceptions Handling Multicycle Operations Slide 72 36

37 Pipeline with Multiple Functional Units Slide 73 37

Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation

Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu www.secs.oakland.edu/~yan

More information

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science Pipeline Overview Dr. Jiang Li Adapted from the slides provided by the authors Outline MIPS An ISA for Pipelining 5 stage pipelining Structural and Data Hazards Forwarding Branch Schemes Exceptions and

More information

Computer Architecture

Computer Architecture Lecture 3: Pipelining Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each

More information

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)

More information

COSC 6385 Computer Architecture - Pipelining

COSC 6385 Computer Architecture - Pipelining COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage

More information

CPE Computer Architecture. Appendix A: Pipelining: Basic and Intermediate Concepts

CPE Computer Architecture. Appendix A: Pipelining: Basic and Intermediate Concepts CPE 110408443 Computer Architecture Appendix A: Pipelining: Basic and Intermediate Concepts Sa ed R. Abed [Computer Engineering Department, Hashemite University] Outline Basic concept of Pipelining The

More information

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards CISC 662 Graduate Computer Architecture Lecture 6 - Hazards Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science 6 PM 7 8 9 10 11 Midnight Time 30 40 20 30 40 20

More information

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building

More information

LECTURE 3: THE PROCESSOR

LECTURE 3: THE PROCESSOR LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science Cases that affect instruction execution semantics

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

Lecture 2: Processor and Pipelining 1

Lecture 2: Processor and Pipelining 1 The Simple BIG Picture! Chapter 3 Additional Slides The Processor and Pipelining CENG 6332 2 Datapath vs Control Datapath signals Control Points Controller Datapath: Storage, FU, interconnect sufficient

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

The Processor: Improving the performance - Control Hazards

The Processor: Improving the performance - Control Hazards The Processor: Improving the performance - Control Hazards Wednesday 14 October 15 Many slides adapted from: and Design, Patterson & Hennessy 5th Edition, 2014, MK and from Prof. Mary Jane Irwin, PSU Summary

More information

COSC4201 Pipelining. Prof. Mokhtar Aboelaze York University

COSC4201 Pipelining. Prof. Mokhtar Aboelaze York University COSC4201 Pipelining Prof. Mokhtar Aboelaze York University 1 Instructions: Fetch Every instruction could be executed in 5 cycles, these 5 cycles are (MIPS like machine). Instruction fetch IR Mem[PC] NPC

More information

Chapter 4 The Processor 1. Chapter 4B. The Processor

Chapter 4 The Processor 1. Chapter 4B. The Processor Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always

More information

Chapter 4 The Processor 1. Chapter 4A. The Processor

Chapter 4 The Processor 1. Chapter 4A. The Processor Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

CSEE 3827: Fundamentals of Computer Systems

CSEE 3827: Fundamentals of Computer Systems CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 martha@cs.columbia.edu Amdahl s Law Be aware when optimizing... T = improved Taffected improvement factor + T unaffected

More information

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception Outline A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception 1 4 Which stage is the branch decision made? Case 1: 0 M u x 1 Add

More information

Instruction Pipelining Review

Instruction Pipelining Review Instruction Pipelining Review Instruction pipelining is CPU implementation technique where multiple operations on a number of instructions are overlapped. An instruction execution pipeline involves a number

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Appendix C. Abdullah Muzahid CS 5513

Appendix C. Abdullah Muzahid CS 5513 Appendix C Abdullah Muzahid CS 5513 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero) Single address mode for load/store: base + displacement no indirection

More information

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Pipelining. Maurizio Palesi

Pipelining. Maurizio Palesi * Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer

More information

There are different characteristics for exceptions. They are as follows:

There are different characteristics for exceptions. They are as follows: e-pg PATHSHALA- Computer Science Computer Architecture Module 15 Exception handling and floating point pipelines The objectives of this module are to discuss about exceptions and look at how the MIPS architecture

More information

Appendix C. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,

Appendix C. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002, Appendix C Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Pipelining Multiple instructions are overlapped in execution Each is in a different stage Each stage is called

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 02: Introduction II Shuai Wang Department of Computer Science and Technology Nanjing University Pipeline Hazards Major hurdle to pipelining: hazards prevent the

More information

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14 MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

Pipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017!

Pipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017! Advanced Topics on Heterogeneous System Architectures Pipelining! Politecnico di Milano! Seminar Room @ DEIB! 30 November, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2 Outline!

More information

Processor (II) - pipelining. Hwansoo Han

Processor (II) - pipelining. Hwansoo Han Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number

More information

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism

More information

Minimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline

Minimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline Instruction Pipelining Review: MIPS In-Order Single-Issue Integer Pipeline Performance of Pipelines with Stalls Pipeline Hazards Structural hazards Data hazards Minimizing Data hazard Stalls by Forwarding

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

Thomas Polzer Institut für Technische Informatik

Thomas Polzer Institut für Technische Informatik Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =

More information

CENG 3420 Lecture 06: Pipeline

CENG 3420 Lecture 06: Pipeline CENG 3420 Lecture 06: Pipeline Bei Yu byu@cse.cuhk.edu.hk CENG3420 L06.1 Spring 2019 Outline q Pipeline Motivations q Pipeline Hazards q Exceptions q Background: Flip-Flop Control Signals CENG3420 L06.2

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Advanced Computer Architecture

Advanced Computer Architecture ECE 563 Advanced Computer Architecture Fall 2010 Lecture 2: Review of Metrics and Pipelining 563 L02.1 Fall 2010 Review from Last Time Computer Architecture >> instruction sets Computer Architecture skill

More information

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

Pipelining: Hazards Ver. Jan 14, 2014

Pipelining: Hazards Ver. Jan 14, 2014 POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? Pipelining: Hazards Ver. Jan 14, 2014 Marco D. Santambrogio: marco.santambrogio@polimi.it Simone Campanoni:

More information

Instruction Level Parallelism. ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction

Instruction Level Parallelism. ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction Instruction Level Parallelism ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction Basic Block A straight line code sequence with no branches in except to the entry and no branches

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University The Processor Logic Design Conventions Building a Datapath A Simple Implementation Scheme An Overview of Pipelining Pipelined

More information

COMPUTER ORGANIZATION AND DESI

COMPUTER ORGANIZATION AND DESI COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler

More information

Basic Pipelining Concepts

Basic Pipelining Concepts Basic ipelining oncepts Appendix A (recommended reading, not everything will be covered today) Basic pipelining ipeline hazards Data hazards ontrol hazards Structural hazards Multicycle operations Execution

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Lecture 7 Pipelining. Peng Liu.

Lecture 7 Pipelining. Peng Liu. Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt

More information

Appendix A. Overview

Appendix A. Overview Appendix A Pipelining: Basic and Intermediate Concepts 1 Overview Basics of Pipelining Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 2 1 Unpipelined

More information

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction

More information

Determined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version

Determined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version MIPS Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

CSE 533: Advanced Computer Architectures. Pipelining. Instructor: Gürhan Küçük. Yeditepe University

CSE 533: Advanced Computer Architectures. Pipelining. Instructor: Gürhan Küçük. Yeditepe University CSE 533: Advanced Computer Architectures Pipelining Instructor: Gürhan Küçük Yeditepe University Lecture notes based on notes by Mark D. Hill and John P. Shen Updated by Mikko Lipasti Pipelining Forecast

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Appendix C: Pipelining: Basic and Intermediate Concepts

Appendix C: Pipelining: Basic and Intermediate Concepts Appendix C: Pipelining: Basic and Intermediate Concepts Key ideas and simple pipeline (Section C.1) Hazards (Sections C.2 and C.3) Structural hazards Data hazards Control hazards Exceptions (Section C.4)

More information

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle? CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:

More information

CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3. Complications With Long Instructions

CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3. Complications With Long Instructions CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3 Long Instructions & MIPS Case Study Complications With Long Instructions So far, all MIPS instructions take 5 cycles But haven't talked

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor 1 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A

More information

CS4617 Computer Architecture

CS4617 Computer Architecture 1/47 CS4617 Computer Architecture Lectures 21 22: Pipelining Reference: Appendix C, Hennessy & Patterson Dr J Vaughan November 2013 MIPS data path implementation (unpipelined) Figure C.21 The implementation

More information

Speeding Up DLX Computer Architecture Hadassah College Spring 2018 Speeding Up DLX Dr. Martin Land

Speeding Up DLX Computer Architecture Hadassah College Spring 2018 Speeding Up DLX Dr. Martin Land Speeding Up DLX 1 DLX Execution Stages Version 1 Clock Cycle 1 I 1 enters Instruction Fetch (IF) Clock Cycle2 I 1 moves to Instruction Decode (ID) Instruction Fetch (IF) holds state fixed Clock Cycle3

More information

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Chapter 4 (Part II) Sequential Laundry

Chapter 4 (Part II) Sequential Laundry Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P 7 8 9 10 11 12 1 2 A T a s k O r d e r A B C D 30 30 30 30 30 30 30 30 30 30

More information

mywbut.com Pipelining

mywbut.com Pipelining Pipelining 1 What Is Pipelining? Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. Today, pipelining is the key implementation technique used to make

More information

Pipeline Review. Review

Pipeline Review. Review Pipeline Review Review Covered in EECS2021 (was CSE2021) Just a reminder of pipeline and hazards If you need more details, review 2021 materials 1 The basic MIPS Processor Pipeline 2 Performance of pipelining

More information

14:332:331 Pipelined Datapath

14:332:331 Pipelined Datapath 14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate

More information

Overview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP

Overview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP Overview Appendix A Pipelining: Basic and Intermediate Concepts Basics of Pipelining Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 1 2 Unpipelined

More information

EIE/ENE 334 Microprocessors

EIE/ENE 334 Microprocessors EIE/ENE 334 Microprocessors Lecture 6: The Processor Week #06/07 : Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, 2009, Elsevier (MK) http://webstaff.kmutt.ac.th/~dejwoot.kha/

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations Determined by ISA

More information

What is Pipelining? RISC remainder (our assumptions)

What is Pipelining? RISC remainder (our assumptions) What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism

More information

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:

More information

Pipeline Data Hazards. Dealing With Data Hazards

Pipeline Data Hazards. Dealing With Data Hazards Pipeline Data Hazards Warning, warning, warning! Dealing With Data Hazards In Software inserting independent instructions In Hardware inserting bubbles (stalling the pipeline) data forwarding Data Data

More information

Pipelining. CSC Friday, November 6, 2015

Pipelining. CSC Friday, November 6, 2015 Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Pipelining: Basic and Intermediate Concepts

Pipelining: Basic and Intermediate Concepts Appendix A Pipelining: Basic and Intermediate Concepts 1 Overview Basics of fpipelining i Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 2 Unpipelined

More information

Instr. execution impl. view

Instr. execution impl. view Pipelining Sangyeun Cho Computer Science Department Instr. execution impl. view Single (long) cycle implementation Multi-cycle implementation Pipelined implementation Processing an instruction Fetch instruction

More information

ECE 486/586. Computer Architecture. Lecture # 12

ECE 486/586. Computer Architecture. Lecture # 12 ECE 486/586 Computer Architecture Lecture # 12 Spring 2015 Portland State University Lecture Topics Pipelining Control Hazards Delayed branch Branch stall impact Implementing the pipeline Detecting hazards

More information

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor. COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing Note: Appendices A-E in the hardcopy text correspond to chapters 7- in the online

More information

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017 Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation

More information

Lecture 9 Pipeline and Cache

Lecture 9 Pipeline and Cache Lecture 9 Pipeline and Cache Peng Liu liupeng@zju.edu.cn 1 What makes it easy Pipelining Review all instructions are the same length just a few instruction formats memory operands appear only in loads

More information

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ... CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100

More information

第三章 Instruction-Level Parallelism and Its Dynamic Exploitation. 陈文智 浙江大学计算机学院 2014 年 10 月

第三章 Instruction-Level Parallelism and Its Dynamic Exploitation. 陈文智 浙江大学计算机学院 2014 年 10 月 第三章 Instruction-Level Parallelism and Its Dynamic Exploitation 陈文智 chenwz@zju.edu.cn 浙江大学计算机学院 2014 年 10 月 1 3.3 The Major Hurdle of Pipelining Pipeline Hazards 本科回顾 ------- Appendix A.2 3.3.1 Taxonomy

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike

More information

Advanced Computer Architecture Pipelining

Advanced Computer Architecture Pipelining Advanced Computer Architecture Pipelining Dr. Shadrokh Samavi Some slides are from the instructors resources which accompany the 6 th and previous editions of the textbook. Some slides are from David Patterson,

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S. Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing (2) Pipeline Performance Assume time for stages is ps for register read or write

More information

ELE 655 Microprocessor System Design

ELE 655 Microprocessor System Design ELE 655 Microprocessor System Design Section 2 Instruction Level Parallelism Class 1 Basic Pipeline Notes: Reg shows up two places but actually is the same register file Writes occur on the second half

More information

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will

More information

CSE 502 Graduate Computer Architecture. Lec 3-5 Performance + Instruction Pipelining Review

CSE 502 Graduate Computer Architecture. Lec 3-5 Performance + Instruction Pipelining Review CSE 502 Graduate Computer Architecture Lec 3-5 Performance + Instruction Pipelining Review Larry Wittie Computer Science, StonyBrook University http://www.cs.sunysb.edu/~cse502 and ~lw Slides adapted from

More information

Very Simple MIPS Implementation

Very Simple MIPS Implementation 06 1 MIPS Pipelined Implementation 06 1 line: (In this set.) Unpipelined Implementation. (Diagram only.) Pipelined MIPS Implementations: Hardware, notation, hazards. Dependency Definitions. Hazards: Definitions,

More information

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor? Chapter 4 The Processor 2 Introduction We will learn How the ISA determines many aspects

More information

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one

More information

LECTURE 10. Pipelining: Advanced ILP

LECTURE 10. Pipelining: Advanced ILP LECTURE 10 Pipelining: Advanced ILP EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls, returns) that changes the normal flow of instruction

More information

HY425 Lecture 05: Branch Prediction

HY425 Lecture 05: Branch Prediction HY425 Lecture 05: Branch Prediction Dimitrios S. Nikolopoulos University of Crete and FORTH-ICS October 19, 2011 Dimitrios S. Nikolopoulos HY425 Lecture 05: Branch Prediction 1 / 45 Exploiting ILP in hardware

More information