Pipelining. Each step does a small fraction of the job All steps ideally operate concurrently

Size: px
Start display at page:

Download "Pipelining. Each step does a small fraction of the job All steps ideally operate concurrently"

Transcription

1 Pipelining Computational assembly line Each step does a small fraction of the job All steps ideally operate concurrently A form of vertical concurrency Stage/segment - responsible for 1 step 1 machine cycle per stage - usually equal to 1 clock cycle Synchronous design ==> slowest stage dominates All stages execute at about the same rate Simple model - common latch clock Stage 1 Latch CL Stage 2 Latch CL Stage n Latch CL Chapter 3 page 1

2 Major Pipeline Benefit = Performance Ideal Performance time-per-instruction = unpiped-instruction-time/#stages ideal speedup = #stages normally cannot be achieved 2 ways to view the performance mechanism Reduced CPI (n stages in both the piped and unpiped systems) increased throughput: 1-inst/1-cycle as opposed to 1-inst/n-cycle the result is that the average CPI is reduced Reduced cycle-time (n stages in piped systems and 1 stage in unpiped systems) work is being split up into n stages the result is that the cycle time is reduced Chapter 3 page 2

3 Other Benefits of Pipelining Hardware mechanisms invisible to programmers Hence no programming model shift is required to exploit this form of concurrency with exception of performance - the programmer is oblivious of the actual pipeline organization BUT - since compiler tries to optimize for performance - the optimization phases of modern compilers will know the pipeline organization details All modern machines are pipelined THE key technique in advancing the performance curve through the 80 s 90 s just uses multiple pipelines Beware - no benefit is totally free/good Chapter 3 page 3

4 Start with Unpipelined DLX Every DLX instruction can be executed in 5 steps Unpipelined case: all 5 steps will occur in the same (long) clock cycle each step outputs just passed to next step (no latches) Remember the instruction set: Ignore Jump now Opcode rs1 rd immediate I-type: loads and stores, R-immed ops, Condix branch (rs1 and rd unused), Jump reg, J&Link reg Opcode rs1 rs2 rd function R-type: reg-reg ops (func=op) read/write special regs, and moves Opcode Offset added to PC J-type: Jump and Link, Jump, TRAP, and RFE Chapter 3 page 4

5 Steps 1 & 2 IF - instruction fetch step IR <-- Mem[PC] fetch the next instruction from memory NPC <-- PC compute the new PC Note - can be done in parallel with opcode decode ID - instruction decode and register fetch step A <-- Regs[IR ] B <-- Regs[IR ] Possible since register specifiers are encoded in fixed fields We may fetch register contents that we don t use but OK since the operands will be ready if the opcode is of the type that does use them Also calculate the sign extended immediate in case that s the value that the opcode needs Chapter 3 page 5

6 Step 3 EX - execution/effective address calculation step There are 4 options depending on the opcode decoded in the previous step Memory Reference (Effective Address Calculation) Output <-- A + (IR 16 ) 16 ## IR effective data address SMD (store memory data register) <-- B data to be written if it is a STORE Register - Register instruction Output <-- A op B Register - Immediate instruction Output <-- A op ((IR 16 ) 16 ## IR )) Branch/Jump (Branch Target Address Calculation) Output <-- NPC + (IR 16 ) 16 ## IR actually a 16 bit immed. for branch - 26 bit immed for a jump cond <-- A op for conditional branches A s value is the condition base (= for BEQZ) In simple Load-Store machine no instruction needs to simultaneously calculate an effective data address, calculate a branch target address, and perform an op - hence EX/EF can be combined into a single cycle. Chapter 3 page 6

7 Steps 4 & 5 MEM - memory access/branch completion Memory Reference LMD (load memory data register) <-- Mem[Output] if it is a load OR Mem[Output] <-- SMD (store memory data register) Branch if (cond) then PC <-- Output else PC <-- NPC for Jumps the condition is always true WB - write back Reg - Reg Regs[IR ] <-- Output Reg - Immed Regs[IR ] <-- Output Load Regs[IR ] <-- LMD Chapter 3 page 7

8 DLX Datapath IF ID EX MEM WB 4 Mux PC add NPC Inst. Mem IR rs1 rs2 rd 16 Registers Sign Extend A B 32 IMM Zero? Mux Mux SMD COND Output (may be EFA) LMD Data Mem Turn dashed lines into registers and you get a pipelined DLX Mux Chapter 3 page 8

9 Unpipelined Datapath Branch instructions each require four cycles and all others each require five cycles to execute Model is essentially correct but not optimized operations could have been completed during stage MEM There are other choices 2 s - 1 would have sufficed since in any given cycle only 1 is active book shows registers rather than labels - these registers are not needed in unpipelined version since contents will be held for next cycle instruction and data memories do not need to be separate could have gone for 1 long cycle rather than 5 short ones Control finite state machine or microcode Chapter 3 page 9

10 DLX Pipeline Example Stages are active simultaneous in every cycle rather than 5 cycles per instruction and then start another 5 ideal result is the CPI went from 5 to 1 Is it really this simple? Of course not but it s a start e.g. what happens if you want to use the same unified instruction and data memory? things will grow more hair soon - e.g. exceptions, bypass, forwarding, etc. Best case pipeline scenario Table 1: Instruction Clock Number Number i IF ID EX MEM WB i+1 IF ID EX MEM WB i+2 IF ID EX MEM WB I+3 IF ID EX MEM WB i+4 IF ID EX MEM WB Chapter 3 page 10

11 Important Pipeline Characteristics Latency Time it takes for an instruction to go through the pipe latency = # stages x stage-delay Dominant feature if there are lots of exceptions Throughput Determined by the rate at which instructions can start/finish Dominant feature if no exceptions Chapter 3 page 11

12 Unpipelined DLX Benefit Example 5 steps - take 50, 50, 60, 50, 50 ns respectively Hence total instruction time = 260 ns Pipelined DLX: Looks like a 5-stage pipeline But there are overheads in passing values assume 5 ns added to slowest stage for latches (registers) Primarily due to set-up (copying) and hold times Hence Must run at slowest stage + overhead = = 65 ns/stage In steady state (no exceptions) instruction done every 65 ns Speedup 260/65 = 4x improvement Chapter 3 page 12

13 What is the Catch? Pipelines introduce some real hair of course Making every stage active all the time is hard Concurrency ==> no shared resources between stages Resource conflicts may exist among stages List resource path of each stage Look for conflicts - remember arbitray instructions may be in the pipe at the same time Duplicate resources so that each stage has its own private set For Example PC must be updated in every cycle and can t use the same in stages IF and EX New instruction must be fetched - can t compete with data memory accesses of other stages: i.e. note conflict in IF and MEM stages Load and Stores overlap needs for MDR - hence need two - LMDR and SMDR Latches are used to carry data and control information from one pipeline stage to the next, e.g., PC1, IR1, 1, etc. What about memory bandwidth? Memory latency hasn t changed but bandwidth demand is up to 5x for 5-stage pipe Harvard architecture (separated I & D memories) and intr. buffer can help Multi-level caches - more on this later Chapter 3 page 13

14 What is the Catch (cont.)? Life is pretty good when pipe members are independent Sadly this is seldom true e.g. R5 <-- R2 + R3 (i); R3<-- R5 + R6 (i+1) write back of R5 doesn t happen until stage 5 for inst. i too bad it s needed in stage 2 of inst i + 1 OOPS - problem - called pipeline hazard Installing Pipeline May be Needed The pipeline is ``stalled'' until the conflict is resolved This needs logic + more control A deeper pipeline installs more because of more stages What limits the depth of a pipeline? pipeline latency, latch overhead, and clock skew Chapter 3 page 14

15 Structural hazards Pipeline Hazards Caused by resource contention Single port register file - conflict with multiple stage needs Memory fetch - may need one in both IF and MEM stages Possible to avoid by adding resources May be too costly Data hazards Instruction needs unavailable result from previous stage Can be mitigated somewhat by a smart compiler Control hazards When the PC doesn t get just incremented Branches and Jumps Interrupts Chapter 3 page 15

16 Simple DLX Pipeline Activities From an instruction class perspective Table 1: Stage Instruction Load or Store Instruction Branch Instruction IF ID EX IR <-- MEM[PC]; PC <-- PC+4 A <-- Rs1; B <-- Rs2; PC1 <-- PC; IR1 <-- IR out <-- A op B or out <-- A op ((IR1 16 )**16 ## IR ) IR <-- MEM[PC]; PC <-- PC+4 A <-- Rs1; B <-- Rs2; PC1 <-- PC; IR1 <-- IR DMAR <-- A + ((IR1 16 )**16 ## IR ); SMDR <-- B {if it is a store} MEM out1 <-- out LMDR <-- Mem[DMAR] or Mem[DMAR] <-- SMDR WB Rd <-- out1 Rd <-- LMDR {if it is a load} IR <-- MEM[PC]; PC <-- PC+4 A <-- Rs1; B <-- Rs2; PC1 <-- PC; IR1 <-- IR out <-- PC1 + ((IR1 16 )**16 ## IR ); cond <-- (Rs1 op 0) if (cond) PC <-- out Note potential stage holes - e.g. nothing much happens Note since instruction isn t decoded differently in DLX, so IF and ID do the same thing for all instructions Chapter 3 page 16

17 Hazards cause Stalls Two Obvious Policy Choices How about just stalling all stages? OK but problem is usually adjacent stage conflicts Hence nothing moves and stall condition never clears Cheapest option but sadly it doesn t work Stalling instructions causing conflicts Stages preceeding stalled stage can t go anywhere Later stages allow to complete Works even better in dynamic pipeline configurations Hairier still But not every stage is always necessary - hence stage skipping is a possibility Controlling this possibility is the problem since control complexity goes up Chapter 3 page 17

18 Pipe from a Resource View ld IF ID EX MEM WB Mem Reg Mem Reg inst1 Mem Reg Mem Reg inst2 Mem Reg Mem Reg inst3 Mem Reg Mem Reg Structural Hazard?? Chapter 3 page 18

19 Stalls Create Bubble - Remove Hazard Mem Reg Mem Reg inst1 Mem Reg Mem Reg Mem Reg Mem Reg No conflict if inst1 is not a load bubble bubble bubble bubble bubble Mem Reg Mem Reg Chapter 3 page 19

20 Calculating Stall Effects Pipeline Speedup = Average instruction time without pipelining Average instruction time with pipelining Pipeline Speedup = unpiped cycle time piped cycle time unpiped CPI piped CPI Ideal CPI = unpiped CPI Pipeline Depth Therefore Pipeline Speedup = unpiped cycle time piped cycle time Ideal CPI Pipeline Depth piped CPI Chapter 3 page 20

21 However Calculating Further piped CPI = Ideal CPI + Pipeline stall cycles per instruction = 1 + average stalls per instruction Then if perfect balance no-overhead and cycle times equal Speedup = CPI Unpiped Pipeline stall cycles per instruction Unpiped CPI = pipeline depth Speedup = Pipeline Depth Pipeline stall cycles per instruction Similar derivations for clock cycle derivations are possible Chapter 3 page 21

22 However the no overhead pipeline model is unrealistic Beware changes in cycle time all those latches do take up time hence the cycle time tends to go up and the latency time per instruction also tends to increase cycle time changes along with skew clock limit the pipeline depth Beware effect of stalls the deeper the pipe the more the replicated resources OR the higher chances for stalls to occur Example on p. 144 textbook A machine with 40% load structural hazard but a clock rate 1.05 times faster The machine is still (1+0.4*1) * 1/1.05 = 1.3 times slower Chapter 3 page 22

23 Pipeline Complexities Pipeline overlaps instruction execution hence there is some phase dependence often called structural, data, and control hazards Consider the following code sequence ADD SUB AND OR XOR R1, R2, R3 R4, R5, R1 R6, R1, R7 R8, R1, R9 R10, R1, R11 R1 gets produced in the first instruction and used in every subsequent instruction maximal illustration And these are all reg-reg OPs - it could be worse jumps, long latency memory ops, exceptions,... Chapter 3 page 23

24 Consider the Data Dependencies Clock ADD R1,R2,R3 IM RS DM RD SUB R4,R1,R5 IM RS DM RD AND R6,R1,R7 IM RS DM RD OR R8,R1,R9 XOR R10,R1,R11 Note dependencies are probable but not always - due to exceptions. IM can be fixed by splitting IM RS RS DM always OK Chapter 3 page 24

25 Forwarding also called bypassing, shorting, short-circuiting Key is to keep the result around ADD produces R1 value at output SUB needs it again at the input Therefore The result from the EX/MEM is always fed back to the input latches The fordwarding hardware can detect if the forwarded output corresponds to a source for the current operation -- If yes, use the forwarded result as the source -- Otherwise use the value provided in the register file Chapter 3 page 25

26 Clock Result with Forwarding ADD R1,R2,R3 IM RS DM RD SUB R4,R1,R5 IM RS DM RD AND R6,R1,R7 This shows data forwarding model IM RS DM RD OR R8,R1,R9 IM RS DM XOR R10,R1,R11 Now OK due to split cycle IM RS Chapter 3 page 26

27 Generalizing Data Forwarding Previous example was -centric Consider the following ADD LW SW In this case R1, R2, R3 R4, 0(R1) 12(R1), R4 LW s needs a forwarded value of R1 from ADD s to do the EFA calculation SW s DM needs a forwarded value of R4 from LW s DM to perform the store operation It illustrates a non- forwarding possibility Chapter 3 page 27

28 Data Hazard Forms i occurs before j: program execution order RAW - read after write j reads before i writes - hence j gets wrong old value most common form of data hazard problem as we ve seen forwarding can solve this one WAW - write after write j writes before i writes - leaving incorrect value occurs if we allow writes in more than one pipeline stage can this happen in DLX (integer pipeline)? Why? - writes registers only in WB and writes memory only in MEM - does not allow out-of-order completions However things can get a lot worse - and will when we look at the DLX FP pipeline which allows writes in different stages and allow out-of-order completions due to varying pipeline lengths Chapter 3 page 28

29 The Other Hazard Possibilities WAR - write after read i then j is the intended order j writes before i reads - i ends up reading incorrect new value A Problem in DLX? Not really since writing happens late in the pipe and reads are early However there are other machines in the world that take a different view This was a problem in autoincrement addressing in the VAX for example RAR - read after read the other possibility but not a hazard Smart compiler can always generate NOP s every instruction followed in the worst case by 4 NOP s sort of ridiculous from a performance viewpoint Can forwarding always work? (no stalls) Chapter 3 page 29

30 Clock When Forwarding Fails LW R1, 0(R2) IM RS DM RD SUB R4, R1, R5 IM RS DM RD AND R6, R1, R7 IM RS DM RD OR R8,R1,R9 Forwarding in negative time!! IM RS DM Where to insert NOP? Chapter 3 page 30

31 Stalls Some latencies can t be absorbed the case on the previous slide Some latencies are unknown any time you access memory for example caches are built to keep up but what happens on a miss Stalls are the result Need pipeline interlock circuits to ensure correct executions detects a hazard and introduces bubbles until the hazard clears CPI for stalled instructions will be increased by the number of bubbles In this case data forwarding cannot work directly Bubbles cause the forwarding paths to change Chapter 3 page 31

32 Bubbles and new Forwarding Paths Clock LW R1, 0(R2) IM RS DM RD SUB R4, R1, R5 IM RS bubble DM RD AND R6, R1, R7 IM bubble RS OR R8,R1,R9 Now no forwarding even required bubble IM RS Note: situation is similar for even A = B + C Chapter 3 page 32

33 Handling Stalls Hardware vs. Software Hardware: Pipeline Interlocks Must detect when required data cannot be provided stall stages to create bubble Software: pipeline or instruction scheduling A = B + C; D = E -F Forwarding required but no stall LW R1, B LW R2, C ADD R3, R1, R2 SW A, R3 LW R4, E LW R5, F SUB R6, R4, R5 SW D, R6 Issue: when an inst. leaves ID stage and goes to EX Chapter 3 page 33

34 Downside Uses more registers Pipeline Scheduling Becomes a major problem in multi-issue machines How well does it work with only basic block (a straight-line code) scope restrictions % of loads causing stalls Tex: Unscheduled 65% Scheduled 25% Spice: Unscheduled 42% Scheduled 14% GCC: Unscheduled 54% Scheduled 31% (indicates that GCC app is harder to schedule) Chapter 3 page 34

35 Pipeline Control Can be implemented by an early check in ID Data hazard detection can be performed by Comparing the destination and sources of adjacent instructions See Figure 3.17 (next slide) ID stage now becomes a major player It can determine if a new instruction should be issued to EX It generates stall(s) if forwarding won t work Stalls can be implemented by injecting NOPs to EX It thus provides effective interlock control Early detection policy simplifies hardware Chapter 3 page 35

36 Situations Requiring Detection Table 1: Situation Sample Code Action Required No dependence Dependence requiring stall Dependence can be handled with forwarding Dependence but register file order splitting makes in a non-issue LW R1,45(R2) ADD R5, R6, R7 SUB R8,R6,R7, OR R9,R6,R7 LW R1,45(R2) ADD R5,R1,R7 SUB R8,R6,R7, OR R9,R6,R7 LW R1,45(R2) ADD R5, R6, R7 SUB R8,R1,R7, OR R9,R6,R7 LW R1,45(R2) ADD R5, R6, R7 SUB R8,R6,R7, OR R9,R1,R7 No hazard since R1 doesn t show up in the next 3 instructions comparators must detect the use of R1 in ADD and stall since LW can only produce the value after cycle 4 and ADD needs it after cycle 2 - hence no way comparators detect R1 use in SUB and setup forward of result to from end of DM stage R1 will be placed in the first half of Rd in LW R1 will be read in the second half of RS for OR Hence no problem - be happy Chapter 3 page 36

37 Load Interlock Comparators Must keep track of data and register source and destination names Allow instructions to be issued if forwarding works - otherwise stall any place that can be forwarded to needs another mux - see Figure 3.20 for example any forward source will increase fanout both negatively influence performance - but less than a stall Forwarding paths required From out, or DMout To in(top or bottom), DMin, Zero detection unit For DLX, 10 comparisons are performed to implement forwarding logic - see Figure 3.19 Chapter 3 page 37

Pipelining. Maurizio Palesi

Pipelining. Maurizio Palesi * Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer

More information

Page 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight

Page 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight Pipelining: Its Natural! Chapter 3 Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder

More information

Instruction Pipelining

Instruction Pipelining Instruction Pipelining Simplest form is a 3-stage linear pipeline New instruction fetched each clock cycle Instruction finished each clock cycle Maximal speedup = 3 achieved if and only if all pipe stages

More information

Instruction Pipelining

Instruction Pipelining Instruction Pipelining Simplest form is a 3-stage linear pipeline New instruction fetched each clock cycle Instruction finished each clock cycle Maximal speedup = 3 achieved if and only if all pipe stages

More information

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017 Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation

More information

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism

More information

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation

More information

COSC4201 Pipelining. Prof. Mokhtar Aboelaze York University

COSC4201 Pipelining. Prof. Mokhtar Aboelaze York University COSC4201 Pipelining Prof. Mokhtar Aboelaze York University 1 Instructions: Fetch Every instruction could be executed in 5 cycles, these 5 cycles are (MIPS like machine). Instruction fetch IR Mem[PC] NPC

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

COSC 6385 Computer Architecture - Pipelining

COSC 6385 Computer Architecture - Pipelining COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

What is Pipelining? RISC remainder (our assumptions)

What is Pipelining? RISC remainder (our assumptions) What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism

More information

CPE Computer Architecture. Appendix A: Pipelining: Basic and Intermediate Concepts

CPE Computer Architecture. Appendix A: Pipelining: Basic and Intermediate Concepts CPE 110408443 Computer Architecture Appendix A: Pipelining: Basic and Intermediate Concepts Sa ed R. Abed [Computer Engineering Department, Hashemite University] Outline Basic concept of Pipelining The

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Overview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP

Overview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP Overview Appendix A Pipelining: Basic and Intermediate Concepts Basics of Pipelining Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 1 2 Unpipelined

More information

Appendix A. Overview

Appendix A. Overview Appendix A Pipelining: Basic and Intermediate Concepts 1 Overview Basics of Pipelining Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 2 1 Unpipelined

More information

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Pipeline Thoai Nam Outline Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy

More information

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Thoai Nam Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy & David a Patterson,

More information

Appendix C. Abdullah Muzahid CS 5513

Appendix C. Abdullah Muzahid CS 5513 Appendix C Abdullah Muzahid CS 5513 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero) Single address mode for load/store: base + displacement no indirection

More information

Lecture 7 Pipelining. Peng Liu.

Lecture 7 Pipelining. Peng Liu. Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt

More information

mywbut.com Pipelining

mywbut.com Pipelining Pipelining 1 What Is Pipelining? Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. Today, pipelining is the key implementation technique used to make

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science 6 PM 7 8 9 10 11 Midnight Time 30 40 20 30 40 20

More information

Appendix C. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,

Appendix C. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002, Appendix C Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Pipelining Multiple instructions are overlapped in execution Each is in a different stage Each stage is called

More information

Instruction Pipelining Review

Instruction Pipelining Review Instruction Pipelining Review Instruction pipelining is CPU implementation technique where multiple operations on a number of instructions are overlapped. An instruction execution pipeline involves a number

More information

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building

More information

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin.   School of Information Science and Technology SIST CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C

More information

Minimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline

Minimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline Instruction Pipelining Review: MIPS In-Order Single-Issue Integer Pipeline Performance of Pipelines with Stalls Pipeline Hazards Structural hazards Data hazards Minimizing Data hazard Stalls by Forwarding

More information

Appendix C: Pipelining: Basic and Intermediate Concepts

Appendix C: Pipelining: Basic and Intermediate Concepts Appendix C: Pipelining: Basic and Intermediate Concepts Key ideas and simple pipeline (Section C.1) Hazards (Sections C.2 and C.3) Structural hazards Data hazards Control hazards Exceptions (Section C.4)

More information

Pipeline Review. Review

Pipeline Review. Review Pipeline Review Review Covered in EECS2021 (was CSE2021) Just a reminder of pipeline and hazards If you need more details, review 2021 materials 1 The basic MIPS Processor Pipeline 2 Performance of pipelining

More information

Pipelining: Basic and Intermediate Concepts

Pipelining: Basic and Intermediate Concepts Appendix A Pipelining: Basic and Intermediate Concepts 1 Overview Basics of fpipelining i Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 2 Unpipelined

More information

ECEC 355: Pipelining

ECEC 355: Pipelining ECEC 355: Pipelining November 8, 2007 What is Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. A pipeline is similar in concept to an assembly

More information

第三章 Instruction-Level Parallelism and Its Dynamic Exploitation. 陈文智 浙江大学计算机学院 2014 年 10 月

第三章 Instruction-Level Parallelism and Its Dynamic Exploitation. 陈文智 浙江大学计算机学院 2014 年 10 月 第三章 Instruction-Level Parallelism and Its Dynamic Exploitation 陈文智 chenwz@zju.edu.cn 浙江大学计算机学院 2014 年 10 月 1 3.3 The Major Hurdle of Pipelining Pipeline Hazards 本科回顾 ------- Appendix A.2 3.3.1 Taxonomy

More information

Pipelining: Hazards Ver. Jan 14, 2014

Pipelining: Hazards Ver. Jan 14, 2014 POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? Pipelining: Hazards Ver. Jan 14, 2014 Marco D. Santambrogio: marco.santambrogio@polimi.it Simone Campanoni:

More information

Chapter 4 The Processor 1. Chapter 4A. The Processor

Chapter 4 The Processor 1. Chapter 4A. The Processor Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

Computer Architecture

Computer Architecture Lecture 3: Pipelining Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in

More information

Computer System. Agenda

Computer System. Agenda Computer System Hiroaki Kobayashi 7/6/2011 Ver. 07062011 7/6/2011 Computer Science 1 Agenda Basic model of modern computer systems Von Neumann Model Stored-program instructions and data are stored on memory

More information

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

C.1 Introduction. What Is Pipelining? C-2 Appendix C Pipelining: Basic and Intermediate Concepts

C.1 Introduction. What Is Pipelining? C-2 Appendix C Pipelining: Basic and Intermediate Concepts C-2 Appendix C Pipelining: Basic and Intermediate Concepts C.1 Introduction Many readers of this text will have covered the basics of pipelining in another text (such as our more basic text Computer Organization

More information

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/sp16 1 Pipelined Execution Representation Time

More information

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science Pipeline Overview Dr. Jiang Li Adapted from the slides provided by the authors Outline MIPS An ISA for Pipelining 5 stage pipelining Structural and Data Hazards Forwarding Branch Schemes Exceptions and

More information

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14 MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK

More information

Pipelining. Principles of pipelining. Simple pipelining. Structural Hazards. Data Hazards. Control Hazards. Interrupts. Multicycle operations

Pipelining. Principles of pipelining. Simple pipelining. Structural Hazards. Data Hazards. Control Hazards. Interrupts. Multicycle operations Principles of pipelining Pipelining Simple pipelining Structural Hazards Data Hazards Control Hazards Interrupts Multicycle operations Pipeline clocking ECE D52 Lecture Notes: Chapter 3 1 Sequential Execution

More information

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will

More information

Very Simple MIPS Implementation

Very Simple MIPS Implementation 06 1 MIPS Pipelined Implementation 06 1 line: (In this set.) Unpipelined Implementation. (Diagram only.) Pipelined MIPS Implementations: Hardware, notation, hazards. Dependency Definitions. Hazards: Definitions,

More information

DLX Unpipelined Implementation

DLX Unpipelined Implementation LECTURE - 06 DLX Unpipelined Implementation Five cycles: IF, ID, EX, MEM, WB Branch and store instructions: 4 cycles only What is the CPI? F branch 0.12, F store 0.05 CPI0.1740.83550.174.83 Further reduction

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation

Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu www.secs.oakland.edu/~yan

More information

Computer System. Hiroaki Kobayashi 6/16/2010. Ver /16/2010 Computer Science 1

Computer System. Hiroaki Kobayashi 6/16/2010. Ver /16/2010 Computer Science 1 Computer System Hiroaki Kobayashi 6/16/2010 6/16/2010 Computer Science 1 Ver. 1.1 Agenda Basic model of modern computer systems Von Neumann Model Stored-program instructions and data are stored on memory

More information

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:

More information

Pipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017!

Pipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017! Advanced Topics on Heterogeneous System Architectures Pipelining! Politecnico di Milano! Seminar Room @ DEIB! 30 November, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2 Outline!

More information

PIPELINING: HAZARDS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

PIPELINING: HAZARDS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah PIPELINING: HAZARDS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 1 submission deadline: Jan. 30 th This

More information

Very Simple MIPS Implementation

Very Simple MIPS Implementation 06 1 MIPS Pipelined Implementation 06 1 line: (In this set.) Unpipelined Implementation. (Diagram only.) Pipelined MIPS Implementations: Hardware, notation, hazards. Dependency Definitions. Hazards: Definitions,

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each

More information

Instruction Level Parallelism. ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction

Instruction Level Parallelism. ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction Instruction Level Parallelism ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction Basic Block A straight line code sequence with no branches in except to the entry and no branches

More information

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards CISC 662 Graduate Computer Architecture Lecture 6 - Hazards Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer

More information

CAD for VLSI 2 Pro ject - Superscalar Processor Implementation

CAD for VLSI 2 Pro ject - Superscalar Processor Implementation CAD for VLSI 2 Pro ject - Superscalar Processor Implementation 1 Superscalar Processor Ob jective: The main objective is to implement a superscalar pipelined processor using Verilog HDL. This project may

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle? CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:

More information

ELE 655 Microprocessor System Design

ELE 655 Microprocessor System Design ELE 655 Microprocessor System Design Section 2 Instruction Level Parallelism Class 1 Basic Pipeline Notes: Reg shows up two places but actually is the same register file Writes occur on the second half

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

CSE 533: Advanced Computer Architectures. Pipelining. Instructor: Gürhan Küçük. Yeditepe University

CSE 533: Advanced Computer Architectures. Pipelining. Instructor: Gürhan Küçük. Yeditepe University CSE 533: Advanced Computer Architectures Pipelining Instructor: Gürhan Küçük Yeditepe University Lecture notes based on notes by Mark D. Hill and John P. Shen Updated by Mikko Lipasti Pipelining Forecast

More information

Computer Architectures. DLX ISA: Pipelined Implementation

Computer Architectures. DLX ISA: Pipelined Implementation Computer Architectures L ISA: Pipelined Implementation 1 The Pipelining Principle Pipelining is nowadays the main basic technique deployed to speed-up a CP. The key idea for pipelining is general, and

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

The Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes.

The Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes. The Processor Pipeline Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes. Pipeline A Basic MIPS Implementation Memory-reference instructions Load Word (lw) and Store Word (sw) ALU instructions

More information

Multi-cycle Instructions in the Pipeline (Floating Point)

Multi-cycle Instructions in the Pipeline (Floating Point) Lecture 6 Multi-cycle Instructions in the Pipeline (Floating Point) Introduction to instruction level parallelism Recap: Support of multi-cycle instructions in a pipeline (App A.5) Recap: Superpipelining

More information

MIPS An ISA for Pipelining

MIPS An ISA for Pipelining Pipelining: Basic and Intermediate Concepts Slides by: Muhamed Mudawar CS 282 KAUST Spring 2010 Outline: MIPS An ISA for Pipelining 5 stage pipelining i Structural Hazards Data Hazards & Forwarding Branch

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Pipelined CPUs. Study Chapter 4 of Text. Where are the registers?

Pipelined CPUs. Study Chapter 4 of Text. Where are the registers? Pipelined CPUs Where are the registers? Study Chapter 4 of Text Second Quiz on Friday. Covers lectures 8-14. Open book, open note, no computers or calculators. L17 Pipelined CPU I 1 Review of CPU Performance

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction

More information

Advanced Computer Architecture

Advanced Computer Architecture Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes

More information

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor. COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction

More information

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ... CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100

More information

Unpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory

Unpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory Pipelining the Idea Similar to assembly line in a factory Divide instruction into smaller tasks Each task is performed on subset of resources Overlap the execution of multiple instructions by completing

More information

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19 CO2-3224 Computer Architecture and Programming Languages CAPL Lecture 8 & 9 Dr. Kinga Lipskoch Fall 27 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

Dynamic Control Hazard Avoidance

Dynamic Control Hazard Avoidance Dynamic Control Hazard Avoidance Consider Effects of Increasing the ILP Control dependencies rapidly become the limiting factor they tend to not get optimized by the compiler more instructions/sec ==>

More information

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good CPU performance equation: T = I x CPI x C Both effective CPI and clock cycle C are heavily influenced by CPU design. For single-cycle CPU: CPI = 1 good Long cycle time bad On the other hand, for multi-cycle

More information

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Processor Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Moore s Law Gordon Moore @ Intel (1965) 2 Computer Architecture Trends (1)

More information

Pipelining: Basic Concepts

Pipelining: Basic Concepts Pipelining: Basic Concepts Prof. Cristina Silvano Dipartimento di Elettronica e Informazione Politecnico di ilano email: silvano@elet.polimi.it Outline Reduced Instruction Set of IPS Processor Implementation

More information

Readings. H+P Appendix A, Chapter 2.3 This will be partly review for those who took ECE 152

Readings. H+P Appendix A, Chapter 2.3 This will be partly review for those who took ECE 152 Readings H+P Appendix A, Chapter 2.3 This will be partly review for those who took ECE 152 Recent Research Paper The Optimal Logic Depth Per Pipeline Stage is 6 to 8 FO4 Inverter Delays, Hrishikesh et

More information

Pipelining. Principles of pipelining. Simple pipelining. Structural Hazards. Data Hazards. Control Hazards. Interrupts. Multicycle operations

Pipelining. Principles of pipelining. Simple pipelining. Structural Hazards. Data Hazards. Control Hazards. Interrupts. Multicycle operations Principles of pipelining Pipelining Simple pipelining Structural Hazards Data Hazards Control Hazards Interrupts Multicycle operations Pipeline clocking ECE D52 Lecture Notes: Chapter 3 1 Sequential Execution

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science Cases that affect instruction execution semantics

More information

L19 Pipelined CPU I 1. Where are the registers? Study Chapter 6 of Text. Pipelined CPUs. Comp 411 Fall /07/07

L19 Pipelined CPU I 1. Where are the registers? Study Chapter 6 of Text. Pipelined CPUs. Comp 411 Fall /07/07 Pipelined CPUs Where are the registers? Study Chapter 6 of Text L19 Pipelined CPU I 1 Review of CPU Performance MIPS = Millions of Instructions/Second MIPS = Freq CPI Freq = Clock Frequency, MHz CPI =

More information

Pipelined Processor Design

Pipelined Processor Design Pipelined Processor Design Pipelined Implementation: MIPS Virendra Singh Computer Design and Test Lab. Indian Institute of Science (IISc) Bangalore virendra@computer.org Advance Computer Architecture http://www.serc.iisc.ernet.in/~viren/courses/aca/aca.htm

More information

(Basic) Processor Pipeline

(Basic) Processor Pipeline (Basic) Processor Pipeline Nima Honarmand Generic Instruction Life Cycle Logical steps in processing an instruction: Instruction Fetch (IF_STEP) Instruction Decode (ID_STEP) Operand Fetch (OF_STEP) Might

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN ARM COMPUTER ORGANIZATION AND DESIGN Edition The Hardware/Software Interface Chapter 4 The Processor Modified and extended by R.J. Leduc - 2016 To understand this chapter, you will need to understand some

More information

Processor Architecture

Processor Architecture Processor Architecture Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong (jinkyu@skku.edu)

More information

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction

More information

These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously on different instructions.

These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously on different instructions. MIPS Pipe Line 2 Introduction Pipelining To complete an instruction a computer needs to perform a number of actions. These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously

More information

Lecture 2: Pipelining Basics. Today: chapter 1 wrap-up, basic pipelining implementation (Sections A.1 - A.4)

Lecture 2: Pipelining Basics. Today: chapter 1 wrap-up, basic pipelining implementation (Sections A.1 - A.4) Lecture 2: Pipelining Basics Today: chapter 1 wrap-up, basic pipelining implementation (Sections A.1 - A.4) 1 Defining Fault, Error, and Failure A fault produces a latent error; it becomes effective when

More information

Speeding Up DLX Computer Architecture Hadassah College Spring 2018 Speeding Up DLX Dr. Martin Land

Speeding Up DLX Computer Architecture Hadassah College Spring 2018 Speeding Up DLX Dr. Martin Land Speeding Up DLX 1 DLX Execution Stages Version 1 Clock Cycle 1 I 1 enters Instruction Fetch (IF) Clock Cycle2 I 1 moves to Instruction Decode (ID) Instruction Fetch (IF) holds state fixed Clock Cycle3

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 17: Pipelining Wrapup Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Outline The textbook includes lots of information Focus on

More information

Suggested Readings! Recap: Pipelining improves throughput! Processor comparison! Lecture 17" Short Pipelining Review! ! Readings!

Suggested Readings! Recap: Pipelining improves throughput! Processor comparison! Lecture 17 Short Pipelining Review! ! Readings! 1! 2! Suggested Readings!! Readings!! H&P: Chapter 4.5-4.7!! (Over the next 3-4 lectures)! Lecture 17" Short Pipelining Review! 3! Processor components! Multicore processors and programming! Recap: Pipelining

More information

What do we have so far? Multi-Cycle Datapath (Textbook Version)

What do we have so far? Multi-Cycle Datapath (Textbook Version) What do we have so far? ulti-cycle Datapath (Textbook Version) CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instruction being processed in datapath How to lower CPI further? #1 Lec # 8 Summer2001

More information

EE 457 Unit 6a. Basic Pipelining Techniques

EE 457 Unit 6a. Basic Pipelining Techniques EE 47 Unit 6a Basic Pipelining Techniques 2 Pipelining Introduction Consider a drink bottling plant Filling the bottle = 3 sec. Placing the cap = 3 sec. Labeling = 3 sec. Would you want Machine = Does

More information

Outline. Pipelining basics The Basic Pipeline for DLX & MIPS Pipeline hazards. Handling exceptions Multi-cycle operations

Outline. Pipelining basics The Basic Pipeline for DLX & MIPS Pipeline hazards. Handling exceptions Multi-cycle operations Pipelining 1 Outline Pipelining basics The Basic Pipeline for DLX & MIPS Pipeline hazards Structural Hazards Data Hazards Control Hazards Handling exceptions Multi-cycle operations 2 Pipelining basics

More information

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27) Lecture Topics Today: Data and Control Hazards (P&H 4.7-4.8) Next: continued 1 Announcements Exam #1 returned Milestone #5 (due 2/27) Milestone #6 (due 3/13) 2 1 Review: Pipelined Implementations Pipelining

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information