CPU Pipelining Issues

Size: px
Start display at page:

Download "CPU Pipelining Issues"

Transcription

1 Spring 25 3/24 Lecture page 1 CPU Pipelining Issues What have you been beating your head against? This pipe stuff makes my head hurt! L17 Pipeline Issues 1 <31:29>:J<25:0>: T REG IRREG 4-Stage minimips pipeline hazards R1 W Z R2 SEL 1 0 SEL T IR FN IRMEM Y MEM MEM dr R/W ata WSEL 3 SEL ack W WE W (N: SME RF S OVE!) L17 Pipeline Issues 2

2 Spring 25 3/24 Lecture page 2 ypass Paths IR RF R1 R2 IR ypass muxes Y IR W Y W W WE L17 Pipeline Issues 3 Structural ata Hazard Consider LOS: Can we fix all these problems using our previous bypass paths? lw $t4,0($t1) add $t5,$t1,$t4 xor $t6,$t3,$t4 IF RF W i i1 i2 i3 i i5 i6 L17 Pipeline Issues 4

3 Spring 25 3/24 Lecture page 3 <31:29>:J<25:0>: T REG IRREG 4-Stage minimips R1 W Z R2 SEL 1 0 SEL T IR FN IRMEM Y MEM MEM dr R/W ata WSEL 3 SEL ack W WE W (N: SME RF S OVE!) L17 Pipeline Issues 5 Structural ata Hazard Consider LOS: Can we fix all these problems using our previous bypass paths? lw $t4,0($t1) add $t5,$t1,$t4 xor $t6,$t3,$t4 IF RF W i i1 i2 i3 i i5 i6 For a lw instruction fetched during clock i, data isn t returned from memory until late into cycle i3. ypassing will fix xor but not add! L17 Pipeline Issues 6

4 Spring 25 3/24 Lecture page 4 Load elays ypassing can t fix the problem with add since the data simply isn t available! In order to fix it we have to add pipeline interlock hardware to stall add s execution, or else program around it. lw $t4,0($t1) add $t5,$t1,$t4 xor $t6,$t3,$t4 IF RF W i i1 i2 i3 i i5 i6 xor lw add add xor dding stalls to the pipeline in order to assure proper operation is sometimes called inserting pipeline ULES This requires inserting a MUX just before the instruction register of the stage, IR, to annul the add (by inserting a NOP) as well as, clock enables on the and IR pipeline registers of earlier pipeline stages to stall the execution without annuling any instructions. This is how the simulator, SPIM works. L17 Pipeline Issues 7 Punting on Load Interlock Early versions of MIPS did not include a pipeline interlock, thus, requiring the compiler/programmer to work around it. lw $t4,0($t1) nop add $t5,$t1,$t4 xor $t6,$t3,$t4 IF RF W i i1 i2 i3 i i5 i6 If compiler knows about load delay, it can often rearrange the code sequence to eliminate the hazard. Many compilers can provide implementation-specific instruction scheduling. This requires no additional H/W, but it further confuses the instruction semantics. We ll include interlocks for SPIM compatibility. L17 Pipeline Issues 8

5 Spring 25 3/24 Lecture page 5 Load elays (cont d) ut, but, what about FSTER processors? FCT: Processors have been become very fast relative to memories! Can we just stall the pipe longer? dd more NOPs? LTERNTIVE: Longer pipelines. 1. dd MEMORY WIT stages between STRT of read operation & return of data. 2. uild pipelined memories, so that multiple (say, N) memory transactions can be in progress at once. 3. (Optional). Stall pipeline when the N limit is exceeded. 4-Stage pipeline requires RE access in LESS than one clock. Sadly, this is our bottleneck. If we want to go faster will have to surround it with pipeline stages SOLUTION: 5-Stage pipeline that allows nearly two clocks for data memory accesses... L17 Pipeline Issues 9 <31:29>:J<25:0>: T REG IRREG 5-Stage minimips Omits some details NO bypass or interlock logic R1 W Z R2 T IR FN SEL 1 0 SEL MEM IRMEM YMEM MEM ddress is available right after instruction enters stage ack IRW YW W WSEL W WE W 3 SEL dr R/W ata almost 2 clock cycles ata is needed just before rising clock edge at end of ack stage L17 Pipeline Issues 10

6 Spring 25 3/24 Lecture page 6 One More Fly in the Ointment There is one more structural hazard that we have not discussed. That is, the saving, and subsequent accesses, of the return address resulting from the jump-and-link, jal, instruction. Moreover, given that we have bought into a single delay-slot, which is always executed, we now need to store the address of the instruction FOLLOWING the delay slot instruction. We need to return here, to 8, not. Once more we need to rewrite the IS spec! jal sqr # call procedure addi $a0,$0,10 # delay slot addi $t0,$v0,-1 # return address L17 Pipeline Issues 11 Return ddress s IF RF MEM W # The code: ssume Reg[LP] 1... add $ra,$0,$0 jal f addi $t1,$ra,-4 # In delay slot... f: xor $t0,$ra,$0 or $r1,$0,$ra add $t2,$0,$ra Can Can we we make make the the regfile regfileaccesses of of i i1 i2 i3 i i5 i6 the the 3 instructions instructions following following the the jal jal add jal addi xor or add work work by by bypassing? bypassing? add jal addi xor or add Where Where do do we we get get add jal addi xor or the the right right return return address add jal addi xor address from? from? add jal addi R ecision Time I reads R writes L17 Pipeline Issues 12

7 Spring 25 3/24 Lecture page 7 <31:29>:J<25:0>: T addi $9,$31,-4 jal f REG IRREG, ypass FN R1 W Z T IR R2 SEL 1 0 SEL JL ypasses On JLs, the register file saves the next address from the ELY SLOT instruction (often 8). Note this bypass is routed from the pipeline not from the output. Thus, we need to add bypass paths for REG. add $ra,$0,$0 MEM IRMEM YMEM MEM IRW W YW dr R/W ata WSEL 3 SEL ack W WE W L17 Pipeline Issues 13 <31:29>:J<25:0>: T REG IRREG JL ypasses We need another bypass. xor $8,$31,$0, ypass addi $9,$31,-4 FN R1 W Z T IR R2 SEL 1 0 SEL In this case, the bypass path supplies the $31 operand for the XOR instruction. jal f add $ra,$0,$0 ack MEM IRMEM YMEM MEM IRW YW W WSEL W WE W 3 SEL dr R/W ata L17 Pipeline Issues 14

8 Spring 25 3/24 Lecture page 8 <31:29>:J<25:0>: T or $1,$0,$31 xor $8,$31,$0 REG IRREG addi $9,$31,-4, ypass jal f FN MEM IRMEM YMEM MEM R1 IRW W YW W Z T IR WSEL 3 SEL R2 SEL 1 0 SEL JL ypasses dr R/W ata nd, we need another MEM bypass. In this case, the bypass path supplies the $31 operand for the OR instruction. W is already taken care of, for the following, using the W stage bypass at the output of the SEL mux. ack W WE W L17 Pipeline Issues 15 <31:29>:J<25:0>: T REG IRREG NOP FN MEM IRMEM YMEM MEM R1 IRW YW W W Z T IR WSEL 3 5-Stage minimips We wanted a simple, clean pipeline but SEL R2 SEL 1 0 SEL dr R/W ata broke the sequential semantics of IS by adding a branch delay-slot and early branch resolution logic added / bypass muxes to get data before it s written to regfile added CLK EN to freeze IF/RF stages so we can wait for lw to reach W stage ack W WE W L17 Pipeline Issues 16

9 Spring 25 3/24 Lecture page 9 ypass MUX etails The previous diagram was oversimplified. Really need for the bypass muxes to precede the and muxes to provide the correct values for the jump target (), write data, and early branch decision logic. from /MEM/W/ from /MEM/W/ ypass shamt 16 (imm) ypass SEL Z 1 0 SEL To To To Mem L17 Pipeline Issues 17 ypass Logic minimips bypass logic (need two copies for / data): rs or rt rt or rd (W)* Regfile (no bypass) W bypass rt or rd (MEM)* MEM bypass rt or rd ()* bypass 0 5-bit compare Select 0 * If instruction is a sw (doesn t write into regfile), set rt for /MEM/W to $0 L17 Pipeline Issues 18

10 Spring 25 3/24 Lecture page 10 <31:29>:J<25:0>: T REG IRREG NOP Final 5-Stage minimips FN MEM IRMEM YMEM MEM R1 W T IR Z R2 SEL 1 0 SEL dded branch delay slot and early branch resolution logic to fix a CONTROL hazard dded lots of bypass paths and detection logic to fix various STRUCTURL hazards dded pipeline interlocks to fix load delay STRUCTURL hazard IRW W YW dr R/W ata WSEL 3 SEL ack W WE W L17 Pipeline Issues 19 Pipeline Summary (I) Started with unpipelined implementation direct execute, 1 cycle/instruction it had a long cycle time: mem regs alu mem wb We ended up with a 5-stage pipelined implementation increase throughput (3x???) delayed branch decision (1 cycle) Choose to execute instruction after branch delayed register writeback (3 cycles) dd bypass paths (6 x 2 12) to forward correct value memory data available only in W stage Introduce NOPs at IR, to stall IF and RF stages until L result was ready L17 Pipeline Issues 20

11 Spring 25 3/24 Lecture page 11 Pipeline Summary (II) Fallacy #1: Pipelining is easy Smart people get it wrong all of the time! Costs? Re-spins of the design. Force S/W folks to devise program/compiler workarounds. Fallacy #2: Pipelining is independent of IS Many IS decisions impact how easy/costly it is to implement pipelining (i.e. branch semantics, addressing modes). ad decisions impact future implementations. (delay slot vs. annul?, load interlocks?) and break otherwise clean semantics. For performance, S/W must be aware! Fallacy #3: Increasing Pipeline stages improves performance iminishing returns. Increasing complexity. Can introduce unusable delay slots, long interlock stalls. Power4: 7; Power5: 23 Pentium: 5; Pentium/MMX:6; PentiumIII: 10; Pentium4: 20, 31 L17 Pipeline Issues 21 RISC Simplicity??? The P.T. arnum World s Tallest warf Competition World s Most Complex RISC? VLIWs, Super-Scalars? ddressing features, eg index registers Pipelines, ypasses, nnulment,,... Primitive Machines with direct implementations Generalization of registers and operand coding RISCs Complex instructions, addressing modes L17 Pipeline Issues 22

CPU Pipelining Issues

CPU Pipelining Issues CPU Pipelining Issues What have you been beating your head against? This pipe stuff makes my head hurt! Finishing up Chapter 6 L20 Pipeline Issues 1 Structural Data Hazard Consider LOADS: Can we fix all

More information

More CPU Pipelining Issues

More CPU Pipelining Issues More CPU Pipelining Issues What have you been beating your head against? This pipe stuff makes my head hurt! Important Stuff: Study Session for Problem Set 5 tomorrow night (11/11) 5:30-9:00pm Study Session

More information

Pipeline Issues. This pipeline stuff makes my head hurt! Maybe it s that dumb hat. L17 Pipeline Issues Fall /5/02

Pipeline Issues. This pipeline stuff makes my head hurt! Maybe it s that dumb hat. L17 Pipeline Issues Fall /5/02 Pipeline Issues This pipeline stuff makes my head hurt! Maybe it s that dumb hat L17 Pipeline Issues 1 Recalling Data Hazards PROLEM: Subsequent instructions can reference the contents of a register well

More information

L19 Pipelined CPU I 1. Where are the registers? Study Chapter 6 of Text. Pipelined CPUs. Comp 411 Fall /07/07

L19 Pipelined CPU I 1. Where are the registers? Study Chapter 6 of Text. Pipelined CPUs. Comp 411 Fall /07/07 Pipelined CPUs Where are the registers? Study Chapter 6 of Text L19 Pipelined CPU I 1 Review of CPU Performance MIPS = Millions of Instructions/Second MIPS = Freq CPI Freq = Clock Frequency, MHz CPI =

More information

Pipelined CPUs. Study Chapter 4 of Text. Where are the registers?

Pipelined CPUs. Study Chapter 4 of Text. Where are the registers? Pipelined CPUs Where are the registers? Study Chapter 4 of Text Second Quiz on Friday. Covers lectures 8-14. Open book, open note, no computers or calculators. L17 Pipelined CPU I 1 Review of CPU Performance

More information

6.004 Computation Structures Spring 2009

6.004 Computation Structures Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 6.4 Computation Structures Spring 29 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. Pipelining the eta bet ta ('be-t&)

More information

CPU Pipelining Issues

CPU Pipelining Issues CPU Pipelining Issues What have you been beating your head against? This pipe stuff makes my head hurt! L17 Pipeline Issues & Memory 1 Pipelining Improve performance by increasing instruction throughput

More information

Pipelining the Beta. I don t think they mean the fish...

Pipelining the Beta. I don t think they mean the fish... Pipelining the Beta bet ta ('be-t&) n. ny of various species of small, brightly colored, long-finned freshwater fishes of the genus Betta, found in southeast sia. be ta ( b-t&, be-) n. 1. The second letter

More information

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/sp16 1 Pipelined Execution Representation Time

More information

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:

More information

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin.   School of Information Science and Technology SIST CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #19: Pipelining II 2005-07-21 Andy Carle CS 61C L19 Pipelining II (1) Review: Datapath for MIPS PC instruction memory rd rs rt registers

More information

Chapter 4 The Processor 1. Chapter 4B. The Processor

Chapter 4 The Processor 1. Chapter 4B. The Processor Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always

More information

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Pipeline Thoai Nam Outline Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy

More information

ECE/CS 552: Pipeline Hazards

ECE/CS 552: Pipeline Hazards ECE/CS 552: Pipeline Hazards Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi, John Shen and Jim Smith Pipeline Hazards Forecast Program Dependences

More information

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Thoai Nam Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy & David a Patterson,

More information

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation

More information

Pipelining. Maurizio Palesi

Pipelining. Maurizio Palesi * Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer

More information

Very Simple MIPS Implementation

Very Simple MIPS Implementation 06 1 MIPS Pipelined Implementation 06 1 line: (In this set.) Unpipelined Implementation. (Diagram only.) Pipelined MIPS Implementations: Hardware, notation, hazards. Dependency Definitions. Hazards: Definitions,

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

Computer Architecture. Lecture 6.1: Fundamentals of

Computer Architecture. Lecture 6.1: Fundamentals of CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and

More information

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141 EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 13 Project Introduction You will design and optimize a RISC-V processor Phase 1: Design

More information

EECS Digital Design

EECS Digital Design EECS 150 -- Digital Design Lecture 11-- Processor Pipelining 2010-2-23 John Wawrzynek Today s lecture by John Lazzaro www-inst.eecs.berkeley.edu/~cs150 1 Today: Pipelining How to apply the performance

More information

CS 152 Computer Architecture and Engineering Lecture 4 Pipelining

CS 152 Computer Architecture and Engineering Lecture 4 Pipelining CS 152 Computer rchitecture and Engineering Lecture 4 Pipelining 2014-1-30 John Lazzaro (not a prof - John is always OK) T: Eric Love www-inst.eecs.berkeley.edu/~cs152/ Play: 1 otorola 68000 Next week

More information

Instruction Pipelining

Instruction Pipelining Instruction Pipelining Simplest form is a 3-stage linear pipeline New instruction fetched each clock cycle Instruction finished each clock cycle Maximal speedup = 3 achieved if and only if all pipe stages

More information

Instruction Pipelining

Instruction Pipelining Instruction Pipelining Simplest form is a 3-stage linear pipeline New instruction fetched each clock cycle Instruction finished each clock cycle Maximal speedup = 3 achieved if and only if all pipe stages

More information

Very Simple MIPS Implementation

Very Simple MIPS Implementation 06 1 MIPS Pipelined Implementation 06 1 line: (In this set.) Unpipelined Implementation. (Diagram only.) Pipelined MIPS Implementations: Hardware, notation, hazards. Dependency Definitions. Hazards: Definitions,

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Lecture 7 Pipelining. Peng Liu.

Lecture 7 Pipelining. Peng Liu. Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt

More information

CS3350B Computer Architecture Quiz 3 March 15, 2018

CS3350B Computer Architecture Quiz 3 March 15, 2018 CS3350B Computer Architecture Quiz 3 March 15, 2018 Student ID number: Student Last Name: Question 1.1 1.2 1.3 2.1 2.2 2.3 Total Marks The quiz consists of two exercises. The expected duration is 30 minutes.

More information

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017 Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Implementing a MIPS Processor. Readings:

Implementing a MIPS Processor. Readings: Implementing a MIPS Processor Readings: 4.1-4.11 1 Goals for this Class Unrstand how CPUs run programs How do we express the computation the CPU? How does the CPU execute it? How does the CPU support other

More information

ECEC 355: Pipelining

ECEC 355: Pipelining ECEC 355: Pipelining November 8, 2007 What is Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. A pipeline is similar in concept to an assembly

More information

CS 61C: Great Ideas in Computer Architecture. Lecture 13: Pipelining. Krste Asanović & Randy Katz

CS 61C: Great Ideas in Computer Architecture. Lecture 13: Pipelining. Krste Asanović & Randy Katz CS 61C: Great Ideas in Computer Architecture Lecture 13: Pipelining Krste Asanović & Randy Katz http://inst.eecs.berkeley.edu/~cs61c/fa17 RISC-V Pipeline Pipeline Control Hazards Structural Data R-type

More information

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will

More information

Computer Architecture and Engineering. CS152 Quiz #1. February 19th, Professor Krste Asanovic. Name:

Computer Architecture and Engineering. CS152 Quiz #1. February 19th, Professor Krste Asanovic. Name: Computer Architecture and Engineering CS152 Quiz #1 February 19th, 2008 Professor Krste Asanovic Name: Notes: This is a closed book, closed notes exam. 80 Minutes 10 Pages Not all questions are of equal

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Pipeline design. Mehran Rezaei

Pipeline design. Mehran Rezaei Pipeline design Mehran Rezaei How Can We Improve the Performance? Exec Time = IC * CPI * CCT Optimization IC CPI CCT Source Level * Compiler * * ISA * * Organization * * Technology * With Pipelining We

More information

ECE 486/586. Computer Architecture. Lecture # 7

ECE 486/586. Computer Architecture. Lecture # 7 ECE 486/586 Computer Architecture Lecture # 7 Spring 2015 Portland State University Lecture Topics Instruction Set Principles Instruction Encoding Role of Compilers The MIPS Architecture Reference: Appendix

More information

1 Hazards COMP2611 Fall 2015 Pipelined Processor

1 Hazards COMP2611 Fall 2015 Pipelined Processor 1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add

More information

Page 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight

Page 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight Pipelining: Its Natural! Chapter 3 Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder

More information

Introduction to Pipelining. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T.

Introduction to Pipelining. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T. Introduction to Pipelining Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T. L15-1 Performance Measures Two metrics of interest when designing a system: 1. Latency: The delay

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Computer Organization MIPS Architecture. Department of Computer Science Missouri University of Science & Technology

Computer Organization MIPS Architecture. Department of Computer Science Missouri University of Science & Technology Computer Organization MIPS Architecture Department of Computer Science Missouri University of Science & Technology hurson@mst.edu Computer Organization Note, this unit will be covered in three lectures.

More information

Computer Architecture ELEC3441

Computer Architecture ELEC3441 Computer Architecture ELEC3441 RISC vs CISC Iron Law CPUTime = # of instruction program # of cycle instruction cycle Lecture 5 Pipelining Dr. Hayden Kwok-Hay So Department of Electrical and Electronic

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 17: Pipelining Wrapup Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Outline The textbook includes lots of information Focus on

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University. See P&H Chapter:

Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University. See P&H Chapter: Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University See P&H Chapter: 4.6 4.8 Prelim next week Tuesday at 7:30. Go to location based on netid [a g]* MRS146: Morrison Hall 146

More information

CS 152, Spring 2011 Section 2

CS 152, Spring 2011 Section 2 CS 152, Spring 2011 Section 2 Christopher Celio University of California, Berkeley About Me Christopher Celio celio @ eecs Office Hours: Tuesday 1-2pm, 751 Soda Agenda Q&A on HW1, Lab 1 Pipelining Questions

More information

Pipelining. Principles of pipelining. Simple pipelining. Structural Hazards. Data Hazards. Control Hazards. Interrupts. Multicycle operations

Pipelining. Principles of pipelining. Simple pipelining. Structural Hazards. Data Hazards. Control Hazards. Interrupts. Multicycle operations Principles of pipelining Pipelining Simple pipelining Structural Hazards Data Hazards Control Hazards Interrupts Multicycle operations Pipeline clocking ECE D52 Lecture Notes: Chapter 3 1 Sequential Execution

More information

Computer Architecture

Computer Architecture Lecture 3: Pipelining Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in

More information

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

Determined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version

Determined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version MIPS Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle? CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:

More information

LECTURE 3: THE PROCESSOR

LECTURE 3: THE PROCESSOR LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU

More information

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction EECS150 - Digital Design Lecture 10- CPU Microarchitecture Feb 18, 2010 John Wawrzynek Spring 2010 EECS150 - Lec10-cpu Page 1 Processor Microarchitecture Introduction Microarchitecture: how to implement

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23) Lecture Topics Today: Single-Cycle Processors (P&H 4.1-4.4) Next: continued 1 Announcements Milestone #3 (due 2/9) Milestone #4 (due 2/23) Exam #1 (Wednesday, 2/15) 2 1 Exam #1 Wednesday, 2/15 (3:00-4:20

More information

Computer Hardware Engineering

Computer Hardware Engineering Computer Hardware Engineering IS2, spring 27 Lecture 9: LU and s ssociate Professor, KTH Royal Institute of Technology Slides version. 2 Course Structure Module : C and ssembly Programming LE LE2 LE EX

More information

CS3350B Computer Architecture Winter 2015

CS3350B Computer Architecture Winter 2015 CS3350B Computer Architecture Winter 2015 Lecture 5.5: Single-Cycle CPU Datapath Design Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and Design, Patterson

More information

Appendix C. Abdullah Muzahid CS 5513

Appendix C. Abdullah Muzahid CS 5513 Appendix C Abdullah Muzahid CS 5513 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero) Single address mode for load/store: base + displacement no indirection

More information

Learning Outcomes. Spiral 3-3. Sorting: Software Implementation REVIEW

Learning Outcomes. Spiral 3-3. Sorting: Software Implementation REVIEW 3-3. Learning Outcomes 3-3. Spiral 3-3 Single Cycle CPU I understand how the single-cycle CPU datapath supports each type of instruction I understand why each mux is needed to select appropriate inputs

More information

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27) Lecture Topics Today: Data and Control Hazards (P&H 4.7-4.8) Next: continued 1 Announcements Exam #1 returned Milestone #5 (due 2/27) Milestone #6 (due 3/13) 2 1 Review: Pipelined Implementations Pipelining

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #19 Designing a Single-Cycle CPU 27-7-26 Scott Beamer Instructor AI Focuses on Poker CS61C L19 CPU Design : Designing a Single-Cycle CPU

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor 1 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A

More information

Pipelined Processors. Ideal Pipelining. Example: FP Multiplier. 55:132/22C:160 Spring Jon Kuhl 1

Pipelined Processors. Ideal Pipelining. Example: FP Multiplier. 55:132/22C:160 Spring Jon Kuhl 1 55:3/C:60 Spring 00 Pipelined Design Motivation: Increase processor throughput with modest increase in hardware. Bandwidth or Throughput = Performance Pipelined Processors Chapter Bandwidth (BW) = no.

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

( ) תשס"ח סמסטר ב' May, 2008 Hugo Guterman Web site:

( ) תשסח סמסטר ב' May, 2008 Hugo Guterman Web site: ארכיטקטורת יחידת עיבוד מרכזית (36113741) תשס"ח סמסטר ב' May, 2008 Hugo Guterman (hugo@ee.bgu.ac.il) Web site: http://www.ee.bgu.ac.il/~cpuarch Arch. CPU L5 Pipeline II 1 Outline More pipelining Control

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 251, Winter 2018, Assignment 5.0.4 3% of course mark Due Wednesday, March 21st, 4:30PM Lates accepted until 10:00am March 22nd with a 15% penalty 1. (10 points) The code sequence below executes on a

More information

The Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes.

The Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes. The Processor Pipeline Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes. Pipeline A Basic MIPS Implementation Memory-reference instructions Load Word (lw) and Store Word (sw) ALU instructions

More information

EXAM #1. CS 2410 Graduate Computer Architecture. Spring 2016, MW 11:00 AM 12:15 PM

EXAM #1. CS 2410 Graduate Computer Architecture. Spring 2016, MW 11:00 AM 12:15 PM EXAM #1 CS 2410 Graduate Computer Architecture Spring 2016, MW 11:00 AM 12:15 PM Directions: This exam is closed book. Put all materials under your desk, including cell phones, smart phones, smart watches,

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

Pipelining. Each step does a small fraction of the job All steps ideally operate concurrently

Pipelining. Each step does a small fraction of the job All steps ideally operate concurrently Pipelining Computational assembly line Each step does a small fraction of the job All steps ideally operate concurrently A form of vertical concurrency Stage/segment - responsible for 1 step 1 machine

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations Determined by ISA

More information

COSC 6385 Computer Architecture - Pipelining

COSC 6385 Computer Architecture - Pipelining COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage

More information

CSE 378 Midterm 2/12/10 Sample Solution

CSE 378 Midterm 2/12/10 Sample Solution Question 1. (6 points) (a) Rewrite the instruction sub $v0,$t8,$a2 using absolute register numbers instead of symbolic names (i.e., if the instruction contained $at, you would rewrite that as $1.) sub

More information

Lecture 8: Data Hazard and Resolution. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 8: Data Hazard and Resolution. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 8: Data Hazard and Resolution James C. Hoe Department of ECE Carnegie ellon University 18 447 S18 L08 S1, James C. Hoe, CU/ECE/CALC, 2018 Your goal today Housekeeping detect and resolve

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Pipelining is Hazardous!

Pipelining is Hazardous! Pipelining is Hazardous! Hazards are situations where pipelining does not work as elegantly as we would like Three kinds Structural hazards -- we have run out of a hardware resource. Data hazards -- an

More information

CPS104 Computer Organization and Programming Lecture 19: Pipelining. Robert Wagner

CPS104 Computer Organization and Programming Lecture 19: Pipelining. Robert Wagner CPS104 Computer Organization and Programming Lecture 19: Pipelining Robert Wagner cps 104 Pipelining..1 RW Fall 2000 Lecture Overview A Pipelined Processor : Introduction to the concept of pipelined processor.

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

Real Processors. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Real Processors. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Real Processors Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Instruction-Level Parallelism (ILP) Pipelining: executing multiple instructions in parallel

More information

Pipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017!

Pipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017! Advanced Topics on Heterogeneous System Architectures Pipelining! Politecnico di Milano! Seminar Room @ DEIB! 30 November, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2 Outline!

More information

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism

More information

EIE/ENE 334 Microprocessors

EIE/ENE 334 Microprocessors EIE/ENE 334 Microprocessors Lecture 6: The Processor Week #06/07 : Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, 2009, Elsevier (MK) http://webstaff.kmutt.ac.th/~dejwoot.kha/

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 02: Introduction II Shuai Wang Department of Computer Science and Technology Nanjing University Pipeline Hazards Major hurdle to pipelining: hazards prevent the

More information

ECE/CS 250 Computer Architecture. Fall 2017

ECE/CS 250 Computer Architecture. Fall 2017 ECE/CS 250 Computer rchitecture Fall 2017 Pipelining Tyler letsch Duke University Includes material adapted from Dan Sorin (Duke) and mir Roth (Penn). This Unit: Pipelining pplication S Compiler Firmware

More information

Concocting an Instruction Set

Concocting an Instruction Set Concocting an Instruction Set Nerd Chef at work. move flour,bowl add milk,bowl add egg,bowl move bowl,mixer rotate mixer... Read: Chapter 2.1-2.7 L03 Instruction Set 1 A General-Purpose Computer The von

More information

CSE Lecture 13/14 In Class Handout For all of these problems: HAS NOT CANNOT Add Add Add must wait until $5 written by previous add;

CSE Lecture 13/14 In Class Handout For all of these problems: HAS NOT CANNOT Add Add Add must wait until $5 written by previous add; CSE 30321 Lecture 13/14 In Class Handout For the sequence of instructions shown below, show how they would progress through the pipeline. For all of these problems: - Stalls are indicated by placing the

More information

Orange Coast College. Business Division. Computer Science Department. CS 116- Computer Architecture. Pipelining

Orange Coast College. Business Division. Computer Science Department. CS 116- Computer Architecture. Pipelining Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Pipelining Recall Pipelining is parallelizing execution Key to speedups in processors Split instruction

More information

Lecture 2: Processor and Pipelining 1

Lecture 2: Processor and Pipelining 1 The Simple BIG Picture! Chapter 3 Additional Slides The Processor and Pipelining CENG 6332 2 Datapath vs Control Datapath signals Control Points Controller Datapath: Storage, FU, interconnect sufficient

More information

Pipeline Architecture RISC

Pipeline Architecture RISC Pipeline Architecture RISC Independent tasks with independent hardware serial No repetitions during the process pipelined Pipelined vs Serial Processing Instruction Machine Cycle Every instruction must

More information

CS 251, Winter 2019, Assignment % of course mark

CS 251, Winter 2019, Assignment % of course mark CS 251, Winter 2019, Assignment 5.1.1 3% of course mark Due Wednesday, March 27th, 5:30PM Lates accepted until 1:00pm March 28th with a 15% penalty 1. (10 points) The code sequence below executes on a

More information

Appendix C: Pipelining: Basic and Intermediate Concepts

Appendix C: Pipelining: Basic and Intermediate Concepts Appendix C: Pipelining: Basic and Intermediate Concepts Key ideas and simple pipeline (Section C.1) Hazards (Sections C.2 and C.3) Structural hazards Data hazards Control hazards Exceptions (Section C.4)

More information

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building

More information

Computer Architecture EE 4720 Midterm Examination

Computer Architecture EE 4720 Midterm Examination Name Solution Computer Architecture EE 4720 Midterm Examination Friday, 20 March 2015, 9:30 10:20 CT Alias The Implementation Game Problem 1 Problem 2 Problem 3 Problem 4 Problem 5 Problem 6 Exam Total

More information

Thomas Polzer Institut für Technische Informatik

Thomas Polzer Institut für Technische Informatik Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =

More information