Pipelining: Hazards Ver. Jan 14, 2014

Size: px
Start display at page:

Download "Pipelining: Hazards Ver. Jan 14, 2014"

Transcription

1 POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? Pipelining: Hazards Ver. Jan 14, 2014 Marco D. Santambrogio: Simone Campanoni:

2 2 Back to the Intro to the Pipeline

3 Sequential vs. Pipelining Execution I1 I2 IF ID EX MEM IF ID EX MEM 10 ns 10 ns I1 IF ID EX MEM Time I2 2 ns IF ID EX MEM I3 2 ns IF ID EX MEM I4 2 ns IF ID EX MEM I5 2 ns IF ID EX MEM 3

4 Implementation of MIPS pipeline M U X ID Instruction Decode EX Execution MEM Memory Access Write Back +4 Adder IF /ID ID/E X Adder EX/MEM ME M/ WR 2- bit Left Shifter PC Read Address Instruction Ins truc tion Memory [25-21] [20-16] M [15-11] U X Register Read 1 Register Read 2 Register Write Write Data RF Content register 1 Content register 2 M U X OP ALU Zero Result Read Address Write Address Write Data WR RD Read Data Data Memory M U X IF Instruction Fetch [15-0] 16 bit Sign extension 32 bit 4

5 Note: Optimized Pipeline! Register File used in 2 stages: Read access during ID and write access during! What happens if read and write refer to the same register in the same clock cycle?! It is necessary to insert one stall From now on,! Optimized Pipeline: the RF read occurs in the second half of clock cycle and the RF write in the first half of clock cycle this is the Pipeline! What happens if read and write refer to the same register in the same clock cycle? we are going to use! It is not necessary to insert one stall 5

6 Note: Optimized Pipeline! Register File used in 2 stages: Read access during ID and write access during! What happens if read and write refer to the same register in the same clock cycle?! It is necessary to insert one stall! Optimized Pipeline: the RF read occurs in the second half of clock cycle and the RF write in the first half of clock cycle! What happens if read and write refer to the same register in the same clock cycle?! It is not necessary to insert one stall 6

7 Today 7

8 Outline! The Problem of Pipeline Hazards! Performance Issues in Pipelining 8

9 The Problem of Hazards! A hazard is created whenever there is a dependence between instructions, and instructions are close enough that the overlap caused by pipelining would change the order of access to the operands involved in the dependence.! Hazards prevent the next instruction in the pipeline from executing during its designated clock cycle.! Hazards reduce the performance from the ideal speedup gained by pipelining. 9

10 Three Classes of Hazards! Structural Hazards: Attempt to use the same resource from different instructions simultaneously! Example: Single memory for instructions and data! Data Hazards: Attempt to use a result before it is ready! Example: Instruction depending on a result of a previous instruction still in the pipeline! Control Hazards: Attempt to make a decision on the next instruction to execute before the condition is evaluated! Example: Conditional branch execution 10

11 Structural Hazards! No structural hazards in MIPS architecture:! Instruction Memory separated from Data Memory! Register File used in the same clock cycle: Read access by an instruction and write access by another instruction I1 I2 I3 I4 I5 IM REG A L DM REG U 2 ns IM REG A L DM REG U 2 ns IM REG A L DM REG U 2 ns IM REG A L DM REG U 2 ns Time IM REG A L DM REG U 11

12 Speed Up Equation for Pipelining CPI pipelined = Ideal CPI + Average Stall cycles per Inst Ideal CPI Pipeline depth Speedup = Ideal CPI + Pipeline stall CPI Cycle Cycle Time Time unpipelined pipelined For simple RISC pipeline, CPI = 1: Pipeline depth Speedup = 1 + Pipeline stall CPI Cycle Cycle Time Time unpipelined pipelined 12

13 Data Hazards! If the instructions executed in the pipeline are dependent, data hazards can arise when instructions are too close! Example: sub $2, $1, $3 # Reg. $2 written by sub and $12, $2, $5 # 1 operand ($2) depends on sub or $13, $6, $2 # 2 operand ($2) depend on sub add $14, $2, $2 # 1 ($2) & 2 ($2) depend on sub sw $15,100($2) # Base reg. ($2) depends on sub 13

14 Data Hazards in the Optimized Pipeline: Example sub $2, $1, $3 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15,100($2) A IM REG DM REG L U 2 ns A IM REG DM REG L U 2 ns A IM REG DM REG L U 2 ns A IM REG DM REG L U 2 ns Time A IM REG DM REG L U Instruction order It is necessary to insert two stalls 14

15 Type of Data Hazard Read After Write (RAW) Instr J tries to read operand before Instr I writes it I: add r1,r2,r3 J: sub r4,r1,r3 Caused by a Dependence (in compiler nomenclature). This hazard results from an actual need for communication. 15

16 Data Hazards: Possible Solutions! Compilation Techniques:! Insertion of nop (no operation) instructions! Instructions Scheduling to avoid that correlating instructions are too close! The compiler tries to insert independent instructions among correlating instructions! When the compiler does not find independent instructions, it insert nops.! Hardware Techniques:! Insertion of bubbles or stalls in the pipeline! Data Forwarding or Bypassing 16

17 Just an example... sub and or add sw A IM REG DM REG L U 2 ns 2 ns Ordine di esecuzione delle istruzioni A IM REG DM REG L U A IM REG DM REG L U 2 ns A IM REG DM REG L U 2 ns Tempo A IM REG DM REG L U I need to insert 2 bubbles or 2 stalls 17

18 Insertion of nops sub $2, $1, $3 nop nop and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15,100($2) IF ID EX ME IF ID EX ME IF ID EX ME IF ID EX ME IF ID EX ME IF ID EX ME IF ID EX ME 18

19 Insertion of bubbles and stalls CK1 CK2 CK3 CK4 CK5 CK6 CK7 CK8 CK9 CK10 CK11 CK12 sub $2, $1, $3 IF ID EX MEM Time (clock cycles) and $12, $2, $5 IF bubble bubble ID EX MEM or $13, $6, $2 IF ID EX MEM add $14, $2, $2 IF ID EX MEM sw $15,100( $2 ) IF ID EX MEM Contenuto di $

20 Scheduling: Example sub $2, $1, $3 sub $2, $1, $3 and $12, $2, $5 add $4, $10, $11 or $13, $6, $2 and $7, $8, $9 add $14, $2, $2 lw $16, 100($18) sw $15,100($2) lw $17, 200($19) add $4, $10, $11 and $12, $2, $5 and $7, $8, $9 or $13, $6, $2 lw $16, 100($18) add $14, $2, $2 lw $17, 200($19) sw $15,100($2) 20

21 Forwarding! Data forwarding uses temporary results stored in the pipeline registers instead of waiting for the write back of results in the RF.! We need to add multiplexers at the inputs of ALU to fetch inputs from pipeline registers to avoid the insertion of stalls in the pipeline. 21

22 Forwarding: Example EX/EX path MEM/EX path sub $2, $1, $3 IF ID EX ME MEM/ID path and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15,100($2) IF ID EX ME IF ID EX ME IF ID EX ME IF ID EX ME 22

23 Implementation of MIPS with Forwarding Unit IF/ID ID/EX EX/MEM MEM/ P C Instr Memory I n s t r u c t i o n path Reg. M u x M u x M u x M u x A L U Data Memory M u x I F / I D. R e g i s t e r R s R s I F / I D. R e g i s t e r R t R t I F / I D. R e g i s t e r R t I F / I D. R e g i s t e r R d R t R d M u x E X / M E M. R e g i s t e r R d MEM/ID path F o r w a r d i n g u n i t M E M / W B. R e g i s t e r R d MEM/EX path EX/EX path 23

24 Data Hazards: Load/Use Hazard L1: lw $s0, 4($t1) # $s0 <- M [4 + $t1] L2: add $s5, $s0, $s1 # 1 operand depends from L1 " CK1 CK2 CK3 CK4 CK5 CK6 CK7 lw $s0, 4($t1) IF ID EX MEM add $s5,$s0,$s1 IF ID EX MEM 24

25 Data Hazards: Load/Use Hazard! With forwarding using the MEM/EX path: 1 stall CK1 CK2 CK3 CK4 CK5 CK6 CK7 lw $s0, 4($t1) IF ID EX MEM add $s5,$s0,$s1 IF ID EX MEM 25

26 Data Hazards: Load/Store L1: lw $s0, 4($t1) # $s0 <- M [4 + $t1] L2: sw $s0, 4($t2) # M[4 + $t2] <- $s0 " CK1 CK2 CK3 CK4 CK5 CK6 CK7 lw $s0, 4($t1) IF ID EX MEM sw $s0, 4($t2) IF ID EX MEM Contenuto di $s / Ø Without forwarding : 3 stalls 26

27 ! Forwarding: Stall = 0 Data Hazards: Load/Store! We need a forwarding path to bring the load result from the memory (in MEM/) to the memory s input for the store. CK1 CK2 CK3 CK4 CK5 CK6 CK7 lw $s0, 4($t1) IF ID EX MEM sw $s0, 4($t2) IF ID EX MEM 27

28 Implementation of MIPS with Forwarding Unit IF/ID ID/ EX/MEM MEM/ EX P C Memoria Istruzioni I n s t r u c t i o n path Reg. M u x M u x M u x M u x A L U M u x Mem. Dati M u x I F / I D. R e g i s t e r R s R s I F / I D. R e g i s t e r R t R t MEM/MEM path I F / I D. R e g i s t e r R t I F / I D. R e g i s t e r R d R t R d M u x E X / M E M. R e g i s t e r R d MEM/ID path F o r w a r d i n g u n i t M E M / W B. R e g i s t e r R d MEM/EX path EX/EX path Ø Ø Ø Ø EX/EX path MEM/EX path MEM/ID path MEM/MEM path 28

29 Forwarding to Avoid LW-SW Data Hazard Time (clock cycles) I n s t r. add r1,r2,r3 lw r4, 0(r1) Ifetch Reg Ifetch ALU Reg DMem ALU Reg DMem Reg O r d e r sw r4,12(r1) or r8,r6,r9 Ifetch Reg Ifetch ALU Reg DMem ALU Reg DMem Reg xor r10,r9,r11 Ifetch Reg ALU DMem Reg 29

30 Data Hazard Even with Forwarding Time (clock cycles) I n s t r. lw r1, 0(r2) sub r4,r1,r6 Ifetch Reg Ifetch ALU Reg DMem ALU Reg DMem Reg O r d e r and r6,r1,r7 or r8,r1,r9 Ifetch Reg Ifetch ALU Reg DMem ALU Reg DMem Reg 30

31 Software Scheduling to Avoid Load Hazards Try producing fast code for a = b + c; d = e f; assuming a, b, c, d,e, and f in memory. Slow code: LW LW ADD SW LW LW Rb,b Rc,c Ra,Rb,Rc a,ra Re,e Rf,f SUB Rd,Re,Rf SW d,rd Fast code: LW LW LW ADD LW SW SUB Rb,b Rc,c Re,e Ra,Rb,Rc Rf,f a,ra Rd,Re,Rf SW d,rd 31 Compiler optimizes for performance. Hardware checks for safety.

32 Data Hazards! Data hazards analyzed up to now are:! RAW (READ AFTER WRITE) hazards: instruction n+1 tries to read a source register before the previous instruction n has written it in the RF.! Example: add $r1, $r2, $r3 sub $r4, $r1, $r5! By using forwarding, it is always possible to solve this conflict without introducing stalls, except for the load/use hazards where it is necessary to add one stall 32

33 Data Hazards! Other types of data hazards in the pipeline:! WAW (WRITE AFTER WRITE)! WAR (WRITE AFTER READ) 33

34 Data Hazards: WAW (WRITE AFTER WRITE)! Instruction n+1 tries to write a destination operand before it has been written by the previous instruction n write operations executed in the wrong order! This type of hazards could not occur in the MIPS pipeline because all the register write operations occur in the stage 34

35 Data Hazards: WAW (WRITE AFTER WRITE)! Example: If we assume the register write in the ALU instructions occurs in the fourth stage and that load instructions require two stages (MEM1 and MEM2) to access the data memory, we can have: CK1 CK2 CK3 CK4 CK5 CK6 CK7 lw $r1, 0($r2) IF ID EX MEM 1 MEM2 add $r1,$r2,$r3 IF ID EX 35

36 Data Hazards: WAW (WRITE AFTER WRITE)! Example: If we assume the floating point ALU operations require a multi-cycle execution, we can have: CK1 CK2 CK3 CK4 CK5 CK6 CK7 CK8 mul $f6,$f2,$f2 IF ID MUL1 MUL2 MUL3 MUL4 MEM add $f6,$f2,$f2 IF ID AD1 AD2 MEM 36

37 WAW Data Hazards! Write After Write (WAW) Instr J writes operand before Instr I writes it. I: sub r1,r4,r3 J: add r1,r2,r3 K: mul r6,r1,r7! Called an output dependence by compiler writers This also results from the reuse of name r1.! Can t happen in MIPS 5 stage pipeline because:! All instructions take 5 stages, and! Writes are always in stage 5! Will see WAR and WAW in more complicated pipes 37

38 Data Hazards: WAR (WRITE AFTER READ)! Instruction n+1 tries to write a destination operand before it has been read from the previous instruction n instruction n reads the wrong value.! This type of hazards could not occur in the MIPS pipeline because the operand read operations occur in the ID stage and the write operations in the stage.! As before, if we assume the register write in the ALU instructions occurs in the fourth stage and that we need two stages to access the data memory, some instructions could read operands too late in the pipeline. 38

39 Data Hazards: WAR (WRITE AFTER READ)! Example: Instruction sw reads $r2 in the second half of MEM2 stage and instruction add writes $r2 in the first half of stage sw reads the new value of $r2. CK1 CK2 CK3 CK4 CK5 CK6 CK7 sw $r1, 0($r2) IF ID EX MEM 1 MEM2 add $r2, $r3, $r4 IF ID EX 39

40 WAR Data Hazards! Write After Read (WAR) Instr J writes operand before Instr I reads it I: sub r4,r1,r3 J: add r1,r2,r3 K: mul r6,r1,r7! Called an anti-dependence by compiler writers. This results from reuse of the name r1.! Can t happen in MIPS 5 stage pipeline because:! All instructions take 5 stages, and! Reads are always in stage 2, and! Writes are always in stage 5 40

41 Performance Issues in Pipelining! Pipelining increases the CPU instruction throughput (number of instructions completed per unit of time), but it does not reduce the execution time (latency) of a single instruction.! Pipelining usually slightly increases the latency of each instruction due to imbalance among the pipeline stages and overhead in the control of the pipeline.! Imbalance among pipeline stages reduces performance since the clock can run no faster than the time needed for the slowest pipe stage.! Pipeline overhead arises from pipeline register delay and clock skew. 41

42 Performance Issues in Pipelining! The average instruction execution time for the unpipelined processor is: Ave. Exec. Time Unpipelined = Ave. CPI Unp. x Clock Cycle Unp. Pipeline Speedup = Ave. Exec. Time Unpipelined = Ave. Exec. Time Pipelined = Ave. CPI Unp. x Clock Cycle Unp. = Ave. CPI Pipe Clock Cycle Pipe 42

43 Performance Issues in Pipelining! The ideal CPI on a pipelined processor is almost always 1, but stalls cause the pipeline performance to degrade form the ideal performance, so we have: Ave. CPI Pipe = Ideal CPI + Pipe Stall Cycles per Instruction = 1 + Pipe Stall Cycles per Instruction! Pipe Stall Cycles per Instruction are due to Structural Hazards + Data Hazards + Control Hazards 43

44 Performance Issues in Pipelining! If we ignore the cycle time overhead of pipelining and we assume the stages are perfectly balanced, the clock cycle time of two processors can be equal, so: Pipeline Speedup = Ave. CPI Unp. 1 + Pipe Stall Cycles per Instruction! Simple case: All instructions take the same number of cycles, which must also equal to the number of pipeline stages (called pipeline depth): Pipeline Speedup = Pipeline Depth 1 + Pipe Stall Cycles per Instruction! If there are no pipeline stalls (ideal case), this leads to the intuitive result that pipelining can improve performance by the depth of the pipeline. 44

45 ! CI = # of Instructios Performances! # Clock Cycles = CI + # Stall Cycles + 4! CPI = # Clock Cycles / CI = (CI + # Stall Cycles + 4) / CI! MIPS = f clock / (CPI * 10 6 ) 45

46 Performance Time sub $2, $1, $3 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15,100($2) Instruction order A IM REG DM REG L U 2 ns A IM REG DM REG L U 2 ns A IM REG DM REG L U 2 ns A IM REG DM REG L U 2 ns A IM REG DM REG L U It is necessary to insert two stalls IC = 5 # Clock Cycles = IC + # Stall Cycles + 4 = = 11 CPI = # Clock Cycles/ CI = 11 / 5 = 2.2 MIPS = f clock / (CPI * 10 6 ) = 500 MHz / 2.2 * 10 6 =

47 Asymptotic Performance! We have n iterations of a cycle defined using m instructions. We have k stalls for the m instructions! IC AS = m * n! # Clock Cycles = IC AS + (# Stall Cycles ) AS + 4! CPI AS = lim n -> ( IC AS + # Stall Cycles AS +4) /IC AS = lim n -> ( m *n + k * n + 4 ) / m * n = (m + k) / m! MIPS AS = f clock / (CPI AS * 10 6 ) 47

48 Questions

Pipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017!

Pipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017! Advanced Topics on Heterogeneous System Architectures Pipelining! Politecnico di Milano! Seminar Room @ DEIB! 30 November, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2 Outline!

More information

Pipelining: Basic Concepts

Pipelining: Basic Concepts Pipelining: Basic Concepts Prof. Cristina Silvano Dipartimento di Elettronica e Informazione Politecnico di ilano email: silvano@elet.polimi.it Outline Reduced Instruction Set of IPS Processor Implementation

More information

Pipeline Review. Review

Pipeline Review. Review Pipeline Review Review Covered in EECS2021 (was CSE2021) Just a reminder of pipeline and hazards If you need more details, review 2021 materials 1 The basic MIPS Processor Pipeline 2 Performance of pipelining

More information

Computer Architecture

Computer Architecture Lecture 3: Pipelining Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in

More information

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards CISC 662 Graduate Computer Architecture Lecture 6 - Hazards Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer

More information

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017 Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation

More information

COSC4201 Pipelining. Prof. Mokhtar Aboelaze York University

COSC4201 Pipelining. Prof. Mokhtar Aboelaze York University COSC4201 Pipelining Prof. Mokhtar Aboelaze York University 1 Instructions: Fetch Every instruction could be executed in 5 cycles, these 5 cycles are (MIPS like machine). Instruction fetch IR Mem[PC] NPC

More information

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science Pipeline Overview Dr. Jiang Li Adapted from the slides provided by the authors Outline MIPS An ISA for Pipelining 5 stage pipelining Structural and Data Hazards Forwarding Branch Schemes Exceptions and

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science 6 PM 7 8 9 10 11 Midnight Time 30 40 20 30 40 20

More information

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each

More information

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building

More information

COSC 6385 Computer Architecture - Pipelining

COSC 6385 Computer Architecture - Pipelining COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science Cases that affect instruction execution semantics

More information

第三章 Instruction-Level Parallelism and Its Dynamic Exploitation. 陈文智 浙江大学计算机学院 2014 年 10 月

第三章 Instruction-Level Parallelism and Its Dynamic Exploitation. 陈文智 浙江大学计算机学院 2014 年 10 月 第三章 Instruction-Level Parallelism and Its Dynamic Exploitation 陈文智 chenwz@zju.edu.cn 浙江大学计算机学院 2014 年 10 月 1 3.3 The Major Hurdle of Pipelining Pipeline Hazards 本科回顾 ------- Appendix A.2 3.3.1 Taxonomy

More information

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism

More information

Advanced Computer Architecture Pipelining

Advanced Computer Architecture Pipelining Advanced Computer Architecture Pipelining Dr. Shadrokh Samavi Some slides are from the instructors resources which accompany the 6 th and previous editions of the textbook. Some slides are from David Patterson,

More information

Instruction Pipelining Review

Instruction Pipelining Review Instruction Pipelining Review Instruction pipelining is CPU implementation technique where multiple operations on a number of instructions are overlapped. An instruction execution pipeline involves a number

More information

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14 MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK

More information

What is Pipelining? RISC remainder (our assumptions)

What is Pipelining? RISC remainder (our assumptions) What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism

More information

CPE Computer Architecture. Appendix A: Pipelining: Basic and Intermediate Concepts

CPE Computer Architecture. Appendix A: Pipelining: Basic and Intermediate Concepts CPE 110408443 Computer Architecture Appendix A: Pipelining: Basic and Intermediate Concepts Sa ed R. Abed [Computer Engineering Department, Hashemite University] Outline Basic concept of Pipelining The

More information

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation

More information

DLX Unpipelined Implementation

DLX Unpipelined Implementation LECTURE - 06 DLX Unpipelined Implementation Five cycles: IF, ID, EX, MEM, WB Branch and store instructions: 4 cycles only What is the CPI? F branch 0.12, F store 0.05 CPI0.1740.83550.174.83 Further reduction

More information

Overview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP

Overview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP Overview Appendix A Pipelining: Basic and Intermediate Concepts Basics of Pipelining Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 1 2 Unpipelined

More information

Appendix A. Overview

Appendix A. Overview Appendix A Pipelining: Basic and Intermediate Concepts 1 Overview Basics of Pipelining Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 2 1 Unpipelined

More information

Pipelining. Maurizio Palesi

Pipelining. Maurizio Palesi * Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer

More information

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Thoai Nam Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy & David a Patterson,

More information

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Pipeline Thoai Nam Outline Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy

More information

Minimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline

Minimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline Instruction Pipelining Review: MIPS In-Order Single-Issue Integer Pipeline Performance of Pipelines with Stalls Pipeline Hazards Structural hazards Data hazards Minimizing Data hazard Stalls by Forwarding

More information

ECEC 355: Pipelining

ECEC 355: Pipelining ECEC 355: Pipelining November 8, 2007 What is Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. A pipeline is similar in concept to an assembly

More information

Appendix C: Pipelining: Basic and Intermediate Concepts

Appendix C: Pipelining: Basic and Intermediate Concepts Appendix C: Pipelining: Basic and Intermediate Concepts Key ideas and simple pipeline (Section C.1) Hazards (Sections C.2 and C.3) Structural hazards Data hazards Control hazards Exceptions (Section C.4)

More information

Outline. Pipelining basics The Basic Pipeline for DLX & MIPS Pipeline hazards. Handling exceptions Multi-cycle operations

Outline. Pipelining basics The Basic Pipeline for DLX & MIPS Pipeline hazards. Handling exceptions Multi-cycle operations Pipelining 1 Outline Pipelining basics The Basic Pipeline for DLX & MIPS Pipeline hazards Structural Hazards Data Hazards Control Hazards Handling exceptions Multi-cycle operations 2 Pipelining basics

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

MIPS An ISA for Pipelining

MIPS An ISA for Pipelining Pipelining: Basic and Intermediate Concepts Slides by: Muhamed Mudawar CS 282 KAUST Spring 2010 Outline: MIPS An ISA for Pipelining 5 stage pipelining i Structural Hazards Data Hazards & Forwarding Branch

More information

Lecture 2: Processor and Pipelining 1

Lecture 2: Processor and Pipelining 1 The Simple BIG Picture! Chapter 3 Additional Slides The Processor and Pipelining CENG 6332 2 Datapath vs Control Datapath signals Control Points Controller Datapath: Storage, FU, interconnect sufficient

More information

Chapter 4 The Processor 1. Chapter 4A. The Processor

Chapter 4 The Processor 1. Chapter 4A. The Processor Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation

Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu www.secs.oakland.edu/~yan

More information

CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3. Complications With Long Instructions

CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3. Complications With Long Instructions CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3 Long Instructions & MIPS Case Study Complications With Long Instructions So far, all MIPS instructions take 5 cycles But haven't talked

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will

More information

Pipelining: Basic and Intermediate Concepts

Pipelining: Basic and Intermediate Concepts Appendix A Pipelining: Basic and Intermediate Concepts 1 Overview Basics of fpipelining i Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 2 Unpipelined

More information

CSE 533: Advanced Computer Architectures. Pipelining. Instructor: Gürhan Küçük. Yeditepe University

CSE 533: Advanced Computer Architectures. Pipelining. Instructor: Gürhan Küçük. Yeditepe University CSE 533: Advanced Computer Architectures Pipelining Instructor: Gürhan Küçük Yeditepe University Lecture notes based on notes by Mark D. Hill and John P. Shen Updated by Mikko Lipasti Pipelining Forecast

More information

Introduction to Embedded Architectures

Introduction to Embedded Architectures Introduction to Embedded Architectures Prof. Cristina SILVANO Politecnico di Milano Dipartimento di Elettronica e Informazione P.za L. Da Vinci 32, I-20133 Milano (Italy) Ph.: +39-02-2399-3692 e-mail:

More information

The Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes.

The Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes. The Processor Pipeline Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes. Pipeline A Basic MIPS Implementation Memory-reference instructions Load Word (lw) and Store Word (sw) ALU instructions

More information

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle? CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:

More information

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one

More information

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:

More information

Page 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight

Page 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight Pipelining: Its Natural! Chapter 3 Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Advanced Computer Architecture

Advanced Computer Architecture Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes

More information

Appendix C. Abdullah Muzahid CS 5513

Appendix C. Abdullah Muzahid CS 5513 Appendix C Abdullah Muzahid CS 5513 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero) Single address mode for load/store: base + displacement no indirection

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

What do we have so far? Multi-Cycle Datapath (Textbook Version)

What do we have so far? Multi-Cycle Datapath (Textbook Version) What do we have so far? ulti-cycle Datapath (Textbook Version) CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instruction being processed in datapath How to lower CPI further? #1 Lec # 8 Summer2001

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 3 Instruction-Level Parallelism and Its Exploitation 1 Branch Prediction Basic 2-bit predictor: For each branch: Predict taken or not

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Lecture 7 Pipelining. Peng Liu.

Lecture 7 Pipelining. Peng Liu. Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt

More information

COMP2611: Computer Organization. The Pipelined Processor

COMP2611: Computer Organization. The Pipelined Processor COMP2611: Computer Organization The 1 2 Background 2 High-Performance Processors 3 Two techniques for designing high-performance processors by exploiting parallelism: Multiprocessing: parallelism among

More information

Unpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory

Unpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory Pipelining the Idea Similar to assembly line in a factory Divide instruction into smaller tasks Each task is performed on subset of resources Overlap the execution of multiple instructions by completing

More information

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards CISC 662 Gaduate Compute Achitectue Lectue 6 - Hazads Michela Taufe http://www.cis.udel.edu/~taufe/teaching/cis662f07 Powepoint Lectue Notes fom John Hennessy and David Patteson s: Compute Achitectue,

More information

Advanced Computer Architecture

Advanced Computer Architecture ECE 563 Advanced Computer Architecture Fall 2010 Lecture 2: Review of Metrics and Pipelining 563 L02.1 Fall 2010 Review from Last Time Computer Architecture >> instruction sets Computer Architecture skill

More information

Pipelining. Each step does a small fraction of the job All steps ideally operate concurrently

Pipelining. Each step does a small fraction of the job All steps ideally operate concurrently Pipelining Computational assembly line Each step does a small fraction of the job All steps ideally operate concurrently A form of vertical concurrency Stage/segment - responsible for 1 step 1 machine

More information

Appendix C. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,

Appendix C. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002, Appendix C Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Pipelining Multiple instructions are overlapped in execution Each is in a different stage Each stage is called

More information

CSE 502 Graduate Computer Architecture. Lec 3-5 Performance + Instruction Pipelining Review

CSE 502 Graduate Computer Architecture. Lec 3-5 Performance + Instruction Pipelining Review CSE 502 Graduate Computer Architecture Lec 3-5 Performance + Instruction Pipelining Review Larry Wittie Computer Science, StonyBrook University http://www.cs.sunysb.edu/~cse502 and ~lw Slides adapted from

More information

1 Hazards COMP2611 Fall 2015 Pipelined Processor

1 Hazards COMP2611 Fall 2015 Pipelined Processor 1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add

More information

Instruction-Level Parallelism (ILP)

Instruction-Level Parallelism (ILP) Instruction Level Parallelism Instruction-Level Parallelism (ILP): overlap the execution of instructions to improve performance 2 approaches to exploit ILP: 1. Rely on hardware to help discover and exploit

More information

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction

More information

14:332:331 Pipelined Datapath

14:332:331 Pipelined Datapath 14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 02: Introduction II Shuai Wang Department of Computer Science and Technology Nanjing University Pipeline Hazards Major hurdle to pipelining: hazards prevent the

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

Processor (II) - pipelining. Hwansoo Han

Processor (II) - pipelining. Hwansoo Han Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number

More information

Ti Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr

Ti Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr Ti5317000 Parallel Computing PIPELINING Michał Roziecki, Tomáš Cipr 2005-2006 Introduction to pipelining What is this What is pipelining? Pipelining is an implementation technique in which multiple instructions

More information

(Basic) Processor Pipeline

(Basic) Processor Pipeline (Basic) Processor Pipeline Nima Honarmand Generic Instruction Life Cycle Logical steps in processing an instruction: Instruction Fetch (IF_STEP) Instruction Decode (ID_STEP) Operand Fetch (OF_STEP) Might

More information

CSE 502 Graduate Computer Architecture. Lec 3-5 Performance + Instruction Pipelining Review

CSE 502 Graduate Computer Architecture. Lec 3-5 Performance + Instruction Pipelining Review CSE 502 Graduate Computer Architecture Lec 3-5 Performance + Instruction Pipelining Review Larry Wittie Computer Science, StonyBrook University http://www.cs.sunysb.edu/~cse502 and ~lw Slides adapted from

More information

Instruction Pipelining

Instruction Pipelining Instruction Pipelining Simplest form is a 3-stage linear pipeline New instruction fetched each clock cycle Instruction finished each clock cycle Maximal speedup = 3 achieved if and only if all pipe stages

More information

ECE154A Introduction to Computer Architecture. Homework 4 solution

ECE154A Introduction to Computer Architecture. Homework 4 solution ECE154A Introduction to Computer Architecture Homework 4 solution 4.16.1 According to Figure 4.65 on the textbook, each register located between two pipeline stages keeps data shown below. Register IF/ID

More information

Instruction Pipelining

Instruction Pipelining Instruction Pipelining Simplest form is a 3-stage linear pipeline New instruction fetched each clock cycle Instruction finished each clock cycle Maximal speedup = 3 achieved if and only if all pipe stages

More information

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27) Lecture Topics Today: Data and Control Hazards (P&H 4.7-4.8) Next: continued 1 Announcements Exam #1 returned Milestone #5 (due 2/27) Milestone #6 (due 3/13) 2 1 Review: Pipelined Implementations Pipelining

More information

Computer Architecture. Lecture 6.1: Fundamentals of

Computer Architecture. Lecture 6.1: Fundamentals of CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and

More information

Pipeline Architecture RISC

Pipeline Architecture RISC Pipeline Architecture RISC Independent tasks with independent hardware serial No repetitions during the process pipelined Pipelined vs Serial Processing Instruction Machine Cycle Every instruction must

More information

EECC551 Review. Dynamic Hardware-Based Speculation

EECC551 Review. Dynamic Hardware-Based Speculation EECC551 Review Recent Trends in Computer Design. Computer Performance Measures. Instruction Pipelining. Branch Prediction. Instruction-Level Parallelism (ILP). Loop-Level Parallelism (LLP). Dynamic Pipeline

More information

PIPELINING: HAZARDS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

PIPELINING: HAZARDS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah PIPELINING: HAZARDS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 1 submission deadline: Jan. 30 th This

More information

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content 3/6/8 CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Today s Content We have looked at how to design a Data Path. 4.4, 4.5 We will design

More information

C.1 Introduction. What Is Pipelining? C-2 Appendix C Pipelining: Basic and Intermediate Concepts

C.1 Introduction. What Is Pipelining? C-2 Appendix C Pipelining: Basic and Intermediate Concepts C-2 Appendix C Pipelining: Basic and Intermediate Concepts C.1 Introduction Many readers of this text will have covered the basics of pipelining in another text (such as our more basic text Computer Organization

More information

Chapter 4 (Part II) Sequential Laundry

Chapter 4 (Part II) Sequential Laundry Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P 7 8 9 10 11 12 1 2 A T a s k O r d e r A B C D 30 30 30 30 30 30 30 30 30 30

More information

LECTURE 3: THE PROCESSOR

LECTURE 3: THE PROCESSOR LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU

More information

EE 4683/5683: COMPUTER ARCHITECTURE

EE 4683/5683: COMPUTER ARCHITECTURE EE 4683/5683: COMPUTER ARCHITECTURE Lecture 4A: Instruction Level Parallelism - Static Scheduling Avinash Kodi, kodi@ohio.edu Agenda 2 Dependences RAW, WAR, WAW Static Scheduling Loop-carried Dependence

More information

Updated Exercises by Diana Franklin

Updated Exercises by Diana Franklin C-82 Appendix C Pipelining: Basic and Intermediate Concepts Updated Exercises by Diana Franklin C.1 [15/15/15/15/25/10/15] Use the following code fragment: Loop: LD R1,0(R2) ;load R1 from address

More information

Graduate Computer Architecture. Chapter 3. Instruction Level Parallelism and Its Dynamic Exploitation

Graduate Computer Architecture. Chapter 3. Instruction Level Parallelism and Its Dynamic Exploitation Graduate Computer Architecture Chapter 3 Instruction Level Parallelism and Its Dynamic Exploitation 1 Overview Instruction level parallelism Dynamic Scheduling Techniques Scoreboarding (Appendix A.8) Tomasulo

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count

More information

Complications with long instructions. CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3. How slow is slow?

Complications with long instructions. CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3. How slow is slow? Complications with long instructions CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3 Long Instructions & MIPS Case Study So far, all MIPS instructions take 5 cycles But haven't talked

More information

Page 1. Recall from Pipelining Review. Lecture 15: Instruction Level Parallelism and Dynamic Execution

Page 1. Recall from Pipelining Review. Lecture 15: Instruction Level Parallelism and Dynamic Execution CS252 Graduate Computer Architecture Recall from Pipelining Review Lecture 15: Instruction Level Parallelism and Dynamic Execution March 11, 2002 Prof. David E. Culler Computer Science 252 Spring 2002

More information

ECE 154A Introduction to. Fall 2012

ECE 154A Introduction to. Fall 2012 ECE 154A Introduction to Computer Architecture Fall 2012 Dmitri Strukov Lecture 10 Floating point review Pipelined design IEEE Floating Point Format single: 8 bits double: 11 bits single: 23 bits double:

More information

Chapter 3. Pipelining. EE511 In-Cheol Park, KAIST

Chapter 3. Pipelining. EE511 In-Cheol Park, KAIST Chapter 3. Pipelining EE511 In-Cheol Park, KAIST Terminology Pipeline stage Throughput Pipeline register Ideal speedup Assume The stages are perfectly balanced No overhead on pipeline registers Speedup

More information

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good CPU performance equation: T = I x CPI x C Both effective CPI and clock cycle C are heavily influenced by CPU design. For single-cycle CPU: CPI = 1 good Long cycle time bad On the other hand, for multi-cycle

More information

Instruction Level Parallelism. ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction

Instruction Level Parallelism. ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction Instruction Level Parallelism ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction Basic Block A straight line code sequence with no branches in except to the entry and no branches

More information

ECE473 Computer Architecture and Organization. Pipeline: Control Hazard

ECE473 Computer Architecture and Organization. Pipeline: Control Hazard Computer Architecture and Organization Pipeline: Control Hazard Lecturer: Prof. Yifeng Zhu Fall, 2015 Portions of these slides are derived from: Dave Patterson UCB Lec 15.1 Pipelining Outline Introduction

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike

More information

Pipelining. CSC Friday, November 6, 2015

Pipelining. CSC Friday, November 6, 2015 Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not

More information