Outline Marquette University
|
|
- Lisa Dorsey
- 5 years ago
- Views:
Transcription
1 COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike Johnson, Richard Povinelli, David Patterson, Morgan Kaufmann, and other online sources Page 1 Outline Pipelining Introduction Pipelined MIPS Datapath Pipeline Control Data Hazards Address data hazards - forwarding (HW solution) Double data hazards - revised forwarding Load-use data hazards - hazard detection unit to Stall Branch Hazards Flush Exceptions Exception Handling Routine Improving Performance Page 2 1
2 Laundry Example Pipelining is Natural! Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 30 minutes Folder takes 30 minutes Stasher takes 30 minutes to put clothes into drawers Page 3 Sequential Laundry Sequential laundry takes 8 hours for 4 loads 6 PM AM T a s k O r d e r A B C D Time If we pipelined, how long would laundry take? Page 4 2
3 Pipelined Laundry: Start work ASAP Pipelined laundry takes 3.5 hours for 4 loads! 12 2 AM 6 PM T a s k A B Time O r d e r C D What s the theoretical speedup? Page 5 Pipelining Lessons T a s k O r d e r 6 PM Time A B C D Pipelining doesn t help latency of a single task, it helps throughput of entire workload Max potential speedup is Number of pipe stages Performance limited by slowest pipeline stage (Unbalanced lengths of pipe) Time to fill pipeline and time to drain it reduces speedup Might need to stall for dependencies Page 6 3
4 The Five Stages of Load Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Load Ifetch Reg/Dec Exec Mem Wr 1. IF: Get instruction from Instruction Memory 2. ID: Register Fetch and Instruction Decode 3. EX: Calculate the memory address 4. MEM: Read the data from data Memory 5. WB: Write the data back to the register file Page 7 Pipelining Improve performance by increasing instruction throughput Program execution order Time (in instructions) lw $1, 100($0) lw $2, 200($0) lw $3, 300($0) Program execution Time order (in instructions) lw $1, 100($0) lw $2, 200($0) lw $3, 300($0) Instruction fetch Instruction fetch 2 ns Reg ALU 8 ns Data access Reg Instruction fetch Reg ALU 8 ns Data access Reg Instruction fetch 2 ns ALU Reg Instruction fetch Data access ALU Reg Reg Data access ALU Reg Data access Reg 2 ns 2 ns 2 ns 2 ns 2 ns Reg Instruction fetch 8 ns... Note when register reads and writes occur What is the ideal speedup for this pipeline? Page 8 4
5 Pipeline Speedup If all stages are balanced i.e., all take the same time Time between instructions pipelined = Time between instructions nonpipelined Number of stages If not balanced, speedup is less Speedup due to increased throughput Latency (time for each instruction) does not decrease Page 9 Implementing Pipelining What makes it easy MIPS instructions are all the same length Just a few instruction formats Memory operands appear only in loads and stores What makes it hard Structural hazards Multiple instructions needing the same datapath components Data hazards Instructions depending on previous instruction s output Control hazards Changes to instruction sequence - branching, jumping, Exception handling: overflows, interrupts Page 10 5
6 Basic MIPS Datapath Structure MEM Right-to-left flow leads to hazards WB Page 11 Pipeline registers Need registers between stages To hold information produced in previous cycle Page 12 6
7 Pipeline Operation Cycle-by-cycle flow of instructions through the pipelined datapath Single-clock-cycle pipeline diagram Shows pipeline usage in a single cycle Highlight resources used c.f. multi-clock-cycle diagram Graph of operation over time We ll look at single-clock-cycle diagrams for load & store Page 13 IF for Load, Store, Page 14 7
8 ID for Load, Store, Page 15 EX for Load Page 16 8
9 MEM for Load Page 17 WB for Load Wrong register number Page 18 9
10 Corrected Datapath for Load Page 19 EX for Store Page 20 10
11 MEM for Store Page 21 WB for Store Page 22 11
12 Graphically Representing Pipelines Program execution order (in instructions) lw$10, 20($1) Time (in clockcycles) CC1 CC2 CC3 CC4 CC5 CC6 IM Reg ALU DM Reg sub $11, $2, $3 IM Reg ALU DM Reg Can help with answering questions like: How many cycles does it take to execute this code? What is the ALU doing during cycle 4? Grayed portions what s being used for this instruction. Page 23 Multi-Cycle Pipeline Diagram Form showing resource usage Page 24 12
13 Single-Cycle Pipeline Diagram State of pipeline in a given cycle Page 25 Pipeline Control (Simplified) Page 26 13
14 Pipeline control We have 5 separate stages 1. IF: Instruction Fetch and PC Increment 2. ID: Instruction Decode / Register Fetch 3. EX: Execution 4. MEM: Memory Access 5. WR: Write Back How should control be handled? A fancy control center telling everyone what to do? Should we use a state machine? What lines need to be controlled in each stage? Page 27 Pipeline Control Pass control signals along just like the data Execution/Address Calculation stage control lines Memory access stage control lines Write-back stage control lines Instruction Reg Dst ALU Op1 ALU Op0 ALU Src Branch Mem Read Mem Write Reg write Mem to Reg R-format lw sw X X beq X X Page 28 14
15 Datapath with Control Page 29 (1) Data Hazards in ALU Instructions Consider this sequence: sub $2, $1,$3 and $12,$2,$5 or $13,$6,$2 add $14,$2,$2 sw $15,100($2) Page 30 15
16 Data hazards Problem with starting next instruction before first is finished have dependencies that go backward in time Page 31 Software Solution Have compiler guarantee no hazards: no ops sub $2, $1, $3 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15, 100($2) Where would we put no ops in this example? Problem: this slows things down! Page 32 16
17 Hardware solution - Forwarding Forwarding: Use outputs before they are actually written Page 33 Detecting the Need to Forward Pass register numbers along pipeline e.g., ID/EX.RegisterRs = register number for Rs sitting in ID/EX pipeline register ALU operand register numbers in EX stage are given by ID/EX.RegisterRs, ID/EX.RegisterRt Data hazards when 1a. EX/MEM.RegisterRd = ID/EX.RegisterRs 1b. EX/MEM.RegisterRd = ID/EX.RegisterRt 2a. MEM/WB.RegisterRd = ID/EX.RegisterRs 2b. MEM/WB.RegisterRd = ID/EX.RegisterRt Fwd from EX/MEM pipeline reg Fwd from MEM/WB pipeline reg Page 34 17
18 Detecting the Need to Forward But only if forwarding instruction will write to a register! EX/MEM.RegWrite, MEM/WB.RegWrite And only if Rd for that instruction is not $zero EX/MEM.RegisterRd 0, MEM/WB.RegisterRd 0 Page 35 Forwarding Paths Page 36 18
19 EX hazard Forwarding Conditions if (EX/MEM.RegWrite and (EX/MEM.RegisterRd 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10 if (EX/MEM.RegWrite and (EX/MEM.RegisterRd 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10 MEM hazard if (MEM/WB.RegWrite and (MEM/WB.RegisterRd 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01 if (MEM/WB.RegWrite and (MEM/WB.RegisterRd 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01 Page 37 (2) Double Data Hazard Consider the sequence: add $1,$1,$2 add $1,$1,$3 add $1,$1,$4 Both hazards occur Want to use the most recent Revise MEM hazard condition Only fwd if EX hazard condition isn t true Page 38 19
20 Revised Forwarding Condition MEM hazard if (MEM/WB.RegWrite and (MEM/WB.RegisterRd 0) and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01 if (MEM/WB.RegWrite and (MEM/WB.RegisterRd 0) and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01 Page 39 Datapath with Forwarding When might this not work? Page 40 20
21 (3) Load-Use Data Hazard Load word can still cause a hazard: Can t read a register after a load instruction writes to same register Need a hazard detection unit to stall the pipeline Need to stall for one cycle Page 41 Load-Use Hazard Detection Check when using instruction is decoded in ID stage ALU operand register numbers in ID stage are given by IF/ID.RegisterRs, IF/ID.RegisterRt Load-use hazard when ID/EX.MemRead and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.RegisterRt)) If detected, stall and insert bubble Page 42 21
22 How to Stall the Pipeline Force control values in ID/EX register to 0 EX, MEM and WB do nop (no-operation) Prevent update of PC and IF/ID register Using instruction is decoded again Following instruction is fetched again 1-cycle stall allows MEM to read data for lw Can subsequently forward to EX stage Forwarding and stalling are fully inter-operable; a forward will operate correctly even across a stalled-stage in the pipeline. Page 43 Stall/Bubble in the Pipeline Stall inserted here Page 44 22
23 Stall/Bubble in the Pipeline Or, more accurately Page 45 Hazard Detection Unit How are we forcing a repeated stage here? Page 46 23
24 Code Scheduling to Avoid Stalls Reorder code to avoid use of load result in the next instruction C code for A = B + E; C = B + F; stall stall lw $t1, 0($t0) lw $t2, 4($t0) add $t3, $t1, $t2 sw $t3, 12($t0) lw $t4, 8($t0) add $t5, $t1, $t4 sw $t5, 16($t0) 13 cycles lw $t1, 0($t0) lw $t2, 4($t0) lw $t4, 8($t0) add $t3, $t1, $t2 sw $t3, 12($t0) add $t5, $t1, $t4 sw $t5, 16($t0) 11 cycles Page 47 (4) Branch Hazards If branch outcome determined at the end of EX stage By the time we know to branch, new instructions are in the pipe! : Flush these instructions (Set control values to 0) PC Page 48 24
25 We are basically predicting branch not taken What if it is? Predict Toss all instructions in the pipe Keep going (the branch instruction will update the PC correctly and we ll get to the right place) Undo any memory/register changes (this won t happen, since that s at the end of the pipe) So how do we tell the datapath to quit executing an instruction in the middle of the pipeline? Page 49 Flush To flush the pipe (when branch is taken) Set IF.Flush (new control line) Zero all control lines Similar to stall, except don t disable the PC and IF/ID write controls this effectively writes over what the previous instruction was doing. No memory or register writes will have yet happened, so everything else is OK Could we make the decision quicker, to lose fewer clock cycles? Page 50 25
26 Reducing Branch Delay Move hardware to determine outcome to ID stage Target address adder Register comparator This means the branch decision can be made during the ID stage instead of the EX stage. This is why we need an IF.Flush, but not an ID.Flush! But Problem: Messes up forwarding for branching Page 51 Example: Branch Taken Page 52 26
27 Example: Branch Taken Page 53 Data Hazards for Branches If a comparison register is a destination of 2 nd or 3 rd preceding ALU instruction add $1, $2, $3 IF ID EX MEM WB add $4, $5, $6 IF ID EX MEM WB IF ID EX MEM WB beq $1, $4, target IF ID EX MEM WB Can resolve using forwarding Page 54 27
28 Data Hazards for Branches If a comparison register is a destination of preceding ALU instruction or 2 nd preceding load instruction Need 1 stall cycle lw $1, addr IF ID EX MEM WB add $4, $5, $6 IF ID EX MEM WB beq stalled IF ID beq $1, $4, target ID EX MEM WB Page 55 Data Hazards for Branches If a comparison register is a destination of immediately preceding load instruction Need 2 stall cycles lw $1, addr IF ID EX MEM WB beq stalled IF ID beq stalled ID beq $1, $0, target ID EX MEM WB Page 56 28
29 Other branch hazard methods Dynamic branch prediction Keep a table of bits associated with the current chunk of memory, telling whether a branch was taken last time that instruction was executed. Use that to guess at the next instruction to execute Delayed branching Make the instruction set operate such that at a branch the following instruction is always executed, then the branch actually occurs. Puts the work on the programmer / assembler. MIPS does this!! (hidden in the assembler). Page 57 Stalls and Performance The BIG Picture Stalls reduce performance But are required to get correct results Compiler can arrange code to avoid hazards and stalls Requires knowledge of the pipeline structure Page 58 29
30 Exceptions: One last thing When an exception occurs (e.g. overflow), we have to stop the instruction right away Otherwise we might overwrite a value in the registers To handle this Flush current and 2 previous instructions Need to add ID.Flush and EX.Flush lines (to zero out control line values) Save address where instruction occurred Add special Exception PC (EPC) register Jump to special Exception handling routine Add special case PC input address Page 59 Final Data Path 30
31 Improving Performance Try to avoid stalls! e.g., reorder these instructions: lw $t0, 0($t1) lw $t2, 4($t1) sw $t2, 0($t1) sw $t0, 4($t1) Page 61 Instruction-Level Parallelism (ILP) Pipelining: executing multiple instructions in parallel To increase ILP Deeper pipeline Less work per stage shorter clock cycle Multiple issue Replicate pipeline stages multiple pipelines Start multiple instructions per clock cycle CPI < 1, so use Instructions Per Cycle (IPC) E.g., 4GHz 4-way multiple-issue 16 BIPS, peak CPI = 0.25, peak IPC = 4 But dependencies reduce this in practice Page 62 31
32 SuperPipelining This is just creating longer pipelines. Remember that the speedup is directly related to the number of stages in the pipe. Big issue: How do you break up a datapath into smaller pieces, each with smaller and roughly equal latencies! Page 63 Superscalar Processing Could you do laundry faster if you had 3 washers and 3 dryers? Yes, if you could keep them moving smoothly Basic method: Put multiple copies of datapath into hardware Launch multiple instructions every clock cycle Get ready to have serious control and hazard issues! Biggest issue: need to schedule instructions quickly and efficiently and get them into the pipes, in such a way that hazards are minimized. Page 64 32
33 Dynamic Pipeline Scheduling This is like regular or super pipelining, except with very advanced techniques to make sure there are always instructions ready to fill in if a stall happens (don t want to waste a clock cycle.) Basically keeps a series of instruction buffers and schedule the instructions as needed to keep the pipe full and the program running without hazards. Branch prediction / speculative execution Can create quite a problem for accurate exception handling. Page 65 Current processors Nearly all have pipelining/superscalar design Compiler technology important Can t create efficient code for a pipelined or superscalar processor without the compiler thoroughly understanding how the pipelining, scheduling, forwarding and hazard detection all work for that specific processor. Page 66 33
Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.
Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview
More informationProcessor (II) - pipelining. Hwansoo Han
Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationCSEE 3827: Fundamentals of Computer Systems
CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 martha@cs.columbia.edu Amdahl s Law Be aware when optimizing... T = improved Taffected improvement factor + T unaffected
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationThe Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
The Processor (3) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationChapter 4 The Processor 1. Chapter 4B. The Processor
Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined
More informationLecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)
Lecture Topics Today: Data and Control Hazards (P&H 4.7-4.8) Next: continued 1 Announcements Exam #1 returned Milestone #5 (due 2/27) Milestone #6 (due 3/13) 2 1 Review: Pipelined Implementations Pipelining
More informationPipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationLecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1
Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number
More informationLECTURE 3: THE PROCESSOR
LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU
More informationECE260: Fundamentals of Computer Engineering
Data Hazards in a Pipelined Datapath James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy Data
More information14:332:331 Pipelined Datapath
14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate
More informationThomas Polzer Institut für Technische Informatik
Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =
More informationDetermined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version
MIPS Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationECE473 Computer Architecture and Organization. Pipeline: Data Hazards
Computer Architecture and Organization Pipeline: Data Hazards Lecturer: Prof. Yifeng Zhu Fall, 2015 Portions of these slides are derived from: Dave Patterson UCB Lec 14.1 Pipelining Outline Introduction
More informationChapter 4. The Processor
Chapter 4 The Processor 1 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A
More informationComputer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationComputer Architecture. Lecture 6.1: Fundamentals of
CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and
More informationCS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST
CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C
More informationEIE/ENE 334 Microprocessors
EIE/ENE 334 Microprocessors Lecture 6: The Processor Week #06/07 : Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, 2009, Elsevier (MK) http://webstaff.kmutt.ac.th/~dejwoot.kha/
More informationDesign a MIPS Processor (2/2)
93-2Digital System Design Design a MIPS Processor (2/2) Lecturer: Chihhao Chao Advisor: Prof. An-Yeu Wu 2005/5/13 Friday ACCESS IC LABORTORY Outline v 6.1 An Overview of Pipelining v 6.2 A Pipelined Datapath
More informationChapter 4. The Processor
Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations Determined by ISA
More informationCPS104 Computer Organization and Programming Lecture 19: Pipelining. Robert Wagner
CPS104 Computer Organization and Programming Lecture 19: Pipelining Robert Wagner cps 104 Pipelining..1 RW Fall 2000 Lecture Overview A Pipelined Processor : Introduction to the concept of pipelined processor.
More informationChapter 4 (Part II) Sequential Laundry
Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P 7 8 9 10 11 12 1 2 A T a s k O r d e r A B C D 30 30 30 30 30 30 30 30 30 30
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More informationzhandling Data Hazards The objectives of this module are to discuss how data hazards are handled in general and also in the MIPS architecture.
zhandling Data Hazards The objectives of this module are to discuss how data hazards are handled in general and also in the MIPS architecture. We have already discussed in the previous module that true
More informationPipeline: Introduction
Pipeline: Introduction These slides are derived from: CSCE430/830 Computer Architecture course by Prof. Hong Jiang and Dave Patterson UCB Some figures and tables have been derived from : Computer System
More informationCENG 3420 Lecture 06: Pipeline
CENG 3420 Lecture 06: Pipeline Bei Yu byu@cse.cuhk.edu.hk CENG3420 L06.1 Spring 2019 Outline q Pipeline Motivations q Pipeline Hazards q Exceptions q Background: Flip-Flop Control Signals CENG3420 L06.2
More informationComputer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationChapter 4 The Processor 1. Chapter 4A. The Processor
Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationComputer Organization and Structure. Bing-Yu Chen National Taiwan University
Computer Organization and Structure Bing-Yu Chen National Taiwan University The Processor Logic Design Conventions Building a Datapath A Simple Implementation Scheme An Overview of Pipelining Pipelined
More informationPipelined datapath Staging data. CS2504, Spring'2007 Dimitris Nikolopoulos
Pipelined datapath Staging data b 55 Life of a load in the MIPS pipeline Note: both the instruction and the incremented PC value need to be forwarded in the next stage (in case the instruction is a beq)
More informationChapter 4. The Processor
Chapter 4 The Processor Recall. ISA? Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction Instruction Format or Encoding how is it decoded? Location of operands and
More informationDEE 1053 Computer Organization Lecture 6: Pipelining
Dept. Electronics Engineering, National Chiao Tung University DEE 1053 Computer Organization Lecture 6: Pipelining Dr. Tian-Sheuan Chang tschang@twins.ee.nctu.edu.tw Dept. Electronics Engineering National
More informationComputer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining
Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one
More informationPipelining. Maurizio Palesi
* Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer
More informationChapter 4. The Processor. Jiang Jiang
Chapter 4 The Processor Jiang Jiang jiangjiang@ic.sjtu.edu.cn [Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, 2008, MK] Chapter 4 The Processor 2 Introduction CPU performance
More informationLecture 6: Pipelining
Lecture 6: Pipelining i CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and other
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor? Chapter 4 The Processor 2 Introduction We will learn How the ISA determines many aspects
More information3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:
More informationLECTURE 9. Pipeline Hazards
LECTURE 9 Pipeline Hazards PIPELINED DATAPATH AND CONTROL In the previous lecture, we finalized the pipelined datapath for instruction sequences which do not include hazards of any kind. Remember that
More informationPipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12
Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing Note: Appendices A-E in the hardcopy text correspond to chapters 7- in the online
More informationPipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010
Pipelining, Instruction Level Parallelism and Memory in Processors Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010 NOTE: The material for this lecture was taken from several
More informationPipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.
Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing (2) Pipeline Performance Assume time for stages is ps for register read or write
More informationECS 154B Computer Architecture II Spring 2009
ECS 154B Computer Architecture II Spring 2009 Pipelining Datapath and Control 6.2-6.3 Partially adapted from slides by Mary Jane Irwin, Penn State And Kurtis Kredo, UCD Pipelined CPU Break execution into
More informationELE 655 Microprocessor System Design
ELE 655 Microprocessor System Design Section 2 Instruction Level Parallelism Class 1 Basic Pipeline Notes: Reg shows up two places but actually is the same register file Writes occur on the second half
More informationModern Computer Architecture
Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each
More informationComputer Systems Architecture Spring 2016
Computer Systems Architecture Spring 2016 Lecture 01: Introduction Shuai Wang Department of Computer Science and Technology Nanjing University [Adapted from Computer Architecture: A Quantitative Approach,
More informationMidnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4
IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM
More informationOutline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception
Outline A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception 1 4 Which stage is the branch decision made? Case 1: 0 M u x 1 Add
More information1 Hazards COMP2611 Fall 2015 Pipelined Processor
1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add
More informationPipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...
CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100
More informationLecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1
Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)
More informationCOMPUTER ORGANIZATION AND DESI
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler
More informationChapter 8. Pipelining
Chapter 8. Pipelining Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined organization requires sophisticated compilation techniques.
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationChapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts
CS359: Computer Architecture Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Computer Science and Engineering Shanghai Jiao Tong University Parallel
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationOrange Coast College. Business Division. Computer Science Department. CS 116- Computer Architecture. Pipelining
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Pipelining Recall Pipelining is parallelizing execution Key to speedups in processors Split instruction
More informationImprove performance by increasing instruction throughput
Improve performance by increasing instruction throughput Program execution order Time (in instructions) lw $1, 100($0) fetch 2 4 6 8 10 12 14 16 18 ALU Data access lw $2, 200($0) 8ns fetch ALU Data access
More informationSI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1,
SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Chapter 6 ADMIN ing for Chapter 6: 6., 6.9-6.2 2 Midnight Laundry Task order A 6 PM 7 8 9 0 2 2 AM B C D 3 Smarty
More informationData Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard
Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:
More informationBasic Instruction Timings. Pipelining 1. How long would it take to execute the following sequence of instructions?
Basic Instruction Timings Pipelining 1 Making some assumptions regarding the operation times for some of the basic hardware units in our datapath, we have the following timings: Instruction class Instruction
More informationCS 61C: Great Ideas in Computer Architecture Pipelining and Hazards
CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/sp16 1 Pipelined Execution Representation Time
More informationECEC 355: Pipelining
ECEC 355: Pipelining November 8, 2007 What is Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. A pipeline is similar in concept to an assembly
More informationCPE 335 Computer Organization. Basic MIPS Pipelining Part I
CPE 335 Computer Organization Basic MIPS Pipelining Part I Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE232 Basic MIPS Pipelining
More informationCOMP2611: Computer Organization. The Pipelined Processor
COMP2611: Computer Organization The 1 2 Background 2 High-Performance Processors 3 Two techniques for designing high-performance processors by exploiting parallelism: Multiprocessing: parallelism among
More informationTopics. Lecture 12: Pipelining. Introduction to pipelining. Pipelined datapath. Hazards in pipeline. Performance. Design issues.
Lecture 2: Pipelining Topics Introduction to pipelining Performance Pipelined datapath Design issues Hazards in pipeline Types Solutions Pipelining is Natural! Laundry Example Use case scenario Ann, Brian,
More informationLecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation
Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu www.secs.oakland.edu/~yan
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationECE260: Fundamentals of Computer Engineering
Pipelining James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy What is Pipelining? Pipelining
More informationPage 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight
Pipelining: Its Natural! Chapter 3 Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder
More informationSome material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier
Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science 6 PM 7 8 9 10 11 Midnight Time 30 40 20 30 40 20
More informationECE232: Hardware Organization and Design
ECE232: Hardware Organization and Design Lecture 17: Pipelining Wrapup Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Outline The textbook includes lots of information Focus on
More informationCS 61C: Great Ideas in Computer Architecture Control and Pipelining
CS 6C: Great Ideas in Computer Architecture Control and Pipelining Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs6c/sp6 Datapath Control Signals ExtOp: zero, sign
More informationPipeline Processors David Rye :: MTRX3700 Pipelining :: Slide 1 of 15
Pipeline Processors Pipelining :: Slide 1 of 15 Pipeline Processors A common feature of modern processors Works like a series production line An operation is divided into k decoupled (independent) elementary
More informationChapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns
Chapter Si Pipelining Improve perfomance by increasing instruction throughput eecutionı Time lw $, ($) 2 6 8 2 6 8 access lw $2, 2($) 8 ns access lw $3, 3($) eecutionı Time lw $, ($) lw $2, 2($) 2 ns 8
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationPipelining: Overview. CPSC 252 Computer Organization Ellen Walker, Hiram College
Pipelining: Overview CPSC 252 Computer Organization Ellen Walker, Hiram College Pipelining the Wash Divide into 4 steps: Wash, Dry, Fold, Put Away Perform the steps in parallel Wash 1 Wash 2, Dry 1 Wash
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationThe Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture
The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count
More informationPipelining. CSC Friday, November 6, 2015
Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not
More informationCISC 662 Graduate Computer Architecture Lecture 6 - Hazards
CISC 662 Graduate Computer Architecture Lecture 6 - Hazards Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationCOSC121: Computer Systems. ISA and Performance
COSC121: Computer Systems. ISA and Performance Jeremy Bolton, PhD Assistant Teaching Professor Constructed using materials: - Patt and Patel Introduction to Computing Systems (2nd) - Patterson and Hennessy
More informationMIPS An ISA for Pipelining
Pipelining: Basic and Intermediate Concepts Slides by: Muhamed Mudawar CS 282 KAUST Spring 2010 Outline: MIPS An ISA for Pipelining 5 stage pipelining i Structural Hazards Data Hazards & Forwarding Branch
More informationECE154A Introduction to Computer Architecture. Homework 4 solution
ECE154A Introduction to Computer Architecture Homework 4 solution 4.16.1 According to Figure 4.65 on the textbook, each register located between two pipeline stages keeps data shown below. Register IF/ID
More informationEE557--FALL 1999 MIDTERM 1. Closed books, closed notes
NAME: SOLUTIONS STUDENT NUMBER: EE557--FALL 1999 MIDTERM 1 Closed books, closed notes GRADING POLICY: The front page of your exam shows your total numerical score out of 75. The highest numerical score
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor
More information微算機系統第六章. Enhancing Performance with Pipelining 陳伯寧教授電信工程學系國立交通大學. Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold
微算機系統第六章 Enhancing Performance with Pipelining 陳伯寧教授電信工程學系國立交通大學 chap6- Pipeline is natural! Laundry Example Ann, Brian, athy, Dave each have one load of clothes to wash, dry, and fold A B D Washer takes
More informationCS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed
Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted fact sheet, with a restriction: 1) one 8.5x11 sheet, both sides, handwritten
More informationELCT 501: Digital System Design
ELCT 501: Digital System Lecture 8: Pipelining Dr. Mohamed Abd El Ghany, Pipelining: Its Natural! Laundry Example Ann, brian, cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes
More information