Unresolved data hazards. CS2504, Spring'2007 Dimitris Nikolopoulos

Similar documents
Full Datapath. Chapter 4 The Processor 2

COMPUTER ORGANIZATION AND DESIGN

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

CSEE 3827: Fundamentals of Computer Systems

Chapter 4. The Processor

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Chapter 4 The Processor 1. Chapter 4B. The Processor

The Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

LECTURE 3: THE PROCESSOR

Determined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version

EIE/ENE 334 Microprocessors

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

The Processor: Improving the performance - Control Hazards

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception

Chapter 4. The Processor

Chapter 4. The Processor

Full Datapath. Chapter 4 The Processor 2

14:332:331 Pipelined Datapath

Processor (II) - pipelining. Hwansoo Han

1 Hazards COMP2611 Fall 2015 Pipelined Processor

ECE260: Fundamentals of Computer Engineering

ELE 655 Microprocessor System Design

ECE/CS 552: Pipeline Hazards

Thomas Polzer Institut für Technische Informatik

LECTURE 9. Pipeline Hazards

COMPUTER ORGANIZATION AND DESI

zhandling Data Hazards The objectives of this module are to discuss how data hazards are handled in general and also in the MIPS architecture.

CENG 3420 Lecture 06: Pipeline

LECTURE 10. Pipelining: Advanced ILP

ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University

Instruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

Instruction Pipelining Review

Outline Marquette University

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Minimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

There are different characteristics for exceptions. They are as follows:

Instr. execution impl. view

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages

Chapter 4. The Processor

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

MIPS An ISA for Pipelining

ECE Exam II - Solutions November 8 th, 2017

Chapter 4. The Processor

Chapter 4. The Processor. Jiang Jiang

Lecture 9 Pipeline and Cache

ECE 486/586. Computer Architecture. Lecture # 12

ECE154A Introduction to Computer Architecture. Homework 4 solution

DEE 1053 Computer Organization Lecture 6: Pipelining

Instruction Level Parallelism. ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction

(Basic) Processor Pipeline

Pipelined datapath Staging data. CS2504, Spring'2007 Dimitris Nikolopoulos

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

Suggested Readings! Recap: Pipelining improves throughput! Processor comparison! Lecture 17" Short Pipelining Review! ! Readings!

Pipelining is Hazardous!

Chapter 4 The Processor (Part 4)

Appendix C: Pipelining: Basic and Intermediate Concepts

Announcement. ECE475/ECE4420 Computer Architecture L4: Advanced Issues in Pipelining. Edward Suh Computer Systems Laboratory

EITF20: Computer Architecture Part2.2.1: Pipeline-1

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards

COSC4201 Pipelining. Prof. Mokhtar Aboelaze York University

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

(Refer Slide Time: 00:02:04)

Chapter 4. The Processor

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

What is Pipelining? RISC remainder (our assumptions)

In embedded systems there is a trade off between performance and power consumption. Using ILP saves power and leads to DECREASING clock frequency.

SISTEMI EMBEDDED. Computer Organization Pipelining. Federico Baronti Last version:

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12

Design a MIPS Processor (2/2)

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

Quiz 5 Mini project #1 solution Mini project #2 assigned Stalling recap Branches!

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.

What about branches? Branch outcomes are not known until EXE What are our options?

CISC 662 Graduate Computer Architecture Lecture 7 - Multi-cycles

CS 251, Winter 2018, Assignment % of course mark

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

Computer Architecture

POLITECNICO DI MILANO. Exception handling. Donatella Sciuto:

Chapter 4 The Processor 1. Chapter 4A. The Processor

ECE260: Fundamentals of Computer Engineering

COMPUTER ORGANIZATION AND DESIGN

CS2100 Computer Organisation Tutorial #10: Pipelining Answers to Selected Questions

ECE260: Fundamentals of Computer Engineering

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...

5008: Computer Architecture HW#2

EITF20: Computer Architecture Part2.2.1: Pipeline-1

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 4. The Processor

Lecture 5: Instruction Pipelining. Pipeline hazards. Sequential execution of an N-stage task: N Task 2

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Pipeline Data Hazards. Dealing With Data Hazards

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

CSE Lecture 13/14 In Class Handout For all of these problems: HAS NOT CANNOT Add Add Add must wait until $5 written by previous add;

CISC 662 Graduate Computer Architecture Lecture 7 - Multi-cycles. Interrupts and Exceptions. Device Interrupt (Say, arrival of network message)

Transcription:

Unresolved data hazards 81

Unresolved data hazards Arithmetic instructions following a load, and reading the register updated by the load: if (ID/EX.MemRead and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.RegisterRt))) stall the pipeline 82

How do we stall a pipeline? A bubble is actually am instruction converted to nop (no operation) instruction. The nop is implemented by zeroing out all control signals of an instruction in flight. Goal is to freeze the PC and prevent the instruction from having an 83 effect. The pipeline will refetch the instruction in the next cycle!

How do we stall a pipeline? We need a hazard detection unit to nullify instructions in the event of a hazard. Hazards are detected between loads and dependent arithmetic instructions. Detection unit zeros EX,MEM and WM control signals, so that instruction 84 does not take any effect. Detection unit also controls PCWrite signal to freeze PC

The trouble with branches Case of control dependence. We don't know which instruction actually follows the branch, until the MEM stage of the branch instruction. The pipeline however will have already fetched 3 instructions from the not-taken path! 85

Solutions for branch hazards Assume always not taken Fetch the next instruction in program order When the branch is resolved: If the branch is not taken, keep going, no problem If the branch is taken, we need to flush 3 instructions in the pipeline We use nops to discard instructions in the IF, ID, EX stage. In the case of branches, we need to flush the instructions from the pipeline, so that they don't have any effect 86

Solutions for branch hazards Say, we move branch logic earlier in the pipeline: ID stage is the earliest time possible Need circuitry to calculate branch target address Easy, since we have PC and offset from the IF stage Need circuitry to evaluate branch condition: Harder! Branch condition may be dependent on earlier instructions! Moving the condition checking earlier introduces more data hazards between the branch and earlier instructions on which the branch depends! 87

Solutions for branch hazards Say, we move branch logic earlier in the pipeline: Need to take care of data hazards before the branch Forwarding from the EX/MEM and the MEM/WB stage, if the branch depends on prior instruction Data hazard can still occur, if the immediately preceding instruction generates a register which is used for the comparison in the branch. At decode stage we need to decide whether we should bypass the ALU and use the dedicated branch condition logic 88

Reducing branch delay Moving branch target and condition calculation to the ID stage, reduces the branch stall in the taken case to one cycle Still create a bubble in the pipeline Can use this bubble if we have a useful instruction to execute and the instruction is independent of the branch (neither in the taken path, nor in the non-taken path, i.e. non conditionally executed). This is called using a branch delay slot 89

Solutions for branch hazards Branch prediction using one bit: When we see a branch, store the PC of the branch instruction and the outcome of the branch in a table When we see the branch again, search the table and predict that the next outcome of the branch will be the last outcome of the branch How good is this? 90

Solutions for branch hazards Branch prediction using one bit: Consider loop with 10 iterations executed twice in the program First time through the loop, first iteration, nothing in the branch prediction table, so prediction is random, but we store the actual outcome (taken) in the table Next eight times through the loop, predict taken Last time through the loop, predict taken, branch not taken. Prediction accuracy 80-90%. Second time through the loop, prediction accuracy 80%, since last outcome was not taken! 91

Improving branch prediction Introduce a hysteresis, i.e. wait to see the same outcome of the branch twice, before you flip the prediction. Will resolve loops and other predictable branches, but still not perfect. 92

Pipelined data path with hazards Hazard detection unit can modify PC to introduce bubble. Control flushes instructions in the IF stage to avoid branch hazards. Hazard detection unit zeroes control signals to prevent later instructions from updating state.. 93

The trouble with exceptions An arithmetic exception resembles a taken branch Control must freeze the current instruction Control needs to jump to an exception handled in the OS Need to flush instructions in flight so that they do not commit state. Need to save program counter (use MIPS EPC register) Need to handle both recoverable and irrecoverable exceptions Exceptions may occur at different stages during instructions (e.g. illegal instruction vs. arithmetic overflow, vs. bad memory address) 94

Pipelined datapath with exceptions Reuse much of branch hazard hardware to flush instructions. Since we have a new source of a hazard, we use multiplexers to zero out control lines in the ID, EX, MEM stages. The ALU zero signal needs to be fed back to the control unit 95

Other considerations with exceptions Possible sources of exceptions: I/O device interrupt (recoverable) OS system call (e.g. file I/O, recoverable) Bad instruction (unrecoverable) Hardware malfunction (unrecoverable) We stop instructions in the middle of execution Programmer needs to see instruction either committed or uncommitted. If exception is recoverable, then we must restart the interrupted instruction and let it commit its result. 96

Other considerations with exceptions Recovering state: When jumping to an exception handler we need to save the state of the program in memory. State includes EPC and other registers in use by the program at the time of the exception State restored upon return from the exception (if we are to recover from it) The very same functionality, called a context switch, can be used to switch between processes (programs) in the processor Context switching can improve processor utilization, e.g. by overlapping I/O with other computation 97

A glimpse of state-of-the-art Advanced pipelining techniques: Multiple ALUs, functional units, multi-port memories and register files Processors able to execute multiple independent instructions in the same stage of the pipeline More hardware enables multiple instruction issue and CPI < 1 (IPC > 1) Hardware performs dynamic dependence analysis between instructions Hardware uses more advanced forms of prediction, not only for branches, but also for data read from memory 98