Instr. execution impl. view

Similar documents
Instruction Pipelining Review

Minimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline

Pipelining. CSC Friday, November 6, 2015

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

Full Datapath. Chapter 4 The Processor 2

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

Instruction Level Parallelism. ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction

ECE 486/586. Computer Architecture. Lecture # 12

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

LECTURE 3: THE PROCESSOR

Predict Not Taken. Revisiting Branch Hazard Solutions. Filling the delay slot (e.g., in the compiler) Delayed Branch

COSC 6385 Computer Architecture - Pipelining

Lecture 15: Pipelining. Spring 2018 Jason Tang

Full Datapath. Chapter 4 The Processor 2

1 Hazards COMP2611 Fall 2015 Pipelined Processor

Computer Architecture

COMPUTER ORGANIZATION AND DESI

Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN

EITF20: Computer Architecture Part2.2.1: Pipeline-1

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards

EITF20: Computer Architecture Part2.2.1: Pipeline-1

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

CSEE 3827: Fundamentals of Computer Systems

Processor (II) - pipelining. Hwansoo Han

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages

Chapter 4 The Processor 1. Chapter 4B. The Processor

Chapter 4. The Processor

Modern Computer Architecture

Basic Pipelining Concepts

The Processor: Improving the performance - Control Hazards

Appendix C: Pipelining: Basic and Intermediate Concepts

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

COSC4201 Pipelining. Prof. Mokhtar Aboelaze York University

Pipeline Review. Review

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

There are different characteristics for exceptions. They are as follows:

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Advanced issues in pipelining

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1

What is Pipelining? RISC remainder (our assumptions)

ECE260: Fundamentals of Computer Engineering

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science

EITF20: Computer Architecture Part2.2.1: Pipeline-1

ECE473 Computer Architecture and Organization. Pipeline: Control Hazard

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST

MIPS An ISA for Pipelining

Pipelined Processor Design

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

Overview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP

Chapter 3. Pipelining. EE511 In-Cheol Park, KAIST

Appendix A. Overview

Lecture 7: Pipelining Contd. More pipelining complications: Interrupts and Exceptions

CIS 662: Midterm. 16 cycles, 6 stalls

ELE 655 Microprocessor System Design

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Computer Architecture Spring 2016

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1

LECTURE 10. Pipelining: Advanced ILP

Instruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards

COMP2611: Computer Organization. The Pipelined Processor

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

Computer Systems Architecture I. CSE 560M Lecture 5 Prof. Patrick Crowley

Lecture 9: Case Study MIPS R4000 and Introduction to Advanced Pipelining Professor Randy H. Katz Computer Science 252 Spring 1996

4. The Processor Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3. Emil Sekerinski, McMaster University, Fall Term 2015/16

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

ECE 154A Introduction to. Fall 2012

Pipelining: Basic and Intermediate Concepts

5008: Computer Architecture HW#2

ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1

Pipelined Processor Design

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

Multi-cycle Instructions in the Pipeline (Floating Point)

Improving Performance: Pipelining

Determined by ISA and compiler. We will examine two MIPS implementations. A simplified version A more realistic pipelined version

Week 11: Assignment Solutions

Unresolved data hazards. CS2504, Spring'2007 Dimitris Nikolopoulos

Orange Coast College. Business Division. Computer Science Department. CS 116- Computer Architecture. Pipelining

COMPUTER ORGANIZATION AND DESIGN

Pipelining. Principles of pipelining. Simple pipelining. Structural Hazards. Data Hazards. Control Hazards. Interrupts. Multicycle operations

The Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes.

ECE 505 Computer Architecture

The Processor: Instruction-Level Parallelism

Four Steps of Speculative Tomasulo cycle 0

Computer Architecture. Lecture 6.1: Fundamentals of

Unpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory

Performance of tournament predictors In the last lecture, we saw the design of the tournament predictor used by the Alpha

ECEC 355: Pipelining

Pipelining. Pipeline performance

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Pipelining. Maurizio Palesi

Transcription:

Pipelining Sangyeun Cho Computer Science Department Instr. execution impl. view Single (long) cycle implementation Multi-cycle implementation Pipelined implementation

Processing an instruction Fetch instruction from memory Separate instruction memory ( Harvard architecture) vs. single memory ( Von Neumann architecture) Decode instruction Read operands Perform specified computation Access memory This need be done before computation if memory provides an operand Address computation must be done before accessing memory Update machine state Register or memory Processing an instruction

Pipelining datapath pipeline registers Pipelining IF ID EX MEM WB cc1 cc2 cc3 cc4 cc5 cc6 cc7 cc8 cc9 ADD IF ID EX MEM WB SUB IF ID EX MEM WB LOAD IF ID EX MEM WB AND IF ID EX MEM WB OR IF ID EX MEM

Pipelining facts Pipeline increases instruction throughput Does not improve instruction latency Clock cycle time is limited by the longest pipeline stage Multiple instructions are overlapped and are in different stages Potential speedup = # of stages Time to fill pipeline and time to drain may affect performance Pipelining hazards Hazards deter instruction execution Structural hazard: HW can t support a specific sequence of instructions due to lack of resources; later instruction must wait Data hazard: an instruction can t proceed, waiting for an operand produced by a prior instruction still in execution Control hazard: can t decide if an instruction is going to be executed because a prior control instruction is still in execution Pipeline interlock Hardware mechanism to detect hazard conditions and prevent instructions from being unduly executed

Tackling hazards Ensuring correctness Hardware interlock Automatic detection of hazard conditions and enforcement of safe instruction execution Software approach Compiler inserts NOP instructions Improving performance Data forwarding Not through register file, but through dedicated forwarding paths Software approach Compiler (statically) schedules instructions to minimize stall times due to hazards Data hazard example

Data forwarding hardware Data forwarding example

Control hazard example branch I1 I2 I3 correct instr. Impact of branch stalls When (base) CPI = 1, 20% of instructions are branches, 3- cycle stall per branch CPI = 1 + 20% 3 = 1.6 How to minimize the impact? Reduce stall delay Determine early if branch is taken or not Compute branch target address early (Branch prediction) Example Move zero test to ID stage Provide a separate adder to calculate new PC in ID stage

Modified pipeline added hardware Branch processing strategies Stall until branch direction is known Predict NOT TAKEN Execute fall-off instructions that follow Squash (or cancel) instructions in pipeline if branch is actually taken (PC+4) already computed, so use it to get next instruction Predict TAKEN As soon as the target address is available, fetch from there 67% MIPS branches taken on average Delayed branch

Delayed branch Assume that we have N (delay) slots after a branch The instructions in the slots are executed regardless of the branch outcome If you don t have instructions to fill the slots, put NOPs there Simple hardware no branch interlock Possibly more performance if slots are filled with instructions Compiler will have to fill the slots Code size will increase Delayed branch

Precise exception Exception (or fault) Exception is a condition which changes the control flow of processor e.g., Page fault, TLB miss, divide-by-zero, software trap, undefined opcode, memory protection error, Processor mode change may occur c.f., interrupt Instructions may generate exceptions at different pipeline stages Pipeline implements precise exception if the pipeline can be stopped so that the instructions before the faulting instructions are completed and those after it can be restarted from scratch Implementing precise exception can be complicated if out-of-order completion is desired and if instructions take various cycles (e.g., floating-point or complex instructions) Smith and Pleszkun (course web page) discusses how to implement precise exception in pipelined processors Pipeline key points Hazards result in inefficiency in pipelined instruction execution Higher CPI Data hazards require that dependent instructions wait for operands to become available Data forwarding Stalls may still be required in certain cases Control hazards require control-dependent instructions wait for the branch outcome to be resolved Out-of-order execution (in-order completion) techniques have been used to improve the efficiency of pipelining