ECS 154B Computer Architecture II Spring 2009

Similar documents
Chapter 4 The Processor 1. Chapter 4A. The Processor

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Pipelined datapath Staging data. CS2504, Spring'2007 Dimitris Nikolopoulos

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Chapter 4 The Processor 1. Chapter 4B. The Processor

Multi-cycle Approach. Single cycle CPU. Multi-cycle CPU. Requires state elements to hold intermediate values. one clock cycle or instruction

ECE260: Fundamentals of Computer Engineering

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

CPE 335 Computer Organization. Basic MIPS Pipelining Part I

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

CPE 335 Computer Organization. Basic MIPS Architecture Part I

ECE473 Computer Architecture and Organization. Pipeline: Data Hazards

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

Processor (II) - pipelining. Hwansoo Han

Processor (multi-cycle)

Full Datapath. Chapter 4 The Processor 2

COMP2611: Computer Organization. The Pipelined Processor

ECE331: Hardware Organization and Design

Chapter 4. The Processor

ECE260: Fundamentals of Computer Engineering

CENG 3420 Lecture 06: Pipeline

Computer Science 141 Computing Hardware

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

The Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

CSEE 3827: Fundamentals of Computer Systems

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

EE557--FALL 1999 MIDTERM 1. Closed books, closed notes

COMPUTER ORGANIZATION AND DESIGN

Chapter 4. The Processor

Perfect Student CS 343 Final Exam May 19, 2011 Student ID: 9999 Exam ID: 9636 Instructions Use pencil, if you have one. For multiple choice

CPE 335. Basic MIPS Architecture Part II

COMPUTER ORGANIZATION AND DESIGN

CENG 3420 Lecture 06: Datapath

zhandling Data Hazards The objectives of this module are to discuss how data hazards are handled in general and also in the MIPS architecture.

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 4. The Processor

CS 251, Winter 2018, Assignment % of course mark

ECE170 Computer Architecture. Single Cycle Control. Review: 3b: Add & Subtract. Review: 3e: Store Operations. Review: 3d: Load Operations

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

14:332:331 Pipelined Datapath

Pipelined Processor Design

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

ECE232: Hardware Organization and Design

Pipelined Processor Design

Outline Marquette University

CS 251, Winter 2019, Assignment % of course mark

Full Datapath. Chapter 4 The Processor 2

University of Jordan Computer Engineering Department CPE439: Computer Design Lab

Codeword[1] Codeword[0]

CSE Quiz 3 - Fall 2009

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

COSC121: Computer Systems. ISA and Performance

LECTURE 3: THE PROCESSOR

Processor (I) - datapath & control. Hwansoo Han

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

ECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

EEM 486: Computer Architecture. Lecture 3. Designing Single Cycle Control

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Pipeline design. Mehran Rezaei

ECE154A Introduction to Computer Architecture. Homework 4 solution

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

ECE 313 Computer Organization FINAL EXAM December 11, Multicycle Processor Design 30 Points

ECE 331 Hardware Organization and Design. UMass ECE Discussion 10 4/5/2018

COSC 6385 Computer Architecture - Pipelining

Adding Support for jal to Single Cycle Datapath (For More Practice Exercise 5.20)

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl.

Review: Abstract Implementation View

COMP303 Computer Architecture Lecture 9. Single Cycle Control

Processor: Multi- Cycle Datapath & Control

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

CSEN 601: Computer System Architecture Summer 2014

MIPS-Lite Single-Cycle Control

Computer Architecture Computer Science & Engineering. Chapter 4. The Processor BK TP.HCM

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception

Lecture 9 Pipeline and Cache

Processor Design Pipelined Processor (II) Hung-Wei Tseng

The Processor: Datapath & Control

How to design a controller to produce signals to control the datapath

LECTURE 5. Single-Cycle Datapath and Control

EE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes

RISC Processor Design

Computer Organization and Structure

CS/CoE 1541 Exam 1 (Spring 2019).

CS232 Final Exam May 5, 2001

ENGN1640: Design of Computing Systems Topic 04: Single-Cycle Processor Design

Thomas Polzer Institut für Technische Informatik

CS359: Computer Architecture. The Processor (A) Yanyan Shen Department of Computer Science and Engineering

Design of the MIPS Processor

EE2011 Computer Organization Lecture 10: Enhancing Performance with Pipelining ~ Pipelined Datapath

RISC Pipeline. Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. See: P&H Chapter 4.6

The Big Picture: Where are We Now? EEM 486: Computer Architecture. Lecture 3. Designing a Single Cycle Datapath

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

Chapter 5 Solutions: For More Practice

The overall datapath for RT, lw,sw beq instrucution

Systems Architecture I

RISC Design: Multi-Cycle Implementation

Transcription:

ECS 154B Computer Architecture II Spring 2009 Pipelining Datapath and Control 6.2-6.3 Partially adapted from slides by Mary Jane Irwin, Penn State And Kurtis Kredo, UCD

Pipelined CPU Break execution into five stages IM Reg DM Reg Corresponds to the five instruction cycles Fetch from instruction memory (IM) Decode and fetch registers (Reg) Execute the operation in the Access data memory (DM) Write result back to register (Reg) 2

State Registers How do we store values across pipeline stages? Multi-cycle MIPS introduced state registers IR, MDR, B, A introduced Sufficient for a pipelined CPU? PC Address Memory Data (Instr. or Data) Write Data IR MDR Addr 1 Register Addr 2 File Write Addr Write Data Data 1 Data 2 A B Out 3

Pipelined State Registers Each must maintain its own state Any information needed by a later stage must be passed along Consider PC + 4 4

Pipelined State Registers Each instruction must maintain its own state Any information needed by a later stage must be passed along Consider PC + 4 Computed during Fetch stage (why?) Needed by Execute stage (why?) 5

Pipelined State Registers More registers necessary for a pipelined CPU IF/ID ID/EX EX/MEM Add PC 4 Instruction Memory Address Addr 1 Register Addr 2 Data 1 File Write Addr Write Data Data 2 Shift left 2 Add Address Write Data Data Memory Data MEM/WB Sign 16 Extend 32 6

Load Word Example (Fetch) IF/ID ID/EX EX/MEM Add PC 4 Instruction Memory Address Addr 1 Register Addr 2 Data 1 File Write Addr Write Data Data 2 Shift left 2 Add Address Write Data Data Memory Data MEM/WB Sign 16 Extend 32 7

Load Word Example (Decode) IF/ID ID/EX EX/MEM Add PC 4 Instruction Memory Address Addr 1 Register Addr 2 Data 1 File Write Addr Write Data Data 2 Shift left 2 Add Address Write Data Data Memory Data MEM/WB Sign 16 Extend 32 8

Load Word Example (Execute) IF/ID ID/EX EX/MEM Add PC 4 Instruction Memory Address Addr 1 Register Addr 2 Data 1 File Write Addr Write Data Data 2 Shift left 2 Add Address Write Data Data Memory Data MEM/WB Sign 16 Extend 32 9

Load Word Example (Memory) IF/ID ID/EX EX/MEM Add PC 4 Instruction Memory Address Addr 1 Register Addr 2 Data 1 File Write Addr Write Data Data 2 Shift left 2 Add Address Write Data Data Memory Data MEM/WB Sign 16 Extend 32 10

Load Word Example (Write Back) IF/ID ID/EX EX/MEM Add PC 4 Instruction Memory Address Addr 1 Register Addr 2 Data 1 File Write Addr Write Data Data 2 Shift left 2 Add Address Write Data Data Memory Data MEM/WB Sign 16 Extend 32 11

Load Word Example (Write Back) All required values must pass through registers IF/ID ID/EX EX/MEM Add PC 4 Instruction Memory Address Addr 1 Register Addr 2 Data 1 File Write Addr Write Data Data 2 Shift left 2 Add Address Write Data Data Memory Data MEM/WB Sign 16 Extend 32 12

Pipelined Instructions With the multi-cycle CPU, instructions could take a different number of cycles to complete Can we do this with a pipelined CPU? R-type sw Jump 13

Pipelined Instructions IF/ID ID/EX EX/MEM Add PC 4 Instruction Memory Address Addr 1 Register Addr 2 Data 1 File Write Addr Write Data Data 2 Shift left 2 Add Address Write Data Data Memory Data MEM/WB Sign 16 Extend 32 14

Pipeline Control What about control signals? Different for each Required at Where does the control unit go? 15

Pipeline Control What about control signals? Different for each instruction Required at different stages Where does the control unit go? 16

Pipeline Control Pass control signals using the state registers No need to pass after they are used Control IF/ID ID/EX EX/MEM MEM/WB 17

Pipeline Control Signals Execute signals Memory access signals Write back signals 18

Pipeline Control Signals Execute signals RegDst Op[1..0] Src Branch Memory access signals Mem MemWrite Write back signals RegWrite MemtoReg 19

The Pipelined CPU So Far PCSrc ID/EX EX/MEM IF/ID Control MEM/WB PC 4 Address IM Add RegWrite Addr 1 Addr 2 Data 1 Register File Write Addr Write Data Data 2 Shift left 2 Src Add Branch Address Write Data DM Data MemtoReg Sign Extend Cntrl Op MemWrite Mem RegDst 20

Data Hazard Review Caused when data is needed before it is ready before write: Result of previous instruction needed by later instruction Load use: Value in data memory needed by later instruction IM Reg DM Reg IM Reg DM Reg IM Reg DM Reg 21

After Write Hazard Solution Stalling always an option Forwarding data improves CPI over stalling add $4, $5, $6 IM Reg DM Reg add $8, $4, $6 IM Reg DM Reg add $10, $9, $4 IM Reg DM Reg 22

Data Forwarding Take the result from the point that it exists in any of the pipeline state registers and forward it to the that need it that cycle For functional unit: the inputs can come from any pipeline register rather than just from ID/EX by adding the inputs of the connecting the in or to either (or both) of the EX s stage inputs adding the proper to control the new muxes Other functional units may need similar forwarding logic With forwarding, the CPU can achieve a CPI of 1 even in the presence of data dependencies 23

Data Forwarding Take the result from the earliest point that it exists in any of the pipeline state registers and forward it to the functional units (e.g., the ) that need it that cycle For functional unit: the inputs can come from any pipeline register rather than just from ID/EX by adding multiplexors to the inputs of the connecting the Rd write data in EX/MEM or MEM/WB to either (or both) of the EX s stage Rs and Rt mux inputs adding the proper control hardware to control the new muxes Other functional units may need similar forwarding logic With forwarding, the CPU can achieve a CPI of 1 even in the presence of data dependencies 24

Data Forwarding Conditions Only forward when state changes Forwarding unnecessary in other cases Forward if either source register needs it 25

Data Forwarding Conditions Only forward when state changes Use RegWrite control signal Don t forward if destination is $0 Forward if previous destination current source Forwarding unnecessary in other cases Forward if either source register needs it 26

EX/MEM Forwarding Register value needed by next instruction Calculated by this clock cycle Needed as input to on next clock cycle add $4, $5, $6 IM Reg DM Reg add $8, $4, $7 or add $8, $7, $4 IM Reg DM Reg 27

EX/MEM Forwarding ID/EX EX/MEM MEM/WB R[Rs] R[Rt] MemWrite RegWrite MemtoReg Immediate Cntrl Rd Rt RegDst 28

EX/MEM Forwarding R[Rs] ID/EX EX/MEM MEM/WB RegWrite R[Rt] MemWrite MemtoReg Immediate Cntrl Rd Rt Rs RegDst Forward Unit 29

MEM/WB Forwarding Register value needed two instructions later Calculated by this clock cycle Needed as input to in two clock cycles add $4, $5, $6 IM Reg DM Reg Unrelated Instruction IM Reg DM Reg add $8, $4, $7 or add $8, $7, $4 IM Reg DM Reg 30

MEM/WB Forwarding R[Rs] ID/EX EX/MEM MEM/WB RegWrite R[Rt] MemWrite MemtoReg Immediate Cntrl Rd Rt Rs RegDst Forward Unit 31

MEM/WB Forwarding R[Rs] ID/EX EX/MEM MEM/WB RegWrite R[Rt] MemWrite MemtoReg Immediate Cntrl Rd Rt Rs RegDst Forward Unit 32

Forwarding Complication Forward unit must forward most recent value It may appear necessary to do MEM/WB and EX/MEM forwarding simultaneously Only do EX/MEM forwarding this cycle Do EX/MEM forwarding again next cycle add $4, $5, $6 IM Reg DM Reg add $4, $4, $13 IM Reg DM Reg add $8, $4, $7 IM Reg DM Reg 33

Complete Input Forwarding R[Rs] R[Rt] ID/EX EX/MEM MemWrite MEM/WB RegWrite MemtoReg Immediate Cntrl Rd Rt Rs RegDst Forward Unit 34

Register Definition How can we specify a particular signal? Each state register has a copy May vary across stages Reference the register that contains the value RegWrite value in EX/MEM state register EX/MEM.RegWrite RegWrite value in MEM/WB state register MEM/WB.RegWrite 35

Forwarding Conditions We want to forward when Previous instruction updates state Previous destination used as current source Previous destination not $0 Data Hazard code add $4, $5, $6 sub $8, $4, $9 How do we do this in hardware? 36

Forwarding Unit R[Rs] R[Rt] ID/EX EX/MEM MemWrite MEM/WB RegWrite MemtoReg Immediate Cntrl Rd Rt Rs RegDst Forward Unit 37

Forwarding Unit Details EX/MEM.RegWrite Forward EX/MEM.RegisterRd[4] ID/EX.RegisterRs[4] EX/MEM.RegisterRd[0] ID/EX.RegisterRs[0] EX/MEM.RegisterRd[4] 0 EX/MEM.RegisterRd[0] 0 38

Forwarding Unit Details EX/MEM.RegWrite Forward EX/MEM.RegisterRd[4] ID/EX.RegisterRs[4] EX/MEM.RegisterRd[0] ID/EX.RegisterRs[0] EX/MEM.RegisterRd = ID/EX.RegisterRs EX/MEM.RegisterRd[4] 0 EX/MEM.RegisterRd[0] 0 EX/MEM.RegisterRd 0 39

Other Forwarding Possible Forwarding to Data Memory add $4, $5, $6 IM Reg DM Reg sw $4, 40($7) IM Reg DM Reg Data memory to data memory copy lw $4, 16($7) IM Reg DM Reg sw $4, 40($7) IM Reg DM Reg 40

Forwarding to Memory What happens here? add $5, $6, $7 sw $5, 8($10) Forwarding must occur, but not through 41

Forwarding to Memory ID/EX EX/MEM MEM/WB R[Rs] MemWrite RegWrite R[Rt] MemtoReg Immediate Cntrl Rd Rt Rs Forward Unit 42

Forwarding to Memory ID/EX EX/MEM MEM/WB R[Rs] MemWrite RegWrite R[Rt] MemtoReg Immediate Cntrl Rd Rt Rs Forward Unit 43