Announcement. ECE475/ECE4420 Computer Architecture L4: Advanced Issues in Pipelining. Edward Suh Computer Systems Laboratory

Similar documents
Predict Not Taken. Revisiting Branch Hazard Solutions. Filling the delay slot (e.g., in the compiler) Delayed Branch

Instruction Level Parallelism. ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction

Instruction Pipelining Review

LECTURE 10. Pipelining: Advanced ILP

Minimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline

ECE 486/586. Computer Architecture. Lecture # 12

Lecture 7: Pipelining Contd. More pipelining complications: Interrupts and Exceptions

COSC 6385 Computer Architecture - Pipelining (II)

There are different characteristics for exceptions. They are as follows:

CS 252 Graduate Computer Architecture. Lecture 4: Instruction-Level Parallelism

CISC 662 Graduate Computer Architecture Lecture 7 - Multi-cycles

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

CISC 662 Graduate Computer Architecture Lecture 7 - Multi-cycles. Interrupts and Exceptions. Device Interrupt (Say, arrival of network message)

ECE 252 / CPS 220 Advanced Computer Architecture I. Lecture 8 Instruction-Level Parallelism Part 1

Announcements. ECE4750/CS4420 Computer Architecture L11: Speculative Execution I. Edward Suh Computer Systems Laboratory

Computer Systems Architecture I. CSE 560M Lecture 5 Prof. Patrick Crowley

CS 152 Computer Architecture and Engineering. Lecture 10 - Complex Pipelines, Out-of-Order Issue, Register Renaming

Complications with long instructions. CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3. How slow is slow?

Complex Pipelining. Motivation

CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3. Complications With Long Instructions

C 1. Last time. CSE 490/590 Computer Architecture. Complex Pipelining I. Complex Pipelining: Motivation. Floating-Point Unit (FPU) Floating-Point ISA

Chapter 3. Pipelining. EE511 In-Cheol Park, KAIST

Appendix C: Pipelining: Basic and Intermediate Concepts

EC 513 Computer Architecture

COMP 4211 Seminar Presentation

CS425 Computer Systems Architecture

Lecture 4 Pipelining Part II

Multi-cycle Instructions in the Pipeline (Floating Point)

Basic Pipelining Concepts

POLITECNICO DI MILANO. Exception handling. Donatella Sciuto:

Lecture 4: Advanced Pipelines. Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1

Exception Handling. Precise Exception Handling. Exception Types. Exception Handling Terminology

Lecture 9: Multiple Issue (Superscalar and VLIW)

Appendix C. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,

Administrivia. CMSC 411 Computer Systems Architecture Lecture 6. When do MIPS exceptions occur? Review: Exceptions. Answers to HW #1 posted

EITF20: Computer Architecture Part2.2.1: Pipeline-1

Chapter3 Pipelining: Basic Concepts

EITF20: Computer Architecture Part2.2.1: Pipeline-1

Pipelining. Principles of pipelining. Simple pipelining. Structural Hazards. Data Hazards. Control Hazards. Interrupts. Multicycle operations

ECE 4750 Computer Architecture, Fall 2017 T05 Integrating Processors and Memories

Complex Pipelining: Out-of-order Execution & Register Renaming. Multiple Function Units

ECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 9 Instruction-Level Parallelism Part 2

CS 152 Computer Architecture and Engineering. Lecture 13 - Out-of-Order Issue and Register Renaming

COMPUTER ORGANIZATION AND DESI

Advanced issues in pipelining

Lecture 2: Pipelining Basics. Today: chapter 1 wrap-up, basic pipelining implementation (Sections A.1 - A.4)

CS4617 Computer Architecture

EITF20: Computer Architecture Part2.2.1: Pipeline-1

Advanced Computer Architecture

Overview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP

MIPS An ISA for Pipelining

The CPU Pipeline. MIPS R4000 Microprocessor User's Manual 43

Unresolved data hazards. CS2504, Spring'2007 Dimitris Nikolopoulos

Appendix C. Authors: John Hennessy & David Patterson. Copyright 2011, Elsevier Inc. All rights Reserved. 1

Appendix A. Overview

Anti-Inspiration. It s true hard work never killed anybody, but I figure, why take the chance? Ronald Reagan, US President

Complex Pipelining COE 501. Computer Architecture Prof. Muhamed Mudawar

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

EE 457 Unit 8. Exceptions What Happens When Things Go Wrong

What are Exceptions? EE 457 Unit 8. Exception Processing. Exception Examples 1. Exceptions What Happens When Things Go Wrong

COSC4201 Pipelining. Prof. Mokhtar Aboelaze York University

LECTURE 3: THE PROCESSOR

Full Datapath. Chapter 4 The Processor 2

COMPUTER ORGANIZATION AND DESIGN

Performance of Computer Systems. CSE 586 Computer Architecture. Review. ISA s (RISC, CISC, EPIC) Basic Pipeline Model.

Pipelining is Hazardous!

Multiple Instruction Issue. Superscalars

Computer Architecture

Pipelining: Basic and Intermediate Concepts

Very short answer questions. "True" and "False" are considered short answers.

Pipelining. CSC Friday, November 6, 2015

PIPELINING: HAZARDS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

Pipelining. Principles of pipelining. Simple pipelining. Structural Hazards. Data Hazards. Control Hazards. Interrupts. Multicycle operations

CISC 662 Graduate Computer Architecture Lecture 13 - CPI < 1

Advanced Computer Architecture. Chapter 4: More sophisticated CPU architectures

Instr. execution impl. view

Lecture 9: Case Study MIPS R4000 and Introduction to Advanced Pipelining Professor Randy H. Katz Computer Science 252 Spring 1996

Improving Performance: Pipelining

Review: Evaluating Branch Alternatives. Lecture 3: Introduction to Advanced Pipelining. Review: Evaluating Branch Prediction

Chapter 4. The Processor

EECS 322 Computer Architecture Superpipline and the Cache

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

Processors. Young W. Lim. May 12, 2016

ECE 313 Computer Organization FINAL EXAM December 13, 2000

Improvement: Correlating Predictors

Four Steps of Speculative Tomasulo cycle 0

Updated Exercises by Diana Franklin

CS252 Graduate Computer Architecture Lecture 8. Review: Scoreboard (CDC 6600) Explicit Renaming Precise Interrupts February 13 th, 2010

CPE 631 Lecture 10: Instruction Level Parallelism and Its Dynamic Exploitation

ELE 818 * ADVANCED COMPUTER ARCHITECTURES * MIDTERM TEST *

ECE 505 Computer Architecture

COSC 6385 Computer Architecture - Pipelining

(Basic) Processor Pipeline

DLX Unpipelined Implementation

Slide Set 7. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng

CMSC411 Fall 2013 Midterm 1

Floating Point/Multicycle Pipelining in DLX

Transcription:

ECE475/ECE4420 Computer Architecture L4: Advanced Issues in Pipelining Edward Suh Computer Systems Laboratory suh@csl.cornell.edu Announcement Lab1 is released Start early we only have limited computing resources Reading: Appendix A.1 A.6 BRC (Big Red Chip) Contact info on blackboard Career Fair tomorrow 2 1

Roadmap Tricky issues in the 5-stage pipeline Handling exceptions Deeper pipeline More complex pipeline with multi-cycle operations 3 Exceptions Exceptions: interrupt instruction execution unexpectedly Common exceptions: I/O device interrupt OS system call Arithmetic overflow, FP anomaly Page fault Misaligned memory access Memory protection violation Illegal instruction Power / hardware failure 4 2

A Taxonomy of Exceptions Synchronous vs. asynchronous User- vs. hardware-triggered Maskable vs. nonmaskable (NMI) 5 A Taxonomy of Exceptions Within vs. between instructions Resume vs. terminate 6 3

Restartable Exceptions What do we need to do in order to resume after an exception? 7 Precise Exception It must appear as if an interrupt is taken between two instructions (say I i and I i+1 ) the effect of all instructions up to and including I i is totally complete no effect of any instruction after I i has taken place The interrupt handler either aborts the program or restarts it at I i+1. 8 4

Exceptions in Pipeline PC Inst. Mem D Decode E + M Data Mem W 9 Exceptions in Pipeline lw IF ID EX MEM WB add IF ID EX MEM WB How to handle multiple exceptions in the same cycle? 10 5

Exceptions in Pipeline lw IF ID EX MEM WB add IF ID EX MEM WB How to handle multiple exceptions for one instruction? 11 Exception Handling (In-Order Five-Stage Pipeline) PC Inst. Mem D Decode E + M Data Mem W PC Address Exceptions Illegal Opcode Overflow Data Addr Except 12 6

When Does State Change? An instruction is committed when it is guaranteed to complete Easy to restart if state has not been changed Simple for MIPS: MEM/WB VAX: auto-increment mode, state updated in middle of inst, need HW support to back out, undo roll back state changes Some architectures have string copy instructions updates memory cannot undo 100% general-purpose registers hold all state instruction continues after exception rather than restart 13 Implementation Details What if an exception is in a branch delay slot? Can we restart the instruction in the delay slot? 14 7

MIPS R4000 Eight-stage pipeline, high clock rate (superpipelined) IF IS RF EX DF DS TC WB IF select PC, start i$ access IS complete i$ access RF decode, register access, check i$ tag EX execution (ALU) DF start d$ access DS complete d$ access TC check d$ tag WB write back result to register file Memory access takes three cycles MEM 15 Deep Pipelines Pros and Cons 16 8

Limits of Pipelining Cannot increase pipeline depth forever hit ILP limits CPI eventually begins to increase due to stalls clock rate does not go down enough to compensate 17 Multicycle Operations: Why? Pipelining becomes complex when we want high performance in the presence of multi-cycle operations 18 9

Realistic Memory System Latency of access to the main memory is usually much greater than one cycle and often unpredictable Solving this problem is a central issue in computer architecture Common approaches to improving memory performance separate instruction and data memory ports no self-modifying code caches single cycle except in case of a miss stall interleaved memory multiple memory accesses bank conflicts split-phase memory operations out-of-order responses 19 Floating Point Unit Much more hardware than an integer unit Single-cycle floating point unit is a bad idea - why? 20 10

Function Unit Characteristics fully pipelined busy 1cyc 1cyc 1cyc accept partially pipelined busy 2 cyc 2 cyc accept Function units have internal pipeline registers 21 Complex Pipeline Structure ALU Mem IF ID WB GPR s FPR s Fadd Fmul Fdiv 22 11

New Challenges Structural conflicts at the execution stage if some FPU or memory unit is not pipelined and takes more than one cycle Structural conflicts at the write-back stage due to variable latencies of different function units Out-of-order write hazards due to variable latencies of different function units (WAW hazards) How to handle exceptions? 23 Structural Hazard Partially pipelined functional units Write-port conflict fmult fadd ld IF ID X1 X2 X3 X4 X5 X6 X7 WB IF ID X1 X2 X3 WB IF ID EX MM WB 24 12

Data Hazard 1 Read-After-Write hazard fmult f2, IF ID X1 X2 X3 X4 X5 X6 X7 WB fadd, f2 IF ID ** ** ** ** ** ** X1 X2 25 Data Hazard 2 Write-After-Write Hazard fadd f2, IF ID X1 X2 X3 X4 WB fld f2, IF ID EX MM WB 26 13

Maintaining Precise Exceptions fdiv f1, f2, f3 fadd f2, f2, f4 Scenario: fadd done, fdiv raises exception 27 Multicycle Hazards Summary Check for structural hazards unpipelined units divide write ports Check for RAW hazards if producer in flight, stall (apply transitive closure here) many stall cycles, even with full bypassing Check for WAW hazards if instruction in flight with same destination, stall How is all this accomplished? dynamic scheduling by scoreboard stay tuned 28 14