COMP 4211 Seminar Presentation

Similar documents
Instruction Pipelining Review

Predict Not Taken. Revisiting Branch Hazard Solutions. Filling the delay slot (e.g., in the compiler) Delayed Branch

Minimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline

There are different characteristics for exceptions. They are as follows:

COSC4201. Prof. Mokhtar Aboelaze York University

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1

Execution/Effective address

Advanced issues in pipelining

Lecture 9: Case Study MIPS R4000 and Introduction to Advanced Pipelining Professor Randy H. Katz Computer Science 252 Spring 1996

Complications with long instructions. CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3. How slow is slow?

Instruction Level Parallelism. ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction

Lecture 6 MIPS R4000 and Instruction Level Parallelism. Computer Architectures S

Administrivia. CMSC 411 Computer Systems Architecture Lecture 6. When do MIPS exceptions occur? Review: Exceptions. Answers to HW #1 posted

CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3. Complications With Long Instructions

CISC 662 Graduate Computer Architecture Lecture 7 - Multi-cycles

CISC 662 Graduate Computer Architecture Lecture 7 - Multi-cycles. Interrupts and Exceptions. Device Interrupt (Say, arrival of network message)

LECTURE 10. Pipelining: Advanced ILP

ECE 486/586. Computer Architecture. Lecture # 12

Computer Systems Architecture I. CSE 560M Lecture 5 Prof. Patrick Crowley

Review: Evaluating Branch Alternatives. Lecture 3: Introduction to Advanced Pipelining. Review: Evaluating Branch Prediction

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

Lecture 4: Introduction to Advanced Pipelining

Announcement. ECE475/ECE4420 Computer Architecture L4: Advanced Issues in Pipelining. Edward Suh Computer Systems Laboratory

Multi-cycle Instructions in the Pipeline (Floating Point)

COSC 6385 Computer Architecture - Pipelining (II)

Chapter3 Pipelining: Basic Concepts

Chapter 3. Pipelining. EE511 In-Cheol Park, KAIST

Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1)

Static vs. Dynamic Scheduling

ELE 818 * ADVANCED COMPUTER ARCHITECTURES * MIDTERM TEST *

Hardware-based Speculation

Lecture 7: Pipelining Contd. More pipelining complications: Interrupts and Exceptions

COSC4201 Instruction Level Parallelism Dynamic Scheduling

Exception Handling. Precise Exception Handling. Exception Types. Exception Handling Terminology

ECE 4750 Computer Architecture, Fall 2017 T05 Integrating Processors and Memories

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1

Updated Exercises by Diana Franklin

CSE 820 Graduate Computer Architecture. week 6 Instruction Level Parallelism. Review from Last Time #1

Slide Set 8. for ENCM 501 in Winter Steve Norman, PhD, PEng

Handout 2 ILP: Part B

Complex Pipelining COE 501. Computer Architecture Prof. Muhamed Mudawar

Appendix C: Pipelining: Basic and Intermediate Concepts

Appendix C. Authors: John Hennessy & David Patterson. Copyright 2011, Elsevier Inc. All rights Reserved. 1

Page 1. Recall from Pipelining Review. Lecture 16: Instruction Level Parallelism and Dynamic Execution #1: Ideas to Reduce Stalls

Hardware-based Speculation

Instruction Frequency CPI. Load-store 55% 5. Arithmetic 30% 4. Branch 15% 4

What is ILP? Instruction Level Parallelism. Where do we find ILP? How do we expose ILP?

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

CS433 Homework 2 (Chapter 3)

Pipelining and Vector Processing

DYNAMIC INSTRUCTION SCHEDULING WITH SCOREBOARD

Recall from Pipelining Review. Lecture 16: Instruction Level Parallelism and Dynamic Execution #1: Ideas to Reduce Stalls

Pipelining: Issue instructions in every cycle (CPI 1) Compiler scheduling (static scheduling) reduces impact of dependences

Hardware-Based Speculation

Appendix C. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,

Overview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP

Load1 no Load2 no Add1 Y Sub Reg[F2] Reg[F6] Add2 Y Add Reg[F2] Add1 Add3 no Mult1 Y Mul Reg[F2] Reg[F4] Mult2 Y Div Reg[F6] Mult1

Basic Pipelining Concepts

EECC551 Exam Review 4 questions out of 6 questions

Page 1. Recall from Pipelining Review. Lecture 15: Instruction Level Parallelism and Dynamic Execution

Appendix A. Overview

Advanced Computer Architecture. Chapter 4: More sophisticated CPU architectures

CISC 662 Graduate Computer Architecture Lecture 13 - CPI < 1

Lecture-13 (ROB and Multi-threading) CS422-Spring

The CPU Pipeline. MIPS R4000 Microprocessor User's Manual 43

NOW Handout Page 1. Review from Last Time #1. CSE 820 Graduate Computer Architecture. Lec 8 Instruction Level Parallelism. Outline

CPE 631 Lecture 09: Instruction Level Parallelism and Its Dynamic Exploitation

CS433 Homework 2 (Chapter 3)

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

EITF20: Computer Architecture Part2.2.1: Pipeline-1

ELEC 5200/6200 Computer Architecture and Design Fall 2016 Lecture 9: Instruction Level Parallelism

DYNAMIC AND SPECULATIVE INSTRUCTION SCHEDULING

Lecture 4: Advanced Pipelines. Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)

Instruction-Level Parallelism and Its Exploitation

The basic structure of a MIPS floating-point unit

Pipelining. Principles of pipelining. Simple pipelining. Structural Hazards. Data Hazards. Control Hazards. Interrupts. Multicycle operations

Processor: Superscalars Dynamic Scheduling

CS252 Graduate Computer Architecture Lecture 5. Interrupt Controller CPU. Interrupts, Software Scheduling around Hazards February 1 st, 2012

CS4617 Computer Architecture

EITF20: Computer Architecture Part2.2.1: Pipeline-1

Question 1 (5 points) Consider a cache with the following specifications Address space is 1024 words. The memory is word addressable The size of the

MIPS ISA AND PIPELINING OVERVIEW Appendix A and C

ECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 9 Instruction-Level Parallelism Part 2

CPE 631 Lecture 10: Instruction Level Parallelism and Its Dynamic Exploitation

Instruction Level Parallelism

EE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes

5008: Computer Architecture

Good luck and have fun!

Anti-Inspiration. It s true hard work never killed anybody, but I figure, why take the chance? Ronald Reagan, US President

Outline Review: Basic Pipeline Scheduling and Loop Unrolling Multiple Issue: Superscalar, VLIW. CPE 631 Session 19 Exploiting ILP with SW Approaches

T T T T T T N T T T T T T T T N T T T T T T T T T N T T T T T T T T T T T N.

Adapted from David Patterson s slides on graduate computer architecture

Four Steps of Speculative Tomasulo cycle 0

EITF20: Computer Architecture Part2.2.1: Pipeline-1

Super Scalar. Kalyan Basu March 21,

CACHE MEMORIES ADVANCED COMPUTER ARCHITECTURES. Slides by: Pedro Tomás

Floating Point/Multicycle Pipelining in DLX

DYNAMIC SPECULATIVE EXECUTION

CPE 631 Lecture 11: Instruction Level Parallelism and Its Dynamic Exploitation

CS 252 Graduate Computer Architecture. Lecture 4: Instruction-Level Parallelism

Transcription:

COMP Seminar Presentation Based On: Computer Architecture A Quantitative Approach by Hennessey and Patterson Presenter : Feri Danes Outline Exceptions Handling Floating Points Operations Case Study: MIPS R000 Pipeline Case Study: MIPS R00 Pipeline What is Exception? I/O device request Invoking an operating system service from a user program Tracing instruction execution Breakpoint (programmer-requested requested interrupt) Integer arithmetic overflow FP arithmetic anomaly Page Fault( not in main memory ) Misaligned memory access ( if alignment required ) Memory protection violation sing an undefined or unimplemented instruction Hardware malfunctions Power failure What happens during an exception? An exception occurs Operating system trap Saving the PC where the exception happens Save the operating system state Run exception code Resume the last instruction before it traps or terminate the program

How does this influence the hardware? npipelined implementations! Occur within instructions! Restartable Pipelined implementations! Preserves the CP state by ing the instruction following the exception source Exception in Pipelined architecture Force a trap instruction into the pipeline on the next Flush the pipeline for the faulting instruction and all instructions that follows After exception handling routine finishes restore the PC of the saved PC and delay branches if exsists Precise Exceptions Exceptions in MIPS ìif the pipeline can be stopped so that the instructions just the faulting instruction are completed and the faulting instruction can be restarted from scratchî Pipeline stage Problem exceptions occurring Page fault on, misaligned memory access, memory protection violation ndefined or illegal opcode Arithmetic exception Page fault on data fetch, misaligned memory access, memory violation protection None

Exceptions in MIPS Floating Points Operation Precise exceptions Exception status vector eg:! Data fault on LD! Page fault on DADD There are four separate functional units that we can assume to support Function Point operations: Integer nit FP and integer multiplier FP adder FP and integer divider LD ADD Floating Points Operation Function Point Operations

Hazards Structural Hazard Structural hazards on the FP and integer divider Structural hazards on WAW because of out of order completion Precise exception is harder to maintain Increase the number of RAW hazards Instruction ML.D F0,F,F6 ÖÖÖÖÖÖÖÖ. ÖÖÖÖÖÖÖÖ. ADD.D F,F,F6 ÖÖÖÖÖÖÖÖ. ÖÖÖÖÖÖÖÖ. L.D F,0(R) M M 5 M 6 M A 7 M5 A M6 A 9 M7 A 0 Structural Hazard RAW Hazard the instruction that can cause structural hazard in the stage the instruction when it enters the or stage Instruction L.D F0F,F6 ML.D F0,F,F6 ADD.D F,F0,F 5 M 6 M 7 M M 9 M5 0 M6 M7 A A A 5 A 6 7 S.D F,0(R)

WAW Hazard WAW Hazard Instruction ML.D F0,F,F6 ÖÖÖÖÖÖÖ. ÖÖÖÖÖÖÖ ADD.D F,F,F6 M M 5 M 6 M A 7 M5 A M6 A 9 M7 A 0 Delay the issue of Load instruction until the add instruction reach the stage Prevent the add instruction to write back to register ÖÖÖÖÖÖÖ L.D F,0(R) Forwarding nit Can be implemented by checking the destination register in these following registers: /!/ M7/ D/ / Maintaining Precise Exceptions Consider the following sequence of code! DIV.D F0,F,F! ADD.D F0,F0,F! SB.D F,F,F Consider this following scenario! ADD.D and SB.D finish before DIV.D! SB.D causes floating point exception after ADD.D has been completed but not DIV.D! Imprecise exception

Maintaining Precise Exception MIPS FP Performance Ignore buffer the result of an operation until all operation that were issued earlier are complete! History file! future file Let exceptions to be imprecise but with state information! save the instruction that precede the completed instruction and run those instructions before restarting the execution Guarantee that no instruction that precede the issuing instruction can be completed without exception before issuing that instruction Case Study MIPS R000 R000 Pipeline Eight stages pipeline Additional pipeline is allocated for cache access Often called as superpipelining

R000 Pipeline RAW Hazard -- --First half of instruction fetch, PC selection and initiation of instruction cache access IS-- --Second half instruction fetch, complete instruction cache access RF-- --Instruction decode and register fetch and instruction cache hit detection -- --Execution includes effective address calculation, AL operation, branch target calculation and branch condition calculation DF-- --Data fetch first half DS-- --Second half data fetch completion of data cache access TC-- --Tag check, determine whether the data cache access hit -- --Write Back for loads and register-register operation Branch Delay Branch Delay Branch taken! Delay Slot + Brach not taken! Delay Slot Note : Branch predicted not taken policy is used for the out cycles of branch delay

Floating Point Pipeline Floating Point Instruction Stag e A D E M N R S Functional nit FP Adder FP Divider FP Multiplier FP Multiplier FP Multiplier FP Adder FP Adder Description Mantissa ADD stage Divide pipeline stage Exception test stage First stage of multiplier Second stage of multiplier Rounding stage Operand shift stage npack FP number FP Instruction Add, Subtract Multiply Divide Square Root Negate Absolute Value FP Compare Latency 6 Initiation Interval 5 Pipe stages,,,,e+m,m,m,m,n,n+a,r,a,r,d 7,D+A,D+R,D+A,D+R,A,R,E,() 0,A,R,S,S,A,R Structural Hazard Performance Operatio n Multiply Add / E+M M M 5 N 6 N+A 7 R 9 0 Some Definition Load s: Delays arising from the use of a load result or cycles after the load Branch s: cycle on every taken branch plus unfilled or canceled branch delay slots FP result s: s because of RAW hazard for an FP operand FP structural s: Delays because of issue restrictions arising from conflicts for functional units in the FP pipeline

Performance MIPS R00 five-stage pipeline 6-bit processor used in embedded system implementation FP operations by extending the pipeline length in the Ex stage Bibliography Hennesy,J.L and Patterson,D.A,, ìcomputer Architecture A Quantitative approachî, Appendix A, Morgan Kaufmann Publisher,SA,00