Pipelining. Parts of these slides are from the support material provided by W. Stallings
|
|
- Hilary Knight
- 6 years ago
- Views:
Transcription
1 Pipelining Raul Queiroz Feitosa Parts of these slides are from the support material provided by W. Stallings Objective To present the Pipelining concept, its limitations and the techniques for performance optimization Pipelining 2 1
2 Outline Instruction Cycle Instruction Pipelining Pipeline Performance Pipeline Hazards Resource Data Control Pipelining 3 Instruction Cycle State Diagram instruction fetch instruction operation decoding operand address calculation multiple operands operand fetch instruction address decoding indirection data operation no interrupt indirection result address calculation interrrupt interrupt interrrupt check result store multiple results Pipelining 4 2
3 Outline Instruction Cycle Instruction Pipelining Pipeline Performance Pipeline Hazards Resource Data Control Pipelining 5 Instruction Pipelining Instruction cycle is split into sequential steps A specific hardware unit (pipeline stage) is built to perform each step Pipeline stages are arranged as a chain pipeline stages 1 2 k Pipelining 6 3
4 Instruction Pipelining - Example fetch instruction decode instruction CO calculate operand FO fetch operand EI execute instruction WO write operand (result) Pipelining 7 Instruction Pipeline Operation time instruction 1 instruction 2 instruction 3 instruction 4 instruction 5 instruction 6 instruction 7 instruction 15 instruction 16 Pipelining 8 4
5 Outline Instruction Cycle Instruction Pipelining Pipeline Performance Pipeline Hazards Resource Data Control Pipelining 9 Pipeline Performance Assuming k stages τ= τ 1 = τ 2 =... = τ k (τ i is the time delay of the i-th stage) T n,k time for a pipeline with k stages to execute n instructions T n,1 = n k τ (conventional machine) T n,k = k τ+ (n-1)τ = (n+k-1)τ (pipeline) The speedup S k nkτ nk = = ( n + k 1) τ n + k 1 For large n nk lim Sk = lim = k!!!!!! n n n + k 1 Pipelining 10 5
6 Pipeline Performance Speedup k = 12 stages k = 9 stages k = 6 stages Number Number of of instructions instructions Speedup Number of stages Number of instructions n = 30 instructions n = 20 instructions n = 10 instructions Pipelining 11 Pipeline Performance The optimal performance is never reached because: The execution time is different from stage to stage τ τ L 1 2 There is still a time delay to latch the output of each stage τ k Pipeline hazards τ = max τ i [ ] + d i Pipelining 12 6
7 Outline Instruction Cycle Instruction Pipelining Pipeline Performance Pipeline Hazards Resource Data Control Pipelining 13 Pipeline Hazards In some cases a portion of pipeline must stall, due to the so called hazards Also called pipeline bubble Types of hazards Resource Data Control Pipelining 14 7
8 Outline Instruction Cycle Instruction Pipelining Pipeline Performance Pipeline Hazards Resource Data Control Pipelining 15 Resource Hazards Also called structural hazards, occur when multiple instructions need the same resource, e.g., single port memory time instruction 1 instruction 2 instruction 3 instruction 4 COFOidle FO idle EI WO Pipelining 16 8
9 Outline Instruction Cycle Instruction Pipelining Pipeline Performance Pipeline Hazards Resource Data Control Pipelining 17 Data Hazards Conflict in access of an operand location Two instructions to be executed in sequence Both access a particular memory or register operand Example: time ADD EAX,EBX SUB ECX,EAX instruction 3 instruction 4 CO idle FO CO idle FO CO idle FO Pipelining 18 9
10 Outline Instruction Cycle Instruction Pipelining Pipeline Performance Pipeline Hazards Resource Data Control Pipelining 19 Control Hazards Also called branch hazard. What is the address of the instruction following a conditional branch from? Known only here time ADD EAX,EBX JNZ ADDRESS instruction 3 instruction 4 idle no memory conflict! Pipelining 20 10
11 Control Hazards Dealing with Branches Multiple Streams Two pipelines prefetch each branch into a separate pipeline (IBM 370/168 and IBM 3033). Always one pipeline produces no useful work. Prefetch Branch Target Target of branch is prefetched in addition to instructions following branch; keep target until branch is executed (IBM 360/91) Loop buffer Very fast memory maintained by fetch stage of pipeline. Check buffer before fetching from memory (CRAY-1) Branch prediction Delayed branching Pipelining 21 Branch Prediction Concept: Instead of delaying the fetch of next instruction, it is predicted Results are stored in temporary registers If prediction correct, make results definitive If prediction incorrect, flush results, and restart fetching from the right address Pipelining 22 11
12 Branch Prediction Static Methods: Predict never or always Predicted by opcode There are two codes for each branch instruction 1 bit indicates predict or predict not Compiler analyses the code, guesses and generate the appropriate branch code. Processor follows compiler suggestion Implies in code incompatibility with previous processors Pipelining 23 Branch Prediction Dynamic Methods Base on recent branch history branch instruction address target address state predict not predict not predict not not predict not not Branch History Table Branch Prediction State Diagram Pipelining 24 12
13 Delayed Branch Concept The branch takes effect until after execution of the following instruction reduces the branch penalty not not MOV EDX,ECX ADD EAX,[EBX] JZ LA instruction MOV EDX,ECX ADD EAX,[EBX] JZ LA instruction ADD EAX,[EBX] JZ LA MOV EDX,ECX instrução ADD EAX,[EBX] JZ LA MOV EDX,ECX instruction always executed LA: instruction... LA: instruction... LA: instrução... LA: instruction... conventional branch delayed branch Pipelining 25 Delayed Branch Example: conventional branch Prediction wrong MOV EDX,ECX time ADD EAX,[EBX] JZ LA instruction 1 instruction 2 instruction 3 FO branch penalty instruction 4 Pipelining 26 13
14 Delayed Branch Example: delayed branch Prediction wrong ADD EAX,[EBX] time JZ LA MOV EDX,ECX instruction 1 instruction 2 branch penalty instruction 3 instruction 4 Pipelining 27 Exercises Exercise 1: Assume the pipeline shown in slide 7 containing 6 stages. Complete the graphs below that represent the pipeline operation assuming a single port memory. Hint: take in consideration the memory accesses for instruction fetch, operand fetch and result write. PROGRAM ADD EAX,[EBX+ESI] INC EBX DEC [ESI*2+EBP] MOV CX,[ 4768] PROGRAM ADD [EBX+ESI], EAX INC EBX DEC [ESI*2+EBP] MOV CX,[ 4768] Pipelining 28 14
15 Exercises Exercise 2: Repeat the previous exercise assuming that there are separate Instruction and Data caches, so that accesses to instruction and operands may occur simultaneously. PROGRAM ADD EAX,[EBX+ESI] INC EBX DEC [ESI*2+EBP] MOV CX,[ 4768] PROGRAM ADD [EBX+ESI], EAX INC EBX DEC [ESI*2+EBP] MOV CX,[ 4768] Pipelining 29 Text Book References The topics are covered in Stallings - sections 12.3 and 12.4 Pipelining 30 15
16 Pipelining END Pipelining 31 16
UNIT- 5. Chapter 12 Processor Structure and Function
UNIT- 5 Chapter 12 Processor Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data CPU With Systems Bus CPU Internal Structure Registers
More informationChapter 14 - Processor Structure and Function
Chapter 14 - Processor Structure and Function Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 14 - Processor Structure and Function 1 / 94 Table of Contents I 1 Processor Organization
More informationCPU Structure and Function
Computer Architecture Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com http://www.yildiz.edu.tr/~naydin CPU Structure and Function 1 2 CPU Structure Registers
More informationModule 4c: Pipelining
Module 4c: Pipelining R E F E R E N C E S : S T A L L I N G S, C O M P U T E R O R G A N I Z A T I O N A N D A R C H I T E C T U R E M O R R I S M A N O, C O M P U T E R O R G A N I Z A T I O N A N D A
More informationCPU Structure and Function
CPU Structure and Function Chapter 12 Lesson 17 Slide 1/36 Processor Organization CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data Lesson 17 Slide 2/36 CPU With Systems
More informationUNIT V: CENTRAL PROCESSING UNIT
UNIT V: CENTRAL PROCESSING UNIT Agenda Basic Instruc1on Cycle & Sets Addressing Instruc1on Format Processor Organiza1on Register Organiza1on Pipeline Processors Instruc1on Pipelining Co-Processors RISC
More informationChapter 9 Pipelining. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan
Chapter 9 Pipelining Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Basic Concepts Data Hazards Instruction Hazards Advanced Reliable Systems (ARES) Lab.
More informationWilliam Stallings Computer Organization and Architecture
William Stallings Computer Organization and Architecture Chapter 11 CPU Structure and Function Rev. 3.2.1 (2005-06) by Enrico Nardelli 11-1 CPU Functions CPU must: Fetch instructions Decode instructions
More informationPipelining, Branch Prediction, Trends
Pipelining, Branch Prediction, Trends 10.1-10.4 Topics 10.1 Quantitative Analyses of Program Execution 10.2 From CISC to RISC 10.3 Pipelining the Datapath Branch Prediction, Delay Slots 10.4 Overlapping
More informationWilliam Stallings Computer Organization and Architecture 8 th Edition. Chapter 12 Processor Structure and Function
William Stallings Computer Organization and Architecture 8 th Edition Chapter 12 Processor Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data
More informationWilliam Stallings Computer Organization and Architecture. Chapter 11 CPU Structure and Function
William Stallings Computer Organization and Architecture Chapter 11 CPU Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data Registers
More informationProcessor Structure and Function
WEEK 4 + Chapter 14 Processor Structure and Function + Processor Organization Processor Requirements: Fetch instruction The processor reads an instruction from memory (register, cache, main memory) Interpret
More informationTi Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr
Ti5317000 Parallel Computing PIPELINING Michał Roziecki, Tomáš Cipr 2005-2006 Introduction to pipelining What is this What is pipelining? Pipelining is an implementation technique in which multiple instructions
More informationWilliam Stallings Computer Organization and Architecture 10 th Edition Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+ William Stallings Computer Organization and Architecture 10 th Edition 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. 2 + Chapter 14 Processor Structure and Function + Processor Organization
More informationChapter 8. Pipelining
Chapter 8. Pipelining Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined organization requires sophisticated compilation techniques.
More informationChapter 9. Pipelining Design Techniques
Chapter 9 Pipelining Design Techniques 9.1 General Concepts Pipelining refers to the technique in which a given task is divided into a number of subtasks that need to be performed in sequence. Each subtask
More informationRISC & Superscalar. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 12. Instruction Pipeline no hazard.
COMP 212 Computer Organization & Architecture Pipeline Re-Cap Pipeline is ILP -Instruction Level Parallelism COMP 212 Fall 2008 Lecture 12 RISC & Superscalar Divide instruction cycles into stages, overlapped
More informationWhat is Pipelining? RISC remainder (our assumptions)
What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism
More informationChapter 12. CPU Structure and Function. Yonsei University
Chapter 12 CPU Structure and Function Contents Processor organization Register organization Instruction cycle Instruction pipelining The Pentium processor The PowerPC processor 12-2 CPU Structures Processor
More informationPipelining and Vector Processing
Chapter 8 Pipelining and Vector Processing 8 1 If the pipeline stages are heterogeneous, the slowest stage determines the flow rate of the entire pipeline. This leads to other stages idling. 8 2 Pipeline
More informationLecture 15: Pipelining. Spring 2018 Jason Tang
Lecture 15: Pipelining Spring 2018 Jason Tang 1 Topics Overview of pipelining Pipeline performance Pipeline hazards 2 Sequential Laundry 6 PM 7 8 9 10 11 Midnight Time T a s k O r d e r A B C D 30 40 20
More information1 Hazards COMP2611 Fall 2015 Pipelined Processor
1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add
More informationControl Hazards. Prediction
Control Hazards The nub of the problem: In what pipeline stage does the processor fetch the next instruction? If that instruction is a conditional branch, when does the processor know whether the conditional
More informationPipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome
Pipeline Thoai Nam Outline Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy
More informationWhat is Pipelining? Time per instruction on unpipelined machine Number of pipe stages
What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism
More informationECE 505 Computer Architecture
ECE 505 Computer Architecture Pipelining 2 Berk Sunar and Thomas Eisenbarth Review 5 stages of RISC IF ID EX MEM WB Ideal speedup of pipelining = Pipeline depth (N) Practically Implementation problems
More informationELE 655 Microprocessor System Design
ELE 655 Microprocessor System Design Section 2 Instruction Level Parallelism Class 1 Basic Pipeline Notes: Reg shows up two places but actually is the same register file Writes occur on the second half
More informationEast Tennessee State University Department of Computer and Information Sciences CSCI 4717 Computer Architecture TEST 3 for Fall Semester, 2005
Points missed: Student's Name: Total score: /100 points East Tennessee State University Department of Computer and Information Sciences CSCI 4717 Computer Architecture TEST 3 for Fall Semester, 2005 Section
More informationMemory Hierarchy Basics. Ten Advanced Optimizations. Small and Simple
Memory Hierarchy Basics Six basic cache optimizations: Larger block size Reduces compulsory misses Increases capacity and conflict misses, increases miss penalty Larger total cache capacity to reduce miss
More informationControl Hazards - branching causes problems since the pipeline can be filled with the wrong instructions.
Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions Stage Instruction Fetch Instruction Decode Execution / Effective addr Memory access Write-back Abbreviation
More informationCOMPUTER ORGANIZATION AND DESI
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler
More informationPipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome
Thoai Nam Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy & David a Patterson,
More informationAdvanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017
Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation
More informationPipelining. CSC Friday, November 6, 2015
Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not
More informationThe Processor: Improving the performance - Control Hazards
The Processor: Improving the performance - Control Hazards Wednesday 14 October 15 Many slides adapted from: and Design, Patterson & Hennessy 5th Edition, 2014, MK and from Prof. Mary Jane Irwin, PSU Summary
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationWeek 11: Assignment Solutions
Week 11: Assignment Solutions 1. Consider an instruction pipeline with four stages with the stage delays 5 nsec, 6 nsec, 11 nsec, and 8 nsec respectively. The delay of an inter-stage register stage of
More informationMemory Hierarchy Basics
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Memory Hierarchy Basics Six basic cache optimizations: Larger block size Reduces compulsory misses Increases
More informationControl Hazards. Branch Prediction
Control Hazards The nub of the problem: In what pipeline stage does the processor fetch the next instruction? If that instruction is a conditional branch, when does the processor know whether the conditional
More informationadministrivia final hour exam next Wednesday covers assembly language like hw and worksheets
administrivia final hour exam next Wednesday covers assembly language like hw and worksheets today last worksheet start looking at more details on hardware not covered on ANY exam probably won t finish
More information3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:
More informationLECTURE 3: THE PROCESSOR
LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More informationLecture 5: Instruction Pipelining. Pipeline hazards. Sequential execution of an N-stage task: N Task 2
Lecture 5: Instruction Pipelining Basic concepts Pipeline hazards Branch handling and prediction Zebo Peng, IDA, LiTH Sequential execution of an N-stage task: 3 N Task 3 N Task Production time: N time
More informationCSEE 3827: Fundamentals of Computer Systems
CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 martha@cs.columbia.edu Amdahl s Law Be aware when optimizing... T = improved Taffected improvement factor + T unaffected
More informationComputer Architecture
Lecture 3: Pipelining Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in
More informationSUPERSCALAR AND VLIW PROCESSORS
Datorarkitektur I Fö 10-1 Datorarkitektur I Fö 10-2 What is a Superscalar Architecture? SUPERSCALAR AND VLIW PROCESSORS A superscalar architecture is one in which several instructions can be initiated
More informationParallelism. Execution Cycle. Dual Bus Simple CPU. Pipelining COMP375 1
Pipelining COMP375 Computer Architecture and dorganization Parallelism The most common method of making computers faster is to increase parallelism. There are many levels of parallelism Macro Multiple
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined
More informationPractice Problems (Con t) The ALU performs operation x and puts the result in the RR The ALU operand Register B is loaded with the contents of Rx
Microprogram Control Practice Problems (Con t) The following microinstructions are supported by each CW in the CS: RR ALU opx RA Rx RB Rx RB IR(adr) Rx RR Rx MDR MDR RR MDR Rx MAR IR(adr) MAR Rx PC IR(adr)
More informationWhat is Superscalar? CSCI 4717 Computer Architecture. Why the drive toward Superscalar? What is Superscalar? (continued) In class exercise
CSCI 4717/5717 Computer Architecture Topic: Instruction Level Parallelism Reading: Stallings, Chapter 14 What is Superscalar? A machine designed to improve the performance of the execution of scalar instructions.
More informationPipeline Processors David Rye :: MTRX3700 Pipelining :: Slide 1 of 15
Pipeline Processors Pipelining :: Slide 1 of 15 Pipeline Processors A common feature of modern processors Works like a series production line An operation is divided into k decoupled (independent) elementary
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More informationThe Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
The Processor (3) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More information6.004 Tutorial Problems L22 Branch Prediction
6.004 Tutorial Problems L22 Branch Prediction Branch target buffer (BTB): Direct-mapped cache (can also be set-associative) that stores the target address of jumps and taken branches. The BTB is searched
More informationPipelining. Principles of pipelining Pipeline hazards Remedies. Pre-soak soak soap wash dry wipe. l Chapter 4.4 and 4.5
Pipelining Pre-soak soak soap wash dry wipe Chapter 4.4 and 4.5 Principles of pipelining Pipeline hazards Remedies 1 Multi-stage process Sequential execution One process begins after previous finishes
More informationF28HS Hardware-Software Interface: Systems Programming
F28HS Hardware-Software Interface: Systems Programming Hans-Wolfgang Loidl School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh Semester 2 2017/18 0 No proprietary software has
More informationEN164: Design of Computing Systems Topic 06.b: Superscalar Processor Design
EN164: Design of Computing Systems Topic 06.b: Superscalar Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown
More informationCPU Structure and Function. Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition
CPU Structure and Function Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition CPU must: CPU Function Fetch instructions Interpret/decode instructions Fetch data Process data
More informationVorlesung / Course IN2075: Mikroprozessoren / Microprocessors
Vorlesung / Course IN2075: Mikroprozessoren / Microprocessors Pipelining 20 Nov 2017 Carsten Trinitis (LRR) P Techniques: Pipelining etc. Processor Organisation Tasks of a processor Input: Stream of instructions
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationInstruction Pipelining
Instruction Pipelining Simplest form is a 3-stage linear pipeline New instruction fetched each clock cycle Instruction finished each clock cycle Maximal speedup = 3 achieved if and only if all pipe stages
More informationLRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved.
LRU A list to keep track of the order of access to every block in the set. The least recently used block is replaced (if needed). How many bits we need for that? 27 Pseudo LRU A B C D E F G H A B C D E
More informationSuperscalar Processors Ch 14
Superscalar Processors Ch 14 Limitations, Hazards Instruction Issue Policy Register Renaming Branch Prediction PowerPC, Pentium 4 1 Superscalar Processing (5) Basic idea: more than one instruction completion
More informationSuperscalar Processing (5) Superscalar Processors Ch 14. New dependency for superscalar case? (8) Output Dependency?
Superscalar Processors Ch 14 Limitations, Hazards Instruction Issue Policy Register Renaming Branch Prediction PowerPC, Pentium 4 1 Superscalar Processing (5) Basic idea: more than one instruction completion
More informationInstruction Pipelining
Instruction Pipelining Simplest form is a 3-stage linear pipeline New instruction fetched each clock cycle Instruction finished each clock cycle Maximal speedup = 3 achieved if and only if all pipe stages
More informationSuperscalar Processors
Superscalar Processors Superscalar Processor Multiple Independent Instruction Pipelines; each with multiple stages Instruction-Level Parallelism determine dependencies between nearby instructions o input
More informationCS311 Lecture: Pipelining, Superscalar, and VLIW Architectures revised 10/18/07
CS311 Lecture: Pipelining, Superscalar, and VLIW Architectures revised 10/18/07 Objectives ---------- 1. To introduce the basic concept of CPU speedup 2. To explain how data and branch hazards arise as
More informationECE260: Fundamentals of Computer Engineering
Pipelining James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy What is Pipelining? Pipelining
More informationDepartment of Computer Science and Engineering
Department of Computer Science and Engineering UNIT-III PROCESSOR AND CONTROL UNIT PART A 1. Define MIPS. MIPS:One alternative to time as the metric is MIPS(Million Instruction Per Second) MIPS=Instruction
More informationMidnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4
IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM
More informationCPE300: Digital System Architecture and Design
CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Pipelining 11142011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Review I/O Chapter 5 Overview Pipelining Pipelining
More informationLecture 1 An Overview of High-Performance Computer Architecture. Automobile Factory (note: non-animated version)
Lecture 1 An Overview of High-Performance Computer Architecture ECE 463/521 Fall 2002 Edward F. Gehringer Automobile Factory (note: non-animated version) Automobile Factory (note: non-animated version)
More informationLecture: Out-of-order Processors. Topics: out-of-order implementations with issue queue, register renaming, and reorder buffer, timing, LSQ
Lecture: Out-of-order Processors Topics: out-of-order implementations with issue queue, register renaming, and reorder buffer, timing, LSQ 1 An Out-of-Order Processor Implementation Reorder Buffer (ROB)
More information6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU
1-6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU Product Overview Introduction 1. ARCHITECTURE OVERVIEW The Cyrix 6x86 CPU is a leader in the sixth generation of high
More informationProcessor (II) - pipelining. Hwansoo Han
Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number
More informationCS Computer Architecture
CS 35101 Computer Architecture Section 600 Dr. Angela Guercio Fall 2010 An Example Implementation In principle, we could describe the control store in binary, 36 bits per word. We will use a simple symbolic
More informationFinal Lecture. A few minutes to wrap up and add some perspective
Final Lecture A few minutes to wrap up and add some perspective 1 2 Instant replay The quarter was split into roughly three parts and a coda. The 1st part covered instruction set architectures the connection
More informationInstr. execution impl. view
Pipelining Sangyeun Cho Computer Science Department Instr. execution impl. view Single (long) cycle implementation Multi-cycle implementation Pipelined implementation Processing an instruction Fetch instruction
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationSuperscalar Processors Ch 13. Superscalar Processing (5) Computer Organization II 10/10/2001. New dependency for superscalar case? (8) Name dependency
Superscalar Processors Ch 13 Limitations, Hazards Instruction Issue Policy Register Renaming Branch Prediction 1 New dependency for superscalar case? (8) Name dependency (nimiriippuvuus) two use the same
More informationBasic Instruction Timings. Pipelining 1. How long would it take to execute the following sequence of instructions?
Basic Instruction Timings Pipelining 1 Making some assumptions regarding the operation times for some of the basic hardware units in our datapath, we have the following timings: Instruction class Instruction
More informationDesigning for Performance. Patrick Happ Raul Feitosa
Designing for Performance Patrick Happ Raul Feitosa Objective In this section we examine the most common approach to assessing processor and computer system performance W. Stallings Designing for Performance
More informationChapter 06: Instruction Pipelining and Parallel Processing
Chapter 06: Instruction Pipelining and Parallel Processing Lesson 09: Superscalar Processors and Parallel Computer Systems Objective To understand parallel pipelines and multiple execution units Instruction
More informationCSE Lecture 13/14 In Class Handout For all of these problems: HAS NOT CANNOT Add Add Add must wait until $5 written by previous add;
CSE 30321 Lecture 13/14 In Class Handout For the sequence of instructions shown below, show how they would progress through the pipeline. For all of these problems: - Stalls are indicated by placing the
More informationCPSC 313, 04w Term 2 Midterm Exam 2 Solutions
1. (10 marks) Short answers. CPSC 313, 04w Term 2 Midterm Exam 2 Solutions Date: March 11, 2005; Instructor: Mike Feeley 1a. Give an example of one important CISC feature that is normally not part of a
More informationInstruction Pipelining Review
Instruction Pipelining Review Instruction pipelining is CPU implementation technique where multiple operations on a number of instructions are overlapped. An instruction execution pipeline involves a number
More informationMinimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline
Instruction Pipelining Review: MIPS In-Order Single-Issue Integer Pipeline Performance of Pipelines with Stalls Pipeline Hazards Structural hazards Data hazards Minimizing Data hazard Stalls by Forwarding
More informationECE473 Computer Architecture and Organization. Pipeline: Control Hazard
Computer Architecture and Organization Pipeline: Control Hazard Lecturer: Prof. Yifeng Zhu Fall, 2015 Portions of these slides are derived from: Dave Patterson UCB Lec 15.1 Pipelining Outline Introduction
More informationLecture 8: Branch Prediction, Dynamic ILP. Topics: static speculation and branch prediction (Sections )
Lecture 8: Branch Prediction, Dynamic ILP Topics: static speculation and branch prediction (Sections 2.3-2.6) 1 Correlating Predictors Basic branch prediction: maintain a 2-bit saturating counter for each
More informationc. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?
Brown University School of Engineering ENGN 164 Design of Computing Systems Professor Sherief Reda Homework 07. 140 points. Due Date: Monday May 12th in B&H 349 1. [30 points] Consider the non-pipelined
More informationInstruction Level Parallelism
Instruction Level Parallelism Software View of Computer Architecture COMP2 Godfrey van der Linden 200-0-0 Introduction Definition of Instruction Level Parallelism(ILP) Pipelining Hazards & Solutions Dynamic
More informationCS 3510 Comp&Net Arch
CS 3510 Comp&Net Arch Pipeline Dr. Ken Hoganson 2010 Enhancing Performance We observed that we can obtain better performance in executing instructions, if a single cycle accomplishes multiple operations:
More informationChapter. Out of order Execution
Chapter Long EX Instruction stages We have assumed that all stages. There is a problem with the EX stage multiply (MUL) takes more time than ADD MUL ADD We can clearly delay the execution of the ADD until
More informationImprove performance by increasing instruction throughput
Improve performance by increasing instruction throughput Program execution order Time (in instructions) lw $1, 100($0) fetch 2 4 6 8 10 12 14 16 18 ALU Data access lw $2, 200($0) 8ns fetch ALU Data access
More informationCOSC 6385 Computer Architecture. - Memory Hierarchies (II)
COSC 6385 Computer Architecture - Memory Hierarchies (II) Fall 2008 Cache Performance Avg. memory access time = Hit time + Miss rate x Miss penalty with Hit time: time to access a data item which is available
More information