Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts

Similar documents
Lecture 6: Pipelining

Designing a Pipelined CPU

CPS104 Computer Organization and Programming Lecture 19: Pipelining. Robert Wagner

Pipeline: Introduction

Outline Marquette University

ELCT 501: Digital System Design

微算機系統第六章. Enhancing Performance with Pipelining 陳伯寧教授電信工程學系國立交通大學. Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1,

Chapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W12-M

Chapter 4 (Part II) Sequential Laundry

Chapter 8. Pipelining

Modern Computer Architecture

Computer Architecture. Lecture 6.1: Fundamentals of

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed

CSE 141 Computer Architecture Spring Lectures 11 Exceptions and Introduction to Pipelining. Announcements

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST

Computer Systems Architecture Spring 2016

Pipeline Processors David Rye :: MTRX3700 Pipelining :: Slide 1 of 15

Pipelining. Maurizio Palesi

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

Page 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1

COMP2611: Computer Organization. The Pipelined Processor

Advanced Computer Architecture Pipelining

Improve performance by increasing instruction throughput

Chapter 5 (a) Overview

Designing a Pipelined CPU

Enhanced Performance with Pipelining

CPE Computer Architecture. Appendix A: Pipelining: Basic and Intermediate Concepts

What do we have so far? Multi-Cycle Datapath (Textbook Version)

CS 61C: Great Ideas in Computer Architecture Control and Pipelining

Pipelining: Overview. CPSC 252 Computer Organization Ellen Walker, Hiram College

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

ECE154A Introduction to Computer Architecture. Homework 4 solution

Pipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010

Basic Instruction Timings. Pipelining 1. How long would it take to execute the following sequence of instructions?

Unpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory

Pipelined Datapath. One register file is enough

Lecture 19 Introduction to Pipelining

Working on the Pipeline

Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation

The Pipelined MIPS Processor

ECEC 355: Pipelining

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.

EE 457 Unit 6a. Basic Pipelining Techniques

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Review: Computer Organization

Processor Architecture

Pipelining. Chapter 4

Appendix A. Overview

Topics. Lecture 12: Pipelining. Introduction to pipelining. Pipelined datapath. Hazards in pipeline. Performance. Design issues.

PIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons

Overview of Pipelining

COSC 6385 Computer Architecture - Pipelining

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...

Computer Architecture

Processor (II) - pipelining. Hwansoo Han

EITF20: Computer Architecture Part2.2.1: Pipeline-1

Lecture 15: Pipelining. Spring 2018 Jason Tang

Overview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP

EITF20: Computer Architecture Part2.2.1: Pipeline-1

Chapter 6: Pipelining

ECE260: Fundamentals of Computer Engineering

Full Datapath. Chapter 4 The Processor 2

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

EITF20: Computer Architecture Part2.2.1: Pipeline-1

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

Module 4c: Pipelining

Design a MIPS Processor (2/2)

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

Topics. Pipelining Lessons (Task, Resource) Pipelining Lessons (Time cycles) Pipelining is Natural! Laundry Example

Multi-cycle Datapath (Our Version)

Chapter 4 The Processor 1. Chapter 4A. The Processor

Pipelining. Pipeline performance

14:332:331 Pipelined Datapath

MC9211Computer Organization. Unit 4 Lesson 1 Processor Design

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W9-W

What do we have so far? Multi-Cycle Datapath

Chapter 3 & Appendix C Part B: ILP and Its Exploitation

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

Lecture 10: Pipelined Implementations

RISC Pipeline. Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. See: P&H Chapter 4.6

Perfect Student CS 343 Final Exam May 19, 2011 Student ID: 9999 Exam ID: 9636 Instructions Use pencil, if you have one. For multiple choice

ECE232: Hardware Organization and Design

Pipelining. CSC Friday, November 6, 2015

Assignment 1 solutions

Pipelining: Basic Concepts

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

Chapter 4 The Processor

ECE260: Fundamentals of Computer Engineering

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

Chapter 4 The Processor 1. Chapter 4B. The Processor

Transcription:

CS359: Computer Architecture Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Computer Science and Engineering Shanghai Jiao Tong University

Parallel Processing 2

Outline C. Introduction to Pipelining How Pipeline is Implemented Pipeline Hazards Exceptions Handling ulticycle Operations 3

Processor Performance Performance of single-cycle processor is limited by the long critical path delay The critical path limits the operating clock frequency Can we do better? New semiconductor technology will reduce the critical path delay by manufacturing smallsized transistors Core 2 Duo is manufactured with 65nm technology Core i7 is manufactured with 5nm technology Next semiconductor technology is nm technology Can we increase the processor performance with a different microarchitecture? Yes! Pipelining

Revisiting Performance Laundry Example Ann, Brian, Cathy, Dave each has one load of clothes to wash, dry, and fold Washer takes 3 minutes Dryer takes minutes Folder takes 2 minutes A B C D 5

Sequential Laundry 6 P 7 8 9 idnight Time T a s k O r d e r A B C D 3 2 3 2 3 2 3 2 Response time: 9 mins Throughput:.67 tasks / hr (= 9mins/task, 6 hours for loads) 6

Pipelined Laundry T a s k O r d e r A B C D 6 P 7 8 9 idnight Time 3 2 Notes Pipelining doesn t help latency (response time) of a single task Pipelining helps throughput of entire workload ultiple tasks operating simultaneously Potential speedup = # of pipeline stages Unbalanced lengths of pipeline stages reduce speedup Response time: 9 mins Throughput:. tasks / hr (= 52.5 mins/task, 3.5 hours for loads) 7

Pipelining Improve performance by increasing instruction throughput Fetch ister File Access () Operation Access ister Access () 2ns ns 2ns 2ns ns Sequential Execution Pipelined Execution Program execution order Time (in instructions) lw $, ($) lw $2, 2($) lw $3, 3($) Program execution Time order (in instructions) lw $, ($) lw $2, 2($) lw $3, 3($) fetch fetch 2 ns 2 6 8 2 6 8 8 ns access fetch 8 ns 2 6 8 2 fetch 2 ns fetch access access access 2 ns 2 ns 2 ns 2 ns 2 ns access fetch 8 ns... 8

Pipelining (Cont.) ultiple instructions are being executed simultaneously Program execution Time order (in instructions) lw $, ($) fetch 2 6 8 2 access lw $2, 2($) 2 ns fetch access lw $3, 3($) 2 ns fetch access fetch access fetch access fetch access fetch access fetch access fetch access 9

Pipelining (Cont.) Pipeline Speedup Time to execute an instruction pipeline = Time to execute an instruction sequential Number of stages If not balanced, speedup is less Speedup comes from increased throughput the latency of instruction does not decrease

Outline Introduction to Pipelining C.2 How Pipeline is Implemented Pipeline Hazards Exceptions Handling ulticycle Operations

Basic Idea IF: fetch u x ID: decode/ file read EX: Execute/ address calculation E: emory access : back ress isters 2 u x ress u x 6 What do we have to add to actually split the path into stages? 2

Basic Idea IF: fetch u x ID: decode/ file read EX: Execute/ address calculation E: emory access : back ress isters 2 u x ress u x 6 clock D Q F/F Q D Q F/F Q D Q F/F Q D Q F/F Q 3

Graphically Representing Pipelines Time 2 6 8 lw IF ID EX E add IF ID EX E Shading indicates the unit is being used by the instruction Shading on the right half of the file (ID or ) or means the element is being read in that stage Shading on the left half means the element is being written in that stage

Pipelined path u x IF/ID ID/EX EX/E E/ ress isters 2 u x ress u x 6 5

lw: Fetch (IF) fetch IF/ID ID/EX EX/E E/ ress isters 2 ress 6 6

lw: Decode (ID) decode IF/ID ID/EX EX/E E/ ress isters 2 ress 6 7

lw: Execution (EX) Execution IF/ID ID/EX EX/E E/ ress isters 2 ress 6 8

lw: emory (E) emory IF/ID ID/EX EX/E E/ ress isters 2 ress 6 9

lw: back () back IF/ID ID/EX EX/E E/ ress isters 2 ress 6 2

sw: emory (E) emory IF/ID ID/EX EX/E E/ ress isters 2 ress 6 2

sw: back (): do nothing back IF/ID ID/EX EX/E E/ ress isters 2 ress 6 22

Corrected path (for lw) IF/ID ID/EX EX/E E/ ress isters 2 ress 6 23

Pipelining Example add $, $5, $6 lw $3, 2($) add $2, $3, $ sub $, $2, $3 lw $, 2($) IF/ID ID/EX EX/E E/ ress isters 2 ress 6 2

Pipeline Control u x Src Note that in this implementation, the branch is resolved in the E stage IF/ID ID/EX EX/E E/ Branch ress isters 2 [5 ] 6 Src u x 6 control ress em em emto u x [2 6] [5 ] u x Op Dst 25

Pipeline Control What needs to be controlled in each stage (IF, ID, EX, E, )? IF: fetch and increment ID: decode and operand fetch from file and/or immediate EX: Execution stage Dst op[:] Src A: emory stage Branch em em : back emto (note that this signal is in ID stage) 26

Pipeline Control Extend pipeline s to include control information created in ID stage Pass control signals along just like the Execution/ress Calculation stage control lines emory access stage control lines -back stage control lines Dst Op Op Src Branch em em write em to R-format lw sw X X beq X X Control EX IF/ID ID/EX EX/E E/ 27

path with Control Src u x Control ID/EX EX/E E/ IF/ID EX ress isters 2 u x Src Branch em ress emto u x 6 [5 ] 6 control em [2 6] [5 ] u x Dst Op 28

path with Control IF: lw $, 9($) Src Control ID/EX EX/E E/ IF/ID EX ress isters 2 Src Branch em ress emto [5 ] 6 6 control em [2 6] [5 ] Op Dst 29

path with Control IF: sub $, $2, $3 ID: lw $, 9($) Src lw Control ID/EX EX/E E/ IF/ID E X ress isters 2 Src Branch em ress emto [5 ] 6 6 control em [2 6] [5 ] Op Dst 3

path with Control IF: and $2, $, $5 ID: sub $, $2, $3 EX: lw $, 9($) Src IF/ID sub Control ID/EX EX EX/E E/ ress isters 2 Src Branch em ress emto [5 ] 6 6 control em [2 6] [5 ] Op Dst 3

path with Control IF: or $3, $6, $7 ID: and $2, $, $5 EX: sub $, $2, $3 E: lw $, 9($) Src IF/ID and Control ID/EX EX EX/E E/ ress isters 2 Src Branch em ress emto [5 ] 6 6 control em [2 6] [5 ] Op Dst

path with Control IF: add $, $8, $9 Src ID: or $3, $6, $7 EX: and $2, $, $5 E: sub $,.. : lw $, 9($) IF/ID or Control ID/EX EX EX/E E/ ress isters 2 Src Branch em ress emto [5 ] 6 6 control em [2 6] [5 ] Op Dst 33

path with Control IF: xxxx ID: add $, $8, $9 EX: or $3, $6, $7 E: and $2 Src : sub $,.. IF/ID add Control ID/EX EX EX/E E/ ress isters 2 Src Branch em ress emto [5 ] 6 6 control em [2 6] [5 ] Op Dst 3

path with Control IF: xxxx ID: xxxx EX: add $, $8, $9 E: or $3,.. Src : and $2 IF/ID Control ID/EX EX EX/E E/ ress isters 2 Src Branch em ress emto [5 ] 6 6 control em [2 6] [5 ] Op Dst 35

path with Control IF: xxxx ID: xxxx EX: xxxx E: add $,.. Src : or $3 IF/ID Control ID/EX EX EX/E E/ ress isters 2 Src Branch em ress emto [5 ] 6 6 control em [2 6] [5 ] Op Dst 36

path with Control IF: xxxx ID: xxxx EX: xxxx E: xxxx Src : add $.. Control ID/EX EX/E E/ IF/ID EX ress isters 2 Src Branch em ress emto [5 ] 6 6 control em [2 6] [5 ] Op Dst 37