Today s Menu. Multi-Cycle Exceptions Pipelining. Exceptions and Interrupts. Handling Exceptions and Interrupts. Why This Is Very Messy.

Size: px
Start display at page:

Download "Today s Menu. Multi-Cycle Exceptions Pipelining. Exceptions and Interrupts. Handling Exceptions and Interrupts. Why This Is Very Messy."

Transcription

1 ulti-cycle Exceptions Today s enu Exceptions hat are they? hat do we do about them? Introduction to pipelining hy pipelining? hy is it difficult? How can we do it efficiently? Examples 1 2 Exceptions and Interrupts Exceptions are exceptional events that disrupt the normal flow of a program Terminology varies between different machines Examples of Interrupts ser hitting the keyboard Disk drive asking for attention Arrival of a network packet Examples of Exceptions Divide by zero Overflow Page fault Handling Exceptions and Interrupts hen do we jump to an exception? pon detection, invoke the O to service the event ight when it occurs? hat about in the middle of executing a multi-cycle instruction Difficult to abort the middle of an instruction Processor checks for event at the end of every instruction Processor provides E & Cause registers to inform O of cause E - Exception Counter Holds that the O should jump to when resuming execution Cause ister Holds bit-encoded cause of the exception 3 Exception Flow hy This Is Very essy hen an exception (or interrupt) occurs, control is transferred to the O hen the O is done, it jumps back to the user program (if it can) ser Process Event exception Exception return (tional) Operating ystem Exception processing by exception handler 5 You have many instructions in flight In one of these instructions, a bad thing happens, eg, divide-by-zero hat do we have to do? e have to deal with this event, since normal program execution is probably now incorrect But, we have a bunch of instructions in flight any of them, but maybe not all of them, need to get killed Don t want to kill stuff that is actually correct, and waste that work. hen do we kill them? NO -- die die die.? ait till exception-causing instruction finishes? ait till the pipeline empties? Very very very messy part of real machine design. 6

2 eview of ulticycle vs. ingle Cycle Complete ingle-cycle Datapath ingle cycle implementations have to consider the worst case delay through the path to come-up with the cycle time. ulticycle implementations have the advantage of using a different number of cycles for executing each instruction. Current emory (A) ADDE In general, the multicycle machine is better than the single cycle machine, but the actual execution time strongly depends on the workload. The most widely used machine implementation is neither single cycle, nor multicycle it s the pipelined implementation. (Next lecture) ister File 1 2 Data 1 rite Data emory (A) rite Data 7 8 Cost of the ingle Cycle Architecture ulti-cycle olution Instr Class 1 Instr Class 2 Instr Class 3 Our Cycle (longest ) Idea: Let the FATET instruction determine clock period Instr Class 1 Instr Class 2 Instr Class 3 Takes cycles Takes 2 cycles ost of the time is wasted! 9 Less asted ulti-cycle eality ulticycle Control Add Intermediate isters e are going to go further than allowing the fastest instruction to determine rate e are going to break EVEY instruction up into phases -class Load em IorD emrite Irite ister Dest rite 1 2 Data 1 A rite B ela Out Branch tore D Extend emto [5:0] hift left 2 elb Control Op 11 12

3 ulticycle Let s build cars Henry Ford, odel T, 1908 Non-pipelined: 1 car/ hours 13 1 Henry Ford, odel T, 1908 Henry Ford, odel T, 1908 Non-pipelined: 1 car/ hours Non-pipelined: 1 car/ hours Henry Ford, odel T, 1908 Henry Ford, odel T, 1908 Non-pipelined: 1 car/ hours Non-pipelined: 1 car/ hours 17 18

4 Analogy: Gasoline Transportation Henry Ford, odel T, 1908 Non-pipelined: 1 car/ hours pipelined: 1 car/hour 19 Trucking gas from depot to gas station Get the barrels Load them into the truck Drive to the gas station nload the gas eturn for more oil Let s do the math Each truck can carry 5 barrels Can load a truck with 5 barrels in 1 hour It takes each truck 1 day to drive to and from gas station Q: How many barrels per week are delivered? Q: hat if I had more trucks? GA TATION 20 Looks a Lot Like a ulticycle Processor hat are the steps Fetch an instruction (Get the barrels) Decode the instruction (Load them into the truck) OP (Drive to the gas station) emory Access (nload the gas) rite-back (eturn for more oil) Business 201 GA TATION emory ister 1 Data 1 2 rite hift Extend left 2 21 oll the barrels down the road Big fire hazard - probably will not meet OHA standards Occupational afety and Health Administration 22 Business 201 Trucking vs. Pipelines GA TATION GA TATION Build a pipeline ill meet OHA standards ight make the environmentalists angry Now let s do the math Pipeline can accept 1 barrel every hour Q: How many barrels get delivered to the gas station per day? Q: How many barrels are in-flight at any moment? Trucks Each truck can carry 5 barrels Can load a truck with 5 barrels in 1 hour Truck takes 1 day to drive to and from gas station LOT of TE when loading area, gas station, and pieces of the road are unused nless you have lots of trucks Pipelines Pipeline can accept 1 barrel every hour esources (loading area, gas station, pipeline) are always in use As long as you can keep your pipeline full (e.g., you have enough barrels) 23 2

5 Big Idea: Pipeline Concurrency Big Idea: It s Faster This computation is too long I can launch a new computation every 0ns in this structure 0 ns 0 ns Pipelined version, 5 pipe stages Pipelined version, 5 pipe stages: I can launch a new computation every 20ns in pipelined structure ~20 ns Latches, called Pipeline registers break up computation into stages ~20 ns Latches, called Pipeline registers break up computation into stages : Implementation Issues hat prevents us from just doing a zillion pipe stages? ome computations just won t divide into any finer (shorter in time) logical implementations ltimately, often comes down to circuit design issues ~20 ns ~2 ns 5 stages: OK 50 stages: ne, sorry 27 : Implementation Issues hat prevents us from just doing a zillion pipe stages? Those latches are NOT free, they take up area, and there is a real delay to go TH the latch itself ~2ns ~0.2ns In modern, deep pipeline (-20 stages), this is a real effect stage pipe Typically see logic depths in one pipe stage of -20 gates ~20 At these speeds, and with this few levels of logic, latch delay is important 28 emember the A big.little Idea? LITTLE How any Pipeline tages? E.g., Intel Pentium : over 20 stages ore than 120 instructions in flight High clock frequency (>3GHz) High I (s per Cycle) BIG Pipeline depth: 8- uch lower power Too many stages: Lots of complications hould take care of possible dependencies among in-flight instructions Control logic is huge Too little work per stage, too high a branch miss-prediction penalty bad performance Pipeline depth: 15-2 uch higher frequency 29 30

6 Performance of Pipelined ystems IP Pipeline tages npipelined instructions Throughput: 1 per 5 cycles Pipelined Pipeline stage time time Latency 5 cycles tage 1: Fetch IF tage 2: Decode ID tage 3: Execute E tage : emory Access E tage 5: rite Back (to register file) B Throughput: 1 per 1 cycle Ideal speedup only if we can keep the pipeline full! Latency 5 cycles Ideally, peedup pipeline = sequential Pipeline Depth stage Version of IP Datapath Complete 5 tage Pipeline (Drawn maller) TAGE 1 Instr. Fetch TAGE2 Decode TAGE 3 Execute TAGE emacc TAGE 5 riteback Current Current IF/ID ID/E E/E E/B E G I T emory (A) E ister File E 1 G 1 I rite T E E G I T E rite Data Data emory E (A) G I T E emory (A) ister File 1 1 rite ADDE Data emory (A) 33 3 Flow of s Through Pipeline L 1, 0(0) Cycle 1 Cycle 2 Cycle 3 Cycle D Cycle 5 Cycle 6 tage 1 - IF ( Fetch) Fetch L L 2,200(0) D Current IF/ID ID/E E/E E/B L 3, 300(0) In cycle we have 3 instructions in-flight : Inst 1 is accessing the memory (D) Inst 2 is using the (E) Inst 3 is access the register file (ID) D 35 emory (A) ister File 1 1 rite ADDE Data emory (A) 36

7 tage 2 - ID ( Decode) tage 3 - E () Decode L L Current IF/ID ID/E E/E E/B Current IF/ID ID/E E/E E/B emory (A) ister File 1 1 rite ADDE Data emory (A) 37 emory (A) ister File 1 1 rite ADDE Data emory (A) 38 tage - E (emory) Current emory (A) IF/ID ID/E E/E E/B ister File 1 1 rite ADDE emory L Data emory (A) 39 tage 5 - B (rite Back) Current emory (A) ister File 1 1 rite exte 16 nd 32 ADDE Data emory (A) riteback L IF/ID ID/E E/E E/B 0 New Complications The good news ultiple instructions are running at the same time, thru the path This works because each stage of pipeline is isolated by latches o, in the best of all possible worlds, N stage pipe has N instructions flowing thru it, speedup is close to N. The bad news s interfere with each other Common name for these: conflicts hy? Different instructions in flight thru path at same time Different instructions might want to use the same piece of hardware in the path at the same time (i.e., in same clock cycle) These conflicts contention for an over-used resource are the source of endless grief in pipeline design 1 Good News: >1 In Flight in Pipe ADD 2,3,1 B 5,6,7 ADD,11,12 Cycle 1 Cycle 2 Cycle 3 Cycle D Cycle 5 Cycle 6 D D 2

8 Bad News: s Interfere ADD, 11, 12 Cycle 1 Cycle 2 Cycle 3 Cycle D Cycle 5 Cycle 6 rite to the register file Interference in a Pipe In its most basic form, it s about contention for a resource 2 instructions want to use a piece of hardware in the pipe There s only one of these in the pipe, maybe it can t service the requirements of more than one instruction at a time ADD 17, 0, 0 D get put put The conflict from previous slide s instruction sequence ADD 16, 0, 0 D B 20, 21, 22 D get put put ADD 30, 17, 18 from the register file 3 D ometimes, You Can edesign the esource In this particular case The problem is one instruction EAD register file and the other ITE register file olution: allow ITE-then-EAD in one clock cycle ( double pump ) get put put get No conflict now, 1st instruction writes in 1st half of clock cycle, later instruction reads in 2nd half put put Now, Even this Case orks OK ADD, 11, 12 ADD 17, 0, 0 ADD 16, 0, 0 B 20, 21, 22 Cycle 1 Cycle 2 Cycle 3 Cycle D Cycle 5 Cycle 6 D 17 D D 5 ADD 30, 17, D But..This Case till crews p ADD 2,3,1 Cycle 1 Cycle 2 Cycle 3 Cycle D Cycle 5 Cycle 6 Another Conflict: Data Hazards Basic structure An instruction in flight wants to use a value that s not done yet Done means it s been computed and it s located where I would normally expect to go look in the pipe hardware to find it B 5,6,7 ADD,11,12 ADD 12,,11 D D riteback esult into D value out of 7 Basic cause You are used to assuming a purely sequential model of instruction execution N finishes before instruction N+k, for k >= 1 Ne, sorry -- not true any more in a pipeline There are dependencies now between nearby instructions ( near in sequential order of fetch from memory) Consequence Data hazards -- instructions want values that are not done yet, or in the right place yet 8

9 This Data Hazard, evisited In this particular case value is not computed or returned to register file when later instruction wants to use it as an input get get put put put put Double pumping reg file doesn t help here; later instruction needs 2 clock cycles before it s been computed & stored back. Os Cing with Data Hazards hat do you do? ometimes the dumb-sounding answer is right Hypothesis: It is BAD when certain instructions overlap in time in certain patterns in our 5 stage IP pipeline Prosed solution Don t let them overlap like this? ight - that is one solution echanics Don t let the instruction flow thru the pipe In particular, don t let it ITE any bits anywhere in the pipe hardware that represents EAL CP state (e.g., register file, memory) Name for this eration: PIPELINE TALL 9 50 Cing with Data Hazards: Example Cycle 1 Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 olution 1 : tall Cycle 1 Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 ADD, 11, 12 D ADD, 11, 12 D ADD 12,, 11 D ADD 12,, 11 bubble bubble D ADD 11,, 12 D ADD 11,, 12 Empty slots in in the pipe called bubbles; means no real instruction work getting saved here echanically: How Do e tall? ecall the isters Between Pipeline tages Add extra hardware to detect stall situations atches the instruction field bits Looks for read versus write conflicts in particular pipe stages Basically, a bunch of careful case logic Current IF/ID ID/E E/E E/B Add extra hardware to push bubbles thru pipe Actually, relatively easy Can just let the instruction you want to stall GO FOAD thru the pipe but, TN OFF the bits that allow any results to get written into the machine state o, the instruction executes (it does the work), but doesn t save If an instruction executes in the middle of forest, but no registers are around to save the results did it really execute? (No.) emory (A) ister File 1 1 rite ADDE Data emory (A) 53 5

10 ecall hat an Looks Like add 8, 17, 18 is stored in binary format as IP lays out instructions into fields rs rt rd shamt funct eration of the instruction s first register source erand rt rd shamt shift amount second register source erand register destination erand funct function (select type of eration) e gotta watch these reg fields 55 Data Hazard Logic Current emory (A) Data Hazard Logic s =? d t =? d between ID/E, E/E, and E/B tages IF/ID ID/E E/E E/B s t d d d ister File 1 2 Data 1 rite ADDE Data emory (A) 56 Example Example sub 2, 1, 3 d = 2 s = 1 t = 3 and 12, 2, 5 d = 12 s = 2 t = 5 or 13, 6, 2 d = 13 s = 6 t = 2 add 1, 2, 2 d = 1 s = 2 t = 2 sw 15, 0(2) d = 15 s = 2 t =?? sub 2, 1, 3 d = 2 s = 1 t = 3 and 12, 2, 5 d = 12 s = 2 t = 5 or 13, 6, 2 d = 13 s = 6 t = 2 add 1, 2, 2 d = 1 s = 2 t = 2 sw 15, 0(2) d = 15 s = 2 t = B-AND Hazard E/E.isterd == ID/E. isters == 2 B-O Hazard E/B.isterd == ID/E. istert == 2 Interactions (real or not) can be tricky Example: do instruction #1 (sub) and # (add) interact, conflict? ell, they do BOTH want to use No Dependence Between #1 and # In this case, double pumped reg file makes it ok Cycle 1 Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 How Else Could e tall the Pipeline? Compiler can insert ns Cycle 1 Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 B 2, 1, 3 D 2 ADD, 11, 12 D AND 12, 2, 5 O 13, 6, 2 D D n n On IP 0 = 0+0 will do it-- saves no state D D ADD 1, 2, 2 2 D ADD 12,, 11 D 59 60

11 Or, The Hardware Can imulate NOP Next lecture Cycle 1 Cycle 2 Cycle 3 Cycle Cycle 5 Cycle 6 How to fix the pipeline to avoid (most) dependency problems ADD, 11, 12 D stall bubble bubble bubble bubble stall bubble bubble bubble bubble ADD 12,, 11 D 61 62

Single-Cycle Examples, Multi-Cycle Introduction

Single-Cycle Examples, Multi-Cycle Introduction Single-Cycle Examples, ulti-cycle Introduction 1 Today s enu Single cycle examples Single cycle machines vs. multi-cycle machines Why multi-cycle? Comparative performance Physical and Logical Design of

More information

Lecture 10 Multi-Cycle Implementation

Lecture 10 Multi-Cycle Implementation Lecture 10 ulti-cycle Implementation 1 Today s enu ulti-cycle machines Why multi-cycle? Comparative performance Physical and Logical Design of Datapath and Control icroprogramming 2 ulti-cycle Solution

More information

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed Lecture 3: General Purpose Processor Design CSCE 665 Advanced VLSI Systems Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, tet etc included in slides are borrowed from various books, websites,

More information

Chapter 4 (Part II) Sequential Laundry

Chapter 4 (Part II) Sequential Laundry Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P 7 8 9 10 11 12 1 2 A T a s k O r d e r A B C D 30 30 30 30 30 30 30 30 30 30

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good CPU performance equation: T = I x CPI x C Both effective CPI and clock cycle C are heavily influenced by CPU design. For single-cycle CPU: CPI = 1 good Long cycle time bad On the other hand, for multi-cycle

More information

COSC 6385 Computer Architecture - Pipelining

COSC 6385 Computer Architecture - Pipelining COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction

More information

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor. COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction

More information

Computer Architecture

Computer Architecture Lecture 3: Pipelining Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in

More information

第三章 Instruction-Level Parallelism and Its Dynamic Exploitation. 陈文智 浙江大学计算机学院 2014 年 10 月

第三章 Instruction-Level Parallelism and Its Dynamic Exploitation. 陈文智 浙江大学计算机学院 2014 年 10 月 第三章 Instruction-Level Parallelism and Its Dynamic Exploitation 陈文智 chenwz@zju.edu.cn 浙江大学计算机学院 2014 年 10 月 1 3.3 The Major Hurdle of Pipelining Pipeline Hazards 本科回顾 ------- Appendix A.2 3.3.1 Taxonomy

More information

What do we have so far? Multi-Cycle Datapath (Textbook Version)

What do we have so far? Multi-Cycle Datapath (Textbook Version) What do we have so far? ulti-cycle Datapath (Textbook Version) CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instruction being processed in datapath How to lower CPI further? #1 Lec # 8 Summer2001

More information

COMP303 - Computer Architecture Lecture 10. Multi-Cycle Design & Exceptions

COMP303 - Computer Architecture Lecture 10. Multi-Cycle Design & Exceptions COP33 - Computer Architecture Lecture ulti-cycle Design & Exceptions Single Cycle Datapath We designed a processor that requires one cycle per instruction RegDst busw 32 Clk RegWr Rd ux imm6 Rt 5 5 Rs

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

The Pipelined MIPS Processor

The Pipelined MIPS Processor 1 The niversity of Texas at Dallas Lecture #20: The Pipeline IPS Processor The Pipelined IPS Processor We complete our study of AL architecture by investigating an approach providing even higher performance

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

Improve performance by increasing instruction throughput

Improve performance by increasing instruction throughput Improve performance by increasing instruction throughput Program execution order Time (in instructions) lw $1, 100($0) fetch 2 4 6 8 10 12 14 16 18 ALU Data access lw $2, 200($0) 8ns fetch ALU Data access

More information

LECTURE 3: THE PROCESSOR

LECTURE 3: THE PROCESSOR LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle? CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:

More information

Pipelined Datapath. One register file is enough

Pipelined Datapath. One register file is enough ipelined path The goal of pipelining is to allow multiple instructions execute at the same time We may need to perform several operations in a cycle Increment the and add s at the same time. Fetch one

More information

Basic Pipelining Concepts

Basic Pipelining Concepts Basic ipelining oncepts Appendix A (recommended reading, not everything will be covered today) Basic pipelining ipeline hazards Data hazards ontrol hazards Structural hazards Multicycle operations Execution

More information

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14 MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK

More information

Chapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns

Chapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns Chapter Si Pipelining Improve perfomance by increasing instruction throughput eecutionı Time lw $, ($) 2 6 8 2 6 8 access lw $2, 2($) 8 ns access lw $3, 3($) eecutionı Time lw $, ($) lw $2, 2($) 2 ns 8

More information

CS 152 Computer Architecture and Engineering Lecture 4 Pipelining

CS 152 Computer Architecture and Engineering Lecture 4 Pipelining CS 152 Computer rchitecture and Engineering Lecture 4 Pipelining 2014-1-30 John Lazzaro (not a prof - John is always OK) T: Eric Love www-inst.eecs.berkeley.edu/~cs152/ Play: 1 otorola 68000 Next week

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

Lecture 6: Pipelining

Lecture 6: Pipelining Lecture 6: Pipelining i CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and other

More information

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building

More information

Chapter 4 The Processor 1. Chapter 4A. The Processor

Chapter 4 The Processor 1. Chapter 4A. The Processor Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

Lecture 7 Pipelining. Peng Liu.

Lecture 7 Pipelining. Peng Liu. Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt

More information

Pipelining: Basic Concepts

Pipelining: Basic Concepts Pipelining: Basic Concepts Prof. Cristina Silvano Dipartimento di Elettronica e Informazione Politecnico di ilano email: silvano@elet.polimi.it Outline Reduced Instruction Set of IPS Processor Implementation

More information

MIPS An ISA for Pipelining

MIPS An ISA for Pipelining Pipelining: Basic and Intermediate Concepts Slides by: Muhamed Mudawar CS 282 KAUST Spring 2010 Outline: MIPS An ISA for Pipelining 5 stage pipelining i Structural Hazards Data Hazards & Forwarding Branch

More information

Computer Architecture. Lecture 6.1: Fundamentals of

Computer Architecture. Lecture 6.1: Fundamentals of CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and

More information

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin.   School of Information Science and Technology SIST CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C

More information

Lecture 19 Introduction to Pipelining

Lecture 19 Introduction to Pipelining CSE 30321 Lecture 19 Pipelining (Part 1) 1 Lecture 19 Introduction to Pipelining CSE 30321 Lecture 19 Pipelining (Part 1) Basic pipelining basic := single, in-order issue single issue one instruction at

More information

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Processor Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Moore s Law Gordon Moore @ Intel (1965) 2 Computer Architecture Trends (1)

More information

Module 4c: Pipelining

Module 4c: Pipelining Module 4c: Pipelining R E F E R E N C E S : S T A L L I N G S, C O M P U T E R O R G A N I Z A T I O N A N D A R C H I T E C T U R E M O R R I S M A N O, C O M P U T E R O R G A N I Z A T I O N A N D A

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1,

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1, SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Chapter 6 ADMIN ing for Chapter 6: 6., 6.9-6.2 2 Midnight Laundry Task order A 6 PM 7 8 9 0 2 2 AM B C D 3 Smarty

More information

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts CS359: Computer Architecture Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Computer Science and Engineering Shanghai Jiao Tong University Parallel

More information

Lecture 5: The Processor

Lecture 5: The Processor Lecture 5: The Processor CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and

More information

Suggested Readings! Recap: Pipelining improves throughput! Processor comparison! Lecture 17" Short Pipelining Review! ! Readings!

Suggested Readings! Recap: Pipelining improves throughput! Processor comparison! Lecture 17 Short Pipelining Review! ! Readings! 1! 2! Suggested Readings!! Readings!! H&P: Chapter 4.5-4.7!! (Over the next 3-4 lectures)! Lecture 17" Short Pipelining Review! 3! Processor components! Multicore processors and programming! Recap: Pipelining

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike

More information

Processor Architecture

Processor Architecture Processor Architecture Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong (jinkyu@skku.edu)

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will

More information

The overall datapath for RT, lw,sw beq instrucution

The overall datapath for RT, lw,sw beq instrucution Designing The Main Control Unit: Remember the three instruction classes {R-type, Memory, Branch}: a) R-type : Op rs rt rd shamt funct 1.src 2.src dest. 31-26 25-21 20-16 15-11 10-6 5-0 a) Memory : Op rs

More information

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ... CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science 6 PM 7 8 9 10 11 Midnight Time 30 40 20 30 40 20

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S. Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing (2) Pipeline Performance Assume time for stages is ps for register read or write

More information

Pipelining. CSC Friday, November 6, 2015

Pipelining. CSC Friday, November 6, 2015 Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing Note: Appendices A-E in the hardcopy text correspond to chapters 7- in the online

More information

CSEE 3827: Fundamentals of Computer Systems

CSEE 3827: Fundamentals of Computer Systems CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 martha@cs.columbia.edu Amdahl s Law Be aware when optimizing... T = improved Taffected improvement factor + T unaffected

More information

Designing a Pipelined CPU

Designing a Pipelined CPU Designing a Pipelined CPU CSE 4, S2'6 Review -- Single Cycle CPU CSE 4, S2'6 Review -- ultiple Cycle CPU CSE 4, S2'6 Review -- Instruction Latencies Single-Cycle CPU Load Ifetch /Dec Exec em Wr ultiple

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 17: Pipelining Wrapup Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Outline The textbook includes lots of information Focus on

More information

Pipelining. Maurizio Palesi

Pipelining. Maurizio Palesi * Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer

More information

14:332:331 Pipelined Datapath

14:332:331 Pipelined Datapath 14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate

More information

Appendix C: Pipelining: Basic and Intermediate Concepts

Appendix C: Pipelining: Basic and Intermediate Concepts Appendix C: Pipelining: Basic and Intermediate Concepts Key ideas and simple pipeline (Section C.1) Hazards (Sections C.2 and C.3) Structural hazards Data hazards Control hazards Exceptions (Section C.4)

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

Pipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010

Pipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010 Pipelining, Instruction Level Parallelism and Memory in Processors Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010 NOTE: The material for this lecture was taken from several

More information

Unpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory

Unpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory Pipelining the Idea Similar to assembly line in a factory Divide instruction into smaller tasks Each task is performed on subset of resources Overlap the execution of multiple instructions by completing

More information

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4 IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141 EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 13 Project Introduction You will design and optimize a RISC-V processor Phase 1: Design

More information

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/sp16 1 Pipelined Execution Representation Time

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #22 CPU Design: Pipelining to Improve Performance II 2007-8-1 Scott Beamer, Instructor CS61C L22 CPU Design : Pipelining to Improve Performance

More information

Computer Architectures. DLX ISA: Pipelined Implementation

Computer Architectures. DLX ISA: Pipelined Implementation Computer Architectures L ISA: Pipelined Implementation 1 The Pipelining Principle Pipelining is nowadays the main basic technique deployed to speed-up a CP. The key idea for pipelining is general, and

More information

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each

More information

Dynamic Control Hazard Avoidance

Dynamic Control Hazard Avoidance Dynamic Control Hazard Avoidance Consider Effects of Increasing the ILP Control dependencies rapidly become the limiting factor they tend to not get optimized by the compiler more instructions/sec ==>

More information

4. The Processor Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3. Emil Sekerinski, McMaster University, Fall Term 2015/16

4. The Processor Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3. Emil Sekerinski, McMaster University, Fall Term 2015/16 4. The Processor Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3 Emil Sekerinski, McMaster University, Fall Term 2015/16 Instruction Execution Consider simplified MIPS: lw/sw rt, offset(rs) add/sub/and/or/slt

More information

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Pipelining. Principles of pipelining. Simple pipelining. Structural Hazards. Data Hazards. Control Hazards. Interrupts. Multicycle operations

Pipelining. Principles of pipelining. Simple pipelining. Structural Hazards. Data Hazards. Control Hazards. Interrupts. Multicycle operations Principles of pipelining Pipelining Simple pipelining Structural Hazards Data Hazards Control Hazards Interrupts Multicycle operations Pipeline clocking ECE D52 Lecture Notes: Chapter 3 1 Sequential Execution

More information

1 Hazards COMP2611 Fall 2015 Pipelined Processor

1 Hazards COMP2611 Fall 2015 Pipelined Processor 1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add

More information

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19 CO2-3224 Computer Architecture and Programming Languages CAPL Lecture 8 & 9 Dr. Kinga Lipskoch Fall 27 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception Outline A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception 1 4 Which stage is the branch decision made? Case 1: 0 M u x 1 Add

More information

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one

More information

CSE 141 Computer Architecture Spring Lectures 11 Exceptions and Introduction to Pipelining. Announcements

CSE 141 Computer Architecture Spring Lectures 11 Exceptions and Introduction to Pipelining. Announcements CSE 4 Computer Architecture Spring 25 Lectures Exceptions and Introduction to Pipelining May 4, 25 Announcements Reading Assignment Sections 5.6, 5.9 The Processor Datapath and Control Section 6., Enhancing

More information

Pipelining. Each step does a small fraction of the job All steps ideally operate concurrently

Pipelining. Each step does a small fraction of the job All steps ideally operate concurrently Pipelining Computational assembly line Each step does a small fraction of the job All steps ideally operate concurrently A form of vertical concurrency Stage/segment - responsible for 1 step 1 machine

More information

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science Pipeline Overview Dr. Jiang Li Adapted from the slides provided by the authors Outline MIPS An ISA for Pipelining 5 stage pipelining Structural and Data Hazards Forwarding Branch Schemes Exceptions and

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware 4.1 Introduction We will examine two MIPS implementations

More information

Advanced Computer Architecture Pipelining

Advanced Computer Architecture Pipelining Advanced Computer Architecture Pipelining Dr. Shadrokh Samavi Some slides are from the instructors resources which accompany the 6 th and previous editions of the textbook. Some slides are from David Patterson,

More information

EECS Digital Design

EECS Digital Design EECS 150 -- Digital Design Lecture 11-- Processor Pipelining 2010-2-23 John Wawrzynek Today s lecture by John Lazzaro www-inst.eecs.berkeley.edu/~cs150 1 Today: Pipelining How to apply the performance

More information

COMPUTER ORGANIZATION AND DESI

COMPUTER ORGANIZATION AND DESI COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler

More information

Chapter 4 The Processor 1. Chapter 4B. The Processor

Chapter 4 The Processor 1. Chapter 4B. The Processor Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always

More information

More advanced CPUs. August 4, Howard Huang 1

More advanced CPUs. August 4, Howard Huang 1 More advanced CPUs In the last two weeks we presented the design of a basic processor. The datapath performs operations on register and memory data. A control unit translates program instructions into

More information

CISC 662 Graduate Computer Architecture Lecture 5 - Pipeline. Pipelining. Pipelining the Idea. Similar to assembly line in a factory:

CISC 662 Graduate Computer Architecture Lecture 5 - Pipeline. Pipelining. Pipelining the Idea. Similar to assembly line in a factory: CISC 662 Graduate Computer rchitecture Lecture 5 - Pipeline ichela Taufer http://www.cis.udel.edu/~taufer/courses Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer rchitecture,

More information

Ti Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr

Ti Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr Ti5317000 Parallel Computing PIPELINING Michał Roziecki, Tomáš Cipr 2005-2006 Introduction to pipelining What is this What is pipelining? Pipelining is an implementation technique in which multiple instructions

More information

Data paths for MIPS instructions

Data paths for MIPS instructions You are familiar with how MIPS programs step from one instruction to the next, and how branches can occur conditionally or unconditionally. We next examine the machine level representation of how MIPS

More information

Review: Abstract Implementation View

Review: Abstract Implementation View Review: Abstract Implementation View Split memory (Harvard) model - single cycle operation Simplified to contain only the instructions: memory-reference instructions: lw, sw arithmetic-logical instructions:

More information

Pipelined CPUs. Study Chapter 4 of Text. Where are the registers?

Pipelined CPUs. Study Chapter 4 of Text. Where are the registers? Pipelined CPUs Where are the registers? Study Chapter 4 of Text Second Quiz on Friday. Covers lectures 8-14. Open book, open note, no computers or calculators. L17 Pipelined CPU I 1 Review of CPU Performance

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 35: Final Exam Review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Material from Earlier in the Semester Throughput and latency

More information

ECE260: Fundamentals of Computer Engineering

ECE260: Fundamentals of Computer Engineering Pipelining James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy What is Pipelining? Pipelining

More information