Computer Architecture. Pipelining and Instruction Level Parallelism An Introduction. Outline of This Lecture

Size: px
Start display at page:

Download "Computer Architecture. Pipelining and Instruction Level Parallelism An Introduction. Outline of This Lecture"

Transcription

1 Compute Achitectue Pipelining and nstuction Level Paallelism An ntoduction Adapted fom COD2e by Hennessy & Patteson Slide 1 Outline of This Lectue ntoduction to the Concept of Pipelined Pocesso Pipelined Datapath and Pipelined Contol Pipeline Example: nstuctions nteaction Pipeline Hazads Fowading Stalls ntoduction to nstuction Level Paallelism Supescala, VLW Out-of-ode execution Banch Pediction Futue Chapte 6 - Pipelining Basics Slide 2

2 The Five Stages of Load F: nstuction Fetch Fetch the instuction fom the nstuction Memoy RF/D: Registes Fetch and nstuction Decode EX: Calculate the memoy addess MEM: Read the data fom the Data Memoy WB: Wite the data back to the egiste file Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Load F RF/D EX MEM WB Chapte 6 - Pipelining Basics Slide 3 Key deas Behind Pipelining Analogy Gading the mid tem exams: 6 poblems, six people gading the exam Each peson gades ONE poblem Pass exam to next peson as soon as one finishes he pat Assume each poblem takes 0.15 hou to gade Each individual exam still takes 0.9 hous to gade But with 6 people, all exams can be gaded much quicke: 100 exams: 90 hous, vs. 90 hs x 6 = 540 hous The load instuction has 5 stages: Five independent functional units to wok on each stage Each functional unit is used only once Anothe load can stat as soon as 1st finishes its F stage Each load still takes five cycles to complete The thoughput, howeve, is much highe Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 4

3 Pipelining the Load nstuction Five independent functional units in pipeline ae: nstuction Memoy fo the F stage Registe file s ead pots fo the RF/D stage fo the EX stage Data Memoy fo the MEM stage Registe File s Wite pot (bus W) fo the WB stage 1 instuction entes the pipeline evey cycle Clock 1 instuction comes out of pipeline (completes) evey cycle Effective Cycles pe nstuction (CP) is 1 Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 1st lw F RF/D EX MEM WB 2nd lw F RF/D EX MEM WB 3d lw F RF/D EX MEM WB Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 5 Fou Stages of F: nstuction Fetch Fetch the instuction fom the nstuction Memoy RF/D: Registes Fetch and nstuction Decode EX: opeates on the two egiste opeands WB: Wite the output back to the egiste file Cycle 1 Cycle 2 Cycle 3 Cycle 4 F RF/D EX WB Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 6

4 Pipelining + Load We have a poblem: Two instuctions ty to wite to egiste file at same time! Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Clock F RF/D EX WB Ops! We have a poblem! F RF/D EX WB Load F RF/D EX MEM WB F RF/D EX WB F RF/D EX WB Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 7 mpotant Obsevation A functional unit can be used once pe instuction Each functional unit must be used at same stage fo all instuctions: Load uses Registe File s Wite Pot duing its 5th stage Load F RF/D EX MEM WB uses Registe File s Wite Pot duing its 4th stage F RF/D EX WB Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 8

5 Solution: Delay WB a Cycle Delay s egiste wite by one cycle: instuctions also use Reg File s wite pot at Stage 5 MEM stage is a NOOP stage: nothing is being done F RF/D EX MEM WB Clock Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 F RF/D MEM EX WB F RF/D MEM EX WB Load F RF/D EX MEM WB F RF/D MEM EX WB F RF/D MEM EX WB Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 9 A Pipelined Datapath Clk F RF/D EX MEM WB RegW ExtOp Op Banch PC 1 0 PC+4 A Unit F/D Registe PC+4 mm16 Rs Ra Rb Rt RFile Rt Rw Di Rd D/Ex Registe 0 1 PC+4 mm16 busa busb EX Unit Ex/MEM Registe Zeo Data ME RAM Do WA Di MEM/WB Registe 1 Mux 0 RegDst Sc MemW MemtoReg Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 10

6 How About Contol Signals? Contol Signals at Stage N = Func (nst. at Stage N) N = EX, MEM, o WB Example: Contols Signals at EX Stage Func(Load s EX) F RF/D EX MEM WB Op=Add RegW ExtOp=1 Banch PC 1 0 PC+4 A Unit F/D: PC+4 mm16 Rs Ra Rb Rt RFile Rt Rw Di Rd D/Ex Registe 0 1 PC+4 mm16 busa busb EX Unit Ex/MEM: Load s Addess Zeo Data ME RAM Do WA Di MEM/WB Registe 1 Mux 0 RegDst=0 Sc=1 MemW MemtoReg Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 11 Pipeline Contol The Main Contol geneates the contol signals duing RF/D Contol signals fo EX (ExtOp, Sc,...) used 1 cycle late Contol signals fo MEM (MemW, Banch) used 2 cycles late Contol signals fo WB (MemtoReg MemW) used 3 cycles late RF/D EX MEM WB F/D Registe Main Contol ExtOp Sc Op RegDst MemW Banch MemtoReg RegW D/Ex Registe ExtOp Sc Op RegDst MemW Banch MemtoReg RegW Ex/MEM Registe MemW Banch MemtoReg RegW MEM/WB Registe MemtoReg RegW Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 12

7 Single Cycle, Multi-Cycle, Pipelined Clk Cycle 1 Cycle 2 Single Cycle mplementation: Load Stoe Waste Clk Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10 Multiple Cycle mplementation: Load F Reg EX MEM WB Stoe F Reg EX MEM F Pipeline mplementation: Load F Reg EX MEM WB Stoe F Reg EX MEM WB F Reg EX MEM WB Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 13 Hazads Challenge to Pipelining Limits to pipelining: Hazads pevent next instuction fom executing duing its designated clock cycle stuctual hazads: HW cannot suppot this combination of instuctions ealie case of load and R-typ like a stuctual hazad, but nomally cannot fix by etiming instuction. data hazads: instuction depends on esult of pio instuction still in the pipeline contol hazads: pipelining of banches & othe instuctionscommon solution is to stall the late pat of the pipeline until the hazad pipeline Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 14

8 Data Hazad on 1 Dependencies backwads in time ae hazads n s t. O d e Time (clock cycles) F D/RF EX MEM WB add 1,2,3 sub 4,1,3 and 6,1,7 o 8,1,9 xo 10,1,11 m Reg Dm Reg Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 15 HW Stalls to Resolve Hazad Dependencies backwads in time ae hazads eliminate evese time by a stall n s t. O d e Time (clock cycles) F D/RF EX MEM WB add 1,2,3 sub 4, 1,3 and 6,1,7 o 8,1,9 xo 10,1,11 m bubble bubble bubble Reg Dm Reg m Reg Dm m Reg m Reg Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 16

9 nsight: Data is available! n s t. O d e Pipeline egistes aleady contain needed data Fowad the data to the appopiate unit Time (clock cycles) F D/RF EX MEM WB add 1,2,3 sub 4,1,3 and 6,1,7 o 8,1,9 xo 10,1,11 m Reg Dm Reg Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 17 HW fo Fowading (Bypassing) ncease multiplexos to add paths fom egistes Assumes egiste ead duing wite gets new value (othewise moe esults to be fowaded) Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 18

10 Fowading Cannot Hide All Hazads n s t. O d e Time (clock cycles) F D/RF EX MEM WB lw 1, 0(2) sub 4,1,6 and 6,1,7 o 8,1,9 m Reg Dm Reg Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 19 Option: HW Stalls to Resolve Hazad ntelock : checks fo hazad & stalls n s t. O d e Time (clock cycles) F D/RF EX MEM WB lw 1, 0(2) stall sub 4,1,3 and 6,1,7 o 8,1,9 m bubble bubble bubble bubble m Reg Dm Reg Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 20

11 Option: SW esolves hazad n s t. O d e SW insets independent instuctions Wost case: pefomance no bette/wose Time (clock cycles) F D/RF EX MEM WB lw 1, 0(2) unelated instuction sub 4,1,3 and 6,1,7 o 8,1,9 m Reg Dm Reg Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 21 Contol Hazad on Banches Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 22

12 Hazads on Banches Time (clock cycles) F D/RF EX MEM WB beq 1,2,L sub 4,1,3 and 6,2,7 o 8,7,9 L: add 1,2,1 m Reg Dm Reg Stall fo two cycles on evey banch! Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 23 CP mpact: Banch Stall mpact f CP = 1, 30% banch, Stall 2 cycles => new CP = 1.6! Reducing the banch penalty MPS banch aleady moe aggessive than most limited eq/neq allows us to detemine banch condition ealy (afte EX), athe than late (e.g., afte MEM) doing bette use sepaate compaato athe than and move banch decision to RF (had!!!) educes penalty to 1 cycle Going futhe Vaiety of techniques: sepaating banch and destination sepaating banch condition and banch decision hadwae pediction of banche Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 24

13 When is pipelining had? nteupts: 5 instuctions executing in 5 stage pipeline How to stop the pipeline? Restat? Who caused the inteupt? Stage Poblem inteupts occuing F Page fault on instuction fetch; misaligned memoy access; memoy-potection violation D Undefined o illegal opcode EX Aithmetic inteupt MEM Page fault on data fetch; misaligned memoy access; memoy-potection violation Load with data page fault, Add with instuction page fault? Solution 1: inteupt vecto/instuction, estat eveything incomplete Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 25 Fist Geneation RSC Pipelines All instuctions: 1 pipeline ode ( static schedule ). Registe wite in last stage + eads pefomed in fist stage afte issue. Simpliy/eliminate hazads Memoy access in stage 4 Avoid all memoy hazads Contol hazads use delayed banch (with fast path) RAW hazads use bypass, except on load esults Load esolved by delayed load o stall Good pipeline pefomance at little cost/complexity. Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 26

14 Summay of Pipelining Basics Speed Up = Pipeline Depth Hazads limit pefomance on computes: stuctual: need moe HW esouces data: need fowading, compile scheduling contol: ealy evaluation & PC, delayed banch, pediction nceasing length of pipe inceases hazads since pipelining helps instuction bandwidth, not latency Compiles can educe cost of data & contol hazads load delay slots banch delay slots Exceptions (also FP, SA) make pipelining hade Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 27 Advanced Pipelining Pipelining exploits paallelism among instuctions by ovelapping them Called nstuction Level Paallelism (LP) Limited by a vaiety of things: paallelism in the pogam compile technology in exposing paallelism functional unit capability: how many ovlapping instuctions ability of hadwae to find instuctions to un in paallel Exploiting LP is hot topic in pocesso design: Lots of diffeent appoaches Multiple instuctions/cycle compile vs. HW fo scheduling instuctions Both achitectue appoaches and compile appoaches Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 28

15 Technique Pipelining Supe-scala Exploiting Available LP ssue multiple scala instuctions pe cycle VLW Each instuction specifies multiple scala opeations F D Ex M W F D Ex M W F D Ex M W F D Ex M W F D Ex M W F D Ex M W F D Ex M W F D Ex M W F D Ex M W Ex M W Ex M W Ex M W HW Limitation ssue ate, FU stalls, FU depth Hazad esolution Packing Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 29 Easy Supescala -Cache nt Reg nst ssue and Bypass FP Reg nt Unit Load / Stoe Unit FP Add FP Mul D-Cache ssue intege and FP opeations in paallel! potential hazads? expected speedup? what combinations of instuctions make sense? Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 30

16 ssuing Multiple nstuction/ Cycle Supescala: 2 instuctions, 1 FP & 1 anything else Fetch 64-bits/clock cycle; nt on left, FP on ight Can only issue 2nd instuction if 1st instuction issues Moe pots fo FP egistes to do FP load & FP op in a pai Type Pipe Stages nt. instuction F D EX MEM W FP instuction F D EX MEM WB nt. instuction F D EX MEM WB FP instuction F D EX MEM WB nt. instuction F D EX MEM WB FP instuction F D EX MEM WB 1 cycle load delay expands to 3 instuction in SS instuction in ight half can t use esult, no can eithe instuction in next slot Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 31 Dynamic Banch Pediction Pedict diection of banches on past behavio keep a cache of banch behavio, look up pediction Pefomance = f(accuacy, cost of mispediction) Banch pediction buffe: lowe bits of PC addess index table of 1-bit values says whethe o not banch taken last time evaluate actual banch condition, if pediction incoect: ecove by flushing pipeline, estating fetch eset pediction Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 32

17 Speculative Supescala Execution Get all available paallelism acoss banches in face cache misses limited only by data dependences Goal: esouces and available bandwidth ae only HW limit Banch pediction execute instuctions speculatively Hazad detection and aggessive esolution out-of-ode execution (dynamic scheduling) in-ode completion Exception handling easie handles incoect speculation nstuction Fetch Decode nstuction Window Execution Units look ahead and pefetch instuctions ssue multiple instuctions to Execution Units when inputs ae available Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 33 Vaiety of Moden Micopocesso Pocesso nstuction Completion Rate Scheduling of pipeline Banch pediction PowePC Dynamic, nonspeculative HW MPS R Dynamic, speculative HW Pentium 4 Dynamic, nonspeculative HW UltaSPARC 4 Static HW Meced? Static? Static? Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 34

18 Limits to Multi-ssue Machines nheent limitations of LP 1 banch in 5 => 5-way VLW busy? Latencies of units => many opeations must be scheduled Need about Pipeline Depth x No. Functional Units of independentdifficulties in building HW Duplicate FUs to get paallel execution ncease pots to Registe File (3 x intege/fp ate) ncease pots to memoy Decoding challenge and impact on clock ate, pipeline depth Limitations specific to eithe SS o VLW implementation Decode issue in SS VLW code size: unoll loops + wasted fields in VLW VLW lock step => 1 hazad & all instuctions stall VLW & binay compatibility Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 35 Summay nstuction Level Paallelism in SW o HW Loop level paallelism is easiest to see SW dependencies/compile sophistication detemine if compile can unoll loops SW Scheduling HW scheduling Banch Pediction SupeScala and VLW CP < 1 Dynamic issue vs. Static issue Moe instuctions issue/clock, lage penalty of hazads Futue? Stay tuned Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 36

19 Single Memoy=>Stuctual Hazad Time (clock cycles) n s t. O d e Load nst 1 nst 2 nst 3 nst 4 MEM Reg MEM Reg MEM Reg MEM Reg MEM Reg MEM Reg MEM Reg MEM Reg MEM Reg MEM Reg Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 37 Stall to esolve Stuctual Hazad Time (clock cycles) n s t. O d e Load nst 1 nst 2 nst 3(stall) nst 4 MEM Reg MEM Reg MEM Reg MEM Reg MEM Reg MEM Reg bubble MEM Reg MEM Reg MEM Reg MEM Reg Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 38

20 Duplicate to Resolve Hazad Sepaate nstuction Cache (m) & Data Cache (Dm) Time (clock cycles) n s t. O d e Load nst 1 nst 2 nst 3 nst 4 m Reg Dm Reg Adapted fom COD2e by Hennessy & Patteson Chapte 6 - Pipelining Basics Slide 39

The Processor: Improving Performance Data Hazards

The Processor: Improving Performance Data Hazards The Pocesso: Impoving Pefomance Data Hazads Monday 12 Octobe 15 Many slides adapted fom: and Design, Patteson & Hennessy 5th Edition, 2014, MK and fom Pof. May Jane Iwin, PSU Summay Pevious Class Pipeline

More information

COEN-4730 Computer Architecture Lecture 2 Review of Instruction Sets and Pipelines

COEN-4730 Computer Architecture Lecture 2 Review of Instruction Sets and Pipelines 1 COEN-4730 Compute Achitectue Lectue 2 Review of nstuction Sets and Pipelines Cistinel Ababei Dept. of Electical and Compute Engineeing Maquette Univesity Cedits: Slides adapted fom pesentations of Sudeep

More information

COSC 6385 Computer Architecture. - Pipelining

COSC 6385 Computer Architecture. - Pipelining COSC 6385 Compute Achitectue - Pipelining Sping 2012 Some of the slides ae based on a lectue by David Culle, Pipelining Pipelining is an implementation technique wheeby multiple instuctions ae ovelapped

More information

Introduction To Pipelining. Chapter Pipelining1 1

Introduction To Pipelining. Chapter Pipelining1 1 Intoduction To Pipelining Chapte 6.1 - Pipelining1 1 Mooe s Law Mooe s Law says that the numbe of pocessos on a chip doubles about evey 18 months. Given the data on the following two slides, is this tue?

More information

Lecture 8 Introduction to Pipelines Adapated from slides by David Patterson

Lecture 8 Introduction to Pipelines Adapated from slides by David Patterson Lectue 8 Intoduction to Pipelines Adapated fom slides by David Patteson http://www-inst.eecs.bekeley.edu/~cs61c/ * 1 Review (1/3) Datapath is the hadwae that pefoms opeations necessay to execute pogams.

More information

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards CISC 662 Gaduate Compute Achitectue Lectue 6 - Hazads Michela Taufe http://www.cis.udel.edu/~taufe/teaching/cis662f07 Powepoint Lectue Notes fom John Hennessy and David Patteson s: Compute Achitectue,

More information

Computer Science 141 Computing Hardware

Computer Science 141 Computing Hardware Compute Science 141 Computing Hadwae Fall 2006 Havad Univesity Instucto: Pof. David Books dbooks@eecs.havad.edu [MIPS Pipeline Slides adapted fom Dave Patteson s UCB CS152 slides and May Jane Iwin s CSE331/431

More information

Chapter 4 (Part III) The Processor: Datapath and Control (Pipeline Hazards)

Chapter 4 (Part III) The Processor: Datapath and Control (Pipeline Hazards) Chapte 4 (Pat III) The Pocesso: Datapath and Contol (Pipeline Hazads) 陳瑞奇 (J.C. Chen) 亞洲大學資訊工程學系 Adapted fom class notes by Pof. M.J. Iwin, PSU and Pof. D. Patteson, UCB 1 吃感冒藥副作用怎麼辦? http://big5.sznews.com/health/images/attachement/jpg/site3/20120319/001558d90b3310d0c1683e.jpg

More information

Administrivia. CMSC 411 Computer Systems Architecture Lecture 5. Data Hazard Even with Forwarding Figure A.9, Page A-20

Administrivia. CMSC 411 Computer Systems Architecture Lecture 5. Data Hazard Even with Forwarding Figure A.9, Page A-20 Administivia CMSC 411 Compute Systems Achitectue Lectue 5 Basic Pipelining (cont.) Alan Sussman als@cs.umd.edu as@csu dedu Homewok poblems fo Unit 1 due today Homewok poblems fo Unit 3 posted soon CMSC

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hadwae Oganization and Design Lectue 16: Pipelining Adapted fom Compute Oganization and Design, Patteson & Hennessy, UCB Last time: single cycle data path op System clock affects pimaily the Pogam

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.bekeley.edu/~cs61c UCB CS61C : Machine Stuctues Lectue SOE Dan Gacia Lectue 28 CPU Design : Pipelining to Impove Pefomance 2010-04-05 Stanfod Reseaches have invented a monitoing technique called

More information

CS 61C: Great Ideas in Computer Architecture. Pipelining Hazards. Instructor: Senior Lecturer SOE Dan Garcia

CS 61C: Great Ideas in Computer Architecture. Pipelining Hazards. Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Geat Ideas in Compute Achitectue Pipelining Hazads Instucto: Senio Lectue SOE Dan Gacia 1 Geat Idea #4: Paallelism So9wae Paallel Requests Assigned to compute e.g. seach Gacia Paallel Theads Assigned

More information

CENG 3420 Computer Organization and Design. Lecture 07: MIPS Processor - II. Bei Yu

CENG 3420 Computer Organization and Design. Lecture 07: MIPS Processor - II. Bei Yu CENG 3420 Compute Oganization and Design Lectue 07: MIPS Pocesso - II Bei Yu CEG3420 L07.1 Sping 2016 Review: Instuction Citical Paths q Calculate cycle time assuming negligible delays (fo muxes, contol

More information

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1 CMCS 611-101 Advanced Compute Achitectue Lectue 6 Intoduction to Pipelining Septembe 23, 2009 www.csee.umbc.edu/~younis/cmsc611/cmsc611.htm Mohamed Younis CMCS 611, Advanced Compute Achitectue 1 Pevious

More information

CENG 3420 Lecture 07: Pipeline

CENG 3420 Lecture 07: Pipeline CENG 3420 Lectue 07: Pipeline Bei Yu byu@cse.cuhk.edu.hk CENG3420 L07.1 Sping 2017 Outline q Review: Flip-Flop Contol Signals q Pipeline Motivations q Pipeline Hazads q Exceptions CENG3420 L07.2 Sping

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Instruc>on Level Parallelism

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Instruc>on Level Parallelism Agenda CS 61C: Geat Ideas in Compute Achitectue (Machine Stuctues) Instuc>on Level Paallelism Instuctos: Randy H. Katz David A. PaJeson hjp://inst.eecs.bekeley.edu/~cs61c/fa10 Review Instuc>on Set Design

More information

Lecture #22 Pipelining II, Cache I

Lecture #22 Pipelining II, Cache I inst.eecs.bekeley.edu/~cs61c CS61C : Machine Stuctues Lectue #22 Pipelining II, Cache I Wiewold cicuits 2008-7-29 http://www.maa.og/editoial/mathgames/mathgames_05_24_04.html http://www.quinapalus.com/wi-index.html

More information

CSE4201. Computer Architecture

CSE4201. Computer Architecture CSE 4201 Compute Achitectue Pof. Mokhta Aboelaze Pats of these slides ae taken fom Notes by Pof. David Patteson at UCB Outline MIPS and instuction set Simple pipeline in MIPS Stuctual and data hazads Fowading

More information

You Are Here! Review: Hazards. Agenda. Agenda. Review: Load / Branch Delay Slots 7/28/2011

You Are Here! Review: Hazards. Agenda. Agenda. Review: Load / Branch Delay Slots 7/28/2011 CS 61C: Geat Ideas in Compute Achitectue (Machine Stuctues) Instuction Level Paallelism: Multiple Instuction Issue Guest Lectue: Justin Hsia Softwae Paallel Requests Assigned to compute e.g., Seach Katz

More information

User Visible Registers. CPU Structure and Function Ch 11. General CPU Organization (4) Control and Status Registers (5) Register Organisation (4)

User Visible Registers. CPU Structure and Function Ch 11. General CPU Organization (4) Control and Status Registers (5) Register Organisation (4) PU Stuctue and Function h Geneal Oganisation Registes Instuction ycle Pipelining anch Pediction Inteupts Use Visible Registes Vaies fom one achitectue to anothe Geneal pupose egiste (GPR) ata, addess,

More information

CS 61C: Great Ideas in Computer Architecture Instruc(on Level Parallelism: Mul(ple Instruc(on Issue

CS 61C: Great Ideas in Computer Architecture Instruc(on Level Parallelism: Mul(ple Instruc(on Issue CS 61C: Geat Ideas in Compute Achitectue Instuc(on Level Paallelism: Mul(ple Instuc(on Issue Instuctos: Kste Asanovic, Randy H. Katz hbp://inst.eecs.bekeley.edu/~cs61c/fa12 1 Paallel Requests Assigned

More information

Lecture Topics ECE 341. Lecture # 12. Control Signals. Control Signals for Datapath. Basic Processing Unit. Pipelining

Lecture Topics ECE 341. Lecture # 12. Control Signals. Control Signals for Datapath. Basic Processing Unit. Pipelining EE 341 Lectue # 12 Instucto: Zeshan hishti zeshan@ece.pdx.edu Novembe 10, 2014 Potland State Univesity asic Pocessing Unit ontol Signals Hadwied ontol Datapath contol signals Dealing with memoy delay Pipelining

More information

Review from last lecture

Review from last lecture CSE820 Gaduate Compute Achitectue Week 3 Pefomance + Pipeline Review Based on slides by David Patteson Review fom last lectue Tacking and extapolating technology pat of achitect s esponsibility Expect

More information

CS 2461: Computer Architecture 1 Program performance and High Performance Processors

CS 2461: Computer Architecture 1 Program performance and High Performance Processors Couse Objectives: Whee ae we. CS 2461: Pogam pefomance and High Pefomance Pocessos Instucto: Pof. Bhagi Naahai Bits&bytes: Logic devices HW building blocks Pocesso: ISA, datapath Using building blocks

More information

Review: Moore s Law. EECS 252 Graduate Computer Architecture Lecture 2. Review: Joy s Law in ManyCore world. Bell s Law new class per decade

Review: Moore s Law. EECS 252 Graduate Computer Architecture Lecture 2. Review: Joy s Law in ManyCore world. Bell s Law new class per decade EECS 252 Gaduate Compute Achitectue Lectue 2 ℵ 0 Review of Instuction Sets, Pipelines, and Caches Januay 26 th, 2009 Review Mooe s Law John Kubiatowicz Electical Engineeing and Compute Sciences Univesity

More information

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:

More information

Pipeline design. Mehran Rezaei

Pipeline design. Mehran Rezaei Pipeline design Mehran Rezaei How Can We Improve the Performance? Exec Time = IC * CPI * CCT Optimization IC CPI CCT Source Level * Compiler * * ISA * * Organization * * Technology * With Pipelining We

More information

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will

More information

Pre-requisites. This is a textbook-based course. Chapter 1. Pipelines, Performance, Caches, and Virtual Memory. January 2009 Paul H J Kelly

Pre-requisites. This is a textbook-based course. Chapter 1. Pipelines, Performance, Caches, and Virtual Memory. January 2009 Paul H J Kelly 332 Advanced Compute Achitectue Chapte 1 Intoduction and eview of Pipelines, Pefomance, Caches, and Vitual Januay 2009 Paul H J Kelly These lectue notes ae patly based on the couse text, Hennessy and Patteson

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 17: Pipelining Wrapup Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Outline The textbook includes lots of information Focus on

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #19: Pipelining II 2005-07-21 Andy Carle CS 61C L19 Pipelining II (1) Review: Datapath for MIPS PC instruction memory rd rs rt registers

More information

ECE4680 Computer Organization and Architecture. Designing a Pipeline Processor

ECE4680 Computer Organization and Architecture. Designing a Pipeline Processor ECE468 Computer Organization and Architecture Designing a Pipeline Processor Pipelined processors overlap instructions in time on common execution resources. ECE468 Pipeline. 22-4-3 Start X:4 Branch Jump

More information

Lecture 7 Pipelining. Peng Liu.

Lecture 7 Pipelining. Peng Liu. Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt

More information

COSC 6385 Computer Architecture - Pipelining

COSC 6385 Computer Architecture - Pipelining COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science Cases that affect instruction execution semantics

More information

Final Exam Spring 2017

Final Exam Spring 2017 COE 3 / ICS 233 Computer Organization Final Exam Spring 27 Friday, May 9, 27 7:3 AM Computer Engineering Department College of Computer Sciences & Engineering King Fahd University of Petroleum & Minerals

More information

CS420/520 Homework Assignment: Pipelining

CS420/520 Homework Assignment: Pipelining CS42/52 Homework Assignment: Pipelining Total: points. 6.2 []: Using a drawing similar to the Figure 6.8 below, show the forwarding paths needed to execute the following three instructions: Add $2, $3,

More information

Four Steps of Speculative Tomasulo cycle 0

Four Steps of Speculative Tomasulo cycle 0 HW support for More ILP Hardware Speculative Execution Speculation: allow an instruction to issue that is dependent on branch, without any consequences (including exceptions) if branch is predicted incorrectly

More information

ELE 655 Microprocessor System Design

ELE 655 Microprocessor System Design ELE 655 Microprocessor System Design Section 2 Instruction Level Parallelism Class 1 Basic Pipeline Notes: Reg shows up two places but actually is the same register file Writes occur on the second half

More information

ECE154A Introduction to Computer Architecture. Homework 4 solution

ECE154A Introduction to Computer Architecture. Homework 4 solution ECE154A Introduction to Computer Architecture Homework 4 solution 4.16.1 According to Figure 4.65 on the textbook, each register located between two pipeline stages keeps data shown below. Register IF/ID

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

Basic Pipelining Concepts

Basic Pipelining Concepts Basic ipelining oncepts Appendix A (recommended reading, not everything will be covered today) Basic pipelining ipeline hazards Data hazards ontrol hazards Structural hazards Multicycle operations Execution

More information

MIPS An ISA for Pipelining

MIPS An ISA for Pipelining Pipelining: Basic and Intermediate Concepts Slides by: Muhamed Mudawar CS 282 KAUST Spring 2010 Outline: MIPS An ISA for Pipelining 5 stage pipelining i Structural Hazards Data Hazards & Forwarding Branch

More information

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one

More information

ECS 154B Computer Architecture II Spring 2009

ECS 154B Computer Architecture II Spring 2009 ECS 154B Computer Architecture II Spring 2009 Pipelining Datapath and Control 6.2-6.3 Partially adapted from slides by Mary Jane Irwin, Penn State And Kurtis Kredo, UCD Pipelined CPU Break execution into

More information

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/sp16 1 Pipelined Execution Representation Time

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

LECTURE 3: THE PROCESSOR

LECTURE 3: THE PROCESSOR LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU

More information

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content 3/6/8 CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Today s Content We have looked at how to design a Data Path. 4.4, 4.5 We will design

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Computer Architecture

Computer Architecture Lecture 3: Pipelining Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

Chapter 4 The Processor 1. Chapter 4A. The Processor

Chapter 4 The Processor 1. Chapter 4A. The Processor Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

CISC 662 Graduate Computer Architecture Lecture 13 - CPI < 1

CISC 662 Graduate Computer Architecture Lecture 13 - CPI < 1 CISC 662 Graduate Computer Architecture Lecture 13 - CPI < 1 Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Full Datapath Branch Target Instruction Fetch Immediate 4 Today s Contents We have looked

More information

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science Pipeline Overview Dr. Jiang Li Adapted from the slides provided by the authors Outline MIPS An ISA for Pipelining 5 stage pipelining Structural and Data Hazards Forwarding Branch Schemes Exceptions and

More information

CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath

CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath CPE 442 single-cycle datapath.1 Outline of Today s Lecture Recap and Introduction Where are we with respect to the BIG picture?

More information

COMP2611: Computer Organization. The Pipelined Processor

COMP2611: Computer Organization. The Pipelined Processor COMP2611: Computer Organization The 1 2 Background 2 High-Performance Processors 3 Two techniques for designing high-performance processors by exploiting parallelism: Multiprocessing: parallelism among

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

ECE473 Computer Architecture and Organization. Pipeline: Control Hazard

ECE473 Computer Architecture and Organization. Pipeline: Control Hazard Computer Architecture and Organization Pipeline: Control Hazard Lecturer: Prof. Yifeng Zhu Fall, 2015 Portions of these slides are derived from: Dave Patterson UCB Lec 15.1 Pipelining Outline Introduction

More information

Module 6 STILL IMAGE COMPRESSION STANDARDS

Module 6 STILL IMAGE COMPRESSION STANDARDS Module 6 STILL IMAE COMPRESSION STANDARDS Lesson 17 JPE-2000 Achitectue and Featues Instuctional Objectives At the end of this lesson, the students should be able to: 1. State the shotcomings of JPE standad.

More information

Pipelined Processor Design

Pipelined Processor Design Pipelined Processor Design Pipelined Implementation: MIPS Virendra Singh Indian Institute of Science Bangalore virendra@computer.org Lecture 20 SE-273: Processor Design Courtesy: Prof. Vishwani Agrawal

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ... CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100

More information

Design a MIPS Processor (2/2)

Design a MIPS Processor (2/2) 93-2Digital System Design Design a MIPS Processor (2/2) Lecturer: Chihhao Chao Advisor: Prof. An-Yeu Wu 2005/5/13 Friday ACCESS IC LABORTORY Outline v 6.1 An Overview of Pipelining v 6.2 A Pipelined Datapath

More information

THE THETA BLOCKCHAIN

THE THETA BLOCKCHAIN THE THETA BLOCKCHAIN Theta is a decentalized video steaming netwok, poweed by a new blockchain and token. By Theta Labs, Inc. Last Updated: Nov 21, 2017 esion 1.0 1 OUTLINE Motivation Reputation Dependent

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

Thomas Polzer Institut für Technische Informatik

Thomas Polzer Institut für Technische Informatik Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction

More information

CS 61C: Great Ideas in Computer Architecture Control and Pipelining

CS 61C: Great Ideas in Computer Architecture Control and Pipelining CS 6C: Great Ideas in Computer Architecture Control and Pipelining Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs6c/sp6 Datapath Control Signals ExtOp: zero, sign

More information

ECE260: Fundamentals of Computer Engineering

ECE260: Fundamentals of Computer Engineering ECE260: Fundamentals of Computer Engineering Pipelined Datapath and Control James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania ECE260: Fundamentals of Computer Engineering

More information

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor. COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction

More information

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism

More information

Accelerating Storage with RDMA Max Gurtovoy Mellanox Technologies

Accelerating Storage with RDMA Max Gurtovoy Mellanox Technologies Acceleating Stoage with RDMA Max Gutovoy Mellanox Technologies 2018 Stoage Develope Confeence EMEA. Mellanox Technologies. All Rights Reseved. 1 What is RDMA? Remote Diect Memoy Access - povides the ability

More information

Working on the Pipeline

Working on the Pipeline Computer Science 6C Spring 27 Working on the Pipeline Datapath Control Signals Computer Science 6C Spring 27 MemWr: write memory MemtoReg: ALU; Mem RegDst: rt ; rd RegWr: write register 4 PC Ext Imm6 Adder

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 35: Final Exam Review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Material from Earlier in the Semester Throughput and latency

More information

Chapter 4 The Processor 1. Chapter 4B. The Processor

Chapter 4 The Processor 1. Chapter 4B. The Processor Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always

More information

361 datapath.1. Computer Architecture EECS 361 Lecture 8: Designing a Single Cycle Datapath

361 datapath.1. Computer Architecture EECS 361 Lecture 8: Designing a Single Cycle Datapath 361 datapath.1 Computer Architecture EECS 361 Lecture 8: Designing a Single Cycle Datapath Outline of Today s Lecture Introduction Where are we with respect to the BIG picture? Questions and Administrative

More information

Appendix C: Pipelining: Basic and Intermediate Concepts

Appendix C: Pipelining: Basic and Intermediate Concepts Appendix C: Pipelining: Basic and Intermediate Concepts Key ideas and simple pipeline (Section C.1) Hazards (Sections C.2 and C.3) Structural hazards Data hazards Control hazards Exceptions (Section C.4)

More information

Pipelined Processor Design

Pipelined Processor Design Pipelined Processor Design Pipelined Implementation: MIPS Virendra Singh Computer Design and Test Lab. Indian Institute of Science (IISc) Bangalore virendra@computer.org Advance Computer Architecture http://www.serc.iisc.ernet.in/~viren/courses/aca/aca.htm

More information

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building

More information

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count

More information

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4 IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM

More information

CS3350B Computer Architecture Winter Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2)

CS3350B Computer Architecture Winter Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2) CS335B Computer Architecture Winter 25 Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2) Marc Moreno Maza www.csd.uwo.ca/courses/cs335b [Adapted from lectures on Computer Organization and Design,

More information

CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3. Complications With Long Instructions

CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3. Complications With Long Instructions CMSC 411 Computer Systems Architecture Lecture 6 Basic Pipelining 3 Long Instructions & MIPS Case Study Complications With Long Instructions So far, all MIPS instructions take 5 cycles But haven't talked

More information

DYNAMIC STORAGE ALLOCATION. Hanan Samet

DYNAMIC STORAGE ALLOCATION. Hanan Samet ds0 DYNAMIC STORAGE ALLOCATION Hanan Samet Compute Science Depatment and Cente fo Automation Reseach and Institute fo Advanced Compute Studies Univesity of Mayland College Pak, Mayland 07 e-mail: hjs@umiacs.umd.edu

More information

Advanced Computer Architecture

Advanced Computer Architecture Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #19 Designing a Single-Cycle CPU 27-7-26 Scott Beamer Instructor AI Focuses on Poker CS61C L19 CPU Design : Designing a Single-Cycle CPU

More information

ECE468 Computer Organization and Architecture. Designing a Single Cycle Datapath

ECE468 Computer Organization and Architecture. Designing a Single Cycle Datapath ECE468 Computer Organization and Architecture Designing a Single Cycle Datapath ECE468 datapath1 The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor Control Input Datapath

More information

CSEE 3827: Fundamentals of Computer Systems

CSEE 3827: Fundamentals of Computer Systems CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 martha@cs.columbia.edu Amdahl s Law Be aware when optimizing... T = improved Taffected improvement factor + T unaffected

More information

Pipelining. CSC Friday, November 6, 2015

Pipelining. CSC Friday, November 6, 2015 Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not

More information

DLX Unpipelined Implementation

DLX Unpipelined Implementation LECTURE - 06 DLX Unpipelined Implementation Five cycles: IF, ID, EX, MEM, WB Branch and store instructions: 4 cycles only What is the CPI? F branch 0.12, F store 0.05 CPI0.1740.83550.174.83 Further reduction

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor 1 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A

More information