What do we have so far? Multi-Cycle Datapath

Size: px
Start display at page:

Download "What do we have so far? Multi-Cycle Datapath"

Transcription

1 What do we have so far? lti-cycle Datapath CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instrction being processed in datapath How to lower CPI frther? #1 Lec # 8 Spring

2 Pipelining pipelining is a CPU implementation techniqe where mltiple operations on a nmber of instrctions are overlapped. The net instrction is fetched in the net cycle withot waiting for the crrent instrction to complete. An instrction eection pipeline involves a nmber of steps, where each step completes one part of an instrction. Each step is called a pipeline stage or a pipeline segment. The stages or steps are connected one to the net to form a pipeline -- instrctions enter at one end and progress throgh the stages and eit at the other end when completed. Pipeline Throghpt : The instrction completion rate of the pipeline and is determined by how often an instrction eists the pipeline. The time to move an instrction one step down the line is is eqal to the machine cycle and is determined by the stage with the longest processing delay. Pipeline Latency: The time reqired to complete an instrction: Cycle time Nmber of pipeline stages. #2 Lec # 8 Spring

3 Single Cycle Vs. Pipelining P rogram e ection Tim e o rder (in instrctions) lw $ 1, 1 ($ ) fetch ALU Data access Single Cycle lw $ 2, 2 ($ ) 8 ns fetch ALU Data access lw $ 3, 3 ($ ) Time for 1 instrctions = 8 1 = 8 ns 8 n s fetch 8 ns... Prog ram eection Time ord er (in instrctions) lw $1, 1 ($) lw $2, 2 ($) fetch 2 ns fetch ALU Data access ALU Data access 5 Stage Pipeline lw $3, 3 ($) 2 ns fetch ALU Data access 2 ns 2 n s 2 ns 2 ns 2 n s Time for 1 instrctions = time to fill pipeline + cycle time 1 = = 28 ns Pipelining Speedp = 8/28 = 3.98 #3 Lec # 8 Spring

4 Pipelining: Design Goals The length of the machine clock cycle is determined by the time reqired for the slowest pipeline stage. An important pipeline design consideration is to balance the length of each pipeline stage. If all stages are perfectly balanced, then the time per instrction on a pipelined machine (assming ideal conditions with no stalls): Time per instrction on npipelined machine Nmber of pipe stages Under these ideal conditions: Speedp from pipelining = the nmber of pipeline stages = k One instrction is completed every cycle: CPI = 1. #4 Lec # 8 Spring

5 From IPS lti-cycle Datapath: Five Stages of Load Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Load IF ID EX E WB 1- Fetch (IF) Fetch Fetch the instrction from the emory. 2- Decode (ID): isters Fetch and Decode. 3- Eecte (EX): Calclate the memory address. 4- emory (E): the data from the Data emory. 5- Write Back (WB): Write the data back to the register file. #5 Lec # 8 Spring

6 Pipelined Processing Representation Clock cycle Nmber Time in clock cycles Nmber I IF ID EX E WB I+1 IF ID EX E WB I+2 IF ID EX E WB I+3 IF ID EX E WB I +4 IF ID EX E WB Time to fill the pipeline Pipeline Stages: IF = Fetch ID = Decode EX = Eection E = emory Access WB = Write Back First instrction, I Completed Last instrction, I+4 completed #6 Lec # 8 Spring

7 Pipelined Processing Time IF ID EX E WB Representation IF ID EX E WB IF ID EX E WB IF ID EX E WB Program Flow IF ID EX E WB IF ID EX E WB #7 Lec # 8 Spring

8 Clk Single Cycle, lti-cycle, Vs. Pipeline Single Cycle Implementation: Cycle 1 Cycle 2 8 ns Load Store Waste 2ns Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 1 Clk ltiple Cycle Implementation: Load IF ID EX E WB Store IF ID EX E R-type IF Pipeline Implementation: Load IF ID EX E WB Store IF ID EX E WB R-type IF ID EX E WB #8 Lec # 8 Spring

9 Single Cycle, lti-cycle, Pipeline: Performance Comparison Eample For 1 instrctions, eection time: Single Cycle achine: 8 ns/cycle 1 CPI 1 inst = 8 ns lticycle achine: 2 ns/cycle 4.6 CPI (de to inst mi) 1 inst = 92 ns Ideal pipelined machine, 5-stages: 2 ns/cycle (1 CPI 1 inst + 4 cycle fill) = 28 ns #9 Lec # 8 Spring

10 IPS Pipeline Stage Identification IF: fetch ID: decode/ register file read EX: Eecte/ address calclation E: emory access WB: Write back 1 Add 4 Shift left 2 Add reslt Add PC Address memory register 1 data 1 register 2 isters data 2 Write register Write data 16 Sign etend 32 1 Zero ALU ALU reslt Address Write data Data memory data 1 What is needed to divide datapath into pipeline stages? #1 Lec # 8 Spring

11 IPS: An Initial Pipelined Datapath 1 IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add Add reslt PC Address memory register 1 data 1 register 2 isters data 2 Write register Write data 1 Zero ALU ALU reslt Address Write data Data memory data 1 16 Sign etend 32 IF ID EX E WB Fetch Decode Eection emory Write Back Can yo find a problem even if there are no dependencies? What instrctions can we eecte to manifest the problem? #11 Lec # 8 Spring

12 A Corrected Pipelined Datapath 1 IF/ID ID/EX EX/E E/WB Add 4 Shift left 2 Add reslt Add PC Address memory register 1 data 1 register 2 isters data 2 Write register Write data 1 Zero ALU ALU reslt Address Write data Data memory data 1 16 Sign etend 32 IF ID EX E WB Fetch Decode Eection emory Write Back #12 Lec # 8 Spring

13 Representing Pipelines Graphically Time (in clock cycles) Program eection order (in instrctions) lw $1, 2($1) CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 I ALU D sb $11, $2, $3 I ALU D Can help with answering qestions like: How many cycles does it take to eecte this code? What is the ALU doing dring cycle 4? Use this representation to help nderstand datapaths #13 Lec # 8 Spring

14 Adding Pipeline Control Points PCSrc 1 IF/ID ID/EX EX/E E/WB Add 4 Write Shift left 2 Add Add reslt Branch PC Address memory register 1 data 1 register 2 isters Write data 2 register Write data ALUSrc 1 Zero ALU ALU reslt Address Write emwrite Data memory data emto 1 [15 ] 16 Sign 32 etend 6 ALU control data em [2 16] [15 11] 1 ALUOp Dst #14 Lec # 8 Spring

15 Pipeline Control Pass needed control signals along from one stage to the net as the instrction travels throgh the pipeline jst like the data Eection/Address Calclation stage control lines emory access stage control lines Write-back stage control lines Dst ALU Op1 ALU Op ALU Src Branch em em Write write em to R-format lw sw X 1 1 X beq X 1 1 X WB Control WB EX WB IF/ID ID/EX EX/E E/WB #15 Lec # 8 Spring

16 Pipeline Control The ain Control generates the control signals dring /Dec Control signals for Eec (EtOp, ALUSrc,...) are sed 1 cycle later Control signals for em (emwr Branch) are sed 2 cycles later Control signals for Wr (emto emwr) are sed 3 cycles later ID EX em WB EtOp EtOp ALUSrc ALUSrc IF/ID ister ain Control ALUOp Dst emwr Branch emto ID/E ister ALUOp Dst emwr Branch emto E/em ister emwr Branch emto em/wb ister emto Wr Wr Wr Wr #16 Lec # 8 Spring

17 Pipelined Datapath with Control Added PCSrc 1 Control ID/EX WB EX/E WB E/WB IF/ID EX WB Add PC 4 Address memory register 1 data 1 register 2 isters Write data 2 register Write data R egwrite Shift left 2 1 Add Add reslt ALUSrc Zero ALU ALU reslt Branch Write data emwrite Address Data memory data emto 1 [15 ] 16 Sign 32 etend 6 ALU control em [2 16] [15 11] 1 Dst ALUOp Target address of branch determined in E #17 Lec # 8 Spring

18 Basic Performance Isses In Pipelining Pipelining increases the CPU instrction throghpt: The nmber of instrctions completed per nit time. Under ideal condition instrction throghpt is one instrction per machine cycle, or CPI = 1 Pipelining does not redce the eection time of an individal instrction: The time needed to complete all processing steps of an instrction (also called instrction completion latency). It sally slightly increases the eection time of each instrction over npipelined implementations de to the increased control overhead of the pipeline and pipeline stage registers delays. #18 Lec # 8 Spring

19 Pipelining Performance Eample Eample: For an npipelined machine: Clock cycle = 1ns, 4 cycles for ALU operations and branches and 5 cycles for memory operations with instrction freqencies of 4%, 2% and 4%, respectively. If pipelining adds 1ns to the machine clock cycle then the speedp in instrction eection from pipelining is: Non-pipelined Average instrction eection time = Clock cycle Average CPI = 1 ns ((4% + 2%) 4 + 4% 5) = 1 ns 4.4 = 44 ns In the pipelined five implementation five stages are sed with an average instrction eection time of: 1 ns + 1 ns = 11 ns Speedp from pipelining = time npipelined time pipelined = 44 ns / 11 ns = 4 times #19 Lec # 8 Spring

20 Pipeline Hazards Hazards are sitations in pipelining which prevent the net instrction in the instrction stream from eecting dring the designated clock cycle reslting in one or more stall cycles. Hazards redce the ideal speedp gained from pipelining and are classified into three classes: Strctral hazards: Arise from hardware resorce conflicts when the available hardware cannot spport all possible combinations of instrctions. Data hazards: Arise when an instrction depends on the reslts of a previos instrction in a way that is eposed by the overlapping of instrctions in the pipeline. Control hazards: Arise from the pipelining of conditional branches and other instrctions that change the PC. #2 Lec # 8 Spring

21 Strctral Hazards In pipelined machines overlapped instrction eection reqires pipelining of fnctional nits and dplication of resorces to allow all possible combinations of instrctions in the pipeline. If a resorce conflict arises de to a hardware resorce being reqired by more than one instrction in a single cycle, and one or more sch instrctions cannot be accommodated, then a strctral hazard has occrred, for eample: when a machine has only one register file write port or when a pipelined machine has a shared single-memory pipeline for data and instrctions. stall the pipeline for one cycle for register writes or memory data access #21 Lec # 8 Spring

22 Strctral hazard Eample: Single emory For s & Data Time (clock cycles) I n s t r. O r d e r Load Instr 1 Instr 2 Instr 3 Instr 4 ALU em em em em ALU em ALU em em ALU em ALU em em Detection is easy in this case (right half highlight means read, left half write) #22 Lec # 8 Spring

23 Data Hazards Eample Problem with starting net instrction before first is finished Data dependencies here that go backward in time create data hazards. sb $2, $1, $3 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15, 1($2) Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $1, $3 CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 I CC 7 CC 8 CC / D and $12, $2, $5 I D or $13, $6, $2 I D add $14, $2, $2 I D sw $15, 1($2) I D #23 Lec # 8 Spring

24 Data Hazard Resoltion: Stall Cycles Stall the pipeline by a nmber of cycles. The control nit mst detect the need to insert stall cycles. In this case two stall cycles are needed. Time (in clock cycles) Vale of register $2: Program eection order (in instrctions) sb $2, $1, $3 CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 I CC 7 CC / D CC 9 2 CC 1 2 CC 11 2 and $12, $2, $5 I STALL STALL D or $13, $6, $2 STALL STALL I D add $14, $2, $2 I D sw $15, 1($2) I D #24 Lec # 8 Spring

25 Performance of Pipelines with Stalls Hazards in pipelines may make it necessary to stall the pipeline by one or more cycles and ths degrading performance from the ideal CPI of 1. CPI pipelined = Ideal CPI + Pipeline stall clock cycles per instrction If pipelining overhead is ignored and we assme that the stages are perfectly balanced then: Speedp = CPI npipelined / (1 + Pipeline stall cycles per instrction) When all instrctions take the same nmber of cycles and is eqal to the nmber of pipeline stages then: Speedp = Pipeline depth / (1 + Pipeline stall cycles per instrction) #25 Lec # 8 Spring

26 Data Hazard Resoltion: Compiler Schedling The compiler can garantee that no data hazards eist by re-ordering instrctions and/or adding NOP instrctions where needed. For the previos eample: sb $2, $1, $3 nop nop and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15, 1($2) #26 Lec # 8 Spring

27 Data Hazard Resoltion: Forwarding Observation: Why not se temporary reslts prodced by memory/alu and not wait for them to be written back in the register bank. Forwarding is a hardware-based techniqe (also called register bypassing or short-circiting) sed to eliminate or minimize data hazard stalls that makes se of this observation. Using forwarding hardware, the reslt of an instrction is copied directly from where it is prodced (ALU, memory read port etc.), to where sbseqent instrctions need it (ALU inpt register, memory write port etc.) #27 Lec # 8 Spring

28 Data Hazard Resoltion: Forwarding ister file forwarding to handle read/write to same register ALU forwarding #28 Lec # 8 Spring

29 Pipelined Datapath With Forwarding ID/EX WB EX/E Control WB E/WB IF/ID EX WB PC memory isters ALU Data memory IF/ID.isterRs Rs IF/ID.isterRt Rt IF/ID.isterRt IF/ID.isterRd Rt Rd EX/E.isterRd Forwarding nit E/WB.isterRd #29 Lec # 8 Spring

30 Data Hazard Eample With Forwarding Vale of register $2 : Vale of EX/E : Vale of E/WB : Time (in clock cycles) CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC / X X X 2 X X X X X X X X X 2 X X X X Program eection order (in instrctions) sb $2, $1, $3 I D and $12, $2, $5 I D or $13, $6, $2 I D add $14, $2, $2 I D sw $15, 1($2) I D #3 Lec # 8 Spring

31 A Data Hazard Reqiring A Stall A load followed by an R-type instrction that ses the loaded vale Program eection order (in instrctions) lw $2, 2($1) Time (in clock cycles) CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 I D CC 7 CC 8 CC 9 and $4, $2, $5 I D or $8, $2, $6 I D add $9, $4, $2 I D slt $1, $6, $7 I D Even with forwarding in place a stall cycle is needed This condition mst be detected by hardware #31 Lec # 8 Spring

32 A Data Hazard Reqiring A Stall A load followed by an R-type instrction that ses the loaded vale Program eection order (in instrctions) Time (in clock cycles) CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 CC 1 lw $2, 2($1) I D and $4, $2, $5 I D or $8, $2, $6 add $9, $4, $2 I I D bbble I D slt $1, $6, $7 I D We can stall the pipeline by keeping an instrction in the same stage #32 Lec # 8 Spring

33 Compiler Schedling Eample Reorder the instrctions to avoid as many pipeline stalls as possible: lw $15, ($2) lw $16, 4($2) sw $16, ($2) sw $15, 4($2) The data hazard occrs on register $16 between the second lw and the first sw reslting in a stall cycle With forwarding we need to find only one independent instrctions to place between them, swapping the lw instrctions works: lw $15, ($2) lw $16, 4($2) sw $15, ($2) sw $16, 4($2) Withot forwarding we need three independent instrctions to place between them, so in addition two nops are added. lw $15, ($2) lw $16, 4($2) nop nop sw $15, ($2) sw $16, 4($2) #33 Lec # 8 Spring

34 Datapath With Hazard Detection Unit A load followed by an instrction that ses the loaded vale is detected and a stall cycle is inserted. Hazard detection nit ID/EX.em ID/EX IF/IDWrite IF/ID Control WB EX EX/E WB E/WB WB PCWrite PC memory isters ALU Data memory IF/ID.isterRs IF/ID.isterRt IF/ID.isterRt IF/ID.isterRd Rt Rd EX/E.isterRd ID/EX.isterRt Rs Rt Forwarding nit E/WB.isterRd #34 Lec # 8 Spring

35 Control Hazards: Eample Three other instrctions are in the pipeline before branch instrction target decision is made when BEQ is in E stage. Program eection order (in instrctions) Time (in clock cycles) CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 4 beq $1, $3, 7 I D 44 and $12, $2, $5 I D 48 or $13, $6, $2 I D 52 add $14, $2, $2 I D 72 lw $4, 5($7) I D In the above diagram, we are predicting branch not taken Need to add hardware for flshing the three following instrctions if we are wrong losing three cycles. #35 Lec # 8 Spring

36 Redcing Delay of Taken Branchs Net PC of a branch known in E stage: Costs three lost cycles if taken. If net PC is known in EX stage, one cycle is saved. Branch address calclation can be moved to ID stage sing a register comparator, costing only one cycle if branch is taken. IF.Flsh Hazard detection nit ID/EX WB EX/E Control WB E/WB IF/ID EX WB PC 4 memory Shift left 2 isters = ALU Data memory Sign etend Forwarding nit #36 Lec # 8 Spring

37 Pipeline Performance Eample Assme the following IPS instrction mi: Type Freqency Arith/Logic 4% Load 3% of which 25% are followed immediately by an instrction sing the loaded vale Store 1% branch 2% of which 45% are taken What is the reslting CPI for the pipelined IPS with forwarding and branch address calclation in ID stage? CPI = Ideal CPI + Pipeline stall clock cycles per instrction = 1 + stalls by loads + stalls by branches = = = #37 Lec # 8 Spring

What do we have so far? Multi-Cycle Datapath (Textbook Version)

What do we have so far? Multi-Cycle Datapath (Textbook Version) What do we have so far? ulti-cycle Datapath (Textbook Version) CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instruction being processed in datapath How to lower CPI further? #1 Lec # 8 Summer2001

More information

Enhanced Performance with Pipelining

Enhanced Performance with Pipelining Chapter 6 Enhanced Performance with Pipelining Note: The slides being presented represent a mi. Some are created by ark Franklin, Washington University in St. Lois, Dept. of CSE. any are taken from the

More information

Pipelining. Chapter 4

Pipelining. Chapter 4 Pipelining Chapter 4 ake processor rns faster Pipelining is an implementation techniqe in which mltiple instrctions are overlapped in eection Key of making processor fast Pipelining Single cycle path we

More information

Overview of Pipelining

Overview of Pipelining EEC 58 Compter Architectre Pipelining Department of Electrical Engineering and Compter Science Cleveland State University Fndamental Principles Overview of Pipelining Pipelined Design otivation: Increase

More information

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good CPU performance equation: T = I x CPI x C Both effective CPI and clock cycle C are heavily influenced by CPU design. For single-cycle CPU: CPI = 1 good Long cycle time bad On the other hand, for multi-cycle

More information

PS Midterm 2. Pipelining

PS Midterm 2. Pipelining PS idterm 2 Pipelining Seqential Landry 6 P 7 8 9 idnight Time T a s k O r d e r A B C D 3 4 2 3 4 2 3 4 2 3 4 2 Seqential landry takes 6 hors for 4 loads If they learned pipelining, how long wold landry

More information

Chapter 6: Pipelining

Chapter 6: Pipelining Chapter 6: Pipelining Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining

More information

Instruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion

Instruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion . (Chapter 5) Fill in the vales for SrcA, SrcB, IorD, Dst and emto to complete the Finite State achine for the mlti-cycle datapath shown below. emory address comptation 2 SrcA = SrcB = Op = fetch em SrcA

More information

Review: Computer Organization

Review: Computer Organization Review: Compter Organization Pipelining Chans Y Landry Eample Landry Eample Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 3 mintes A B C D Dryer takes 3 mintes

More information

1048: Computer Organization

1048: Computer Organization 8: Compter Organization Lectre 6 Pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6- Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards

More information

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts CS359: Compter Architectre Chapter 3 & Appendi C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Compter Science and Engineering Shanghai Jiao Tong University 1 Otline Introdction

More information

Improve performance by increasing instruction throughput

Improve performance by increasing instruction throughput Improve performance by increasing instruction throughput Program execution order Time (in instructions) lw $1, 100($0) fetch 2 4 6 8 10 12 14 16 18 ALU Data access lw $2, 200($0) 8ns fetch ALU Data access

More information

Chapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind

Chapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind Pipelining hink of sing machines in landry services Chapter 6 nhancing Performance with Pipelining 6 P 7 8 9 A ime ask A B C ot pipelined Assme 3 min. each task wash, dry, fold, store and that separate

More information

Multi-cycle Datapath (Our Version)

Multi-cycle Datapath (Our Version) ulti-cycle Datapath (Our Version) npc_sel Next PC PC Instruction Fetch IR File Operand Fetch A B ExtOp ALUSrc ALUctr Ext ALU R emrd emwr em Access emto Data em Dst Wr. File isters added: IR: Instruction

More information

TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction

TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction Review Friday the 2st of October Real world eamples of pipelining? How does pipelining pp inflence instrction latency? How does pipelining inflence instrction throghpt? What are the three types of hazard

More information

Chapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns

Chapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns Chapter Si Pipelining Improve perfomance by increasing instruction throughput eecutionı Time lw $, ($) 2 6 8 2 6 8 access lw $2, 2($) 8 ns access lw $3, 3($) eecutionı Time lw $, ($) lw $2, 2($) 2 ns 8

More information

Chapter 6: Pipelining

Chapter 6: Pipelining CSE 322 COPUTER ARCHITECTURE II Chapter 6: Pipelining Chapter 6: Pipelining Febrary 10, 2000 1 Clothes Washing CSE 322 COPUTER ARCHITECTURE II The Assembly Line Accmlate dirty clothes in hamper Place in

More information

Chapter 4 (Part II) Sequential Laundry

Chapter 4 (Part II) Sequential Laundry Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P 7 8 9 10 11 12 1 2 A T a s k O r d e r A B C D 30 30 30 30 30 30 30 30 30 30

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 25, Winter 28, Assignment 4.. 3% of corse mark De Wednesday, arch 7th, 4:3P Lates accepted ntil Thrsday arch 8th, am with a 5% penalty. (6 points) In the diagram below, the mlticycle compter from the

More information

The final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read.

The final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read. The final path PC 4 Add Reg Shift left 2 Add PCSrc Instrction [3-] Instrction I [25-2] I [2-6] I [5 - ] register register 2 register 2 Registers ALU Zero Reslt ALUOp em Data emtor RegDst ALUSrc em I [5

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 83 Compter Organization Chapter.6 A Pipelined path Chans Y Pipelined Approach 2 - Cycle time, No. stages - Resorce conflict E E A B C D 3 E E 5 E 2 3 5 2 6 7 8 9 c.y9@csohio.ed Resorces sed in 5 Stages

More information

The single-cycle design from last time

The single-cycle design from last time lticycle path Last time we saw a single-cycle path and control nit for or simple IPS-based instrction set. A mlticycle processor fies some shortcomings in the single-cycle CPU. Faster instrctions are not

More information

The extra single-cycle adders

The extra single-cycle adders lticycle Datapath As an added bons, we can eliminate some of the etra hardware from the single-cycle path. We will restrict orselves to sing each fnctional nit once per cycle, jst like before. Bt since

More information

CS 251, Winter 2019, Assignment % of course mark

CS 251, Winter 2019, Assignment % of course mark CS 25, Winter 29, Assignment.. 3% of corse mark De Wednesday, arch 3th, 5:3P Lates accepted ntil Thrsday arch th, pm with a 5% penalty. (7 points) In the diagram below, the mlticycle compter from the corse

More information

Comp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13

Comp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13 Comp 33 Compter Architectre A Pipelined path Lectre 3 Pipelined path with Signals PCSrc IF/ ID ID/ EX EX / E E / Add PC 4 Address Instrction emory RegWr ra rb rw Registers bsw [5-] [2-6] [5-] bsa bsb Sign

More information

Solutions for Chapter 6 Exercises

Solutions for Chapter 6 Exercises Soltions for Chapter 6 Eercises Soltions for Chapter 6 Eercises 6. 6.2 a. Shortening the ALU operation will not affect the speedp obtained from pipelining. It wold not affect the clock cycle. b. If the

More information

Quiz #1 EEC 483, Spring 2019

Quiz #1 EEC 483, Spring 2019 Qiz # EEC 483, Spring 29 Date: Jan 22 Name: Eercise #: Translate the following instrction in C into IPS code. Eercise #2: Translate the following instrction in C into IPS code. Hint: operand C is stored

More information

EEC 483 Computer Organization. Branch (Control) Hazards

EEC 483 Computer Organization. Branch (Control) Hazards EEC 483 Compter Organization Section 4.8 Branch Hazards Section 4.9 Exceptions Chans Y Branch (Control) Hazards While execting a previos branch, next instrction address might not yet be known. s n i o

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 483 Compter Organization Chapter 4.4 A Simple Implementation Scheme Chans Y The Big Pictre The Five Classic Components of a Compter Processor Control emory Inpt path Otpt path & Control 2 path and

More information

Exceptions and interrupts

Exceptions and interrupts Eceptions and interrpts An eception or interrpt is an nepected event that reqires the CPU to pase or stop the crrent program. Eception handling is the hardware analog of error handling in software. Classes

More information

PIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons

PIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons Pipelining: Natral Phenomenon Landry Eample: nn, rian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 mintes C D Dryer takes 0 mintes PIPELINING Folder takes 20 mintes

More information

Review Multicycle: What is Happening. Controlling The Multicycle Design

Review Multicycle: What is Happening. Controlling The Multicycle Design Review lticycle: What is Happening Reslt Zero Op SrcA SrcB Registers Reg Address emory em Data Sign etend Shift left Sorce A B Ot [-6] [5-] [-6] [5-] [5-] Instrction emory IR RegDst emtoreg IorD em em

More information

EXAMINATIONS 2010 END OF YEAR NWEN 242 COMPUTER ORGANIZATION

EXAMINATIONS 2010 END OF YEAR NWEN 242 COMPUTER ORGANIZATION EXAINATIONS 2010 END OF YEAR COPUTER ORGANIZATION Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. ake sre yor answers are clear and to the point. Calclators and paper foreign langage

More information

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University Compter Architectre Chapter 5 Fall 25 Department of Compter Science Kent State University The Processor: Datapath & Control Or implementation of the MIPS is simplified memory-reference instrctions: lw,

More information

EXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation

EXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation EXAINATIONS 2003 COP203 END-YEAR Compter Organisation Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. There are 180 possible marks on the eam. Calclators and foreign langage dictionaries

More information

Designing a Pipelined CPU

Designing a Pipelined CPU Designing a Pipelined CPU CSE 4, S2'6 Review -- Single Cycle CPU CSE 4, S2'6 Review -- ultiple Cycle CPU CSE 4, S2'6 Review -- Instruction Latencies Single-Cycle CPU Load Ifetch /Dec Exec em Wr ultiple

More information

1048: Computer Organization

1048: Computer Organization 48: Compter Organization Lectre 5 Datapath and Control Lectre5A - simple implementation (cwli@twins.ee.nct.ed.tw) 5A- Introdction In this lectre, we will try to implement simplified IPS which contain emory

More information

The multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM

The multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM Lectre (Wed /5/28) Lab # Hardware De Fri Oct 7 HW #2 IPS programming, de Wed Oct 22 idterm Fri Oct 2 IorD The mlticycle path SrcA Today s objectives: icroprogramming Etending the mlti-cycle path lti-cycle

More information

Lecture 7. Building A Simple Processor

Lecture 7. Building A Simple Processor Lectre 7 Bilding A Simple Processor Christos Kozyrakis Stanford University http://eeclass.stanford.ed/ee8b C. Kozyrakis EE8b Lectre 7 Annoncements Upcoming deadlines Lab is de today Demo by 5pm, report

More information

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...

Pipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ... CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100

More information

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed Lecture 3: General Purpose Processor Design CSCE 665 Advanced VLSI Systems Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, tet etc included in slides are borrowed from various books, websites,

More information

Lecture 6: Pipelining

Lecture 6: Pipelining Lecture 6: Pipelining i CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and other

More information

Computer Architecture. Lecture 6: Pipelining

Computer Architecture. Lecture 6: Pipelining Compter Architectre Lectre 6: Pipelining Dr. Ahmed Sallam Based on original slides by Prof. Onr tl Agenda for Today & Net Few Lectres Single-cycle icroarchitectres lti-cycle and icroprogrammed icroarchitectres

More information

Lecture 10: Pipelined Implementations

Lecture 10: Pipelined Implementations U 8-7 S 9 L- 8-7 Lectre : Pipelined Implementations James. Hoe ept of EE, U Febrary 23, 29 nnoncements: Project is de this week idterm graded, d reslts posted Handots: H9 Homework 3 (on lackboard) Graded

More information

Review. A single-cycle MIPS processor

Review. A single-cycle MIPS processor Review If three instrctions have opcodes, 7 and 5 are they all of the same type? If we were to add an instrction to IPS of the form OD $t, $t2, $t3, which performs $t = $t2 OD $t3, what wold be its opcode?

More information

Pipelining: Basic Concepts

Pipelining: Basic Concepts Pipelining: Basic Concepts Prof. Cristina Silvano Dipartimento di Elettronica e Informazione Politecnico di ilano email: silvano@elet.polimi.it Outline Reduced Instruction Set of IPS Processor Implementation

More information

Prof. Kozyrakis. 1. (10 points) Consider the following fragment of Java code:

Prof. Kozyrakis. 1. (10 points) Consider the following fragment of Java code: EE8 Winter 25 Homework #2 Soltions De Thrsday, Feb 2, 5 P. ( points) Consider the following fragment of Java code: for (i=; i

More information

Computer Architecture

Computer Architecture Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Spring 25 Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics

More information

1048: Computer Organization

1048: Computer Organization 48: Compter Organization Lectre 5 Datapath and Control Lectre5B - mlticycle implementation (cwli@twins.ee.nct.ed.tw) 5B- Recap: A Single-Cycle Processor PCSrc 4 Add Shift left 2 Add ALU reslt PC address

More information

Computer Architecture

Computer Architecture Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics (Lectres

More information

Unpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory

Unpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory Pipelining the Idea Similar to assembly line in a factory Divide instruction into smaller tasks Each task is performed on subset of resources Overlap the execution of multiple instructions by completing

More information

Instruction Pipelining is the use of pipelining to allow more than one instruction to be in some stage of execution at the same time.

Instruction Pipelining is the use of pipelining to allow more than one instruction to be in some stage of execution at the same time. Pipelining Pipelining is the se of pipelining to allow more than one instrction to be in some stage of eection at the same time. Ferranti ATLAS (963): Pipelining redced the average time per instrction

More information

CSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control

CSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control CSE-45432 Introdction to Compter Architectre Chapter 5 The Processor: Datapath & Control Dr. Izadi Data Processor Register # PC Address Registers ALU memory Register # Register # Address Data memory Data

More information

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle? CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:

More information

Pipeline Data Hazards. Dealing With Data Hazards

Pipeline Data Hazards. Dealing With Data Hazards Pipeline Data Hazards Warning, warning, warning! Dealing With Data Hazards In Software inserting independent instructions In Hardware inserting bubbles (stalling the pipeline) data forwarding Data Data

More information

COMP2611: Computer Organization. The Pipelined Processor

COMP2611: Computer Organization. The Pipelined Processor COMP2611: Computer Organization The 1 2 Background 2 High-Performance Processors 3 Two techniques for designing high-performance processors by exploiting parallelism: Multiprocessing: parallelism among

More information

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing Note: Appendices A-E in the hardcopy text correspond to chapters 7- in the online

More information

Lecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University 8 447 Lectre 6: icroprogrammed lti Cycle Implementation James C. Hoe Department of ECE Carnegie ellon University 8 447 S8 L06 S, James C. Hoe, CU/ECE/CALC, 208 Yor goal today Hosekeeping nderstand why

More information

PART I: Adding Instructions to the Datapath. (2 nd Edition):

PART I: Adding Instructions to the Datapath. (2 nd Edition): EE57 Instrctor: G. Pvvada ===================================================================== Homework #5b De: check on the blackboard =====================================================================

More information

CSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade

CSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade CSE 141 Compter Architectre Smmer Session I, 2004 Lectres 10 Advanced Topics, emory Hierarchy and Cache Pramod V. Argade CSE141: Introdction to Compter Architectre Instrctor: TA: Pramod V. Argade (p2argade@cs.csd.ed)

More information

4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language 345.e1

4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language 345.e1 .3 Advanced Topic: An Introdction to Digital Design Using a Hardware Design Langage 35.e.3 Advanced Topic: An Introdction to Digital Design Using a Hardware Design Langage to Describe and odel a Pipeline

More information

微算機系統第六章. Enhancing Performance with Pipelining 陳伯寧教授電信工程學系國立交通大學. Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold

微算機系統第六章. Enhancing Performance with Pipelining 陳伯寧教授電信工程學系國立交通大學. Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold 微算機系統第六章 Enhancing Performance with Pipelining 陳伯寧教授電信工程學系國立交通大學 chap6- Pipeline is natural! Laundry Example Ann, Brian, athy, Dave each have one load of clothes to wash, dry, and fold A B D Washer takes

More information

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content 3/6/8 CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Today s Content We have looked at how to design a Data Path. 4.4, 4.5 We will design

More information

ECEC 355: Pipelining

ECEC 355: Pipelining ECEC 355: Pipelining November 8, 2007 What is Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. A pipeline is similar in concept to an assembly

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S. Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing (2) Pipeline Performance Assume time for stages is ps for register read or write

More information

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Full Datapath Branch Target Instruction Fetch Immediate 4 Today s Contents We have looked

More information

Pipelining. CSC Friday, November 6, 2015

Pipelining. CSC Friday, November 6, 2015 Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:

More information

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor. COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction

More information

CS 251, Spring 2018, Assignment 3.0 3% of course mark

CS 251, Spring 2018, Assignment 3.0 3% of course mark CS 25, Spring 28, Assignment 3. 3% of corse mark De onday, Jne 25th, 5:3 P. (5 points) Consider the single-cycle compter shown on page 6 of this assignment. Sppose the circit elements take the following

More information

cs470 - Computer Architecture 1 Spring 2002 Final Exam open books, open notes

cs470 - Computer Architecture 1 Spring 2002 Final Exam open books, open notes 1 of 7 ay 13, 2002 v2 Spring 2002 Final Exam open books, open notes Starts: 7:30 pm Ends: 9:30 pm Name: (please print) ID: Problem ax points Your mark Comments 1 10 5+5 2 40 10+5+5+10+10 3 15 5+10 4 10

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

Animating the Datapath. Animating the Datapath: R-type Instruction. Animating the Datapath: Load Instruction. MIPS Datapath I: Single-Cycle

Animating the Datapath. Animating the Datapath: R-type Instruction. Animating the Datapath: Load Instruction. MIPS Datapath I: Single-Cycle nimating the atapath PS atapath : Single-Cycle npt is either (-type) or sign-etended lower half of instrction (load/store) op offset/immediate W egister File 6 6 + from instrction path beq,, offset if

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

Advanced Computer Architecture Pipelining

Advanced Computer Architecture Pipelining Advanced Computer Architecture Pipelining Dr. Shadrokh Samavi Some slides are from the instructors resources which accompany the 6 th and previous editions of the textbook. Some slides are from David Patterson,

More information

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1,

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1, SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Chapter 6 ADMIN ing for Chapter 6: 6., 6.9-6.2 2 Midnight Laundry Task order A 6 PM 7 8 9 0 2 2 AM B C D 3 Smarty

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 25, Winter 28, Assignment 3.. 3% of corse mark De onday, Febrary 26th, 4:3 P Lates accepted ntil : A, Febrary 27th with a 5% penalty. IEEE 754 Floating Point ( points): (a) (4 points) Complete the following

More information

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14 MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK

More information

Processor (II) - pipelining. Hwansoo Han

Processor (II) - pipelining. Hwansoo Han Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Pipelining: Hazards Ver. Jan 14, 2014

Pipelining: Hazards Ver. Jan 14, 2014 POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? Pipelining: Hazards Ver. Jan 14, 2014 Marco D. Santambrogio: marco.santambrogio@polimi.it Simone Campanoni:

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

Pipeline Review. Review

Pipeline Review. Review Pipeline Review Review Covered in EECS2021 (was CSE2021) Just a reminder of pipeline and hazards If you need more details, review 2021 materials 1 The basic MIPS Processor Pipeline 2 Performance of pipelining

More information

14:332:331 Pipelined Datapath

14:332:331 Pipelined Datapath 14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

EECS 322 Computer Architecture Improving Memory Access: the Cache

EECS 322 Computer Architecture Improving Memory Access: the Cache EECS 322 Computer Architecture Improving emory Access: the Cache Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses powerpoint animation: please viewshow

More information

CSSE232 Computer Architecture I. Mul5cycle Datapath

CSSE232 Computer Architecture I. Mul5cycle Datapath CSSE232 Compter Architectre I Ml5cycle Datapath Class Stats Next 3 days : Ml5cycle datapath ing Ml5cycle datapath is not in the book! How long do instrc5ons take? ALU 2ns Mem 2ns Reg File 1ns Everything

More information

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4 IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM

More information

LECTURE 3: THE PROCESSOR

LECTURE 3: THE PROCESSOR LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU

More information

Chapter 4 The Processor 1. Chapter 4B. The Processor

Chapter 4 The Processor 1. Chapter 4B. The Processor Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always

More information

Instruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31

Instruction word R0 R1 R2 R3 R4 R5 R6 R8 R12 R31 4.16 Exercises 419 Exercise 4.11 In this exercise we examine in detail how an instruction is executed in a single-cycle datapath. Problems in this exercise refer to a clock cycle in which the processor

More information

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one

More information

COSC 6385 Computer Architecture - Pipelining

COSC 6385 Computer Architecture - Pipelining COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor

More information

Winter 2013 MIDTERM TEST #2 Wednesday, March 20 7:00pm to 8:15pm. Please do not write your U of C ID number on this cover page.

Winter 2013 MIDTERM TEST #2 Wednesday, March 20 7:00pm to 8:15pm. Please do not write your U of C ID number on this cover page. page of 7 University of Calgary Departent of Electrical and Copter Engineering ENCM 369: Copter Organization Lectre Instrctors: Steve Noran and Nor Bartley Winter 23 MIDTERM TEST #2 Wednesday, March 2

More information