EEC 483 Computer Organization. Branch (Control) Hazards

Size: px

Start display at page:

Download "EEC 483 Computer Organization. Branch (Control) Hazards"

Marlene Gregory
5 years ago
Views:

1 EEC 483 Compter Organization Section 4.8 Branch Hazards Section 4.9 Exceptions Chans Y Branch (Control) Hazards While execting a previos branch, next instrction address might not yet be known. s n i o t c r t s n I Conditional branch Branch target Calclates PC+4. 1 ID Stall 2 Comptes branch target address. Performs branch test & sets PC to target EX Stall E ID EX E Time Step (Clock Cycle) 2 1

2 Branch (Control) Hazards 3 Branch Hazards We can stall the pipeline for every branch instrction Too slow (3 instrctions) Or, contine exection down the seqential instrction stream assming that the branch will not be taken (predict branch not taken ) If the condition is not met, OK! (prediction is sccessfl) If the condition is met, (prediction is wrong) Some nwanted instrctions are in the pipeline! Need to flsh instrctions How do yo compare the above two? If branches are taken half the time, and if it costs little to discard the instrctions, the second approach halves the cost of control hazards 4 2

3 Branch Hazards Redcing the cost of taken branch Branch address procedre : PC+4 EX: Branch address calclation, ZF evalation E: Branch target is selected Selecting branch address at the ID stage to redce the penalty to one cycle from 3 cycles Branch address calclation can be done at ID stage ZF evalation: Eqality can be tested at ID stage by first exclsive ORing respective bits of two read registers and then ANDing all the reslts Control:.Flsh to flsh the instrction in stage It zeros the instrction field of the /ID pipeline register.flsh = (/ID.Branch && ZF)?? is this same as PCSrc??? 5.Flsh verss Zero Control Signals In order to pt a bbble, we nllify the control signals (for stall on a data hazard) Qestions Why can t we se the same techniqe for branch hazard? There is no control signal at stage Is zeroing control signals enogh in case of stall? As long as emread, emwrite, RegWrite are not asserted, any storage vale is not pdated. ALU will do something and UXes will select something, bt it doesn t affect any reslt. What does it mean by flshing? It zeros the instrction field of the /ID pipeline register, which in fact can be decoded as sll $0, $0, $0 In fact, nop = sll $0, $0, $0 6 3

4 Branch Hazards Example 0040 beq $1, $2, 7 ; *4= and $3, $4, $ lw $6, 50($7) branch target is calclated and ZF is checked beq ID EX and --- lw 7 Branch Hazards : Flshing Implements flshing for branch hazards (only one addition!) and it comes from the Control circit.flsh Hazard detection nit x ID/EX EX/E Control 0 x E/ /ID EX PC 4 Instrction memory Shift left 2 Registers = x x ALU Data memory x Sign extend x Forwarding nit 8 4

5 Stalling: What happen in the pipleine? CC1 CC2 CC3 CC4 CC5 CC6 CC7 beq $1,$2, 7 ID EX E add $3,$4,$5 Nll (ID) Nll Nll Nll (EX) (E)() target of beq ID EX E ID stage exectes a nll instrction (sll $0,$0,$0) at CC3 CC5 EX stage exectes a nll instrction (sll $0,$0,$0) at CC4 E stage exectes a nll instrction (sll $0,$0,$0) at 9 stage exectes a nll instrction (sll $0,$0,$0) at CC6.Flsh at CC3 will do. Branch Hazards: Improvement ain techniqes for avoiding stalls: Eliminating branches Branch prediction ove comparison testing to earlier stage Branch delay slots 10 5

6 Branch Hazards: Eliminating Branches Compiler can rewrite code to eliminate some branches. Examples? Branch to a branch. Loop nrolling. branch branch branch? branch? sort of branch branch yes no yes no body body body 11 Branch Hazards: Earlier Branch Testing In given pipeline, tested branch conditional in EX Cold move test to ID Reqires additional mini-alu to perform tests Eliminates one stall cycle Cold potentially increase cycle length Still have one cycle of stall Jst like nconditional branches Assme this optimization 12 6

7 Branch Hazards: Branch Delay Slots While determining next instrction address, go ahead and execte seqentially following instrction(s). Comptes branch target address. Performs branch test & sets PC to target. s n i o t c r t s n I Conditional branch Branch delay Branch target ID EX ID E EX ID E EX Fetches correct target. E Time Step (Clock Cycle) 13 Branch Hazards: Branch Delay Slots Advantage: Can avoid one stall per delay slot. Disadvantages: akes assembly-langage programming more difficlt. Can be difficlt to find appropriate code for slot. Exposes implementation detail that cold change. Later implementations withot a stall mst still emlate slot. ost modern processors avoid 14 7

8 Branch Hazards: Branch Prediction Gess which instrction is next, & start execting it. What if gess is wrong? : Flsh the pipeline Simplest gesses: Always Taken or Never Taken. When to do prediction? Static prediction: compiler Dynamic prediction: processor 15 Dynamic Branch Prediction Branch prediction bffer (branch history table) A small memory that is indexed by the lower portion of the address of the branch instrction and that contains one or more bits indicating whether the branch was recently taken or not. PC Instrction memory Instrction BPB Prediction (T or NT) 16 /ID 8

9 Dynamic Branch Prediction 1-bit predictor T Predict taken N (Not taken) T (Taken) Predict not taken NT Prediction accracy loop 10 times => 1 st :?, 2 nd : correct, 3 rd : correct, beq 9 th : correct, 10 th : incorrect => 80% accracy (Becase the first one is incorrect in the second exection of the same code.) 17 Dynamic Branch Prediction 2-bit predictor What is the prediction accracy with the same example? : 90% 18 9

10 4.9 Exceptions Another form of control hazard involves exceptions. When an arithmetic overflow occrs dring execting add $1, $2, $1 Transfer control to the exception rotine (0x ) This is the same as execting a branch instrction For a taken-branch, we flsh pipeline registers. Branch is tested at the beginning of ID stage. And ths flshing takes place at ID stage. Since only one instrction is following after the instrction at ID, we jst need to flsh that instrction 19 Flsh Control Signals Similar to the taken-branch, we need to flsh pipeline registers. Qestion is which stages pipeline register(s)? Arithmetic overflow is detected at the end of EX stage. And ths flshing takes place at E stage (at the next cycle). Since three following instrctions are already in the pipeline (, ID and EX stages), we need to flsh those three instrctions. Otherwise, $1 can be written back and cannot investigate the case of the overflow. Therefore, we need ID.Flsh and EX.Flsh in addition to.flsh control signal

11 EPC and Case Additionally, EPC is written Case is written

12 Exception in a Pipelined Compter Given the instrction seqence 0x40 sb $11, $2, $4 0x44 and $12, $2, $5 0x48 or $13, $2, $6 0x4c add $1, $2, $1 0x50 slt $15, $6, $7 0x54 lw $16, 50($7)... 0x sw $25, 1000($0) 0x sw $26, 1004($0)... Assme an overflow exception occrs when execting add EPC becomes 0x50 Flsh signals convert the following instrctions to bbbles And start fetching from 0x (exception service roting) and and or instrction prior to and complete

13 Challenges What if more than one instrction generates exceptions? While add cases an overflow exception at CC5 in EX, lw (with wrong opcode) cases an invalid opcode exception at CC5 at It is not OK to generate all flshing signals. And, how does the exception service rotine correctly identify the instrction that cases the exception? => Imprecise exception 25 Precise and Imprecise Exceptions Precise exceptions Hardware (CPU) correctly identifies the offending instrction. And makes sre all prior instrctions complete. All instrctions following it are not allowed to complete their exection and have not modified the process state Imprecise exception Hardware does not garantee it and leaves it p to the operating system to determine which instrction cased the problem. Some instrctions following the offending instrction are allowed to completed their exection and modified the process state. ost of modern CPUs spport Precise exceptions 26 13

Review: Computer Organization

Review: Computer Organization Review: Compter Organization Pipelining Chans Y Landry Eample Landry Eample Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 3 mintes A B C D Dryer takes 3 mintes