Announcements. ECE4750/CS4420 Computer Architecture L11: Speculative Execution I. Edward Suh Computer Systems Laboratory

Size: px

Start display at page:

Download "Announcements. ECE4750/CS4420 Computer Architecture L11: Speculative Execution I. Edward Suh Computer Systems Laboratory"

Juniper Thompson
5 years ago
Views:

1 ECE4750/CS4420 Computer Architecture L11: Speculative Execution I Edward Suh Computer Systems Laboratory suh@csl.cornell.edu Announcements Lab3 due today 2 1

2 Overview Branch penalties limit performance of advanced processors Dynamic branch prediction techniques can be very effective: ~97% Fetch from predicted path ahead of branch resolution Today: speculative execution Cannot afford to fetch and wait must execute speculatively Key question: How to recover from a mis-speculation? Reading: Chapter Speculative Execution Instructions from predicted path allowed to execute (enter EX stage) Especially effective with dynamic scheduling Must provide recovery mechanism in case of misprediction Speculative instructions may not commit state changes Speculative results must be still be provided to others 4 2

3 Tomasulo-based FPU 5 Three Steps to Instruction Execution Step 1: Issue if reservation station available (structural), then rename operands, send instruction to reservation station Step 2: Execution if operand(s) not available, monitor CDB (snoop) inform control logic at completion Step 3: Write result broadcast result via CDB if no WAW hazard, update register 6 3

4 Parts of Tomasulo Instruction Status in which of the three steps each instruction is Reservation Station Status is the reservation station available? Busy reservation station is busy Op operation to be performed Address effective address (if load/store) V j, V k Source values (not registers!) Q j, Q k reservation stations producing V j, V k for this instruction Register Status which reservation station will update each register Q i reservation station producing the most updated value for the register 7 Tomasulo: Example Instr. Status Instruction I E W fld 1 f6,34($2) fld 2 f2,45($3) fmul f0,f2,f4 3 fsub f8,f6,f2 4 7 fdiv f10,f0,f6 5 fadd f6,f8,f2 6 Reservation Stations Busy Op V j V k Q j Q k 0 Add 1 Yes fsub M 1 M 2 Add 2 Yes fadd M 2 Add 1 Add 3 8 Mul 1 Yes fmul M 2 f4 Mul 2 Yes fdiv M 1 Mul 1 Busy Address Load 1 Register Result Status F0 F2 F4 F6 F8 F10 F12 F30 Load 2 Load 3 Q i Mul 1 M 2 Add 2 Add 1 Mul 2 Clock 8 7 4

5 Tomasulo: Example Instr. Status Instruction I E W fld 1 f6,34($2) fld 2 f2,45($3) fmul f0,f2,f4 3 fsub f8,f6,f fdiv f10,f0,f6 5 fadd f6,f8,f2 6 Reservation Stations Busy Op V j V k Q j Q k Add 1 2 Add 2 Yes fadd M 1 M 2 M 2 Add 3 7 Mul 1 Yes fmul M 2 f4 Mul 2 Yes fdiv M 1 Mul 1 Busy Address Load 1 Register Result Status F0 F2 F4 F6 F8 F10 F12 F30 Load 2 Load 3 Q i Mul 1 M 2 Add 2 M 1 M 2 Mul 2 Clock 9 8 Exception Handling (In-Order Five-Stage Pipeline) Commit Point Inst. Mem D Decode E + M Data Mem W Select Handler Address Exceptions Exc D Illegal Opcode Exc E Overflow Exc M Data Addr Kill Except Writeback Cause Kill F Stage D Kill D Stage E Kill E Stage M Asynchronous Interrupts E Hold exception flags in pipeline until commit point (M stage) Exceptions in earlier pipe stages override later exceptions Inject external interrupts at commit point (override others) 10 5

6 In-Order Commit 11 Reorder Buffer (ROB) Holds instruction results until commit (1) (2) Strictly in order (allocated at issue) Similar to Tomasulo s reservation stations, but: 12 6

7 ROB Entry Fields Instruction type Destination branch: no destination store: destination address other: destination register Value: instruction s result (outcome for branches) Ready bit: whether instruction has executed (and result is ready) 13 ROB-Extended Tomasulo s Reservation stations (RS): buffer op and operands once issued until executed track assigned ROB entry ROB provide renaming mechanism (tag results with ROB entry instead of RS) hold instruction s relevant state from issue to commit, including result if any Register Status track renaming: the ROB entry 14 7

8 Instruction Execution Issue: get instruction from instruction queue issue instruction to RS if RS and ROB entry available stall otherwise locate operands in register file or ROB entry communicate ROB entry pointer to RS; update register table (rename) Execute: read available operands from register file or ROB monitor CDB (ROB entry match) for unavailable operands execute when all operands available some sources call this step issue we do not follow this convention recall: it can take several cycles 15 Instruction Execution Writeback: dump result on CDB, together with ROB entry pointer waiting RS pick up result ROB entry s Value field updated, Ready field set does not: update register file perform stores to memory (hold value in ROB entry) Commit (a.k.a. graduate): when instruction reaches head of ROB entry and Ready bit is set if conventional, update register file (even if renamed), recycle ROB entry if a store, perform to memory if a mispredicted branch, flush ROB and pipeline, fetch from the right path 16 8

9 Exception Handling Unlike Scoreboard or Tomasulo s, ROB supports precise exceptions Key: defer handling to commit stage all previous instructions committed architectural state up to date this and subsequent instructions not committed architectural state not compromised false exceptions from mispredicted path never emerge If excepting instruction makes it to commit stage: flush all uncommitted instructions save precise architectural state handle exception restore architectural state and initiate fetch 17 Instruction Window (IW) Modern alternative to RS + ROB (combine two) Sometimes, just called ROB Inst# use exec op p1 src1 p2 src2 pd dest data cause ptr 2 next to commit ptr 1 next available 18 9

CS 252 Graduate Computer Architecture. Lecture 4: Instruction-Level Parallelism

CS 252 Graduate Computer Architecture Lecture 4: Instruction-Level Parallelism Krste Asanovic Electrical Engineering and Computer Sciences University of California, Berkeley http://wwweecsberkeleyedu/~krste