Dynamic Scheduling. CSE471 Susan Eggers 1

Size: px

Start display at page:

Download "Dynamic Scheduling. CSE471 Susan Eggers 1"

Maurice Ross
5 years ago
Views:

1 Dynamic Scheduling Why go out of style? expensive hardware for the time (actually, still is, relatively) register files grew so less register pressure early RISCs had lower CPIs Why come back? higher chip densities greater need to hide latencies as discrepancy between CPU & memory speeds increases branch misprediction penalty increases was generalized to cover more than floating point operations handles branches & hides branch latencies hides cache misses commits instructions in-order to preserve precise interrupts uses a more general register renaming mechanism 2 styles: large physical register file & reorder buffer processors now issue multiple instructions at the same time more need to exploit ILP 2 styles: superscalars & VLIW processors CSE471 Susan Eggers 1

2 Register Renaming with A Physical Register File (R1-style) Register renaming provides a mapping between 2 register sets architectural registers defined by the ISA physical registers implemented in the CPU more of them than architectural registers results of the instructions committed so far (in program order) results of subsequent, independent instructions that have not yet committed ~ issue width * # pipeline stages between register renaming & commit architectural register associated with a physical register during a register renaming stage, usually just after decode operands thereafter called by their physical register number hazards determined by comparing physical register numbers Effects: eliminates WAW and WAR hazards increases ILP CSE471 Susan Eggers 2

3 A Register Renaming Example Code Segment Register Mapping Comments ld r7,(r6) r7 -> p1 p1 is allocated... add r8, r9, r7 r8 -> p2 use p1, not r7... sub r7, r2, r r7 -> p p is allocated p1 is deallocated when sub commits CSE471 Susan Eggers

4 The Implementation (R1) Modular design with regular hardware data structures 64 physical registers (each, for integer & FP) s for the current architectural-to-physical register mapping (separate, for integer & FP) accessed with an architectural register number produces a physical register number a destination register is assigned a new physical register number from a free register list (separate, for integer & FP) source operands refer to the latest defined destination register. i.e., the current mappings instruction queues (integer, FP & data transfer) contains decoded & mapped instructions with the current physical register mappings instructions entered into free locations in the IQ sit there until they are dispatched to functional units somewhat analogous to Tomasulo reservation stations without value fields or valid bits determines when operands are available compares each source operand for instructions in the IQ to destinations being written this cycle determines when an appropriate functional unit is available dispatches instructions to functional units CSE471 Susan Eggers 4

5 The Implementation (R1) one active list for all uncommitted instructions the extra hardware needed to preserve precise interrupts instructions entered in program-generated order allows instructions to complete in program-generated order the mechanism for maintaining precise interrupts instructions removed from the active list when: an instruction commits: the instruction has completed execution all instructions ahead of it have completed branch is mispredicted an exception occurs contains the previous architectural-to-physical destination register mapping used to recreate the for instruction restart after an exception instructions in the other hardware structures & the functional units are identified by their active list location CSE471 Susan Eggers

6 The Implementation (R1) busy-register table (integer & FP): indicates whether a physical register contains a value used to determine operand availability bit is set when a register is mapped & leaves the free list cleared when a FU writes the register CSE471 Susan Eggers 6

7 The R1 in Action 1 ld A, #(reg) arch register A defined add A4, A, reg sub A, reg, reg or A, A, reg 4 2 Py Pz ld unk P2 Px A not done CSE471 Susan Eggers 7

8 The R1 in Action 2 ld A, #(reg) arch register A defined add A4, A, reg arch register A used sub A, reg, reg or A, A, reg Pz ld unk P2 Px A not done Py A4 not done add P2 1 P21 1 CSE471 Susan Eggers 8

9 The R1 in Action ld A, #(reg) arch register A defined add A4, A, reg arch register A used sub A, reg, reg arch register A redefined name dependence or A, A, reg Pz sub unk P22 2 ld unk P2 add P2 1 P21 1 Px A not done Py A4 not done P2 A done CSE471 Susan Eggers 9

10 The R1 in Action 4 ld A, #(reg) arch register A defined add A4, A, reg arch register A used sub A, reg, reg arch register A redefined name dependence or A, A, reg arch register A used sub unk P22 2 ld unk P2 or P22 P2 add P2 1 P21 1 Px A not done Py A4 not done P2 A done Pz A done CSE471 Susan Eggers 1

11 The R1 in Action: Interrupts 1 ld A, #(reg) arch register A defined add A4, A, reg arch register A used sub A, reg, reg arch register A redefined name dependence or A, A, reg arch register A used Pz sub unk P22 2 ld unk P2 or P22 P2 add P2 1 P21 1 Px A not done Py A4 not done P2 A done CSE471 Susan Eggers 11

12 The R1 in Action: Interrupts 2 ld A, #(reg) arch register A defined add A4, A, reg arch register A used sub A, reg, reg arch register A redefined name dependence or A, A, reg arch register A used Pz sub unk P22 2 ld unk P2 or P22 P2 add P2 1 P21 1 Px A not done Py A4 not done CSE471 Susan Eggers 12

13 The R1 in Action: Interrupts ld A, #(reg) arch register A defined add A4, A, reg arch register A used sub A, reg, reg arch register A redefined name dependence or A, A, reg arch register A used 4 2 Py Pz sub unk P22 2 ld unk P2 or P22 P2 add P2 1 P21 1 Px A not done CSE471 Susan Eggers 1

14 The R1 in Action: Interrupts 4 ld A, #(reg) arch register A defined add A4, A, reg arch register A used sub A, reg, reg arch register A redefined name dependence or A, A, reg arch register A used 4 Px Py Pz CSE471 Susan Eggers 14

15 R1 Execution In-order issue (have already fetched instructions) rename architectural registers to physical registers via a map table detect structural hazards for instruction queues (integer, memory & FP) & active list issue up to 4 instructions to the instruction queues Out-of-order execution (to increase ILP) reservation-station-like instruction queues that indicate when an operand has been calculated each instruction monitors the setting of the busy-register table detect functional unit structural & RAW hazards set busy-register table entry for the destination register dispatch instructions to functional units In-order completion (to preserve precise interrupts) this & previous program-generated instructions have completed physical register in previous mapping returned to rollback on interrupts CSE471 Susan Eggers 1

Reorder Buffer Implementation (Pentium Pro) Reorder Buffer Implementation (Pentium Pro)

Reorder Buffer Implementation (Pentium Pro) Hardware data structures retirement register file (RRF) (~ IBM 360/91 physical registers) physical register file that is the same size as the architectural registers