CISC 662 Graduate Computer Architecture Lecture 7 - Multi-cycles Michela Taufer Interrupts and Exceptions http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture, 4th edition ---- Additional teaching material from: Jelena Mirkovic (U Del) and John Kubiatowicz (UC Berkeley) Device Interrupt (Say, arrival of network message) Types of Exceptions Network Interrupt add r,r2,r3 subi r4,r,#4 slli r4,r4,#2 Hiccup(!) lw r2,0(r4) lw r3,4(r4) add r2,r2,r3 sw 8(r4),r2 PC saved Disable All Ints Supervisor Mode Restore PC User Mode Raise priority Reenable All Ints Save registers lw r,20(r0) lw r2,0(r) addi r3,r0,#5 sw 0(r),r3 Restore registers Clear current Int Disable All Ints Restore priority RTE Could be interrupted by disk Exceptional situations: the normal execution of instructions is changes Exceptional situations: Interrupt Fault Exceptions Name of common exceptions vary across different architectures Note that priority must be raised to avoid recursive interrupts!
Exceptions or Interrupts Examples of Exceptions IBM 360 Every event is called and interrupt Motorola 680 Every event is called exception VAX Divides events into interrupts and exceptions We use the term exceptions to cover all the events I/O device request Invoking an operating system service from a user program Tracing instruction execution Programmer-request interrupt Integer arithmetic overflow and underflow FP arithmetic anomaly Page fault (not in main memory) Misaligned memory accesses (if alignment is required) Memory-protection violation Hardware malfunction Power failure What Action for an Event? Events have characteristics that determine what action is needed in the hardware Synchronous vs. asynchronous User required vs. coerced User maskable vs user unmaskable Within vs. between instructions Resume vs. terminate Synchronous vs. Asynchronous Synchronous: means related to the instruction stream, i.e. during the execution of an instruction Must stop an instruction that is currently executing Page fault on load or store instruction Arithmetic exception Software Trap Instructions Asynchronous: means unrelated to the instruction stream, i.e. caused by an outside event. Does not have to disrupt instructions that are already executing Interrupts are asynchronous Machine checks are asynchronous 2
User Required vs. Coerced User required event: The user task directly asks for the event This even is predictable E.g., Invoke OS, trace instruction execution, breakpointing Coerced events: A hardware event, not under the control of the user, causes it This event is hard to implement because not predictable User Maskable vs User Unmaskable Usermaskable: the even can be disabled by the user task I/O device request is an example of unmaskable event. Why? Can you make an example of maskable exceptions? Interrupt Controller Hardware and Mask Levels Interrupt disable mask may be multi-bit word accessed through some special memory address Operating system constructs a hierarchy of masks that reflects some form of interrupt priority. For instance: Priority Examples 0 Software interrupts 2 Network Interrupts 4 Sound card 5 Disk Interrupt 6 Real Time clock This reflects the an order of urgency to interrupts For instance, this ordering says that disk events can interrupt the interrupt handlers for network interrupts. Within vs. Between Instructions Within instruction: the event prevents instruction completion by occurring in the middle of the execution Between instruction: the event is recognized between two instructions It is harder to implement exceptions that occur within instructions than those between instructions. Why? 3
Resume vs. Terminate Terminating event: the program s execution always stops after the interrupt Resuming event: the program s execution continues after the interrupt It is easer to implement exceptions that terminate execution. Why? Precise Interrupts/Exceptions An interrupt or exception is considered precise if there is a single instruction (or interrupt point) for which: All instructions before that have committed their state No following instructions (including the interrupting instruction) have modified any state. This means, that you can restart execution at the interrupt point and get the right answer Implicit in our previous example of a device interrupt: Interrupt point is at first lw instruction External Interrupt add subi slli r,r2,r3 r4,r,#4 r4,r4,#2 lw r2,0(r4) lw r3,4(r4) add r2,r2,r3 sw 8(r4),r2 PC saved Disable All Ints Supervisor Mode Restore PC User Mode Int handler Imprecise Interrupts/Exceptions Precise interrupt point may require multiple PCs to specify An exception is imprecise if the processor state when an exception is raised does not look exactly as if the instructions were executed sequentially in strict program order: - The pipeline may have already completed instructions that are later in program order than the instruction causing the exception (we may have div.d raising an exception but have sub.d completed already). - The pipeline may have not yet completed some instructions that are earlier in program order than the instruction causing the exception. addi r4,r3,#4 sub r,r2,r3 PC: bne r,there PC+4: and r2,r3,r5 <other insts> addi r4,r3,#4 sub r,r2,r3 PC: bne r,there PC+4: and r2,r3,r5 <other insts> Interrupt point described as <PC,PC+4> Interrupt point described as: <PC+4,there> (branch was taken) or <PC+4,PC+8> (branch was not taken) 4
Why are precise interrupts desirable? Many types of interrupts/exceptions need to be restartable. Easier to figure out what actually happened: I.e. TLB faults. Need to fix translation, then restart load/store IEEE gradual underflow, illegal operation, etc: sin( x) e.g. Suppose you are computing: f ( x) = Then, for x 0, x 0 f ( 0) = NaN + illegal _ operation 0 Want to take exception, replace NaN with, then restart. Restartability doesn t require preciseness. However, preciseness makes it a lot easier to restart. Simplify the task of the operating system a lot Less state needs to be saved away if unloading process. Quick to restart (making for fast interrupts) Approximations to precise interrupts Hardware has imprecise state at time of interrupt Exception handler must figure out how to find a precise PC at which to restart program. Emulate instructions that may remain in pipeline Example: SPARC allows limited parallelism between FP and integer core: Possible that integer instructions # - #4 have already executed at time that the first floating instruction gets a recoverable exception Interrupt handler code must fixup <float >, then emulate both <float > and <float 2> At that point, precise interrupt point is integer instruction #5. <float > <int > <int 2> <int 3> <float 2> <int 4> <int 5> Vax had string move instructions that could be in middle at time that page-fault occurred. Could be arbitrary processor state that needs to be restored to restart execution. Problems with Pipelining An unusual event happens to an instruction during its execution Examples: divide by zero, undefined opcode Hardware signal to switch the processor to a new instruction stream Example: a sound card interrupts when it needs more audio output samples (an audio click happens if it is left waiting) Problem: It must appear that the exception or interrupt must appear between 2 instructions (I i and I i+ ) The effect of all instructions up to and including I i is totaling complete No effect of any instruction after I i can take place The interrupt (exception) handler either aborts program or restarts at instruction I i+ Precise Exceptions in Simple 5-stage Pipeline Exceptions may occur at different stages in pipeline (i.e. out of order): IF: page fault on instruction fetch IF: misalignment memory access IF: memory protection violation ID: undefined or illegal opcode E: arithmetic exception MEM: page fault on data fetch MEM: misaligned memory access MEM: memory protection violation WB: None How do we stop the pipeline? How do we restart it? Do we interrupt immediately or wait? How do we sort all of this out to maintain preciseness? 5
Precise Exceptions in Simple 5-stage Pipeline All of this solved by tagging instructions in pipeline as cause exception or not and wait until end of memory stage to flag exception Interrupts become marked NOPs (like bubbles) that are placed into pipeline instead of an instruction. Assume that interrupt condition persists in case NOP flushed Clever instruction fetch might start fetching instructions from interrupt vector, but this is complicated by need for supervisor mode switch, saving of one or more PCs, etc Another Look at the Exception Problem Data TLB Bad Inst Inst TLB fault Overflow Program Flow IFetch Dcd Exec Mem WB IFetch Dcd Exec Mem WB Use pipeline to sort this out! Pass exception status along with instruction. Keep track of PCs for every instruction in pipeline. Don t act on exception until it reaches WB stage Handle interrupts through faulting noop in IF stage When instruction reaches WB stage: Save PC EPC, Interrupt vector addr PC Turn all instructions in earlier stages into noops! Time IFetch Dcd Exec Mem WB IFetch Dcd Exec Mem WB Precise Exceptions in Static Pipelines Multicycles Key observation: architected state only change in memory and register write stages. 6
Extending MIPS for FP Pipelining FP operations are long and cannot be completed in 5 cycles ( lasts more than cycle) Simply imagine that stage is duplicated for FP There are multiple FP functional units Extending MIPS for FP Pipelining Integer IF ID FP/Int Mult MEM WB 2 3 4 5 6 7 8 9 Instruction i FP Instruction i+ IF ID MEM WB Instruction i+2 FP Adder FP/Int Div Extending MIPS for FP Pipelining Extending MIPS for FP Pipelining It would be beneficial to pipeline stage for FP We define: Latency number of cycles between the instruction that produces result and instruction that uses the result Initiation interval number of cycles that must elapse before issuing two operations of a given type IF ID M M2 M3 M4 M5 M6 M7 A A2 A3 A4 MEM WB Functional unit Latency Initiation interval Integer ALU Data memory FP add FP/Int multiply FP/Int divide 0 3 6 24 25 DIV 7
Hazards In FP MIPS Pipeline J Structural Hazards Possible Use shift registers for conflict identification J Number of register writes in a cycle could be > Serialize writes or treat as structural hazards J WAW hazards possible, because instructions no longer reach WB in order J Instructions can complete in a different order than issued Problems with exceptions! J Stalls for RAW hazards more frequent Hazards In FP MIPS Pipeline J Because DIV unit is not pipelined structural hazards can occur J Because instructions have varying running times number of register writes in a cycle can be > J Instructions don t reach WB in order, so WAW hazards are possible J Instructions can raise exceptions out of order J Stalls for RAW hazards will be longer due to long latency RAW Hazards In FP MIPS Pipeline WAW Hazards In FP MIPS Pipeline MUL.D 2 3 4 5 6 7 8 9 0 IF ID M M2 M3 ADD.D IF ID A A2 A3 A4 MEM WB M4 M5 M6 M7 MEM WB L.D F4,0(R2) MUL.D F0, F4, F6 ADD.D F2, F0, F8 L.D F2, 0(R2) 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 IF ID stall M M2 M3 M4 M5 M6 M7 MEM WB IF stall ID stall stall stall stall stall stall A A2 A3 A4 MEM WB IF stall stall stall stall stall stall ID stall stall stall MEM WB Although it seems useless sequence of instructions (L.D overwrites F2 immediately after ADD.D writes it) we must detect WAW hazard and make sure the later value appears in register One approach (shown) is to delay writing stage of the later instruction Another approach is to stamp the result of the earlier instruction and do not write it into memory or register 8
MUL.D ADD.D L.D Write Port Structural Hazards In FP MIPS Pipeline 2 3 4 5 6 7 8 9 0 IF ID M M2 M3 M4 M5 M6 M7 MEM WB IF ID A A2 A3 A4 MEM WB J Detect write port hazard in ID stage, use shift register that indicates when already issued instructions will use write port, shift reservation register one bit at each clock cycle. We could then insert stalls right after ID stage. J Alternative solution would insert stall before MEM or WB Data Hazards In FP MIPS Pipeline J Since FP and integer operations use different registers we need only consider moves and loads/stores as potential source of hazards between FP and integer instructions J Pipeline checks in ID for: B Structural hazards DIV unit and write port B RAW hazards wait until source registers are not listed as pending destinations B WAW hazards determine if any instruction in stage has the same destination register, if so stall the current instruction Maintaining Precise Exceptions In FP MIPS Pipeline Out-of-order completion is possible DIV.D F0, F2, F4 ADD.D F0, F0, F8 SUB.D F2, F2, F4 Option : History file keeps track of original values of registers/memory; in case of out-of-order complition and exceptions, roll back original values of registers Option 2: Future file keeps track of new values, registers/memory are updated when all previous instructions have completed Option 3: Proceed only if sure that no previous instructions will cause exceptions Work in Group Given the code: DIV.D F0, F2, F4 ADD.D F0, F0, F8 SUB.D F2, F2, F4 Is there any dependency? Why is the code characterized by an out-of-order completion? Why is an out-of-order completion a problem? Make an example. How can you address the problem? 9