This course provides an overview of the SH-2 32-bit RISC CPU core used in the popular SH-2 series microcontrollers

Size: px

Start display at page:

Download "This course provides an overview of the SH-2 32-bit RISC CPU core used in the popular SH-2 series microcontrollers"

Agnes Alexander
6 years ago
Views:

1 Course Introduction Purpose: This course provides an overview of the SH-2 32-bit RISC CPU core used in the popular SH-2 series microcontrollers Objectives: Learn about error detection and address errors Explore the pipeline in the SH-2 CPU Understand branching Obtain some helpful coding tips Content: 16 pages 3 questions Learning Time: 15 minutes 1

2 CPU Error Detection SH-2 CPU can detect and react to various error conditions: Address errors Illegal instructions (invalid op codes) Undefined instruction executed Slot illegal instructions Certain instructions, such as one that changes the PC, cannot be located in a delay slot (i.e., a slot after a delayed branch) 2

3 Address Errors Instruction fetch from odd address Instruction fetch from on-chip peripheral module space Instruction fetch from external memory space in single-chip mode Word data accessed from odd address Longword data accessed from other than a longword boundary Longword accessed in 8-bit on-chip peripheral module space External memory space accessed when in single-chip mode Stack access to address that is not multiple of four (longword alignment required) Access using vector base register when VBR is not a multiple of four (longword alignment required)

4 PROPERTIES On passing, 'Finish' button: On failing, 'Finish' button: Allow user to leave quiz: User may view slides after quiz: User may attempt quiz: Goes to Next Slide Goes to Slide At any time After passing quiz Unlimited times

5 Why Use a Pipeline? CISC CPUs perform all operations sequentially, then execute instructions in series, one at a time; instruction execution can take several cycles RISC CPUs use a pipeline to try to overlap independent parts" of instructions, so that one instruction may be able to execute every cycle Content of a typical instruction Instruction Fetch Inst. Dec./ Reg. Acc. ALU Mem Access/ Data Cycle Write Back Register CISC: Variable length SH-2: 16 bits long CISC Instruction Instruction Instruction IF ID ALU MA WB IF ID ALU MA WB IF ID ALU MA WB Instruction RISC Instruction Instruction Time saved Ideal pipeline speedup: T pipeline = T unpipelined / # of Pipeline stages

6 Ideal SH-2 Pipeline Flow IF (Instruction Fetch): Fetches an instruction from the memory in which the program is stored ID (Instruction Decode): Decodes instruction fetched EX (Instruction Execution): Performs data operations and address calculations according to the results of decoding MA (Memory Access): WB(Write Back): Accesses data in memory; generated by instructions that involve memory access (with some exceptions) Returns results of memory access (data to a register) - Generated by instructions that involve memory loads, with some exceptions Slots Instruction 1 IF ID EX MA WB Instruction 2 IF ID EX MA WB Instruction Instruction 3 IF ID EX MA WB stream Instruction 4 IF ID EX MA WB Instruction 5 IF ID EX MA WB Instruction 6 IF ID EX MA WB Time Ideal execution One instruction every slot (1 cycle per slot)

7 Hazards Prevent Ideal Flow Pipeline can be stalled by: Structural hazards Arise from resource conflicts that occur when the hardware cannot support all possible combinations of instructions in simultaneous, overlapped execution Data hazards Arise when an instruction depends on the result of a previous instruction in a way that is exposed by the overlapping of instructions in the pipeline Control hazards Arise from the pipelining of branches and other instructions that change the PC

8 PROPERTIES On passing, 'Finish' button: On failing, 'Finish' button: Allow user to leave quiz: User may view slides after quiz: User may attempt quiz: Goes to Next Slide Goes to Slide At any time After passing quiz Unlimited times

9 Slots Requiring Multiple Cycles Execution time of Instruction 1: 4 cycles IF stage takes two cycles MA stage takes three cycles 9

10 Pipeline and Flow Control Due to the pipeline, instructions are fetched before previous instructions are totally completed Program timing is affected when fetched instructions are not used; this situation can occur for: Conditional branching Unconditional branching Understand the difference between Virtual time for instruction execution (ideal case) Total time for an instruction to execute (actual case) SuperH compilers organize code to minimize pipeline stalls

11 Conditional Branch Taken When branch is taken... Conditional branch instructions: BF <destination> BT <destination> Destination address is known at this point Destination address is known at this point. BR Instr. IF ID EX Next Instr. IF ID Fetched, but discarded Next Instr. IF Fetched, but discarded BR Destination IF ID EX MA

12 Conditional Branch Not Taken When branch is NOT taken... Conditional branch instructions: BF <destination> BT <destination> Execution continues normally BR Inst IF ID EX MA WB Next Inst Next Inst Next Inst IF ID EX MA WB IF ID EX MA WB IF ID EX MA WB

13 Conditional Delayed Branch When branch is taken... Instructions: BF/S <destination> BT/S <destination> Delay Introduced BR Instr. IF ID EX Delay Next Instr. IF Delay ID EX MA WB BR Destination IF ID EX MA WB IF ID EX MA Delayed Branch / Instruction Re-order Traditional CPU ADD.W R1,R0 BGT target_address SuperH CPU BF/S target_address ADD.W R1,R0

14 PROPERTIES On passing, 'Finish' button: On failing, 'Finish' button: Allow user to leave quiz: User may view slides after quiz: User may attempt quiz: Goes to Next Slide Goes to Slide At any time After passing quiz Unlimited times

15 Programming Tips Use local variables wherever possible to improve execution (locals use less ROM and RAM) Use modular programming to reduce far branches Be careful with constants; use 8-bit wherever possible Avoid operations on the MAC that will stall the pipeline Place functions that call each other close together Try to align instructions on 32-bit boundaries, especially load/store instructions Convert byte and word values to signed long integers, the most efficient data type for the SH-2 architecture Make sure that instructions that immediately follow an instruction that loads from memory, do not use the same destination register as the load instruction

16 Course Summary Error detection Address errors Pipeline Branching Recommendations for coding 16

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism