Course Introduction Purpose This course provides an overview of the SH-2A 32-bit RISC CPU core built into newer microcontrollers in the popular SH-2 series Objectives Acquire knowledge about the CPU s register banks Gain an understanding of the SH-2A s on-chip cache memory Review some helpful programming suggestions Content 13 pages 2 questions Learning Time 20 minutes 1
SH-2A/SH2A-FPU Register Banks The SH-2A and SH2A-FPU CPU cores have register banks that: Provide high-speed register save and retrieve, particularly useful for improving the performance of interrupt processing Can be banked automatically by interrupts, based and enabled on an interrupt priority basis Can be restored using the RESBANK instruction SH-2A CPU Superscalar* RISC Design General Registers System Registers 5-stage Pipeline (*Two instructions are fetched and executed simultaneously) Hardware Multiplier Control Registers Register Banks CPU Instruction Fetch Bus CPU Data Fetch Bus On-chip Cache FPU (SH2A-FPU only) Clock 2
Nineteen Registers Are Banked General Registers R0 to R14 GBR MAC Registers Procedure Register IBCR, IBNR 3
Number of Register Banks SH-2A/SH2A-FPU architecture supports up to 512 banks, but the typical number is about 15 When all banks are full, the register contents are saved to and restored from the stack automatically Exceptions can be generated when: An attempt is made to bank registers when all banks are full (overflow) An attempt is made to restore register contents via a RESBANK instruction when all banks are empty (underflow) 4
Question Is the following statement true or false? Click Done when you are finished. When the ISR begins executing, it stacks the CPU contents in RAM, a process aided by register banking. True False Done 5
On-Chip, 16KB Cache Memory Built-in cache controller Separate operand (data) and instruction caches 8KB each Four-way set associative 128 entries per way 16-byte cache line size Operand cache: ways 2 and 3 are lockable Write modes Write-back and write-through, selectable LRU replacement algorithm employed Helps minimize impact of cache line replacement Pre-fetch capability PREF instruction SH-2A CPU Superscalar* RISC Design General Registers System Registers 5-stage Pipeline (*Two instructions are fetched and executed simultaneously) Hardware Multiplier Control Registers Register Banks CPU Instruction Fetch Bus CPU Data Fetch Bus On-chip Cache FPU (SH2A-FPU only) Clock 6
Structure of the Operand Cache There are four ways (Banks) 7
Address and Data Sections Operand Cache Both Both the the address address and and data data sections sections of of the the cache cache are are divided divided into into 128 128 entries entries 8
Cache Line Operand Cache The The data data section section of of each each entry entry is is a cache cache line line of of 16 16 bytes bytes (four (four 4-byte 4-byte longwords) longwords) 9
V: Valid Bit in Address Array Operand Cache V: V: Indicates Indicates when when the the data data in in the the cache cache is is valid valid (set (set to to 1) 1) (Important: (Important: Flush Flush the the cache cache before before using using it; it; that that sets sets the the V bit bit to to 0) 0) 10
U: Has Data Been Written to? Operand Cache U: U: Only Only present present in in the the operand operand cache; cache; it it indicates indicateswhether or or not not the the entry entry has has been been written written to to in in a write-back write-back mode. mode. (U (U is is a 1 when when it it has has been been written written to) to) 11
LRU: Cache Housekeeper Operand Cache LRU: LRU: Stores Stores information information on on which which the the four four ways waysan an entry entry is is stored stored in. in. This This is is important important because because up up to to four four data data or or instruction instruction entries entrieswith with the the same same entry entry address address can can be be registered registered in in the the cache. cache. The The LRU LRU also also indicates indicates the the least-used least-used data, data, if if replacement replacement is is necessary. necessary. 12
Seven Bits = 128 Entries Operand Cache Always zero Entries Entries are are selected selected using using bits bits 10 10 to to 4 of of the the memory memory address address (The (The four four LSBs LSBs are are always always 0) 0) 13
Tag Address Operand Cache Bits Bits 31 31 to to 11 11 of of the the address addressare arestored as as the thetag tag address address in in the the cache. cache. 14
V=1, Cache Hit; V=0, Cache Miss Operand Cache When When the the comparison comparison shows shows a match match and and the the V bit bit is is 1, 1, a cache cache hit hit occurs. occurs. If If the the V bit bit is is 0, 0, a cache cache miss miss occurs. occurs. 15
Cache Read Hits/Misses Read hit Data is transferred from the cache to the CPU Read miss External bus cycle starts and the cache entry is updated The data is transferred to the CPU at the same time that it is loaded into the cache The V bit is set and the LRU is updated For the operand cache, the U bit is cleared to 0 If the U bit was 1, the original contents of the cache are copied to the write-back buffer before the cache is updated After the cache fill, a cache write-back occurs to restore the original cache contents 16
Operand Cache Write Hits/Misses Write hit Write-back mode Data is written to the cache and no external access occurs The U bit is set and the LRU is updated Write-through mode Data is written to the cache and an external write cycle is issued. The U bit is not set; the LRU is updated Write miss Write-back mode External cycle starts and entry is updated If the U bit of the replaced cache way is 1, a cache update occurs after the original cache line is written to the write-back buffer After the cache update, the write-back buffer is written to external memory Write-through mode No cache write occurs There is external memory access only 17
Question Match the SH-2A instructions to the appropriate descriptions by dragging the letters on the left to their appropriate locations on the right. Click Done when you are finished. A Operand cache B Indicates when the data in the cache is valid B V bit D Occurs when the comparison shows a match and V is 1 C U = 1 A Ways 2 and 3 can be locked D Cache hit C Indicates that the entry has been written to in a write-back mode Done Reset Show Solution 18
Ten Helpful Programming Tips 1. Locate branch destinations on longword boundaries 2. Use a register different from the load destination register for the next three instructions after an instruction that loads from memory 3. Use a register different from the multiply result register for the next three instructions after a 32-bit multiply instruction 4. Use local or automatic stack-based variables wherever possible 5. Use modular programming 6. Be careful with constants, using 8-bit if possible 7. Avoid unnecessary MAC and FPU operations that might stall pipelines 8. Place functions that call each other close together 9. Try to align instructions on 32-bit boundaries 10. Convert byte and word values to signed-long integers
Course Summary Register banks of SH-2A and SH2A-FPU RISC CPU cores On-chip cache memory Suggestions for efficient programming 20