ECE 15B Computer Organization Spring 2011 Dmitri Strukov Partially adapted from Computer Organization and Design, 4 th edition, Patterson and Hennessy,
Agenda Instruction formats Addressing modes Advanced dconcepts
Instruction formats
Simple datapath picture Let s add more details on this figure to see why instruction Let s add more details on this figure to see why instruction decoding could be simple and to see what is happening with for different instructions
High Level Language Program (e.g., C) Compiler Assembly Language Program (e.g.,mips) Assembler Machine Language Program (MIPS) Machine Interpretation Hardware Architecture Description (e.g., block diagrams) Architecture Implementation Logic Circuit Description (Circuit Schematic Diagrams) Below the Program temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; lw $t0, 0($2) lw $t1, 4($2) sw $t1, 0($2) sw $t0, 4($2) 0000 1001 1100 0110 1010 1111 0101 1000 1010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111
One to one mapping Assembly instruction lw $t0, 0($2) Binary code 0000 1001 1100 0110 1010 1111 0101 1000 One assembly instruction = 32 bit vector always 32 bits Macro or pseudo instruction > one line of code Examples: shift and rotate from Quiz 1 rol $a0, $a1, $a2 subu $t0, $0, $a2 srlv $t0, $a1, $t0 sllv $a0, $a1, $a2 or $a0, $a0, $t0
Datapath With Control
Instruction formats R format: op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits I format: op rs rt constant or address 6 bits 5 bits 5 bits 16 bits J format: op address 6 bits 26 bits
R format Example op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits add $t0, $s1, $s2 note the order! (green card) special $s1 $s2 $t0 0 add 0 17 18 8 0 32 000000 10001 10010 01000 00000 100000 00000010001100100100000000100000 2 = 02324020 16
R Type Instruction
Instruction formats R format: op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits I format: op rs rt constant or address 6 bits 5 bits 5 bits 16 bits J format: op address 6 bits 26 bits
Load Instruction
Instruction formats R format: op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits I format: op rs rt constant or address 6 bits 5 bits 5 bits 16 bits J format: op address 6 bits 26 bits
Target Addressing Example Loop code from earlier example Assume Loop at location 80000 Loop: sll $t1, $s3, 2 80000 0 0 19 9 4 0 add $t1, $t1, $s6 80004 0 9 22 9 0 32 lw $t0, 0($t1) 80008 35 9 8 0 bne $t0, $s5, Exit 80012 5 8 21 2 addi $s3, $s3, 1 80016 8 19 19 1 j Loop 80020 2 20000 Exit: 80024
Branch on Equal Instruction
MIPS PC relative or branch addressing Branch instructions specify Opcode, two registers, target address Most branch targets are near branch Forward or backward op rs rt constant oraddress 6 bits 5 bits 5 bits 16 bits PC relative addressing Target address = PC + offset 4 PC already incremented dby 4 by this time
Instruction formats R format: op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits I format: op rs rt constant or address 6 bits 5 bits 5 bits 16 bits J format: op address 6 bits 26 bits
Target Addressing Example Loop code from earlier example Assume Loop at location 80000 Loop: sll $t1, $s3, 2 80000 0 0 19 9 4 0 add $t1, $t1, $s6 80004 0 9 22 9 0 32 lw $t0, 0($t1) 80008 35 9 8 0 bne $t0, $s5, Exit 80012 5 8 21 2 addi $s3, $s3, 1 80016 8 19 19 1 j Loop 80020 2 20000 Exit: 80024
Datapath With Jumps Added
Pseudodirect or Jump Addressing Jump (j and jal) targets could be anywhere in text segment Encode full address in instruction op address 6 bits 26 bits (Pseudo)Direct jump addressing Target address = PC 31 28 : (address 4)
Implementing Jumps Jump 2 address 31:26 25:0 Jump uses word address Update PC with concatenation of Top 4 bits of old PC 26 bit jump address 00 Need an extra control signal decoded from opcode
Branching Far Away If branch target is too far to encode with 16 bit offset, assembler rewrites the code Example beq $s0,$s1, L1 bne $s0,$s1, L2 j L1 L2:
Note on the PC incrementing Technical term for auto incrementation of PC is delayed branch By default in SPIM delayed branch is not checked. To see you SPIM settings look at simulator settings You can also check it by loading code to SPIM to check main : bne $s0, $s0, main
Loading constant values to registers Any immediate is 16 bit To load 32 bits constant one can use addi, sll + addi or better way to use lui rd, const (load upper immediate)
Specific Addressing Mode in MIPS
Various specific addressing modes in other ISAs Absolute address Immediate data Inherent address Register direct Register indirect Base register Register indirect with index register Register indirect with index register and displacement Register indirect with index register scaled Absolute address with index register Memory indirect Program counter relative
Example: Basic x86 Addressing Modes Two operands per instruction Source/dest operand Register Register Register Memory Memory Memory addressing modes Address inregister Address = R base + displacement Second source operand Register Immediate Memory Register Immediate Address = R + scale base 2 R index (scale = 0, 1, 2, or 3) Address = R base + 2 scale R index + displacement
Advanced Topics: Code density examples
Recent study (2009)
Code density examples
Advanced topics: Pipelining
Datapath With Control
Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non stop: Speedup p = 2n/0.5n + 1.5 4 = number of stages
Pipeline registers Need registers between stages To hold information produced in previous cycle
Multi Cycle Pipeline Diagram Traditional form
Advanced topics: Cache design basics
Datapath With Control
Principle of Locality Programs access a small proportion popoto of their address space at any time Temporal locality Items accessed recently are likely to be accessed again soon e.g., instructions i in a loop, induction i variables ibl Spatial locality Items near those accessed recently entl are likely to be accessed soon E.g., g, sequential instruction access, array data
Taking Advantage of Locality Memory hierarchy Store everything on disk Copy recently accessed (and nearby) b)items from disk to smaller DRAM memory Main memory Copy more recently accessed (and nearby) items from DRAM to smaller SRAM memory Cache memory attached to CPU
Direct Mapped Cache Location determined by address Direct mapped: only one choice (Block address) modulo (#Blocks in cache) #Blocks is a power of 2 Use low order address bits
Tags and Valid Bits How do we know which particular block is stored in a cache location? Store block address as well as the data Actually, only need the high order bits Called the tag What if there is no data in a location? Valid bit: 1 = present, 0 = not present Initially 0
Example: Direct mapped cache