CSC 3210 Computer Organization and Programming Georgia State University March 31, 2015
This lecture Plan for the lecture: Recap: Register Saving Subroutine Linkage call instruction jmpl instruction ret instruction Arguments to subroutines Selected solution for Quiz 4
Introduction (1) Subroutines makes it possible to repeat computation or repeat computation with different arguments. Open Subroutine : handled by text editor / macro preprocessor The code is inserted whenever required in the program Efficient with no wasted instructions Results in long code Closed Subroutine: appears only once in the program, a jump to the code is executed whenever it is needed and return is made after the subroutine has been executed to the location after the jump instruction.
Introduction (2) Subroutines allows to debug code once and then to be sure that all future instantiations of the code will be correct. provides for control of errors basis for structured programming
Register Saving (1) Almost any computation will involve the use of registers Usually when subroutines are called, registers are pushed onto the stack and popped from, when it returns To avoid the execution time involved, in CISC, sometimes a special register save mask is used, that would indicate, by bits that were set, which registers were to be saved
REGISTER FILE REGISTER FILE 8-Global 8-Global
Register Saving (2) SPARC architecture provides 128 registers, with the programmer having access to the eight global registers, and only 24 of the mapped registers at a time Current register set is indicated by the current window pointer, CWP (bits 0-4 inside PSR). Last free register set is marked by the window invalid mask register, WIM Each register set has 16 registers. save instruction changes the register mapping so that new registers are provided restore instruction restores the register mapping on subroutine return
window overflow: If a further five subroutine calls are made without any returns, window overflow will occur after further five subroutine calls without return an additional subroutine call moves the 16 registers from window set 7 to the stack where the %sp of register window set 6 is pointing. window underflow: window underflow occurs if CWP is moved back to WIM
Save & restore After save instruction %o6 (%sp) becomes %i6(%fp). save %sp, -96, %sp subtracts 96 from the current stack pointer but saves the result in the new stack pointer, leaving the old stack pointer unchanged. The old stack pointer becomes the new frame pointer restore instruction restores the register window set. restores the registers from the stack. restore is also an add instruction.
Subroutine Linkage (1) To branch to a subroutine a ba instruction might be used. However, there is no way to return to the point where the subroutine was called. In SPARC,two instructions for linking to subroutine which save the address of calling instruction in %o7. return is to the address pointed by %o7 + 8. inside the subroutine, since o7 is mapped to i7, the return is made to %i7+8.
Subroutine Linkage (2) The SPARC architecture supports two instructions, call and jmpl, for linking to subroutines
jmpl Instruction Used when the address of the subroutine is computed and not known address is loaded into a register subroutine address is the sum of the source arguments, and the address of the jmpl instruction is stored in the destination register jmpl reg rs1, reg rs2 / constant, reg rd. always followed by a delay slot instruction to call a subroutine whose address is in register %o0 and to store the return address into %o7, we would write: jmpl %o0, %o7
Call Instruction If the subroutine name is known at assembly time, the call instruction may be used call instruction has a target address label It stores %pc contents to %o7 call label or %register (points address) Transfers control to that address, and stores the address of call into %o7. call %o0 is expanded as jmpl %o0, %o7 always followed by a delay slot instruction
Call vs jmpl The assembler recognizes call %o0!as jmpl %o0, %07 The return from a subroutine also makes use of the jmpl instruction We need to return to %i7 + 8 Assembler recognizes ret for: jmpl %i7 + 8, %g0
ret instruction The call to subroutine is: call subr nop And at the entry of the subroutine subr: save %sp, %sp with the return ret restore The restore instruction is normally used to fill the delay slot of the ret instruction The ret is expanded to: jmpl %i7 + 8, %g0 restore
Example of call instruction Call and return instructions update the program counter. [700] mov 3, %o0 [704] call.mul [708] mov 10, %o1 [712] add %o0, %l0, %l2 [2000] save %sp, -96, %sp [2024] ret [2028] restore not real assembly codes for.mul just for illustration
Example of call instruction (cont.) During the execute stage of the call function the program counter is set to 2000 %pc=704 708 2000 [704] call F E M W [708] mov F E M W [2000] save F E M W %i0=30 Register %i7 holds the program Counter during the subroutine [2400] ret F E M W [2404] restore F E M W [712] add F E M W %pc=2400 2404 %i7 + 8 = 712
Arguments to Subroutines (1) 1. Arguments follow in-line after the call instruction: (Strongly NOT recommended) For example, a Fortran routine to add two numbers, 3 and 4, together would be called by: and handled by the following subroutine code: Note that the return is to %i7 + 16 jumping over the arguments. This type of argument passing is very efficient but is limited. Recursive calls are not possible, nor is it possible to compute any of the arguments.
Arguments to Subroutines (2) 2. Placing arguments in stack, time consuming since each argument must be stored before calling subroutine and should be moved back to register for any computation flexible since any number of arguments can be passed, supports recursive calls In, SPARC first 6 arguments can be placed in the out registers. Only 6 out registers are available since %o6 is stack pointer and %o7 is used for storing calling address. After save instruction the arguments will be available to subroutine in %i0-%i5
Arguments to Subroutines (3) 64 bytes for saving 16 registers starting at %sp 4 bytes for structure return pointer at %sp + 64 24 bytes for 6 arguments starting at %sp + 68. Pointer == Address additional arguments may be placed on stack starting at %sp + 92 and can be accessed by the called subroutine at %fp + 92 Stand format for writing subroutine:.global subroutine_name: subroutine_name: save %sp, -(64 + 4 + 24 + more_args + local ) & -8, %sp where local variables for called subroutine (callee) are accessed at %fp local_var_offset
where passed arguments for called subroutine (callee) are accessed at %fp + arguments_offset
1. Assume (-3) 10 is stored in the memory location %fp+x_offset in 8 bits. Show the content of register %o0 in hexadecimal after the following instructions. Assume the register is 32 bit wide. (10 points) a) ldsb [%fp + x_offset], %o0!ldsb loads signed byte b) ldub [%fp + x_offset], %o0!ldub loads unsigned byte
1. Assume (-3) 10 is stored in the memory location %fp+x_offset in 8 bits. Show the content of register %o0 in hexadecimal after the following instructions. Assume the register is 32 bit wide. (10 points) (-3) 10 = 0xFD a) ldsb [%fp + x_offset], %o0!ldsb loads signed byte %o0 = 0x FFFFFFFD b) ldub [%fp + x_offset], %o0!ldub loads unsigned byte %o0 = 0x 000000FD
3. If you need the following variables (declared in the same order as shown below, double type is using 8 bytes) in your program double a, b; char c; int d; Write the offsets for each variable, save instruction to provide the storage and appropriate load instructions to load into local registers (a -> %l0, b -> %l1, c -> %l2, d -> %l3). (10 points)
3. If you need the following variables (declared in the same order as shown below, double type is using 8 bytes) in your program double a, b; char c; int d; offsets: a_s = -8 b_s = -16 c_s = -17 d_s = -24
3. If you need the following variables (declared in the same order as shown below, double type is using 8 bytes) in your program double a, b; char c; int d; save instruction: save %sp, -(92 + 24) & -8, %sp
3. If you need the following variables (declared in the same order as shown below, double type is using 8 bytes) in your program double a, b; char c; int d; load instructions: ldd [%fp + a_s], %l0! load variable a ldd [%fp + b_s], %l2! load variable b ldsb/ldub [%fp + c_s], %l3! load variable c ld [%fp + d_s], %l4! load variable d