System Software Assignment 1 Runtime Support for Procedures Exercise 1: Nested procedures Some programming languages like Oberon and Pascal support nested procedures. 1. Find a run-time structure for such a language to handle the access to data in outer procedures. In each procedure stack frame we store an additional pointer (the static link or SL) to mark where the outer procedure stack frame is (note that this is not always the same as the caller). local variables dynamic link return address static link (SL) parameters Figure 1: Procedure activation frame. Figure 1 shows the procedure activation frame for an Oberon (or Pascal) procedure: in addition to the dynamic link and return address it includes the static link pointer to the outer procedure. Figure 2 shows a sample program with several levels of nested procedures. Figure 3 shows the stack layout after calling the procedures S, Q, Q, P, and E. The dynamic link pointer chain (formed by the frame pointers) links the frames in the activation order (E P Q Q S) while the static link pointer chain links the frames to their parent, or enclosing directory (E S, P Q, and Q S). Table 1 shows some examples procedure calls and accesses to local variables at various levels. 2. In Oberon only global procedures can be used as value for procedure variables (e.g., VAR proc var: PROCEDURE foo(i: INTEGER);). Explain, from the run-time system point of view, the reasons to introduce such a limitation. A local procedure assumes a context to be available through the static link (SL) pointer, but when invoked from a procedure variable the context may not exists; to avoid a security hole in the system, such procedures cannot be passed as variable. Figure 4 depicts a simple module with a procedure variable x and two nested procedures P and Q. In the procedure P we assign Q to the global variable x and after the execution of P x correspond to the inner procedure Q. This does not make any sense since Q only exists in the scope of Q. 1
MODULE Test; (* level 0 *) PROCEDURE S(); (* level 1 *) VAR a: INTEGER; PROCEDURE E(); (* level 2 *) (* the stack coming from *) (* S, Q, Q, P is depicted *) (* on the right *) END E; PROCEDURE Q(); (* level 2 *) VAR k: INTEGER; PROCEDURE P(); (* level 3 *) VAR j: INTEGER; E() END P; S a: INTEGER E Q k: INTEGER P j: INTEGER IF b THEN Q() ELSE P() END END Q; Q() END S; END Test. Figure 2: Nested procedures example. 2
E P Q Q S dynamic link static link Figure 3: Stack layout. Table 1: Local variables access examples Calling convention Access convention one level deeper: S Q S: a := 42; push FP mov a[fp], 42 call Q same level: E Q E: a := 42; push 8[FP] mov R0, 8[FP] call Q mov a[r0], 42 one level higher: P Q P: a := 42; mov R0, 8[FP] mov R0, 8[FP] push 8[R0] mov R1, 8[R0] call Q mov a[r1], 42 3
MODULE Test; VAR x: PROCEDURE(); PROCEDURE P(); PROCEDURE Q(); END Q; x := Q (* this is an illegal statement *) END P; P(); (* now x points to an invalid location! *) END Figure 4: Function pointer example. 4
Exercise 2: C calling conventions The C language supports open parameter lists (e.g., void foo(int a, int b,...);), in other words: the number of arguments is not fix and can be larger than the specified parameters (in our example two or more). 1. Which is the formula to compute the address of the parameters on the stack? Let s look at an example using the printf function (in C) which has an open list of parameters. printf("%i %i %i %i", p0, p1, p2, p3); If we pass the parameters p i left to right (like in Java or Oberon) the address of a parameter will be (Figure 5 shows the stack layout): addr(p i ) := F P + padding + size(p n ) + size(p n 1 ) +... + size(p i+1 ) FP RET pn... p1 p0 Figure 5: Left to right parameter passing. To compute the address of a parameter we need the total number (n) of actual parameters which is unknown at compile time. If we pass the parameters right to left (like in C) the address of a parameter will be (Figure 6 shows the stack layout): addr(p i ) := F P + padding + size(p 0 ) + size(p 1 ) +... + size(p i 1 ) posing no problem at compile time! FP RET p0 p1 p2... Figure 6: Right to left parameter passing. 2. Give the implementation of the actions call and return in pseudocode. 5
The only problem that is generated by the presence of a variable number of parameters is the size of procedure frame is not known at compile time. When a procedure returns, the parameters must be removed from the stack but their is unknown. To avoid this problem the old stack pointer is stored on the stack (as a last hidden parameter) and later used to restore the stack after the called procedure has returned. This is the implementation of call and return with a variable number of parameters (the problem s specific code is highlighted in bold) The only change in the implementation is of course the order of the parameters and the additional storage of the stack pointer. Call Return save registers push parameters (right to left) push SP save PC branch save FP FP := SP allocate locals Remove Locals restore FP restore PC restore the SP (remove parameters) restore registers 6
Exercise 3: Parameter passing 1. Why do compilers typically allocate space for arguments in the stack, even when they pass them in registers? In this way when a parameter has to be written to memory (if and additional register is needed) it can be written in the stack as all the other parameters without having to handle it in a special way. Let s look at an example: on the PowerPC architecture registers 3 to 10 can be used to pass arguments. void foo(int a, int b, int c) { while (a > 0) foo(a-1, b, c); } At the first call of foo the three parameters will be stored to R3-R5. When foo calls itself recursively the registers R3 to R5 are again needed to store the parameters. The old values of R3,R4 and R5 have to be written back to memory. In this case the three registers can safely be written on the stack as any other parameter. 2. How can complex variables (arrays and records) be passed to a procedure. Discuss two possibilities and comment their efficiency A first simple possibility is to copy these structures on the stack. This approach has two major drawbacks: a lot of time is required to copy all the data at every procedure invocation, and huge stack frames are generated (as an example the size of an ARRAY 1024,1024 OF LONGREAL is 8MB). Some languages only allow this type of variables to be passed by reference eliminating the problem. Some other language allow to pass records and arrays by reference read-only: in this way the structures can be passed by reference avoiding the copy. When the callee wants to write on these structures the data is copied to another location (lazy copy) 3. Which is the problem when passing a constant by reference? How could this problem be avoided? If a constant is passed by reference, the callee could modify it since the address of the memory location is passed on the stack and the called procedure cannot be aware of the fact the actual parameter is, in fact, a constant. To avoid the problem a constant should be copied to a temporary location and the address of this temporary location should be passed to the callee. 7
Exercise 4: Return values Functions can return a value to the caller (e.g., PROCEDURE foo(): INTEGER; in Oberon or int foo(); in C). In Oberon it is only possible to return a scalar value or a pointer, other languages like C allow more complex structures to be returned. 1. Describe a simple way to return a value from the callee to the caller in Oberon (scalars and pointers only). When only scalars and pointers can be returned the maximum size of the returned variable is known and fix (normally four or eight bytes). In this case one or two registers can be reserved to this scope (on PowerPC usually R3, on Intel EAX). 2. Explain what should be done to return complex structures like records and arrays. This can be solved in two ways: either the caller gives the address of the structure that will be used by return to the callee (this corresponds to an hidden parameter), or the callee returns a pointer to a static variable where the structure is stored. 8