Status Run-time Environments Lecture 11 We have covered the ront-end phases Lexical analysis Parsin Semantic analysis Next are the back-end phases Optimization Code eneration We ll do code eneration irst... Pros. Aiken CS 143 Lecture 11 1 Pros. Aiken CS 143 Lecture 11 2 Run-time environments Beore discussin code eneration, we need to understand what we are tryin to enerate There are a number o standard techniques or structurin executable code that are widely used Outline Manaement o run-time resources Correspondence between static (compile-time) and dynamic (run-time) structures Storae oranization Pros. Aiken CS 143 Lecture 11 3 Pros. Aiken CS 143 Lecture 11 4 Run-time Resources Memory Layout Execution o a proram is initially under the control o the operatin system Code Low Address When a proram is invoked: The OS allocates space or the proram The code is loaded into part o the space The OS jumps to the entry point (i.e., main ) Memory Other Space Hih Address Pros. Aiken CS 143 Lecture 11 5 Pros. Aiken CS 143 Lecture 11 6 1!
Notes By tradition, pictures o machine oranization have: Low address at the top Hih address at the bottom Lines delimitin areas or dierent kinds o data These pictures are simpliications E.., not all memory need be contiuous What is Other Space? Holds all data or the proram Other Space = Data Space Compiler is responsible or: Generatin code Orchestratin use o the data area Pros. Aiken CS 143 Lecture 11 Pros. Aiken CS 143 Lecture 11 8 Code Generation Goals Two oals: Correctness Speed Most complications in code eneration come rom tryin to be ast as well as correct Assumptions about Execution 1. Execution is sequential; control moves rom one point in a proram to another in a welldeined order 2. When a procedure is called, control eventually returns to the point immediately ater the call Do these assumptions always hold? Pros. Aiken CS 143 Lecture 11 9 Pros. Aiken CS 143 Lecture 11 10 Activations An invocation o procedure P is an activation o P The lietime o an activation o P is All the steps to execute P Includin all the steps in procedures P calls Lietimes o Variables The lietime o a variable x is the portion o execution in which x is deined Note that Lietime is a dynamic (run-time) concept Scope is a static concept Pros. Aiken CS 143 Lecture 11 11 Pros. Aiken CS 143 Lecture 11 12 2!
Activation Trees Assumption (2) requires that when P calls Q, then Q returns beore P does Lietimes o procedure activations are properly nested Activation lietimes can be depicted as a tree Class { () : Int { 1 ; (): Int { () ; main(): Int {{ (); (); ; Pros. Aiken CS 143 Lecture 11 13 Pros. Aiken CS 143 Lecture 11 14 2 Class { () : Int { 1 ; (x:int): Int { i x = 0 then () else (x - 1) i; main(): Int {{(3); ; What is the activation tree or this example? Notes The activation tree depends on run-time behavior The activation tree may be dierent or every proram input Since activations are properly nested, a stack can track currently active procedures Pros. Aiken CS 143 Lecture 11 15 Pros. Aiken CS 143 Lecture 11 16 Class { () : Int { 1 ; (): Int { () ; Class { () : Int { 1 ; (): Int { () ; main(): Int {{ (); (); ; main(): Int {{ (); (); ; Pros. Aiken CS 143 Lecture 11 1 Pros. Aiken CS 143 Lecture 11 18 3!
Class { () : Int { 1 ; (): Int { () ; main(): Int {{ (); (); ; Pros. Aiken CS 143 Lecture 11 19 Class { () : Int { 1 ; (): Int { () ; main(): Int {{ (); (); ; Pros. Aiken CS 143 Lecture 11 20 Revised Memory Layout Activation Records Memory Code Low Address The inormation needed to manae one procedure activation is called an activation record (AR) or rame I procedure F calls G, then G s activation record contains a mix o ino about F and G. Hih Address Pros. Aiken CS 143 Lecture 11 21 Pros. Aiken CS 143 Lecture 11 22 What is in G s AR when F calls G? F is suspended until G completes, at which point F resumes. G s AR contains inormation needed to resume execution o F. G s AR may also contain: G s return value (needed by F) Actual parameters to G (supplied by F) Space or G s local variables The Contents o a Typical AR or G Space or G s return value Actual parameters Pointer to the previous activation record The control link; points to AR o caller o G Machine status prior to callin G Contents o reisters & proram counter Local variables Other temporary values Pros. Aiken CS 143 Lecture 11 23 Pros. Aiken CS 143 Lecture 11 24 4!
2, Revisited Ater Two Calls to Class { () : Int { 1 ; (x:int):int {i x=0 then () else (x - 1)(**)i; main(): Int {{(3); (*) ; AR or : result arument control link return address Pros. Aiken CS 143 Lecture 11 25 (result) 3 (*) (result) 2 (**) Pros. Aiken CS 143 Lecture 11 26 Notes has no arument or local variables and its result is never used; its AR is uninterestin (*) and (**) are return addresses o the invocations o The return address is where execution resumes ater a procedure call inishes This is only one o many possible AR desins Would also work or C, Pascal, FORTRAN, etc. The Point The compiler must determine, at compile-time, the layout o activation records and enerate code that correctly accesses locations in the activation record Thus, the AR layout and the code enerator must be desined toether! Pros. Aiken CS 143 Lecture 11 2 Pros. Aiken CS 143 Lecture 11 28 The picture shows the state ater the call to the 2nd invocation o returns (result) 3 (*) 1 2 Discussion The advantae o placin the return value 1st in a rame is that the caller can ind it at a ixed oset rom its own rame There is nothin maic about this oranization Can rearrane order o rame elements Can divide caller/callee responsibilities dierently An oranization is better i it improves execution speed or simpliies code eneration (**) Pros. Aiken CS 143 Lecture 11 29 Pros. Aiken CS 143 Lecture 11 30 5!
Discussion (Cont.) Real compilers hold as much o the rame as possible in reisters Especially the method result and aruments Globals All reerences to a lobal variable point to the same object Can t store a lobal in an activation record Globals are assined a ixed address once Variables with ixed address are statically allocated Dependin on the lanuae, there may be other statically allocated values Pros. Aiken CS 143 Lecture 11 31 Pros. Aiken CS 143 Lecture 11 32 Memory Layout with Static Data Heap Storae Memory Code Static Data Low Address A value that outlives the procedure that creates it cannot be kept in the AR method oo() { new Bar The Bar value must survive deallocation o oo s AR Lanuaes with dynamically allocated data use a heap to store dynamic data Hih Address Pros. Aiken CS 143 Lecture 11 33 Pros. Aiken CS 143 Lecture 11 34 Notes Notes (Cont.) The code area contains object code For most lanuaes, ixed size and read only The static area contains data (not code) with ixed addresses (e.., lobal data) Fixed size, may be readable or writable The stack contains an AR or each currently active procedure Each AR usually ixed size, contains locals Heap contains all other data In C, heap is manaed by malloc and ree Pros. Aiken CS 143 Lecture 11 35 Both the heap and the stack row Must take care that they don t row into each other Solution: start heap and stack at opposite ends o memory and let them row towards each other Pros. Aiken CS 143 Lecture 11 36 6!
Memory Layout with Heap Data Layout Code Low Address Low-level details o machine architecture are important in layin out data or correct code and maximum perormance Memory Static Data Chie amon these concerns is alinment Heap Hih Address Pros. Aiken CS 143 Lecture 11 3 Pros. Aiken CS 143 Lecture 11 38 Alinment Most modern machines are (still) 32 bit 8 bits in a byte 4 bytes in a word Machines are either byte or word addressable Data is word alined i it beins at a word boundary Most machines have some alinment restrictions Or perormance penalties or poor alinment Alinment (Cont.) : A strin Hello Takes 5 characters (without a terminatin \0) To word alin next datum, add 3 paddin characters to the strin The paddin is not part o the strin, it s just unused memory Pros. Aiken CS 143 Lecture 11 39 Pros. Aiken CS 143 Lecture 11 40 Next Topic: Machines A simple evaluation model No variables or reisters A stack o values or intermediate results Each instruction: Takes its operands rom the top o the stack Removes those operands rom the stack Computes the required operation on them Pushes the result on the stack o Machine Operation The addition operation on a stack machine 5 5 9 9! 12 9 pop add push Pros. Aiken CS 143 Lecture 11 41 Pros. Aiken CS 143 Lecture 11 42!
o a Machine Proram Consider two instructions push i - place the inteer i on top o the stack add - pop two elements, add them and put the result back on the stack A proram to compute + 5: push push 5 add Why Use a Machine? Each operation takes operands rom the same place and puts results in the same place This means a uniorm compilation scheme And thereore a simpler compiler Pros. Aiken CS 143 Lecture 11 43 Pros. Aiken CS 143 Lecture 11 44 Why Use a Machine? Location o the operands is implicit Always on the top o the stack No need to speciy operands explicitly No need to speciy the location o the result Instruction add as opposed to add r 1, r 2 Smaller encodin o instructions More compact prorams This is one reason why Java Bytecodes use a stack evaluation model Optimizin the Machine The add instruction does 3 memory operations Two reads and one write to the stack The top o the stack is requently accessed Idea: keep the top o the stack in a reister (called accumulator) Reister accesses are aster The add instruction is now acc acc + top_o_stack Only one memory operation! Pros. Aiken CS 143 Lecture 11 45 Pros. Aiken CS 143 Lecture 11 46 Machine with Accumulator Invariants The result o an expression is in the accumulator Machine with Accumulator. Compute + 5 usin an accumulator For op(e 1,,e n ) push the accumulator on the stack ater computin e 1,,e n-1 Ater the operation pops n-1 values acc stack 5! 12 Expression evaluation preserves the stack acc push acc acc 5 acc acc + top_o_stack pop Pros. Aiken CS 143 Lecture 11 4 Pros. Aiken CS 143 Lecture 11 48 8!
A Bier : 3 + ( + 5) Notes Code Acc acc 3 3 <init> push acc 3 3, <init> acc 3, <init> push acc, 3, <init> acc 5 5, 3, <init> acc acc + top_o_stack 12, 3, <init> pop 12 3, <init> acc acc + top_o_stack 15 3, <init> pop 15 <init> Pros. Aiken CS 143 Lecture 11 49 It is very important evaluation o a subexpression preserves the stack beore the evaluation o + 5 is 3, <init> ater the evaluation o + 5 is 3, <init> The irst operand is on top o the stack Pros. Aiken CS 143 Lecture 11 50 9!