Helmut Seidl. Virtual Machines. München. Summer 2014

Size: px
Start display at page:

Download "Helmut Seidl. Virtual Machines. München. Summer 2014"

Transcription

1 Helmut Seidl Virtual Machines München Summer

2 0 Introduction Principle of Interpretation: Program + Input Interpreter Output Advantage: No precomputation on the program text == no/short startup-time Disadvantages: Program parts are repeatedly analyzed during execution + less efficient access to program variables == slower execution speed 2

3 Principle of Compilation: Program Compiler Code Input Code Output Two Phases (at two different Times): Translation of the source program into a machine program (at compile time); Execution of the machine program on input data (at run time). 3

4 Preprocessing of the source program provides for efficient access to the values of program variables at run time global program transformations to increase execution speed. Disadvantage: Compilation takes time Advantage: Program execution is sped up == compilation pays off in long running or often run programs 4

5 Structure of a compiler: Source program Frontend Internal representation (Syntax tree) Optimizations Internal representation Code generation Program for target machine 5

6 Subtasks in code generation: Goal is a good exploitation of the hardware resources: 1. Instruction Selection: Selection of efficient, semantically equivalent instruction sequences; 2. Register-allocation: Best use of the available processor registers 3. Instruction Scheduling: Reordering of the instruction stream to exploit intra-processor parallelism For several reasons, e.g. modularization of code generation and portability, code generation may be split into two phases: 6

7 Intermediate representation Code generation abstract machine code abstract machine code Compiler concrete machine code alternatively: Input Interpreter Output 7

8 Virtual machine idealized architecture, simple code generation, easily implemented on real hardware. Advantages: Porting the compiler to a new target architecture is simpler, Modularization makes the compiler easier to modify, Translation of program constructs is separated from the exploitation of architectural features. 8

9 Virtual (or: abstract) machines for some programming languages: Pascal P-machine Smalltalk Bytecode Prolog WAM ( Warren Abstract Machine ) SML, Haskell STGM Java JVM 9

10 We will consider the following languages and virtual machines: C CMa // imperative PuF MaMa // functional Proll WiM // logic based C± OMa // object oriented multi-threaded C threaded CMa // concurrent 10

11 The Translation of C 11

12 1 The Architecture of the CMa Each virtual machine provides a set of instructions Instructions are executed on the virtual hardware This virtual hardware can be viewed as a set of data structures, which the instructions access... and which are managed by the run-time system For the CMa we need: 12

13 The Data Store: S 0 SP S is the (data) store, onto which new cells are allocated in a LIFO discipline == Stack. SP ( = Stack Pointer) is a register, which contains the address of the topmost allocated cell, Simplification: All types of data fit into one cell of S. 13

14 The Code/Instruction Store: C 0 1 PC C is the Code store, which contains the program. Each cell of field C can store exactly one virtual instruction. PC ( = Program Counter) is a register, which contains the address of the instruction to be executed next. Initially, PC contains the address 0. == C[0] contains the instruction to be executed first. 14

15 Execution of Programs: The machine loads the instruction in C[PC] into a Instruction-Register IR and executes it PC is incremented by 1 before the execution of the instruction while (true) { } IR = C[PC]; PC++; execute (IR); The execution of the instruction may overwrite the PC (jumps). The Main Cycle of the machine will be halted by executing the instruction halt, which returns control to the environment, e.g. the operating system More instructions will be introduced by demand 15

16 2 Simple expressions and assignments Problem: evaluate the expression (1 + 7) 3! This means: generate an instruction sequence, which determines the value of the expression and pushes it on top of the stack... Idea: first compute the values of the subexpressions, save these values on top of the stack, then apply the operator. 16

17 The general principle: instructions expect their arguments on top of the stack, execution of an instruction consumes its operands, results, if any, are stored on top of the stack. loadc q q SP++; S[SP] = q; Instruction loadc q needs no operand on top of the stack, pushes the constant q onto the stack. Note: the content of register SP is only implicitly represented, namely through the height of the stack. 17

18 3 8 mul 24 SP--; S[SP] = S[SP] S[SP+1]; mul expects two operands on top of the stack, consumes both, and pushes their product onto the stack.... the other binary arithmetic and logical instructions, add, sub, div, mod, and, or and xor, work analogously, as do the comparison instructions eq, neq, le, leq, gr and geq. 18

19 Example The operator leq 7 3 leq 1 Remark: 0 represents false, all other integers true. Unary operators neg and not consume one operand and produce one result. 8 neg 8 S[SP] = S[SP]; 19

20 Example Code for 1 + 7: loadc 1 loadc 7 add Execution of this code sequence: 7 loadc 1 1 loadc 7 1 add 8 20

21 Variables are associated with memory cells in S: z: y: x: ρ delivers for each variable x the relative address of x. ρ is called Address Environment. 21

22 Variables can be used in two different ways: Example x = y+1 We are interested in the value of y, but in the address of x. The syntactic position determines, whether the L-value or the R-value of a variable is required. L-value of x = address of x R-value of x = content of x code R e ρ code L e ρ produces code to compute the R-value of e in the address environmentρ analogously for the L-value Note: Not every expression has an L-value (Ex.: x+1). 22

23 We define: code R (e 1 + e 2 ) ρ = code R e 1 ρ code R e 2 ρ add... analogously for the other binary operators code R ( e) ρ = code R e ρ neg... analogously for the other unary operators code R q ρ = loadc q code L x ρ = loadc (ρ x)... 23

24 code R x ρ = code L x ρ load The instruction load loads the contents of the cell, whose address is on top of the stack. 13 load S[SP] = S[S[SP]]; 24

25 code R (x = e) ρ = code R e ρ code L x ρ store store writes the contents of the second topmost stack cell into the cell, whose address in on top of the stack, and leaves the written value on top of the stack. Note: this differs from the code generated by gcc?? store 13 S[S[SP]] = S[SP-1]; SP--; 25

26 Example Code for e x = y 1 withρ = {x 4, y 7}. code R e ρ produces: loadc 7 load loadc 1 sub loadc 4 store Improvements: Introduction of special instructions for frequently used instruction sequences, e.g., loada q = loadc q load storea q = loadc q store 26

27 3 Statements and Statement Sequences Is e an expression, then e; is a statement. Statements do not deliver a value. The contents of the SP before and after the execution of the generated code must therefore be the same. code e; ρ = code R e ρ pop The instruction pop eliminates the top element of the stack. 1 pop SP--; 27

28 The code for a statement sequence is the concatenation of the code for the statements of the sequence: code (s ss) ρ = code s ρ code ss ρ codeε ρ = // empty sequence of instructions 28

29 4 Conditional and Iterative Statements We need jumps to deviate from the serial execution of consecutive statements: jump A A PC PC PC = A; 29

30 1 jumpz A PC PC 0 jumpz A A PC PC if (S[SP] == 0) PC = A; SP--; 30

31 For ease of comprehension, we use symbolic jump targets. They will later be replaced by absolute addresses. Instead of absolute code addresses, one could generate relative addresses, i.e., relative to the actual PC. Advantages: smaller addresses suffice most of the time; the code becomes relocatable, i.e., can be moved around in memory. 31

32 4.1 One-sided Conditional Statement Let us first regard s if (e) s. Idea: Put code for the evaluation of e and s consecutively in the code store, Insert a conditional jump (jump on zero) in between. 32

33 code s ρ = code R e ρ jumpz A code s ρ A :... code R for e jumpz code for s 33

34 4.2 Two-sided Conditional Statement Let us now regard s if (e) s 1 else s 2. The same strategy yields: code s ρ = code R e ρ jumpz A code s 1 ρ jump B A : code s 2 ρ B :... code R for e jumpz code for s 1 jump code for s 2 34

35 Example Be ρ = {x 4, y 7} and s if(x > y) (i) x = x y; (ii) else y = y x; (iii) code s ρ produces: loada 4 loada 4 A: loada 7 loada 7 loada 7 loada 4 gr sub sub jumpz A storea 4 storea 7 pop pop jump B B:... (i) (ii) (iii) 35

36 4.3 while-loops Let us regard the loop s while (e) s. We generate: code s ρ = A : code R e ρ jumpz B code s ρ jump A B :... code R for e jumpz code for s jump 36

37 Example Be ρ = {a 7, b 8, c 9} and s the statement: code s ρ produces the sequence: while (a > 0) {c = c+1; a = a b;} A: loada 7 loada 9 loada 7 B:... loadc 0 loadc 1 loada 8 gr add sub jumpz B storea 9 storea 7 pop pop jump A 37

38 4.4 for-loops The for-loop s for (e 1 ; e 2 ; e 3 ) s is equivalent to the statement sequence e 1 ; while (e 2 ) {s e 3 ;} provided that s contains no continue-statement. We therefore translate: code s ρ = code R e 1 ρ pop A : code R e 2 ρ jumpz B code s ρ code R e 3 ρ pop jump A B :... 38

39 4.5 The switch-statement Idea: Multi-target branching in constant time! Use a jump table, which contains at its i-th position the jump to the beginning of the i-th alternative. Realized by indexed jumps. q jumpi B B+q PC PC PC = B + S[SP]; SP--; 39

40 Simplification: We only regard switch-statements of the following form: s switch (e) { } case 0: case 1:.. case k 1: default: ss 0 break; ss 1 break; ss k ss k 1 break; s is then translated into the instruction sequence: 40

41 code s ρ = code R e ρ C 0 : code ss 0 ρ B: jump C 0 check 0 k B jump D jump C k C k : code ss k ρ D:... jump D The Macro check 0 k B checks, whether the R-value of e is in the interval [0, k], and executes an indexed jump into the table B The jump table contains direct jumps to the respective alternatives. At the end of each alternative is an unconditional jump out of the switch-statement. 41

42 check 0 k B = dup dup jumpi B loadc 0 loadc k A: pop geq le loadc k jumpz A jumpz A jumpi B The R-value of e is still needed for indexing after the comparison. It is therefore copied before the comparison. This is done by the instruction dup. The R-value of e is replaced by k before the indexed jump is executed if it is less than 0 or greater than k. 42

43 3 dup 3 3 S[SP+1] = S[SP]; SP++; 43

44 Note: The jump table could be placed directly after the code for the Macro check. This would save a few unconditional jumps. However, it may require to search the switch-statement twice. If the table starts with u instead of 0, we have to decrease the R-value of e by u before using it as an index. If all potential values of e are definitely in the interval[0, k], the macro check is not needed. 44

45 5 Storage Allocation for Variables Goal: Associate statically, i.e. at compile time, with each variable x a fixed (relative) address ρ x Assumptions: variables of basic types, e.g. int,... occupy one storage cell. variables are allocated in the store in the order, in which they are declared, starting at address 1. Consequently, we obtain for the declaration d t 1 x 1 ;... t k x k ; (t i basic type) the address environmentρ such that ρ x i = i, i = 1,..., k 45

46 5.1 Arrays Example int [11] a; The array a consists of 11 components and therefore needs 11 cells. ρ a is the address of the component a[0]. a[10] a[0] 46

47 We need a function sizeof (notation: ), computing the space requirement of a type: t = 1 if t basic k t if t t [k] Accordingly, we obtain for the declaration d t 1 x 1 ;... t k x k ; ρ x 1 = 1 ρ x i = ρ x i 1 + t i 1 for i > 1 Since can be computed at compile time, alsoρ can be computed at compile time. 47

48 Task: Extend code L and code R to expressions with accesses to array components. Be t[c] a; the declaration of an array a. To determine the start address of a component ρ a+ t (R-value of i). a[i], we compute In consequence: code L a[e]ρ = loadc(ρ a) code R e ρ loadc t mul add... or more general: 48

49 code L e 1 [e 2 ] ρ = code R e 1 ρ code R e 2 ρ loadc t mul add Remark: In C, an array is a pointer. A declared array a is a pointer-constant, whose R-value is the start address of the array. Formally, we define for an array e: code R e ρ = code L e ρ In C, the following are equivalent (as L-values): 2[a] a[2] a+2 Normalization: Array names and expressions evaluating to arrays occur in front of index brackets, index expressions inside the index brackets. 49

50 5.2 Structures In Modula and Pascal, structures are called Records. Simplification: Names of structure components are not used elsewhere. Alternatively, one could manage a separate environment ρ st for each structure type st. Be struct { int a; int b; } x; part of a declaration list. x has as relative address the address of the first cell allocated for the structure. The components have addresses relative to the start address of the structure. In the example, these are a 0, b 1. 50

51 Let t struct{t 1 c 1 ;... t k c k ;}. We have t = k i=1 t i ρ c 1 = 0 and ρ c i = ρ c i 1 + t i 1 for i > 1 We thus obtain: code L (e.c) ρ = code L e ρ loadc(ρ c) add 51

52 Example Be struct { int a; int b; } x; such that ρ = {x 13, a 0, b 1}. This yields: code L (x.b) ρ = loadc 13 loadc 1 add 52

53 6 Pointer and Dynamic Storage Management Pointer allow the access to anonymous, dynamically generated objects, whose life time is not subject to the LIFO-principle. == We need another potentially unbounded storage area H the Heap. S 0 MAX H SP EP NP NP EP = New Pointer; points to the lowest occupied heap cell. = Extreme Pointer; points to the uppermost cell, to which SP can point (during execution of the actual function). 53

54 Idea: Stack and Heap grow toward each other in S, but must not collide. (Stack Overflow). A collision may be caused by an increment of SP or a decrement of NP. EP saves us the check for collision at the stack operations. The checks at heap allocations are still necessary. 54

55 What can we do with pointers (pointer values)? set a pointer to a storage cell, dereference a pointer, access the value in a storage cell pointed to by a pointer. There a two ways to set a pointer: (1) A call malloc (e) reserves a heap area of the size of the value of e and returns a pointer to this area: code R malloc(e) ρ = code R e ρ new (2) The application of the address operator & to a variable returns a pointer to this variable, i.e. its address ( = L-value). Therefore: code R (&e) ρ = code L e ρ 55

56 NP NP n n new if (NP - S[SP] EP) S[SP] = NULL; else{ NP = NP - S[SP]; S[SP] = NP; } NULL is a special pointer constant, identified with the integer constant 0. In the case of a collision of stack and heap the NULL-pointer is returned. 56

57 Dereferencing of Pointers: The application of the operator to the expression e returns the contents of the storage cell, whose address is the R-value of e: Example Given the declarations code L ( e) ρ = code R e ρ struct t { int a[7]; struct t b; }; int i, j; struct t pt; and the expression ((pt b) a)[i + 1] Because of e a ( e).a holds: code L (e a) ρ = code R e ρ loadc (ρ a) add 57

58 b: a: b: pt: j: i: a: 58

59 Be ρ = {i 1, j 2, pt 3, a 0, b 7}. Then: code L ((pt b) a)[i+1] ρ = code R ((pt b) a) ρ = code R ((pt b) a) ρ code R (i+1) ρ loada 1 loadc 1 loadc 1 mul add add loadc 1 mul add 59

60 For arrays, their R-value equals their L-value. Therefore: code R ((pt b) a) ρ = code R (pt b) ρ = loada 3 loadc 0 loadc 7 add add load loadc 0 add In total, we obtain the instruction sequence: loada 3 load loada 1 loadc 1 loadc 7 loadc 0 loadc 1 mul add add add add 60

61 7 Conclusion We tabulate the cases of the translation of expressions: code L (e 1 [e 2 ]) ρ = code R e 1 ρ code R e 2 ρ loadc t mul add if e 1 has type t or t[] code L (e.a)ρ = code L e ρ loadc (ρ a) add 61

62 code L ( e) ρ = code R e ρ code L x ρ = loadc (ρ x) code R (&e) ρ = code L e ρ code R e ρ = code L e ρ if e is an array code R (e 1 e 2 ) ρ = code R e 1 ρ code R e 2 ρ op op instruction for operator 62

63 code R q ρ = loadc q q constant code R (e 1 = e 2 ) ρ = code R e 2 ρ code L e 1 ρ store code R e ρ = code L e ρ load otherwise 63

64 Example int a[10], ( b)[10]; withρ = {a 7, b 17}. For the statement: a = 5; we obtain: code L ( a) ρ = code R aρ = code L aρ = loadc 7 code ( a = 5;) ρ = loadc 5 loadc 7 store pop As an exercise translate: s 1 b = (&a)+2; and s 2 (b+3)[0] = 5; 64

65 code (s 1 s 2 ) ρ = loadc 7 loadc 5 loadc 2 loadc 17 loadc 10 // size of int[10] load mul // scaling loadc 3 add loadc 10 // size of int[10] loadc 17 mul // scaling store add pop // end of s 1 store pop // end of s 2 65

66 8 Freeing Occupied Storage Problems: The freed storage area is still referenced by other pointers (dangling references). After several deallocations, the storage could look like this (fragmentation): frei 66

67 Potential Solutions: Trust the programmer. Manage freed storage in a particular data structure (free list)== malloc or free my become expensive. Do nothing, i.e.: code free(e); ρ = code R e ρ pop == simple and (in general) efficient. Use an automatic, potentially conservative Garbage-Collection, which occasionally collects certainly inaccessible heap space. 67

68 9 Functions The definition of a function consists of: a name by which it can be called; a specification of the formal parameters; a possible result type; a block of statements. In C, we have: code R f ρ = loadc _ f = start address of the code for f == Function names must be maintained within the address environment! 68

69 Example int fac (int x) { if (x 0) return 1; else return x fac(x 1); } main () { int n; n = fac(2)+fac(1); printf ( %d, n); } At every point of execution, several instances (calls) of the same function may be active, i.e., have been started, but not yet completed. The recursion tree of the example: main fac fac fac fac fac printf 69

70 We conclude: The formal parameters and local variables of the different calls of the same function (the instances) must be cept separate. Idea Allocate a dedicated memory block for each call of a function. In sequential programming languages, these memory blocks may be maintained on a stack. Therefore, they are also called stack frames. 70

71 9.1 Memory Organization for Functions SP local variables FP PCold FPold EPold organizational cells formal parameters / return value FP = Frame Pointer; points to the last organizational cell and is used for addressing the formal parameters and local variables. 71

72 Caveat The local variables receive relative addresses +1,+2,.... The formal parameters are placed below the organizational cells and therefore have negative addresses relative to FP :-) This organization is particularly well suited for function calls with variable number of arguments as, e.g., forprintf. The memory block of parameters is recycled for storing the return value of the function :-)) Simplification The return value fits into a single memory cell. 72

73 Caveat The local variables receive relative addresses +1,+2,.... The formal parameters are placed below the organizational cells and therefore have negative addresses relative to FP :-) This organization is particularly well suited for function calls with variable number of arguments as, e.g., forprintf. The memory block of parameters is recycled for storing the return value of the function :-)) Simplificaiton: The return value fits into a single cell. Tasks of a Translator for Functions: Generate code for the body of the function! Generate code for calls! 73

74 9.2 Determining Address Environments We distinguish two kinds of variables: 1. global/extern that are defined outside of functions; 2. local/intern/automatic (inkluding formal parameters) which are defined inside functions. == The address environment ρ maps names onto pairs (tag, a) {G, L} Z. Caveat In general, there are further refined grades of visibility of variables. Different parts of a program may be translated relative to different address environments! 74

75 Example 0 int i; struct list { int info; struct list next; } l; 1 int ith (struct list x, int i) { if (i 1) return x info; else return ith (x next, i 1); } 2 main () { int k; scanf ("%d", &i); scanlist (&l); printf ("\n\t%d\n", ith (l,i)); } 75

76 Address Environments Occurring in the Program: 0 Before the Function Definitions: ρ 0 : i (G, 1) l (G, 2)... 1 Inside of ith: ρ 1 : i (L, 4) x (L, 3) l (G, 2) ith (G, _ith)... 76

77 Caveat The actual parameters are evaluated from right to left!! The first parameter resides directly below the organizational cells :-) For a prototype τ f(τ 1 x 1,...,τ k x k ) we define: x 1 (L, 2 τ 1 ) x i (L, 2 τ 1... τ i ) 77

78 Caveat The actual parameters are evaluated from right to left!! The first parameter resides directly below the organizational cells :-) For a prototype τ f(τ 1 x 1,...,τ k x k ) we define: x 1 (L, 2 τ 1 ) x i (L, 2 τ 1... τ i ) 2 Inside of main: ρ 2 : i (G, 1) l (G, 2) k (L, 1) ith (G, _ith) main (G, _main)... 78

79 9.3 Calling/Entering and Exiting/Leaving Functions Assume that f is the current function, i.e., the caller, and f calls the function g, i.e., the callee. The code for the call must be distributed between the caller and the callee. The distribution can only be such that the code depending on information of the caller must be generated for the caller and likewise for the callee. Caveat The space requirements of the actual paramters is only known to the caller... 79

80 Actions when entering g: 1. Evaluating the actual parameters 2. Saving of FP, EP 3. Determining the start address of g 4. Setting of the new FP 5. Saving PC and Jump to the beginning of g 6. Setting of new EP 7. Allocating of local variables } mark call } enter } alloc are part of f are part of g 80

81 Actions when terminating the call: 1. Storing of the return value 2. Restoring of the registers FP, EP 3. Jumping back into the code of f, i.e., Restauration of the PC 4. Popping the stack return } slide 81

82 Accordingly, we obtain for a call to a function with at least one parameter and one return value: code R g(e 1,..., e n ) ρ = code R e n ρ... code R e 1 ρ mark code R g ρ call slide(m 1) where m is the size of the actual parameters. 82

83 Remark Of every expression which is passed as a parameter, we determine the R-value == call-by-value passing of parameters. The function g may as well be denoted by an expression, whose R-value provids the start address of the called function... 83

84 Similar to declared arrays, function names are interpreted as constant pointes onto function code. Thus, the R-value of this pointer is the start address of the function. Caveat! For a variable int( )() g; the two calls ( g)() und g() are equivalent! By means of normalization, the dereferencing of function pointers can be considered as redundant :-) During passing of parameters, these are copied. Consequently, code R f ρ = loadc (ρ f) f name of a function code R ( e) ρ = code R e ρ e function pointer code R e ρ = code L e ρ move k e a structure of size k where 84

85 k move k for (i = k-1; i 0; i--) S[SP+i] = S[S[SP]+i]; SP = SP+k 1; 85

86 The instruction mark saves the registers FP and EP onto the stack. FP EP e FP EP e e mark S[SP+1] = EP; S[SP+2] = FP; SP = SP + 2; 86

87 The instruction call saves the return address and sets FP and PC onto the new values. q call FP p PC p PC q tmp = S[SP]; S[SP] = PC; FP = SP; PC = tmp; 87

88 The instruction slide copies the return values into the correct memory cell: 42 m slide m 42 tmp = S[SP]; SP = SP-m; S[SP] = tmp; 88

89 Accordingly, we translate a function definition: code t f (specs){v_defs ss} ρ = _f: enter q // initialize EP alloc k // allocate the local variables code ss ρ f return // return from call where q = max + k with max = maximal length of the local stack k = size of the local variables ρ f = address environment for f // takes specs, V_defs andρ into account 89

90 The instruction enter q sets the EP to the new value. If not enough space is available, program execution terminates. enter q EP q EP = SP + q; if (EP NP) Error ( Stack Overflow ); 90

91 The instruction alloc k allocates memory for locals on the stack. alloc k k SP = SP + k; 91

92 The instruction return pops the current stack frame. This means it restores the registers PC, EP and FP and returns the return value on top of the stack. PC FP EP p e v return PC FP EP p e v PC = S[FP]; EP = S[FP-2]; if (EP NP) Error ( Stack Overflow ); SP = FP-3; FP = S[SP+2]; 92

93 9.4 Access to Variables, Formal Parameters and Returning of Values Accesses to local variables or formal parameters are relative to the current FP. Accordingly, we modify code L for names of variables. For ρ x = (tag, j) we define code L x ρ = loadc j loadrc j tag = G tag = L 93

94 The instruction loadrc j computes the sum of FP and j. FP f loadrc j FP f f+j SP++; S[SP] = FP+j; 94

95 As an optimization, we introduce analogously to loada j and storea j the new instructions loadr j and storer j : loadr j = loadrc j load storer j = loadrc j; store 95

96 The code for return e; corresponds to an assignment to a variable with relative address 3. code return e; ρ = code R e ρ storer -3 return Example For function int fac (int x) { if (x 0) return 1; else return x fac (x 1); } we generate: 96

97 _fac: enter q loadc 1 A: loadr -3 mul alloc 0 storer -3 loadr -3 storer -3 loadr -3 return loadc 1 return loadc 0 jump B sub B: return leq mark jumpz A loadc _fac call slide 0 where ρ fac : x (L, 3) and q = 5. 97

98 10 Translation of Whole Programs Before program execution, we have: SP = 1 FP = EP = 1 PC = 0 NP = MAX Let p V_defs F_def 1... F_def n, denote a program where F_def i is the definition of a function f i of which one is called main. The code for the program p consists of: code for the function definitions F_def i ; code for the allocation of global variables; code for the call of int main(); the instruction halt which returns control to the operating system together with the value at address 0. 98

99 Then we define: code p = enter(k+4) alloc(k+1) mark loadc _main call slide k halt _f 1 : code F_def 1 ρ _f n :. code F_def n ρ where = empty address environment; ρ = global address environment; k = size of the global variables 99

100 The Translation of Functional Programming Languages 100

101 11 The language PuF We only regard a mini-language PuF ( Pure Functions ). We do not treat, as yet: Side effects; Data structures; Exceptions. 101

102 A program is an expression e of the form: An expression is therefore e ::= b x ( 1 e) (e 1 2 e 2 ) (if e 0 then e 1 else e 2 ) (e e 0... e k 1 ) (fun x 0... x k 1 e) (let x 1 = e 1 in e 0 ) (let rec x 1 = e 1 and... and x n = e n in e 0 ) a basic value, a variable, the application of an operator, or a function-application, a function-abstraction, or a let-expression, i.e. an expression with locally defined variables, or a let-rec-expression, i.e. an expression with simultaneously defined local variables. For simplicity, we only allowint as basic type. 102

103 Example The following well-known function computes the factorial of a natural number: let rec fac = fun x if x 1 then 1 else x fac(x 1) in fac 7 As usual, we only use the minimal amount of parentheses. There are two Semantics: CBV: Arguments are evaluated before they are passed to the function (as in SML); CBN: Arguments are passed unevaluated; they are only evaluated when their value is needed (as in Haskell). 103

104 12 Architecture of the MaMa: We know already the following components: C 0 1 PC C = Code-store contains the MaMa-program; each cell contains one instruction; PC = Program Counter points to the instruction to be executed next; 104

105 S 0 SP FP S = Runtime-Stack each cell can hold a basic value or an address; SP = Stack-Pointer points to the topmost occupied cell; as in the CMa implicitely represented; FP = Frame-Pointer points to the actual stack frame. 105

106 We also need a heap H: Tag Code Pointer Value Heap Pointer 106

107 ... it can be thought of as an abstract data type, being capable of holding data objects of the following form: B v 173 Basic Value C cp gp Closure F cp ap gp Function V n v[0]... v[n 1] Vector 107

108 The instructionnew(tag, args) creates a corresponding object (B, C, F, V) in H and returns a reference to it. We distinguish three different kinds of code for an expression e: code V e (generates code that) computes the Value of e, stores it in the heap and returns a reference to it on top of the stack (the normal case); code B e computes the value of e, and returns it on the top of the stack (only for Basic types); code C e does not evaluate e, but stores a Closure of e in the heap and returns a reference to the closure on top of the stack. We start with the code schemata for the first two kinds: 108

109 13 Simple expressions Expressions consisting only of constants, operator applications, and conditionals are translated like expressions in imperative languages: code B bρ sd = loadc b code B ( 1 e)ρ sd = code B eρ sd op 1 code B (e 1 2 e 2 )ρ sd = code B e 1 ρ sd code B e 2 ρ (sd+1) op 2 109

110 code B (if e 0 then e 1 else e 2 )ρ sd = code B e 0 ρ sd jumpz A code B e 1 ρ sd jump B A: code B e 2 ρ sd B:

111 Note: ρ denotes the actual address environment, in which the expression is translated. The extra argument sd, the stack difference, simulates the movement of the SP when instruction execution modifies the stack. It is needed later to address variables. The instructions op 1 and op 2 implement the operators 1 and 2, in the same way as the the operators neg and add implement negation resp. addition in the CMa. For all other expressions, we first compute the value in the heap and then dereference the returned pointer: code B eρ sd = code V eρ sd getbasic 111

112 B getbasic if (H[S[SP]]!= (B,_)) Error not basic! ; else S[SP] = H[S[SP]].v; 112

113 For code V and simple expressions, we define analogously: code V bρ sd = loadc b; mkbasic code V ( 1 e)ρ sd = code B eρ sd op 1 ; mkbasic code V (e 1 2 e 2 )ρ sd = code B e 1 ρ sd code B e 2 ρ (sd+1) op 2 ; mkbasic code V (if e 0 then e 1 else e 2 )ρ sd = code B e 0 ρ sd jumpz A code V e 1 ρ sd jump B A: code V e 2 ρ sd B:

114 17 mkbasic B 17 S[SP] = new (B,S[SP]); 114

115 14 Accessing Variables We must distinguish between local and global variables. Example Regard the function f : let c = 5 in let f = fun a let b = a a in f c in b+c The function f uses the global variable c and the local variables a (as formal parameter) and b (introduced by the inner let). The binding of a global variable is determined, when the function is constructed (static binding!), and later only looked up. 115

116 Accessing Global Variables The bindings of global variables of an expression or a function are kept in a vector in the heap (Global Vector). They are addressed consecutively starting with 0. When an F-object or a C-object are constructed, the Global Vector for the function or the expression is determined and a reference to it is stored in the gp-component of the object. During the evaluation of an expression, the (new) register GP (Global Pointer) points to the actual Global Vector. In constrast, local variables should be administered on the stack... == General form of the address environment: ρ : Vars {L, G} Z 116

117 Accessing Local Variables Local variables are administered on the stack, in stack frames. Let e e e 0... e m 1 be the application of a function e to arguments e 0,..., e m 1. Caveat: The arity of e does not need to be m :-) f may therefore receive less than n arguments (under supply); f may also receive more than n arguments, if t is a functional type (over supply). 117

118 Possible stack organisations: F e e m 1 FP e 0 + Addressing of the arguments can be done relative to FP The local variables of e cannot be addressed relative to FP. If e is an n-ary function with n < m, i.e., we have an over-supplied function application, the remaining m n arguments will have to be shifted. 118

119 If e evaluates to a function, which has already been partially applied to the parameters a 0,..., a k 1, these have to be sneaked in underneath e 0 : e m 1 a 1 e 0 FP a 0 119

120 Alternative: F e e 0 FP e m 1 + The further arguments a 0,..., a k 1 and the local variables can be allocated above the arguments. 120

121 a 0 e 0 a1 FP e m 1 Addressing of arguments and local variables relative to FP is no more possible. (Remember: m is unknown when the function definition is translated.) 121

122 Way out: We address both, arguments and local variables, relative to the stack pointer SP!!! However, the stack pointer changes during program execution... sd SP sp e 0 0 FP e m 1 122

123 The differerence between the current value of SP and its value sp 0 at the entry of the function body is called the stack distance, sd. Fortunately, this stack distance can be determined at compile time for each program point, by simulating the movement of the SP. The formal parameters x 0, x 1, x 2,... successively receive the non-positive relative addresses 0, 1, 2,..., i.e., ρ x i = (L, i). The absolute address of the i-th formal parameter consequently is sp 0 i = (SP sd) i The local let-variables y 1, y 2, y 3,... will be successively pushed onto the stack: 123

124 SP sd sp 0 : y 3 y 1 x 0 y 2 x 1 x k 1 The y i have positive relative addresses 1, 2, 3,..., that is: ρ y i = (L, i). The absolute address of y i is then sp 0 + i = (SP sd)+i 124

125 With CBN, we generate for the access to a variable: code V x ρ sd = getvar x ρ sd eval The instruction eval checks, whether the value has already been computed or whether its evaluation has to yet to be done (== will be treated later :-) With CBV, we can just delete eval from the above code schema. The (compile-time) macro getvar is defined by: getvar x ρ sd = let (t, i) = ρ x in match t with L pushloc (sd i) G pushglob i end 125

126 The access to local variables: pushloc n n S[SP+1] =S[SP - n]; SP++; 126

127 Correctness argument: Let sp and sd be the values of the stack pointer resp. stack distance before the execution of the instruction. The value of the local variable with address i is loaded from S[a] with... exactly as it should be :-) a = sp (sd i) = (sp sd)+i = sp 0 + i 127

128 The access to global variables is much simpler: pushglob i GP V GP V i SP = SP + 1; S[SP] = GP v[i]; 128

129 Example Regard e (b+c) for ρ = {b (L, 1), c (G, 0)} and sd = 1. With CBN, we obtain: code V e ρ 1 = getvar b ρ 1 = 1 pushloc 0 eval 2 eval getbasic 2 getbasic getvar c ρ 2 2 pushglob 0 eval 3 eval getbasic 3 getbasic add 3 add mkbasic 2 mkbasic 129

130 15 let-expressions As a warm-up let us first consider the treatment of local variables :-) Let e let y 1 = e 1 in... let y n = e n in e 0 be a nested let-expression. The translation of e must deliver an instruction sequence that allocates local variables y 1,..., y n ; in the case of CBV: evaluates e 1,..., e n and binds the y i to their values; CBN: constructs closures for the e 1,..., e n and binds the y i to them; evaluates the expression e 0 and returns its value. Here, we consider the non-recursive case only, i.e. where y j only depends on y 1,..., y j 1. We obtain for CBN: 130

131 code V e ρ sd = code C e 1 ρ sd code C e 2 ρ 1 (sd+1)... code C e n ρ n 1 (sd+n 1) code V e 0 ρ n (sd+n) slide n // deallocates local variables where ρ j = ρ {y i (L, sd+i) i = 1,..., j}. In the case of CBV, we use code V for the expressions e 1,..., e n. Caveat! All the e i must be associated with the same binding for the global variables! 131

132 Example Consider the expression e let a = 19 in let b = a a in a+b forρ = and sd = 0. We obtain (for CBV): 0 loadc 19 3 getbasic 3 pushloc 1 1 mkbasic 3 mul 4 getbasic 1 pushloc 0 2 mkbasic 4 add 2 getbasic 2 pushloc 1 3 mkbasic 2 pushloc 1 3 getbasic 3 slide 2 132

133 The instruction slide k deallocates again the space for the locals: k slide k S[SP-k] = S[SP]; SP = SP - k; 133

134 16 Function Definitions The definition of a function f requires code that allocates a functional value for f in the heap. This happens in the following steps: Creation of a Global Vector with the binding of the free variables; Creation of an (initially empty) argument vector; Creation of an F-Object, containing references to theses vectors and the start address of the code for the body; Separately, code for the body has to be generated. Thus: 134

135 code V (fun x 0... x k 1 e)ρ sd = getvar z 0 ρ sd A : getvar z 1 ρ (sd+1)... B :... getvar z g 1 ρ (sd+ g 1) mkvec g mkfunval A jump B targ k code V e ρ 0 return k where {z 0,..., z g 1 } = free(fun x 0... x k 1 e) and ρ = {x i (L, i) i = 0,..., k 1} {z j (G, j) j = 0,..., g 1} 135

136 V g g mkvec g h = new (V, n); SP = SP - g + 1; for (i=0; i<g; i++) h v[i] = S[SP + i]; S[SP] = h; 136

137 V F A mkfunval A V 0 V a = new (V,0); S[SP] = new (F, A, a, S[SP]); 137

138 Example Regard f fun b a+b for ρ = {a (L, 1)} and sd = 1. code V f ρ 1 produces: 1 pushloc 0 0 pushglob 0 2 getbasic 2 mkvec 1 1 eval 2 add 2 mkfunval A 1 getbasic 1 mkbasic 2 jump B 1 pushloc 1 1 return 1 0 A : targ 1 2 eval 2 B :... The secrets around targ k and return k will be revealed later :-) 138

139 17 Function Application Function applications correspond to function calls in C. The necessary actions for the evaluation of e e 0... e m 1 are: Allocation of a stack frame; Transfer of the actual parameters, i.e. with: CBV: Evaluation of the actual parameters; CBN: Allocation of closures for the actual parameters; Evaluation of the expression e to an F-object; Application of the function. Thus for CBN: 139

140 code V (e e 0... e m 1 )ρ sd = mark A // Allocation of the frame code C e m 1 ρ (sd+3) code C e m 2 ρ (sd+4)... code C e 0 ρ (sd+m+2) code V e ρ (sd+m+3) // Evaluation of e apply // corresponds to call A :... To implement CBV, we use code V instead of code C for the arguments e i. Example For ( f 42),ρ = { f (L, 2)} and sd = 2, we obtain with CBV: 2 mark A 6 mkbasic 7 apply 5 loadc 42 6 pushloc 4 3 A :

141 A Slightly Larger Example let a = 17 in let f = fun b a+b in f 42 For CBV and sd= 0 we obtain: 0 loadc 17 2 jump B 2 getbasic 5 loadc 42 1 mkbasic 0 A: targ 1 2 add 5 mkbasic 1 pushloc 0 0 pushglob 0 1 mkbasic 6 pushloc 4 2 mkvec 1 1 getbasic 1 return 1 7 apply 2 mkfunval A 1 pushloc 1 2 B: mark C 3 C: slide 2 141

142 For the implementation of the new instruction, we must fix the organization of a stack frame: SP local stack Arguments FP PCold 0 FPold -1 GPold -2 3 org. cells 142

143 Different from the CMa, the instruction mark A already saves the return address: mark A A FP FP GP V GP V S[SP+1] = GP; S[SP+2] = FP; S[SP+3] = A; FP = SP = SP + 3; 143

144 The instruction apply unpacks the F-object, a reference to which (hopefully) resides on top of the stack, and continues execution at the address given there: GP PC F 42 ap gp V apply GP PC 42 V V n h = S[SP]; if (H[h]!= (F,_,_)) Error no fun ; else { GP = h gp; PC = h cp; for (i=0; i< h ap n; i++) S[SP+i] = h ap v[i]; SP = SP + h ap n 1; } 144

145 Caveat The last element of the argument vector is the last to be put onto the stack. This must be the first argument reference. This should be kept in mind, when we treat the packing of arguments of an under-supplied function application into an F-object!!! 145

146 18 Over and Undersupply of Arguments The first instruction to be executed when entering a function body, i.e., after an apply is targ k. This instruction checks whether there are enough arguments to evaluate the body. Only if this is the case, the execution of the code for the body is started. Otherwise, i.e. in the case of under-supply, a new F-object is returned. The test for number of arguments uses: SP FP 146

147 targ k is a complex instruction. We decompose its execution in the case of under-supply into several steps: targ k = if (SP FP < k) { mkvec0; wrap; popenv; } // creating the argumentvector // wrapping into an F object // popping the stack frame The combination of these steps into one instruction is a kind of optimization :-) 147

148 The instruction mkvec0 takes all references from the stack above FP and stores them into a vector: V g g mkvec0 FP FP g = SP FP; h = new (V, g); SP = FP+1; for (i=0; i<g; i++) h v[i] = S[SP + i]; S[SP] = h; 148

149 The instruction wrap wraps the argument vector together with the global vector and PC-1 into an F-object: F 41 ap gp V wrap V GP V GP V PC 42 PC 42 S[SP] = new (F, PC-1, S[SP], GP); 149

150 The instruction popenv finally releases the stack frame: FP 42 popenv FP 19 GP = S[FP-2]; S[FP-2] = S[SP]; PC = S[FP]; SP = FP - 2; FP = S[FP-1]; PC GP

151 Thus, we obtain for targ k in the case of under supply: GP PC 42 V mkvek0 FP 17 V 151

152 GP PC 42 V V m wrap FP 17 V 152

153 GP F 41 V PC 42 V m popenv FP 17 V 153

154 GP F 41 V PC 17 V FP V 154

155 The stack frame can be released after the execution of the body if exactly the right number of arguments was available. If there is an oversupply of arguments, the body must evaluate to a function, which consumes the rest of the arguments... The check for this is done by return k: return k = if (SP FP = k+1) popenv; else{ slide k; apply; } // Done // There are more arguments // another application The execution of return k results in: 155

156 Case: Done GP PC GP PC 17 k popenv FP 17 FP V V 156

157 Case: Over-supply F F k slide k apply FP FP 157

158 19 let-rec-expressions Consider the expression e let rec y 1 = e 1 and... and y n = e n in e 0. The translation of e must deliver an instruction sequence that allocates local variables y 1,..., y n ; in the case of CBV: evaluates e 1,..., e n and binds the y i to their values; CBN: constructs closures for the e 1,..., e n and binds the y i to them; evaluates the expression e 0 and returns its value. Caveat In a letrec-expression, the definitions can use variables that will be allocated only later! == Dummy-values are put onto the stack before processing the definition. 158

159 For CBN, we obtain: code V e ρ sd = alloc n // allocates local variables code C e 1 ρ (sd+n) rewrite n... code C e n ρ (sd+n) rewrite 1 code V e 0 ρ (sd+n) slide n // deallocates local variables where ρ = ρ {y i (L, sd+i) i = 1,..., n}. In the case of CBV, we also use code V for the expressions e 1,..., e n. Caveat Recursive definitions of basic values are undefined with CBV!!! 159

160 Example Consider the expression e let rec f = fun x y if y 1 then x else f(x y)(y 1) in f 1 forρ = and sd = 0. We obtain (for CBV): 0 alloc 1 0 A: targ 2 4 loadc 1 1 pushloc mkbasic 2 mkvec 1 1 return 2 5 pushloc 4 2 mkfunval A 2 B: rewrite 1 6 apply 2 jump B 1 mark C 2 C: slide 1 160

161 The instruction alloc n reserves n cells on the stack and initialises them with n dummy nodes: alloc n C C C C n for (i=1; i<=n; i++) S[SP+i] = new (C,-1,-1); SP = SP + n; 161

162 The instruction rewrite n overwrites the contents of the heap cell pointed to by the reference at S[SP n]: n x rewrite n x H[S[SP-n]] = H[S[SP]]; SP = SP - 1; The reference S[SP n] remains unchanged! Only its contents is changed! 162

163 20 Closures and their Evaluation Closures are needed in the implementation of CBN for let-, let-rec expressions as well as for actual paramaters of functions. Before the value of a variable is accessed (with CBN), this value must be available. Otherwise, a stack frame must be created to determine this value. This task is performed by the instruction eval. 163

164 eval can be decomposed into small actions: eval = if (H[S[SP]] (C, _, _)) { mark0; pushloc 3; apply0; } // allocation of the stack frame // copying of the reference // corresponds to apply A closure can be understood as a parameterless function. Thus, there is no need for an ap-component. Evaluation of the closure means evaluation of an application of this function to 0 arguments. In constrast to mark A, mark0 dumps the current PC. The difference between apply and apply0 is that no argument vector is put on the stack. 164

165 mark0 17 FP GP PC 17 V FP GP PC 17 V S[SP+1] = GP; S[SP+2] = FP; S[SP+3] = PC; FP = SP = SP + 3; 165

166 cp gp cp gp C 42 V C 42 V GP apply0 GP PC PC 42 h = S[SP]; SP--; GP = h gp; PC = h cp; We thus obtain for the instruction eval: 166

167 cp gp C 42 V mark0 FP GP 3 PC C cp 42 gp V pushloc 3 FP GP PC

168 17 3 C cp 42 gp V apply0 FP GP PC C cp 42 gp V FP GP PC

169 The construction of a closure for an expression e consists of: Packing the bindings for the free variables into a vector; Creation of a C-object, which contains a reference to this vector and to the code for the evaluation of e: code C e ρ sd = getvar z 0 ρ sd getvar z 1 ρ (sd+1)... getvar z g 1 ρ (sd+ g 1) mkvec g mkclos A jump B A : code V e ρ 0 B :... update where {z 0,..., z g 1 } = free(e) and ρ = {z i (G, i) i = 0,..., g 1}. 169

170 Example Consider e a a withρ = {a (L, 0)} and sd = 1. We obtain: 1 pushloc 1 0 A: pushglob 0 2 getbasic 2 mkvec 1 1 eval 2 mul 2 mkclos A 1 getbasic 1 mkbasic 2 jump B 1 pushglob 0 1 update 2 eval 2 B:

171 The instruction mkclos A is analogous to the instruction mkfunval A. It generates a C-object, where the included code pointer is A. V mkclos A C V A S[SP] = new (C, A, S[SP]); 171

172 In fact, the instruction update is the combination of the two actions: popenv rewrite 1 It overwrites the closure with the computed value. FP 42 FP update 19 C PC GP

173 21 Optimizations I: Global Variables Observation: Functional programs construct many F- and C-objects. This requires the inclusion of (the bindings of) all global variables. Recall, e.g., the construction of a closure for an expression e

174 code C e ρ sd = getvar z 0 ρ sd getvar z 1 ρ (sd+1)... getvar z g 1 ρ (sd+ g 1) mkvec g mkclos A jump B A : code V e ρ 0 update B :... where {z 0,..., z g 1 } = free(e) and ρ = {z i (G, i) i = 0,..., g 1}. 174

175 Idea: Reuse Global Vectors, i.e. share Global Vectors! Profitable in the translation of let-expressions or function applications: Build one Global Vector for the union of the free-variable sets of all let-definitions resp. all arguments. Allocate (references to) global vectors with multiple uses in the stack frame like local variables! Support the access to the current GP, e.g., by an instruction copyglob : 175

176 GP V GP V copyglob SP++; S[SP] = GP; 176

177 The optimization will cause Global Vectors to contain more components than just references to the free the variables that occur in one expression... Disadvantage: Superfluous components in Global Vectors prevent the deallocation of already useless heap objects == Space Leaks :-( Potential Remedy: Deletion of references from the global vector at the end of their life times. 177

178 22 Optimizations II: Closures In some cases, the construction of closures can be avoided, namely for Basic values, Variables, Functions. 178

179 Basic Values: The construction of a closure for the value is at least as expensive as the construction of the B-object itself! Therefore: code C b ρ sd = code V b ρ sd = loadc b mkbasic This replaces: mkvec 0 jump B mkbasic B:... mkclos A A: loadc b update 179

180 Variables: Variables are either bound to values or to C-objects. Constructing another closure is therefore superfluous. Therefore: code C x ρ sd = getvar x ρ sd This replaces: getvar xρsd mkclos A A: pushglob 0 update mkvec 1 jump B eval B:

181 Example Consider e let rec a = b and b = 7 in a. code V e 0 produces: 0 alloc 2 3 rewrite 2 3 mkbasic 2 pushloc 1 2 pushloc 0 2 loadc 7 3 rewrite 1 3 eval 3 slide 2 The execution of this instruction sequence should deliver the basic value

182 0 alloc 2 3 rewrite 2 3 mkbasic 2 pushloc 1 2 pushloc 0 2 loadc 7 3 rewrite 1 3 eval 3 slide 2 alloc 2 182

183 0 alloc 2 3 rewrite 2 3 mkbasic 2 pushloc 1 2 pushloc 0 2 loadc 7 3 rewrite 1 3 eval 3 slide 2 pushloc 0 C C

184 0 alloc 2 3 rewrite 2 3 mkbasic 2 pushloc 1 2 pushloc 0 2 loadc 7 3 rewrite 1 3 eval 3 slide 2 rewrite 2 C C

185 0 alloc 2 3 rewrite 2 3 mkbasic 2 pushloc 1 2 pushloc 0 2 loadc 7 3 rewrite 1 3 eval 3 slide 2 loadc 7 C C

186 0 alloc 2 3 rewrite 2 3 mkbasic 2 pushloc 1 2 pushloc 0 2 loadc 7 3 rewrite 1 3 eval 3 slide 2 mkbasic 7 C C

187 0 alloc 2 3 rewrite 2 3 mkbasic 2 pushloc 1 2 pushloc 0 2 loadc 7 3 rewrite 1 3 eval 3 slide 2 rewrite 1 B C C

188 0 alloc 2 3 rewrite 2 3 mkbasic 2 pushloc 1 2 pushloc 0 2 loadc 7 3 rewrite 1 3 eval 3 slide 2 pushloc 1 B C

189 0 alloc 2 3 rewrite 2 3 mkbasic 2 pushloc 1 2 pushloc 0 2 loadc 7 3 rewrite 1 3 eval 3 slide 2 eval B C

190 0 alloc 2 3 rewrite 2 3 mkbasic 2 pushloc 1 2 pushloc 0 2 loadc 7 3 rewrite 1 3 eval 3 slide 2 Segmentation Fault!! 190

191 Apparently, this optimization was not quite correct :-( The Problem: Binding of variable y to variable x before x s dummy node is replaced!! The Solution: == cyclic definitions: reject sequences of definitions like let rec a = b and... b = a in.... acyclic definitions: order the definitions y = x such that the dummy node for the right side of x is already overwritten. 191

192 Functions: Functions are values, which are not evaluated further. Instead of generating code that constructs a closure for an F-object, we generate code that constructs the F-object directly. Therefore: code C (fun x 0... x k 1 e) ρ sd = code V (fun x 0... x k 1 e) ρ sd 192

193 23 The Translation of a Program Expression Execution of a program e starts with PC = 0 SP = FP = GP = 1 The expression e must not contain free variables. The value of e should be determined and then a halt instruction should be executed. code e = code V e 0 halt 193

194 Remarks: The code schemata as defined so far produce Spaghetti code. Reason: Code for function bodies and closures placed directly behind the instructions mkfunval resp. mkclos with a jump over this code. Alternative: Place this code somewhere else, e.g. following the halt-instruction: Advantage: Elimination of the direct jumps following mkfunval and mkclos. Disadvantage: The code schemata are more complex as they would have to accumulate the code pieces in a Code-Dump. Solution: == Disentangle the Spaghetti code in a subsequent optimization phase :-) 194

195 Example let a = 17 in let f = fun b a+b in f 42 Disentanglement of the jumps produces: 0 loadc 17 2 mark B 3 B: slide 2 1 pushloc 1 1 mkbasic 5 loadc 42 1 halt 2 eval 1 pushloc 0 6 mkbasic 0 A: targ 1 2 getbasic 2 mkvec 1 6 pushloc 4 0 pushglob 0 2 add 2 mkfunval A 7 eval 1 eval 1 mkbasic 7 apply 1 getbasic 1 return 1 195

CMa simple C Abstract Machine

CMa simple C Abstract Machine CMa simple C Abstract Machine CMa architecture An abstract machine has set of instructions which can be executed in an abstract hardware. The abstract hardware may be seen as a collection of certain data

More information

MaMa a simple abstract machine for functional languages

MaMa a simple abstract machine for functional languages MaMa a simple abstract machine for functional languages Functional Language PuF We will consider a mini-laguage of "Pure Functions" PuF. Programs are expressions e in form: e ::= b j x j ( 1 e) j (e 1

More information

Code generation scheme for RCMA

Code generation scheme for RCMA Code generation scheme for RCMA Axel Simon July 5th, 2010 1 Revised Specification of the R-CMa We detail what constitutes the Register C-Machine (R-CMa ) and its operations in turn We then detail how the

More information

Compiler Construction I

Compiler Construction I TECHNISCHE UNIVERSITÄT MÜNCHEN FAKULTÄT FÜR INFORMATIK Compiler Construction I Dr. Michael Petter, Dr. Axel Simon SoSe 2014 1 / 104 Topic: Semantic Analysis 2 / 104 Topic: Code Synthesis 3 / 104 Generating

More information

Compiler Construction I

Compiler Construction I TECHNISCHE UNIVERSITÄT MÜNCHEN FAKULTÄT FÜR INFORMATIK Compiler Construction I Dr. Michael Petter, Dr. Axel Simon SoSe 2013 1 / 108 Organizing Master or Bachelor in the 6th Semester with 5 ECTS Prerequisites

More information

Compiler Construction

Compiler Construction Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Recap: Static Data Structures Outline of Lecture 18 Recap:

More information

Compiler Construction

Compiler Construction Compiler Construction Lecture 18: Code Generation V (Implementation of Dynamic Data Structures) Thomas Noll Lehrstuhl für Informatik 2 (Software Modeling and Verification) noll@cs.rwth-aachen.de http://moves.rwth-aachen.de/teaching/ss-14/cc14/

More information

Module 27 Switch-case statements and Run-time storage management

Module 27 Switch-case statements and Run-time storage management Module 27 Switch-case statements and Run-time storage management In this module we will discuss the pending constructs in generating three-address code namely switch-case statements. We will also discuss

More information

CS558 Programming Languages

CS558 Programming Languages CS558 Programming Languages Fall 2016 Lecture 4a Andrew Tolmach Portland State University 1994-2016 Pragmatics of Large Values Real machines are very efficient at handling word-size chunks of data (e.g.

More information

G Programming Languages - Fall 2012

G Programming Languages - Fall 2012 G22.2110-003 Programming Languages - Fall 2012 Lecture 4 Thomas Wies New York University Review Last week Control Structures Selection Loops Adding Invariants Outline Subprograms Calling Sequences Parameter

More information

CS558 Programming Languages Winter 2018 Lecture 4a. Andrew Tolmach Portland State University

CS558 Programming Languages Winter 2018 Lecture 4a. Andrew Tolmach Portland State University CS558 Programming Languages Winter 2018 Lecture 4a Andrew Tolmach Portland State University 1994-2018 Pragmatics of Large Values Real machines are very efficient at handling word-size chunks of data (e.g.

More information

The PCAT Programming Language Reference Manual

The PCAT Programming Language Reference Manual The PCAT Programming Language Reference Manual Andrew Tolmach and Jingke Li Dept. of Computer Science Portland State University September 27, 1995 (revised October 15, 2002) 1 Introduction The PCAT language

More information

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem: Expression evaluation CSE 504 Order of evaluation For the abstract syntax tree + + 5 Expression Evaluation, Runtime Environments + + x 3 2 4 the equivalent expression is (x + 3) + (2 + 4) + 5 1 2 (. Contd

More information

Lecture Notes on Memory Layout

Lecture Notes on Memory Layout Lecture Notes on Memory Layout 15-122: Principles of Imperative Computation Frank Pfenning André Platzer Lecture 11 1 Introduction In order to understand how programs work, we can consider the functions,

More information

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University G22.2110-001 Programming Languages Spring 2010 Lecture 4 Robert Grimm, New York University 1 Review Last week Control Structures Selection Loops 2 Outline Subprograms Calling Sequences Parameter Passing

More information

Programming Languages Third Edition. Chapter 10 Control II Procedures and Environments

Programming Languages Third Edition. Chapter 10 Control II Procedures and Environments Programming Languages Third Edition Chapter 10 Control II Procedures and Environments Objectives Understand the nature of procedure definition and activation Understand procedure semantics Learn parameter-passing

More information

CS 360 Programming Languages Interpreters

CS 360 Programming Languages Interpreters CS 360 Programming Languages Interpreters Implementing PLs Most of the course is learning fundamental concepts for using and understanding PLs. Syntax vs. semantics vs. idioms. Powerful constructs like

More information

Q1: /8 Q2: /30 Q3: /30 Q4: /32. Total: /100

Q1: /8 Q2: /30 Q3: /30 Q4: /32. Total: /100 ECE 2035(A) Programming for Hardware/Software Systems Fall 2013 Exam Three November 20 th 2013 Name: Q1: /8 Q2: /30 Q3: /30 Q4: /32 Total: /100 1/10 For functional call related questions, let s assume

More information

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline. The Main Idea of Today s Lecture Code Generation We can emit stack-machine-style code for expressions via recursion (We will use MIPS assembly as our target language) 2 Lecture Outline What are stack machines?

More information

We can emit stack-machine-style code for expressions via recursion

We can emit stack-machine-style code for expressions via recursion Code Generation The Main Idea of Today s Lecture We can emit stack-machine-style code for expressions via recursion (We will use MIPS assembly as our target language) 2 Lecture Outline What are stack machines?

More information

Compilers and computer architecture: A realistic compiler to MIPS

Compilers and computer architecture: A realistic compiler to MIPS 1 / 1 Compilers and computer architecture: A realistic compiler to MIPS Martin Berger November 2017 Recall the function of compilers 2 / 1 3 / 1 Recall the structure of compilers Source program Lexical

More information

Topic 7: Activation Records

Topic 7: Activation Records Topic 7: Activation Records Compiler Design Prof. Hanjun Kim CoreLab (Compiler Research Lab) POSTECH 1 Storage Organization Stack Free Memory Heap Static Code 2 ELF file format example Executable Object

More information

Short Notes of CS201

Short Notes of CS201 #includes: Short Notes of CS201 The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with < and > if the file is a system

More information

Lecture 5: Declarative Programming. The Declarative Kernel Language Machine. September 12th, 2011

Lecture 5: Declarative Programming. The Declarative Kernel Language Machine. September 12th, 2011 Lecture 5: Declarative Programming. The Declarative Kernel Language Machine September 12th, 2011 1 Lecture Outline Declarative Programming contd Dataflow Variables contd Expressions and Statements Functions

More information

CS201 - Introduction to Programming Glossary By

CS201 - Introduction to Programming Glossary By CS201 - Introduction to Programming Glossary By #include : The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with

More information

IMPORTANT QUESTIONS IN C FOR THE INTERVIEW

IMPORTANT QUESTIONS IN C FOR THE INTERVIEW IMPORTANT QUESTIONS IN C FOR THE INTERVIEW 1. What is a header file? Header file is a simple text file which contains prototypes of all in-built functions, predefined variables and symbolic constants.

More information

Functions in MIPS. Functions in MIPS 1

Functions in MIPS. Functions in MIPS 1 Functions in MIPS We ll talk about the 3 steps in handling function calls: 1. The program s flow of control must be changed. 2. Arguments and return values are passed back and forth. 3. Local variables

More information

Review of the C Programming Language

Review of the C Programming Language Review of the C Programming Language Prof. James L. Frankel Harvard University Version of 11:55 AM 22-Apr-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights reserved. Reference Manual for the

More information

Programming Languages Third Edition. Chapter 7 Basic Semantics

Programming Languages Third Edition. Chapter 7 Basic Semantics Programming Languages Third Edition Chapter 7 Basic Semantics Objectives Understand attributes, binding, and semantic functions Understand declarations, blocks, and scope Learn how to construct a symbol

More information

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13 Run-time Environments Lecture 13 by Prof. Vijay Ganesh) Lecture 13 1 What have we covered so far? We have covered the front-end phases Lexical analysis (Lexer, regular expressions,...) Parsing (CFG, Top-down,

More information

Problem with Scanning an Infix Expression

Problem with Scanning an Infix Expression Operator Notation Consider the infix expression (X Y) + (W U), with parentheses added to make the evaluation order perfectly obvious. This is an arithmetic expression written in standard form, called infix

More information

Implementing Functions at the Machine Level

Implementing Functions at the Machine Level Subroutines/Functions Implementing Functions at the Machine Level A subroutine is a program fragment that... Resides in user space (i.e, not in OS) Performs a well-defined task Is invoked (called) by a

More information

The SPL Programming Language Reference Manual

The SPL Programming Language Reference Manual The SPL Programming Language Reference Manual Leonidas Fegaras University of Texas at Arlington Arlington, TX 76019 fegaras@cse.uta.edu February 27, 2018 1 Introduction The SPL language is a Small Programming

More information

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization. Where We Are Source Code Lexical Analysis Syntax Analysis Semantic Analysis IR Generation IR Optimization Code Generation Optimization Machine Code Where We Are Source Code Lexical Analysis Syntax Analysis

More information

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University.

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University. Instructions: ti Language of the Computer Rui Wang, Assistant professor Dept. of Information and Communication Tongji University it Email: ruiwang@tongji.edu.cn Computer Hierarchy Levels Language understood

More information

Example. program sort; var a : array[0..10] of integer; procedure readarray; : function partition (y, z :integer) :integer; var i, j,x, v :integer; :

Example. program sort; var a : array[0..10] of integer; procedure readarray; : function partition (y, z :integer) :integer; var i, j,x, v :integer; : Runtime Environment Relationship between names and data objects (of target machine) Allocation & de-allocation is managed by run time support package Each execution of a procedure is an activation of the

More information

Lecture 5. Announcements: Today: Finish up functions in MIPS

Lecture 5. Announcements: Today: Finish up functions in MIPS Lecture 5 Announcements: Today: Finish up functions in MIPS 1 Control flow in C Invoking a function changes the control flow of a program twice. 1. Calling the function 2. Returning from the function In

More information

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM.

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM. IR Generation Announcements My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM. This is a hard deadline and no late submissions will be

More information

SOURCE LANGUAGE DESCRIPTION

SOURCE LANGUAGE DESCRIPTION 1. Simple Integer Language (SIL) SOURCE LANGUAGE DESCRIPTION The language specification given here is informal and gives a lot of flexibility for the designer to write the grammatical specifications to

More information

NOTE: Answer ANY FOUR of the following 6 sections:

NOTE: Answer ANY FOUR of the following 6 sections: A-PDF MERGER DEMO Philadelphia University Lecturer: Dr. Nadia Y. Yousif Coordinator: Dr. Nadia Y. Yousif Internal Examiner: Dr. Raad Fadhel Examination Paper... Programming Languages Paradigms (750321)

More information

G Programming Languages - Fall 2012

G Programming Languages - Fall 2012 G22.2110-003 Programming Languages - Fall 2012 Lecture 2 Thomas Wies New York University Review Last week Programming Languages Overview Syntax and Semantics Grammars and Regular Expressions High-level

More information

CSc 453 Interpreters & Interpretation

CSc 453 Interpreters & Interpretation CSc 453 Interpreters & Interpretation Saumya Debray The University of Arizona Tucson Interpreters An interpreter is a program that executes another program. An interpreter implements a virtual machine,

More information

UNIT-V. Symbol Table & Run-Time Environments Symbol Table

UNIT-V. Symbol Table & Run-Time Environments Symbol Table 1 P a g e UNIT-V Symbol Table & Run-Time Environments Symbol Table Symbol table is a data structure used by compiler to keep track of semantics of variable. i.e. symbol table stores the information about

More information

CS558 Programming Languages

CS558 Programming Languages CS558 Programming Languages Fall 2016 Lecture 3a Andrew Tolmach Portland State University 1994-2016 Formal Semantics Goal: rigorous and unambiguous definition in terms of a wellunderstood formalism (e.g.

More information

ECE220: Computer Systems and Programming Spring 2018 Honors Section due: Saturday 14 April at 11:59:59 p.m. Code Generation for an LC-3 Compiler

ECE220: Computer Systems and Programming Spring 2018 Honors Section due: Saturday 14 April at 11:59:59 p.m. Code Generation for an LC-3 Compiler ECE220: Computer Systems and Programming Spring 2018 Honors Section Machine Problem 11 due: Saturday 14 April at 11:59:59 p.m. Code Generation for an LC-3 Compiler This assignment requires you to use recursion

More information

Review of the C Programming Language for Principles of Operating Systems

Review of the C Programming Language for Principles of Operating Systems Review of the C Programming Language for Principles of Operating Systems Prof. James L. Frankel Harvard University Version of 7:26 PM 4-Sep-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights

More information

Chapter 9 Subroutines and Control Abstraction. June 22, 2016

Chapter 9 Subroutines and Control Abstraction. June 22, 2016 Chapter 9 Subroutines and Control Abstraction June 22, 2016 Stack layout Common to have subroutine activation record allocated on a stack Typical fare for a frame: arguments, return value, saved registers,

More information

Intermediate Code Generation

Intermediate Code Generation Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target

More information

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING B.E SECOND SEMESTER CS 6202 PROGRAMMING AND DATA STRUCTURES I TWO MARKS UNIT I- 2 MARKS

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING B.E SECOND SEMESTER CS 6202 PROGRAMMING AND DATA STRUCTURES I TWO MARKS UNIT I- 2 MARKS DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING B.E SECOND SEMESTER CS 6202 PROGRAMMING AND DATA STRUCTURES I TWO MARKS UNIT I- 2 MARKS 1. Define global declaration? The variables that are used in more

More information

Lecture 6: The Declarative Kernel Language Machine. September 13th, 2011

Lecture 6: The Declarative Kernel Language Machine. September 13th, 2011 Lecture 6: The Declarative Kernel Language Machine September 13th, 2011 Lecture Outline Computations contd Execution of Non-Freezable Statements on the Abstract Machine The skip Statement The Sequential

More information

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015 Branch Addressing Branch instructions specify Opcode, two registers, target address Most branch targets are near branch Forward or backward op rs rt constant or address 6 bits 5 bits 5 bits 16 bits PC-relative

More information

CSCI B522 Lecture 11 Naming and Scope 8 Oct, 2009

CSCI B522 Lecture 11 Naming and Scope 8 Oct, 2009 CSCI B522 Lecture 11 Naming and Scope 8 Oct, 2009 Lecture notes for CS 6110 (Spring 09) taught by Andrew Myers at Cornell; edited by Amal Ahmed, Fall 09. 1 Static vs. dynamic scoping The scope of a variable

More information

Today. Putting it all together

Today. Putting it all together Today! One complete example To put together the snippets of assembly code we have seen! Functions in MIPS Slides adapted from Josep Torrellas, Craig Zilles, and Howard Huang Putting it all together! Count

More information

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11 CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11 CS 536 Spring 2015 1 Handling Overloaded Declarations Two approaches are popular: 1. Create a single symbol table

More information

Lectures 5. Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS

Lectures 5. Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS Lectures 5 Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS 1 OOPS - What does this C code do? int foo(char *s) { int L = 0; while (*s++) { ++L; } return L; } 2

More information

Intermediate Programming, Spring 2017*

Intermediate Programming, Spring 2017* 600.120 Intermediate Programming, Spring 2017* Misha Kazhdan *Much of the code in these examples is not commented because it would otherwise not fit on the slides. This is bad coding practice in general

More information

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine Machine Language Instructions Introduction Instructions Words of a language understood by machine Instruction set Vocabulary of the machine Current goal: to relate a high level language to instruction

More information

CS558 Programming Languages

CS558 Programming Languages CS558 Programming Languages Winter 2017 Lecture 4a Andrew Tolmach Portland State University 1994-2017 Semantics and Erroneous Programs Important part of language specification is distinguishing valid from

More information

Name, Scope, and Binding. Outline [1]

Name, Scope, and Binding. Outline [1] Name, Scope, and Binding In Text: Chapter 3 Outline [1] Variable Binding Storage bindings and lifetime Type bindings Type Checking Scope Lifetime vs. Scope Referencing Environments N. Meng, S. Arthur 2

More information

Advanced Programming & C++ Language

Advanced Programming & C++ Language Advanced Programming & C++ Language ~6~ Introduction to Memory Management Ariel University 2018 Dr. Miri (Kopel) Ben-Nissan Stack & Heap 2 The memory a program uses is typically divided into four different

More information

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace Project there are a couple of 3 person teams regroup or see me or forever hold your peace a new drop with new type checking is coming using it is optional 1 Compiler Architecture source code Now we jump

More information

1 Lexical Considerations

1 Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler

More information

Informatica 3 Marcello Restelli

Informatica 3 Marcello Restelli Informatica 3 Marcello Restelli 9/15/07 Laurea in Ingegneria Informatica Politecnico di Milano Operational Semantics We describe the semantics of a programming language by means of an abstract semantic

More information

Compiler Construction

Compiler Construction Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-17/cc/ Generation of Intermediate Code Conceptual Structure of

More information

Intermediate Representations

Intermediate Representations Intermediate Representations A variety of intermediate representations are used in compilers Most common intermediate representations are: Abstract Syntax Tree Directed Acyclic Graph (DAG) Three-Address

More information

https://lambda.mines.edu A pointer is a value that indicates location in memory. When we change the location the pointer points to, we say we assign the pointer a value. When we look at the data the pointer

More information

Problem with Scanning an Infix Expression

Problem with Scanning an Infix Expression Operator Notation Consider the infix expression (X Y) + (W U), with parentheses added to make the evaluation order perfectly obvious. This is an arithmetic expression written in standard form, called infix

More information

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms SE352b Software Engineering Design Tools W3: Programming Paradigms Feb. 3, 2005 SE352b, ECE,UWO, Hamada Ghenniwa SE352b: Roadmap CASE Tools: Introduction System Programming Tools Programming Paradigms

More information

Data Structure for Language Processing. Bhargavi H. Goswami Assistant Professor Sunshine Group of Institutions

Data Structure for Language Processing. Bhargavi H. Goswami Assistant Professor Sunshine Group of Institutions Data Structure for Language Processing Bhargavi H. Goswami Assistant Professor Sunshine Group of Institutions INTRODUCTION: Which operation is frequently used by a Language Processor? Ans: Search. This

More information

Project 3 Due October 21, 2015, 11:59:59pm

Project 3 Due October 21, 2015, 11:59:59pm Project 3 Due October 21, 2015, 11:59:59pm 1 Introduction In this project, you will implement RubeVM, a virtual machine for a simple bytecode language. Later in the semester, you will compile Rube (a simplified

More information

LECTURE 17. Expressions and Assignment

LECTURE 17. Expressions and Assignment LECTURE 17 Expressions and Assignment EXPRESSION SYNTAX An expression consists of An atomic object, e.g. number or variable. An operator (or function) applied to a collection of operands (or arguments)

More information

Semantic analysis and intermediate representations. Which methods / formalisms are used in the various phases during the analysis?

Semantic analysis and intermediate representations. Which methods / formalisms are used in the various phases during the analysis? Semantic analysis and intermediate representations Which methods / formalisms are used in the various phases during the analysis? The task of this phase is to check the "static semantics" and generate

More information

CS558 Programming Languages

CS558 Programming Languages CS558 Programming Languages Fall 2017 Lecture 3a Andrew Tolmach Portland State University 1994-2017 Binding, Scope, Storage Part of being a high-level language is letting the programmer name things: variables

More information

Compiler Construction

Compiler Construction Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-17/cc/ Generation of Intermediate Code Outline of Lecture 15 Generation

More information

Virtual Machine Tutorial

Virtual Machine Tutorial Virtual Machine Tutorial CSA2201 Compiler Techniques Gordon Mangion Virtual Machine A software implementation of a computing environment in which an operating system or program can be installed and run.

More information

Concepts Introduced in Chapter 7

Concepts Introduced in Chapter 7 Concepts Introduced in Chapter 7 Storage Allocation Strategies Static Stack Heap Activation Records Access to Nonlocal Names Access links followed by Fig. 7.1 EECS 665 Compiler Construction 1 Activation

More information

Thomas Polzer Institut für Technische Informatik

Thomas Polzer Institut für Technische Informatik Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Branch to a labeled instruction if a condition is true Otherwise, continue sequentially beq rs, rt, L1 if (rs == rt) branch to

More information

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont ) Chapter 2 Computer Abstractions and Technology Lesson 4: MIPS (cont ) Logical Operations Instructions for bitwise manipulation Operation C Java MIPS Shift left >>> srl Bitwise

More information

! Those values must be stored somewhere! Therefore, variables must somehow be bound. ! How?

! Those values must be stored somewhere! Therefore, variables must somehow be bound. ! How? A Binding Question! Variables are bound (dynamically) to values Subprogram Activation! Those values must be stored somewhere! Therefore, variables must somehow be bound to memory locations! How? Function

More information

Instructions: Assembly Language

Instructions: Assembly Language Chapter 2 Instructions: Assembly Language Reading: The corresponding chapter in the 2nd edition is Chapter 3, in the 3rd edition it is Chapter 2 and Appendix A and in the 4th edition it is Chapter 2 and

More information

CS143 - Written Assignment 4 Reference Solutions

CS143 - Written Assignment 4 Reference Solutions CS143 - Written Assignment 4 Reference Solutions 1. Consider the following program in Cool, representing a slightly over-engineered implementation which calculates the factorial of 3 using an operator

More information

Lectures 5-6: Introduction to C

Lectures 5-6: Introduction to C Lectures 5-6: Introduction to C Motivation: C is both a high and a low-level language Very useful for systems programming Faster than Java This intro assumes knowledge of Java Focus is on differences Most

More information

Principles of Programming Languages Topic: Scope and Memory Professor Louis Steinberg Fall 2004

Principles of Programming Languages Topic: Scope and Memory Professor Louis Steinberg Fall 2004 Principles of Programming Languages Topic: Scope and Memory Professor Louis Steinberg Fall 2004 CS 314, LS,BR,LTM: Scope and Memory 1 Review Functions as first-class objects What can you do with an integer?

More information

Code Generation & Parameter Passing

Code Generation & Parameter Passing Code Generation & Parameter Passing Lecture Outline 1. Allocating temporaries in the activation record Let s optimize our code generator a bit 2. A deeper look into calling sequences Caller/Callee responsibilities

More information

Lecture 14. No in-class files today. Homework 7 (due on Wednesday) and Project 3 (due in 10 days) posted. Questions?

Lecture 14. No in-class files today. Homework 7 (due on Wednesday) and Project 3 (due in 10 days) posted. Questions? Lecture 14 No in-class files today. Homework 7 (due on Wednesday) and Project 3 (due in 10 days) posted. Questions? Friday, February 11 CS 215 Fundamentals of Programming II - Lecture 14 1 Outline Static

More information

CS4215 Programming Language Implementation. Martin Henz

CS4215 Programming Language Implementation. Martin Henz CS4215 Programming Language Implementation Martin Henz Thursday 15 March, 2012 2 Chapter 11 impl: A Simple Imperative Language 11.1 Introduction So far, we considered only languages, in which an identifier

More information

MIPS Programming. A basic rule is: try to be mechanical (that is, don't be "tricky") when you translate high-level code into assembler code.

MIPS Programming. A basic rule is: try to be mechanical (that is, don't be tricky) when you translate high-level code into assembler code. MIPS Programming This is your crash course in assembler programming; you will teach yourself how to program in assembler for the MIPS processor. You will learn how to use the instruction set summary to

More information

CS 270 Algorithms. Oliver Kullmann. Binary search. Lists. Background: Pointers. Trees. Implementing rooted trees. Tutorial

CS 270 Algorithms. Oliver Kullmann. Binary search. Lists. Background: Pointers. Trees. Implementing rooted trees. Tutorial Week 7 General remarks Arrays, lists, pointers and 1 2 3 We conclude elementary data structures by discussing and implementing arrays, lists, and trees. Background information on pointers is provided (for

More information

Parsing Scheme (+ (* 2 3) 1) * 1

Parsing Scheme (+ (* 2 3) 1) * 1 Parsing Scheme + (+ (* 2 3) 1) * 1 2 3 Compiling Scheme frame + frame halt * 1 3 2 3 2 refer 1 apply * refer apply + Compiling Scheme make-return START make-test make-close make-assign make- pair? yes

More information

Chapter 2A Instructions: Language of the Computer

Chapter 2A Instructions: Language of the Computer Chapter 2A Instructions: Language of the Computer Copyright 2009 Elsevier, Inc. All rights reserved. Instruction Set The repertoire of instructions of a computer Different computers have different instruction

More information

7 Translation to Intermediate Code

7 Translation to Intermediate Code 7 Translation to Intermediate Code ( 7. Translation to Intermediate Code, p. 150) This chpater marks the transition from the source program analysis phase to the target program synthesis phase. All static

More information

C Programming SYLLABUS COVERAGE SYLLABUS IN DETAILS

C Programming SYLLABUS COVERAGE SYLLABUS IN DETAILS C Programming C SYLLABUS COVERAGE Introduction to Programming Fundamentals in C Operators and Expressions Data types Input-Output Library Functions Control statements Function Storage class Pointer Pointer

More information

CIT Week13 Lecture

CIT Week13 Lecture CIT 3136 - Week13 Lecture Runtime Environments During execution, allocation must be maintained by the generated code that is compatible with the scope and lifetime rules of the language. Typically there

More information

Compilers. 8. Run-time Support. Laszlo Böszörmenyi Compilers Run-time - 1

Compilers. 8. Run-time Support. Laszlo Böszörmenyi Compilers Run-time - 1 Compilers 8. Run-time Support Laszlo Böszörmenyi Compilers Run-time - 1 Run-Time Environment A compiler needs an abstract model of the runtime environment of the compiled code It must generate code for

More information

Compilers and Code Optimization EDOARDO FUSELLA

Compilers and Code Optimization EDOARDO FUSELLA Compilers and Code Optimization EDOARDO FUSELLA Contents Data memory layout Instruction selection Register allocation Data memory layout Memory Hierarchy Capacity vs access speed Main memory Classes of

More information

Intermediate Representation (IR)

Intermediate Representation (IR) Intermediate Representation (IR) Components and Design Goals for an IR IR encodes all knowledge the compiler has derived about source program. Simple compiler structure source code More typical compiler

More information

Functional Programming. Pure Functional Programming

Functional Programming. Pure Functional Programming Functional Programming Pure Functional Programming Computation is largely performed by applying functions to values. The value of an expression depends only on the values of its sub-expressions (if any).

More information

Run-time Environments

Run-time Environments Run-time Environments Status We have so far covered the front-end phases Lexical analysis Parsing Semantic analysis Next come the back-end phases Code generation Optimization Register allocation Instruction

More information

Computer Architecture

Computer Architecture Computer Architecture Chapter 2 Instructions: Language of the Computer Fall 2005 Department of Computer Science Kent State University Assembly Language Encodes machine instructions using symbols and numbers

More information

Run-time Environments

Run-time Environments Run-time Environments Status We have so far covered the front-end phases Lexical analysis Parsing Semantic analysis Next come the back-end phases Code generation Optimization Register allocation Instruction

More information