Code Generation Frédéric Haziza <daz@it.uu.se> Department of Computer Systems Uppsala University Spring 2008
Operating Systems Process Management Memory Management Storage Management Compilers Compiling process & Lexical analysis Parsing Semantic & Code generation
Analysis Lexical analysis Syntax analysis Semantics analysis Synthesis Machine independant code generation Optimization of machine independant code Storage Allocation Machine code generation Optimization of machine code
Outline 1 Intermediate code 2 Machine code 3 Optimizations 4 OSKomp 08 Code Generation
Good IR easy to translate from AST easy to translate to assembly/machine code easy to optimize easy to retarget 6 OSKomp 08 Code Generation
Well-known examples 1 Three-address code 2 P-code (for Pascal) 3 Bytecode (for Java) 7 OSKomp 08 Code Generation
P-Code Stack-based intermediate code, for Pascal. Instruction format: F P Q F is a function P, Q may be absent P used to specify a static block level Q: offset within a frame or immediate operand (ie constant) Compile-time address are (static level, offset) 8 OSKomp 08 Code Generation
Instructions Instructions with no parameter operate on the stack s top AND, DIF, NGI, FLT, FLO, INN One or two-address instruction to load/store a value on the top of the stack LDCI, LODI, LDA, STRI Jump instructions: UJP L 7 : unconditional jump FJP L 8 : jump to L 8 if top of the stack is false Labels 9 OSKomp 08 Code Generation
if(expression) statement 1 else statement 2 Code to put the value of expression on top of the stack FJP L 1 Code to implement statement 1 UJP L 2 L 1 Code to implement statement 2 L 2 while(expression) statement L 1 Code to put the value of expression on top of the stack FJP L 2 Code to implement statement UJP L 1 L 2
Java Run-Time system Execution engine (executes bytecode instructions) Memory manager (manages heap in which all objects and arrays are stored) Error and Exception manager (used to catch runtime failures in a planned and systematic manner) Threads interface (handles concurrency) Class Loader (loads, links and initializes classes) Security manager deals with attempts to run hostile programs 11 OSKomp 08 Code Generation
Principal types of bytecode instructions Stack manipulation Performing arithmetic Handling objects and arrays Control flow Method invocation Handling exceptions and concurrency 12 OSKomp 08 Code Generation
Manipulating the stack Instructions Meanings iconst_4 iload_4 pop dup swap load the integer constant 4 on to the stack load the value of local variable number 4 on to the stack discard top value of the stack duplicate top item on the stack interchange top two values of the stack istore_4 store the value on top of the stack in the local variable number 4 13 OSKomp 08 Code Generation
Arithmetic and Array Instructions iadd fadd fmul Meanings add the two integers on the top of the stack add the two floats on the top of the stack multiply the two floats on the top of the stack Instructions iaload Meanings puts the value of an array element on top of the stack, assuming the array reference and the index of the array are already on the stack 14 OSKomp 08 Code Generation
Conditional and unconditional Branches Instructions Meanings ifeq L 1 if_icmpne L 1 jump to L 1 if the integer value on top of the stack is zero jump to L 1 if two integer values on top of the stack are not equal goto L 1 jump to L 1 15 OSKomp 08 Code Generation
if(expression) statement 1 else statement 2 Bytecode to put the value of expression on top of the stack ifeq L 1 Bytecode to implement statement 1 goto L 2 L 1 Bytecode to implement statement 2 L 2 while(expression) statement L 1 Bytecode to put the value of expression on top of the stack ifeq L 2 Bytecode to implement statement goto L 1 L 2
Considerations CISC complex instruction set computer RISC reduced instruction set computer * Wide range of addressing mode * Small number of registers (<16) * Many special purpose registers * 2-address instructions A + B A * Variable length instructions * Instructions with side-effect * Different exec time for instr * Simple addressing mode (with registers) * Many registers (>32) * All registers are general purpose * 3-address instructions r 3 = r 1 + r 2 * Fixed length instructions (32 bits) * No side-effect, one result per instr * Similar exec time for instr Instruction selection Register allocation 18 OSKomp 08 Code Generation
Example t 1 = a + b t 2 = c + d t 3 = t 1 t 2 Must keep t 1 and t 2 until t 3 is evaluated 19 OSKomp 08 Code Generation
Example 2 a b + c d + e f Temporaries Register t 1 = a b t 2 = c d t 1 1 t 3 = t 1 + t 2 t 2 2 t 4 = e f t 3 3 t 5 = t 3 + t 4 t 4 1 t 5 2 Do we really need distinct registers? 20 OSKomp 08 Code Generation
1) n = 0 2) sum 2 = 0 3) sum 3 = 0 4) t 1 = n < 10 : L 1 5) t 2 = not t 1 6) if t 2 goto L 2 7) n = n + 1 8) m = n n 9) sum 2 = sum 2 + m 10) t 3 = m n 11) sum 3 = sum 3 + t 3 12) goto L 1 13) L 2 n = 0; sum2 = 0; sum3 = 0; while (n<10){ n = n+1; m = n*n; sum2 = sum2 + m; sum3 = sum3 + m*n; } Variable Live Register n 1..12 1 sum 2 2..12 2 sum 3 3..12 3 t 1 4..5 4 m 8..10 4 t 2 5..6 4 t 3 10..11 4 21 OSKomp 08 Code Generation
Liveness analysis Equations of the form: 1 in n = use n (out n \def n ) 2 out n = s succ(n) in s where use n: out n: def n: in n: set of all variables whose values are used in statement n set of all variables that are live on leaving statement n set of all variables that are defined in statement n set of all variables that are live on reaching statement n 22 OSKomp 08 Code Generation
Typical local optimizations Constant folding Strength reduction Elimination of unnecessary instructions 24 OSKomp 08 Code Generation
Typical global optimizations Analysis of control and data flow: Dead code elimination Common subexpression elimination Loop optimizations 25 OSKomp 08 Code Generation