Compiler Optimization Intermediate Representation Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail: viren@ee.iitb.ac.in A Short Course on Compiler Construction & Optimization Lecture 2 (23 May 2014)
From Source to Executable source program foo.c main() sub1() data Compiler sta;c library libc.a prin9 scanf gets fopen exit data... object modules foo.o main sub1 data Linkage Editor load module a.out main sub1 data prin9 exit data Load ;me (Run Time) Loader Machine memory? other programs... main sub1 data prin9 exit data other...... kernel (system calls) Dynamic library case not shown 23 May 2014 Compilers@EE-IITB 2
General Structure of a Modern Compiler Source Program Lexical Analysis Scanner Context Symbol Table CFG Syntax Analysis Parser Front end Build high-level IR Semantic Analysis High-level IR to low-level IR conversion Controlflow/Dataflow Optimization Back end Assembly Code Code Generation Machine independent asm to machine dependent 23 May 2014 Compilers@EE-IITB 3
Intermediate Representation (aka IR) The compilers internal representa1on Is language- independent and machine- independent Enables machine independent and machine dependent optis AST optimize IR Pentium Java bytecode Itanium TI C5x ARM 23 May 2014 Compilers@EE-IITB 4
What Makes a Good IR? Captures high- level language constructs Easy to translate from AST Supports high- level op1miza1ons Captures low- level machine features Easy to translate to assembly Supports machine- dependent op1miza1ons Narrow interface: small number of node types (instruc1ons) Easy to op1mize Easy to retarget 23 May 2014 Compilers@EE-IITB 5
Kinds of IR Abstract syntax trees (AST) Linear operator form of tree (e.g., posjix nota1on) Directed acyclic graphs (DAG) Control flow graphs (CFG) Program dependence graphs (PDG) Sta1c single assignment form (SSA) 3- address code Hybrid combina1ons 23 May 2014 Compilers@EE-IITB 6
Categories of IR Structural graphically oriented (trees, DAGs) nodes and edges tend to be large heavily used on source- to- source translators Linear pseudo- code for abstract machine large varia1on in level of abstrac1on simple, compact data structures easier to rearrange Hybrid combina1on of graphs and linear code (e.g. CFGs) arempt to achieve best of both worlds 23 May 2014 Compilers@EE-IITB 7
Important IR properties Ease of genera1on Ease of manipula1on Cost of manipula1on Level of abstrac1on Freedom of expression (!) Size of typical procedure Original or deriva1ve Subtle design decisions in the IR can have far-reaching effects on the speed and effectiveness of the compiler! è Degree of exposed detail can be crucial 23 May 2014 Compilers@EE-IITB 8
Abstract Syntax Tree An AST is a parse tree A linear operator form of this tree (postfix) would be: x 2 y * -! 23 May 2014 Compilers@EE-IITB 9
Directed Acyclic Graph A DAG is an AST with unique, shared nodes for each value. x := 2 * y + sin(2*x)! z := x / 2! 23 May 2014 Compilers@EE-IITB 10
Control Flow Graph A CFG models transfer of control in a program nodes are basic blocks (straight- line blocks of code) edges represent control flow (loops, if/else, goto ) if x = y then!!s1! else!!s2! end! S3! 23 May 2014 Compilers@EE-IITB 11
3-address code Statements take the form: x = y op z! single operator and at most three names x 2 * y! t1 = 2 * y! t2 = x t1! Advantages: Compact form Names for intermediate values 23 May 2014 Compilers@EE-IITB 12
Typical 3-address codes assignments! x = y op z! x = op y! x = y[i]! x = y! branches! goto L! conditional branches! if x relop y goto L! procedure calls! address and pointer assignments! param x! param y! call p! x = &y! *y = z! 23 May 2014 Compilers@EE-IITB 13
Other hybrids exist IR choices combina1ons of graphs and linear codes CFG with 3- address code for basic blocks Many variants used in prac1ce no widespread agreement compilers may need several different IRs! Advice: choose IR with right level of detail keep manipula1on costs in mind 23 May 2014 Compilers@EE-IITB 14
Static Single Assignment Form Goal: simplify procedure- global op1miza1ons Defini3on: Program is in SSA form if every variable is only assigned once 23 May 2014 Compilers@EE-IITB 15 15
Static Single Assignment (SSA) Each assignment to a temporary is given a unique name All uses reached by that assignment are renamed Compact representa1on Useful for many kinds of compiler op1miza1on x := 3;! x := x + 1;! è x := 7;! x := x*2;! x 1 := 3;! x 2 := x 1 + 1;! x 3 := 7;! x 4 := x 3 *2;! Ron Cytron, et al., Efficiently compu3ng sta3c single assignment form and the control dependence graph, ACM TOPLAS., 1991. 23 May 2014 Compilers@EE-IITB 16
Why Sta1c? Why Static? We only look at the sta3c program One assignment per variable in the program At run1me variables are assigned mul1ple 1mes! 23 May 2014 Compilers@EE-IITB 17
Example: Sequence Easy to do for sequential programs: Original a := b + c! b := c + 1! d := b + c! a := a + 1! e := a + b! SSA a 1 := b 1 + c 1! b 2 := c 1 + 1! d 1 := b 2 + c 1! a 2 := a 1 + 1! e 1 := a 2 + b 2! 23 May 2014 Compilers@EE-IITB 18
Example: Condition Conditions: what to do on control-flow merge? Original if B then!!a := b! else!!a := c! end!! a! SSA if B then!!a 1 := b! else!!a 2 := c! End!!! a?! 23 May 2014 Compilers@EE-IITB 19
Solution: Φ-Function Conditions: what to do on control-flow merge? Original if B then!!a := b! else!!a := c! end!! a! SSA if B then!!a 1 := b! else!!a 2 := c! End! a 3 := Φ(a 1,a 2 )!! a 3! 23 May 2014 Compilers@EE-IITB 20
The Φ-Function Φ- func1ons are always at the beginning of a basic block Select between values depending on control- flow a k+1 := Φ (a 1 a k ): the block has k preceding blocks PHI- func3ons are evaluated simultaneously within a basic block. 23 May 2014 Compilers@EE-IITB 21
SSA and CFG SSA is normally used for control- flow graphs (CFG) Basic blocks are in 3- address form 23 May 2014 Compilers@EE-IITB 22
Recall: Control flow graph A CFG models transfer of control in a program nodes are basic blocks (straight- line blocks of code) edges represent control flow (loops, if/else, goto ) if x = y then!!s1! else!!s2! end! S3! 23 May 2014 Compilers@EE-IITB 23
SSA: a Simple Example if B then!!a1 := 1! else!!a2 := 2! End! a3 := PHI(a1,a2)!! a3! 23 May 2014 Compilers@EE-IITB 24
Recall: IR front end produces IR optimizer transforms IR to more efficient program back end transform IR to target code 23 May 2014 Compilers@EE-IITB 25
SSA as IR 23 May 2014 Compilers@EE-IITB 26
Transforming to SSA Problem: Performance / Memory Minimize number of inserted Φ- func1ons Do not spend too much 1me Many rela1vely complex algorithms 23 May 2014 Compilers@EE-IITB 27 27
Two steps: Place Φ- func1ons Rename Variables Minimal SSA Where to place Φ- func1ons? We want minimal amount of needed Φ Save memory Algorithms will work faster 23 May 2014 Compilers@EE-IITB 28 28
Optimization: The Idea Transform the program to improve efficiency Performance: faster execu1on Size: smaller executable, smaller memory footprint Tradeoffs: 1) Performance vs. Size 2) Compila;on speed and memory 23 May 2014 Compilers@EE-IITB 29
Optimization on many levels Op1miza1ons both in the op1mizer and back- end 23 May 2014 Compilers@EE-IITB 30
Examples for Optimizations Constant Folding / Propaga1on Copy Propaga1on Algebraic Simplifica1ons Strength Reduc1on Dead Code Elimina1on Structure Simplifica1ons Loop Op1miza1ons Par1al Redundancy Elimina1on Code Inlining 23 May 2014 Compilers@EE-IITB 31
Constant Folding Evaluate constant expressions at compile 1me Only possible when side- effect freeness guaranteed c:= 1 + 3 c:= 4 true not false Caveat: Floats implementation could be different between machines! 23 May 2014 Compilers@EE-IITB 32
Constant Propagation Variables that have constant value, e.g. b := 3 Later uses of b can be replaced by the constant If no change of b between! b := 3! c := 1 + b! d := b + c! b := 3! c := 1 + 3! d := 3 + c! Analysis needed, as b can be assigned more than once! 23 May 2014 Compilers@EE-IITB 33
Copy Propagation for a statement x := y replace later uses of x with y, if x and y have not been changed. x := y! c := 1 + x! d := x + c! x := y! c := 1 + y! d := y + c! Analysis needed, as y and x can be assigned more than once! 23 May 2014 Compilers@EE-IITB Marcus Denker 34
Algebraic Simplifications Use algebraic proper1es to simplify expressions -(-i)! i! b or: true! true! Important to simplify code for later optimizations 23 May 2014 Compilers@EE-IITB Marcus Denker 35
Strength Reduction Replace expensive opera1ons with simpler ones Example: Mul1plica1ons replaced by addi1ons y := x * 2! y := x + x! Peephole optimizations are often strength reductions 23 May 2014 Compilers@EE-IITB 36
Dead Code Remove unnecessary code e.g. variables assigned but never read b := 3! c := 1 + 3! d := 3 + c! c := 1 + 3! d := 3 + c! > Remove code never reached if (false) {a := 5}! if (false) {}! 23 May 2014 Compilers@EE-IITB 37
Simplify Structure Similar to dead code: Simplify CFG Structure Op1miza1ons will degenerate CFG Needs to be cleaned to simplify further op1miza1on! 23 May 2014 Compilers@EE-IITB 38
Delete Empty Basic Blocks 23 May 2014 Compilers@EE-IITB 39
Fuse Basic Blocks 23 May 2014 Compilers@EE-IITB 40
Common Subexpression Elimination (CSE) Common Subexpression: There is another occurrence of the expression whose evalua1on always precedes this one operands remain unchanged Local (inside one basic block): When building IR Global (complete flow- graph) 23 May 2014 Compilers@EE-IITB 41
Example CSE b := a + 2! c := 4 * b! b < c?! t1 := a + 2! b := t1! c := 4 * b! b < c?! b := 1! b := 1! d := a + 2! d := t1! 23 May 2014 Compilers@EE-IITB 42
Loop Optimizations Op1mizing code in loops is important ojen executed, large payoff All op1miza1ons help when applied to loop- bodies Some op1miza1ons are loop specific 23 May 2014 Compilers@EE-IITB 43
Loop Invariant Code Motion Move expressions that are constant over all itera1ons out of the loop 23 May 2014 Compilers@EE-IITB 44
Induction Variable Optimizations Values of variables form an arithme1c progression integer a(100)! do i = 1, 100! a(i) = 202-2 * i! endo! value assigned to a decreases by 2 integer a(100)! t1 := 202! do i = 1, 100! t1 := t1-2! a(i) = t1! endo! uses Strength Reduction 23 May 2014 Compilers@EE-IITB 45
Partial Redundancy Elimination (PRE) Combines mul1ple op1miza1ons: global common- subexpression elimina1on loop- invariant code mo1on Par;al Redundancy: computa1on done more than once on some path in the flow- graph PRE: insert and delete code to minimize redundancy. 23 May 2014 Compilers@EE-IITB 46
Code Inlining All op1miza1ons up to now were local to one procedure Problem: procedures or func1ons are very short Especially in good OO code! Solu1on: Copy code of small procedures into the caller OO: Polymorphic calls. Which method is called? 23 May 2014 Compilers@EE-IITB 47
Example: Inlining a := power2(b)! power2(x) {! return x*x! }! a := b * b! 23 May 2014 Compilers@EE-IITB 48
Optimizations in the Backend Register Alloca1on Instruc1on Selec1on Peep- hole Op1miza1on 23 May 2014 Compilers@EE-IITB 49
Register Allocation Processor has only finite amount of registers Can be very small (x86) Temporary variables non- overlapping temporaries can share one register Passing arguments via registers Op1mizing register alloca1on very important for good performance Especially on x86 23 May 2014 Compilers@EE-IITB 50
Instruction Selection For every expression, there are many ways to realize them for a processor Example: Mul1plica1on*2 can be done by bit- shij Instruc3on selec3on is a form of op3miza3on 23 May 2014 Compilers@EE-IITB Marcus Denker 51
Peephole Optimization Simple local op1miza1on Look at code through a hole replace sequences by known shorter ones table pre- computed store R,a;! load a,r! store R,a;! imul 2,R; ashl 1,R; Important when using simple instruc3on selec3on! 23 May 2014 Compilers@EE-IITB 52
Thank You 23 May 2014 Compilers@EE-IITB 53