Control-Flow Graphs & Dataflow Analysis. CS4410: Spring 2013

Similar documents
CS153: Compilers Lecture 17: Control Flow Graph and Data Flow Analysis

CS153: Compilers Lecture 16: Local Optimization II

CS153: Compilers Lecture 23: Static Single Assignment Form

Data Flow Analysis. Agenda CS738: Advanced Compiler Optimizations. 3-address Code Format. Assumptions

Why Global Dataflow Analysis?

Dataflow analysis (ctd.)

Compiler Optimization and Code Generation

Midterm II CS164, Spring 2006

Data Flow Information. already computed

CS153: Compilers Lecture 15: Local Optimization

Data-flow Analysis. Y.N. Srikant. Department of Computer Science and Automation Indian Institute of Science Bangalore

Compiler Structure. Data Flow Analysis. Control-Flow Graph. Available Expressions. Data Flow Facts

Lecture 3 Local Optimizations, Intro to SSA

Data Flow Analysis. Program Analysis

Tour of common optimizations

Midterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 21, 2016

A main goal is to achieve a better performance. Code Optimization. Chapter 9

Compiler Design. Fall Control-Flow Analysis. Prof. Pedro C. Diniz

CS Lecture 7: The dynamic environment. Prof. Clarkson Spring Today s music: Down to Earth by Peter Gabriel from the WALL-E soundtrack

This test is not formatted for your answers. Submit your answers via to:

Calvin Lin The University of Texas at Austin

Intermediate representation

Lecture 4 Introduction to Data Flow Analysis

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

ELEC 876: Software Reengineering

Instruction Scheduling

Program analysis for determining opportunities for optimization: 2. analysis: dataow Organization 1. What kind of optimizations are useful? lattice al

Department of Computer Sciences CS502. Purdue University is an Equal Opportunity/Equal Access institution.

Variables. Substitution

CS 432 Fall Mike Lam, Professor. Data-Flow Analysis

CS 406/534 Compiler Construction Putting It All Together

Lecture Notes on Liveness Analysis

Loops. Lather, Rinse, Repeat. CS4410: Spring 2013

CS577 Modern Language Processors. Spring 2018 Lecture Optimization

CS153: Compilers Lecture 8: Compiling Calls

Lambda Calculus: Implementation Techniques and a Proof. COS 441 Slides 15

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

Optimizations. Optimization Safety. Optimization Safety CS412/CS413. Introduction to Compilers Tim Teitelbaum

MIT Introduction to Dataflow Analysis. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Data-flow Analysis - Part 2

Functional programming languages

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

COSE312: Compilers. Lecture 20 Data-Flow Analysis (2)

Loop Optimizations. Outline. Loop Invariant Code Motion. Induction Variables. Loop Invariant Code Motion. Loop Invariant Code Motion

CMSC 330: Organization of Programming Languages

Lecture 2. Introduction to Data Flow Analysis

CSE P 501 Compilers. SSA Hal Perkins Spring UW CSE P 501 Spring 2018 V-1

Global Optimization. Lecture Outline. Global flow analysis. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization

CS4410 : Spring 2013

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

COP5621 Exam 4 - Spring 2005

MIT Introduction to Program Analysis and Optimization. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Induction Variable Identification (cont)

Programming Language Processor Theory

The Environment Model. Nate Foster Spring 2018

CVO103: Programming Languages. Lecture 5 Design and Implementation of PLs (1) Expressions

Compiler Construction 2009/2010 SSA Static Single Assignment Form

Closures & Environments. CS4410: Spring 2013

Scope, Functions, and Storage Management

G Programming Languages - Fall 2012

Lecture 21 CIS 341: COMPILERS

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

CS553 Lecture Generalizing Data-flow Analysis 3

Introduction to Code Optimization. Lecture 36: Local Optimization. Basic Blocks. Basic-Block Example

Liveness Analysis and Register Allocation. Xiao Jia May 3 rd, 2013

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

Register Allocation. CS 502 Lecture 14 11/25/08

Topic 9: Control Flow

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

Functional Programming. Pure Functional Programming

Lectures Basics in Procedural Programming: Machinery

Introduction to Machine-Independent Optimizations - 6

Homework Assignment #1 Sample Solutions

Topic 14: Scheduling COS 320. Compiling Techniques. Princeton University Spring Lennart Beringer

CMSC430 Spring 2009 Midterm 2 (Solutions)

Interpreters. Prof. Clarkson Fall Today s music: Step by Step by New Kids on the Block

G Programming Languages - Fall 2012

Static Program Analysis Part 1 the TIP language

The story so far. Elements of Programming Languages. While-programs. Mutable vs. immutable

CSE341, Spring 2013, Final Examination June 13, 2013

Register allocation. CS Compiler Design. Liveness analysis. Register allocation. Liveness analysis and Register allocation. V.

CS 242. Fundamentals. Reading: See last slide

Plan for Today. Concepts. Next Time. Some slides are from Calvin Lin s grad compiler slides. CS553 Lecture 2 Optimizations and LLVM 1

Compiler Design. Fall Data-Flow Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

CSE341 Spring 2017, Final Examination June 8, 2017

CS422 - Programming Language Design

More Lambda Calculus and Intro to Type Systems

Lecture 13 CIS 341: COMPILERS

The Environment Model

Data Flow Analysis and Computation of SSA

Lecture 15 CIS 341: COMPILERS

Flow Analysis. Data-flow analysis, Control-flow analysis, Abstract interpretation, AAM

Lecture 6 Foundations of Data Flow Analysis

Languages and Compiler Design II IR Code Optimization

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

CSE 401 Final Exam. March 14, 2017 Happy π Day! (3/14) This exam is closed book, closed notes, closed electronics, closed neighbors, open mind,...

Programming Languages Lecture 15: Recursive Types & Subtyping

7. Optimization! Prof. O. Nierstrasz! Lecture notes by Marcus Denker!

5. Selection: If and Switch Controls

Transcription:

Control-Flow Graphs & Dataflow Analysis CS4410: Spring 2013

Past Few Lectures: High-level Intermediate Languages: Monadic Normal Form Optimization as algebraic transformations: 3+4 7, (λx.e) v e[v/x], fst (e 1,e 2 ) e 1 Correctness issues: limiting ourselves to "pure" (valuable) expressions when we duplicate or eliminate. avoiding variable capture by keeping bound variables unique.

Today: Imperative Representations Like MIPS assembly at the instruction level. except we assume an infinite # of temps and abstract away details of the calling convention But with a bit more structure. Organized into a Control-Flow graph (ch 8) nodes: labeled basic blocks of instructions single-entry, single-exit i.e., no jumps, branching, or labels inside block edges: jumps/branches to basic blocks Dataflow analysis (ch 17) computing information to answer questions about data flowing through the graph.

A CFG Abstract Syntax Operands w ::= i x L (* ints, vars, labels *) Cmp-op c ::= < > = (* comparison *) Blocks B ::= return w jump L if w1 c w2 then L1 else L2 x := w; B (* move *) y := *(x + i); B (* load *) *(x + i) := y; B (* store *) x := p (w1,,wn); B (* arith op *) x := f (w1,,wn); B (* call *)

A CFG Abstract Syntax: type operand = Int of int Var of var Label of label type block = Return of operand Jump of label If of operand * cmp * operand * label * label Move of var * operand * block Load of var * operand * int * block Store of var * int * operand * block Arith of var * primop * (operand list) * block Call of var * operand * (operand list) * block type proc = { vars : var list, prologue: label, epilogue: label, blocks : (label * block) list }

Conceptually

Differences with Monadic Form datatype block = Return of operand Jump of label If of operand * test * operand * label * label Move of var * operand * block Load of var * operand * int * block Store of var * int * operand * block Arith of var * primop * (operand list) * block Call of var * operand * (operand list) * block Essentially MIPS assembly with an infinite # of registers. No lambdas, so easy to translate to MIPS modulo register allocation and assignment. Monadic form requires extra pass to eliminate lambdas and make closures explicit. (Closure Conversion) Unlike Monadic Form, variables are mutable.

Let's Revisit Optimizations constant folding t := 3+4 t := 7 constant propagation t := 7;B; u:=t+3 t := 7; B;u:=7+3 problem: B might assign a fresh value to t. copy propagation t:=u;b; v:=t+3 t:=u;b;v:=u+3 problems: B might assign a fresh value to t or a fresh value to u!

More Optimizations: Dead code elimination x:=e; B; jump L B; jump L problem: the block L might use x. x:=e 1 ;B 1 ; x:=e 2 ;B 2 B 1 ;x:=e2;b 2 (x not in B 1 ) Common sub-expression elimination x:=y+z;b 1 ;w := y+z;b 2 x:=y+z;b 1 ;w:=x;b 2 problem: B 1 might change x,y, or z.

Point: Optimization on a functional representation: we only had to worry about variable capture. we could avoid this by renaming all of the variables so that they were unique. then: let x=p(v 1,,v n ) in e == e[p(v 1,,v n )/x] Optimization in an imperative representation: we have to worry about intervening updates. for defined variable, similar to variable capture. but we must also worry about free variables. x:=p(v 1,,v n );B == B[p(v 1,,v n )/x] only when B doesn't modify x nor modifies any of the v i! on the other hand, a graph representation makes it possible to be more precise about the scope of a variable.

Consider: let k(x,y) = let z=x+1 in c(z,y) in let a = x+1 in if b then... k(x,a) else... k(x,a) If we inline the function k, we get: let a=x+1 in if b then let z=x+1 in c(z,y) else let z=x+1 in c(z,y) so we can do CSE on x+1, eliminating z. But the price paid is that we had to duplicate the function body. Can we do this without inlining?

In the Graph World: a:=x+1 if b Monadic terms only let you build trees, and the scoping rules follow the tree. To localize scope, we end up copying sub-trees. z:=x+1 jump c What we need is some way to accommodate "scope" across paths in a graph. (CPS & SSA get best of both)

Constant Propagation: Try #1 type env = var -> operand val init_env = fun (x:var) => Var x val subst : env -> operand -> operand val extend : env -> var -> operand -> env let rec cp (env:env) (b:block) : block = match b with Return v -> Return (subst env v) Jump L -> Jump L If(v1,t,v2,L1,L2) -> If(subst env v1,t,subst env v2,l1,l2) Move(x,v,b) -> let v' = subst env v in cp (extend env x v') b Arith(x,p,vs,b) -> Arith(x,p,map (subst env) vs, cp env b)

Problem: L1: x := 3; j L2; L2: return x

Constant Propagation: Try #2 let rec cp (env:env) (b:block) : block = match b with Return v -> Return (subst env v) Jump L -> (setblock L (cp env (getblock L)); Jump L) If(v1,t,v2,L1,L2) -> If(subst env v1,t,subst env v2,l1,l2) Move(x,v,b) -> let v' = subst env v in cp (extend env x v') b Arith(x,p,vs,b) -> Arith(x,p,map (subst env) vs, cp env b)...

Problem: L1: x := 3; j L2 L2: y := x; j L1

Constant Propagation: Try #3 let rec cp (env:env) (b:block) : block = match b with Return v -> Return (subst env v) Jump L -> Jump L If(v1,t,v2,L1,L2) -> If(subst env v1,t,subst env v2,l1,l2) Move(x,v,b) -> let v' = subst v env in Move(x,v',cp (extend env x v') b) Arith(x,p,vs,b) -> Arith(x,p,map (subst env) vs, cp env b)...

Problem x := 3; { x -> 3} x := 3; y := x+1; y := 3+1; x := x-1; x := 3-1; z := x+2; z := 3+2;

Constant Propagation: Try #4 let rec cp (env:env) (b:block) : block = match b with Return v -> Return (subst env v) Jump L -> Jump L If(v1,t,v2,L1,L2) -> If(subst env v1,t,subst env v2,l1,l2) Move(x,v,b) -> let v' = subst env v in Move(x,v',cp (extend env x v') b) Arith(x,p,vs,b) ->... Arith(x,p,map (subst env) vs, cp (extend env x (Var x)) b)

Moral: Can't just hack this up with simple substitution. To extend across blocks, we have to be careful about termination.

Available Expressions: A definition "x := e" reaches a program point p if there is no intervening assignment to x or to the free variables of e on any path leading from the definition to p. We say e is available at p. If "x:=e" is available at p, we can use x in place of e (i.e., for common sub-expression elimination.) How do we compute the available expressions at each program point?

Gen and Kill Suppose D is a set of assignments that reaches the program point p. Suppose p is of the form "x := e 1 ; B" Then the statement "x:=e 1 " generates the definition "x:=e 1 ", and kills any definition "y:= e 2 " in D such that either x=y or x is in FV(e 2 ). So the definitions that reach B are: D - { y:=e 2 x=y or x in FV(e 2 )} + {x:=e 1 }

More Generally: statement gen's kill's x:=v x:=v {y:=e x=y or x in e} x:=v 1 p v 2 x:=v 1 p v 2 {y:=e x=y or x in e} x:=*(v+i) {} {y:=e x=y or x in e} *(v+i):=x {} {} jump L {} {} return v {} {} if v 1 r v 2 goto L1 else goto L2 {} {} x := call v(v 1,,v n ) {} {y:=e x=y or x in e}

Flowing through the Graph: Given the available expressions Din[L] that flow into a block labeled L, we can compute the definitions Dout[L] that flow out by just using the gen & kill's for each statement in L's block. For each block L, we can define: succ[l] = the blocks L might jump to. pred[l] = the blocks that might jump to L. We can then flow Dout[L] to all of the blocks in succ[l]. They'll compute new Dout's and flow them to their successors and so on.

Algorithm Sketch: initialize Din[L] to be the empty set. initialize Dout[L] to be the available expressions that flow out of block L, assuming Din[L] are the set flowing in. loop until no change { for each L: In := intersection(dout[l']) for all L' in pred[l] if In == Din[L] then continue to next block. Din[L] := In. Dout[L] := flow Din[L] through L's block. }

Termination and Speed: We're ensured that this will terminate because Din[L] can at worst grow to the set of all assignments in the program. If Din[L] doesn't change, neither will Dout[L]. There are a number of tricks used to speed up the analysis: can calculate gen/kill for a whole block before running the algorithm. can keep a work queue that holds only those blocks that have changed.

Gen/Kill Available Expressions: statement gen's kills x:=v {x:=v} {y:=e x=y or x in e} x:=p(v 1,v 2 ) {x:=v 1 p v 2 } {y:=e x=y or x in e} x:=*(v+i) {} {y:=e x=y or x in e} *(v+i):=x {} {} x := v( ) {} {y:=e x=y or x in e}

Extending to Basic Blocks Gen[B]: Gen[s; B] = (Gen[s] - Kill[B]) Gen[B] Gen[return v] = {} Gen[jump L] = {} Gen[if r(v 1,v 2 ) then L 1 else L 2 ] = {} Kill[B]: Kill[s; B] = Kill[s] Kill[B] Kill[return v] = {} Kill[jump L] = {} Kill[if r(v 1,v 2 ) then L 1 else L 2 ] = {}

Equational Interpretation: We need to solve the following equations: Din[L] = Dout[L 1 ] Dout[L n ] where pred[l] = {L 1,,L n } Dout[L] = (Din[L] - Kill[L]) Gen[L] Note that for cyclic graphs, this isn't a definition, it's an equation. e.g., x*x = 2y is not a definition for x. must solve for x. might have 0 or > 1 solution.

Solving the Equations initialize Din[L] to be the empty set. initialize Dout[L] to be Gen[L]. loop until no change { for each L: In := Dout[L 1 ] Dout[L n ] where pred[l] = {L 1,,L n } if In == Din[L] then continue to next block. Din[L] := In. Dout[L] := (Din[L] - Kill[L]) Gen[L] }

Recap: Control-flow graphs: nodes are basic blocks single-entry, single-exit sequences of code statements are imperative variables have no nested scope edges correspond to jumps/branches Dataflow analysis: Example: available expressions Iterative solution Next: Another dataflow analysis - Liveness

Liveness Analysis A variable x is live at a point p if there is some path from p to a use of x that does not go through a definition of x. Liveness is backwards: flows from uses backwards Available expressions forwards: flows from definitions. We would like to calculate the set of live variables coming into and out of each statement. dead code: x:=e; B if x is not live coming out of B, then we can delete the assignment. register allocation: if x and y are live at the same point p, then they can't share a register.

Gen & Kill for Liveness A use of x generates liveness, while a definition kills it. statement gen's kills x:=y {y} {x} x:=p(y,z) {y,z} {x} x:=*(y+i) {y} {x} *(v+i):=x {x} {} x := f(y 1,,y n ) {f,y 1,,y n } {x}

Extending to blocks: Gen[B]: Gen[s; B] = (Gen[B] - Kill[s]) Gen[s] Gen[return x] = {x} Gen[jump L] = {} Gen[if r(x,z) then L1 else L2] = {x,z} Kill[B]: Kill[s; B] = Kill[s] Kill[B] Kill[return v] = {} Kill[jump L] = {} Kill[if v1 r v2 then L1 else L2] = {}

Equations for graph: We need to solve: LiveIn[L] = Gen[L] (LiveOut[L] - Kill[L]) LiveOut[L] = LiveIn[L 1 ] LiveIn[L n ] where succ[l] = {L 1,,L n } So if LiveIn changes for some successor, our LiveOut changes, which then changes our LiveIn, which then propagates to our predecessors

Liveness Algorithm initialize LiveIn[L] := Gen[L]. initialize LiveOut[L] := { }. loop until no change { for each L: Out := LiveIn[L 1 ] LiveIn[L n ] where succ[l] = {L 1,,L n } if Out == LiveOut[L] then continue to next block. LiveOut[L] := Out. LiveIn[L] := Gen[L] (LiveOut[L] - Kill[L]). }

Speeding up the Analysis For liveness, flow is backwards. so processing successors before predecessors will avoid doing another loop. of course, when there's a loop, we have to just pick a place to break the cycle. For available expressions, flow is forwards. so processing predecessors before successors will avoid doing another loop. Only need to revisit blocks that change. keep a priority queue, sorted by flow order

Representing Sets (See Appel) Consider liveness analysis: need to calculate sets of variables. need efficient union, subtraction. Usual solution uses bitsets use bitwise operations (e.g., &,, ~, etc.) to implement set operations. note: this solution scales well, but has bad asymptotic complexity compared to a sparse representation. Complexity of whole liveness algorithm? worst case, O(n 4 ) assuming set ops are O(n) in practice it's roughly quadratic.

Coming up Register allocation [ch. 11] seen first part: liveness analysis next: construct interference graph then: graph coloring & simplification Loop-oriented optimizations [ch. 18] e.g., loop-invariant removal CPS & SSA