Dataflow analysis (ctd.)

Similar documents
Data-flow Analysis - Part 2

Data-flow Analysis. Y.N. Srikant. Department of Computer Science and Automation Indian Institute of Science Bangalore

MIT Introduction to Dataflow Analysis. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Lecture 6 Foundations of Data Flow Analysis

Lecture 6 Foundations of Data Flow Analysis

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

Lecture 2. Introduction to Data Flow Analysis

Data Flow Information. already computed

Lecture 4 Introduction to Data Flow Analysis

Data Flow Analysis. Agenda CS738: Advanced Compiler Optimizations. 3-address Code Format. Assumptions

Department of Computer Sciences CS502. Purdue University is an Equal Opportunity/Equal Access institution.

Compiler Structure. Data Flow Analysis. Control-Flow Graph. Available Expressions. Data Flow Facts

Data Flow Analysis. CSCE Lecture 9-02/15/2018

Foundations of Dataflow Analysis

Data Flow Analysis. Program Analysis

ELEC 876: Software Reengineering

More Dataflow Analysis

Principles of Program Analysis: Data Flow Analysis

COSE312: Compilers. Lecture 20 Data-Flow Analysis (2)

Why Global Dataflow Analysis?

Compiler Optimization and Code Generation

A main goal is to achieve a better performance. Code Optimization. Chapter 9

CMSC430 Spring 2009 Midterm 2 (Solutions)

Compiler Design. Fall Data-Flow Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Optimisation

Review; questions Basic Analyses (2) Assign (see Schedule for links)

Dependence and Data Flow Models. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 1

compile-time reasoning about the run-time ow represent facts about run-time behavior represent eect of executing each basic block

Lecture 4. More on Data Flow: Constant Propagation, Speed, Loops

Dataflow Analysis 2. Data-flow Analysis. Algorithm. Example. Control flow graph (CFG)

Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices

Compiler Design. Fall Control-Flow Analysis. Prof. Pedro C. Diniz

Program analysis for determining opportunities for optimization: 2. analysis: dataow Organization 1. What kind of optimizations are useful? lattice al

20b -Advanced-DFA. J. L. Peterson, "Petri Nets," Computing Surveys, 9 (3), September 1977, pp

Page # 20b -Advanced-DFA. Reading assignment. State Propagation. GEN and KILL sets. Data Flow Analysis

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Introduction. Data-flow Analysis by Iteration. CSc 553. Principles of Compilation. 28 : Optimization III

Compiler Optimisation

Program Static Analysis. Overview

Compilers. Optimization. Yannis Smaragdakis, U. Athens

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Example of Global Data-Flow Analysis

Intro to dataflow analysis. CSE 501 Spring 15

Why Data Flow Models? Dependence and Data Flow Models. Learning objectives. Def-Use Pairs (1) Models from Chapter 5 emphasized control

Principles of Program Analysis. Lecture 1 Harry Xu Spring 2013

arxiv: v2 [cs.pl] 16 Sep 2014

Introduction to Machine-Independent Optimizations - 4

4/24/18. Overview. Program Static Analysis. Has anyone done static analysis? What is static analysis? Why static analysis?

Lecture Notes: Dataflow Analysis Examples

Flow Analysis. Data-flow analysis, Control-flow analysis, Abstract interpretation, AAM

Program Analysis and Verification

Live Variable Analysis. Work List Iterative Algorithm Rehashed

Lecture Notes: Widening Operators and Collecting Semantics

CS243 Homework 1. Winter Due: January 23, 2019 at 4:30 pm

CS202 Compiler Construction

Thursday, December 23, The attack model: Static Program Analysis

Program Analysis. Readings

Control-Flow Graphs & Dataflow Analysis. CS4410: Spring 2013

Introduction to Machine-Independent Optimizations - 6

CS 6110 S14 Lecture 38 Abstract Interpretation 30 April 2014

Lecture 5. Data Flow Analysis

Lecture 21 CIS 341: COMPILERS

CS153: Compilers Lecture 17: Control Flow Graph and Data Flow Analysis

Iterative Program Analysis Abstract Interpretation

Dataflow Analysis. Xiaokang Qiu Purdue University. October 17, 2018 ECE 468

process variable x,y,a,b,c: integer begin x := b; -- d2 -- while (x < c) loop end loop; end process; d: a := b + c

Lecture 5. Partial Redundancy Elimination

Compiler Construction 2009/2010 SSA Static Single Assignment Form

Deferred Data-Flow Analysis : Algorithms, Proofs and Applications

Advanced Compilers Introduction to Dataflow Analysis by Example. Fall Chungnam National Univ. Eun-Sun Cho

Appendix Set Notation and Concepts

Data Flow Analysis and Computation of SSA

Midterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 21, 2016

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code

Compiler Construction 2010/2011 Loop Optimizations

CS577 Modern Language Processors. Spring 2018 Lecture Optimization

Calvin Lin The University of Texas at Austin

Reaching Definitions and u-d Chaining M.B.Chandak

CS 6353 Compiler Construction, Homework #3

Data Flow Analysis. Suman Jana. Adopted From U Penn CIS 570: Modern Programming Language Implementa=on (Autumn 2006)

Homework Assignment #1 Sample Solutions

Compiler Construction 2016/2017 Loop Optimizations

Global Optimization. Lecture Outline. Global flow analysis. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization

Lecture Compiler Middle-End

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

Control Flow Analysis. Reading & Topics. Optimization Overview CS2210. Muchnick: chapter 7

Functions 2/1/2017. Exercises. Exercises. Exercises. and the following mathematical appetizer is about. Functions. Functions

The Static Single Assignment Form:

Data-Flow Analysis Foundations

CIS 341 Final Examination 4 May 2017

A Gentle Introduction to Program Analysis

Static Analysis and Dataflow Analysis

Advanced Compiler Construction

Dependence and Data Flow Models

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

CS 4120 Lecture 31 Interprocedural analysis, fixed-point algorithms 9 November 2011 Lecturer: Andrew Myers

Dependence and Data Flow Models. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 1

Discrete Mathematics. Kruskal, order, sorting, induction

EECS 583 Class 6 Dataflow Analysis

Transcription:

Dataflow analysis (ctd.)

Available expressions Determine which expressions have already been evaluated at each point. A expression x+y is available at point p if every path from the entry to p evaluates x+y and after the last such evaluation prior to reaching p, there are no assignments to x or y Used in : global common subexpression elimination

Example Kills z-w, x+z, etc. x = x+y z = a+b Kills x+y, w*x, etc. Generates a+b 3

Available expressions What is safe? To assume that an expression is not available at some point even if it may be. The computed set of available expressions at point p will be a subset of the actual set of available expressions at p The computed set of unavailable expressions at point p will be a superset of the actual set of unavailable expressions at p Goal : make the set of available expressions as large as possible (i.e. as close to the actual set as possible)

Available expressions How are the gen and kill sets defined? gen[b] = {expressions evaluated in B without subsequently redefining its operands} kill[b] = {expressions whose operands are redefined in B without reevaluating the expression afterwards} What is the direction of the analysis? forward out[b] = gen[b] (in[b] - kill[b])

Available expressions What is the confluence operator? intersection in[b] = out[p], over the predecessors P of B How do we initialize? Start with emptyset!

Avaliable Expressions: equations AE entry (p)= for inititial point p { AE exit (q) (q,p) in the CFD} AE exit (p)= gen AE (p) U (AE entry (p) \ kill AE (p)) 7

Equations n kill AE (n) gen AE (n) 1 {a+b} 2 {a*b} 3 {a+b} 4 {a+b, a*b,a+1} 5 {a+b} for iniitial point p AE entry (p)= { AE exit (q) (q,p) in CFD} AE exit (p)= (AE entry (p) \ kill AE (p)) U gen AE (p) AE entry (1)= AE entry (2)=AE exit (1) AE entry (3)=AE exit (2) AE exit (5) AE entry (4)=AE exit (3) AE entry (5)=AE exit (4) AE exit (1)= AE entry (1) U {a+b} AE exit (2)= AE entry (2) U {a*b} AE exit (3)= AE entry (3) U {a+b} AE exit (4)= AE entry (4) - {a+b, a*b, a+1} AE exit (5)= AE entry (5) U {a+b} 8

Solution AE entry (1)= AE entry (2)=AE exit (1) AE entry (3)=AE exit (2) AE exit (5) AE entry (4)=AE exit (3) AE entry (5)=AE exit (4) AE exit (1)= AE entry (1) U {a+b} AE exit (2)= AE entry (2) U {a*b} AE exit (3)= AE entry (3) U {a+b} AE exit (4)= AE entry (4) - {a+b, a*b, a+1} AE exit (5)= AE entry (5) U {a+b} n AE entry (n) AE exit (n) 1 {a+b} 2 {a+b} {a+b, a*b} 3 {a+b} {a+b} 4 {a+b} 5 {a+b} 9

Result [x:=a+b] 1 ; [y:=a*b] 2 ; while [y>a+b] 3 do { [a:=a+1] 4 ; [x:=a+b] 5 } n AE entry (n) AE exit (n) 1 {a+b} 2 {a+b} {a+b, a*b} 3 {a+b} {a+b} 4 {a+b} 5 {a+b} Even though the expression a is redefined in the cycle (in 4), the expression a+b is always available ai the entry of the cycle (in 3). Viceversa, a*b is available at the first entry of the cycle but it is killed before the next iteration (in 4). 10

Very Busy Expressions Determine whether an expression is evaluated in all paths from a point to the exit. An expression e is very busy at point p if no matter what path is taken from p, e will be evaluated before any of its operands are defined. Used in: Code hoisting If e is very busy at point p, we can move its evaluation at p.

Example if [a>b] 1 then ([x:=b-a] 2 ; [y:=a-b] 3 ) else ([y:=b-a] 4 ; [x:=a-b] 5 ) The two expressions a-b and b-a are both very busy in program point 1. 12

Very Busy Expressions What is safe? To assume that an expression is not very busy at some point even if it may be. The computed set of very busy expressions at point p will be a subset of the actual set of available expressions at p Goal : make the set of very busy expressions as large as possible (i.e. as close to the actual set as possible)

Very Busy Expressions How are the gen and kill sets defined? gen[b] = {all expressions evaluated in B before any definitions of their operands} kill[b] = {all expressions whose operands are defined in B before any possible re-evaluation} What is the direction of the analysis? backward in[b] = gen[b] (out[b] - kill[b])

Very Busy Expressions What is the confluence operator? intersection out[b] = in[s], over the successors S of B

Very Busy Expressions: equations VB exit (p)= if p is final { VB entry (q) (p,q) in the CFD} VB entry (p)= (VB exit (p) \ kill VB (p)) U gen VB (p) 16

n kill VB (n) gen VB (n) 1 2 {b-a} 3 3 {a-b} 4 {b-a} 5 {a-b} x:=b-a y:=a-b 1 a>b 2 4 y:=b-a 5 x:=a-b VB entry (1)=VB exit (1) VB entry (2)=VB exit (2) U {b-a} VB entry (3)={a-b} VB entry (4)=VB exit (4) U {b-a} VB entry (5)={a-b} VB exit (1)= VB entry (2) VB entry (4) VB exit (2)= VB entry (3) VB exit (3)= VB exit (4)= VB entry (5) VB exit (5)= 17

Result if [a>b] 1 then ([x:=b-a] 2 ; [y:=a-b] 3 ) else ([y:=b-a] 4 ; [x:=a-b] 5 ) 3 x:=b-a y:=a-b 1 a>b 2 4 y:=b-a 5 x:=a-b n VB entry (n) VB exit (n) 1 {a-b, b-a} {a-b, b-a} 2 {a-b, b-a} {a-b} 3 {a-b} 4 {a-b, b-a} {a-b} 5 {a-b} 18

Dataflow analysis: a general framework

Dataflow Analysis Compile-time reasoning about run-time values of variables or expressions At different program points Which assignment statements produced value of variable at this point? Which variables contain values that are no longer used after this program point? What is the range of possible values of variable at this program point?

Program Representation Control Flow Graph Nodes N statements of program Edges E flow of control pred(n) = set of all predecessors of n succ(n) = set of all successors of n Start node n 0 Set of final nodes N final

Program Points One program point before each node One program point after each node Join point point with multiple predecessors Split point point with multiple successors

Basic Idea Information about program represented using values from algebraic structure Analysis produces a value for each program point Two flavors of analysis Forward dataflow analysis Backward dataflow analysis

Forward Dataflow Analysis Analysis propagates values forward through control flow graph with flow of control Each node n has a transfer function f n Input value at program point before node Output new value at program point after node Values flow from program points after predecessor nodes to program points before successor nodes At join points, values are combined using a merge function Canonical Example: Reaching Definitions

Backward Dataflow Analysis Analysis propagates values backward through control flow graph against flow of control Each node n has a transfer function f n Input value at program point after node Output new value at program point before node Values flow from program points before successor nodes to program points after predecessor nodes At split points, values are combined using a merge function Canonical Example: Live Variables

Representing the property of interest Dataflow information will be lattice values Transfer functions operate on lattice values Solution algorithm will generate increasing sequence of values at each program point Ascending chain condition will ensure termination Will use to combine values at control-flow join points

Transfer Functions Transfer function f n : P P for each node n in control flow graph f n models effect of the node on the program information

Transfer Functions Each dataflow analysis problem has a set F of transfer functions f: P P Identity function i F F must be closed under composition: f,g F. the function h(x) = f(g(x)) F Each f F must be monotone: x y implies f(x) f(y) Sometimes all f F are distributive: f(x y) = f(x) f(y) Distributivity implies monotonicity

Distributivity Implies Monotonicity Proof of distributivity implies monotonicity Assume f(x y) = f(x) f(y) Must show: x y implies f(x) f(y), and this is equivalent to show that x y = y implies f(x) f(y) = f(y) f(y) = f(x y) (by applying f to both sides) = f(x) f(y) (by distributivity)

Forward Dataflow Analysis Simulates execution of program forward with flow of control For each node n, have in n value at program point before n out n value at program point after n f n transfer function for n (given in n, computes out n ) Require that solution satisfy n. out n = f n (in n ) n n 0. in n = { out m. m in pred(n) } in n0 = I, where I summarizes information at start of program

Worklist Algorithm for Solving Forward Dataflow Equations for each n do out n := f n ( ) in n0 := I; out n0 := f n0 (I) worklist := N - { n 0 } while worklist do remove a node n from worklist in n := { out m. m in pred(n) } out n := f n (in n ) if out n changed then worklist := worklist succ(n)

Correctness Argument Why result satisfies dataflow equations? Whenever process a node n, the algorithm ensures that out n = f n (in n ) Whenever out m changes, the algoritm puts succ(m) on worklist. Consider any node n succ(m). It will eventually come off worklist and the algorithm will set in n := { out m. m in pred(n) } to ensure that in n = { out m. m in pred(n) } So final solution will satisfy dataflow equations

Termination Argument Why does algorithm terminate? Sequence of values taken on by in n or out n is a chain. If values stop increasing, worklist empties and algorithm terminates. If the lattice enjoys the ascending chain property, the algorithm terminates Algorithm terminates for finite lattices For lattices without ascending chain property, me may use widening operator

Widening Operators Detect lattice values that may be part of infinitely ascending chain Artificially raise value to least upper bound of chain Example: Lattice is set of all subsets of integers Could be used to collect possible values taken on by variable during execution of program Widening operator might raise all sets of size n or greater to TOP (likely to be useful for loops)

Reaching Definitions P = powerset of set of all definitions in program (all subsets of set of definitions in program) = (order is ) = I = in n0 = F = all functions f of the form f(x) = a (x-b) b is set of definitions that node kills a is set of definitions that node generates General pattern for many transfer functions f(x) = GEN (x-kill)

Does Reaching Definitions Satisfy the Framework Constraints? satisfies conditions for x y and y z implies x z (transitivity) x y and y x implies y = x (asymmetry) x x (idempotence) F satisfies transfer function conditions x. (x- ) = x.x F (identity) Will show f(x y) = f(x) f(y) (distributivity) f(x) f(y) = (a (x b)) (a (y b)) = a (x b) (y b) = a ((x y) b) = f(x y)

Does Reaching Definitions Framework Satisfy Properties? What about composition? Given f 1 (x) = a 1 (x-b 1 ) and f 2 (x) = a 2 (x-b 2 ) Must show f 1 (f 2 (x)) can be expressed as a (x - b) f 1 (f 2 (x)) = a 1 ((a 2 (x-b 2 )) - b 1 ) = a 1 ((a 2 - b 1 ) ((x-b 2 ) - b 1 )) = (a 1 (a 2 - b 1 )) ((x-b 2 ) - b 1 )) = (a 1 (a 2 - b 1 )) (x-(b 2 b 1 )) Let a = (a 1 (a 2 - b 1 )) and b = b 2 b 1 Then f 1 (f 2 (x)) = a (x b)

General Result All GEN/KILL transfer function frameworks satisfy Identity Distributivity Composition properties

Available Expressions P = powerset of set of all expressions in program (all subsets of set of expressions) = (order is ) = P I = in n0 = F = all functions f of the form f(x) = a (x-b) b is set of expressions that node kills a is set of expressions that node generates Another GEN/KILL analysis

Concept of Conservatism Reaching definitions use as join Optimizations must take into account all definitions that reach along ANY path Available expressions use as join Optimization requires expression to reach along ALL paths Optimizations must conservatively take all possible executions into account. Structure of analysis varies according to way analysis used.

Backward Dataflow Analysis Simulates execution of program backward against the flow of control For each node n, have in n value at program point before n out n value at program point after n f n transfer function for n (given out n, computes in n ) Require that solution satisfies n. in n = f n (out n ) n N final. out n = { in m. m in succ(n) } n N final = out n = O Where O summarizes information at end of program

Worklist Algorithm for Solving Backward Dataflow Equations for each n do in n := f n ( ) for each n N final do out n := O; in n := f n (O) worklist := N - N final while worklist do remove a node n from worklist out n := { in m. m in succ(n) } in n := f n (out n ) if in n changed then worklist := worklist pred(n)

Live Variables P = powerset of set of all variables in program (all subsets of set of variables in program) = (order is ) = O = F = all functions f of the form f(x) = a (x-b) b is set of variables that node kills a is set of variables that node reads