Lecture 6 Foundations of Data Flow Analysis

Similar documents
Lecture 6 Foundations of Data Flow Analysis

Lecture 4 Introduction to Data Flow Analysis

Lecture 2. Introduction to Data Flow Analysis

Lecture 4. More on Data Flow: Constant Propagation, Speed, Loops

Compiler Optimization and Code Generation

Data-flow Analysis - Part 2

Introduction to Machine-Independent Optimizations - 4

Data-flow Analysis. Y.N. Srikant. Department of Computer Science and Automation Indian Institute of Science Bangalore

Data Flow Information. already computed

Dataflow analysis (ctd.)

Why Global Dataflow Analysis?

MIT Introduction to Dataflow Analysis. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

COSE312: Compilers. Lecture 20 Data-Flow Analysis (2)

Data Flow Analysis. Agenda CS738: Advanced Compiler Optimizations. 3-address Code Format. Assumptions

Review; questions Basic Analyses (2) Assign (see Schedule for links)

CS243 Homework 1. Winter Due: January 23, 2019 at 4:30 pm

Lecture 8: Induction Variable Optimizations

Compiler Structure. Data Flow Analysis. Control-Flow Graph. Available Expressions. Data Flow Facts

Compiler Optimisation

Data Flow Analysis. Program Analysis

A main goal is to achieve a better performance. Code Optimization. Chapter 9

ELEC 876: Software Reengineering

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

Foundations of Dataflow Analysis

Introduction. Data-flow Analysis by Iteration. CSc 553. Principles of Compilation. 28 : Optimization III

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

Compiler Design. Fall Data-Flow Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

compile-time reasoning about the run-time ow represent facts about run-time behavior represent eect of executing each basic block

Lecture 10: Lazy Code Motion

Example of Global Data-Flow Analysis

Intro to dataflow analysis. CSE 501 Spring 15

CMSC430 Spring 2009 Midterm 2 (Solutions)

Reaching Definitions and u-d Chaining M.B.Chandak

Department of Computer Sciences CS502. Purdue University is an Equal Opportunity/Equal Access institution.

20b -Advanced-DFA. J. L. Peterson, "Petri Nets," Computing Surveys, 9 (3), September 1977, pp

Page # 20b -Advanced-DFA. Reading assignment. State Propagation. GEN and KILL sets. Data Flow Analysis

Advanced Compilers Introduction to Dataflow Analysis by Example. Fall Chungnam National Univ. Eun-Sun Cho

Compiler Optimisation

Lecture 9: Loop Invariant Computation and Code Motion

Compiler Design Spring 2017

Program Analysis. Readings

Homework Assignment #1 Sample Solutions

Thursday, December 23, The attack model: Static Program Analysis

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Compiler Design. Fall Control-Flow Analysis. Prof. Pedro C. Diniz

Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices

Data Flow Analysis. CSCE Lecture 9-02/15/2018

Liveness Analysis and Register Allocation. Xiao Jia May 3 rd, 2013

Live Variable Analysis. Work List Iterative Algorithm Rehashed

Principles of Program Analysis: Data Flow Analysis

CS553 Lecture Generalizing Data-flow Analysis 3

CSC D70: Compiler Optimization LICM: Loop Invariant Code Motion

Compilers. Optimization. Yannis Smaragdakis, U. Athens

TOPOLOGY, DR. BLOCK, FALL 2015, NOTES, PART 3.

CS 432 Fall Mike Lam, Professor. Data-Flow Analysis

Why Data Flow Models? Dependence and Data Flow Models. Learning objectives. Def-Use Pairs (1) Models from Chapter 5 emphasized control

Control Flow Analysis. Reading & Topics. Optimization Overview CS2210. Muchnick: chapter 7

Dependence and Data Flow Models. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 1

Flow Analysis. Data-flow analysis, Control-flow analysis, Abstract interpretation, AAM

A Domain-Specific Language for Generating Dataflow Analyzers

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

Lecture 5. Partial Redundancy Elimination

Goals of Program Optimization (1 of 2)

Program Static Analysis. Overview

CS 4120 Lecture 31 Interprocedural analysis, fixed-point algorithms 9 November 2011 Lecturer: Andrew Myers

The Static Single Assignment Form:

More Dataflow Analysis

Deferred Data-Flow Analysis : Algorithms, Proofs and Applications

Control Flow Graphs. (From Matthew S. Hetch. Flow Analysis of Computer Programs. North Holland ).

Compiler Construction 2010/2011 Loop Optimizations

Foundations of Computer Science Spring Mathematical Preliminaries

CS 6110 S14 Lecture 38 Abstract Interpretation 30 April 2014

Program analysis for determining opportunities for optimization: 2. analysis: dataow Organization 1. What kind of optimizations are useful? lattice al

Control flow and loop detection. TDT4205 Lecture 29

Lecture 5. Data Flow Analysis

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions.

9.5 Equivalence Relations

Compiler Construction 2016/2017 Loop Optimizations

Principles of Program Analysis. Lecture 1 Harry Xu Spring 2013

Solution for Homework set 3

Data-Flow Analysis Foundations

CS2210: Compiler Construction. Compiler Optimization

Class 6. Review; questions Assign (see Schedule for links) Slicing overview (cont d) Problem Set 3: due 9/8/09. Program Slicing

Calvin Lin The University of Texas at Austin

4/24/18. Overview. Program Static Analysis. Has anyone done static analysis? What is static analysis? Why static analysis?

CSC Discrete Math I, Spring Sets

CONTROL FLOW ANALYSIS. The slides adapted from Vikram Adve

Global Data Flow Analysis of Syntax Tree Intermediate Code

Simone Campanoni Alias Analysis

CS 406/534 Compiler Construction Putting It All Together

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code

CS 6353 Compiler Construction, Homework #3

2.8. Connectedness A topological space X is said to be disconnected if X is the disjoint union of two non-empty open subsets. The space X is said to

Flow Graph Theory. Depth-First Ordering Efficiency of Iterative Algorithms Reducible Flow Graphs

CSE 401 Final Exam. March 14, 2017 Happy π Day! (3/14) This exam is closed book, closed notes, closed electronics, closed neighbors, open mind,...

Cardinality Lectures

: Compiler Design. 9.6 Available expressions 9.7 Busy expressions. Thomas R. Gross. Computer Science Department ETH Zurich, Switzerland

Lecture 3 Local Optimizations, Intro to SSA

EECS 583 Class 6 Dataflow Analysis

Transcription:

Review: Reaching Definitions Lecture 6 Foundations of Data Flow Analysis I. Meet operator II. Transfer functions III. Correctness, Precision, Convergence IV. Efficiency [ALSU 9.3] Phillip B. Gibbons 15-745: Foundations of Data Flow 1 no B1 d0: y = 3 d2 reaches d1: x = 10 d2: y = 11 this point? if e B2 B3 d3: x = x+1 d4: y = y+4 a basic block b generates definitions: Gen[b], set of definitions in b that reach end of b kills definitions: in[b] - Kill[b], where Kill[b]=set of defs (in rest of program) killed by defs in b transfer function f b : out[b] = Gen[b] U (in(b)-kill[b]) meet operator: in[b] = out[p 1 ] U out[p 2 ] U... U out[p n ], where p 1,..., p n are all predecessors of b 15-745: Foundations of Data Flow 2 d5: x = 4 d6: z = x f Gen Kill 1 {1,2 {3,4,5 2 {3,4 {1,2,5 3 {5,6 {1,3 Review: Live Variable Analysis A variable v is live at point p if the value of v is used along some path in the flow graph starting at p. A basic block b can generate live variables: Use[b] set of locally exposed uses in b propagate incoming live variables: OUT[b] - Def[b], where Def[b]= set of variables defined in b.b. Backward analysis transfer function for block b: in[b] = Use[b] U (out(b)-def[b]) meet operator: out[b] = in[s 1 ] U in[s 2 ] U... U in[s n ], where s 1,..., s n are all successors of b no y = 2 v = 2 v live at this point? d0: v = 3 d1: x = 10 d2: y = 1 if e z = x y = v Review: Data Flow Analysis Framework Reaching Definitions Live Variables Domain Sets of definitions Sets of variables Direction forward: out[b] = f b (in[b]) in[b] = out[pred(b)] backward: in[b] = f b (out[b]) out[b] = in[succ(b)] Transfer function f b (x) = Gen b (x Kill b ) f b (x) = Use b (x -Def b ) Meet Operation ( ) Boundary Condition out[entry] = in[exit] = Initial interior points out[b] = in[b] = Other Data Flow Analysis problems fit into this general framework, e.g., Available Expressions [ALSU 9.2.6] 15-745: Foundations of Data Flow 3 15-745: Foundations of Data Flow 4 1

A Unified Framework Data flow problems are defined by Domain of values: V Meet operator (V V V), initial value A set of transfer functions (V V) Usefulness of unified framework To answer questions such as correctness, precision, convergence, speed of convergence for a family of problems If meet operators and transfer functions have properties X, then we know Y about the above. Reuse code Overview: A Check List for Data Flow Problems Semi-lattice set of values meet operator top, bottom finite descending chain? Transfer functions function of each basic block monotone distributive? Algorithm initialization step (entry/exit, other nodes) visit order: rpostorder depth of the graph 15-745: Foundations of Data Flow 5 15-745: Foundations of Data Flow 6 Properties of the meet operator commutative: x y = y x x I. Meet Operator x y idempotent: x x = x associative: x (y z) = (x y) z there is a Top element T such that x T = x Meet operator defines a partial ordering on values x y if and only if x y = x [y x in diagram] Transitivity: if x y and y z then x z Antisymmetry: if x y and y x then x = y Reflexivity: x x y Partial Order T = (1,1) (1,0) (0,1) (0,0) Elementwise-min T = (1,1) (1,0) (0,1) (0,0) Elementwise-min Top and Bottom elements Partial Order {d1 T = {d1,d2 Top T such that: x T = x Bottom such that: x = Values and meet operator in a data flow problem define a semi-lattice: there exists a T, but not necessarily a. x, y are ordered: x y then x y = x [y x in diagram] what if x and y are not ordered? x y x, x y y, and if w x, w y, then w x y { Intersection {d2 {d1 T = { {d2 {d1,d2 Union 15-745: Foundations of Data Flow 7 15-745: Foundations of Data Flow 8 2

One vs. All Variables/Definitions Lattice for each variable: e.g. intersection 1 0 Lattice for three variables: x 111 x 101 x 110 x 011 x 100 x 010 x 001 x 000 Descending Chain Definition The height of a lattice is the largest number of > relations that will fit in a descending chain. x 0 > x 1 > x 2 > Height of values in reaching definitions? Height=n, where n is the number of definitions Important property: finite descending chain Can an infinite lattice have a finite descending chain? Example: Constant Propagation/Folding To determine if a variable is a constant Data values undef,... -1, 0, 1, 2,..., not-a-constant 15-745: Foundations of Data Flow 9 15-745: Foundations of Data Flow 10 II. Transfer Functions Basic Properties f: V V Has an identity function There exists an f such that f (x) = x, for all x. Closed under composition if f 1, f 2 F, then f 1 f 2 F out[b] = Gen[b] U (in(b)-kill[b]) Monotonicity A framework (F, V, ) is monotone if and only if x y implies f(x) f(y) out[b] = Gen[b] U (in(b)-kill[b]) i.e. a smaller or equal input to the same function will always give a smaller or equal output Equivalently, a framework (F, V, ) is monotone if and only if f(x y) f(x) f(y) i.e. merge input, then apply f is small than or equal to apply the transfer function individually and then merge the result 15-745: Foundations of Data Flow 11 15-745: Foundations of Data Flow 12 3

Example Reaching definitions: f(x) = Gen (x - Kill), = Definition 1: x 1 x 2, Gen (x 1 - Kill) Gen (x 2 - Kill) Definition 2: (Gen (x 1 - Kill) ) (Gen (x 2 - Kill) ) = (Gen ((x 1 x 2 ) - Kill)) Note: Monotone framework does not mean that f(x) x e.g., reaching definition for two definitions in program suppose: f x : Gen x = {d 1, d 2 ; Kill x = { then f(x) = {d 1, d 2 for any x, including x = { If input(second iteration) input(first iteration) result(second iteration) result(first iteration) {d1 T = { {d2 {d1,d2 Union [x iff ] Distributivity A framework (F, V, ) is distributive if and only if f(x y) = f(x) f(y) i.e. merge input, then apply f is equal to apply the transfer function individually then merge result Example: Constant Propagation is NOT distributive a = 2 b = 3 c = a + b a = 3 b = 2 15-745: Foundations of Data Flow 13 15-745: Foundations of Data Flow 14 III. Data Flow Analysis Meet-Over-Paths (MOP) Definition Let f 1,..., f m : F, where f i is the transfer function for node i Err in the conservative direction Meet-Over-Paths (MOP): f p = f nk f n1, where p is a path through nodes n 1,..., n k f p = identify function, if p is an empty path Ideal data flow answer: For each node n: f pi (T), for all possibly executed paths p i reaching n. if sqrt(y) >= 0 x = 0 x = 1 For each node n: MOP(n) = f pi (T), for all paths p i reaching n a path exists as long there is an edge in the code consider more paths than necessary MOP = Perfect-Solution Solution-to-Unexecuted-Paths MOP Perfect-Solution Potentially more constrained, solution is small hence conservative It is not safe to be > Perfect-Solution! Desirable solution: as close to MOP as possible But: Determining all possibly executed paths is undecidable 15-745: Foundations of Data Flow 15 15-745: Foundations of Data Flow 16 4

Example: MOP considers more paths than Ideal Solving Data Flow Equations B2 B5 B1 B4 if x == 1 if x == 0 B3 B6 Ideal: Considers only 2 paths B1-B2-B4-B6-B7 (i.e., x=1) B1-B3-B4-B5-B7 (i.e., x=0) MOP: Also considers unexecuted paths B1-B2-B4-B5-B7 B1-B3-B4-B6-B7 Example: Reaching definitions out[entry] = { Values = {subsets of definitions Meet operator: in[b] = out[p], for all predecessors p of b Transfer functions: out[b] = gen b (in[b] -kill b ) Any solution satisfying equations = Fixed Point Solution (FP) Iterative algorithm initializes out[b] to { if converges, then it computes Maximum Fixed Point (MFP): MFP is the largest of all solutions to equations B7 Assume: B2 & B3 do not update x Properties: FP MFP MOP Perfect-solution FP, MFP are safe in(b) MOP(b) 15-745: Foundations of Data Flow 17 15-745: Foundations of Data Flow 18 Partial Correctness of Algorithm If data flow framework is monotone (i.e., x y implies f(x) f(y)) then if the algorithm converges, IN[b] MOP[b] Proof: Induction on path lengths Define IN[entry] = OUT[entry] and transfer function of entry = Identity function Base case: path of length 0 Proper initialization of IN[entry] If true for path of length k, p k = (n 1,..., n k ), then true for path of length k+1: p k+1 = (n 1,..., n k+1 ) Assume: IN[n k ] f nk-1 (f nk-2 (... f n1 (IN[entry]))) Precision If data flow framework is distributive (i.e., f(x y) = f(x) f(y)) then if the algorithm converges, IN[b] = MOP[b] Monotone but not distributive: behaves as if there are additional paths a = 2 b = 3 c = a + b a = 3 b = 2 IN[n k+1 ] = OUT[n k ]... OUT[n k ] = f nk (IN[n k ]) f nk (f nk-1 (... f n1 (IN[entry]))) by inductive assumption & monotonicity 15-745: Foundations of Data Flow 19 15-745: Foundations of Data Flow 20 5

Additional Property to Guarantee Convergence Data flow framework (monotone) converges if there is a finite descending chain IV. Speed of Convergence Speed of convergence depends on order of node visits For each variable IN[b], OUT[b], consider the sequence of values set to each variable across iterations: if sequence for in[b] is monotonically decreasing sequence for out[b] is monotonically decreasing (out[b] initialized to T) Reverse direction for backward flow problems if sequence for out[b] is monotonically decreasing sequence of in[b] is monotonically decreasing 15-745: Foundations of Data Flow 21 15-745: Foundations of Data Flow 22 Reverse Postorder Depth-First Iterative Algorithm (forward) Step 1: depth-first post order main() { count = 1; Visit(root); Visit(n) { for each successor s that has not been visited Visit(s); PostOrder(n) = count; count = count+1; B7 Step 2: reverse order For each node i rpostorder(i)= NumNodes - PostOrder(i) B2 B5 B1 B4 if x == 1 if x == 0 B3 B6 input: control flow graph CFG = (N, E, Entry, Exit) /* Initialize */ out[entry] = init_value For all nodes i out[i] = T Change = True /* iterate */ While Change { Change = False For each node i in rpostorder { in[i] = (out[p]), for all predecessors p of i oldout = out[i] out[i] = f i (in[i]) if oldout out[i] Change = True 15-745: Foundations of Data Flow 23 15-745: Foundations of Data Flow 24 6

Speed of Convergence Summary: A Check List for Data Flow Problems If cycles do not add information information can flow in one pass down a series of nodes of increasing order number: e.g., 1 -> 4 -> 5 -> 7 -> 2 -> 4... first pass passes determined by number of back edges in the path essentially the nesting depth of the graph Number of iterations = number of back edges in any acyclic path + 2 (2 are necessary even if there are no cycles) (2 not 1 since need a last pass where nothing changed) What is the depth? corresponds to depth of intervals for reducible graphs in real programs: average of 2.75 15-745: Foundations of Data Flow 25 [ALSU 9.6.7] Semi-lattice set of values meet operator top, bottom finite descending chain? Transfer functions function of each basic block monotone distributive? Algorithm initialization step (entry/exit, other nodes) visit order: rpostorder depth of the graph 15-745: Foundations of Data Flow 26 Friday s Class Global common subexpression elimination Constant propagation/folding 15-745: Foundations of Data Flow 27 7