Flow Analysis. Data-flow analysis, Control-flow analysis, Abstract interpretation, AAM

Similar documents
Compiler Structure. Data Flow Analysis. Control-Flow Graph. Available Expressions. Data Flow Facts

Why Data Flow Models? Dependence and Data Flow Models. Learning objectives. Def-Use Pairs (1) Models from Chapter 5 emphasized control

Dependence and Data Flow Models. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 1

A Gentle Introduction to Program Analysis

Program Static Analysis. Overview

Page # 20b -Advanced-DFA. Reading assignment. State Propagation. GEN and KILL sets. Data Flow Analysis

20b -Advanced-DFA. J. L. Peterson, "Petri Nets," Computing Surveys, 9 (3), September 1977, pp

Advanced Programming Methods. Introduction in program analysis

Compiler Optimization and Code Generation

CS 6110 S14 Lecture 38 Abstract Interpretation 30 April 2014

4/24/18. Overview. Program Static Analysis. Has anyone done static analysis? What is static analysis? Why static analysis?

Midterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 21, 2016

Dependence and Data Flow Models. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 1

Data Flow Analysis. Program Analysis

Data Flow Analysis. CSCE Lecture 9-02/15/2018

Control Flow Analysis. Reading & Topics. Optimization Overview CS2210. Muchnick: chapter 7

Data Flow Analysis. Agenda CS738: Advanced Compiler Optimizations. 3-address Code Format. Assumptions

Static Analysis. Systems and Internet Infrastructure Security

Foundations of Dataflow Analysis

Abstracting Definitional Interpreters

Lecture 6. Abstract Interpretation

Dataflow analysis (ctd.)

More Dataflow Analysis

CS 4120 Lecture 31 Interprocedural analysis, fixed-point algorithms 9 November 2011 Lecturer: Andrew Myers

Thursday, December 23, The attack model: Static Program Analysis

Interprocedural Analysis and Abstract Interpretation. cs6463 1

Lecture 2. Introduction to Data Flow Analysis

Static Program Analysis Part 7 interprocedural analysis

Lecture 4 Introduction to Data Flow Analysis

Algebraic Program Analysis

Compiler Optimisation

Control Flow Analysis with SAT Solvers

Static Program Analysis Part 9 pointer analysis. Anders Møller & Michael I. Schwartzbach Computer Science, Aarhus University

CS553 Lecture Generalizing Data-flow Analysis 3

Homework Assignment #1 Sample Solutions

Lecture Compiler Middle-End

Denotational Semantics. Domain Theory

Lecture 6 Foundations of Data Flow Analysis

Abstract Interpretation

PROGRAM ANALYSIS & SYNTHESIS

Data-Flow Analysis Foundations

Lecture 6 Foundations of Data Flow Analysis

Data-flow Analysis - Part 2

CSE 501 Midterm Exam: Sketch of Some Plausible Solutions Winter 1997

Lecture 5. Data Flow Analysis

Advanced Compiler Construction

Data-flow Analysis. Y.N. Srikant. Department of Computer Science and Automation Indian Institute of Science Bangalore

Principles of Program Analysis. Lecture 1 Harry Xu Spring 2013

Static Program Analysis Part 8 control flow analysis

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

Compilation 2012 Static Analysis

Lecture Notes: Dataflow Analysis Examples

Introduction to Machine-Independent Optimizations - 4

Static Analysis and Dataflow Analysis

Compiler Design. Fall Control-Flow Analysis. Prof. Pedro C. Diniz

Data Flow Information. already computed

Chapter 1 Introduction

CS 432 Fall Mike Lam, Professor. Data-Flow Analysis

CS153: Compilers Lecture 17: Control Flow Graph and Data Flow Analysis

Compiler Optimisation

Automatic Determination of May/Must Set Usage in Data-Flow Analysis

Iterative Program Analysis Abstract Interpretation

The Apron Library. Bertrand Jeannet and Antoine Miné. CAV 09 conference 02/07/2009 INRIA, CNRS/ENS

History Jan 27, Still no examples. Advanced Compiler Construction Prof. U. Assmann

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

Code Optimization. Code Optimization

Compiler Design. Fall Data-Flow Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Why Global Dataflow Analysis?

Static analysis and all that

MIT Introduction to Dataflow Analysis. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Logic-Flow Analysis of Higher-Order Programs

Principles of Program Analysis: Data Flow Analysis

Program Analysis Course Notes

compile-time reasoning about the run-time ow represent facts about run-time behavior represent eect of executing each basic block

Data Flow Testing. CSCE Lecture 10-02/09/2017

Dependent Object Types - A foundation for Scala s type system

Review; questions Basic Analyses (2) Assign (see Schedule for links)

Goals of Program Optimization (1 of 2)

Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices

Combining Model Checking and Data-Flow Analysis

Control Flow Analysis & Def-Use. Hwansoo Han

Interprocedural Analysis with Data-Dependent Calls. Circularity dilemma. A solution: optimistic iterative analysis. Example

Programming Languages Lecture 15: Recursive Types & Subtyping

Interprocedural Heap Analysis using Access Graphs and Value Contexts

Program Analysis and Verification

Static Analysis. Xiangyu Zhang

A New Abstraction Framework for Affine Transformers

Computing Approximate Happens-Before Order with Static and Dynamic Analysis

Midterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 11, 2015

Programming Languages and Compilers Qualifying Examination. Answer 4 of 6 questions.1

CO444H. Ben Livshits. Datalog Pointer analysis

Data Flow Analysis using Program Graphs

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Model checking pushdown systems

Interprocedural Analysis. CS252r Fall 2015

Department of Computer Sciences CS502. Purdue University is an Equal Opportunity/Equal Access institution.

Compilers. Optimization. Yannis Smaragdakis, U. Athens

Transcription:

Flow Analysis Data-flow analysis, Control-flow analysis, Abstract interpretation, AAM

Helpful Reading: Sections 1.1-1.5, 2.1

Data-flow analysis (DFA) A framework for statically proving facts about program data. Focuses on simple, finite facts about programs. Necessarily over- or under-approximate (may or must). Conservatively considers all possible behaviors. Requires control-flow information; i.e., a control-flow graph. If imprecise, DFA may consider infeasible code paths! Examples: reaching defs, available expressions, liveness,

Control-flow graphs (CFGs) fact(n) entry define fact(n : int) { s := 1; while (n > 1) { s := s*n; n := n-1; } } return s; s := 1; n > 1 s := s*n; n := n-1; return s; fact(n) exit

Control-flow graphs (CFGs) Intraprocedural CFGs may have a single entry/exit. Nodes can be a full basic block or a single statement. Each block or statement has 1 + predecessor and 1 + successor (stmt or block). A fork point is where paths diverge and a join point is where paths come together. fact(n) entry s := 1; n > 1 s := s*n; n := n-1; return s; fact(n) exit

Data-flow analysis Computed by propagating facts forward or backward. Computes may or must information. Reaching definitions/assignments (def-use info): which assignments may reach each variable reference (use). Liveness: which variables are still needed at each point. Available expressions: which expressions are already stored. Very busy expressions: which expressions are computed down all possible paths forward.

Reaching defs, worklist algorithm Reaching definitions is a forward may analysis. gen(s) yields facts generated by a statement s. kill(s) yields facts invalidated by a statement s. pred(s) and succ(s) yield sets of statements preceding/succ. entry(s) = s pred(s)exit(s ); exit(s) = (entry(s) \ kill(s)) gen(s) gen, kill, pred, succ, are fixed for each s; exit is monotonic in entry (if entry(s) grows, exit(s) grows); entry for exits of preds. We iterate rules for entry/exit until reaching a fixed point.

Reaching defs, worklist algorithm Main idea: worklist-based fixed-point algorithm. All entry(s) and exit(s) are initialized to be empty. Add all statements to the worklist; all must be considered. Until the worklist is empty, remove an s from worklist: Compute entry(s) as union of all exit(s ), of predecessor s Compute exit(s) from gen(s), kill(s), and entry(s) If exit(s) was increased, add all succ(s) to the worklist (This version is for forward may kill/gen analyses.)

Reaching defs, worklist algorithm exit(s) = W = all statements s //worklist while W not empty: remove s from W entry(s) = s pred(s)exit(s ) update = (entry(s) \ kill(s)) gen(s) if update!= exit(s): exit(s) = update W = W succ(s) Forward may analysis

Reaching definitions analysis Stmt GEN KILL s := 1; 1 (s,1) (s,*) fact(n) 0 entry s := 1; 1 n > 1 2 n > 1 2 s := s*n; 3 (s,3) (s,1) (s,3) s := s*n; 3 return s; 5 n := n-1; 4 (n,4) (n,0) (n,4) n := n-1; 4 fact(n) exit return s; 5

Reaching definitions analysis (n,0) fact(n) 0 entry s := 1; 1 (n,0) (s,1) (n,4) (s,3) n > 1 2 (n,0) (s,1) (n,4) (s,3) (n,0) (s,3) (n,4) s := s*n; 3 return s; 5 (n,4) (s,3) n := n-1; 4 fact(n) exit

Reaching definitions analysis all(x) = {(x,l) l} RDentry(1) = RDexit(0) RDentry(2) = RDexit(1) RDexit(4) RDentry(3) = RDexit(2) RDentry(4) = RDexit(3) RDentry(5) = RDexit(2) RDexit(0) = {(n,0)} RDexit(1) = RDentry(1)\all(s) {(s,1)} RDexit(2) = RDentry(2) RDexit(3) = RDentry(3)\all(s) {(s,3)} RDexit(4) = RDentry(4)\all(n) {(n,4)} RDexit(5) = RDentry(5) fact(n) 0 entry s := 1; 1 n > 1 2 s := s*n; 3 return s; 5 n := n-1; 4 fact(n) exit

Lattices Facts range over lattices: partial orders with joins (least upper bounds) and meets (greatest lower bounds). {(s,1),(s,3),(n,4)} {(s,1),(s,3)} {(s,1),(n,4)} {(s,3),(n,4)} {(s,1)} {(s,3)} {(n,4)}

Lattices Facts range over lattices: partial orders with joins (least upper bounds) and meets (greatest lower bounds). A partial order is a set X and an ordering (X, ) that is: Reflexive; x. x x Transitive; x,y,z. x y y z x z Anti-symmetric; x,y. x y y x x = y Lattices must also have unique joins and meets for any 2 points. Complete lattices have unique joins/meets for any set. Cartesian product of lattices is a lattice. Map with a lattice co-domain is a lattice.

Reaching definitions analysis The 11 sets RDexit(0), RDentry(1), RDexit(1), RDexit(5), are defined in terms of one another. Written as a vector of sets: RD Our set of equations can be turned into a monotonic F: RD0 RD1 F(RD0) F(RD1) So that a satisfying vector of reachable defs is a fixed point: RD = F(RD) = F n ( ) for some n For example, the join point would end up encoded as: F(,RDexit(1),,RDexit(4), ) = (,RDexit(1) RDexit(4), )

Very busy expressions analysis Computes a set of expressions that are computed down all paths forward before any subexpressions change value. Assignments represent GEN for the right hand side and KILL for expressions containing the right hand side (assigned var). Is a backward must data-flow analysis: Propagates a set of computed expressions backward. Computes the meet (GLB, intersection) of entry(s ) in s succ(s) at each fork point to obtain exit(s).

Very busy expressions analysis Stmt GEN KILL fact(n) 0 entry s := 1; 1 1 s s*n s := 1; 1 n > 1 2 n > 1 2 s := s*n; 3 s*n s s*n s := s*n; 3 return s; 5 n := n-1; 4 n-1 n-1 s*n n := n-1; 4 fact(n) exit return s; 5

Very busy expressions analysis 1 fact(n) 0 entry s := 1; 1 n > 1 2 s*n, n-1 n-1 s := s*n; 3 return s; 5 n := n-1; 4 fact(n) exit

May/Must & Forward/Backward May Must Forward Forward, computes exit(s) from entry(s) Join ( ) at CFG join points e.g., Reaching Defs (use-def) (which assignments reach uses) Forward, computes exit(s) from entry(s) Meet ( ) at CFG join points e.g., Available Expressions Backward Backward, computes entry(s) from exit(s) Join ( ) at CFG fork points e.g., Live Variables Backward, computes entry(s) from exit(s) Meet ( ) at CFG fork points e.g., Very Busy Expressions

exit(s) = W = all statements s //worklist while W not empty: remove s from W entry(s) = s pred(s)exit(s ) update = (entry(s) \ kill(s)) gen(s) if update!= exit(s): exit(s) = update W = W succ(s) Forward may analysis

exit(s) = // except at function entry W = all statements s //worklist while W not empty: remove s from W entry(s) = s pred(s)exit(s ) update = (entry(s) \ kill(s)) gen(s) if update!= exit(s): exit(s) = update W = W succ(s) Forward must analysis

entry(s) = W = all statements s //worklist while W not empty: remove s from W exit(s) = s succ(s)entry(s ) update = (exit(s) \ kill(s)) gen(s) if update!= entry(s): entry(s) = update W = W succ(s) Backward may analysis

entry(s) = // except at function exit W = all statements s //worklist while W not empty: remove s from W exit(s) = s succ(s)entry(s ) update = (exit(s) \ kill(s)) gen(s) if update!= entry(s): entry(s) = update W = W succ(s) Backward must analysis

Abstract interpretation A general methodology for justifying or calculating sound analyses, given a precise semantics for the target language. Abstract interpretation establishes abstract semantic domains and a Galois connection between concrete and abstract. A function alpha (α: X X) defines a notion of abstraction. ^ ^ A function gamma (γ: X X) defines a corresponding notion of concretization (α implies γ and vice versa; more on this ). A concrete interpreter (F: X X) and Galois connection can be used to justify or calculate an abstract interpretation: α F γ F^

Abstraction/Concretization (Galois) ^ α( x ) x if and only if ^ x γ( x )

Abstraction/Concretization (Galois conn.) γ γ( ^ x ) ^ x x α( x ) α X X^

Abstraction/Concretization (Galois conn.) γ {,-1,0,1, } {1,2,3, } γ Int Pos-Int {1} {2} {3} α( 2 ) = Pos-Int α Values Simple Types

Constant Propagation Forward must style of DFA. Or as an abstract interpretation: Uses a flat lattice of constants with top and bottom (C, ): 0 1 2 a #f void Facts become sets of pairs (Var x C) or a map (Env: Var C).

Flow analysis Intraprocedural analysis: considers functions independently. Interprocedural analysis: considers multiple functions together. Whole-program analysis: considers an entire program at once. DFA is great for simple, local-variable-focused analyses. Analysis of heap-allocated data is much harder. The simple case is called pointer analysis (aliases, nullable). The general case is called shape analysis (full data-structures).

What about Scheme or ANF/CPS IRs?

Depends on call sites for the lambda that binds it. (lambda (k f x y) (let ([a (prim + x y)]) (f k a))) What value can f be? Data-flow depends upon control-flow.

Depends on the possible values of parameter f. (lambda (k f x y) (let ([a (prim + x y)]) (f k a))) Where does control propagate from this call-site? Control-flow depends upon data-flow.

The higher-order control-flow problem: Data-flow and control-flow properties are thoroughly entangled and mutually dependent.

The solution? Control-flow analysis: Simultaneously model control-flow behavior and data-flow behavior in a single analysis.

Control-flow analysis Use abstract interpretation to produce a single uniform analysis of all interdependent program properties. CFAs are whole-program interprocedural analyses. (Shivers 1991) k-cfa tracks k latest call-sites; bindings have context. (EXPTIME) 0-CFA produces a single model of each function. (O(n 3 )) Abstracting abstract machines (AAM): a unified methodology for deriving CFAs from concrete (precise) abstract machines! (Van Horn and Might 2010)

CPS lambda calculus e Exp ::= (ae ae ) ae AExp ::= x lam lam Lam ::= (lambda (x ) e) x Var is a set of variables

CPS lambda calculus (semantics) DOMAINS State: Exp x Env Env: Var Clo Clo: Lam x Env ATOMIC EVAL A(lam,env) = (lam,env) A(x,env) = env(x) SMALL-STEP TRANSITION ((ae0 aej), env) (e0, env ), where ((lambda (x1 xj) e0),envc) = A(ae0) cloi = A(aei) env = envc[xi cloi]

Collecting semantics (generates a trace) e Tarski. (1955)

Exp x Env Env: Var Clo Might. Abstract interpreters for free. (2010)

CPS store-passing semantics DOMAINS State: Exp x Env X Store Env: Var Addr Store: Addr Clo Clo: Lam x Env Addr: some infinite set ATOMIC EVAL A(lam,env,st) = (lam,env) A(x,env,st) = st(env(x))

CPS store-passing semantics SMALL-STEP TRANSITION ((ae0 aej), env, st) (e0, env, st ), where ((lambda (x1 xj) e0),envc) = A(ae0) cloi = A(aei) env = envc[xi alloc(xi)] st = st[alloc(xi) cloi] alloc(xi) = fresh address = xi (yields 0-CFA)

Now we may finitize the set Addr State: Exp x Env X Store Env: Var Addr Store: Addr Clo Clo: Lam x Env

Now we may finitize the set Addr State: Exp x Env X Store Env: Var Addr Store: Addr Clo ( ) Clo: Lam x Env

-3-2 -1 0 1 2 3 Negative Zero Int abstraction Positive Int Cousot and Cousot (1977), Cousot (2000)

abstraction

Abstract abstract machines ( [x ax, (f x),, y ay, f af] [ax {Int}, ay {Int}, af {(λ (w) e 1 ), (λ (z) e 2 )}] )

Abstract-state transition ( [x ax, (f x),, y ay, f af] [ax {Int}, ay {Int}, af {(λ (w) e 1 ), (λ (z) e 2 )}] )

e

abstraction abstraction Soundness concrete transition abstract transition

addresses Exponential complexity (, store) { values {

Global store widening (flow-sensitive) Control-flow graph per-point stores

Global store widening (flow-insensitive) ( ), Control-flow graph Global store

variables Flow-insensitive (CPS) 0-CFA ( call { ) call call call, lambdas { Control-flow graph (of call-sites) Variables map to sets of lambdas

Let s live-code this 0-CFA