CS553 Lecture Generalizing Data-flow Analysis 3

Similar documents
Data-flow Analysis. Y.N. Srikant. Department of Computer Science and Automation Indian Institute of Science Bangalore

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

More Dataflow Analysis

Data-flow Analysis - Part 2

Compiler Structure. Data Flow Analysis. Control-Flow Graph. Available Expressions. Data Flow Facts

Program Optimizations using Data-Flow Analysis

Calvin Lin The University of Texas at Austin

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Compiler Design. Fall Data-Flow Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Using Static Single Assignment Form

Calvin Lin The University of Texas at Austin

Control Flow Analysis. Reading & Topics. Optimization Overview CS2210. Muchnick: chapter 7

MIT Introduction to Dataflow Analysis. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Interprocedural Analysis. Motivation. Interprocedural Analysis. Function Calls and Pointers

Why Global Dataflow Analysis?

Live Variable Analysis. Work List Iterative Algorithm Rehashed

Compiler Design. Fall Control-Flow Analysis. Prof. Pedro C. Diniz

CS153: Compilers Lecture 23: Static Single Assignment Form

CS153: Compilers Lecture 17: Control Flow Graph and Data Flow Analysis

Calvin Lin The University of Texas at Austin

CS202 Compiler Construction

CS 406/534 Compiler Construction Putting It All Together

Lecture 21 CIS 341: COMPILERS

Data Flow Analysis. CSCE Lecture 9-02/15/2018

Data-Flow Analysis Foundations

Compiler Optimization and Code Generation

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

CS577 Modern Language Processors. Spring 2018 Lecture Optimization

Lecture Compiler Middle-End

CS153: Compilers Lecture 15: Local Optimization

Global Optimization. Lecture Outline. Global flow analysis. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization

Foundations of Dataflow Analysis

Plan for Today. Concepts. Next Time. Some slides are from Calvin Lin s grad compiler slides. CS553 Lecture 2 Optimizations and LLVM 1

Lecture 6 Foundations of Data Flow Analysis

Data Flow Information. already computed

Lecture 3 Local Optimizations, Intro to SSA

Lecture Notes: Dataflow Analysis Examples

Alias Analysis. Last time Interprocedural analysis. Today Intro to alias analysis (pointer analysis) CS553 Lecture Alias Analysis I 1

Lecture 6 Foundations of Data Flow Analysis

Data Flow Analysis. Program Analysis

Register Allocation. CS 502 Lecture 14 11/25/08

Automatic Determination of May/Must Set Usage in Data-Flow Analysis

Dataflow Analysis. Xiaokang Qiu Purdue University. October 17, 2018 ECE 468

Flow Analysis. Data-flow analysis, Control-flow analysis, Abstract interpretation, AAM

Reuse Optimization. LLVM Compiler Infrastructure. Local Value Numbering. Local Value Numbering (cont)

Data Flow Analysis. Suman Jana. Adopted From U Penn CIS 570: Modern Programming Language Implementa=on (Autumn 2006)

Compilers. Optimization. Yannis Smaragdakis, U. Athens

CS 4120 Lecture 31 Interprocedural analysis, fixed-point algorithms 9 November 2011 Lecturer: Andrew Myers

(How Not To Do) Global Optimizations

IR Optimization. May 15th, Tuesday, May 14, 13

An Overview of GCC Architecture (source: wikipedia) Control-Flow Analysis and Loop Detection

Code optimization. Have we achieved optimal code? Impossible to answer! We make improvements to the code. Aim: faster code and/or less space

Induction Variable Identification (cont)

Data Flow Analysis. Agenda CS738: Advanced Compiler Optimizations. 3-address Code Format. Assumptions

A main goal is to achieve a better performance. Code Optimization. Chapter 9

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation

Optimizing Finite Automata

Introduction to Optimization Local Value Numbering

compile-time reasoning about the run-time ow represent facts about run-time behavior represent eect of executing each basic block

CSE 501 Midterm Exam: Sketch of Some Plausible Solutions Winter 1997

Undergraduate Compilers in a Day

Reading assignment: Reviews and Inspections

Goals of Program Optimization (1 of 2)

Page 1. Reading assignment: Reviews and Inspections. Foundations for SE Analysis. Ideally want general models. Formal models

Program Static Analysis. Overview

ELEC 876: Software Reengineering

CS 432 Fall Mike Lam, Professor. Data-Flow Analysis

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

Homework Assignment #1 Sample Solutions

Lecture 4. More on Data Flow: Constant Propagation, Speed, Loops

Why Data Flow Models? Dependence and Data Flow Models. Learning objectives. Def-Use Pairs (1) Models from Chapter 5 emphasized control

Languages and Compiler Design II IR Code Optimization

Compilers. Lecture 2 Overview. (original slides by Sam

Data Structures and Algorithms in Compiler Optimization. Comp314 Lecture Dave Peixotto

Dependence and Data Flow Models. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 1

Department of Computer Sciences CS502. Purdue University is an Equal Opportunity/Equal Access institution.

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

Context-Sensitive Pointer Analysis. Recall Context Sensitivity. Partial Transfer Functions [Wilson et. al. 95] Emami 1994

Project Compiler. CS031 TA Help Session November 28, 2011

Dependence and Data Flow Models. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 1

CSC D70: Compiler Optimization LICM: Loop Invariant Code Motion

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

Review; questions Basic Analyses (2) Assign (see Schedule for links)

Lecture 4 Introduction to Data Flow Analysis

Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Program analysis for determining opportunities for optimization: 2. analysis: dataow Organization 1. What kind of optimizations are useful? lattice al

Dataflow analysis (ctd.)

Control Flow Analysis

CSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation

Page # 20b -Advanced-DFA. Reading assignment. State Propagation. GEN and KILL sets. Data Flow Analysis

Intermediate representation

Redundant Computation Elimination Optimizations. Redundancy Elimination. Value Numbering CS2210

Compiler Optimisation

Data Flow Analysis and Computation of SSA

Syntax Errors; Static Semantics

20b -Advanced-DFA. J. L. Peterson, "Petri Nets," Computing Surveys, 9 (3), September 1977, pp

Transcription:

Generalizing Data-flow Analysis Announcements Project 2 writeup is available Read Stephenson paper Last Time Control-flow analysis Today C-Breeze Introduction Other types of data-flow analysis Reaching definitions, available expressions, reaching constants Abstracting data-flow analysis What s common among the different analyses? Introduce lattice-theoretic frameworks CS553 Lecture Generalizing Data-flow Analysis 2 The C-Breeze Compiler Characteristics A C source-to-source translator Implemented in C++ Makes heavy use of templates and the Standard Template Library Organized as a set of phases Parse an ANSI C program, producing an AST Dismantle the C program into a LIR form Emit C code as output Can perform various analyses and transformations Can create new phases easily Uses common design patterns: walkers and changers CS553 Lecture Generalizing Data-flow Analysis 3 1

C-Breeze Structure ANSI C cbz myphase.o libc-breeze.a ANSI C, call graph, assembly, statistics, etc. Output is determined by the phases that are invoked CS553 Lecture Generalizing Data-flow Analysis 4 C-Breeze Terminology Phase An organizational unit of work (analysis, transformation, or both) Translation unit A file of a source program Walker Uses visitor pattern to traversing the AST Changer Uses visitor pattern for traversing and changing nodes in the AST CS553 Lecture Generalizing Data-flow Analysis 5 2

Walker Example (src/main/print_walker.*) void print_walker::at_basicblock(basicblocknode * the_bb, Order ord) { if (ord == Preorder) { indent(the_bb); _out << "basicblock: "; if (the_bb->decls().empty()) _out << "(no decls) "; if (the_bb->stmts().empty()) _out << "(no stmts) "; _out << endl; in(); } } if (ord == Postorder) out(); print_walker pw; unit_list_p u; // for each translation unit in the program... for (u=cbz::program.begin(); u!= CBZ::Program.end(); u++) //... walk the AST using the CountAssg walker (*u)->walk (pw); CS553 Lecture Generalizing Data-flow Analysis 6 Other C-Breeze Information No Symbol Table Alias Analysis int main() { int a[10],b,c,d; } a[3] = 4; b = 5+3; c = c+b + a[3]; return 0;... a[3] = 4; T2 = 8; b = 8; T3 = c + 8; T4 = T3 + a[3]; c = T4;... CS553 Lecture Generalizing Data-flow Analysis 7 3

Evaluating Compiler Infrastructures Efficiency Execution time for parsing and transformation Memory requirements Speed of resulting code Front-ends and back-ends Which ones are available and how robust? Intermediate Representation How well defined is the IR-API? Pass Construction How difficult is it to write analyses and transformations? What basic analyses are available? Debugging assistance? CS553 Lecture Generalizing Data-flow Analysis 8 Reaching Definitions Definition A definition (statement) d of a variable v reaches node n if there is a path from d to n such that v is not redefined along that path d v :=... def[v] Uses of reaching definitions Build use/def chains Constant propagation Loop invariant code motion 1 a =...; 2 b =...; 3 for (...) { 4 x = a + b; 5... 6 } d n x := 5 f(x) Reaching definitions of a and b To determine whether it s legal to move statement 4 out of the loop, we need to ensure that there are no reaching definitions of a or b inside the loop CS553 Lecture Generalizing Data-flow Analysis 9 n Does this def of x reach n? can we replace n with f(5)? 4

Computing Reaching Definitions Assumption At most one definition per node We can refer to definitions by their node number Gen[n]: Definitions that are generated by node n (at most one) Kill[n]: Definitions that are killed by node n Defining Gen and Kill for various statement types statement Gen[s] Kill[s] s: t = b op c {s} def[t] {s} s: t = M[b] {s} def[t] {s} s: M[a] = b {} {} s: if a op b goto L {} {} statement Gen[s] Kill[s] s: goto L {} {} s: L: {} {} s: f(a, ) {} {} s: t=f(a, ) {s} def[t] {s} CS553 Lecture Generalizing Data-flow Analysis 10 A Better Formulation of Reaching Definitions Problem Reaching definitions gives you a set of definitions (nodes) Doesn t tell you what variable is defined Expensive to find definitions of variable v Solution Reformulate to include variable e.g., Use a set of (var, def) pairs a x= b y= n in[n] = {(x,a),(y,b),...} CS553 Lecture Generalizing Data-flow Analysis 11 5

Recall Liveness Analysis Definition A variable is live at a particular point in the program if its value at that point will be used in the future (dead, otherwise). Uses of Liveness Register allocation Dead-code elimination d n...:= v def[v] 1 a =...; 2 b =...; 3... 4 x = f(b); If a is not live out of statement 1 then statement 1 is dead code. CS553 Lecture Generalizing Data-flow Analysis 12 Available Expressions Definition An expression, x+y, is available at node n if every path from the entry node to n evaluates x+y, and there are no definitions of x or y after the last evaluation entry...x+y......x+y... x and y not defined along blue edges n...x+y... CS553 Lecture Generalizing Data-flow Analysis 13 6

Available Expressions for CSE How is this information useful? Common Subexpression Elimination (CSE) If an expression is available at a point where it is evaluated, it need not be recomputed Example 2 i := i + 1 b := 4 * i 1 i := j a := 4 * i 3 c := 4 * i 2 i := i + 1 t := 4 * i b := t 1 i := j t := 4 * i a := t 3 c := t CS553 Lecture Generalizing Data-flow Analysis 14 Aspects of Data-flow Analysis Must or may Information guaranteed or possible Direction forward or backward Flow values variables, definitions,... Initial guess universal or empty set Kill due to semantics of stmt what is removed from set Gen due to semantics of stmt what is added to set Merge how sets from two control paths compose CS553 Lecture Generalizing Data-flow Analysis 15 7

Must vs. May Information Must information Implies a guarantee May information Identifies possibilities Liveness? Available expressions? May Must safe overly large set overly small set desired information small set large set Gen add everything that add only facts that are might be true guaranteed to be true Kill remove only facts that remove everything that are guaranteed to be true might be false merge union intersection initial guess empty set universal set CS553 Lecture Generalizing Data-flow Analysis 16 Reaching Definitions: Must or May Analysis? Consider constant propagation d x = 4 n d x = 5 f(x) We need to know if d might reach node n CS553 Lecture Generalizing Data-flow Analysis 17 8

Defining Available Expressions Analysis Must or may Information? Direction? Flow values? Initial guess? Kill? Gen? Merge? Must Forward Sets of expressions Universal set Set of expressions killed by statement s Set of expressions evaluated by s Intersection CS553 Lecture Generalizing Data-flow Analysis 18 Reaching Constants Goal Compute value of each variable at each program point (if possible) Flow values Set of (variable,constant) pairs Merge function Intersection Data-flow equations Effect of node n x = c kill[n] = {(x,d) d} gen[n] = {(x,c)} Effect of node n x = y + z kill[n] = {(x,c) c} gen[n] = {(x,c) c=valy+valz, (y, valy) in[n], (z, valz) in[n]} CS553 Lecture Generalizing Data-flow Analysis 19 9

Reality Check! Some definitions and uses are ambiguous We can t tell whether or what variable is involved e.g., *p = x; /* what variable are we assigning?! */ Unambiguous assignments are called strong updates Ambiguous assignments are called weak updates Solutions Be conservative Sometimes we assume that it could be everything e.g., Defining *p (generating reaching definitions) Sometimes we assume that it is nothing e.g., Defining *p (killing reaching definitions) Try to figure it out: alias/pointer analysis (more later) CS553 Lecture Generalizing Data-flow Analysis 20 Lattices Define lattice L = (V, ) V is a set of elements of the lattice is a binary relation over the elements of V (meet or greatest lower bound) 001 011 000 010 101 100 110 111 Properties of x,y V x y V (closure) x,y V x y = y x (commutativity) x,y,z V (x y) z = x (y z) (associativity) CS553 Lecture Generalizing Data-flow Analysis 21 10

Lattices (cont) Under ( ) Imposes a partial order on V x y x y = x 001 = 000 010 100 Top ( ) A unique greatest element of V (if it exists) x V { }, x 011 101 = 111 110 Bottom ( ) A unique least element of V (if it exists) x V { }, x Height of lattice L The longest path through the partial order from greatest to least element (top to bottom) CS553 Lecture Generalizing Data-flow Analysis 22 Data-Flow Analysis via Lattices Relationship Elements of the lattice (V) represent flow values (in[] and out[] sets) e.g., Sets of live variables for liveness represents best-case information (initial flow value) e.g., Empty set represents worst-case information e.g., Universal set (meet) merges flow values e.g., Set union If x y, then x is a conservative approximation of y e.g., Superset {i} {i,j} {} {j} {i,k} {i,j,k} {k} {j,k} CS553 Lecture Generalizing Data-flow Analysis 23 11

Data-Flow Analysis via Lattices (cont) Remember what these flow values represent At each program point a lattice element represents an in[] set or an out[] set {} Initially print(x) x = y print(y) {x} {x,y} {y} Finally x = y { y } { x,y } { x } print(x) print(y) { y } CS553 Lecture Generalizing Data-flow Analysis 24 Data-Flow Analysis Frameworks Data-flow analysis framework A set of flow values (V) A binary meet operator ( ) A set of flow functions (F) (also known as transfer functions) Flow Functions F = {f: V V} f describes how each node in CFG affects the flow values Flow functions map program behavior onto lattices CS553 Lecture Generalizing Data-flow Analysis 25 12

Visualizing DFA Frameworks as Lattices Example: Liveness analysis with 3 variables S = {v1, v2, v3} = T V: 2 S = {{v1,v2,v3}, {v1,v2},{v1,v3},{v2,v3}, {v1},{v2},{v3}, } Meet ( ): { v1 } { v2 } { v3 } : Top(T): { v1,v2 } { v1,v3 } { v2,v3 } Bottom ( ): V F: {f n (X) = Gen n (X Kill n ), n} { v1,v2,v3 } = Inferior solutions are lower on the lattice More conservative solutions are lower on the lattice CS553 Lecture Generalizing Data-flow Analysis 26 Concepts Data-flow analyses are distinguished by Flow values (initial guess, type) May/must Direction Gen Kill Merge Complication Ambiguous references (strong/weak updates) Lattices Conservative approximation Optimistic (initial guess) Data-flow analysis frameworks Tuples of lattices CS553 Lecture Generalizing Data-flow Analysis 27 13