CS 4120 Lecture 31 Interprocedural analysis, fixed-point algorithms 9 November 2011 Lecturer: Andrew Myers

Similar documents
Compiler Structure. Data Flow Analysis. Control-Flow Graph. Available Expressions. Data Flow Facts

Graphs. A graph is a data structure consisting of nodes (or vertices) and edges. An edge is a connection between two nodes

Lecture 4. More on Data Flow: Constant Propagation, Speed, Loops

Advanced Compiler Construction

Why Data Flow Models? Dependence and Data Flow Models. Learning objectives. Def-Use Pairs (1) Models from Chapter 5 emphasized control

Dependence and Data Flow Models. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 1

Why Global Dataflow Analysis?

Interprocedural Analysis. Motivation. Interprocedural Analysis. Function Calls and Pointers

CS 432 Fall Mike Lam, Professor. Data-Flow Analysis

CSE 501 Midterm Exam: Sketch of Some Plausible Solutions Winter 1997

Calvin Lin The University of Texas at Austin

More Graph Algorithms

Lecture 21 CIS 341: COMPILERS

Interprocedural Analysis. Dealing with Procedures. Course so far: Terminology

Compiler Optimization and Code Generation

Compiler Optimisation

Homework Assignment #1 Sample Solutions

compile-time reasoning about the run-time ow represent facts about run-time behavior represent eect of executing each basic block

More Dataflow Analysis

Register Allocation & Liveness Analysis

Data Flow Analysis. CSCE Lecture 9-02/15/2018

Introduction to Machine-Independent Optimizations - 4

Static Program Analysis Part 9 pointer analysis. Anders Møller & Michael I. Schwartzbach Computer Science, Aarhus University

Binary Search Trees Treesort

Statements or Basic Blocks (Maximal sequence of code with branching only allowed at end) Possible transfer of control

Design and Analysis of Algorithms

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

CS781 Lecture 2 January 13, Graph Traversals, Search, and Ordering

Goals of Program Optimization (1 of 2)

Programming Languages and Compilers Qualifying Examination. Answer 4 of 6 questions.1

CS 6110 S14 Lecture 38 Abstract Interpretation 30 April 2014

Review; questions Basic Analyses (2) Assign (see Schedule for links)

An Overview of GCC Architecture (source: wikipedia) Control-Flow Analysis and Loop Detection

Strongly Connected Components

Data Structure. IBPS SO (IT- Officer) Exam 2017

Lecture 27. Pros and Cons of Pointers. Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis

Lecture 20 Pointer Analysis

Flow Analysis. Data-flow analysis, Control-flow analysis, Abstract interpretation, AAM

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

Control Flow Analysis. Reading & Topics. Optimization Overview CS2210. Muchnick: chapter 7

do such opportunities arise? (i) calling library code (ii) Where programs. object-oriented Propagating information across procedure boundaries is usef

Design and Analysis of Algorithms Lecture- 9: Binary Search Trees

COSC 2007 Data Structures II Final Exam. Part 1: multiple choice (1 mark each, total 30 marks, circle the correct answer)

Global Optimization. Lecture Outline. Global flow analysis. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization

Recitation 9. Prelim Review

Lecture 16 Pointer Analysis

CS521 \ Notes for the Final Exam

Taking Stock. IE170: Algorithms in Systems Engineering: Lecture 17. Depth-First Search. DFS (Initialize and Go) Last Time Depth-First Search

Interprocedural Analysis with Data-Dependent Calls. Circularity dilemma. A solution: optimistic iterative analysis. Example

Graphs and Network Flows ISE 411. Lecture 7. Dr. Ted Ralphs

Lecture 14 Pointer Analysis

Dependence and Data Flow Models. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 1

Plan. CMPSCI 311: Introduction to Algorithms. Recall. Adjacency List Representation. DFS Descriptions. BFS Description

Static Program Analysis Part 7 interprocedural analysis

1 Heaps. 1.1 What is a heap? 1.2 Representing a Heap. CS 124 Section #2 Heaps and Graphs 2/5/18

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

CSC 172 Data Structures and Algorithms. Lecture 24 Fall 2017

Program Static Analysis. Overview

Region-Based Shape Analysis with Tracked Locations

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University

Lecture 5. Data Flow Analysis

Strongly connected: A directed graph is strongly connected if every pair of vertices are reachable from each other.

CS553 Lecture Generalizing Data-flow Analysis 3

CS 406/534 Compiler Construction Putting It All Together

Calvin Lin The University of Texas at Austin

Control-Flow Analysis

CS 310 Advanced Data Structures and Algorithms

Total Score /1 /20 /41 /15 /23 Grader

Copyright 2000, Kevin Wayne 1

Department of Computer Sciences CS502. Purdue University is an Equal Opportunity/Equal Access institution.

Priority Queues, Binary Heaps, and Heapsort

Algorithms and Data Structures

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Static Analysis and Dataflow Analysis

Lecture 2: Control Flow Analysis

INF2220: algorithms and data structures Series 7

Interprocedural Analysis with Data-Dependent Calls. Circularity dilemma. A solution: optimistic iterative analysis. Example

1 Connected components in undirected graphs

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS

CSE 373 Autumn 2012: Midterm #2 (closed book, closed notes, NO calculators allowed)

Topic 12: Elementary Graph Algorithms

Static Analysis of Accessed Regions in Recursive Data Structures

Data Structures Brett Bernstein

CSE P 501 Compilers. SSA Hal Perkins Spring UW CSE P 501 Spring 2018 V-1

CS301 - Data Structures Glossary By

4/24/18. Overview. Program Static Analysis. Has anyone done static analysis? What is static analysis? Why static analysis?

Algorithm Design (8) Graph Algorithms 1/2

Control Flow Analysis & Def-Use. Hwansoo Han

CS5363 Final Review. cs5363 1

Lecture Compiler Backend

Trees and Graphs Shabsi Walfish NYU - Fundamental Algorithms Summer 2006

CSE 373 MAY 10 TH SPANNING TREES AND UNION FIND

Algorithm Design and Analysis

Register Allocation. CS 502 Lecture 14 11/25/08

1 Start with each vertex being its own component. 2 Merge two components into one by choosing the light edge

Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi

CO444H. Ben Livshits. Datalog Pointer analysis

15-451/651: Design & Analysis of Algorithms October 5, 2017 Lecture #11: Depth First Search and Strong Components last changed: October 17, 2017

CS 4349 Lecture October 18th, 2017

Transcription:

CS 4120 Lecture 31 Interprocedural analysis, fixed-point algorithms 9 November 2011 Lecturer: Andrew Myers These notes are not yet complete. 1 Interprocedural analysis Some analyses are not sufficiently precise when done one procedure at a time, because there is not enough information about what other procedures (either callers or callees) might do. Points-to analysis is one example where an interprocedural analysis often yields significant benefits. 1.1 Example Abstractions for allocation are a common pattern, as in the following code: f() = { a: int[] = newarray(5); b: int[] = newarray(5);... use a and b... } newarray(n: int): int[] = { x: int[n];... initialize x somehow... return x; } To do a good job of optimizing function f, the compiler is likely to need reason accurately about the possible aliasing of a and b with each other and with other heap locations. With no interprocedural analysis, however, there is no information about what the result of newarray might point to. So a and b must be treated as potential aliases for almost everything. 2 Context-insensitive analysis A simple strategy for getting more information is to combine the control flow graphs for the various functions, connecting them according to their calls and returns. The control flow graphs are connected according to the call graph for the program. How this strategy works is shown in Figure 1 for the example program. The analysis is context-insensitive: it analyzes the function newarray for all possible calling contexts at once. The result of this combined analysis is then sent back to all call sites via the return node. This will at least allow the compiler to determine that a and b are both generated by allocation alloc1, and therefore don t alias other variables. However, it won t be able to determine that a and b also don t alias each other. The combined CFG for Figure 1 is an abstraction of the real call tree that happens at run time. The call tree has one node for each actual procedure invocation, and therefore contains two distinct nodes for newarray. One way of recovering precision is to connect the CFGs according to the possible call trees instead of the call graph, cloning the newarray CFG and creating two separate copies with distinguished allocation sites alloc1 and alloc2. Then we d know that a can point only to alloc1 and b only to alloc2: no aliasing possible. The result of context-sensitive analysis using cloning is shown in Figure 2 However, cloning only works if the call tree is finite (and not huge). In general there can be an infinite number of possible call trees, and therefore an infinite number of calling contexts in which a procedure needs to be analyzed. We need a way to abstract over possible contexts. 1

f CFG a = newarray(5) b = newarray(5)... {a alloc1} {a alloc1, b alloc1} f newarray CFG START x = new int[n] alloc1 return x {x alloc1} Figure 1: Context-insensitive analysis (dataflow values shown in blue) 2

f CFG a = newarray(5) b = newarray(5)... {a alloc1} {a alloc1, b alloc2} f newarray CFG context 1 START newarray CFG context 2 START x = new int[n] alloc1 x = new int[n] alloc2 return x {x alloc1} return x {x alloc2} Figure 2: Context-sensitive analysis with cloning 3

j k 1 k 2 m g h f Figure 3: Calling contexts example 3 Calling contexts Thinking backward from a given call to a procedure f, there is some number of distinct callers, and in turn, these callers may have multiple distinct callers, as depicted in Figure 3. This diagram expands the call graph for the program into a tree in the backward direction. For example, there are two distinct calls to procedure g from procedure k, labeled k 1 and k 2. Here, at least four calling contexts for f are shown, which can be enumerated in terms of the call paths from the root to f. These paths can be named using their call strings, which are the sequences of names visted along the path: jk 1 g f, jk 2 g f, jmg f, and jmh f. Each of these contexts could potentially provide and receive different information from their respective calls to f, so additional precision might be gained by making a separate copy of the f CFG for each of these contexts. But in general the number of such contexts will be infinite, in particular if there is any recursion. 4 Abstracting contexts The solution to the problem of too many contexts, as with the problem of alias analysis, is abstraction: the analysis needs to project classes of calling contexts into a single abstracted contexts, and do an analysis that conservatively captures the analyses that would be done on all contexts in the class. Context-insensitive analysis can be viewed an instance of this approach, in which all contexts are abstracted to a single context. However, more fine-grained abstractions can do better. 4.1 k-limiting context-sensitive analysis One approach to abstraction is to collapse together all contexts that agree on the last k procedure calls. This is called k-limiting context-sensitive analysis, also known as k-cfa. 4.2 Abstracting cycles Because of recursion, the last k calls may include a recursive cycle. Another approach to context sensitivity is to consider all call strings to be equivalent if they agree when repeated recursive cycles are removed. So the call strings hg f, hgg f, hggg f would all be considered as one context hg f. Analysis in this context entails having the CFGs of h, g, and f wired together in a way that loses information about whether calls to g are coming from h or from g itself. 4

5 Using summaries We would like to avoid rerunning a full program analysis of a given procedure for each calling context. This may be accomplished by computing a summary of the way the procedure affects information. If output information can be summarized as a simple function of the input information, the summary can be used instead of the corresponding dataflow analysis. The summary is typically parameterized on the input information, which often requires making the underlying analysis more general so that it can be parameterized. For example, summarizing the behavior of the newarray procedure, we can see that when it is called in a context c, it performs an allocation which we might name alloc newarray/c. The result of analyzing the procedure is the mapping {x alloc newarray/c }. Note that to summarize the procedure accuately, we generalize the points-to analysis to allow heap locations to be indexed by contexts. Given the two calls to newarray from program points we might name f 1 and f 2, applying the summary to these context yields the mappings {x alloc newarray/ f1 } and {x alloc newarray/ f2 }. Therefore the analysis of f determines points-to information for a and b as {a alloc newarray/ f1, b alloc newarray/ f2 }, so the two variables are not aliases. 6 Solving equations with fixed points Many problems can be solved iteratively, and we have seen some of them during this course: dataflow analysis, loop-invariant expressions, LL(1) grammar construction. Type inference and information flow analysis are some other examples. The common feature of all these problems is that they can be expressed as a system of n equations of n variables x 1,..., x n, whose values are elements drawn from some lattice L with finite height h and top element. The equations come in one of the following two forms: x 1 = f 1 (x 1,..., x n ) x 2 = f 2 (x 1,..., x n ). x n = f n (x 1,..., x n ) x 1 f 1 (x 1,..., x n ) x 2 f 2 (x 1,..., x n ). x n f n (x 1,..., x n ) where, in addition, the functions f i are monotonic. Given these constraints, we know that we can use iterative analysis to find maximal solutions to equations of either kind. Because the solution is maximal, the solution iterative analysis finds to one of these sets of equations (using = or ) must also solve the corresponding set of equations of the other kind ( or =, respectively). Therefore we will simply assume the equations use equality. The simple iterative analysis algorithm uses the equations to evaluate x 1, x 2, x 3, and so on up to x n, and then repeats until convergence. However, for many problems, we can speed up convergence asympotically by evaluating the equations for the x i in a different order. The key to speeding up analysis is to observe that for a given system of equations, not all the functions f i depend on all the variables x 1,..., x n. For any system of equations, we can draw a dependency graph on variables, where each variable is a node and there is an edge from x i to x j if f j depends directly on x i. For dataflow analysis, this dependency graph is essentially the same as the control flow graph. 6.1 Example Consider the dependency graph shown in Figure 4. When the equations are applied in the order 1,..., n, it can take n 1 iterations to propagate information all the way from x n to x 1, or total time O(n 2 ). On the other hand, if the variables were evaluated in reverse order, convergence would be achieved after just one pass, or O(n) time. 5

x n x n-1 x 3 x 2 x 1... Figure 4: A simple dependency graph 7 Reverse postorder algorithm A useful refinement of the basic iterative analysis algorithm is to topologically sort the variables, at least approximately. This is accomplished by doing a depth-first postorder traversal of the nodes in the dependence graph, and ordering the evaluations in the reverse of this order. This means a node is visited before all its successor nodes have been, unless the successor is reached by a back edge. For example, if we do a depth-first traversal of the graph in Figure 4, we leave each node in the order x 1, x 2,..., x n. The reverse of this postorder traversal is then to update x n, x n1,..., x 1, achieving convergence after the first iteration. 8 Worklist algorithm There is no need to update variables whose predecessors have not changed. This motivates the worklist algorithm for solving equations, which we can now write more in a more general form: 1. Initialize all x i to. 2. Initialize the worklist w := {i i 1..n}. 3. While some i w repeat: (a) w := w {i} (b) x i := f i (x 1,..., x n ) (c) w := w {j f j depends on x i } The worklist is a set of node identifiers. Usually the set acts as a FIFO queue, so that newly added elements go to the end of the queue. The initial ordering of nodes can be reverse postorder. 9 Strongly-connected components Figure 5 shows a dependency graph where it makes sense to be more intelligent about worklist ordering. The dependency graph contains strongly connected components (SCCs) spanning multiple nodes. In fact, every directed graph can be reduced to a direct acyclic graph (DAG) whose nodes are strongly connected components. It makes sense to propagate information through this DAG, allowing each SCC to converge before propagating its information into the rest of the DAG. Strongly connected components can be found in linear time with two depth-first traversals: 1. Do a postorder traversal of the graph. 2. Do a traversal of the transposed graph (follow edges backward), but pick the nodes to start from in reverse postorder. All nodes reached from a starting node are part of the same SCC as that node. Further, the SCCs will be found in topologically sorted order. For example, in the graph of Figure 5, one postorder traversal of the graph starting from node 1 (it doesn t matter much which node we start from) is: 9,8,3,2,7,6,5,4,1. Now we start from the end with node 1. No other nodes are reachable from it in the transposed graph, so it is its own SCC. Node 4 is the same. From node 5 we can reach nodes 6 and 7, so (5,6,7) is an SCC. From node 2 we reach node 3, so (2,3) is an SCC. And finally, (8,9) is an SCC. Reversing this, the ordering on SCCs is (1),(4), (6,7,5), (2,3), (8,9). 6

1 2 4 3 5 6 7 8 9 Figure 5: Dependency graph with strongly connected components 10 Priority-SCC Iteration Priority-SCC iteration works by propagating information through the DAG of SCCs. Essentially, we use the reverse postorder algorithm, but on SCCs rather than on nodes. To make each SCC converge well, we topologically sort each SCC and use reverse postorder within the SCC using a worklist. In the example above, this gives us, e.g., (1), (4), (5,6,7), (2,3), (8,9). Each SCC is then iterated over repeatedly until it converges; the analysis then proceeds to the next SCC in the list. A worklist is used to avoid unnecessary updates within an SCC. In the worst case where the whole graph forms one SCC, this is equivalent to reverse postorder iteration. 7