CS153: Compilers Lecture 17: Control Flow Graph and Data Flow Analysis

Similar documents
Control-Flow Graphs & Dataflow Analysis. CS4410: Spring 2013

CS153: Compilers Lecture 23: Static Single Assignment Form

CS153: Compilers Lecture 16: Local Optimization II

CS153: Compilers Lecture 15: Local Optimization

Compiler Design. Fall Control-Flow Analysis. Prof. Pedro C. Diniz

Data Flow Analysis. Agenda CS738: Advanced Compiler Optimizations. 3-address Code Format. Assumptions

Compiler Structure. Data Flow Analysis. Control-Flow Graph. Available Expressions. Data Flow Facts

Lecture 3 Local Optimizations, Intro to SSA

CS553 Lecture Generalizing Data-flow Analysis 3

Data Flow Analysis. Program Analysis

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

CS153: Compilers Lecture 7: Simple Code Generation

More Dataflow Analysis

Optimizations. Optimization Safety. Optimization Safety CS412/CS413. Introduction to Compilers Tim Teitelbaum

Dataflow Analysis. Xiaokang Qiu Purdue University. October 17, 2018 ECE 468

CSE 401 Final Exam. December 16, 2010

Compiler Design. Fall Data-Flow Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Lecture Compiler Middle-End

Lecture 21 CIS 341: COMPILERS

Fall Compiler Principles Lecture 5: Intermediate Representation. Roman Manevich Ben-Gurion University of the Negev

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization

Programming Language Processor Theory

Homework Assignment #1 Sample Solutions

Intermediate representation

Compiler Optimisation

Topic 14: Scheduling COS 320. Compiling Techniques. Princeton University Spring Lennart Beringer

CS153: Compilers Lecture 8: Compiling Calls

CS 432 Fall Mike Lam, Professor. Data-Flow Analysis

Data Flow Analysis. CSCE Lecture 9-02/15/2018

Intermediate Code Generation

Midterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 21, 2016

Fall Compiler Principles Lecture 6: Intermediate Representation. Roman Manevich Ben-Gurion University of the Negev

Dataflow analysis (ctd.)

Register Allocation. CS 502 Lecture 14 11/25/08

Tour of common optimizations

CS Lecture 7: The dynamic environment. Prof. Clarkson Spring Today s music: Down to Earth by Peter Gabriel from the WALL-E soundtrack

IR Optimization. May 15th, Tuesday, May 14, 13

G Programming Languages - Fall 2012

Global Optimization. Lecture Outline. Global flow analysis. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization

CSE 501 Midterm Exam: Sketch of Some Plausible Solutions Winter 1997

CS577 Modern Language Processors. Spring 2018 Lecture Optimization

A main goal is to achieve a better performance. Code Optimization. Chapter 9

Midterm II CS164, Spring 2006

Topic 9: Control Flow

Introduction to Code Optimization. Lecture 36: Local Optimization. Basic Blocks. Basic-Block Example

Midterm 2. CMSC 430 Introduction to Compilers Spring Instructions Total 100. Name: April 18, 2012

COP5621 Exam 4 - Spring 2005

G Programming Languages - Fall 2012

Formal Specification and Verification

Torben./Egidius Mogensen. Introduction. to Compiler Design. ^ Springer

CS 406/534 Compiler Construction Putting It All Together

CSC D70: Compiler Optimization LICM: Loop Invariant Code Motion

Software Engineering using Formal Methods

CS61C : Machine Structures

Program Optimizations using Data-Flow Analysis

Principles of Compiler Design

Static Program Analysis Part 1 the TIP language

Introduction to Parsing. Lecture 8

Compiler Design Spring 2017

CS61C Machine Structures. Lecture 12 - MIPS Procedures II & Logical Ops. 2/13/2006 John Wawrzynek. www-inst.eecs.berkeley.

An Overview of GCC Architecture (source: wikipedia) Control-Flow Analysis and Loop Detection

University of Arizona, Department of Computer Science. CSc 453 Assignment 5 Due 23:59, Dec points. Christian Collberg November 19, 2002

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

CS153: Compilers Lecture 14: Type Checking

PROGRAM ANALYSIS & SYNTHESIS

IR Generation. May 13, Monday, May 13, 13

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

Introduction to Machine-Independent Optimizations - 6

CIS 341 Final Examination 4 May 2017

EECS 583 Class 6 Dataflow Analysis

Compilers. Compiler Construction Tutorial The Front-end

CS 6110 S14 Lecture 38 Abstract Interpretation 30 April 2014

(How Not To Do) Global Optimizations

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

Three-Address Code IR

Lecture 15 CIS 341: COMPILERS

Program Syntax; Operational Semantics

CMSC430 Spring 2014 Midterm 2 Solutions

Data-flow Analysis. Y.N. Srikant. Department of Computer Science and Automation Indian Institute of Science Bangalore

CS 4120 Lecture 31 Interprocedural analysis, fixed-point algorithms 9 November 2011 Lecturer: Andrew Myers

Topic 7: Intermediate Representations

Midterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 11, 2015

Program analysis for determining opportunities for optimization: 2. analysis: dataow Organization 1. What kind of optimizations are useful? lattice al

CMSC 330 Exam #2 Fall 2008

Lecture 6 Decision + Shift + I/O

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

Lecture 4. More on Data Flow: Constant Propagation, Speed, Loops

Liveness Analysis and Register Allocation. Xiao Jia May 3 rd, 2013

CFG (Control flow graph)

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

Lecture #13: Type Inference and Unification. Typing In the Language ML. Type Inference. Doing Type Inference

Compiler Construction 2009/2010 SSA Static Single Assignment Form

Lectures Basics in Procedural Programming: Machinery

Calvin Lin The University of Texas at Austin

Flow Analysis. Data-flow analysis, Control-flow analysis, Abstract interpretation, AAM

Lecture Notes: Dataflow Analysis Examples

Data-flow Analysis - Part 2

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation

OCaml Language Choices CMSC 330: Organization of Programming Languages

Transcription:

CS153: Compilers Lecture 17: Control Flow Graph and Data Flow Analysis Stephen Chong https://www.seas.harvard.edu/courses/cs153

Announcements Project 5 out Due Tuesday Nov 13 (14 days) Project 6 out Due Tuesday Nov 20 (21 days) Project 7 will be released today Due Thursday Nov 29 (30 days) 2

Today Control Flow Graphs Basic Blocks Dataflow Analysis Available Expressions 3

Optimizations So Far We ve look only at local optimizations Limited to pure expressions Avoid variable capture by having unique variable names 4

Next Few Lectures Imperative Representations Like MIPS assembly at the instruction level. except we assume an infinite # of temps and abstract away details of the calling convention But with a bit more structure. Organized into a Control-Flow graph nodes: labeled basic blocks of instructions single-entry, single-exit i.e., no jumps, branching, or labels inside block edges: jumps/branches to basic blocks Dataflow analysis computing information to answer questions about data flowing through the graph. 5

Control-Flow Graphs Graphical representation of a program Edges in graph represent control flow: how execution traverses a program x := 0 y := 0 Nodes represent statements x := 0; y := 0; while (n > 0) { if (n % 2 = 0) { x := x + n; y := y + 1; } else { y := y + n; x := x + 1; } n := n - 1; } print(x); n%2=0 n > 0 true false true false x:=x+n y:=y+n y:=y+1 x:=x+1 n:=n-1 print(x) 6

Basic Blocks We will require that nodes of a control flow graph are basic blocks Sequences of statements such that: Can be entered only at beginning of block Can be exited only at end of block Exit by branching, by unconditional jump to another block, or by returning from function Basic blocks simplify representation and analysis 7

Basic Blocks Basic block: single entry, single exit x := 0; y := 0; while (n > 0) { if (n % 2 = 0) { x := x + n; y := y + 1; } else { y := y + n; x := x + 1; } n := n - 1; } print(x); true x:=x+n y:=y+1 x := 0 y := 0 true n%2=0 n:=n-1 n > 0 false false y:=y+n x:=x+1 print(x) 8

CFG Abstract Syntax type operand = Int of int Var of var Label of label type block = Return of operand Jump of label Branch of operand * test * operand * label * label Move of var * operand * block Load of var * int * operand * block Store of var * int * operand * block Assign of var * primop * (operand list) * block Call of var * operand * (operand list) * block type proc = { vars : var list, prologue: label, epilogue: label, blocks : (label * block) list } 9

Differences with Monadic Form Essentially MIPS assembly with infinite number of registers No lambdas, so easy to translate to MIPS modulo register allocation and assignment. Monadic form requires extra pass to eliminate lambdas and make closures explicit (closure conversion, lambda lifting) Unlike Monadic Form, variables are mutable Return constructor is function return, not monadic return 10

Let s Revisit Optimizations Folding t:=3+4 becomes t:=7 Constant propagation t:=7; B; u:=t+3; B becomes t := 7; B; u:=7+3; B Problem! B might assign a fresh value to t Copy propagation t:=u; B; v:=t+3; B becomes t:=u; B; v:=u+3; B Problem! B might assign a fresh value to t or u 11

Let s Revisit Optimizations Dead code elimination x:=e; B; jump L becomes B; jump L Problem! Block L might use x x:=e1;b1; x:=e2;b2 becomes B1;x:=e2;B2 (x not used in B1) Common sub-expression elimination x:=y+z;b1;w := y+z;b2 becomes x:=y+z;b1;w:=x;b2 problem: B1 might change x, y, or z 12

Optimization in Imperative Settings Optimization on a functional representation: Only had to worry about variable capture. Could avoid this by renaming variables so that they were unique. then: let x=p(v1,,vn) in e == e[x p(v1,,vn)] Optimization in an imperative representation: Have to worry about intervening updates for defined variable, similar to variable capture. but must also worry about free variables. x:=p(v1,,vn);b == B[x p(v1,,vn)] only when B doesn't modify x or modify any of the vi! On the other hand, graph representation makes it possible to be more precise about the scope of a variable. 13

Dataflow Analysis To handle intervening updates we will compute analysis facts for each program point There is a program point immediately before and after each instruction Analysis facts are facts about variables, expressions, etc. Which facts we are interested in will depend on the particular optimization or analysis we are concerned with Given that some facts D hold at a program point before instruction S, after S executes some facts D will hold How S transforms D into D is called the transfer function for S This kind of analysis is called dataflow analysis Because given a control-flow graph, we are computing facts about data/ variables and propagating these facts over the control flow graph 14

Available Expressions An expression e is available at program point p if on all paths from the entry to p, expression e is computed at least once, and there are no intervening assignment to x or to the free variables of e If e is available at p, we do not need to re-compute e (i.e., for common sub-expression elimination) How do we compute the available expressions at each program point? 15

Available Expressions Example 1. 2. entry x := a + b; 3. {a+b} 4. 5. 6. {a+b, a*b} {a+b, a*b} {a+b, a*b} y := a * b; y > a {a+b} {a+b} 10. 11. 7. {a+b, a*b} a := a + 1; exit {a+b} 12. 8. x := a + b 9. {a+b} (Numbers indicate the order that the facts are computed in this example.) 16

More Formally Suppose D is a set of expressions that are available at program point p Suppose p is immediately before x := e1; B Then the statement x:=e1 generates the available expression e1, and kills any available expression e2 in D such that x is in variables(e2) So the available expressions for B are: (D {e1}) { e2 x variables(e2) } 17

Gen and Kill Sets Can describe this analysis by the set of available expressions that each statement generates and kills! Stmt Gen Kill x:=v { v } {e x in e} x:=v1 op v2 {v1 op v2} {e x in e} x:=*(v+i) {} {e x in e} *(v+i):=x {} {} jump L {} {} return v {} {} if v1 op v2 goto L1 else goto L2 {} {} x:=v(v1,...vn) {} {e x in e} Transfer function for stmt S: λd. (D Gen S ) Kill S 18

Available Expressions Example What is the effect of each entry statement on the facts? x := a + b; Stmt Gen Kill x := a + b a+b y := a * b a*b y > a y := a * b; y > a a := a + 1 a+1 a+1 a+b a*b a := a + 1; x := a + b exit 19

Aliasing We don t track expressions involving memory (loads & stores). We can tell whether variables names are equal. We cannot (in general) tell whether two variables will have the same value. If we track *x as an available expression, and then see *y := e, don t know whether to kill *x Don t know whether x s value will be the same as y s value 20

Function Calls Because a function call may access memory, and may have side effects, we can t consider them to be available expressions 21

Flowing Through the Graph How to propagate available expression facts over control flow graph? Given available expressions D in [L] that flow into block labeled L, compute D out [L] that flow out Composition of transfer functions of statements in L's block For each block L, we can define: succ[l] = the blocks L might jump to pred[l] = the blocks that might jump to L We can then flow D out [L] to all of the blocks in succ[l] They'll compute new D out 's and flow them to their successors and so on How should we combine facts from predecessors? e.g., if pred[l] = {L 1, L2, L3}, how do we combine D out [L 1 ], D out [L 2 ], D out [L 3 ] to get D in [L]? Union or intersection? 22

Algorithm Sketch initialize D in [L] to the empty set. initialize D out [L] to the available expressions that flow out of block L, assuming D in [L] are the set flowing in. loop until no change { for each L: In := {D out [L ] L in pred[l] } if In D in [L] then { D in [L] := In D out [L] := flow D in [L] through L's block. } } 23

Termination and Speed We know the available expressions dataflow analysis will terminate! Each time through the loop each D in [L] and D out [L] either stay the same or increase If all D in [L] and D out [L] stay the same, we stop There s a finite number of assignments in the program and finite blocks, so a finite number of times we can increase D in [L] and D out [L] In general, if set of facts form a lattice, transfer functions monotonic, then termination guaranteed There are a number of tricks used to speed up the analysis: Can keep a work queue that holds only those blocks that have changed Pre-compute transfer function for a block (i.e., composition of transfer functions of statements in block) 24