Topic I (d): Static Single Assignment Form (SSA)

Similar documents
Partial Redundancy Elimination and SSA Form

CSE P 501 Compilers. SSA Hal Perkins Spring UW CSE P 501 Spring 2018 V-1

Topic 6 Basic Back-End Optimization

Compiler Construction 2009/2010 SSA Static Single Assignment Form

Intermediate representation

Advanced Compilers CMPSCI 710 Spring 2003 Computing SSA

ECE 5775 High-Level Digital Design Automation Fall More CFG Static Single Assignment

CSE Section 10 - Dataflow and Single Static Assignment - Solutions

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code

Control-Flow Analysis

Goals of Program Optimization (1 of 2)

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

Compiler Design. Fall Control-Flow Analysis. Prof. Pedro C. Diniz

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

Lecture 3 Local Optimizations, Intro to SSA

Compiler Optimisation

SSA Construction. Daniel Grund & Sebastian Hack. CC Winter Term 09/10. Saarland University

Introduction to Machine-Independent Optimizations - 6

CS577 Modern Language Processors. Spring 2018 Lecture Optimization

Calvin Lin The University of Texas at Austin

8. Static Single Assignment Form. Marcus Denker

ECE 5775 (Fall 17) High-Level Digital Design Automation. Static Single Assignment

Intermediate Representation (IR)

What If. Static Single Assignment Form. Phi Functions. What does φ(v i,v j ) Mean?

An Overview of GCC Architecture (source: wikipedia) Control-Flow Analysis and Loop Detection

6. Intermediate Representation!

Tour of common optimizations

CS153: Compilers Lecture 23: Static Single Assignment Form

Example (cont.): Control Flow Graph. 2. Interval analysis (recursive) 3. Structural analysis (recursive)

6. Intermediate Representation

The SGI Pro64 Compiler Infrastructure - A Tutorial

Computing Static Single Assignment (SSA) Form. Control Flow Analysis. Overview. Last Time. Constant propagation Dominator relationships

CSC D70: Compiler Optimization Register Allocation

Why Global Dataflow Analysis?

Static Single Assignment (SSA) Form

Advanced Compilers CMPSCI 710 Spring 2003 Dominators, etc.

Lecture 23 CIS 341: COMPILERS

Intermediate representations IR #1: CPS/L 3. Code example. Advanced Compiler Construction Michel Schinz

CS 406/534 Compiler Construction Putting It All Together

Intermediate Representations. Reading & Topics. Intermediate Representations CS2210

Induction Variable Identification (cont)

Control Flow Analysis & Def-Use. Hwansoo Han

7. Optimization! Prof. O. Nierstrasz! Lecture notes by Marcus Denker!

Introduction to Optimization. CS434 Compiler Construction Joel Jones Department of Computer Science University of Alabama

Code generation for modern processors

Intermediate Representations Part II

Code generation for modern processors

Code Merge. Flow Analysis. bookkeeping

Compiler Optimization Techniques

Sta$c Single Assignment (SSA) Form

Calvin Lin The University of Texas at Austin

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

Control Flow Analysis

More Dataflow Analysis

CFG (Control flow graph)

Programming Language Processor Theory

Program analysis for determining opportunities for optimization: 2. analysis: dataow Organization 1. What kind of optimizations are useful? lattice al

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Intermediate Representations

Loop Optimizations. Outline. Loop Invariant Code Motion. Induction Variables. Loop Invariant Code Motion. Loop Invariant Code Motion

Instruction Scheduling Beyond Basic Blocks Extended Basic Blocks, Superblock Cloning, & Traces, with a quick introduction to Dominators.

The Static Single Assignment Form:

The GNU Compiler Collection

Compiler Design. Fall Data-Flow Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Register Allocation. Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations

Using Static Single Assignment Form

Towards an SSA based compiler back-end: some interesting properties of SSA and its extensions

Redundant Computation Elimination Optimizations. Redundancy Elimination. Value Numbering CS2210

Compiler Construction 2016/2017 Loop Optimizations

USC 227 Office hours: 3-4 Monday and Wednesday CS553 Lecture 1 Introduction 4

Example. Example. Constant Propagation in SSA

Optimization Prof. James L. Frankel Harvard University

Lecture 8: Induction Variable Optimizations

CS 432 Fall Mike Lam, Professor. Data-Flow Analysis

Data Flow Analysis and Computation of SSA

CS5363 Final Review. cs5363 1

An example of optimization in LLVM. Compiler construction Step 1: Naive translation to LLVM. Step 2: Translating to SSA form (opt -mem2reg)

Overview of a Compiler

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

CSC D70: Compiler Optimization LICM: Loop Invariant Code Motion

Overview of a Compiler

IR Optimization. May 15th, Tuesday, May 14, 13

Lecture 21 CIS 341: COMPILERS

The View from 35,000 Feet

Overview of a Compiler

Combining Optimizations: Sparse Conditional Constant Propagation

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Global Optimization. Lecture Outline. Global flow analysis. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization

The Development of Static Single Assignment Form

CS553 Lecture Dynamic Optimizations 2

Single-Pass Generation of Static Single Assignment Form for Structured Languages

Plan for Today. Concepts. Next Time. Some slides are from Calvin Lin s grad compiler slides. CS553 Lecture 2 Optimizations and LLVM 1

Static single assignment

Intermediate Representations

A main goal is to achieve a better performance. Code Optimization. Chapter 9

Introduction to Optimization Local Value Numbering

Topic 9: Control Flow

Code Optimization. Code Optimization

EECS 583 Class 8 Classic Optimization

Transcription:

Topic I (d): Static Single Assignment Form (SSA) 621-10F/Topic-1d-SSA 1

Reading List Slides: Topic Ix Other readings as assigned in class 621-10F/Topic-1d-SSA 2

ABET Outcome Ability to apply knowledge of SSA technique in compiler optimization An ability to formulate and solve the basic SSA construction problem based on the techniques introduced in class. Ability to analyze the basic algorithms using SSA form to express and formulate dataflow analysis problem A Knowledge on contemporary issues on this topic. 621-10F/Topic-1d-SSA 3

Roadmap Motivation Introduction: SSA form Construction Method Application of SSA to Dataflow Analysis Problems PRE (Partial Redundancy Elimination) and SSAPRE Summary 621-10F/Topic-1d-SSA 4

Prelude SSA: A program is said to be in SSA form iff Each variable is statically defined exactly only once, and each use of a variable is dominated by that variable s definition. So, is straight line code in SSA form? 621-10F/Topic-1d-SSA 5

Example X 2 X 1 In general, how to transform an arbitrary program into SSA form? Does the definition of X 2 dominate its use in the example? X 3 φ (X 1, X 2 ) X 4 621-10F/Topic-1d-SSA 6

SSA: Motivation Provide a uniform basis of an IR to solve a wide range of classical dataflow problems Encode both dataflow and control flow information A SSA form can be constructed and maintained efficiently Many SSA dataflow analysis algorithms are more efficient (have lower complexity) than their CFG counterparts. 621-10F/Topic-1d-SSA 7

Algorithm Complexity Assume a 1 GHz machine, and an algorithm that takes f(n) steps (1 step = 1 nanosecond). n 8 16 32 128 1024 lg n 3 ns 4 ns 5 ns 7 ns 10 ns sqrt(n) 2.8 ns 4 ns 6 ns 11 ns 32 ns n 8 ns 16 ns 32 ns 128 ns 1 s n lg n 24 ns 64 ns 160 ns 896 ns 10 s n 2 64 ns 256 ns 1.0 s 16 s 1 ms n 3 512 ns 4 s 32.8 s 2 ms 1.1 sec 2 n 256 ns 66 s 4 sec. 10 22 year n! 40 s 5.8 hours 10 19 year 621-10F/Topic-1d-SSA 8

Where SSA Is Used In Modern Compilers? Front end Interprocedural Analysis and Optimization Good IR Loop Nest Optimization and Parallelization Global (Scalar) Optimization Middle-End Backend Code Generation 621-10F/Topic-1d-SSA 9

KCC Compiler Infrastructure Front end Middle end Back end Machine Model Fortran C C++ fe90 gfec gfecc Very High WHIRL High WHIRL Middle WHIRL Low WHIRL Very Low WHIRL CGIR Asm file Source to IR (Scanner Parser RTL WHIRL) VHO (Very High WHIRL Optimizer) Standalone Inliner W2C/W2F IPA (inter-procedural analysis & opt) LNO (Loop unrolling/fission/fusion/tiling/peeling etc) PREOPT (point-to analysis etc) WOPT - SSAPRE (Partial Redundancy Elimination) - VNFRE (Value numbering based full redundancy elim.) RVI-1 (Register Variable Identification) RVI-2 Some peephole opt Cflow (control flow opt) EBO (extended block opt.) PQS (predicate query system) Loop Opt (Unrolling + SWP) GCM (global code motion), HB sched (hyperblk schedule) GRA/LRA (global/local register alloc) 621-10F/Topic-1d-SSA 10

Roadmap Motivation Introduction: SSA form Construction Method Application of SSA to Dataflow Analysis Problems PRE (Partial Redundancy Elimination) and SSAPRE Summary 621-10F/Topic-1d-SSA 11

Static Single- Assignment Form Each variable has only one definition in the program text. This single static definition can be in a loop and may be executed many times. Thus even in a program expressed in SSA, a variable can be dynamically defined many times. 621-10F/Topic-1d-SSA 12

Advantages of SSA Simpler dataflow analysis No need to use use-def/def-use chains, which requires N M space for N uses and M definitions SSA form relates in a useful way with dominance structures. 621-10F/Topic-1d-SSA 13

SSA Form An Example SSA-form Each name is defined exactly once Each use refers to exactly one name What s hard Straight-line code is trivial Splits in the CFG are trivial Joins in the CFG are hard x 17-4 x a + b x y - z x 13 Building SSA Form Insert Ø-functions at birth points Rename all values for uniqueness [Curtesy: Slide 10-14 are from the book from Prof. K. Cooper s website]? s w - x z x * q 621-10F/Topic-1d-SSA 14 *

Birth Points (another notion due to Tarjan) Consider the flow of values in this example: x 17-4 x a + b x y - z x 13 z x * q The value x appears everywhere It takes on several values. Here, x can be 13, y-z, or 17-4 Here, it can also be a+b If each value has its own name Need a way to merge these distinct values Values are born at merge points s w - x 621-10F/Topic-1d-SSA 15 *

Birth Points (another notion due to Tarjan) Consider the flow of values in this example: x 17-4 x a + b New value for x here 17-4 or y - z x y - z x 13 z x * q New value for x here 13 or (17-4 or y - z) s w - x New value for x here a+b or ((13 or (17-4 or y-z)) 621-10F/Topic-1d-SSA 16

Birth Points (another notion due to Tarjan) Consider the value flow below: x 17-4 x a + b x y - z x 13 z x * q All birth points are join points Not all join points are birth points Birth points are value-specific s w - x These are all birth points for values 621-10F/Topic-1d-SSA 17

Review SSA-form Each name is defined exactly once Each use refers to exactly one name What s hard Straight-line code is trivial Splits in the CFG are trivial Joins in the CFG are hard Building SSA Form Insert Ø-functions at birth points Rename all values for uniqueness A Ø-function is a special kind of copy that selects one of its parameters. The choice of parameter is governed by the CFG edge along which control reached the current block. y 1... y 2... y 3 Ø(y 1,y 2 ) Real machines do not implement a Ø-function directly in hardware.(not yet!) 621-10F/Topic-1d-SSA 18 *

Use-def Dependencies in Non-straight-line Code Many uses to many defs a = a = a = Overhead in representation Hard to manage a a a 621-10F/Topic-1d-SSA 19

Factoring Operator φ Factoring when multiple edges cross a join point, create a common node that all edges must pass through Number of edges reduced from 9 to 6 a = a = a = A φ is regarded as def (its parameters are uses) Many uses to 1 def Each def dominates all its uses a = φ a,a,a) a a a 621-10F/Topic-1d-SSA 20

Rename to represent use-def edges No longer necessary to represent the use-def edges explicitly a 1 = a 4 a 2 = a 3 = a 4 = φ a 1,a 2,a 3 ) a 4 a 4 621-10F/Topic-1d-SSA 21

SSA Form in Control- Flow Path Merges Is this code in SSA form? No, two definitions of a at B4 appear in the code (in B1 and B3) B1 b M[x] a 0 How can we transform this code into a code in SSA form? We can create two versions of a, one for B1 and another for B3. B3 B2 if b<4 a b B4 c a + b 621-10F/Topic-1d-SSA 22

SSA Form in Control- Flow Path Merges But which version should we use in B4 now? B1 b M[x] a1 0 We define a fictional function that knows which control path was taken to reach the basic block B4: B3 B2 a2 b if b<4 f ( a1, a2 ) = a1 if a2 if we arrive at B4 from B2 we arrive at B4 from B3 B4 c a? + b 621-10F/Topic-1d-SSA 23

SSA Form in Control- Flow Path Merges But, which version should we use in B4 now? B1 b M[x] a1 0 We define a fictional function that knows which control path was taken to reach the basic block B4: B3 B2 a2 b if b<4 f ( a2, a1 ) = a1 if a2 if we arrive at B4 from B2 we arrive at B4 from B3 B4 a3 f(a2,a1) c a3 + b 621-10F/Topic-1d-SSA 24

A Loop Example a 0 a1 0 b a+1 c c+b a b*2 if a < N return a3 f(a1,a2) b2 f(b0,b2) c2 f(c0,c1) b2 a3+1 c1 c2+b2 a2 b2*2 if a2 < N f(b0,b2) is not necessary because b0 is never used. But the phase that generates f functions does not know it. Unnecessary functions are eliminated by dead code elimination. return Note: only a,c are first used in the loop body before it is redefined. For b, it is redefined right at the beginning! 621-10F/Topic-1d-SSA 25

The f function How can we implement a f function that knows which control path was taken? Answer 1: We don t!! The f function is used only to connect use to definitions during optimization, but is never implemented. Answer 2: If we must execute the f function, we can implement it by inserting MOVE instructions in all control paths. 621-10F/Topic-1d-SSA 26

Roadmap Motivation Introduction: SSA form Construction Method Application of SSA to Dataflow Analysis Problems PRE (Partial Redundancy Elimination) and SSAPRE Summary 621-10F/Topic-1d-SSA 27

Criteria For Inserting f Functions We could insert one f function for each variable at every join point(a point in the CFG with more than one predecessor). But that would be wasteful. What should be our criteria to insert a f function for a variable a at node z of the CFG? Intuitively, we should add a function f if there are two definitions of a that can reach the point z through distinct paths. 621-10F/Topic-1d-SSA 28

A naïve method Simply introduce a φ-function at each join point in CFG But, we already pointed out that this is inefficient too many useless φ-functions may be introduced! What is a good algorithm to introduce only the right number of φ-functions? 621-10F/Topic-1d-SSA 29

Path Convergence Criterion Insert a f function for a variable a at a node z if all the following conditions are true: 1. There is a block x that defines a 2. There is a block y x that defines a 3. There is a non-empty path Pxz from x to z 4. There is a non-empty path Pyz from y to z 5. Paths Pxz and Pyz don t have any nodes in common other than z 6.? The start node contains an implicit definition of every variable. 621-10F/Topic-1d-SSA 30

Iterated Path- Convergence Criterion The f function itself is a definition of a. Therefore the path-convergence criterion is a set of equations that must be satisfied. while there are nodes x, y, z satisfying conditions 1-5 and z does not contain a f function for a do insert a f(a, a,, a) at node z This algorithm is extremely costly, because it requires the examination of every triple of nodes x, y, z and every path leading from x to y. Can we do better? a topic for more discussion 621-10F/Topic-1d-SSA 31

Concept of dominance Frontiers An Intuitive View bb1 X bbn Blocks dominate d by bb1 Border between dorm and notdorm (Dominance Frontier) 621-10F/Topic-1d-SSA 32

Dominance Frontier The dominance frontier DF(x) of a node x is the set of all node z such that x dominates a predecessor of z, without strictly dominating z. Recall: if x dominates y and x y, then x strictly dominates y 621-10F/Topic-1d-SSA 33

Calculate The Dominance Frontier An Intuitive Way How to Determine the Dominance Frontier of Node 5? 1. Determine the dominance region of node 5: {5, 6, 7, 8} 1 2 9 5 2. Determine the targets of edges crossing from the dominance region of node 5 These targets are the dominance frontier of node 5 DF(5) = { 4, 5, 12, 13} 3 4 6 7 10 11 8 12 13 NOTE: node 5 is in DF(5) in this case why? 621-10F/Topic-1d-SSA 34

Are we done? Not yet! See a simple example.. 621-10F/Topic-1d-SSA 35

Putting program into SSA form needed only at dominance frontiers of defs (where it stops dominating) Dominance frontiers pre-computed based on control flow graph Two phases: 1. Insert s at dominance frontiers of each def (recursive) 2. Rename the uses to their defs name Maintain and update stack of variable versions in pre-order traversal of dominator tree 621-10F/Topic-1d-SSA 36

Example Phase 1: Insertion 1 a = Steps: def at BB 3 Φ at BB 4 Φ def at BB 4 Φ at BB 2 a = f a,a) 2 3 a = a = f a,a) 4 621-10F/Topic-1d-SSA 37

Example Phase 2: Rename 1 a 1 = a 1 stack for a dominator tree 1 2 a = f a,a 1 ) 2 3 a = f a,a) 4 a = 3 4 621-10F/Topic-1d-SSA 38

Example Phase 2: Rename 1 a 1 = dominator tree 1 2 a 2 = f a,a 1 ) 2 3 a = f a 2,a) 4 a 2 a 1 a = 3 4 621-10F/Topic-1d-SSA 39

Example Phase 2: Rename 1 a 1 = dominator tree a 2 = f a,a 1 ) 2 1 3 a 3 = a 3 a 2 a 1 2 a = f a 2,a 3 ) 4 3 4 621-10F/Topic-1d-SSA 40

Example Phase 2: Rename 1 a 1 = dominator tree 1 2 3 4 a 2 = f a 4,a 1 ) 2 3 a 4 = f a 2,a 3 ) 4 a 3 = a 4 a 2 a 1 621-10F/Topic-1d-SSA 41

Roadmap Motivation Introduction: SSA form Construction Method Application of SSA to Dataflow Analysis Problems PRE (Partial Redundancy Elimination) and SSAPRE Summary 621-10F/Topic-1d-SSA 42

Simple Constant Propagation in SSA If there is a statement v c, where c is a constant, then all uses of v can be replaced for c. A f function of the form v f(c1, c2,, cn) where all ci s are identical can be replaced for v c. Using a work list algorithm in a program in SSA form, we can perform constant propagation in linear time In the next slide we assume that x, y, z are variables and a, b, c are constants. 621-10F/Topic-1d-SSA 43

Linear Time Optimizations in SSA form Copy propagation: The statement x f(y) or the statement x y can be deleted and y can substitute every use of x. Constant folding: If we have the statement x a b, we can evaluate c a b at compile time and replace the statement for x c Constant conditions: The conditional if a < b goto L1 else L2 can be replaced for goto L1 or goto L2, according to the compile time evaluation of a < b, and the CFG, use lists, adjust accordingly Unreachable Code: eliminate unreachable blocks. 621-10F/Topic-1d-SSA 44

Dead-Code Elimination in SSA Form Because there is only one definition for each variable, if the list of uses of the variable is empty, the definition is dead. When a statement v x y is eliminated because v is dead, this statement should be removed from the list of uses of x and y. Which might cause those definitions to become dead. Thus we need to iterate the dead code elimination algorithm. 621-10F/Topic-1d-SSA 45

A Case Study: Dead Store Elimination Steps: 1. Assume all defs are dead and all statements not required 2. Mark following statements required: a. Function return values b. Statements with side effects c. Def of global variables 3. Variables in required statements are live 4. Propagate liveness backwards iteratively through: a. use-def edges when a variable is live, its def statement is made live b. control dependences 621-10F/Topic-1d-SSA 46

Control Dependence Statements in branched-to blocks depend on the conditional branch Equivalent to postdominance frontier (dominance frontier of the inverted control flow graph) If (i < n) x = 621-10F/Topic-1d-SSA 47

Example of dead store elimination Propagation steps: 1. return s 2 s 2 2. s 2 s 2 = s 3 * s 3 3. s 3 s 3 = f(s 2,s 1 ) 4. s 1 s 1 = 5. return s 2 if (i 2 < 10) [control dependence] 6. i 2 i 2 = i 3 + 1 7. i 3 i 3 = f(i 2,i 1 ) i 1 = s 1 = i 3 = f i 2,i 1 ) s 3 = f s 2,s 1 ) i 2 = i 3 +1 s 2 = s 3 * s 3 if (i 3 < 10) 8. i 1 i 1 = Nothing is dead return s 2 621-10F/Topic-1d-SSA 48

Example of dead store elimination All statements not required; whole loop deleted i 1 = s 1 = i 3 = f i 2,i 1 ) s 3 = f s 2,s 1 ) i 2 = i 3 +1 empty s 2 = s 3 * s 3 if (i 3 < 10) 621-10F/Topic-1d-SSA 49

Advantages of SSA-based optimizations Dependency information built-in No separate phase required to compute dependency information Transformed output preserves SSA form Little overhead to update dependencies Efficient algorithms due to: Sparse occurrence of nodes Complexity dependent only on problem size (independent of program size) Linear data flow propagation along use-def edges Can customize treatment according to candidate Can re-apply algorithms as often as needed No separation of local optimizations from global optimizations 621-10F/Topic-1d-SSA 50