Reuse Optimization. LLVM Compiler Infrastructure. Local Value Numbering. Local Value Numbering (cont)

Similar documents
Calvin Lin The University of Texas at Austin

Calvin Lin The University of Texas at Austin

Using Static Single Assignment Form

A Sparse Algorithm for Predicated Global Value Numbering

ECE 5775 (Fall 17) High-Level Digital Design Automation. Static Single Assignment

Induction Variable Identification (cont)

Sparse Conditional Constant Propagation

Sparse Conditional Constant Propagation

CSE P 501 Compilers. SSA Hal Perkins Spring UW CSE P 501 Spring 2018 V-1

Sparse Conditional Constant Propagation

Lecture 3 Local Optimizations, Intro to SSA

An Overview of GCC Architecture (source: wikipedia) Control-Flow Analysis and Loop Detection

Data Structures and Algorithms in Compiler Optimization. Comp314 Lecture Dave Peixotto

Intermediate Representations Part II

What If. Static Single Assignment Form. Phi Functions. What does φ(v i,v j ) Mean?

Live Variable Analysis. Work List Iterative Algorithm Rehashed

Computing Static Single Assignment (SSA) Form. Control Flow Analysis. Overview. Last Time. Constant propagation Dominator relationships

Code Placement, Code Motion

ECE 5775 High-Level Digital Design Automation Fall More CFG Static Single Assignment

Redundant Computation Elimination Optimizations. Redundancy Elimination. Value Numbering CS2210

Static Single Assignment (SSA) Form

6. Intermediate Representation

Advanced Compiler Construction

CS553 Lecture Generalizing Data-flow Analysis 3

Rematerialization. Graph Coloring Register Allocation. Some expressions are especially simple to recompute: Last Time

INTRODUCTION TO LLVM Bo Wang SA 2016 Fall

4/1/15 LLVM AND SSA. Low-Level Virtual Machine (LLVM) LLVM Compiler Infrastructure. LL: A Subset of LLVM. Basic Blocks

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

Lecture 21 CIS 341: COMPILERS

A Propagation Engine for GCC

Improvements to Linear Scan register allocation

The Development of Static Single Assignment Form

Department of Computer Sciences CS502. Purdue University is an Equal Opportunity/Equal Access institution.

A Translation Framework for Automatic Translation of Annotated LLVM IR into OpenCL Kernel Function

Plan for Today. Concepts. Next Time. Some slides are from Calvin Lin s grad compiler slides. CS553 Lecture 2 Optimizations and LLVM 1

Intermediate Representation (IR)

Languages and Compiler Design II IR Code Optimization

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code

Example. Example. Constant Propagation in SSA

Intermediate representation

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices

Introduction to Machine-Independent Optimizations - 6

Introduction to Optimization Local Value Numbering

CS553 Lecture Dynamic Optimizations 2

Compilers and Code Optimization EDOARDO FUSELLA

Introduction to LLVM. UG3 Compiling Techniques Autumn 2018

Constant Propagation with Conditional Branches

Compiler Construction 2010/2011 Loop Optimizations

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Instantiation of Template class

SSA Construction. Daniel Grund & Sebastian Hack. CC Winter Term 09/10. Saarland University

LLVM: A Compilation Framework. Transformation

Compiler Construction 2016/2017 Loop Optimizations

Constant Propagation

Midterm II CS164, Spring 2006

Register allocation. instruction selection. machine code. register allocation. errors

Lecture 23 CIS 341: COMPILERS

CS153: Compilers Lecture 23: Static Single Assignment Form

6. Intermediate Representation!

Program Optimizations using Data-Flow Analysis

Register Allocation in Just-in-Time Compilers: 15 Years of Linear Scan

Loop Invariant Code Motion. Background: ud- and du-chains. Upward Exposed Uses. Identifying Loop Invariant Code. Last Time Control flow analysis

CSE au Final Exam Sample Solution

The Static Single Assignment Form:

Why Global Dataflow Analysis?

LLVM code generation and implementation of nested functions for the SimpliC language

Global Register Allocation via Graph Coloring

Principles of Compiler Design

Combining Optimizations: Sparse Conditional Constant Propagation

Lecture 3 Overview of the LLVM Compiler

CSE 501 Midterm Exam: Sketch of Some Plausible Solutions Winter 1997

Register Allocation. Register Allocation. Local Register Allocation. Live range. Register Allocation for Loops

8. Static Single Assignment Form. Marcus Denker

Intermediate Code Generation

Intermediate Representations

Transla'on Out of SSA Form

Introduction. L25: Modern Compiler Design

LLVM & LLVM Bitcode Introduction

Sorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?

Cpt S 122 Data Structures. Course Review FINAL. Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University

Intermediate Representations

7. Optimization! Prof. O. Nierstrasz! Lecture notes by Marcus Denker!

Homework I - Solution

arxiv: v2 [cs.pl] 16 Sep 2014

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University

Apple LLVM GPU Compiler: Embedded Dragons. Charu Chandrasekaran, Apple Marcello Maggioni, Apple

CS 4110 Programming Languages & Logics. Lecture 35 Typed Assembly Language

JOHANNES KEPLER UNIVERSITÄT LINZ

Intermediate Representations

Topic 12: Register Allocation

Linear Scan Register Allocation. Kevin Millikin

Program analysis for determining opportunities for optimization: 2. analysis: dataow Organization 1. What kind of optimizations are useful? lattice al

15-411: LLVM. Jan Hoffmann. Substantial portions courtesy of Deby Katz

Lecture Notes on Static Single Assignment Form

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

CSC D70: Compiler Optimization Register Allocation

Register allocation. CS Compiler Design. Liveness analysis. Register allocation. Liveness analysis and Register allocation. V.

CS 4120 Lecture 31 Interprocedural analysis, fixed-point algorithms 9 November 2011 Lecturer: Andrew Myers

Transcription:

LLVM Compiler Infrastructure Source: LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation by Lattner and Adve Reuse Optimization Eliminate redundant operations in the dynamic execution of instructions Goals - lifelong analysis and optimization - modular compiler components IR is low level, control-flow graph and SSA - language-independent types: 8-26 bit ints, float, double, pointers, arrays, structures, functions Status -Apple is paying main developer (Chris Lattner to continue development -built OpenGL JIT compiler with it in two weeks How do redundancies arise? Loop invariant code (e.g., index calculation for arrays Sequence of similar operations (e.g., method lookup Same value be generated in multiple places in the code Types of reuse optimization Value numbering Common subexpression elimination Partial redundancy elimination CS553 Lecture Value Numbering 1 CS553 Lecture Value Numbering 2 Local Value Numbering Each variable, expression, and constant is assigned a unique number When we encounter a variable, expression or constant, see if it s already been assigned a number If so, use the value for that number If not, assign a new number Same number same value Example a := b + c d := b b := a e := d + c a b #1 #3 a #3 d #1 d + c is #1 + # 2 #3 e #3 Local Value Numbering (cont Temporaries may be necessary a := b + c a := b d := a + c t := b + c a := b d := b + c t b #1 a #3 #1 a + c is #1 + # 2 #3 d #3 b #1 t #3 a #1 a + c is #1 + # 2 #3 d #3 CS553 Lecture Value Numbering 3 CS553 Lecture Value Numbering 4 1

Global Value Numbering How do we handle control flow? w = 5 x = 5 w = 8 x = 8 Global Value Numbering (cont [Alpern, Wegman, and Zadeck 1988] Partition program variables into congruence classes All variables in a particular congruence class have the same value SSA form is helpful w #1 x #1 y = w z = x w #2 x #2 Approaches to computing congruence classes Pessimistic Assume no variables are congruent (start with n classes Iteratively coalesce classes that are determined to be congruent Optimistic Assume all variables are congruent (start with one class Iteratively partition variables that contradict assumption Slower but better results CS553 Lecture Value Numbering 5 CS553 Lecture Value Numbering 6 Role of SSA Form Basis SSA form is helpful Allows us to avoid data-flow analysis Variables correspond to values a not congruent to anything a = b a = c a = d = b = c = d Congruence classes: {,b, {,c, {,d If x and y are congruent then f(x and f(y are congruent x and y are congruent ta = a tb = b x = f(a,b y = f(ta,tb Use this fact to combine (pessimistic or split (optimistic classes Problem This is not true for φ-functions & b 1 CS553 Lecture Value Numbering 7 = = b 1 = b 2 = n m & b 2 = φ n(, b 3 = φ (b 1,b 2 m & b 3 Solution: Label φ-functions with join point CS553 Lecture Value Numbering 8 2

Pessimistic Global Value Numbering Algorithm Initially each variable is in its own congruence class Consider each assignment statement s (reverse postorder in CFG Update LHS value number with hash of RHS Identical value number congruence Why reverse postorder? Ensures that when we consider an assignment statement, we have already considered definitions that reach the RHS operands c b e a f Postorder: d, c, e, b, f, a d CS553 Lecture Value Numbering 9 for each assignment of the form: x = f(a,b ValNum[x] UniqueValue( // same for a and b for each assignment of the form: x = f(a,b (in reverse postorder ValNum[x] Hash(f ValNum[a] ValNum[b] = b 1 = b 1 i 1 = 1 #1 b 1 #2 w 3 = φ n (, x 3 = φ n (, = w 3 +i 1 = x 3 +i 1 = = i 1 #3 #4 #2 #5 #2 #6 #1 #7 #1 w 3 #8 φ n (#2,#1 #12 x 3 #9 φ n (#2,#1 #12 #10+(#12,#3 #13 #11+(#12,#3 #13 CS553 Lecture Value Numbering 10 Snag! Problem Our algorithm assumes that we consider operands before variables that depend upon it Can t deal with code containing loops! Solution Ignore back edges Make conservative (worst case assumption for previously unseen variable (i.e., assume its in it s own congruence class Optimistic Global Value Numbering Initially all variables in one congruence class Split congruence classes when evidence of non-congruence arises Variables that are computed using different functions Variables that are computed using functions with non-congruent operands CS553 Lecture Value Numbering 11 CS553 Lecture Value Numbering 12 3

Splitting Initially Variables computed using the same function are placed in the same class = f(,b 1 = f(c 1,d 1 = f(e 1,f 1 Iteratively Split classes when corresponding operands are in different classes Example: assume and c 1 are congruent, but e 1 is congruent to neither P P c 1 Splitting (cont Definitions Suppose P and are sets representing congruence classes splits P for each i into two sets P \ i contains variables in P whose i th operand is in P / i contains variables in P whose i th operand is not in properly splits P if neither resulting set is empty = f(,b 1 = f(c 1,d 1 = f(e 1,f 1 P c 1 P \ 1 P / 1 CS553 Lecture Value Numbering 13 CS553 Lecture Value Numbering 14 Algorithm worklist for each function f C f for each assignment of the form x = f(a,b C f C f { x worklist worklist {C f CC CC {C f while worklist Delete some D from worklist for each class C properly split by D (at operand i CC CC C worklist worklist C Create new congruence classes C j {C \ i D and C k {C / i D CC CC C j C k worklist worklist C j C k Example SSA code x 0 = 1 y 0 = 2 = x 0 = y 0 = x 0 Congruence classes {x 0 S 1 {y 0 {,, S 3 {, S 4 { Worklist: ={x 0, S 1 ={y 0, ={,,, S 3 ={,, S 4 ={ psplit? no psplit S 1? no psplit? yes! \ 1 = {, = S 3 / 1 = { = S 4 Note: see paper for optimization CS553 Lecture Value Numbering 15 CS553 Lecture Value Numbering 16 4

Comparing Optimistic and Pessimistic Role of SSA Differences Handling of loops Pessimistic makes worst-case assumptions on back edges Optimistic requires actual contradiction to split classes w 0 = 5 x 0 = 5 Single global result Single def reaches each use No data (flow value at each point No data flow analysis Optimistic: Iterate over congruence classes, not CFG nodes Pessimistic: Visit each assignment once =φ(w 0, =φ(x 0, = = φ-functions Make data-flow merging explicit Treat like normal functions after subscripting them for each merge point CS553 Lecture Value Numbering 17 CS553 Lecture Value Numbering 18 Next Time Lecture Parallelism and Data Locality, start reading chapter 11 in dragon book Suggested Exercise Use any programming language to write the class declarations for the equivalence class and variable structures described at the bottom of page 6 in the value numbering chapter. Instantiate the data structures for the small example programs in Figures 7.2 and 7.6 and perform optimistic value numbering. Remember to convert to SSA first. After performing value numbering, how will the program be transformed? Now perform pessimistic value numbering on the two examples. Transform the code as well. CS553 Lecture Value Numbering 19 5