Languages and Compiler Design II IR Code Optimization

Similar documents
Languages and Compiler Design II IR Code Generation I

Compiler Optimization and Code Generation

Tour of common optimizations

7. Optimization! Prof. O. Nierstrasz! Lecture notes by Marcus Denker!

Compiler Design. Fall Control-Flow Analysis. Prof. Pedro C. Diniz

Compiler Construction 2016/2017 Loop Optimizations

Compiler Construction 2010/2011 Loop Optimizations

COMS W4115 Programming Languages and Translators Lecture 21: Code Optimization April 15, 2013

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

Induction Variable Identification (cont)

CSC D70: Compiler Optimization

Calvin Lin The University of Texas at Austin

What Do Compilers Do? How Can the Compiler Improve Performance? What Do We Mean By Optimization?

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization

Lecture 3 Local Optimizations, Intro to SSA

Calvin Lin The University of Texas at Austin

A main goal is to achieve a better performance. Code Optimization. Chapter 9

Principles of Compiler Design

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

16.10 Exercises. 372 Chapter 16 Code Improvement. be translated as

Data Flow Analysis. Agenda CS738: Advanced Compiler Optimizations. 3-address Code Format. Assumptions

USC 227 Office hours: 3-4 Monday and Wednesday CS553 Lecture 1 Introduction 4

Machine-Independent Optimizations

Why Global Dataflow Analysis?

IR Optimization. May 15th, Tuesday, May 14, 13

Loop Optimizations. Outline. Loop Invariant Code Motion. Induction Variables. Loop Invariant Code Motion. Loop Invariant Code Motion

Lecture Notes on Loop Optimizations

COP5621 Exam 4 - Spring 2005

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Operational Semantics of Cool

CS202 Compiler Construction

CS321 Languages and Compiler Design I. Winter 2012 Lecture 1

Goals of Program Optimization (1 of 2)

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

COMPILER DESIGN - CODE OPTIMIZATION

Office Hours: Mon/Wed 3:30-4:30 GDC Office Hours: Tue 3:30-4:30 Thu 3:30-4:30 GDC 5.

Group B Assignment 8. Title of Assignment: Problem Definition: Code optimization using DAG Perquisite: Lex, Yacc, Compiler Construction

ECE 486/586. Computer Architecture. Lecture # 7

Code optimization. Have we achieved optimal code? Impossible to answer! We make improvements to the code. Aim: faster code and/or less space

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

Code Optimization. Code Optimization

Supercomputing in Plain English Part IV: Henry Neeman, Director

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Data-flow Analysis. Y.N. Srikant. Department of Computer Science and Automation Indian Institute of Science Bangalore

Intermediate Representations

Using Static Single Assignment Form

Introduction to Code Optimization. Lecture 36: Local Optimization. Basic Blocks. Basic-Block Example

Compiler Design and Construction Optimization

UNIT-V. Symbol Table & Run-Time Environments Symbol Table

Introduction to Optimization Local Value Numbering

CSC D70: Compiler Optimization LICM: Loop Invariant Code Motion

Loops. Lather, Rinse, Repeat. CS4410: Spring 2013

CS577 Modern Language Processors. Spring 2018 Lecture Optimization

CSc 453 Interpreters & Interpretation

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code

Compiler Optimization Intermediate Representation

Control flow graphs and loop optimizations. Thursday, October 24, 13

Weeks 6&7: Procedures and Parameter Passing

MIT Introduction to Program Analysis and Optimization. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Program Optimizations using Data-Flow Analysis

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

Supercomputing and Science An Introduction to High Performance Computing

Intermediate Representations. Reading & Topics. Intermediate Representations CS2210

A Simple Syntax-Directed Translator

CS 360 Programming Languages Interpreters

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Compiler Optimization

Intermediate representation

Optimization Prof. James L. Frankel Harvard University

EECS 583 Class 8 Classic Optimization

Compiler Code Generation COMP360

CS 242. Fundamentals. Reading: See last slide

IA-64 Compiler Technology

Chapter 3 (part 3) Describing Syntax and Semantics

CS 403 Compiler Construction Lecture 10 Code Optimization [Based on Chapter 8.5, 9.1 of Aho2]

Lecture 9: Loop Invariant Computation and Code Motion

Redundant Computation Elimination Optimizations. Redundancy Elimination. Value Numbering CS2210

Final CSE 131B Winter 2003

Other Forms of Intermediate Code. Local Optimizations. Lecture 34

Administrative. Other Forms of Intermediate Code. Local Optimizations. Lecture 34. Code Generation Summary. Why Intermediate Languages?

Loop Invariant Code Motion. Background: ud- and du-chains. Upward Exposed Uses. Identifying Loop Invariant Code. Last Time Control flow analysis

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

Multi-dimensional Arrays

Intermediate Code Generation (ICG)

CS 6353 Compiler Construction, Homework #3

Reuse Optimization. LLVM Compiler Infrastructure. Local Value Numbering. Local Value Numbering (cont)

Introduction to Compilers

Data Flow Analysis. Program Analysis

CS 701. Class Meets. Instructor. Teaching Assistant. Key Dates. Charles N. Fischer. Fall Tuesdays & Thursdays, 11:00 12: Engineering Hall

Conditional Elimination through Code Duplication

Qualifying Exam in Programming Languages and Compilers

6. Intermediate Representation!

CSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation

CS 321 IV. Overview of Compilation

CSE P 501 Compilers. Loops Hal Perkins Spring UW CSE P 501 Spring 2018 U-1

CSCI Compiler Design

CS5363 Final Review. cs5363 1

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation

Plan for Today. Concepts. Next Time. Some slides are from Calvin Lin s grad compiler slides. CS553 Lecture 2 Optimizations and LLVM 1

Compiler Design. Fall Data-Flow Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Transcription:

Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 4/16/2010 PSU CS322 HM 1

Agenda IR Optimization Redundancy Elimination Sample: CSE Partial Redundancy Elimination (PRE) Copy Propagation Value Numbering Loop Invariant Code Motion Counter Examples Strength Reduction Induction Variable (IV) Elimination PSU CS322 HM 2

IR Optimization Definition: Optimization is the translation of an original program P1 into a semantically equivalent program P2 with better properties Better depends on the project. Possibilities include code compactness, execution speed, numeric precision, and others PSU CS322 HM 3

IR Optimization Optimizations transform a program into a functionally-equivalent program with better performance. Transformation can be implemented at various stages and levels. Advantages of IR-Level Optimization: IR Operations are explicit, so cost estimations can be accurate IR Optimizations are machine-independent, hence the results are portable across different target machines Scopes of Optimization: Local: Transforming code by analyzing a single basic block Global: Transforming code by analyzing a whole subroutine Inter-Procedural: By analyzing the whole program Concepts and Techniques: Basic blocks & flow graphs Control-flow analysis & data-flow analysis PSU CS322 HM 4

Redundancy Elimination IR code optimization removes redundant computations. The following are specific examples: Common Subexpression Elimination (CSE) Based on lexical representation, applicable to global scope Partial Redundancy Elimination More powerful than CSE Copy Propagation Companion optimization to CSE Value Numbering (VN) Value based, single Basic Block Super-local Value Numbering Extends VN to multiple blocks Loop Invariant Elimination Removes code from frequently to rarely executed part of program PSU CS322 HM 5

Common Subexpression Elimination (CSE) E is a common subexpression if it occurs at L1 and L2, was computed at L1, and no components received new values along path to L2 To achieve CSE, introduce Temp to hold subexpression when first evaluated; see Example from Quicksort(): BB before CSE t11 := 4*i x := a[t11] t12 := 4*i t13 := 4*j t14 := a[t13] a[t12]:= t14 t15 := 4*j a[t15] := x BB after CSE t11 := 4*i x := a[t11] t12 := t11 t13 := 4*j t14 := a[t13] a[t12]:= t14 t15 := t13 a[t15] := x BB after total CSE t11 := 4*i x := a[t11] t13 := 4*j t14 := a[t13] a[t11]:= t14 a[t13]:= x The second occurrence of 4*i in BB --from Quicksort()-- is a common subexpression; so is the second occurrence of 4*j PSU CS322 HM 6

CSE Across BBs CSE can eliminate redundant computation across Basic Blocks: before CSE BB1 i := j a := 4 * i if goto BB3 after CSE BB1 i := j temp := 4 * i a := temp if goto BB3 BB2 i := j b := 4 * i BB2 i := j b := temp i := j c := 4 * i BB3 i := j c := temp BB3 PSU CS322 HM 7

Global CSE both 4*i in BB5 (and BB6) are CSEs eliminate t6 and t11, t7, t12, replace with t2 4*j in BB5 and BB6 are CSEs eliminate t10 and t15, replace with t8 and t13 i := m-1 j := n t1 := 4*n v := a[t1] BB1 BB2 i := i+1 t2 := 4*i t3 := a[t2] if t3<v goto BB2 BB3 j := j-1 t4 := 4*j t5 := a[t4] if t5 > v goto BB3 Now a[t2] in BB5 and BB6 become CSEs replace with t3 BB4 if i >= j goto BB6 t6 := 4*i t11 := 4*i x := a[t6] x := a[t11] t7 := 4*i t12 := 4*i t8 := 4*j t13 := 4*j t9 := a[t8] t14 := a[t13] a[t7]:= t9 a[t12]:= t14 t10 := 4*j t15 := 4*j a[t10]:= x PSU CS322 a[t15] := x HM 8 goto BB2 BB5 BB6

Global CSE i := m-1 j := n t1 := 4*n v := a[t1] BB1 BB2 i := i+1 t2 := 4*i t3 := a[t2] if t3 < v goto BB2 BB3 j := j-1 t4 := 4*j t5 := a[t4] if t5 > v goto BB3 BB4 if i >= j goto BB6 BB5 BB6 x := t3 a[t2]:= t5 a[t4]:= x goto BB2 x := t3 t14 := a[t1] a[t2]:= t14 a[t1]:= x PSU CS322 HM 9

CSE Algorithm Available expressions: An expression x y is available at node n if every path from the entry node to n evaluates the expression, and there are no definitions of x or y after the last evaluation Algorithm: 1. Compute available expressions for all expressions. 2. At each node n : w := x y, where the expression x y is available, search backwards for the evaluations of x y that reach n 3. Replace each evaluation v := x y found in the search by t := x y; v := t 4. Replace n by w := t PSU CS322 HM 10

An Improved CSE Algorithm The previous CSE algorithm performs the expensive backward search and inserts a new temp for every use of a common subexpression. The following ideas can improve the algorithm: Reduce number of new temps by assigning a unique name to each unique expression Avoid backward search by a separate traversal of the CFG Algorithm: 1. Compute available expressions for all expressions 2. Initialize an array Name[ e ] = ø for all expressions 3. At each node n : w := x y, where the expression x y (denoted e below) is available: If Name[ e ] = ø, allocate new name t and set Name[ e ] = t; Else let t = Name[ e ]; Replace n by w := t; 4. In a subsequent traversal of CFG, at each node v := e, if Name[ e ]!= ø, let t = Name[ e ]; replace the node by t := e; v := t; PSU CS322 HM 11

Yet Another CSE Algorithm Ideas: Create one temp for each unique expression. Let subsequent pass eliminate unnecessary temps. Algorithm: 1. Compute available expressions for all expressions. 2. At each evaluation of e: Hash e to a name, t, in a table Insert assignment t = e. 3. At a use of e where e is available: Look up e s name t in the hash table Replace e with t. PSU CS322 HM 12

Partial Redundancy Elimination (PRE) An expression x y is partially redundant at node n, if some path from entry node to n evaluates x y, and there are no definitions of x or y after the last evaluation PRE Optimization (it subsumes CSE): Discover partially redundant expressions Convert them to fully redundant expressions Remove redundancy, to reduce # of overall computations at runtime =... x y x y x y x y x y x y n x y n n =... PSU CS322 HM 13

Copy Propagation Copy statement has the form f := g A large number of copy statements may be generated after performing CSE optimizations. Copy propagation eliminates copy statements by using g for f wherever possible t6 := 4*i x := a[t6] t7 := t6 t8 := 4*j t9 := a[t8] a[t7]:= t9 t10 := t8 a[t10]:= x goto BB2 Before BB5 t6 := 4*i x := a[t6] t8 := 4*j t9 := a[t8] a[t6]:= t9 a[t8]:= x goto BB2 After BB5 PSU CS322 HM 14

Cascading Problem CSE transformations may have a cascading effect more rounds of CSE/Copy-propagation may be needed before reaching the final form: x := b + c y := a + x u := b + c v := a + u x := b + c y := a + x u := x v := a + u x := b + c y := a + x v := a + x x := b + c y := a + x v := y PSU CS322 HM 15

Value Numbering Each variable is assumed to have a unique initial value Each unique value is assigned a unique number An expression s value is represented by a corresponding symbolic expression based on the operands numbers E.g. expression x + y s value is 1+2, if 1 and 2 are x and y s value numbers, respectively Each unique expression value is also assigned a unique number When a new variable or expression is encountered, check to see if it has been assigned a number, if so, use the number, otherwise assign it a new number Use a hash table for efficient number lookup PSU CS322 HM 16

Sample: Value Numbering statement var or expr assigned # x := b + c b 1 c 2 b+c (1+2) 3 x := b + c y := a + x u := b + c v := a + u y := a + x x a a+x (4+3) 3 4 5 y 5 u := b + c u (1+2) 3 v := a + u v (4+3) 5 Value numbering uses a single round to calculate the effect of cascaded optimizations PSU CS322 HM 17

Loop Invariant Code Motion If a loop contains a statement t a b such that a and b have the same values each time around the loop, then t will also have the same value each time. Hoist such loop-invariant statement out of loop! t1 := 0 BB1 t1 := 0 t2 := a * b BB1 BB2 i := i+1 t2 := a * b M[i]:= t2 if a < N goto BB3 BB2 i := i+1 M[i]:= t2 if a < N goto BB3 BB3 BB3 x := t2 x := t2 PSU CS322 HM 18

Loop Invariant Criteria A statement S : t a1 a2 is loop-invariant within loop L if, for each operand a i 1.) a i is a constant, or 2.) all definitions of a i that reach S are outside the loop, or 3.) only 1 definition of a i reaches S, which is loop-invariant An iterative algorithm can be used to find all loop-invariant statements PSU CS322 HM 19

Strength Reduction (SR) Definition: Reduction in strength is the replacement of an operation by a cheaper one, e.g. replace * by + if feasible Do not make such changes in the source, e.g. do not replace j=2*k; with j=k+k; let optimizer do this BB1 if i >= y goto BB3 BB1 if i >= y goto BB3 Call func1 j := 2 * k i := i + 1 goto BB1 BB2 Call func1 j := k + k i++ goto BB1 BB2 x :=... BB3 x :=... BB3 PSU CS322 HM 20

Induction Variable Elimination (IVE) Definition: Induction Variable (IV) is a variable iterating through a linear progression of values in a program section The program section is frequently a proper loop IV are either fundamental or dependent on other IVs IV elimination reduces multiple IVs into fewer, thus saving operations Since these operations are inside inner loops, savings can be significant After IVE other optimizations can be applied too, e.g. SR PSU CS322 HM 21

Induction Variable Elimination, Cont d integer a(100) do i = 1, 100 a(i) = 2 * i enddo BB0 t1 = 1 // i -- low bound is 1, not 0 like in C++ or Java, subtract! -- OK for i to be undefined after loop -- rhs deliberately not 4 * i, which would be easy: = IV BB0 t0 = 0 // IV t1 = 1 // i BB0 t0 = A(a) // IV t1 = 1 // i BB1 If t1>100 goto BB3 BB1 If t0>= 400 goto BB3 BB1 If t0>= A(a)+400 goto BB3 BB2 BB2 BB2 t2 = 2 * t1 t3 = 4 * t1 t4 = t3 4 t5 = A(a)+t4 *t5 = t2 t1 = t1 + 1 Goto BB1 t2 = 2 * t1 t5 = A(a)+t0 *t5 = t2 t0 = t0 + 4 Goto BB1 t2 = 2 * t1 *t0 = t2 t0 = t0 + 4 Goto BB1 BB3 Ater loop i undefined BB3 Ater loop i undefined BB3 Ater loop i undefined PSU CS322 HM 22