Loop Invariant Code Mo0on

Similar documents
Operator Strength Reduc1on

Overview Of Op*miza*on, 2

Welcome to COMP 512. COMP 512 Rice University Spring 2015

Dynamic Compila-on in Smalltalk- 80

Construc)on of Sta)c Single- Assignment Form

Intermediate Representations

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Instruction Scheduling Beyond Basic Blocks Extended Basic Blocks, Superblock Cloning, & Traces, with a quick introduction to Dominators.

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Introduction to Optimization Local Value Numbering

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Just-In-Time Compilers & Runtime Optimizers

Implementing Control Flow Constructs Comp 412

Global Register Allocation via Graph Coloring

Intermediate Representations

Transla'on Out of SSA Form

Register Alloca.on via Graph Coloring

The Big Picture. The Programming Assignment & A Taxonomy of Op6miza6ons. COMP 512 Rice University Spring 2015

Combining Optimizations: Sparse Conditional Constant Propagation

Parsing II Top-down parsing. Comp 412

Grammars. CS434 Lecture 15 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

Instruction Selection: Preliminaries. Comp 412

Syntax Analysis, III Comp 412

Syntax Analysis, III Comp 412

The View from 35,000 Feet

Lazy Code Motion. Comp 512 Spring 2011

Loop Optimizations. Outline. Loop Invariant Code Motion. Induction Variables. Loop Invariant Code Motion. Loop Invariant Code Motion

The Software Stack: From Assembly Language to Machine Code

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Code Merge. Flow Analysis. bookkeeping

Handling Assignment Comp 412

Code Shape II Expressions & Assignment

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

CS 406/534 Compiler Construction Putting It All Together

Local Register Allocation (critical content for Lab 2) Comp 412

Register Allocation. Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations

Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412

Lab 3, Tutorial 1 Comp 412

Computing Inside The Parser Syntax-Directed Translation. Comp 412

Arrays and Functions

CS415 Compilers. Intermediate Represeation & Code Generation

Naming in OOLs and Storage Layout Comp 412

Loops and Locality. with an introduc-on to the memory hierarchy. COMP 506 Rice University Spring target code. source code OpJmizer

An Overview of GCC Architecture (source: wikipedia) Control-Flow Analysis and Loop Detection

Runtime Support for OOLs Object Records, Code Vectors, Inheritance Comp 412

Fortran H and PL.8. Papers about Great Op.mizing Compilers. COMP 512 Rice University Spring 2015

Syntax Analysis, VII One more LR(1) example, plus some more stuff. Comp 412 COMP 412 FALL Chapter 3 in EaC2e. target code.

Building a Parser Part III

CSE P 501 Compilers. SSA Hal Perkins Spring UW CSE P 501 Spring 2018 V-1

Lecture 9: Loop Invariant Computation and Code Motion

Lexical Analysis. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Runtime Support for Algol-Like Languages Comp 412

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Introduction to Parsing. Comp 412

EECS 583 Class 8 Classic Optimization

Simone Campanoni Loop transformations

Intermediate Representations Part II

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Parsing II Top-down parsing. Comp 412

Overview of a Compiler

CS415 Compilers. Procedure Abstractions. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code

Introduction to Optimization. CS434 Compiler Construction Joel Jones Department of Computer Science University of Alabama

Calvin Lin The University of Texas at Austin

Procedure and Function Calls, Part II. Comp 412 COMP 412 FALL Chapter 6 in EaC2e. target code. source code Front End Optimizer Back End

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

Lecture 21 CIS 341: COMPILERS

The Development of Static Single Assignment Form

Lecture Notes on Loop Optimizations

6. Intermediate Representation

Lecture 3 Local Optimizations, Intro to SSA

Compiler Design. Register Allocation. Hwansoo Han

A short tutorial on Git. Servesh Muralidharan 4 March 2014

Code Placement, Code Motion

Data Structures and Algorithms in Compiler Optimization. Comp314 Lecture Dave Peixotto

Introduction to optimizations. CS Compiler Design. Phases inside the compiler. Optimization. Introduction to Optimizations. V.

The So'ware Stack: From Assembly Language to Machine Code

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Reuse Optimization. LLVM Compiler Infrastructure. Local Value Numbering. Local Value Numbering (cont)

Tour of common optimizations

Lessons from Fi,een Years of Adap2ve Compila2on

Flaco Lock Service. RESTful Distributed Locking Over HTTP. Jose Falcon Bryon Jacob

CSC D70: Compiler Optimization Register Allocation

6. Intermediate Representation!

CS 426 Fall Machine Problem 4. Machine Problem 4. CS 426 Compiler Construction Fall Semester 2017

The Processor Memory Hierarchy

This test is not formatted for your answers. Submit your answers via to:

Goals of Program Optimization (1 of 2)

Syntax Analysis, VI Examples from LR Parsing. Comp 412

BEAMJIT: An LLVM based just-in-time compiler for Erlang. Frej Drejhammar

Intermediate Representations

CS 701. Class Meets. Instructor. Teaching Assistant. Key Dates. Charles N. Fischer. Fall Tuesdays & Thursdays, 11:00 12: Engineering Hall

Compiler Construction 2010/2011 Loop Optimizations

CS415 Compilers Register Allocation. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Lecture Notes on Memory Layout

Single-Pass Generation of Static Single Assignment Form for Structured Languages

Transcription:

COMP 512 Rice University Spring 2015 Loop Invariant Code Mo0on A Simple Classical Approach Copyright 2015, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educahonal inshtuhons may use these materials for nonprofit educahonal purposes, provided this copyright nohce is preserved CitaHon numbers refer to entries in the EaC2e bibliography.

Background Code in loops executes more o5en than code outside loops An expression is loop invariant if it computes the same value, indepent of the iterahon in which it is computed Loop invariant code can oqen be moved to a spot before the loop Execute once and reuse the value mulhple Hmes Reduce the execuhon cost of the loop Techniques seem to fall into two categories Ad- hoc graph- based algorithms Complete data- flow soluhons Think of loop invariant as redundant across iterahons Always lengthens live range COMP 512, Rice University 2

Loop- invariant Code Mo?on x + y Neither x nor y change in loop Example x + y is invariant in the loop Computed on each iterahon Subexpressions don t change Move evaluahon out of loop Need block to hold it Use exishng block or insert new Rela?onship to redundancy x + y is redundant along back edge x + y is not redundant from loop entry If we add an evaluahon on the entry edge, x + y in the loop is redundant COMP 512, Rice University 3

Loop- invariant Code Mo?on x + y t x + y t t x + y t Neither x nor y change in loop OpHon 1 Move x + y into predecessor OpHon 2 Create a block for x + y What s the difference? COMP 512, Rice University 4

Loop- invariant Code Mo?on zero- trip test t x + y t x + y landing pad x + y t t SpeculaHve ConservaHve In prac?ce, many loops have a zero- trip test Determines whether to run the first iterahon Moving x + y above zero- trip test is specula5ve Lengthens path around the loop LICM techniques oqen create a landing pad COMP 512, Rice University 5

Loop- invariant Code Mo?on z x + y t x + y z t z x + y Move expression Move assignment Another difference between methods Some methods move expression evaluahon, but not assignment Easier safety condihons, easier to manage transformahon Leaves a copy operahon in the loop (may coalesce) Other methods move the enhre assignment Eliminates copy from loop, as well COMP 512, Rice University 6

Loop- invariant Code Mo?on y!= 0 x / y None of x, y, or z change in loop Control flow Another source of speculahon Move it & lengthen other path Don t move it while ( )! {! if (y!= 0)! then z = x/y! else z = x! }! Divergence may be an issue Don t want to move the op if doing so introduces an excephon If that path is hot, then y!= 0 and we might move it. Note that y is invariant in this classic example, so we could move the test, Would like to use branch probabilihes as well. COMP 512, See Cytron, Rice University Lowry, & Zadeck. 7

Loop- invariant Code Mo?on Pedagogical Plan 1. A simple and direct algorithm (today) 2. Lazy code mohon, a data- flow soluhon based on availability 3. Cytron, Lowry, & Zadeck s SSA- based approach (probably not) Which is best? The authors of 2 & 3 would each argue for their technique In prachce, each has strengths and weaknesses Taught 3 last year and had someone implement it in LLVM Surprising set of complicahons in SSA implementahon Moving around all of those definihons proved problemahc CreaHng new names, recreahng SSA name space too much work for the benefits Cannot recomm CLZ in prachce COMP 512, Rice University 8

Loop Shape Is Important Loops Evaluate condihon before loop (if needed) Evaluate condihon aqer loop Branch back to the top (if needed) Merges test with last block of loop body COMP 412 teaches this loop shape. It creates a natural place to insert a landing pad. Pre- test Landing pad Loop body Post- test while, for, do, & un5l all fit this basic model Next block For tail recursion, unroll the recursion once Paper by Mueller & Whalley in SIGPLAN 92 PLDI COMP 512, Rice University about conver5ng arbitrary loops to this form. 9

Loop- invariant Code Mo?on Preliminaries Need a list of loops, ordered inner to outer in each loop nest Define natural loop as one formed by a back edge with respect to DOM An edge (t,h) is a back edge if h DOM t h is the loop s header and t is a tail of the loop The loop s body includes all predecessors back to h Natural loops can nest Given two headers, h 1 and h 2, the loops are nested if h 1 DOM h 2 or h 2 DOM h 1 If h 1 & h 2 are unrelated by DOM, loops are not nested Inner to outer order is defined by DOM Need a landing pad on each loop Insert above loop header & redirect inbound edges Given back edge (h,t) body { h } stack empty push(t) while (stack is nonempty) { n pop() if n body then { body body { n } for each x pred(n) push(x) } } Find the loop body formed by a back edge (t,h) COMP 512, Rice University 10

Loop- invariant Code Mo?on A Simple Algorithm for each l in LOOPLIST do FactorInvariants(l ) FactorInvariants(loop) MarkInvariants(loop) for each expr e loop (x y + z) if x is marked invariant then begin allocate a new name t replace o with x t insert t e in landing pad for loop MarkInvariants(loop ) LOOPDEF block b loop DEF(b) for each op o loop (x y + z) mark x as invariant if y LOOPDEF then mark x as variant if z LOOPDEF then mark x as variant COMP 512, Rice University 11

Example Consider the following simple loop nest do i 1 to 100 do j 1 to 100 do k 1 to 100 Dimensions addressed Multiplies a(i,j,k) i * j * k 3,000,000 2,000,000 Original Code COMP 512, Rice University 12

Example Consider the following simple loop nest do i 1 to 100 do j 1 to 100 Dimensions addressed t 1 addr(a(i,j)) 20,000 Multiplies t 2 i * j 10,000 do k 1 to 100 t 1 (k) t 2 * k 1,000,000 1,000,000 After LICM on the Inner Loop COMP 512, Rice University 13

Example Consider the following simple loop nest do i 1 to 100 Dimensions addressed t 3 addr(a(i)) 100 do j 1 to 100 t 1 addr(t 3 (j)) 10,000 Multiplies t 2 i * j 10,000 do k 1 to 100 t 1 (k) t 2 * k 1,000,000 1,000,000 After Doing Middle Loop COMP 512, Rice University 14

Safety of Loop- invariant Code Mo?on What happens if we move an expression that generates an excep?on? Maybe the fault happens at a different Hme in execuhon Maybe the original code would not have faulted Ideally, the compiler should delay the fault unhl it would occur Op?ons Mask fault & add a test to loop body Replace value with one that causes fault when used Block read access to the faulted value Patch the executable with a bad opcode Generate two copies of the loop and rerun with original code Never move an evaluahon that can fault COMP 512, Rice University Applies to all the algorithms 15

Profitability of Loop- invariant Code Mo?on Does the loop body always execute? Placement of the landing pad is crucial Lengthen any paths? ConservaHve code mohon: not a problem SpeculaHve code mohon: might be an issue Rely on eshmates of branch frequency? Register pressure? Target of moved operahon has longer live range Might shorten live ranges of operands of moved op Backus dilemmna Applies to all the algorithms COMP 512, Rice University 16

Shortcomings of the Simple LICM Algorithm Moves code out of condihonals in a speculahve fashion Ignores control flow inside the loop Can imagine a more sophishcated approach based on control- depence Moves only evaluahon, not assignment To move assignments would require a more complex approach to naming both arguments and results (such as the SSA name space?) Only finds first order invariants for i 1 to n for j 1 to m x a * b y x * c First order invariant Second order invariant Need to iterate on a loop unhl it stabilizes, then move on COMP 512, Rice University 17