Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Similar documents
Global Register Allocation via Graph Coloring

Compiler Design. Register Allocation. Hwansoo Han

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

Register allocation. Overview

Instruction Selection: Preliminaries. Comp 412

CS 406/534 Compiler Construction Putting It All Together

CS 406/534 Compiler Construction Instruction Selection and Global Register Allocation

Intermediate Representations

Global Register Allocation via Graph Coloring The Chaitin-Briggs Algorithm. Comp 412

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

Register Allocation. Register Allocation. Local Register Allocation. Live range. Register Allocation for Loops

Just-In-Time Compilers & Runtime Optimizers

Register Allocation. Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Code generation for modern processors

Code generation for modern processors

Low-Level Issues. Register Allocation. Last lecture! Liveness analysis! Register allocation. ! More register allocation. ! Instruction scheduling

Runtime Support for OOLs Object Records, Code Vectors, Inheritance Comp 412

Procedure and Function Calls, Part II. Comp 412 COMP 412 FALL Chapter 6 in EaC2e. target code. source code Front End Optimizer Back End

CSC D70: Compiler Optimization Register Allocation

The Processor Memory Hierarchy

Instruction Scheduling Beyond Basic Blocks Extended Basic Blocks, Superblock Cloning, & Traces, with a quick introduction to Dominators.

Naming in OOLs and Storage Layout Comp 412

Lexical Analysis. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Register Allocation 1

Global Register Allocation - Part 2

Register Allocation. Lecture 38

Runtime Support for Algol-Like Languages Comp 412

Register Allocation (via graph coloring) Lecture 25. CS 536 Spring

Combining Optimizations: Sparse Conditional Constant Propagation

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University

Local Register Allocation (critical content for Lab 2) Comp 412

Handling Assignment Comp 412

CS415 Compilers Register Allocation. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

The Software Stack: From Assembly Language to Machine Code

Global Register Allocation

Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412

Register Allocation & Liveness Analysis

Intermediate Representations

Outline. Register Allocation. Issues. Storing values between defs and uses. Issues. Issues P3 / 2006

Global Register Allocation - Part 3

Arrays and Functions

CS415 Compilers. Procedure Abstractions. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Register Allocation. Lecture 16

Register allocation. CS Compiler Design. Liveness analysis. Register allocation. Liveness analysis and Register allocation. V.

Optimizer. Defining and preserving the meaning of the program

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

Lab 3, Tutorial 1 Comp 412

Syntax Analysis, VI Examples from LR Parsing. Comp 412

k register IR Register Allocation IR Instruction Scheduling n), maybe O(n 2 ), but not O(2 n ) k register code

The C2 Register Allocator. Niclas Adlertz

Lecture 25: Register Allocation

Register allocation. TDT4205 Lecture 31

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Syntax Analysis, III Comp 412

Parsing II Top-down parsing. Comp 412

Code Shape II Expressions & Assignment

Global Register Allocation

Register Allocation. Introduction Local Register Allocators

Syntax Analysis, III Comp 412

Rematerialization. Graph Coloring Register Allocation. Some expressions are especially simple to recompute: Last Time

Topic 12: Register Allocation

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

CSE P 501 Compilers. Register Allocation Hal Perkins Autumn /22/ Hal Perkins & UW CSE P-1

Lecture Overview Register Allocation

Today More register allocation Clarifications from last time Finish improvements on basic graph coloring concept Procedure calls Interprocedural

Register allocation. instruction selection. machine code. register allocation. errors

Lecture 6. Register Allocation. I. Introduction. II. Abstraction and the Problem III. Algorithm

Register Allocation via Hierarchical Graph Coloring

Lecture 15 Register Allocation & Spilling

Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices

CHAPTER 3. Register allocation

register allocation saves energy register allocation reduces memory accesses.

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Building a Parser Part III

Instruction Selection, II Tree-pattern matching

Register Allocation 3/16/11. What a Smart Allocator Needs to Do. Global Register Allocation. Global Register Allocation. Outline.

EECS 583 Class 15 Register Allocation

Variables vs. Registers/Memory. Simple Approach. Register Allocation. Interference Graph. Register Allocation Algorithm CS412/CS413

Implementing Object-Oriented Languages - 2

Transla'on Out of SSA Form

The View from 35,000 Feet

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

CS415 Compilers. Intermediate Represeation & Code Generation

CHAPTER 3. Register allocation

Syntax Analysis, VII One more LR(1) example, plus some more stuff. Comp 412 COMP 412 FALL Chapter 3 in EaC2e. target code.

Register Allocation (wrapup) & Code Scheduling. Constructing and Representing the Interference Graph. Adjacency List CS2210

Computing Inside The Parser Syntax-Directed Translation. Comp 412

UG3 Compiling Techniques Overview of the Course

Register Allocation. Preston Briggs Reservoir Labs

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Introduction to Parsing. Comp 412

Global Register Allocation - 2

Implementing Control Flow Constructs Comp 412

Loops and Locality. with an introduc-on to the memory hierarchy. COMP 506 Rice University Spring target code. source code OpJmizer

Code Merge. Flow Analysis. bookkeeping

The ILOC Virtual Machine (Lab 1 Background Material) Comp 412

CS415 Compilers. Instruction Scheduling and Lexical Analysis

Lexical Analysis, V Implemen'ng Scanners. Comp 412 COMP 412 FALL Sec0on 2.5 in EaC2e. target code. source code Front End OpMmizer Back End

Transcription:

Register Allocation Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP at Rice. Copyright 00, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Register Allocation Part of the compiler s back end IR Instruction Selection m register asm Instruction Scheduling m register asm Register Allocation k register asm Errors Critical properties Produce correct code that uses k (or fewer) registers Minimize added loads and stores Minimize space used to hold spilled values Operate efficiently O(n), O(n log n), maybe O(n ), but not O( n ) Comp, Fall 00

Global Register Allocation The Big Picture m register code Register Allocator k register code Optimal global allocation is NP-Complete, under almost any assumptions. At each point in the code Determine which values will reside in registers Select a register for each such value The goal is an allocation that minimizes running time Most modern, global allocators use a graph-coloring paradigm Build a conflict graph or interference graph Find a k-coloring for the graph, or change the code to a nearby problem that it can k-color Comp, Fall 00 Graph Coloring (A Background Digression) The problem A graph G is said to be k-colorable iff the nodes can be labeled with integers k so that no edge in G connects two nodes with the same label Examples -colorable -colorable Each color can be mapped to a distinct physical register Comp, Fall 00

Building the Interference Graph What is an interference? (or conflict) Two values interfere if there exists an operation where both are simultaneously live If x and y interfere, they cannot occupy the same register To compute interferences, we must know where values are live We ve seen Liveness analysis in the Data Flow Analysis lecture. Comp, Fall 00 Observation on Coloring for Register Allocation Suppose you have k registers look for a k coloring Any vertex n that has fewer than k neighbors in the interference graph (n < k) can always be colored! Pick any color not used by its neighbors there must be one Comp, Fall 00 5

Chaitin s Algorithm. While vertices with < k neighbors in G I > Pick any vertex n such that n < k and put it on the stack > Remove that vertex and all edges incident to it from G I. If G I is non-empty (all vertices have k or more neighbors) then: > Pick a vertex n (using some heuristic) and spill the live range associated with n > Remove vertex n from G I, along with all edges incident to it and put it on the spill list > If this causes some vertex in G I to have fewer than k neighbors, then go to step ; otherwise, repeat step. If the spill list is not empty, insert spill code, then rebuild the interference graph and try to allocate, again Lowers degree of n s neighbors. Otherwise, successively pop vertices off the stack and color them in the lowest color not used by some neighbor Comp, Fall 00 6 Chaitin s Algorithm in Practice Registers 5 is the only node with degree < Comp, Fall 00 7

Chaitin s Algorithm in Practice Registers 5 Now, & have degree < Comp, Fall 00 8 Chaitin s Algorithm in Practice Registers 5 Now all nodes have degree < Comp, Fall 00 9 5

Chaitin s Algorithm in Practice Registers 5 Comp, Fall 00 0 Chaitin s Algorithm in Practice Registers 5 : : : Comp, Fall 00 6

Chaitin s Algorithm in Practice Registers 5 : : : Comp, Fall 00 Chaitin s Algorithm in Practice Registers 5 : : : Comp, Fall 00 7

Chaitin s Algorithm in Practice Registers 5 : : : Comp, Fall 00 Chaitin s Algorithm in Practice Registers 5 : : : Comp, Fall 00 5 8

Chaitin s Algorithm in Practice Registers 5 : : : Comp, Fall 00 6 Improvement in Coloring Scheme Optimistic Coloring If Chaitin s algorithm reaches a state where every node has k or more neighbors, it chooses a node to spill. Briggs said, take that same node and push it on the stack When you pop it off, a color might be available for it! Registers: Chaitin s algorithm immediately spills one of these nodes For example, a node n might have k+ neighbors, but those neighbors might only use (<k) colors Degree is a loose upper bound on colorability Comp, Fall 00 Briggs et al, PLDI 89 (Also, TOPLAS 99) 7 9

Improvement in Coloring Scheme Optimistic Coloring If Chaitin s algorithm reaches a state where every node has k or more neighbors, it chooses a node to spill. Briggs said, take that same node and push it on the stack When you pop it off, a color might be available for it! Registers: -Colorable Briggs algorithm finds an available color For example, a node n might have k+ neighbors, but those neighbors might only use just one color (or any number < k ) Degree is a loose upper bound on colorability Comp, Fall 00 8 Chaitin-Briggs Algorithm. While vertices with < k neighbors in G I > Pick any vertex n such that n < k and put it on the stack > Remove that vertex and all edges incident to it from G I This action often creates vertices with fewer than k neighbors. If G I is non-empty (all vertices have k or more neighbors) then: > Pick a vertex n (using some heuristic condition), push n on the stack and remove n from G I, along with all edges incident to it > If this causes some vertex in G I to have fewer than k neighbors, then go to step ; otherwise, repeat step. Successively pop vertices off the stack and color them in the lowest color not used by some neighbor > If some vertex cannot be colored, then pick an uncolored vertex to spill, spill it, and restart at step Comp, Fall 00 9 0

Chaitin-Briggs in Practice Registers No node has degree < Chaitin would spill a node Briggs picks the same node & stacks it Comp, Fall 00 0 Chaitin-Briggs in Practice Registers Pick a node, say Comp, Fall 00

Chaitin-Briggs in Practice Registers Pick a node, say Comp, Fall 00 Chaitin-Briggs in Practice Registers Now, both & have degree < Pick one, say Comp, Fall 00

Chaitin-Briggs in Practice Registers Both & have degree <. Take them in order, then. Comp, Fall 00 Chaitin-Briggs in Practice Registers Comp, Fall 00 5

Chaitin-Briggs in Practice Registers Now, rebuild the graph Comp, Fall 00 6 Chaitin-Briggs in Practice Registers : : Comp, Fall 00 7

Chaitin-Briggs in Practice Registers : : Comp, Fall 00 8 Chaitin-Briggs in Practice Registers : : Comp, Fall 00 9 5

Chaitin-Briggs in Practice Registers : : Comp, Fall 00 0 Approximate Global Allocation Linear Scan Allocation Coloring allocators are often viewed as too expensive for use in JIT environments, where compile time occurs at runtime Linear scan allocators use an approximate interference graph Algorithm does allocation in a linear scan of the graph Linear scan produces faster, albeit less precise, allocations Linear scan allocators hit a different point on the curve of cost versus performance Comp, Fall 00 Sun s HotSpot server compiler uses a complete Chaitin-Briggs allocator. 6