Register allocation. instruction selection. machine code. register allocation. errors

Similar documents
Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices

Topic 12: Register Allocation

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University

Register allocation. CS Compiler Design. Liveness analysis. Register allocation. Liveness analysis and Register allocation. V.

Variables vs. Registers/Memory. Simple Approach. Register Allocation. Interference Graph. Register Allocation Algorithm CS412/CS413

Liveness Analysis and Register Allocation. Xiao Jia May 3 rd, 2013

Lecture 21 CIS 341: COMPILERS

Compiler Design. Register Allocation. Hwansoo Han

Liveness Analysis and Register Allocation

Register Allocation & Liveness Analysis

CSC D70: Compiler Optimization Register Allocation

Compilers and Code Optimization EDOARDO FUSELLA

Low-Level Issues. Register Allocation. Last lecture! Liveness analysis! Register allocation. ! More register allocation. ! Instruction scheduling

Global Register Allocation

CSE P 501 Compilers. Register Allocation Hal Perkins Autumn /22/ Hal Perkins & UW CSE P-1

Register allocation. Overview

Register Allocation. CS 502 Lecture 14 11/25/08

Compilation /17a Lecture 10. Register Allocation Noam Rinetzky

Register Allocation. Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations

Global Register Allocation

register allocation saves energy register allocation reduces memory accesses.

Register allocation. TDT4205 Lecture 31

Compiler Architecture

Global Register Allocation via Graph Coloring

Outline. Register Allocation. Issues. Storing values between defs and uses. Issues. Issues P3 / 2006

Global Register Allocation - Part 3

Global Register Allocation - Part 2

Lecture 6. Register Allocation. I. Introduction. II. Abstraction and the Problem III. Algorithm

Redundant Computation Elimination Optimizations. Redundancy Elimination. Value Numbering CS2210

Iterated Register Coalescing

The compilation process is driven by the syntactic structure of the program as discovered by the parser

Register Allocation. Register Allocation. Local Register Allocation. Live range. Register Allocation for Loops

Introduction to optimizations. CS Compiler Design. Phases inside the compiler. Optimization. Introduction to Optimizations. V.

Register Allocation (wrapup) & Code Scheduling. Constructing and Representing the Interference Graph. Adjacency List CS2210

Runtime management. CS Compiler Design. The procedure abstraction. The procedure abstraction. Runtime management. V.

Register Allocation. Preston Briggs Reservoir Labs

Code generation for modern processors

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Code generation for modern processors

Rematerialization. Graph Coloring Register Allocation. Some expressions are especially simple to recompute: Last Time

Register Allocation 3/16/11. What a Smart Allocator Needs to Do. Global Register Allocation. Global Register Allocation. Outline.

Register Allocation, iii. Bringing in functions & using spilling & coalescing

CSc 553. Principles of Compilation. 23 : Register Allocation. Department of Computer Science University of Arizona

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

Alternatives for semantic processing

April 15, 2015 More Register Allocation 1. Problem Register values may change across procedure calls The allocator must be sensitive to this

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

Acknowledgement. CS Compiler Design. Semantic Processing. Alternatives for semantic processing. Intro to Semantic Analysis. V.

Translation. From ASTs to IR trees

The C2 Register Allocator. Niclas Adlertz

Code Generation. CS 540 George Mason University

CS 406/534 Compiler Construction Putting It All Together

Lecture Notes on Register Allocation

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University.

Chapter 4: LR Parsing

Register Allocation. Lecture 38

Today More register allocation Clarifications from last time Finish improvements on basic graph coloring concept Procedure calls Interprocedural

Register Allocation via Hierarchical Graph Coloring

Register Allocation (via graph coloring) Lecture 25. CS 536 Spring

CS 406/534 Compiler Construction Instruction Selection and Global Register Allocation

Calvin Lin The University of Texas at Austin

Lecture 25: Register Allocation

CS 132 Compiler Construction

Register Allocation 1

Register Allocation. Lecture 16

How to efficiently use the address register? Address register = contains the address of the operand to fetch from memory.

Register Allocation. Michael O Boyle. February, 2014

Register Allocation. Stanford University CS243 Winter 2006 Wei Li 1

Experimental evaluation and improvements to linear scan register allocation

Concepts Introduced in Chapter 7

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

Interprocedural Variable Liveness Analysis for Function Signature Recovery

CHAPTER 3. Register allocation

Linear-Scan Register Allocation. CS 352 Lecture 12 11/28/07

Run-time Environment

CA Compiler Construction

CHAPTER 3. Register allocation

Lecture 3 Local Optimizations, Intro to SSA

Introduction to Optimization Local Value Numbering

CSCI565 Compiler Design

MIT Spring 2011 Quiz 3

Quality and Speed in Linear-Scan Register Allocation

CS 352 Compilers: Principles and Practice Final Examination, 12/11/05

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

CS5363 Final Review. cs5363 1

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

Rethinking Code Generation in Compilers

Lecture 15 Register Allocation & Spilling

Compilation Lecture 11a. Register Allocation Noam Rinetzky. Text book: Modern compiler implementation in C Andrew A.

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation

Reuse Optimization. LLVM Compiler Infrastructure. Local Value Numbering. Local Value Numbering (cont)

What Compilers Can and Cannot Do. Saman Amarasinghe Fall 2009

Chapter 9. Register Allocation

CIS 341 Final Examination 3 May 2011

Systems I. Machine-Level Programming V: Procedures

OPERATING SYSTEM. Chapter 9: Virtual Memory

Exploiting Hardware Resources: Register Assignment across Method Boundaries

Compiler Construction

CSCI Compiler Design

Transcription:

Register allocation IR instruction selection register allocation machine code errors Register allocation: have value in a register when used limited resources changes instruction choices can move loads and stores optimal allocation is difficult NP-complete for k 1 registers Copyright c 2012 by Antony L. Hosking. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from hosking@cs.purdue.edu. 1

Register allocation by simplification Assume K registers 1. Build interference graph G: for each program point (a) compute set of temporaries simultaneously live (b) add edge to graph for each pair in set 2. Simplify: Color graph using a simple heuristic (a) suppose G has node m with degree < K (b) if G = G {m} can be colored then so can G, since nodes adjacent to m have at most K 1 colors (c) each such simplification will reduce degree of remaining nodes leading to more opportunity for simplification (d) leads to recursive coloring algorithm 3. Spill: suppose m of degree < K (a) target some node (temporary) for spilling (optimistically, spilling node will allow coloring of remaining nodes) (b) remove and continue simplifying 2

Register allocation by simplification (cont.) 4. Select: assign colors to nodes (a) start with empty graph (b) must be a color for non-spill nodes (basis for removal) (c) if adding spill node and no color available (neighbors already K-colored) then mark as an actual spill (d) repeat select 5. Start over: if select has no actual spills then finished, otherwise (a) rewrite program to fetch actual spills before each use and store after each definition (b) recalculate liveness and repeat 3

Coalescing Can delete a move instruction when source s and destination d do not interfere: coalesce them into a new node whose edges are the union of those of s and d In principle, any pair of non-interfering nodes can be coalesced unfortunately, the union is more constrained and new graph may no longer be K-colorable overly aggressive 4

Simplification with aggressive coalescing build aggressive coalesce done any coalesce simplify spill done spill select any 5

Conservative coalescing Apply tests for coalescing that preserve colorability. Suppose a and b are candidates for coalescing into node ab. Briggs: coalesce only if ab has < K neighbors of significant degree K simplify first removes all insignificant-degree neighbors ab will then be adjacent to < K neighbors simplify can then remove ab George: coalesce only if all significant-degree neighbors of a already interfere with b simplify removes all insignificant-degree neighbors of a remaining significant-degree neighbors of a already interfere with b so coalescing does not increase the degree of any node 6

Iterated register coalescing Interleave simplification with coalescing to eliminate most moves while guaranteeing not to introduce spills: 1. Build interference graph G and distinguish move-related from non-move-related nodes 2. Simplify: remove non-move-related nodes of low degree one at a time 3. Coalesce: conservatively coalesce move-related nodes remove associated move instruction if resulting node is non-move-related it can now be simplified repeat simplify and coalesce until only significant-degree or uncoalesced moves 4. Freeze: if unable to simplify or coalesce (a) look for move-related node of low-degree (b) freeze its associated moves (give up on coalescing) (c) now treat as non-move-related; resume iteration of simplify and coalesce 7

Iterated register coalescing (cont.) 5. Spill: if no low-degree nodes (a) select candidate for spilling (b) remove to stack and continue simplifying 6. Select: pop stack assigning colors (including actual spills) 7. Start over: if select has no actual spills then finished, otherwise (a) rewrite code to fetch actual spills before each use and store after each definition (b) recalculate liveness and repeat 8

Iterated register coalescing SSA constant propagation (optional) build simplify conservative coalesce freeze potential spill select actual spill done spills any 9

Spilling Spills require repeating build and simplify on the whole program To avoid increasing number of spills in future rounds of build can simply discard coalescences Alternatively, preserve coalescences from before first potential spill, discard those after that point Move-related spilled temporaries can be aggressively coalesced, since (unlike registers) there is no limit on the number of stack-frame locations 10

Precolored nodes Precolored nodes correspond to machine registers (e.g., stack pointer, arguments, return address, return value) select and coalesce can give an ordinary temporary the same color as a precolored register, if they don t interfere e.g., argument registers can be reused inside procedures for a temporary simplify, freeze and spill cannot be performed on them also, precolored nodes interfere with other precolored nodes So, treat precolored nodes as having infinite degree This also avoids needing to store large adjacency lists for precolored nodes; coalescing can use the George criterion 11

Temporary copies of machine registers Since precolored nodes don t spill, their live ranges must be kept short: 1. use move instructions 2. move callee-save registers to fresh temporaries on procedure entry, and back on exit, spilling between as necessary 3. register pressure will spill the fresh temporaries as necessary, otherwise they can be coalesced with their precolored counterpart and the moves deleted 12

Caller-save and callee-save registers Variables whose live ranges span calls should go to callee-save registers, otherwise to caller-save This is easy for graph coloring allocation with spilling calls interfere with caller-save registers a cross-call variable interferes with all precolored caller-save registers, as well as with the fresh temporaries created for callee-save copies, forcing a spill choose nodes with high degree but few uses, to spill the fresh callee-save temporary instead of the cross-call variable this makes the original callee-save register available for coloring the cross-call variable 13

Example enter: c := r3 a := r1 b := r2 d := 0 e := a loop: d := d + b e := e - 1 if e > 0 goto loop r1 := d r3 := c return [ r1, r3 live out ] Temporaries are a, b, c, d, e Assume target machine with K = 3 registers: r1, r2 (caller-save/argument/result), r3 (callee-save) The code generator has already made arrangements to save r3 explicitly by copying into temporary a and back again 14

Interference graph: r3 c r2 b e r1 a d 15

No opportunity for simplify or freeze (all non-precolored nodes have significant degree K) Any coalesce will produce a new node adjacent to K significant-degree nodes Must spill based on priorities: Node uses + defs outside loop uses + defs inside loop degree priority a ( 2 +10 0 )/ 4 = 0.50 b ( 1 +10 1 )/ 4 = 2.75 c ( 2 +10 0 )/ 6 = 0.33 d ( 2 +10 2 )/ 4 = 5.50 e ( 1 +10 3 )/ 3 = 10.30 Node c has lowest priority so spill it 16

Interference graph with c removed: r3 r2 b e r1 a d 17

Only possibility is to coalesce a and e: ae will have < K significant-degree neighbors (after coalescing d will be low-degree, though high-degree before) r3 r2 b r1 ae d 18

Can now coalesce b with r2 (or coalesce ae and r1): r3 r2b r1 ae d 19

Coalescing ae and r1 (could also coalesce d with r1): r3 r2b r1ae d 20

Cannot coalesce r1ae with d because the move is constrained: the nodes interfere. Must simplify d: r3 r2b r1ae 21

Graph now has only precolored nodes, so pop nodes from stack coloring along the way d r3 a, b, e have colors by coalescing c must spill since no color can be found for it Introduce new temporaries c1 and c2 for each use/def, add loads before each use and stores after each def 22

enter: c1 := r3 M[c_loc] := c1 a := r1 b := r2 d := 0 e := a loop: d := d + b e := e - 1 if e > 0 goto loop r1 := d c2 := M[c_loc] r3 := c2 return [ r1, r3 live out ] 23

New interference graph: r3 c1 r2 b e c2 r1 a d 24

Coalesce c1 with r3, then c2 with r3: r3c1c2 r2 b e r1 a d 25

As before, coalesce a with e, then b with r2: r3c1c2 r2b r1 ae d 26

As before, coalesce ae with r1 and simplify d: r3c1c2 r2b r1ae 27

Pop d from stack: select r3. All other nodes were coalesced or precolored. So, the coloring is: a r1 b r2 c r3 d r3 e r1 28

Rewrite the program with this assignment: enter: r3 := r3 M[c_loc] := r3 r1 := r1 r2 := r2 r3 := 0 r1 := r1 loop: r3 := r3 + r2 r1 := r1-1 if r1 > 0 goto loop r1 := r3 r3 := M[c_loc] r3 := r3 return [ r1, r3 live out ] 29

Delete moves with source and destination the same (coalesced): enter: M[c_loc] := r3 r3 := 0 loop: r2 := r3 + r2 r1 := r1-1 if r1 > 0 goto loop r1 := r3 r3 := M[c_loc] return [ r1, r3 live out ] One uncoalesced move remains 30