CMSC 430 Introduction to Compilers. Spring Register Allocation

Similar documents
Midterm 2. CMSC 430 Introduction to Compilers Spring Instructions Total 100. Name: April 18, 2012

Computer Organization

Compiler Optimisation

Midterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 11, 2015

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University

Chapter 9 Memory Management

Midterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 20, 2013

Midterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 19, 2014

Just-In-Time Software Pipelining

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

register allocation saves energy register allocation reduces memory accesses.

Register Allocation III. Interference Graph Allocators. Coalescing. Granularity of Allocation (Renumber step in Briggs) Chaitin

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Compiler Design. Register Allocation. Hwansoo Han

Message Transport With The User Datagram Protocol

Register Allocation III. Interference Graph Allocators. Computing the Interference Graph (in MiniJava compiler)

Register Allocation (via graph coloring) Lecture 25. CS 536 Spring

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help

Topic 12: Register Allocation

Register allocation. Overview

Prelim 1. Solution. CS 2110, 14 March 2017, 7:30 PM Total Question Name Short answer

You Can Do That. Unit 16. Motivation. Computer Organization. Computer Organization Design of a Simple Processor. Now that you have some understanding

Virtual Memory: Policies. CS439: Principles of Computer Systems March 5, 2018

Register Allocation 1

Register Allocation. Lecture 38

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Politehnica University of Timisoara Mobile Computing, Sensors Network and Embedded Systems Laboratory. Testing Techniques

6.823 Computer System Architecture. Problem Set #3 Spring 2002

PART 5. Process Coordination And Synchronization

Outline. Register Allocation. Issues. Storing values between defs and uses. Issues. Issues P3 / 2006

Global Register Allocation

The C2 Register Allocator. Niclas Adlertz

Unit 15. Building Wide Muxes. Building Wide Muxes. Common Hardware Components WIDE MUXES

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control

Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices

Midterm 2. CMSC 430 Introduction to Compilers Fall 2018

Register Allocation. Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations

Register allocation. TDT4205 Lecture 31

Abstract Interpretation Continued

Global Register Allocation - Part 2

Global Register Allocation via Graph Coloring

PART 2. Organization Of An Operating System

Global Register Allocation - Part 3

Introduction to Machine Learning Spring 2018 Note Sparsity and LASSO. 1.1 Sparsity for SVMs

An ECA-based Control-rule formalism for the BPEL Process Modularization *

Register allocation. instruction selection. machine code. register allocation. errors

University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Science. EECS150, Spring 2010

Today's Topics. CISC 458 Winter J.R. Cordy

Multilevel Paging. Multilevel Paging Translation. Paging Hardware With TLB 11/13/2014. CS341: Operating System

Code generation for modern processors

Code generation for modern processors

CS 188: Artificial Intelligence Fall Search Gone Wrong?

Global Register Allocation - 2

Array Dependence Analysis as Integer Constraints. Array Dependence Analysis Example. Array Dependence Analysis as Integer Constraints, cont

Prelim 1. CS 2110, 14 March 2017, 7:30 PM Total Question Name Short answer. OO Recursion Loop invariants Max Score Grader

CS 406/534 Compiler Construction Putting It All Together

NEWTON METHOD and HP-48G

Topics Introduction to Microprocessors

T Parallel and Distributed Systems (4 ECTS)

Register Allocation. CS 502 Lecture 14 11/25/08

Register Allocation. Lecture 16

Theory of Integers. CS389L: Automated Logical Reasoning. Lecture 13: The Omega Test. Overview of Techniques. Geometric Description

Lecture 25: Register Allocation

Recitation Caches and Blocking. 4 March 2019

Register Allocation 3/16/11. What a Smart Allocator Needs to Do. Global Register Allocation. Global Register Allocation. Outline.

Proving Vizing s Theorem with Rodin

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

Computer Architecture

Additional Divide and Conquer Algorithms. Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication

MODULE VII. Emerging Technologies

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Simple Machine Model. Lectures 14 & 15: Instruction Scheduling. Simple Execution Model. Simple Execution Model

1 Disjoint-set data structure.

Binary Decision Diagrams (BDDs) Pingqiang Zhou ShanghaiTech University

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics

TENTAMEN / EXAM. General instructions

CHAPTER 3. Register allocation

Low-level optimization

What Compilers Can and Cannot Do. Saman Amarasinghe Fall 2009

Overview. Operating Systems I. Simple Memory Management. Simple Memory Management. Multiprocessing w/fixed Partitions.

Memory Usage 0x7fffffff. stack. dynamic data. static data 0x Code Reserved 0x x A software convention

Efficient and Scalable Sequence-Based XML Filtering

Project 3 Due October 21, 2015, 11:59:59pm

CSc 453 Interpreters & Interpretation

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Variables vs. Registers/Memory. Simple Approach. Register Allocation. Interference Graph. Register Allocation Algorithm CS412/CS413

CSC 2400: Computer Systems. Using the Stack for Function Calls

Assembly Language: Function Calls

1 Model checking and equivalence checking

L4: Binary Decision Diagrams. Reading material

Compilation /17a Lecture 10. Register Allocation Noam Rinetzky

Assembly Language: Function Calls" Goals of this Lecture"

CHAPTER 3. Register allocation

Assembly Language: Function Calls. Goals of this Lecture. Function Call Problems

Hot X: Algebra Exposed

Assembly Language: Function Calls" Goals of this Lecture"

Compiler Architecture

Loop Scheduling and Partitions for Hiding Memory Latencies

Spanheight, A Natural Extension of Bandwidth and Treedepth

Transcription:

CMSC 430 Introuction to Compilers Spring 2016 Register Allocation

Introuction Change coe that uses an unoune set of virtual registers to coe that uses a finite set of actual regs For ytecoe targets, can let the JIT hanle this - Even with finite set of ytecoe regs that finite set is proaly large But critical for compiling to real harware Critical properties Prouce correct coe Minimize ae spill coe - The coe neee to move values etween registers an memory, that wasn t neee when assuming unoune set of registers - Memory operations are slow on moern processors Minimize space for spille registers Operate efficiently - E.g., not exponential in size of coe 2

Register allocation approaches Local allocation (within asic locks) In single forwar pass through lock, spill an loa regs as necessary (Coul also try to look at lock as a whole to etermine some etter allocation) Gloal allocation (across asic locks) Use graph coloring Local allocation is simple to implement But can introuce inefficiencies at lock ounaries Most compilers use graph-coloring ase gloal allocation 3

Spill coe Where shoul spille registers e store? Each instance of a function nees its own storage store on stack Can allocate space for spille regs in function prolog coe - Refer to reg storage using frame pointer - Nee to reserve feasile set of physical regs only for spilling Inserte spill coe Definition of a spille register rs - a rs, r2, r3 insert store n(%ep), rs afterwar Use of spille register rs - a rs, r2, r3 insert loa rs, n(%ep) efore 4

Instruction set For illustration purposes, we ll use the instruction set from coegen-*.ml Will write rn for register n type instr = ILoa of reg * src (* st, src *) IStore of st * reg (* st, src *) IMov of reg * reg (* st, src *) IA of reg * reg * reg (* st, src1, src2 *) IMul of reg * reg * reg (* st, src1, src2 *) IJmp of int (* pc offset *) IIfZero of reg * int (* src, pc offset *) IReturn 5

Live ranges A register is live Starting at its efinition (x...), inclusive Ening at the point it ecomes ea (y... x...), inclusive - Can represent as an interval [i,j] or live range within a lock - Also nee to know which regs live on exit Source coe ILoa r1, 42 IMov r2, r1 IMul r3, r1, r2 ILoa r4, 5 IA r5, r4, r2 ILoa r6, 8 IMul r7, r5, r6 IA r8, r7, r3 IA r1, r8, r1 IStore &1234, r1 Live regs (en of instr) r1 r1 r2 r1 r2 r3 r1 r2 r3 r4 r1 r3 r5 r1 r3 r5 r6 r1 r3 r7 r1 r8 r1 (none) 6

Local register allocation Algorithm Start with empty reg set Loa from memory into reg on eman When no reg availale, spill to free one - Nee policy on which reg to spill - Common approach: one whose next use is farthest in the future - Keep values use soon in registers - Similar to cache line / page replacement 7

Example One possile ottom-up allocation to 3 regs (ra-rc) Notice r1 spille to memory after first IMul Reg alloc (at exit) Source coe Live regs ra r rc ILoa r1, 42 r1 r1 IMov r2, r1 r1 r2 r1 r2 IMul r3, r1, r2 r1 r2 r3 r1 r2 r3 (spill r1 to memory) ILoa r4, 5 r1 r2 r3 r4 r4 r2 r3 IA r5, r4, r2 r1 r3 r5 r4 r2 r5 r3 ILoa r6, 8 r1 r3 r5 r6 r6 r5 r3 IMul r7, r5, r6 r1 r3 r7 r6 r5 r7 r3 IA r8, r7, r3 r1 r8 r6 r7 r8 r3 (loa r1 from memory) IA r1, r8, r1 r1 r1 r8 r3 IStore &1234, r1 (none) means oth neee in this instruction. 8

Example generate coe One possile ottom-up allocation to 3 regs (ra-rc) Notice r1 spille to memory after first IMul Reg alloc (at exit) Source coe Live regs ra r rc ILoa ra, 42 r1 r1 IMov r, ra r1 r2 r1 r2 IMul rc, ra, r r1 r2 r3 r1 r2 r3 (spill ra to memory for r1) ILoa ra, 5 r1 r2 r3 r4 r4 r2 r3 IA r, ra, r r1 r3 r5 r4 r2 r5 r3 ILoa ra, 8 r1 r3 r5 r6 r6 r5 r3 IMul r, r, ra r1 r3 r7 r6 r5 r7 r3 IA r, r, rc r1 r8 r6 r8 r3 (loa ra from memory for r1) IA ra, r, ra r1 r1 r8 r3 IStore &1234, ra (none) 9

Register reuse Note that in some cases, can reuse the same register as source an target in single instruction Namely, when one live range ens an another egins Source coe ILoa r1, 42 ILoa r2, 43 IMul r3, r1, r2 Live regs r1 r1 r2 (none) - Suppose r1 ra an r2 r - Then coul assign r3 to ra, r, or some other register In previous slie, wrote register reuse as r1 r2 - r1 is assigne at eginning of instruction, r2 at en 10

Gloal register allocation [Chaitin et al 1981] Definition: Graph coloring prolem Input: A graph G an an integer k - k is the numer of colors Output: an assignment of noes of G to colors such that - No noes that are connecte y an ege have the same color - The assignment uses at most k colors This prolem is NP-har for k > 2 Reuce register allocation to graph coloring Data flow analysis to fin live ranges of virtual registers Buil a color interference graph, where - Noes represent live ranges - Ege etween two noes inicates oth ranges live at some point Fin k coloring of graph, for k = # of physical regs - If unale to fin coloring, spill virtual regs an repeat 11

Live ranges All noes in CFG from efinition to use, inclusive Live ranges inicate when virtual registers shoul e in some physical reg to avoi spill coe (A single virtual register may comprise several live ranges) a live a =...... = a =... =...... = + a live a =...... = a a live 12

Builing the interference graph At each point p in the program A ege (x,y) for all pairs of live ranges x, y live at p a live a =... a live a =...... = a =... =...... = live... =... = a live a a 13

Graph coloring via simplification Algorithm Repeately remove noes with egree < k from graph - Push noes onto stack, removing from graph If every remaining noe is egree k - Spill noe with lowest spill cost - Use some heuristic to guess which virtual reg est to spill - Remove noe from graph - (Once spille, no longer causes interference) Reassemle graph with noes poppe from stack - Choose color iffering from neighors when ae to graph - Always possile since noe ha egree < k 14

Graph coloring example Assume 3 physical registers Simplify graph y removing noes with < 3 neighors a e e e e e c c Reassemle y popping noes from stack - Assigning colors not use y neighors a e e e e e c c 15

Graph coloring w/spill Assuming 2 physical registers No noe with < 2 neighors Must spill noe with lowest spill cost Remaining noes can then e simplifie an colore spill a c c c 16

Spill coe Here, we ve assume that spilling a removes it completely from live range But of course, the spill coe will nee to loa an store to register Thus, we either nee to - Recompute live ranges after we insert spill coe - Reserve a set of register that cannot e allocate to, ut that we will use to loa an store for spills 17

Discussion Gloal register allocation is an ol iea Material presente in these slies is just the eginning there s een lots of work coming up with etter variants Register pressure occurs when not enough physical registers availale, requiring spills Register allocation an optimization interact - If we optimize efore register alloc, might increase register pressure - E.g., y moving a computation earlier than it was efore, therey increasing live ranges - If we register alloc efore optimizing, might create false epenencies - E.g., reg alloc maps what are conceptually separate variales to the same physical register; coul confuse optimizer 18