Register allocation. CS Compiler Design. Liveness analysis. Register allocation. Liveness analysis and Register allocation. V.

Similar documents
Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices

Compilation 2013 Liveness Analysis

Register allocation. instruction selection. machine code. register allocation. errors

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

Introduction to optimizations. CS Compiler Design. Phases inside the compiler. Optimization. Introduction to Optimizations. V.

Lecture 21 CIS 341: COMPILERS

We can express this in dataflow equations using gen and kill sets, where the sets are now sets of expressions.

Liveness Analysis and Register Allocation. Xiao Jia May 3 rd, 2013

Runtime management. CS Compiler Design. The procedure abstraction. The procedure abstraction. Runtime management. V.

Acknowledgement. CS Compiler Design. Semantic Processing. Alternatives for semantic processing. Intro to Semantic Analysis. V.

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation

Challenges in the back end. CS Compiler Design. Basic blocks. Compiler analysis

Liveness Analysis and Register Allocation

Compiler Design. Register Allocation. Hwansoo Han

Lecture 6. Register Allocation. I. Introduction. II. Abstraction and the Problem III. Algorithm

Register Allocation. CS 502 Lecture 14 11/25/08

Linear-Scan Register Allocation. CS 352 Lecture 12 11/28/07

Data Flow Analysis. Suman Jana. Adopted From U Penn CIS 570: Modern Programming Language Implementa=on (Autumn 2006)

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Outline. Register Allocation. Issues. Storing values between defs and uses. Issues. Issues P3 / 2006

Global Register Allocation

Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices

Variables vs. Registers/Memory. Simple Approach. Register Allocation. Interference Graph. Register Allocation Algorithm CS412/CS413

Register Allocation. Register Allocation. Local Register Allocation. Live range. Register Allocation for Loops

Plan for Today. Concepts. Next Time. Some slides are from Calvin Lin s grad compiler slides. CS553 Lecture 2 Optimizations and LLVM 1

Academic Formalities. CS Modern Compilers: Theory and Practise. Images of the day. What, When and Why of Compilers

CSE P 501 Compilers. Register Allocation Hal Perkins Autumn /22/ Hal Perkins & UW CSE P-1

Register allocation. TDT4205 Lecture 31

Register Allocation & Liveness Analysis

register allocation saves energy register allocation reduces memory accesses.

Academic Formalities. CS Language Translators. Compilers A Sangam. What, When and Why of Compilers

Linear Scan Register Allocation. Kevin Millikin

Code generation for modern processors

Code generation for modern processors

CS2210: Compiler Construction. Compiler Optimization

Global Register Allocation via Graph Coloring

Compiler Architecture

Global Register Allocation - 2

Topic 12: Register Allocation

Improvements to Linear Scan register allocation

Global Register Allocation - Part 2

CS577 Modern Language Processors. Spring 2018 Lecture Optimization

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

Register Allocation. Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations

Compiler Design. Fall Data-Flow Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Lecture Compiler Backend

Syntax-Directed Translation. CS Compiler Design. SDD and SDT scheme. Example: SDD vs SDT scheme infix to postfix trans

Code Generation. CS 540 George Mason University

Low-Level Issues. Register Allocation. Last lecture! Liveness analysis! Register allocation. ! More register allocation. ! Instruction scheduling

CS 132 Compiler Construction

CSC D70: Compiler Optimization Register Allocation

Alternatives for semantic processing

CS202 Compiler Construction

Register allocation. Overview

Lecture 20 CIS 341: COMPILERS

Register Allocation. Lecture 16

Programming Language Processor Theory

Chapter 4: LR Parsing

Register Allocation. Stanford University CS243 Winter 2006 Wei Li 1

MIT Introduction to Dataflow Analysis. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Register Allocation in Just-in-Time Compilers: 15 Years of Linear Scan

Register Allocation. Lecture 38

Calvin Lin The University of Texas at Austin

4.1 Interval Scheduling

Compiler Construction 2010/2011 Loop Optimizations

CS 406/534 Compiler Construction Putting It All Together

The compilation process is driven by the syntactic structure of the program as discovered by the parser

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code

Register Allocation (via graph coloring) Lecture 25. CS 536 Spring

Lecture 25: Register Allocation

Compiler Construction 2009/2010 SSA Static Single Assignment Form

Global Register Allocation - Part 3

Lecture Overview Register Allocation

Outline: Finish uncapacitated simplex method Negative cost cycle algorithm The max-flow problem Max-flow min-cut theorem

EECS 583 Class 15 Register Allocation

Lecture 15 Register Allocation & Spilling

An Overview of GCC Architecture (source: wikipedia) Control-Flow Analysis and Loop Detection

Compiler Optimisation

The C2 Register Allocator. Niclas Adlertz

Compiler Construction 2016/2017 Loop Optimizations

Register Allocation 1

Abstract Interpretation Continued

Instruction Scheduling

Compilers and Code Optimization EDOARDO FUSELLA

IR Optimization. May 15th, Tuesday, May 14, 13

Thursday, December 23, The attack model: Static Program Analysis

Compilation Lecture 11a. Register Allocation Noam Rinetzky. Text book: Modern compiler implementation in C Andrew A.

Global Register Allocation

CHAPTER 3. Register allocation

Compiler Construction

Data-flow Analysis. Y.N. Srikant. Department of Computer Science and Automation Indian Institute of Science Bangalore

Compilation /17a Lecture 10. Register Allocation Noam Rinetzky

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

April 15, 2015 More Register Allocation 1. Problem Register values may change across procedure calls The allocator must be sensitive to this

Register Allocation. Introduction Local Register Allocators

CIS 341 Final Examination 30 April 2013

Last class. CS Principles of Programming Languages. Introduction. Outline

Live Variable Analysis. Work List Iterative Algorithm Rehashed

Leveraging Transitive Relations for Crowdsourced Joins*

Transcription:

Register allocation CS3300 - Compiler Design Liveness analysis and Register allocation V. Krishna Nandivada IIT Madras Copyright c 2014 by Antony L. Hosking. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from hosking@cs.purdue.edu. V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 2 / 26 Register allocation Liveness analysis IR Register allocation: instruction selection errors have value in a register when used limited resources can effect the instruction choices can move loads and stores optimal allocation is difficult NP-complete for k 1 registers register allocation machine code Problem: IR contains an unbounded number of temporaries machine has bounded number of registers Approach: temporaries with disjoint live ranges can map to same register if not enough registers then spill some temporaries (i.e., keep them in memory) The compiler must perform liveness analysis for each temporary: It is live if it holds a value that may be needed in future V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 3 / 26 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 4 / 26

Example Liveness analysis a 0 L 1 : b a + 1 c c + b a b 2 if a < N goto L 1 return c Gathering liveness information is a form of data flow analysis operating over the CFG: We will treat each statement as a different basic block. liveness of variables flows around the edges of the graph assignments define a variable, v: def(v) = set of graph nodes that define v def[n] = set of variables defined by n occurrences of v in expressions use it: use(v) = set of nodes that use v use[n] = set of variables used in n V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 5 / 26 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 6 / 26 Definitions Liveness analysis v is live on edge e if there is a directed path from e to a use of v that does not pass through any def(v) v is live-in at node n if live on any of n s in-edges v is live-out at n if live on any of n s out-edges v use[n] v live-in at n (For programs with statically established no uninitialized variables) v live-in at n v live-out at all m pred[n] v live-out at n,v def[n] v live-in at n Define: Then: Note: in[n] = variables live-in at n out[n] = variables live-out at n out[n] = in[s] s succ(n) succ[n] = φ out[n] = φ in[n] use[n] in[n] out[n] def[n] use[n] and def[n] are constant (indepent of control flow) Now, v in[n] iff. v use[n] or v out[n] def[n] Thus, in[n] = use[n] (out[n] def[n]) V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 7 / 26 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 8 / 26

Iterative solution for liveness Notes N : Set of nodes of CFG; foreach n N do in[n] φ; out[n] φ; foreach n Nodes do in [n] in[n]; out [n] out[n]; in[n] use[n] (out[n] def [n]); out[n] s succ[n] in[s] ; until n,in [n] = in[n] out [n] = out[n] ; should order computation of inner loop to follow the flow liveness flows backward along control-flow arcs, from out to in nodes can just as easily be basic blocks to reduce CFG size could do one variable at a time, from uses back to defs, noting liveness along the way V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 9 / 26 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 10 / 26 Iterative solution for liveness Least fixed points Complexity: for input program of size N N nodes in CFG N variables N elements per in/out O(N) time per set-union for loop performs constant number of set operations per node O(N 2 ) time for for loop each iteration of loop can only add to each set sets can contain at most every variable sizes of all in and out sets sum to 2N 2, bounding the number of iterations of the loop worst-case complexity of O(N 4 ) ordering can cut loop down to 2-3 iterations O(N) or O(N 2 ) in practice There is often more than one solution for a given dataflow problem (see example). Any solution to dataflow equations is a conservative approximation: v has some later use downstream from n v out(n) but not the converse Conservatively assuming a variable is live does not break the program; just means more registers may be needed. Assuming a variable is dead when really live will break things. Many possible solutions but we want the smallest : the least fixpoint. The iterative algorithm computes this least fixpoint. V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 11 / 26 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 12 / 26

Register allocation - by Graph coloring Graph coloring - a simplistic approach Step 1: Select target machine instructions assuming infinite registers (temps). If a instruction requires a special register replace that temp with that register. Step 2: Construct an interference graph. Solve the register allocation problem by coloring the graph. A graph is said to be colored if each each pair of neighboring nodes have different colors. Input: G - the interference graph, K - number of colors Remove a node n and all its edges from G, such that degree of n is less than K; Push n onto a stack; until G has no node with degree less than K; // G is either empty or all of its nodes have degree K if G is not empty then Take one node m out of G, and mark it for spilling; Remove all the edges of m from G; until G is empty; Take one node at a time from the stack and assign a non conflicting color. V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 13 / 26 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 14 / 26 Example 1, available colors = 2 Example 2 We have to spill. V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 15 / 26 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 16 / 26

Graph coloring - Kempe s heuristic Example 2 (revisited) Algorithm dating back to 1879. Input: G - the interference graph, K - number of colors Remove a node n and all its edges from G, such that degree of n is less than K; Push n onto a stack; until G has no node with degree less than K; // G is either empty or all of its nodes have degree K if G is not empty then Take one node m out of G.; push m onto the stack; until G is empty; Take one node at a time from the stack and assign a non conflicting color (if possible, else spill). We don t have to spill. V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 17 / 26 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 18 / 26 Example 3 Register allocation - Linear scan Register allocation is expensive. Many algorithms use heuristics for graph coloring. Allocation may take time quadratic in the number of live intervals. Not suitable Online compilers need to generate code quickly. e.g. JIT compilers. Sacrifice efficient register allocation for compilation speed. Linear scan register allocation - Massimiliano Poletto and Vivek Sarkar, ACM TOPLAS 1999 Don t have a choice. Have to spill. V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 19 / 26 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 20 / 26

Linear Scan algorithm Example Say, available registers = 2 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 21 / 26 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 22 / 26 Linear Scan algorithm - analysis Spilling Each live range gets either a register or a spill location. Note: The number of overlapping intervals changes only at the start and points of an interval. Live intervals are stored in a list that is sorted in order of increasing start point. The active list is kept sorted in order of increasing point. Adv: need to scan only those intervals (+1 at most) that have to be removed. Complexity: O(V) if number of registers is assumed ot be a constant. Else? O(V logr) We need to generate extra instructions to load variables from the stack and store them back. The load and store may require registers again: Naive approach: Keep a separate register (wasteful). Rewrite the code - by introducing a temporary; rerun the liveness + ra. (Note: the new temp has much smaller live range). V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 23 / 26 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 24 / 26

Example: rewrite code Criteria for spilling Consider: add t1 t2 Suppose t2 has to be spilled, say to [sp-4]. Invent a new temp t35, and rewrite: mov t35 [sp-4] add t1 t35 t35 has a very short live range and less likely to interfere. Now rerun the algo. During register allocation, we identify that one of the live ranges from a given set, has to be spilled. Criteria? Random! Adv? Disadv? One with maximum degree One that has the longest life One with the shortest life (take advantage of the cache). One with least cost. Cost = Dynamic (load cost + store cost) How to handle loops, conditionals? Cost of load, store V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 25 / 26 V.Krishna Nandivada (IIT Madras) CS3300 - Aug 2014 26 / 26