Register Allocation, i. Overview & spilling

Similar documents
6.035 Project 3: Unoptimized Code Generation. Jason Ansel MIT - CSAIL

Machine Program: Procedure. Zhaoguo Wang

Assembly Language: Function Calls

CSE P 501 Exam 12/1/11

Functions. Ray Seyfarth. August 4, Bit Intel Assembly Language c 2011 Ray Seyfarth

Stack Frame Components. Using the Stack (4) Stack Structure. Updated Stack Structure. Caller Frame Arguments 7+ Return Addr Old %rbp

CS429: Computer Organization and Architecture

Princeton University Computer Science 217: Introduction to Programming Systems. Assembly Language: Function Calls

Register Allocation, iii. Bringing in functions & using spilling & coalescing

Computer Systems C S Cynthia Lee

18-600: Recitation #3

The Stack & Procedures

6.1. CS356 Unit 6. x86 Procedures Basic Stack Frames

Midterm 1 topics (in one slide) Bits and bitwise operations. Outline. Unsigned and signed integers. Floating point numbers. Number representation

Assembly I: Basic Operations. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 12

The Stack & Procedures

Procedures and the Call Stack

Mechanisms in Procedures. CS429: Computer Organization and Architecture. x86-64 Stack. x86-64 Stack Pushing

CSE 401 Final Exam. March 14, 2017 Happy π Day! (3/14) This exam is closed book, closed notes, closed electronics, closed neighbors, open mind,...

Machine-level Programs Procedure

Assembly III: Procedures. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Machine-Level Programming III: Procedures

University of Washington

Machine-Level Programming V: Unions and Memory layout

RISC I from Berkeley. 44k Transistors 1Mhz 77mm^2

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 4

Computer Architecture and System Programming Laboratory. TA Session 3

Unoptimized Code Generation

Machine/Assembler Language Putting It All Together

Lecture 21 CIS 341: COMPILERS

Compiling C Programs Into x86-64 Assembly Programs

CS356 Unit 12a. Logic Circuits. Combinational Logic Gates BASIC HW. Processor Hardware Organization Pipelining

Assembly Programming IV

Function Calls and Stack

Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition. Carnegie Mellon

18-600: Recitation #4 Exploits

CSE351 Spring 2018, Midterm Exam April 27, 2018

CSE P 501 Exam Sample Solution 12/1/11

P423/P523 Compilers Register Allocation

Data Representa/ons: IA32 + x86-64

Buffer Overflow Attack (AskCypert CLaaS)

_vectorcall and regcall Demystified

Register allocation. TDT4205 Lecture 31

Machine-Level Programming (2)

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 7

Credits and Disclaimers

What Compilers Can and Cannot Do. Saman Amarasinghe Fall 2009

Binghamton University. CS-220 Spring x86 Assembler. Computer Systems: Sections

Recitation: Bomb Lab. September 17 th 2018

Lecture 4 CIS 341: COMPILERS

18-600: Recitation #4 Exploits (Attack Lab)

CSCI 2021: x86-64 Control Flow

x86 Programming I CSE 351 Winter

Instructions and Instruction Set Architecture

Where We Are. Optimizations. Assembly code. generation. Lexical, Syntax, and Semantic Analysis IR Generation. Low-level IR code.

x86-64 Programming III & The Stack

Introduction to Intel x86-64 Assembly, Architecture, Applications, & Alliteration. Xeno Kovah

Intel x86-64 and Y86-64 Instruction Set Architecture

Code Generation: An Example

Test I Solutions MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Department of Electrical Engineering and Computer Science

Changelog. Assembly (part 1) logistics note: lab due times. last time: C hodgepodge

CS 107. Lecture 13: Assembly Part III. Friday, November 10, Stack "bottom".. Earlier Frames. Frame for calling function P. Increasing address

Instruction Set Architecture (ISA) Data Types

Corrections made in this version not seen in first lecture:

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

Spring 2 Spring Loop Optimizations

CSE 351 Midterm - Winter 2015

Assembly Language Programming 64-bit environments

CS429: Computer Organization and Architecture

CS 261 Fall Machine and Assembly Code. Data Movement and Arithmetic. Mike Lam, Professor

Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth

CS356: Discussion #15 Review for Final Exam. Marco Paolieri Illustrations from CS:APP3e textbook

Computer Organization: A Programmer's Perspective

Machine- Level Representa2on: Procedure

x86-64 Programming III

CSE 351 Midterm - Winter 2017

12.1. CS356 Unit 12. Processor Hardware Organization Pipelining

C to Assembly SPEED LIMIT LECTURE Performance Engineering of Software Systems. I-Ting Angelina Lee. September 13, 2012

L09: Assembly Programming III. Assembly Programming III. CSE 351 Spring Guest Lecturer: Justin Hsia. Instructor: Ruth Anderson

CS Bootcamp x86-64 Autumn 2015

Areas for growth: I love feedback

x86-64 Programming II

Rev101. spritzers - CTF team. spritz.math.unipd.it/spritzers.html

CSC 252: Computer Organization Spring 2018: Lecture 5

Simple Machine Model. Lectures 14 & 15: Instruction Scheduling. Simple Execution Model. Simple Execution Model

Computer Architecture I: Outline and Instruction Set Architecture. CENG331 - Computer Organization. Murat Manguoglu

MACHINE LEVEL PROGRAMMING 1: BASICS

Machine-Level Programming II: Control

THE UNIVERSITY OF BRITISH COLUMBIA CPSC 261: MIDTERM 1 February 14, 2017

Assembly Programming III

x86 Programming II CSE 351 Winter

Machine-Level Programming II: Control

Topics of this Slideset. CS429: Computer Organization and Architecture. Instruction Set Architecture. Why Y86? Instruction Set Architecture

Hands-on Ethical Hacking: Preventing & Writing Buffer Overflow Exploits

Liveness Analysis and Register Allocation. Xiao Jia May 3 rd, 2013

CS-220 Spring 2018 Final Exam Version Practice May 10, Name:

Assembly Programming IV

CSE 351 Midterm Exam

Review addressing modes

Transcription:

Register Allocation, i Overview & spilling 1

L1 p ::=(label f...) f ::=(label nat nat i...) i ::=(w <- s) (w <- (mem x n8)) ((mem x n8) <- s) (w aop= t) (w sop= sx) (w sop= num) (w <- t cmp t) label (goto label) (cjump t cmp t label label) (call u nat) (call print 1) (call allocate 2) (call array-error 2) (tail-call u nat0-6) (return) aop= ::=+= -= *= &= sop= ::=<<= >>= cmp ::=< <= = u ::=x label t ::=x num s ::=x num label x ::=w rsp w ::=a rax rbx rbp r10 r11 r12 r13 r14 r15 a ::=rdi rsi rdx sx r8 r9 sx ::=rcx label ::=sequence of chars matching #rx"^:[a-za-z_][a-za-z_0-9]*$" 2

L2 p ::=(label f...) f ::=(label nat nat i...) i ::=(w <- s) (w <- (mem x n8)) ((mem x n8) <- s) (w aop= t) (w sop= sx) (w sop= num) (w <- t cmp t) label (goto label) (cjump t cmp t label label) (call u nat) (call print 1) (call allocate 2) (call array-error 2) (tail-call u nat0-6) (return) (w <- (stack-arg n8)) aop= ::=+= -= *= &= sop= ::=<<= >>= cmp ::=< <= = u ::=x label t ::=x num s ::=x num label x ::=w rsp w ::=a rax rbx rbp r10 r11 r12 r13 r14 r15 a ::=rdi rsi rdx sx r8 r9 sx ::=rcx var label ::=sequence of chars matching #rx"^:[a-za-z_][a-za-z_0-9]*$" var ::=sequence of chars matching #rx"^[a-za-z_][a-za-z_0-9-]*$" 3

L2 semantics: variables L2 behaves just like L1, except that non-reg variables are function local, e.g., (define (f x) (+ (g x) 1)) (define (g x) (+ x 2)) (f 10) (:main (:main (rdi <- 10) (call :f)) (:f (temp <- 1) (call :g) (rax += temp) (return)) (:g (temp <- 2) (rax += temp) (return))) The assignment to temp in g does not break f, but if temp were a register, it would. 4

L2 semantics: stack-arg L2 has a convenience for accessing stack-based arguments: (stack-arg n8). It is equivalent to (mem rsp?) where the? is the n8, plus enough space for the spills. That is, (stack-arg 0) is always the last stack argument, (stack-arg 8) is always the second to last argument, etc. This means that if the second natural number in the function header changes, then the stack-arg references don t have to change they will still be referring to the same arguments 5

From L2 to L1 Register allocation, in three parts; for each function body we do: Liveness analysis interference graph (nodes are variables; edges indicate cannot be in the same register ) Graph coloring register assignments Spilling: coping with too few registers Bonus part, coalescing eliminating redundant (x <- y) instructions 6

Example Function int f(int x) = 2x 2 + 3x + 4 (:f 1 0 (x2 <- rdi) (x2 *= x2) (dx2 <- x2) (dx2 *= 2) (tx <- rdi) (tx *= 3) (rax <- dx2) (rax += tx) (rax += 4) (return)) 7

Example Function: live ranges int f(int x) = 2x 2 + 3x + 4 (:f 1 0 (x2 <- rdi) (x2 *= x2) (dx2 <- x2) (dx2 *= 2) (tx <- rdi) (tx *= 3) (rax <- dx2) (rax += tx) (rax += 4) (return)) dx2 tx x2 8

Example Function: live ranges int f(int x) = 2x 2 + 3x + 4 (:f 1 0 (x2 <- rdi) (x2 *= x2) (dx2 <- x2) (dx2 *= 2) (tx <- rdi) (tx *= 3) (rax <- dx2) (rax += tx) (rax += 4) (return)) dx2 tx x2 r10 r11 r12 r13 r14 r15 r8 r9 rax rbp rbx rcx rdi rdx rsi 9

Example Function 2 int f(int x) = x+x+x+x+x+x+x+x (in a stupid compiler) (:f 1 0 (a <- rdi) (b <- rdi) (c <- rdi) (d <- rdi) (e <- rdi) (f <- rdi) (g <- rdi) (h <- rdi) (rax <- a) (rax += b) (rax += c) (rax += d) (rax += e) (rax += f) (rax += g) (rax += h) (return)) a b c d e f g h r10 r11 r12 r13 r14 r15 r8 r9 rax rbp rbx rcx rdi rdx rsi 10

No way to get all of a, b, c, d, e, f, g, and h into their own registers; so we need to spill one of them. 11

Spilling Spilling is a program rewrite to make it easier to allocate registers Pick a variable Make a new location on the stack (increment the second nat in the function definition) Replace all writes to the variable with writes to the new stack location Replace all reads from the variable with reads from the new stack location Sometimes that means introducing new temporaries 12

Spilling Example Say we want to spill a to first location on the stack, (mem rsp 0); two easy cases: (a <- 1) ((mem rsp 0) <- 1) (x <- a) (x <- (mem rsp 0)) 13

Example Function 2, need to spill int f(int x) = x+x+x+x+x+x+x+x (in a stupid compiler) (:f 1 0 (a <- rdi) (b <- rdi) (c <- rdi) (d <- rdi) (e <- rdi) (f <- rdi) (g <- rdi) (h <- rdi) (rax <- a) (rax += b) (rax += c) (rax += d) (rax += e) (rax += f) (rax += g) (rax += h) (return)) a b c d e f g h r10 r11 r12 r13 r14 r15 r8 r9 rax rbp rbx rcx rdi rdx rsi 14

Example Function 2, spilling a int f(int x) = x+x+x+x+x+x+x+x (in a stupid compiler) (:f 1 1 ((mem rsp 0) <- rdi) (b <- rdi) (c <- rdi) (d <- rdi) (e <- rdi) (f <- rdi) (g <- rdi) (h <- rdi) (rax <- (mem rsp 0)) (rax += b) (rax += c) (rax += d) (rax += e) (rax += f) (rax += g) (rax += h) (return)) b c d e f g h r10 r11 r12 r13 r14 r15 r8 r9 rax rbp rbx rcx rdi rdx rsi 15

Spilling Example A trickier case: (a *= a) (a_new <- (mem rsp 0)) (a_new *= a_new) ((mem rsp 0) <- a_new) In general, make up a new temporary for each instruction that uses the variable to be spilled This makes for very short live ranges. 16

Example Function 2, spilling b int f(int x) = x+x+x+x+x+x+x+x (in a stupid compiler) (:f 1 1 (a <- rdi) ((mem rsp 0) <- rdi) (c <- rdi) (d <- rdi) (e <- rdi) (f <- rdi) (g <- rdi) (h <- rdi) (rax <- a) (s0 <- (mem rsp 0)) (rax += s0) (rax += c) (rax += d) (rax += e) (rax += f) (rax += g) (rax += h) (return)) a c d e f g h s0 r10 r11 r12 r13 r14 r15 r8 r9 rax rbp rbx rcx rdi rdx rsi 17

Example Function 2, spilling b Even though we still have eight temporaries, we can still allocate them to our seven unused registers because the live ranges of s0 and a don t overlap and so they can go into the same register. 18

Your job Implement: spill : (label nat nat i...) ;; original func var ;; to spill var ;; prefix for temporaries -> (label nat nat i...) ;; spilled func 19

Here s how to two example spilled functions from the earlier slides would look like as calls to spill: (spill «the original program» 'a 's) (spill «the original program» 'b 's) See the assignment handout for more details on the precise spec for test cases and your spill function s interface 20