CS4410 : Spring 2013

Similar documents
CS153: Compilers Lecture 7: Simple Code Generation

CS153: Compilers Lecture 8: Compiling Calls

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )

MIPS ISA and MIPS Assembly. CS301 Prof. Szajda

Control Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary

Control Instructions

1/26/2014. Previously. CSE 2021: Computer Organization. The Load/Store Family (1) Memory Organization. The Load/Store Family (2)

Previously. CSE 2021: Computer Organization. Memory Organization. The Load/Store Family (1) 5/26/2011. Used to transfer data to/from DRAM.

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

Chapter 2: Instructions:

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

Chapter 2. Instructions:

Assembler. Lecture 8 CS301

Reduced Instruction Set Computer (RISC)

CS 61c: Great Ideas in Computer Architecture

CS 61C: Great Ideas in Computer Architecture. MIPS Instruction Formats

Overview. Introduction to the MIPS ISA. MIPS ISA Overview. Overview (2)

comp 180 Lecture 10 Outline of Lecture Procedure calls Saving and restoring registers Summary of MIPS instructions

MIPS Assembly (Functions)

Chapter 2A Instructions: Language of the Computer

CS/COE1541: Introduction to Computer Architecture

ECE369. Chapter 2 ECE369

Reduced Instruction Set Computer (RISC)

Lectures 5. Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS

Computer Architecture

Lecture 5: Procedure Calls

EE 361 University of Hawaii Fall

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

Chapter 2. Instructions: Language of the Computer. Adapted by Paulo Lopes

ECE468 Computer Organization & Architecture. MIPS Instruction Set Architecture

Instruction Set Architecture

MIPS R-format Instructions. Representing Instructions. Hexadecimal. R-format Example. MIPS I-format Example. MIPS I-format Instructions

CENG3420 Lecture 03 Review

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1

Chapter 2. Instruction Set Architecture (ISA)

Lecture 4: Instruction Set Architecture

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Prof. Kavita Bala and Prof. Hakim Weatherspoon CS 3410, Spring 2014 Computer Science Cornell University. See P&H 2.8 and 2.12, and A.

Review of Last Lecture. CS 61C: Great Ideas in Computer Architecture. MIPS Instruction Representation II. Agenda. Dealing With Large Immediates

MIPS Instruction Set Architecture (2)

Lecture 4: MIPS Instruction Set

CSCI 402: Computer Architectures. Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI.

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

Topic Notes: MIPS Instruction Set Architecture

EE 109 Unit 15 Subroutines and Stacks

Architecture II. Computer Systems Laboratory Sungkyunkwan University

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

Lecture 5. Announcements: Today: Finish up functions in MIPS

Thomas Polzer Institut für Technische Informatik

ECE 486/586. Computer Architecture. Lecture # 7

CSE Lecture In Class Example Handout

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

UCB CS61C : Machine Structures

MIPS Assembly Language. Today s Lecture

Patterson PII. Solutions

Levels of Programming. Registers

Instructions: Language of the Computer

Today s topics. MIPS operations and operands. MIPS arithmetic. CS/COE1541: Introduction to Computer Architecture. A Review of MIPS ISA.

Today s Lecture. MIPS Assembly Language. Review: What Must be Specified? Review: A Program. Review: MIPS Instruction Formats

Introduction to the MIPS. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Instructor: Randy H. Katz hap://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #7. Warehouse Scale Computer

Compilers CS S-08 Code Generation

MODULE 4 INSTRUCTIONS: LANGUAGE OF THE MACHINE

Instructions: MIPS arithmetic. MIPS arithmetic. Chapter 3 : MIPS Downloaded from:

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

CS 4200/5200 Computer Architecture I

CS 110 Computer Architecture Lecture 6: More MIPS, MIPS Functions

Lecture 2: RISC V Instruction Set Architecture. Housekeeping

MIPS Assembly Programming

Compiler Architecture

Calling Conventions. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 2.8 and 2.12

CSE Lecture In Class Example Handout

ECE 154A Introduction to. Fall 2012

Code Generation. CS 540 George Mason University

The plot thickens. Some MIPS instructions you can write cannot be translated to a 32-bit number

Mark Redekopp, All rights reserved. EE 357 Unit 11 MIPS ISA

CS222: MIPS Instruction Set

Mips Code Examples Peter Rounce

CS61C : Machine Structures

COMPUTER ORGANIZATION AND DESIGN

Course Administration

Instruction Set Architecture. "Speaking with the computer"

Operands and Addressing Modes

5/17/2012. Recap from Last Time. CSE 2021: Computer Organization. The RISC Philosophy. Levels of Programming. Stored Program Computers

The plot thickens. Some MIPS instructions you can write cannot be translated to a 32-bit number

Recap from Last Time. CSE 2021: Computer Organization. Levels of Programming. The RISC Philosophy 5/19/2011

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam

Functions in MIPS. Functions in MIPS 1

MIPS ISA. 1. Data and Address Size 8-, 16-, 32-, 64-bit 2. Which instructions does the processor support

Operands and Addressing Modes

MIPS Functions and the Runtime Stack

CS 61C: Great Ideas in Computer Architecture (Machine Structures) More MIPS Machine Language

Chapter 4 The Processor 1. Chapter 4A. The Processor

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Character Is a byte quantity (00~FF or 0~255) ASCII (American Standard Code for Information Interchange) Page 91, Fig. 2.21

Lecture 2. Instructions: Language of the Computer (Chapter 2 of the textbook)

MIPS Assembly: More about Memory and Instructions CS 64: Computer Organization and Design Logic Lecture #7

EN164: Design of Computing Systems Lecture 11: Processor / ISA 4

Do-While Example. In C++ In assembly language. do { z--; while (a == b); z = b; loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero

MIPS%Assembly% E155%

Transcription:

CS4410 : Spring 2013

Next PS: Map Fish code to MIPS code Issues: eliminating compound expressions eliminating variables encoding conditionals, short- circuits, loops

type exp = Var of var Int of int Binop of exp * binop * exp Not of exp Or of exp * exp And of exp * exp Assign of var * exp

type label = string type reg = R0 R1 R2 R31 type operand = Reg of reg Immed of word

type inst = Add of reg * reg * operand Li of reg * word Slt of reg * reg * operand Beq of reg * reg * label Bgez of reg * label J of label La of reg * label Lw of reg * reg * word Sw of reg * reg * word Label of label...

Fish only has global variables. These can be placed in the data segment..data.align 0 x:.word 0 y:.word 0 z:.word 0

To compile x = x+1 la $3, x ; load x's address lw $2, 0($3) ; load x's value addi $2,$2,1 ; add 1 sw $2, 0($3) ; store value in x

Source: Binop(Binop(x,Plus,y),Plus,Binop(w,Plus,z)) Target: add rd, rs, rt What should we do?

Given: Binop(A,Plus,B) translate A so that result ends up in a particular register (e.g., $3) translate B so that result ends up in a different register (e.g., $2) Generate: add $2, $3, $2 Problems?

Invariants: results always placed in $2 Given: Binop(A,Plus,B) translate A save $2 somewhere translate B Restore A's value into register $3 Generate: add $2, $3, $2

Binop(Binop(x,Plus,y),Plus,Binop(w,Plus,z)) 1. compute x+y, placing result in $2 2. store result in a temporary t1 3. compute w+z, placing result in $2 4. load temporary t1 into a register, say $3 5. add $2, $3, $2

let rec exp2mips(i:exp):inst list = match i with Int j -> [Li(R2, Word32.fromInt j)] Var x -> [La(R2,x), Lw(R2,R2,zero)] Binop(i1,b,i2) -> (let t = new_temp()in (exp2mips i1) @ [La(R1,t), Sw(R2,R1,zero)] @(exp2mips i2) @ [La(R1,t), Lw(R1,R1,zero)] @(match b with Plus -> [Add(R2,R2,Reg R1)]... ->...)) Assign(x,e) => [exp2mips e] @ [La(R1,x), Sw(R2,R1,zero)]

let rec stmt2mips(s:stmt):inst list = match s with Exp e -> exp2mips e Seq(s1,s2) -> (stmt2mips s1) @ (stmt2mips s2) If(e,s1,s2) -> (let else_l = new_label() in let end_l = new_label() in (exp2mips b) @ [Beq(R2,R0,else_l)] @ (stmt2mips s1) @ [J end_l,label else_l] @ (stmt2mips s2) @ [Label end_l])

While(e,s) -> (let test_l = new_label() in let top_l = new_label() in [J test_l, Label top_l] @ (stmt2mips s) @ [Label test_l] @ (exp2mips b) @ [Bne(R2,R0,top_l)]) For(e1,e2,e3,s) -> stmt2mips(seq(exp e1,while(e2,seq(s,exp e3))))

Compiler takes O(n 2 ) time due to @. No constant folding. For "if (x == y) s1 else s2" we do a seq followed by a beq when we could just do a beq. For "if (b1 && b2) s1 else s2" we could just jump to s2 when b1 is false. Lots of la/lw and la/sw for variables. For e1 + e2, we always write out e1's value to memory when we could just cache it in a register.

Naïve append: fun append nil y = y append (h::t) y = h::(append t y) Tail- recursive append: fun revapp nil y = y revapp (h::t) y = revapp t (h::y) fun rev x = revapp x [] fun append x y = revapp (rev x) y

let rec exp2mips'(i:exp) (a:inst list):inst list = match i with... let exp2mips (i:exp) = (exp2mips' i [])

let rec exp2mips'(i:f.iexp) (a:inst list) = match i with Int w -> Li(R2, Word32.fromInt w) :: a Binop(i1,Plus,Int 0) -> exp2mips' i1 a Binop(Int i1,plus,int i2) -> exp2mips' (Int (i1+i2)) a Binop(Int i1,minus,int i2) -> exp2mips' (Int (i1-i2)) a Binop(b,i1,i2) ->... Why isn't this great? How can we fix it?

Consider: if (x < y) then S1 else S2 ELSE: END: slt $2, $2, $3 beq $2, ELSE S1 j END S2

In most contexts, we want a value. But in a testing context, we jump to one place or another based on the value, and otherwise never use the value. In many situations, we can avoid materializing the value and thus produce better code.

let rec bexp2mips(e:exp) (t:label) (f:label) = match e with Int 0 -> [J f] Int _ -> [J t] Binop(e1,Eq,e2) => (let t = temp() in (exp2mips e1) @ [La(R1,t), Sw(R2,R1,zero)] @ (exp2mips e2) @ [La(R1,t), Lw(R1,R1,zero), Bne(R1,R2,f), J t]) _ -> (exp2mips e1) @ [ Beq(R2,R0,f), J t ]

We treated all variables (including temps) as if they were globals: set aside data with a label to read: load address of label, then load value stored at that address. to write: load address of label, then store value at that address. This is fairly inefficient: e.g., x+x involves loading x's address twice! lots of memory operations.

One option is to allocate a register to hold a variable's value: Eliminates the need to load an address or do memory operations. Will talk about more generally later. Of course, in general, we can't avoid it when code has more (live) variables than registers. Can we at least avoid loading addresses?

Set aside one block of memory for all of the variables. Dedicate $30 (aka $fp) to hold the base address of the block. x Assign each variable a y position within the block z (x 0, y 4, z 8, etc.) Now loads/stores can be done relative to $fp:

z = x+1 lda $1,x lw $2,0($1) addi $2,$2,1 lda $1,z sw $2,0($1) lw $2,0($fp) addi $2,$2,1 sw $2,8($fp)

Get rid of nested expressions before translating Introduce new variables to hold intermediate results Perhaps do things like constant folding For example, a = (x + y) + (z + w) might be translated to: t0 := x + y; t1 := z + w; a := t0 + t1;

t0 := x + y; lw $v0, <xoff>($fp) lw $v1, <yoff>($fp) add $v0, $v0, $v1 sw $v0, <t0off>($fp) t1 := z + w; lw $v0, <zoff>($fp) lw $v1, <woff>($fp) add $v0, $v0, $v1 sw $v0, <t1off>($fp) a := t0 + t1; lw $v0, <t0off>($fp) lw $v1, <t1off>($fp) add $v0, $v0, $v1 sw $v0, <aoff>($fp)

We're doing a lot of stupid loads and stores. We shouldn't need to load/store from temps! (Nor variables, but we'll deal with them later ) So another idea is to use registers to hold the intermediate values instead of variables. Of course, we can only use registers to hold some number of temps (say k). Maybe use registers to hold first k temps?

t0 := x; # load variable t1 := y; # load variable t2 := t0 + t1; # add t3 := z; # load variable t4 := w; # load variable t5 := t3 + t4; # add t6 := t2 + t5; # add a := t6; # store result

Notice that each little statement can be directly translated to MIPs instructions: t0 := x; --> lw $t0,<xoff>($fp) t1 := y; --> lw $t1,<yoff>($fp) t2 := t0 + t1; --> add $t2,$t0,$t1 t3 := z; --> lw $t3,<zoff>($fp) t4 := w; --> lw $t4,<woff>($fp) t5 := t3 + t4; --> add $t5,$t3,$t4 t6 := t2 + t5; --> add $t6,$t2,$t5 a := t6; --> sw $t6,<aoff>($fp)

Sometimes we can recycle a temp: t0 := x; t0 taken t1 := y; t0,t1 taken t2 := t0 + t1; t2 taken (t0,t1 free) t3 := z; t2,t3 taken t4 := w; t2,t3,t4 taken t5 := t3 + t4; t2,t5 taken (t3,t4 free) t6 := t2 + t5; t6 taken (t2,t5 free) a := t6; (t6 free)

Hmmm. Looks a lot like a stack t0 := x; t0 t1 := y; t1,t0 t0 := t0 + t1; t0 t1 := z; t1,t0 t2 := w; t2,t1,t0 t1 := t1 + t2; t1,t0 t1 := t0 + t1; t1 a := t1; <empty>

(x+y)*x t0 := x; # loads x t1 := y; t0 := x+y t1 := x; # loads x again! t0 := t0*t1;

Introduces temps as described earlier: It lowers the code to something close to assembly, where the number of resources (i.e., registers) is made explicit. Ideally, we have a 1- to- 1 mapping between the lowered intermediate code and assembly code. Performs an analysis to calculate the live range of each temp: A temp t is live at a program point if there is a subsequent read (use) of t along some control- flow path, without an intervening write (definition). The problem is simplified for functional code since variables are never re- defined.

From the live- range information for each temp, we calculate an interference graph. Temps t1 and t2 interfere if there is some program point where they are both live. We build a graph where the nodes are temps and the edges represent interference. If two temps interfere, then we cannot allocate them to the same register. Conversely, if t1 and t2 do not interfere, we can use the same register to hold their values.

Assign each node (temp) a register such that if t1 interferes with t2, then they are given distinct colors. Similar to trying to "color" a map so that adjacent countries have different colors. In general, this problem is NP complete, so we must use heuristics. Problem: given k registers and n > k nodes, the graph might not be colorable. Solution: spill a node to the stack. Reconstruct interference graph & try coloring again. Trick: spill temps that are used infrequently and/or have high interference degree.

a := (x+y)*(x+z) t1 t0 t5 t0 := x t1 := y t2 := z t3 := t0+t1 t4 := t0+t2 t5 := t3*t4 a := t5 live range for t1 live range for t0 live range for t2 live range for t3 live range for t4 live range for t5 t2 t3 t4

a := (x+y)*(x+z) t1 t0 t5 t0 := x t1 := y t2 := z t3 := t0+t1 t4 := t0+t2 t5 := t3*t4 a := t5 live range for t1 live range for t0 live range for t2 live range for t3 live range for t4 live range for t5 t2 t3 t4

a := (x+y)*(x+z) t1 t0 t5 t0 := x t1 := y t2 := z t3 := t0+t1 t4 := t0+t2 t5 := t3*t4 a := t5 live range for t1 live range for t0 live range for t2 live range for t3 live range for t4 live range for t5 t2 t3 t4

a := (x+y)*(x+z) t1 t0 t5 t0 := x t1 := y t2 := z t3 := t0+t1 t4 := t0+t2 t5 := t3*t4 a := t5 live range for t1 live range for t0 live range for t2 live range for t3 live range for t4 live range for t5 t2 t3 t4

a := (x+y)*(x+z) t1 t0 t5 t0 := x t1 := y t2 := z t3 := t0+t1 t4 := t0+t2 t5 := t3*t4 a := t5 t2 t3 t4 t0 t1 t2

a := (x+y)*(x+z) t1 t0 t5 t0 := x t1 := y t2 := z t3 := t0+t1 t0 := t0+t2 t0 := t3*t0 a := t0 t2 t3 t4 t0 t1 t2

a := (x+y)*(x+z) t0 := x --> lw $t0,<xoff>($fp) t1 := y --> lw $t1,<yoff>($fp) t2 := z --> lw $t2,<zoff>($fp) t3 := t0+t1 --> add $t3,$t0,$t1 t0 := t0+t2 --> add $t0,$t0,$t2 t0 := t3*t0 --> mul $t0,$t3,$t2 a := t0 --> sw $t0,<aoff>($fp)