Compilers and computer architecture Code-generation (2): register-machines

Similar documents
Compilers and computer architecture: A realistic compiler to MIPS

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

Semantic actions for expressions

Problem with Scanning an Infix Expression

Semantic actions for declarations and expressions

Semantic actions for declarations and expressions. Monday, September 28, 15

Compilers CS S-08 Code Generation

Compilers and computer architecture: Semantic analysis

Summary: Direct Code Generation

Semantic actions for declarations and expressions

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

Computer Architecture

Chapter 2A Instructions: Language of the Computer

Assembly Language Programming. CPSC 252 Computer Organization Ellen Walker, Hiram College

Problem with Scanning an Infix Expression

Compiler Construction I

This section provides some reminders and some terminology with which you might not be familiar.

The CPU and Memory. How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram:

Code Generation. Lecture 30

Introduction to Scientific Computing

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

IR Generation. May 13, Monday, May 13, 13

Instruction Sets: Characteristics and Functions Addressing Modes

Computer Organization CS 206 T Lec# 2: Instruction Sets

Midterm II CS164, Spring 2006

UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. M.Sc. in Computational Science & Engineering

Chapter 2. Instruction Set Architecture (ISA)

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

We can emit stack-machine-style code for expressions via recursion

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1

Lecture #16: Introduction to Runtime Organization. Last modified: Fri Mar 19 00:17: CS164: Lecture #16 1

Functions in MIPS. Functions in MIPS 1

ECE220: Computer Systems and Programming Spring 2018 Honors Section due: Saturday 14 April at 11:59:59 p.m. Code Generation for an LC-3 Compiler

Chapter 6 Control Flow. June 9, 2015

Lecture Outline. Code Generation. Lecture 30. Example of a Stack Machine Program. Stack Machines

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

CSCE 5610: Computer Architecture

CMSC 330: Organization of Programming Languages

Compiler construction 2009

Taking the Derivative

Code Generation. Lecture 12

Instructions: Language of the Computer

CS 432 Fall Mike Lam, Professor. Code Generation

Project 3 Due October 21, 2015, 11:59:59pm

We briefly explain an instruction cycle now, before proceeding with the details of addressing modes.

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization

Code Generation. Lecture 19

Lectures 5. Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS

I ve been getting this a lot lately So, what are you teaching this term? Computer Organization. Do you mean, like keeping your computer in place?

Lecture 5. Announcements: Today: Finish up functions in MIPS

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

6.001 Notes: Section 1.1

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Lecture #31: Code Generation

COMP 181. Prelude. Intermediate representations. Today. High-level IR. Types of IRs. Intermediate representations and code generation

Compiler Architecture

Midterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 11, 2015

Chapter 2. Instruction Set. RISC vs. CISC Instruction set. The University of Adelaide, School of Computer Science 18 September 2017

CS61C Machine Structures. Lecture 12 - MIPS Procedures II & Logical Ops. 2/13/2006 John Wawrzynek. www-inst.eecs.berkeley.

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

Computer Organization and Technology Processor and System Structures

Today. Putting it all together

6.001 Notes: Section 6.1

Operational Semantics of Cool

Virtual Machine Tutorial

Compiler Construction D7011E

TACi: Three-Address Code Interpreter (version 1.0)

Fixed-Point Math and Other Optimizations

2. ADDRESSING METHODS

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM.

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

Bits, Words, and Integers

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

Course introduction. Advanced Compiler Construction Michel Schinz

Page 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions

Ruby: Introduction, Basics

Outline. What is semantics? Denotational semantics. Semantics of naming. What is semantics? 2 / 21

(Refer Slide Time: 1:40)

ECE 154A Introduction to. Fall 2012

Chapter 2 Instruction Set Architecture

What is a Compiler? Compiler Construction SMD163. Why Translation is Needed: Know your Target: Lecture 8: Introduction to code generation

Compiler Code Generation COMP360

Three-Address Code IR

Math 230 Assembly Programming (AKA Computer Organization) Spring MIPS Intro

MODULE 32 DAG based Code Generation and Dynamic Programming

CSE Lecture In Class Example Handout

Compiler Construction I

COMPUTER ORGANIZATION & ARCHITECTURE

CS 2210 Programming Project (Part IV)

Intro to Algorithms. Professor Kevin Gold

Language Processors Chapter 1. By: Bhargavi H Goswami Assistant Professor Sunshine Group of Institutes Rajkot, Gujarat, India

Code Generation. Lecture 31 (courtesy R. Bodik) CS164 Lecture14 Fall2004 1

Assignment 11: functions, calling conventions, and the stack

Writing an Interpreter Thoughts on Assignment 6

CSCI 402: Computer Architectures. Instructions: Language of the Computer (1) Fengguang Song Department of Computer & Information Science IUPUI

University of Arizona, Department of Computer Science. CSc 453 Assignment 5 Due 23:59, Dec points. Christian Collberg November 19, 2002

Transcription:

1 / 1 Compilers and computer architecture Code-generation (2): register-machines Martin Berger November 2018

Recall the function of compilers 2 / 1

3 / 1 Plan for this week Source program Lexical analysis Intermediate code generation In the previous section we introduced the stack machine architecture, and then investigated a simple syntax-directed code-generator for this architecture. Syntax analysis Semantic analysis, e.g. type checking Optimisation Code generation This week we continue looking at code generation, but for register machines, a faster CPU architecture. If time permits, we ll also look at accumulator machines. Translated program

Recall: stack machine architecture 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 66 22 Add 13 PushAbs 12 PushAbs 3 Jump SP PC Current instruction = Jump Temporary register = 12 4 / 1

5 / 1 Recall: stack machine language Nop PushAbs x CompGreaterThan Jump l Plus Times Negate Pop x PushImm n CompEq JumpTrue l Minus Divide Important: arguments (e.g. to Plus) are always on top of the stack, and are removed (by rearranging the stack pointer (SP)). The result of the command is placed on the top of the stack.

6 / 1 Register machines The problem with the stack machine is that memory access (on modern CPUs) is very slow in comparion with CPU operations, approx. 20-100 times slower. The stack machine forces us constantly to access memory, even for the simplest operations. It would be nice if the CPU let us store, access and manipulate data directly, rather than only work on the top elements of the stack. Registers are fast (in comparison with memory), temporary, addressable storage in the CPU, that let us do this, whence register machines. But compilation for register machines is more complicated than compilation for stack machines. Can you guess why?

Compilation for register machines Each CPU has only a small finite number of registers (e.g. 16, 32, 128). That can be a problem. Why? Because for small expressions, we can fit all the relevant parameters into the registers, but for the execution of larger expressions this is no longer the case. Moreover, what registers are available at each point in the computation depends on what other code is being executed. Hence a compiler must be able to do the following things. Generate code that has all parameters in registers. Generate code that has some (or most) parameters in main memory. Detect which of the above is the case, and be able seamlessly to switch between the two. All of this makes compilers more difficult. Let s look at register machines. 7 / 1

A simple register machine 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 66 22 Add 13 PushAbs 12 PushAbs 3 Jump SP PC Cur. instr. = Jump R0 = 2 R1= 42 R2 = 33 R3 =12 R4 = 20 R5= 25 R6 = 116 R7 = 123 Quite similar to the stack machine, but we have additional registers. Important: operations like add, multiply operate on registers, no longer on the top of the stack How to generate code for register machines? 8 / 1

9 / 1 Dealing with registers by divide-and-conquer In order to explain the difficult problem of generating efficient code for register machines, we split the problem into three simpler sub-problems. Generate code assuming an unlimited supply of registers. Modify the translator to evaluate expressions in the order which minimises the number of registers needed, while still generating efficient code. Invent a scheme to handle cases where we run out of registers. Let s start by looking at register machines with an unlimited number of registers.

10 / 1 Commands for register machines We assume an unlimited supply of registers R0, R1, R2, ranged over by r, r We call these general purpose registers (as distinct from PC, SP). Nop Pop r Push r Load r x LoadImm r n Store r x CompGreaterThan r r Does nothing removes the top of the stack and stores it in register r Pushes the content of the register r on stack Loads the content of memory location x into register r Loads integer n into register r Stores the content of register r in memory location x Compares the content of register r with the content of register r. Stores 1 in r if former is bigger than latter, otherwise stores 0

11 / 1 Commands for register machines CompEq r r Compares the content of register r with the content of register r. Stores 1 in r if both are equal, otherwise stores 0 Jump l Jumps to l JumpTrue r l Jumps to address/label l if the content of register r is not 0 JumpFalse r l Jumps to address/label l if the content of register r is 0 Plus r r Adds the content of r and r, leaving the result in r Remaining arithmetic operations are similar Some commands have arguments (called operands). They take two (if the command has one operand) or three units of storage, the others only one. These operands need to be specified in the op-code, unlike with the stack machine. (Why?)

12 / 1 Commands for register machines Question: Why do we bother with stack operations at all? Important for e.g. procedure/method invocation. We ll talk about that later.

13 / 1 Source language Our source language is unchanged: a really simple imperative language. M ::= M; M for x = E to E {M} x := E E ::= n x E + E E E E E E/E E Everything that s difficult to compile, e.g. procedures, objects, is left out. We come to that later.

Code generation for register machines def codegen ( s : AST, target : Register ) : List [ Instruction ] def codegenexpr ( exp : Expr, target : Register ) : List [ Instruction ] Important convention 1: The code generated by codegenexpr( e, i ) can use registers Ri, Ri+1, upwards, but must leave the other registers R0 Ri-1 unchanged! Important convention 2: The result of evaluating the expression is returned in register target. One way of thinking about this target is that it is used to track where the stack pointer would point. Similar conventions for codegen for statements. 14 / 1

15 / 1 Code generation for constants def codegenexpr ( exp : Expr, target : Register ) = { if exp is of shape Const ( n ) then List ( I_LoadImm ( target, n ) ) } }

16 / 1 Code generation for variables def codegenexpr ( exp : Expr, target : Register ) = { if exp is of shape Ident ( x ) then List ( I_Load ( target, x ) )

17 / 1 Code generation for binary expressions def codegenexpr ( exp : Expr, target : Register ) = { if exp is of shape Binop ( lhs, op, rhs ) then { codegenexpr ( rhs, target ) ++ codegenexpr ( lhs, target+1 ) ++ codegenbinop ( op, target, target+1 ) } where def codegenbinop ( op : Op, r1 : Register, r2 : Register ) = { if op is of shape Plus then List ( I_Plus ( r1, r2 ) ) Minus then List ( I_Minus ( r1, r2 ) ) Times then List ( I_Times ( r1, r2 ) ) Divide then List ( I_Divide ( r1, r2 ) ) } }

18 / 1 Code generation for binary expressions Note that the call codegenexpr(lhs, target+1) in Binop ( lhs, op, rhs ) then { codegenexpr ( rhs, target ) ++ codegenexpr ( lhs, target+1 ) ++ codegenbinop ( op, target, target+1 ) } leaves the result of the first call codegenexpr(rhs, target) in the register target unchanged by our assumptions that codegenexpr never modifies registers below its second argument. Please convince yourself that each clause of codegenexpr really implements this guarantee!

19 / 1 Example (x*3)+4 Compiling the expression (x*3)+4 (to target register r17, say) gives: LoadImm r17 4 LoadImm r18 3 Load r19 x Times r18 r19 Plus r17 r18

20 / 1 How can this be improved (1)? Let s use commutativity of addition (a+b = b+a) and compile 4+(x*3)! When we compile it, we obtain: LoadImm r17 3 Load r18 x Times r17 r18 LoadImm r18 4 Plus r17 r18 How is this better?

21 / 1 Side by side Compilation of (x*3)+4 LoadImm r17 4 LoadImm r18 3 Load r19 x Times r18 r19 Plus r17 r18 Compilation of 4+(x*3) LoadImm r17 3 Load r18 x Times r17 r18 LoadImm r18 4 Plus r17 r18 The translation on the left uses 3 registers, while the right only two. We are currently assuming an unbounded number of registers, so who cares For realistic CPUs the number of registers is small, so smart translation strategies that save registers are better. More on this later!

22 / 1 How can this be improved (2)? Not exam relevant! We used two registers can we get away with fewer? Probably not with this machine architecture. But some widely used CPUs can use constants directly in arithmetic operations, e.g. MulImm r 3 Multiplies the content of register r with 3, storing the result in r AddImm r 4 Adds the content of register r with 3, storing the result in r Then (x*3)+4 translates to Load r17 x TimesImm r17 3 PlusImm r17 4 This is a different address(ing) mode.

23 / 1 Translation of statements Similar to stack machine, except that arguments and results of expressions are held in registers. We ll see this in detail later. Question: Does the codegenstatement method need to be passed a target register (as opposed to hard-coding one)? Answer: Yes, because statements may contain expressions, e.g. x := x*y+3. Now we do something more interesting.

Bounded register numbers It s easy to compile to register machine code, when the number of registers is unlimited. Now we look at compilation to register machines with a fixed number of registers. Let s go to the other extreme: just one register called accumulator. Operations take one of their arguments from the accumulator, and store the result in the accumulator. Additional arguments are taken from the top of the stack. Why is this interesting? We can combine two strategies: While > 1 free registers remain, use the register machine strategy discussed above for compilation. When the limit is reached (ie. when there is one register left), revert to the accumulator strategy, using the last register as the accumulator. The effect is that most expressions get the full benefit of registers, while unusually large expressions are handled correctly. 24 / 1