Principles of Compiler Design

Similar documents
PSD3A Principles of Compiler Design Unit : I-V. PSD3A- Principles of Compiler Design

General issues. Section 9.1. Compiler Construction: Code Generation p. 1/18

Figure 28.1 Position of the Code generator

CSE 504: Compiler Design. Code Generation

G Compiler Construction Lecture 12: Code Generation I. Mohamed Zahran (aka Z)

Code Generation. M.B.Chandak Lecture notes on Language Processing

Code Generation (#')! *+%,-,.)" !"#$%! &$' (#')! 20"3)% +"#3"0- /)$)"0%#"

CODE GENERATION Monday, May 31, 2010

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

Code Generation. CS 540 George Mason University

tokens parser 1. postfix notation, 2. graphical representation (such as syntax tree or dag), 3. three address code

Compiler Architecture

Compiler Optimization Techniques

Optimization. ASU Textbook Chapter 9. Tsan-sheng Hsu.

THEORY OF COMPILATION

Compiler Theory. (Intermediate Code Generation Abstract S yntax + 3 Address Code)

Intermediate representation

Group B Assignment 9. Code generation using DAG. Title of Assignment: Problem Definition: Code generation using DAG / labeled tree.

Gechstudentszone.wordpress.com

CS 4201 Compilers 2014/2015 Handout: Lab 1

Compiler, Assembler, and Linker

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization

4. An interpreter is a program that

Chapter 04: Instruction Sets and the Processor organizations. Lesson 11: Generation of Memory Addresses and Addressing Modes

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI

CSE 401/M501 Compilers

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

Computer Science and Engineering 331. Midterm Examination #1. Fall Name: Solutions S.S.#:

Intermediate Representations

Principles of Compiler Design

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

Compilation /17a Lecture 10. Register Allocation Noam Rinetzky

Code Generation: Integrated Instruction Selection and Register Allocation Algorithms

Concepts Introduced in Chapter 6

CSc 453 Interpreters & Interpretation

Instruction Sets: Characteristics and Functions Addressing Modes

Instruction-set Design Issues: what is the ML instruction format(s) ML instruction Opcode Dest. Operand Source Operand 1...

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

Intermediate Representations

MAHARASHTRA STATE BOARD OF TECHNICAL EDUCATION (Autonomous) (ISO/IEC Certified) WINTER-15 EXAMINATION Model Answer Paper

CSc 453. Code Generation I Christian Collberg. Compiler Phases. Code Generation Issues. Slide Compilers and Systems Software.

VIVA QUESTIONS WITH ANSWERS

We will study the MIPS assembly language as an exemplar of the concept.

Introduction. CSc 453. Compilers and Systems Software. 19 : Code Generation I. Department of Computer Science University of Arizona.

Tizen/Artik IoT Lecture Chapter 3. JerryScript Parser & VM

A Simple Syntax-Directed Translator

Architectures. Code Generation Issues III. Code Generation Issues II. Machine Architectures I

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100


VETRI VINAYAHA COLLEGE OF ENGINEERING AND TECHNOLOGY

Compiler Design Aug 1996

Instruction-set Design Issues: what is the ML instruction format(s) ML instruction Opcode Dest. Operand Source Operand 1...

Building a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano

AS-2883 B.Sc.(Hon s)(fifth Semester) Examination,2013 Computer Science (PCSC-503) (System Software) [Time Allowed: Three Hours] [Maximum Marks : 30]

Machine Code and Assemblers November 6

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Communicating with People (2.8)

Compiler construction in4303 lecture 9

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University.

Register Allocation. 2. register contents, namely which variables, constants, or intermediate results are currently in a register, and

6.035 Project 3: Unoptimized Code Generation. Jason Ansel MIT - CSAIL

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

CSIS1120A. 10. Instruction Set & Addressing Mode. CSIS1120A 10. Instruction Set & Addressing Mode 1

1 Little Man Computer

MIPS (SPIM) Assembler Syntax

More Code Generation and Optimization. Pat Morin COMP 3002

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

Overview of Compiler. A. Introduction

Chapter 2 Instruction Set Architecture

CS401 - Computer Architecture and Assembly Language Programming Glossary By

Compiler Optimization

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

Three address representation

Intermediate Code & Local Optimizations. Lecture 20

Programming Language Processor Theory

Lec 13: Linking and Memory. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Announcements

IR Optimization. May 15th, Tuesday, May 14, 13

Concepts Introduced in Chapter 6

What Compilers Can and Cannot Do. Saman Amarasinghe Fall 2009

The 8051 Microcontroller and Embedded Systems

Instructions and Instruction Set Architecture

Practical Malware Analysis

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

CPU: SOFTWARE ARCHITECTURE INSTRUCTION SET (PART

Intermediate Representation (IR)

Tour of common optimizations

1. The output of lexical analyser is a) A set of RE b) Syntax Tree c) Set of Tokens d) String Character

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

Chapter 2A Instructions: Language of the Computer

Computer Hardware and System Software Concepts

Chapter 2. Assembler Design

CHAPTER ASSEMBLY LANGUAGE PROGRAMMING

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

ICS 51: Introduction to Computer Organization

Lecture Outline. Intermediate code Intermediate Code & Local Optimizations. Local optimizations. Lecture 14. Next time: global optimizations

Chapter 2. lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1

CS293S Redundancy Removal. Yufei Ding

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

ELEG3924 Microprocessor

Transcription:

Principles of Compiler Design Code Generation Compiler Lexical Analysis Syntax Analysis Semantic Analysis Source Program Token stream Abstract Syntax tree Intermediate Code Code Generation Target Program Front End Back End (Language specific) 1

Code generation and Instruction Selection input Front end Intermediate Code generator Code generator output Symbol table Requirements output code must be correct output code must be of high quality code generator should run efficiently 2

Design of code generator: Issues Input: Intermediate representation with symbol table assume that input has been validated by the front end Target programs : absolute machine language fast for small programs relocatable machine code requires linker and loader assembly code requires assembler, linker, and loader 3

More Issues Instruction selection Uniformity Completeness Instruction speed, power consumption Register allocation Instructions with register operands are faster store long life time and counters in registers temporary locations Even odd register pairs Evaluation order 4

Instruction Selection straight forward code if efficiency is not an issue a=b+c Mov b, R 0 d=a+e Add c, R 0 Mov R 0, a Mov a, R 0 can be eliminated Add e, R 0 Mov R 0, d a=a+1 Mov a, R 0 Inc a Add #1, R 0 Mov R 0, a 5

Example Target Machine Byte addressable with 4 bytes per word n registers R 0, R 1,..., R n-l Two address instructions of the form opcode source, destination Usual opcodes like move, add, sub etc. Addressing modes MODE FORM ADDRESS Absolute M M register R R index c(r) c+content(r) indirect register *R content(r) indirect index *c(r) content(c+content(r)) literal #c c 6

Flow Graph Graph representation of three address code Useful for understanding code generation (and for optimization) Nodes represent computation Edges represent flow of control 7

Basic blocks (maximum) sequence of consecutive statements in which flow of control enters at the beginning and leaves at the end Algorithm to identify basic blocks determine leader first statement is a leader any target of a goto statement is a leader any statement that follows a goto statement is a leader for each leader its basic block consists of the leader and all statements up to next leader 8

Flow graphs add control flow information to basic blocks nodes are the basic blocks there is a directed edge from B 1 to B 2 if B 2 can follow B 1 in some execution sequence there is a jump from the last statement of B 1 to the first statement of B 2 B 2 follows B 1 in natural order of execution initial node: block with first statement as leader 9

Next use information for register and temporary allocation remove variables from registers if not used statement X = Y op Z defines X and uses Y and Z scan each basic blocks backwards assume all temporaries are dead on exit and all user variables are live on exit 10

Computing next use information Suppose we are scanning i : X := Y op Z in backward scan 1. attach to statement i, information in symbol table about X, Y, Z 2. set X to not live and no next use in symbol table 3. set Y and Z to be live and next use as i in symbol table 11

Example 1: t 1 = a * a 2: t 2 = a * b 3: t 3 = 2 * t 2 4: t 4 = t 1 + t 3 5: t 5 = b * b 6: t 6 = t 4 + t 5 7: X = t 6 12

STATEMENT 1: t 1 = a * a 2: t 2 = a * b 3: t 3 = 2 * t 2 4: t 4 = t 1 + t 3 5: t 5 = b * b 6: t 6 = t 4 + t 5 7: X = t 6 t 1 Example Symbol Table dead Use in 4 7: no temporary is live 6: t 6 :use(7), t 4 t 5 not live 5: t 5 :use(6) 4: t 4 :use(6), t 1 t 3 not live 3: t 3 :use(4), t 2 not live 2: t 2 :use(3) 1: t 1 :use(4) t 2 t 3 t 4 t 5 t 6 dead dead dead dead dead Use in 3 Use in 4 Use in 6 Use in 6 Use in 7 13

1 2 3 t 1 t 2 Example 1: t 1 = a * a 2: t 2 = a * b STATEMENT 1: t 1 = a * a 2: t 2 = a * b 3: t 3 = 2 * t 2 4: t 4 = t 1 + t 3 5: t 5 = b * b 6: t 6 = t 4 + t 5 7: X = t 6 4 t 3 3: t 2 = 2 * t 2 5 t 4 t 5 4: t 1 = t 1 + t 2 5: t 2 = b * b 6 6: t 1 = t 1 + t 2 7 t 6 7: X = t 1 14

Code Generator consider each statement remember if operands are in registers Register descriptor Keep track of what is currently in each register. Initially all the registers are empty Address descriptor Keep track of location where current value of the name can be found at runtime The location might be a register, stack, memory address or a set of those 15

Code Generation Algorithm for each X = Y op Z do invoke a function getreg to determine location L where X must be stored. Usually L is a register. Consult address descriptor of Y to determine Y'. Prefer a register for Y'. If value of Y not already in L generate Mov Y', L 16

Code Generation Algorithm Generate op Z', L Again prefer a register for Z. Update address descriptor of X to indicate X is in L. If L is a register, update its descriptor to indicate that it contains X and remove X from all other register descriptors. If current value of Y and/or Z have no next use and are dead on exit from block and are in registers, change register descriptor to indicate that they no longer contain Y and/or Z. 17

Function getreg 1. If Y is in register (that holds no other values) and Y is not live and has no next use after X = Y op Z then return register of Y for L. 2. Failing (1) return an empty register 3. Failing (2) if X has a next use in the block or op requires register then get a register R, store its content into M (by Mov R, M) and use it. 4. else select memory location X as L 18

Example Stmt code reg desc addr desc t 1 =a-b mov a,r 0 sub b,r 0 R 0 contains t 1 t 1 in R 0 t 2 =a-c mov a,r 1 R 0 contains t 1 t 1 in R 0 sub c,r 1 R 1 contains t 2 t 2 in R 1 t 3 =t 1 +t 2 add R 1,R 0 R 0 contains t 3 t 3 in R 0 t 1 =a-b t 2 =a-c t 3 =t 1 +t 2 d=t 3 +t 2 R 1 contains t 2 t 2 in R 1 d=t 3 +t 2 add R 1,R 0 R 0 contains d d in R 0 mov R 0,d d in R 0 and 19 memory

Conditional Statements branch if value of R meets one of six conditions negative, zero, positive, non-negative, non-zero, non-positive if X < Y goto Z MOV X, R0 SUB Y, R0 CJ- Z Condition codes: indicate whether last quantity computed or loaded into a location is negative, zero, or positive if X < Y goto Z CMP X, Y CJ< Z 20

Conditional Statements maintain a condition code descriptor: tells the name that last set the condition codes X =Y + Z if X < 0 goto L Mov Y,R0 Add Z, R0 Mov R0, X CJ< L 21

DAG representation of basic blocks useful data structures for implementing transformations on basic blocks gives a picture of how value computed by a statement is used in subsequent statements good way of determining common subexpressions A dag for a basic block has following labels on the nodes leaves are labeled by unique identifiers, either variable names or constants interior nodes are labeled by an operator symbol nodes are also optionally given a sequence of identifiers for labels 22

DAG representation: example 1. t 1 := 4 * i 2. t 2 := a[t 1 ] 3. t 3 := 4 * i 4. t 4 := b[t 3 ] 5. t 5 := t 2 * t 4 6. t 6 := prod + t 5 7. prod := t 6 8. t 7 := i + 1 9. i := t 7 10. if i <= 20 goto (1) + prod 0 * t 2 t 6 prod t 5 [ ] [ ] a b * + 4 t 4 t 1 t 3 i 0 t 7 (1) <= i 1 20 23

Code Generation from DAG S 1 = 4 * i S 2 = addr(a)-4 S 3 = S 2 [S 1 ] S 4 = 4 * i S 5 = addr(b)-4 S 6 = S 5 [S 4 ] S 7 = S 3 * S 6 S 8 = prod+s 7 prod = S 8 S 9 = I+1 I = S 9 If I <= 20 goto (1) S 1 = 4 * i S 2 = addr(a)-4 S 3 = S 2 [S 1 ] S 5 = addr(b)-4 S 6 = S 5 [S 4 ] S 7 = S 3 * S 6 prod = prod + S 7 I = I + 1 If I <= 20 goto (1) 24

Rearranging order of the code Consider following basic block t 1 = a + b t 2 = c + d t 3 = e t 2 X = t 1 t 3 a t 1 + b - X - e + c t 3 t 2 d and its DAG 25

Rearranging order Three adress code for the DAG (assuming only two registers are available) Rearranging the code as t 2 = c + d t 3 = e t 2 t 1 = a + b MOV a, R 0 ADD b, R 0 MOV c, R 1 ADD d, R 1 MOV R 0, t 1 MOV e, R 0 SUB R 1, R 0 MOV t 1, R 1 SUB R 0, R 1 MOV R 1, X Register spilling Register reloading X = t 1 t 3 gives MOV c, R 0 ADD d, R 0 MOV e, R 1 SUB R 0, R 1 MOV a, R 0 ADD b, R 0 SUB R 1, R 0 MOV R 1, X 26