opt. front end produce an intermediate representation optimizer transforms the code in IR form into an back end transforms the code in IR form into

Similar documents
Alternatives for semantic processing

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Intermediate Representations

Intermediate Representations

6. Intermediate Representation

6. Intermediate Representation!

Syntax-directed translation. Context-sensitive analysis. What context-sensitive questions might the compiler ask?

Intermediate Representation (IR)

Intermediate Representations

Context-sensitive analysis. Semantic Processing. Alternatives for semantic processing. Context-sensitive analysis

Intermediate Representations

Intermediate Representations & Symbol Tables

CS415 Compilers. Intermediate Represeation & Code Generation

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Intermediate Representations Part II

Compiler Optimization Intermediate Representation

Front End. Hwansoo Han

COMP 181. Prelude. Intermediate representations. Today. High-level IR. Types of IRs. Intermediate representations and code generation

CSE 401/M501 Compilers

Intermediate Code Generation

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

Intermediate representation

CS 406/534 Compiler Construction Intermediate Representation and Procedure Abstraction

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Intermediate Code & Local Optimizations

Intermediate Code & Local Optimizations. Lecture 20

Introduction to Compiler

Principles of Compiler Design

Other Forms of Intermediate Code. Local Optimizations. Lecture 34

Administrative. Other Forms of Intermediate Code. Local Optimizations. Lecture 34. Code Generation Summary. Why Intermediate Languages?

Intermediate Representations

CSc 453. Compilers and Systems Software. 13 : Intermediate Code I. Department of Computer Science University of Arizona

Compiler Theory. (Intermediate Code Generation Abstract S yntax + 3 Address Code)

Data Structures and Algorithms in Compiler Optimization. Comp314 Lecture Dave Peixotto

Intermediate Code Generation

Semantic Analysis computes additional information related to the meaning of the program once the syntactic structure is known.

Concepts Introduced in Chapter 6

G Compiler Construction Lecture 12: Code Generation I. Mohamed Zahran (aka Z)

Lecture Outline. Intermediate code Intermediate Code & Local Optimizations. Local optimizations. Lecture 14. Next time: global optimizations

CS 406/534 Compiler Construction Putting It All Together

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

tokens parser 1. postfix notation, 2. graphical representation (such as syntax tree or dag), 3. three address code

Implementing Control Flow Constructs Comp 412

Parsing II Top-down parsing. Comp 412

Intermediate Code Generation

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

CS415 Compilers Register Allocation. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM.

CS 432 Fall Mike Lam, Professor. Code Generation

Control Flow Analysis

Concepts Introduced in Chapter 6

CMPSC 160 Translation of Programming Languages. Three-Address Code

COMPILER CONSTRUCTION (Intermediate Code: three address, control flow)

CMPT 379 Compilers. Anoop Sarkar. 11/13/07 1. TAC: Intermediate Representation. Language + Machine Independent TAC

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

UNIT IV INTERMEDIATE CODE GENERATION

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

We can emit stack-machine-style code for expressions via recursion

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412

Chapter 6 Intermediate Code Generation

CS5363 Final Review. cs5363 1

Time : 1 Hour Max Marks : 30

CS293S Redundancy Removal. Yufei Ding

Principles of Compiler Design

Compilers. Lecture 2 Overview. (original slides by Sam

CS2210: Compiler Construction. Code Generation

Languages and Compiler Design II IR Code Generation I

Dixita Kagathara Page 1

PSD3A Principles of Compiler Design Unit : I-V. PSD3A- Principles of Compiler Design

Where we are. What makes a good IR? Intermediate Code. CS 4120 Introduction to Compilers

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

Runtime management. CS Compiler Design. The procedure abstraction. The procedure abstraction. Runtime management. V.

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

CS 132 Compiler Construction

Compiling Techniques

Group B Assignment 9. Code generation using DAG. Title of Assignment: Problem Definition: Code generation using DAG / labeled tree.

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.

CMSC430 Spring 2014 Midterm 2 Solutions

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Instruction Selection: Preliminaries. Comp 412

Intermediate Representa.on

Chapter 5. A Closer Look at Instruction Set Architectures

Program Analysis ( 软件源代码分析技术 ) ZHENG LI ( 李征 )

Intermediate Code Generation

Runtime Support for Algol-Like Languages Comp 412

Lecture 31: Building a Runnable Program

Lecture 4: Instruction Set Design/Pipelining

The View from 35,000 Feet

CSE413: Programming Languages and Implementation Racket structs Implementing languages with interpreters Implementing closures

IR Optimization. May 15th, Tuesday, May 14, 13

Undergraduate Compilers in a Day

Advanced C Programming

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

NARESHKUMAR.R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY Page 1

IR Generation. May 13, Monday, May 13, 13

Fall Compiler Principles Lecture 6: Intermediate Representation. Roman Manevich Ben-Gurion University of the Negev

Semantic analysis and intermediate representations. Which methods / formalisms are used in the various phases during the analysis?

Transcription:

Intermediate representations source code front end ir opt. ir back end target code front end produce an intermediate representation (IR) for the program. optimizer transforms the code in IR form into an equivalent program that may run more eciently. back end transforms the code in IR form into native code for the target machine The IR encodes knowledge that the compiler has derived about the source program. CPSC 434 Lecture 17, Page 1

Intermediate representations Why use an intermediate representation? 1. break the compiler into manageable pieces good software engineering technique 2. allow a complete pass before code is emitted lets compiler consider more than one option 3. simplies retargeting to new host isolates back end from front end 4. simplies handling of \poly-architecture" problem m lang's, n targets ) m + n components (myth) 5. enables machine-independent optimization general techniques, multiple passes An intermediate representation is a compile-time data structure CPSC 434 Lecture 17, Page 2

Intermediate representations Important IR Properties ease of generation ease of manipulation cost of manipulation level of abstraction freedom of expression size of typical procedure original or derivative Subtle design decisions in the IR have far reaching eects on the speed and eectiveness of the compiler. Level of exposed detail is a crucial consideration. CPSC 434 Lecture 17, Page 3

Intermediate representations Representations talked about in the literature include: abstract syntax trees (AST) linear (operator) form of tree directed acyclic graphs (DAG) control ow graphs (CFG) program dependence graphs (PDG) static single assignment form (SSA) stack code three address code hybrid combinations CPSC 434 Lecture 17, Page 4

Intermediate representations Broadly speaking, IRs fall into three categories: Structural structural IRs are graphically oriented examples: trees, directed acyclic graphs heavily used in source to source translators nodes, edges tend to be large Linear pseudo-code for some abstract machine large variation in level of abstraction simple, compact data structures easier to rearrange Hybrids combination of graphs and linear code attempt to take best of each examples: control-ow graph CPSC 434 Lecture 17, Page 5

Abstract syntax tree An abstract syntax tree (AST) is the procedure's parse tree with the nodes for most non-terminal symbols removed. - <id,x> * <num,2> <id,y> This represents \x - 2 * y". For ease of manipulation, can use a linearized (operator) form of the tree. x 2 y * - in postx form. CPSC 434 Lecture 17, Page 6

Directed acyclic graph A directed acyclic graph (DAG) is an AST with a unique node for each value. <id,z> / + <id,x> * sin <id,y> * <num,2> <id,x> x 2 * y + sin(2*x) z x / 2 CPSC 434 Lecture 17, Page 7

Control ow graph The control ow graph (CFG) models the transfers of control in the procedure. nodes in the graph are basic blocks maximal-length straight-line blocks of code edges in the graph represent control ow loops, if-then-else, case, goto Example becomes if (x=y) then s1 else s2 s3 x=y s1 s2 s3 CPSC 434 Lecture 17, Page 8

Stack machine code Several stack-based computers have been built Compilers can directly generate stack code Example x - 2 * y becomes push x push 2 push y multiply subtract Advantages compact form introduced names are implicit, not explicit simple to generate and execute code B5500, B1700, P-code, BCPL, RPN calculators Bytecodes are becoming popular (again) CPSC 434 Lecture 17, Page 9

Three address code Three address code is a term used to describe a variety of representations. In general, they allow statements of the form: x y op z with a single operator and, at most, three names. Simpler form of expression x - 2 * y becomes t1 2 * y t2 x - t1 Advantages compact form (direct naming) names for intermediate values Can include forms of prex or postx code CPSC 434 Lecture 17, Page 10

Three address code Typical statement types include: 1. assignments x y op z 2. assignments x op y 3. assignments x y[i] 4. assignments x y 5. branches goto L 6. conditional branches if x relop y goto L 7. procedure calls param x and call p 8. address and pointer assignments CPSC 434 Lecture 17, Page 11

Three address code Until recently, compile-time space was a serious issue machines had small memories compiler touches space it allocates Compact forms of three address code quadruples triples indirect triples Major tradeo is compactness versus ease of manipulation Today, speed (and locality) may be more important CPSC 434 Lecture 17, Page 12

Three address code Quadruples x - 2 * y (1) load t1 y (2) loadi t2 2 (3) mult t3 t2 t1 (4) load t4 x (5) sub t5 t4 t2 simple record structure with four elds easy to reorder explicit names CPSC 434 Lecture 17, Page 13

Three address code Triples x - 2 * y (1) load y (2) loadi 2 (3) mult (1) (2) (4) load x (5) sub (4) (3) use table index as implicit name require only three elds in record harder to reorder CPSC 434 Lecture 17, Page 14

Three address code Indirect Triples x - 2 * y stmt op arg1 arg2 (1) (100) (100) load y (2) (101) (101) loadi 2 (3) (102) (102) mult (100) (101) (4) (103) (103) load x (5) (104) (104) sub (103) (102) list of 1st triple in statement simplies moving statements more space than triples implicit name space management CPSC 434 Lecture 17, Page 15

Other hybrids An attempt to get the best of both worlds. graphs where they work linear codes where it pays o Unfortunately, there appears to be little agreement about where to use each kind of IR to best advantage. For example: PCC and F77 directly emit assembly code for control ow, but build and pass around expression trees for expressions. Many systems use a control ow graph with three address code for each basic block. Source-to-source translators typically use AST and dependence graph CPSC 434 Lecture 17, Page 16

Intermediate representations But, this isn't the whole story. Symbol table: identiers, procedures size, type, location lexical nesting depth Constant table: representation, type storage class, oset(s) Storage map: storage layout overlap information (virtual) register assignments CPSC 434 Lecture 17, Page 17

Advice Many kinds of IR are used in practice. Best choice depends on application. There is no widespread agreement on this subject. A compiler may need several dierent IRs Choose IR with right level of detail Keep manipulation costs in mind CPSC 434 Lecture 17, Page 18