Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Similar documents
Intermediate Representations

Intermediate Representations

Intermediate Representations

Intermediate Representations & Symbol Tables

CS415 Compilers. Intermediate Represeation & Code Generation

Intermediate Representations

opt. front end produce an intermediate representation optimizer transforms the code in IR form into an back end transforms the code in IR form into

Intermediate Representations Part II

CS 406/534 Compiler Construction Intermediate Representation and Procedure Abstraction

Alternatives for semantic processing

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation

Intermediate Representation (IR)

Front End. Hwansoo Han

6. Intermediate Representation

CSE 401/M501 Compilers

Code Shape II Expressions & Assignment

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

6. Intermediate Representation!

COMP 181. Prelude. Intermediate representations. Today. High-level IR. Types of IRs. Intermediate representations and code generation

The View from 35,000 Feet

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

CS415 Compilers Register Allocation. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam

Code Merge. Flow Analysis. bookkeeping

Syntax-directed translation. Context-sensitive analysis. What context-sensitive questions might the compiler ask?

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Instruction Selection: Preliminaries. Comp 412

Instruction Selection, II Tree-pattern matching

Context-sensitive analysis. Semantic Processing. Alternatives for semantic processing. Context-sensitive analysis

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

Intermediate representation

Data Structures and Algorithms in Compiler Optimization. Comp314 Lecture Dave Peixotto

Intermediate Code Generation

Implementing Control Flow Constructs Comp 412

Handling Assignment Comp 412

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

CS 406/534 Compiler Construction Putting It All Together

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

Procedure and Function Calls, Part II. Comp 412 COMP 412 FALL Chapter 6 in EaC2e. target code. source code Front End Optimizer Back End

Intermediate Representations

Lecture 12: Compiler Backend

Arrays and Functions

Compiler Optimization Intermediate Representation

Runtime Support for Algol-Like Languages Comp 412

Instruction Selection: Peephole Matching. Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

Naming in OOLs and Storage Layout Comp 412

tokens parser 1. postfix notation, 2. graphical representation (such as syntax tree or dag), 3. three address code

UG3 Compiling Techniques Overview of the Course

Compiler Design. Code Shaping. Hwansoo Han

Overview of a Compiler

Principles of Compiler Design

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

G Compiler Construction Lecture 12: Code Generation I. Mohamed Zahran (aka Z)

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Local Register Allocation (critical content for Lab 2) Comp 412

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation

CS5363 Final Review. cs5363 1

High-level View of a Compiler

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Parsing II Top-down parsing. Comp 412

Building a Parser Part III

Lexical Analysis. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

The Software Stack: From Assembly Language to Machine Code

UNIT IV INTERMEDIATE CODE GENERATION

The Processor Memory Hierarchy

Understand the factors involved in instruction set

LECTURE 17. Expressions and Assignment

Intermediate Code & Local Optimizations

Lecture 31: Building a Runnable Program

Global Register Allocation via Graph Coloring

CS 432 Fall Mike Lam, Professor. Code Generation

Intermediate Code Generation

Lab 3, Tutorial 1 Comp 412

Intermediate Code Generation

Figure 28.1 Position of the Code generator

(just another operator) Ambiguous scalars or aggregates go into memory

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

CS415 Compilers. Instruction Scheduling and Lexical Analysis

EC-801 Advanced Computer Architecture

Chapter 5. A Closer Look at Instruction Set Architectures

CS415 Compilers. Procedure Abstractions. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Semantic Analysis computes additional information related to the meaning of the program once the syntactic structure is known.

Parsing Part II (Top-down parsing, left-recursion removal)

CHAPTER 5 A Closer Look at Instruction Set Architectures

CHAPTER 5 A Closer Look at Instruction Set Architectures

CS 406/534 Compiler Construction Instruction Scheduling beyond Basic Block and Code Generation

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

Chapter 5. A Closer Look at Instruction Set Architectures

CODE GENERATION Monday, May 31, 2010

CS /534 Compiler Construction University of Massachusetts Lowell

Introduction to Optimization Local Value Numbering

Introduction to Compiler

CS 406/534 Compiler Construction Instruction Scheduling

Computing Inside The Parser Syntax-Directed Translation. Comp 412

Runtime Support for OOLs Object Records, Code Vectors, Inheritance Comp 412

CMPT 379 Compilers. Anoop Sarkar. 11/13/07 1. TAC: Intermediate Representation. Language + Machine Independent TAC

Transcription:

Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials for their personal use.

Intermediate Representations Source Code Front IR Middle IR Back End End End Target Code Front end -produces an intermediate representation (IR)) Middle end - transforms the IR into an equivalent IR that runs more efficiently Back end - transforms the IR into native code IR encodes the compiler s knowledge of the program IR encodes the compiler s knowledge of the program Middle end usually consists of several passes

Intermediate Representations Decisions in IR design affect the speed and efficiency of the compiler Some important IR properties Ease of generation Ease of manipulation Procedure size Freedom of expression Level of abstraction The importance of different properties varies between compilers Selecting an appropriate IR for a compiler is critical

Types of Intermediate Representations Three major categories Structural Graphically oriented Heavily used in source-to-source translators Tend to be large Examples: Trees, DAGs Linear Pseudo-code for an abstract machine Level of abstraction varies Simple, compact data structures Easier to rearrange Hybrid Combination of graphs and linear code Example: control-flow graph Examples: 3 address code Stack machine code Example: Control-flow graph

Level of Abstraction The level of detail exposed in an IR influences the profitability and feasibility of different optimizations. Two different representations of an array reference: loadi 1 => r 1 subscript A i j High level AST: Good for memory disambiguation sub r j, r 1 => r 2 loadi 10 => r 3 mult r 2, r 3 => r 4 sub r i, r 1 => r 5 add r 4, r 5 => r 6 loadi @A => r 7 Add r 7, r 6 => r 8 load r 8 => r Aij Low level linear code: Good for address calculation

Level of Abstraction Structural IRs are usually considered high-level Linear IRs are usually considered low-level Not necessarily true: load Low level AST + + @A loadarray A,i,j High level linear code * - - 10 j 1 j 1

Abstract Syntax Tree An abstract syntax tree is the procedure s parse tree with the nodes for most non-terminal nodes removed - x * 2 y x - 2 * y Can use linearized form of the tree Easier to manipulate than pointers x 2 y * - in postfix form - * 2 y x in prefix form S-expressions are (essentially) ASTs

Directed Acyclic Graph A directed acyclic graph (DAG) is an AST with a unique node for each value z - w / x * 2 y z x - 2 * y w x / 2 Makes sharing explicit Encodes redundancy Same expression twice means that the compiler might arrange to evaluate it just once!

Stack Machine Code Originally used for stack-based computers, now Java Example: push x x - 2 * y becomes push 2 push y multiply subtract Advantages Compact form Introduced names are implicit, not explicit Simple to generate and execute code Useful where code is transmitted Implicit i names take up over slow communication links (the net ) no space, where explicit ones do!

Three Address Code Several different representations of three address code In general, three address code has statements of the form: x y op z With 1 operator (op ) and, at most, 3 names (x, y, & z) Example: Advantages: z x - 2 * y becomes Resembles many machines Introduces a new set of names * Compact form t 2 * y z x - t

Three Address Code: Quadruples Naïve representation of three address code Table of k * 4 small integers Simple record structure The original FORTRAN compiler used quads Easy to reorder Explicit names load r1, y loadi r2, 2 mult r3, r2, r1 load r4, x sub r5, r4, r3 RISC assembly code load 1 Y loadi 2 2 mult 3 2 1 load 4 X sub 5 4 2 Quadruples

Three Address Code: Triples Index used as implicit name 25% less space consumed than quads Much harder to reorder (1) load y (2) loadi 2 (3) mult (1) (2) (4) load x (5) sub (4) (3) Implicit names take no space!

Three Address Code: Indirect Triples List first triple in each statement Implicit name space Uses more space than triples, but easier to reorder (100) (105) (100) (101) (102) (103) (104) load y loadi 2 mult (100) (101) load x sub (103) (102) Major tradeoff between quads and triples is compactness versus ease of manipulation In the past compile-time space was critical Today, speed (may be) more important

Static Single Assignment Form The main idea: each name defined exactly once Introduce -functions to make it work Original SSA-form x y while (x < k) x x + 1 y y + x Strengths of SSA-form Sharper analysis -functions give hints about placement (sometimes) faster algorithms x 0 y 0 if (x 0 > k) goto next loop: x 1 (x 0,x 2 ) y 1 (y 0,y 2 ) x 2 x 1 + 1 y 2 y 1 + x 2 if (x 2 < k) goto loop next:

Two Address Code Allows statements of the form x x op y Has 1 operator (op ) and, at most, 2 names (x and y) Example: z x - 2 * y Can be very compact becomes t 1 2 t 2 load y t 2 t 2 * t 1 z load x z z - t 2 Problems Machines no longer rely on destructive operations Difficult name space Destructive operations make reuse hard Good model for machines with destructive ops (PDP-11)

Control-flow Graph Models the transfer of control in the procedure Nodes in the graph are basic blocks Can be represented with quads or any other linear representation Edges in the graph represent control flow Example a 2 b 5 if (x = y) Basic blocks Maximal length sequences of a 3 straight-line code b 4 c a * b

Using Multiple Representations Source Code Front End IR 1 Middle IR 2 Middle IR 3 End End Back End Target Code Repeatedly lower the level of the intermediate representation Each intermediate representation is suited towards certain optimizations Example: the Open64 compiler WHIRL intermediate format Consists of 5 different IRs that are progressively more detailed

Memory Models Two major models Register-to-register model Keep all values that can legally be stored in a register in registers Ignore machine limitations on number of registers Compiler back-end must insert loads and stores Memory-to-memory model Keep all values in memory Only promote values to registers directly before they are used Compiler back-end can remove loads and stores Compilers for RISC machines usually use register-to-register Reflects programming g model Easier to determine when registers are used

The Rest of the Story Representing the code is only part of an IR There are other necessary components Symbol table (already discussed) Constant table Representation, type Storage class, offset Storage map Overall storage layout Overlap information Virtual register assignments