Intermediate Representations

Similar documents
Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Intermediate Representations

Intermediate Representations Part II

opt. front end produce an intermediate representation optimizer transforms the code in IR form into an back end transforms the code in IR form into

Intermediate Representations

Intermediate Representations & Symbol Tables

Intermediate Representations

Front End. Hwansoo Han

CS415 Compilers. Intermediate Represeation & Code Generation

Alternatives for semantic processing

Intermediate Representation (IR)

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation

COMP 181. Prelude. Intermediate representations. Today. High-level IR. Types of IRs. Intermediate representations and code generation

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

CSE 401/M501 Compilers

CS 406/534 Compiler Construction Intermediate Representation and Procedure Abstraction

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

Principles of Compiler Design

6. Intermediate Representation

Introduction to Compiler

Syntax-directed translation. Context-sensitive analysis. What context-sensitive questions might the compiler ask?

Intermediate Code Generation

6. Intermediate Representation!

Intermediate representation

The View from 35,000 Feet

Context-sensitive analysis. Semantic Processing. Alternatives for semantic processing. Context-sensitive analysis

Code Shape II Expressions & Assignment

CSc 453 Interpreters & Interpretation

Intermediate Representations

tokens parser 1. postfix notation, 2. graphical representation (such as syntax tree or dag), 3. three address code

CHAPTER 5 A Closer Look at Instruction Set Architectures

Page 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions

The von Neumann Architecture. IT 3123 Hardware and Software Concepts. The Instruction Cycle. Registers. LMC Executes a Store.

Compiling Techniques

Parsing II Top-down parsing. Comp 412

COMPUTER ORGANIZATION & ARCHITECTURE

Overview of a Compiler

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412

Procedure and Function Calls, Part II. Comp 412 COMP 412 FALL Chapter 6 in EaC2e. target code. source code Front End Optimizer Back End

CS293S Redundancy Removal. Yufei Ding

EC-801 Advanced Computer Architecture

Compiler Optimization Intermediate Representation

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

CHAPTER 5 A Closer Look at Instruction Set Architectures

Chapter 5. A Closer Look at Instruction Set Architectures

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Data Structures and Algorithms in Compiler Optimization. Comp314 Lecture Dave Peixotto

CHAPTER 5 A Closer Look at Instruction Set Architectures

CS5363 Final Review. cs5363 1

Compilers. Lecture 2 Overview. (original slides by Sam

I ve been getting this a lot lately So, what are you teaching this term? Computer Organization. Do you mean, like keeping your computer in place?

Compilers and Code Optimization EDOARDO FUSELLA

Code Generation. Dragon: Ch (Just part of it) Holub: Ch 6.

Lecture 31: Building a Runnable Program

Instruction Set Principles and Examples. Appendix B

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Understand the factors involved in instruction set

Chapter 5. A Closer Look at Instruction Set Architectures

Compiler Optimisation

CSc 453. Compilers and Systems Software. 13 : Intermediate Code I. Department of Computer Science University of Arizona

Intermediate Code Generation

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

CS 132 Compiler Construction

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Topics Power tends to corrupt; absolute power corrupts absolutely. Computer Organization CS Data Representation

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

Summary: Direct Code Generation

Compiler Design. Code Shaping. Hwansoo Han

CS 406/534 Compiler Construction Parsing Part I

Instruction Sets: Characteristics and Functions Addressing Modes

The structure of a compiler

CS 352: Compilers: Principles and Practice

Computer Organization CS 206 T Lec# 2: Instruction Sets

The x86 Instruction Set

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

Intermediate Code Generation (ICG)

Compiler Construction

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

CA Compiler Construction

3.5 Practical Issues PRACTICAL ISSUES Error Recovery

We can emit stack-machine-style code for expressions via recursion

COSC 6385 Computer Architecture. Instruction Set Architectures

Overview of a Compiler

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

Instruction Set Design

Compiler construction. x86 architecture. This lecture. Lecture 6: Code generation for x86. x86: assembly for a real machine.

CODE GENERATION Monday, May 31, 2010

Chapter 2 Instruction Set Architecture

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

Intermediate Code Generation

The CPU and Memory. How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram:

Intermediate Code Generation

CS 432 Fall Mike Lam, Professor. Code Generation

Transcription:

Intermediate Representations

Where In The Course Are We? Rest of the course: compiler writer needs to choose among alternatives Choices affect the quality of compiled code time to compile There may be no best answer

Intermediate Representations Source Code Front End IR Middle End IR Back End Target Code Front end - produces an intermediate representation (IR) Middle end - transforms the IR into an equivalent IR that runs more efficiently Back end - transforms the IR into native code IR encodes the compiler s knowledge of the program Middle end usually consists of many passes

JikesRVM Middle End (IBM Open Source Java JIT compiler) Back End

JikesRVM (IBM Open Source Java JIT compiler) Middle End HIR= High-level Intermediate Representation

JikesRVM (IBM Open Source Java JIT compiler) Middle End LIR=Low-level Intermediate Representation

JikesRVM (IBM Open Source Java JIT compiler) Back End MIR=Machine Intermediate Representation

Intermediate Representations Decisions in IR design affect the speed and efficiency of the compiler The importance of different properties varies between compilers Selecting the right IR for a compiler is critical

Some Important IR Properties Ease of generation speed of compilation Ease of manipulation improved passes Procedure size compilation footprint Level of abstraction improved passes

Types of Intermediate Representations Three major categories Structural Linear Hybrid

Types of Intermediate Representations Three major categories Structural Graphically oriented Heavily used in source-to-source translators Tend to be large Linear Hybrid Examples: Trees, DAGs

Types of Intermediate Representations Three major categories Structural Linear Pseudo-code for an abstract machine Level of abstraction varies Simple, compact data structures Easier to rearrange Hybrid Examples: 3 address code, Stack machine code

Types of Intermediate Representations Three major categories Structural Linear Hybrid Examples: Control Flow Graph Combination of graphs and linear code

Level of Abstraction Two different representations of an array ref: High level AST Low level Linear Code subscript A i j Good for memory disambiguation loadi 1 => r 1 sub r j, r 1 => r 2 loadi 10 => r 3 mult r 2, r 3 => r 4 sub r i, r 1 => r 5 add r 4, r 5 => r 6 loadi @A => r 7 add r 7, r 6 => r 8 load r 8 => r Aij Good for address calculation

Level of Abstraction Structural IRs are usually considered high-level Linear IRs are usually considered low-level Not necessarily true: load + + @A * - loadarray A,i,j - 10 i 1 j 1 Low level AST High level linear code

Abstract Syntax Tree An abstract syntax tree is parse tree with the nodes for most non-terminal nodes removed Goal x - 2 * y Expr - Expr Term Term Term * Fact. x * Fact. Fact. <id,y> 2 y <id,x> <num,2> Parse Tree Abstract Syntax Tree

Directed Acyclic Graph A directed acyclic graph (DAG) is an AST with a unique node for each value - z w / x 2 y Makes sharing explicit Encodes redundancy * z x - 2 * y w x / 2 With two copies of the same expression, the compiler might be able to arrange the code to evaluate it only once.

Stack Machine Code Originally used for stack-based computers, now Java Example: x - 2 * y becomes push x push 2 push y multiply subtract

Stack Machine Code Operations take operands from a stack Compact form A form of one-address code Introduced names are implicit, not explicit Simple to generate and execute code

Stack Machine Code Advantages x - 2 * y Result is stored in a temporary! Explicit name for result. push 2 push y multiply push x subtract Multiply pops two items off of stack and pushes result! Implicit name for result

Three Address Code Different representations of three address code In general, three address code has statements of the form: x y op z With 1 operator (op ) and (at most) 3 names (x, y, & z)

Three Address Code Example: z x - 2 * y becomes t 2 * y z x - t Explicit name for result.

Three Address Code Advantages Resembles many real (RISC) machines Introduces a new set of names Compact form

Three Address Code: Simple Array Naïve representation of three address code Table of k * 4 small integers Destination Two operands load r1, y loadi r2, 2 mult r3, r2, r1 load r4, x sub r5, r4, r3 RISC assembly code load 1 y loadi 2 2 mult 3 2 1 load 4 x sub 5 4 3 Simple Array

Three Address Code: Array of Pointers Index causes level of indirection Easy (and cheap) to reorder Easy to add (delete) instructions

Three Address Code: Array of Pointers Index causes level of indirection Easy (and cheap) to reorder Easy to add (delete) instructions

Three Address Code: Linked List No additional array of indirection Easy (and cheap) to reorder than simple table Easy to add (delete) instructions

Control-Flow Graphs Linear IR Hybrid IR : CFG

Control-Flow Graphs Node: an instruction or sequence of instructions (a basic block) Two instructions i, j in same basic block iff execution of i guarantees execution of j Directed edge: potential flow of control Distinguished start node Entry First instruction in program

Control-flow Graph Models the transfer of control in the procedure Nodes in the graph are basic blocks Can be represented with quads or any other linear representation Edges in the graph represent control flow Example a 2 b 5 if (x = y) a 3 b 4 Basic blocks Maximal length sequences of straight-line code c a * b

Using Multiple Representations Source Code Front End IR 1 IR 2 IR 3 Middle End Middle End Back End Target Code Repeatedly lower the level of the intermediate representation Each intermediate representation is suited towards certain optimizations

Memory Models Two major models Register-to-register model Keep all values that can legally be stored in a register in registers Ignore machine limitations on number of registers Compiler back-end must insert loads and stores Memory-to-memory model Keep all values in memory Only promote values to registers directly before they are used Compiler back-end can remove loads and stores Compilers usually use register-to-register Reflects programming model Easier to determine when registers are used

The Rest of the Story Representing the code is only part of an IR There are other necessary components Symbol table Constant table Representation, type Storage class, offset Storage map Overall storage layout Overlap information Virtual register assignments

Symbol Tables Traditional approach to building a symbol table uses hashing One table scheme Lots of wasted space h( fie ) h( fee ) fee integer scalar h( fum ) h( foe ) foe x fum float scalar fie char * array Hash table