Compiler Construction

Similar documents
CS 132 Compiler Construction

CS 352: Compilers: Principles and Practice

Compiler Construction 1. Introduction. Oscar Nierstrasz

Compiler Construction 1. Introduction. Oscar Nierstrasz

Fundamentals of Compilation

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

PART ONE Fundamentals of Compilation

Compiling Techniques

CPSC 411: Introduction to Compiler Construction

The View from 35,000 Feet

Compilers and Interpreters

Overview of a Compiler

High-level View of a Compiler

Academic Formalities. CS Modern Compilers: Theory and Practise. Images of the day. What, When and Why of Compilers

Compilation 2014 Warm-up project

Compiler Design (40-414)

Introduction to Compiler

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Academic Formalities. CS Language Translators. Compilers A Sangam. What, When and Why of Compilers

Overview of a Compiler

Overview of a Compiler

Modern Compiler Implementation in ML

Programming Language Processor Theory

Alternatives for semantic processing

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

Front End. Hwansoo Han

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Part 5 Program Analysis Principles and Techniques

Introduction to Compilers

Undergraduate Compilers in a Day

Topic 3: Syntax Analysis I

Compilers. Lecture 2 Overview. (original slides by Sam

CA Compiler Construction

Compiler Construction LECTURE # 1

CS 4201 Compilers 2014/2015 Handout: Lab 1

Compilers and Code Optimization EDOARDO FUSELLA

Introduction to Parsing

CSE 3302 Programming Languages Lecture 2: Syntax

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

Abstract Syntax Trees

JavaCC. JavaCC File Structure. JavaCC is a Lexical Analyser Generator and a Parser Generator. The structure of a JavaCC file is as follows:

A simple syntax-directed

Abstract Syntax. Mooly Sagiv. html://

Formal Languages and Compilers Lecture I: Introduction to Compilers

CS 314 Principles of Programming Languages

Compiling Regular Expressions COMP360

Goals for course What is a compiler and briefly how does it work? Review everything you should know from 330 and before

Lexical Scanning COMP360

Topic 5: Syntax Analysis III

Chapter 4: LR Parsing

Parsing II Top-down parsing. Comp 412

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Acknowledgement. CS Compiler Design. Intermediate representations. Intermediate representations. Semantic Analysis - IR Generation

Intermediate Representations

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Lexical Analysis. Introduction

CS 406/534 Compiler Construction Putting It All Together

A Simple Syntax-Directed Translator

6. Intermediate Representation

CS415 Compilers. Lexical Analysis

Compiler Theory Introduction and Course Outline Sandro Spina Department of Computer Science

CST-402(T): Language Processors

The Zephyr Abstract Syntax Description Language

CSE 582 Autumn 2002 Exam 11/26/02

Intermediate Representations

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Introduction to Parsing. Comp 412

Theory and Compiling COMP360

CS 406/534 Compiler Construction Parsing Part I

What do Compilers Produce?

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

6. Intermediate Representation!

Formats of Translated Programs

CPS 506 Comparative Programming Languages. Syntax Specification

CS152 Programming Language Paradigms Prof. Tom Austin, Fall Syntax & Semantics, and Language Design Criteria

Acknowledgement. CS Compiler Design. Semantic Processing. Alternatives for semantic processing. Intro to Semantic Analysis. V.

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

COP 3402 Systems Software Syntax Analysis (Parser)

Compiler Code Generation COMP360

CPSC 411, 2015W Term 2 Midterm Exam Date: February 25, 2016; Instructor: Ron Garcia

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI

Introduction. Compilers and Interpreters

LECTURE 3. Compiler Phases

About the Authors... iii Introduction... xvii. Chapter 1: System Software... 1

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

The Structure of a Syntax-Directed Compiler

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Principles of Compiler Construction ( )

The compilation process is driven by the syntactic structure of the program as discovered by the parser

IR Optimization. May 15th, Tuesday, May 14, 13

3. Parsing. Oscar Nierstrasz

EECS 6083 Intro to Parsing Context Free Grammars

Appendix A The DL Language

Lecture Compiler Middle-End

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

COSE312: Compilers. Lecture 1 Overview of Compilers

More on Syntax. Agenda for the Day. Administrative Stuff. More on Syntax In-Class Exercise Using parse trees

Stating the obvious, people and computers do not speak the same language.

Transcription:

Compiler Construction Important facts: Name: Prof. Dr. Peter Thiemann Email: thiemann@informatik.uni-freiburg.de Office: 079-00-015 Exercises: Name: Dipl.-Inform. Manuel Geffken Email: geffken@informatik.uni-freiburg.de Office: 079-00-014 Basis for grades: 50% homeworks 50% final exam Copyright c 2012 by Antony L. Hosking. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from hosking@cs.purdue.edu. Compiler Construction Introduction 1

Things to do read Appel chapter 1 make sure you have a working account start brushing up on Java review Java development tools find the official course webpage http://proglang.informatik.uni-freiburg.de/teaching/compilerbau/2012ws/ Compiler Construction Introduction 2

Compilers What is a compiler? a program that reads an executable program in one language and translates it into an executable program in another language we expect the program produced by the compiler to be better, in some way, than the original What is an interpreter? a program that reads an executable program and produces the results of running that program usually, this involves executing the source program in some fashion This course deals mainly with compilers Many of the same issues arise in interpreters Compiler Construction Introduction 3

Motivation Why study compiler construction? Why build compilers? Why attend class? Compiler Construction Introduction 4

Interest Compiler construction is a microcosm of computer science artificial intelligence greedy algorithms, learning algorithms algorithms graph algorithms, union-find, dynamic programming theory DFAs for scanning, parser generators, lattice theory systems allocation and naming, locality, synchronization architecture pipeline management, hierarchy management, instruction set use Inside a compiler, all these things come together Compiler Construction Introduction 5

Isn t it a solved problem? Machines are constantly changing Changes in architecture changes in compilers new features pose new problems changing costs lead to different concerns old solutions need re-engineering Changes in compilers should prompt changes in architecture New languages and features Compiler Construction Introduction 6

Intrinsic Merit Compiler construction is challenging and fun interesting problems primary responsibility for performance (blame) new architectures new challenges real results extremely complex interactions Compilers have an impact on how computers are used Some of the most interesting problems in computing Compiler Construction Introduction 7

Experience You have used several compilers What qualities are important in a compiler? Compiler Construction Introduction 8

Experience You have used several compilers What qualities are important in a compiler? 1. Correct 2. Output runs fast 3. Compiler runs fast 4. Compile time proportional to program size 5. Support for separate compilation 6. Good diagnostics for syntax errors 7. Works well with the debugger 8. Good diagnostics for flow anomalies 9. Cross language calls 10. Consistent, predictable optimization Each of these shapes your expectations about this course Compiler Construction Introduction 9

Abstract view source compiler machine errors Implications: recognize legal (and illegal) programs generate correct manage storage of all variables and agreement on format for object (or assembly) Big step up from assembler higher level notations Compiler Construction Introduction 10

Traditional two pass compiler source front end IR back end machine errors Implications: intermediate representation (IR) front end maps legal into IR back end maps IR onto target machine simplify retargeting allows multiple front ends multiple passes better Compiler Construction Introduction 11

A fallacy FORTRAN front end C++ CLU Smalltalk front end front end front end back end back end back end target1 target2 target3 Can we build n m compilers with n + m components? must en all the knowledge in each front end must represent all the features in one IR must handle all the features in each back end Limited success with low-level IRs but LLVM Compiler Construction Introduction 12

Front end source tokens scanner parser IR errors Responsibilities: recognize legal procedure report errors produce IR preliminary storage map shape the for the back end Much of front end construction can be automated Compiler Construction Introduction 13

Front end source tokens scanner parser IR Scanner: errors maps characters into tokens the basic unit of syntax x = x + y; becomes <id, x> = <id, x> + <id, y> ; character string value for a token is a lexeme typical tokens: number, id, +, -, *, /, do, end eliminates white space (tabs, blanks, comments) a key issue is speed use specialized recognizer (as opposed to lex) Compiler Construction Introduction 14

Front end source tokens scanner parser IR Parser: errors recognize context-free syntax guide context-sensitive analysis construct IR(s) produce meaningful error messages attempt error correction Parser generators mechanize much of the work Compiler Construction Introduction 15

Front end Context-free syntax is specified with a grammar <sheep noise> ::= baa baa <sheep noise> The noises sheep make under normal circumstances This format is called Backus-Naur form (BNF) Formally, a grammar G = (S,N,T,P) where S is the start symbol N is a set of non-terminal symbols T is a set of terminal symbols P is a set of productions or rewrite rules (P : N N T ) Compiler Construction Introduction 16

Front end Context free syntax can be put to better use 1 <goal> ::= <expr> 2 <expr> ::= <expr> <op> <term> 3 <term> 4 <term> ::= number 5 id 6 <op> ::= + 7 - Simple expressions with addition and subtraction over tokens id and number S = <goal> T = number, id, +, - N = <goal>, <expr >, <term>, <op> P = 1, 2, 3, 4, 5, 6, 7 Compiler Construction Introduction 17

Front end Given a grammar, valid sentences can be derived by repeated substitution. Prod n. Result <goal> 1 <expr> 2 <expr> <op> <term> 5 <expr> <op> y 7 <expr> - y 2 <expr> <op> <term> - y 4 <expr> <op> 2 - y 6 <expr> + 2 - y 3 <term> + 2 - y 5 x + 2 - y To recognize a valid sentence in some CFG, we reverse this process and build up a parse Compiler Construction Introduction 18

Front end A parse can be represented by a parse, or syntax, tree goal expr expr op term expr op term - <id: y> term + <num:2> <id:x> A parse tree contains a lot of unnecessary information Compiler Construction Introduction 19

Front end So, compilers often use a more concise abstract syntax tree - + <id: y> <id:x> <num: 2> Abstract syntax trees (ASTs) are often used as an IR between front end and back end Compiler Construction Introduction 20

Back end IR instruction selection register allocation machine errors Responsibilities translate IR into target machine choose instructions for each IR operation decide what to keep in registers at each point ensure conformance with system interfaces Automation has been less successful here Compiler Construction Introduction 21

Back end IR instruction selection register allocation machine errors Instruction selection: produce compact, fast use available addressing modes pattern matching problem ad hoc techniques tree pattern matching string pattern matching dynamic programming Compiler Construction Introduction 22

Back end IR instruction selection register allocation machine errors Register Allocation: have value in a register when used limited resources changes instruction choices can move loads and stores optimal allocation is difficult Modern allocators often rely on graph coloring Compiler Construction Introduction 23

Traditional three pass compiler source front IR middle IR back end end end machine errors Code Improvement analyzes and changes IR goal is to reduce runtime must preserve values Compiler Construction Introduction 24

Optimizer (middle end) IR opt1 IR... IR opt n IR errors Modern optimizers are usually built as a set of passes Typical passes constant propagation and folding motion reduction of operator strength common subexpression elimination redundant store elimination dead elimination Compiler Construction Introduction 25

The MiniJava compiler Pass 2 Source Program Lex Tokens Canoncalize Environments Pass 1 Tables Pass 3 Pass 4 Parse Reductions Parsing Actions Abstract Syntax Semantic Analysis Translate Translate Frame IR Trees IR Trees Instruction Selection Assem Frame Layout Assem Control Flow Analysis Flow Graph Data Flow Analysis Interference Graph Register Allocation Register Assignment Code Emission Assembly Language Assembler Pass 5 Pass 6 Pass 7 Pass 8 Pass 9 Relocatable Object Code Linker Pass 10 Machine Language Compiler Construction Introduction 26

The MiniJava compiler phases Lex Parse Parsing Actions Semantic Analysis Frame Layout Translate Canonicalize Instruction Selection Control Flow Analysis Data Flow Analysis Register Allocation Code Emission Break source file into individual words, or tokens Analyse the phrase structure of program Build a piece of abstract syntax tree for each phrase Determine what each phrase means, relate uses of variables to their definitions, check types of expressions, request translation of each phrase Place variables, function parameters, etc., into activation records (stack frames) in a machine-dependent way Produce intermediate representation trees (IR trees), a notation that is not tied to any particular source language or target machine Hoist side effects out of expressions, and clean up conditional branches, for convenience of later phases Group IR-tree nodes into clumps that correspond to actions of targetmachine instructions Analyse sequence of instructions into control flow graph showing all possible flows of control program might follow when it runs Gather information about flow of data through variables of program; e.g., liveness analysis calculates places where each variable holds a still-needed (live) value Choose registers for variables and temporary values; variables not simultaneously live can share same register Replace temporary names in each machine instruction with registers Compiler Construction Introduction 27

A straight-line programming language Stm Stm ; Stm CompoundStm Stm id := Exp AssignStm Stm print ( ExpList ) PrintStm Exp id IdExp Exp num NumExp Exp Exp Binop Exp OpExp Exp ( Stm, Exp ) EseqExp ExpList Exp, ExpList PairExpList ExpList Exp LastExpList Binop + Plus Binop Minus Binop Times Binop / Div An example straight-line program: prints: 8 7 80 a := 5 + 3; b := (print(a,a 1),10 a); print(b) Compiler Construction Introduction 28

Tree representation a := 5 + 3; b := (print(a,a 1),10 a); print(b) CompoundStm AssignStm CompoundStm a OpExp AssignStm PrintStm NumExp Plus NumExp b EseqExp LastExpList 5 3 IdExp PrintStm OpExp b PairExpList NumExp Times IdExp IdExp LastExpList 10 a a OpExp IdExp Minus NumExp a 1 This convenient internal representation is convenient for a compiler to use. Compiler Construction Introduction 29

Java classes for trees abstract class Stm {} class CompoundStm extends Stm { Stm stm1, stm2; CompoundStm(Stm s1, Stm s2) { stm1=s1; stm2=s2; } } class AssignStm extends Stm { String id; Exp exp; AssignStm(String i, Exp e) { id=i; exp=e; } } class PrintStm extends Stm { ExpList exps; PrintStm(ExpList e) { exps=e; } } abstract class Exp {} class IdExp extends Exp { String id; IdExp(String i) {id=i;} } class NumExp extends Exp { int num; NumExp(int n) {num=n;} } class OpExp extends Exp { Exp left, right; int oper; final static int Plus=1,Minus=2,Times=3,Div=4; OpExp(Exp l, int o, Exp r) { left=l; oper=o; right=r; } } class EseqExp extends Exp { Stm stm; Exp exp; EseqExp(Stm s, Exp e) { stm=s; exp=e; } } abstract class ExpList {} class PairExpList extends ExpList { Exp head; ExpList tail; public PairExpList(Exp h, ExpList t) { head=h; tail=t; } } class LastExpList extends ExpList { Exp head; public LastExpList(Exp h) {head=h;} } Compiler Construction Introduction 30