Introduction to Compiler Construction

Similar documents
Introduction to Compiler Construction

Introduction to Compiler Construction

CS 4201 Compilers 2014/2015 Handout: Lab 1

COMPILER DESIGN LECTURE NOTES


Introduction to Compiler Design

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP455: COMPILER AND LANGUAGE DESIGN. Dr. Alaa Aljanaby University of Nizwa Spring 2013

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

COP 3402 Systems Software. Lecture 4: Compilers. Interpreters

A Simple Syntax-Directed Translator

Compiler Design (40-414)

Formal Languages and Compilers Lecture I: Introduction to Compilers

CS 321 IV. Overview of Compilation

Introduction to Compiler Construction

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Introduction to Compiler Construction

Lexical Analyzer Scanner

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

Lexical Analyzer Scanner

A simple syntax-directed

Compilers. Prerequisites

A programming language requires two major definitions A simple one pass compiler

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Syntax-Directed Translation

LECTURE NOTES ON COMPILER DESIGN P a g e 2

Earlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl

Compiling Regular Expressions COMP360

COMPILER DESIGN. For COMPUTER SCIENCE

CST-402(T): Language Processors

Compilers and Interpreters

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Introduction. Compilers and Interpreters

LECTURE 3. Compiler Phases


COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

Group A Assignment 3(2)

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

Introduction. Interpreters High-level language intermediate code which is interpreted. Translators. e.g.

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

CPS 506 Comparative Programming Languages. Syntax Specification

CS606- compiler instruction Solved MCQS From Midterm Papers

Symbol Tables. ASU Textbook Chapter 7.6, 6.5 and 6.3. Tsan-sheng Hsu.

Working of the Compilers

Pioneering Compiler Design

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.

Question Bank. 10CS63:Compiler Design

COSE312: Compilers. Lecture 1 Overview of Compilers

Stating the obvious, people and computers do not speak the same language.

Optimization. ASU Textbook Chapter 9. Tsan-sheng Hsu.

Compiler Code Generation COMP360

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Why are there so many programming languages? Why do we have programming languages? What is a language for? What makes a language successful?

UNIT I- LEXICAL ANALYSIS. 1.Interpreter: It is one of the translators that translate high level language to low level language.

Undergraduate Compilers in a Day

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Compiler Design Overview. Compiler Design 1

Software II: Principles of Programming Languages

Compiler Structure. Lexical. Scanning/ Screening. Analysis. Syntax. Parsing. Analysis. Semantic. Context Analysis. Analysis.

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.

Life Cycle of Source Program - Compiler Design

The role of semantic analysis in a compiler

Compiling Techniques

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

INTRODUCTION TO COMPILER AND ITS PHASES

Theory and Compiling COMP360

Compilers. Lecture 2 Overview. (original slides by Sam

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI

CSE 401 Midterm Exam Sample Solution 11/4/11

CS131: Programming Languages and Compilers. Spring 2017

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING ACADEMIC YEAR / EVEN SEMESTER

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

2.2 Syntax Definition

Compilers and Code Optimization EDOARDO FUSELLA

This book is licensed under a Creative Commons Attribution 3.0 License

Program Analysis ( 软件源代码分析技术 ) ZHENG LI ( 李征 )

Introduction to Compiler

Syntax Analysis. Chapter 4

INF5110 Compiler Construction

LANGUAGE TRANSLATORS

COSC121: Computer Systems: Runtime Stack

Principles of Programming Languages COMP251: Syntax and Grammars

Early computers (1940s) cost millions of dollars and were programmed in machine language. less error-prone method needed

When do We Run a Compiler?

UNIT I Programming Language Syntax and semantics. Kainjan Sanghavi

Introduction to Syntax Analysis. Compiler Design Syntax Analysis s.l. dr. ing. Ciprian-Bogdan Chirila

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

Time : 1 Hour Max Marks : 30

Lexical analysis. Syntactical analysis. Semantical analysis. Intermediate code generation. Optimization. Code generation. Target specific optimization

The Structure of a Syntax-Directed Compiler

COMPILERS BASIC COMPILER FUNCTIONS

Lexical Analysis. Introduction

PRINCIPLES OF COMPILER DESIGN

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Concepts of Programming Languages

Transcription:

Introduction to Compiler Construction ALSU Textbook Chapter 1.1 1.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1

What is a compiler? Definitions: a recognizer ; a translator. source program compiler target program Source and target must be equivalent! Compiler writing spans: programming languages; machine architecture; language theory; algorithms and data structures; software engineering. History: 1950: the first FORTRAN compiler took 18 man-years; now: using software tools, can be done in a few months as a student s project. Compiler notes #1, 20070226, Tsan-sheng Hsu 2

Applications High-level programming language compilers. Optimizations for computer architectures. Design of new computer architectures. Translator: from one format to another. query interpreter text formatter silicon compiler infix notation postfix notation: pretty printers Software productivity tools. 3 + 5 6 6 3 5 + 6 6 Compiler notes #1, 20070226, Tsan-sheng Hsu 3

Relations with computational theory a set of grammar rules the definition of a particular machine. also equivalent to a set of languages recognized by this machine. a type of machines: a family of machines with a given set of operations, or capabilities; power of a type of machines the set of languages that can be recognized by this type of machines. Compiler notes #1, 20070226, Tsan-sheng Hsu 4

Flow chart of a typical compiler symbol table source code sequence of characters lexical analyzer (scanner) sequence of tokens syntax analyzer (parser) abstract syntax tree semantic analyzer annoted abstract syntax tree intermediate code generator intermediate code code optimizer optimized intermediate code code generator error handler target code Compiler notes #1, 20070226, Tsan-sheng Hsu 5

Scanner Actions: Reads characters from the source program; Groups characters into go together, following a given pattern ; Each lexeme corresponds to a token. lexemes, i.e., sequences of characters that the scanner returns the next token, plus maybe some additional information, to the parser; The scanner may also discover lexical errors, i.e., erroneous characters. The definitions of what a lexeme, token or bad character is depend on the definition of the source language. Compiler notes #1, 20070226, Tsan-sheng Hsu 6

Lexeme: C sentence Scanner example for C L1: x = y2 + 12; (Lexeme) L1 : x = y2 + 12 ; (Token) ID COLON ID ASSIGN ID PLUS INT SEMI-COL Arbitrary number of blanks between lexemes. Erroneous sequence of characters, comments, for the C language: that are not parts of control characters @ 2abc Compiler notes #1, 20070226, Tsan-sheng Hsu 7

Actions: Parser Group tokens into grammatical phrases, to discover the underlying structure of the source Find syntax errors, e.g., the following C source line: (Lexeme) index = 12 * ; (Token) ID ASSIGN INT TIMES SEMI-COL Every token is legal, but the sequence is erroneous! May find some static semantic errors, e.g., use of undeclared variables or multiple declared variables. May generate code, or build some intermediate representation of the source program, such as an abstract-syntax tree. Compiler notes #1, 20070226, Tsan-sheng Hsu 8

Parser example for C Source code: position = initial + rate 60; Abstract-syntax tree: = position + initial * rate 60 interior nodes of the tree are OPERATORS; a node s children are its OPERANDS; each subtree forms a logical unit. the subtree with at its root shows that has higher precedence than +, the operation rate 60 must be performed as a unit, not initial + rate. Compiler notes #1, 20070226, Tsan-sheng Hsu 9

Semantic analyzer Actions: Check for more static semantic errors, e.g., type errors. May annotate and/or change the abstract syntax tree. = position + initial * rate 60 = position + (float) initial * (float) rate int to float() (float) 60 Compiler notes #1, 20070226, Tsan-sheng Hsu 10

Intermediate code generator Actions: translate from abstract-syntax trees to intermediate codes. One choice for intermediate code is 3-address code : Each statement contains at most 3 operands; in addition to :=, i.e., assignment, at most one operator. An easy and universal format that can be translated into most assembly languages. Example: = position + (float) initial * (float) rate int to float() (float) 60 temp1 := int-to-float(60) temp2 := rate * temp1 temp3 := initial + temp2 position := temp3 Compiler notes #1, 20070226, Tsan-sheng Hsu 11

Optimizer Improve the efficiency of intermediate code. Goal may be to make code run number of registers Example: temp1 := int-to-float(60) temp2 := rate * temp1 temp3 := initial + temp2 position := temp3 faster, and/or to use least temp2 := rate * 60.0 position := initial + temp2 Current trends: to obtain smaller, but maybe slower, equivalent code for embedded systems; to reduce power consumption; to enable parallelism; Compiler notes #1, 20070226, Tsan-sheng Hsu 12

Code generation A compiler may generate pure machine codes (machine dependent assembly language) directly, which is rare now ; virtual machine code. Example: PASCAL compiler P-code interpreter execution Speed is roughly 4 times slower than running directly generated machine codes. Advantages: simplify the job of a compiler; decrease the size of the generated code: 1/3 for P-code ; can be run easily on a variety of platforms P-machine is an ideal general machine whose interpreter can be written easily; divide and conquer; recent example: JAVA and Byte-code. Compiler notes #1, 20070226, Tsan-sheng Hsu 13

Code generation example temp2 := rate * 60.0 position := initial + temp2 = LOADF rate, R 1 MULF #60.0, R 1 LOADF initial, R 2 ADDF R 2, R 1 STOREF R 1, position Compiler notes #1, 20070226, Tsan-sheng Hsu 14

Practical considerations (1/2) Preprocessing phase: macro substitution: #define MAXC 10 rational preprocessing: add new features for old languages. BASIC C C + + compiler directives: #include <stdio.h> non-standard language extensions. adding parallel primitives Compiler notes #1, 20070226, Tsan-sheng Hsu 15

Practical considerations (2/2) Passes of compiling First pass reads the text file once. May need to read the text one more time for any forward addressed objects, i.e., anything that is used before its declaration. goto error handling; Example: C language error handling: Compiler notes #1, 20070226, Tsan-sheng Hsu 16

Each pass takes I/O time. Reduce number of passes Back-patching : leave a blank slot for missing information, and fill in the empty slot when the information becomes available. Example: C language when a label is used if it is not defined before, save a trace into the to-be-processed table label name corresponds to LABEL TABLE[i] code generated: GOTO LABEL TABLE[i] when a label is defined check known labels for redefined labels if it is not used before, save a trace into the to-be-processed table if it is used before, then find its trace and fill the current address into the trace Time and space trade-off! Compiler notes #1, 20070226, Tsan-sheng Hsu 17