Formal Languages and Compilers Lecture I: Introduction to Compilers

Similar documents
Compiler Design (40-414)

CS131: Programming Languages and Compilers. Spring 2017

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

CST-402(T): Language Processors

Introduction. Compilers and Interpreters

CA Compiler Construction

Compiler Design Overview. Compiler Design 1

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

Introduction to Compiler Design

Compiler Construction LECTURE # 1

COP 3402 Systems Software. Lecture 4: Compilers. Interpreters

CS426 Compiler Construction Fall 2006

COMPILER DESIGN. For COMPUTER SCIENCE

LECTURE NOTES ON COMPILER DESIGN P a g e 2

CS 4201 Compilers 2014/2015 Handout: Lab 1

A Simple Syntax-Directed Translator

Philadelphia University Faculty of Information Technology Department of Computer Science --- Semester, 2007/2008. Course Syllabus

Compilers. Pierre Geurts

COMP455: COMPILER AND LANGUAGE DESIGN. Dr. Alaa Aljanaby University of Nizwa Spring 2013


Evaluation Scheme L T P Total Credit Theory Mid Sem Exam

Introduction to Compiler Construction

CS Compiler Construction West Virginia fall semester 2014 August 18, 2014 syllabus 1.0

COMPILER DESIGN LECTURE NOTES

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

Life Cycle of Source Program - Compiler Design

Formal Languages and Compilers Lecture VI: Lexical Analysis

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization

SRM UNIVERSITY FACULTY OF ENGINEERING AND TECHNOLOGY SCHOOL OF COMPUTING DEPARTMENT OF CSE COURSE PLAN

Earlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl

LANGUAGE TRANSLATORS

Introduction to Compiler Construction

Introduction to Compilers

Introduction to Compiler Construction

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

SRM UNIVERSITY FACULTY OF ENGINEERING AND TECHNOLOGY

Introduction to Compiler Construction

Compiler, Assembler, and Linker

Introduction to Compiler Construction

About the Authors... iii Introduction... xvii. Chapter 1: System Software... 1

INF5110 Compiler Construction

Formal Languages and Compilers Lecture IX Semantic Analysis: Type Chec. Type Checking & Symbol Table


CSCI 565 Compiler Design and Implementation Spring 2014

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Pioneering Compiler Design

UNIT I- LEXICAL ANALYSIS. 1.Interpreter: It is one of the translators that translate high level language to low level language.

Lexical Scanning COMP360

HOLY ANGEL UNIVERSITY COLLEGE OF INFORMATION AND COMMUNICATIONS TECHNOLOGY COMPILER THEORY COURSE SYLLABUS

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

Formal Languages and Compilers Lecture X Intermediate Code Generation

Formal languages and computation models

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

1. The output of lexical analyser is a) A set of RE b) Syntax Tree c) Set of Tokens d) String Character

Undergraduate Compilers in a Day

Lecture 10 Parsing 10.1

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata

Syntax. In Text: Chapter 3

Compiler construction

CS 415 Midterm Exam Spring 2002

SRM UNIVERSITY FACULTY OF ENGINEERING AND TECHNOLOGY SCHOOL OF COMPUTER SCIENCE AND ENGINEERING COURSE PLAN

Compilers and Interpreters

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

PLAGIARISM. Administrivia. Compilers. CS143 11:00-12:15TT B03 Gates. Text. Staff. Instructor. TAs. Office hours, contact info on 143 web site

Compiler Theory Introduction and Course Outline Sandro Spina Department of Computer Science

CS 132 Compiler Construction

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

CSCE 314 Programming Languages

Working of the Compilers

Principle of Compilers Lecture VIII: Intermediate Code Generation. Alessandro Artale

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Compiler Construction Lecture 1: Introduction Winter Semester 2018/19 Thomas Noll Software Modeling and Verification Group RWTH Aachen University

Administrivia. Compilers. CS143 10:30-11:50TT Gates B01. Text. Staff. Instructor. TAs. The Purple Dragon Book. Alex Aiken. Aho, Lam, Sethi & Ullman

G.PULLAIH COLLEGE OF ENGINEERING & TECHNOLOGY

Compiler Code Generation COMP360

A simple syntax-directed

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

Compilers and Code Optimization EDOARDO FUSELLA

Chapter 3. Describing Syntax and Semantics ISBN

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.

Group A Assignment 3(2)

Introduction to Compiler

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Compiler Design. Dr. Chengwei Lei CEECS California State University, Bakersfield

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

CSCI 334: Principles of Programming Languages. Lecture 2: Lisp Wrapup & Fundamentals. Higher-Order Functions. Activity

CS 4120 and 5120 are really the same course. CS 4121 (5121) is required! Outline CS 4120 / 4121 CS 5120/ = 5 & 0 = 1. Course Information

Lexical Analysis. Introduction

LECTURE 3. Compiler Phases

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Compilers for Modern Architectures Course Syllabus, Spring 2015

Compiler Construction

KALASALINGAM UNIVERSITY ANAND NAGAR, KRISHNAN KOIL DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING ODD SEMESTER COURSE PLAN

11. a b c d e. 12. a b c d e. 13. a b c d e. 14. a b c d e. 15. a b c d e

GUJARAT TECHNOLOGICAL UNIVERSITY

Compilers Crash Course

Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov Compiler Design (170701)

afewadminnotes CSC324 Formal Language Theory Dealing with Ambiguity: Precedence Example Office Hours: (in BA 4237) Monday 3 4pm Wednesdays 1 2pm

Transcription:

Formal Languages and Compilers Lecture I: Introduction to Compilers Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/ artale/ Formal Languages and Compilers BSc course 2017/18 First Semester

Course Overview Introduction to the Notion of Compiler. Formal Language Theory: Chomsky Classification and notion of Formal Grammar. Theory of regular languages: deterministic and non-deterministic finite automata, Regular expressions and Regular grammars. Context-free languages and their grammars. Lexical Analysis and Automata. Syntax Analysis and Parsers: Top-Down Parser Bottom-Up Parser Syntax-Directed Translation to Translate Programming Language Constructs. Semantic Analysis: Type Checking. Code Generation and Principles of Code Optimization.

Final Exam Final Written Exam: 70% of the total mark Mid-Term Exam: Grants the possibility to skip the Formal Language part of the final exam. Compiler Project: 30% of the total mark Form teams of two/tree persons Decide and implement your little language developing a compiler for it. Two weeks after the end of the course you will present a demo of your project. You are free to develop your project either in C or Java.

Reading List Introduction to Automata Theory, Languages, and Computation (3rd edition), J.E. Hopcroft, R. Motwani, J.D. Ullman. Addison Wesley, 2007. Compilers: Principles, Techniques, and Tools, Alfred V. Aho, Ravi Sethi and Jeff Ullman. Publisher: Prentice Hall, 2003. Further reading material: Compiler Construction: Principles and Practice, Kenneth C. Louden. Publisher: Brooks Cole, 1997. Programming Language Processors in Java: Compilers and Interpreters, David Watt and Deryck Brown. Publisher: Prentice Hall, 2000. Advanced Compiler Design and Implementation, Steven Muchnick. Publisher: Morgan Kaufmann, 1997.

Summary of Lecture I Motivations and Brief History. The Architecture of a Compiler. The Analysis Phase. The Synthesis Phase. Towards Executable Code: Assembler, Loader and Linker.

How are Languages Implemented? Two major strategies: 1 Compilers. Translate programs to a machine executable code. They do extensive preprocessing. 2 Interpreters. Run programs as is without preliminary translation: Successive phases of translation (to machine/intermediate code) and execution.

History of High-Level Languages 1953 IBM develops the 701: All programming done in assembly. Problem: Software costs exceeded hardware costs! John Backus: Speedcoding: An interpreted language that ran 10-20 times slower than hand-written assembly! John Backus: Translate high-level code to assembly Many thought this impossible. Had already failed in other projects. 1954-7 FORTRAN I project: By 1958, > 50% of all software is in FORTRAN. Cut the development time dramatically (from weeks to hours).

Summary Motivations and Brief History. The Architecture of a Compiler. The Analysis Phase. The Synthesis Phase. Towards Executable Code: Assembler, Loader and Linker.

The Context of a Compiler A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language the target language. In addition to a compiler, other programs are needed to generate an executable code. Source Program COMPILER Target Assembly Program ASSEMBLER Relocatable Machine Code LOADER/LINKER Absolute Machine Code

The Architecture of a Compiler Compilation can be divided in two parts: Analysis and Synthesis. 1 Analysis. Breaks the source program into constituent pieces and creates intermediate representation. 2 Synthesis. Generates the target program from the intermediate representation. The analysis part can be divided along the following phases: 1 Lexical Analysis; 2 Syntax Analysis; 3 Semantic Analysis. The synthesis part can be divided along the following phases: 1 Intermediate Code Generator; 2 Code Optimizer; 3 Code Generator.

The Architecture of a Compiler (Cont.) Source Program Lexical Analyzer Syntax Analyzer Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator Target Program

Summary Motivations and Brief History. The Architecture of a Compiler. The Analysis Phase. The Synthesis Phase. Towards Executable Code: Assembler, Loader and Linker.

Lexical Analysis The program is considered as a unique sequence of characters. The Lexical Analyzer reads the program from left-to-right and sequence of characters are grouped into tokens lexical units with a collective meaning. The sequence of characters that gives rise to a token is called lexeme.

Lexical Analysis: An Example Let us consider the following assignment statement: position = initial + rate 60 Then, the lexical analyzer will group the characters in the following tokens: Lexeme Token position ID = = initial ID + + rate ID 60 NUM

Symbol Table An essential function of a compiler is to build the Symbol Table where the identifiers used in the program are recorded along with various properties: Storage allocated for the ID; its type; its scope (where in the program is valid); number and types of its arguments (in case the ID is a procedure name); etc. When an identifier is detected an ID token is generated, the corresponding lexeme is entered in the Symbol Table, and a pointer to the position in the Symbol Table is associated to the ID token.

Syntactic Analysis The Syntactic Analysis is also called Parsing. Tokens are grouped into grammatical phrases represented by a Parse Tree which gives a hierarchical structure to the source program. The hierarchical structure is expressed by recursive rules, called Grammar s Productions. Example. Grammar s Productions for assignment statements are: <assignment > ID = <expr > <expr > ID NUM <expr ><op ><expr > (<expr >) <op > + /

Parse Tree: An Example assignment ID 1 = expr position expr + expr ID 2 expr * expr initial ID 3 NUM rate 60

Grammars and Formal Language Theory The notion of Grammar is related to studies in natural languages. Linguists were concerned with: 1 Defining the valid sentences of a Language; 2 Providing a structural definition of such valid sentences. The Formal Language Theory considers a Language as a mathematical object. A Language is just a set of strings. To formally define a Language we need to formally define what are the strings admitted by the Language: A Grammar is a formalism that gives a finite representation of a Language and allows to generate the set of strings belonging to a given Language.

Semantic Analysis The Semantic Analysis phase checks the program for semantic errors (Type Checking) and gathers type information for the successive phases. Type Checking. Check types of operands (possibly imposing type coercions); No real number as index for array; etc.

Summary Motivations and Brief History. The Architecture of a Compiler. The Analysis Phase. The Synthesis Phase. Towards Executable Code: Assembler, Loader and Linker.

Intermediate Code Generation An intermediate code is generated as a program for an abstract machine. The intermediate code should be easy to translate into the target program. As intermediate code we consider the three-address code, similar to assembly: sequence of instructions with at most three operands such that: 1 There is at most one operator, in addition to the assignment. Thus, we make explicit the operators precedence. 2 Temporary names must be generated to compute intermediate operations. Example. The intermediate code for the assignment statement is: temp1 = inttoreal(60) temp2 = id3 temp1 temp3 = id2 + temp2 id1 = temp3

Code Optimization This phase attempts to improve the intermediate code so that faster-running machine code can be obtained. Different compilers adopt different optimization techniques. Example. A simple optimization of the intermediate code for the assignment statement could be: temp1 = inttoreal(60) temp2 = id3 temp1 temp3 = id2 + temp2 id1 = temp3 -> temp1 = id3 60.0 id1 = id2 + temp1

Code Generation This phase generates the target code consisting of assembly code. 1 Memory locations are selected for each variable; 2 Instructions are translated into a sequence of assembly instructions; 3 Variables and intermediate results are assigned to memory registers. Example. A target code generated from the optimized code of the assignment statement could be: MOVF id3, R2 MULF #60.0, R2 MOVF id2, R1 ADDF R2, R1 MOVF R1, id1 The F stands for floating-point instruction The # means that 60.0 is a constant The first and second operand of each instruction specify a source and a destination

Summing Up

Summary Motivations and Brief History. The Architecture of a Compiler. The Analysis Phase. The Synthesis Phase. Towards Executable Code: Assembler, Loader and Linker.

Assembler The Assembler is responsible for translating the target code usually assembly code into an executable machine code. The assembly code is a mnemonic version of machine code in which: 1 Names are used instead of binary codes for operations (Code Table). 2 Names are used for operands instead of memory locations (Symbol Tables).

Loader and Linker The machine code generated by the Assembler can be executed only if allocated in Main Memory starting from the address 0. Since this is not possible the Loader will alter the relocatable addresses of the code to place both instructions and data in the right place in Main Memory. The starting free address, L, in Main Memory to allocate the program is called the Relocation Factor. The Loader must: 1 Add to each relocatable address the relocation factor L; 2 Leave unaltered each absolute address e.g., address of I/O devices. The Linker links together the different files/modules of a single program and, possibly, adds library files.

Summary of Lecture I Motivations and Brief History. The Architecture of a Compiler. The Analysis Phase. The Synthesis Phase. Towards Executable Code: Assembler, Loader and Linker.