Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Similar documents
A Simple Syntax-Directed Translator

A programming language requires two major definitions A simple one pass compiler

Introduction to Compiler Design

Compiler Code Generation COMP360

Compiler Design (40-414)

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Yacc: A Syntactic Analysers Generator

Building Compilers with Phoenix

COP 3402 Systems Software Syntax Analysis (Parser)

2068 (I) Attempt all questions.

Lexical Analysis. Introduction

Compiling Regular Expressions COMP360

A simple syntax-directed

Introduction to Parsing. Lecture 8

CS 321 IV. Overview of Compilation

4. Semantic Processing and Attributed Grammars

CSE 401 Midterm Exam 11/5/10

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

Introduction to Compiler Construction

Languages and Compilers

Introduction to Compiler Construction

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

Lexical Scanning COMP360

Compiling Techniques

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

COMP455: COMPILER AND LANGUAGE DESIGN. Dr. Alaa Aljanaby University of Nizwa Spring 2013

Introduction to Compiler Construction

CSCI312 Principles of Programming Languages!

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

Theory and Compiling COMP360

TDDD55 - Compilers and Interpreters Lesson 3

COMPILER CONSTRUCTION Seminar 02 TDDB44

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming

Part 5 Program Analysis Principles and Techniques

Question Bank. 10CS63:Compiler Design

CS 4201 Compilers 2014/2015 Handout: Lab 1

How do LL(1) Parsers Build Syntax Trees?

TDDD55- Compilers and Interpreters Lesson 3

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

Undergraduate Compilers in a Day

Introduction to Parsing. Lecture 5

Introduction to Parsing Ambiguity and Syntax Errors

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

Compiler Lab. Introduction to tools Lex and Yacc

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization

QUESTIONS RELATED TO UNIT I, II And III

Introduction to Parsing Ambiguity and Syntax Errors

Pioneering Compiler Design

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

LECTURE 3. Compiler Phases

CS415 Compilers. Lexical Analysis

Compiler Design Aug 1996

COP 3402 Systems Software. Lecture 4: Compilers. Interpreters

Front End. Hwansoo Han

CST-402(T): Language Processors

Syntax. A. Bellaachia Page: 1

Syntax-Directed Translation. Lecture 14

Lexical analysis. Syntactical analysis. Semantical analysis. Intermediate code generation. Optimization. Code generation. Target specific optimization

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

The Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?

Downloaded from Page 1. LR Parsing

Introduction to Compiler

Lexical and Syntax Analysis

Introduction to Parsing. Lecture 5

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Using an LALR(1) Parser Generator

Crafting a Compiler with C (II) Compiler V. S. Interpreter

CS164: Midterm I. Fall 2003

Single-pass Static Semantic Check for Efficient Translation in YAPL

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Formats of Translated Programs

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.

CS2383 Programming Assignment 3

Compilers Project 3: Semantic Analyzer

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

Semantic Analysis computes additional information related to the meaning of the program once the syntactic structure is known.

CS606- compiler instruction Solved MCQS From Midterm Papers

Time : 1 Hour Max Marks : 30

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

Working of the Compilers

G Compiler Construction Lecture 4: Lexical Analysis. Mohamed Zahran (aka Z)

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

Syntax and Grammars 1 / 21

Earlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl

CA Compiler Construction

Programming Languages & Translators PARSING. Baishakhi Ray. Fall These slides are motivated from Prof. Alex Aiken: Compilers (Stanford)

1. Explain the input buffer scheme for scanning the source program. How the use of sentinels can improve its performance? Describe in detail.

Properties of Regular Expressions and Finite Automata

CSCE 531 Spring 2009 Final Exam

Syntax-Directed Translation Part I

Transcription:

More detailed overview of compiler front end Structure of a compiler Today we ll take a quick look at typical parts of a compiler. This is to give a feeling for the overall structure. source program lexical analyzer (scanner) We ll then begin to consider a very simple example of a compiler translates arithmetic expressions from infix to postfix notation. syntax analyzer (parser) semantic analyzer symbol table error handler Once we ve seen how such a compiler can be built by hand, we ll begin to study more powerful (automated) techniques, in detail. intermediate code generator code optimizer code generator target program We ll focus on the front end lexical analyzer through intermediate code generator. 1 2

Lexical Analysis (Scanning) A source program is typically stored in a file that is essentially a sequence of bytes or characters. The function of a lexical analyzer (scanner) is to translate this stream of bytes into a sequence of tokens. We ll use an example to begin to flesh out the idea of a token... Consider the following fragment of a program: { float tab[10]; int x2, y3, z; z x + y * 10; } Lexical analysis will eliminate white space, and turn character strings into tokens... { } float tab[10]; int x2, y3, z; z x + y * 10; Lexical analysis might produce something like the following sequence of tokens and token-value pairs: { FLOAT <IDENTIFIER,tab> [ <NUMBER,10> ] ; INT <IDENTIFIER,x> <NUMBER,2>, <IDENTIFIER,y> <NUMBER,3>, <IDENTIFIER,z> ; <IDENTIFIER,z> <IDENTIFIER,x> + <IDENTIFIER,y> * <NUMBER,10> ; } 3 4

Symbol Tables Syntactic Analysis (Parsing) The symbol table stores information about identifiers encountered in the source file during compilation. For example, it could be that a variable identifier has a name ( lexeme ) and a type; a type identifier has a name and a type; a function identifier has a name, a parameter list, and a result type. Typically, entries in the symbol table are partially constructed during lexical analysis, and they are used and elaborated in later steps. Essentially, the task of a syntactic analyzer (parser) is to convert a sequence of tokens into a syntax tree. For the previous example, we may have roughly the following syntactic structures: A block consists of declarations followed by statements. Each declaration consists of a type followed by a list of variable names. Each variable name has an optional pre-modifier, such as pointer (*), an identifier, and an optional post-modifier, which may be for array size or function type, or an initialization. A statement can be an assignment, which consists of a variable, followed by and an expression. An expression can be an addition, a multiplication,... { float tab[10]; int x2, y3, z; z x + y * 10; } 5 6

Below is a possible abstract syntax tree for the assignment Semantic Analysis z x + y * 10; <ID,z> + <ID,x> * <ID,y> <NUMBER,10> Semantic analysis includes activities like type checking. Associate a type with each variable and each expression, and check for type consistency in variables: array element access, structure component access, pointer object access; expressions: the types of operands must be consistent with the operator; function calls: types of arguments must be consistent with function declaration; assignments. 7 8

Error Handler Intermediate Code Generation At each stage in this process we may encounter errors in the input. Some guidelines: Detect errors when possible. Identify them as specifically as possible: where they occur, what they are, etc. Don t halt upon encountering an error. Instead try to continue in a reasonable fashion. (Involves some guessing.) Don t try to repair errors. <ID,z> + <ID,x> * <ID,y> <NUMBER,10> Intermediate code: temp1 : 10 temp2 : IDy * temp1 temp3 : IDx + temp2 IDz : temp3 Intermediate code optimization: temp1 : IDy * 10 IDz : IDx + temp1 9 10

Compiler Front End Construction Tools Simple Compiler Example parser generators: yacc, bison. Implement powerful parsing algorithms too complex to be reliably written by hand. Input, at its simplest, is a context-free grammar specifying the syntactic structure of the source language. scanner generators: lex, flex. Generate lexical analyzers, quite similar to finite automata. Input, at its simplest, is a list of regular expressions specifying which input strings correspond to tokens. syntax-directed translation engines We ll play around with lex and yacc. Translate infix expression to postfix expression. Example: 1+2 12+ 3-1+2 31-2+ (left associative operators) Rudimentary example of parsing, semantic analysis, intermediate code generation. Also, lexical analysis. We ll look this over quickly to get an overview of a very simple compiler front end. We ll also get a sense of how to do things by hand. Later we ll revisit scanning and parsing, in much greater detail. (And we ll use powerful, standard tools!) 11 12

Structure of Simple Example Compiler We begin with a grammar specifying the syntax of (our restricted notion of) infix expressions. infix expression lexical analyzer (scanner) syntax-directed translator postfix expression expr expr + digit expr digit digit digit 0 1 2 3 4 5 6 7 8 9 These rules are called productions. The tokens, or terminal symbols, are + 0 1 2 3 4 5 6 7 8 9 Syntax-directed translator combines parsing, semantic analysis and intermediate code generation. To begin we ll make things even simpler by assuming that the input has no whitespace and each token is represented by a single character and the nonterminals are The start symbol is expr. expr digit + 0 1 2 3 4 5 6 7 8 9 so that scanning is trivial just read the input character by character. The token strings that can be derived from the start symbol constitute the language defined by the grammar. (We ll make all this more precise later.) 13 14

Parse trees Translation scheme expr expr + digit expr digit digit digit 0 1 2 3 4 5 6 7 8 9 Let s draw a few parse trees for this grammar: Assuming that a parser could construct such parse trees using this grammar, it would not be difficult to synthesize a translation by embedding actions in the productions of the grammar, so that executing those actions during a depth-first traversal of the parse tree would produce the translation. expr expr expr / \ / \ / \ / \ digit expr + digit expr + digit / \ \ / \ \ 1 digit 2 expr - digit 2 1 digit 1 3 expr expr + digit { putchar( + ); } expr digit { putchar( - ); } digit digit 0 { putchar( 0 ); } 1 { putchar( 1 ); } 2 { putchar( 2 ); } 3 { putchar( 3 ); } 4 { putchar( 4 ); } 5 { putchar( 5 ); } 6 { putchar( 6 ); } 7 { putchar( 7 ); } 8 { putchar( 8 ); } 9 { putchar( 9 ); } 15 16

Modify the parse trees below to incorporate the actions we ve added to the productions of the grammar: expr expr expr / \ / \ / \ / \ digit expr + digit expr + digit / \ \ / \ \ 1 digit 2 expr - digit 2 1 digit 1 3 For next time... Continue reading Chapter 2, through 2.7. Notice that executing the actions during a depth-first traversal indeed yields the postfix translation. Next difficulty: How to implement a parser for this grammar? 17 18