ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

Similar documents
ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

CMPT 755 Compilers. Anoop Sarkar.

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Context-Free Grammars

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

CSE 3302 Programming Languages Lecture 2: Syntax

Principles of Programming Languages COMP251: Syntax and Grammars

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

CSE302: Compiler Design

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Concepts Introduced in Chapter 4

Syntax Analysis Check syntax and construct abstract syntax tree

CS 314 Principles of Programming Languages

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

CMSC 330: Organization of Programming Languages

Introduction to Syntax Analysis

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

CMSC 330: Organization of Programming Languages. Context-Free Grammars Ambiguity

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Compiler Design Concepts. Syntax Analysis

3. Context-free grammars & parsing

Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov Compiler Design (170701)

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

CMSC 330: Organization of Programming Languages

Defining syntax using CFGs

Introduction to Syntax Analysis. The Second Phase of Front-End

Syntax. In Text: Chapter 3

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

CS 321 Programming Languages and Compilers. VI. Parsing

Principles of Programming Languages COMP251: Syntax and Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars

Context-free grammars (CFG s)

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages. Context Free Grammars

Building Compilers with Phoenix

COP 3402 Systems Software Syntax Analysis (Parser)

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Part 3. Syntax analysis. Syntax analysis 96

3. Parsing. Oscar Nierstrasz

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

BSCS Fall Mid Term Examination December 2012

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

CS 406/534 Compiler Construction Parsing Part I

Introduction to Lexical Analysis

Parsing II Top-down parsing. Comp 412

Context-free grammars

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Chapter 4: Syntax Analyzer

Introduction to parsers

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Part 5 Program Analysis Principles and Techniques

Compiler Construction: Parsing

CA Compiler Construction


ECE251 Midterm practice questions, Fall 2010

Syntax Analysis Part I

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Types of parsing. CMSC 430 Lecture 4, Page 1

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Dr. D.M. Akbar Hussain

A programming language requires two major definitions A simple one pass compiler

Introduction to Parsing

Context Free Grammars. CS154 Chris Pollett Mar 1, 2006.

Syntax. A. Bellaachia Page: 1

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

Syntax Analysis. Chapter 4

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

Introduction to Parsing. Lecture 5

Optimizing Finite Automata

General Overview of Compiler

EECS 6083 Intro to Parsing Context Free Grammars

Compiler Passes. Syntactic Analysis. Context-free Grammars. Syntactic Analysis / Parsing. EBNF Syntax of initial MiniJava.

CPS 506 Comparative Programming Languages. Syntax Specification

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

A Simple Syntax-Directed Translator

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming

Introduction to Lexical Analysis

Introduction to Parsing. Lecture 5

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

CSCE 314 Programming Languages

Compilers Course Lecture 4: Context Free Grammars

CONTEXT FREE GRAMMAR. presented by Mahender reddy

CS308 Compiler Principles Syntax Analyzer Li Jiang

Syntactic Analysis. Top-Down Parsing

Introduction to Parsing Ambiguity and Syntax Errors

Transcription:

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών Lecture 5b Syntax Analysis Elias Athanasopoulos eliasathan@cs.ucy.ac.cy

Regular Expressions vs Context-Free Grammars Grammar for the regular expression (a b)*abb A 0 è aa 0 ba 0 aa 1 A 1 è ba 2 A 2 è ba 3 A 3 è ε

Construct a grammar from an NFA For each state i of the NFA, create a nonterminal symbol A i If state i has a transition to state j on symbol a, introduce the production: A i è aa j If state i has a transition to state j on symbol ε, introduce the production: A i è A j If state i is an accepting state: A i è ε If i is the start state, then make A i be the start symbol of the grammar Recall the NFA version: a start a 0 1 b 2 b 3 b

REs are useful 1. The lexical rules of a language are frequently quite simple, and to describe them we do not need a notation as powerful as grammars. 2. Regular expressions generally provide a more concise and easier to understand notation for tokens than grammars. 3. More efficient lexical analyzers can be construct automatically from regular expressions than from arbitrary grammars. 4. Separating the syntactic structure of the language into lexical and nonlexical parts provides a convenient way of modularizing the front end of a compiler into two mangeable-sized components.

Eliminating Ambiguity stmt è if expr then stmt if expr then stmt else stmt other Valid Sentence if E 1 then if E 2 then S 1 else S 2

if E 1 then if E 2 then S 1 else S 2 stmt if expr then stmt Parse Tree 1 (preferred) E 1 if expr then stmt else stmt E 2 S 1 S 2 stmt if expr then stmt else stmt Parse Tree 2 E 1 if expr then stmt S 2 E 2 S 1

Eliminating Ambiguity General rule Match each else with the closest previous unmatched then. Unambiguous Version stmt matched_stmt è matched_stmt unmatched_stmt è if expr then matched_stmt else matched_stmt other unmatched_stmt è if expr then stmt if expr then matched_stmt else unmatched_stmt The idea is that a statement appearing between a then/else must be matched, i.e., it must not end with an unmatched then followed by any statement

Left Recursion Expressions where the leftmost symbol on the right side is the same as the nonterminal in the left side of the production are called left recursive expr è expr + term These productions can cause the parser to loop forever expr() { expr(); match( + ); term(); }

Left Recursion Elimination A left-recursive production can be eliminated by re-writing. Consider: A è Aa b, where a, b are sequences of terminals and nonterminals that do not start with A E.g., expr è expr + term term A = expr, a = +term, b = term This production can be re-written as: A è br R è ar ε (R is right-recursive)

Left Recursion Elimination E è E + T T T è T * F F F è (E) id E è TE E è +TE ε T è FT T è *FT ε F è (E) id

Generic Rule No matter how many A-productions there are, we can eliminate immediate left recursion from them by the following technique. (1) We group the A-productions as: A è Aa 1 Aa 2... Aa m b 1 b 2... b m (where no b 1 begins with an A) (2) We replace the A-productions: A è b 1 A b 2 A... b m A A è a 1 A a 2 A... a m A ε

Non-immediate Left Recursion S è Aa b A è Ac Sd ε The nonterminal S is left recursive because S=>Aa=>Sda, but it is not immediately recursive

Eliminating left recursion (any kind) Input Grammar G with no cycles or ε-productions (cycle + is A => A, and ε-production is Aèε) Output An equivalent grammar with no left recursion

1. Arrange the nonterminals in some order A 1, A 2,..., A n 2. for i := 1 to n do begin for j := 1 to j-1 do begin replace each production of the form A i è A j γ by the productions A i è δ 1 γ δ 2 γ... δ k γ where A j è δ 1 δ 2... δ k are all the current A j -productions end eliminate the immediate left recursion among the A j -productions end

Example S è Aa b A è Ac Sd ε We order the nonterminals S, A. There is no immediate left recursion among the S-productions, so nothing happens during step (2) for the case i = 1. For i = 2, we substitute the S-productions in A è Sd to obtain the following A-productions: A è Ac Aad bd ε The final grammar S è Aa b A è bda A A è ca ada ε

Left Factoring When we have two productions stmt è if expr then stmt else stmt if expr then stmt on seeing the input token if, we cannot immediately tell which production to use to expand stmt

Left Factoring In general, if Aèab 1 ab 2 are two A-productions and the input begins with a nonempty string derived from a, we do not know whether to expand A to ab 1 or ab 2. However we may defer the decision by expanding A to aa. Then after seeing the input derived from a, we expand A to b 1 or to b 2 : A è aa A è b 1 b 2

Left Factoring a Grammar Input Grammar G Output An equivalent left-factored grammar Method For each nonterminal A find the longest prefix a common to two or more of its alternatives. If a<>ε, i.e., there is a nontrivial common prefix, replace all the A productions A è ab 1 ab 2... ab n γ, where γ represents all alternatives that do not begin with a by A è aa γ A è b 1 b 2... b n

Example If expression then statement, If expression then statement else statememt S è iets ietses a E è b S è ietss a S è es ε E è b

Non-Context-Free Grammars Γραμματικές με Συμφραζόμενα L 1 ={wcw όπου w Î(a b)*} This language abstracts the problem of checking that identifiers are declared before their use in the program. L 2 ={a n b m c n d m όπου n³1, m ³ 1} This language abstracts the problem of checking that the number of formal parameters in the declaration of a procedure agrees with the number of actual parameters in a use of the procedure Properties that cannot be expressed using a CFG are checked in Semantic Analysis