Derivations of a CFG. MACM 300 Formal Languages and Automata. Context-free Grammars. Derivations and parse trees

Similar documents
CMPT 755 Compilers. Anoop Sarkar.

COP 3402 Systems Software Syntax Analysis (Parser)

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

Syntax Analysis Part I

CT32 COMPUTER NETWORKS DEC 2015

Specifying Syntax COMP360

Natural Language Processing

Introduction to Lexing and Parsing

Limits of Computation p.1/?? Limits of Computation p.2/??

Context-Free Grammars


JNTUWORLD. Code No: R

Context-Free Grammars

Introduction to Syntax Analysis. The Second Phase of Front-End

R10 SET a) Construct a DFA that accepts an identifier of a C programming language. b) Differentiate between NFA and DFA?

Introduction to Syntax Analysis

Principles of Programming Languages COMP251: Syntax and Grammars

Context-Free Languages and Parse Trees

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

PS3 - Comments. Describe precisely the language accepted by this nondeterministic PDA.

Skyup's Media. PART-B 2) Construct a Mealy machine which is equivalent to the Moore machine given in table.

Principles of Programming Languages COMP251: Syntax and Grammars

T.E. (Computer Engineering) (Semester I) Examination, 2013 THEORY OF COMPUTATION (2008 Course)

Languages and Compilers

QUESTION BANK. Formal Languages and Automata Theory(10CS56)

UNIT I PART A PART B

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

CSCE 314 Programming Languages

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

Ambiguous Grammars and Compactification

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Introduction to Parsing. Lecture 5. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

CS 44 Exam #2 February 14, 2001

Multiple Choice Questions

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

( ) i 0. Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5.

Context Free Grammars. CS154 Chris Pollett Mar 1, 2006.

(a) R=01[((10)*+111)*+0]*1 (b) ((01+10)*00)*. [8+8] 4. (a) Find the left most and right most derivations for the word abba in the grammar

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

CMSC 330: Organization of Programming Languages. Context Free Grammars

Question Bank. 10CS63:Compiler Design

Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5. Derivations.

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

CMSC 330: Organization of Programming Languages. Context Free Grammars

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Regular Languages. MACM 300 Formal Languages and Automata. Formal Languages: Recap. Regular Languages

Theory of Computation

Definition 2.8: A CFG is in Chomsky normal form if every rule. only appear on the left-hand side, we allow the rule S ǫ.

CMSC 330: Organization of Programming Languages. Context Free Grammars

Lecture 4: Syntax Specification

Parsing: Derivations, Ambiguity, Precedence, Associativity. Lecture 8. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

CMSC 330: Organization of Programming Languages

MA513: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 18 Date: September 12, 2011

Part 5 Program Analysis Principles and Techniques

CS 314 Principles of Programming Languages. Lecture 3

CS 314 Principles of Programming Languages

Models of Computation II: Grammars and Pushdown Automata

Introduction to Lexical Analysis

CMSC 330: Organization of Programming Languages

PDA s. and Formal Languages. Automata Theory CS 573. Outline of equivalence of PDA s and CFG s. (see Theorem 5.3)

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

Lexical Scanning COMP360

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Formal Languages. Formal Languages

CSCI312 Principles of Programming Languages!

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Syntax. In Text: Chapter 3

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

QUESTION BANK. Unit 1. Introduction to Finite Automata

Wednesday, August 31, Parsers

LL(1) predictive parsing

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

CSE 3302 Programming Languages Lecture 2: Syntax

1. [5 points each] True or False. If the question is currently open, write O or Open.

CS/B.Tech/CSE/IT/EVEN/SEM-4/CS-402/ ItIauIafIaAblll~AladUnrtel1ity

Introduction to Parsing. Lecture 5

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Decision Properties for Context-free Languages

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Goals for course What is a compiler and briefly how does it work? Review everything you should know from 330 and before

Bottom-Up Parsing. Lecture 11-12

Final Course Review. Reading: Chapters 1-9

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

Context-Free Grammars

UNIT 1 SNS COLLEGE OF ENGINEERING

Compiler Construction Using

Homework. Context Free Languages. Before We Start. Announcements. Plan for today. Languages. Any questions? Recall. 1st half. 2nd half.

Closure Properties of CFLs; Introducing TMs. CS154 Chris Pollett Apr 9, 2007.

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2016

Course Project 2 Regular Expressions

CMSC 330: Organization of Programming Languages

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

EECS 6083 Intro to Parsing Context Free Grammars

2068 (I) Attempt all questions.

Transcription:

Derivations of a CFG MACM 300 Formal Languages and Automata Anoop Sarkar http://www.cs.sfu.ca/~anoop strings grow on trees strings grow on Noun strings grow Object strings Verb Object Noun Verb Object Sentence 3 Context-free Grammars Derivations and parse trees Set of rules by which val sentences can be constructed. xample: Sentence! Noun Verb Object Noun! trees strings Verb! are grow Object! on Noun Adjective Adjective! slowly interesting What strings can Sentence derive? Syntax only no semantic checking 2 Sentence Noun Verb Object Noun strings grow on trees 4

CFG Notation Normal CFG notation! *! + Backus Naur notation ::= * + (an or-list of right hand ses) 5 CFGs and Languages For the regular expression (01)* we know that it corresponds to a language L = {', 01, 0101, 010101, } We know this because we know the definition of the Kleene closure operator * Similarly we define a relation between CFGs and a language as follows: 1. Write down the start variable 2. Find a variable written down so far and a rule that has that variable on the left-hand se of a rule in the CFG. Replace the variable with the right-hand se of the rule 3. Repeat step 2 until no variable remains 7 CFG: Formal Definition CFGs and Languages A CFG is a 4-tuple (V, ", R, S), where: V is a finite non-empty set of symbols (called variables, or non-terminals) " is a finite set of symbols (called set of terminal symbols, or the alphabet) We generally assume that V # " = $ R is a finite non-empty set of rules, where each rule is of the form: A! w, w % (V & ")* S % V is the start variable (or start non-terminal) We can formally specify the 3 rules for deriving a language from a CFG as follows: If u, v, w are strings from (V & ")* Then if we have a string uav written down And if we have a rule in the CFG, A! w, Then we can replace A with w, written as: uav ( wwv (called a derivation step) 6 8

CFGs and Languages Leftmost derivations for + * Conser CFG: A! 0A1 ', here are some strings derived from this grammar: A ( ' A ( 0A1 ( 0'1 = 01 (derived in 0 or more steps: A (* 01) A ( 0A1 ( 00A11 ( 00'11 = 0011 A ( 0A1 ( 00A11 ( 000A111 ( 000'111 = 000111 and so on The set of all strings that can be derived is the language of the CFG In this case the language is { 0 n 1 n n ) 0 } 9 ( + ( + ( + * ( + * ( + * (* lm + * + * 11! +! *! ( )! -! Arithmetic xpressions ( * ( + * Leftmost derivations for + * ( + * ( + * ( + * * + 10 12

! +! *! ( )! -! (* rm + * Rightmost derivation for + * ( * ( * ( + * ( + * ( + * Note that the parse tree is the same as one of the leftmost derivations. * 13 + A grammar G is ambiguous iff there are two parse trees or two distinct leftmost or two distinct rightmost derivations for any string in L(G) Ambiguity Alternatives Massage grammar to make it unambiguous Rely on default parser behavior Augment parser Conser the original ambiguous grammar:! +! *! ( )! -! How can we change the grammar to get only one tree for the input + * 15 Ambiguity Ambiguity Grammar is ambiguous if more than one parse tree is possible for some sentences xamples in nglish: Two sisters reunited after 18 years in checkout counter Ambiguity is not acceptable in PL Unfortunately, it s undecable to check whether a grammar is ambiguous 14 Original ambiguous grammar:! +! *! ( )! -! Unambiguous grammar:! + T T! T * F! T T! F F! ( ) F! - F! Input: + * + T T T * F F F 16

Other Ambiguous Grammars Conser the grammar R! R & R R R R * ( R ) a b What does this grammar generate? What s the parse tree for a&b*a Is this grammar ambiguous? First step, add S 0 and then remove epsilon rules S! ASA ab A! B S ' After adding S 0 and then '-removal (remove *): S 0! S S! ASA ab a SA AS S A! B S '* '* 17 19 Grammar Transformations G is converted to G s.t. L(G ) = L(G) Left Factoring Removing cycles: A ( + A Removing '-rules of the form A! ' liminating left recursion Conversion to normal forms:, A! B C and A! a Greibach Normal Form, A! a * Second step, remove unit rules or chain rules S 0! S S! ASA ab a SA AS S A! B S After removal of unit rules S 0! S and S! S: S 0! ASA ab a SA AS S! ASA ab a SA AS A! B S 18 20

After removal of unit rules S 0! S and S! S: S 0! ASA ab a SA AS S! ASA ab a SA AS A! B S After removal of unit rules A! B and A! S: S 0! ASA ab a SA AS S! ASA ab a SA AS A! b ASA ab a SA AS Converting the rhs of each rule to have two nonterminals A! B N 1 C N 2 N 1! a N 2! d After converting to binary form: A! B N 3 N 1! a N 3! N 1 N 4 N 2! d N 4! C N 2 21 23 Conser some simpler examples for next two steps Removing terminals from the rhs of rules A! B a C d After removal of terminals from the rhs: A! B N 1 C N 2 N 1! a N 2! d 22 Back to original example After removal of unit rules A! B and A! S: S 0! ASA ab a SA AS S! ASA ab a SA AS A! b ASA ab a SA AS After adding variables: S 0! AA 1 UB a SA AS S! AA 1 UB a SA AS A! b AA 1 UB a SA AS A 1! SA U! a 24

Non-CF Languages CF Languages The pumping lemma for CFLs [Bar-Hillel] is similar to the pumping lemma for RLs For a string wuxvy in a CFL for u,v! " and the string is long enough then wu n xv n y is also in the CFL for n # 0 Not strong enough to work for every non- CF language (cf. Ogden s Lemma) 25 27 Non-CF Languages Context-free languages and Pushdown Automata Recall that for each regular language there was an equivalent finite-state automaton The FSA was used as a recognizer of the regular language For each context-free language there is also an automaton that recognizes it: called a pushdown automaton (pda) 26 28

Context-free languages and Pushdown Automata Similar to FSAs there are non-deterministic pda and deterministic pda Unlike in the case of FSAs we cannot always convert a npda to a dpda Similar to the FSA case, a DFA construction proves us with the algorithm for lexical analysis, In this case the construction of a dpda will prove us with the algorithm for parsing (take in strings and prove the parse tree) Summary CFGs can be used describe PL Derivations correspond to parse trees Parse trees represent structure of programs Ambiguous CFGs exist CF languages can be recognized using Pushdown Automata 29 31 Pushdown Automata PDA has an alphabet (terminals) and stack symbols (like non-terminals), a finite-state automaton, and stack ', '! $ 1, A! ' 0, '! A e.g. PDA for language L = { 0 n 1 n : n >= 0 }! implies a push/pop of stack symbol(s) push stack symbol A ', $! ' 1, A! ' pop stack symbol A check that stack is empty 30