Context Free Grammars, Parsing, and Amibuity. CS154 Chris Pollett Feb 28, 2007.

Similar documents
Context Free Grammars. CS154 Chris Pollett Mar 1, 2006.

Normal Forms and Parsing. CS154 Chris Pollett Mar 14, 2007.

Closure Properties of CFLs; Introducing TMs. CS154 Chris Pollett Apr 9, 2007.

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

JNTUWORLD. Code No: R

Context-Free Grammars

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

Program Abstractions, Language Paradigms. CS152. Chris Pollett. Aug. 27, 2008.

1. The output of lexical analyser is a) A set of RE b) Syntax Tree c) Set of Tokens d) String Character

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Ambiguous Grammars and Compactification

Syntax Analysis. Chapter 4

( ) i 0. Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5.

Optimizing Finite Automata

Outline. Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

CS525 Winter 2012 \ Class Assignment #2 Preparation

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Yet More CFLs; Turing Machines. CS154 Chris Pollett Mar 8, 2006.

Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5. Derivations.

Introduction to Parsing. Lecture 5. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

Introduction to Parsing. Lecture 8

Introduction to Parsing. Lecture 5

Homework. Context Free Languages. Before We Start. Announcements. Plan for today. Languages. Any questions? Recall. 1st half. 2nd half.

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2016

Derivations of a CFG. MACM 300 Formal Languages and Automata. Context-free Grammars. Derivations and parse trees

CS154 Midterm Examination. May 4, 2010, 2:15-3:30PM

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

CT32 COMPUTER NETWORKS DEC 2015

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

Introduction to Parsing. Lecture 5

MA513: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 18 Date: September 12, 2011

Compilers - Chapter 2: An introduction to syntax analysis (and a complete toy compiler)

Limitations of Algorithmic Solvability In this Chapter we investigate the power of algorithms to solve problems Some can be solved algorithmically and

CMPT 755 Compilers. Anoop Sarkar.

Introduction to Parsing


2.2 Syntax Definition

Natural Language Processing

COP 3402 Systems Software Syntax Analysis (Parser)

Grammars and ambiguity. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 8 1

Parsing: Derivations, Ambiguity, Precedence, Associativity. Lecture 8. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

CS 44 Exam #2 February 14, 2001

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

Properties of Regular Expressions and Finite Automata

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

R10 SET a) Construct a DFA that accepts an identifier of a C programming language. b) Differentiate between NFA and DFA?

Context-Free Grammars

CS 314 Principles of Programming Languages

Intro To Parsing. Step By Step

Normal Forms for CFG s. Eliminating Useless Variables Removing Epsilon Removing Unit Productions Chomsky Normal Form

QUESTION BANK. Formal Languages and Automata Theory(10CS56)

Context Free Languages

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM.

CS152: Programming Languages. Lecture 2 Syntax. Dan Grossman Spring 2011

CSCI 1260: Compilers and Program Analysis Steven Reiss Fall Lecture 4: Syntax Analysis I

Context-Free Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

UNIT I PART A PART B

E E+E E E (E) id. id + id E E+E. id E + E id id + E id id + id. Overview. derivations and parse trees. Grammars and ambiguity. ambiguous grammars

The Turing Machine. Unsolvable Problems. Undecidability. The Church-Turing Thesis (1936) Decision Problem. Decision Problems

Introduction to Parsing Ambiguity and Syntax Errors

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

Context-Free Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

Context-Free Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

Compiler Construction

Introduction to Parsing Ambiguity and Syntax Errors

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Context-Free Grammars and Languages (2015/11)

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

CS 321 Programming Languages and Compilers. VI. Parsing

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

Bottom-Up Parsing. Lecture 11-12

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

Variants of Turing Machines

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Context-Free Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

Introduction to Syntax Analysis

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

NFAs and Myhill-Nerode. CS154 Chris Pollett Feb. 22, 2006.

CS 536 Midterm Exam Spring 2013

Defining syntax using CFGs

Context-Free Grammars

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

Fall Compiler Principles Context-free Grammars Refresher. Roman Manevich Ben-Gurion University of the Negev

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Parsing. Top-Down Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 19

Final Course Review. Reading: Chapters 1-9

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

An Interactive Approach to Formal Languages and Automata with JFLAP

Non-deterministic Finite Automata (NFA)

Lecture 8: Deterministic Bottom-Up Parsing

Grammars and Parsing. Paul Klint. Grammars and Parsing

Context-Free Languages and Parse Trees

Introduction to Syntax Analysis. The Second Phase of Front-End

Transcription:

Context Free Grammars, Parsing, and Amibuity CS154 Chris Pollett Feb 28, 2007.

Outline JFLAP Intuitions about Compiler Ambiguity Left and Rightmost Derivations Brute Force Parsing

JFLAP and the Pumping Lemma JFLAP has a Regular Pumping Lemma button. Clicking on it gives a window:

More JFLAP Selecting one of these languages lets you play a game versus the computer. You choose number of states of machine. i.e., pumping length. It selects a string longer than this length. You get to choose how it is split The computer either tries to find a string that can be pumped but which is not in the language or it loses.

Yet More JFLAP JFLAP also has a grammar button. If you click on it you can then specify a grammar. If you like, it could be a regular grammar or a CFG. JFLAP supports conversions of the grammar to a number of normal forms as well as supports check if a string is in the language of a grammar.

Intuitions about Compilers Recall compilers take strings written in a high level language like C or Java and spit out code in machine language. So a compilers job is to assign machine code meaning to a string written in C. The C language might be specified using a CFG. For instance, we might have a rule like: <while_statement> ::= while <expression> <statement> Crudely, we could imagine writing a recursive program to handle this: int parsewhile(string program, int whereparsing, MachineCode whilecode) { MachineCode expressioncode = new MachineCode(); MachineCode statementcode = new MachineCode(); whereparsing += 5; //advance passed the keyword while whereparsing = parseexpression(program, whereparsing, expressioncode); whereparsing = parsestatement(program, whereparsing, statementcode); // some code to build whilecode from expressioncode and statementcode return whereparsing; } The underlying assumption for compilation to work is that there can be only one machine code meaning one can give to a given C string. Here we have to be careful

Ambiguity Sometime a grammar can generate string in more than one way. Such a string will have several different parse trees. As the parse tree is supposed to give us the meaning of the string, such a string would have more than one meaning. A string with more than one parse tree with respect to a grammar is said to be ambiguously derived in that grammar. For example, consider --> + x () a. Then a + a x a can be derived with two different parse trees. a + a x a a + a x a The left tree probably means a+(a x a); whereas, the right means (a+a) x a

Leftmost and Rightmost Derivations We want to formalize the notion of ambiguity in terms of derivations rather than parse trees as derivations are easier to work with syntactically. We say that a derivation of a string w in a grammar G is a leftmost derivation if at every step the leftmost remaining variable is the one replaced. We say that a derivation of a string w in a grammar G is a rightmost derivation if at every step the rightmost remaining variable is the one replaced. As an example, if our CFG was A-->BAC λ, B --> b, C-->c. Then A=>BAC=>bAC=>bBACC=>bbACC=>bbCC=>bbcC=>bbcc, is a leftmost derivation of bbcc. On the other hand, A=>BAC=>BAc=>BBACc=>BBAcc=>BBcc=>Bbcc=>bbcc would be a rightmost derivation of the same string. Intuitively, if we have two ways to expand the leftmost (respectively, rightmost) symbol then the derivation will be ambiguous.

Ambiguity and Leftmost Derivations A string w is derived ambiguously in G if it has two or more different leftmost derivations (resp. rightmost derivations). A CFG is called ambiguous if it generates some string ambiguously. In the case of a + a x a, the two left most derivations are: (1) => + => a + => a + x => a+ a x => a + a x a; (2) => x => + x => a + x => a+ a x => a + a x a; There are often many different CFGs for the same language. Even though one of these may be ambiguous some other may be unambiguous. We say a language is inherently ambiguous if one can never find an unambiguous CFG for it. The book says without proof that {a i b j c k i=j or j=k} is inherently ambiguous.

Brute Force Parsing One way to do parsing is by exhaustive search. We consider each one step derivation from the start variable, then each two step derivation, etc. in turn. If we ever see the string we want we accept. If all the active derivations involve strings of terminals and variables longer than the string w we are searching for, we halt and reject. To handle rules like A->B. Which can give derivations like A=>B=>A, we maintain a list of strings we have already seen. If we repeat, we prune that branch. This is an exponential time algorithm; whereas, if we use a normal form for our grammars we can speed things up to be either cubic or in some cases linear time.