Context Free Grammars. CS154 Chris Pollett Mar 1, 2006.

Similar documents
Normal Forms and Parsing. CS154 Chris Pollett Mar 14, 2007.

Context Free Grammars, Parsing, and Amibuity. CS154 Chris Pollett Feb 28, 2007.

Limits of Computation p.1/?? Limits of Computation p.2/??

Yet More CFLs; Turing Machines. CS154 Chris Pollett Mar 8, 2006.

JNTUWORLD. Code No: R

Ambiguous Grammars and Compactification

CS154 Midterm Examination. May 4, 2010, 2:15-3:30PM

CT32 COMPUTER NETWORKS DEC 2015

Closure Properties of CFLs; Introducing TMs. CS154 Chris Pollett Apr 9, 2007.

CS525 Winter 2012 \ Class Assignment #2 Preparation

Derivations of a CFG. MACM 300 Formal Languages and Automata. Context-free Grammars. Derivations and parse trees

Normal Forms for CFG s. Eliminating Useless Variables Removing Epsilon Removing Unit Productions Chomsky Normal Form

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Context-Free Grammars


Context-Free Grammars and Languages (2015/11)

Introduction to Syntax Analysis

CMPT 755 Compilers. Anoop Sarkar.

COP 3402 Systems Software Syntax Analysis (Parser)

Languages and Compilers

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Compiler Construction

UNIT I PART A PART B

Optimizing Finite Automata

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

Part 5 Program Analysis Principles and Techniques

Context Free Languages

CS 181 B&C EXAM #1 NAME. You have 90 minutes to complete this exam. You may assume without proof any statement proved in class.

CMSC 330: Organization of Programming Languages. Context Free Grammars

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

Introduction to Syntax Analysis. The Second Phase of Front-End

Non-deterministic Finite Automata (NFA)

Final Course Review. Reading: Chapters 1-9

Definition 2.8: A CFG is in Chomsky normal form if every rule. only appear on the left-hand side, we allow the rule S ǫ.

1. Which of the following regular expressions over {0, 1} denotes the set of all strings not containing 100 as a sub-string?

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Homework. Context Free Languages. Before We Start. Announcements. Plan for today. Languages. Any questions? Recall. 1st half. 2nd half.

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

CMSC 330: Organization of Programming Languages

Multiple Choice Questions

MA513: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 18 Date: September 12, 2011

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages

(a) R=01[((10)*+111)*+0]*1 (b) ((01+10)*00)*. [8+8] 4. (a) Find the left most and right most derivations for the word abba in the grammar

Context-free grammars (CFGs)

Syntax Analysis Part I

CMSC 330: Organization of Programming Languages. Context Free Grammars

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM.

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

CS 321 Programming Languages and Compilers. VI. Parsing

Decision Properties for Context-free Languages

Context-Free Languages and Parse Trees

14.1 Encoding for different models of computation

NFAs and Myhill-Nerode. CS154 Chris Pollett Feb. 22, 2006.

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

Homework. Announcements. Before We Start. Languages. Plan for today. Chomsky Normal Form. Final Exam Dates have been announced

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

Principles of Programming Languages COMP251: Syntax and Grammars

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

Parsing: Derivations, Ambiguity, Precedence, Associativity. Lecture 8. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2016

LR Parsing Techniques

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

2.2 Syntax Definition

A Simple Syntax-Directed Translator

1. [5 points each] True or False. If the question is currently open, write O or Open.

Downloaded from Page 1. LR Parsing

Models of Computation II: Grammars and Pushdown Automata

Properties of Regular Expressions and Finite Automata

Natural Language Processing

Reflection in the Chomsky Hierarchy

CS 314 Principles of Programming Languages

Limitations of Algorithmic Solvability In this Chapter we investigate the power of algorithms to solve problems Some can be solved algorithmically and

Context-Free Grammars

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Theory of Computation

DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY SIRUVACHUR, PERAMBALUR DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

CSE302: Compiler Design

QUESTION BANK. Formal Languages and Automata Theory(10CS56)

Context Free Languages and Pushdown Automata

Outline. Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

Defining syntax using CFGs

Lecture 8: Context Free Grammars

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

Introduction to Parsing. Lecture 8

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

The CKY Parsing Algorithm and PCFGs. COMP-599 Oct 12, 2016

Skyup's Media. PART-B 2) Construct a Mealy machine which is equivalent to the Moore machine given in table.

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Transcription:

Context Free Grammars CS154 Chris Pollett Mar 1, 2006.

Outline Formal Definition Ambiguity Chomsky Normal Form

Formal Definitions A context free grammar is a 4-tuple (V, Σ, R, S) where 1. V is a finite set called the variables 2. Σ is a finite set, disjoint from V called the terminals. 3. R is a finite set of rules, with each rule being a pair consisting of a variable and a string of variables and terminals, and 4. S V is a start variable. For a rule A--> w where w is a string over (V Σ), and for other strings u and v, we say uav yields uwv, written uav => uwv. We say u derives v, written u=> * v, if there is a finite sequence: u => u 1 => u 2 => => u k => v.

Example Consider the grammar G= (V, Σ, R, <EXPR>) where V is {<EXPR>, <TERM>, <FACTOR>} and Σ is (a, +, x, (, )) and the rules are: <EXPR> --> <EXPR> + <TERM> <TERM> <TERM> --> <TERM> x <FACTOR> <FACTOR> <FACTOR> --> (<EXPR>) a One can verify that <EXPR> => * (a+a) x a.

Techniques for Designing CFGs Many CFLs are the union of simpler CFLs. So one can design a CFG for each in turn with start states S 1, S 2, S n.then take the union of the rules and add a new start variable with a rule S--> S 1 S 2 S n. For example, take the language {0 n 1 n n>=0} {1 n 0 n n>=0}. First we could make CFGs for each language separately. Say, S 1 --> 0 S 1 1 ε and S 2 --> 1 S 2 0 ε. Then add the rule S--> S 1 S 2. A CFG for a language that is regular can be had by first make a DFA for the language. For each state q i make a variable R i and for each transition δ(q i,a) = q j make a rule R i --> a R j. Have the start state variable be R 0. Add rules R i --> ε for each final state.

More Techniques for Designing CFGs For CFL which contain two substrings which are linked in the sense that a machine for such a language would need to remember information about one on the strings to verify information about the other substring, you might want to consider rules of the form R --> u R v. Here u and v should satisfy the property you are trying to verify.

Ambiguity Sometime a grammar can generate string in more than one way. Such a string will have several different parse trees. As the parse tree is supposed to give us the meaning, such a string would have more than one meaning. A string with more than one parse tree with respect to a grammar is said to be ambiguously derived in that grammar. For example, consider <EXPR> --> <EXPR>+ <EXPR> <EXPR> x <EXPR> (<EXPR>) a. Then a + a x a can be derived with two different parse trees.

Leftmost Derivations We want to formalize the notion of ambiguity in terms of derivations rather than parse trees as derivations are easier to work with syntactically. We say that a derivation of a string w in a grammar G is a leftmost derivation if at every step the leftmost remaining variable is the one replaced. A string w is derived ambiguously in G if it has two or more different leftmost derivations. A CFG is called ambiguous if it generates some string ambiguously. There are often many different CFGs for the same language. Even though one of these may be ambiguous some other may be unambiguous. We say a language is inherently ambiguous if one can never find an unambiguous CFG for it. One of the problems in the book asks you to prove that {a i b j c k i=j or j=k} is inherently ambiguous.

Chomsky Normal Form When working with CFGs it is convenient to have them in some kind of normal form in order to do proofs. Chomsky Normal Form is often used. A CFG is in Chomsky Normal Form if every rule is of the form A-->BC or of the form A-->a, where A,B,C are any variables and a is a terminal. In addition the rule S--> ε is permitted.

Conversion to Chomsky Normal Form Any CFL L can be generated by a CFG in Chomsky Normal Form Proof Let G be a CFG for L. First we add a new start variable and rule S 0 -->S. This guarentees the start variable does not occur on the RHS of any rule. Second we remove any ε-rules A--> ε where A is not the start variable. Then for each occurrence of A on the RHS of a rule, say R--> uav, we add a rule R--> uv. We do this for each occurrence of an A. So for R--> uavaw, we would add the rules R-->uvAw, R--> uavw, R--> uvw. If we had the rule R-->A, add the rule R--> ε unless we previously removed the rule R--> ε. Then we repeat the process with R. Next we handle unit rule A--> B. To do this, we delete this rule and then for each rule of the form B--> u, we add then rule A-->u, unless this is a unit rule that was previously removed. We repeat until we eliminate unit rules. Finally, we convert all the remaining rules to the proper form. For any rule A--> u 1 u 2 u k where k>=3 and each ui is a variable or a terminal symbol, we replace the rule with A --> u 1 A 1, A 1 --> u 2 A 2, A k-2 --> u k-1 u k. For any rule with k=2, we replace any nonterminal with a new variable U i and a rule U i --> u i.