Homework. Announcements. Before We Start. Languages. Plan for today. Chomsky Normal Form. Final Exam Dates have been announced

Similar documents
Homework. Context Free Languages. Before We Start. Announcements. Plan for today. Languages. Any questions? Recall. 1st half. 2nd half.

Normal Forms for CFG s. Eliminating Useless Variables Removing Epsilon Removing Unit Productions Chomsky Normal Form

Decision Properties for Context-free Languages

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Ambiguous Grammars and Compactification

JNTUWORLD. Code No: R

Chapter 18: Decidability

(a) R=01[((10)*+111)*+0]*1 (b) ((01+10)*00)*. [8+8] 4. (a) Find the left most and right most derivations for the word abba in the grammar

Normal Forms. Suradet Jitprapaikulsarn First semester 2005

Parsing. Cocke Younger Kasami (CYK) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 35

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

Context-Free Languages and Parse Trees

Models of Computation II: Grammars and Pushdown Automata

Context Free Grammars. CS154 Chris Pollett Mar 1, 2006.

Context-Free Grammars

CT32 COMPUTER NETWORKS DEC 2015

Final Course Review. Reading: Chapters 1-9

Lecture 12: Cleaning up CFGs and Chomsky Normal

Limits of Computation p.1/?? Limits of Computation p.2/??

Context-Free Grammars

Lecture 8: Context Free Grammars

SLR parsers. LR(0) items

Introduction to Syntax Analysis

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

Compiler Construction

Actually talking about Turing machines this time

Introduction to Syntax Analysis. The Second Phase of Front-End

Multiple Choice Questions

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

MA513: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 18 Date: September 12, 2011

Context-Free Grammars

PDA s. and Formal Languages. Automata Theory CS 573. Outline of equivalence of PDA s and CFG s. (see Theorem 5.3)

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Skyup's Media. PART-B 2) Construct a Mealy machine which is equivalent to the Moore machine given in table.

Solving systems of regular expression equations

CS210 THEORY OF COMPUTATION QUESTION BANK PART -A UNIT- I

Context Free Languages and Pushdown Automata

Non-deterministic Finite Automata (NFA)

1. Which of the following regular expressions over {0, 1} denotes the set of all strings not containing 100 as a sub-string?

Derivations of a CFG. MACM 300 Formal Languages and Automata. Context-free Grammars. Derivations and parse trees

QUESTION BANK. Unit 1. Introduction to Finite Automata


A Characterization of the Chomsky Hierarchy by String Turing Machines

Formal Languages. Grammar. Ryan Stansifer. Department of Computer Sciences Florida Institute of Technology Melbourne, Florida USA 32901

CS 314 Principles of Programming Languages. Lecture 3

Compilation Lecture 3: Syntax Analysis: Top-Down parsing. Noam Rinetzky

Languages and Compilers

UNIT I PART A PART B

CS 44 Exam #2 February 14, 2001

T.E. (Computer Engineering) (Semester I) Examination, 2013 THEORY OF COMPUTATION (2008 Course)

I have read and understand all of the instructions below, and I will obey the Academic Honor Code.

CSE 105 THEORY OF COMPUTATION

Formal Grammars and Abstract Machines. Sahar Al Seesi

COP 3402 Systems Software Syntax Analysis (Parser)

Formal Languages and Automata

CS 314 Principles of Programming Languages

Definition 2.8: A CFG is in Chomsky normal form if every rule. only appear on the left-hand side, we allow the rule S ǫ.

CpSc 421 Final Solutions

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

TAFL 1 (ECS-403) Unit- V. 5.1 Turing Machine. 5.2 TM as computer of Integer Function

Closure Properties of CFLs; Introducing TMs. CS154 Chris Pollett Apr 9, 2007.

LR Parsing - The Items

Lexical and Syntax Analysis. Bottom-Up Parsing

In One Slide. Outline. LR Parsing. Table Construction

Syntax Analysis Part I

Plan for Today. Regular Expressions: repetition and choice. Syntax and Semantics. Context Free Grammars

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama

Defining syntax using CFGs

CS375: Logic and Theory of Computing

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Parsing Expression Grammars and Packrat Parsing. Aaron Moss

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Context Free Languages

Homework. Context Free Languages III. Languages. Plan for today. Context Free Languages. CFLs and Regular Languages. Homework #5 (due 10/22)

Parsing Part II (Top-down parsing, left-recursion removal)

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata

Formal Languages and Compilers Lecture VI: Lexical Analysis

Context-Free Languages and Pushdown Automata

Parsing. For a given CFG G, parsing a string w is to check if w L(G) and, if it is, to find a sequence of production rules which derive w.

Introduction to Lexing and Parsing

The CYK Parsing Algorithm

LR(0) Parsers. CSCI 3130 Formal Languages and Automata Theory. Siu On CHAN. Fall Chinese University of Hong Kong 1/31

Context-Free Grammars

Exercise 1: Balanced Parentheses

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

Reflection in the Chomsky Hierarchy

CS 314 Principles of Programming Languages

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

Week 2: Syntax Specification, Grammars

Finite Automata Part Three

Fall Compiler Principles Lecture 3: Parsing part 2. Roman Manevich Ben-Gurion University

Question Marks 1 /11 2 /4 3 /13 4 /5 5 /11 6 /14 7 /32 8 /6 9 /11 10 /10 Total /117

DVA337 HT17 - LECTURE 4. Languages and regular expressions

CSCI312 Principles of Programming Languages

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

LING/C SC/PSYC 438/538. Lecture 20 Sandiway Fong

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Transcription:

Homework Homework #3 returned Homework #4 due today Homework #5 Pg 169 -- Exercise 4 Pg 183 -- Exercise 4c,e,i Pg 184 -- Exercise 10 Pg 184 -- Exercise 12 Pg 185 -- Exercise 17 Due 10 / 17 Announcements Final Exam Dates have been announced Friday, November 17 10:15am -- 12:15pm Room 70-3435 Before We Start Exam 1 will be returned next class Any questions? Conflicts? Let me know. Plan for today 1st half 2nd half Pushdown Automata Languages Recall. What is a language? What is a class of languages? Note: minor change from original schedule 1

The Language Bubble Context Free Languages Regular Languages Finite Languages Context Free Languages Context Free Languages(CFL) is the next class of languages outside of Regular Languages: Language / grammar: Context Free Grammar Machine for accepting: Pushdown Automata Grammars Let s formalize this a bit: A grammar is a 4-tuple: (V, T, P, S) where V is a set of variables T is a set of terminals P is a set of production rules V and T are disjoint (I.e. V T = ) S V, is your start symbol General Grammars Production Rules Of the form A B A is a string of terminals and variables B is a string of terminals and variables To apply a rule, replace any occurrence of A with the string B. Grammars Context Free Grammars Let s formalize this a bit: Production rules We say that γ can be derived from α in one step: A β is a rule α = α 1 A α 2 γ = α 1 β α 2 α γ Production Rules Of the form A B A is a variable B is a string, combining terminals and variables To apply a rule, replace an occurrence of A with the string B. We write α * γ if γ can be derived from α in zero or more steps. We say that the grammar is context-free since this substitution can take place regardless of where A is. 2

Context Free Grammars The language generated by a grammar Let G = (V, T, P, S) The language generated by G, L(G) L(G) = { x T * S * x} A language L is a Context Free Language (CFL) iff there is a CFG G, such that L = L(G) A context free grammar is in Chomsky Normal Form (CNF) if every production is of the form: A BC A a Where A,B, and C are variables and a is a terminal. Theory Hall of Fame Noam Chomsky The Grammar Guy 1928 b. Philadelphia, PA PhD UPenn (1955) Linguistics Prof at MIT (Linguistics) (1955 - present) If we can put a CFG into CNF, then we can calculate the depth of the longest branch of a parse tree for the derivation of a string. B A C At most 2 branches at every node Probably more famous for his leftist political views. a 3 Step process: Remove λ- Productions Remove Unit Productions Remove Useless Symbols A λ-productions is a production of the form A λ Basic idea Find the set of all variables A such that A * λ (set of nullable variables) For all productions that contain a nullable variable on the right hand side, add a production that eliminates the nullable from the right hand side 3

We must be a bit careful here If λ is in a CFL, then the production S λ must be in the production set. The algorithm to be described will generate L {λ} Step 1: Find the set of nullable variables: Example: S AB A aaa λ B bbb λ All variables are nullable A and B are nullable since A ε and B ε S is nullable since S AB and A and B are nullable Step 2: Remove nullable variables For all productions A β where β contains nullable variables, add a new production with each nullable removed from β Step 2: Remove nullable variables Example: S AB A aaa λ B bbb λ All variables are nullable Step 2: Remove nullable variables Example: Consider: S AB Add to P: S A and S B Consider: A aaa Add to P: A aa and A a Consider: B bbb Add to P: B bb and B b Removing λ -Productions Step 2: Remove nullable variables Our grammar now looks like: S AB A B A aaa aa a λ B bbb bb b λ 4

Step 3: Remove your λ-productions Example: Remove A λ and B λ Our final grammar looks like: S AB A B A aaa aa a B bbb bb b Questions? A Unit Productions is a production of the form A B where A and B are variable Basic idea Very similar to removing λ productions For each variable A, find the set of all variables B such that A * B by just following unit productions (Aderivable) For all variables B that are A derivable and for all productions B α, add the production A α Step 0: Remove λ-productions using the previous algorithm. Step 1: For all variables A find the set of A-derivable variables: Recursive definition of A-derivable 1. If A B then B is A-derivable 2. If C is A derivable and C B (and B A), then B is A derivable 3. No other variables are A-derivable. Step 1: For all variables A find the set of A- derivable variables: Example: S S + T T T T * F F Step 1: For all variables A find the set of A- derivable variables: Example: S S + T T T T * F F Let s find the set of S-derivable variables: T is S derivable since S T F is S derivable since T F and T is S derivable S-derivable = {T, F} T-derivable = {F} F-derivable = 5

Step 2: For each variable A, if B is A- derivable, for each non-unit production B β, add the production A β Step 2: Example: S S + T T T T * F F S-derivable = {T, F} T-derivable = {F} Add to P: S T * F, S (S) a : T (S) a Step 2: Our new grammar now looks like: S S + T T * F (S) a T T T * F (S) a F Step 3: Remove Unit Productions Our final grammar looks like: Our new grammar now looks like: S S + T T * F (S) a T T * F (S) a Remove S T, T F Questions Removing Useless Symbols A symbol X is useful for a grammar G = (V, T, P, S) if S * αxβ * w where w L(G) In other words, a useful symbol will be used somewhere in the derivation of a string in the language. Any symbol that is not useful is useless. Useless symbols do not add to the language generated by a grammar, so it s okay to remove them. Removing Useless Symbols Definitions: We say a symbol X is generating if: X * w for some w L(G) We say a symbol X is reachable if: S * α Xβ for some α, β Symbols that are useful must be both generating and reachable. Such symbols (and assoc. productions) can be removed 6

Removing useless symbols Removing useless symbols Algorithm: Eliminate all non generating symbols Eliminate all non reachable symbols from resultant grammar. Finding generating symbols All symbols in T are generating If A α and all symbols in α are generating, then A is generating. No other symbols are generating. Removing useless symbols Removing Useless Symbols Finding reachable symbols S is reachable If A is reachable, and A α, then all variables in α are reachable. Example: S AB a A b B is useless since it is not generating Eliminate it Removing useless symbols Recall our goal Example: S a A b Now A is not reachable, eliminate it! S a A context free grammar is in Chomsky Normal Form (CNF) if every production is of the form: A BC A a Note that you must eliminate non-generating symbols before non-reachable symbols. Where A,B, and C are variables and a is a terminal. 7

Given a CFG G, there is an equivalent CFG, G in Chomsky Normal form such that L(G ) = L(G) {λ} Step 1: Remove λ-productions Step 2: Remove Unit Productions Step 3: Remove useless symbols After steps 1 3 : All productions are of the form: A a where A is a variable and a is a terminal A β where β 2 and β contains variables and/or terminals. Step 4: Derive terminals from new variables: For all productions of the 2 nd type: A β, for all terminals a in β, create a new variable X a Add a new production X a a Replace a in β with X a Step 4: Let s go back to our first example: S AB A B A aaa aa a B bbb bb b Removing unit transitions: S AB aaa aa a bbb bb b A aaa aa a B bbb bb b Note that S, A, and B are all useful. Step 4: Define new productions: X a a and X b b and replace instance of a with X a, similarly for b S AB aaa aa a bbb bb b A aaa aa a B bbb bb b New: S AB X a AA X a A a X b BB X b B b A X a AA X a A a B X b BB X b B b X a a X b b After steps 1 4 : All productions are of the form: A a where A is a variable and a is a terminal A β where β 2 and β contains only variables. Step 5: For all productions of type 2 where β > 2, replace the production with a series of new productions each having exactly 2 variables on the right Best illustrated with an example 8

Step 4: The production: A BCDBCE Would be replaced with A BY 1 Y 1 CY 2 Y 2 DY 3 Y 3 BY 4 Y 4 CE Step 4: Back to our example S AB X a AA X a A a X b BB X b B b A X a AA X a A a B X b BB X b B b X a a X b b Add productions Y 1 AA Y 2 BB Step 4: Our final grammar Questions S AB X a Y 1 X a A a X b Y 2 X b B b A X a Y 1 X a A a B X b Y 2 X b B b Y 1 AA Y 2 BB X a a X b b CNF Any grammar can be placed into CNF Why bother? Means to simplify grammars Gives upper limit on size of parse tree And we ll need this factoid next week. Questions? Break. 9