The CYK Algorithm. We present now an algorithm to decide if w L(G), assuming G to be in Chomsky Normal Form.

Similar documents
Multiple Choice Questions

Context-Free Grammars

Normal Forms for CFG s. Eliminating Useless Variables Removing Epsilon Removing Unit Productions Chomsky Normal Form

JNTUWORLD. Code No: R

UNIT I PART A PART B

Decision Properties for Context-free Languages

QUESTION BANK. Unit 1. Introduction to Finite Automata

1. Which of the following regular expressions over {0, 1} denotes the set of all strings not containing 100 as a sub-string?

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

Normal Forms and Parsing. CS154 Chris Pollett Mar 14, 2007.

Derivations of a CFG. MACM 300 Formal Languages and Automata. Context-free Grammars. Derivations and parse trees

AUBER (Models of Computation, Languages and Automata) EXERCISES

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

CMPT 755 Compilers. Anoop Sarkar.

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Skyup's Media. PART-B 2) Construct a Mealy machine which is equivalent to the Moore machine given in table.

The CYK Parsing Algorithm

Compiler Construction

Formal Languages and Automata

QUESTION BANK. Formal Languages and Automata Theory(10CS56)

University of Nevada, Las Vegas Computer Science 456/656 Fall 2016

CMSC 330: Organization of Programming Languages

CYK parsing. I love automata theory I love love automata automata theory I love automata love automata theory I love automata theory

Parsing. Earley Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 39


Chapter 18: Decidability

CMSC 330: Organization of Programming Languages

Ambiguous Grammars and Compactification

CMSC 330: Organization of Programming Languages. Context Free Grammars

Models of Computation II: Grammars and Pushdown Automata

Yet More CFLs; Turing Machines. CS154 Chris Pollett Mar 8, 2006.

CT32 COMPUTER NETWORKS DEC 2015

Normal Forms. Suradet Jitprapaikulsarn First semester 2005

Post's Correspondence Problem. An undecidable, but RE, problem that. appears not to have anything to do with. TM's.

Closure Properties of CFLs; Introducing TMs. CS154 Chris Pollett Apr 9, 2007.

Context-Free Grammars and Languages (2015/11)

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Context-Free Grammars Ambiguity

1. [5 points each] True or False. If the question is currently open, write O or Open.

CS 44 Exam #2 February 14, 2001

CMSC 330: Organization of Programming Languages

CS210 THEORY OF COMPUTATION QUESTION BANK PART -A UNIT- I

From Theorem 8.5, page 223, we have that the intersection of a context-free language with a regular language is context-free. Therefore, the language

Formal Languages. Grammar. Ryan Stansifer. Department of Computer Sciences Florida Institute of Technology Melbourne, Florida USA 32901

Context-Free Languages and Parse Trees

Reflection in the Chomsky Hierarchy

PDA s. and Formal Languages. Automata Theory CS 573. Outline of equivalence of PDA s and CFG s. (see Theorem 5.3)

Assignment No.4 solution. Pumping Lemma Version I and II. Where m = n! (n-factorial) and n = 1, 2, 3

CS525 Winter 2012 \ Class Assignment #2 Preparation

Context Free Languages and Pushdown Automata

Solving systems of regular expression equations

Section III: TRANSFORMATIONS

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

Automata Theory TEST 1 Answers Max points: 156 Grade basis: 150 Median grade: 81%

1. (a) What are the closure properties of Regular sets? Explain. (b) Briefly explain the logical phases of a compiler model. [8+8]

CS311 / MATH352 - AUTOMATA AND COMPLEXITY THEORY

KEY. A 1. The action of a grammar when a derivation can be found for a sentence. Y 2. program written in a High Level Language

Power Set of a set and Relations

AUTOMATA THEORY AND COMPUTABILITY

R10 SET a) Construct a DFA that accepts an identifier of a C programming language. b) Differentiate between NFA and DFA?

Lecture 8: Context Free Grammars

Part I: Multiple Choice Questions (40 points)

Homework. Context Free Languages. Before We Start. Announcements. Plan for today. Languages. Any questions? Recall. 1st half. 2nd half.

York University CSE 2001 Unit 4.0 Context Free Grammars and Parsers and Context Sensitive Grammars Instructor: Jeff Edmonds

Context-Free Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

Context-Free Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

Context-Free Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

Natural Language Processing

Parsing. Cocke Younger Kasami (CYK) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 35

Homework. Announcements. Before We Start. Languages. Plan for today. Chomsky Normal Form. Final Exam Dates have been announced

Context Free Languages

Context Free Grammars. CS154 Chris Pollett Mar 1, 2006.

Computational Logic. SLD resolution. Damiano Zanardini

CS154 Midterm Examination. May 4, 2010, 2:15-3:30PM

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

Context-Free Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

Limits of Computation p.1/?? Limits of Computation p.2/??

Syntax Analysis Part I

Theory of Computation, Homework 3 Sample Solution

Decidable Problems. We examine the problems for which there is an algorithm.

CS143 Midterm Fall 2008

Functional Programming. Overview. Topics. Definition n-th Fibonacci Number. Graph

Regular Expressions. Lecture 10 Sections Robb T. Koether. Hampden-Sydney College. Wed, Sep 14, 2016

Efficient Mergesort. Christian Sternagel. August 28, 2014

MA513: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 18 Date: September 12, 2011

INFOB3TC Solutions for the Exam

Lecture 12: Cleaning up CFGs and Chomsky Normal

CS453 : Shift Reduce Parsing Unambiguous Grammars LR(0) and SLR Parse Tables by Wim Bohm and Michelle Strout. CS453 Shift-reduce Parsing 1

Context-Free Grammars

Context-Free Grammars

5200/7200 Fall 2007 Concurrence theorems for triangles

DVA337 HT17 - LECTURE 4. Languages and regular expressions

The Turing Machine. Unsolvable Problems. Undecidability. The Church-Turing Thesis (1936) Decision Problem. Decision Problems

Database Theory VU , SS Codd s Theorem. Reinhard Pichler

Complementing Non-CFLs If A and B are context free languages then: AR is a context-free language TRUE

Automata Theory CS S-FR Final Review

Turing Machine Languages

A Note on the Succinctness of Descriptions of Deterministic Languages

Abstract Syntax Trees L3 24

Transcription:

CFG [1] The CYK Algorithm We present now an algorithm to decide if w L(G), assuming G to be in Chomsky Normal Form. This is an example of the technique of dynamic programming Let n be w. The natural algorithm (trying all productions of length < 2n) may be exponential. This technique gives a O(n 3 ) algorithm!! 1

CFG [2] dynamic programming fib 0 = fib 1 = 1 fib (n + 2) = fib n + fib (n + 1) fib 5? calls fib 4, fib 3 and fib 4 calls fib 3 So in a top-down computation there is duplication of works (if one does not use memoization) 2

CFG [3] dynamic programming For a bottom-up computation fib 2 = 2, fib 3 = 3, fib 4 = 5, fib 5 = 8 What is going on in the CYK algorithm or Earley algorithm is similar S AB BC, A BA a, B CC b, C AB a bab L(G)?? and aba L(G)? 3

CFG [4] dynamic programming The idea is to represent bab as the collection of the facts b(0, 1), a(1, 2), b(2, 3) We compute then the facts X(i, k) for i < k by induction on k i Only one rule: If we have a production C AB and A in X(i, j) and B in X(j, k) then C is in X(i, k) 4

CFG [5] The CYK Algorithm The algorithm is best understood in term of production systems Example: the grammar S AB BA SS AC BD A a, B b, C SB, D SA becomes the production system 5

CFG [6] The CYK Algorithm A(x, y), B(y, z) S(x, z), B(x, y), A(y, z) S(x, z) S(x, y), S(y, z) S(x, z), A(x, y), C(y, z) S(x, z) B(x, y), D(y, z) S(x, z), S(x, y), B(y, z) C(x, z) S(x, y), A(y, z) D(x, z), a(x, y) A(x, y), b(x, y) B(x, y) 6

CFG [7] The CYK Algorithm The problem if one can one derive S aabbab is transformed to the problem: can one produce S(0, 6) in this production system given the facts a(0, 1), a(1, 2), b(2, 3), b(3, 4), a(4, 5), b(5, 6) 7

CFG [8] The CYK Algorithm For this we apply a forward chaining/bottom up sequence of productions A(0, 1), A(1, 2), B(2, 3), B(3, 4), A(4, 5), B(5, 6) S(1, 3), S(3, 5), S(4, 6) S(1, 5), C(1, 4), C(3, 6) S(0, 4),... S(0, 6) 8

CFG [9] The CYK Algorithm For instance the fact that C(3, 6) is produced corresponds to the derivation C SB BAB bab bab bab In this way, we get a solution in O(n 3 )! 9

CFG [10] Forward-chaining inference This idea works actually for any grammar. For instance is represented by the production system S SS asb ɛ S(x, x), S(x, y), S(y, z) S(x, z) a(x, y), S(y, z), b(z, t) S(x, t) and the problem to decide S aabb is replaced by the problem to derive S(0, 4) from the facts a(0, 1), a(1, 2), b(2, 3), b(3, 4) 10

CFG [11] Forward-chaining inference This is the main idea behind Earley algorithm Mainly used for parsing in computational linguistics Earley parsers are interesting because they can parse all context-free languages 11

CFG [12] Pumping Lemma for CFL We prove that {a n b n c m n m} is not context-free using the pumping lemma Similar problem for {a n b m m n}: one can show that it is not regular using the pumping lemma 12

CFG [13] Complement of a CLF We have seen that CLF are not closed under intersection, are closed under union It follows that they are not closed under complement Here is an explicit example: we show that the complement of {a n b n c n n 0} is a CFL For this we prove that the complemenent of L(a b c ) is regular 13

CFG [14] Undecidable Problems We have given algorithm to decide L(G) and w L(G). What is surprising is that it can be shown that there are no algorithms for the following problems Given G 1 and G 2 do we have L(G 1 ) L(G 2 )? Do we have L(G 1 ) = L(G 2 )? Given G and R regular expression, do we have L(G) = L(R)? L(R) L(G)? Do we have L(G) = T where T is the alphabet of G? (Compare to the case of regular languages) Given G is G ambiguous?? 14

CFG [15] Undecidable Problems One reduces these problems to the Post Correspondance Problem Given u 1,..., u n and v 1,..., v n in {0, 1} is it possible to find i 1,..., i k such that u i1... u ik = v i1... v ik Example: 1, 10, 011 and 101, 00, 11 Challenge example: 001, 01, 01, 10 and 0, 011, 101, 001 15

CFG [16] Haskell Program isprefix [] ys = True isprefix (x:xs) (y:ys) = x == y && isprefix xs ys isprefix xs ys = False iscomp (xs,ys) = isprefix xs ys isprefix ys xs exists p [] = False exists p (x:xs) = p x exists p xs exhibit p (x:xs) = if p x then x else exhibit p xs 16

CFG [17] Haskell Program addnum k [] = [] addnum k (x:xs) = (k,x):(addnum (k+1) xs) nextstep xs ys = concat (map (\ (n,(s,t)) -> map (\ (ns,(u,v)) -> (ns++[n],(u ++ s,v ++ t))) ys) xs) 17

CFG [18] Haskell Program mainloop xs ys = let bs = filter (iscomp. snd) ys prop (_,(u,v)) = u == v in if exists prop bs then exhibit prop bs else if bs == [] then error"no SOLUTION" else mainloop xs (nextstep xs bs) 18

CFG [19] Haskell Program post xs = let as = addnum 1 xs in mainloop as (map (\ (n,z) -> ([n],z)) as) xs1 = [("1","101"),("10","00"),("011","11")] xs2 = [("001","0"),("01","011"),("01","101"),("10","001")] 19

CFG [20] Haskell Program Main> post xs1 ([1,3,2,3],("101110011","101110011")) Main> post xs2 ERROR - Garbage collection fails to reclaim sufficient space [2,2,2,3,2,2,2,3,3,4,4,6,8,8,15, 21,15,17,18,24,15,12,12,18,18,24,24,45, 63,66,84,91,140,182,201,346,418,324,330,321,423,459,780 20

CFG [21] Post Correspondance Problem and CFL To the sequence u 1,..., u n we associate the following grammar G A The alphabet is {0, 1, a 1,..., a n } The productions are A u 1 a 1... u n a n u 1 Aa 1... u n Aa n This grammar is non ambiguous 21

CFG [22] Post Correspondance Problem and CFL To the sequence v 1,..., v n we associate the following grammar G B The alphabet is the same {0, 1, a 1,..., a n } The productions are B v 1 a 1... v n a n v 1 Ba 1... v n Ba n This grammar is non ambiguous 22

CFG [23] Post Correspondance Problem and CFL Theorem: We have L(G A ) L(G B ) iff the Post Correspondance Problem for u 1,..., u n and v 1,..., v n has a solution 23

CFG [24] Post Correspondance Problem and CFL Finally we have the grammar G with productions S A B Theorem: The grammar G is ambiguous iff the Post Correspondance Problem for u 1,..., u n and v 1,..., v n has a solution 24

CFG [25] Post Correspondance Problem and CFL The complement of L(G A ) is CF We see this on one example u 1 = 0, u 2 = 10 The complement of L(G B ) is CF Hence we have a grammar G C for the union of the complement of L(G A ) and the complement of L(G B ) 25