COMP 330 Autumn 2018 McGill University

Similar documents
Theory of Computation Dr. Weiss Extra Practice Exam Solutions

QUESTION BANK. Unit 1. Introduction to Finite Automata

Models of Computation II: Grammars and Pushdown Automata

PS3 - Comments. Describe precisely the language accepted by this nondeterministic PDA.

Compiler Construction

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

Skyup's Media. PART-B 2) Construct a Mealy machine which is equivalent to the Moore machine given in table.

R10 SET a) Construct a DFA that accepts an identifier of a C programming language. b) Differentiate between NFA and DFA?

CT32 COMPUTER NETWORKS DEC 2015


CSE 105 THEORY OF COMPUTATION

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

Pushdown Automata. A PDA is an FA together with a stack.

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

LL(1) predictive parsing

Multiple Choice Questions

PDA s. and Formal Languages. Automata Theory CS 573. Outline of equivalence of PDA s and CFG s. (see Theorem 5.3)

I have read and understand all of the instructions below, and I will obey the Academic Honor Code.

JNTUWORLD. Code No: R

ECS 120 Lesson 7 Regular Expressions, Pt. 1

Context Free Languages and Pushdown Automata

UNIT I PART A PART B

1 Parsing (25 pts, 5 each)

Normal Forms and Parsing. CS154 Chris Pollett Mar 14, 2007.

Lecture 2 Finite Automata

CS 44 Exam #2 February 14, 2001

HKN CS 374 Midterm 1 Review. Tim Klem Noah Mathes Mahir Morshed

Limits of Computation p.1/?? Limits of Computation p.2/??

CS103 Handout 16 Winter February 15, 2013 Problem Set 6

Context-Free Grammars

Reflection in the Chomsky Hierarchy

Theory of Computations Spring 2016 Practice Final Exam Solutions

CS154 Midterm Examination. May 4, 2010, 2:15-3:30PM

QUESTION BANK. Formal Languages and Automata Theory(10CS56)

Computer Science 236 Fall Nov. 11, 2010

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama

Automata Theory TEST 1 Answers Max points: 156 Grade basis: 150 Median grade: 81%

CS143 Midterm Fall 2008

To illustrate what is intended the following are three write ups by students. Diagonalization

Yet More CFLs; Turing Machines. CS154 Chris Pollett Mar 8, 2006.

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Context Free Grammars. CS154 Chris Pollett Mar 1, 2006.

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

CS402 Theory of Automata Solved Subjective From Midterm Papers. MIDTERM SPRING 2012 CS402 Theory of Automata

Decision Properties for Context-free Languages

Context-Free Grammars

Parsing. For a given CFG G, parsing a string w is to check if w L(G) and, if it is, to find a sequence of production rules which derive w.

CS402 - Theory of Automata Glossary By

Bottom-Up Parsing. Lecture 11-12

Ambiguous Grammars and Compactification

CS/ECE 374 Fall Homework 1. Due Tuesday, September 6, 2016 at 8pm

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Complexity Theory. Compiled By : Hari Prasad Pokhrel Page 1 of 20. ioenotes.edu.np

Normal Forms for CFG s. Eliminating Useless Variables Removing Epsilon Removing Unit Productions Chomsky Normal Form

Closure Properties of CFLs; Introducing TMs. CS154 Chris Pollett Apr 9, 2007.

CpSc 421 Final Solutions

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

CS6160 Theory of Computation Problem Set 2 Department of Computer Science, University of Virginia

CS 4120 Introduction to Compilers

Lexical Scanning COMP360

Midterm I (Solutions) CS164, Spring 2002

Compilers CS S-01 Compiler Basics & Lexical Analysis

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Formal Languages and Compilers Lecture VI: Lexical Analysis

Bottom-Up Parsing. Lecture 11-12

Theory and Compiling COMP360

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

CMSC 330 Practice Problem 4 Solutions

CS311 / MATH352 - AUTOMATA AND COMPLEXITY THEORY

Parsing. Top-Down Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 19

15 212: Principles of Programming. Some Notes on Grammars and Parsing

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.

LL(1) predictive parsing

CS 181 B&C EXAM #1 NAME. You have 90 minutes to complete this exam. You may assume without proof any statement proved in class.

Context-Free Grammars

Introduction to Lexing and Parsing

CPSC 121 Sample Final Examination December 2013

CS103 Handout 35 Spring 2017 May 19, 2017 Problem Set 7

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Front End: Lexical Analysis. The Structure of a Compiler

CMPSCI 250: Introduction to Computation. Lecture 20: Deterministic and Nondeterministic Finite Automata David Mix Barrington 16 April 2013

1 Variations of the Traveling Salesman Problem

Intro to semantics; Small-step semantics Lecture 1 Tuesday, January 29, 2013

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

CS2 Language Processing note 3

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Chapter Seven: Regular Expressions

Derivations of a CFG. MACM 300 Formal Languages and Automata. Context-free Grammars. Derivations and parse trees

CS 164 Programming Languages and Compilers Handout 9. Midterm I Solution

Languages and Compilers

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

Regular Languages (14 points) Solution: Problem 1 (6 points) Minimize the following automaton M. Show that the resulting DFA is minimal.

Announcements. CS243: Discrete Structures. Strong Induction and Recursively Defined Structures. Review. Example (review) Example (review), cont.

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

lec3:nondeterministic finite state automata

Final Exam 1, CS154. April 21, 2010

Homework 1 Due Tuesday, January 30, 2018 at 8pm

Theory Bridge Exam Example Questions Version of June 6, 2008

Recursion and Structural Induction

CMSC 330: Organization of Programming Languages

Transcription:

COMP 330 Autumn 2018 McGill University Assignment 4 Solutions and Grading Guide Remarks for the graders appear in sans serif font. Question 1[25 points] A sequence of parentheses is a sequence of ( and ) symbols or the empty sequence. Such a sequence is said to be balanced if it has an equal number of ( and ) symbols and in every prefix of the sequence the number of left parentheses is greater than or equal to the number of right parentheses. Thus ((())()) is balanced but, for example, ())( is not balanced even though there are an equal number of left and right parentheses. The empty sequence is considered to be balanced. Consider the grammar S (S) SS ε. This grammar is claimed to generate balanced parentheses. 1. Prove that any string generated by this grammar is a properly balanced sequence of parentheses. [10] 2. Prove that every sequence of balanced parentheses can be generated by this grammar. [15] Solution 1. We can prove this by induction on the length of the derivation. The base case is a derivation that has a single step. There is only one derivation like that and it can only produce the empty string which is balanced. For the inductive case we proceed as follows. We assume that for all derivations of length up to n the claim that we are trying to prove is correct. Now consider a derivation of length n + 1. There are two cases: (a) the first rule used is S (S) and (b) the first rule used is S SS. In case (a) we get a string of the form (w) where w is generated from S by a derivation of length n. So w is a balanced string and hence clearly (w) is a balanced string. In case (b) we have a string of the form w 1 w 2 where each of w 1 and w 2 are generated by 1

S by derivations that are of length n or less. Thus each of w 1 and w 2 is a balanced string. Clearly every prefix of w 1 has either (i) more left parentheses than right parentheses or (ii) equal number of left and right parentheses. If we consider a prefix of w 1 w 2 that includes all of w 1 and part of w 2 ; we see that the portion coming from w 1 has an equal number of left and right parentheses while the portion coming from w 2 has at least as many left parentheses as right parentheses. Taken together, such a prefix has at least as many left parentheses as right parentheses. Finally, the entire string w 1 w 2 has equal number of left and right parentheses. Thus the string w 1 w 2 is balanced. I am happy if they say that every time a right parenthesis is introduced a left is also introduced and the left parenthesis is always to the left of the right parenthesis. Furthermore, every time a right parenthesis is produced a left parenthesis is also produced and the left parenthesis is to the left of the right parenthesis. This means that the final string must have equal numbers of left and right parentheses and the prefix of the string always has more or equal left parentheses as right parentheses. The careful induction I have done is more for practice with induction as the verbal argument I have just given is perfectly rigourous. 2. We prove this by induction on the length of the string of parentheses. The empty string is balanced and can be generated by the grammar: the base case is done. For the induction case we assume that any string of balanced parentheses of length up to and including n can be generated by the grammar. Now we consider a string w of balanced parentheses of length n + 1. Since it is balanced the first symbol must be a left parenthesis. Now define a function b(p) which is a function of the position in the string. This function is defined to be the number of left parentheses up to position p minus the number of right parentheses up to position p. In a balanced string b(1) = 1 and for every p, b(p) 0. At the end of the string b(p) must be zero. We know that as we move right, the value of b(p) increases or decreases by exactly 1 at each step. Let p 0 be the position where the function attains zero for the first time. This may happen only at the end of the string, for example in (((()))) or it may happen before the end as in ((()))(()). If p 0 is at the end of the word then we know that our string w has the form (w 1 ) where w 1 is a balanced string. Since w 1 has length n 1 by the induction hypothesis it can be generated by the grammar and thus w can be generated from the grammar by starting with the rule S (S) and then using the S produced on the right 2

hand side of this derivation to generate w 1. If p 0 occurs inside the string w then we break the string into two pieces w = w 1 w 2 where p 0 is at the end of w 1. Now, clearly, w 2 and w 1 both have to be balanced strings and they are both of length n or less so they can be generated by the grammar. Thus w can be generated by starting with the rule S SS and then using the two S s generated to produce w 1 and w 2. This proof requires some argument to justify why one can always break a balanced string into balanced substrings as I have described above. They should lose 7 points if they do not justify this step. Question 2[15 points] Consider the following context-free grammar S as asbs ε This grammar generates all strings where in every prefix the number of a s is greater than or equal to the number of b s. Show that this grammar is ambiguous by giving an example of a string and showing that it has two different derivations. Solution Consider the string aab. This can be generated by but also by S as aasb aab S asbs aasbs aabs aab. They must give two distinct trees. Two different derivation sequences is not enough as unambiguous grammars can give different derivations just by altering the order in which non-terminals are expanded. If they give two linearized derivation that correspond to different trees give them full marks but if they give two different derivation sequences that correspond to the same tree give them 5. Alternate Question 2[15 points] Prove that the grammar in the previous question generates only strings with the stated property and all strings with the stated property. Solution You are doing alternate questions? Clearly you don t need such detailed explanations like I gave in question 1. To prove that all the generated strings have the stated property is an easy induction on the length of the derivation along the lines described in the solution to question 1. The only slightly subtle point is that you need to explain what happens when your derivation starts with the rule S asbs. In this case you have a string 3

of the form aw 1 bw 2 with both w 1 and w 2 having the stated property. So aw 1 has at least one more a than it has b s. Thus when we get to the prefix aw 1 b we are assured that the number of a s is at least the same as the number of b s. Since w 2 also satisfies the property by the induction hypothesis, we see that all prefixed have the stated property. To prove that every string with the stated property can be generated we proceed as follows. The first letter has to be a. We do induction on the length of strings. The word we are considering in the inductive case has the form aw. Now w may itself have the property, in this case we can generate w and then generate aw by starting with the rule S as. If w does not have the property, we look for the first place where there is an excess of b s. Then w can be written as w 1 bw 2 where w 1 b marks the place where the first excess of b s occurs. I claim that w 2 must have the stated property. Suppose it does not, locate the first place within w 2 where an excess of b s occurs. Now we can write w 2 as w 3 bw 4 and the whole word is aw 1 bw 3 bw 4. We see that the prefix aw 1 bw 3 b has an excess of b s violating the condition that the overall word is supposed to satisfy. Going back to w 2 and abandoning the mythical w 3, w 4 we see that the overall word has the form aw 1 bw 2 where both w 1, w 2 satisfy the required condition. Now we can generate this starting with the rule S asbs and relying on the inductive hypothesis to ensure that w 1, w 2 can be generated. They must justify the decomposition of the string for the induction argument. If they do not the induction proof is worthless and they should get at most 5 points. Question 3[15 points] We define the language P AREN 2 inductively as follows: 1. ε P AREN 2, 2. if x P AREN 2 then so are (x) and [x], 3. if x and y are both in P AREN 2 then so is xy. Describe a PDA for this language which accepts by empty stack. Give all the transitions. Solution Please note that when you are recognizing by empty stack you do not need accept states. Note also that you must be at the end of the string in order to accept; this is true whether by empty stack or by accept state. Here is a simple DPDA to recognise the language P AREN 2. 4

(, ɛ ( ], [ ɛ [, ɛ [ ), ( ɛ It pushes open parentheses and brackets onto the stack and pops them off only if the matching type is seen. The machine will jam if an unexpected symbol is seen. Since it accepts by empty stack, a word will be accepted only if every open parenthesis or bracket was closed. Note that it rejects strings like [(]). I did not spend a lot of time in class explaining the difference between empty stack acceptance and acceptance by state. Give them full marks for a correct machine regardless of how they did it. Question 4[20 points] Consider the language {a n b m c p n p or m p}. Show that this is context free by giving a grammar. You need not give a formal proof that your grammar is correct but you must explain, at least briefly, why it works. Solution The following grammar G generates the given language, which we will call L. S N AM SC N anc B M bmc ɛ A aa ɛ B bb ɛ C cc ɛ To see that any string generated by G is in L, we analyse the productions. The start symbol S gives the choice between having no fewer cs than as (n p in the question), and no fewer cs than bs (m p). The symbol N (respectively M) generates words with as many cs as there are as (bs), while allowing any number of bs (as). The production S SC allows for more cs to be added. 5

Now we need to show that any word in L can be generated by G. Given a word w = a n b m c p in L, we can generate it with the above grammar as follows. Let q = min(n, m) and split it w = w 1 w 2 = a n b m c q c p q. Then w 1 is in L, and c p q is generated by C in G, so that if w 1 is generated by G, w will be generated by the rule S SC. Suppose n m, then w 1 = a n b m c n. The symbol B generates b m, and by using the production N anc n times, N can generate w 1. Using the rule S N, this gives that w 1 is generated by G as required. If instead m n, then w 1 = a n b m c m. In this case, the symbol A generates a n, while the symbol M generates b m c m. Using the rule S AM, this too gives that G generates w 1, and so G generates w. This question is tricky so mark is generously. Give them 5 marks for the explanation; i.e. if this is completely missing take off 5. Question 5[25 points] A linear grammar is one in which every rule has exactly one terminal and one non-terminal on the right hand side or just a single terminal on the right hand side or the empty string. Here is an example S as a bb; B bb b. 1. Show that any regular language can be generated by a linear grammar. I will be happy if you show me how to construct a grammar from a DFA; if your construction is clear enough you can skip the straightforward proof that the language generated by the grammar and the language recognized by the DFA are the same. 2. Is every language generated by a linear grammar regular? If your answer is yes you must give a proof. If your answer is no you must give an example. Solution 1. Suppose we have a regular language L, there has to be a DFA to recognize it. We fix the alphabet to be Σ. Let the DFA be M = (S, s 0, δ, F ) with the usual meanings of these symbols. We define a linear grammar as follows: the nonterminals N are identified with the states S of the DFA. Thus for any state p S we introduce a nonterminal P in our grammar. The start state of the DFA is identified with the start symbol S of the 6

grammar. Now for every transition δ(p, a) = q we have a rule written as P aq. For every accept state q F of the DFA we have a rule Q ε in the grammar. It is easy to see that every run through the automaton exactly produces the sequence of rules needed to generate the string. Yes, I do know that most of you Googled it. 2. The answer is no. If we defined a left-linear grammar in which we insisted that the terminal symbol appears to the left of the nonterminal and then asked if left-linear grammars always produce regular languages then the answer would be yes. The same could be said of right-linear gramamrs. But in the definition above I have allowed you to mix left and right rules. Consider the grammar below S aa ε A Sb. This generates our old friend {a n b n n 0}. I think everyone will get the first part easily. I didin t ask for a proof so if the construction is clear give them full marks. For part 2, people may have looked up random definitions on the web. Give them an IMMEDIATE ZERO if they did not use my definition. They need to learn that a definition I give is the one they have to use. If they come to complain send them to me; don t waste time arguing with them about it. However, if they use a definition they found elsewhere and proved that it is equivalent to my definition then, of course, they should get full marks. 7