First Midterm Exam CS164, Fall 2007 Oct 2, 2007

Similar documents
Midterm I - Solution CS164, Spring 2014

CS164: Midterm I. Fall 2003

Midterm I (Solutions) CS164, Spring 2002

CS143 Midterm Fall 2008

Final Exam CS164, Fall 2007 Dec 18, 2007

CS164 Second Midterm Exam Fall 2014

CS 164 Handout 11. Midterm Examination. There are seven questions on the exam, each worth between 10 and 20 points.

Midterm II CS164, Spring 2006

ECE251 Midterm practice questions, Fall 2010

Semantic Analysis. Lecture 9. February 7, 2018

Today. Assignments. Lecture Notes CPSC 326 (Spring 2019) Quiz 5. Exam 1 overview. Type checking basics. HW4 due. HW5 out, due in 2 Tuesdays

1. Consider the following program in a PCAT-like language.

BSCS Fall Mid Term Examination December 2012

MIDTERM EXAMINATION - CS130 - Spring 2003

Problem Score Max Score 1 Syntax directed translation & type

Question Marks 1 /20 2 /16 3 /7 4 /10 5 /16 6 /7 7 /24 Total /100

CS 115 Exam 1, Fall 2015 Thu. 09/24/2015

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

UVa ID: NAME (print): CS 4501 LDI Midterm 1

CSE 401 Midterm Exam Sample Solution 2/11/15

A simple syntax-directed

CS143 Midterm Sample Solution Fall 2010

CS 164 Programming Languages and Compilers Handout 9. Midterm I Solution

Syntax Analysis. Chapter 4

UMBC CMSC 331 Final Exam

CSCE 531 Spring 2009 Final Exam

Briefly describe the purpose of the lexical and syntax analysis phases in a compiler.

CMSC330 Fall 2016 Midterm #2 2:00pm/3:30pm

2068 (I) Attempt all questions.

CSE 401/M501 18au Midterm Exam 11/2/18. Name ID #

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

COL728 Minor1 Exam Compiler Design Sem II, Answer all 5 questions Max. Marks: 20

CS164 First Midterm Exam Fall 2014

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

CMSC 330, Fall 2018 Midterm 2

MIDTERM EXAM (Solutions)

CS143 Midterm Spring 2014

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

CS 406/534 Compiler Construction Putting It All Together

CS 164 Handout 16. Final Examination. There are nine questions on the exam, some in multiple parts. You have 3 hours to work on the

UNIVERSITY OF CALIFORNIA

CS 415 Midterm Exam Spring SOLUTION

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Midterm 2 Solutions Many acceptable answers; one was the following: (defparameter g1

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Administrativia. PA2 assigned today. WA1 assigned today. Building a Parser II. CS164 3:30-5:00 TT 10 Evans. First midterm. Grammars.

Midterm Exam. CSCI 3136: Principles of Programming Languages. February 20, Group 2

CSE 401 Midterm Exam 11/5/10

Question Bank. 10CS63:Compiler Design

CMSC330 Spring 2014 Midterm #2. Score 1 OCaml types & type

CS/ECE 374 Fall Homework 1. Due Tuesday, September 6, 2016 at 8pm

CSCI 1260: Compilers and Program Analysis Steven Reiss Fall Lecture 4: Syntax Analysis I

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

I have read and understand all of the instructions below, and I will obey the Academic Honor Code.

CMSC330 Spring 2018 Midterm 2 9:30am/ 11:00am/ 3:30pm

CMSC 330, Fall 2018 Midterm 2

Midterm Exam. CS169 Fall November 17, 2009 LOGIN: NAME: SID: Problem Max Points Points TOTAL 100

Syntax and Grammars 1 / 21

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

CS143 Handout 20 Summer 2012 July 18 th, 2012 Practice CS143 Midterm Exam. (signed)

Chapter 4. Lexical and Syntax Analysis

Lexical and Syntax Analysis. Top-Down Parsing

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division

Syntax-Directed Translation. Lecture 14

Question Marks 1 /11 2 /4 3 /13 4 /5 5 /11 6 /14 7 /32 8 /6 9 /11 10 /10 Total /117

CS 314 Principles of Programming Languages

CS 406/534 Compiler Construction Parsing Part I

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

CSCI Compiler Design

In addition to the correct answer, you MUST show all your work in order to receive full credit.

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

CS 44 Exam #2 February 14, 2001

CS 536 Midterm Exam Spring 2013

CIS 110 Introduction to Computer Programming 8 October 2013 Midterm

Context-Free Grammars

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming

CS 415 Midterm Exam Spring 2002

CS164: Final Exam. Fall 2003

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI

CS103 Handout 42 Spring 2017 May 31, 2017 Practice Final Exam 1

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Lexical and Syntax Analysis

2.2 Syntax Definition

Introduction to Parsing. Lecture 5

CMSC330 Spring 2016 Midterm #2 9:30am/12:30pm/3:30pm

Project Compiler. CS031 TA Help Session November 28, 2011

CS 164 Programming Languages and Compilers Handout 8. Midterm I

Prelude COMP 181 Tufts University Computer Science Last time Grammar issues Key structure meaning Tufts University Computer Science

Building a Parser II. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Comp 311: Sample Midterm Examination

Compilers and computer architecture From strings to ASTs (2): context free grammars

CS 411 Midterm Feb 2008

CSE 582 Autumn 2002 Exam 11/26/02

Part 5 Program Analysis Principles and Techniques

MidTerm Papers Solved MCQS with Reference (1 to 22 lectures)

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

CS143 Final Spring 2016

Introduction to Parsing. Lecture 8

Transcription:

P a g e 1 First Midterm Exam CS164, Fall 2007 Oct 2, 2007 Please read all instructions (including these) carefully. Write your name, login, and SID. No electronic devices are allowed, including cell phones used as watches. Silence your cell phones and place them in your bag. The exam is closed book, but you may refer to one (1) page of handwritten notes. Solutions will be graded on correctness and clarity. Each problem has a relatively simple and straightforward solution. Partial solutions will be graded for partial credit. There are 8 pages in this exam and 4 questions, each with multiple parts. If you get stuck on a question move on and come back to it later. You have 1 hour and 20 minutes to work on the exam. Please write your answers in the space provided on the exam, and clearly mark your solutions. You may use the backs of the exam pages as scratch paper. Do not use any additional scratch paper. LOGIN: NAME: SID: Problem Max points Points 1 20 2 30 3 25 4 25 TOTAL 100

P a g e 2 Problem 1: Miscellaneous [20 points] 1) [4 points] Circle pairs of regular expressions that are equivalent (in that they describe the same sets of strings): a. (ab)+ (ab)*ab b. ab* (ab)* c. (a b+) (a (b)+) d. a+* a+ 2) [4 points] Tokenize the following fragments of Java programs. Each fragment contains an error but tokenization is still possible. Indicate tokenization by drawing characters between lexemes. a. int j = a ++ b ; b. int j = a ++ ++ + b ; c. int $ foo ( int a ) { return 1 ; } 3) [4 points] CYK parser accepts arbitrary context-free grammars. This is because the CYK parser implicitly disambiguates these grammars. True or False 4) [4 points] A language is a set of strings. REGEX is the set of all languages that can be described with regular expressions and CFG is the set of all languages that can be described by context free grammars. Which relationship holds? Circle all applicable smileys. Answer: A B C D E F A REGEX is a strict subset of CFG B REGEX is a subset of CFG C REGEX is equal to CFG D REGEX is a superset of CFG E REGEX is a strict superset of CFG F None of the above 5) [4 points] There are languages that can be represented by NFAs but not by DFAs. True or False

P a g e 3 Problem 2: Earley Parser [30 points] This is a grammar for a simple programming language with a single function declaration (F). The body (B) of the function contains an expression (E) that is either a variable or a function call. P is a list of formal parameters and A is a list actual arguments. F int id ( P ) { B } P id P, P E id E ( A ) A B E E, A E This is a string from the language described by our grammar: int id ( id, id ) { id } This is a scratch area. Solution goes on the next page. See below how to draw edges. Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E int Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E id Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E ( Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E id Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E, Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E id Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E ) Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E { Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E id Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E } Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E

P a g e 4 Part 1. [15 points] Show the edges added for this string by the Earley parser: Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E int Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E id Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E ( Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E id Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E, Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E id Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E ) Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E { Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E id Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E } Fun int id ( P ) { B } P id P, P E id E ( A ) A E E, A B E Part 2. [5 points] Is the grammar ambiguous? Circle one. YES NO due to the PP P rule Part 3. [5 points] How many parse trees did the Earley parser discover? 1 Part 4. [5 points] Draw below three (3) edges that would be placed by the CYK parser but were not placed by the Earley parser. Label the edges in CYK style. int id ( id, id ) { id }

P a g e 5 Problem 3: Representation Conversions [25 points] Part 1. [8 points] Convert the following grammar to a regular expression, or concisely justify why this is not possible. Note: we do not care about the parse tree; the regular expression must represent the same set of strings as the grammar. Hint: write a few strings from the language and find a regular pattern, if possible. ARITHMETIC CFG: E A M 0 1 A E + E M E * E YOUR REGEX: ( 0 1 ) ( ( + * ) ( 0 1 ))* Justification: Part 2. [8 points] Same problem as above but for a different grammar: Hint: write a few strings from the language and find a regular pattern, if possible. Reverse Polish Notation CFG: E E E + E E * 0 1 YOUR REGEX: Justification: Due to a mistake on our part, we made this problem harder than originally intended. As a result, all of you got full credit. Some of you received an extra credit for making good observations about the wrong grammar. As some of you noticed, the grammar in the problem was not a correct RPM grammar. The correct RPN grammar is given above. The (correct) RPN grammar cannot be expressed as a regular language. Informally, it is because the grammar generates strings of the form 00+, 000++, 00+0+, etc, where the + s, viewed as binary operators, must always have zeros on the left; these zeros could be results of the evaluations. So there is a counting argument: one needs to keep track of the generated zeros in order to generate the right number of + s. Similar argument applied to the more complex grammar in the problem.

P a g e 6 Part 3. [9 points] Convert the following automaton into a regular expression. Show each step: first eliminate node 2, then node 3.

P a g e 7

P a g e 8 Problem 4: Grammars and Syntax Directed Translation [25 points] In this question we will design and implement (a tiny subset of) a language that will simplify development of HTML documents. Wikis already come with such a formatting language but we want something closer to a professional language like LaTeX. We focus on a single aspect of the formatting language: adding emphasis to text by using an italics font. The tricky part is nested emphasis: we want to emphasize text that is already within emphasized text. In the previous sentence, the text already within is an emphasis nested within a bigger enclosing emphasis. In HTML, the example sentence would need to be written as follows. The only tricky part is <em>nested</em> emphasis: <em>we want to emphasize text that is</em> already within <em>an emphasized text</em>. Note that HTML does not support nested emphasis. To support it, we had to turn off italics before the nested emphasis (before already within ). In our language we want to make things clean and readable; we ll indicate beginning and end of emphasis fragments: The only tricky part is \emph{nested} emphasis: \emph{we want to emphasize text that is \emph{already within} an emphasized text}. Part 1 [7 points]: Write regular expressions for the lexemes in the language. You want to tokenize the input as indicated below with. The only tricky part is \emph{ nested } emphasis: \emph{ we want to emphasize text that is \emph{ already within } an emphasized text }. Backslashes may appear in the text lexeme when followed by by } or \ characters. Backslashes may also be followed by emph{ ; this forms the open token. Anything else is a lexical error. Syntax of regular expressions: you can use Python, including raw strings, but you can also use JavaScript regexes. Please underline your choice. Lexeme regular expression Token \emph{ /\\emph{/ open } /}/ close Text /(\\[}\\] [^\\}] \n)*/ Text

P a g e 9 Part 2 [8 points]: Write a context-free grammar for parsing text with \emph directives. Make the grammar free of left recursion, as we ll use it to build a recursive descent parser. Hint: it is possible to write the grammar with three productions. S text S open S close S A common mistake was that the grammar did not allow strings of the form open text close... open text close Part 3 [10 points]: Write a recursive descent parser that translates text with \emph directives into HTML. When nested emphasis is deeper than two levels, emphasis fragments must alternate between italicized and non-italicized fonts. You can use any programming language. If you use Python, for convenience, you can assume that it supports the?: conditional operator and that assignments are expressions. Assume the availability of functions bool token(regex), int checkpoint(), and restore(int). Answer: Here is one of many possible solutions. The tokens in italics are regular expressions defined in Part 1. def S(em): var s if (s=token(text)): print s S(em) elif token(open): print em? <em> : </em> S(not em) if token(close): print em? </em> : <em> S(em) else: report error and exit elif token (EOF): return # successfully parsed else: report error and exit Start the translator by invoking S(true).