CS164: Midterm I. Fall 2003

Similar documents
Midterm I (Solutions) CS164, Spring 2002

CS164: Final Exam. Fall 2003

First Midterm Exam CS164, Fall 2007 Oct 2, 2007

CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer

CS143 Midterm Fall 2008

Midterm I - Solution CS164, Spring 2014

CS 164 Programming Languages and Compilers Handout 9. Midterm I Solution

UVa ID: NAME (print): CS 4501 LDI Midterm 1

UNIVERSITY OF CALIFORNIA

CS164: Midterm Exam 2

CS143 Midterm Spring 2014

Introduction to Lexing and Parsing

CS143 Midterm Sample Solution Fall 2010

CSE 401 Midterm Exam Sample Solution 2/11/15

CS 164 Handout 11. Midterm Examination. There are seven questions on the exam, each worth between 10 and 20 points.

CS164 Second Midterm Exam Fall 2014

Lexical Analysis. Lecture 3. January 10, 2018

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Question Marks 1 /20 2 /16 3 /7 4 /10 5 /16 6 /7 7 /24 Total /100

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

CS 536 Midterm Exam Spring 2013

1 Parsing (25 pts, 5 each)

ECE468 Fall 2003, Final Exam

CMSC 330: Organization of Programming Languages

Compiler phases. Non-tokens

1. Consider the following program in a PCAT-like language.

Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

Compiler Construction LECTURE # 3

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens

Lexical Analysis. Finite Automata

A simple syntax-directed

CSE 401 Midterm Exam 11/5/10

2.2 Syntax Definition

CS 415 Midterm Exam Spring SOLUTION

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages

CSE 401/M501 18au Midterm Exam 11/2/18. Name ID #

Theory of Computation Dr. Weiss Extra Practice Exam Solutions

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

Semantic Analysis. Lecture 9. February 7, 2018

Midterm II CS164, Spring 2006

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Formal Languages and Compilers Lecture VI: Lexical Analysis

CMSC 330, Fall 2018 Midterm 2

CSE 12 Abstract Syntax Trees

Scanners. Xiaokang Qiu Purdue University. August 24, ECE 468 Adapted from Kulkarni 2012

Lexical Analysis. Chapter 2

Grammars & Parsing. Lecture 12 CS 2112 Fall 2018

Question Marks 1 /16 2 /13 3 /12 4 /15 5 /8 6 /15 7 /8 8 /5 9 /8 Total /100

Final Exam 2, CS154. April 25, 2010

Introduction to Parsing. Lecture 5

COL728 Minor1 Exam Compiler Design Sem II, Answer all 5 questions Max. Marks: 20

CMSC 330: Organization of Programming Languages. Context Free Grammars

CS/ECE 374 Fall Homework 1. Due Tuesday, September 6, 2016 at 8pm

CSCE 531 Spring 2009 Final Exam

CMSC 330: Organization of Programming Languages. Context Free Grammars

Lexical Analysis. Lecture 2-4

Monday, August 26, 13. Scanners

Implementation of Lexical Analysis. Lecture 4

Lexical Analysis. Finite Automata

CS131 Compilers: Programming Assignment 2 Due Tuesday, April 4, 2017 at 11:59pm

Wednesday, September 3, 14. Scanners

CS143 Final Fall 2009

ECE251 Midterm practice questions, Fall 2010

Final Exam CS164, Fall 2007 Dec 18, 2007

Week 2: Syntax Specification, Grammars

Lexical Analysis. Lecture 3-4

CS 164 Handout 16. Final Examination. There are nine questions on the exam, some in multiple parts. You have 3 hours to work on the

Final Exam 1, CS154. April 21, 2010

CSE 582 Autumn 2002 Exam 11/26/02

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

Midterm Exam. CS169 Fall November 17, 2009 LOGIN: NAME: SID: Problem Max Points Points TOTAL 100

Introduction to Lexical Analysis

Introduction to Parsing. Lecture 8

Part 5 Program Analysis Principles and Techniques

University of Nevada, Las Vegas Computer Science 456/656 Fall 2016

Semantics via Syntax. f (4) = if define f (x) =2 x + 55.

CSE 582 Autumn 2002 Exam Sample Solution

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

CSCI312 Principles of Programming Languages!

Implementation of Lexical Analysis

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

Lexical Analysis. Introduction

Introduction to Lexical Analysis

Project 1: Scheme Pretty-Printer

Introduction to Parsing Ambiguity and Syntax Errors

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

MIDTERM EXAMINATION - CS130 - Spring 2003

1. [5 points each] True or False. If the question is currently open, write O or Open.

CS Introduction to Data Structures How to Parse Arithmetic Expressions

CMSC330 Spring 2018 Midterm 2 9:30am/ 11:00am/ 3:30pm

Introduction to Parsing Ambiguity and Syntax Errors

Course Project 2 Regular Expressions

Extra Credit Question

CS 415 Midterm Exam Fall 2003

UNIT -2 LEXICAL ANALYSIS

Transcription:

CS164: Midterm I Fall 2003 Please read all instructions (including these) carefully. Write your name, login, and circle the time of your section. Read each question carefully and think about what s being asked. Ask the proctor if you don t understand anything about any of the questions. If you make any non-trivial assumptions, state them clearly. Some questions span multiple pages. Be sure to answer every part of each question. You will have 1 hour and 20 minutes to work on the exam. The exam is closed book, but you may refer to your two pages of handwritten notes. Solutions will be graded on correctness and clarity, so make sure your answers are neat and coherent. Remember that if we can t read it, it s wrong! Each question has a relatively simple and straightforward solution. We might deduct points if your solution is far more complicated than necessary. Partial answers will be graded for partial credit. Write all of your answers in the space provided on the exam, and clearly mark your answers. Use the backs of the exam pages for scratch work. Do not use any extra scratch paper. Do not unstaple the exam. Turn off your cell phone. Good luck! NAME and LOGIN: Your SSN or Student ID: Circle the time of your section: 9:00 10:00 11:00 2:00 3:00 Q1 (30 points) Q2 (30 points) Q3 (30 points) Q4 (10 points) Total (100 points) 1

1 Regular Expressions/Finite Automata Part (a) Consider a language over the alphabet {0, 1} consisting of the strings that meet the following conditions: The length of the strings is 6. There must be at least one 1 in an odd position (1, 3, or 5) that is immediately followed by a 1 in an even position (2, 4, or 6). For example, 110000, 000011, and 101111 are all in the language; 00000, 001010, and 100110 are not. Write a regular expression that defines this language: Part (b) Consider a language over the alphabet {0, 1} consisting of the strings that meet the following conditions: The length of the strings is 6. The last two characters must both be zero. For example, 110000, 001100, and 001100 are all in the language; 000011, 001010, and 111001 are not. Write a regular expression that defines this language: 2

Part (c) Draw two deterministic finite-state machines, one for the language defined in Part (a), and one for the language defined in Part (b). When drawing the state machines, use the appropriate notation for nodes and edges. Organize your node layouts neatly; the layout should reflect the logic of your solution. Part (d) Draw a non-deterministic finite-state machine for the language that is the union of the languages defined in Parts (a) and (b). (A string belongs to a language that is the union of languages A and B if the string belongs to A or B or both A and B.) 3

Part (e). Draw a deterministic finite-state machine for the language that is the union of the languages defined in Parts (a) and (b). Also, write a string that is not accepted by your finite-state machine. 4

2 First/Follow/LL(1) This question concerns the context-free grammar given below (where capital letters denote nonterminals, small letters denote terminals). S AB BAB A SA yx B x ɛ Part (a) Fill in the following table with First and Follow sets for the grammar. X FIRST(X) F OLLOW (X) S A B A B B A B y x x ɛ Part (b) In the LL(1) parsing table below, fill in the column headings, and the row corresponding to the non-terminal A: A Part (c) Is the grammar LL(1)? Justify your answer. 5

3 A Simple Real-World Problem Background: Recall that Java provides the ++ operator, which can be used either as a pre-increment or as a post-increment operator. That is, ++ can be used to increment the value of a variable either before or after the value of the variable is used. For example, if the value of x is initially 1, then the expression x++ evaluates to 1 and the expression ++x evaluates to 2. Also recall that x++ is a legal Java expression, but 3++ is not. That is, you can increment a memory location (e.g., a variable) but you cannot increment a value. Note that the error in 3++ is detected in the semantic phase of the compiler. Part (a) Draw an AST for the expression below. Label each AST node clearly with the meaning of the node (for example, addition, identifier, etc). Invent new types of AST nodes as necessary. x++ + ++x Part (b) It is interesting to observe that while x++ + ++x is a legal Java expression, the same expression without white spaces, namely x+++++x is not a legal Java expression. That is, the latter expression will cause a compile-time error. This question asks you to identify the phase of the compiler in which the error occurred. Depending on the compiler, the error can be flagged in different stages, and so there is more than one correct answer. (Continued on the next page.) 6

(i) Was the error found in the lexer? If yes, clearly explain the cause of the lexical error, and move on to Part (c) on the next page. If not, write the output of the lexer for this input expression and continue to (ii). (Assume that the lexer uses three kinds of tokens: the + operator, the ++ operator, and an identifier.) (ii) Was the error found in the parser? If yes, clearly explain the cause of the syntactic error, and move on to Part (c) on the next page. If not, write the output of the parser (i.e., an AST) for this input expression and continue to (iii). (iii) Was the error found in the semantic checker? If yes, clearly explain the cause of the semantic error. (Clearly, since you got all the way to this part, the answer must be yes.) To show the error, all you need to do is evaluate the AST from (ii) under the assumption that the initial value of x is 1, and show where and why the evaluation would fail. Evaluate the AST just like you did in your PA1 interpreter. 7

Part (c) Let s assume that when the programmer wrote the expression x+++++x, he really meant to compute the expression x++ + ++x. Unfortunately, the first Java expression is not equivalent to the second expression with the whitespaces (as discussed above, the first expression is not even a legal Java expression). Clearly, in Java, white spaces are sometimes very important. One solution how to make the compiler recognize x+++++x as equivalent to x++ + ++x without having to insert any whitespaces into the expression is to move part of the functionality of the lexer to the parser. Specifically, let us propose that the ++ operator won t be recognized in the lexer; instead, the lexer would simply turn each + character into a plus token. So, the lexer recognizes only two kinds of tokens: id and plus. As usual, the lexer discards any white space from the input. It would then be up to the parser to group plus tokens into ++ operators. Below is a grammar for such a parser. This grammar describes a tiny subset of the language of arithmetic expressions (this language includes additions, pre-increments and post-increments): E T plus E T T id id plus plus plus plus id (i) Show the parse tree for the input string x+++++x. (ii) Let s now show that the proposed solution unfortunately doesn t work because the grammar is ambiguous. Use the input x+++x to show that the grammar is ambiguous. Why does ambiguity in a grammar for a programming language pose a problem? 8

4 Lexer Generator The code below is a fragment of a solution for the second programming assignment (the lexer generator). The code shows a fragment of a visitor object that traverses a Java AST that represents a regular expression and creates an NFA that represents the regular expression. Specifically, the code shows a method that implements the regular expression + operator (i.e., the one-or-more repetition) as well as the * operator (i.e., the zero-or-more repetition). The code for the second operator is not shown. Your task is to fill in this missing code. public boolean visit(postfixexpression s) { // the traversal of the child s fills in the values of start, end // which are instance fields of the visitor object. s.getoperand().accept(this); NFAState childstart = start, childend = end; // Java ++ operator corresponds to regexp + operator if (s.getoperator() == PostfixExpression.Operator.INCREMENT) { start = new NFAState(); end = new NFAState(); start.addtransition(nfastate.epsilon, childstart); childend.addtransition(nfastate.epsilon, end); childend.addtransition(nfastate.epsilon, childstart); } // Java -- operator corresponds to regexp * operator else if (s.getoperator() == PostfixExpression.Operator.DECREMENT) { // ADD YOUR CODE HERE } } else assert false; return false; 9