CSE 401 Midterm Exam Sample Solution 2/11/15

Similar documents
CSE 401/M501 18au Midterm Exam 11/2/18. Name ID #

CSE 401 Midterm Exam Sample Solution 11/4/11

CSE P 501 Exam 11/17/05 Sample Solution

ECE251 Midterm practice questions, Fall 2010

CSE 401 Midterm Exam 11/5/10

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

CPS 506 Comparative Programming Languages. Syntax Specification

CS 314 Principles of Programming Languages

CS164: Midterm I. Fall 2003

CSE 582 Autumn 2002 Exam Sample Solution

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Syntax-Directed Translation. Lecture 14

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

CSE P 501 Exam 8/5/04

CSE 413 Final Exam Spring 2011 Sample Solution. Strings of alternating 0 s and 1 s that begin and end with the same character, either 0 or 1.

A Simple Syntax-Directed Translator

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

A simple syntax-directed

Semantic Analysis. Lecture 9. February 7, 2018

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

LECTURE 3. Compiler Phases

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Downloaded from Page 1. LR Parsing

CSE P 501 Exam Sample Solution 12/1/11

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

CSE au Final Exam Sample Solution

1. Consider the following program in a PCAT-like language.

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Optimizing Finite Automata

Introduction to Parsing. Lecture 5

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

CS 403: Scanning and Parsing

Outline. Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

Lecture 7: Deterministic Bottom-Up Parsing

Properties of Regular Expressions and Finite Automata

Grammars & Parsing. Lecture 12 CS 2112 Fall 2018

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Chapter 3. Parsing #1

CMSC 330: Organization of Programming Languages

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

2068 (I) Attempt all questions.

CS143 Midterm Fall 2008

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

Monday, September 13, Parsers

Introduction to Parsing. Lecture 5

CSE 3302 Programming Languages Lecture 2: Syntax

CSE 413 Final Exam. December 13, 2012

Chapter 3: Lexing and Parsing

Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

CSE P 501 Exam 12/1/11

CMSC 330: Organization of Programming Languages

Syntax Analysis Check syntax and construct abstract syntax tree

Introduction to Parsing. Lecture 8

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Lexical Analysis. Chapter 2

CMSC 330: Organization of Programming Languages. Context Free Grammars

Compiler phases. Non-tokens

Lecture 8: Deterministic Bottom-Up Parsing

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Topic 3: Syntax Analysis I

Defining syntax using CFGs

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Chapter 2 :: Programming Language Syntax

2.2 Syntax Definition

CSE 582 Autumn 2002 Exam 11/26/02

Question Bank. 10CS63:Compiler Design

Lexical Analysis. Lecture 3-4

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Outline. Top Down Parsing. SLL(1) Parsing. Where We Are 1/24/2013

ECE 468/573 Midterm 1 October 1, 2014

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

CMSC 330: Organization of Programming Languages

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

Appendix Set Notation and Concepts

CS 406/534 Compiler Construction Putting It All Together

Chapter 4. Abstract Syntax

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

CSE 5317 Midterm Examination 4 March Solutions

CSE 401/M501 Compilers

Lexical Analysis. Lecture 2-4

Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov Compiler Design (170701)

Syntax. 2.1 Terminology

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Bottom-Up Parsing. Lecture 11-12

Context-Free Grammars

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

Scanners. Xiaokang Qiu Purdue University. August 24, ECE 468 Adapted from Kulkarni 2012

Transcription:

Question 1. (10 points) Regular expression warmup. For regular expression questions, you must restrict yourself to the basic regular expression operations covered in class and on homework assignments: rs, r s, r*, r+, r?, character classes like [a-cxy] and [^aeiou], abbreviations name=regexp, and parenthesized regular expressions. No additional operations that might be found in the regexp packages in various Unix programs, scanner generators like JFlex, or language libraries. (a) (5 points) Write a regular expression that generates the set of strings recognized by this finite automaton: a a b c c (abc+)+ (b) (5 points) Write a regular expression that generates all non-empty strings of a s, b s, and c s where the first a precedes the first b if both a s and b s are present in a string. One possible solution: c*a(a b c)* (b c)+ Page 1 of 8

Question 2. (12 points) Regular expressions. Identifiers in the Ruby programming language are similar to those in many other languages, but a little different. An identifier: Contains any combination of letters, digits, and underscores (for this problem, assume that letters are only the ASCII characters a-z and A-Z, and digits are 0-9) May not begin with a digit. May optionally have a single $, a single @, or two @@ characters preceding the main part of the identifier. May optionally have a single!, a single?, or a single = character at the end. Examples of valid identifiers: abc123, @_Ab!, _ (a single underscore), @@GLOB, x2=. (a) (6 points) Give a regular expression that generates the set of valid Ruby identifiers. letter = [a-za-z] digit = [0-9] ($ @@?)? (letter _ ) (letter digit _ )* (!? =)? (b) (6 points) Draw a DFA that accepts the set of valid Ruby identifiers. You only need to draw a DFA that is correct, you do not have to formally derive it from your answer to part (a) (although you re free to use a formal derivation if you d like). $ a-za-z_ a-za-z0-9_ @ @ a-za-z_!?= a-za-z_ a-za-z_ Page 2 of 8

Question 3. (10 points) Scanners and tokens. Here is a small fragment found in a file: a[k]++=1,234.5 #Hashtag //comment ret/*we re done*/urn 17===42; Below, list in order the tokens that would be returned by a scanner for the full Java programming language (not just the MiniJava subset) as it reads this input. If there is a lexical error in the input, indicate where the error is in the token stream and what is wrong, and continue to list the tokens corresponding to the remaining input, as would be done by a normal scanner. Remember that a scanner just detects lexical errors, and does not diagnose parsing or semantic errors detected by later parts of the compiler. You can use any reasonable token names as long as your meaning is clear. The first three tokens are written for you: ID(a) LBRACKET ID(k) RBRACKET PLUSPLUS ASSIGN NUMBER(1) COMMA NUMBER(234.5) illegal character # ID(Hashtag) ID(ret) ID(urn) NUMBER(17) EQUALEQUAL ASSIGN NUMBER(42) SEMICOLON The most common errors on this question were because of trouble with the principle of longest match or else missing that the question was about tokens for full Java and not MiniJava. Many answers, for instance, listed two PLUS tokens for ++, but it is a single operator in Java. Page 3 of 8

Question 4. (34 point) The oh noz, not again, parsing question. Here is a tiny grammar. 0. S ::= S $ ($ represents end-of-file) 1. S ::= abc 2. B ::= b 3. B ::= (a) (12 points) Draw the LR(0) state machine for this grammar. 1 0 S ::=. S $ S S ::= S. $ S ::=. a B c 2 a S ::= a. B c B ::=. b B ::=. B 4 S ::= a B. c 3 b B ::= b. c 5 S ::= a B c. The most common error here was treating as an input character and having a shift transition on from state 2 to a new state. But is not an actual input character to be matched it indicates that B can derive the empty string. (b) (6 points) Compute nullable and the FIRST and FOLLOW sets for the nonterminals S and B in the above grammar: Symbol nullable FIRST FOLLOW S false a $ B true b c (continued on next page) Page 4 of 8

Question 4. (cont.) Grammar repeated from previous page for reference. 0. S ::= S $ 1. S ::= abc 2. B ::= b 3. B ::= (c) (8 points) Write the LR(0) parse table for this grammar based on your LR(0) state machine in your answer to part (a). a b c $ S B 0 acc 1 s2 g0 2 r3 r3,s3 r3 r3 g4 3 r2 r2 r2 r2 4 s5 5 r1 r1 r1 r1 (d) (2 points) Is this grammar LR(0)? Why or why not? No. Shift-reduce conflict in state 2 on input b. (e) (2 points) Is this grammar SLR? Why or why not? Yes. Input b is not in FOLLOW(B) so state 2 should do s3 on input b, and that eliminates the shift-reduce conflict. (f) (2 points) Is this grammar LL(1)? Why or why not? Yes. The only issue is whether the two B productions have non-intersecting FIRST sets and they do not. (g) (2 points) Everyone got 2 free points here. We realized as we were grading that parts (a)-(f) only added up to 32 and not the 34 points listed at the beginning of the question, so we added a 2-point part (g) and gave everyone full credit for it. Page 5 of 8

Question 5. (12 points) Concrete syntax. Full Java, as well as C and other languages, includes a three-operand conditional expression operator?:. An example of its use is in this assignment statement: max = x>y? x : y; which stores the larger value of x or y in variable max. More generally, e1?e2:e3 evaluates e1, then if e1 is true, it evaluates e2 and that is the value of the entire conditional expression. If e1 is false, then e3 is evaluated and it becomes the value of the conditional expression. Suppose we have a language with the usual arithmetic expression grammar: exp ::= exp + term exp - term term term ::= term * factor term / factor factor factor ::= int id ( exp ) Now suppose we add the three-operand conditional expressions to this language by adding the following production to the ones above: exp ::= exp? exp : exp Show that the resulting grammar is ambiguous. Here are two leftmost derivations of x?y:z+1, omitting most of the obvious steps. exp => exp? exp : exp =>* x? y : exp => x? y : exp + term =>* x? y : z + 1. exp => exp + term =>* x? y : z + term =>* x? y : z + 1 There are other solutions that show the grammar is ambiguous by generating other strings using two different leftmost or rightmost derivations. Another solution would be to show two different parse trees that generate the same string. Page 6 of 8

Question 6. (14 points) Abstract syntax and semantics. Let s assume that we ve solved the ambiguity issues and have added conditional expressions from the previous problem to MiniJava. Suppose we encounter this statement in a program: max = b < a? a : b ; (a) (6 points) Draw a tree below showing the abstract syntax for this statement. Don t worry about whether you match the AST classes in the MiniJava project code exactly (you re not expected to memorize that sort of thing). Just show nodes and edges of an AST that is appropriate for this expression. (b) (8 points) After you ve drawn the AST for this statement, annotate it by writing next to the appropriate nodes the checks or tests that need to be done in the static semantics / type-checking phase of the compiler to ensure that this statement does not contain any compile-time errors. You only need to indicate the necessary checks you do not need to specify an attribute grammar, for example. assign check: expr type is assignment compatible with variable type id:max? : check: first expr is boolean check: 2nd and 3rd exprs have same type, which becomes the type of the?: expression check: operands can be compared with < < id:a id:b id:a id:b check: for all identifiers, check whether visible in current scope Page 7 of 8

Question 7. (8 points) The Ghost of LL Parsing Past. Consider the following grammar: 1. S ::= a a 2. S ::= a b (a) (3 points) This grammar is not LL(1). Why not? (Give a concise technical reason) FIRST(a a) and FIRST(a b) both contain a. So the intersection of the FIRST sets for the two right sides of the productions for S is not empty. (b) (3 points) Write a grammar that generates the same language that is LL(1). 1. S ::= a S 2. S ::= a 3. S ::= b (c) (2 points) Would it be possible to write a hand-coded recursive descent parser to recognize programs generated by the original grammar without rewriting it? Why or why not? Yes. In a hand-written recursive-descent parser we can look ahead a few extra symbols when we need to, or we can sometimes defer deciding which production is being processed until later (the dangling else problem in an if-else statement is often handled this way). Page 8 of 8