Introduction; Parsing LL Grammars
|
|
- Archibald Blair
- 5 years ago
- Views:
Transcription
1 Introduction; Parsing LL Grammars CS 440: Programming Languages and Translators Due Fri Feb 2, 11:59 pm 1/29 pp.1, 2; 2/7 all updates incorporated, solved Instructions You can work together in groups of 4. Submit your work on Blackboard. * Submit one copy. Include the names and A-IDs of everyone in the group on that copy (in the pdf, for example). Submit under the name of one person in the group (doesn't matter who). Questions [100 points total] 1. [10 = 5+5 points] For each question below, a paragraph should be enough. a. Exercise 1.3 (p.38) b. Exercise 1.9 (p.39) For Questions 2 4, your regular expressions can use some basic egrep notations. (Try man re_format on unix for help.) Some simple example of what you can use: [a-z_] ("a through z or underscore") [0-9ab] ("Any digit or the letters a or b") [^xyz] ("Any character except for x, y, or z") x? ("x or nothing") x+ ("one or more x's"). (a period or dot means "any one character") \. (backslash dot means literally a dot, as in the float 12\.34") Don't use back references, (such as "\3"); bounds (such as "{7}"); character classes (such as"[:cntrl:]" or "[[:<:]]"); or assertions (such as "\D"). (You won't need literals like \n (except for \.), and if you try things like \x{89abcdef}, we'll hunt you down :-) 2. [15 = 3*5 points] Translate each regular expression below into English. Don't just translate individual subexpressions; try to get at the essence of the expression. (E.g., "[1-9][0-9]" could be "a two-digit number without a leading zero".) [Hint: You can try an expression using egrep -e "expression" text_file, where each line of text_file has a candidate string to try to match. You may want to add "^" and "$" to the expression, in that case; again, see the man page.] a. [0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9] b. (19 20)[0-9][0-9]-(0[1-9] 1[012])-(0[1-9] [12][0-9] 3[01]) c. (0x)[1-9a-f][0-9a-f]* * Using group submission is an experiment; let me know how it works. CS 440: Programming Languages and Translators 1 James Sasaki, 2018
2 3. [15 = 3*5 points] Give regular expressions that match each of the following kinds of (possibly empty) strings. There may be more than one answer; we just want one. a. Strings that alternate between vowels (a, e, i, o, u) and consonants (not vowels) and can start with either a vowel or a consonant. b. Strings of a's and b's where the number of b's is divisible by 2 or 3 c. Strings of lowercase letters that don't include abc. (Don't forget to include strings like aaa or xab.) 4. [12 points] Give a regular expression for numbers in the following made-up format: Integers are sequences of digits; leading zeros are allowed. Floats include a dot with digits before and/or after the dot. In addition, you may include a base as a leading b#, o#, d#, or x# (binary, octal, decimal, or hex). You may also have a leading + or - before the base (or the integer, if there's no base). In addition, you may include an exponent after the number, of the form e integer where integer is as described above. If specified, the base for the exponent doesn't have to match the base of the number. A single space can be included between each group of one or more digits, or after the base #, or before the e exponent, but no space is allowed between a leading sign and base or between a base and #. The letters (b, o, etc) can be in upper case. If you like, you can define parts of the expressions as grammar rules (like number integer float etc.) Some random examples of numbers (with spaces as underscores to make them more visible): -b#_1._e-b#10 equals binary -1.0 / 2² = binary eb#10 equals binary = 2 10 cast as a float +3.e+1 equals e1 equals 30 o#072_031 equals 72031₈ But not b#_3 (because of the 3) or (two spaces between 12 and 34) or -_56 (space after -) 5. [18 points] Here's a state transition table for an NFA that accepts the 3-character string abc. To (I hope) make things clearer, I've mostly given states names that are regular expressions describing the input that takes us to that state. The cells that are empty actually contain err. (I omitted them to make the non-err parts more visible). State ε a b c Start ε (Seen) ε a (Seen) a ab (Seen) ab abc (Seen) abc accept Accept err err err err err err err err CS 440: Programming Languages and Translators 2 James Sasaki, 2018
3 Accept is underlined to indicate that it's (the only) accepting state. Note that once you get to the error state err, you stay there forever. Now imagine gluing together four NFAs for abc, acc, bbc, and bca, merging their Start, Accept, and err states respectively, and ending up with an NFA with 3 + 4*4 = 19 states. For this problem, convert this NFA to a DFA; the most straightforward way to do this is to use the algorithm in the text. You'll need to use some different terminology to name the states. (Number them? More complicated regular expressions?). You can, but don't have to, give a DFA with the minimum number of states (I believe it's 6 states). Present the DFA using a transition table. 6. [20 = 4*5 points] (Modified Exercise 2.14, p.108) Consider the language consisting of all strings of properly-balanced parentheses and brackets. (I.e., "(", ")", "[", and "]".) a. Give an LL(1) grammar for this language. Surround each terminal parenthesis or bracket by double quotes to emphasize that they are terminal symbols. b. Give the corresponding LL(1) parsing table. c. Show the parse tree for ([]([]))[]. If you like, you may present the tree using an outline form: List the nodes in preorder with the children for each node indented one more level than their parent. E.g., a tree with root X, children Y and Z, with Y having children A and B, and Z having children C and D would be presented as X. Y.. A.. B. Z.. C.. D d. Give a trace of the parser action as it constructs the parse tree. 7. [10 = 5+5 points] (Modified Problem 2.26) Consider the grammar below. The start symbol is S, the other nonterminals are E, T, TL, F, and FL, and the terminal symbols are v and anything double-quoted. S E "$$" E v ":=" E E T TL TL "+" T TL ε T F FL FL "*" F FL ε F "(" E ")" v a. For each rule A α above, give the FIRST(α), FOLLOW(A), EPS(α), and PREDICT(A α) sets. Omit duplicates (there's no reason to show EPS(ε) more than once, for example). b. What tells us that this grammar is not LL(1)? CS 440: Programming Languages and Translators 3 James Sasaki, 2018
4 Solution to Homework 1 1. (Compilation; Correctness) a. Exercise 1.3, p.38 (Compilation vs interpretation) Some possible answers: Compilation can catch errors earlier; compiled code usually executes faster. An interpreter may take less time to rerun a program that's had a small change made to it (a compiler has to recompile and relink the whole program); am interpreter may produce better error messages; for language development, writing an interpreter can be faster than writing a compiler. b. Exercise 1.9, p.39 (Program correctness) There are two parts to correctness: the specification and meeting the specification. Specifications can be vague, wrong, or not cover all possible inputs. For correctness, testing only reveals lack of bugs under the tested inputs; untested inputs may still encounter bugs, plus, determining what inputs to test on is hard. For complex software, it's hard to figure out what environments a program might run in (plus test in all of them). Blind spots can include things you know you don't know (like exact user behavior) and things you don't know you don't know (like unexpected user behavior). 2. (Translate reg expressions to English). There can be alternative answers. 2a. [0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9] Three (natural) numbers separated by dashes; the first number has 4 digits and the other two have 2 digits. 2b. (19 20)[0-9][0-9]-(0[1-9] 1[012])-(0[1-9] [12][0-9] 3[01]) Dates of the form yearmonth-day where the years are , days are 01 12, and days are c. (0x)[1-9a-f][0-9a-f]* Hex natural numbers: the tag 0x followed by one or more lower-case hex digits, with no leading zero. 3. (Regular expressions) 3a. Alternating vowels and consonants: [aeiou]?([^aeiou][aeiou])*[^aeiou]? Note this allows the empty string. 3b. a's and b's with number of b's is a multiple of 2 or 3: a* (b a* b a*)* a* a* (b a* b a* b a*)* a* (You should include the empty string.) 3c. Strings without substring abc: ([^a] a[^b] ab[^c])*(a ab)? 4. (Integer and Float Numbers in bases 2, 8, 10, 16). First, let's break down the problem. The basic idea: We need sign? base? number exponent? Including spaces gives us sign? base?\_? number_with_spaces exponent_with_spaces? "\_" means backslash space, which means an actual space. To avoid having a number lead with spaces, I'm putting them between the base and number, hence \_?. This is the first time I've offered this problem, so if my solution has bugs, let me know. CS 440: Programming Languages and Translators 4 James Sasaki, 2018
5 Sign and base are straightforward: sign? is [+ -]?, and base? could be ([bbooddxx]#\_)?, except that the legal digits in the following number depend on the base, so we'll need to break the bases up into cases: base? can be ([bb]#\_?)?, ([oo]#\_?)? and so on. The exponent part is also straightforward: \_[ee]integer, with integer as below. The number part is the hard one, of course. It's either an integer or a float. If the number is an integer (natural number, technically), then we can use an expression like digit + (\_digit + )*, which is a sequence that alternates between runs of digits and one space, beginning and ending with digit(s). (Remember the superscript Kleene + is one or more of.) It's \_, not \_ + because we're only allowed one space between runs of digits. There are alternatives like digit ((digit \_)*digit)? that are perfectly fine too. For a float, the thing to avoid is something like digit*\.digit*, which makes digits optional before or after the dot (which is good) but doesn't insist on having at least one digit somewhere (which is bad). We can follow a (non-empty) integer with a dot and (optionally) more digits and spaces ending with a digit (we don't want trailing spaces). digit + (\_digit + )*(\.((digit + \_)*digit + )?)? // integer (dot integer?)? Or, we can begin with a dot and follow with digits and spaces and end with a digit(s) \. (digit + \_)*digit + (Note we don't allow a space before and/or after the dot; maybe that's a bug in the specification.) Below, I'm using name expression to give names to expressions to make things more readable (I hope). It's fine if you used symbols like or ::=. I took of the _with_spaces and went with just number and exponent. The full expansion is pretty horrendous, so I'm skipping it. (Hope you did too.) value sign? base_and_nbr exponent?!! sign [+ -]! base_and_nbr (base2? nbr2 base8? nbr8 base10? nbr10 base16? nbr16) base2 [bb]#\_? base8 [oo]#\_? base10 [dd]#\_? base16 [xx]#\_? nbr2 [01] + (\_ [01] + )*(\.(([01] + \_)*[01] + )?)? \. ([01] + \_)*[01] + nbr8 [0-7] + (\_ [0-7] + )*(\.(([0-7] + \_)*[0-7] + )?)? \. ([0-7] + \_)*[0-7] + nbr10 [0-9] + (\_ [0-9] + )*(\.(([0-9] + \_)*[0-9] + )?)? \. ([0-9] + \_)*[0-9] + nbr16 [0-9a-fA-F] + (\_ [0-9a-fA-F] + )*(\.(([0-9a-fA-F] + \_)*[0-9a-fA-F] + )?)?! \. ([0-9a-fA-F] + \_)* [0-9a-fA-F] + exponent \_?[ee] sign? base_nbr CS 440: Programming Languages and Translators 5 James Sasaki, 2018
6 5. [18 points] (DFA that accepts abc, acc, bbc, and bca) Except for Start, Accept, and err, I named the states after the path you take to get there. State a b c Start a b err a err ab ac bb ab ac bb ab ac bb err err Acc b err ab ac bb bc bc Acc err err Accept err err err err err err err [Not asked for: The DFA above is minimal. Rows with different (error not error) patterns can't be joined, and Accept and err aren't both accepting or non-accepting states, so they can't be joined either. If you have separate rows for ab, ac, and bb, you'll see they behave identically (accept on c, err otherwise). That's why they can be joined. So the minimal automaton has seven states (when I said six I forgot about the error state).] 6. [20 = 4*5 points] (Modified Exercise 2.14, p.108: Balanced parentheses and brackets) 6a. The grammar has four rules, given below. The rule Start S $$ lets the parser check for end-of-input. Rule # Rule 1 Start S $$ 2 S ( S ) S 3 S [ S ] S 4 S ε 6b. The parse table pairs the nonterminal at the top of the stack with the current input token and tells you which rule to apply to the nonterminal. err indicates a syntax error. Stack Top Input Token ( ) [ ] $$ Start 1 err 1 err 1 S CS 440: Programming Languages and Translators 6 James Sasaki, 2018
7 6c. Parse tree for ([]([]))[]. The outline-format tree is to the left; the terminal string on the right shows where each terminal symbol appears in the input (as the head of the string) Start. S.. ( ([]([]))[].. S... [ []([]))[]... S.... ε... ] ]([]))[]... S.... ( ([]))[].... S..... [ []))[]..... S..... ] ]))[]..... S.... ) ))[].... S.. ) )[].. S... [ []... S.... ε... ] ]... S.... ε. $$ 6d. Trace of parser actions: Parser Stack Input Stream Action Start ( [ ] ( [ ] ) ) [ ] $$ (Initialize parser) S $$ ( [ ] ( [ ] ) ) [ ] $$ (Predict) Rule 1: Start S $$ ( S ) S $$ ( [ ] ( [ ] ) ) [ ] $$ Rule 2: S ( S ) S S ) S $$ [ ] ( [ ] ) ) [ ] $$ Match ( [ S ] S ) S $$ [ ] ( [ ] ) ) [ ] $$ Rule 3: S [ S ] S S ] S ) S $$ ] ( [ ] ) ) [ ] $$ Match [ ] S ) S $$ ] ( [ ] ) ) [ ] $$ Rule 4: S ε S ) S $$ ( [ ] ) ) [ ] $$ Match [ ( S ) S ) S $$ ( [ ] ) ) [ ] $$ Rule 2: S ( S ) S S ) S ) S $$ [ ] ) ) [ ] $$ Match ( [ S ] S ) S ) S $$ [ ] ) ) [ ] $$ Rule 3: S [ S ] S CS 440: Programming Languages and Translators 7 James Sasaki, 2018
8 S ] S ) S ) S $$ ] ) ) [ ] $$ Match [ ] S ) S ) S $$ ] ) ) [ ] $$ Rule 4: S ε S ) S ) S $$ ) ) [ ] $$ Match [ ) S ) S $$ ) ) [ ] $$ Rule 4: S ε S ) S $$ ) [ ] $$ Match ) ) S $$ ) [ ] $$ Rule 4: S ε S $$ [ ] $$ Match ) [ S ] S $$ [ ] $$ Rule 3: S [ S ] S S ] S $$ ] $$ Match [ ] S $$ ] $$ Rule 4: S ε S $$ $$ Match ] $$ $$ Rule 4: S ε empty ε Match $$ Parse successful! 7. [10 = 5+5 points] (Modified Problem 2.26: First, Follow, etc.) The rules are S E $$ E v ":=" E E T TL TL + T TL ε T F FL FL * F FL ε F ( E ) v 7a. Here is a table that lists the inferences about FIRST, FOLLOW, and EPS that follow from each rule. Rule A α FIRST(α) includes Other Inferences from Rule Start E $$ FIRST(E) FIRST(E) FIRST(Start), $$ FOLLOW(E) E v ":=" E v v FIRST(E) E T TL FIRST(T) FIRST(E) FIRST(T), FIRST(TL) FOLLOW(T) FOLLOW(E) FOLLOW(TL) If EPS(TL) then FOLLOW(E) FOLLOW(T) TL + T TL + + FIRST(TL), FIRST(TL) FOLLOW(T) If EPS(TL) then FOLLOW(TL) FOLLOW(T) TL ε EPS(TL) = Y T F FL FIRST(F) FIRST(F) FIRST(T), FIRST(FL) FOLLOW(F), FOLLOW(T) FOLLOW(FL) If EPS(FL) then FOLLOW(T) FOLLOW(F) CS 440: Programming Languages and Translators 8 James Sasaki, 2018
9 FL * F FL * * FIRST(FL), FIRST(FL) FOLLOW(F) If EPS(FL) then FOLLOW(FL) FOLLOW(F) FL ε EPS(FL) = Y F ( E ) v (, v (, v FIRST(F), ) FOLLOW(E) Using the inferences, we can calculate the FIRST, FOLLOW, and EPS sets for each nonterminal: A FIRST(A) FOLLOW(A) EPS(A) Start (, v N E (, v ), $$ N TL + ), $$ Y T (, v +, ), $$ N FL * + Y F (, v *, +, ), $$ N From the FIRST, FOLLOW, and EPS sets, we can calculate the PREDICT sets for the rules: Rule A α PREDICT(A α) Rule A α PREDICT(A α) Start E $$ (, v T F FL (, v E v ":=" E v FL * F FL * E T TL (, v FL ε + TL + T TL + F ( E ) ( TL ε ), $$ F v v 7b. The grammar is not LL(1) because v is in the PREDICT of two rules for the same nonterminal, E. CS 440: Programming Languages and Translators 9 James Sasaki, 2018
Table-Driven Parsing
Table-Driven Parsing It is possible to build a non-recursive predictive parser by maintaining a stack explicitly, rather than implicitly via recursive calls [1] The non-recursive parser looks up the production
More informationStructure of Programming Languages Lecture 3
Structure of Programming Languages Lecture 3 CSCI 6636 4536 Spring 2017 CSCI 6636 4536 Lecture 3... 1/25 Spring 2017 1 / 25 Outline 1 Finite Languages Deterministic Finite State Machines Lexical Analysis
More informationCS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution
CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution Exam Facts Format Wednesday, July 20 th from 11:00 a.m. 1:00 p.m. in Gates B01 The exam is designed to take roughly 90
More informationNon-deterministic Finite Automata (NFA)
Non-deterministic Finite Automata (NFA) CAN have transitions on the same input to different states Can include a ε or λ transition (i.e. move to new state without reading input) Often easier to design
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back
More informationCSE 401/M501 18au Midterm Exam 11/2/18. Name ID #
Name ID # There are 7 questions worth a total of 100 points. Please budget your time so you get to all of the questions. Keep your answers brief and to the point. The exam is closed books, closed notes,
More informationWhere We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser
More informationReview. Pat Morin COMP 3002
Review Pat Morin COMP 3002 What is a Compiler A compiler translates from a source language S to a target language T while preserving the meaning of the input 2 Structure of a Compiler program text syntactic
More informationExample CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.
Example CFG Lectures 16 & 17 Bottom-Up Parsing CS 241: Foundations of Sequential Programs Fall 2016 1. Sʹ " S 2. S " AyB 3. A " ab 4. A " cd Matt Crane University of Waterloo 5. B " z 6. B " wz 2 LL(1)
More informationLexical and Syntax Analysis. Top-Down Parsing
Lexical and Syntax Analysis Top-Down Parsing Easy for humans to write and understand String of characters Lexemes identified String of tokens Easy for programs to transform Data structure Syntax A syntax
More informationCSE 401 Midterm Exam 11/5/10
Name There are 5 questions worth a total of 100 points. Please budget your time so you get to all of the questions. Keep your answers brief and to the point. The exam is closed books, closed notes, closed
More informationCMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters
: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Scanner Parser Static Analyzer Intermediate Representation Front End Back End Compiler / Interpreter
More informationCS 4120 Introduction to Compilers
CS 4120 Introduction to Compilers Andrew Myers Cornell University Lecture 6: Bottom-Up Parsing 9/9/09 Bottom-up parsing A more powerful parsing technology LR grammars -- more expressive than LL can handle
More informationProgramming Lecture 3
Programming Lecture 3 Expressions (Chapter 3) Primitive types Aside: Context Free Grammars Constants, variables Identifiers Variable declarations Arithmetic expressions Operator precedence Assignment statements
More informationWeek 2: Syntax Specification, Grammars
CS320 Principles of Programming Languages Week 2: Syntax Specification, Grammars Jingke Li Portland State University Fall 2017 PSU CS320 Fall 17 Week 2: Syntax Specification, Grammars 1/ 62 Words and Sentences
More informationLexical Analysis. COMP 524, Spring 2014 Bryan Ward
Lexical Analysis COMP 524, Spring 2014 Bryan Ward Based in part on slides and notes by J. Erickson, S. Krishnan, B. Brandenburg, S. Olivier, A. Block and others The Big Picture Character Stream Scanner
More informationSyntactic Analysis. Top-Down Parsing
Syntactic Analysis Top-Down Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make
More informationCompiler Design 1. Top-Down Parsing. Goutam Biswas. Lect 5
Compiler Design 1 Top-Down Parsing Compiler Design 2 Non-terminal as a Function In a top-down parser a non-terminal may be viewed as a generator of a substring of the input. We may view a non-terminal
More informationChapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.
Topics Chapter 4 Lexical and Syntax Analysis Introduction Lexical Analysis Syntax Analysis Recursive -Descent Parsing Bottom-Up parsing 2 Language Implementation Compilation There are three possible approaches
More informationR10 SET a) Construct a DFA that accepts an identifier of a C programming language. b) Differentiate between NFA and DFA?
R1 SET - 1 1. a) Construct a DFA that accepts an identifier of a C programming language. b) Differentiate between NFA and DFA? 2. a) Design a DFA that accepts the language over = {, 1} of all strings that
More informationCompilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1. Top-Down Parsing. Lect 5. Goutam Biswas
Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1 Top-Down Parsing Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 2 Non-terminal as a Function In a top-down parser a non-terminal may be viewed
More information1 Parsing (25 pts, 5 each)
CSC173 FLAT 2014 ANSWERS AND FFQ 30 September 2014 Please write your name on the bluebook. You may use two sides of handwritten notes. Perfect score is 75 points out of 85 possible. Stay cool and please
More informationSyntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38
Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program
More informationCS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)
CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square) Introduction This semester, through a project split into 3 phases, we are going
More informationCS 164 Handout 11. Midterm Examination. There are seven questions on the exam, each worth between 10 and 20 points.
Midterm Examination Please read all instructions (including these) carefully. Please print your name at the bottom of each page on the exam. There are seven questions on the exam, each worth between 10
More informationIn this simple example, it is quite clear that there are exactly two strings that match the above grammar, namely: abc and abcc
JavaCC: LOOKAHEAD MiniTutorial 1. WHAT IS LOOKAHEAD The job of a parser is to read an input stream and determine whether or not the input stream conforms to the grammar. This determination in its most
More informationRegular Expressions Explained
Found at: http://publish.ez.no/article/articleprint/11/ Regular Expressions Explained Author: Jan Borsodi Publishing date: 30.10.2000 18:02 This article will give you an introduction to the world of regular
More informationLec-5-HW-1, TM basics
Lec-5-HW-1, TM basics (Problem 0)-------------------- Design a Turing Machine (TM), T_sub, that does unary decrement by one. Assume a legal, initial tape consists of a contiguous set of cells, each containing
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationPage No 1 (Please look at the next page )
Salman Bin Abdul Aziz University Collage of Computer Engineering and Sciences Computer Science Department 1433-1434 (2012-2013) First Term CS 4300 Compiler Construction 8 th Level Final Exam 120 Minutes
More informationTop-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7
Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationCS103 Handout 35 Spring 2017 May 19, 2017 Problem Set 7
CS103 Handout 35 Spring 2017 May 19, 2017 Problem Set 7 What can you do with regular expressions? What are the limits of regular languages? In this problem set, you'll explore the answers to these questions
More informationCSCI312 Principles of Programming Languages
Copyright 2006 The McGraw-Hill Companies, Inc. CSCI312 Principles of Programming Languages! LL Parsing!! Xu Liu Derived from Keith Cooper s COMP 412 at Rice University Recap Copyright 2006 The McGraw-Hill
More informationTypes, Expressions, and States
8/27: solved Types, Expressions, and States CS 536: Science of Programming, Fall 2018 A. Why? Expressions represent values in programming languages, relative to a state. Types describe common properties
More informationThe procedure attempts to "match" the right hand side of some production for a nonterminal.
Parsing A parser is an algorithm that determines whether a given input string is in a language and, as a side-effect, usually produces a parse tree for the input. There is a procedure for generating a
More informationCOMP 330 Autumn 2018 McGill University
COMP 330 Autumn 2018 McGill University Assignment 4 Solutions and Grading Guide Remarks for the graders appear in sans serif font. Question 1[25 points] A sequence of parentheses is a sequence of ( and
More informationRegexs with DFA and Parse Trees. CS230 Tutorial 11
Regexs with DFA and Parse Trees CS230 Tutorial 11 Regular Expressions (Regex) This way of representing regular languages using metacharacters. Here are some of the most important ones to know: -- OR example:
More informationContext-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5
Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5 1 Not all languages are regular So what happens to the languages which are not regular? Can we still come up with a language recognizer?
More informationContext-Free Grammars
Context-Free Grammars Describing Languages We've seen two models for the regular languages: Finite automata accept precisely the strings in the language. Regular expressions describe precisely the strings
More informationProgramming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson
Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators Jeremy R. Johnson 1 Theme We have now seen how to describe syntax using regular expressions and grammars and how to create
More informationCSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis
Chapter 4 Lexical and Syntax Analysis Introduction - Language implementation systems must analyze source code, regardless of the specific implementation approach - Nearly all syntax analysis is based on
More informationCS143 Handout 20 Summer 2012 July 18 th, 2012 Practice CS143 Midterm Exam. (signed)
CS143 Handout 20 Summer 2012 July 18 th, 2012 Practice CS143 Midterm Exam This midterm exam is open-book, open-note, open-computer, but closed-network. This means that if you want to have your laptop with
More informationArchitecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End
Architecture of Compilers, Interpreters : Organization of Programming Languages ource Analyzer Optimizer Code Generator Context Free Grammars Intermediate Representation Front End Back End Compiler / Interpreter
More informationCS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3
CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3 CS 536 Spring 2015 1 Scanning A scanner transforms a character stream into a token stream. A scanner is sometimes
More informationCSE 413 Final Exam. June 7, 2011
CSE 413 Final Exam June 7, 2011 Name The exam is closed book, except that you may have a single page of hand-written notes for reference plus the page of notes you had for the midterm (although you are
More informationCMSC 201 Fall 2016 Lab 09 Advanced Debugging
CMSC 201 Fall 2016 Lab 09 Advanced Debugging Assignment: Lab 09 Advanced Debugging Due Date: During discussion Value: 10 points Part 1: Introduction to Errors Throughout this semester, we have been working
More informationBriefly describe the purpose of the lexical and syntax analysis phases in a compiler.
Name: Midterm Exam PID: This is a closed-book exam; you may not use any tools besides a pen. You have 75 minutes to answer all questions. There are a total of 75 points available. Please write legibly;
More informationCA Compiler Construction
CA4003 - Compiler Construction David Sinclair A top-down parser starts with the root of the parse tree, labelled with the goal symbol of the grammar, and repeats the following steps until the fringe of
More informationLexical Analyzer Scanner
Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce
More informationLR Parsing. The first L means the input string is processed from left to right.
LR Parsing 1 Introduction The LL Parsing that is provided in JFLAP is what is formally referred to as LL(1) parsing. Grammars that can be parsed using this algorithm are called LL grammars and they form
More informationLexical and Syntax Analysis
Lexical and Syntax Analysis (of Programming Languages) Top-Down Parsing Lexical and Syntax Analysis (of Programming Languages) Top-Down Parsing Easy for humans to write and understand String of characters
More informationDefining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1
Defining Program Syntax Chapter Two Modern Programming Languages, 2nd ed. 1 Syntax And Semantics Programming language syntax: how programs look, their form and structure Syntax is defined using a kind
More informationCSE 401 Midterm Exam Sample Solution 2/11/15
Question 1. (10 points) Regular expression warmup. For regular expression questions, you must restrict yourself to the basic regular expression operations covered in class and on homework assignments:
More informationMITOCW watch?v=w_-sx4vr53m
MITOCW watch?v=w_-sx4vr53m The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high-quality educational resources for free. To
More informationLanguages and Compilers
Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:
More informationCS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer
CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer Assigned: Thursday, September 16, 2004 Due: Tuesday, September 28, 2004, at 11:59pm September 16, 2004 1 Introduction Overview In this
More informationChapter Seven: Regular Expressions
Chapter Seven: Regular Expressions Regular Expressions We have seen that DFAs and NFAs have equal definitional power. It turns out that regular expressions also have exactly that same definitional power:
More informationCMSC 330: Organization of Programming Languages. Context Free Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationIntroduction to Lexing and Parsing
Introduction to Lexing and Parsing ECE 351: Compilers Jon Eyolfson University of Waterloo June 18, 2012 1 Riddle Me This, Riddle Me That What is a compiler? 1 Riddle Me This, Riddle Me That What is a compiler?
More informationPrinciples of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore
(Refer Slide Time: 00:20) Principles of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Lecture - 4 Lexical Analysis-Part-3 Welcome
More informationAmbiguous Grammars and Compactification
Ambiguous Grammars and Compactification Mridul Aanjaneya Stanford University July 17, 2012 Mridul Aanjaneya Automata Theory 1/ 44 Midterm Review Mathematical Induction and Pigeonhole Principle Finite Automata
More informationCMSC 330: Organization of Programming Languages. Context Free Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationCSE302: Compiler Design
CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 20, 2007 Outline Recap
More informationCS308 Compiler Principles Lexical Analyzer Li Jiang
CS308 Lexical Analyzer Li Jiang Department of Computer Science and Engineering Shanghai Jiao Tong University Content: Outline Basic concepts: pattern, lexeme, and token. Operations on languages, and regular
More informationSyntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill
Syntax Analysis Björn B. Brandenburg The University of North Carolina at Chapel Hill Based on slides and notes by S. Olivier, A. Block, N. Fisher, F. Hernandez-Campos, and D. Stotts. The Big Picture Character
More informationTop-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7
Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess
More informationSyntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011
Syntax Analysis COMP 524: Programming Languages Srinivas Krishnan January 25, 2011 Based in part on slides and notes by Bjoern Brandenburg, S. Olivier and A. Block. 1 The Big Picture Character Stream Token
More informationA Simple Syntax-Directed Translator
Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called
More informationLexical Analyzer Scanner
Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce
More information2.2 Syntax Definition
42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions
More informationDaMPL. Language Reference Manual. Henrique Grando
DaMPL Language Reference Manual Bernardo Abreu Felipe Rocha Henrique Grando Hugo Sousa bd2440 flt2107 hp2409 ha2398 Contents 1. Getting Started... 4 2. Syntax Notations... 4 3. Lexical Conventions... 4
More informationMore Examples. Lex/Flex/JLex
More Examples A FORTRAN-like real literal (which requires digits on either or both sides of a decimal point, or just a string of digits) can be defined as RealLit = (D + (λ. )) (D *. D + ) This corresponds
More informationLR Parsing Techniques
LR Parsing Techniques Introduction Bottom-Up Parsing LR Parsing as Handle Pruning Shift-Reduce Parser LR(k) Parsing Model Parsing Table Construction: SLR, LR, LALR 1 Bottom-UP Parsing A bottom-up parser
More informationProgram Syntax; Operational Semantics
9/5 Solved Program Syntax; Operational Semantics CS 536: Science of Programming, Fall 2018 A. Why Our simple programming language is a model for the kind of constructs seen in actual languages. Step-by-step
More informationIntroduction to Bottom-Up Parsing
Introduction to Bottom-Up Parsing Lecture 11 CS 536 Spring 2001 1 Outline he strategy: shift-reduce parsing Ambiguity and precedence declarations Next lecture: bottom-up parsing algorithms CS 536 Spring
More informationPart 5 Program Analysis Principles and Techniques
1 Part 5 Program Analysis Principles and Techniques Front end 2 source code scanner tokens parser il errors Responsibilities: Recognize legal programs Report errors Produce il Preliminary storage map Shape
More informationShared Variables and Interference
Solved Shared Variables and Interference CS 536: Science of Programming, Fall 2018 A. Why Parallel programs can coordinate their work using shared variables, but it s important for threads to not interfere
More informationAbstract Syntax Trees & Top-Down Parsing
Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree
More informationAbstract Syntax Trees & Top-Down Parsing
Abstract Syntax Trees & Top-Down Parsing Review of Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree
More informationAbstract Syntax Trees & Top-Down Parsing
Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree
More informationprintf( Please enter another number: ); scanf( %d, &num2);
CIT 593 Intro to Computer Systems Lecture #13 (11/1/12) Now that we've looked at how an assembly language program runs on a computer, we're ready to move up a level and start working with more powerful
More informationHaskell: Lists. CS F331 Programming Languages CSCE A331 Programming Language Concepts Lecture Slides Friday, February 24, Glenn G.
Haskell: Lists CS F331 Programming Languages CSCE A331 Programming Language Concepts Lecture Slides Friday, February 24, 2017 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks
More informationLast lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions
Last lecture CMSC330 Finite Automata Languages Sets of strings Operations on languages Regular expressions Constants Operators Precedence 1 2 Finite automata States Transitions Examples Types This lecture
More informationCS 536 Midterm Exam Spring 2013
CS 536 Midterm Exam Spring 2013 ID: Exam Instructions: Write your student ID (not your name) in the space provided at the top of each page of the exam. Write all your answers on the exam itself. Feel free
More informationSlide 1 CS 170 Java Programming 1 The Switch Duration: 00:00:46 Advance mode: Auto
CS 170 Java Programming 1 The Switch Slide 1 CS 170 Java Programming 1 The Switch Duration: 00:00:46 Menu-Style Code With ladder-style if-else else-if, you might sometimes find yourself writing menu-style
More informationParsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)
TD parsing - LL(1) Parsing First and Follow sets Parse table construction BU Parsing Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1) Problems with SLR Aho, Sethi, Ullman, Compilers
More informationLecture Bottom-Up Parsing
Lecture 14+15 Bottom-Up Parsing CS 241: Foundations of Sequential Programs Winter 2018 Troy Vasiga et al University of Waterloo 1 Example CFG 1. S S 2. S AyB 3. A ab 4. A cd 5. B z 6. B wz 2 Stacks in
More informationMonday, August 26, 13. Scanners
Scanners Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved words, literals, etc. What do we need to know? How do we define tokens? How can
More informationChapter 3: Lexing and Parsing
Chapter 3: Lexing and Parsing Aarne Ranta Slides for the book Implementing Programming Languages. An Introduction to Compilers and Interpreters, College Publications, 2012. Lexing and Parsing* Deeper understanding
More informationWednesday, September 3, 14. Scanners
Scanners Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved words, literals, etc. What do we need to know? How do we define tokens? How can
More informationSFU CMPT 379 Compilers Spring 2018 Milestone 1. Milestone due Friday, January 26, by 11:59 pm.
SFU CMPT 379 Compilers Spring 2018 Milestone 1 Milestone due Friday, January 26, by 11:59 pm. For this assignment, you are to convert a compiler I have provided into a compiler that works for an expanded
More informationCS52 - Assignment 10
CS52 - Assignment 10 Due Wednesday 12/9 at 7:00pm https://xkcd.com/205/ Important Notice Assignments 9 and 10 are due at the same time. This is to give you maximum flexibility in scheduling during the
More informationStating the obvious, people and computers do not speak the same language.
3.4 SYSTEM SOFTWARE 3.4.3 TRANSLATION SOFTWARE INTRODUCTION Stating the obvious, people and computers do not speak the same language. People have to write programs in order to instruct a computer what
More informationMaciej Sobieraj. Lecture 1
Maciej Sobieraj Lecture 1 Outline 1. Introduction to computer programming 2. Advanced flow control and data aggregates Your first program First we need to define our expectations for the program. They
More informationMidterm I (Solutions) CS164, Spring 2002
Midterm I (Solutions) CS164, Spring 2002 February 28, 2002 Please read all instructions (including these) carefully. There are 9 pages in this exam and 5 questions, each with multiple parts. Some questions
More informationRegular Expressions. Regular Expression Syntax in Python. Achtung!
1 Regular Expressions Lab Objective: Cleaning and formatting data are fundamental problems in data science. Regular expressions are an important tool for working with text carefully and eciently, and are
More informationLanguage Processing note 12 CS
CS2 Language Processing note 12 Automatic generation of parsers In this note we describe how one may automatically generate a parse table from a given grammar (assuming the grammar is LL(1)). This and
More informationCompilers. Yannis Smaragdakis, U. Athens (original slides by Sam
Compilers Parsing Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Next step text chars Lexical analyzer tokens Parser IR Errors Parsing: Organize tokens into sentences Do tokens conform
More informationCS502: Compilers & Programming Systems
CS502: Compilers & Programming Systems Top-down Parsing Zhiyuan Li Department of Computer Science Purdue University, USA There exist two well-known schemes to construct deterministic top-down parsers:
More information