COP4020 Programming Assignment 2 Spring 2011

Similar documents
COP4020 Programming Assignment 2 - Fall 2016

COP4020 Programming Assignment 6

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

LECTURE 7. Lex and Intro to Parsing

COP4020 Spring 2011 Midterm Exam

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

A simple syntax-directed

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

1 Lexical Considerations

Recursive Descent Parsers

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

A parser is some system capable of constructing the derivation of any sentence in some language L(G) based on a grammar G.

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

A Simple Syntax-Directed Translator

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

Lexical and Syntax Analysis. Top-Down Parsing

Compiler Construction Assignment 3 Spring 2018

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Question Points Score

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

CSCI312 Principles of Programming Languages

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

COP5621 Exam 3 - Spring 2005

Chapter 4. Lexical and Syntax Analysis

Introduction to Syntax Directed Translation and Top-Down Parsers

Syntax-Directed Translation. Lecture 14

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

Lexical and Syntax Analysis

Left to right design 1

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Lexical and Syntax Analysis (2)

Abstract Syntax Trees & Top-Down Parsing

Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Compiler construction in4303 lecture 3

Parsing. Lecture 11: Parsing. Recursive Descent Parser. Arithmetic grammar. - drops irrelevant details from parse tree

A programming language requires two major definitions A simple one pass compiler

Lexical Analysis - An Introduction. Lecture 4 Spring 2005 Department of Computer Science University of Alabama Joel Jones

4. Lexical and Syntax Analysis

CSE302: Compiler Design

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

Lexical Analysis. Introduction

Principles of Programming Languages

4. Lexical and Syntax Analysis

Lexical Considerations

LECTURE 3. Compiler Phases

Outline. Top Down Parsing. SLL(1) Parsing. Where We Are 1/24/2013

CSE431 Translation of Computer Languages

CSE 3302 Programming Languages Lecture 2: Syntax

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

Lexical and Syntax Analysis

CSCI312 Principles of Programming Languages!

LL(k) Compiler Construction. Top-down Parsing. LL(1) parsing engine. LL engine ID, $ S 0 E 1 T 2 3

Lexical Considerations

Software II: Principles of Programming Languages

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Chapter 3. Parsing #1

Configuration Sets for CSX- Lite. Parser Action Table

COP4020 Programming Assignment 1 - Spring 2011

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

CS 4120 Introduction to Compilers

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

Homework. Lecture 7: Parsers & Lambda Calculus. Rewrite Grammar. Problems

CS 314 Principles of Programming Languages

LL(k) Compiler Construction. Choice points in EBNF grammar. Left recursive grammar

LL parsing Nullable, FIRST, and FOLLOW

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

Topic 3: Syntax Analysis I

Syntactic Analysis. Top-Down Parsing

1 Introduction. 2 Recursive descent parsing. Predicative parsing. Computer Language Implementation Lecture Note 3 February 4, 2004

Lecture 14 Sections Mon, Mar 2, 2009

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Abstract Syntax Trees Synthetic and Inherited Attributes

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 97 INSTRUCTIONS

Error Recovery during Top-Down Parsing: Acceptable-sets derived from continuation

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

Project Compiler. CS031 TA Help Session November 28, 2011

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

CS 314 Principles of Programming Languages. Lecture 9

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

Project 2 Interpreter for Snail. 2 The Snail Programming Language

Syntax-Directed Translation Part I

3. Parsing. 3.1 Context-Free Grammars and Push-Down Automata 3.2 Recursive Descent Parsing 3.3 LL(1) Property 3.4 Error Handling

CSCE 531 Spring 2009 Final Exam

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

The procedure attempts to "match" the right hand side of some production for a nonterminal.

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

COP 3402 Systems Software Syntax Analysis (Parser)

CMSC 330: Organization of Programming Languages

Syntax-Directed Translation. Introduction

Jim Lambers ENERGY 211 / CME 211 Autumn Quarter Programming Project 2

Transcription:

COP4020 Programming Assignment 2 Spring 2011 Consider our familiar augmented LL(1) grammar for an expression language (see Syntax lecture notes on the LL(1) expression grammar): <expr> -> <term> <term_tail> term_tail.subtotal := term.value; expr.value := term_tail.value <term> -> <factor> <factor_tail> factor_tail.subtotal := factor.value; term.value := factor_tail.value <term_tail1> -> + <term> <term_tail2> term_tail2.subtotal := term_tail1.subtotal+term.value; term_tail1.value := term_tail2.value - <term> <term_tail2> term_tail2.subtotal := term_tail1.subtotal-term.value; term_tail1.value := term_tail2.value empty term_tail1.value := term_tail1.subtotal <factor1> -> ( <expr> ) factor1.value := expr.value - <factor2> factor1.value := -factor2.value num factor1.value := num.value id factor1.value := lookup(id.name) <factor_tail1> -> * <factor> <factor_tail2> factor_tail2.subtotal := factor_tail1.subtotal*factor.value; factor_tail1.value := factor_tail2.value / <factor> <factor_tail2> factor_tail2.subtotal := factor_tail1.subtotal/factor.value; factor_tail1.value := factor_tail2.value empty factor_tail1.value := factor_tail1.subtotal where lookup(id.name) returns the value of the identifier (assuming it is a variable with a value) from a name-value store, which could be an associative array or hashmap etc. in the implementation of an interpreter that uses this grammar to parse input. Note that we use nonterminals with indexes, so factor1 and factor2 are just the same factor nonterminal, but indexed so we can distinguish the nonterminals in the productions in the semantic rules. The empty in the productions shown above denotes ε. To goal of this project is to implement an interpreter that evaluates arithmetic expressions and that supports the use of variables in local scopes. The local scope is similar to the let construct in functional languages: let x = 2 in x*x which evaluates to 4, 3 * ( let x = 2 in x*x 1

which evaluates to 12, and 3 * ( let x = 2 in ( let y = x*x+1 in y/2 + 1 evaluates to 9 (integer division rounds towards zero). To implement the scoping of variable in our interpreter, we need to store name-value pairs. Note that when the interpreter executes a let it defines a new variable that is only used within the body of the let. Therefore, the natural choice for a name-value store would be a stack, where a new name-value pair is pushed on the stack when the interpreter enters the let and pops it off when it leaves the let. When the variable is used in the body of the let the interpreter searches the stack from the top (the last pair added) to the bottom to find the variable. This also allows for shadowing in scope rules when scopes are nested: let x = 1 in ( let x = 3 in 2*x + x which evaluates to 7. Here the scope of the x in the outer let is hidden in the inner body of the inner let. When the interpreter needs the value of a variable in an expression it looks it up in the stack from the top to the bottom to find the most recently added variable, which will be the one of the inner scope. To do this, we ext this grammar by adding a new production: <expr1> -> let id = <expr2> in <push> <expr3> <pop> push.name = id.name; push.value = expr2.value; expr1.value = expr3.value and we use dummy nonterminals that represent the push and pop operations that need to be performed before and after the evaluation of the let body expression expr3 as follows: <push> -> empty stack.push(pair(push.name, push.value)) <pop> -> empty stack.pop() Note that in this way the interpreter performs the push and pop operations cleverly while parsing the input, as these are embedded in the augmented grammar. More specifically, the stack object we use is a global stack that is updated with push and pop operations and can be searched top-down with a lookup function. 2

In this project we will implement an interpreter for this language using a recursivedescent parsing technique. The recursive descent parser for the let construct has the following outline: /* keywords recognized by the scanner */ #define LET 256 #define IN 257 #define END 258 /* the <expr> parser */ int expr() int result; if (lookahead() == LET) char *name; int value; match(let); // let name = id(); // id match( = ); // = value = expr(); // <expr> match(in); // in push(name, value); // <push> result = expr(); // <expr> match(end); // pop(); // <pop> int value = term(); // <term> result = term_tail(value); return result; // <term_tail> You need to complete the recursive descent parser implementation for the augmented grammar. For this assignment you can write the code in C or C++. You need to write a scanner first to tokenize the input. The scanner has two functions: lookahead() and getnext(). The lookahead() returns an integer that represents the current token. The getnext() moves to the next token on the input. The match() function can be written based on these: void match(int token) if (lookahead() == token)... // report syntax error and exit Note that if all tokens were single ASCII we can simply use getchar() for getnext() to read the next ASCII token: int current = 0; // getnext() gets the next token (simple version) void getnext() do current = getchar(); while (isspace(current)); // lookahead() returns current token int lookahead() if (current == 0) // first time around return current; 3 // skip space until next char

However, we also need to recognize the let, in, keywords, identifier names, and integer constants. For identifier names (variables), we need to store the current variable name and set current to ID or to NUM for integer constants: #define ID 259 #define NUM 260 int current = 0; int number; char name[80]; // max identifier name length is 79 // getnext() improved version void getnext() int c; do c = getchar(); while (isspace(c)); if (...) // identifier: current = ID; //... fill name[] with id name if (...) // integer: current = NUM; //... number = decoded value current = c; You should also check for the keywords let, in, and to set current to LET, IN, or END. Note that EOF is returned by lookahead() upon of file, which will be used to stop the expression parser and print the calculated value. Now implement the interpreter using the suggested scanner code shown above. You should test your scanner implementation first by calling getnext() in a loop and printing the lookahead() token: while (lookahead()!= EOF) switch (lookahead()) case ID: printf("id=%s\n", name); break; case NUM: printf("num=%d\n", number); break; default: printf("%c\n", lookahead()); This should tokenize the stdin stream and print the results to stdout. When you are confident that the scanner works properly, you can start working on the next phase, the parser which also performs interpreter operations to evaluate expressions. The parser should use recursive descent. Implement a stack to push and pop name-value pairs and search the stack top-down for the value of a name. The input to the interpreter is read from the stdin. The value should be printed to stdout. 4

BONUS ASSIGNMENT You can earn 10% extra credit as follows. Modify the grammar as follows to parse comma-separated expressions: <comm> -> <expr> <rest> rest.total := expr.total; comm.value := rest.value <rest> ->, <comm> rest.value := comm.value empty rest.value := rest.total The new start symbol of the grammar is comm (this is where the parser will start). Note that rest has two attributes: total (inherited) and value (synthesized). This allows parsing of comma-separated expressions, where the value of the list of expressions is the value of the last expression in this list. So the first values are simply ignored. This is similar to the comma operator in C. For example: 1, 2 evaluates to 2, and 3*(1, 2) evaluates to 6. Next, we add variable assignments to our language, similar to C. This way we can modify the value of a variable in a let construct: let x = 0 in (x = x + 1, x) which evaluates to 1, since x is assigned before the comma and the value of the comma expression is the value of x. Like in C, assignments can be used within expressions: let x = 0 in x = 2 * (x = x + 1) which evaluates to 2, since the value of the nested assignment is 1. The productions for factor should be modified to handle the new comma operator within parenthesis and the assignment operator: <factor1> -> ( <comm> ) factor1.value := comm.value - <factor2> factor1.value := -factor2.value num factor1.value := num.value id factor1.value := lookup(id.name) id = <expr> assign(id.name, expr.value); factor1.value = expr.value Note that the last two productions have common prefixes, which invalidates the LL(1) property of the grammar. You will have to resolve this in your recursive descent parser. The assign operation changes the value of the variable on the stack. That is, it searches top-down in the stack for the variable name and modifies its value stored there. End 5