COP4020 Programming Assignment 2 - Fall 2016

Similar documents
COP4020 Programming Assignment 2 Spring 2011

COP4020 Programming Assignment 6

LECTURE 7. Lex and Intro to Parsing

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

A simple syntax-directed

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

A parser is some system capable of constructing the derivation of any sentence in some language L(G) based on a grammar G.

Lexical Analysis. Introduction

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

1 Lexical Considerations

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

Today's Topics. CISC 458 Winter J.R. Cordy

COP4020 Spring 2011 Midterm Exam

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Recursive Descent Parsers

Lexical and Syntax Analysis. Top-Down Parsing

A Simple Syntax-Directed Translator

Lexical and Syntax Analysis

Introduction to Syntax Directed Translation and Top-Down Parsers

Lexical and Syntax Analysis

Left to right design 1

Software II: Principles of Programming Languages

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

CSCI312 Principles of Programming Languages

Compiler Construction Assignment 3 Spring 2018

Lexical Considerations

CSCE 314 Programming Languages. Type System

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Parsing. Lecture 11: Parsing. Recursive Descent Parser. Arithmetic grammar. - drops irrelevant details from parse tree

Lexical Analysis - An Introduction. Lecture 4 Spring 2005 Department of Computer Science University of Alabama Joel Jones

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend:

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 97 INSTRUCTIONS

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

Lexical Considerations

LECTURE 3. Compiler Phases

Project Compiler. CS031 TA Help Session November 28, 2011

Question Points Score

UNIVERSITY OF CALIFORNIA

Compiler construction in4303 lecture 3

Chapter 4. Lexical and Syntax Analysis

Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

Syntax Errors; Static Semantics

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

CSCE 531 Spring 2009 Final Exam

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Chapter 3. Parsing #1

CS 314 Principles of Programming Languages. Lecture 9

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Compiler Theory. (Semantic Analysis and Run-Time Environments)

CS 4120 Introduction to Compilers

CS 314 Principles of Programming Languages

CSE 401 Midterm Exam Sample Solution 2/11/15

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

Syntax-Directed Translation. Lecture 14

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

CSCI312 Principles of Programming Languages!

Programs as data Parsing cont d; first-order functional language, type checking

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

CSE431 Translation of Computer Languages

Lexical and Syntax Analysis (2)

CSE P 501 Exam 11/17/05 Sample Solution

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

Principles of Programming Languages

Flow Control. CSC215 Lecture

Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

4. Lexical and Syntax Analysis

CSE 3302 Programming Languages Lecture 2: Syntax

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Homework. Lecture 7: Parsers & Lambda Calculus. Rewrite Grammar. Problems

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

Project 2 Interpreter for Snail. 2 The Snail Programming Language

Context-free grammars (CFG s)

Configuration Sets for CSX- Lite. Parser Action Table

4. Lexical and Syntax Analysis

COP5621 Exam 3 - Spring 2005

Semantic Analysis. Lecture 9. February 7, 2018

Syntactic Analysis. Top-Down Parsing

Scope. CSC 4181 Compiler Construction. Static Scope. Static Scope Rules. Closest Nested Scope Rule

COP 3402 Systems Software Syntax Analysis (Parser)

The procedure attempts to "match" the right hand side of some production for a nonterminal.

Syntax-Directed Translation Part I

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Jim Lambers ENERGY 211 / CME 211 Autumn Quarter Programming Project 4

Using an LALR(1) Parser Generator

Class Information ANNOUCEMENTS

LL parsing Nullable, FIRST, and FOLLOW

Programming Languages and Compilers (CS 421)

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

The Compiler So Far. Lexical analysis Detects inputs with illegal tokens. Overview of Semantic Analysis

Transcription:

COP4020 Programming Assignment 2 - Fall 2016 To goal of this project is to implement in C or C++ (your choice) an interpreter that evaluates arithmetic expressions with variables in local scopes. The local scope is similar to the let construct in functional languages: let x = 2 in x*x end which evaluates to 4, 3 * ( let x = 2 in x*x which evaluates to 12, and 3 * ( let x = 2 in ( let y = x*x+1 in y/2 + 1 evaluates to 9 (integer division rounds towards zero). Expression grammar Consider our familiar augmented LL(1) grammar for an expression language (see Syntax lecture notes on the LL(1) expression grammar): <expr> -> <term> <term_tail> term_tail.subtotal := term.value; expr.value := term_tail.value <term> -> <factor> <factor_tail> factor_tail.subtotal := factor.value; term.value := factor_tail.value <term_tail1> -> + <term> <term_tail2> term_tail2.subtotal := term_tail1.subtotal+term.value; term_tail1.value := term_tail2.value - <term> <term_tail2> term_tail2.subtotal := term_tail1.subtotal-term.value; term_tail1.value := term_tail2.value empty term_tail1.value := term_tail1.subtotal <factor1> -> ( <expr> ) factor1.value := expr.value - <factor2> factor1.value := -factor2.value num factor1.value := num.value id factor1.value := lookup(id.name) <factor_tail1> -> * <factor> <factor_tail2> factor_tail2.subtotal := 1

factor_tail1.subtotal*factor.value; factor_tail1.value := factor_tail2.value / <factor> <factor_tail2> factor_tail2.subtotal := factor_tail1.subtotal/factor.value; factor_tail1.value := factor_tail2.value empty factor_tail1.value := factor_tail1.subtotal where lookup(id.name) returns the value of the identifier (assuming it is a variable with a value) from a name-value store such as a binding stack. Note that in the augmented grammar above we use indexed nonterminals which is needed to distinguish the use of the nonterminals in the semantic rules, so factor1 and factor2 are just the same factor nonterminal. The meta-symbol empty in the productions denotes ε. Let-rules To facilitate scoping of variables with let, we need to store the name-value pairs on a binding stack. When our interpreter executes a let the variable declared is only visible in the body of the let expression. Therefore, the obvious choice for a name-value store would be a binding stack, where a new name-value pair is pushed on the stack when the interpreter enters the let and pops it off when it leaves the let. When the variable is used in the body of the let the interpreter searches the stack from the top (the last pair added) to the bottom to find the variable. This also allows for shadowing in scope rules when scopes are nested: let x = 1 in ( let x = 3 in 2*x + x end which evaluates to 7. Here the scope of the x in the outer let is hidden in the inner body of the inner let. When the interpreter needs the value of a variable in an expression it looks it up in the binding stack from the top to the bottom to find the most recently pushed variable. To add the let construct to our grammar we extend it with a new production: <expr1> -> let id = <expr2> in <push> <expr3> end <pop> push.name = id.name; push.value = expr2.value; expr1.value = expr3.value We use the dummy nonterminals <push> and <pop> that represent the push and pop operations that we need to execute before and after the evaluation of the let body expression expr3: 2

<push> -> empty stack.push(pair(push.name, push.value)) <pop> -> empty stack.pop() Note that in this way the interpreter performs the push and pop operations cleverly while parsing the input, as these are embedded in the augmented grammar. More specifically, the stack object we use is a global stack that is updated with push and pop operations and can be searched top-down with a lookup function. You have to decide on a stack implementation. The stack size should be at least 100. Recursive descent parsing In this project we will use the recursive-descent parsing technique. The recursive descent parser for the let construct uses a function for each nonterminal to parse the input that the nonterminal represents, for example parsing an <expr>: /* define keyword tokens recognized by the scanner */ #define LET 256 #define IN 257 #define END 258 /* the <expr> parser */ int expr() int result; if (lookahead() == LET) char *name; int value; match(let); // let name = id(); // id match( = ); // = value = expr(); // <expr> match(in); // in push(name, value); // <push> result = expr(); // <expr> match(end); // end pop(); int value = term(); result = term_tail(value); return result; // <pop> // <term> // <term_tail> You need to complete the recursive descent parser implementation for the augmented grammar. For this assignment you can write the code in C or C++. The match() function can be written based on these: void match(int token) if (lookahead() == token)... // report syntax error and exit 3

Adding a scanner You need to write a scanner first to tokenize the input. The scanner has two functions: lookahead() and getnext(). The lookahead() returns an integer that represents the current token. The getnext() moves to the next token on the input. Note that if all tokens were single ASCII we can simply use getchar() for getnext() to read the next ASCII token: int current = 0; // getnext() gets the next token (simple version) void getnext() do current = getchar(); while (isspace(current)); // lookahead() returns current token int lookahead() if (current == 0) return current; // skip space until next char // first time around However, we also need to recognize the let, in, end keywords, identifier names, and integer constants. For identifier names (variables), we need to store the current variable name and set current to ID or to NUM for integer constants: #define ID 259 #define NUM 260 int current = 0; int number; char name[80]; // max identifier name length is 79 // getnext() improved version void getnext() int c; do c = getchar(); while (isspace(c)); if (...check for keyword...) current =...keyword token... if (...check for identifier...) current = ID;... fill name[] with id name if (...check for number...) current = NUM;... number = decoded value current = c; 4

Check for the keywords let, in, and end to set current to the value of LET, IN, or END (these are macros, see above). Note that EOF is returned by lookahead() upon end of file, which will be used to stop the expression parser and print the calculated value. You should test your scanner implementation first by calling getnext() in a loop and printing the lookahead() token: while (lookahead()!= EOF) switch (lookahead()) case ID: printf("id=%s\n", name); break; case NUM: printf("num=%d\n", number); break; default: printf("%c\n", lookahead()); This should tokenize the stdin stream and print the results to stdout. When you are confident that the scanner works properly, you can start working on the next phase, the interpreter to evaluate expressions. Interpreter Implement the let-expression interpreter using your scanner and parser. The parser should use recursive descent. Implement a stack to push and pop name-value pairs and search the stack top-down for the value of a name. The input to the interpreter is read from the stdin. The value should be printed to stdout. BONUS You can earn 10% extra credit as follows. Modify the grammar to include multiple variables in a let such as let x = 1, y = 2 in x+y end by starting with: <expr1> -> let <vars> in <expr2> end <popvars> expr1.value = expr2.value??? <vars> -> id = <expr> <restvars> vars.count = restvars.count + 1??? <restvars> ->, <vars> restvars.count = vars.count empty restvars.count = 0 <popvars> -> empty??? The semantic rules are not complete, you will need to fill in the code for???. The number of pops to execute in <popvars> should be the number of pushed bindings that <vars> executed. Use the count attribute for this as suggested above. Implement the new grammar and rules in your recursive descent parser. End 5