CS2383 Programming Assignment 3

Similar documents
09 STACK APPLICATION DATA STRUCTURES AND ALGORITHMS REVERSE POLISH NOTATION

Section 5.5. Left subtree The left subtree of a vertex V on a binary tree is the graph formed by the left child L of V, the descendents

Lab 7 1 Due Thu., 6 Apr. 2017

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Compiler Code Generation COMP360

CS 171: Introduction to Computer Science II. Stacks. Li Xiong

Some Applications of Stack. Spring Semester 2007 Programming and Data Structure 1

Formal Languages and Automata Theory, SS Project (due Week 14)

Stating the obvious, people and computers do not speak the same language.

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS

Review: Expressions, Variables, Loops, and more.

Stack Applications. Lecture 27 Sections Robb T. Koether. Hampden-Sydney College. Wed, Mar 29, 2017

Assessment of Programming Skills of First Year CS Students: Problem Set

A Simple Syntax-Directed Translator

Data Structure (CS301)

Computer Science 210 Data Structures Siena College Fall Topic Notes: Trees

Syntax-Directed Translation. Lecture 14

CDA 3103 Computer Organization Homework #7 Solution Set

Project Compiler. CS031 TA Help Session November 28, 2011

CMPSCI 187 / Spring 2015 Postfix Expression Evaluator

CS 2604 Minor Project 1 Summer 2000

syntax tree - * * * * * *

ASTS, GRAMMARS, PARSING, TREE TRAVERSALS. Lecture 14 CS2110 Fall 2018

Stacks. Chapter 5. Copyright 2012 by Pearson Education, Inc. All rights reserved

Stack Abstract Data Type

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

CS 2604 Minor Project 1 DRAFT Fall 2000

F453 Module 7: Programming Techniques. 7.2: Methods for defining syntax

CS W3134: Data Structures in Java

CS 211 Programming Practicum Spring 2017

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS. Regrades 10/6/15. Prelim 1. Prelim 1. Expression trees. Pointers to material

SimpleCalc. which can be entered into a TI calculator, like the one on the right, like this:

15-122: Principles of Imperative Computation, Fall 2015

Problem with Scanning an Infix Expression

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM.

-The Hacker's Dictionary. Friedrich L. Bauer German computer scientist who proposed "stack method of expression evaluation" in 1955.

Tree. Virendra Singh Indian Institute of Science Bangalore Lecture 11. Courtesy: Prof. Sartaj Sahni. Sep 3,2010

CSE 401 Midterm Exam 11/5/10

Largest Online Community of VU Students

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS

UNIVERSITY OF CALIFORNIA

CSE P 501 Exam 8/5/04

CS 206 Introduction to Computer Science II

Chapter 04: Instruction Sets and the Processor organizations. Lesson 18: Stack-based processor Organisation

CS 211 Programming Practicum Spring 2018

Class Information ANNOUCEMENTS

The Stack and Queue Types

Postfix (and prefix) notation

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS

Stacks, Queues and Hierarchical Collections

COMP 250 Fall binary trees Oct. 27, 2017

Additional Guidelines and Suggestions for Project Milestone 1 CS161 Computer Security, Spring 2008

PA3 Design Specification

CS 211 Programming Practicum Fall 2018

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Programming Assignment 2

Friday, March 30. Last time we were talking about traversal of a rooted ordered tree, having defined preorder traversal. We will continue from there.

Stacks, Queues and Hierarchical Collections. 2501ICT Logan

EE 368. Week 6 (Notes)

First Semester - Question Bank Department of Computer Science Advanced Data Structures and Algorithms...

Lecture 12 TREES II CS2110 Spring 2018

March 13/2003 Jayakanth Srinivasan,

Semantic actions for declarations and expressions

Special Section: Building Your Own Compiler

Project 2: Scheme Interpreter

Problem with Scanning an Infix Expression

STACKS. A stack is defined in terms of its behavior. The common operations associated with a stack are as follows:

Project 1: Scheme Pretty-Printer

n Data structures that reflect a temporal relationship q order of removal based on order of insertion n We will consider:

Semantic actions for declarations and expressions. Monday, September 28, 15

CSE 413 Final Exam. December 13, 2012

Programming, Data Structures and Algorithms Prof. Hema A Murthy Department of Computer Science and Engineering Indian Institute of Technology, Madras

Expressions and Assignment

CS 206 Introduction to Computer Science II

Lecture 4: Stack Applications CS2504/CS4092 Algorithms and Linear Data Structures. Parentheses and Mathematical Expressions

Writeup for first project of CMSC 420: Data Structures Section 0102, Summer Theme: Threaded AVL Trees

syntax tree - * * * - * * * * * 2 1 * * 2 * (2 * 1) - (1 + 0)

3137 Data Structures and Algorithms in C++

Stacks. Revised based on textbook author s notes.

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

, has the form T i1i 2 i m. = κ i1i 2 i m. x i1. 1 xi2 2 xim m (2)

Lecture 26. Introduction to Trees. Trees

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

Stack Applications. Lecture 25 Sections Robb T. Koether. Hampden-Sydney College. Mon, Mar 30, 2015

Implementing Programming Languages

This is an individual assignment and carries 100% of the final CPS 1000 grade.

An Introduction to Trees

In this chapter you ll learn:

Stacks. Chapter 5. Copyright 2012 by Pearson Education, Inc. All rights reserved

Lecture 12 ADTs and Stacks

LECTURE 17. Expressions and Assignment

CS 6353 Compiler Construction Project Assignments

CISC-235. At this point we fnally turned our atention to a data structure: the stack

7.1 Introduction. A (free) tree T is A simple graph such that for every pair of vertices v and w there is a unique path from v to w

Programming, Data Structures and Algorithms Prof. Hema Murthy Department of Computer Science and Engineering Indian Institute of Technology, Madras

19 Much that I bound, I could not free; Much that I freed returned to me. Lee Wilson Dodd

Abstract Data Types. Stack. January 26, 2018 Cinda Heeren / Geoffrey Tien 1

Assignment 6. Computer Science 52. Due Friday, November 9, 2018, at 5:00 pm

STACKS AND QUEUES. Problem Solving with Computers-II

Lecture Notes 16 - Trees CSS 501 Data Structures and Object-Oriented Programming Professor Clark F. Olson

Transcription:

CS2383 Programming Assignment 3 October 18, 2014 due: November 4 Due at the end of our class period. Due to the midterm and the holiday, the assignment will be accepted with a 10% penalty until the end of our class period on November 13. After this, it will not be accepted. You will have been assigned Program 4 before this, and it is probably unwise to still be working on Program 3 after the midterm. 1 Introduction In this program, you will make a simple compiler for expressions. The key data structure is an expression tree. You will read in an expression, perform some optimization on it, and then generate a low level Java program that can evaluate the expression. The website contains a skeleton of the Java code for my solution. It leaves out all the interesting stuff, but some of the more boring code has already been written for you. You are to complete the skeleton. Testing with JUnit is not required for this program. 2 Details Details of the input parser, optimizer, and code generator follow. 2.1 Input Parser The expression to be compiled is given in reverse Polish notation. If you have not encountered this concept, an RPN expression is what you get from a postorder traversal of an expression tree. For instance, the expression we would normally write in our familiar infix notation as 2*4 + c*d becomes 2 4 * c d * + in RPN. 1

Our expressions involve 10 variables (lower-case a through j ), constants 0 through 9, and operators +,-,* and /. These operators are for integer arithmetic. I.e., 5 6 / is zero, not 5 6. Since constants and variables are single characters, we do not allow spaces in our RPN expressions. So we would use 24*cd*+ to represent the infix expression 2 * 4 + c*d. Why RPN? Well, if it is not a concept you know, it s a good one to learn. Also, it is easier to parse RPN than to parse ordinary infix expressions. RPN can be parsed using a stack of expression trees. In my solution, I use a LinkedBinaryTree<String> to represent an expression tree; leaves store strings such as 500 and x, whereas internal nodes store strings such as *. Thus, my parser uses as stack of these: well, actually I use a Deque<LinkedBinaryTree<String>>. The classical RPN parsing algorithm is as follows: the characters ( tokens ) in the input are processed from left to right. When you see a constant or a variable token, you construct a 1-node expression tree and push it onto the stack. When you see an operator token, you pop the top two expression trees from the stack. You make a new tree having the operator as its root, and the two popped trees as its left and right children. (You have to get the order of things right, because 13- means 1-3, not 3-1.) Having made this new tree, you push it onto the stack. Once the last token has been processed, the stack should have a single tree on it this is the parser s output. Input to your program is delivered via the command-line arguments, accessed by reading the parameter that main() takes. If you were developing your program from the command-line tools, then when you ran your program, you could specify the input expression. E.g. prompt> java Compiler "23-ab+*" With Eclipse, under the Run menu, there is an option for Run Configurations. There is a tab for Arguments, and you could type your 23-ab+* into the Program arguments box. Other IDEs (at least NetBeans and JGrasp) have similar mechanisms. 2.2 Code Generation Compilers usually generate either native machine code or assembly language. You ll learn about that in CS2253. The key idea is that the generated code consists of very simple statements that only do one basic operation. We ll generate this kind of code, except as bunch of Java statements that are in the body of a method whose parameters include our expression s variables. The course website shows a number of example outputs. So the RPN expression a2+b* should lead to a sequence of statements 2

like the following: int temp1 = a; int temp2 = 2; int temp3 = temp1+temp2; int temp4 = b; int temp5 = temp3*temp4; Your program should do the code generation from the expression tree, rather than trying to do something based directly on the RPN input. A (custom) postfix traversal of the expression tree will be required. Since it s a postfix traversal, it is not surprising that the generated code will follow the order of operations in the RPN. It does not make sense to attempt code generation until you can successfully parse the input. It should be possible to run the Java compiler on the code you have produced. 2.3 Optimization A sophisticated compiler will do many optimizations to the given code. Your compiler will do constant folding and implement a few algebraic simplifications. Constant folding occurs whenever the compiler can find an operator whose operands are constants. So, in the infix expression 2*a + ( 5*7 + 6) the compiler can first fold 5*7 into 35, then fold 35+6 into 41. Thus the expression now corresponds to 2*a + 41. Every algebraic identity that a mathemagician can derive can lead to an opportunity for a compiler to do algebraic simplification. To keep things manageable, you need to implement simplifications based only on the following identities: X 0 = 0 and 0 X = 0, where X is any expression; X X = 0, where X is any expression Note that one optimization can lead to an opportunity for another. For instance, in (5 / 6) * (a + b + c), the constant folding leads to 0* (a + b + c), which can be further optimized to 0. After you have optimized the expression tree, there should be no remaining opportunities for constant folding or our limited set of algebraic simplifications. The trickiest part of the optimizer is probably from the last simplification, since we have to make sure that both the left and right subtrees of 3

the - operator are identical expressions. So (a+b)-(a+b) needs to be detected. However, your optimizer does not need to take into account properties such as commutativity: it does not need to detect (a+b) - (b+a), for instance. The subexpressions a+b and b+a are different subexpressions. The optimizer can be tackled as soon as the input parser works. The optimizer and the code generator are independent of one another. 3 Difficulty Levels A real compiler requires dozens of person-years of effort by highly skilled developers. So you might be worried about this homework. However, a real compiler would have expressions as a fairly minor component. You should set aside at least 10 hours for this assignment. (I think CS students do not realize that programming assignments are inherently much more time consuming than the weekly homeworks in other courses. How much more time consuming depends on how badly you need additional programming practice.) The skeleton omits about 15 20 (nonblank, noncomment) lines from my sample solution s parserpolish(). You ve been told the basic algorithm, and so this should be the easiest part of the homework. Still, it is worth 50% of the homework. The skeleton also omits code for code generation. Again, it is about 15 lines (mostly in a method invoked from codegenerate()). This code is trickier and involves a custom postorder traversal of the expression tree. It is worth 30% of the homework. Finally, the skeleton omits optimization code. There are 15 20 lines in a method to determine whether two expressions are syntactically identical (needed for the subtract algebraic optimization). There are about 20 lines (some fairly long) that do the rest of the optimizations. You will probably find the full optimizer challenging to write, although basic constant folding is not too difficult. The full optimizer is worth 20% of the homework; i.e., one cannot get an A on the program without getting at least some of the optimizer working, but anyone who has to choose between getting codegeneration working and optimization should choose the former. 4

4 What to Submit Supply a printout of your source code. Electronically, I will use the final version checked into your subversion repository as your submission. 5