Garrett Hall CSC 173 C Programming Weeks 3-4

Similar documents
MATVEC: MATRIX-VECTOR COMPUTATION LANGUAGE REFERENCE MANUAL. John C. Murphy jcm2105 Programming Languages and Translators Professor Stephen Edwards

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages. Context Free Grammars

Introduction to MATLAB

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

CS-201 Introduction to Programming with Java

Introduction to MATLAB

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Compiler Construction

Ordinary Differential Equation Solver Language (ODESL) Reference Manual

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

CMSC 330: Organization of Programming Languages

Project 1: Scheme Pretty-Printer

Compiler Construction

Fundamental Data Types. CSE 130: Introduction to Programming in C Stony Brook University

A programming language requires two major definitions A simple one pass compiler

CMSC 330: Organization of Programming Languages

Week 2: Syntax Specification, Grammars

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Introduction to Engineering gii

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019

CMSC 330: Organization of Programming Languages. Context Free Grammars

Syntax Analysis, III Comp 412

Principles of Programming Languages COMP251: Syntax and Grammars

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

CSE 401 Midterm Exam Sample Solution 2/11/15

Context-Free Grammar (CFG)

Syntax Analysis, III Comp 412

SCHEME 8. 1 Introduction. 2 Primitives COMPUTER SCIENCE 61A. March 23, 2017

Announcements. Written Assignment 1 out, due Friday, July 6th at 5PM.

1. true / false By a compiler we mean a program that translates to code that will run natively on some machine.

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

Full file at

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Stacks, Queues and Hierarchical Collections

Section 7.6 Graphs of the Sine and Cosine Functions

CMSC 330: Organization of Programming Languages. Context Free Grammars

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

CS606- compiler instruction Solved MCQS From Midterm Papers

CSE 401 Midterm Exam 11/5/10

download instant at

Lexical Analysis (ASU Ch 3, Fig 3.1)

Character Set. The character set of C represents alphabet, digit or any symbol used to represent information. Digits 0, 1, 2, 3, 9

A Simple Syntax-Directed Translator

Syntax-Directed Translation. Lecture 14

CS 314 Principles of Programming Languages

This book is licensed under a Creative Commons Attribution 3.0 License

Lecture 4: Syntax Specification

Parsing. CS321 Programming Languages Ozyegin University

Compiler Construction

CSCE 314 Programming Languages

CS 314 Principles of Programming Languages. Lecture 3

SMPL - A Simplified Modeling Language for Mathematical Programming

CSCE 120: Learning To Code

CSE 3302 Programming Languages Lecture 2: Syntax

Language Basics. /* The NUMBER GAME - User tries to guess a number between 1 and 10 */ /* Generate a random number between 1 and 10 */

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Optimizing Finite Automata

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

Supplemental Materials: Grammars, Parsing, and Expressions. Topics. Grammars 10/11/2017. CS2: Data Structures and Algorithms Colorado State University

A simple syntax-directed

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Stating the obvious, people and computers do not speak the same language.

PROGRAMMING IN C AND C++:

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

First Midterm Exam CS164, Fall 2007 Oct 2, 2007

CS164: Midterm I. Fall 2003

SC/MATH Boolean Formulae. Ref: G. Tourlakis, Mathematical Logic, John Wiley & Sons, York University

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

ML 4 A Lexer for OCaml s Type System

ANSI C Programming Simple Programs

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

Will introduce various operators supported by C language Identify supported operations Present some of terms characterizing operators

FRAC: Language Reference Manual

Solving for the Unknown: Basic Operations & Trigonometry ID1050 Quantitative & Qualitative Reasoning

Syntax and Parsing COMS W4115. Prof. Stephen A. Edwards Fall 2003 Columbia University Department of Computer Science

Our Strategy for Learning Fortran 90

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

Defining Functions. CSc 372. Comparative Programming Languages. 5 : Haskell Function Definitions. Department of Computer Science University of Arizona

Chapter 3. Parsing #1

Introduction. Problem Solving on Computer. Data Structures (collection of data and relationships) Algorithms

CS 314 Principles of Programming Languages

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

Stacks, Queues and Hierarchical Collections. 2501ICT Logan

BSCS Fall Mid Term Examination December 2012

ITEC2620 Introduction to Data Structures

Lexical and Syntax Analysis

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 2: Lexical Analysis 23 Jan 08

ECON 502 INTRODUCTION TO MATLAB Nov 9, 2007 TA: Murat Koyuncu

Compiler Construction

UNIT -2 LEXICAL ANALYSIS

Non-deterministic Finite Automata (NFA)

CSE 401 Midterm Exam Sample Solution 11/4/11

Parser Tools: lex and yacc-style Parsing

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

1. [5 points each] True or False. If the question is currently open, write O or Open.

Transcription:

Garrett Hall CSC 173 C Programming Weeks 3-4 1. Overview The program parses code from an input file and evaluates it. The evaluator can handle basic mathematical operations (+, -, *, /, ^) and trigonometric functions (Sin, Cos, Tan, Asin, Acos, Atan). In addition, the evaluator can handle variables and even solve simple algebraic equalities using a system of substitution rules (described next). 2. Evaluation The evaluation system is described first because it provides examples for the capabilities and syntax of the program as a whole. To run these examples, run the program and type the filename at the prompt. 2.1 Substitution Rules All algebra in the evaluation is based upon substitution rules which are indicated by the -> token. If the left side of a rule is matched then the expression is replaced by the right side. For instance, the rule a-a -> will replace any two identical expressions that are subtracted with a. example1.1 a-a -> ; x-x; x - x Note that substitution rule variables such as a can match any complex expression such as x*3. example1.2 a-a -> ; x*3-x*3; (x * 3) - (x * 3) Note: Substitution rule variables must be single character a-z and are unrelated to variables appearing outside of the build rule. For instance x-x -> would have produced the same output as above. 2.2 Normal Variables Unlike variables that appear in a substitution rule, normal variables can bind to numbers. The assignment operator = indicates a binding when the variable is on the left and the number is on the right. example2.1 x = 3; x - 2; x = 3

x - 2 3-2 1 It is important to note that variables cannot be reassigned and there is no symbol table. Instead, the expression x=3 is similar to the rule x ->3 where x is a text match. For instance: example2.2 x = 3; x = 5; x = 3 x = 5 3 = 5 False The program can also handle multi-variable equations. example2.3 z = y / 2; y = 4; z = (y / 2) y = 4 z = (y / 2) z = (4 / 2) z = 2 2.3 Simple Algebra Rules Rules and variables are combined to make basic algebra systems. The substitution rule algorithm is designed to simplify expressions as much as possible given a set of rules. A simple problem would be x2 4. The evaluator doesn t know anything about dealing with variable expressions until we add some rules. By adding the rule a*b=c -> a=c/b evaluator can now deal with a larger subset of problems example3.1 a*b=c -> a=c/b; x * 2 = 4; (x * 2) = 4 x = 2 To deal with multiplicands in reverse order ( 2 x ) we can add a reflexive rule example3.2 a*b -> b*a; 2 * y = 4; (2 * y) = 4 (y * 2) = 4 y = 2 The example can be extended with identities example3.3

a/a -> 1; z * w = z; (z * w) = z (w * z) = z w = 1 Or transitive rules a*(b*c) -> (a*b)*c; 2 * u * v = y; (2 * (u * v)) = u (2 * (v * u)) = u ((v * u) * 2) = u ((2 * v) * u) = u (v * 2) = 1 v =.5 example3.4 Note: The steps in red are omitted from output, but I ve added them for clarity. This is because the evaluation algorithm is essentially looking two steps ahead. 2.4 Advanced Algebra Here are some more advanced examples using a larger algebra base found in the rules file. 2 2x 9y y 1 2z z example4.1 2 * x^2 = 9 * y; y / 2 * z = 1 / z; (2 * (x ^ 2)) = (9 * y) (y / (2 * z)) = (1 / z) (2 * (x ^ 2)) = (9 * y) ((x ^ 2) * 2) = (9 * y) ((x ^ 2) * 2) = (y * 9) ((4.5 * y) ^.5) = x (y / (2 * z)) = (1 / z) (y / (z * 2)) = (1 / z) (y / (2 * z)) = (1 / z) ((2 * z) / z) = y ((4.5 * y) ^.5) = x ((y * 4.5) ^.5) = x ((4.5 * y) ^.5) = x ((x ^ 2) / y) = 4.5 ((2 * z) / z) = y ((z * 2) / z) = y 2 = y y = 2 ((x ^ 2) / y) = 4.5 ((x ^ 2) / 2) = 4.5 ((x ^ 2) *.5) = 4.5 x = 3 The output is a rather verbose series of manipulations, but the correct values are finally arrived at.

Note: The evaluator cannot handle systems of inequalities or equations, although it would be possible to extend it to do so. Multivariable equations are only solvable if assignments can be found for all variables except one. Here we solve for the double angle identities. Note these identities are not rules themselves, but are found by expansion example4.2 Cos(x)^2+Sin(x)^2; ((Cosx) ^ 2) + ((Sinx) ^ 2) ((Cosx) * (Cosx)) + ((Sinx) * (Sinx)) Cos(x - x) Cos() 1 The gcc math library for acos returns a single number example4.3 = Cos(x)^2-Sin(x)^2; = (((Cosx) ^ 2) - ((Sinx) ^ 2)) = (((Cosx) ^ 2) - ((Sinx) ^ 2)) = (Cos(2 * x)) (x * 2) = (Acos) (x * 2) = (Acos) (x * 2) = 1.578 x =.785398 2.5 Evaluation Algorithm Because evaluation is not the main focus of this assignment I won t discuss it in much detail although it is the most complex aspect of the program. The evaluator transverses the AST built by the parser (detailed later). At each node, the evaluator checks if the node matches any substitution rule. When multiple substitutions are possible the algorithm performs a breadth-first search of fixed depth on the substitution rules. The resulting node with the best heuristic value is chosen. This heuristic favors eliminating variables, simplifying expressions, and moving variables to the left side of an expression (i.e. x=2 is better than 2=x). If numbers are found on either side of an operator node, the operator is replaced by the numeric result of the operation. 3. Implementation The program does the following: 1. Reads input from file specified in main.c 2. Tokenizes it in scanner.c using a DFA 3. Parses it in parser.c using a recursive-descent LL(1) parser 4. Builds abstract symbol trees (binary tree data structures described in AST.c) 5. Evaluates the code in evaluate.c 3.1. Language Below the language is described in terms of the regular expressions accepted by the scanner and the context free grammar (CFG) accepted by the parser. 3.1.1 Character Classes digit 1 2 3 4 5 6 7 8 9

lower a b... z upper A B... Z 3.1.2 Accepted Tokens token number variable function -> white ) * + / - ^ ( ; eof number digit digit (. digit ) variable lowercase ( lowercase uppercase digit) function uppercase lowercase lowercase (Function keywords are: Cos, Sin, Tan, Acos, Asin, and Atan) white tab newline space comparison < <= = >= > 3.1.3 Context-free Grammar statement rule ; statement rule expression -> expression expression The -> token indicates a special substitution rule. expression field comparison field field Comparisons will evaluate to true or false although = will act as an assignment to an unbound variable. For example x=2 will act as an assignment statement if x was previously unassigned. Otherwise it will be a comparison. field field + field field field 1 field 1 ( field ) The field i are non-terminals which represent the order of operations. Specifically i is the priority of the operation. The non-terminals are right-recursive which allows LL(1) parsing (no infinite loops) and creates parse trees. field field - field field ( field ) 1 2 1 2 field field / field field ( field ) 2 3 2 3 field field * field field ( field ) 3 4 3 4 field field ^ field field ( field ) field 4 5 4 5 5 - atom atom atom number variable function( field ) The -atom is a signed atom and uses a flag so that it can only occur in the first atom. Otherwise the negative sign could be infinitely nested. Function keywords are Cos, Sin, Tan, Acos, Asin, and Atan.

3.2 Scanner DFA S 1 lower lower F F 2 F C lower C Alphabet Sets lower = {a, b,...,z} upper = {A, B,...,Z} symbol = {+ - * / ^ ; ) ( < = > eof } digit = {, 1,...,9} F = white U { ( } V = number U upper U lower upper F 1 lower C Function Token V C lower V 1 V Variable Token digit digit C N 3 digit digit digit N 1 {. } C {. } N 2 digit C Number Token { < = > } C { = } E 1 { < = > } { < > } C { < } E 2 { < > } { < > } C { > } E 3 { < > } Equality Token { # } { # } { > } C C 1 { # } C Comment Token { - } R 1 { > } Rule Token symbol U white Other Tokens

3.3. Abstract Symbol Tree (AST) The parser which implements the CFG generates a binary tree. This is not the concrete parse tree because redundant non-terminals are not added to the tree, so it is an AST. For example, the production arrived at through field3 field4 field5 atom number is reduced to simply number. Each node in the AST has an associated class: AST_BINARY: a binary operation on two nodes + AST_UNARY: a unary operation on one node Cos AST_COMPARE: comparison between two nodes <= AST_NUMBER: single node that is a number 9.32 AST_VARIABLE: single node that is a variable myvar AST_BOUND: single node that is an assigned variable myvar = 9.32 This is the parse tree for the statement 9*3+5; Notice the order of operations inherent in the tree structure: statement rule expression field field1 field2 field3 field field1 field2 field3 field3 field4 field4 field4 field5 field5 field5 atom atom atom number * number + number ; 9 * 3 + 5 ;

The AST is simply: 9 + * 5 3 3.3.1 Generating the AST To view the tree-structure of input the flag in main.c must be changed by changing the line tree_flag(); to tree_flag(1); codefile 1*2+3^4/(5+6); + * 1 2 / ^ 3 4 + 5 6 The output is in prefix notation. If the flag is turned off tree_flag(); the code will simply evaluate. In this case the result is in fully parenthesized infix notation: (1 * 2) + ((3 ^ 4) / (5 + 6)) 2 + ((3 ^ 4) / (5 + 6)) 2 + (81 / (5 + 6)) 2 + (81 / 11) 2 + 7.36364 9.36364 4. Error Handling Below are a few examples of possible errors. Comments must always begin and close with the pound sign. # Good # # Bad Error end of file in comment at line 2, col

Because testing for equality and assignments are interchangeable only the token = should be used. x == 4; Comparison '==' should be '=' at line 1, col 2 Operators should not occur next to each other. Parentheses avoid this problem. - 2 / + 3; Atom expected '+' found at line 1, col 6-2 / (+ 3); (-2) / (+3) -2 / (+3) -2 / 3 -.666667 The format of functions must include parentheses and the first letter must be capitalized. Cos; '' cannot be in function name at line 1, col cos(); ';' expected '(' found at line 1, col 3 Cos ; '' cannot be in function name at line 1, col Cos(); Cos 1