Assignment 1. Due 08/28/17

Similar documents
Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

CSE 401 Midterm Exam 11/5/10

The Structure of a Syntax-Directed Compiler

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 2

The Structure of a Syntax-Directed Compiler

The Structure of a Syntax-Directed Compiler

The Structure of a Compiler

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

Reading Assignment. Scanner. Read Chapter 3 of Crafting a Compiler.

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

Last Time. What do we want? When do we want it? An AST. Now!

Alternation. Kleene Closure. Definition of Regular Expressions

A Simple Syntax-Directed Translator

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

Lexical and Syntax Analysis

When do We Run a Compiler?

CPS 506 Comparative Programming Languages. Syntax Specification

A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

PLT 4115 LRM: JaTesté

CSE 401 Midterm Exam Sample Solution 2/11/15

ASTs, Objective CAML, and Ocamlyacc

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Figure 2.1: Role of Lexical Analyzer

The Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Crafting a Compiler with C (V) Scanner generator

Compilers. History of Compilers. A compiler allows programmers to ignore the machine-dependent details of programming.

Part 5 Program Analysis Principles and Techniques

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

CSE 401 Midterm Exam Sample Solution 11/4/11

COMPILER DESIGN LECTURE NOTES

Compiler Design. Subject Code: 6CS63/06IS662. Part A UNIT 1. Chapter Introduction. 1.1 Language Processors

Lexical Analysis. Introduction

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

Compilers & Translation Systems Engineering

Typescript on LLVM Language Reference Manual

2.2 Syntax Definition

Left to right design 1

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

Defining syntax using CFGs

GAWK Language Reference Manual

n Parsing function takes a lexing function n Returns semantic attribute of 10/27/16 2 n Contains arbitrary Ocaml code n May be omitted 10/27/16 4

Programming Languages & Compilers. Programming Languages and Compilers (CS 421) Programming Languages & Compilers. Major Phases of a Compiler

Programming Languages and Compilers (CS 421)

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

1. Lexical Analysis Phase

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Programming Languages & Compilers. Programming Languages and Compilers (CS 421) I. Major Phases of a Compiler. Programming Languages & Compilers

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

Sticker. CS 241 Fall 2009 Final Exam. Draft of 6 December 01:43 hrs

IPCoreL. Phillip Duane Douglas, Jr. 11/3/2010

Chapter 3 Lexical Analysis

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

Compiler course. Chapter 3 Lexical Analysis

Chapter 3. Describing Syntax and Semantics ISBN

VHDL Lexical Elements

SMPL - A Simplified Modeling Language for Mathematical Programming

VENTURE. Section 1. Lexical Elements. 1.1 Identifiers. 1.2 Keywords. 1.3 Literals

LECTURE 17. Expressions and Assignment

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Using an LALR(1) Parser Generator

TML Language Reference Manual

CS 11 Ocaml track: lecture 6

Crayon (.cry) Language Reference Manual. Naman Agrawal (na2603) Vaidehi Dalmia (vd2302) Ganesh Ravichandran (gr2483) David Smart (ds3361)

1 Lexical Considerations

Software II: Principles of Programming Languages

A Pascal program. Input from the file is read to a buffer program buffer. program xyz(input, output) --- begin A := B + C * 2 end.

CA4003 Compiler Construction Assignment Language Definition

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Topic 3: Syntax Analysis I

Lecture 2 Tao Wang 1

The PCAT Programming Language Reference Manual

A simple syntax-directed

TDDD55 - Compilers and Interpreters Lesson 3

Jim Lambers ENERGY 211 / CME 211 Autumn Quarter Programming Project 4

Formal Languages. Formal Languages

CUP. Lecture 18 CUP User s Manual (online) Robb T. Koether. Hampden-Sydney College. Fri, Feb 27, 2015

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

afewadminnotes CSC324 Formal Language Theory Dealing with Ambiguity: Precedence Example Office Hours: (in BA 4237) Monday 3 4pm Wednesdays 1 2pm

Operator Precedence a+b*c b*c E + T T * P ( E )

Full file at

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Parsing and Pattern Recognition

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

Parsing II Top-down parsing. Comp 412

TaML. Language Reference Manual. Adam Dossa (aid2112) Qiuzi Shangguan (qs2130) Maria Taku (mat2185) Le Chang (lc2879) Columbia University

MP 3 A Lexer for MiniJava

The pixelman Language Reference Manual. Anthony Chan, Teresa Choe, Gabriel Kramer-Garcia, Brian Tsau

4. Semantic Processing and Attributed Grammars

Transcription:

Assignment 1. Due 08/28/17 1. Implement Scanner for Micro language (Micro Scanner) employing routines considered in class. When you call Scanner in a the output must be a list of tokens. Scan the following program: BEGIN SOMETHING UNUSUAL READ(A1, New_A, D, B); C:= A1 +(New_A D) 75; New_C:=((B (7)+(C+D))) (3 A1); STUPID FORMULA WRITE (C, A1+New_C); WHAT ABOUT := B+D; END Also, scan another program of your choice written on Micro language. 2. Write an Extended Micro Grammar to include an equality operator, =, and an exponentiation operator, ** (B**2 means B 2 ). Equal should have a lower operator precedence than plus and minus, while exponentiation should have a higher precedence. That is, A+B**2 = C+D should be equivalent to (A+(B**2)) = C+D. Further, exponentiation should group from the right (so that A**B**C is equivalent to A**(B**C)), and = should not group at all (A = B = C is illegal). Be sure that your grammar is unambiguous. Remark. For credit you do not have to implement your Scanner for Extended Micro language. Do this just for Micro Language. If you do implement Extended Micro language this will be considered as extra credit.

Micro language 2 The only data type is integer. All identifiers are implicitly declared and are no longer than 32 characters. Identifiers must begin with a letter and are composed of letters, digits, and underscores. Literals are strings of digits. Comments begin with and end at the end of the current line. Statement types are as follows: Assignment: Id := Expression; Expression is an infix expression constructed from identifiers, literals, and the operators + and ; parentheses are also allowed. Input / Output: read ( List of Id s); write (List of Expressions); begin, end, read, and write are reserved words. Each statement is terminated by a semicolon (;). The body of a program is delimited by begin and end. A blank is appended to the right end of each source line; thus tokens may not extend across line boundaries. Example BEGIN SOMETHING UNUSUAL READ(A1, New_A, D, B); C:= A1 + (New_A D) 75; New_C:=((B (7)+(C+D))) (3 A1); STUPID FORMULA WRITE (C, A1+New_C); WHAT ABOUT := B+D; END

3 The Structure of Micro Compiler The scanner reads a source program from a text file and produces a stream of token representations. So that no actual stream need not exist at any time, the scanner is actually a function that produces token representations one at a time when called by the parser. The parser processes tokens until it recognizes a syntactic structure that requires semantic processing. It takes a direct call to a semantic routine. Some of these semantic routines use token representation information in their processing. The semantic routines produce output in assembly language for a simple virtual machine. Thus the compiler structure includes no optimizer, and code generation is done by direct calls to appropriate routines from the set of semantic routines. The symbol table is used only by the semantic routines. Parser Tokens Syntactic Structure Source Program (character stream) Scanner Semantic Routines Target Machine Code

while <expr> <statement> <statement>... <statement> end ; Some Pseudocode Constructs 4 case <expr> is when <expr> <expr>... => <list of statements>;... when <expr> <expr>... => <list of statements>; <list of statements>; if <expr> <statement>; <statement>; end if; function <name> (<formal param list>) return <type name> is Type, variable, constant, and subprogram declarations begin Statement list end;

A Micro Scanner The scanner will be a function of no arguments that returns Token values. 5 type Token is (BeginSym, EndSym, ReadSym, WriteSym, Id, IntLiteral, LParen, RParen, SemiColon, Comma, AssignOp, PlusOp, MinusOp, EofSym); TokenBuffer: String function Scanner return Token; The following routines will be used: Read(C) The next input character is read; error if Eof is true. func. Inspect The next input character is returned, but input is not advanced; error if Eof is true. Advance func. Eof BufferChar ClearBuffer The next input character is removed, but not returned; no effect at end of file. True at the end of file. Adds its argument to a character buffer called TokenBuffer. This buffer is visible to any part of the compiler, and always contains the text of the most recently scanned token. The content of TokenBuffer will be used particularly by semantic routines. Of course, the characters of this buffer also are used by CheckReserved to determine whether a token that looks like an identifier is actually a reserved word. Reset the buffer TokenBuffer to the empty string func. CheckReserved Takes the identifiers as they are recognized and returns the proper token class (either Id or some reserved word).

function Scanner return Token is begin ClearBuffer; if Eof while not Eof case CurrentChar is when... =>.................. end ; end if end Scanner function Scanner return Token is begin ClearBuffer; if Eof while not Eof case CurrentChar is... Code to recognize, Tab, Eol... Code to recognize identifiers and reserved words... Code to recognize integer literals... Code to recognize delimiters... Code to recognize operators... Code to recognize comments LexicalError(CurrentChar); end ; end if end Scanner 6

Scanner Loop to Recognize Identifiers, Reserved Words, and Literals 7 while not Eof case CurrentChar is when Tab Eol => null when A.. Z a.. z => case Inspect is when A.. Z a.. z 0.. 9 _ => Advance; return CheckReserved end ; when 0.. 9 => case Inspect is when 0.. 9 => Advance; return IntLiteral; end case end LexicalError(CurrentChar); end ;

Scanner Loop to Recognize Operators, Comments and Delimiters 8 type Token is (BeginSym, EndSym, ReadSym, WriteSym, Id, IntLiteral, LParen, RParen, SemiColon, Comma, AssignOp, PlusOp, MinusOp, EofSym); function Scanner return Token; while not Eof case CurrentChar is... Code to recognize, Tab, Eol... Code to recognize identifiers... Code to recognize integer literals when ( => return LParen; when ) => return RParen; when ; => return SemiColon; when, => return Comma; when + => return PlusOp; when : => if Inspect = = Advance; return AssignOp; LexicalError(Inspect); end if; when => if Inspect = ; while CurrentChar Eol end ; return MinusOp; end if LexicalError(CurrentChar); end ;

Complete Scanner Function for Micro with recognition of reserved words and Eof 9 function CheckReserved Takes the identifiers as they are recognized and returns the proper token class (either Id or some reserved word). BufferChar(CurrentChar) Adds its argument to a character buffer called TokenBuffer ClearBuffer Resets the buffer to the empty string type Token is (BeginSym, EndSym, ReadSym, WriteSym, Id, IntLiteral, LParen, RParen, SemiColon, Comma, AssignOp, PlusOp, MinusOp, EofSym); function Scanner return Token is begin ClearBuffer; if Eof while not Eof case CurrentChar is when Tab Eol => null when A.. Z a.. z => BufferChar(CurrentChar); case Inspect is when A.. Z a.. z 0.. 9 _ => BufferChar(CurrentChar); Advance; return CheckReserved; end case end when 0.. 9 => BufferChar(CurrentChar); case Inspect is when 0.. 9 => BufferChar(CurrentChar);

Advance; return IntLiteral; end case end when ( => return LParen; when ) => return RParen; when ; => return SemiColon; when, => return Comma; when + => return PlusOp; when : => if Inspect = = Advance; return AssignOp; LexicalError(Inspect); end if; when => if Inspect = ; while CurrentChar =/ Eol end ; return MinusOp; end if LexicalError(CurrentChar); end ; end if end Scanner 10