An Oz Subset. 1 Microsyntax for An Oz Subset. Notational Conventions. COP 4020 Programming Languages 1 January 17, 2012

Similar documents
Programming Language Concepts, cs2104 Lecture 04 ( )

On Academic Dishonesty. Declarative Computation Model. Single assignment store. Single assignment store (2) Single assignment store (3)

CPS 506 Comparative Programming Languages. Syntax Specification

Part 5 Program Analysis Principles and Techniques

Lecture 4: The Declarative Sequential Kernel Language. September 5th 2011

Special Directions for this Test

The PCAT Programming Language Reference Manual

MATVEC: MATRIX-VECTOR COMPUTATION LANGUAGE REFERENCE MANUAL. John C. Murphy jcm2105 Programming Languages and Translators Professor Stephen Edwards

A simple syntax-directed

Lecture 21: Relational Programming II. November 15th, 2011

The SPL Programming Language Reference Manual

IPCoreL. Phillip Duane Douglas, Jr. 11/3/2010

DaMPL. Language Reference Manual. Henrique Grando

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

PL Revision overview

Sprite an animation manipulation language Language Reference Manual

Introduction to Lexical Analysis

CSCI312 Principles of Programming Languages!

VENTURE. Section 1. Lexical Elements. 1.1 Identifiers. 1.2 Keywords. 1.3 Literals

ML 4 A Lexer for OCaml s Type System

1 Lexical Considerations

BoredGames Language Reference Manual A Language for Board Games. Brandon Kessler (bpk2107) and Kristen Wise (kew2132)

Compiler Construction D7011E

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Typescript on LLVM Language Reference Manual

GBIL: Generic Binary Instrumentation Language. Language Reference Manual. By: Andrew Calvano. COMS W4115 Fall 2015 CVN

Project 2 Interpreter for Snail. 2 The Snail Programming Language

Language Reference Manual simplicity

Project 1: Scheme Pretty-Printer

CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer

CSC 467 Lecture 3: Regular Expressions

Declarative Computation Model

CS /534 Compiler Construction University of Massachusetts Lowell. NOTHING: A Language for Practice Implementation

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

CHAD Language Reference Manual

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

\n is used in a string to indicate the newline character. An expression produces data. The simplest expression

COMS W4115 Programming Languages & Translators GIRAPHE. Language Reference Manual

Syntax. A. Bellaachia Page: 1

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens

Lexical Considerations

Lecture 5: Declarative Programming. The Declarative Kernel Language Machine. September 12th, 2011

Second release of the COMPASS Tool Tool Grammar Reference

FRAC: Language Reference Manual

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

Programming Languages & Translators. XML Document Manipulation Language (XDML) Language Reference Manual

VHDL Lexical Elements

Lexical Considerations

Introduction to Lexical Analysis

Decaf Language Reference Manual

SMURF Language Reference Manual Serial MUsic Represented as Functions

CLIP - A Crytographic Language with Irritating Parentheses

Compiler Construction

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

Group A Assignment 3(2)

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

6.184 Lecture 4. Interpretation. Tweaked by Ben Vandiver Compiled by Mike Phillips Original material by Eric Grimson

MELODY. Language Reference Manual. Music Programming Language

Parser Tools: lex and yacc-style Parsing

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.

CS321 Languages and Compiler Design I. Winter 2012 Lecture 4

YOLOP Language Reference Manual

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Contents. Jairo Pava COMS W4115 June 28, 2013 LEARN: Language Reference Manual

COP4020 Programming Assignment 1 - Spring 2011

Ordinary Differential Equation Solver Language (ODESL) Reference Manual

A First Look at ML. Chapter Five Modern Programming Languages, 2nd ed. 1

MR Language Reference Manual. Siyang Dai (sd2694) Jinxiong Tan (jt2649) Zhi Zhang (zz2219) Zeyang Yu (zy2156) Shuai Yuan (sy2420)

A Simple Syntax-Directed Translator

CSCE 531 Spring 2009 Final Exam

6.037 Lecture 4. Interpretation. What is an interpreter? Why do we need an interpreter? Stages of an interpreter. Role of each part of the interpreter

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

DINO. Language Reference Manual. Author: Manu Jain

Lexical Analysis. Lecture 3. January 10, 2018

PYTHON- AN INNOVATION

Compiler Techniques MN1 The nano-c Language

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Lexical and Syntax Analysis

JME Language Reference Manual

Introduction to Compiler Construction

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Programming Languages & Compilers. Programming Languages and Compilers (CS 421) Programming Languages & Compilers. Major Phases of a Compiler

VLC : Language Reference Manual

Parser Tools: lex and yacc-style Parsing

MP 3 A Lexer for MiniJava

The Warhol Language Reference Manual

d-file Language Reference Manual

A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994

Programming Languages and Compilers (CS 421)

Programming Languages & Compilers. Programming Languages and Compilers (CS 421) I. Major Phases of a Compiler. Programming Languages & Compilers

6.096 Introduction to C++ January (IAP) 2009

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

A Pascal program. Input from the file is read to a buffer program buffer. program xyz(input, output) --- begin A := B + C * 2 end.

CS 11 Ocaml track: lecture 6

B E C Y. Reference Manual

SML A F unctional Functional Language Language Lecture 19

L-System Fractal Generator: Language Reference Manual

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Carlos Varela RPI September 19, Adapted with permission from: Seif Haridi KTH Peter Van Roy UCL

Transcription:

COP 4020 Programming Languages 1 January 17, 2012 An Oz Subset This document describes a subset of Oz [VH04] that is designed to be slightly larger than the declarative kernel language of chapter 2 (of [VH04]). The subset includes several syntactic sugars, but not so many that manipulation becomes unwieldy. The purpose of this subset, and the programs in the following directory: http://www.eecs.ucf.edu/~leavens/cop4020/reducer is to show how to define various manipulations of Oz programs. In particular, we provide definitions of Free and Bound Variable identifiers and desugaring into the declarative kernel language. Also provided is a definitional interpreter. All of these programs are written in Oz (although using a larger portion of the language). In what follows, I provide the lexical (Section 1) and context-free (Section 2 on page 3) grammars for what the tools use in lexical analysis and parsing. I also define the abstract syntax trees (Section 3 on page 3) that are used in other tools (Section 4 on page 7). Notational Conventions Throughout this document we use the following conventions for describing grammars. The square brackets [ and ] surround optional parts of a production. Thus A ::= example [ B ] C means the same thing as A ::= example C example B C Furthermore, the notation... following an optional construct means zero or more repetitions of the enclosed sentential form. Thus A ::= example [ B ]... C means the same thing as A ::= example Bs C Bs ::= B Bs Empty Empty ::= Single quotes ( and ) are used to indicate when notations such as [ and are to be taken literally as terminal symbols, as opposed to their default interpretation as meta-level grammatical symbols. 1 Microsyntax for An Oz Subset The lexical analyzer in OzLexer.oz, which uses the code in LexerTools.oz, does lexical analysis for the Oz subset. The lexer recognizes all the keywords (really reserved words) of the entire Oz language, as well as all the special keyword tokens. However, it currently does not handle the Oz \insert directive. The microsyntax recognized by the lexer is given in Figure 1 on the following page. This grammar which is to be understood lexically, that is, unlike a normal grammar, no spaces or comments may appear between the characters of a token. Thus, in particular, in a label, there can be no space or comment before the left parenthesis. The microsyntax and the lexer do not not consider unit to be a keyword, unlike the Oz documentation, but merely an Atom. 1

microsyntax ::= [ lexeme ]... lexeme ::= space comment token token ::= varid keyword literal label space ::= non-nl-space end-of-line non-nl-space ::= a blank, tab, or formfeed character end-of-line ::= newline carriage-return carriage-return newline newline ::= a newline character carriage-return ::= a carriage-return character comment ::=? % [ non-eol ]... end-of-line non-eol ::= any character except a newline or carriage return varid ::= CapitalLetter [ AlphaNumeric ]... underbar [ AlphaNumeric ]... CapitalLetter ::= An ISO capital letter, including the letters A through Z AlphaNumeric ::= CapitalLetter lowercaseletter digit underbar lowercaseletter ::= An ISO lower case letter, including the letters a through z underbar ::= _ keyword ::= andthen at attr case catch choice class declare define div do else elseif end export fail finally for from fun functor if in lazy local lock meth mod of orelse proc prop raise self skip then thread try ( ) [ ] { } # ::=... =. := ˆ [] $! ~ + - * / @ <-,!! <= == \= < =< > >= =: : <: =<: >: >=: :: ::: literal ::= Atom Bool Int Float String Atom ::= lowercaseletter [ AlphaNumeric ]... non-quote-eol [ non-quote-eol ]... non-quote-eol ::= any character that is a non-eol character except a Bool ::= true false Int ::= [ ~ ] digit [ digit ]... Float ::= Int. numeric exponent numeric ::= [ digit ]... exponent ::= exponentchar Int exponentchar ::= E e String ::= " [ non-dq-eol ]... " non-dq-eol ::= any character that is a non-eol character except a " label ::= Atom ( Bool ( varid ( Figure 1: The lexical grammar recognized by the tools in this directory, which is a subset of the Oz lexical grammar. Note that an Atom that has the same sequence of characters as a keyword or a Bool will be recognized as that keyword or a Bool, unless it is quoted. 2

2 Context Free Grammar for the Oz Subset The parser in MyOzParser.oz recognizes the grammar in Figure 2 on the next page, which builds on the lexical grammar as usual. In the figure, the names in the right column are not part of the grammar, but instead show what type of abstract syntax tree(s) record(s) is (are) made from the production(s) on the left. Compare these with the abstract syntax definitions in Section 3. 3 Abstract Syntax Trees for the Oz Subset The grammar in Figure 3 on page 5, describes the abstract syntax trees produced by the parser and used by other tools. These data structures are all records with distinguishing labels. Thus the grammar in the figure represents the abstract syntax of our Oz subset. For example the following Oz statement: A = B is represented by (parsed into) the following abstract syntax tree: unifystmt(varid('a') varid('b')) Note that the grammar uses Oz atoms to represent variable identifiers. A more complex example of an abstract syntax tree is given in Figure 4 on page 6. 3

Program ::= [ Sequence ] Queries1 program Queries1 ::= [ Query ]... Query ::= declare Sequence in Sequence declareinquery declare Sequence declarequery Sequence ::= Statement [ Statement ]... seqstmt Statement ::= skip skipstmt local varid in InStatement end localstmt varid = varid unifystmt varid = varid unifystmt if Expression then InStatement else InStatement end ifstmt case Expression of Pattern then InStatement else InStatement end casestmt { Expression [ Expression ]... } applystmt fun { Formals } Expression end namedfunstmt thread Statement end threadstmt InStatement ::= Sequence Pattern = Expression in Sequence instmt Expression ::= RelationalExp RelationalExp ::= VBarExp [ RelationalOperator VBarExp ] applyexp RelationalOperator ::= == \= < =< > >= VBarExp ::= PoundExp [ VBarExp ] recordexp PoundExp ::= AdditiveExp [ # AdditiveExp ]... recordexp AdditiveExp ::= MultiplicativeExp [ AdditiveOp MultiplicativeExp ]... applyexp AdditiveOp ::= + - MultiplicativeExp ::= NegateExp [ MultiplicativeOp NegateExp ]... applyexp MultiplicativeOp ::= * / div mod NegateExp ::= [ ~ ]... DotExp applyexp DotExp ::= TightOpExp [. TightOpExp ]... applyexp TightOpExp ::= [!! ]... PrimaryExp applyexp PrimaryExp ::= ( Expression ) varid atomexp varid, atomexp boolexp boolexp intlit floatlit intlit, floatlit String recordexp label [ Field ]... ) recordexp proc { $ Formals } Sequence end procexp if Expression then Expression else Expression end ifexp case Expression of Pattern then Expression else Expression end caseexp { Expression [ Expression ]... } applyexp thread Expression end threadexp Field ::= Feature : Expression colonfld Expression posfld Feature ::= atomexp intlit boolexp Pattern ::= VBarPat VBarPat ::= PoundPat [ VBarPat ] recordpat PoundPat ::= PrimitivePat [ # PrimitivePat ]... recordpat PrimitivePat ::= varid atomexp varidpat, atompat boolexp boolpat ( Pattern ) label [ PatField ]... ) recordpat PatField ::= Feature : Pattern colonfld Pattern posfld Formals ::= [ Pattern ]... 4 Figure 2: Concrete syntax recognized by the parser, which is a subset of Oz.

Program ::= program( List Query Position ) Query ::= seqquery( Statement Position ) declareinquery( Statement Statement Position ) declarequery( Statement Position ) Statement ::= skipstmt( Position ) seqstmt( List Statement Position ) localstmt( VarIdRef Statement Position ) unifystmt( VarIdRef Expression Position ) ifstmt( Expression Statement Statement Position ) casestmt( Expression Pattern Statement Statement Position ) applystmt( Expression List Expression Position ) namedfunstmt( Atom List Pattern Expression Position ) instmt( Pattern Expression Statement Position ) threadstmt( Statement Position ) Expression ::= VarIdRef atomexp( Atom Position ) boolexp( Bool Position ) intlit( Int Position ) floatlit( Float Position ) recordexp(atomexp( Atom List Field Position ) procexp( List Pattern Statement Position ) ifexp( Expression Expression Expression Position ) caseexp( Expression Pattern Expression Expression Position ) applyexp( Expression List Expression Position ) threadexp( Expression Position ) VarIdRef ::= varid( Atom Position ) Pattern ::= varidpat( Atom Position ) atompat( Atom Position ) boolpat( Bool Position ) recordpat(atomexp( Atom ) List PatField Position ) Field ::= colonfld( Feature Expression Position ) posfld( Expression Position ) PatField ::= colonfld( Feature Pattern Position ) posfld( Pattern Position ) Feature ::= atomexp( Atom Position ) intlit( Int Position ) boolexp( Bool Position ) Position ::= pos:pos( FileName Line Col ) pos:pos( FileName Line Col FileName Line Col ) Empty Empty ::= FileName ::= Atom Line ::= Int Col ::= Int Figure 3: Abstract syntax for an Oz Subset, based on the textbook s section 2.6 [VH04]. The optional Position field is used for error messages, but is often omitted in examples made by hand. The type String is the type of character strings in Oz, Atom is the type of atoms in Oz, Int is the type of Integers in Oz, etc. 5

The following Oz statement: fun {AddToEach A#B Ls} case Ls of (X#Y) T then ({Plus A X}#{Plus B Y}) {AddToEach A#B T} else nil end end is represented by the following record structure (with all positions omitted): namedfunstmt('addtoeach' [recordpat(atomexp('#') [posfld(varidpat('a')) posfld(varidpat('b'))]) varidpat('ls')] caseexp(varid('ls') recordpat(atomexp(' ') [posfld(recordexp(atomexp('#') [posfld(varidpat('x')) posfld(varidpat('y'))])) posfld(varidpat('t'))]) recordexp(atomexp(' ') [posfld(recordexp(atomexp('#') [posfld(applyexp(varid('plus') [varid('a') varid('x')])) posfld(applyexp(varid('plus') [varid('b') varid('y')]))])) posfld(applyexp(varid('addtoeach') [recordexp(atomexp('#') [posfld(varid('a')) posfld(varid('b'))]) varid('t')]))]) atomexp(nil))) Figure 4: Example showing how Oz is parsed into the record structures (abstract syntax trees) from Figure 3 on the preceding page. 6

4 The Main Functions The reducer provides the following main functions to users. 1. The function FreeVarIdsStmt: <fun {$ <Statement>}: <Set <String>> takes a Statement in the grammar of Figure 3 on page 5 and returns a set of all the variable identifiers that occur free in its argument. Its implementation is in FreeVarIds.oz. There are tests in the file FreeVarIdsTest.oz. See the file TestOutput.txt for results of testing. 2. The function BoundVarIdsStmt: <fun {$ <Statement>}: <Set <String>> takes a Statement in the grammar of Figure 3 on page 5 and returns a set of all the variable identifiers that occur bound in its argument. Its implementation is in BoundVarIds.oz. There are tests in the file BoundVarIdsTest.oz. See the file TestOutput.txt for results of testing. 3. The function DesugarStmt: <fun {$ <Statement>}: <Statement> takes a Statement in the grammar of Figure 3 on page 5 and returns a Statement in the subset of that grammar that represents Oz statements in the kernel language of the textbook [VH04, Section 2.3]. An implementation is in Desugar.oz. There are tests in the file DesugarTest.oz. See the file TestOutput.txt for results of testing. 4. The function Reduce1: <fun {$ <Config>}: <Pair String Config>> takes a non-terminal configuration and returns a new rule name and a configuration for the next step in the operational semantics of Oz, starting at the given configuration. The implementation is in Reducer.oz. There are tests in the file ReducerTest.oz. See Configuration.oz for details on configurations. References [VH04] Peter Van Roy and Seif Haridi. Concepts, Techniques, and Models of Computer Programming. The MIT Press, Cambridge, Mass., 2004. 7