The Lexical Structure of Verdi TR Mark Saaltink. Release date: July 1994
|
|
- Morris Morris
- 5 years ago
- Views:
Transcription
1 The Lexical Structure of Verdi TR Mark Saaltink Release date: July 1994 ORA Canada 267 Richmond Road, Suite 100 Ottawa, Ontario K1Z 6X3 CANADA
2 Verdi Compiler Project TR This report formally denes the lexical structure of Verdi [1, 3], using the Z notation [4]. This formulation of lexical structure is similar to the denition in the formal denition of Turing [2]. The grammar of Verdi, like that of most programming languages, is most conveniently described using two distinct phases: lexical analysis and parsing. Verdi programs are composed of characters. Lexical analysis transforms this sequence of characters to a sequence of tokens. This sequence of tokens is then parsed according to a context-free grammar. 1 Changes to Verdi The original description of Verdi had a lexical structure that was ambiguous; the string "\141" could be interpreted as "a" or as "141", depending on whether the escape was interpreted as a numeric representation or as a single character escape. This ambiguity has been removed by restricting the characters that may appear in a single character escape. 2 Characters Verdi programs are written in the ASCII character set. The set CHAR comprises some representation of these characters. [CHAR] Function char code is a bijection between CHAR and small numbers; for a character c, char code(c) is the ASCII code of c. char code : CHAR! char val : ! CHAR char val = char code 01 We will write characters inside single quotes, for example `a' denotes the lower case letter \a", which is also the value of char val(97). We use the usual names for the common formatting characters: cr; lf ; ; sp; tab : CHAR cr = char val(13) lf = char val(10) = char val(12) sp = char val(32) tab = char val(9) Several sets of characters are used in the denitions below. The visible characters, with codes between 33 and 126 inclusive, have associated glyphs; the graphic characters also include space. The blank characters are formatting characters that leave \white space". Carriage return and line feed characters are used to end lines. Visible; Graphic; Blank; EndLine : P CHAR Visible = char val(j j) Graphic = char val(j j) = Visible [ fspg Blank = fcr; lf ; ; sp; tabg EndLine = fcr; lf g
3 Verdi Compiler Project TR Several kinds of digits are used in the description of numerals: BinaryDigit; OctalDigit; Digit; HexDigit : P CHAR BinaryDigit = f`0'; `1'g = char val(j j) OctalDigit = f`0'; `1'; `2'; `3'; `4'; `5'; `6'; `7'g = char val(j j) Digit = OctalDigit [ f`8'; `9'g = char val(j j) HexDigit = Digit [ f`a'; `b'; `c'; `d'; `e'; `f'; `A'; `B'; `C'; `D'; `E'; `F'g The escape characters have special uses in character literals and strings: EscapeChar : P CHAR EscapeTable : CHAR 7! CHAR EscapeChar = dom EscapeTable EscapeTable = f`b' 7! char val(8); `d' 7! char val(127); `l' 7! lf ; `n' 7! lf ; `p' 7! ; `r' 7! cr; `s' 7! sp; `t' 7! tab; `"' 7! `"'; `\' 7! `\'g 3 Lexical Units Two kinds of lexical units are used: tokens and separators. LexicalUnit b= Token [ Separator 3.1 Tokens Tokens are certain sequences of characters. There are four classes of tokens in Verdi: numerals, identiers, character literals, and strings; in addition, the left and right parentheses are special tokens. Token : P(seq CHAR) Token = Numeral [ Identier [ Character literal [ String literal [ fh`('i; h`)'ig 3.2 Attributes We will associate an \attribute" with each token. This attribute is used in the description of the semantics of Verdi, or else is used to dene the abstract syntax corresponding to the concrete syntax. The attribute of a numeral is the number it represents. The attribute of an identier is a name, which has been normalized (by converting upper case letters to lower case). The attribute of a character literal is a character. The attribute of a string is a sequence of characters. Parentheses have no attributes. Attribute :== numhhzii j identhhseq CHARii j charhhcharii j stringhhseq CHARii j openparen j closeparen Function attr gives the attribute of a token. This function is composed in the obvious way from functions determining the attribute value for each type of token. These individual functions will be dened below.
4 Verdi Compiler Project TR attr : Token! Attribute attr = (num num attr) [ (ident ident attr) [ (char char attr) [ (string string attr) [ fh`('i 7! openparen; h`)'i 7! closepareng 3.3 Numerals Numeral : P(seq CHAR) Numeral = seq 1 Digit [ f s : seq 1 Digit h`-'i a s g [ f s : seq 1 BinaryDigit; r : f`b'; `B'g h`#'; ri a s g [ f s : seq 1 OctalDigit; r : f`o'; `O'g h`#'; ri a s g [ f s : seq 1 HexDigit; r : f`h'; `H'g h`#'; ri a s g The value of a numeral is calculated in the obvious way. value in radix : Z 2 seq HexDigit! Z digit value : HexDigit! Z value in radix(r; hi) = 0 value in radix(r; s a hdi) = r 3 value in radix(r; s) + digit value(d) digit value = f`0' 7! 0;... ; `9' 7! 9; `a' 7! 10; `A' 7! 10;... ; `f' 7! 15; `F' 7! 15g num attr : Numeral! Z 8 s : seq 1 Digit num attr(s) = value in radix(10; s) 8 s : seq 1 Digit num attr(h`-'i a s) = 0value in radix(10; s) 8 s : seq 1 BinaryDigit; r : f`b'; `B'g num attr(h`#'; ri a s g = value in radix(2; s) 8 s : seq 1 OctalDigit; r : f`o'; `O'g num attr(h`#'; ri a s g = value in radix(8; s) 8 s : seq 1 HexDigit; r : f`h'; `H'g num attr(h`#'; ri a s g = value in radix(16; s) 3.4 Identiers Identier : P(seq CHAR) Identier = (seq 1 X ) n Numeral where X = Visible n f`('; `)'; `"'; `''; ``'; `;'; `#'; g It is not immediately obvious that this lexical class can be dened by a regular expression. Clearly, though, seq 1 X is denable by a regular expression, as is Numeral. Furthermore, the set of regular languages is closed under complementation and intersection. So, we can conclude that Identier can be specied by a regular expression. Once we think to look for it, it is easy to nd: Identier ::= `-' j XV 3 j (D j `-')D 3 NV 3 where V = Visible n f`('; `)'; `"'; `''; ``'; `;'; `#'; g D = Digit N = V n Digit X = N n f`-'g
5 Verdi Compiler Project TR The attribute associated with an identier is derived by converting all letters in the identier to lower case. (We could equally well convert them to upper case; all that matters is that case is normalized.) ident attr : Identier! Identier lower : CHAR! CHAR ident attr = i : Identier (lower i) lower = (id CHAR) 8 f`a' 7! `a';...; `Z' 7! `z'g 3.5 Character Literals Character literal : P(seq CHAR) Escape : P(seq Char) Character literal = f c : Graphic j c 6= `\' h`''; c; `''i g [ f s : Escape h`''i a s a h`''i g Escape = f c : EscapeChar h`\'; ci g [ f s : seq OctalDigit j #s = 3 h`\'i a s g char attr : Character literal! CHAR escape value : Escape! CHAR 8 c : Graphic char attr(h`''; c; `''i) = c 8 s : Escape char attr(h`''i a s a h`''i) = escape value(s) 8 c : EscapeChar escape value(h`\'; ci) = EscapeTable(c) 8 s : seq OctalDigit j #s = 3 escape value(h`\'i a s) = char val(value in radix(8; s)) 3.6 String Literals String literal : P(seq CHAR) String element : P(seq CHAR) String literal = fs : seq String element h`"'i a ( a = s) a h`"'i g String element = f c : (Graphic n f`"'; `\'g) hci g [ Escape A given string literal can be divided into elements in only one way: Lemma 1 Suppose e; e 0 : seq String element, and suppose a = e = a = e 0. Then e = e 0. This property makes it easy to nd the attribute of a string literal. Each element is interpreted as (the body of) a character literal: string attr : String literal! seq CHAR element char : String element! CHAR 8 s : String literal; x : seq String element j s = h`"'i a ( a = x) a h`"'i string attr(s) = element char x 8 e : String element element char(e) = char attr(h`''i a e a h`''i)
6 Verdi Compiler Project TR Separators Separators are either whitespace or comments. Separator : P(seq CHAR) Whitespace : P(seq CHAR) Comment : P(seq CHAR) Separator = Whitespace [ Comment Whitespace = f c : Blank hci g Comment = f c : EndLine; s : seq(char n EndLine) h`;'i a s a hci g 4 Tokenization A sequence of characters is tokenized by rst dividing it into tokens and separators, then throwing away the separators. The rst stage introduces possible ambiguities: the sequence h`1'; `2'i can be divided into two one-digit numbers, or into a single number; similarly, the sequence h`a'; `b'i can be divided into a single identier, or two identiers. A simple principle (called maximum munch in [2]) resolves such problems: each lexical unit must be as long as possible. To put it another way, a sequence of characters is divided into two or more tokens only if necessary. This principle can be formalized as a property of a sequence of lexical units: each unit in the sequence must be the longest possible lexical unit that is a prex of the input. This is easily described formally (note that the input is obtained by concatenating the lexical units in the sequence): Maximal : P(seq LexicalUnit) 8 u : seq LexicalUnit Maximal u, (#u > 0 ) Maximal (tail(u)) ^ 8 t : LexicalUnit j t a = u t head(u)) Tokenization is a relation between an input sequence of characters and an output sequence of tokens, dened according to the above description: Tokenize : seq CHAR $ seq Token Tokenize = f u : seq LexicalUnit j Maximal u ( a = u; u TOKEN ) g Tokenize is, in fact, a partial function. We rst show that the maximal munch principle uniquely determines the lexical tokens comprising a sequence (this is also proven in [2]): Lemma 2 Suppose u; u 0 : seq LexicalUnit are both maximal, and a = u = a = u 0. Then u = u 0. We can prove this by inducting on the length of u. Suppose rst that a = u is empty. Since <>62 LexicalUnit, u 0 must be empty as well. Therefore, we may assume that both u and u 0 are nonempty. Put h = head(u) and h 0 = head(u 0 ). Then h and h 0 are lexical units, and By the maximality of u 0, h a = u = a = u 0 : 8 t : LexicalUnit j t a = u 0 t head(u 0 ):
7 Verdi Compiler Project TR Therefore, instantiating t to h, we have h head(u 0 ) = h 0 : Similarly, since u is maximal, we can show h 0 h. Therefore h = h 0. By the denition of maximality, both tail(u) and tail(u 0 ) are maximal. Moreover, since a = u = a = u 0 and head(u) = head(u 0 ), we have a = tail(u) = a = tail(u 0 ). So by the induction hypothesis, tail(u) = tail(u 0 ), and thus u = u 0. Lemma 3 Tokenize is a function. Suppose (s; t) 2 Tokenize and (s; t 0 ) 2 Tokenize. We must show t = t 0. By the denition of Tokenize and the hypotheses, we have and 9 u : seq LexicalUnit j Maximal u s = a = u ^ t = u Token 9 u 0 : seq LexicalUnit j Maximal u 0 s = a = u 0 ^ t 0 = u 0 Token But Lemma 2 shows that u = u 0, so clearly t = t 0. We should make sure that the maximum munch principle does not resolve too much ambiguity. In fact, this is so: the only possible ambiguities it resolves are as in the above examples, where identiers or numerals are arbitrarily divided: Lemma 4 Suppose t and t 0 are tokens, t 6= t 0, and t t 0. Then t and t 0 are both in the set Identier [ Numeral. The proof is a rather tedious consideration of cases. References [1] Dan Craigen. Reference manual for the language Verdi. Technical Report TR , Odyssey Research Associates, February [2] R. C. Holt, P. A. Matthews, J. A. Rosselet, and J. R. Cordy. The Turing Programming Language: Design and Denition. University of Toronto (Draft of June 1985). [3] Mark Saaltink. A formal description of Verdi. Technical Report TR a, Odyssey Research Associates, November [4] J. M. Spivey. The Z Notation: A Reference Manual. Prentice Hall, 1989.
Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!
Lexical Analysis Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast! Compiler Passes Analysis of input program (front-end) character stream
More informationCompiler Techniques MN1 The nano-c Language
Compiler Techniques MN1 The nano-c Language February 8, 2005 1 Overview nano-c is a small subset of C, corresponding to a typical imperative, procedural language. The following sections describe in more
More informationLexical Analysis. Lecture 3. January 10, 2018
Lexical Analysis Lecture 3 January 10, 2018 Announcements PA1c due tonight at 11:50pm! Don t forget about PA1, the Cool implementation! Use Monday s lecture, the video guides and Cool examples if you re
More informationContext-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation
Concepts Introduced in Chapter 2 A more detailed overview of the compilation process. Parsing Scanning Semantic Analysis Syntax-Directed Translation Intermediate Code Generation Context-Free Grammar A
More informationLanguage Reference Manual simplicity
Language Reference Manual simplicity Course: COMS S4115 Professor: Dr. Stephen Edwards TA: Graham Gobieski Date: July 20, 2016 Group members Rui Gu rg2970 Adam Hadar anh2130 Zachary Moffitt znm2104 Suzanna
More informationLexical Analysis. Finite Automata
#1 Lexical Analysis Finite Automata Cool Demo? (Part 1 of 2) #2 Cunning Plan Informal Sketch of Lexical Analysis LA identifies tokens from input string lexer : (char list) (token list) Issues in Lexical
More information1 Lexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler
More informationLexical Analysis. Finite Automata
#1 Lexical Analysis Finite Automata Cool Demo? (Part 1 of 2) #2 Cunning Plan Informal Sketch of Lexical Analysis LA identifies tokens from input string lexer : (char list) (token list) Issues in Lexical
More informationA simple syntax-directed
Syntax-directed is a grammaroriented compiling technique Programming languages: Syntax: what its programs look like? Semantic: what its programs mean? 1 A simple syntax-directed Lexical Syntax Character
More informationA Simple Syntax-Directed Translator
Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called
More informationProgramming in C++ 4. The lexical basis of C++
Programming in C++ 4. The lexical basis of C++! Characters and tokens! Permissible characters! Comments & white spaces! Identifiers! Keywords! Constants! Operators! Summary 1 Characters and tokens A C++
More informationIntroduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1
Introduction to Automata Theory BİL405 - Automata Theory and Formal Languages 1 Automata, Computability and Complexity Automata, Computability and Complexity are linked by the question: What are the fundamental
More informationLexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Fall 2005 Handout 6 Decaf Language Wednesday, September 7 The project for the course is to write a
More informationCSE 3302 Programming Languages Lecture 2: Syntax
CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:
More informationLexical Analysis. Chapter 2
Lexical Analysis Chapter 2 1 Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples
More informationDVA337 HT17 - LECTURE 4. Languages and regular expressions
DVA337 HT17 - LECTURE 4 Languages and regular expressions 1 SO FAR 2 TODAY Formal definition of languages in terms of strings Operations on strings and languages Definition of regular expressions Meaning
More information2. λ is a regular expression and denotes the set {λ} 4. If r and s are regular expressions denoting the languages R and S, respectively
Regular expressions: a regular expression is built up out of simpler regular expressions using a set of defining rules. Regular expressions allows us to define tokens of programming languages such as identifiers.
More informationComplexity Theory. Compiled By : Hari Prasad Pokhrel Page 1 of 20. ioenotes.edu.np
Chapter 1: Introduction Introduction Purpose of the Theory of Computation: Develop formal mathematical models of computation that reflect real-world computers. Nowadays, the Theory of Computation can be
More informationCSCI312 Principles of Programming Languages!
CSCI312 Principles of Programming Languages!! Chapter 3 Regular Expression and Lexer Xu Liu Recap! Copyright 2006 The McGraw-Hill Companies, Inc. Clite: Lexical Syntax! Input: a stream of characters from
More informationLexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2010 Handout Decaf Language Tuesday, Feb 2 The project for the course is to write a compiler
More informationTypescript on LLVM Language Reference Manual
Typescript on LLVM Language Reference Manual Ratheet Pandya UNI: rp2707 COMS 4115 H01 (CVN) 1. Introduction 2. Lexical Conventions 2.1 Tokens 2.2 Comments 2.3 Identifiers 2.4 Reserved Keywords 2.5 String
More informationCMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics
Recall Architecture of Compilers, Interpreters CMSC 330: Organization of Programming Languages Source Scanner Parser Static Analyzer Operational Semantics Intermediate Representation Front End Back End
More informationCunning Plan. Informal Sketch of Lexical Analysis. Issues in Lexical Analysis. Specifying Lexers
Cunning Plan Informal Sketch of Lexical Analysis LA identifies tokens from input string lexer : (char list) (token list) Issues in Lexical Analysis Lookahead Ambiguity Specifying Lexers Regular Expressions
More informationAdvanced Algorithms and Computational Models (module A)
Advanced Algorithms and Computational Models (module A) Giacomo Fiumara giacomo.fiumara@unime.it 2014-2015 1 / 34 Python's built-in classes A class is immutable if each object of that class has a xed value
More informationSprite an animation manipulation language Language Reference Manual
Sprite an animation manipulation language Language Reference Manual Team Leader Dave Smith Team Members Dan Benamy John Morales Monica Ranadive Table of Contents A. Introduction...3 B. Lexical Conventions...3
More informationSupplementary Notes on Abstract Syntax
Supplementary Notes on Abstract Syntax 15-312: Foundations of Programming Languages Frank Pfenning Lecture 3 September 3, 2002 Grammars, as we have discussed them so far, define a formal language as a
More informationJME Language Reference Manual
JME Language Reference Manual 1 Introduction JME (pronounced jay+me) is a lightweight language that allows programmers to easily perform statistic computations on tabular data as part of data analysis.
More information1. Suppose you are given a magic black box that somehow answers the following decision problem in polynomial time:
1. Suppose you are given a magic black box that somehow answers the following decision problem in polynomial time: Input: A CNF formula ϕ with n variables x 1, x 2,..., x n. Output: True if there is an
More informationCS152: Programming Languages. Lecture 2 Syntax. Dan Grossman Spring 2011
CS152: Programming Languages Lecture 2 Syntax Dan Grossman Spring 2011 Finally, some formal PL content For our first formal language, let s leave out functions, objects, records, threads, exceptions,...
More informationThe MaSH Programming Language At the Statements Level
The MaSH Programming Language At the Statements Level Andrew Rock School of Information and Communication Technology Griffith University Nathan, Queensland, 4111, Australia a.rock@griffith.edu.au June
More informationLexical Analysis. Lecture 2-4
Lexical Analysis Lecture 2-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 2 1 Administrivia Moving to 60 Evans on Wednesday HW1 available Pyth manual available on line.
More informationThe Pencil Reference Manual
The Pencil Reference Manual Christopher Conway Cheng-Hong Li Megan Pengelly November 7 2002 1 Grammar Notation Grammar symbols are defined as they are introduced in this document. Regular expression notation
More informationA lexical analyzer generator for Standard ML. Version 1.6.0, October 1994
A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994 Andrew W. Appel 1 James S. Mattson David R. Tarditi 2 1 Department of Computer Science, Princeton University 2 School of Computer
More informationIntroduction to Lexical Analysis
Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexical analyzers (lexers) Regular
More informationLexical Analysis. Lecture 3-4
Lexical Analysis Lecture 3-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 3-4 1 Administrivia I suggest you start looking at Python (see link on class home page). Please
More informationUniversity of Utrecht. 1992; Fokker, 1995), the use of monads to structure functional programs (Wadler,
J. Functional Programming 1 (1): 1{000, January 1993 c 1993 Cambridge University Press 1 F U N C T I O N A L P E A R L S Monadic Parsing in Haskell Graham Hutton University of Nottingham Erik Meijer University
More informationProgramming Languages Third Edition
Programming Languages Third Edition Chapter 12 Formal Semantics Objectives Become familiar with a sample small language for the purpose of semantic specification Understand operational semantics Understand
More informationCS525 Winter 2012 \ Class Assignment #2 Preparation
1 CS525 Winter 2012 \ Class Assignment #2 Preparation Ariel Stolerman 2.26) Let be a CFG in Chomsky Normal Form. Following is a proof that for any ( ) of length exactly steps are required for any derivation
More informationCS321 Languages and Compiler Design I. Winter 2012 Lecture 4
CS321 Languages and Compiler Design I Winter 2012 Lecture 4 1 LEXICAL ANALYSIS Convert source file characters into token stream. Remove content-free characters (comments, whitespace,...) Detect lexical
More informationAnnouncements. Written Assignment 1 out, due Friday, July 6th at 5PM.
Syntax Analysis Announcements Written Assignment 1 out, due Friday, July 6th at 5PM. xplore the theoretical aspects of scanning. See the limits of maximal-munch scanning. Class mailing list: There is an
More informationAppendix. Grammar. A.1 Introduction. A.2 Keywords. There is no worse danger for a teacher than to teach words instead of things.
A Appendix Grammar There is no worse danger for a teacher than to teach words instead of things. Marc Block Introduction keywords lexical conventions programs expressions statements declarations declarators
More informationRDGL Reference Manual
RDGL Reference Manual COMS W4115 Programming Languages and Translators Professor Stephen A. Edwards Summer 2007(CVN) Navid Azimi (na2258) nazimi@microsoft.com Contents Introduction... 3 Purpose... 3 Goals...
More informationLexical Analysis (ASU Ch 3, Fig 3.1)
Lexical Analysis (ASU Ch 3, Fig 3.1) Implementation by hand automatically ((F)Lex) Lex generates a finite automaton recogniser uses regular expressions Tasks remove white space (ws) display source program
More informationDr. D.M. Akbar Hussain
1 2 Compiler Construction F6S Lecture - 2 1 3 4 Compiler Construction F6S Lecture - 2 2 5 #include.. #include main() { char in; in = getch ( ); if ( isalpha (in) ) in = getch ( ); else error (); while
More informationCS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer
CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer Assigned: Thursday, September 16, 2004 Due: Tuesday, September 28, 2004, at 11:59pm September 16, 2004 1 Introduction Overview In this
More informationAdministrivia. Lexical Analysis. Lecture 2-4. Outline. The Structure of a Compiler. Informal sketch of lexical analysis. Issues in lexical analysis
dministrivia Lexical nalysis Lecture 2-4 Notes by G. Necula, with additions by P. Hilfinger Moving to 6 Evans on Wednesday HW available Pyth manual available on line. Please log into your account and electronically
More informationIPCoreL. Phillip Duane Douglas, Jr. 11/3/2010
IPCoreL Programming Language Reference Manual Phillip Duane Douglas, Jr. 11/3/2010 The IPCoreL Programming Language Reference Manual provides concise information about the grammar, syntax, semantics, and
More informationIntroduction to Lexical Analysis
Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples
More informationYork University CSE 2001 Unit 4.0 Context Free Grammars and Parsers and Context Sensitive Grammars Instructor: Jeff Edmonds
York University CSE 2001 Unit 4.0 Context Free Grammars and Parsers and Context Sensitive Grammars Instructor: Jeff Edmonds Don t cheat by looking at these answers prematurely. 1. Consider the following
More informationCSC 467 Lecture 3: Regular Expressions
CSC 467 Lecture 3: Regular Expressions Recall How we build a lexer by hand o Use fgetc/mmap to read input o Use a big switch to match patterns Homework exercise static TokenKind identifier( TokenKind token
More informationThe TXL. Programming Language. Version 10.4 January 2005 TXL TXL. James R. Cordy. Ian H. Carmichael Russell Halliday
The TXL Programming Language Version 10.4 January 2005 James R. Cordy TXL Ian H. Carmichael Russell Halliday TXL James R. Cordy et al. The TXL Programming Language Version 10.4 1991-2005 James R. Cordy,
More information2.2 Syntax Definition
42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions
More informationFormal languages and computation models
Formal languages and computation models Guy Perrier Bibliography John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman - Introduction to Automata Theory, Languages, and Computation - Addison Wesley, 2006.
More informationThe Language for Specifying Lexical Analyzer
The Language for Specifying Lexical Analyzer We shall now study how to build a lexical analyzer from a specification of tokens in the form of a list of regular expressions The discussion centers around
More informationCOSC252: Programming Languages: Semantic Specification. Jeremy Bolton, PhD Adjunct Professor
COSC252: Programming Languages: Semantic Specification Jeremy Bolton, PhD Adjunct Professor Outline I. What happens after syntactic analysis (parsing)? II. Attribute Grammars: bridging the gap III. Semantic
More informationPart 5 Program Analysis Principles and Techniques
1 Part 5 Program Analysis Principles and Techniques Front end 2 source code scanner tokens parser il errors Responsibilities: Recognize legal programs Report errors Produce il Preliminary storage map Shape
More informationCPS 506 Comparative Programming Languages. Syntax Specification
CPS 506 Comparative Programming Languages Syntax Specification Compiling Process Steps Program Lexical Analysis Convert characters into a stream of tokens Lexical Analysis Syntactic Analysis Send tokens
More informationRay Pereda Unicon Technical Report UTR-02. February 25, Abstract
iflex: A Lexical Analyzer Generator for Icon Ray Pereda Unicon Technical Report UTR-02 February 25, 2000 Abstract iflex is software tool for building language processors. It is based on flex, a well-known
More informationStating the obvious, people and computers do not speak the same language.
3.4 SYSTEM SOFTWARE 3.4.3 TRANSLATION SOFTWARE INTRODUCTION Stating the obvious, people and computers do not speak the same language. People have to write programs in order to instruct a computer what
More informationCSCI 2010 Principles of Computer Science. Data and Expressions 08/09/2013 CSCI
CSCI 2010 Principles of Computer Science Data and Expressions 08/09/2013 CSCI 2010 1 Data Types, Variables and Expressions in Java We look at the primitive data types, strings and expressions that are
More informationA GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY
A GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY KARL L. STRATOS Abstract. The conventional method of describing a graph as a pair (V, E), where V and E repectively denote the sets of vertices and edges,
More informationSecond release of the COMPASS Tool Tool Grammar Reference
Grant Agreement: 287829 Comprehensive Modelling for Advanced Systems of Systems Second release of the COMPASS Tool Tool Grammar Reference Deliverable Number: D31.2c Version: 1.2 Date: January 2013 Public
More informationCOMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou
COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]
More informationCSE 401 Midterm Exam Sample Solution 2/11/15
Question 1. (10 points) Regular expression warmup. For regular expression questions, you must restrict yourself to the basic regular expression operations covered in class and on homework assignments:
More informationThe SPL Programming Language Reference Manual
The SPL Programming Language Reference Manual Leonidas Fegaras University of Texas at Arlington Arlington, TX 76019 fegaras@cse.uta.edu February 27, 2018 1 Introduction The SPL language is a Small Programming
More informationIntro to semantics; Small-step semantics Lecture 1 Tuesday, January 29, 2013
Harvard School of Engineering and Applied Sciences CS 152: Programming Languages Lecture 1 Tuesday, January 29, 2013 1 Intro to semantics What is the meaning of a program? When we write a program, we use
More informationMIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology
MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Massachusetts Institute of Technology Language Definition Problem How to precisely define language Layered structure
More informationQuestion Points Score
CS 453 Introduction to Compilers Midterm Examination Spring 2009 March 12, 2009 75 minutes (maximum) Closed Book You may use one side of one sheet (8.5x11) of paper with any notes you like. This exam has
More informationMIT Specifying Languages with Regular Expressions and Context-Free Grammars
MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Language Definition Problem How to precisely
More informationImplementation of Lexical Analysis
Implementation of Lexical Analysis Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs) Implementation
More informationX Language Definition
X Language Definition David May: November 1, 2016 The X Language X is a simple sequential programming language. It is easy to compile and an X compiler written in X is available to simplify porting between
More informationLecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1
CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Operational Semantics CMSC 330 Summer 2018 1 Formal Semantics of a Prog. Lang. Mathematical description of the meaning of programs written in that language
More informationProf. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan
Compilers Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan Lexical Analyzer (Scanner) 1. Uses Regular Expressions to define tokens 2. Uses Finite Automata to recognize tokens
More informationComputer Science 236 Fall Nov. 11, 2010
Computer Science 26 Fall Nov 11, 2010 St George Campus University of Toronto Assignment Due Date: 2nd December, 2010 1 (10 marks) Assume that you are given a file of arbitrary length that contains student
More informationAnnouncements! P1 part 1 due next Tuesday P1 part 2 due next Friday
Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday 1 Finite-state machines CS 536 Last time! A compiler is a recognizer of language S (Source) a translator from S to T (Target) a program
More information.Math 0450 Honors intro to analysis Spring, 2009 Notes #4 corrected (as of Monday evening, 1/12) some changes on page 6, as in .
0.1 More on innity.math 0450 Honors intro to analysis Spring, 2009 Notes #4 corrected (as of Monday evening, 1/12) some changes on page 6, as in email. 0.1.1 If you haven't read 1.3, do so now! In notes#1
More informationCSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions
CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions 1 Agenda Overview of language recognizers Basic concepts of formal grammars Scanner Theory
More informationCompiler Theory. (Semantic Analysis and Run-Time Environments)
Compiler Theory (Semantic Analysis and Run-Time Environments) 005 Semantic Actions A compiler must do more than recognise whether a sentence belongs to the language of a grammar it must do something useful
More informationConsider a description of arithmetic. It includes two equations that define the structural types of digit and operator:
Syntax A programming language consists of syntax, semantics, and pragmatics. We formalize syntax first, because only syntactically correct programs have semantics. A syntax definition of a language lists
More informationFinite Automata Theory and Formal Languages TMV027/DIT321 LP4 2016
Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2016 Lecture 15 Ana Bove May 23rd 2016 More on Turing machines; Summary of the course. Overview of today s lecture: Recap: PDA, TM Push-down
More informationII (Sorting and) Order Statistics
II (Sorting and) Order Statistics Heapsort Quicksort Sorting in Linear Time Medians and Order Statistics 8 Sorting in Linear Time The sorting algorithms introduced thus far are comparison sorts Any comparison
More informationCompiler Construction D7011E
Compiler Construction D7011E Lecture 2: Lexical analysis Viktor Leijon Slides largely by Johan Nordlander with material generously provided by Mark P. Jones. 1 Basics of Lexical Analysis: 2 Some definitions:
More informationGenerell Topologi. Richard Williamson. May 6, 2013
Generell Topologi Richard Williamson May 6, 2013 1 8 Thursday 7th February 8.1 Using connectedness to distinguish between topological spaces I Proposition 8.1. Let (, O ) and (Y, O Y ) be topological spaces.
More informationGrammars and Parsing, second week
Grammars and Parsing, second week Hayo Thielecke 17-18 October 2005 This is the material from the slides in a more printer-friendly layout. Contents 1 Overview 1 2 Recursive methods from grammar rules
More informationB.V. Patel Institute of BMC & IT, UTU 2014
BCA 3 rd Semester 030010301 - Java Programming Unit-1(Java Platform and Programming Elements) Q-1 Answer the following question in short. [1 Mark each] 1. Who is known as creator of JAVA? 2. Why do we
More informationHandout 9: Imperative Programs and State
06-02552 Princ. of Progr. Languages (and Extended ) The University of Birmingham Spring Semester 2016-17 School of Computer Science c Uday Reddy2016-17 Handout 9: Imperative Programs and State Imperative
More informationExamination in Compilers, EDAN65
Examination in Compilers, EDAN65 Department of Computer Science, Lund University 2016 10 28, 08.00-13.00 Note! Your exam will be marked only if you have completed all six programming lab assignments in
More informationChapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)
Chapter 3: Describing Syntax and Semantics Introduction Formal methods of describing syntax (BNF) We can analyze syntax of a computer program on two levels: 1. Lexical level 2. Syntactic level Lexical
More informationVHDL Lexical Elements
1 Design File = Sequence of Lexical Elements && Separators (a) Separators: Any # of Separators Allowed Between Lexical Elements 1. Space character 2. Tab 3. Line Feed / Carriage Return (EOL) (b) Lexical
More informationUNIT -2 LEXICAL ANALYSIS
OVER VIEW OF LEXICAL ANALYSIS UNIT -2 LEXICAL ANALYSIS o To identify the tokens we need some method of describing the possible tokens that can appear in the input stream. For this purpose we introduce
More informationLecture Notes on Static and Dynamic Semantics
Lecture Notes on Static and Dynamic Semantics 15-312: Foundations of Programming Languages Frank Pfenning Lecture 4 September 9, 2004 In this lecture we illustrate the basic concepts underlying the static
More informationDefining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1
Defining Program Syntax Chapter Two Modern Programming Languages, 2nd ed. 1 Syntax And Semantics Programming language syntax: how programs look, their form and structure Syntax is defined using a kind
More information(Refer Slide Time: 0:19)
Theory of Computation. Professor somenath Biswas. Department of Computer Science & Engineering. Indian Institute of Technology, Kanpur. Lecture-15. Decision Problems for Regular Languages. (Refer Slide
More informationFormal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2
Formal Languages and Grammars Chapter 2: Sections 2.1 and 2.2 Formal Languages Basis for the design and implementation of programming languages Alphabet: finite set Σ of symbols String: finite sequence
More informationThe Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?
The Front End Source code Front End IR Back End Machine code Errors The purpose of the front end is to deal with the input language Perform a membership test: code source language? Is the program well-formed
More informationEDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:
EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing Görel Hedin Revised: 2017-09-04 This lecture Regular expressions Context-free grammar Attribute grammar
More informationData Types and Variables in C language
Data Types and Variables in C language Basic structure of C programming To write a C program, we first create functions and then put them together. A C program may contain one or more sections. They are
More informationLexical Analysis. Introduction
Lexical Analysis Introduction Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies
More informationCS 374 Fall 2014 Homework 2 Due Tuesday, September 16, 2014 at noon
CS 374 Fall 2014 Homework 2 Due Tuesday, September 16, 2014 at noon Groups of up to three students may submit common solutions for each problem in this homework and in all future homeworks You are responsible
More information