Parsing Combinators: Introduction & Tutorial

Size: px
Start display at page:

Download "Parsing Combinators: Introduction & Tutorial"

Transcription

1 Parsing Combinators: Introduction & Tutorial Mayer Goldberg October 21, 2017 Contents 1 Synopsis 1 2 Backus-Naur Form (BNF) 2 3 Parsing Combinators 3 4 Simple constructors 4 5 The parser stack 6 6 Recursive parsers 7 7 Packing 9 8 Builtin Parsers & Parser Constructors 11 9 Using the parsers The power of parsing combinators 15 1 Synopsis Parsing combinators are a technique for compositionally embedding topdown, recursive descent parsers into programming languages that support higher-order abstraction. Put simply, this means that if you formulate your grammar in a specific way, that we shall discuss later, you can encode it directly, in an almost one-to-one way, into any functional and object-oriented programming languages. This means that you can encode and implement 1

2 your parsers quickly, incrementally, and directly into your programming language, without having to learn some special language for describing the grammar, without having to translate the grammar into source code in some programming language. This technique will let you implement sophisticated parsers quickly and correctly, with minimal pain, far quicker than by using other methods. The drawback: The technique is an embedding of a grammar, rather than a translation of it, so no optimizations are performed on the grammar. This means that if your parsers are terribly inefficient, you will have to identify the causes for this and change your grammar accordingly. This tutorial describes the theory and use of parsing combinators. To use parsing combinators, you do not really need to understand how they are implemented, though this will help you and is not very difficult. Try to read through the full text, including the examples. If you find an error in the text, or have suggestions, please write me an . 2 Backus-Naur Form (BNF) Our journey begins with the Backus-Naur form, named after John Backus of The IBM Corporation, and Peter Naur of the University of København, in København, Denmark. BNF is a notation used to specify Context-Free Grammars, and was used since the late 1950 s. BNF is a language that describes non-terminals using both terminals and non-terminals and the constructors catenation and disjunction. BNF has been extended with syntactic sugar that includes the Kleene-star (denoting the concatenation of zero or more expressions), the Kleene-plus (denoting the concatenation of one or more expressions), the question mark (denoting either 0 or 1 occurrence of an expression), and parenthesis for grouping sub-expressions. All these extensions can be translated into straight BNF if we add additional non-terminals. Years ago, it was quite common for books on specific languages to include a grammar for the syntax of the language, either in BNF or in some extended version of BNF. Today this is rarely seen. However, you should already be somewhat familiar with the notation of BNF. Here is an example of the definition of integers with no initial zeros: <digit-1-9> ::= <digit-0-9> ::= 0 <digit-1-9> <natural-number> ::= <digit-1-9><digit-0-9>* 0 <integer> ::= ( - + )? <natural-number> The? means at most once, so we may replace the rule 2

3 <integer> ::= ( - + )? <natural-number> with <integer> ::= - <natural-number> + <natural-number> <natural-number> 3 Parsing Combinators The basic ideas behind parsing combinators are: Terminals are encoded as parsers that only recognize their respective terminals. Non-terminals are encoded as parsers. The operators of catenation and disjunction can be encoded as higherorder abstractions (either higher-order functions or as instances of a class of parsers), i.e., In the functional world, catenation and disjunction are higherorder procedures that take parsers for grammars, and returns a parser for the catenation and disjunction of these grammars. In the object-oriented world, catenation and disjunction are static methods, factory methods or factory classes that take parsers for grammars and construct new parsers for the catenation or disjunction of these grammars. Recursive non-terminals become either recursive functions, or recursive methods. So if you look at the abstract-syntax tree for a grammar encoded in BNF (i.e., the AST of the BNF for that grammar), each node in that tree maps to a function or method call in the definition of a parser for that grammar that has been constructed using parsing combinators. But this is far from all there is to say about parsing combinators: Because parsers are just functions or objects, and because the parsers that are constructed using parsing combinators are constructed on-the-fly, at runtime, it is simple to use either functional or object-oriented abstraction to create additional parsing combinators, i.e., procedures that take parsers and construct new parsers. In this way, a single derived parsing combinator can be used to describe many rules in BNF. This is where parsing combinators are actually better at expressing grammar than BNF: It s like BNF with abstraction. We shall have more to say about this later on. 3

4 4 Simple constructors You are given the file pc.scm, which is the implementation for the parsing combinators package. To begin using it, you must load the file into your Scheme session. You can either load it from the prompt, or, if you re using it within a larger project (e.g., writing a compiler), then you place the call to the load procedure at the top of the file: > (load "pc.scm") > If the file is in the current directory in which the Scheme system is running, then this should be enough. Otherwise, you may need to know the path to the pc.scm file. Please make sure you use relative paths when specifying the file pc.scm. Otherwise, your code shall not be portable across Linux/Windows. The file loads without any visible output. The most elementary parsing combinators are given by const, caten, disj, for creating terminals, catenations, and disjunctions: > (const (lambda (ch) (and (char<=? #\a ch) (char<=? ch #\z)))) #<procedure at pc.scm:483> The constant parser takes a predicate as an argument. This procedure takes a character (or token). It then returns a parser (which too is a procedure) that matches the predicate. How might we test such a parser? We define <alphabetic> to be such a parser. Notice that we are following a BNF-like notational convention, whereby non-terminals are enclosed in angle-brackets. The parsing-combinator package comes with a builtin procedure test-string for testing parsers. This is not how you deploy a parser, and you should use this procedure only for testing. That said, you should build and test your parsers incrementally, rather than attempt to construct from scratch the entire grammar for a large language: > (define <alphabetic> (const (lambda (ch) (and (char<=? #\a ch) (char<=? ch #\z))))) > (test-string <alphabetic> "") (failed with report:) > (test-string <alphabetic> "a") ((match #\a) (remaining "")) > (test-string <alphabetic> "abc") ((match #\a) (remaining "bc")) 4

5 As you can see, <test-string> takes two arguments: A parser, and an input string. It then attempts to parse the head of the string using the parser. It either fails, returning a report, or it succeeds, returning an expression, and the remaining characters. Notice that when recognizing an alphabetic character, we only recognize one such character. To recognize more, we would need to pass a different parser: > (test-string (caten <alphabetic> <alphabetic>) "abc") ((match (#\a #\b)) (remaining "c")) > (test-string (caten <alphabetic> <alphabetic>) "ab") ((match (#\a #\b)) (remaining "")) > (test-string (caten <alphabetic> <alphabetic>) "a") (failed with report:) We have no introduced the catenation combinator, which takes any number of parsers for some grammars, and returns a parser for the catenation of these grammars. Notice that the parser that recognizes two alphabetic characters cannot match a string that contains only one such characters, so the parser fails. The parsing-combinator package contains extensive tools for reporting errors, but we shall not be covering them just yet. This means that for the time being, when we fail to match the head of the input string, we shall fail with an empty report. Consider the disjunction combinator: > (define <alphabetic> (const (lambda (ch) (and (char<=? #\a ch) (char<=? ch #\z))))) > (define <digit> (const (lambda (ch) (and (char<=? #\0 ch) (char<=? ch #\9))))) > (test-string (disj <alphabetic> <digit>) "a") ((match #\a) (remaining "")) > (test-string (disj <alphabetic> <digit>) "3") ((match #\3) (remaining "")) > (test-string (disj <alphabetic> <digit>) "*") (failed with report:) The disjunction of either an alphabetic char or a digit char can recognize both digits and alphabetic characters, but not punctuation. Hence we fail on *. Writing parsers in this way can be very tedious. For example, to recognize the input HELLO would require a catenation of 5 different parsers! Luckily, the parsing-combinator package contains some more advanced combinators to help meet such common parsing needs: 5

6 The procedure range takes two characters and returns a parser that recognizes characters in the given range, i.e., between the two characters. The procedure range-ci behaves like range, only in a case-insensitive manner, namely, it doesn t distinguish between uppercase and lowercase characters. The procedure word takes a string and returns a parser that matches that string. The procedure word-ci behaves like range, only in a case-insensitive manner. > (test-string (range #\a #\z) "a") ((match #\a) (remaining "")) > (test-string (range #\a #\z) "*") (failed with report:) > (test-string (range #\a #\z) "A") (failed with report:) > (test-string (range-ci #\a #\z) "A") ((match #\A) (remaining "")) > (test-string (range-ci #\a #\z) "c") ((match #\c) (remaining "")) > (test-string (word "HELLO") "hello") (failed with report:) > (test-string (word "HELLO") "HELL") (failed with report:) > (test-string (word "HELLO") "HELLO-WORLD!") ((match (#\H #\E #\L #\L #\O)) (remaining "-WORLD!")) > (test-string (word-ci "HELLO") "hello-world!") ((match (#\h #\e #\l #\l #\o)) (remaining "-world!")) 5 The parser stack Writing complex parsers requires composing many smaller ones. This can be difficult and nest deeply. To simplify this task, we use postfix notation to describe the construction of complex parsers using a parser stack: The procedure new starts a new stack. Commands for the parser stack are preceded with an asterisk character (*). The sequence of commands ends with the command done: If by the time the done command is executed, the parser 6

7 stack contains one parser, then this parser is returned; Otherwise an error message is generated. Here s a simple example: Rather than write a complex parser such as: (define <base-10-integer> (disj (caten (range #\1 #\9) (star (range #\0 #\9))) (not-followed-by (char #\0) (range #\0 #\9)))) We can write it as a flat structure, using the parser stack, as follows: (define <base-10-integer> (new (*parser (range #\1 #\9)) (*parser (range #\0 #\9)) *star (*caten 2) (*parser (char #\0)) (*parser (range #\0 #\9)) *not-followed-by (*disj 2) done)) The procedure star takes a parser and returns a parser that recognizes the Kleene star (i.e., zero or more occurrences) of any expression recognized by the original parser. The procedures *caten and *disj are the parser-stack equivalents of caten and disj. The argument they take is number of elements off of the stack to catenate or disjunct. The procedure not-followed-by takes two parsers p1 and p2, and returns a parser that recognizes all expressions that are recognized by p1 provided that they are not also recognized by p2. The parser-stack equivalents of this is *not-followed-by. 6 Recursive parsers Regardless of whether you define your parsers by composing parsing combinators directly, or by using a parser-stack to compose them, you define parsers through application. This means that the general form is always just a bunch of applications of various functions f1, f2, and so on, to various parsers <p1>, <p2>, <p3>, etc. For example, something like this: 7

8 (define <parser> (f1 <p1> (f2 (f3 <p2> <p3>) (f3 <p4> (f4 <p5> <p6>)) (f5 <p7>)))) If you consider how Scheme handles application, namely, how applicative order of evaluation works, you shall realize that we are going to have a real problem defining recursive parsers. Suppose we wanted to define: (define <parser> (f1 <parser>)) This wouldn t work, since we need to have <parser> in order to define <parser>. This problem is quite the same as that of defining a recursive function. When we define recursive functions, all we need is the address of the function and not its value. Consider the ubiquitous example of a recursive function, the factorial function: (define fact (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) The occurrence of the variable name fact within the body of the procedure requires only the address of fact, rather than its value. The value will only be needed later, when we apply fact. But the purpose of compiling fact the address is sufficient. Getting back to our parsing combinators, we need a mechanism that will let us define recursive parsers, and we do this by wrapping the name of the parser within a thunk, that is, a procedure of zero arguments: (lambda () <parser>). Of course, the interface to this parser is now completely different from what our parsing combinators expect, so we need a way to bridge this delayed value with the standard parsing combinator interface. This bridge is called delayed. 8

9 Suppose we wish to parse the grammar: S ::= a b S. The grammar is recursive, and we shall need to wrap the recursive production in a thunk and delay it: (define <S> (disj (char #\a) (caten (char #\b) (delayed (lambda () <S>))))) 7 Packing So far, parsing involves finding splitting the input stream of characters into the characters that are recognized by our grammar, and the remaining characters. This is all that parsing theory is really interested in. However for any practical use of parsing theory, we shall want to be able to do something with the matched characters: Generally, construct something, and in the context of programming language tools, create an abstract syntax tree. The parsing combinators package comes with procedures for performing post-processing on the matched input. These procedures are pack and pack-with, and their parsing stack equivalents *pack and *pack-with. The procedure pack takes a parser and a unary callback function, and returns a parser that recognizes the exact same grammar as the original parser, the only difference being that the callback function is applied for post-processing. To see how to use pack, let us return to our original parser for natural numbers: (define <base-10-integer> (disj (caten (range #\1 #\9) (star (range #\0 #\9))) (not-followed-by (char #\0) (range #\0 #\9)))) Testing this parser, we notice that the matching characters are returned in two lists: A list of the first character, and a list of the remaining characters: > (test-string <base-10-integer> "12345") ((match (#\1 (#\2 #\3 #\4 #\5))) (remaining "")) These lists are generated by the parser: 9

10 (caten (range #\1 #\9) (star (range #\0 #\9))) Can we combine these two lists? To do so, we can replace this parser with: (pack (caten (range #\1 #\9) (star (range #\0 #\9))) (lambda (first+rest) (cons (car first+rest) (cadr first+rest)))) Starting with the original parser, we pass it as an argument to pack. The callback function is a unary function that takes the single parameter first+rest, which stands for a list of two things: The first character and the list of all the remaining characters. We simply apply cons to the elements of this list: > (test-string <base-10-integer> "12345") ((match (#\1 #\2 #\3 #\4 #\5)) (remaining "")) So we now generate a single list. We are not yet quite satisfied. We still want to convert these characters into a number. To do this, we simply beef-up the post-processing: (pack (caten (range #\1 #\9) (star (range #\0 #\9))) (lambda (first+rest) (string->number (list->string (cons (car first+rest) (cadr first+rest)))))) The resulting code behaves as follows: > (test-string <base-10-integer> "12345moshe") ((match 12345) (remaining "moshe")) 10

11 The procedure pack-with is similar to pack, but is intended to give names to the elements of the list. For example, when, as in this case, the list is created using catenation, we may want to name each element in the list. We can do this with pack (and possibly use let inside the callback function), but it s simpler to use pack-with: (pack-with (caten (range #\1 #\9) (star (range #\0 #\9))) (lambda (first rest) (string->number (list->string (cons first rest))))) The code behaves identically. We can also use *pack and *pack-with when writing parsers using the parser-stack: (define <base-10-integer> (new (*parser (range #\1 #\9)) (*parser (range #\0 #\9)) *star (*caten 2) (*pack-with (lambda (first rest) (string->number (list->string (cons first rest))))) (*parser (char #\0)) (*parser (range #\0 #\9)) *not-followed-by (*disj 2) done)) 8 Builtin Parsers & Parser Constructors 8.1 Builtin parsers <any-char>: Matches any character. <any>: A synonym for <any-char>. 11

12 <end-of-input>: Matches the end of the input stream. <epsilon>: Matches the empty input. This is the unit for catenation. <fail>: Matches nothing. This is the unit for disjunction. 8.2 Parser constructors ^<separated-exprs>: Takes a parser for <expr> for expressions, and a parser <sep> for a separator, and returns a parser for a sequence of one or more expressions separated by the given separator. caten: Takes any number of parsers, and returns their catenation. char-ci: Takes a character, and returns a parser that matches that character in a case-insensitive manner. char: Takes a character, and returns a parser that matches that character. const: Takes a predicate, and returns a parser that matches anything that satisfies the given predicate. delayed: Provides an interface to a thunk-wrapped parser. Used for embedding recursive production rules. diff: Takes two parsers <p1>, <p2>, and returns a parser that matches anything matched by <p1> provided <p2> does not match the head of the same input characters. disj: Takes any number of parsers, and returns their disjunction. fence followed-by maybe not-followed-by one-of-ci one-of otherwise 12

13 pack-with pack plus: Takes a parser <p>. (plus <p>) returns a parser that for any string str recognized by <p>, recognizes the catenation of one or more copies of str. range-ci range star: Takes a parser <p>. (star <p>) returns a parser that for any string str recognized by <p>, recognizes the catenation of zero or more copies of str. times: Takes a parser <p> and an natural number n. (times <p> n) returns a parser that for each string str recognized by <p>, recognizes the catenation of n copies of str. word-ci word-suffixes-ci word-suffixes word 8.3 Parser-Stack procedures *caten *diff *disj *dup *fence *followed-by *guard *maybe *not-followed-by 13

14 *otherwise *pack-with *pack *parser *plus *star *swap *times 9 Using the parsers The procedure test-string is intended to help you test and develop your parsers interactively and incrementally. In fact, they are not how you should deploy your parsers. The purpose of this section is to show you how you can invoke your parsers. The parsers you write using the parsing-combinator package are procedures of 3 arguments: A list of input characters A success continuation A failure continuation If the parser succeeds in matching the head of the input characters to some grammatical form, the success continuation is called. Otherwise, the failure continuation is called. The succeess continuation is called with two arguments: The object matched and the remaining characters of the input stream. Initially, the object matched is the list of characters matched by the parser, however the use of post-processing callback functions (e.g., through the use of pack) can result in other objects being returned. The list of remaining characters are precisely those characters left after the returned object has been "read" from the input stream. The fail continuation is invoked with a single argument, an error report. This is going to be a list of strings. This tutorial does not [yet?] deal with error reporting, so we shall not explore this venue at present. 14

15 In summary, if <parser> is some parser, you call it in tail position as follows: (<parser> s ; some list of characters (lambda (e remaining-chars)... ) (lambda (errors)... )) Look at the source code for the test-string procedure in the pc.scm file, and see how it works. 10 The power of parsing combinators Parsing combinators are way to embed a grammar into a programming language, in a way that is both compositional and direct. Unlike parser generators, parsing combinators perform absolutely no processing on the grammar. The following shall have to be done manually, before the parser can be implemented using parsing combinators: Removal of left-recursive productions Any optimizations to the grammar That said, parsing combinators offer unique advantages for encoding complex parsers quickly and precisely: The parsing combinator package contains powerful constructors that allow to define grammars that are not context-free (!) with ease. The use of abstraction allows us to define meta-production-rules each of which replaces many production rules, resulting in shorter, simpler, and more consistent implementations. Parsing combinators encourage an interactive, incremental, bottom-up development of parsers, that is very conducive to large and complex parsers. 15

This book is licensed under a Creative Commons Attribution 3.0 License

This book is licensed under a Creative Commons Attribution 3.0 License 6. Syntax Learning objectives: syntax and semantics syntax diagrams and EBNF describe context-free grammars terminal and nonterminal symbols productions definition of EBNF by itself parse tree grammars

More information

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1 Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1 1. Introduction Parsing is the task of Syntax Analysis Determining the syntax, or structure, of a program. The syntax is defined by the grammar rules

More information

COP 3402 Systems Software Syntax Analysis (Parser)

COP 3402 Systems Software Syntax Analysis (Parser) COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis

More information

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer: Theoretical Part Chapter one:- - What are the Phases of compiler? Six phases Scanner Parser Semantic Analyzer Source code optimizer Code generator Target Code Optimizer Three auxiliary components Literal

More information

Compiler principles, PS1

Compiler principles, PS1 Compiler principles, PS1 1 Compiler structure A compiler is a computer program that transforms source code written in a programming language into another computer language. Structure of a compiler: Scanner

More information

CPS 506 Comparative Programming Languages. Syntax Specification

CPS 506 Comparative Programming Languages. Syntax Specification CPS 506 Comparative Programming Languages Syntax Specification Compiling Process Steps Program Lexical Analysis Convert characters into a stream of tokens Lexical Analysis Syntactic Analysis Send tokens

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up

More information

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Lecture 4: Syntax Specification

Lecture 4: Syntax Specification The University of North Carolina at Chapel Hill Spring 2002 Lecture 4: Syntax Specification Jan 16 1 Phases of Compilation 2 1 Syntax Analysis Syntax: Webster s definition: 1 a : the way in which linguistic

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2007

More information

2.2 Syntax Definition

2.2 Syntax Definition 42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions

More information

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

COP 3402 Systems Software Top Down Parsing (Recursive Descent) COP 3402 Systems Software Top Down Parsing (Recursive Descent) Top Down Parsing 1 Outline 1. Top down parsing and LL(k) parsing 2. Recursive descent parsing 3. Example of recursive descent parsing of arithmetic

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and context-free grammars n Grammar derivations n More about parse trees n Top-down and

More information

A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

More information

Project 2: Scheme Lexer and Parser

Project 2: Scheme Lexer and Parser Project 2: Scheme Lexer and Parser Due Monday, Oct 8, 2018 at 8pm Contents Background 2 Lexer...................................................... 2 Lexical Analysis.............................................

More information

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input. flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input. More often than not, though, you ll want to use flex to generate a scanner that divides

More information

CSE 311 Lecture 21: Context-Free Grammars. Emina Torlak and Kevin Zatloukal

CSE 311 Lecture 21: Context-Free Grammars. Emina Torlak and Kevin Zatloukal CSE 311 Lecture 21: Context-Free Grammars Emina Torlak and Kevin Zatloukal 1 Topics Regular expressions A brief review of Lecture 20. Context-free grammars Syntax, semantics, and examples. 2 Regular expressions

More information

Introduction to Scheme

Introduction to Scheme How do you describe them Introduction to Scheme Gul Agha CS 421 Fall 2006 A language is described by specifying its syntax and semantics Syntax: The rules for writing programs. We will use Context Free

More information

Stating the obvious, people and computers do not speak the same language.

Stating the obvious, people and computers do not speak the same language. 3.4 SYSTEM SOFTWARE 3.4.3 TRANSLATION SOFTWARE INTRODUCTION Stating the obvious, people and computers do not speak the same language. People have to write programs in order to instruct a computer what

More information

Syntax. In Text: Chapter 3

Syntax. In Text: Chapter 3 Syntax In Text: Chapter 3 1 Outline Syntax: Recognizer vs. generator BNF EBNF Chapter 3: Syntax and Semantics 2 Basic Definitions Syntax the form or structure of the expressions, statements, and program

More information

Lecture 09: Data Abstraction ++ Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree.

Lecture 09: Data Abstraction ++ Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree. Lecture 09: Data Abstraction ++ Parsing Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree. program text Parser AST Processor Compilers (and some interpreters)

More information

A Simple Syntax-Directed Translator

A Simple Syntax-Directed Translator Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called

More information

Languages and Compilers

Languages and Compilers Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:

More information

Building Compilers with Phoenix

Building Compilers with Phoenix Building Compilers with Phoenix Syntax-Directed Translation Structure of a Compiler Character Stream Intermediate Representation Lexical Analyzer Machine-Independent Optimizer token stream Intermediate

More information

Dr. D.M. Akbar Hussain

Dr. D.M. Akbar Hussain Syntax Analysis Parsing Syntax Or Structure Given By Determines Grammar Rules Context Free Grammar 1 Context Free Grammars (CFG) Provides the syntactic structure: A grammar is quadruple (V T, V N, S, R)

More information

Compilers - Chapter 2: An introduction to syntax analysis (and a complete toy compiler)

Compilers - Chapter 2: An introduction to syntax analysis (and a complete toy compiler) Compilers - Chapter 2: An introduction to syntax analysis (and a complete toy compiler) Lecturers: Paul Kelly (phjk@doc.ic.ac.uk) Office: room 304, William Penney Building Naranker Dulay (nd@doc.ic.ac.uk)

More information

Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics ISBN Chapter 3 Describing Syntax and Semantics ISBN 0-321-49362-1 Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Copyright 2009 Addison-Wesley. All

More information

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott Introduction programming languages need to be precise natural languages less so both form (syntax) and meaning

More information

Programming Language Definition. Regular Expressions

Programming Language Definition. Regular Expressions Programming Language Definition Syntax To describe what its programs look like Specified using regular expressions and context-free grammars Semantics To describe what its programs mean Specified using

More information

CSE 12 Abstract Syntax Trees

CSE 12 Abstract Syntax Trees CSE 12 Abstract Syntax Trees Compilers and Interpreters Parse Trees and Abstract Syntax Trees (AST's) Creating and Evaluating AST's The Table ADT and Symbol Tables 16 Using Algorithms and Data Structures

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

A Small Interpreted Language

A Small Interpreted Language A Small Interpreted Language What would you need to build a small computing language based on mathematical principles? The language should be simple, Turing equivalent (i.e.: it can compute anything that

More information

Parser Tools: lex and yacc-style Parsing

Parser Tools: lex and yacc-style Parsing Parser Tools: lex and yacc-style Parsing Version 5.0 Scott Owens June 6, 2010 This documentation assumes familiarity with lex and yacc style lexer and parser generators. 1 Contents 1 Lexers 3 1.1 Creating

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back

More information

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table COMPILER CONSTRUCTION Lab 2 Symbol table LABS Lab 3 LR parsing and abstract syntax tree construction using ''bison' Lab 4 Semantic analysis (type checking) PHASES OF A COMPILER Source Program Lab 2 Symtab

More information

What s different about Factor?

What s different about Factor? Harshal Lehri What s different about Factor? Factor is a concatenative programming language - A program can be viewed as a series of functions applied on data Factor is a stack oriented program - Data

More information

Summer 2017 Discussion 10: July 25, Introduction. 2 Primitives and Define

Summer 2017 Discussion 10: July 25, Introduction. 2 Primitives and Define CS 6A Scheme Summer 207 Discussion 0: July 25, 207 Introduction In the next part of the course, we will be working with the Scheme programming language. In addition to learning how to write Scheme programs,

More information

B The SLLGEN Parsing System

B The SLLGEN Parsing System B The SLLGEN Parsing System Programs are just strings of characters. In order to process a program, we need to group these characters into meaningful units. This grouping is usually divided into two stages:

More information

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications Agenda for Today Regular Expressions CSE 413, Autumn 2005 Programming Languages Basic concepts of formal grammars Regular expressions Lexical specification of programming languages Using finite automata

More information

Parsing a primer. Ralf Lämmel Software Languages Team University of Koblenz-Landau

Parsing a primer. Ralf Lämmel Software Languages Team University of Koblenz-Landau Parsing a primer Ralf Lämmel Software Languages Team University of Koblenz-Landau http://www.softlang.org/ Mappings (edges) between different representations (nodes) of language elements. For instance,

More information

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target

More information

CS664 Compiler Theory and Design LIU 1 of 16 ANTLR. Christopher League* 17 February Figure 1: ANTLR plugin installer

CS664 Compiler Theory and Design LIU 1 of 16 ANTLR. Christopher League* 17 February Figure 1: ANTLR plugin installer CS664 Compiler Theory and Design LIU 1 of 16 ANTLR Christopher League* 17 February 2016 ANTLR is a parser generator. There are other similar tools, such as yacc, flex, bison, etc. We ll be using ANTLR

More information

1 Lexical Considerations

1 Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler

More information

Chapter 2 :: Programming Language Syntax

Chapter 2 :: Programming Language Syntax Chapter 2 :: Programming Language Syntax Michael L. Scott kkman@sangji.ac.kr, 2015 1 Regular Expressions A regular expression is one of the following: A character The empty string, denoted by Two regular

More information

CS 415 Midterm Exam Spring 2002

CS 415 Midterm Exam Spring 2002 CS 415 Midterm Exam Spring 2002 Name KEY Email Address Student ID # Pledge: This exam is closed note, closed book. Good Luck! Score Fortran Algol 60 Compilation Names, Bindings, Scope Functional Programming

More information

Parser Tools: lex and yacc-style Parsing

Parser Tools: lex and yacc-style Parsing Parser Tools: lex and yacc-style Parsing Version 6.11.0.6 Scott Owens January 6, 2018 This documentation assumes familiarity with lex and yacc style lexer and parser generators. 1 Contents 1 Lexers 3 1.1

More information

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d) CMSC 330: Organization of Programming Languages Tail Calls A tail call is a function call that is the last thing a function does before it returns let add x y = x + y let f z = add z z (* tail call *)

More information

A simple syntax-directed

A simple syntax-directed Syntax-directed is a grammaroriented compiling technique Programming languages: Syntax: what its programs look like? Semantic: what its programs mean? 1 A simple syntax-directed Lexical Syntax Character

More information

Scheme Quick Reference

Scheme Quick Reference Scheme Quick Reference COSC 18 Fall 2003 This document is a quick reference guide to common features of the Scheme language. It is not intended to be a complete language reference, but it gives terse summaries

More information

EECS 6083 Intro to Parsing Context Free Grammars

EECS 6083 Intro to Parsing Context Free Grammars EECS 6083 Intro to Parsing Context Free Grammars Based on slides from text web site: Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. 1 Parsing sequence of tokens parser

More information

Thoughts on Assignment 4 Haskell: Flow of Control

Thoughts on Assignment 4 Haskell: Flow of Control Thoughts on Assignment 4 Haskell: Flow of Control CS F331 Programming Languages CSCE A331 Programming Language Concepts Lecture Slides Monday, February 27, 2017 Glenn G. Chappell Department of Computer

More information

UNIVERSITY OF CALIFORNIA

UNIVERSITY OF CALIFORNIA UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division CS164 Fall 1997 P. N. Hilfinger CS 164: Midterm Name: Please do not discuss the contents of

More information

a name refers to an object side effect of assigning composite objects

a name refers to an object side effect of assigning composite objects Outline 1 Formal Languages syntax and semantics Backus-Naur Form 2 Strings, Lists, and Tuples composite data types building data structures the % operator 3 Shared References a name refers to an object

More information

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019 Comp 411 Principles of Programming Languages Lecture 3 Parsing Corky Cartwright January 11, 2019 Top Down Parsing What is a context-free grammar (CFG)? A recursive definition of a set of strings; it is

More information

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters : Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Scanner Parser Static Analyzer Intermediate Representation Front End Back End Compiler / Interpreter

More information

Homework & Announcements

Homework & Announcements Homework & nnouncements New schedule on line. Reading: Chapter 18 Homework: Exercises at end Due: 11/1 Copyright c 2002 2017 UMaine School of Computing and Information S 1 / 25 COS 140: Foundations of

More information

Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute

Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute Module # 02 Lecture - 03 Characters and Strings So, let us turn our attention to a data type we have

More information

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console Scanning 1 read Interpreter Scanner request token Parser send token Console I/O send AST Tree Walker 2 Scanner This process is known as: Scanning, lexing (lexical analysis), and tokenizing This is the

More information

Scheme Quick Reference

Scheme Quick Reference Scheme Quick Reference COSC 18 Winter 2003 February 10, 2003 1 Introduction This document is a quick reference guide to common features of the Scheme language. It is by no means intended to be a complete

More information

CSCE 314 Programming Languages

CSCE 314 Programming Languages CSCE 314 Programming Languages Syntactic Analysis Dr. Hyunyoung Lee 1 What Is a Programming Language? Language = syntax + semantics The syntax of a language is concerned with the form of a program: how

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Formal Languages. Formal Languages

Formal Languages. Formal Languages Regular expressions Formal Languages Finite state automata Deterministic Non-deterministic Review of BNF Introduction to Grammars Regular grammars Formal Languages, CS34 Fall2 BGRyder Formal Languages

More information

MIDTERM EXAM (Solutions)

MIDTERM EXAM (Solutions) MIDTERM EXAM (Solutions) Total Score: 100, Max. Score: 83, Min. Score: 26, Avg. Score: 57.3 1. (10 pts.) List all major categories of programming languages, outline their definitive characteristics and

More information

Syntax. 2.1 Terminology

Syntax. 2.1 Terminology Syntax 2 Once you ve learned to program in one language, learning a similar programming language isn t all that hard. But, understanding just how to write in the new language takes looking at examples

More information

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield CMPS 3500 Programming Languages Dr. Chengwei Lei CEECS California State University, Bakersfield Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing

More information

Programming Languages Third Edition

Programming Languages Third Edition Programming Languages Third Edition Chapter 12 Formal Semantics Objectives Become familiar with a sample small language for the purpose of semantic specification Understand operational semantics Understand

More information

GNU ccscript Scripting Guide IV

GNU ccscript Scripting Guide IV GNU ccscript Scripting Guide IV David Sugar GNU Telephony 2008-08-20 (The text was slightly edited in 2017.) Contents 1 Introduction 1 2 Script file layout 2 3 Statements and syntax 4 4 Loops and conditionals

More information

EECS 311: Data Structures and Data Management Program 1 Assigned: 10/21/10 Checkpoint: 11/2/10; Due: 11/9/10

EECS 311: Data Structures and Data Management Program 1 Assigned: 10/21/10 Checkpoint: 11/2/10; Due: 11/9/10 EECS 311: Data Structures and Data Management Program 1 Assigned: 10/21/10 Checkpoint: 11/2/10; Due: 11/9/10 1 Project: Scheme Parser. In many respects, the ultimate program is an interpreter. Why? Because

More information

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010 Last time Source code Scanner Tokens Parser What are compilers? Phases of a compiler Syntax tree Semantic Routines IR Optimizer IR Code Generation Executable Extra: Front-end vs. Back-end Scanner + Parser

More information

SCHEME 8. 1 Introduction. 2 Primitives COMPUTER SCIENCE 61A. March 23, 2017

SCHEME 8. 1 Introduction. 2 Primitives COMPUTER SCIENCE 61A. March 23, 2017 SCHEME 8 COMPUTER SCIENCE 61A March 2, 2017 1 Introduction In the next part of the course, we will be working with the Scheme programming language. In addition to learning how to write Scheme programs,

More information

Project 1: Scheme Pretty-Printer

Project 1: Scheme Pretty-Printer Project 1: Scheme Pretty-Printer CSC 4101, Fall 2017 Due: 7 October 2017 For this programming assignment, you will implement a pretty-printer for a subset of Scheme in either C++ or Java. The code should

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

TDDD55 - Compilers and Interpreters Lesson 3

TDDD55 - Compilers and Interpreters Lesson 3 TDDD55 - Compilers and Interpreters Lesson 3 November 22 2011 Kristian Stavåker (kristian.stavaker@liu.se) Department of Computer and Information Science Linköping University LESSON SCHEDULE November 1,

More information

LANGUAGE PROCESSORS. Introduction to Language processor:

LANGUAGE PROCESSORS. Introduction to Language processor: LANGUAGE PROCESSORS Introduction to Language processor: A program that performs task such as translating and interpreting required for processing a specified programming language. The different types of

More information

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill Syntax Analysis Björn B. Brandenburg The University of North Carolina at Chapel Hill Based on slides and notes by S. Olivier, A. Block, N. Fisher, F. Hernandez-Campos, and D. Stotts. The Big Picture Character

More information

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

CS323 Lecture - Specifying Syntax and Semantics Last revised 1/16/09

CS323 Lecture - Specifying Syntax and Semantics Last revised 1/16/09 CS323 Lecture - Specifying Syntax and Semantics Last revised 1/16/09 Objectives: 1. To review previously-studied methods for formal specification of programming language syntax, and introduce additional

More information

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser

More information

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1 CSEP 501 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter 2008 1/8/2008 2002-08 Hal Perkins & UW CSE B-1 Agenda Basic concepts of formal grammars (review) Regular expressions

More information

Consider a description of arithmetic. It includes two equations that define the structural types of digit and operator:

Consider a description of arithmetic. It includes two equations that define the structural types of digit and operator: Syntax A programming language consists of syntax, semantics, and pragmatics. We formalize syntax first, because only syntactically correct programs have semantics. A syntax definition of a language lists

More information

Lexical Analysis. Introduction

Lexical Analysis. Introduction Lexical Analysis Introduction Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies

More information

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen COP4020 Programming Languages Functional Programming Prof. Robert van Engelen Overview What is functional programming? Historical origins of functional programming Functional programming today Concepts

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2006

More information

Index. caps method, 180 Character(s) base, 161 classes

Index. caps method, 180 Character(s) base, 161 classes A Abjads, 160 Abstract syntax tree (AST), 3 with action objects, 141 143 definition, 135 Action method for integers converts, 172 173 S-Expressions, 171 Action objects ASTs, 141 142 defined, 137.made attribute,

More information

Functional Programming Languages (FPL)

Functional Programming Languages (FPL) Functional Programming Languages (FPL) 1. Definitions... 2 2. Applications... 2 3. Examples... 3 4. FPL Characteristics:... 3 5. Lambda calculus (LC)... 4 6. Functions in FPLs... 7 7. Modern functional

More information

Fall 2017 Discussion 7: October 25, 2017 Solutions. 1 Introduction. 2 Primitives

Fall 2017 Discussion 7: October 25, 2017 Solutions. 1 Introduction. 2 Primitives CS 6A Scheme Fall 207 Discussion 7: October 25, 207 Solutions Introduction In the next part of the course, we will be working with the Scheme programming language. In addition to learning how to write

More information

Ling/CSE 472: Introduction to Computational Linguistics. 4/6/15: Morphology & FST 2

Ling/CSE 472: Introduction to Computational Linguistics. 4/6/15: Morphology & FST 2 Ling/CSE 472: Introduction to Computational Linguistics 4/6/15: Morphology & FST 2 Overview Review: FSAs & FSTs XFST xfst demo Examples of FSTs for spelling change rules Reading questions Review: FSAs

More information

Lexical Analysis (ASU Ch 3, Fig 3.1)

Lexical Analysis (ASU Ch 3, Fig 3.1) Lexical Analysis (ASU Ch 3, Fig 3.1) Implementation by hand automatically ((F)Lex) Lex generates a finite automaton recogniser uses regular expressions Tasks remove white space (ws) display source program

More information

There are many other applications like constructing the expression tree from the postorder expression. I leave you with an idea as how to do it.

There are many other applications like constructing the expression tree from the postorder expression. I leave you with an idea as how to do it. Programming, Data Structures and Algorithms Prof. Hema Murthy Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture 49 Module 09 Other applications: expression tree

More information

Context-Free Grammar (CFG)

Context-Free Grammar (CFG) Context-Free Grammar (CFG) context-free grammar looks like this bunch of rules: ain idea: + 1 (),, are non-terminal symbols aka variables. When you see them, you apply rules to expand. One of them is designated

More information

British Informatics Olympiad Final 30 March 1 April, 2007 Sponsored by Lionhead Studios. Parsing

British Informatics Olympiad Final 30 March 1 April, 2007 Sponsored by Lionhead Studios. Parsing British Informatics Olympiad Final 30 March 1 April, 2007 Sponsored by Lionhead Studios Parsing This question is quite long do not worry if you do not get to the end Backus-Naur form One of the first tasks

More information

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

Examples of attributes: values of evaluated subtrees, type information, source file coordinates, 1 2 3 Attributes can be added to the grammar symbols, and program fragments can be added as semantic actions to the grammar, to form a syntax-directed translation scheme. Some attributes may be set by

More information

Optimizing Finite Automata

Optimizing Finite Automata Optimizing Finite Automata We can improve the DFA created by MakeDeterministic. Sometimes a DFA will have more states than necessary. For every DFA there is a unique smallest equivalent DFA (fewest states

More information

UNIT I Programming Language Syntax and semantics. Kainjan Sanghavi

UNIT I Programming Language Syntax and semantics. Kainjan Sanghavi UNIT I Programming Language Syntax and semantics B y Kainjan Sanghavi Contents Language Definition Syntax Abstract and Concrete Syntax Concept of binding Language Definition Should enable a person or computer

More information

Chapter 3. Describing Syntax and Semantics

Chapter 3. Describing Syntax and Semantics Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs:

More information

A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994

A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994 A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994 Andrew W. Appel 1 James S. Mattson David R. Tarditi 2 1 Department of Computer Science, Princeton University 2 School of Computer

More information

Introduction to Lexing and Parsing

Introduction to Lexing and Parsing Introduction to Lexing and Parsing ECE 351: Compilers Jon Eyolfson University of Waterloo June 18, 2012 1 Riddle Me This, Riddle Me That What is a compiler? 1 Riddle Me This, Riddle Me That What is a compiler?

More information

Theory and Compiling COMP360

Theory and Compiling COMP360 Theory and Compiling COMP360 It has been said that man is a rational animal. All my life I have been searching for evidence which could support this. Bertrand Russell Reading Read sections 2.1 3.2 in the

More information

SEM / YEAR : VI / III CS2352 PRINCIPLES OF COMPLIERS DESIGN UNIT I - LEXICAL ANALYSIS PART - A

SEM / YEAR : VI / III CS2352 PRINCIPLES OF COMPLIERS DESIGN UNIT I - LEXICAL ANALYSIS PART - A SEM / YEAR : VI / III CS2352 PRINCIPLES OF COMPLIERS DESIGN UNIT I - LEXICAL ANALYSIS PART - A 1. What is a compiler? (A.U Nov/Dec 2007) A compiler is a program that reads a program written in one language

More information