Languages and Compilers (SProg og Oversættere) Lecture 3 recap Bent Thomsen Department of Computer Science Aalborg University

Size: px
Start display at page:

Download "Languages and Compilers (SProg og Oversættere) Lecture 3 recap Bent Thomsen Department of Computer Science Aalborg University"

Transcription

1 Languages and Compilers (SProg og Oversættere) Lecture 3 recap Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Norm Hutchinson whose slides this lecture is based on. 1

2 Syntax Analysis: Parser Dataflow chart Source Program Stream of Characters Scanner Error Reports Stream of Tokens Parser Error Reports Abstract Syntax Tree 2

3 Top-Down vs. Bottom-Up parsing LL-Analyse (Top-Down) Left-to-Right Left Derivative LR-Analyse (Bottom-Up) Left-to-Right Right Derivative Reduction Derivation Look-Ahead Look-Ahead 3

4 Recursive Descent Parsing Recursive descent parsing is a straightforward top-down parsing algorithm. We will now look at how to develop a recursive descent parser from an EBNF specification. Idea: the parse tree structure corresponds to the call graph structure of parsing procedures that call each other recursively. 4

5 Recursive Descent Parsing Sentence ::= Subject Verb Object. Subject ::= I a Noun the Noun Object ::= me me a Noun the Noun Noun ::= cat mat rat Verb ::= like is is see sees Define a procedure parsen for each non-terminal N private void parsesentence() ; private void parsesubject(); private void parseobject(); private void parsenoun(); private void parseverb(); 5

6 Recursive Descent Parsing: Auxiliary Methods public class MicroEnglishParser private TerminalSymbol currentterminal private void accept(terminalsymbol expected) if if (currentterminal matches expected) currentterminal = next input terminal ; else report a syntax error

7 Recursive Descent Parsing: Parsing Methods Sentence ::= Subject Verb Object. private void parsesentence() parsesubject(); parseverb(); parseobject(); accept(. ); 7

8 Recursive Descent Parsing: Parsing Methods Subject ::= I a Noun the Noun private void parsesubject() if if (currentterminal matches I ) accept( I ); else if if (currentterminal matches a ) accept( a ); parsenoun(); else if if (currentterminal matches the ) accept( the ); parsenoun(); else report a syntax error 8

9 Recursive Descent Parsing: Parsing Methods Noun ::= cat mat rat private void parsenoun() if if (currentterminal matches cat ) accept( cat ); else if if (currentterminal matches mat ) accept( mat ); else if if (currentterminal matches rat ) accept( rat ); else report a syntax error 9

10 Top-down parsing The parse tree is constructed starting at the top (root). Sentence Subject Verb Object. Noun Noun The cat sees a rat. 10

11 A little bit of useful theory We will now look at a few useful bits of theory. These will be necessary later when we implement parsers. Grammar transformations A grammar can be transformed in a number of ways without changing the meaning (i.e. the set of strings that it defines) The definition and computation of starter sets 11

12 Left factorization 1) Grammar Transformations X Y X Z X ( Y Z ) X Y= ε Z Example: single-command ::= V-name := := Expression if if Expression then single-command if if Expression then single-command else single-command single-command ::= V-name := := Expression if if Expression then single-command ( ε else single-command) 12

13 1) Grammar Transformations (ctd) Elimination of Left Recursion N ::= X N Y N ::= X Y* Example: Identifier ::= Letter Identifier Letter Identifier Digit Identifier ::= Letter Identifier (Letter Digit) Identifier ::= Letter (Letter Digit)* 13

14 1) Grammar Transformations (ctd) Substitution of non-terminal symbols N ::= X M ::= α N β Example: N ::= X M ::= α X β single-command ::= for contrvar := := Expression to-or-dt Expression do do single-command to-or-dt ::= to to downto single-command ::= for contrvar := := Expression (to downto) Expression do do single-command 14

15 Developing RD Parser for Mini Triangle Before we begin: The following non-terminals are recognized by the scanner They will be returned as tokens by the scanner Identifier := := Letter (Letter Digit)* Integer-Literal ::= Digit Digit* Operator ::= + - * / < > = Comment ::=! Graphic* eol Assume scanner produces instances of: public class Token byte kind; String spelling; final static byte IDENTIFIER = 0, 0, INTLITERAL = 1; 1;... 15

16 (1+2) Express grammar in EBNF and factorize... Program ::= single-command Command ::= single-command Command Left ; single-command recursion elimination needed single-command Left factorization needed ::= V-name := := Expression Identifier ( Expression ) if if Expression then single-command else single-command while Expression do do single-command let Declaration in in single-command begin Command end V-name ::= Identifier... 16

17 (1+2)Express grammar in EBNF and factorize... After factorization etc. we get: Program ::= single-command Command ::= single-command (;single-command)* single-command ::= Identifier ( := := Expression ( Expression ) ) if if Expression then single-command else single-command while Expression do do single-command let Declaration in in single-command begin Command end V-name ::= Identifier... 17

18 Developing RD Parser for Mini Triangle Expression Left recursion elimination needed ::= primary-expression Expression Operator primary-expression primary-expression ::= Integer-Literal V-name Operator primary-expression ( Expression ) Declaration Left recursion elimination needed ::= single-declaration Declaration ; single-declaration single-declaration ::= const Identifier ~ Expression var Identifier : Type-denoter Type-denoter ::= Identifier 18

19 (1+2) Express grammar in EBNF and factorize... After factorization and recursion elimination : Expression ::= primary-expression ( Operator primary-expression )* )* primary-expression ::= Integer-Literal Identifier Operator primary-expression ( Expression ) Declaration ::= single-declaration (;single-declaration)* single-declaration ::= const Identifier ~ Expression var Identifier : Type-denoter Type-denoter ::= Identifier 19

20 (3) Create a parser class with... public class Parser private Token currenttoken; private void accept(byte expectedkind) if if (currenttoken.kind == == expectedkind) currenttoken = scanner.scan(); else report syntax error private void acceptit() currenttoken = scanner.scan(); public void parse() acceptit(); //Get the the first token parseprogram(); if if (currenttoken.kind!=!= Token.EOT) report syntax error

21 (4) Implement private parsing methods: Program ::= single-command private void parseprogram() parsesinglecommand(); 21

22 (4) Implement private parsing methods: single-command ::= Identifier ( := := Expression ( Expression ) ) if if Expression then single-command else single-command... more alternatives... private void parsesinglecommand() switch (currenttoken.kind) case Token.IDENTIFIER : case Token.IF : more cases default: report a syntax error 22

23 (4) Implement private parsing methods: single-command ::= Identifier ( := := Expression ( Expression ) ) if if Expression then single-command else single-command while Expression do do single-command let Declaration in in single-command begin Command end From the above we can straightforwardly derive the entire implementation of parsesinglecommand (much as we did in the microenglish example) 23

24 Algorithm to convert EBNF into a RD parser The conversion of an EBNF specification into a Java implementation for a recursive descent parser is so mechanical that it can easily be automated! => JavaCC Java Compiler Compiler We can describe the algorithm by a set of mechanical rewrite rules N ::= ::= X private void parsen() parse X 24

25 Algorithm to convert EBNF into a RD parser parse tt accept(t); parse N parsen(); where ttis is a terminal where N is is a non-terminal parse ε // // a dummy statement parse XY XY parse X parse Y 25

26 Algorithm to convert EBNF into a RD parser parse X* X* while (currenttoken.kind is is in in starters[x]) parse X parse X Y X Y switch (currenttoken.kind) cases in in starters[x]: parse X break; cases in in starters[y]: parse Y break; default: report syntax error 26

27 Example: Generation of parsecommand Command ::= ::= single-command ((;; single-command )* )* private void void void parsecommand() parsesinglecommand(); single-command ((;; single-command )* )* parse while ( ( ; ; single-command (currenttoken.kind==token.semicolon) )* )* parse acceptit(); ; ; single-command parse parsesinglecommand(); single-command 27

28 Example: Generation of parsesingledeclaration single-declaration ::= ::= const Identifier ~ Type-denoter var varidentifier :: Expression private void parsesingledeclaration() switch (currenttoken.kind) private case void void parsesingledeclaration() Token.CONST: parse switch const (currenttoken.kind) acceptit(); Identifier ~ Type-denoter case var var Token.CONST: parseidentifier(); :: Expression parse acceptit(); acceptit(token.is); const Identifier ~ Type-denoter case parseidentifier(); Token.VAR: parsetypedenoter(); parse acceptit(token.is); parse case ~ var varidentifier :: Expression Token.VAR: default: parsetypedenoter(); acceptit(); Type-denoter report syntax error case Token.VAR: parseidentifier(); parse acceptit(token.colon); var var Identifier : : Expression default: parseexpression(); report syntax error default: report syntax error 28

29 LL(1) Grammars The presented algorithm to convert EBNF into a parser does not work for all possible grammars. It only works for so called LL(1) grammars. What grammars are LL(1)? Basically, an LL(1) grammar is a grammar which can be parsed with a top-down parser with a lookahead (in the input stream of tokens) of one token. How can we recognize that a grammar is (or is not) LL(1)? There is a formal definition which we will skip for now We can deduce the necessary conditions from the parser generation algorithm. 29

30 parse X* X* LL(1) Grammars while (currenttoken.kind is is in in starters[x]) parse X parse X Y X Y switch (currenttoken.kind) cases in instarters[x]: parse X break; cases in instarters[y]: parse Y break; default: report syntax error Condition: starters[x] must be be disjoint from the the set set of of tokens that can immediately follow X * Condition: starters[x] and starters[y] must be be disjoint sets. 30

31 2) Starter Sets Informal Definition: The starter set of a RE X is the set of terminal symbols that can occur as the start of any string generated by X Example : starters[ (+ - ε)(0 1 9)* ] = +,-, 0,1,,9 Formal Definition: starters[ε] = starters[t] =t (where t is a terminal symbol) starters[x Y] = starters[x] starters[y] (if X generates ε) starters[x Y] = starters[x] (if not X generates ε) starters[x Y] = starters[x] starters[y] starters[x*] = starters[x] 31

32 2) Starter Sets (ctd) Informal Definition: The starter set of RE can be generalized to extended BNF Formal Definition: starters[n] = starters[x] (for production rules N ::= X) Example : starters[expression] = starters[primaryexp (Operator PrimaryExp)*] = starters[primaryexp] = starters[identifiers] starters[(expression)] = starters[a b c... z] ( = a, b, c,, z, ( 32

33 LL(1) grammars and left factorisation The original Mini Triangle grammar is not LL(1): For example: single-command ::= V-name := := Expression Identifier ( Expression )... V-name ::= Identifier Starters[V-name := Expression] = Starters[V-name] = Starters[Identifier] Starters[Identifier ( Expression )] = Starters[Identifier] NOT DISJOINT! 33

34 LL(1) grammars: left factorisation What happens when we generate a RD parser from a non LL(1) grammar? single-command ::= V-name := := Expression Identifier ( Expression )... private void void parsesinglecommand() switch (currenttoken.kind) case Token.IDENTIFIER: parse V-name := := Expression case Token.IDENTIFIER: parse Identifier ( Expression )...other cases... default: report syntax error wrong: overlapping cases 34

35 LL(1) grammars: left factorisation single-command ::= V-name := := Expression Identifier ( Expression )... Left factorisation (and substitution of V-name) single-command ::= Identifier ( := := Expression ( Expression ) )... 35

36 LL1 Grammars: left recursion elimination Command ::= single-command Command ; single-command What happens if we don t perform left-recursion elimination? public void voidparsecommand() switch (currenttoken.kind) case in in starters[single-command] parsesinglecommand(); case in in starters[command] parsecommand(); accept(token.semicolon); parsesinglecommand(); default: report syntax error wrong: overlapping cases 36

37 LL1 Grammars: left recursion elimination Command ::= single-command Command ; single-command Left recursion elimination Command ::= single-command (; (; single-command)* 37

38 Systematic Development of RD Parser (1) Express grammar in EBNF (2) Grammar Transformations: Left factorization and Left recursion elimination (3) Create a parser class with private variable currenttoken methods to call the scanner: accept and acceptit (4) Implement private parsing methods: add private parsen method for each non terminal N public parse method that gets the first token form the scanner calls parses (S is the start symbol of the grammar) 38

39 Algorithm to convert EBNF into a RD parser The conversion of an EBNF specification into a Java implementation for a recursive descent parser is straightforward We can describe the algorithm by a set of mechanical rewrite rules N ::= ::= X private void void parsen() parse X 39

40 Algorithm to convert EBNF into a RD parser parse tt accept(t); parse N parsen(); where ttis is a terminal where N is is a non-terminal parse ε // // a dummy statement parse XY XY parse X parse Y 40

41 Algorithm to convert EBNF into a RD parser parse X* X* while (currenttoken.kind is is in in starters[x]) parse X parse X Y X Y switch (currenttoken.kind) cases in instarters[x]: parse X break; cases in instarters[y]: parse Y break; default: report syntax error 41

42 LL(1) Grammars The presented algorithm to convert EBNF into a parser does not work for all possible grammars. It only works for so called LL(1) grammars. What grammars are LL(1)? Basically, an LL1 grammar is a grammar which can be parsed with a top-down parser with a lookahead (in the input stream of tokens) of one token. How can we recognize that a grammar is (or is not) LL1? We can deduce the necessary conditions from the parser generation algorithm. There is a formal definition we can use. 42

43 parse X* X* LL 1 Grammars while (currenttoken.kind is is in in starters[x]) parse X parse X Y X Y switch (currenttoken.kind) cases in instarters[x]: parse X break; cases in instarters[y]: parse Y break; default: report syntax error Condition: starters[x] must be be disjoint from the the set set of of tokens that can immediately follow X * Condition: starters[x] and starters[y] must be be disjoint sets. 43

44 Formal definition of LL(1) A grammar G is LL(1) iff for each set of productions M ::= X 1 X 2 X n : 1. starters[x 1 ], starters[x 2 ],, starters[x n ] are all pairwise disjoint 2. If X i =>* ε then starters[x j ] follow[x]=ø, for 1 j n.i j If G is ε-free then 1 is sufficient 44

45 What does X i =>* ε mean? Derivation It means a derivation from X i leading to the empty production What is a derivation? A grammar has a derivation: αaβ => αγβ iff A γ P (Sometimes A ::= γ ) =>* is the transitive closure of => Example: G = (E, a,+,*,(,), P, E) where P = E E+E, E E*E, E a, E (E) E => E+E => E+E*E => a+e*e => a+e*a => a+a*a E =>* a+a*a 45

46 Follow Sets Follow(A) is the set of prefixes of strings of terminals that can follow any derivation of A in G $ follow(s) (sometimes <eof> follow(s)) if(b αaβ) P, then first(β) follow(b) follow(a) The definition of follow usually results in recursive set definitions. In order to solve them, you need to do several iterations on the equations. 46

47 A few provable facts about LL(1) grammars No left-recursive grammar is LL(1) No ambiguous grammar is LL(1) Some languages have no LL(1) grammar A ε-free grammar, where each alternative X j for N ::= X j begins with a distinct terminal, is a simple LL(1) grammar 47

48 Converting EBNF into RD parsers The conversion of an EBNF specification into a Java implementation for a recursive descent parser is so mechanical that it can easily be automated! => JavaCC Java Compiler Compiler 48

49 JavaCC JavaCC is a parser generator JavaCC can be thought of as Lex and Yacc for implementing parsers in Java JavaCC is based on LL(k) grammars JavaCC transforms an EBNF grammar into an LL(k) parser The lookahead can be change by writing LOOKAHEAD( ) The JavaCC can have action code written in Java embedded in the grammar JavaCC has a companion called JJTree which can be used to generate an abstract syntax tree 49

50 JavaCC and JJTree JavaCC is a parser generator Inputs a set of token definitions, grammar and actions Outputs a Java program which performs lexical analysis Finding tokens Parses the tokens according to the grammar Executes actions JJTree is a preprocessor for JavaCC Inputs a grammar file Inserts tree building actions Outputs JavaCC grammar file From this you can add code to traverse the tree to do static analysis, code generation or interpretation. 50

51 JavaCC and JJTree 51

52 JavaCC input format One file with extension.jj containing Header Token specifications Grammar Example: TOKEN: <INTEGER_LITERAL: ([ 1-9 ]([ 0-9 ])* 0 )> void StatementListReturn() : (Statement())* return Expression() ; 52

53 JavaCC token specifications use regular expressions Characters and strings must be quoted ;, int, while Character lists [ ] is shorthand for [ a - z ] matches a b c z [ a, e, i, o,u ] matches any vowel [ a - z, A - Z ] matches any letter Repetition shorthand with * and + [ a - z, A - Z ]* matches zero or more letters [ a - z, A - Z ]+ matches one or more letters Shorthand with? provides for optionals: ( + - )?[ 0-9 ]+ matches signed and unsigned integers Tokens can be named TOKEN : <IDENTIFIER:<LETTER>(<LETTER> <DIGIT>)*> TOKEN : <LETTER: [ a - z, A - Z ] > <DIGIT:[ 0-9 ]> Now <IDENTIFIER> can be used in defining syntax 53

54 A bigger example options LOOKAHEAD=2; PARSER_BEGIN(Arithmetic) public class Arithmetic PARSER_END(Arithmetic) SKIP : " " "\r" "\t" TOKEN: < NUMBER: (<DIGIT>)+ ( "." (<DIGIT>)+ )? > < DIGIT: ["0"-"9"] > double expr(): term() ( "+" expr() "-" expr() )* double term(): unary() ( "*" term() "/" term() )* double unary(): "-" element() element() double element(): <NUMBER> "(" expr() ")" 54

55 Generating a parser with JavaCC javacc filename.jj generates a parser with specified name Lots of.java files javac *.java Compile all the.java files Note the parser doesn t do anything on its own. You have to either Add actions to grammar by hand Use JJTree to generate actions for building AST Use JBT to generate AST and visitors 55

56 Adding Actions by hand options LOOKAHEAD=2; PARSER_BEGIN(Calculator) public class Calculator public static void main(string args[]) throws ParseException Calculator parser = new Calculator(System.in); while (true) parser.parseoneline(); PARSER_END(Calculator) SKIP : " " "\r" "\t" TOKEN: < NUMBER: (<DIGIT>)+ ( "." (<DIGIT>)+ )? > < DIGIT: ["0"-"9"] > < EOL: "\n" > void parseoneline(): double a; a=expr() <EOL> System.out.println(a); <EOL> <EOF> System.exit(-1); 56

57 Adding Actions by hand (ctd.) double expr(): double a; double b; a=term() ( "+" b=expr() a += b; "-" b=expr() a -= b; )* return a; double term(): double a; double b; a=unary() ( "*" b=term() a *= b; "/" b=term() a /= b; )* return a; double unary(): double a; "-" a=element() return -a; a=element() return a; double element(): Token t; double a; t=<number> return Double.parseDouble(t.toString()); "(" a=expr() ")" return a; 57

58 Using JJTree JJTree is a preprocessor for JavaCC JTree transforms a bare JavaCC grammar into a grammar with embedded Java code for building an AST Classes Node and SimpleNode are generated Can also generate classes for each type of node All AST nodes implement interface Node Useful methods provided include: Public void jjtgetnumchildren()- returns the number of children Public void jjtgetchild(int i) - returns the i th child The state is in a parser field called jjtree The root is at Node rootnode() You can display the tree with ((SimpleNode)parser.jjtree.rootNode()).dump( ); JJTree supports the building of abstract syntax trees which can be traversed using the visitor design pattern 58

59 JBT JBT Java Tree Builder is an alternative to JJTree It takes a plain JavaCC grammar file as input and automatically generates the following: A set of syntax tree classes based on the productions in the grammar, utilizing the Visitor design pattern. Two interfaces: Visitor and ObjectVisitor. Two depth-first visitors: DepthFirstVisitor and ObjectDepthFirst, whose default methods simply visit the children of the current node. A JavaCC grammar with the proper annotations to build the syntax tree during parsing. New visitors, which subclass DepthFirstVisitor or ObjectDepthFirst, can then override the default methods and perform various operations on and manipulate the generated syntax tree. 59

60 The Visitor Pattern For object-oriented programming the visitor pattern enables the definition of a new operator on an object structure without changing the classes of the objects When using visitor pattern The set of classes must be fixed in advance Each class must have an accept method Each accept method takes a visitor as argument The purpose of the accept method is to invoke the visitor which can handle the current object. A visitor contains a visit method for each class (overloading) A method for class C takes an argument of type C The advantage of Visitors: New methods without recompilation! 60

61 Summary: Parser Generator Tools JavaCC is a Parser Generator Tool It can be thought of as Lex and Yacc for Java There are several other Parser Generator tools for Java We shall look at some of them later Beware! To use Parser Generator Tools efficiently you need to understand what they do and what the code they produce does Note, there can be bugs in the tools and/or in the code they generate! 61

62 So what can I do with this in my project? Language Design What type of language are you designing? C like, Java Like, Scripting language, Functional.. Does it fit with existing paradigms/generations? Which language design criteria to emphasise? Readability, writability, orthogonality, What is the syntax? Informal description CFG in EBNF (Human readable, i.e. may be ambiguous) LL(1) or (LALR(1)) grammar Grammar transformations left-factorization, left-recursion elimination, substitution Compiler Implementation Recursive Decent Parser (Tools generated Parser) 62

Languages and Compilers (SProg og Oversættere)

Languages and Compilers (SProg og Oversættere) Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Norm Hutchinson whose slides this lecture is based on. 1 Syntax Analysis

More information

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend:

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend: Course Overview Introduction (Chapter 1) Compiler Frontend: Today Lexical Analysis & Parsing (Chapter 2,3,4) Semantic Analysis (Chapter 5) Activation Records (Chapter 6) Translation to Intermediate Code

More information

Languages and Compilers (SProg og Oversættere) Lecture 4. Bent Thomsen Department of Computer Science Aalborg University

Languages and Compilers (SProg og Oversættere) Lecture 4. Bent Thomsen Department of Computer Science Aalborg University Languages and Compilers (SProg og Oversættere) Lecture 4 Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Norm Hutchinson whose slides this lecture is based on. 1

More information

Parsing. Parsing. Bottom Up Parsing. Bottom Up Parsing. Bottom Up Parsing. Bottom Up Parsing

Parsing. Parsing. Bottom Up Parsing. Bottom Up Parsing. Bottom Up Parsing. Bottom Up Parsing Parsing Determine if an input string is a sentence of G. G is a context free grammar (later). Assumed to be unambiguous. Recognition of the string plus determination of phrase structure. We constantly

More information

Build your own languages with

Build your own languages with Page 1 of 10 Advertisement: Support JavaWorld, click here! December 2000 HOME FEATURED TUTORIALS COLUMNS NEWS & REVIEWS FORUM JW RESOURCES ABOUT JW Cool Tools Build your own languages with JavaCC JavaCC

More information

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

More information

Types of parsing. CMSC 430 Lecture 4, Page 1

Types of parsing. CMSC 430 Lecture 4, Page 1 Types of parsing Top-down parsers start at the root of derivation tree and fill in picks a production and tries to match the input may require backtracking some grammars are backtrack-free (predictive)

More information

3. Parsing. Oscar Nierstrasz

3. Parsing. Oscar Nierstrasz 3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/

More information

COMP3131/9102: Programming Languages and Compilers

COMP3131/9102: Programming Languages and Compilers COMP3131/9102: Programming Languages and Compilers Jingling Xue School of Computer Science and Engineering The University of New South Wales Sydney, NSW 2052, Australia http://www.cse.unsw.edu.au/~cs3131

More information

Abstract Syntax Trees and Contextual Analysis. Roland Backhouse March 8, 2001

Abstract Syntax Trees and Contextual Analysis. Roland Backhouse March 8, 2001 1 Abstract Syntax Trees and Contextual Analysis Roland Backhouse March 8, 2001 Phases of a Compiler 2 sequence of characters sequence of tokens Lexical analysis Syntax Analysis AST representation of program

More information

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser JavaCC Parser The Compilation Task Input character stream Lexer stream Parser Abstract Syntax Tree Analyser Annotated AST Code Generator Code CC&P 2003 1 CC&P 2003 2 Automated? JavaCC Parser The initial

More information

CSCI312 Principles of Programming Languages

CSCI312 Principles of Programming Languages Copyright 2006 The McGraw-Hill Companies, Inc. CSCI312 Principles of Programming Languages! LL Parsing!! Xu Liu Derived from Keith Cooper s COMP 412 at Rice University Recap Copyright 2006 The McGraw-Hill

More information

The Phases of a Compiler. Course Overview. In Chapter 4. Syntax Analysis. Syntax Analysis. Multi Pass Compiler. PART I: overview material

The Phases of a Compiler. Course Overview. In Chapter 4. Syntax Analysis. Syntax Analysis. Multi Pass Compiler. PART I: overview material Course Overview The Phases of a Compiler PART I: overview material Introduction 2 Language processors (tombstone diagrams, bootstrappg) 3 Architecture of a compiler PART II: side a compiler 4 Sntax analsis

More information

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F). CS 2210 Sample Midterm 1. Determine if each of the following claims is true (T) or false (F). F A language consists of a set of strings, its grammar structure, and a set of operations. (Note: a language

More information

Course Overview. Levels of Programming Languages. Compilers and other translators. Tombstone Diagrams. Syntax Specification

Course Overview. Levels of Programming Languages. Compilers and other translators. Tombstone Diagrams. Syntax Specification Course Overview Levels of Programming Languages PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inse a compiler

More information

Wednesday, September 9, 15. Parsers

Wednesday, September 9, 15. Parsers Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs: What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax Susan Eggers 1 CSE 401 Syntax Analysis/Parsing Context-free grammars (CFG s) Purpose: determine if tokens have the right form for the language (right syntactic structure) stream of tokens abstract syntax

More information

CSE 3302 Programming Languages Lecture 2: Syntax

CSE 3302 Programming Languages Lecture 2: Syntax CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:

More information

A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

More information

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;} Compiler Construction Grammars Parsing source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2012 01 23 2012 parse tree AST builder (implicit)

More information

CA Compiler Construction

CA Compiler Construction CA4003 - Compiler Construction David Sinclair A top-down parser starts with the root of the parse tree, labelled with the goal symbol of the grammar, and repeats the following steps until the fringe of

More information

Wednesday, August 31, Parsers

Wednesday, August 31, Parsers Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

Syntactic Analysis. Top-Down Parsing

Syntactic Analysis. Top-Down Parsing Syntactic Analysis Top-Down Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

More information

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill Syntax Analysis Björn B. Brandenburg The University of North Carolina at Chapel Hill Based on slides and notes by S. Olivier, A. Block, N. Fisher, F. Hernandez-Campos, and D. Stotts. The Big Picture Character

More information

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4) CS1622 Lecture 9 Parsing (4) CS 1622 Lecture 9 1 Today Example of a recursive descent parser Predictive & LL(1) parsers Building parse tables CS 1622 Lecture 9 2 A Recursive Descent Parser. Preliminaries

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Parsing III. (Top-down parsing: recursive descent & LL(1) ) Parsing III (Top-down parsing: recursive descent & LL(1) ) Roadmap (Where are we?) Previously We set out to study parsing Specifying syntax Context-free grammars Ambiguity Top-down parsers Algorithm &

More information

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam Compilers Parsing Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Next step text chars Lexical analyzer tokens Parser IR Errors Parsing: Organize tokens into sentences Do tokens conform

More information

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program

More information

Monday, September 13, Parsers

Monday, September 13, Parsers Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of non-terminals S is the start symbol P is the set of productions

More information

Grammars and Parsing. Paul Klint. Grammars and Parsing

Grammars and Parsing. Paul Klint. Grammars and Parsing Paul Klint Grammars and Languages are one of the most established areas of Natural Language Processing and Computer Science 2 N. Chomsky, Aspects of the theory of syntax, 1965 3 A Language...... is a (possibly

More information

Chapter 4. Lexical and Syntax Analysis

Chapter 4. Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing Copyright 2012 Addison-Wesley. All rights reserved.

More information

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011 Syntax Analysis COMP 524: Programming Languages Srinivas Krishnan January 25, 2011 Based in part on slides and notes by Bjoern Brandenburg, S. Olivier and A. Block. 1 The Big Picture Character Stream Token

More information

Chapter 3. Parsing #1

Chapter 3. Parsing #1 Chapter 3 Parsing #1 Parser source file get next character scanner get token parser AST token A parser recognizes sequences of tokens according to some grammar and generates Abstract Syntax Trees (ASTs)

More information

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

COP 3402 Systems Software Top Down Parsing (Recursive Descent) COP 3402 Systems Software Top Down Parsing (Recursive Descent) Top Down Parsing 1 Outline 1. Top down parsing and LL(k) parsing 2. Recursive descent parsing 3. Example of recursive descent parsing of arithmetic

More information

Compiler construction in4303 lecture 3

Compiler construction in4303 lecture 3 Compiler construction in4303 lecture 3 Top-down parsing Chapter 2.2-2.2.4 Overview syntax analysis: tokens AST program text lexical analysis language grammar parser generator tokens syntax analysis AST

More information

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Introduction - Language implementation systems must analyze source code, regardless of the specific implementation approach - Nearly all syntax analysis is based on

More information

LL(k) Compiler Construction. Top-down Parsing. LL(1) parsing engine. LL engine ID, $ S 0 E 1 T 2 3

LL(k) Compiler Construction. Top-down Parsing. LL(1) parsing engine. LL engine ID, $ S 0 E 1 T 2 3 LL(k) Compiler Construction More LL parsing Abstract syntax trees Lennart Andersson Revision 2011 01 31 2010 Related names top-down the parse tree is constructed top-down recursive descent if it is implemented

More information

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468 Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure

More information

CPS 506 Comparative Programming Languages. Syntax Specification

CPS 506 Comparative Programming Languages. Syntax Specification CPS 506 Comparative Programming Languages Syntax Specification Compiling Process Steps Program Lexical Analysis Convert characters into a stream of tokens Lexical Analysis Syntactic Analysis Send tokens

More information

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised: EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing Görel Hedin Revised: 2017-09-04 This lecture Regular expressions Context-free grammar Attribute grammar

More information

CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages CS 314 Principles of Programming Languages Lecture 5: Syntax Analysis (Parsing) Zheng (Eddy) Zhang Rutgers University January 31, 2018 Class Information Homework 1 is being graded now. The sample solution

More information

Automated Tools. The Compilation Task. Automated? Automated? Easier ways to create parsers. The final stages of compilation are language dependant

Automated Tools. The Compilation Task. Automated? Automated? Easier ways to create parsers. The final stages of compilation are language dependant Automated Tools Easier ways to create parsers The Compilation Task Input character stream Lexer Token stream Parser Abstract Syntax Tree Analyser Annotated AST Code Generator Code CC&P 2003 1 CC&P 2003

More information

LL(k) Compiler Construction. Choice points in EBNF grammar. Left recursive grammar

LL(k) Compiler Construction. Choice points in EBNF grammar. Left recursive grammar LL(k) Compiler Construction More LL parsing Abstract syntax trees Lennart Andersson Revision 2012 01 31 2012 Related names top-down the parse tree is constructed top-down recursive descent if it is implemented

More information

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a EDA180: Compiler Construc6on Top- down parsing Görel Hedin Revised: 2013-01- 30a Compiler phases and program representa6ons source code Lexical analysis (scanning) Intermediate code genera6on tokens intermediate

More information

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3 Programming Language Specification and Translation ICOM 4036 Fall 2009 Lecture 3 Some parts are Copyright 2004 Pearson Addison-Wesley. All rights reserved. 3-1 Language Specification and Translation Topics

More information

ICOM 4036 Spring 2004

ICOM 4036 Spring 2004 Language Specification and Translation ICOM 4036 Spring 2004 Lecture 3 Copyright 2004 Pearson Addison-Wesley. All rights reserved. 3-1 Language Specification and Translation Topics Structure of a Compiler

More information

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1 Defining Program Syntax Chapter Two Modern Programming Languages, 2nd ed. 1 Syntax And Semantics Programming language syntax: how programs look, their form and structure Syntax is defined using a kind

More information

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

More information

Top down vs. bottom up parsing

Top down vs. bottom up parsing Parsing A grammar describes the strings that are syntactically legal A recogniser simply accepts or rejects strings A generator produces sentences in the language described by the grammar A parser constructs

More information

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing Abstract Syntax Trees & Top-Down Parsing Review of Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

More information

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing Review of Parsing Abstract Syntax Trees & Top-Down Parsing Given a language L(G), a parser consumes a sequence of tokens s and produces a parse tree Issues: How do we recognize that s L(G)? A parse tree

More information

Compiler construction lecture 3

Compiler construction lecture 3 Compiler construction in4303 lecture 3 Top-down parsing Chapter 2.2-2.2.4 Overview syntax analysis: tokens AST language grammar parser generator program text lexical analysis tokens syntax analysis AST

More information

Syntax Analysis, III Comp 412

Syntax Analysis, III Comp 412 COMP 412 FALL 2017 Syntax Analysis, III Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp

More information

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised: EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 2017-09-11 This lecture Regular expressions Context-free grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)

More information

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis. Topics Chapter 4 Lexical and Syntax Analysis Introduction Lexical Analysis Syntax Analysis Recursive -Descent Parsing Bottom-Up parsing 2 Language Implementation Compilation There are three possible approaches

More information

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1 Building a Parser III CS164 3:30-5:00 TT 10 Evans 1 Overview Finish recursive descent parser when it breaks down and how to fix it eliminating left recursion reordering productions Predictive parsers (aka

More information

CS Parsing 1

CS Parsing 1 CS414-20034-03 Parsing 1 03-0: Parsing Once we have broken an input file into a sequence of tokens, the next step is to determine if that sequence of tokens forms a syntactically correct program parsing

More information

Context-free grammars (CFG s)

Context-free grammars (CFG s) Syntax Analysis/Parsing Purpose: determine if tokens have the right form for the language (right syntactic structure) stream of tokens abstract syntax tree (AST) AST: captures hierarchical structure of

More information

CS502: Compilers & Programming Systems

CS502: Compilers & Programming Systems CS502: Compilers & Programming Systems Top-down Parsing Zhiyuan Li Department of Computer Science Purdue University, USA There exist two well-known schemes to construct deterministic top-down parsers:

More information

Question Bank. 10CS63:Compiler Design

Question Bank. 10CS63:Compiler Design Question Bank 10CS63:Compiler Design 1.Determine whether the following regular expressions define the same language? (ab)* and a*b* 2.List the properties of an operator grammar 3. Is macro processing a

More information

Prelude COMP 181 Tufts University Computer Science Last time Grammar issues Key structure meaning Tufts University Computer Science

Prelude COMP 181 Tufts University Computer Science Last time Grammar issues Key structure meaning Tufts University Computer Science Prelude COMP Lecture Topdown Parsing September, 00 What is the Tufts mascot? Jumbo the elephant Why? P. T. Barnum was an original trustee of Tufts : donated $0,000 for a natural museum on campus Barnum

More information

Object-oriented Compiler Construction

Object-oriented Compiler Construction 1 Object-oriented Compiler Construction Extended Abstract Axel-Tobias Schreiner, Bernd Kühl University of Osnabrück, Germany {axel,bekuehl}@uos.de, http://www.inf.uos.de/talks/hc2 A compiler takes a program

More information

CS 406/534 Compiler Construction Parsing Part I

CS 406/534 Compiler Construction Parsing Part I CS 406/534 Compiler Construction Parsing Part I Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr.

More information

Syntax Analysis, III Comp 412

Syntax Analysis, III Comp 412 Updated algorithm for removal of indirect left recursion to match EaC3e (3/2018) COMP 412 FALL 2018 Midterm Exam: Thursday October 18, 7PM Herzstein Amphitheater Syntax Analysis, III Comp 412 source code

More information

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23 Parser Larissa von Witte Institut für oftwaretechnik und Programmiersprachen 11. Januar 2016 L. v. Witte 11. Januar 2016 1/23 Contents Introduction Taxonomy Recursive Descent Parser hift Reduce Parser

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers. Part III : Parsing From Regular to Context-Free Grammars Deriving a Parser from a Context-Free Grammar Scanners and Parsers A Parser for EBNF Left-Parsable Grammars Martin Odersky, LAMP/DI 1 From Regular

More information

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008. Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008. Outline. Top-down versus Bottom-up Parsing. Recursive Descent Parsing. Left Recursion Removal. Left Factoring. Predictive Parsing. Introduction.

More information

Introduction to Programming Using Java (98-388)

Introduction to Programming Using Java (98-388) Introduction to Programming Using Java (98-388) Understand Java fundamentals Describe the use of main in a Java application Signature of main, why it is static; how to consume an instance of your own class;

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back

More information

Configuration Sets for CSX- Lite. Parser Action Table

Configuration Sets for CSX- Lite. Parser Action Table Configuration Sets for CSX- Lite State s 6 s 7 Cofiguration Set Prog { Stmts } Eof Stmts Stmt Stmts State s s Cofiguration Set Prog { Stmts } Eof Prog { Stmts } Eof Stmts Stmt Stmts Stmts λ Stmt if ( Expr

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up

More information

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing ICOM 4036 Programming Languages Lexical and Syntax Analysis Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing This lecture covers review questions 14-27 This lecture covers

More information

November Grammars and Parsing for SSC1. Hayo Thielecke. Introduction. Grammars. From grammars to code. Parser generators.

November Grammars and Parsing for SSC1. Hayo Thielecke. Introduction. Grammars. From grammars to code. Parser generators. and and November 2008 methods CC and Outline of the parsing part of the module 1 2 3 methods 4 CC and 5 and methods CC and Motivation: a challenge Write a program that reads strings like this and evaluates

More information

Parsing Part II (Top-down parsing, left-recursion removal)

Parsing Part II (Top-down parsing, left-recursion removal) Parsing Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

More information

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012 LR(0) Drawbacks Consider the unambiguous augmented grammar: 0.) S E $ 1.) E T + E 2.) E T 3.) T x If we build the LR(0) DFA table, we find that there is a shift-reduce conflict. This arises because the

More information

ECE251 Midterm practice questions, Fall 2010

ECE251 Midterm practice questions, Fall 2010 ECE251 Midterm practice questions, Fall 2010 Patrick Lam October 20, 2010 Bootstrapping In particular, say you have a compiler from C to Pascal which runs on x86, and you want to write a self-hosting Java

More information

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction C12/1 Introduction to Compilers and Translators pring 00 Outline More tips forll1) grammars Bottom-up parsing LR0) parser construction Lecture 5: Bottom-up parsing Lecture 5 C 12/1 pring '00 Andrew Myers

More information

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis Lexical and Syntactic Analysis Lexical and Syntax Analysis In Text: Chapter 4 Two steps to discover the syntactic structure of a program Lexical analysis (Scanner): to read the input characters and output

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and context-free grammars n Grammar derivations n More about parse trees n Top-down and

More information

Lexical and Syntax Analysis. Top-Down Parsing

Lexical and Syntax Analysis. Top-Down Parsing Lexical and Syntax Analysis Top-Down Parsing Easy for humans to write and understand String of characters Lexemes identified String of tokens Easy for programs to transform Data structure Syntax A syntax

More information

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing. Lecture 11-12 Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 9/22/06 Prof. Hilfinger CS164 Lecture 11 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient

More information

Building Compilers with Phoenix

Building Compilers with Phoenix Building Compilers with Phoenix Syntax-Directed Translation Structure of a Compiler Character Stream Intermediate Representation Lexical Analyzer Machine-Independent Optimizer token stream Intermediate

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

COP 3402 Systems Software Syntax Analysis (Parser)

COP 3402 Systems Software Syntax Analysis (Parser) COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis

More information

Computer Science 160 Translation of Programming Languages

Computer Science 160 Translation of Programming Languages Computer Science 160 Translation of Programming Languages Instructor: Christopher Kruegel Top-Down Parsing Parsing Techniques Top-down parsers (LL(1), recursive descent parsers) Start at the root of the

More information

Grammars and Parsing for SSC1

Grammars and Parsing for SSC1 Grammars and Parsing for SSC1 Hayo Thielecke http://www.cs.bham.ac.uk/~hxt/parsing.html 1 Introduction Outline of the parsing part of the module Contents 1 Introduction 1 2 Grammars and derivations 2 3

More information

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

More information

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Parsing Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students

More information

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Syntax Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Limits of Regular Languages Advantages of Regular Expressions

More information

CSE 401 Midterm Exam Sample Solution 2/11/15

CSE 401 Midterm Exam Sample Solution 2/11/15 Question 1. (10 points) Regular expression warmup. For regular expression questions, you must restrict yourself to the basic regular expression operations covered in class and on homework assignments:

More information

Syntax and Grammars 1 / 21

Syntax and Grammars 1 / 21 Syntax and Grammars 1 / 21 Outline What is a language? Abstract syntax and grammars Abstract syntax vs. concrete syntax Encoding grammars as Haskell data types What is a language? 2 / 21 What is a language?

More information

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Phases of a compiler Syntactic structure Figure 1.6, page 5 of text Recap Lexical analysis: LEX/FLEX (regex -> lexer) Syntactic analysis:

More information

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans. Administrativia Building a Parser III CS164 3:30-5:00 10 vans WA1 due on hu PA2 in a week Slides on the web site I do my best to have slides ready and posted by the end of the preceding logical day yesterday,

More information

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis Lexical and Syntactic Analysis Lexical and Syntax Analysis In Text: Chapter 4 Two steps to discover the syntactic structure of a program Lexical analysis (Scanner): to read the input characters and output

More information

Building Compilers with Phoenix

Building Compilers with Phoenix Building Compilers with Phoenix Parser Generators: ANTLR History of ANTLR ANother Tool for Language Recognition Terence Parr's dissertation: Obtaining Practical Variants of LL(k) and LR(k) for k > 1 PCCTS:

More information

LL parsing Nullable, FIRST, and FOLLOW

LL parsing Nullable, FIRST, and FOLLOW EDAN65: Compilers LL parsing Nullable, FIRST, and FOLLOW Görel Hedin Revised: 2014-09- 22 Regular expressions Context- free grammar ATribute grammar Lexical analyzer (scanner) SyntacKc analyzer (parser)

More information