CS 230 Programming Languages

Similar documents
Chapter 4. Lexical and Syntax Analysis

Lexical and Syntax Analysis (2)

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Chapter 3. Describing Syntax and Semantics ISBN

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis

Syntax. In Text: Chapter 3

Lexical and Syntax Analysis

Introduction. Introduction. Introduction. Lexical Analysis. Lexical Analysis 4/2/2019. Chapter 4. Lexical and Syntax Analysis.

Programming Language Syntax and Analysis

4. LEXICAL AND SYNTAX ANALYSIS

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3

ICOM 4036 Spring 2004

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Revisit the example. Transformed DFA 10/1/16 A B C D E. Start

CPS 506 Comparative Programming Languages. Syntax Specification

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

Syntax Analysis, III Comp 412

Chapter 3. Syntax - the form or structure of the expressions, statements, and program units

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

LECTURE 7. Lex and Intro to Parsing

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

MIDTERM EXAM (Solutions)

Syntax Analysis, III Comp 412

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

A programming language requires two major definitions A simple one pass compiler

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Introduction to Syntax Analysis Recursive-Descent Parsing

Syntax Intro and Overview. Syntax

Describing Syntax and Semantics

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Parsing Part II (Top-down parsing, left-recursion removal)

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

CSCI312 Principles of Programming Languages

Introduction to Parsing. Lecture 8

Parsing II Top-down parsing. Comp 412

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

UNIVERSITY OF CALIFORNIA

Types of parsing. CMSC 430 Lecture 4, Page 1

Chapter 3. Describing Syntax and Semantics

Homework & Announcements

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Chapter 3: Syntax and Semantics. Syntax and Semantics. Syntax Definitions. Matt Evett Dept. Computer Science Eastern Michigan University 1999

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

CIT 3136 Lecture 7. Top-Down Parsing

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend:

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Top down vs. bottom up parsing

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

Parsing III. (Top-down parsing: recursive descent & LL(1) )

CSE302: Compiler Design

MIT Top-Down Parsing. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Parsing Expression Grammars and Packrat Parsing. Aaron Moss

Parsing Algorithms. Parsing: continued. Top Down Parsing. Predictive Parser. David Notkin Autumn 2008

Lexical and Syntax Analysis

CS 406/534 Compiler Construction Parsing Part I

10/12/17. Motivating Example. Lexical and Syntax Analysis (2) Recursive-Descent Parsing. Recursive-Descent Parsing. Recursive-Descent Parsing

3. Parsing. Oscar Nierstrasz

CSE431 Translation of Computer Languages

CS 230 Programming Languages

Building Compilers with Phoenix

Computer Science 160 Translation of Programming Languages

Outline. Top Down Parsing. SLL(1) Parsing. Where We Are 1/24/2013

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

3. Context-free grammars & parsing

Syntax-Directed Translation. Lecture 14

Compiler Design Concepts. Syntax Analysis

Habanero Extreme Scale Software Research Project

Chapter 4: LR Parsing

Introduction to Lexing and Parsing

Recursive Descent Parsers

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Syntactic Analysis. Top-Down Parsing

CMSC 330: Organization of Programming Languages

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

A Simple Syntax-Directed Translator

Today. Assignments. Lecture Notes CPSC 326 (Spring 2019) Quiz 2. Lexer design. Syntax Analysis: Context-Free Grammars. HW2 (out, due Tues)

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

The procedure attempts to "match" the right hand side of some production for a nonterminal.

EECS 6083 Intro to Parsing Context Free Grammars

Transcription:

CS 230 Programming Languages 10 / 16 / 2013 Instructor: Michael Eckmann

Today s Topics Questions/comments? Top Down / Recursive Descent Parsers

Top Down Parsers We have a left sentential form xa Expand A (the leftmost non-terminal) using a production that has A as its LHS. How might we do this? We would call the lexical analyzer to get the next lexeme. Then determine which production to expand based on the lexeme/token returned. For grammars that are one lookahead, exactly one RHS of A will have as its first symbol the lexeme/token returned. If there is no RHS of A with its first symbol being the lexeme/token returned --- what do we think occurred?

Top Down Parsers Top down parsers can be implemented as recursive descent parsers --- that is, a program based directly on the BNF description of the language. Or they can be represented by a parsing table that contains all the information in the BNF rules in table form. These top down parsers are LL algorithms --- L for left-to-right scan of the input and L for leftmost derivation.

Top Down Parsers Left recursion is problematic for top-down parsers. e.g. <A> -> <A> b Each non-terminal is written as a function in the parser. So, a recursive descent parser function for this production would continually call itself first and never get out of the recursion. Similarly for any top-down parser. Left recursion is problematic for top-down parsers even if it is indirect. e.g. <A> -> <B> b <B> -> <A> a Bottom up parsers do not have this problem with left recursion.

Top Down Parsers When a nonterminal has more than one RHS, the first terminal symbol that can be generated in a derivation for each of them must be unique to that RHS. If it is not unique, then top-down parsing (with one lookahead) is impossible for that grammar. Because there is only one lookahead symbol, which lex would return to the parser, that one symbol has to uniquely determine which RHS to choose. Does that make sense?

Top Down Parsers Example: Here are two productions in a possible grammar: <variable> -> identifier identifier [ <expression> ] Assuming we have a left sentential form where the next nonterminal to be expanded is <variable> and we call out to lex() to get the next token and it returns identifier. Which production will we pick?

Top Down Parsers The pairwise disjointness test is used on grammars to determine if they have this problem. What it does is, it determines the first symbol for every RHS of a production and for those with the same LHS, the first symbols must be different for each. <A> -> a <B> b<a>b c <C> d The 4 RHSs of A here each have as their first symbol {a}, {b}, {c}, {d} respectively. These sets are all disjoint so these productions would pass the test. The ones on the previous slide would not.

Top Down Parsers Note: if the first symbol on the RHS is a nonterminal, we would need to expand it to determine the first symbol. e.g. <A> -> <B> a b<a>b c <C> d <B> -> <C> f <C> -> s <F> Here, to determine the first symbol of the first RHS of A, we need to follow <B> then <C> to find out that it is s. So, the 4 RHSs of A here each have as their first symbol {s}, {b}, {c}, {d} respectively. These sets are all disjoint so the productions for <A> would pass the test. How about for the <B> and <C> productions?

Recursive descent example So called because many of its subroutines (one for each non-terminal) are recursive and descent because it is a topdown parser. EBNF is well suited for these parsers because EBNF reduces the number of non-terminals (as compared to plain BNF) which in turn, reduces the number of subroutines. A grammar for simple expressions: <expr> <term> {(+ -) <term>} <term> <factor> {(* /) <factor>} <factor> id ( <expr> ) Anyone remember what the curly braces mean in EBNF?

Recursive descent example Assume we have a lexical analyzer named lex, which puts the next token code in nexttoken The coding process when there is only one RHS: For each terminal symbol in the RHS, compare it with the next input token; if they match, continue, else there is an error For each nonterminal symbol in the RHS, call its associated parsing subroutine

Recursive descent example /* Function expr (written in the C language) Parses strings in the language generated by the rule: <expr> <term> {(+ -) <term>} */ void expr() { /* Parse the first term */ term(); /* term() is it's own function representing the non-terminal <term>*/ /* As long as the next token is + or -, call lex to get the next token, and parse the next term */ } while (nexttoken == PLUS_OP nexttoken == MINUS_OP) { lex(); term(); }

Recursive descent example Note: by convention each subroutine leaves the most recent token in nexttoken A nonterminal that has more than one RHS requires an initial process to determine which RHS it is to parse The correct RHS is chosen on the basis of the next token of input (the lookahead) The next token is compared with the first token that can be generated by each RHS until a match is found If no match is found, it is a syntax error Let's now look at an example of how the parser would handle multiple RHS's.

Recursive descent example /* Function term Parses strings in the language generated by the rule: <term> <factor> {(* /) <factor>} */ void expr() { /* Parse the first factor */ factor(); /* factor() is it's own function representing the non-terminal <factor>*/ /* As long as the next token is * or /, call lex to get the next token, and parse the next term */ } while (nexttoken == MULT_OP nexttoken == DIV_OP) { lex(); factor(); }

Recursive descent example /* Function factor Parses strings in the language generated by the rule: <factor> -> id (<expr>) */ void factor() { if (nexttoken == IDENT) /* For the RHS id, call lex to get the next token*/ } lex(); /* If the RHS is ( <expr> ) call lex to pass over the left parenthesis, call expr, and check for the right parenthesis */ else if (nexttoken == LEFT_PAREN) { lex(); expr(); if (nexttoken == RIGHT_PAREN) lex(); else error(); } else error(); /* Neither RHS matches */