LL Parsing, LR Parsing, Complexity, and Automata

Size: px
Start display at page:

Download "LL Parsing, LR Parsing, Complexity, and Automata"

Transcription

1 LL Parsing, LR Parsing, Complexity, and Automata R. Gregory Taylor Department of Mathematics and Computer Science Manhattan College Riverdale, New York USA Abstract It is well known that pushdown-stack automata find application within the syntactic analysis phase of compilation. Nonetheless, in most compiler design textbooks the relation between popular parsing algorithms and the theory of deterministic pushdown-stack automata remains implicit. We show that it is not difficult to implement these algorithms as deterministic automata. These implementations in turn yield instructive time/space analyses of the implemented algorithms. 1. LL(1) Parsing and Deterministic Pushdown- Stack Automata Table LL(1) Expression Grammar Example 1 It will be helpful for our discussion to focus on a particular example, and so let us consider the expression grammar G whose productions are as follows. expr " term expr_aux expr_aux " add_op term expr_aux term " fac term_aux term_aux " mult_op fac term_aux fac " primary - primary primary " ( expr ) tok_id add_op " + mult_op " * / Note that G is a left-factored grammar. It is obvious that G has no direct left recursion. It is also quite easy to see that G has no indirect left recursion either. Table 1 presents part of an LL(1) parse table that is constructed, in accordance with a well-known technique, on the basis of grammar G (see [1], [2]) Implementation of an LL(1) Parser as a Pushdown-Stack Automaton It can be shown that an LL(1) parser is, at bottom, a deterministic pushdown-stack automaton that accepts by empty stack. Of course, a stack component does play a role in the table-driven LL(1) parsers that one standardly considers. However, it is presumably not obvious how we may view a parser that uses a stack and a table as a pushdown automaton, which, of course, appears to involve no table. As the reader will have guessed, this table component is really just the tabular representation of the state diagram of a (deterministic) pushdown automaton. To see this, consider the automaton M of Figure 1 in conjunction with the following remarks. We shall use Table 1 as the basis upon which to construct M s state diagram as depicted (partially) in Figure 1. The result will be that M accepts (or recognizes) the class of expressions generated by G. We assume that we previously tokenized the input stream. The input alphabet of M then comprises the terminal symbols (token classes) of G together with end-of-input symbol #. The stack alphabet of M contains all terminals and Vol 34, No. 4, 2002 December 71 SIGCSE Bulletin

2 nonterminals of G plus stack-initialization symbol #. As usual, we assume the latter symbol to be the only symbol on M s stack at the inception of execution. An arc labeled a, b;c, d, e, say, is interpreted as follows: if symbol a is the current input symbol and symbol b is currently on top of the stack, then pop the stack and push symbols c, d, and e in that order. If symbol a is e, then the input stream is being ignored in effect. If, following the semicolon, there is an occurrence of e, this is to say that no symbol is pushed onto the stack. The states of M are two initialization states q 0 and q 1 and, in addition, eight states corresponding to each symbol within its input alphabet including #. It will be convenient to designate the latter states as q tok_id, q +, q -, q *, q /, q (, q ), and q #. (State designation q + abbreviates q tok_plus or the like.) The start state of M is q 0, and M has no accepting states. Each of the transitions from state q 1 corresponds to M s looking ahead one token in the input stream and then incorporating the encountered lookahead symbol into its state. For example, if the lookahead symbol at state q 1 is *, then M is seen to enter state q * and then to manipulate its stack without reading further input see the two selfloops at state q *. (For the sake of simplicity in presenting the diagram, we are using symbol a as a wildcard ranging over all members of M s stack alphabet.) Processing at state q * continues until symbol * itself appears atop the stack, at which point it is popped and M re-enters state q 1. For simplicity again, we show only two self-loops for each of the states q tok_id, q +, and so forth. We may easily infer the omitted self-loops from (the completion of) Table 1. It is rather easy to see that M is deterministic. This is primarily because the expression grammar on which it is based is left-factored. Tracing the computation of M for input string num1 * num2 + num3 as well as (num1 +, say, reveals that the former is accepted, whereas the latter is not. In fact, M s computation will be essentially that of an LL(1) parser as recorded in Tables 2 and 3 (both partial). The student should then have no trouble believing that the table-driven LL(1) parser is, at root, the implementation of a pushdown automaton that accepts by empty stack. The state diagram of Figure 1 may be used to clarify the role of the (single) lookahead in LL(1) parsing. Namely, at central state q 1, machine M must decide which peripheral state to enter and, by implication, how to expand the leftmost parse tree node labeled by a nonterminal. It renders this decision based solely upon the lookahead: if the lookahead is (, then M enters state q ( and so forth. Once at peripheral state q (, say, M proceeds to expand the tree downward until terminal ( appears atop the stack, at which point the stack is popped and M returns to state q 1. It is at this point, and not before, that input token ( is said to have been consumed. In other words, lookahead ( is used as the basis for tree expansion before it is consumed. Incidentally, the topology of the state diagram of Figure 1 a single central state sur- Figure 1. Arc label a serves as a wildcard representing an arbitrary stack alphabet symbol. SIGCSE Bulletin 72 Vol 34, No. 4, 2002 December

3 rounded by a ring of peripheral states with multiple selfloops is characteristic of parsing automata that use a single lookahead. (How would we configure parsers using two lookaheads? Three lookaheads?) Table 2 2. LR Parsing and Deterministic Pushdown-Stack Automata In fact, we shall look only at so-called SLR(1) parsing, a particularly simple form of LR(1) parsing Example 2 Our discussion of SLR parsing will focus on the expression grammar G appearing below. Table 3 (1.1) (1.2) expr " term expr add_op term (2.1) (2.2) term " fac term mult_op fac (3.1) (3.4) fac " ( expr ) tok_int_lit - ( expr ) - tok_int_lit (4.1) (4.2) add_op " + - (5.1) (5.2) mult_op " * / Note that G is left-recursive and not left-factored. 1.3 The Complexity of LL(1) Parsing Finally, reflection upon Figure 1 enables us to justify a certain claim regarding the efficiency of LL(1) parsing. Note that if token * is the current input symbol, then M reads that symbol, leaves state q 1, and enters state q *. Once in state q *, M will traverse e-self-loops until symbol * appears atop its stack. It is easy to see that this will require, worst case, two steps. One more step will then bring M back to state q 1. The case of input token / is perfectly analogous. So are all the other possibilities: each peripheral state q has but finitely many e-self-loops. Since we eliminated left recursion from the grammar based on which M was constructed, none of these e-self-loops will ever be traversed twice during any one stint at q. Moreover, the number of these self-loops obviously depends in no way upon n, where we take n to be the number of tokens in the tokenized input stream. (What it does depend on is the grammar G of Example 1.) Finally, since the self-loops at q are all e-moves, they never advance input. It should now be apparent that, for each of n input tokens, automaton M enters some peripheral state and then computes there for O(1) steps, worst case, assuming the obvious notion of computation step. Moreover, each step increases the height of M s stack by O(1). Apparently, we have proved the following proposition. Theorem 1 LL(1) parsing requires O(n) time and O(n) space, where n is the length of the token stream Implementation of an SLR Parser as a Pushdown- Stack Automaton Once again we endeavor to show that a certain type of parser this time an SLR parser for the expression grammar G of Example 2 is, in its essentials, the implementation of a deterministic pushdown automaton M that accepts by empty stack. A part of the transition diagram of M appears in Figure 2. Again, it is presumably not obvious that Tables 4 and 5 represent this machine. Consequently, some explanation will be required in order to make this plausible. This explanation will take the form of a step-by-step description of the construction of M based upon G or, rather, based upon the action and goto tables that are based, in turn, upon G. (1) M will have ten states total. There will be one for each of the seven terminal symbols of G and one more for end-of-input symbol #. In addition, there will be two initialization states q 0 and q 1. Again, it will be convenient to designate peripheral states as q tok_int_lit, q +, q -,q *, q /, q (, q ), and q #. M s start state will be q 0. (2) M s stack alphabet will include all terminals and nonterminals of G plus stack-initialization symbol #. In addition, the stack alphabet will contain numerals representing each of the 22 rows in the complete action table (Table 4). We shall think of the numeral 46, say, as a single stack alphabet symbol. (3) The stack of M will contain symbols of G terminals and nonterminals alternating with numerals designating states of a certain finite-state automaton (not shown). This is potentially confusing, we admit. Our talk of states here has nothing whatever to do with the states of the pushdown automaton M now under construction of which there are only ten. (4) A single arc from state q 0 to state q 1 pushes numeral 0 onto M s stack, which was assumed to already contain stack-initialization symbol #. (5) For each terminal a of G and every state numeral n, we Vol 34, No. 4, 2002 December 73 SIGCSE Bulletin

4 add an arc labeled a,n;n from start state q 0 to state q a. Similarly, for end-of-input symbol #, we add an arc labeled #,n ;n from start state q 0 to state q #. (See Figure 2, where we indicated these arcs schematically.) (6) Corresponding to each shift action within the action table, there will be an e-move leading from a peripheral state back to state q 1. For instance, corresponding to the shift S5 in the upper left-hand corner of Table 4 there will be an arc labeled e,0;0 tok_int_lit 5 from state q tok_int_lit back to state q 1. There are 31 shift actions in the action table, but we have included only a very few of the corresponding arcs in the diagram of Figure 2 just the ones that we shall need to cite later. (7) Corresponding to each reduce action in Table 4, there will be an e-self-loop on one of the peripheral states in Figure 2. For example, corresponding to the reduce action R3.2 in the sixth row and second column (not shown), there will be a self-loop labeled e,tok_int_lit 5;fac at state q +. (The reader will need to check production (3.2) of G in order to make sense of this.) Again, we have presented only a very few of these arcs in Figure 2. The entire goto table of Table 5 will be represented by e- self-loops on each of the peripheral states including state Table 4. Action Table (partial) Table 5. Goto Table (partial) q #. Thus, corresponding to the three entries in the first row of that table, there will be arcs labeled e,0 expr;0 expr 1 and e,0 term;0 term 2 and e,0 fac;0 fac 3 on state q + and on every other peripheral state. (8) The single Accept action of the action table will be reflected in a self-loop labeled e,# 0 expr 1;e at state q # (see Figure 2). Figure 2. SIGCSE Bulletin 74 Vol 34, No. 4, 2002 December

5 2.3. The Complexity of SLR Parsing As in the case of LL parsing, reflection upon deterministic pushdown automata such as machine M of Figure 2 will enable us to give worst-case time and space analyses for SLR parsing. (1) First, note that, after entering central state q 1, machine M makes a single-step transition from q 1 to some peripheral state q a for each token a within its input stream. That transition does not itself alter M s stack. However, M will eventually make a transition from q a back to q 1, simultaneously pushing a together with some state numeral onto its stack. (2) Since such periphery-to-center transitions are the only instructions that strictly increase the height of the stack, one can readily see that the height of the stack at any point during M s computation is O(n), where n is the length of the token stream. Thus M computes in O(n) space. (3) Further, while at peripheral state q a, machine M executes a number of e-moves. These e-moves occur in pairs: an e-move implementing a reduce action, followed by an e-move implementing a goto action. Again, the number of such e-moves executed, during any one stint at q a, is O(1) it depends upon G and not upon n. (4) Putting (1) through (3) together, we have established the following proposition. Theorem 2 SLR(1) parsing executes in time and space that are linear in the length of the token stream. The same is true of LR(1) parsing. Of course, Theorem 2 does not take into account the cost of computations required to design the SLR parser itself. That is as it should be, since the cost of parser construction is a one-time cost that we should not charge to the parsing process itself. As for the generalization to LR(1) parsing, we remind the reader that the driver routines of SLR and LR parsing do not differ. Rather, the difference between LR and SLR parsing is a matter of the size of the respective parse tables. In the present context, this means that the general structure of the transition diagrams of implementing pushdown automata will be the same. What will change is the number of e- moves at peripheral states as well as the number of center-toperiphery and periphery-to-center transitions. Consequently, the foregoing analyses for SLR parsing are applicable to LR parsing as well. 3. Summary We have that both LL and LR parsing can be implemented by deterministic pushdown stack automata. Moreover, both algorithms can be carried out in linear time. A more careful statement of the situation is the following. Given any LL(1) respectively LR(1) grammar G for context-free language L, a deterministic pushdown stack automaton MG can be constructed such that MG parses an arbitrary string of length n, over the set of terminal symbols of G, in O(n) steps. This having been said, there do exist context-free languages generated by no LR(1) grammar, and the class of LL(1) grammars is still more restrictive. Of course, these grammar classes have not been defined above (see [3]). Suffice it to say here that they are precisely the grammars for which our deterministic automaton construction techniques work. Acknowledgements The author wishes to thank Jane Stanton for editorial assistance. He also acknowledges a debt to the late Matthew Smosna, in whose superb course at New York University he acquired a first exposure to compiler design theory. References [1] Fischer, Charles N. and LeBlanc, Richard., Jr. Crafting a Compiler. Benjamin/Cummings, Menlo Park, California, [2] Parsons, Thomas W. Introduction to Compiler Construction, Computer Science Press, New York, [3] Sorenson, Paul G. and Tremblay, Jean Paul, The Theory and Practice of Compiler Writing, McGraw Hill, New York, FASE Forum for Advanced Software Engineering Education Online Newsletter for educating and training software engineers < Vol 34, No. 4, 2002 December 75 SIGCSE Bulletin

Chapter 4. Lexical and Syntax Analysis

Chapter 4. Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing Copyright 2012 Addison-Wesley. All rights reserved.

More information

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,, CMPSCI 601: Recall From Last Time Lecture 5 Definition: A context-free grammar (CFG) is a 4- tuple, variables = nonterminals, terminals, rules = productions,,, are all finite. 1 ( ) $ Pumping Lemma for

More information

Syntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Syntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay Syntax Analysis (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay September 2007 College of Engineering, Pune Syntax Analysis: 2/124 Syntax

More information

LLparse and LRparse: Visual and Interactive Tools for Parsing

LLparse and LRparse: Visual and Interactive Tools for Parsing LLparse and LRparse: Visual and Interactive Tools for Parsing Stephen A. Blythe, Michael C. James, and Susan H. Rodger 1 Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180-3590

More information

UNIT III & IV. Bottom up parsing

UNIT III & IV. Bottom up parsing UNIT III & IV Bottom up parsing 5.0 Introduction Given a grammar and a sentence belonging to that grammar, if we have to show that the given sentence belongs to the given grammar, there are two methods.

More information

LL(1) predictive parsing

LL(1) predictive parsing LL(1) predictive parsing Informatics 2A: Lecture 11 Mary Cryan School of Informatics University of Edinburgh mcryan@staffmail.ed.ac.uk 10 October 2018 1 / 15 Recap of Lecture 10 A pushdown automaton (PDA)

More information

ONE-STACK AUTOMATA AS ACCEPTORS OF CONTEXT-FREE LANGUAGES *

ONE-STACK AUTOMATA AS ACCEPTORS OF CONTEXT-FREE LANGUAGES * ONE-STACK AUTOMATA AS ACCEPTORS OF CONTEXT-FREE LANGUAGES * Pradip Peter Dey, Mohammad Amin, Bhaskar Raj Sinha and Alireza Farahani National University 3678 Aero Court San Diego, CA 92123 {pdey, mamin,

More information

Syntax Analysis, VII One more LR(1) example, plus some more stuff. Comp 412 COMP 412 FALL Chapter 3 in EaC2e. target code.

Syntax Analysis, VII One more LR(1) example, plus some more stuff. Comp 412 COMP 412 FALL Chapter 3 in EaC2e. target code. COMP 412 FALL 2017 Syntax Analysis, VII One more LR(1) example, plus some more stuff Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon,

More information

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser JavaCC Parser The Compilation Task Input character stream Lexer stream Parser Abstract Syntax Tree Analyser Annotated AST Code Generator Code CC&P 2003 1 CC&P 2003 2 Automated? JavaCC Parser The initial

More information

LR Parsing - The Items

LR Parsing - The Items LR Parsing - The Items Lecture 10 Sections 4.5, 4.7 Robb T. Koether Hampden-Sydney College Fri, Feb 13, 2015 Robb T. Koether (Hampden-Sydney College) LR Parsing - The Items Fri, Feb 13, 2015 1 / 31 1 LR

More information

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1 CSE P 501 Compilers LR Parsing Hal Perkins Spring 2018 UW CSE P 501 Spring 2018 D-1 Agenda LR Parsing Table-driven Parsers Parser States Shift-Reduce and Reduce-Reduce conflicts UW CSE P 501 Spring 2018

More information

shift-reduce parsing

shift-reduce parsing Parsing #2 Bottom-up Parsing Rightmost derivations; use of rules from right to left Uses a stack to push symbols the concatenation of the stack symbols with the rest of the input forms a valid bottom-up

More information

Downloaded from Page 1. LR Parsing

Downloaded from  Page 1. LR Parsing Downloaded from http://himadri.cmsdu.org Page 1 LR Parsing We first understand Context Free Grammars. Consider the input string: x+2*y When scanned by a scanner, it produces the following stream of tokens:

More information

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing 8 Parsing Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces strings A parser constructs a parse tree for a string

More information

CS143 Handout 20 Summer 2012 July 18 th, 2012 Practice CS143 Midterm Exam. (signed)

CS143 Handout 20 Summer 2012 July 18 th, 2012 Practice CS143 Midterm Exam. (signed) CS143 Handout 20 Summer 2012 July 18 th, 2012 Practice CS143 Midterm Exam This midterm exam is open-book, open-note, open-computer, but closed-network. This means that if you want to have your laptop with

More information

Programming Language Syntax and Analysis

Programming Language Syntax and Analysis Programming Language Syntax and Analysis 2017 Kwangman Ko (http://compiler.sangji.ac.kr, kkman@sangji.ac.kr) Dept. of Computer Engineering, Sangji University Introduction Syntax the form or structure of

More information

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis. Topics Chapter 4 Lexical and Syntax Analysis Introduction Lexical Analysis Syntax Analysis Recursive -Descent Parsing Bottom-Up parsing 2 Language Implementation Compilation There are three possible approaches

More information

CS 4120 Introduction to Compilers

CS 4120 Introduction to Compilers CS 4120 Introduction to Compilers Andrew Myers Cornell University Lecture 6: Bottom-Up Parsing 9/9/09 Bottom-up parsing A more powerful parsing technology LR grammars -- more expressive than LL can handle

More information

LR Parsing E T + E T 1 T

LR Parsing E T + E T 1 T LR Parsing 1 Introduction Before reading this quick JFLAP tutorial on parsing please make sure to look at a reference on LL parsing to get an understanding of how the First and Follow sets are defined.

More information

Parsing Algorithms. Parsing: continued. Top Down Parsing. Predictive Parser. David Notkin Autumn 2008

Parsing Algorithms. Parsing: continued. Top Down Parsing. Predictive Parser. David Notkin Autumn 2008 Parsing: continued David Notkin Autumn 2008 Parsing Algorithms Earley s algorithm (1970) works for all CFGs O(N 3 ) worst case performance O(N 2 ) for unambiguous grammars Based on dynamic programming,

More information

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino 3. Syntax Analysis Andrea Polini Formal Languages and Compilers Master in Computer Science University of Camerino (Formal Languages and Compilers) 3. Syntax Analysis CS@UNICAM 1 / 54 Syntax Analysis: the

More information

Lecture Notes on Bottom-Up LR Parsing

Lecture Notes on Bottom-Up LR Parsing Lecture Notes on Bottom-Up LR Parsing 15-411: Compiler Design Frank Pfenning Lecture 9 September 23, 2009 1 Introduction In this lecture we discuss a second parsing algorithm that traverses the input string

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis Chapter 4 Lexical and Syntax Analysis Introduction - Language implementation systems must analyze source code, regardless of the specific implementation approach - Nearly all syntax analysis is based on

More information

Lexical and Syntax Analysis. Bottom-Up Parsing

Lexical and Syntax Analysis. Bottom-Up Parsing Lexical and Syntax Analysis Bottom-Up Parsing Parsing There are two ways to construct derivation of a grammar. Top-Down: begin with start symbol; repeatedly replace an instance of a production s LHS with

More information

Lecture Bottom-Up Parsing

Lecture Bottom-Up Parsing Lecture 14+15 Bottom-Up Parsing CS 241: Foundations of Sequential Programs Winter 2018 Troy Vasiga et al University of Waterloo 1 Example CFG 1. S S 2. S AyB 3. A ab 4. A cd 5. B z 6. B wz 2 Stacks in

More information

Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412

Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412 Midterm Exam: Thursday October 18, 7PM Herzstein Amphitheater Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412 COMP 412 FALL 2018 source code IR Front End Optimizer Back End IR target

More information

CSCI312 Principles of Programming Languages!

CSCI312 Principles of Programming Languages! CSCI312 Principles of Programming Languages!! Chapter 3 Regular Expression and Lexer Xu Liu Recap! Copyright 2006 The McGraw-Hill Companies, Inc. Clite: Lexical Syntax! Input: a stream of characters from

More information

Chapter 3: Lexing and Parsing

Chapter 3: Lexing and Parsing Chapter 3: Lexing and Parsing Aarne Ranta Slides for the book Implementing Programming Languages. An Introduction to Compilers and Interpreters, College Publications, 2012. Lexing and Parsing* Deeper understanding

More information

4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

More information

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table. MODULE 18 LALR parsing After understanding the most powerful CALR parser, in this module we will learn to construct the LALR parser. The CALR parser has a large set of items and hence the LALR parser is

More information

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Parsing III (Top-down parsing: recursive descent & LL(1) ) (Bottom-up parsing) CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper,

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and context-free grammars n Grammar derivations n More about parse trees n Top-down and

More information

CT32 COMPUTER NETWORKS DEC 2015

CT32 COMPUTER NETWORKS DEC 2015 Q.2 a. Using the principle of mathematical induction, prove that (10 (2n-1) +1) is divisible by 11 for all n N (8) Let P(n): (10 (2n-1) +1) is divisible by 11 For n = 1, the given expression becomes (10

More information

Lexical and Syntax Analysis

Lexical and Syntax Analysis Lexical and Syntax Analysis In Text: Chapter 4 N. Meng, F. Poursardar Lexical and Syntactic Analysis Two steps to discover the syntactic structure of a program Lexical analysis (Scanner): to read the input

More information

Top down vs. bottom up parsing

Top down vs. bottom up parsing Parsing A grammar describes the strings that are syntactically legal A recogniser simply accepts or rejects strings A generator produces sentences in the language described by the grammar A parser constructs

More information

Lecture Notes on Bottom-Up LR Parsing

Lecture Notes on Bottom-Up LR Parsing Lecture Notes on Bottom-Up LR Parsing 15-411: Compiler Design Frank Pfenning Lecture 9 1 Introduction In this lecture we discuss a second parsing algorithm that traverses the input string from left to

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up

More information

Syntactic Analysis. Top-Down Parsing

Syntactic Analysis. Top-Down Parsing Syntactic Analysis Top-Down Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

More information

COP 3402 Systems Software Syntax Analysis (Parser)

COP 3402 Systems Software Syntax Analysis (Parser) COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis

More information

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG) CS5371 Theory of Computation Lecture 8: Automata Theory VI (PDA, PDA = CFG) Objectives Introduce Pushdown Automaton (PDA) Show that PDA = CFG In terms of descriptive power Pushdown Automaton (PDA) Roughly

More information

Introduction to Syntax Analysis Recursive-Descent Parsing

Introduction to Syntax Analysis Recursive-Descent Parsing Introduction to Syntax Analysis Recursive-Descent Parsing CS F331 Programming Languages CSCE A331 Programming Language Concepts Lecture Slides Friday, February 10, 2017 Glenn G. Chappell Department of

More information

Propositional Logic. Part I

Propositional Logic. Part I Part I Propositional Logic 1 Classical Logic and the Material Conditional 1.1 Introduction 1.1.1 The first purpose of this chapter is to review classical propositional logic, including semantic tableaux.

More information

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1 CS453 : JavaCUP and error recovery CS453 Shift-reduce Parsing 1 Shift-reduce parsing in an LR parser LR(k) parser Left-to-right parse Right-most derivation K-token look ahead LR parsing algorithm using

More information

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised: EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 2017-09-11 This lecture Regular expressions Context-free grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)

More information

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology MIT 6.035 Parse Table Construction Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Parse Tables (Review) ACTION Goto State ( ) $ X s0 shift to s2 error error goto s1

More information

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing ICOM 4036 Programming Languages Lexical and Syntax Analysis Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing This lecture covers review questions 14-27 This lecture covers

More information

CSCI312 Principles of Programming Languages

CSCI312 Principles of Programming Languages Copyright 2006 The McGraw-Hill Companies, Inc. CSCI312 Principles of Programming Languages! LL Parsing!! Xu Liu Derived from Keith Cooper s COMP 412 at Rice University Recap Copyright 2006 The McGraw-Hill

More information

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1 CSE 401 Compilers LR Parsing Hal Perkins Autumn 2011 10/10/2011 2002-11 Hal Perkins & UW CSE D-1 Agenda LR Parsing Table-driven Parsers Parser States Shift-Reduce and Reduce-Reduce conflicts 10/10/2011

More information

Recursive Descent Parsers

Recursive Descent Parsers Recursive Descent Parsers Lecture 7 Robb T. Koether Hampden-Sydney College Wed, Jan 28, 2015 Robb T. Koether (Hampden-Sydney College) Recursive Descent Parsers Wed, Jan 28, 2015 1 / 18 1 Parsing 2 LL Parsers

More information

15 212: Principles of Programming. Some Notes on Grammars and Parsing

15 212: Principles of Programming. Some Notes on Grammars and Parsing 15 212: Principles of Programming Some Notes on Grammars and Parsing Michael Erdmann Spring 2011 1 Introduction These notes are intended as a rough and ready guide to grammars and parsing. The theoretical

More information

A Characterization of the Chomsky Hierarchy by String Turing Machines

A Characterization of the Chomsky Hierarchy by String Turing Machines A Characterization of the Chomsky Hierarchy by String Turing Machines Hans W. Lang University of Applied Sciences, Flensburg, Germany Abstract A string Turing machine is a variant of a Turing machine designed

More information

2.2 Syntax Definition

2.2 Syntax Definition 42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions

More information

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing. LR Parsing Compiler Design CSE 504 1 Shift-Reduce Parsing 2 LR Parsers 3 SLR and LR(1) Parsers Last modifled: Fri Mar 06 2015 at 13:50:06 EST Version: 1.7 16:58:46 2016/01/29 Compiled at 12:57 on 2016/02/26

More information

Chapter 14: Pushdown Automata

Chapter 14: Pushdown Automata Chapter 14: Pushdown Automata Peter Cappello Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93106 cappello@cs.ucsb.edu The corresponding textbook chapter should

More information

How do LL(1) Parsers Build Syntax Trees?

How do LL(1) Parsers Build Syntax Trees? How do LL(1) Parsers Build Syntax Trees? So far our LL(1) parser has acted like a recognizer. It verifies that input token are syntactically correct, but it produces no output. Building complete (concrete)

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

1. Lexical Analysis Phase

1. Lexical Analysis Phase 1. Lexical Analysis Phase The purpose of the lexical analyzer is to read the source program, one character at time, and to translate it into a sequence of primitive units called tokens. Keywords, identifiers,

More information

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous. Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or

More information

Wednesday, August 31, Parsers

Wednesday, August 31, Parsers Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically

More information

Wednesday, September 9, 15. Parsers

Wednesday, September 9, 15. Parsers Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs: What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F). CS 2210 Sample Midterm 1. Determine if each of the following claims is true (T) or false (F). F A language consists of a set of strings, its grammar structure, and a set of operations. (Note: a language

More information

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target

More information

Reflection in the Chomsky Hierarchy

Reflection in the Chomsky Hierarchy Reflection in the Chomsky Hierarchy Henk Barendregt Venanzio Capretta Dexter Kozen 1 Introduction We investigate which classes of formal languages in the Chomsky hierarchy are reflexive, that is, contain

More information

Programming Languages Third Edition

Programming Languages Third Edition Programming Languages Third Edition Chapter 12 Formal Semantics Objectives Become familiar with a sample small language for the purpose of semantic specification Understand operational semantics Understand

More information

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012 Predictive Parsers LL(k) Parsing Can we avoid backtracking? es, if for a given input symbol and given nonterminal, we can choose the alternative appropriately. his is possible if the first terminal of

More information

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input. flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input. More often than not, though, you ll want to use flex to generate a scanner that divides

More information

Introduction to Computers & Programming

Introduction to Computers & Programming 16.070 Introduction to Computers & Programming Theory of computation 5: Reducibility, Turing machines Prof. Kristina Lundqvist Dept. of Aero/Astro, MIT States and transition function State control A finite

More information

Compiler Construction 2016/2017 Syntax Analysis

Compiler Construction 2016/2017 Syntax Analysis Compiler Construction 2016/2017 Syntax Analysis Peter Thiemann November 2, 2016 Outline 1 Syntax Analysis Recursive top-down parsing Nonrecursive top-down parsing Bottom-up parsing Syntax Analysis tokens

More information

Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics ISBN Chapter 3 Describing Syntax and Semantics ISBN 0-321-49362-1 Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Copyright 2009 Addison-Wesley. All

More information

The following deflniüons h:i e been establish for the tokens: LITERAL any group olcharacters surrounded by matching quotes.

The following deflniüons h:i e been establish for the tokens: LITERAL any group olcharacters surrounded by matching quotes. 15 colon. This calls the got_derlvatlon_synibol routine, and its purpose is to recognize obtained for the next available character in the source stream. A lexical error may Once all comments, blanks, and

More information

Chapter 3. Set Theory. 3.1 What is a Set?

Chapter 3. Set Theory. 3.1 What is a Set? Chapter 3 Set Theory 3.1 What is a Set? A set is a well-defined collection of objects called elements or members of the set. Here, well-defined means accurately and unambiguously stated or described. Any

More information

LR Parsing. The first L means the input string is processed from left to right.

LR Parsing. The first L means the input string is processed from left to right. LR Parsing 1 Introduction The LL Parsing that is provided in JFLAP is what is formally referred to as LL(1) parsing. Grammars that can be parsed using this algorithm are called LL grammars and they form

More information

CJT^jL rafting Cm ompiler

CJT^jL rafting Cm ompiler CJT^jL rafting Cm ompiler ij CHARLES N. FISCHER Computer Sciences University of Wisconsin Madison RON K. CYTRON Computer Science and Engineering Washington University RICHARD J. LeBLANC, Jr. Computer Science

More information

Semantics via Syntax. f (4) = if define f (x) =2 x + 55.

Semantics via Syntax. f (4) = if define f (x) =2 x + 55. 1 Semantics via Syntax The specification of a programming language starts with its syntax. As every programmer knows, the syntax of a language comes in the shape of a variant of a BNF (Backus-Naur Form)

More information

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6 Compiler Design 1 Bottom-UP Parsing Compiler Design 2 The Process The parse tree is built starting from the leaf nodes labeled by the terminals (tokens). The parser tries to discover appropriate reductions,

More information

Visual PCYACC. Developing and Debugging with Visual Pcyacc. by Y. Jenny Luo. For more information, contact

Visual PCYACC. Developing and Debugging with Visual Pcyacc. by Y. Jenny Luo. For more information, contact 1 Visual PCYACC Developing and Debugging with Visual Pcyacc by Y. Jenny Luo PCYACC is a software product of ABRAXAS SOFTWARE INC. For more information, contact ABRAXAS SOFTWARE INC. Post Office Box 19586

More information

Bottom-Up Parsing II. Lecture 8

Bottom-Up Parsing II. Lecture 8 Bottom-Up Parsing II Lecture 8 1 Review: Shift-Reduce Parsing Bottom-up parsing uses two actions: Shift ABC xyz ABCx yz Reduce Cbxy ijk CbA ijk 2 Recall: he Stack Left string can be implemented by a stack

More information

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States TDDD16 Compilers and Interpreters TDDB44 Compiler Construction LR Parsing, Part 2 Constructing Parse Tables Parse table construction Grammar conflict handling Categories of LR Grammars and Parsers An NFA

More information

WWW.STUDENTSFOCUS.COM UNIT -3 SYNTAX ANALYSIS 3.1 ROLE OF THE PARSER Parser obtains a string of tokens from the lexical analyzer and verifies that it can be generated by the language for the source program.

More information

Introduction. Introduction. Introduction. Lexical Analysis. Lexical Analysis 4/2/2019. Chapter 4. Lexical and Syntax Analysis.

Introduction. Introduction. Introduction. Lexical Analysis. Lexical Analysis 4/2/2019. Chapter 4. Lexical and Syntax Analysis. Chapter 4. Lexical and Syntax Analysis Introduction Introduction The Parsing Problem Three approaches to implementing programming languages Compilation Compiler translates programs written in a highlevel

More information

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Parsing Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students

More information

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Massachusetts Institute of Technology Language Definition Problem How to precisely define language Layered structure

More information

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

MIT Specifying Languages with Regular Expressions and Context-Free Grammars MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Language Definition Problem How to precisely

More information

Introduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1

Introduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1 Introduction to Automata Theory BİL405 - Automata Theory and Formal Languages 1 Automata, Computability and Complexity Automata, Computability and Complexity are linked by the question: What are the fundamental

More information

Context-Free Grammars and Parsers. Peter S. Housel January 2001

Context-Free Grammars and Parsers. Peter S. Housel January 2001 Context-Free Grammars and Parsers Peter S. Housel January 2001 Copyright This is the Monday grammar library, a set of facilities for representing context-free grammars and dynamically creating parser automata

More information

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology exam Compiler Construction in4303 April 9, 2010 14.00-15.30 This exam (6 pages) consists of 52 True/False

More information

Pushdown Automata. A PDA is an FA together with a stack.

Pushdown Automata. A PDA is an FA together with a stack. Pushdown Automata A PDA is an FA together with a stack. Stacks A stack stores information on the last-in firstout principle. Items are added on top by pushing; items are removed from the top by popping.

More information

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation Concepts Introduced in Chapter 2 A more detailed overview of the compilation process. Parsing Scanning Semantic Analysis Syntax-Directed Translation Intermediate Code Generation Context-Free Grammar A

More information

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications Agenda for Today Regular Expressions CSE 413, Autumn 2005 Programming Languages Basic concepts of formal grammars Regular expressions Lexical specification of programming languages Using finite automata

More information

Assignment 4 CSE 517: Natural Language Processing

Assignment 4 CSE 517: Natural Language Processing Assignment 4 CSE 517: Natural Language Processing University of Washington Winter 2016 Due: March 2, 2016, 1:30 pm 1 HMMs and PCFGs Here s the definition of a PCFG given in class on 2/17: A finite set

More information

CSC 4181 Compiler Construction. Parsing. Outline. Introduction

CSC 4181 Compiler Construction. Parsing. Outline. Introduction CC 4181 Compiler Construction Parsing 1 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL1) parsing LL1) parsing algorithm First and follow sets Constructing LL1) parsing table

More information

LR Parsing Techniques

LR Parsing Techniques LR Parsing Techniques Introduction Bottom-Up Parsing LR Parsing as Handle Pruning Shift-Reduce Parser LR(k) Parsing Model Parsing Table Construction: SLR, LR, LALR 1 Bottom-UP Parsing A bottom-up parser

More information

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Module No. # 01 Lecture No. # 01 An Overview of a Compiler This is a lecture about

More information

Software II: Principles of Programming Languages

Software II: Principles of Programming Languages Software II: Principles of Programming Languages Lecture 4 Language Translation: Lexical and Syntactic Analysis Translation A translator transforms source code (a program written in one language) into

More information

CS 164 Programming Languages and Compilers Handout 9. Midterm I Solution

CS 164 Programming Languages and Compilers Handout 9. Midterm I Solution Midterm I Solution Please read all instructions (including these) carefully. There are 5 questions on the exam, some with multiple parts. You have 1 hour and 20 minutes to work on the exam. The exam is

More information

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers Plan for Today Review main idea syntax-directed evaluation and translation Recall syntax-directed interpretation in recursive descent parsers Syntax-directed evaluation and translation in shift-reduce

More information

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES Saturday 10 th December 2016 09:30 to 11:30 INSTRUCTIONS

More information

Parsing II Top-down parsing. Comp 412

Parsing II Top-down parsing. Comp 412 COMP 412 FALL 2018 Parsing II Top-down parsing Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled

More information