MODULE 12 LEADING, TRAILING and Operator Precedence Table

Similar documents
Parsing - 1. What is parsing? Shift-reduce parsing. Operator precedence parsing. Shift-reduce conflict Reduce-reduce conflict

CSCI312 Principles of Programming Languages

CS 4120 Introduction to Compilers

Lexical and Syntax Analysis. Top-Down Parsing

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Context-free grammars

Compiler Construction: Parsing

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.


Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Introduction to Bottom-Up Parsing

LL(1) predictive parsing

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Principles of Programming Languages

Syntax Analysis Part I

LR Parsing LALR Parser Generators

In One Slide. Outline. LR Parsing. Table Construction

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Chapter 4: LR Parsing

Introduction to Syntax Analysis

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

Syntax Analysis Check syntax and construct abstract syntax tree

S Y N T A X A N A L Y S I S LR

Syntactic Analysis. Top-Down Parsing

CA Compiler Construction

Types of parsing. CMSC 430 Lecture 4, Page 1

Lexical and Syntax Analysis

Introduction to Syntax Analysis. The Second Phase of Front-End

Lecture Bottom-Up Parsing

Bottom-Up Parsing. Lecture 11-12

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

More Bottom-Up Parsing

Top down vs. bottom up parsing

UNIT-III BOTTOM-UP PARSING

Bottom-Up Parsing. Lecture 11-12

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Bottom Up Parsing Handout. 1 Introduction. 2 Example illustrating bottom-up parsing

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Compiler Design Concepts. Syntax Analysis

Concepts Introduced in Chapter 4

MODULE 14 SLR PARSER LR(0) ITEMS

CS502: Compilers & Programming Systems

LR Parsing LALR Parser Generators

4. Lexical and Syntax Analysis

SLR parsers. LR(0) items

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Compilers Course Lecture 4: Context Free Grammars

Chapter 4. Lexical and Syntax Analysis

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

4. Lexical and Syntax Analysis

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

Compilers. Bottom-up Parsing. (original slides by Sam

Outline. The strategy: shift-reduce parsing. Introduction to Bottom-Up Parsing. A key concept: handles

Context-Free Grammars

LR Parsing E T + E T 1 T

CS 406/534 Compiler Construction Parsing Part I

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

MA513: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 18 Date: September 12, 2011

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

컴파일러입문 제 6 장 구문분석

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

Bottom-up Parser. Jungsik Choi

15 212: Principles of Programming. Some Notes on Grammars and Parsing

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Computer Science 160 Translation of Programming Languages

LR Parsing. Table Construction

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Intro to Bottom-up Parsing. Lecture 9

LL(1) predictive parsing

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

Syntactic Analysis. Top-Down Parsing. Parsing Techniques. Top-Down Parsing. Remember the Expression Grammar? Example. Example

How do LL(1) Parsers Build Syntax Trees?

Parsing. Cocke Younger Kasami (CYK) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 35

CS 314 Principles of Programming Languages

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Parsing II Top-down parsing. Comp 412

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

3. Parsing. Oscar Nierstrasz

Monday, September 13, Parsers

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Compiler Construction 2016/2017 Syntax Analysis

CS 321 Programming Languages and Compilers. VI. Parsing

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

LR Parsing. The first L means the input string is processed from left to right.

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Transcription:

MODULE 12 LEADING, TRAILING and Operator Precedence Table In this module, we will learn to construct the next type of Bottom up parser, which is the operator precedence parser The functions Leading and Trailing are computed to construct the Operator precedence parsing table This module discusses the construction of Operator precedence parsing table using the functions leading and trailing 121 Operator Precedence Grammar The class of grammar parsed by the Operator precedence parser is the Operator Precedence Grammar This is a small, but an important class of grammar An operator precedence parser can be a powerful shift-reduce parser for this small class of operator precedence grammar A grammar has to satisfy the following conditions for it to be parsed by an operator precedence parser: No -productions are permitted thus ensuring that the RHS of all production should be a combination of terminals and non-terminals The RHS of all productions should be in such a way that no two non-terminals should be adjacent Consider the following examples: 1 Grammar G1 is defined with the productions, E AB, A a, B b This grammar has the non-terminals A and B adjacent to each other and hence is not an operator precedence grammar 2 Grammar G2 is defined with productions, E EOE, E id, O + * / This grammar is also not operator precedence grammar as E, O, E are non-terminals and they are adjacent to each other 3 Grammar G3, is a modified version of G2 with productions, E E+E E*E E/E id which is a operator precedence grammar as it does not have ε productions and has no two nonterminals adjacent to each other in the RHS of the productions 122 Operator precedence parser rules Let G be an -free operator grammar (No -Production)For each terminal symbols a and b, the following conditions need to be satisfied We define three symbols to define the precedence relations between two terminals a and b namely, the,, to indicate same, lesser and greater precedence respectively The same, lesser and greater precedence between any two terminals is defined based on the following rules: 1 The first rule is for the same precedence relation between two terminals a and b We say, a b, if a production in RHS of the form αaβbγ, where β is either or a single non-terminal Consider the grammar with a production, S ictses It can be observed

that this grammar is an operator precedence grammar The symbols for α can be ε, C, S for three situations Thus for the scenario where α is ε, we have a is i' and b is t Thus we have the relation, i t Considering an alternate situation, where α is C, we have a as t and b as e which yields the precedence relation t e When α is S, we don t have a symbol for β and thus only these two precedence relations could be derived 2 The second rule is to design the lesser than relation between two terminals a and b We say, a b if for some non-terminal A a production in RHS of the form A αaaβ, and A + γbδ where γ is either or a single non-terminal Consider the productions, S icts and C + b and where β has bδ and hence, we deduce i b as non-terminal C derives b 3 The third rule is to design the greater than relation between two terminals a and b and is to look at the derivation from the right of the RHS of the production We say, a b if for some non-terminal A a production in RHS of the form A αabβ, and A + γaδ where δ is either or a single non-terminal Thus, for the production, S icts and C + b the relation derived will be b t Example 121 Consider the following ambiguous expression grammar E E+E E*E (E) id This grammar is not a Operator precedence Grammar since by rule no 3 we have + + & + + as the grammar is ambiguous However, we have the unambiguous expression grammar and is given below This grammar has clear definition of the precedence relation E E+T T, T T*F F, F (E) id 123 Operator precedence parsing The operator precedence parser is based on the precedence rules that are defined between the terminals of a grammar In operator-precedence parsing, we define three disjoint precedence relations between certain pairs of terminals a b implies b has higher precedence than a a = b implies b has same precedence as a a b implies b has lower precedence than a The challenge lies in the determination of correct precedence relations between terminals which are used for constructing the Operator precedence parsing table For now let us assume the existence of the operator precedence parsing table The determination of precedence relations are based on the traditional notions of associativity and precedence of operators In the expression grammar, the precedence relations could be identified between all pairs of operators based on associativity and precedence and unary minus alone cause s problem

The intention of the precedence relations is to find the handle of a right-sentential form The following conventions are used with marking the left end, = appearing in the interior of the handle, and marking the right hand The input string is prefixed and suffixed with $ which would look like, $a 1 a 2 a n $ and then we insert the precedence relation between the pairs of terminals (the precedence relation holds between the terminals in that pair) Example 112 Consider the following ambiguous grammar with productions E E+E E*E id Let us assume the precedence table as given in Table 121 The next section of this module will deal with the construction of this precedence table Table 121 Precedence table for ambiguous expression grammar Id + * $ id + * $ Then the input string id+id*id with the precedence relations inserted will look like $ id + id * id $ The precedence relation is inserted between two terminals, by considering the first terminal in the row and the second terminal in the column Thus, $ and id has Thus, the next pair is id and + which has a relation and is also inserted The operator precedence parsing algorithm is a two step process The stack is looked and scanned for the following 1 Scan the string from left end until the first is encountered 2 Then scan backwards (to the left) over any = until a is encountered This algorithm also uses a stack where the initial stack contents are the modified input string with precedence added The handle contains everything to left of the first and to the right of the is encountered The parsing action by the operator precedence parser is given in Table 122

Table 122 Parsing action for the ambiguous expression grammar Stack Rule Input Comments $ id + id * id $ E id $ id + id * id $ Here the first id is looked as the handle and since we were able to reduce, we reduce it in the input $ + id * id $ E id $ E + id * id $ The second handle is also id since that is available between a pair of lesser than and greater than precedences $ + * id $ E id $ E + E * id $ The third handle is also id $ + * $ E E*E $ E + E * E $ The fourth handle is E *E, and is popped in the stack and we push the greater than symbol $ + $ E E+E $ E + E $ The last handle is E+E and that is also reduced $$ The stack is empty and has only the $ symbol, we say the string is accepted After discussing the overview of the operator precedence parsers, we will discuss each step of the operator precedence parser in detail Steps involved in the construction of the parser are as follows 1 Ensure the Grammar satisfies the pre-requisite 2 Compute the functions Leading and Trailing 3 Using the computed leading and trailing, construct the Operator precedence parsing table 4 Parse the string based on the algorithm In this module, we will discuss the computation of Leading and Trailing followed by the construction of the parsing table 123 LEADING and TRAILING computation LEADING is defined for every non-terminal It is defined for each non-terminal such that, terminals that can be the first terminal in a string derived from that non-terminal Similarly, TRAILING for each non-terminal are those terminals that can be the last terminal in a string derived from that NT Formally, the functions LEADING and TRAILING are defined as follows:

LEADING(A) = { a A + γaδ, where γ is or a single non-terminal, where = indicates derivation, + indicates in one or more steps, A is a non-terminal Thus LEADING(A) can be interpreted as looking for the first terminal from the left, in the RHS of a production by applying all possible derivations for a production TRAILING(A) = { a A + γaδ, where δ is or a single non-terminal, where = indicates derivation, + indicates in one or more steps, A is a non-terminal Thus TRAILING(A) can be interpreted as looking for the first terminal from the right, in the RHS of a production by applying all possible derivations for a production The algorithm for finding LEADING(A) where A is a non-terminal is given in Algorithm 121 Algorithm 121 LEADING(A) { } 1 a is in Leading(A) if A γaδ where γ is ε or any Non-Terminal 2 If a is in Leading(B) and A Bα, then a in Leading(A) Step 1 of algorithm 121, indicates how to add the first terminal occurring in the RHS of every production directly Step 2 of the algorithm indicates to add the first terminal, through another non-terminal B to be included indirectly to the LEADING() of every non-terminal Similarly the algorithm to find TRAILING (A) is given in algorithm 122 Algorithm 122 TRAILING (A) { } 1 a is in Trailing(A) if A γaδ where δ is ε or any Non-Terminal 2 If a is in Trailing(B) and A αb, then a in Trailing(A) Algorithm 122 is similar to algorithm 121 and the only difference being, the symbol is looked from right to left as against left to right in algorithm 121 Step 1 of the algorithm 122, indicates looking for the first terminal occurring in the RHS of a production from right side and thus adds the direct first symbol The second step looks for adding the indirect first symbol from the right of the RHS of the production

Example 122 Consider the unambiguous version of the expression grammar 1 E E + T 2 E T 3 T T * F 4 T F 5 F (E) 6 F id Let us consider the productions of F from productions 5, 6 From production 6, there is only one symbol in the RHS and that is a terminal So, id will be in the LEADING(F) From production 5, the first symbol itself is a terminal ( and hence that will be added to LEADING(F) as well Further additions to LEADING(F) is not possible as the first symbol in the RHS of the productions involving F is a terminal Thus LEADING (F) = {(, id} Let us consider the production of T as defined in production 3, 4 From production 3, the first symbol on the RHS is a non-terminal and the first terminal is * and thus * will be in LEADING(T) Then from production 4, there is only one symbol in the RHS and that is a nonterminal So, the LEADING of the RHS non-terminal will be there in the LHS non-terminal also Thus, LEADING(T) will include LEADING(F) in addition to * LEADING (T) = { *, (, id } A similar analogy as explained for non-terminal T is applied to define the LEADING(E) Thus LEADING(E) will include LEADING(T) from production 2 and + from production 1 Thus LEADING (E) = { +, *, (, id} Let us consider computation of TRAILING () where it is similar to LEADING() but the productions are scanned from right to left Let us again start from F From production 6, id will be in TRAILING(F) as it is the only symbol in the RHS In addition, from production 5, ) will be in the TRAILING(F) and will not have any more symbols as the first symbol from the right is a terminal in both productions TRAILING (F) = {), id} Similarly from productions 3, 4, the TRAILING(T) would include * and TRAILING(F) from the two productions respectively Thus, TRAILING (T) = { *, ), id} Similarly from productions 1,2, the TRAILING(E) would include + and TRAILING(T) from these two productions Thus,

TRAILING (E) = { +, *, ), id} 124 Operator precedence parsing table construction After computing the two functions LEADING() and TRAILING(), the operator precedence table is constructed between all the terminals in the grammar including the $ symbol The algorithm for computing this parsing table is given in Algorithm 123 Algorithm 123 PARSINGTABLE(Grammar G, LEADING(), TRAILING() ) { For each production A X 1 X 2 X 3 X n for i = 1 to n-1 } Example 123 1 if X i and X i+1 are terminals set X i = X i+1 2 if i n-2 and X i and X i+2 are terminals and X i+1 is a non-terminal set X i = X i+2 3 if X i is a terminal and X i+1 is a non-terminal then for all a in Leading(X i+1 ) set X i a 4 if X i is a non-terminal and X i+1 is a terminal then for all a in Trailing(X i ) set a X i+1 Consider the unambiguous expression grammar involving as given in example 122 Step 1 is not to be encountered in the expression grammar where two terminals occur adjacent to each other Step 2 of the algorithm looks for a Non-terminal between a pair of terminals and the RHS should have only three symbols, then we go for the same precedence Using this step, the terminals ( and ) of the 5 th production of the Expression grammar gets the same precedence and is given in Table 123 Since, it is the only production obeying this rule, the same precedence situation will not arise for any other pair of terminals Consider step 3 of the algorithm, where we are looking for a terminal followed by a nonterminal Productions 1, 3, 5 will fall in this category We are looking at + and T in

production 1 LEADING(T) = {*, (, id} So we set + {*, (, id} and this is shown in row 1 of Table 123 Similarly, from production 3, we set * { (, id} and is given in row 2 of the table 123 So, is the case from production 5 we set ( {+, *, (, id} Consider step 4 of the algorithm, where we are looking for a terminal preceded by a nonterminal Similarly productions 1, 3, 5 will be used for this step of the algorithm also Consider production 1 so we set, TRAILING(E) X i+1 Therefore, { ), id} + and is shown in row 5, row 3 of the algorithm Similarly, {*, ), id} * From production 5, {+, *, ), id} ) This parsing table is later used by the operator precedence parser Table 123 Parsing table construction + + * id ( ) $ * id ( ) $ = Summary: This module detailed on the pre-requisite for a grammar to be parsed using operator precedence parser with a brief note on the operator precedence parsing algorithm This module also dealt with computing Leading, trailing and using which the operator parsing table is also constructed