Lecture 8: Deterministic Bottom-Up Parsing

Similar documents
Lecture 7: Deterministic Bottom-Up Parsing

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing. Lecture 11-12

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

LR Parsing LALR Parser Generators

LR Parsing LALR Parser Generators

In One Slide. Outline. LR Parsing. Table Construction

Conflicts in LR Parsing and More LR Parsing Types

Compilers. Bottom-up Parsing. (original slides by Sam

LR Parsing. Table Construction

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

Bottom-Up Parsing II. Lecture 8

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

S Y N T A X A N A L Y S I S LR

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Monday, September 13, Parsers

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Lexical and Syntax Analysis. Bottom-Up Parsing

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Context-free grammars

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Compiler Construction: Parsing

Principles of Programming Languages

More Bottom-Up Parsing

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

Wednesday, August 31, Parsers

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

shift-reduce parsing

Lecture Notes on Bottom-Up LR Parsing

3. Parsing. Oscar Nierstrasz

Outline. The strategy: shift-reduce parsing. Introduction to Bottom-Up Parsing. A key concept: handles

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Lecture Notes on Bottom-Up LR Parsing

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Introduction to Parsing. Lecture 8

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Lecture Bottom-Up Parsing

SLR parsers. LR(0) items

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

Syntax-Directed Translation. Lecture 14

How do LL(1) Parsers Build Syntax Trees?

Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

Downloaded from Page 1. LR Parsing

4. Lexical and Syntax Analysis

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

UNIT III & IV. Bottom up parsing

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

Concepts Introduced in Chapter 4

4. Lexical and Syntax Analysis

Compiler Construction 2016/2017 Syntax Analysis

CS 4120 Introduction to Compilers

LR Parsing Techniques

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay


Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

CS415 Compilers. LR Parsing & Error Recovery

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

LR Parsing - The Items

Introduction to Parsing. Lecture 5

Lecture 9: General and Bottom-Up Parsing. Last modified: Sun Feb 18 13:49: CS164: Lecture #9 1

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

LR Parsers. Aditi Raste, CCOEW

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Introduction to Bottom-Up Parsing

MODULE 14 SLR PARSER LR(0) ITEMS

Lecture Notes on Shift-Reduce Parsing

Intro to Bottom-up Parsing. Lecture 9

Chapter 4. Lexical and Syntax Analysis

Introduction to Parsing. Lecture 5

UNIVERSITY OF CALIFORNIA

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

COMPILER (CSE 4120) (Lecture 6: Parsing 4 Bottom-up Parsing )

UNIT-III BOTTOM-UP PARSING

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

LR Parsing - Conflicts

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

Transcription:

Lecture 8: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 1

Avoiding nondeterministic choice: LR We ve been looking at general context-free parsing. It comes at a price, measured in overheads, so in practice, we design programming languages to be parsed by less general but faster means, like top-down recursive descent. Deterministic bottom-up parsing is more general than top-down parsing, and just as efficient. Most common form is LR parsing L means that tokens are read left to right R means that it constructs a rightmost derivation Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 2

An Introductory xample LR parsers don t need left-factored grammars and can also handle left-recursive grammars Consider the following grammar: : + ( ) int (Why is this not LL(1)?) Consider the string: int + ( int ) + ( int ). Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 3

The Idea LR parsing reduces a string to the start symbol by inverting productions. In the following, sent is a sentential form that starts as the input and is reduced to the start symbol, S: sent = input string of terminals while sent S: Identify first β in sent such that A : β is a production and S = αaγ αβγ = sent. Replace β by A in sent (so that αaγ becomes new sent). Such αβ s are called handles. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 4

A Bottom-up Parse in Detail (1) Grammar: : + ( ) int int + (int) + (int) int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 5

A Bottom-up Parse in Detail (2) Grammar: : + ( ) int int + (int) + (int) + (int) + (int) (handles in red) int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 6

A Bottom-up Parse in Detail (3) Grammar: : + ( ) int int + (int) + (int) + (int) + (int) + () + (int) int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 7

A Bottom-up Parse in Detail (4) Grammar: : + ( ) int int + (int) + (int) + (int) + (int) + () + (int) + (int) int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 8

A Bottom-up Parse in Detail (5) Grammar: : + ( ) int int + (int) + (int) + (int) + (int) + () + (int) + (int) + () int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 9

A Bottom-up Parse in Detail (6) Grammar: : + ( ) int A reverse rightmost derivation: int + (int) + (int) + (int) + (int) + () + (int) + (int) + () int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 10

Where Do Reductions Happen? Because an LR parser produces a reverse rightmost derivation: If αβγ is one step of a bottom-up parse with handle αβ And the next reduction is by A : β, Then γ must be a string of terminals, Because αaγ αβγ is a step in a rightmost derivation Intuition: We make decisions about what reduction to use after seeing all symbols in the handle, rather after seeing only the first (as for LL(1)). Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 11

Notation Idea: Split the input string into two substrings Right substring (a string of terminals) is as-yet unprocessed by parser Left substring has terminals and nonterminals (In examples, we ll mark the dividing point with.) The dividing point marks the end of the next potential handle. Initially, all input is unexamined: x 1 x 2 x n Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 12

Shift-Reduce Parsing Bottom-up parsing uses only two kinds of actions: Shift: Move one place to the right, shifting a terminal to the left string. For example, + ( int ) + ( int ) Reduce: Apply an inverse production at the handle. For example, if : + ( ) is a production, then we might reduce: + ( + ( ) ) +( ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 13

Accepting a String The process ends when we reduce all the input to the start symbol. For technical convenience, however, we usually add a new start symbol and a hidden production to handle the end-of-file: S : S Having done this, we can now stop parsing and accept the string whenever we reduce the entire input to S without bothering to do the final shift and reduce. This will be the convention from now on. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 14

Shift-Reduce xample (1) Sent. Form Actions Grammar: : + ( ) int int + (int) + (int) shift int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 15

Shift-Reduce xample (2) Sent. Form Actions Grammar: : + ( ) int int + (int) + (int) shift int + (int) + (int) reduce by : int int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 16

Shift-Reduce xample (3) Sent. Form Actions Grammar: : + ( ) int int + (int) + (int) shift int + (int) + (int) reduce by : int + (int) + (int) shift 3 times int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 17

Shift-Reduce xample (4) Sent. Form Actions Grammar: : + ( ) int int + (int) + (int) shift int + (int) + (int) reduce by : int + (int) + (int) shift 3 times + (int ) + (int) reduce by : int int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 18

Shift-Reduce xample (5) Sent. Form Actions Grammar: : + ( ) int int + (int) + (int) shift int + (int) + (int) reduce by : int + (int) + (int) shift 3 times + (int ) + (int) reduce by : int + ( ) + (int) shift int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 19

Shift-Reduce xample (6) Sent. Form Actions Grammar: : + ( ) int int + (int) + (int) shift int + (int) + (int) reduce by : int + (int) + (int) shift 3 times + (int ) + (int) reduce by : int + ( ) + (int) shift + () + (int) reduce by : +() int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 20

Shift-Reduce xample (7) Sent. Form Actions Grammar: : + ( ) int int + (int) + (int) shift int + (int) + (int) reduce by : int + (int) + (int) shift 3 times + (int ) + (int) reduce by : int + ( ) + (int) shift + () + (int) reduce by : +() + (int) shift 3 times int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 21

Shift-Reduce xample (8) Sent. Form Actions Grammar: : + ( ) int int + (int) + (int) shift int + (int) + (int) reduce by : int + (int) + (int) shift 3 times + (int ) + (int) reduce by : int + ( ) + (int) shift + () + (int) reduce by : +() + (int) shift 3 times + (int ) reduce by : int int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 22

Shift-Reduce xample (9) Sent. Form Actions Grammar: : + ( ) int int + (int) + (int) shift int + (int) + (int) reduce by : int + (int) + (int) shift 3 times + (int ) + (int) reduce by : int + ( ) + (int) shift + () + (int) reduce by : +() + (int) shift 3 times + (int ) reduce by : int + ( ) shift int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 23

Shift-Reduce xample (10) Sent. Form Actions Grammar: : + ( ) int int + (int) + (int) shift int + (int) + (int) reduce by : int + (int) + (int) shift 3 times + (int ) + (int) reduce by : int + ( ) + (int) shift + () + (int) reduce by : +() + (int) shift 3 times + (int ) reduce by : int + ( ) shift + () reduce by : +() int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 24

Shift-Reduce xample (11) Sent. Form Actions Grammar: : + ( ) int int + (int) + (int) shift int + (int) + (int) reduce by : int + (int) + (int) shift 3 times + (int ) + (int) reduce by : int + ( ) + (int) shift + () + (int) reduce by : +() + (int) shift 3 times + (int ) reduce by : int + ( ) shift + () reduce by : +() accept int + ( int ) + ( int ) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 25

The Parsing Stack The left string (left of the ) can be implemented as a stack: Top of the stack is just left of the. Shift pushes a terminal on the stack. Reduce pops 0 or more symbols from the stack (corresponding to the production s right-hand side) and pushes a nonterminal on the stack (the production s left-hand side). Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 26

Key Issue: When to Shift or Reduce? Decide based on the left string ( the stack ) and some of the remaining input (lookahead tokens) typically one token at most. Idea: use a DFA to decide when to shift or reduce: DFA alphabet consists of terminals and nonterminals. The DFA input is the stack up to potential handle. DFA recognizes complete handles. In addition, the final states are labeled with particular productions that might apply, given the possible lookahead symbols. We run the DFA on the stack and we examine the resulting state, X and the lookahead token τ after. If X has a transition labeled τ then shift. If X is labeled with A : β on τ, then reduce. So we scan the input from Left to right, producing a (reverse) Rightmost derivation, using 1 symbol of lookahead: giving LR(1) parsing. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 27

2 accept on 7 : + () on, + + 0 ) 8 3 + 10 LR(1) Parsing. An xample int ( 4 6 5 ( 9 1 : int on, + int + int ) 11 : int on ), + : +() on ), + 0 int + (int) + (int) shift int 1 + (int) + (int) red. by : int 2 + (int) + (int) shift 3 times + (int 5 ) + (int) red. by : int + ( 6 ) + (int) shift + () 7 + (int) red. by : +() 2 + (int) shift 3 times + (int 5 ) red. by : int + ( 6 ) shift + () 7 red. by : +() 2 accept (Subscripts on show the states that the DFA reaches by scanning the left string.) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 28

2 accept on 7 : + () on, + + 0 ) LR(1) Parsing. Another xample 8 3 + 10 int ( 4 6 5 ( 9 1 : int on, + int + int ) 11 : int on ), + : +() on ), + 0 int + (int + (int + (int))) shift int 1 + (int + (int + (int))) red. by : int 2 + (int) + (int + (int))) shift 3 times + (int 5 ) + (int + (int))) red. by : int + ( 6 ) + (int + (int))). shift. + ( + ( + (int 5 ) )) red. by : int + ( + ( + ( 10 ) )) shift + ( + ( + () 11 )) red. by : + () + ( + ( 10 )) shift + ( + ( ) 11 ) red. by : + () + ( 6 ) shift + () 7 red. by : + () 2 accept Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 29

Representing the DFA Parsers represent the DFA as a 2D table, as for table-driven lexical analysis Lines correspond to DFA states Columns correspond to terminals and nonterminals Classical treatments (like Aho, et al) split the columns are split into: Those for terminals: the action table. Those for nonterminals: the goto table. The goto table contains only shifts, but conceptually, the tables are very much alike as far as the DFA is concerned. The classical division has some advantages when it comes to table compression. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 30

Representing the DFA. xample Here s the table for a fragment of our DFA: 3 ( 4... int + ( ) int 6 5 ) : int on ), + 3 s4 4 s5 s6 5 r : int r : int 6 s7 7 r : +() r : +() 7 : +() on, +... Legend: sn means shift (or go to) state N. r P means reduce using production P. blank entries indicate errors. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 31

A Little Optimization After a shift or reduce action we rerun the DFA on the entire stack. This is wasteful, since most of the work is repeated, so Memoize: instead of putting terminal and nonterminal symbols on the stack, put the DFA states you get to after reading those symbols. For example, when we ve reached this point: + ( + ( + (int 5 ) )) store the part to the left of as 0 2 3 4 6 8 9 10 8 9 5 And don t throw any of these away until you reduce them. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 32

The Actual LR Parsing Algorithm Let I = w 1 w 2...w n be initial input Let j = 1 Let stack = < 0 > repeat case table[top_state(stack), I[j]] of sk: push k on the stack; j += 1 r X: α : pop len(α) symbols from stack push j on stack, where table[top_state(stack), X] is sj. accept: return normally error: return parsing error indication Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 33

Parsing Contexts Consider the state describing the situation at the in the stack + ( int )+( int ), which tells us We are looking to reduce : + (), having already seen + ( from the right-hand side. Therefore, we expect that the rest of the input starts with something that will eventually reduce to : : int or : +() after which we expect to find a ), but we have as yet seen nothing from the right-hand sides of either of these two possible productions. One DFA state captures a set of such contexts in the form of a set of LR(1) items, like this: [ : + ( ),... ] [ : int, + ] (why?) [ : int, ) ] [ : +(), + ] (why?) [ : +(), ) ] (Traditionally, use in items to show where the is.) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 34

LR(1) Items An LR(1) item is a pair: X: α β, a X: αβ is a production. a is a terminal symbol (an expected lookahead). It says we are trying to find an X followed by a. and that we have already accumulated α on top of the parsing stack. Therefore, we need to see next a prefix of something derived from βa. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 35

Constructing the Parsing DFA The idea is to borrow from arley s algorithm (where we ve already seen this notation!). We throw away a lot of the information that arley s algorithm keeps around (notably where in the input each current item got introduced), because when we have a handle, there will only be one possible reduction to take based on what we ve seen so far. This allows the set of possible item sets to be finite. ach state in the DFA has an item set that is derived from what arley s algorithm would do, but collapsed because of the information we throw away. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 36

Constructing the Parsing DFA: Partial xample S:, : +(), /+ : int, /+ 0 2 S:, : + (), /+ accept on 6 : +( ), /+ : + (), )/+ int + 1 : int, /+ : + (), /+ ( : +( ), /+ : int, )/+ : + (), )/+ int : int on, + 3 4 : int, )/+ 5 : int on ), + ) + Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 37

LR Parsing Tables. Notes We really want to construct parsing tables (i.e. the DFA) from CFGs automatically, since this construction is tedious. But still good to understand the construction to work with parser generators, which report errors in terms of sets of items. What kind of errors can we expect? Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 38

Shift/Reduce Conflicts If a DFA state contains both [X: α aβ, b] and [Y: γ, a], then we have two choices when the parser gets into that state at the and the next input symbol is a: Shift into the state containing [X: αa β, b], or Reduce with Y: γ. This is called a shift-reduce conflict. Often due to ambiguities in the grammar. Classic example: the dangling else S: "if" "then" S "if" "then" S "else" S... This grammar gives rise to a DFA state containing [S: "if" "then" S, "else"] and [S: "if" "then" S "else" S,... ] So if else is next, we can shift or reduce. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 39

More Shift/Reduce Conflicts Consider the ambiguous grammar : + * int We will have the states containing [: *, *] [: *, *] = [: +, *] [: +, *]...... Again we have a shift/reduce on input * (in the item set on the right). We probably want to shift ( * is usually supposed to bind more tightly than + ) Solution: provide extra information (the precedence of * and + ) that allows the parser generator to decide what to do. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 40

Using Precedence in Bison In Bison, you can declare precedence and associativity of both terminal symbols and rules, For terminal symbols (tokens), there are precedence declarations, listed from lowest to highest precedence: %left + %left * %right "**" Symbols on each such line have the same precedence. For a rule, precedence = that of its last terminal (Can override with %prec if needed, cf. the Bison manual). Now, we resolve shift/reduce conflict with a shift if: The next input token has higher precedence than the rule, or The next input token has the same precedence as the rule and the relevent precedence declaration was %right. and otherwise, we choose to reduce the rule. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 41

xample of Using Precedence to Solve S/R Conflict (1) Assuming we ve declared %left + %left * the rule : + will have precedence 1 (left-associative) and the rule : * will have precedence 2. So, when the parser confronts the choice in state 6 w/next token *, 5 : +, */+ : +, */+ : *, */+ etc. : +, */+ : +, */+ : *, */+ 6 it will choose to shift because the * has higher precedence than the rule +. On the other hand, with input symbol +, it will choose to reduce, because the input token then has the same precedence as the rule to be reduced, and is left-associative. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 42

xample of Using Precedence to Solve S/R Conflict (2) Back to our dangling else example. We ll have the state 10 S: "if" "then" S, "else" S: "if" "then" S "else" S, "else" etc. Can eliminate conflict by declaring the token else to have higher precedence than then (and thus, than the first rule above). HOWVR: best to limit use of precedence to these standard examples (expressions, dangling elses). If you simply throw them in because you have a conflict you don t understand, you re like to end up with unexpected parse trees or syntax errors. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 43

Reduce/Reduce Conflicts The lookahead symbols in LR(1) items are only considered for reductions in items that end in. If a DFA state contains both [X: α, a] and [Y: β, a] then on input a we don t know which production to reduce. Such reduce/reduce conflicts are often due to a gross ambiguity in the grammar. xample: defining a sequence of identifiers with S: ǫ id id S There are two parse trees for the string id: S id or S id S S. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 44

Reduce/Reduce Conflicts in DFA For this example, you ll get states: 0 S : S, S:, S: id, S: id S, S : S, id S: id, S: id S, S:, S: id S, S: id, S: id S, 1 Reduce/reduce conflict on input. Better rewrite the grammar: S: ǫ id S. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 45

Relation to Bison Bison builds this kind of machine. However, for efficiency concerns, collapses many of the states together, namely those that differ only in lookahead sets, but otherwise have identical sets of items. Result is called an LALR(1) parser (as opposed to LR(1)). Causes some additional conflicts, but not these are rare. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 46

A Hierarchy of Grammar Classes Unambiguous Grammars LL(k) LR(k) LR(1) LL(1) LALR(1) SLR Ambiguous Grammars From Andrew Appel, Modern Compiler Implementation in Java LL(0) LR(0) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 47

Summary Parsing provides a means of tying translation actions to syntax clearly. A simple parser: LL(1), recursive descent A more powerful parser: LR(1) An efficiency hack: LALR(1), as in Bison. arley s algorithm provides a complete algorithm for parsing all contextfree languages. We can get the same effect in Bison by other means (the%glr-parser option, for Generalized LR), as seen in one of the examples from lecture #5. Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 48