CS 4120 Introduction to Compilers

Similar documents
Outline CS412/413. Administrivia. Review. Grammars. Left vs. Right Recursion. More tips forll(1) grammars Bottom-up parsing LR(0) parser construction

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

LR Parsing - The Items

Bottom-Up Parsing. Lecture 11-12

shift-reduce parsing

Bottom-Up Parsing. Lecture 11-12

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Lecture Bottom-Up Parsing

How do LL(1) Parsers Build Syntax Trees?

LR Parsing LALR Parser Generators

Lexical and Syntax Analysis. Bottom-Up Parsing

Syntax Analysis Part I

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

S Y N T A X A N A L Y S I S LR

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

UNIT III & IV. Bottom up parsing

More Bottom-Up Parsing

In One Slide. Outline. LR Parsing. Table Construction

Context-free grammars

LR Parsing LALR Parser Generators

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Compiler Construction: Parsing

LALR stands for look ahead left right. It is a technique for deciding when reductions have to be made in shift/reduce parsing. Often, it can make the

UNIT-III BOTTOM-UP PARSING

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

SLR parsers. LR(0) items

CS606- compiler instruction Solved MCQS From Midterm Papers

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

Recursive Descent Parsers

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

Compilers. Bottom-up Parsing. (original slides by Sam

CS453 : Shift Reduce Parsing Unambiguous Grammars LR(0) and SLR Parse Tables by Wim Bohm and Michelle Strout. CS453 Shift-reduce Parsing 1

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Bottom-up Parser. Jungsik Choi

LR Parsers. Aditi Raste, CCOEW

Bottom-Up Parsing II. Lecture 8

Downloaded from Page 1. LR Parsing

Top down vs. bottom up parsing

Chapter 4: LR Parsing

Principles of Programming Languages

Lecture 8: Deterministic Bottom-Up Parsing

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Talen en Compilers. Johan Jeuring , period 2. January 17, Department of Information and Computing Sciences Utrecht University

LR Parsing E T + E T 1 T

LR Parsing Techniques

CSCI312 Principles of Programming Languages

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

Lecture 7: Deterministic Bottom-Up Parsing

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

LR Parsing. Table Construction

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Chapter 4. Lexical and Syntax Analysis

CS 314 Principles of Programming Languages

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

Monday, September 13, Parsers

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Wednesday, August 31, Parsers

MODULE 14 SLR PARSER LR(0) ITEMS

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

Introduction to Bottom-Up Parsing

The following deflniüons h:i e been establish for the tokens: LITERAL any group olcharacters surrounded by matching quotes.

Compiler Construction 2016/2017 Syntax Analysis

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

4. Lexical and Syntax Analysis

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Bottom-Up Parsing LR Parsing

4. Lexical and Syntax Analysis

Lecture Notes on Shift-Reduce Parsing

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Introduction to Parsing. Comp 412

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Syntax-Directed Translation. Lecture 14

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Intro to Bottom-up Parsing. Lecture 9

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)


Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Outline. The strategy: shift-reduce parsing. Introduction to Bottom-Up Parsing. A key concept: handles

Concepts Introduced in Chapter 4

Conflicts in LR Parsing and More LR Parsing Types

CSC 4181 Compiler Construction. Parsing. Outline. Introduction


CSX-lite Example. LL(1) Parse Tables. LL(1) Parser Driver. Example of LL(1) Parsing. An LL(1) parse table, T, is a twodimensional

Transcription:

CS 4120 Introduction to Compilers Andrew Myers Cornell University Lecture 6: Bottom-Up Parsing 9/9/09

Bottom-up parsing A more powerful parsing technology LR grammars -- more expressive than LL can handle left-recursive grammars, virtually all programming languages Easier to express programming language syntax Shift-reduce parsers construct right-most derivation of program automatic parser generators (e.g., yacc, CUP, ocamlyacc) detect errors as soon as possible allows better error recovery CS 4120 Introduction to Compilers 2

Top-down parsing (1+2+(3+4))+5 S S+E E+E (S)+E (S +E)+E (S+E+E)+E (E+E +E)+E (1+E+E)+E (1+2+E) +E... In left-most derivation, entire tree above a token (2) has to be expanded when encountered Must be able to predict productions! S S + E E E n ( S ) S S + E E ( S ) S + E 5 S + E ( S ) E 2 S + E 1 E 4 3 CS 4120 Introduction to Compilers 3

Bottom-up parsing Right-most derivation -- backward Start with the tokens End with the start symbol S S + E E E number ( S ) (1+2+(3+4))+5 (E+2+(3+4))+5 (S+2+(3+4))+5 (S+E+(3+4))+5 (S+(3+4))+5 (S+(E+4))+5 (S+(S+4))+5 (S+(S+E))+5 (S+(S))+5 (S +E)+5 (S)+5 E+5 S+E S CS 4120 Introduction to Compilers 4

Progress of bottom-up parsing right-most derivation (1+2+(3+4))+5 (1+2+(3+4))+5 (E+2+(3+4))+5 (1 +2+(3+4))+5 (S+2+(3+4))+5 (1 +2+(3+4))+5 (S+E+(3+4))+5 (1+2 +(3+4))+5 (S+(3+4))+5 (1+2+(3 +4))+5 (S+(E+4))+5 (1+2+(3 +4))+5 (S+(S+4))+5 (1+2+(3 +4))+5 (S+(S+E))+5 (1+2+(3+4 ))+5 (S+(S))+5 (1+2+(3+4 ))+5 (S+E)+5 (1+2+(3+4) )+5 (S)+5 (1+2+(3+4) )+5 E+5 (1+2+(3+4)) +5 S+E (1+2+(3+4))+5 S (1+2+(3+4))+5 CS 4120 Introduction to Compilers 5

Bottom-up parsing (1+2+(3+4))+5 (E+2+(3+4))+5 (S+2+(3+4))+5 (S+E+(3+4))+5 S S + E E E number ( S ) Advantage of bottom-up parsing: select productions using more information S S + E E ( S ) S + E 5 S + E ( S ) E 2 S + E 1 E 4 3 CS 4120 Introduction to Compilers 6

Top-down vs. Bottom-up Bottom-up: Don t need to figure out as much of the parse tree for a given amount of input scanned unscanned Top-down scanned unscanned Bottom-up CS 4120 Introduction to Compilers 7

Shift-reduce parsing Parsing is a sequence of shift and reduce operations Parser state is a stack of terminals and non-terminals (grows to the right) Unconsumed input is a string of terminals Current derivation step is always stack+input Derivation step stack unconsumed input (1+2+(3+4))+5 (1+2+(3+4))+5 (E+2+(3+4))+5 (E +2+(3+4))+5 (S+2+(3+4))+5 (S +2+(3+4))+5 (S+E+(3+4))+5 (S+E +(3+4))+5 CS 4120 Introduction to Compilers 8

Shift-reduce parsing Parsing is a sequence of shifts and reduces Shift : move lookahead token to stack. No effect on derivation. stack input action ( 1+2+(3+4))+5 shift 1 (1 +2+(3+4))+5 Reduce : Replace symbols γ in top of stack with nonterminal symbol X, corresponding to production X γ (pop γ, push X). Reduces rightmost nonterminal. stack input action (S+E +(3+4))+5 reduce S S+E (S +(3+4))+5 9

Shift-reduce parsing S S + E E E number ( S ) derivation stack input stream action (1+2+(3+4))+5 (1+2+(3+4))+5 ( (1+2+(3+4))+5 shift 1+2+(3+4))+5 shift (1+2+(3+4))+5 (1 +2+(3+4))+5 reduce E num (E+2+(3+4))+5 (E +2+(3+4))+5 reduce S E (S+2+(3+4))+5 (S +2+(3+4))+5 shift (S+2+(3+4))+5 (S+ 2+(3+4))+5 shift (S+2+(3+4))+5 (S+2 +(3+4))+5 reduce E num (S+E+(3+4))+5 (S+E +(3+4))+5 reduce S S+E (S+(3+4))+5 (S +(3+4))+5 shift (S+(3+4))+5 (S+ (3+4))+5 shift (S+(3+4))+5 (S+( 3+4))+5 shift (S+(3+4))+5 (S+(3 +4))+5 reduce E num 10

Problem How do we know which action to take -- whether to shift or reduce, and which production? Sometimes can reduce but shouldn t. e.g., X ε can always be reduced Sometimes can reduce in more than one way. CS 4120 Introduction to Compilers 11

Action Selection Problem Given stack σ and look-ahead symbol b, should parser: shift b onto the stack (making it σb) reduce some production X γ assuming that stack has the form α γ (making it αx) If stack has form α γ, should apply reduction X γ (or shift) depending on stack prefix α α is different for different possible reductions, since γ s have different length. How to keep track of possible reductions? CS 4120 Introduction to Compilers 12

Parser States Goal: know what reductions are legal at any given point. Idea: summarize all possible stacks σ (and prefixes α) as a finite parser state Parser state is computed by a DFA that reads in the stack σ Accept states of DFA: unique reduction! Summarizing discards information affects what grammars parser handles affects size of DFA (number of states) CS 4120 Introduction to Compilers 13

LR(0) parser Left-to-right scanning, Right-most derivation, zero look-ahead characters Too weak to handle most language grammars (e.g., sum grammar) But will help us understand shift-reduce parsing... CS 4120 Introduction to Compilers 14

LR(0) states A state is a set of items keeping track of progress on possible upcoming reductions An LR(0) item is a production from the language with a separator. somewhere in the RHS of the production E num. state E (. S ) item Stuff before. is already on stack (beginnings of possible γ s to be reduced) Stuff after. : what we might see next e prefixes α represented by state itself 15

An LR(0) grammar: non-empty lists S ( L ) id L S L, S x (x,y) (x, (y,z), w) ((((x)))) (x, (y, (z, w))) (x,(y,z),w) S ( L ) L, S L, S S x w ( L ) L, S S y z CS 4120 Introduction to Compilers 16

Start State & Closure S ( L ) id L S L, S DFA start state closure S. S $ S. S $ S. ( L ) S. id Constructing a DFA to read stack: First step: augment grammar with prod n S S $ Start state of DFA: empty stack = S. S $ Closure of a state adds items for all productions whose LHS occurs in an item in the state, just after. set of possible productions to be reduced next Added items have the. located at the beginning: no symbols for these items on the stack yet CS 4120 Introduction to Compilers 17

Applying terminal symbols S. S $ S. ( L ) S. id id ( S id. S (. L ) L. S L. L, S S. ( L ) S. id id ( S ( L ) id L S L, S In new state, include all items that have appropriate input symbol just after dot, advance dot in those items, and take closure. CS 4120 Introduction to Compilers 18

Applying non-terminals S. S $ S. ( L ) S. id id ( id S id. S (. L ) L. S L. L, S S. ( L ) S. id ( L S S ( L. ) L L., S L S. Non-terminals on stack treated just like terminals (but added by reductions) CS 4120 Introduction to Compilers 19

Applying reduce actions S. S $ S. ( L ) S. id id ( S id. S (. L ) L. S L. L, S S. ( L ) S. id id Pop RHS off stack, replace with LHS X (X γ), rerun DFA (e.g. (x)) ( L S states causing reductions S ( L. ) L L., S L S. CS 4120 Introduction to Compilers 20

Full DFA (Appel) S ( L ) id L S L, S 1 4 S. S $ S. ( L ) S. id S S S. $ $ final state ( ( id 3 2 S id. id S (. L ) L. S L. L, S S. ( L ) S. id S L S. 7 id L L L,. S S. ( L ) S. id, S ( L. ) L L., S ) S ( L ). 8 9 S L L, S. reduce-only state: reduce if shift transition for look-ahead: shift otherwise: syntax error current state: push stack through DFA 5 6 CS 4120 Introduction to Compilers 21

Parsing example: ((x),y) derivation stack input action ((x),y) 1 ((x),y) shift, goto 3 ((x),y) 1 ( 3 (x),y) shift, goto 3 ((x),y) 1 ( 3 ( 3 x),y) shift, goto 2 ((x),y) 1 ( 3 ( 3 x 2 ),y) reduce S id ((S),y) 1 ( 3 ( 3 S 7 ),y) reduce L S ((L),y) 1 ( 3 ( 3 L 5 ),y) shift, goto 6 ((L),y) 1 ( 3 ( 3 L 5 ) 6,y) reduce S (L) (S,y) 1 ( 3 S 7,y) reduce L S (L,y) 1 ( 3 L 5,y) shift, goto 8 (L,y) 1 ( 3 L 5, 8 y) shift, goto 9 (L,y) 1 ( 3 L 5, 8 y 2 ) reduce S id (L,S) 1 ( 3 L 5, 8 S 9 ) reduce L L, S (L) 1 ( 3 L 5 ) shift, goto 6 (L) 1 ( 3 L 5 ) 6 reduce S (L) S 1 S 4 $ done S ( L ) id L S L, S CS 4120 Introduction to Compilers 22

Optimization Don t need to rerun DFA from beginning on every reduction On reducing X γ with stack αγ: pop γ off stack, revealing prefix α and state take single step in DFA from top state push X onto stack with new DFA state ((L), y) state = 6 (S, y) state =? 23

Implementation: LR parsing table input (terminal) symbols non-terminal symbols state next action state next state Action table Used at every step to decide whether to shift or reduce Goto table Used only when reducing, to determine next state a X γ. X CS 4120 Introduction to Compilers 24

Shift-reduce parsing table terminal symbols non-terminal symbols action table 1. shift and goto state n 2. reduce using X γ pop symbols γ off stack using state label of top (end) of stack, look up X in goto table and go to that state DFA + stack = push-down automaton (PDA) state next actions next state on red n CS 4120 Introduction to Compilers 25

List grammar parsing table ( ) id, $ S L 1 s3 s2 g4 2 S id S id S id S id S id 3 s3 s2 g7 g5 4 accept 5 s6 s8 6 S (L) S (L) S (L) S (L) S (L) 7 L S L S L S L S L S 8 s3 s2 g9 9 L L,S L L,S L L,S L L,S L L,S 26

Shift-reduce parsing Grammars can be parsed bottom-up using a DFA + stack DFA processes stack σ to decide what reductions might be possible given shift-reduce parser or push-down automaton (PDA) Compactly represented as LR parsing table State construction converts grammar into states that decide action to take CS 4120 Introduction to Compilers 27