# Compiler Construction: Parsing

Size: px
Start display at page:

Transcription

1 Compiler Construction: Parsing Mandar Mitra Indian Statistical Institute M. Mitra (ISI) Parsing 1 / 33

2 Context-free grammars. Reference: Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals - tokens non-terminals - represent higher-level structures of a program start symbol, productions Example: E E op E (E) E id op + / % NOTE: recall token vs. lexeme difference Derivation: starting from the start symbol, use productions to generate a string (sequence of tokens) Parse tree: pictorial representation of a derivation M. Mitra (ISI) Parsing 2 / 33

3 Ambiguity Reference: Section 4.2, 4.3. Left-most derivation: at each step, replace left-most non-terminal Ambiguous grammar: G is ambiguous if a string has > 1 left-most (or right-most) derivation ALT: G is ambiguous if > 1 parse tree can be constructed for a string Examples: 1. E E + E E E E 2. stmt if expr then stmt if expr then stmt else stmt [ SOLUTION: stmt matched unmatched matched if E then matched else matched unmatched if E then stmt if E then matched else unmatched ] M. Mitra (ISI) Parsing 3 / 33

4 Top-down parsing - I Reference: Section 4.4. Recursive descent parsing: Corresponds to finding a leftmost derivation for an input string Equivalent to constructing parse tree in pre-order Example: Grammar: S cad Input: cad A ab a Problems: 1. backtracking involved ( buffering of tokens required) 2. left recursion will lead to infinite looping 3. left factors may cause several backtracking steps M. Mitra (ISI) Parsing 4 / 33

5 Top-down parsing - I Reference: Section 4.3. Left recursion: G istb left recursive if for some non-terminal A, A + Aα Simple case I: A Aα β A βa A αa ϵ Simple case II: A Aα 1 Aα 2... Aα m β 1... β n A β 1 A... β n A A α 1 A α 2 A... α m A ϵ M. Mitra (ISI) Parsing 5 / 33

6 Top-down parsing - I General case: Input: G without any cycles (A + A) or ε-productions Output: equivalent non-recursive grammar Algorithm: Let the non-terminals be A 1,... A n. for i = 1 to n do for j = 1 to i 1 do replace A i A j γ by A i δ 1 γ δ 2 γ... δ k γ where A j δ 1 δ 2... δ k are the current A j productions end for Eliminate immediate left-recursion for A i. end for M. Mitra (ISI) Parsing 6 / 33

7 Top-down parsing - I Left factoring: Example: stmt if ( expr ) stmt Algorithm: if ( expr ) stmt else stmt while left factors exist do for each non-terminal A do Find longest prefix α common to 2 rules Replace A αβ 1... αβ n... by A αa... A β 1... β n end for end while M. Mitra (ISI) Parsing 7 / 33

8 Top-down parsing - II Reference: Section 4.4. Predictive parsing: recursive descent parsing without backtracking Principle: Given current input symbol and non-terminal, we should be able to determine which production is to be used Example: stmt if ( expr )... while... for... M. Mitra (ISI) Parsing 8 / 33

9 Top-down parsing - II Implementation: use transition diagrams (1 per non-terminal) A X 1 X 2... X n s X 1 X 2 X... n F 1. If X i is a terminal, match with next input token and advance to next state. 2. If X i is a non-terminal, go to the transition diagram for X i, and continue. On reaching the final state of that transition diagram, advance to next state of current transition diagram. Example: E E + T T T T F F F (E) id M. Mitra (ISI) Parsing 9 / 33

10 Top-down parsing - II Non-recursive implementation: a Input top X Stack TABLE Table: 2-d array s.t. M[A, a] specifies A-production to be used if input symbol is a Algorithm: 0. Initially: stack contains EOF S, input pointer is at start of input 1. if X = a = EOF, done 2. if X = a EOF, pop stack and advance input pointer 3. if X is non-terminal, lookup M[X, a] X UV W pop X, push W, V, U M. Mitra (ISI) Parsing 10 / 33

11 FIRST and FOLLOW FIRST(α): set of terminals that begin strings derived from α if α ϵ, then ϵ FIRST (α) FOLLOW (A): set of terminals that can appear immediately to the right of A in some sentential form FOLLOW (A) = {a S αaaβ} if A is the rightmost symbol in any sentential form, then EOF FOLLOW (A) M. Mitra (ISI) Parsing 11 / 33

13 Table construction Algorithm: 1. For each production A α (a) for each terminal a FIRST (α), add A α to M[A, a] (b) if ϵ FIRST (α), add A α to M[A, b] for each terminal b FOLLOW (A) 2. Mark all other entries error Example: Grammar: E E + T T T T F F F (E) id Input: id + id * id M. Mitra (ISI) Parsing 13 / 33

14 LL(1) grammars If the table has no multiply defined entries, grammar istb LL(1) L - left-to-right L - leftmost 1 - lookahead If G is LL(1), then G cannot be left-recursive or ambiguous Example: S i E t S S a S e S ϵ E b M[S, e] = {S ϵ, S e S} Some non-ll(1) grammars may be transformed into equivalent LL(1) grammars M. Mitra (ISI) Parsing 14 / 33

15 LL(1) parsing Error recovery: 1. if X is a terminal, but X a, pop X 2. if M[X, a] is blank, skip a 3. if M[X, a] = synch, pop X, but do not advance input pointer Synch sets: use FOLLOW (A) add the FIRST set of a higher-level non-terminal to the synch set of a lower-level non-terminal M. Mitra (ISI) Parsing 15 / 33

16 Bottom-up parsing Reference: Section 4.5. Example: Grammar: E E + T T Input: id + id * id T T F F F (E) id Sentential form: any string α s.t. S α Handle: for a sentential form γ, handle is a production A β and a position of β in γ, s.t. β may be replaced by A to produce the previous right-sentential form in a rightmost derivation of γ Properties: 1. string to right of handle must consist of terminals only 2. if G is unambiguous, every right-sentential form has a unique handle Advantages: 1. No backtracking 2. More powerful than LL(1) / predictive parsing M. Mitra (ISI) Parsing 16 / 33

17 Bottom-up parsing Reference: Section 4.7. Implementation scheme: 0. Use input buffer, stack, and parsing table. 1. Shift 0 input symbols onto stack until a handle β is on top of stack. 2. Reduce β to A (i.e. pop symbols of β and push A). 3. Stop when stack = EOF, S, and input pointer is at EOF. Stack: s o X 1 s 1... X m s m, where each s i represents a state (current situation in the parsing process) Table: used to guide steps 2 and 3 2-d array indexed by state, input symbol pairs consists of two parts (action + goto) M. Mitra (ISI) Parsing 17 / 33

18 Bottom-up parsing Algorithm: 1. Initially, stack = s 0 (initial state) 2. Let s - state on top of stack a - current input symbol if action[s, a] = shift s push a, s on stack, advance input pointer if action[s, a] = reduce A β pop 2 β symbols let s be the new top of stack push A, goto[s, A] on stack if action[s, a] = accept, done else error M. Mitra (ISI) Parsing 18 / 33

19 SLR parsing Grammar augmentation: Create new start symbol S ; add S S to productions Item (LR(0) item): production of G with a dot at some position in the RHS, representing how much of the RHS has already been seen at a given point in the parse Example: A ϵ A Closure: Let I be a set of items closure(i) I repeat until no more changes for each A α Bβ in closure(i) for each production B γ s.t. B γ closure(i) add B γ to closure(i) Example: closure(e E) M. Mitra (ISI) Parsing 19 / 33

20 SLR parsing Goto construction: goto(i, X) = closure( {A αx β A α Xβ I} ) Example: Let I = {E E, E E +T } goto(i, +) = closure({e E + T }) Canonical collection construction: 1. C {closure({s S})} 2. repeat until no more changes: for each I C, for each grammar symbol X if goto(i, X) is not empty and not in C add goto(i, X) to C M. Mitra (ISI) Parsing 20 / 33

21 SLR parsing Table construction: 1. Let C = {I 0,..., I n } be the canonical collection of LR(0) items for G. 2. Create a state s i corresponding to each I i. The set containing S S corresponds to the initial state. 3. If A α aβ I i and goto(i i, a) = I j, then action(s i, a) = shift s j. 4. If A α I i (A S ), then action(s i, a) = reduce A α for all a FOLLOW (A). 5. If S S I i, then action(s i, EOF) = accept. 6. If goto(i i, a) = I j, then goto(s i, a) = s j. 7. Mark all blank entries error. M. Mitra (ISI) Parsing 21 / 33

22 Conflicts 1. Shift-reduce conflict: stmt if ( expr ) stmt if ( expr ) stmt else stmt 2. Reduce-reduce conflict: stmt id ( param list ) ; Example: expr id ( expr list ) param id expr id Grammar: S L = R S R L R L id R L Canonical collection: I 0 = {S S,...} I 2 = {S L = R, R L }... Table: action(2, =) = shift... action(2, =) = reduce... NOTE: SLR(1) grammars are unambiguous, but not vice versa.. M. Mitra (ISI) Parsing 22 / 33

23 Canonical LR parsers Motivation: Reduction by A α not necessarily proper even if a FOLLOW (A) explicitly indicate tokens for which reduction is acceptable LR(1) item: pair of the form A α β, a, where A αβ is a production, a is a terminal or EOF Properties: 1. A α β, a - lookahead has no effect A α, a - reduce only if input symbol is a 2. {a A α, a canonical collection} FOLLOW (A) M. Mitra (ISI) Parsing 23 / 33

24 Canonical LR parsers Closure: Let I be a set of items closure(i) I repeat until no more changes for each item A α Bβ, a in closure(i) for each production B γ and each terminal b FIRST (βa) if B γ, b closure(i) add B γ, b to closure(i) Goto: goto(i, X) = closure({ A αx β, a A α Xβ, a I}) Canonical collection construction: 1. C {closure({ S S, EOF })} 2. /* Similar to SLR algorithm */ M. Mitra (ISI) Parsing 24 / 33

25 Canonical LR parsers Table construction: 1. If A α aβ, b I i and goto(i i, a) = I j then action(i, a) = shift j. 2. If A α, b I i, then action(i, b) = reduce A α (A S ). If S S, EOF I i, then action(i, EOF) = accept. M. Mitra (ISI) Parsing 25 / 33

26 LALR parsers Motivation: try to combine efficiency of SLR parser with power of canonical method Core: set of LR(0) items corresponding to a set of LR(1) items Method: 1. Construct canonical collection of LR(1) items, C = {I 0,..., I n }. 2. Merge all sets with the same core. Let the new collection be C = {J 0,..., J m }. 3. Construct the action table as before. 4. If J = I 1... I k, then goto(j, X) = union of all sets with the same core as goto(i 1, X). M. Mitra (ISI) Parsing 26 / 33

27 SLR vs LR(1) vs LALR No. of states: SLR = LALR <= LR(1) Power: SLR < LALR < LR(1) (cf. Pascal) SLR vs. LALR: LALR items can be regarded as SLR items, with the core augmented by appropriate subsets of FOLLOW (A) explicitly specified LALR vs. LR(1): 1. If there were no shift-reduce conflicts in the LR(1) table, there will be no shift-reduce conflicts in the LALR table. 2. Step 2 may generate reduce-reduce conflicts. Example: I 1 = { A α, a, B β, b } I 2 = { A α, b, B β, a } 3. Correct inputs: LALR parser mimics LR(1) parser Incorrect inputs: incorrect reductions may occur on a lookahead a parser goes back to a state I i in which A has just been recognized. But a cannot follow A in this state error M. Mitra (ISI) Parsing 27 / 33

28 Error recovery Reference: Section 4.8. Detection: Canonical LR - errors are immediately detected (no unnecessary shift/reduce actions) SLR/LALR - no symbols are shifted onto stack, but reductions may occur before error is detected Panic mode recovery: 1. Scan down the stack until a state s with a goto on a significant non-terminal A (e.g. expr, stmt, etc.) is found. 2. Discard input until a symbol a which can follow A is found. 3. Push A, goto(s, A) and resume parsing. Expln: s α Aaβ α γ }{{} aβ location of error M. Mitra (ISI) Parsing 28 / 33

29 Error recovery Phrase-level error recovery: recovery by local correction on remaining input e.g. insertion, deletion, substitution, etc. Scheme: 1. Consider each blank entry, and decide what the error is likely to be. 2. Call appropriate recovery method should consume input (to avoid infinite loop) avoid popping a significant non-terminal from stack Examples: State: E E Input: + or Action: push id, goto appropriate state Message: missing operand State: E E + T Input: ) Action: skip ) from input Message: extra ) M. Mitra (ISI) Parsing 29 / 33

30 Yacc Reference: Section 4.9. Usage: \$ yacc myfile.y (generates y.tab.c) File format: declarations %% grammar rules (terminals, non-terminals, start symbol) %% auxiliary procedures (C functions) Semantic actions: \$\$ - attribute value associated with LHS non-terminal \$i - attribute value associated with i-th grammar symbol on RHS specified action is executed on reduction by corresponding production default action: \$\$ = \$1 M. Mitra (ISI) Parsing 30 / 33

31 M. Mitra (ISI) Parsing 31 / 33

32 Yacc Lexical analyzer: yylex() must be provided should return integer code for a token should set yylval to attribute value Usage with lex: % lex scanner.l % yacc parser.y % cc y.tab.c -ly -ll Add #include "lex.yy.c" to third section of file Declared tokens can be used as return values in scanner.l M. Mitra (ISI) Parsing 32 / 33

33 Yacc Implicit conflict resolution: 1. Shift-reduce: choose shift 2. Reduce-reduce: choose the production listed earlier Explicit conflict resolution: Precedence: tokens are assigned precedence according to the order in which they are declared (lowest first) Associativity: left, right, or nonassoc Precedence/assoc. of a production = precedence/assoc. of rightmost terminal or explicitly specified using %prec Given A α, a: if precedence of production is higher, reduce if precedence of production is same, and associativity of production is left, reduce else shift M. Mitra (ISI) Parsing 33 / 33

### Context-free grammars

Context-free grammars Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals - tokens non-terminals - represent higher-level structures of a program start symbol,

### PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

PART 3 - SYNTAX ANALYSIS F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 64 / 309 Goals Definition of the syntax of a programming language using context free grammars Methods for parsing

### Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Formal Languages and Compilers Lecture VII Part 3: Syntactic Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

### Formal Languages and Compilers Lecture VII Part 4: Syntactic A

Formal Languages and Compilers Lecture VII Part 4: Syntactic Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

### 3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

3. Syntax Analysis Andrea Polini Formal Languages and Compilers Master in Computer Science University of Camerino (Formal Languages and Compilers) 3. Syntax Analysis CS@UNICAM 1 / 54 Syntax Analysis: the

### Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Bottom-up parsing Bottom-up parsing Recall Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form If α V t,thenα is called a sentence in L(G) Otherwise it is just

### A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

Bottom-up parsing Recall For a grammar G, with start symbol S, any string α such that S α is a sentential form If α V t, then α is a sentence in L(G) A left-sentential form is a sentential form that occurs

### Compiler Construction 2016/2017 Syntax Analysis

Compiler Construction 2016/2017 Syntax Analysis Peter Thiemann November 2, 2016 Outline 1 Syntax Analysis Recursive top-down parsing Nonrecursive top-down parsing Bottom-up parsing Syntax Analysis tokens

### S Y N T A X A N A L Y S I S LR

LR parsing There are three commonly used algorithms to build tables for an LR parser: 1. SLR(1) = LR(0) plus use of FOLLOW set to select between actions smallest class of grammars smallest tables (number

### LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

LR Parsing Compiler Design CSE 504 1 Shift-Reduce Parsing 2 LR Parsers 3 SLR and LR(1) Parsers Last modifled: Fri Mar 06 2015 at 13:50:06 EST Version: 1.7 16:58:46 2016/01/29 Compiled at 12:57 on 2016/02/26

### Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

Parser Generation Main Problem: given a grammar G, how to build a top-down parser or a bottom-up parser for it? parser : a program that, given a sentence, reconstructs a derivation for that sentence ----

### Principles of Programming Languages

Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp- 14/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 8! Bo;om- Up Parsing Shi?- Reduce LR(0) automata and

### Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Bottom up parsing Construct a parse tree for an input string beginning at leaves and going towards root OR Reduce a string w of input to start symbol of grammar Consider a grammar S aabe A Abc b B d And

### Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

MODULE 18 LALR parsing After understanding the most powerful CALR parser, in this module we will learn to construct the LALR parser. The CALR parser has a large set of items and hence the LALR parser is

### Concepts Introduced in Chapter 4

Concepts Introduced in Chapter 4 Grammars Context-Free Grammars Derivations and Parse Trees Ambiguity, Precedence, and Associativity Top Down Parsing Recursive Descent, LL Bottom Up Parsing SLR, LR, LALR

### Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or

### MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

MIT 6.035 Parse Table Construction Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Parse Tables (Review) ACTION Goto State ( ) \$ X s0 shift to s2 error error goto s1

### Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Parsing Wrapup Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing LR(1) items Computing closure Computing goto LR(1) canonical collection This lecture LR(1) parsing Building ACTION

### Compilers. Bottom-up Parsing. (original slides by Sam

Compilers Bottom-up Parsing Yannis Smaragdakis U Athens Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Bottom-Up Parsing More general than top-down parsing And just as efficient Builds

### Syntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Syntax Analysis (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay September 2007 College of Engineering, Pune Syntax Analysis: 2/124 Syntax

SYNTAX ANALYSIS 1. Define parser. Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. Also termed as Parsing. 2. Mention the basic

### UNIT-III BOTTOM-UP PARSING

UNIT-III BOTTOM-UP PARSING Constructing a parse tree for an input string beginning at the leaves and going towards the root is called bottom-up parsing. A general type of bottom-up parser is a shift-reduce

### SLR parsers. LR(0) items

SLR parsers LR(0) items As we have seen, in order to make shift-reduce parsing practical, we need a reasonable way to identify viable prefixes (and so, possible handles). Up to now, it has not been clear

WWW.STUDENTSFOCUS.COM UNIT -3 SYNTAX ANALYSIS 3.1 ROLE OF THE PARSER Parser obtains a string of tokens from the lexical analyzer and verifies that it can be generated by the language for the source program.

### Lexical and Syntax Analysis. Bottom-Up Parsing

Lexical and Syntax Analysis Bottom-Up Parsing Parsing There are two ways to construct derivation of a grammar. Top-Down: begin with start symbol; repeatedly replace an instance of a production s LHS with

### Monday, September 13, Parsers

Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of non-terminals S is the start symbol P is the set of productions

### LR Parsing Techniques

LR Parsing Techniques Introduction Bottom-Up Parsing LR Parsing as Handle Pruning Shift-Reduce Parser LR(k) Parsing Model Parsing Table Construction: SLR, LR, LALR 1 Bottom-UP Parsing A bottom-up parser

### Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

Free University of Bolzano Principles of Compilers Lecture IV Part 4, 2003/2004 AArtale (1) Principle of Compilers Lecture IV Part 4: Syntactic Analysis Alessandro Artale Faculty of Computer Science Free

### Syntax Analyzer --- Parser

Syntax Analyzer --- Parser ASU Textbook Chapter 4.2--4.9 (w/o error handling) Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 A program represented by a sequence of tokens

### Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

### Wednesday, August 31, Parsers

Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically

### Wednesday, September 9, 15. Parsers

Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

### Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

### Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure

### LALR Parsing. What Yacc and most compilers employ.

LALR Parsing Canonical sets of LR(1) items Number of states much larger than in the SLR construction LR(1) = Order of thousands for a standard prog. Lang. SLR(1) = order of hundreds for a standard prog.

### LR Parsers. Aditi Raste, CCOEW

LR Parsers Aditi Raste, CCOEW 1 LR Parsers Most powerful shift-reduce parsers and yet efficient. LR(k) parsing L : left to right scanning of input R : constructing rightmost derivation in reverse k : number

### 4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

### EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 2017-09-11 This lecture Regular expressions Context-free grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)

### Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

Compiler Design 1 Bottom-UP Parsing Compiler Design 2 The Process The parse tree is built starting from the leaf nodes labeled by the terminals (tokens). The parser tries to discover appropriate reductions,

### 4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

### Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Also known as Shift-Reduce parsing More powerful than top down Don t need left factored grammars Can handle left recursion Attempt to construct parse tree from an input string eginning at leaves and working

### Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 9/22/06 Prof. Hilfinger CS164 Lecture 11 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient

### UNIT III & IV. Bottom up parsing

UNIT III & IV Bottom up parsing 5.0 Introduction Given a grammar and a sentence belonging to that grammar, if we have to show that the given sentence belongs to the given grammar, there are two methods.

### In One Slide. Outline. LR Parsing. Table Construction

LR Parsing Table Construction #1 In One Slide An LR(1) parsing table can be constructed automatically from a CFG. An LR(1) item is a pair made up of a production and a lookahead token; it represents a

### Using an LALR(1) Parser Generator

Using an LALR(1) Parser Generator Yacc is an LALR(1) parser generator Developed by S.C. Johnson and others at AT&T Bell Labs Yacc is an acronym for Yet another compiler compiler Yacc generates an integrated

### 3. Parsing. Oscar Nierstrasz

3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/

### CA Compiler Construction

CA4003 - Compiler Construction David Sinclair A top-down parser starts with the root of the parse tree, labelled with the goal symbol of the grammar, and repeats the following steps until the fringe of

### Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Syntax Analysis: Context-free Grammars, Pushdown Automata and Part - 4 Department of Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Principles of Compiler

### Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Compilerconstructie najaar 2017 http://www.liacs.leidenuniv.nl/~vlietrvan1/coco/ Rudy van Vliet kamer 140 Snellius, tel. 071-527 2876 rvvliet(at)liacs(dot)nl college 3, vrijdag 22 september 2017 + werkcollege

### Acknowledgements. The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal

Acknowledgements The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error

### Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Bottom-up Parsing: Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals. Basic operation is to shift terminals from the input to the

### Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Eliminating Ambiguity Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity. Example: consider the following grammar stat if expr then stat if expr then stat else stat other One can

### Lecture 7: Deterministic Bottom-Up Parsing

Lecture 7: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Tue Sep 20 12:50:42 2011 CS164: Lecture #7 1 Avoiding nondeterministic choice: LR We ve been looking at general

### LR Parsing LALR Parser Generators

LR Parsing LALR Parser Generators Outline Review of bottom-up parsing Computing the parsing DFA Using parser generators 2 Bottom-up Parsing (Review) A bottom-up parser rewrites the input string to the

### Lecture 8: Deterministic Bottom-Up Parsing

Lecture 8: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 1 Avoiding nondeterministic choice: LR We ve been looking at general

### LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

TDDD16 Compilers and Interpreters TDDB44 Compiler Construction LR Parsing, Part 2 Constructing Parse Tables Parse table construction Grammar conflict handling Categories of LR Grammars and Parsers An NFA

### Chapter 4: LR Parsing

Chapter 4: LR Parsing 110 Some definitions Recall For a grammar G, with start symbol S, any string α such that S called a sentential form α is If α Vt, then α is called a sentence in L G Otherwise it is

### Bottom-Up Parsing. Parser Generation. LR Parsing. Constructing LR Parser

Parser Generation Main Problem: given a grammar G, how to build a top-down parser or a bottom-up parser for it? parser : a program that, given a sentence, reconstructs a derivation for that sentence ----

### CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

CS 2210 Sample Midterm 1. Determine if each of the following claims is true (T) or false (F). F A language consists of a set of strings, its grammar structure, and a set of operations. (Note: a language

VIVA QUESTIONS WITH ANSWERS 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language-the

### Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

LR(0) Drawbacks Consider the unambiguous augmented grammar: 0.) S E \$ 1.) E T + E 2.) E T 3.) T x If we build the LR(0) DFA table, we find that there is a shift-reduce conflict. This arises because the

### CS 4120 Introduction to Compilers

CS 4120 Introduction to Compilers Andrew Myers Cornell University Lecture 6: Bottom-Up Parsing 9/9/09 Bottom-up parsing A more powerful parsing technology LR grammars -- more expressive than LL can handle

### Syn S t yn a t x a Ana x lysi y s si 1

Syntax Analysis 1 Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token, tokenval Get next token Parser and rest of front-end Intermediate representation Lexical error Syntax

### LR Parsing Techniques

LR Parsing Techniques Bottom-Up Parsing - LR: a special form of BU Parser LR Parsing as Handle Pruning Shift-Reduce Parser (LR Implementation) LR(k) Parsing Model - k lookaheads to determine next action

### Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS164 Lecture 11 1 Administrivia Test I during class on 10 March. 2/20/08 Prof. Hilfinger CS164 Lecture 11

### How do LL(1) Parsers Build Syntax Trees?

How do LL(1) Parsers Build Syntax Trees? So far our LL(1) parser has acted like a recognizer. It verifies that input token are syntactically correct, but it produces no output. Building complete (concrete)

### LR Parsing LALR Parser Generators

Outline LR Parsing LALR Parser Generators Review of bottom-up parsing Computing the parsing DFA Using parser generators 2 Bottom-up Parsing (Review) A bottom-up parser rewrites the input string to the

### Syntactic Analysis. Top-Down Parsing

Syntactic Analysis Top-Down Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

### MODULE 14 SLR PARSER LR(0) ITEMS

MODULE 14 SLR PARSER LR(0) ITEMS In this module we shall discuss one of the LR type parser namely SLR parser. The various steps involved in the SLR parser will be discussed with a focus on the construction

### CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

CS453 : JavaCUP and error recovery CS453 Shift-reduce Parsing 1 Shift-reduce parsing in an LR parser LR(k) parser Left-to-right parse Right-most derivation K-token look ahead LR parsing algorithm using

### Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

TD parsing - LL(1) Parsing First and Follow sets Parse table construction BU Parsing Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1) Problems with SLR Aho, Sethi, Ullman, Compilers

### COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

COMPILER CONSTRUCTION Lab 2 Symbol table LABS Lab 3 LR parsing and abstract syntax tree construction using ''bison' Lab 4 Semantic analysis (type checking) PHASES OF A COMPILER Source Program Lab 2 Symtab

### CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

CSE P 501 Compilers LR Parsing Hal Perkins Spring 2018 UW CSE P 501 Spring 2018 D-1 Agenda LR Parsing Table-driven Parsers Parser States Shift-Reduce and Reduce-Reduce conflicts UW CSE P 501 Spring 2018

### CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Chapter 4 Lexical and Syntax Analysis Introduction - Language implementation systems must analyze source code, regardless of the specific implementation approach - Nearly all syntax analysis is based on

### Lecture Bottom-Up Parsing

Lecture 14+15 Bottom-Up Parsing CS 241: Foundations of Sequential Programs Winter 2018 Troy Vasiga et al University of Waterloo 1 Example CFG 1. S S 2. S AyB 3. A ab 4. A cd 5. B z 6. B wz 2 Stacks in

### Parsing. Rupesh Nasre. CS3300 Compiler Design IIT Madras July 2018

Parsing Rupesh Nasre. CS3300 Compiler Design IIT Madras July 2018 Character stream Lexical Analyzer Machine-Independent Code Code Optimizer F r o n t e n d Token stream Syntax Analyzer Syntax tree Semantic

### The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

COMPILER DESIGN 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language-the target

### Syntax Analysis Part I

Syntax Analysis Part I Chapter 4: Context-Free Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,

### Conflicts in LR Parsing and More LR Parsing Types

Conflicts in LR Parsing and More LR Parsing Types Lecture 10 Dr. Sean Peisert ECS 142 Spring 2009 1 Status Project 2 Due Friday, Apr. 24, 11:55pm The usual lecture time is being replaced by a discussion

### Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 1 Shift/Reduce Conflicts If a DFA state contains both [X: α aβ, b] and [Y: γ, a],

### Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program

### Syntactic Analysis. Chapter 4. Compiler Construction Syntactic Analysis 1

Syntactic Analysis Chapter 4 Compiler Construction Syntactic Analysis 1 Context-free Grammars The syntax of programming language constructs can be described by context-free grammars (CFGs) Relatively simple

### CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

CSE 401 Compilers LR Parsing Hal Perkins Autumn 2011 10/10/2011 2002-11 Hal Perkins & UW CSE D-1 Agenda LR Parsing Table-driven Parsers Parser States Shift-Reduce and Reduce-Reduce conflicts 10/10/2011

### CSE302: Compiler Design

CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University March 20, 2007 Outline Recap LR(0)

### Syntax-Directed Translation

Syntax-Directed Translation ALSU Textbook Chapter 5.1 5.4, 4.8, 4.9 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 What is syntax-directed translation? Definition: The compilation

### shift-reduce parsing

Parsing #2 Bottom-up Parsing Rightmost derivations; use of rules from right to left Uses a stack to push symbols the concatenation of the stack symbols with the rest of the input forms a valid bottom-up

### Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Topics Chapter 4 Lexical and Syntax Analysis Introduction Lexical Analysis Syntax Analysis Recursive -Descent Parsing Bottom-Up parsing 2 Language Implementation Compilation There are three possible approaches

### Types of parsing. CMSC 430 Lecture 4, Page 1

Types of parsing Top-down parsers start at the root of derivation tree and fill in picks a production and tries to match the input may require backtracking some grammars are backtrack-free (predictive)

### General Overview of Compiler

General Overview of Compiler Compiler: - It is a complex program by which we convert any high level programming language (source code) into machine readable code. Interpreter: - It performs the same task

### LR Parsing. Table Construction

#1 LR Parsing Table Construction #2 Outline Review of bottom-up parsing Computing the parsing DFA Closures, LR(1) Items, States Transitions Using parser generators Handling Conflicts #3 In One Slide An

### 2068 (I) Attempt all questions.

2068 (I) 1. What do you mean by compiler? How source program analyzed? Explain in brief. 2. Discuss the role of symbol table in compiler design. 3. Convert the regular expression 0 + (1 + 0)* 00 first

### CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages Lecture 5: Syntax Analysis (Parsing) Zheng (Eddy) Zhang Rutgers University January 31, 2018 Class Information Homework 1 is being graded now. The sample solution

### A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

### Lexical and Syntax Analysis (2)

Lexical and Syntax Analysis (2) In Text: Chapter 4 N. Meng, F. Poursardar Motivating Example Consider the grammar S -> cad A -> ab a Input string: w = cad How to build a parse tree top-down? 2 Recursive-Descent

### CS 406/534 Compiler Construction Parsing Part I

CS 406/534 Compiler Construction Parsing Part I Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr.

### Principles of Programming Languages

Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp- 15/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 10! LR parsing with ambiguous grammars Error recovery

### CS415 Compilers. LR Parsing & Error Recovery

CS415 Compilers LR Parsing & Error Recovery These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Review: LR(k) items The LR(1) table construction