PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

Similar documents
Context-free grammars

Compiler Construction: Parsing

Lecture Compiler Construction

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

UNIT-III BOTTOM-UP PARSING

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Compiler Construction 2016/2017 Syntax Analysis

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Concepts Introduced in Chapter 4

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.


Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

LR Parsing Techniques

Syntax Analysis Part I

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

MODULE 14 SLR PARSER LR(0) ITEMS

S Y N T A X A N A L Y S I S LR

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Principles of Programming Languages

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Compiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6

VIVA QUESTIONS WITH ANSWERS

CSE 401 Compilers. LR Parsing Hal Perkins Autumn /10/ Hal Perkins & UW CSE D-1

3. Parsing. Oscar Nierstrasz

CSE P 501 Compilers. LR Parsing Hal Perkins Spring UW CSE P 501 Spring 2018 D-1

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

Table-Driven Parsing


Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Parsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)

LALR Parsing. What Yacc and most compilers employ.

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

Syntax Analyzer --- Parser

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Compilers. Bottom-up Parsing. (original slides by Sam

LR Parsing Techniques

Bottom-Up Parsing. Lecture 11-12

Acknowledgements. The slides for this lecture are a modified versions of the offering by Prof. Sanjeev K Aggarwal

LR Parsers. Aditi Raste, CCOEW

Lecture 7: Deterministic Bottom-Up Parsing

Bottom-Up Parsing. Lecture 11-12

Lecture 8: Deterministic Bottom-Up Parsing

CA Compiler Construction

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Bottom-Up Parsing II. Lecture 8

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Chapter 4: Syntax Analyzer

CS308 Compiler Principles Syntax Analyzer Li Jiang

Lexical and Syntax Analysis. Bottom-Up Parsing

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

Downloaded from Page 1. LR Parsing

Introduction to Syntax Analysis

UNIT III & IV. Bottom up parsing

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

4. Lexical and Syntax Analysis

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

CS 4120 Introduction to Compilers

A programming language requires two major definitions A simple one pass compiler

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Wednesday, August 31, Parsers

4. Lexical and Syntax Analysis

Types of parsing. CMSC 430 Lecture 4, Page 1

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Chapter 4. Lexical and Syntax Analysis

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

SLR parsers. LR(0) items

Parsing. Rupesh Nasre. CS3300 Compiler Design IIT Madras July 2018

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

How do LL(1) Parsers Build Syntax Trees?

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

General Overview of Compiler

Monday, September 13, Parsers

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Introduction to Syntax Analysis. The Second Phase of Front-End

Syn S t yn a t x a Ana x lysi y s si 1

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Bottom-Up Parsing. Parser Generation. LR Parsing. Constructing LR Parser

Chapter 4: LR Parsing

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

CSE302: Compiler Design

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

컴파일러입문 제 6 장 구문분석

CS 314 Principles of Programming Languages

Top down vs. bottom up parsing

CS 406/534 Compiler Construction Parsing Part I

In One Slide. Outline. LR Parsing. Table Construction

Syntactic Analysis. Top-Down Parsing

Introduction to parsers

LR Parsing LALR Parser Generators

Transcription:

PART 3 - SYNTAX ANALYSIS F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 64 / 309

Goals Definition of the syntax of a programming language using context free grammars Methods for parsing of programs determine whether a program is syntactically correct Advantages (of grammars): Precise, easily comprehensible language definition Automatic construction of parsers Declaration of the structure of a programming language (important for translation and error detection) Easy language extensions and modifications F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 65 / 309

Tasks source program lexical analyser token get next token parser parse tree rest of the front end intermediate representation symbol table Parser types: Universal parsers (inefficient) Top-down-parser Bottom-up-parser Only subclasses of grammars (LL, LR) Collect token informations Type checking Immediate code generation F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 66 / 309

Syntax error handling Error types: Lexical errors (spelling of a keyword) Syntactic errors (closing bracket is missing) Semantic errors (operand is incompatible to operator) Logic Errors (infinite loop) Tasks: Exact error description Error recovery consecutive errors should be detectable Error correction should not slow down the processing of correct programs F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 67 / 309

Problems during error handling Spurious Errors: Consecutive errors created by error recovery Example: Compiler issues error-recovery resulting in the removal of the declaration of pi Error during semantic analysis: pi undefined Error is detected late in the process error message does not point to the correct position within the code Too many error messages are issued F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 68 / 309

Error-recovery Panic mode: Skip symbols until input can by synchronized to a token Phrase-level recovery: Local error corrections, e.g. replacement of, by a ; Error productions: Extension of grammar to handle common errors Global correction: Minimal correction of program in order to find a matching derivation (cost intensive) F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 69 / 309

Grammars Grammar A grammar is a 4-tupel G = (V N, V T, S, Φ) whereby: V N Set of nonterminal symbols V T Set of terminal symbols S V N Start symbol Φ : (V N V T ) V N (V N V T ) (V N V T ) Set of production rules (rewriting rules) (α, β) is represented as α β Example: ({S, A, Z}, {a, b, 1, 2}, S, {S AZ, A a, A b, Z ɛ, Z 1, Z 2, Z ZZ}) F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 70 / 309

Derivations in grammars Direct derivation σ, ψ (V T V N ). σ can be directly derived from ψ (in one step; ψ σ), if there are two strings φ 1, φ 2, so that σ = φ 1 βφ 2 and ψ = φ 1 αφ 2 and α β Φ. Example: ψ σ Rule used φ 1 φ 2 S A Z S AZ ɛ ɛ az a1 Z 1 a ɛ AZZ A2Z Z 2 A Z F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 71 / 309

Derivation Production: A string ψ produces σ (ψ + σ), if there are strings φ 0,..., φ n (n > 0), so that ψ = φ 0 φ 1, φ 1 φ 2,..., φ n 1 φ n = σ. Example: S AZ AZZ A2Z a2z a21 Reflexive, transitive closure: ψ σ ψ + σ or ψ = σ Accepted language: A grammar G accepts the following language L(G) = {σ S σ, σ (V T ) } F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 72 / 309

Parse trees Example: E E + E E E id 2 derivations (and parse trees) for id+id*id E E E + E E * E id E * E E + E id id id id id Grammar is ambiguous F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 73 / 309

Classification of grammars Chomsky (restriction of production rules α β) Unrestricted Grammar: no restrictions Context-Sensitive Grammar: α β Context-Free Grammar: α β and α V N Regular Grammar: α β, α V N and β is in the form of: ab or a whereby a V T and B V N F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 74 / 309

Grammar examples Regular grammar: (a b) abb A 0 aa 0 ba 0 aa 1 A 1 ba 2 A 2 ba 3 A 3 ɛ Context-sensitive grammars: L 1 = {wcw w (a b) } But L 1 = {wcw R w (a b) } is context-free L 2 = {a n b m c n d m n 1, m 1} L 3 = {a n b n c n n 1} F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 75 / 309

Conversions Remove ambiguities stmnt if expr then stmnt if expr then stmnt else stmnt other 2 parse trees for if E 1 then if E 2 then S 1 else S 2. smtn smtn if expr then smtn E1 if expr then smtn else smtn if expr then smtn else smtn E1 S2 if expr then smtn E2 S1 S2 E2 S1 Prefer left tree Associate each else with the closest preceding then F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 76 / 309

Removing left recursions A grammar is left-recursive if there is a nonterminal A and a production A + Aα Top-Down-Parsing can t handle left recursions Example: convert A Aα β to: A βa 1 A 1 αa 1 ɛ F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 77 / 309

Algorithm to eliminate left recursions Input: Grammar G without cycles and ɛ-productions Output: Grammar without left recursions Arrange the nonterminals in some order A 1, A 2,..., A n for i := 1 to n do for j := 1 to i 1 do Replace each production A i A j γ by the productions A i δ 1 γ... δ k γ, where A j δ 1... δ k are all the current A j -production end Eliminate the immediate left recursion among the A i -productions end F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 78 / 309

Left factoring Important for predictive parsing Elimination of alternative productions stmnt if expr then stmnt else stmnt Example: if expr then stmnt Solution: For each nonterminal A find the longest prefix α for two or more alternative productions If α ɛ then replace all A-productions A αβ 1 αβ 2... αβ n γ (γ does not start with α) with: A αa 1 γ A 1 β 1 β 2... β n Apply transformation until no prefixes α ɛ can be found F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 79 / 309

Top-down-parsing Idea: Construct parse tree for a given input, starting at root node Recursive-descent parsing (with backtracking) Example: S cad A ab a Matching of cad c S A (1) Predictive parsing (without backtracking, special case of recursive-descent parsing) Left-recursive grammars can lead to infinite loops! d c a S A (2) b d c S A a (3) d F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 80 / 309

Predictive parsers Recursive-descent parser without backtracking Possible if production which needs to be used is obvious for each input symbol Transition diagrams 1 Remove left recursions 2 Left factoring 3 For each nonterminal A: 1 Create a initial state and an end state 2 For each production A X 1X 2... X n create a path leading from the initial state to the end state while labeling the edges X 1,..., X n F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 81 / 309

Predictive parsers (II) Processing: 1 Start at the initial state of the current start symbol 2 Suppose we are currently in the state s which has an edge whose label contains a terminal a and leads to the state t. If the next input symbol is a then go to state t and read a new input symbol. 3 Suppose the edge (from s) is marked by a nonterminal A. In that case go to the initial state of A (without reading a new input symbol). If we reach the end state of A then go to state t which is succeeding s. 4 If the edge is marked by ɛ then go directly to t without reading the input. Easily implemented by recursive procedures F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 82 / 309

Example - Predictive parser E() E1() E ide 1 (E) E 1 ope ɛ if nexttoken=id then getnexttoken E1() if nexttoken=( then getnexttoken E() if nexttoken=) then akzept if nexttoken=op then getnexttoken E() else return E: E1: id 0 1 ( 2 op E 0 1 2 ε E1 E ) 3 4 F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 83 / 309

Non-recursive predictive parser INPUT a + b $ STACK X Y Z $ Predictive Parsing Program OUTPUT Parsing Table M Input buffer: String to be parsed (terminated by a $) Stack: Initialized with the start symbol and contains nonterminals wich are not derivated yet (terminated by a $) Parsing table M(A, a), A is a nonterminal, a a terminal or $ F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 84 / 309

Top-down parsing with stack Mode of operation: X is top element of stack, a the current input symbol 1 X is a terminal: If X = a = $, then the input was matched. If X = a $, pop X off the stack and read next input symbol. Otherwise an error occured. 2 X ist a nonterminal: Fetch entry of M(X, a). If this entry is an error skip to error recovery. Otherwise the entry is a production of the form X UV W. Replace X on the stack with W V U (afterward U is the top most element on the stack). F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 85 / 309

Example Grammar E id E 1 (E) E 1 op E ɛ Parsing table M(X, a) ONTERMINAL id op ( ) $ E E id E 1 E (E) E 1 E op E E 1 ɛ E 1 ɛ Derivation of id op id. F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 86 / 309

Example (II) STACK INPUT OUTPUT $ E id op id $ $ E 1 id id op id $ E id E 1 $ E 1 op id $ $ E op op id $ E 1 op E $ E id $ $ E 1 id id $ E id E 1 $ E 1 $ $ $ E 1 ɛ F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 87 / 309

FIRST and FOLLOW Used when calculating parse table F IRST (α) Set of terminals, which can be derived from α (α string of grammar symbols) F OLLOW (A) Set of terminals which occur directly on the right side next to the nonterminal A in a derivation If A is the right most element of a derivation, then $ is contained in F OLLOW (A) F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 88 / 309

Calculation of FIRST F IRST (X) for a grammar symbol X 1 X is a terminal: F IRST (X) = {X} 2 X ɛ is a production: Add ɛ to F IRST (X) 3 X is a nonterminal and X Y 1 Y 2... Y k is a production a is in F IRST (X) if: 1 An i exists; a is in F IRST (Y i) and ɛ is in every set F IRST (Y 1)... F IRST (Y i 1) 2 a = ɛ and ɛ is in every set F IRST (Y 1)... F IRST (Y k ) F IRST (X 1 X 2... X n ): Each non-ɛ symbol of F IRST (X 1 ) is in the result If ɛ F IRST (X 1 ), then each non-ɛ symbol of F IRST (X 2 ) is in the result and so on Is ɛ in every F IRST -set, then it it also is contained in the result F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 89 / 309

Calculation of FOLLOW In order to calculate F OLLOW (A) of a nonterminal A use following rules: 1 Add $ to F OLLOW (S), whereby S is the initial symbol 2 For each production A αbβ, add all elements of F IRST (β) except ɛ to F OLLOW (B) 3 For each production A αb and A αbβ with ɛ F IRST (β), add each element of F OLLOW (A) to F OLLOW (B) F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 90 / 309

Example Grammar: FIRST sets: FOLLOW sets: E id E 1 (E) E 1 op E ɛ F IRST (E) = {id, (} F IRST (E 1 ) = {op, ɛ} F OLLOW (E) = F OLLOW (E 1 ) = {$, )} F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 91 / 309

Construction of parsing tables Input: Grammar G Output: Parsing table M 1 For each production A α do Steps 2 and 3. 2 For each terminal a in F IRST (α), add A α to M(A, a). 3 If ɛ is in F IRST (α), add A α to M(A, b) for each terminal b in F OLLOW (A). If ɛ is in F IRST (α) and $ is in F OLLOW (A), add A α to M(A, $) 4 Make each undefined entry of M be error. Example: See table of last example grammar F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 92 / 309

LL(1) Grammars Parsing table construction can be used with arbitrary grammars Multiple elements per entry may occur LL(1) Grammar: Grammar whose parsing table contains no multiple entries L... Scanning the Input from LEFT to right L... Producing the LEFTMOST derivation 1... Using 1 input symbol lookahead F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 93 / 309

Properties of LL(1) No ambiguous or left-recursive grammar is LL(1) G ist LL(1) For each two different productions A α β it is neccessary that: 1 No strings may be derived from both α and β which start with the same terminal a 2 At most one of the productions α or β may be derivable to ɛ 3 If β ɛ, then α may not derive any string which starts with an element in F OLLOW (A) Multiple entries in the parsing table can occasionally be removed by hand (without changing the language recognized by the automaton) F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 94 / 309

Error-recovery in predictive parsing Heuristics in panic-mode error recovery: 1 Initially, all symbols in F OLLOW (A) can be used for synchronisation: Skip all tokens until an element in F OLLOW (A) is read and remove A from the stack. 2 If F OLLOW sets don t suffice: Use hierarchical structure of program constructs. E.g. use keywords occuring at the beginning of a statement as addition to the synchronisation set. 3 F IRST (A) can be used as well: If an element in F IRST (A) is read, continue parsing at A. 4 If a terminal which can t be matched is at the top of the stack, remove it. F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 95 / 309

Bottom-up parsing Shift-reduce parsing Reduction of an input towards the start symbol of the grammar Reduction step: Replace a substring, which matches the right side of a production with the left side of that same production Example: S aabe A Abc b B d abbcde aabcde aade aabe S F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 96 / 309

Handles Substring, which matches the right side of a production and leads to a valid derivation (rightmost derivation) Example (ambiguous grammar): E E + E E E E E (E) E id Rightmost derivation of id + id * id: Right-Sentential Form Handle Reducing Production id + id * id id E id id + id * E id E id id + E * E E E E E E id + E id E id E + E E + E E E + E E F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 97 / 309

Stack implementation Initially: Stack Input $ w$ Shift n 0 symbols from input onto stack until a handle can be found Reduce handle (replace handle with left side of production) F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 98 / 309

Example shift-reduce parsing Stack Input Action (1) $ id + id * id $ shift (2) $ id + id * id $ reduce by E id (3) $ E + id * id $ shift (4) $ E+ id * id $ shift (5) $ E+ id * id $ reduce by E id (6) $ E + E * id $ shift (7) $ E + E id $ shift (8) $ E + E id $ reduce by E id (9) $ E + E E $ reduce by E E E (10) $ E + E $ reduce by E E + E (11) $ E $ accept F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 99 / 309

Viable prefixes, conflicts Viable prefix: Right sentential forms which can occur within the stack of a shift-reduce parser Conflicts: (Ambiguous grammars) stmt if expr then stmt if expr then stmt else stmt other Configuration: Stack Input... if expr then stmt else... No unambiguous handle (shift-reduce conflict) F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 100 / 309

LR parser LR(k) parsing L... Left-to-right scanning R... Rightmost derivation in reverse Advantages: Can be used for (nearly) every programming language construct Most generic backtrack-free shift-reduce parsing method Class of LR-grammars is greater than those of LL-grammars LR-parsers identify errors as early as possible Disadvantage: LR-parser is hard to build manually F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 101 / 309

LR-parsing algorithm INPUT a... a... 1 i a n $ STACK s m Xm s m-1 Xm-1... LR Parsing Program OUTPUT s 0 action goto Stack stores s 0 X 1 s 1 X 2 s 2... X m s m (X i grammar, s i state) Parsing table = action- and goto-table s m current state, a i current input symbol action[s m, a i ] {shift, reduce, accept, error} goto[s m, a i ] transition function of a DFA F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 102 / 309

LR-parsing mode of operation Configuration (s 0 X 1 s 1... X m s m, a i a i+1... a n ) Next step (move) is determined by reading of a i Dependent on action[s m, a i ]: 1 action[s m, a i ] = shift s New configuration: (s 0 X 1 s 1... X m s m a i s, a i+1... a n ) 2 action[s m, a i ] = reduce A β New configuration: (s 0 X 1 s 1... X m r s m r As, a i a i+1... a n ) whereby s = goto[s m r, A], r length of β 3 action[s m, a i ] = accept 4 action[s m, a i ] = error F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 103 / 309

Example ) E E + T ) E T ) T T F ) T F ) F (E) ) F id State action goto id + * ( ) $ E T F 0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1 10 r3 r3 r3 r3 11 r5 r5 r5 r5 F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 104 / 309

Construction of SLR parsing tables LR(0)-items: Production with dot at one position of the right side Example: Production A XY Z has 4 items: A.XY Z, A X.Y Z, A XY.Z and A XY Z. Exception: Produktion A ɛ only has the item: A. Augmented grammar: Grammar with new start symbol S and production S S. Functions: closure and goto closure(i) (I... set of items) 1 All I are within closure 2 If A α.bβ is part of closure and B γ is a production, then add B.γ to closure F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 105 / 309

Construction, goto goto(i, X) with I as set of items and X a symbol of the grammar goto = closure of set of all items A αx.β for all A α.xβ in I Example: I = {E E., E E. + T } goto(i, +) = {E E +.T, T.T F, T.F, F.(E), F.id} Sets-of-items construction (Construction of all LR(0)-items) items(g ) I 0 = closure({s.s}) C = {I 0 } repeat for each set of items I C and each grammar symbol X such that goto(i, X) is not empty and not in C do Add goto(i, X) to C until no more sets of items can be added to C F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 106 / 309

SLR parsing table Input: Augmented grammar G Output: SLR parsing table 1 Calculate C = {I 0, I 1,..., I n }, the set of LR(0)-items of G 2 State i is created by I i as follows: 1 If A α.aβ is in I i and goto(i i, a) = I j, then action(i, a) = shift j (a is a terminal symbol) 2 If A α. is in I i, then action[i, a] = reduce A α for all a F OLLOW (A) A S 3 If S S. is in I i, then action[i, $] = accept 3 For all nonterminal symbols A: goto[i, A] = j if goto(i i, A) = I j 4 Every other table entry is set to error 5 Initial state is determined by the item set with S.S F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 107 / 309

SLR(1), conflicts, error handling If we recieve a table without multiple entries using the SLR-parsing-table-algorithm then the grammar is SLR(1) Otherwise the algorithm fails and an algorithm for extended languages (like LR) needs to be utilized generally results in increased processing requirements Shift/reduce-conflicts can be partially resolved The process usually involves the determination of operator binding strength and associativity Error handling can be directly incorporated into the parsing table F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 108 / 309