Talen en Compilers. Johan Jeuring , period 2. January 17, Department of Information and Computing Sciences Utrecht University

Similar documents
CS 4120 Introduction to Compilers

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Lecture Bottom-Up Parsing

DEPARTMENT OF INFORMATION TECHNOLOGY / COMPUTER SCIENCE AND ENGINEERING UNIT -1-INTRODUCTION TO COMPILERS 2 MARK QUESTIONS

VALLIAMMAI ENGNIEERING COLLEGE SRM Nagar, Kattankulathur

shift-reduce parsing

Bottom-Up Parsing. Lecture 11-12

Syntax Analysis Part I

S Y N T A X A N A L Y S I S LR

SLR parsers. LR(0) items

Introduction to Bottom-Up Parsing

Introduction to Syntax Analysis

LR Parsing. The first L means the input string is processed from left to right.

Monday, September 13, Parsers

Introduction to Syntax Analysis. The Second Phase of Front-End

Bottom-Up Parsing. Lecture 11-12

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

LL Parsing: A piece of cake after LR

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Recursive Descent Parsers

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

UNIT-III BOTTOM-UP PARSING

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

UNIT III & IV. Bottom up parsing

CS453 : Shift Reduce Parsing Unambiguous Grammars LR(0) and SLR Parse Tables by Wim Bohm and Michelle Strout. CS453 Shift-reduce Parsing 1

Downloaded from Page 1. LR Parsing

LR Parsing - The Items

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

R10 SET a) Construct a DFA that accepts an identifier of a C programming language. b) Differentiate between NFA and DFA?

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

General Overview of Compiler

Lexical and Syntax Analysis. Bottom-Up Parsing

Question Bank. 10CS63:Compiler Design

LR(0) Parsers. CSCI 3130 Formal Languages and Automata Theory. Siu On CHAN. Fall Chinese University of Hong Kong 1/31

CS143 Handout 14 Summer 2011 July 6 th, LALR Parsing

Decision Properties for Context-free Languages

Outline. The strategy: shift-reduce parsing. Introduction to Bottom-Up Parsing. A key concept: handles

컴파일러구성 제 2 강 Recursive-descent Parser / Predictive Parser

Principles of Programming Languages

Intro to Bottom-up Parsing. Lecture 9

CS143 Handout 20 Summer 2012 July 18 th, 2012 Practice CS143 Midterm Exam. (signed)

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

CT32 COMPUTER NETWORKS DEC 2015

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Context-free grammars

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Table-Driven Parsing

LALR Parsing. What Yacc and most compilers employ.

Compiler Construction: Parsing

A language is a subset of the set of all strings over some alphabet. string: a sequence of symbols alphabet: a set of symbols

Bottom-Up Parsing II. Lecture 8

Chapter 4. Lexical and Syntax Analysis

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Wednesday, August 31, Parsers

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Wednesday, September 9, 15. Parsers

Lecture 7: Deterministic Bottom-Up Parsing

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

LR Parsing LALR Parser Generators

Fall Compiler Principles Lecture 3: Parsing part 2. Roman Manevich Ben-Gurion University

Compiler Construction

2068 (I) Attempt all questions.

More Bottom-Up Parsing

LALR stands for look ahead left right. It is a technique for deciding when reductions have to be made in shift/reduce parsing. Often, it can make the

PDA s. and Formal Languages. Automata Theory CS 573. Outline of equivalence of PDA s and CFG s. (see Theorem 5.3)

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

CMSC 330: Organization of Programming Languages

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

CSC 4181 Compiler Construction. Parsing. Outline. Introduction

CS 314 Principles of Programming Languages. Lecture 3

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

LR Parsing Techniques

LR Parsing E T + E T 1 T

Lecture 8: Deterministic Bottom-Up Parsing

Derivations of a CFG. MACM 300 Formal Languages and Automata. Context-free Grammars. Derivations and parse trees

LR Parsing LALR Parser Generators

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

4. Lexical and Syntax Analysis

Properties of Regular Expressions and Finite Automata

Dr. D.M. Akbar Hussain

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

University of Technology Department of Computer Sciences. Final Examination st Term. Subject:Compilers Design

CMSC 330: Organization of Programming Languages

Lecture Notes on Shift-Reduce Parsing

CMSC 330: Organization of Programming Languages. Context Free Grammars

Zhizheng Zhang. Southeast University

In One Slide. Outline. LR Parsing. Table Construction

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

4. Lexical and Syntax Analysis

Compilers. Bottom-up Parsing. (original slides by Sam

LL(1) predictive parsing

Bottom Up Parsing Handout. 1 Introduction. 2 Example illustrating bottom-up parsing

Transcription:

Talen en Compilers 2015-2016, period 2 Johan Jeuring Department of Information and Computing Sciences Utrecht University January 17, 2016

13. LR parsing 13-1

This lecture LR parsing Basic idea The LR(0) automaton Conflicts 13-2

13.1 Basic idea 13-3

Setup The LR parsing procedure is similar to that of LL parsing. An LR parser maintains a state consisting of a stack of symbols, and the current (remaining) input. We again write (α, x) to denote the state consisting of the stack α and input x. 13-4

Actions We start with state (ε, x) where x is the input string, i.e., with the empty stack. In each step, we do one of the following: Shift The first symbol in the input is moved on top of the stack. We let the stack grow to the right this time. Reduce If the top symbols of the stack match the right hand side of a production, we can replace them with the corresponding left hand side. The parser succeeds if we can reach the state (S, ε) where S is the start symbol. Otherwise, the parser fails. 13-5

Acceptance As with LL, the LR approach is at first nondeterministic. It is not always clear when to shift and when to reduce. An LR machine accepts a word if any sequence of choices leads to success. 13-6

Example S as cs b Let us parse aab: stack input ε aab initial state we have to shift a ab shift aa b shift aab ε reduce aas ε reduce as ε reduce S ε parse successful 13-7

Another example 13-8 S ca b A cbc bsa a B cc Cb C as ba Let us parse ccccba. ε ccccba initial state, shift c cccba shift cc ccba let s try reducing B ccba shift Bc cba shift Bcc ba let s try reducing BB ba shift BBb a let s try reducing BBS a shift BBSa ε reduce BBSA ε we re stuck Reducing as early as possible does not seem to be a good strategy in general.

Another example another strategy S ca b A cbc bsa a B cc Cb C as ba Let us parse ccccba. ε ccccba initial state, shift c cccba shift cc ccba let s try shifting ccc cba let s try shifting cccc ba let s try reducing ccb ba shift ccbb a let s try shifting ccbba ε reduce, but how? ccbc ε reduce ca ε reduce S ε parse successful 13-9

LR vs. LL LR LL ε ccccba S ccccba c cccba ca ccccba cc ccba A cccba ccc cba cbc cccba cccc ba BC ccba ccb ba ccc ccba ccbb a cc cba ccbba ε C ba ccbc ε ba ba ca ε a a S ε ε ε 13-10

LR vs. LL contd. LR derivation S ca ccbc ccbba ccccba This is a rightmost derivation. LL derivation S ca ccbc ccccc ccccba This is a leftmost derivation. The name LR is because 13-11 the input is consumed starting from the left (the L ), and a rightmost derivation is returned (the R ).

13.2 The LR(0) automaton 13-12

How to pick actions During the parse process: we can only shift if there are symbols left in the input, we should only shift if there is an option of a later reduction, we can only reduce if the top of the stack matches a right hand side. We thus have to keep track of where in the productions we might currently be while recognizing the input. 13-13

Items Definition An item (also called LR(0) item) is a production together with a marked position on the right hand side. The position can be either at the beginning, between two symbols, or at the end. For example, for the production A cbc we obtain the following items: A cbc A c BC A cb C A cbc 13-14

Explicit end of input The following development is easier if we require the input grammar to have an explicit end of input marker (written $): S S$ S ca b A cbc bsa a B cc Cb C as ba 13-15

Start Our example grammar: S S$ S ca b A cbc bsa a B cc Cb C as ba At the beginning of the parse process, we are at the beginning of recognizing the right hand side of the start symbol. This corresponds to the item: S S$ 13-16

Closure of an item (set) If the marker in an item is in front of a nonterminal, we can in fact be at the beginning of recognizing the right hand side of that nonterminal, too. We add these items and repeat the procedure until we reach a fixed point, the closure of the item set: S S$ S ca S b These three items describe a state in an automaton augmenting the parser. 13-17

Building the automaton A state such as S S$ S ca S b tells us a few things: We cannot reduce in this state, because no items in the current state have position markers at the end of a right hand side. We should shift on seeing a c or a b. There cannot be a valid parse if the next input symbol is an a, so we can signal an error. 13-18

Transitions Transitions in the automaton are labelled with symbols, i.e., transitions can be labelled with both terminals and nonterminals. In any case, to determine the new state, we move the position marker where applicable and compute the closure of the remaining items. 13-19

Transition example From state 0, S S$ S ca S b on seeing a c we move to state 1: S c A A cbc A bsa A a This is another shift state. 13-20

Transition example contd. From state 1, S c A A cbc A bsa A a on seeing another c we move to state 2: A c BC B cc B Cb C as C ba 13-21 Still a shift state.

Transition example contd. From state 2, A c BC B cc B Cb C as C ba on seeing yet another c we move to state 3: B c c Still a shift state. 13-22

Transition example contd. From state 3, B c c on seeing yet another c we move to state 4: B cc A state with items that are marked in end position is called a reduce state. If there are not other items in the state (as here), the automaton has no outgoing transitions in the state. 13-23

How to reduce? On the stack, we remember not only the symbols we push, but also the states of the automaton we went through. Since we reduce the two cs by a B, we go back to state 2, A c BC B cc B Cb C as C ba and assume we have read a B. 13-24

Summary The automaton tells us what to do: In a shift state, we shift the state and the next symbol if it is among the transitions, or signal an error otherwise. In a reduce state, we remove the right hand side from the stack, look at the state, look where we re going when reading the left hand side, put the left hand side on the stack, and go on. 13-25

A part of the automaton start S S$ S ca S b A c BC S c A B cc c A cbc c c c B Cb B c c A bsa C as A a C ba B cc S A B S S $ S ca A cb C C as C ba b C b a a C ba C A cbc 13-26

The complete automaton The automaton can be rather large (as it is composed of item sets). How many items are there? How many potential item sets? In our example case, the automaton turns out to have 20 states. Suitable for a generator, not really suitable for manual generation. 13-27

13.3 Conflicts 13-28

Types of conflicts There are two types of conflicts that can arise: If a state allows both shifting and reducing, there is a shift-reduce conflict. If a state allows reducing by more than one production, there is a reduce-reduce conflict. 13-29

Example of a shift-reduce conflict Consider the following grammar: S S$ S asa a 13-30

Example of a shift-reduce conflict contd. The LR(0) automaton looks as follows: start S S$ S asa S a a S a Sa S asa S a S a a S S a S S $ S as a S asa 13-31 The marked state is both a shift and a reduce state.

Example of a reduce-reduce conflict Consider the following grammar: S S$ S Aa Bb A a B a 13-32

Example of a reduce-reduce conflict contd. The LR(0) automaton looks as follows: start S S$ S Aa S Bb A a B a a A a B a S A B S S $ S A a S B b a b S Aa S Bb 13-33 The marked state contains multiple reducible productions.

Next lecture How to resolve conflicts (sometimes). Another example. Discussion and summary. 13-34