Parsing. Earley Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 39

Size: px
Start display at page:

Download "Parsing. Earley Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 39"

Transcription

1 Parsing Earley Parsing Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 39

2 Table of contents 1 Idea 2 Algorithm 3 Tabulation 4 Parsing 5 Lookaheads 2 / 39

3 Idea (1) Goal: overcome problems with pure TD/BU approaches. Earley s algorithm can be seen as a bottom-up parser with top-down control, i.e., a bottom-up parsing that does only reductions that can be top-down predicted from S, or a top-down parser with bottom-up recognition 3 / 39

4 Idea (2) At each time of parsing, one production A X 1... X k is considered such that some part X 1... X i has already been bottom-up recognized (completed) while some part X i+1... X k has been top-down predicted. As in the left-corner chart parser, this situation can be characterized by a dotted production (sometimes called Earley item) A X 1... X i X i+1... X k. Dotted productions are called active items. Productions of the form A α are called completed items. 4 / 39

5 Idea (3) The Earley parser simulates a top-down left-to-right depth-first traversal of the parse tree while moving the dot such that for each node first, the dot is to its left (the node is predicted), then the dot traverses the tree below, then the dot is to its right (the subtree below the node is completed) Each state of the parser can be characterized by a set of dotted productions A X 1... X i X i+1... X k. For each of these, one needs to keep track of the input part spanned by the completed part of the rhs, i.e., by X 1... X i. 5 / 39

6 Algorithm (1) The items describing partial results of the parser contain a dotted production and the start and end index of the completed part of the rhs: Item form: [A α β, i, j] with A αβ P, 0 i j n. Parsing starts with predicting all S-productions: Axioms: [S α, 0, 0] S α P 6 / 39

7 Algorithm (2) If the dot of an item is followed by a non-terminal symbol B, a new B-production can be predicted. The completed part of the new item (still empty) starts at the index where the completed part of the first item ends. Predict: [A α Bβ, i, j] [B γ, j, j] B γ P If the dot of an item is followed by a terminal symbol a that is the next input symbol, then the dot can be moved over this terminal (the terminal is scanned). The end position of the completed part is incremented. Scan: [A α aβ, i, j] [A αa β, i, j + 1] w j+1 = a 7 / 39

8 Algorithm (3) If the dot of an item is followed by a non-terminal symbol B and if there is a second item with a dotted B-production and a fully completed rhs and if, furthermore, the completed part of the second item starts at the position where the completed part of the first ends, then the dot in the first can be moved over the B while changing the end index to the end index of the completed B-production. Complete: [A α Bβ, i, j], [B γ, j, k] [A αb β, i, k] The parser is successfull if a completed S-production spanning the entire input can be deduced: Goal items: [S α, 0, n] for some S α P. 8 / 39

9 Algorithm (4) Note that this algorithm can deal with ɛ-productions; loops and left-recursions are no problem since an active item is generated only once; the algorithm works for any type of CFG. 9 / 39

10 Algorithm (5) Example Productions: S ab ba, A as baa a, B bs abb b. w = abab. Set of deduced items: 1. [S ab, 0, 0] axiom 2. [S ba, 0, 0] axiom 3. [S a B, 0, 1] scan with [B bs, 1, 1] predict with [B b, 1, 1] predict with [B abb, 1, 1] predict with [B b S, 1, 2] scan with [B b, 1, 2] scan with [S ab, 0, 2] complete with 3. and / 39

11 Algorithm (6) Example continued 10. [S ab, 2, 2] predict with [S ba, 2, 2] predict with [S a B, 2, 3] scan with [B bs, 3, 3] predict with [B b, 3, 3] predict with [B abb, 3, 3] predict with [B b S, 3, 4] scan with [B b, 3, 4] scan with [S ab, 4, 4] predict with [S ba, 4, 4] predict with [S ab, 2, 4] complete with 12. and [B bs, 1, 4] complete with 7. and [S ab, 0, 4] complete with 3. and / 39

12 Algorithm (7) Soundness and completeness: The following holds: [A α β, i, j] iff S w 1... w i Aγ w 1... w i αβγ w 1... w i w i+1... w j βγ for some γ (N T ). The algorithm is in particular prefix-valid: if there is an item with end position j, then there is a word in the language with prefix w 1... w j. 12 / 39

13 Algorithm (8) In addition, one can use passive items [A, i, j] with A N, 0 i j n. Then, we need an additional convert rule, that converts a completed active item into a passive one: Convert: [B γ, j, k] [B, j, k] The goal item is then [S, 0, n]. 13 / 39

14 Algorithm (9) The Complete rule can use passive items now: Complete: [A α Bβ, i, j], [B, j, k] [A αb β, i, k] The advantage is that we obtain a higher degree of factorization: A B-subtree might have different analyses. Complete can use this B category now independent from the concrete analyses, i.e., there is only one single application of Complete for all of them. 14 / 39

15 Tabulation (1) We can tabulate the dotted productions depending on the indices of the covered input. I.e., we adopt a (n + 1) (n + 1)-chart C with A α β C i,j iff [A α β, i, j]. The chart is initialized with C 0,0 := {S α S α P} and C i,j = for all i, j [0..n] with i 0 or j 0. It can then be filled in the following way: 15 / 39

16 Tabulation (2) Let us consider the version without passive items. The chart is filled row by row: for every end-of-span index k: we first compute all applications of predict and complete that yield new items with end-of-span index k; then, we compute all applications of scan which gives items with end-of-span index k / 39

17 Tabulation (3) for all k [0..n]: do until chart does not change any more: for all j [0..k] and all p C j,k : if p = A α Bβ then add B γ to C k,k else if p = B γ then for all i [0..j]: if there is a A α Bβ C i,j then add A αb β to C i,k for all j [0..k] and for all p C j,k : if p = A α w k+1 β then add A αw k+1 β to C j,k+1 for all B γ P predict scan complete 17 / 39

18 Tabulation (4) Note that predict and complete do not increment the end of the spanned input, i.e., they add only elements to the fields C...,k (the k-th row of the chart). Scan however adds elements to the C...,k+1 (the k + 1-th row). This is why first, all possible predict and complete operations are performed to generate new chart entries in the k-th row. Then, scan is applied and one can move on to the next row k + 1. Since predict and complete are applied as often as possible, ɛ-productions and left recursion are no problem for this algorithm. 18 / 39

19 Tabulation (5) Implementation: Besides the chart, for every k, we keep an agenda A k of those items from the chart that still need to be processed. Initially, for k = 0, this agenda contains all S α, the other agendas are empty. We process the items in the agendas from k = 0 to k = n. For each k, we stop once the k-agenda is empty. 19 / 39

20 Tabulation (6) Items x of the form A α Bβ trigger a predict operation. The newly created items, if they are not yet in the chart, are added to chart and k-agenda. In addition, if ɛ-productions are allowed, x also triggers a complete where the chart is searched for a completed B-item ranging from k to k. The new items (if not in the chart yet) are added to the k-agenda and the chart. x is removed from the k-agenda. 20 / 39

21 Tabulation (8) Items x of the form B γ trigger a complete operation where the chart is searched for corresponding items A α Bβ. The newly created items are added to the chart and the k- agenda (if they are not yet in the chart), x is removed from the k-agenda. Items x of the form A α aβ trigger a scan operation. The newly created items (if not yet in the chart) are added to the chart and the k + 1-agenda, x is removed from the k-agenda. 21 / 39

22 Tabulation (9) Example without ɛ-productions Productions S ASB c, A a, B b, input w = acb A 0 = {[S ASB, 0, 0], [S c, 0, 0]} S ASB S c S ASB triggers a predict: 22 / 39

23 Tabulation (10) Example continued A 0 = {[S c, 0, 0], [A a, 0, 0]} S ASB S c A a S c triggers a scan that fails, A a triggers a successful scan: 23 / 39

24 Tabulation (11) Example continued A 0 = {}, A 1 = {[A a, 0, 1]} 2 1 A a 0 S ASB S c A a A a triggers a complete: 24 / 39

25 Tabulation (12) Example continued A 0 = {}, A 1 = {[S A SB, 0, 1]} 2 1 S A SB A a 0 S ASB S c A a S A SB triggers a predict: 25 / 39

26 Tabulation (13) Example continued A 0 = {}, A 1 = {[S ASB, 1, 1], [S c, 1, 1]} 2 1 S A SB S ASB A a S c 0 S ASB S c A a S ASB triggers a predict: 26 / 39

27 Tabulation (14) Example continued A 0 = {}, A 1 = {[S c, 1, 1], [A a, 1, 1]} 2 1 S A SB S ASB A a S c A a 0 S ASB S c A a S c triggers a scan: 27 / 39

28 Tabulation (15) Example continued A 0 = {}, A 1 = {[A a, 1, 1]}, A 2 = {[S c, 1, 2]} 2 S c 1 S A SB S ASB A a S c A a 0 S ASB S c A a A a triggers a scan that fails, then S c triggers a complete: 28 / 39

29 Tabulation (16) Example continued A 0 = {}, A 1 = {}, A 2 = {[S AS B, 0, 2]} 2 S AS B S c 1 S A SB S ASB A a S c A a 0 S ASB S c A a S AS B triggers the prediction of [B b, 2, 2], which triggers a successful scan: 29 / 39

30 Tabulation (17) Example continued A 0 = {}, A 1 = {}, A 2 = {}, A 3 = {[B b, 2, 3]} 3 B b 2 S AS B S c B b 1 S A SB S ASB A a S c A a 0 S ASB S c A a B b triggers a complete which leads to a goal item: 30 / 39

31 Tabulation (18) Example continued A 0 = {}, A 1 = {}, A 2 = {}, A 3 = {} 3 S ASB B b 2 S AS B S c B b 1 S A SB S ASB A a S c A a 0 S ASB S c A a / 39

32 Tabulation (19) Example with ɛ-productions Productions S ABB, A a, B ɛ, input w = a 1 A a S A BB B S AB B S ABB 0 S ABB A a / 39

33 Tabulation (20) Overview: TD/BU and mixed approaches: TD/BU item form chart parsing Top-down TD [Γ, i] no Bottom-up BU [Γ, i] no CYK BU [A, i, l] yes Left corner mixed [Γ compl, Γ td, Γ lhs ] no [A α β, i, l] yes Earley mixed [A α β, i, j] yes 33 / 39

34 Parsing (1) So far, we have a recognizer. One way to extend it to a parser it to read off the parse tree in a top-down way from the chart. Alternatively, in every completion step, we can record in the chart the way the new item can be obtained by adding pointers to its pair of antecedents. Then, for constructing the parse tree, we only need to follow the pointers. 34 / 39

35 Parsing (2) First possibility (initial call parse-tree(s,0,n)): parse-tree(x,i,j) trees := ; if X = w j and j = i + 1 then trees := {w j } else for all X X 1... X r C i,j for all i 1,..., i r, i i 1 i r 1 i r = j and all t 1,... t r with t 1 parse-tree(x 1, i, i 1 ) and t l parse-tree (X l, i l 1, i l ) for 1 < l r: trees := trees {X(t 1,..., t r )}; output trees 35 / 39

36 Parsing (3) Second possibility: We equip items with an additional set of pairs of pointers to other items in the item set. Whenever an item [A αa β, i, k] is obtained in a complete operation from [A α Aβ, i, j] and [A γ, j, k], we add a pair of pointers to the two antecedent items to the pointer set of the consequent item. Obviously, items might have more than one element in their set if the grammar is ambiguous. 36 / 39

37 Parsing (4) Example: Backpointers Productions S AB, A Ac a, B cb b, input w = acb. Item set (with list of pointer pairs): 1 [S AB, 0, 0] 2 [A Ac, 0, 0] 3 [A a, 0, 0] 4 [A a, 0, 1] 5 [A A c, 0, 1], { 2, 4 } 6 [S A B, 0, 1], { 1, 4 } 7 [B cb, 1, 1] 8 [B b, 1, 1] 9 [A Ac, 0, 2] 10 [B c B, 1, 2] 11 [A A c, 0, 2], { 2, 9 } 12 [S A B, 0, 2], { 1, 9 } 13 [B cb, 2, 2] 14 [B b, 2, 2] 15 [B b, 2, 3] 16 [B cb, 1, 3], { 10, 15 } 17 [S AB, 0, 3], { 6, 16, 12, 15 } 37 / 39

38 Lookaheads Two kinds of lookaheads: a prediction lookahead and a reduction lookahead: Predict with lookahead: [A α Bβ, i, j] B γ P, w [B γ, j, j] j+1 First(γ) or ɛ First(γ) Complete with lookahead: [A α Bβ, i, j], [B γ, j, k] [A αb β, i, k] w k+1 First(β) or ɛ First(β) and w k+1 Follow(A) Instead of precomputing the Follow sets, one can compute the actual follows while predicting. 38 / 39

39 Conclusion Earley is a top-down restricted bottom-up parser. The three operations to compute new partial parsing results are predict, scan and complete. Earley is a chart parser. Earley can parse all CFGs. 39 / 39

Parsing. Cocke Younger Kasami (CYK) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 35

Parsing Cocke Younger Kasami (CYK) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 35 Table of contents 1 Introduction 2 The recognizer 3 The CNF recognizer 4 CYK parsing 5 CYK

Parsing. Top-Down Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 19

Parsing Top-Down Parsing Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 19 Table of contents 1 Introduction 2 The recognizer 3 The parser 4 Control structures 5 Parser generators

Types of parsing. CMSC 430 Lecture 4, Page 1

Types of parsing Top-down parsers start at the root of derivation tree and fill in picks a production and tries to match the input may require backtracking some grammars are backtrack-free (predictive)

Parsing Chart Extensions 1

Parsing Chart Extensions 1 Chart extensions Dynamic programming methods CKY Earley NLP parsing dynamic programming systems 1 Parsing Chart Extensions 2 Constraint propagation CSP (Quesada) Bidirectional

CA Compiler Construction

CA4003 - Compiler Construction David Sinclair A top-down parser starts with the root of the parse tree, labelled with the goal symbol of the grammar, and repeats the following steps until the fringe of

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

Andreas Daul Natalie Clarius

Parsing as Deduction M. Shieber, Y. Schabes, F. Pereira: Principles and Implementation of Deductive Parsing Andreas Daul Natalie Clarius Eberhard Karls Universität Tübingen Seminar für Sprachwissenschaft

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program

Parsing III. (Top-down parsing: recursive descent & LL(1) )

Parsing III (Top-down parsing: recursive descent & LL(1) ) Roadmap (Where are we?) Previously We set out to study parsing Specifying syntax Context-free grammars Ambiguity Top-down parsers Algorithm &

SLR parsers. LR(0) items

SLR parsers LR(0) items As we have seen, in order to make shift-reduce parsing practical, we need a reasonable way to identify viable prefixes (and so, possible handles). Up to now, it has not been clear

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Syntax Analysis: Context-free Grammars, Pushdown Automata and Part - 4 Department of Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Principles of Compiler

3. Parsing. Oscar Nierstrasz

3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/

Context-free grammars

Context-free grammars Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals - tokens non-terminals - represent higher-level structures of a program start symbol,

Syntactic Analysis. Top-Down Parsing

Syntactic Analysis Top-Down Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS164 Lecture 11 1 Administrivia Test I during class on 10 March. 2/20/08 Prof. Hilfinger CS164 Lecture 11

3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino

3. Syntax Analysis Andrea Polini Formal Languages and Compilers Master in Computer Science University of Camerino (Formal Languages and Compilers) 3. Syntax Analysis CS@UNICAM 1 / 54 Syntax Analysis: the

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

MIT 6.035 Parse Table Construction Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Parse Tables (Review) ACTION Goto State ( ) \$ X s0 shift to s2 error error goto s1

Compiler Construction: Parsing

Compiler Construction: Parsing Mandar Mitra Indian Statistical Institute M. Mitra (ISI) Parsing 1 / 33 Context-free grammars. Reference: Section 4.2 Formal way of specifying rules about the structure/syntax

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

CS1622 Lecture 9 Parsing (4) CS 1622 Lecture 9 1 Today Example of a recursive descent parser Predictive & LL(1) parsers Building parse tables CS 1622 Lecture 9 2 A Recursive Descent Parser. Preliminaries

Parsing Part II (Top-down parsing, left-recursion removal)

Parsing Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 9/22/06 Prof. Hilfinger CS164 Lecture 11 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient

UNIT-III BOTTOM-UP PARSING

UNIT-III BOTTOM-UP PARSING Constructing a parse tree for an input string beginning at the leaves and going towards the root is called bottom-up parsing. A general type of bottom-up parser is a shift-reduce

Syntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Syntax Analysis (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay September 2007 College of Engineering, Pune Syntax Analysis: 2/124 Syntax

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Bottom-up Parsing: Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals. Basic operation is to shift terminals from the input to the

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

Formal Languages and Compilers Lecture VII Part 3: Syntactic Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess

SYNTAX ANALYSIS 1. Define parser. Hierarchical analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning. Also termed as Parsing. 2. Mention the basic

Algorithms for NLP. Chart Parsing. Reading: James Allen, Natural Language Understanding. Section 3.4, pp

11-711 Algorithms for NLP Chart Parsing Reading: James Allen, Natural Language Understanding Section 3.4, pp. 53-61 Chart Parsing General Principles: A Bottom-Up parsing method Construct a parse starting

CS 321 Programming Languages and Compilers. VI. Parsing

CS 321 Programming Languages and Compilers VI. Parsing Parsing Calculate grammatical structure of program, like diagramming sentences, where: Tokens = words Programs = sentences For further information,

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

Bottom-up parsing Bottom-up parsing Recall Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form If α V t,thenα is called a sentence in L(G) Otherwise it is just

Compiler Construction

Compiler Construction Exercises 1 Review of some Topics in Formal Languages 1. (a) Prove that two words x, y commute (i.e., satisfy xy = yx) if and only if there exists a word w such that x = w m, y =

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

Eliminating Ambiguity Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity. Example: consider the following grammar stat if expr then stat if expr then stat else stat other One can

A left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.

Bottom-up parsing Recall For a grammar G, with start symbol S, any string α such that S α is a sentential form If α V t, then α is a sentence in L(G) A left-sentential form is a sentential form that occurs

Lecture Bottom-Up Parsing

Lecture 14+15 Bottom-Up Parsing CS 241: Foundations of Sequential Programs Winter 2018 Troy Vasiga et al University of Waterloo 1 Example CFG 1. S S 2. S AyB 3. A ab 4. A cd 5. B z 6. B wz 2 Stacks in

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Compilerconstructie najaar 2017 http://www.liacs.leidenuniv.nl/~vlietrvan1/coco/ Rudy van Vliet kamer 140 Snellius, tel. 071-527 2876 rvvliet(at)liacs(dot)nl college 3, vrijdag 22 september 2017 + werkcollege

CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages Lecture 5: Syntax Analysis (Parsing) Zheng (Eddy) Zhang Rutgers University January 31, 2018 Class Information Homework 1 is being graded now. The sample solution

Algorithms for NLP. Chart Parsing. Reading: James Allen, Natural Language Understanding. Section 3.4, pp

-7 Algorithms for NLP Chart Parsing Reading: James Allen, Natural Language Understanding Section 3.4, pp. 53-6 Chart Parsing General Principles: A Bottom-Up parsing method Construct a parse starting from

CSCI312 Principles of Programming Languages

Copyright 2006 The McGraw-Hill Companies, Inc. CSCI312 Principles of Programming Languages! LL Parsing!! Xu Liu Derived from Keith Cooper s COMP 412 at Rice University Recap Copyright 2006 The McGraw-Hill

CS Parsing 1

CS414-20034-03 Parsing 1 03-0: Parsing Once we have broken an input file into a sequence of tokens, the next step is to determine if that sequence of tokens forms a syntactically correct program parsing

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure

Compilers. Bottom-up Parsing. (original slides by Sam

Compilers Bottom-up Parsing Yannis Smaragdakis U Athens Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Bottom-Up Parsing More general than top-down parsing And just as efficient Builds

In One Slide. Outline. LR Parsing. Table Construction

LR Parsing Table Construction #1 In One Slide An LR(1) parsing table can be constructed automatically from a CFG. An LR(1) item is a pair made up of a production and a lookahead token; it represents a

Vorlesung 7: Ein effizienter CYK Parser

Institut für Computerlinguistik, Uni Zürich: Effiziente Analyse unbeschränkter Texte Vorlesung 7: Ein effizienter CYK Parser Gerold Schneider Institute of Computational Linguistics, University of Zurich

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Predictive Parsers LL(k) Parsing Can we avoid backtracking? es, if for a given input symbol and given nonterminal, we can choose the alternative appropriately. his is possible if the first terminal of

Lecture 8: Context Free Grammars

Lecture 8: Context Free s Dr Kieran T. Herley Department of Computer Science University College Cork 2017-2018 KH (12/10/17) Lecture 8: Context Free s 2017-2018 1 / 1 Specifying Non-Regular Languages Recall

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

PART 3 - SYNTAX ANALYSIS F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2016 64 / 309 Goals Definition of the syntax of a programming language using context free grammars Methods for parsing

CS 406/534 Compiler Construction Parsing Part I

CS 406/534 Compiler Construction Parsing Part I Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr.

Computer Science 160 Translation of Programming Languages

Computer Science 160 Translation of Programming Languages Instructor: Christopher Kruegel Top-Down Parsing Parsing Techniques Top-down parsers (LL(1), recursive descent parsers) Start at the root of the

Chapter 4: LR Parsing

Chapter 4: LR Parsing 110 Some definitions Recall For a grammar G, with start symbol S, any string α such that S called a sentential form α is If α Vt, then α is called a sentence in L G Otherwise it is

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

Formal Languages and Compilers Lecture VII Part 4: Syntactic Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

Lecture 7: Deterministic Bottom-Up Parsing

Lecture 7: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Tue Sep 20 12:50:42 2011 CS164: Lecture #7 1 Avoiding nondeterministic choice: LR We ve been looking at general

Monday, September 13, Parsers

Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of non-terminals S is the start symbol P is the set of productions

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages Parsing CMSC 330 - Spring 2017 1 Recall: Front End Scanner and Parser Front End Token Source Scanner Parser Stream AST Scanner / lexer / tokenizer converts

Parser Generation. Bottom-Up Parsing. Constructing LR Parser. LR Parsing. Construct parse tree bottom-up --- from leaves to the root

Parser Generation Main Problem: given a grammar G, how to build a top-down parser or a bottom-up parser for it? parser : a program that, given a sentence, reconstructs a derivation for that sentence ----

The CYK Algorithm. We present now an algorithm to decide if w L(G), assuming G to be in Chomsky Normal Form.

CFG [1] The CYK Algorithm We present now an algorithm to decide if w L(G), assuming G to be in Chomsky Normal Form. This is an example of the technique of dynamic programming Let n be w. The natural algorithm

Lexical and Syntax Analysis. Bottom-Up Parsing

Lexical and Syntax Analysis Bottom-Up Parsing Parsing There are two ways to construct derivation of a grammar. Top-Down: begin with start symbol; repeatedly replace an instance of a production s LHS with

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

Bottom up parsing Construct a parse tree for an input string beginning at leaves and going towards root OR Reduce a string w of input to start symbol of grammar Consider a grammar S aabe A Abc b B d And

Wednesday, September 9, 15. Parsers

Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Scanner Parser Static Analyzer Intermediate Representation Front End Back End Compiler / Interpreter

October 19, 2004 Chapter Parsing

October 19, 2004 Chapter 10.3 10.6 Parsing 1 Overview Review: CFGs, basic top-down parser Dynamic programming Earley algorithm (how it works, how it solves the problems) Finite-state parsing 2 Last time

Lecture 8: Deterministic Bottom-Up Parsing

Lecture 8: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 1 Avoiding nondeterministic choice: LR We ve been looking at general

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Syntax Analysis Prof. James L. Frankel Harvard University Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved. Context-Free Grammar (CFG) terminals non-terminals start

Lecture Notes on Top-Down Predictive LL Parsing

Lecture Notes on Top-Down Predictive LL Parsing 15-411: Compiler Design Frank Pfenning Lecture 8 1 Introduction In this lecture we discuss a parsing algorithm that traverses the input string from l eft

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008.

Parsing Techniques. CS152. Chris Pollett. Sep. 24, 2008. Outline. Top-down versus Bottom-up Parsing. Recursive Descent Parsing. Left Recursion Removal. Left Factoring. Predictive Parsing. Introduction.

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

CS 2210 Sample Midterm 1. Determine if each of the following claims is true (T) or false (F). F A language consists of a set of strings, its grammar structure, and a set of operations. (Note: a language

More Bottom-Up Parsing

More Bottom-Up Parsing Lecture 7 Dr. Sean Peisert ECS 142 Spring 2009 1 Status Project 1 Back By Wednesday (ish) savior lexer in ~cs142/s09/bin Project 2 Due Friday, Apr. 24, 11:55pm My office hours 3pm

A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or

Compilers. Predictive Parsing. Alex Aiken

Compilers Like recursive-descent but parser can predict which production to use By looking at the next fewtokens No backtracking Predictive parsers accept LL(k) grammars L means left-to-right scan of input

Compiler Construction 2016/2017 Syntax Analysis

Compiler Construction 2016/2017 Syntax Analysis Peter Thiemann November 2, 2016 Outline 1 Syntax Analysis Recursive top-down parsing Nonrecursive top-down parsing Bottom-up parsing Syntax Analysis tokens

PESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru -100 Department of Computer Science and Engineering

TEST 1 Date : 24 02 2015 Marks : 50 Subject & Code : Compiler Design ( 10CS63) Class : VI CSE A & B Name of faculty : Mrs. Shanthala P.T/ Mrs. Swati Gambhire Time : 8:30 10:00 AM SOLUTION MANUAL 1. a.

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

S Y N T A X A N A L Y S I S LR

LR parsing There are three commonly used algorithms to build tables for an LR parser: 1. SLR(1) = LR(0) plus use of FOLLOW set to select between actions smallest class of grammars smallest tables (number

CYK- and Earley s Approach to Parsing

CYK- and Earley s Approach to Parsing Anton Nijholt and Rieks op den Akker Faculty of Computer Science, University of Twente P.O. Box 217, 7500 AE Enschede, The Netherlands 1. Introduction In this paper

UNIT III & IV. Bottom up parsing

UNIT III & IV Bottom up parsing 5.0 Introduction Given a grammar and a sentence belonging to that grammar, if we have to show that the given sentence belongs to the given grammar, there are two methods.

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers Parsing Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Next step text chars Lexical analyzer tokens Parser IR Errors Parsing: Organize tokens into sentences Do tokens conform

Wednesday, August 31, Parsers

Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

Review: Shift-Reduce Parsing Bottom-up parsing uses two actions: Bottom-Up Parsing II Lecture 8 Shift ABC xyz ABCx yz Reduce Cbxy ijk CbA ijk Prof. Aiken CS 13 Lecture 8 1 Prof. Aiken CS 13 Lecture 8 2

Solving systems of regular expression equations

Solving systems of regular expression equations 1 of 27 Consider the following DFSM 1 0 0 0 A 1 A 2 A 3 1 1 For each transition from state Q to state R under input a, we create a production of the form

Computational Linguistics

Computational Linguistics CSC 2501 / 485 Fall 2018 3 3. Chart parsing Gerald Penn Department of Computer Science, University of Toronto Reading: Jurafsky & Martin: 13.3 4. Allen: 3.4, 3.6. Bird et al:

The CKY Parsing Algorithm and PCFGs. COMP-599 Oct 12, 2016

The CKY Parsing Algorithm and PCFGs COMP-599 Oct 12, 2016 Outline CYK parsing PCFGs Probabilistic CYK parsing 2 CFGs and Constituent Trees Rules/productions: S VP this VP V V is rules jumps rocks Trees:

Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

Fall 2017-2018 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion University of the Negev 1 Books Compilers Principles, Techniques, and Tools Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman

VIVA QUESTIONS WITH ANSWERS

VIVA QUESTIONS WITH ANSWERS 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language-the

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5 1 Not all languages are regular So what happens to the languages which are not regular? Can we still come up with a language recognizer?

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

LR(0) Parsing ummary C412/C41 Introduction to Compilers Tim Teitelbaum Lecture 10: LR Parsing February 12, 2007 LR(0) item = a production with a dot in RH LR(0) state = set of LR(0) items valid for viable

LR Parsing LALR Parser Generators

LR Parsing LALR Parser Generators Outline Review of bottom-up parsing Computing the parsing DFA Using parser generators 2 Bottom-up Parsing (Review) A bottom-up parser rewrites the input string to the

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

TDDD16 Compilers and Interpreters TDDB44 Compiler Construction LR Parsing, Part 2 Constructing Parse Tables Parse table construction Grammar conflict handling Categories of LR Grammars and Parsers An NFA

shift-reduce parsing

Parsing #2 Bottom-up Parsing Rightmost derivations; use of rules from right to left Uses a stack to push symbols the concatenation of the stack symbols with the rest of the input forms a valid bottom-up

Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion University of the Negev 1 Books Compilers Principles, Techniques, and Tools Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman

Principles of Programming Languages

Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp- 14/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 8! Bo;om- Up Parsing Shi?- Reduce LR(0) automata and

Natural Language Processing

Natural Language Processing Anoop Sarkar http://www.cs.sfu.ca/~anoop 9/18/18 1 Ambiguity An input is ambiguous with respect to a CFG if it can be derived with two different parse trees A parser needs a

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Phases of a compiler Syntactic structure Figure 1.6, page 5 of text Recap Lexical analysis: LEX/FLEX (regex -> lexer) Syntactic analysis:

Compila(on (Semester A, 2013/14)

Compila(on 0368-3133 (Semester A, 2013/14) Lecture 4: Syntax Analysis (Top- Down Parsing) Modern Compiler Design: Chapter 2.2 Noam Rinetzky Slides credit: Roman Manevich, Mooly Sagiv, Jeff Ullman, Eran

Topdown parsing with backtracking

Top down parsing Types of parsers: Top down: repeatedly rewrite the start symbol; find a left-most derivation of the input string; easy to implement; not all context-free grammars are suitable. Bottom

LR Parsing. Table Construction

#1 LR Parsing Table Construction #2 Outline Review of bottom-up parsing Computing the parsing DFA Closures, LR(1) Items, States Transitions Using parser generators Handling Conflicts #3 In One Slide An

Syntax Analysis, III Comp 412

Updated algorithm for removal of indirect left recursion to match EaC3e (3/2018) COMP 412 FALL 2018 Midterm Exam: Thursday October 18, 7PM Herzstein Amphitheater Syntax Analysis, III Comp 412 source code