Algorithms for NLP. Chart Parsing. Reading: James Allen, Natural Language Understanding. Section 3.4, pp

Similar documents
Algorithms for NLP. Chart Parsing. Reading: James Allen, Natural Language Understanding. Section 3.4, pp

Computational Linguistics

Algorithms for NLP. LR Parsing. Reading: Hopcroft and Ullman, Intro. to Automata Theory, Lang. and Comp. Section , pp.

Ling 571: Deep Processing for Natural Language Processing

Introduction to Bottom-Up Parsing

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Parsing. Earley Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 39

11-682: Introduction to IR, NLP, MT and Speech. Parsing with Unification Grammars. Reading: Jurafsky and Martin, Speech and Language Processing

Compilers. Predictive Parsing. Alex Aiken

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Monday, September 13, Parsers

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

CS 842 Ben Cassell University of Waterloo

Midterm I (Solutions) CS164, Spring 2002

Outline. The strategy: shift-reduce parsing. Introduction to Bottom-Up Parsing. A key concept: handles

Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

October 19, 2004 Chapter Parsing

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

S Y N T A X A N A L Y S I S LR

Intro to Bottom-up Parsing. Lecture 9

Lexical and Syntax Analysis. Bottom-Up Parsing

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Syntax Analysis, III Comp 412

Syntax Analysis, III Comp 412

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

CS 132 Compiler Construction, Fall 2012 Instructor: Jens Palsberg Multiple Choice Exam, Nov 6, 2012

Types of parsing. CMSC 430 Lecture 4, Page 1

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Topics in Parsing: Context and Markovization; Dependency Parsing. COMP-599 Oct 17, 2016

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Natural Language Processing

Dependency grammar and dependency parsing

Part 3. Syntax analysis. Syntax analysis 96

Bottom-Up Parsing. Lecture 11-12

Lexical and Syntax Analysis. Top-Down Parsing

Wednesday, August 31, Parsers

Parsing Chart Extensions 1

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

Transition-based Parsing with Neural Nets

Dependency Parsing CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

Admin PARSING. Backoff models: absolute discounting. Backoff models: absolute discounting 3/4/11. What is (xy)?

More Bottom-Up Parsing

Homework & Announcements

Chapter 3. Parsing #1

Downloaded from Page 1. LR Parsing

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Context-free grammars

LR Parsing LALR Parser Generators

In One Slide. Outline. LR Parsing. Table Construction

Parsing II Top-down parsing. Comp 412

Dependency grammar and dependency parsing

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Large-Scale Syntactic Processing: Parsing the Web. JHU 2009 Summer Research Workshop

Compiler Construction: Parsing

Dependency grammar and dependency parsing

Assignment 4 CSE 517: Natural Language Processing

Lecture 6. Register Allocation. I. Introduction. II. Abstraction and the Problem III. Algorithm

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Bottom-Up Parsing. Lecture 11-12

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

The procedure attempts to "match" the right hand side of some production for a nonterminal.

Syntactic Analysis. Top-Down Parsing

Table-Driven Parsing

ECE 468/573 Midterm 1 October 1, 2014

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

2068 (I) Attempt all questions.

Parsing Part II (Top-down parsing, left-recursion removal)

Summary: Semantic Analysis

Extra Credit Question

Vorlesung 7: Ein effizienter CYK Parser

LR Parsing Techniques

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

CS 230 Programming Languages

shift-reduce parsing

LR Parsing LALR Parser Generators

Introduction to parsers

CS 321 Programming Languages and Compilers. VI. Parsing

What is Parsing? NP Det N. Det. NP Papa N caviar NP NP PP S NP VP. N spoon VP V NP VP VP PP. V spoon V ate PP P NP. P with.

The CKY Parsing Algorithm and PCFGs. COMP-599 Oct 12, 2016

Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

CMSC330 Fall 2014 Midterm 2 Solutions

LL Parsing: A piece of cake after LR

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

1. Consider the following program in a PCAT-like language.

The Expectation Maximization (EM) Algorithm

CS 406/534 Compiler Construction Parsing Part II LL(1) and LR(1) Parsing

Introduction to Parsing. Comp 412

Computing PREDICT and FOLLOW sets

Parsing: Derivations, Ambiguity, Precedence, Associativity. Lecture 8. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

Computer Science 160 Translation of Programming Languages

Transcription:

11-711 Algorithms for NLP Chart Parsing Reading: James Allen, Natural Language Understanding Section 3.4, pp. 53-61

Chart Parsing General Principles: A Bottom-Up parsing method Construct a parse starting from the input symbols Build constituents from sub-constituents When all constituents on the RHS of a rule are matched, create a constituent for the LHS of the rule The Chart allows storing partial analyses, so that they can be shared. Data structures used by the algorithm: The Key: the current constituent we are attempting to match An Active Arc: a grammar rule that has a partially matched RHS The Agenda: Keeps track of newly found unprocessed constituents The Chart: Records processed constituents (non-terminals) that span substrings of the input 1 11-711 Algorithms for NLP

Chart Parsing Steps in the Process: Input is processed left-to-right, one word at a time 1. Find all POS of word (terminal-level) 2. Initialize Agenda with all POS of the word 3. Pick a Key from the Agenda 4. Add all grammar rules that start with the Key as active arcs 5. Extend any existing active arcs with the Key 6. Add LHS constituents of newly completed rules to the Agenda 7. Add the Key to the Chart 8. If Agenda not empty - goto (3), else goto (1) 2 11-711 Algorithms for NLP

The Chart Parsing Algorithm Extending Active Arcs with a Key: Each Active Arc has the form: 1 A Key constituent has the form: When processing the Key 1 2, we search the active arc list for an arc 1 0 1, and then create a new active arc 1 0 2 If the new active arc is a completed rule: 1 0 2, then we add 0 2 to the Agenda After using the key to extend all relevant arcs, it is entered into the Chart 3 11-711 Algorithms for NLP

The Chart Parsing Algorithm The Main Algorithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find all POS of and add them as constituents 1 to the Agenda 3. Pick a Key constituent 1 from the Agenda 4. For each grammar rule of form 1, add 1 to the list of active arcs 5. Use Key to extend all relevant active arcs 6. Add LHS of any completed active arcs into the Agenda 7. Insert the Key into the Chart 8. If Key is 1 then Accept the input Else goto (2) 4 11-711 Algorithms for NLP

Chart Parsing - Example The Grammar: 1 2 3 4 5 6 5 11-711 Algorithms for NLP

Chart Parsing - Example The input: The large can can hold the water POS of Input Words: the: ART large: ADJ can: N, AUX, V hold: N, V water: N, V 6 11-711 Algorithms for NLP

Chart Parsing = Example The input: "X = large can can hold the 8 f 1 1-71 1 Algorithms for NLP

Chart Parsing - Example The input: "X = The large can can hold the water" ART 1 ADJ 1 I 2 3 NP + ART 0 ADJ N NP + ART0 N NP + ADJ o N 0 - Figure 3.9 The chart after seeing an ADJ in position 2 11-71 1 Algorithms for NLP

P io -1-5 - P C Z C - - X C > S Z C 8 Z C W W Z -a M Z?I 2 - CD h) w Z 2 2 - CD P b ~

Chart Parsing - Example The input: "X = The large can can hold the water" S I (rule 1 with NPI and VP2) S2 (rule I with NP2 and VP2) VP3 (rule 5 with AUXI and VP2) NP2 (rule 4) VP2 (rule 5) NPI (rule 2) VPI (rule 6) N1 N2 NP3 (rule 3) VI v2 v3 v4 ART I ADJ 1 AUX 1 AUX2 N3 ART2 N4 1 the 2 large 3 can 4 can 5 hold 6 the 7 water Figure 3.15 The final chart 11-71 1 Algorithms for NLP

Time Complexity of Chart Parsing Algorithm iterates for each word of input (i.e. iterations) How many keys can be created in a single iteration? Each key in Agenda has the form 1, 1 Thus keys Each key 1 can extend active arcs of the form 1, for 1 Each key can thus extend at most active arcs Time required for each iteration is thus 2 Time bound on entire algorithm is therefore 3 12 11-711 Algorithms for NLP

Parsing with a Chart Parser We need to keep back-pointers to the constituents that we combine Each constituent added to the chart must have the form 1 1, where the are pointers to the RHS sub-constituents Pointers must thus also be recorded within the active arcs At the end - reconstruct a parse from the back-pointers 13 11-711 Algorithms for NLP

Parsing with a Chart Parser Efficient Representation of Ambiguities a Local Ambiguity - multiple ways to derive the same substring from a non-terminal What do local ambiguities look like with Chart Parsing? Multiple items in the chart of the form 1 1, with the same, and. Local Ambiguity Packing: create a single item in the Chart for, with pointers to the various possible derivations. can then be a sufficient back-pointer in the chart Allows to efficiently represent a very large number of ambiguities (even exponentially many) Unpacking - producing one or more of the packed parse trees by following the back-pointers. 15 11-711 Algorithms for NLP

Improved Chart Parser Increasing the Efficiency of the Algorithm Observation: The algorithm creates constituents that cannot fit into a global parse structure. Example: 2 3 These can be eliminated using predictions Using Top-down predictions: For each non-terminal, define the set of all leftmost non-terminals derivable from Compute in advance for all non-terminals When processing the key 1 we look for grammar rules that form an active arc 2 Add the active arc only if is predicted: there already exists an active arc 1 such that For 1 (arcs of the form 2 1 1 ), add the active arc only if 16 11-711 Algorithms for NLP

Improved Chart Parser Simple Procedure for Computing : (1) For every A, Cur-First[A] = { A } (2) For every grammar rule A --> B..., Cur-First[A] = Union( Cur-First[A], { B }) (3) Updated = T While Updated do Updated = F For every A do Old-First[A] = Cur-First[A] For each B in Old-First[A] do Cur-First[A] = Union( Cur-First[A], Cur-First[B]) If Cur-First[A]!= Old-First[A] then Updated = T (4) For every A do First[A] = Cur-First[A] 17 11-711 Algorithms for NLP

Chart Parsing - Example The input: ".I. = The large can can hold the water"