Algorithms for NLP. Chart Parsing. Reading: James Allen, Natural Language Understanding. Section 3.4, pp

Similar documents
Algorithms for NLP. Chart Parsing. Reading: James Allen, Natural Language Understanding. Section 3.4, pp

Computational Linguistics

Algorithms for NLP. LR Parsing. Reading: Hopcroft and Ullman, Intro. to Automata Theory, Lang. and Comp. Section , pp.

Ling 571: Deep Processing for Natural Language Processing

Introduction to Bottom-Up Parsing

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Parsing. Earley Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 39

11-682: Introduction to IR, NLP, MT and Speech. Parsing with Unification Grammars. Reading: Jurafsky and Martin, Speech and Language Processing

Compilers. Predictive Parsing. Alex Aiken

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Monday, September 13, Parsers

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

October 19, 2004 Chapter Parsing

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

CS 842 Ben Cassell University of Waterloo

Midterm I (Solutions) CS164, Spring 2002

Outline. The strategy: shift-reduce parsing. Introduction to Bottom-Up Parsing. A key concept: handles

Parsing #1. Leonidas Fegaras. CSE 5317/4305 L3: Parsing #1 1

Chapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

S Y N T A X A N A L Y S I S LR

Intro to Bottom-up Parsing. Lecture 9

Lexical and Syntax Analysis. Bottom-Up Parsing

Parsing. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Syntax Analysis, III Comp 412

Parsing Chart Extensions 1

Syntax Analysis, III Comp 412

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

CS 132 Compiler Construction, Fall 2012 Instructor: Jens Palsberg Multiple Choice Exam, Nov 6, 2012

Types of parsing. CMSC 430 Lecture 4, Page 1

Transition-based Parsing with Neural Nets

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Topics in Parsing: Context and Markovization; Dependency Parsing. COMP-599 Oct 17, 2016

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Natural Language Processing

LR Parsing LALR Parser Generators

Dependency grammar and dependency parsing

Part 3. Syntax analysis. Syntax analysis 96

Bottom-Up Parsing. Lecture 11-12

Lexical and Syntax Analysis. Top-Down Parsing

Wednesday, August 31, Parsers

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

Dependency Parsing CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

LR Parsing LALR Parser Generators

Example CFG. Lectures 16 & 17 Bottom-Up Parsing. LL(1) Predictor Table Review. Stacks in LR Parsing 1. Sʹ " S. 2. S " AyB. 3. A " ab. 4.

CS 321 Programming Languages and Compilers. VI. Parsing

Admin PARSING. Backoff models: absolute discounting. Backoff models: absolute discounting 3/4/11. What is (xy)?

More Bottom-Up Parsing

Homework & Announcements

Chapter 3. Parsing #1

Downloaded from Page 1. LR Parsing

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Context-free grammars

In One Slide. Outline. LR Parsing. Table Construction

Parsing II Top-down parsing. Comp 412

Dependency grammar and dependency parsing

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Large-Scale Syntactic Processing: Parsing the Web. JHU 2009 Summer Research Workshop

Compiler Construction: Parsing

Dependency grammar and dependency parsing

Assignment 4 CSE 517: Natural Language Processing

Lecture 6. Register Allocation. I. Introduction. II. Abstraction and the Problem III. Algorithm

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Bottom-Up Parsing. Lecture 11-12

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

The procedure attempts to "match" the right hand side of some production for a nonterminal.

Syntactic Analysis. Top-Down Parsing

Table-Driven Parsing

ECE 468/573 Midterm 1 October 1, 2014

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

2068 (I) Attempt all questions.

CS Parsing 1

Parsing Part II (Top-down parsing, left-recursion removal)

Summary: Semantic Analysis

Extra Credit Question

Vorlesung 7: Ein effizienter CYK Parser

LR Parsing Techniques

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

CS 230 Programming Languages

shift-reduce parsing

Introduction to parsers

What is Parsing? NP Det N. Det. NP Papa N caviar NP NP PP S NP VP. N spoon VP V NP VP VP PP. V spoon V ate PP P NP. P with.

The CKY Parsing Algorithm and PCFGs. COMP-599 Oct 12, 2016

Fall Compiler Principles Lecture 2: LL parsing. Roman Manevich Ben-Gurion University of the Negev

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

CMSC330 Fall 2014 Midterm 2 Solutions

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

LL Parsing: A piece of cake after LR

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

1. Consider the following program in a PCAT-like language.

The Expectation Maximization (EM) Algorithm

Introduction to Parsing. Comp 412

CS 406/534 Compiler Construction Parsing Part II LL(1) and LR(1) Parsing

Computing PREDICT and FOLLOW sets

Transcription:

-7 Algorithms for NLP Chart Parsing Reading: James Allen, Natural Language Understanding Section 3.4, pp. 53-6

Chart Parsing General Principles: A Bottom-Up parsing method Construct a parse starting from the input symbols Build constituents from sub-constituents When all constituents on the RHS of a rule are matched, create a constituent for the LHS of the rule The Chart allows storing partial analyses, so that they can be shared. Data structures used by the algorithm: The Key: the current constituent we are attempting to match An Active Arc: a grammar rule that has a partially matched RHS The Agenda: Keeps track of newly found unprocessed constituents The Chart: Records processed constituents (non-terminals) that span substrings of the input -7 Algorithms for NLP

Chart Parsing Steps in the Process: Input is processed left-to-right, one word at a time. Find all POS of word (terminal-level) 2. Initialize Agenda with all POS of the word 3. Pick a Key from the Agenda 4. Add all grammar rules that start with the Key as active arcs 5. Extend any existing active arcs with the Key 6. Add LHS constituents of newly completed rules to the Agenda 7. Add the Key to the Chart 8. If Agenda not empty - goto (3), else goto () 2-7 Algorithms for NLP

The Chart Parsing Algorithm Extending Active Arcs with a Key: Each Active Arc has the form: 2 2 2 2 A Key constituent has the form: When processing the Key list for an arc new active arc, we search the active arc, and then create a 0 0 If the new active arc is a completed rule: then we add to the Agenda 0 0, After using the key to extend all relevant arcs, it is entered into the Chart 3-7 Algorithms for NLP

(!! "add & # % ' $ # $! The Chart Parsing Algorithm The Main Algorithm: parsing input. 0 2. If Agenda is empty and then set, find all POS of add them as constituents to the Agenda and 3. Pick a Key constituent from the Agenda $ For each grammar rule of form "4. $ %, to the list of active arcs 5. Use Key to extend all relevant active arcs 6. Add LHS of any completed active arcs into the Agenda 7. Insert the Key into the Chart 8. If Key is then Accept the input Else goto (2) 4-7 Algorithms for NLP

,+,+ +* / +* * +* / ),+ +* 6 5 4 3 2 Chart Parsing - Example The Grammar: 5-7 Algorithms for NLP, +*,+ 0 * -. -. 0 *

Chart Parsing - Example The input: 243The large can can hold the water POS of Input Words: the: ART large: ADJ can: N, AUX, V hold: N, V water: N, V 6-7 Algorithms for NLP

Chart Parsing = Example The input: "X = large can can hold the 8 f -7 Algorithms for NLP

Chart Parsing - Example The input: "X = The large can can hold the water" ART ADJ I 2 3 NP + ART 0 ADJ N NP + ART0 N NP + ADJ o N 0 - Figure 3.9 The chart after seeing an ADJ in position 2-7 Algorithms for NLP

~ P io - - 5 - P C Z C - - X C Z?I - 2 - CD h) w Z 2 2 - CD P b C > S Z 8 Z C W W Z -a M

Chart Parsing - Example The input: "X = The large can can hold the water" S I (rule with NPI and VP2) S2 (rule I with NP2 and VP2) VP3 (rule 5 with AUXI and VP2) NP2 (rule 4) VP2 (rule 5) NPI (rule 2) VPI (rule 6) N N2 NP3 (rule 3) VI v2 v3 v4 ART I ADJ AUX AUX2 N3 ART2 N4 the 2 large 3 can 4 can 5 hold 6 the 7 water Figure 3.5 The final chart -7 Algorithms for NLP

: 5 : 5 : 8 9 ; 7 : 8 9 8 7 6 Time Complexity of Chart Parsing Algorithm iterates for each word of input (i.e. 5iterations) How many keys can be created in a single iteration 6? Each key in Agenda has the form, Thus 5 keys Each key can extend active arcs of the form, for 84< Each key can thus extend at most 5 active arcs Time required for each iteration is thus 2 Time bound on entire algorithm is therefore 3 2-7 Algorithms for NLP

; = = are pointers = Parsing with a Chart Parser We need to keep back-pointers to the constituents that we combine Each constituent added to the chart must have the form to the RHS sub-constituents, where the Pointers must thus also be recorded within the active arcs At the end - reconstruct a parse from the back-pointers 3-7 Algorithms for NLP

! "! " ' @! % % # $ " Parsing with a Chart Parser Efficient Representation of Ambiguities a Local Ambiguity - multiple ways to derive the same substring from a non-terminal What do local ambiguities look like with Chart Parsing? Multiple items in the chart of the form $?>, with the same & "?> ",! and @. Local Ambiguity Packing: create a single item in the Chart for @, with pointers to the various possible derivations. @can then be a sufficient back-pointer in the chart Allows to efficiently represent a very large number of ambiguities (even exponentially many) Unpacking - producing one or more of the packed parse trees by following the back-pointers. 5-7 Algorithms for NLP

@ % ' " J F HG > # $! % ' # $ " J F HG > D! '!! % # $ F HG > " HG > ' # D" E A Improved Chart Parser Increasing the Efficiency of the Algorithm Observation: The algorithm creates constituents that cannot fit into a global parse structure. Example: &BAC These can be eliminated using predictions Using Top-down predictions: For each non-terminal Compute in advance When processing the key that form an active arc & " F", define "non-terminals derivable from " "for all non-terminals 2$ Add the active arc only if " is predicted: there already exists an active arc such that For (arcs of the form active arc only if & " ( &BI 2 3 the set of all leftmost we look for grammar rules 2$ D $ @ ), add the 6-7 Algorithms for NLP

= K ML6 Improved Chart Parser Simple Procedure for Computing () For every A, Cur-First[A] = { A } : (2) For every grammar rule A --> B..., Cur-First[A] = Union( Cur-First[A], { B }) (3) Updated = T While Updated do Updated = F For every A do Old-First[A] = Cur-First[A] For each B in Old-First[A] do Cur-First[A] = Union( Cur-First[A], Cur-First[B]) If Cur-First[A]!= Old-First[A] then Updated = T (4) For every A do First[A] = Cur-First[A] 7-7 Algorithms for NLP

Chart Parsing - Example The input: ".I. = The large can can hold the water"