Transition-Based Dependency Parsing with MaltParser

Similar documents
Introduction to Data-Driven Dependency Parsing

Sorting Out Dependency Parsing

Sorting Out Dependency Parsing

Non-Projective Dependency Parsing in Expected Linear Time

Online Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies

Transition-based Dependency Parsing with Rich Non-local Features

A Quick Guide to MaltParser Optimization

Automatic Discovery of Feature Sets for Dependency Parsing

Discriminative Classifiers for Deterministic Dependency Parsing

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin

Tekniker för storskalig parsning: Dependensparsning 2

Dependency Parsing CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin

Dependency Parsing domain adaptation using transductive SVM

A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy

Statistical Dependency Parsing

CS395T Project 2: Shift-Reduce Parsing

Hybrid Combination of Constituency and Dependency Trees into an Ensemble Dependency Parser

Optimistic Backtracking A Backtracking Overlay for Deterministic Incremental Parsing

Dependency Parsing. Ganesh Bhosale Neelamadhav G Nilesh Bhosale Pranav Jawale under the guidance of

Improving Transition-Based Dependency Parsing with Buffer Transitions

Refresher on Dependency Syntax and the Nivre Algorithm

Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set

The Application of Constraint Rules to Data-driven Parsing

Undirected Dependency Parsing

Testing parsing improvements with combination and translation in Evalita 2014

Basic Parsing with Context-Free Grammars. Some slides adapted from Karl Stratos and from Chris Manning

Statistical parsing. Fei Xia Feb 27, 2009 CSE 590A

MaltOptimizer: A System for MaltParser Optimization

Transition-Based Parsing of the Chinese Treebank using a Global Discriminative Model

Managing a Multilingual Treebank Project

Agenda for today. Homework questions, issues? Non-projective dependencies Spanning tree algorithm for non-projective parsing

Dependency Parsing with Undirected Graphs

splitsvm: Fast, Space-Efficient, non-heuristic, Polynomial Kernel Computation for NLP Applications

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Transition-based dependency parsing

Dependency Parsing. Johan Aulin D03 Department of Computer Science Lund University, Sweden

Non-Projective Dependency Parsing with Non-Local Transitions

Density-Driven Cross-Lingual Transfer of Dependency Parsers

arxiv: v2 [cs.cl] 24 Mar 2015

CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019

A Dynamic Confusion Score for Dependency Arc Labels

Utilizing Dependency Language Models for Graph-based Dependency Parsing Models

Supplementary A. Built-in transition systems

Learning Latent Linguistic Structure to Optimize End Tasks. David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu

A Tale of Two Parsers: investigating and combining graph-based and transition-based dependency parsing using beam-search

Langforia: Language Pipelines for Annotating Large Collections of Documents

A Transition-based Algorithm for AMR Parsing

Exploring Automatic Feature Selection for Transition-Based Dependency Parsing

Easy-First POS Tagging and Dependency Parsing with Beam Search

Dynamic Programming Algorithms for Transition-Based Dependency Parsers

Bidirectional Transition-Based Dependency Parsing

Transition-based Parsing with Neural Nets

Discriminative Training with Perceptron Algorithm for POS Tagging Task

Homework 2: Parsing and Machine Learning

NLP in practice, an example: Semantic Role Labeling

Dependency Parsing L545. With thanks to Joakim Nivre and Sandra Kübler. Dependency Parsing 1(70)

Let s get parsing! Each component processes the Doc object, then passes it on. doc.is_parsed attribute checks whether a Doc object has been parsed

Assignment 4 CSE 517: Natural Language Processing

Incremental Integer Linear Programming for Non-projective Dependency Parsing

EDAN20 Language Technology Chapter 13: Dependency Parsing

Online Service for Polish Dependency Parsing and Results Visualisation

Combining Discrete and Continuous Features for Deterministic Transition-based Dependency Parsing

AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands

Projective Dependency Parsing with Perceptron

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Base Noun Phrase Chunking with Support Vector Machines

Optimizing Planar and 2-Planar Parsers with MaltOptimizer

An Integrated Digital Tool for Accessing Language Resources

Army Research Laboratory

Boosting for Efficient Model Selection for Syntactic Parsing

Optimal Incremental Parsing via Best-First Dynamic Programming

Graph-Based Parsing. Miguel Ballesteros. Algorithms for NLP Course. 7-11

Lexicalized Semi-Incremental Dependency Parsing

Searn in Practice. Hal Daumé III, John Langford and Daniel Marcu

Lexicalized Semi-Incremental Dependency Parsing

An Empirical Study of Semi-supervised Structured Conditional Models for Dependency Parsing

Automatic Evaluation of Parser Robustness: Eliminating Manual Labor and Annotated Resources

CSE 573: Artificial Intelligence Autumn 2010

Comparing State-of-the-art Dependency Parsers for the EVALITA 2014 Dependency Parsing Task

Backpropagating through Structured Argmax using a SPIGOT

Predicting Structure in Handwritten Algebra Data From Low Level Features

Iterative CKY parsing for Probabilistic Context-Free Grammars

Re-Ranking Algorithms for Name Tagging

Generalized Higher-Order Dependency Parsing with Cube Pruning

Discriminative Reranking for Grammatical Error Correction with Statistical Machine Translation

HC-search for Incremental Parsing

Automatic Domain Partitioning for Multi-Domain Learning

On maximum spanning DAG algorithms for semantic DAG parsing

Kernel Slicing: Scalable Online Training with Conjunctive Features

WebAnno: a flexible, web-based annotation tool for CLARIN

Easy-First Chinese POS Tagging and Dependency Parsing

INFORMATION EXTRACTION USING SVM UNEVEN MARGIN FOR MULTI-LANGUAGE DOCUMENT

Incremental Joint POS Tagging and Dependency Parsing in Chinese

Text Classification and Clustering Using Kernels for Structured Data

A Deductive Approach to Dependency Parsing

Stanford s System for Parsing the English Web

Optimal Shift-Reduce Constituent Parsing with Structured Perceptron

Context Encoding LSTM CS224N Course Project

Chunking with Support Vector Machines

Opinion Mining by Transformation-Based Domain Adaptation

Transcription:

Transition-Based Dependency Parsing with MaltParser Joakim Nivre Uppsala University and Växjö University Transition-Based Dependency Parsing 1(13)

Introduction Outline Goals of the workshop Transition-based dependency parsing... Transition systems Scoring functions Search algorithms... with MaltParser Parsing algorithm = transition system + search algorithm Guide = scoring function Transition-Based Dependency Parsing 2(13)

Introduction Goals of the Workshop Background: OSDT meeting in Copenhagen Subgroup interested in using MaltParser as a research platform Goals: Enable participants to use MaltParser Enable participants to modify MaltParser Establish desiderata for future versions of MaltParser Expectations from participants? Transition-Based Dependency Parsing 3(13)

Introduction Program Thursday morning: Introduction: Transition-based parsing with MaltParser (Nivre) MaltParser: Architecture, components and interfaces (Hall) Thursday afternoon: Using MaltParser with built-in options (Nivre) Extending MaltParser with plugins (Hall) Friday morning: Building applications with MaltParser (Hall) Challenges in using parsers at Google (Ringgaard) Friday afternoon Free for discussions, planning, etc. Transition-Based Dependency Parsing 4(13)

Transition-Based Dependency Parsing Dependency Parsing Task definition: Map a sentence x = (w 1,..., w n ) to a dependency graph G = (V, A), where 1. V = {0, 1,..., n} is a set of nodes (one for each w i + root 0), 2. A V L V is a set of labeled arcs (over label set L). We normally require G to be a directed tree rooted at 0. Transition-Based Dependency Parsing 5(13)

Transition-Based Dependency Parsing Transition-Based Dependency Parsing A transition system S = (C, T, c s, C t ), where 1. C is a set of configurations, each of which contains a buffer β of (remaining) nodes and a set A of dependency arcs, 2. T is a set of transitions, each of which is a (partial) function t : C C, 3. c s is an initialization function, mapping a sentence x = (w 1,..., w n ) to a configuration with β = [1,..., n], 4. C t C is a set of terminal configurations. A scoring function λ : C T R, which assigns a real-valued score λ(c, t) to each transition t out of a configuration c. A search algorithm h(s, λ, x) for finding the optimal transition sequence C 0,m = c 0,..., c m (c 0 = c s (x), c m C t ) for sentence x in system S relative to the scoring function λ. Transition-Based Dependency Parsing 6(13)

Transition-Based Dependency Parsing Example: Transition System Arc-standard shift-reduce parsing: C = {(σ, β, A) σ is a stack, β is a buffer, A is an arc set} T = {Shift, LeftArc l, RightArc l }, where 1. Shift : (σ, i β, A) (σ i, β, A) 2. LeftArc l : (σ i, j β, A) (σ, j β, A {(j, l, i)}) 3. RightArc l (σ i, j β, A) (σ, i β, A {(i, l, j)}) c s (x = w 1,..., w n ) = ([0], [1,..., n], ) C t = {(σ, β, A) C β = []} Transition-Based Dependency Parsing 7(13)

Transition-Based Dependency Parsing Example: Scoring Function Feature-based classification: λ(c, t) = g(φ(c, t)), where 1. Φ : C T R k is a feature model, which maps each pair (c, t) to a k-dimensional feature vector Φ(c, t), 2. g : R k R is a (generalized) linear classifier, which maps a feature vector Φ(c, t) to a score in the interval [ 1, 1]. Classifier training: Training instances (c, t) derived from treebank data. Supervised learning using support vector machines with kernels. Transition-Based Dependency Parsing 8(13)

Transition-Based Dependency Parsing Example: Search Algorithm Greedy, deterministic search: h(s, λ, x) 1 c c s (x) 2 while c C t 3 t arg max t λ(c, t ) 4 c t(c) 5 return G = ({0, 1,..., n}, A c ) Transition-Based Dependency Parsing 9(13)

Variations on Transition-Based Parsing Alternative transition systems: Arc-eager shift-reduce parsing [Nivre 2003] Transition-Based Dependency Parsing Arc-standard shift-reduce parsing [Yamada and Matsumoto 2003] Restricted non-projective parsing [Attardi 2006] Unrestricted non-projective parsing [Covington 2001, Nivre 2007] Alternative scoring functions: Support vector machines [Kudo and Matsumoto 2002, Yamada and Matsumoto 2003, Isozaki et al. 2004, Cheng et al. 2004, Sagae and Lavie 2006] Memory-based learning [Attardi 2006] Maximum entropy [Cheng et al. 2005, Attardi 2006] Perceptron learning [Ciaramita and Attardi 2007] Alternative search algorithms: Greedy single-pass [Nivre et al. 2004] Greedy iterative [Yamada and Matsumoto 2003] Beam search [Johansson and Nugues 2006, Titov and Henderson 2007] Transition-Based Dependency Parsing 10(13)

MaltParser MaltParser as a Framework MaltParser: Framework for transition-based dependency parsing Orthogonal components: Transition system Scoring function Search algorithm Designed for maximum flexibility: Components can be varied independently. Any combination of components should work (in principle). Transition-Based Dependency Parsing 11(13)

MaltParser Theory and Implementation 1 Transition systems and search algorithms: In MaltParser, a transition system is (currently) merged with a particular search algorithm into a parsing algorithm. As a result, transition systems and search algorithms cannot be varied independently. Parsing algorithms: Several parsing algorithms are built into the system. New parsing algorithms can be added as plugins. Transition-Based Dependency Parsing 12(13)

MaltParser Theory and Implementation 2 Scoring functions: In MaltParser, a scoring function is currently split into a feature model and a learner. As a result, feature models and learners can be varied independently. Feature models: Feature models are defined using a specification language over built-in feature functions. New feature functions can be added as plugins. Learners: Learners are interfaces to machine learning packages. New learners can be added as plugins. Transition-Based Dependency Parsing 13(13)

References Giuseppe Attardi. 2006. Experiments with a multilanguage non-projective dependency parser. In Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL), pages 166 170. Yuchang Cheng, Masayuki Asahara, and Yuji Matsumoto. 2004. Deterministic dependency structure analyzer for Chinese. In Proceedings of the First International Joint Conference on Natural Language Processing (IJCNLP), pages 500 508. Yuchang Cheng, Masayuki Asahara, and Yuji Matsumoto. 2005. Machine learning-based dependency analyzer for Chinese. In Proceedings of International Conference on Chinese Computing (ICCC), pages 66 73. Massimiliano Ciaramita and Giuseppe Attardi. 2007. Dependency parsing with second-order feature maps and annotated semantic information. In Proceedings of the Tenth International Conference on Parsing Technologies, pages 133 143, June. Michael A. Covington. 2001. A fundamental algorithm for dependency parsing. In Proceedings of the 39th Annual ACM Southeast Conference, pages 95 102. Hideki Isozaki, Hideto Kazawa, and Tsutomu Hirao. 2004. A deterministic word dependency analyzer enhanced with preference learning. In Proceedings of the 20th International Conference on Computational Linguistics (COLING), pages 275 281. Richard Johansson and Pierre Nugues. 2006. Investigating multilingual dependency parsing. In Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL), pages 206 210. Taku Kudo and Yuji Matsumoto. 2002. Japanese dependency analysis using cascaded chunking. In Proceedings of the Sixth Workshop on Computational Language Learning (CoNLL), pages 63 69. Joakim Nivre, Johan Hall, and Jens Nilsson. 2004. Memory-based dependency parsing. In Proceedings of the 8th Conference on Computational Natural Language Learning, pages 49 56. Joakim Nivre. 2003. An efficient algorithm for projective dependency parsing. In Proceedings of the 8th International Workshop on Parsing Technologies (IWPT), pages 149 160. Transition-Based Dependency Parsing 13(13)

References Joakim Nivre. 2007. Incremental non-projective dependency parsing. In Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT), pages 396 403. Kenji Sagae and Alon Lavie. 2006. Parser combination by reparsing. In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, pages 129 132. Ivan Titov and James Henderson. 2007. A latent variable model for generative dependency parsing. In Proceedings of the 10th International Conference on Parsing Technologies (IWPT), pages 144 155. Hiroyasu Yamada and Yuji Matsumoto. 2003. Statistical dependency analysis with support vector machines. In Proceedings of the 8th International Workshop on Parsing Technologies (IWPT), pages 195 206. Transition-Based Dependency Parsing 13(13)