Projective Dependency Parsing with Perceptron

Similar documents
Agenda for today. Homework questions, issues? Non-projective dependencies Spanning tree algorithm for non-projective parsing

CS395T Project 2: Shift-Reduce Parsing

Dynamic Feature Selection for Dependency Parsing

Collins and Eisner s algorithms

<Insert Picture Here> Oracle Policy Automation 10.0 Features and Benefits

Graph-Based Parsing. Miguel Ballesteros. Algorithms for NLP Course. 7-11

Density-Driven Cross-Lingual Transfer of Dependency Parsers

Dependency grammar and dependency parsing

Incremental Integer Linear Programming for Non-projective Dependency Parsing

Parsing with Dynamic Programming

Transfer Manual Norman Endpoint Protection Transfer to Avast Business Antivirus Pro Plus

Dependency grammar and dependency parsing

Dependency grammar and dependency parsing


Intel USB 3.0 extensible Host Controller Driver

Transfer Manual Norman Endpoint Protection Transfer to Avast Business Antivirus Pro Plus

The Perceptron. Simon Šuster, University of Groningen. Course Learning from data November 18, 2013

Transition-based dependency parsing

Non-Projective Dependency Parsing with Non-Local Transitions

Structured Prediction Models via the Matrix-Tree Theorem

Dynamic Programming for Higher Order Parsing of Gap-Minding Trees

2. bizhub Remote Access Function Support List

ADOBE READER AND ACROBAT 8.X AND 9.X SYSTEM REQUIREMENTS

Guide & User Instructions

One-screen Follow-up Reference Guide. Merlin.net Patient Care Network (PCN)

Section Software

Transition-Based Dependency Parsing with MaltParser

This bulletin was created to inform you of the release of the new version 4.30 of the Epson EMP Monitor software utility.

Managing a Multilingual Treebank Project

Turning on the Turbo: Fast Third-Order Non- Projective Turbo Parsers

Licensed Program Specifications

INSITE Features Notes

An Empirical Study of Semi-supervised Structured Conditional Models for Dependency Parsing

NLP in practice, an example: Semantic Role Labeling

Hik-Connect Mobile Client

American Philatelic Society Translation Committee. Annual Report Prepared by Bobby Liao

The SAP Knowledge Acceleration, website package, can be deployed to any web server, file server, CD-ROM, or a user workstation.

Displayer. showing FM-Radio. Basic Functions. 6. Support TXT. 10. Support game

QT8716. NVR Technology IP Channels 16 (32) Recording Resolution 720p / 1080p / 3MP / 4MP Live Viewing Resolution 1080p. Live Display FPS.

Models HP Engage One Top Mount 2x20 CFD (Black) HP Engage One Top Mount 2x20 CFD (White)

Let s get parsing! Each component processes the Doc object, then passes it on. doc.is_parsed attribute checks whether a Doc object has been parsed

Microsoft Store badge guidelines. October 2017

Online Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies

Dependency Parsing. Allan Jie. February 20, Slides: Allan Jie Dependency Parsing February 20, / 16

Project Name SmartPSS

INSITE Features Notes

Learning Latent Linguistic Structure to Optimize End Tasks. David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu

Google Search Appliance

Quick Start Guide version 4. Abstract Quick Start Guide for the use of EyeQuestion Software. Rignald Span

introduction to using Watson Services with Java on Bluemix

KYOCERA Quick Scan v1.0

DINO-LITE SOFTWARE FOTO SCHERM MET SOFTWARE DINO-LITE SOFTWARE

INSITE Features Notes

Tekniker för storskalig parsning: Dependensparsning 2

Syntax and Grammars 1 / 21

Multilingual Support Configuration For IM and Presence Service

Inkjet Printer UX Series PREMIUM CLASS. Coding Solutions by Hitachi

SAP Crystal Reports 2013 Product Availability Matrix (PAM)

Microsoft Academic Select Enrollment

SAP Crystal Reports for Eclipse Product Availability Matrix (PAM)

Structured Perceptron. Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen

Dependency Parsing. Ganesh Bhosale Neelamadhav G Nilesh Bhosale Pranav Jawale under the guidance of

Backpropagating through Structured Argmax using a SPIGOT

Complete Messaging Solution

IBM Full Width Quiet Touch Keyboards now available on IBM System p servers and selected IntelliStation workstations

GV-Center V2 INTRODUCTION GV CENTER V2 VS. GV CENTER V2 PRO

KAF: a generic semantic annotation format

12 Steps to Software Translation Success

Modbus Integration Integration for Modbus Functionality for VTR8300 Series

Net: EUR Gross: EUR

Hik-Connect Client Software V (Android) V (iOS) Release Notes ( )

PORTABLE VIDEO BORESCOPE. User Manual

QuickSpecs. HP Retail Integrated 2x20 Display. Overview. Front. 1. 2X20 LCD, backlit display. 2. USB connector

MaintSmart. Enterprise. User. Guide. for the MaintSmart Translator. version 4.0. How does the translator work?...2 What languages are supported?..

10 Steps to Document Translation Success

Net: EUR Gross: EUR

Net: PLN Gross: PLN

Perceptive Intelligent Capture Visibility

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin

SmartPSS. Smart Professional Surveillance System. Provide efficient device management, monitoring, playback, alarm, video analytics, video wall, etc.

Automatic Discovery of Feature Sets for Dependency Parsing

Product Library 4.0 EUR. CD Contents. Release Notes July 1st, Windows Vista. Windows. Windows. Server Windows 2000.

Leica SmartWorx Viva Software Release Notes

Oracle Access Manager

GLOBAL NETFLIX PREFERRED VENDOR (NPV) RATE CARD:

Net: EUR Gross: EUR

Filr 3.3 Desktop Application Guide for Linux. December 2017

Bidirectional LTAG Dependency Parsing

Xapity Current Activity Administration Guide.

Complete Messaging Solution

Oracle Policy Automation Release Notes

SUSE LINUX Openexchange Server 4.1 Installation Kit Strong Encryption (128+ bit) German $ % 92.65

CONTENT. ANALYST OPINION INDICATOR for MT4. Set up & Configuration Guide. Description 1 Subscribing to TRADING CENTRAL feed 1 Installation process 1

Installation Guide Command WorkStation 5.6 with Fiery Extended Applications 4.2

GV-Hot Swap Recording Server System V5 (Rev.D) 4U, 20-Bay

Probabilistic parsing with a wide variety of features

Stack- propaga+on: Improved Representa+on Learning for Syntax

LiveEngage System Requirements and Language Support Document Version: 5.0 February Relevant for LiveEngage Enterprise In-App Messenger SDK v2.

Efficient Parsing for Head-Split Dependency Trees

Multilingual Support Configuration For IM and Presence Service

Transcription:

Projective Dependency Parsing with Perceptron Xavier Carreras, Mihai Surdeanu, and Lluís Màrquez Technical University of Catalonia {carreras,surdeanu,lluism}@lsi.upc.edu 8th June 2006

Outline Introduction Parsing and Learning Parsing Model Parsing Algorithm Global Perceptron Learning Algorithm Features Experiments and Results Results Discussion

Outline Introduction Parsing and Learning Parsing Model Parsing Algorithm Global Perceptron Learning Algorithm Features Experiments and Results Results Discussion

Introduction Motivation Blind treatment of multilingual data Use well-known components Our Dependency Parsing Learning Architecture: Eisner dep-parsing algorithm, for projective structures Perceptron learning algorithm, run globally Features: state-of-the-art, with some new ones In CoNLL-X data, we achieve moderate performance: 74.72 of overall labeled attachment score 10th position in the ranking

Outline Introduction Parsing and Learning Parsing Model Parsing Algorithm Global Perceptron Learning Algorithm Features Experiments and Results Results Discussion

Parsing Model A dependency tree is decomposed into labeled dependencies, each of the form [h, m, l] where : h is the position of the head word m is the position of the modifier word l is the label of the dependency Given a sentence x the parser computes: dparser(x, w) = arg max score(x, y, w) y Y(x) = arg max score([h, m, l], x, y, w) y Y(x) [h,m,l] y = arg max w l φ([h, m], x, y) y Y(x) [h,m,l] y w = (w 1,..., w l,..., w L ) is the learned weight vector φ is the feature extraction function, given a priori

The Parsing Algorithm of Eisner (1996) Assumes that dependency structures are projective; in CoNLL data, this only holds for Chinese Bottom-up dynamic programming algorithm In a given span from word s to word e : 1. Look for the optimal point giving internal structures: s r r+1 e 2. Look for the best label to connect the structures:??

The Parsing Algorithm of Eisner (1996) (II) A third step assembles two dependency structures without using learning s r r e s r e

Perceptron Learning Global Perceptron (Collins 2002): trains the weight vector dependently of the parsing algorithm. A very simple online learning algorithm: it corrects the mistakes seen after a training sentence is parsed. w = 0 for t = 1 to T foreach training example (x, y) do ŷ = dparser(x, w) foreach [h, m, l] y \ŷ do /* missed deps */ w l = w l + φ(h, m, x, ŷ) foreach [h, m, l] ŷ \y do /* over-predicted deps */ w l = w l φ(h, m, x, ŷ) return w

Outline Introduction Parsing and Learning Parsing Model Parsing Algorithm Global Perceptron Learning Algorithm Features Experiments and Results Results Discussion

Feature Extraction Function φ(h, m, x, y): represents in a feature vector a dependency from word positions m to h, in the context of a sentence x and a dependency tree y φ(h, m, x, y) = φ token (x, h, head ) + φ tctx (x, h, head ) + φ token (x, m, mod ) + φ tctx (x, m, mod ) + φ dep (x, mm h,m, d h,m ) + φ dctx (x, mm h,m, d h,m ) + φ dist (x, mm h,m, d h,m ) + φ runtime (x, y, h, m, d h,m ) where mm h,m is a shorthand for the tuple min(h, m), max(h, m) d h,m indicates the direction of the dependency

Context-Independent Token Features Represent a token i type indicates the type of token being represented, i.e. head or mod Novel features are in red. φ token (x, i, type) type word(x i ) type lemma(x i ) type cpos(x i ) type fpos(x i ) foreach f morphosynt(x i ) : type f type word(x i ) cpos(x i ) foreach f morphosynt(x i ) : type word(x i ) f

Context-Dependent Token Features Represent the context of a token x i The function extracts token features of surrounding tokens It also conjoins some selected features along the window φ tctx (x, i, type) φ token (x, i 1, type string( 1)) φ token (x, i 2, type string( 2)) φ token (x, i + 1, type string( 1)) φ token (x, i + 2, type string( 2)) type cpos(x i ) cpos(x i 1 ) type cpos(x i ) cpos(x i 1 ) cpos(x i 2 ) type cpos(x i ) cpos(x i+1 ) type cpos(x i ) cpos(x i+1 ) cpos(x i+2 )

Context-Independent Dependency Features Features of the two tokens involved in a dependency relation dir indicates whether the relation is left-to-right or right-to-left φ dep (x, i, j, dir) dir word(x i ) cpos(x i ) word(x j ) cpos(x j ) dir cpos(x i ) word(x j ) cpos(x j ) dir word(x i ) word(x j ) cpos(x j ) dir word(x i ) cpos(x i ) cpos(x j ) dir word(x i ) cpos(x i ) word(x j ) dir word(x i ) word(x j ) dir cpos(x i ) cpos(x j )

Context-Dependent Dependency Features Capture the context of the two tokens involved in a relation dir indicates whether the relation is left-to-right or right-to-left φ dctx (x, i, j, dir) dir cpos(x i ) cpos(x i+1 ) cpos(x j 1 ) cpos(x j ) dir cpos(x i 1 ) cpos(x i ) cpos(x j 1 ) cpos(x j ) dir cpos(x i ) cpos(x i+1 ) cpos(x j ) cpos(x j+1 ) dir cpos(x i 1 ) cpos(x i ) cpos(x j ) cpos(x j+1 )

Surface Distance Features Features on the surface tokens found within a dependency relation Numeric features are discretized using binning to a small number of intervals φ dist (x, i, j, dir) foreach(k (i, j)): dir cpos(x i ) cpos(x k ) cpos(x j ) number of tokens between i and j number of verbs between i and j number of coordinations between i and j number of punctuations signs between i and j

Runtime Features Capture the labels of the dependencies that attach to the head word This information is available in the dynamic programming matrix of the parsing algorithm? h m l 1 l 2 l φ runtime (x, y, h, m, dir) foreach i, 1 i S : dir cpos(x h ) cpos(x m ) l i dir cpos(x h ) cpos(x m ) l 1 dir cpos(x h ) cpos(x m ) l 1 l 2 dir cpos(x h ) cpos(x m ) l 1 l 2 l 3 dir cpos(x h ) cpos(x m ) l 1 l 2 l 3 l 4 3 l... S

Outline Introduction Parsing and Learning Parsing Model Parsing Algorithm Global Perceptron Learning Algorithm Features Experiments and Results Results Discussion

Results GOLD UAS LAS Japanese 99.16 90.79 88.13 Chinese 100.0 88.65 83.68 Portuguese 98.54 87.76 83.37 Bulgarian 99.56 88.81 83.30 German 98.84 85.90 82.41 Danish 99.18 85.67 79.74 Swedish 99.64 85.54 78.65 Spanish 99.96 80.77 77.16 Czech 97.78 77.44 68.82 Slovene 98.38 77.72 68.43 Dutch 94.56 71.39 67.25 Arabic 99.76 72.65 60.94 Turkish 98.41 70.05 58.06 Overall 98.68 81.19 74.72

Feature Analysis φ token +φ dep +φ tctx +φ dist +φ runtime +φ dctx Japanese 38.78 78.13 86.87 88.27 88.13 Portuguese 47.10 64.74 80.89 82.89 83.37 Spanish 12.80 53.80 68.18 74.27 77.16 Turkish 33.02 48.00 55.33 57.16 58.06 This table shows LAS at increasing feature configurations All families of feature patterns help significantly

Errors Caused by 4 Factors 1. Size of training sets: accuracy below 70% for languages with small training sets: Turkish, Arabic, and Slovene. 2. Modeling large distance dependencies: our distance features (φ dist ) are insufficient to model well large-distance dependencies: to root 1 2 3 6 >= 7 Spanish 83.04 93.44 86.46 69.97 61.48 Portuguese 90.81 96.49 90.79 74.76 69.01 3. Modeling context: our context features (φ dctx, φ tctx, and φ runtime ) do not capture complex dependencies. Top 5 focus words with most errors: Spanish: y, de, a, en, and que Portuguese: em, de, a, e, and para 4. Projectivity assumption: Dutch is the language with most crossing dependencies in this evaluation, and the accuracy we obtain is below 70%.

Thanks!