Fully Delexicalized Contexts for Syntax-Based Word Embeddings

Similar documents
SETS: Scalable and Efficient Tree Search in Dependency Graphs

Stack- propaga+on: Improved Representa+on Learning for Syntax

Let s get parsing! Each component processes the Doc object, then passes it on. doc.is_parsed attribute checks whether a Doc object has been parsed

TectoMT: Modular NLP Framework

UD Annotatrix: An annotation tool for Universal Dependencies

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Dependency Parsing. Ganesh Bhosale Neelamadhav G Nilesh Bhosale Pranav Jawale under the guidance of

Large-Scale Syntactic Processing: Parsing the Web. JHU 2009 Summer Research Workshop

UD Annotatrix: An Annotation Tool For Universal Dependencies

Treex: Modular NLP Framework

NLPL - The Nordic Language Processing Laboratory.

Learning Latent Linguistic Structure to Optimize End Tasks. David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu

CS395T Project 2: Shift-Reduce Parsing

Tokenization - Definition

Abstract Syntax and Universal Dependencies

Online Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies

Syntax and Grammars 1 / 21

UIMA-based Annotation Type System for a Text Mining Architecture

Managing a Multilingual Treebank Project

Backpropagating through Structured Argmax using a SPIGOT

Ortolang Tools : MarsaTag

Parsing the Language of Web 2.0

A CASE STUDY: Structure learning for Part-of-Speech Tagging. Danilo Croce WMR 2011/2012

Semantic Analysis. Compiler Architecture

Refresher on Dependency Syntax and the Nivre Algorithm

Final Project Discussion. Adam Meyers Montclair State University

Natural Language Processing Pipelines to Annotate BioC Collections with an Application to the NCBI Disease Corpus

Sparse Non-negative Matrix Language Modeling

Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Density-Driven Cross-Lingual Transfer of Dependency Parsers

Introduction to Lexical Analysis

Graph Kernels versus Graph Representations: a Case Study in Parse Ranking

Making Sense Out of the Web

Formats and standards for metadata, coding and tagging. Paul Meurer

Easy-First POS Tagging and Dependency Parsing with Beam Search

Meaning Banking and Beyond

ERRATOR: a Tool to Help Detect Annotation Errors in the Universal Dependencies Project

Grammar Knowledge Transfer for Building RMRSs over Dependency Parses in Bulgarian

Semantics as a Foreign Language. Gabriel Stanovsky and Ido Dagan EMNLP 2018

arxiv: v1 [cs.cl] 25 Apr 2017

Syntax-Directed Translation. Lecture 14

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin

NLP Chain. Giuseppe Castellucci Web Mining & Retrieval a.a. 2013/2014

On Building Natural Language Interface to Data and Services

The CKY Parsing Algorithm and PCFGs. COMP-550 Oct 12, 2017

NLP Final Project Fall 2015, Due Friday, December 18

Multimodal Information Spaces for Content-based Image Retrieval

Machine Learning in GATE

Online Service for Polish Dependency Parsing and Results Visualisation

Software Testing and Maintenance 1. Introduction Product & Version Space Interplay of Product and Version Space Intensional Versioning Conclusion

Corpus Linguistics. Seminar Resources for Computational Linguists SS Magdalena Wolska & Michaela Regneri

Unsupervised Semantic Parsing

Topics in Parsing: Context and Markovization; Dependency Parsing. COMP-599 Oct 17, 2016

CSE255 Assignment 1 Improved image-based recommendations for what not to wear dataset

Word2vec and beyond. presented by Eleni Triantafillou. March 1, 2016

Mining Web Data. Lijun Zhang

Principles of Programming Languages COMP251: Syntax and Grammars

Clinical Named Entity Recognition Method Based on CRF

Dependency Parsing CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin

School of Computing and Information Systems The University of Melbourne COMP90042 WEB SEARCH AND TEXT ANALYSIS (Semester 1, 2017)

Dependency grammar and dependency parsing

CRFVoter: Chemical Entity Mention, Gene and Protein Related Object recognition using a conglomerate of CRF based tools

Maximum Entropy based Natural Language Interface for Relational Database

Metric Learning for Large-Scale Image Classification:

Incremental Integer Linear Programming for Non-projective Dependency Parsing

A Robust Number Parser based on Conditional Random Fields

NLP in practice, an example: Semantic Role Labeling

Assisting IoT Projects and Developers in Designing Interoperable Semantic Web of Things Applications

A tool for Cross-Language Pair Annotations: CLPA

FastText. Jon Koss, Abhishek Jindal

Towards Optimized Multimodal Concept Indexing

WebAnno: a flexible, web-based annotation tool for CLARIN

RECSM Summer School: Scraping the web. github.com/pablobarbera/big-data-upf

Package corenlp. June 3, 2015

Army Research Laboratory

Using Search-Logs to Improve Query Tagging

Basic Parsing with Context-Free Grammars. Some slides adapted from Karl Stratos and from Chris Manning

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text

CS152 Programming Language Paradigms Prof. Tom Austin, Fall Syntax & Semantics, and Language Design Criteria

Citation extraction and modeling. Meen Chul Kim, Andrea Forte, Aaron Halfaker

Deliverable D1.4 Report Describing Integration Strategies and Experiments

Machine Learning for Natural Language Processing. Alice Oh January 17, 2018

A Deep Relevance Matching Model for Ad-hoc Retrieval

ACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang

DepPattern User Manual beta version. December 2008

Text Mining for Software Engineering

Semantic Web Information Management

Corpus Linguistics: corpus annotation

A Comparative Study of Syntactic Parsers for Event Extraction

COMP 3002: Compiler Construction. Pat Morin School of Computer Science

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

Predicting Stack Exchange Keywords

Machine Translation Zoo

CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019

Stanford s System for Parsing the English Web

Dependency Schema Transformation with Tree Transducers

EuroParl-UdS: Preserving and Extending Metadata in Parliamentary Debates

Ontology-guided Extraction of Complex Nested Relationships

Transcription:

Fully Delexicalized Contexts for Syntax-Based Word Embeddings Jenna Kanerva¹, Sampo Pyysalo² and Filip Ginter¹ ¹Dept of IT - University of Turku, Finland ²Lang. Tech. Lab - University of Cambridge turkunlp.github.io

Abstract We propose fully delexicalized contexts derived from syntactic trees to train word embeddings We demonstrate and evaluate our embeddings compared to vanilla word2vec Nearest neighbours Correlation to human judgement Dependency parsing 2

Outline Related work Our word embedding contexts Motivation Experiments and evaluation 3

Related work Word2vec Mikolov et al. (2013) Word2vecf Levy and Goldberg (2014) Syntactic context 4

Vanilla word2vec (Mikolov et al. 2013) Output layer: Hope, you, your, weekend * Output layer size: vocabulary Hidden layer Input layer: enjoyed 5

Word2vecf (Levy & Goldberg 2014) Output layer: Hope/ccomp gov, you/nsubj, weekend/obj Hidden layer * Output layer size: ~vocabulary Input layer: enjoyed 6

Our method Output layer: ccomp gov, nsubj, obj, VERB, Mood=Ind, Tense=Past, VerbForm=Fin Hidden layer Input layer: enjoyed * Output layer size: pos tags + features + dependency types 7

Motivation 1) How much semantics can be learnt without the actual words? 2) Does task-specific training help? 3) Unified treebank annotations Universal/multilingual word embeddings? 8

Experiments Training our word embeddings for 45 languages Inspecting nearest neighbours Comparing our embeddings to vanilla word2vec Correlation with human judgement Dependency parsing (closely related task) 9

Data Word embedding training: Automatically parsed raw text collection (Ginter et al. 2017) Parser training + evaluation: Universal Dependencies v2.0 treebanks (Nivre et al. 2017) Word similarity evaluation: evaluation service of 13 human judgement datasets (Faruqui and Dyer 2014) 10

Evaluation: Nearest neighbours vanilla word2vec Our delexicalized vectors 11

Evaluation: Nearest neighbours vanilla word2vec Our delexicalized vectors 12

Evaluation: human judgement Our vectors not as good as word2vec but correlation is still positive 13

Evaluation: Dependency parsing UDPipe parser with three different pre-trained word embeddings Baseline: word2vec trained on treebank data word2vec trained on raw text collection Our trained on automatically analysed raw text collection 14

Evaluation: Dependency parsing Green if our is better than word2vec, and difference to baseline is positive 15

Evaluation: Dependency parsing word2vec +0.16 better than baseline on average diff to baseline between -1.55% and +6.28% 31 treebanks positive, and 23 negative Our embeddings +0.88 better than baseline diff to baseline between -0.80% and +7.30% 45 treebanks positive, and 9 negative 16

Evaluation: Dependency parsing pre-trained embeddings does not automatically increase parsing performance across languages delexicalized syntactic embeddings lead to higher performance as well as generalize better across languages when evaluated in closely related task 17

Parser accuracy vs. quality of the embeddings Our word embeddings are trained on automatically parsed data How does the baseline parser accuracy affect the quality of the word embeddings? Bootstrapping: Baseline parser parse raw text embeddings better parser parse raw text new embeddings even better parser? 18

Bootstrapping on Finnish Baseline Iteration 1 Iteration 2 Finnish 75.70 77.35 77.57 Small improvement with the second iteration model UDPipe not optimal parser for this study as POS tags and morphological features are not revised 19

Bootstrapping vol. 2 Baseline UDPipe not competitive with state-of-the-art on Finnish 75.7% compared to 83-84% What if we use raw data parsed with (near) state-of-the-art parser? 20

Bootstrapping vol. 2 Raw data: Finnish Internet Parsebank ~3.6 billion token collection of web crawled data Finnish-dep-parser Omorfi rule-based morphological analyzer Marmot tagger Mate-tools graph-based dependency parser UD v1.2 LAS estimated to be ~82% 21

Bootstrapping vol. 2 Warning! Numbers not comparable! Numbers are not comparable to our main result table! Different version of UD (v1.2 compared to v2.0) Raw text collection more than three times bigger 22

Bootstrapping vol. 2 UDPipe + our embeddings trained on Finnish Internet Parsebank: 82.21% UDPipe + word2vec embeddings trained on Finnish Internet Parsebank: 78.35% UDPipe baseline: ~76.5% 23

Conclusions Fully delexicalized context for word embedding training Bit surprisingly, these embeddings are able to capture also semantic aspects Improve parsing accuracy and generalize better than standard word2vec embeddings 24

Thanks! turkunlp.github.io Academy of Finland Kone Foundation University of Turku Graduate School 25