Backpropagating through Structured Argmax using a SPIGOT

Size: px
Start display at page:

Download "Backpropagating through Structured Argmax using a SPIGOT"

Transcription

1 Backpropagating through Structured Argmax using a SPIGOT Hao Peng, Sam Thomson, Noah A. July 17, 2018

2 Overview arg max Parser Downstream task Loss L

3 Overview arg max Parser Downstream task Head token Yang and Mitchell, 2017 Tree-RNN Tai et al., 2015 Graph CNN Kipf and Welling, 2017 Loss L

4 Overview arg max Parser A layer in the computation graph? Downstream task Loss L

5 Overview Non-differentiable arg max Parser A layer in the computation graph? Downstream task Loss L

6 Overview Aim Structured prediction as a layer. Motivation Structures help. Ji and Smith, 2017; Oepen et al., 2017 Linguistic structures may not be universally optimal. Williams, 2017 arg max Intermediate parser Downstream task Loss L r L?

7 Overview Aim Structured prediction as a layer. Motivation Structures help. Ji and Smith, 2017; Oepen et al., 2017 Linguistic structures may not be universally optimal. Williams, 2017 arg max Intermediate parser Downstream task Loss L r L? Challenges argmax is non-differentiable.

8 Overview Aim Structured prediction as a layer. Motivation Structures help. Ji and Smith, 2017; Oepen et al., 2017 Linguistic structures may not be universally optimal. Williams, 2017 Challenges argmax is non-differentiable. arg max Method Loss L Intermediate parser Downstream task A proxy Structured Prediction Intermediate Gradients Optimization Technique SPIGOT r L?

9 Outline Background: structured prediction as linear programs Method: SPIGOT algorithm Experiments

10 Structured Prediction Reviewed Input Output

11 Structured Prediction Reviewed Input Score S ( ) X s ( ) head mod = arcs

12 Structured Prediction Reviewed Input Score > s = s ( ),s ( ),s ( ),...,s ( ) z = [ 1?, 0?, 1?,..., 0? ] > Output s.t. arg max z forms a tree z > s ẑ

13 Linear Programming Formulation ẑ = arg max z > s.t. z forms a tree 2 3 s ( ) s ( ) s ( ) s ( ) Az apple b Roth and Yih, 2004; Martins et al., 2009

14 Linear Programming Formulation ẑ arg max z > s.t. z forms a tree z i 2 {0, 1} relaxation z i 2 [0, 1] = 2 Az apple b 3 s ( ) s ( ) s ( ) s ( ) Roth and Yih, 2004; Martins et al., 2009

15 Outline Background: structured prediction as linear programs Method: SPIGOT algorithm Experiments

16 Backprop ẑ = arg max z > s.t. z forms a tree s ( ) s ( ) s ( ). s ( ) r L ẑ Downstream task Loss L

17 Backprop ẑ = arg max z > s.t. z forms a tree s ( ) s ( ) s ( ). s ( ) r L ẑ rẑl Downstream task Loss L Backprop

18 Backprop ẑ = arg max z > s.t. z forms a tree s ( ) s ( ) s ( ). s ( ) r L Backprop r s L ẑ rẑl Downstream task Loss L Backprop

19 Backprop ẑ = arg max z > s.t. z forms a tree s ( ) s ( ) s ( ). s ( ) r L Backprop r s L Proxy ẑ rẑl Downstream task Loss L Backprop

20 Backprop We have: rẑl We need: r s L

21 Backprop We have: rẑl We need: r s L Leibniz, 1676 r s L = J rẑl

22 Backprop We have: rẑl We need: r s L Leibniz, 1676 r s L = J rẑl ẑ = arg max z > s s.t. z forms a tree Jacobian not defined

23 Backprop We have: rẑl We need: r s L Leibniz, 1676 r s L = J rẑl Straight-through Estimator (STE) Hinton, 2012; Bengio et al., 2013 r s L, rẑl

24 Some Geometry Straight-through Estimator (STE): r s L, rẑl Az apple b ẑ =[1, 0, 1,, 0] >

25 Some Geometry Straight-through Estimator (STE): r s L, rẑl Az apple b rẑl =[ 0.3, 0.5, 0.4,...,0.2] ẑ =[1, 0, 1,, 0] >

26 Some Geometry Straight-through Estimator (STE): r s L, rẑl p = ẑ rẑl Az apple b rẑl =[ 0.3, 0.5, 0.4,...,0.2] ẑ =[1, 0, 1,, 0] >

27 Some Geometry SPIGOT p = ẑ rẑl q Az apple b rẑl =[ 0.3, 0.5, 0.4,...,0.2] ẑ =[1, 0, 1,, 0] >

28 Some Geometry SPIGOT p = ẑ rẑl q Az apple b rẑl =[ 0.3, 0.5, 0.4,...,0.2] r s L ẑ =[1, 0, 1,, 0] > p = ẑ rẑl q =proj(p) r s L, ẑ q

29 Some Geometry SPIGOT ẑ rẑl ẑ rẑl r s L ẑ ẑ r s L

30 Algorithm Input Parser ẑ = arg max z > s.t. z forms a tree 2 3 s ( ) s ( ) s ( ) s ( ) ẑ

31 Algorithm Input Parser ẑ = arg max z > s.t. z forms a tree 2 3 s ( ) s ( ) s ( ) s ( ) ẑ Downstream task Loss L

32 Algorithm Input Parser ẑ = arg max z > s.t. z forms a tree 2 3 s ( ) s ( ) s ( ) s ( ) ẑ rẑl Downstream task Loss L Backprop

33 Algorithm Input Parser ẑ = arg max z > s.t. z forms a tree 2 3 s ( ) s ( ) s ( ) s ( ) r s L p = ẑ rẑl q =proj(p) r s L, ẑ q Project onto ẑ rẑl Downstream task Loss L Backprop

34 Algorithm Input Parser ẑ = arg max z > s.t. z forms a tree 2 3 s ( ) s ( ) s ( ) s ( ) r L Backprop r s L p = ẑ rẑl q =proj(p) r s L, ẑ q Project onto ẑ rẑl Downstream task Loss L Backprop

35 Connections to Related Work SPIGOT STE ẑ rẑl ẑ rẑl r s L ẑ r s L Hard decision on Backprop Marginal Projection ẑ Pipeline STE Structured Att. SPIGOT Structured Attention: Kim et al., 2017

36 Connections to Related Work SPIGOT Structured Attention ẑ rẑl ẑ = softmax(...) r s L ẑ = arg max (...) Hard decision on Backprop Marginal Projection ẑ Pipeline STE Structured Att. SPIGOT Structured Attention: Kim et al., 2017

37 Applications Training data Joint learning Swayamdipta et al., 2016 arg max Parser L 1 r L 1

38 Applications Training data Joint learning Swayamdipta et al., 2016 arg max Parser L 1 r L 1 r L 2 Downstream task r L 2 Loss L 2

39 Applications Training data Joint learning Swayamdipta et al., 2016 Induce latent structures Yogatama et al., 2017; Williams et al., 2017 Training data arg max Parser r L 1 r L 2 L 1 arg max Parser r L Downstream task r L 2 Downstream task r L Loss L 2 Loss L

40 Outline Background: structured prediction as linear programs Method: SPIGOT algorithm Experiments

41 Experiments: Syntactic-then-semantic Parsing Input arg max Syntactic Parser Syntactic tree Semantic graph arg1 Semantic Parser arg2 poss

42 Experiments: Syntactic-then-semantic Parsing Input Eisner Algorithm Eisner, 1996 arg max Syntactic Parser BiLSTM + MLP Kiperwasser and Goldberg, 2016 Syntactic tree Semantic graph arg1 Semantic Parser arg2 poss

43 Experiments: Syntactic-then-semantic Parsing Input Eisner Algorithm Eisner, 1996 arg max Syntactic Parser BiLSTM + MLP Kiperwasser and Goldberg, 2016 Syntactic tree root NeurboParser Peng et al., 2017 Concat head token embedding Semantic graph arg1 Semantic Parser arg2 poss

44 SemEval 15. Micro-averaged labeled F1 88 in-domain out-of-domain 86 F Neurbo Pipeline STE Structured Att. SPIGOT Syntax Backprop Hard decision Projection ẑ N/A N/A N/A Neurbo: Peng et al., 2017

45 SemEval 15. Micro-averaged labeled F1 88 in-domain out-of-domain 86 F Neurbo Pipeline STE Structured Att. SPIGOT Syntax Backprop Hard decision Projection ẑ N/A N/A N/A Neurbo: Peng et al., 2017

46 SemEval 15. Micro-averaged labeled F1 88 in-domain out-of-domain 86 F Neurbo Pipeline STE Structured Att. SPIGOT Syntax Backprop Hard decision Projection ẑ N/A N/A N/A Neurbo: Peng et al., 2017

47 SemEval 15. Micro-averaged labeled F1 88 in-domain out-of-domain 86 F Neurbo Pipeline STE Structured Att. SPIGOT Syntax Backprop Hard decision Projection ẑ N/A N/A N/A Neurbo: Peng et al., 2017

48 Semantic Parsing for Sentiment Classification Input Semantic graph arg max arg1 Semantic Parser arg2 poss Classifier Positive? Negative?

49 Semantic Parsing for Sentiment Classification Input AD 3 Martins et al., 2011 Semantic graph arg max : arg1 arg1 Semantic Parser arg2 poss :arg2; :poss NeurboParser Peng et al., 2017 BiLSTM+MLP Concat head token and role Classifier Positive? Negative?

50 Stanford Sentiment Treebank accuracy Accuracy BiLSTM Pipeline STE SPIGOT

51 Conclusion Problem

52 Conclusion Problem Method SPIGOT

53 Conclusion Problem Method Results SPIGOT

54 Thank you!

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Transition-Based Dependency Parsing with Stack Long Short-Term Memory Transition-Based Dependency Parsing with Stack Long Short-Term Memory Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, Noah A. Smith Association for Computational Linguistics (ACL), 2015 Presented

More information

Transition-based dependency parsing

Transition-based dependency parsing Transition-based dependency parsing Syntactic analysis (5LN455) 2014-12-18 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Overview Arc-factored dependency parsing

More information

Learning Latent Linguistic Structure to Optimize End Tasks. David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu

Learning Latent Linguistic Structure to Optimize End Tasks. David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu Learning Latent Linguistic Structure to Optimize End Tasks David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu 12 October 2012 Learning Latent Linguistic Structure to Optimize End Tasks David A. Smith

More information

Context Encoding LSTM CS224N Course Project

Context Encoding LSTM CS224N Course Project Context Encoding LSTM CS224N Course Project Abhinav Rastogi arastogi@stanford.edu Supervised by - Samuel R. Bowman December 7, 2015 Abstract This project uses ideas from greedy transition based parsing

More information

Structured Attention Networks

Structured Attention Networks Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush HarvardNLP 1 Deep Neural Networks for Text Processing and Generation 2 Attention Networks 3 Structured Attention Networks

More information

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text Philosophische Fakultät Seminar für Sprachwissenschaft Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text 06 July 2017, Patricia Fischer & Neele Witte Overview Sentiment

More information

Accurate Parsing and Beyond. Yoav Goldberg Bar Ilan University

Accurate Parsing and Beyond. Yoav Goldberg Bar Ilan University Accurate Parsing and Beyond Yoav Goldberg Bar Ilan University Syntactic Parsing subj root rcmod rel xcomp det subj aux acomp acomp The soup, which I expected to be good, was bad subj root rcmod rel xcomp

More information

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision Anonymized for review Abstract Extending the success of deep neural networks to high level tasks like natural language

More information

Semantics as a Foreign Language. Gabriel Stanovsky and Ido Dagan EMNLP 2018

Semantics as a Foreign Language. Gabriel Stanovsky and Ido Dagan EMNLP 2018 Semantics as a Foreign Language Gabriel Stanovsky and Ido Dagan EMNLP 2018 Semantic Dependency Parsing (SDP) A collection of three semantic formalisms (Oepen et al., 2014;2015) Semantic Dependency Parsing

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

Structured Attention Networks

Structured Attention Networks Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush HarvardNLP ICLR, 2017 Presenter: Chao Jiang ICLR, 2017 Presenter: Chao Jiang 1 / Outline 1 Deep Neutral Networks for Text

More information

Incremental Integer Linear Programming for Non-projective Dependency Parsing

Incremental Integer Linear Programming for Non-projective Dependency Parsing Incremental Integer Linear Programming for Non-projective Dependency Parsing Sebastian Riedel James Clarke ICCS, University of Edinburgh 22. July 2006 EMNLP 2006 S. Riedel, J. Clarke (ICCS, Edinburgh)

More information

Let s get parsing! Each component processes the Doc object, then passes it on. doc.is_parsed attribute checks whether a Doc object has been parsed

Let s get parsing! Each component processes the Doc object, then passes it on. doc.is_parsed attribute checks whether a Doc object has been parsed Let s get parsing! SpaCy default model includes tagger, parser and entity recognizer nlp = spacy.load('en ) tells spacy to use "en" with ["tagger", "parser", "ner"] Each component processes the Doc object,

More information

Online Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies

Online Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies Online Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies Ivan Titov University of Illinois at Urbana-Champaign James Henderson, Paola Merlo, Gabriele Musillo University

More information

27: Hybrid Graphical Models and Neural Networks

27: Hybrid Graphical Models and Neural Networks 10-708: Probabilistic Graphical Models 10-708 Spring 2016 27: Hybrid Graphical Models and Neural Networks Lecturer: Matt Gormley Scribes: Jakob Bauer Otilia Stretcu Rohan Varma 1 Motivation We first look

More information

Dependency grammar and dependency parsing

Dependency grammar and dependency parsing Dependency grammar and dependency parsing Syntactic analysis (5LN455) 2014-12-10 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Mid-course evaluation Mostly positive

More information

Dependency Parsing. Allan Jie. February 20, Slides: Allan Jie Dependency Parsing February 20, / 16

Dependency Parsing. Allan Jie. February 20, Slides:   Allan Jie Dependency Parsing February 20, / 16 Dependency Parsing Allan Jie February 20, 2016 Slides: http://www.statnlp.org/dp.html Allan Jie Dependency Parsing February 20, 2016 1 / 16 Table of Contents 1 Dependency Labeled/Unlabeled Dependency Projective/Non-projective

More information

EECS 496 Statistical Language Models. Winter 2018

EECS 496 Statistical Language Models. Winter 2018 EECS 496 Statistical Language Models Winter 2018 Introductions Professor: Doug Downey Course web site: www.cs.northwestern.edu/~ddowney/courses/496_winter2018 (linked off prof. home page) Logistics Grading

More information

TTIC 31190: Natural Language Processing

TTIC 31190: Natural Language Processing TTIC 31190: Natural Language Processing Kevin Gimpel Winter 2016 Lecture 2: Text Classification 1 Please email me (kgimpel@ttic.edu) with the following: your name your email address whether you taking

More information

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking Yi Yang * and Ming-Wei Chang # * Georgia Institute of Technology, Atlanta # Microsoft Research, Redmond Traditional

More information

Grounded Compositional Semantics for Finding and Describing Images with Sentences

Grounded Compositional Semantics for Finding and Describing Images with Sentences Grounded Compositional Semantics for Finding and Describing Images with Sentences R. Socher, A. Karpathy, V. Le,D. Manning, A Y. Ng - 2013 Ali Gharaee 1 Alireza Keshavarzi 2 1 Department of Computational

More information

Parsing with Dynamic Programming

Parsing with Dynamic Programming CS11-747 Neural Networks for NLP Parsing with Dynamic Programming Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Two Types of Linguistic Structure Dependency: focus on relations between words

More information

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan Jurafsky & James Martin Dependency Parsing Formalizing dependency trees Transition-based dependency parsing

More information

AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands

AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands Svetlana Stoyanchev, Hyuckchul Jung, John Chen, Srinivas Bangalore AT&T Labs Research 1 AT&T Way Bedminster NJ 07921 {sveta,hjung,jchen,srini}@research.att.com

More information

Dependency Parsing CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin

Dependency Parsing CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan Jurafsky & James Martin Dependency Parsing Formalizing dependency trees Transition-based dependency parsing

More information

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra Recurrent Neural Networks Nand Kishore, Audrey Huang, Rohan Batra Roadmap Issues Motivation 1 Application 1: Sequence Level Training 2 Basic Structure 3 4 Variations 5 Application 3: Image Classification

More information

LSTM for Language Translation and Image Captioning. Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia

LSTM for Language Translation and Image Captioning. Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia 1 LSTM for Language Translation and Image Captioning Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia 2 Part I LSTM for Language Translation Motivation Background (RNNs, LSTMs) Model

More information

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University LSTM and its variants for visual recognition Xiaodan Liang xdliang328@gmail.com Sun Yat-sen University Outline Context Modelling with CNN LSTM and its Variants LSTM Architecture Variants Application in

More information

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio Presented

More information

DeepWalk: Online Learning of Social Representations

DeepWalk: Online Learning of Social Representations DeepWalk: Online Learning of Social Representations ACM SIG-KDD August 26, 2014, Rami Al-Rfou, Steven Skiena Stony Brook University Outline Introduction: Graphs as Features Language Modeling DeepWalk Evaluation:

More information

Dependency grammar and dependency parsing

Dependency grammar and dependency parsing Dependency grammar and dependency parsing Syntactic analysis (5LN455) 2015-12-09 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Activities - dependency parsing

More information

Unsupervised Learning

Unsupervised Learning Deep Learning for Graphics Unsupervised Learning Niloy Mitra Iasonas Kokkinos Paul Guerrero Vladimir Kim Kostas Rematas Tobias Ritschel UCL UCL/Facebook UCL Adobe Research U Washington UCL Timetable Niloy

More information

Guiding Semi-Supervision with Constraint-Driven Learning

Guiding Semi-Supervision with Constraint-Driven Learning Guiding Semi-Supervision with Constraint-Driven Learning Ming-Wei Chang 1 Lev Ratinov 2 Dan Roth 3 1 Department of Computer Science University of Illinois at Urbana-Champaign Paper presentation by: Drew

More information

Compiler Design (40-414)

Compiler Design (40-414) Compiler Design (40-414) Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007 Evaluation: Midterm Exam 35% Final Exam 35% Assignments and Quizzes 10% Project

More information

NLP in practice, an example: Semantic Role Labeling

NLP in practice, an example: Semantic Role Labeling NLP in practice, an example: Semantic Role Labeling Anders Björkelund Lund University, Dept. of Computer Science anders.bjorkelund@cs.lth.se October 15, 2010 Anders Björkelund NLP in practice, an example:

More information

Dependency grammar and dependency parsing

Dependency grammar and dependency parsing Dependency grammar and dependency parsing Syntactic analysis (5LN455) 2016-12-05 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Activities - dependency parsing

More information

Natural Language Processing with Deep Learning CS224N/Ling284

Natural Language Processing with Deep Learning CS224N/Ling284 Natural Language Processing with Deep Learning CS224N/Ling284 Lecture 8: Recurrent Neural Networks Christopher Manning and Richard Socher Organization Extra project office hour today after lecture Overview

More information

Conditional Random Fields as Recurrent Neural Networks

Conditional Random Fields as Recurrent Neural Networks BIL722 - Deep Learning for Computer Vision Conditional Random Fields as Recurrent Neural Networks S. Zheng, S. Jayasumana, B. Romera-Paredes V. Vineet, Z. Su, D. Du, C. Huang, P.H.S. Torr Introduction

More information

Fully Delexicalized Contexts for Syntax-Based Word Embeddings

Fully Delexicalized Contexts for Syntax-Based Word Embeddings Fully Delexicalized Contexts for Syntax-Based Word Embeddings Jenna Kanerva¹, Sampo Pyysalo² and Filip Ginter¹ ¹Dept of IT - University of Turku, Finland ²Lang. Tech. Lab - University of Cambridge turkunlp.github.io

More information

Collins and Eisner s algorithms

Collins and Eisner s algorithms Collins and Eisner s algorithms Syntactic analysis (5LN455) 2015-12-14 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Recap: Dependency trees dobj subj det pmod

More information

A Quick Guide to MaltParser Optimization

A Quick Guide to MaltParser Optimization A Quick Guide to MaltParser Optimization Joakim Nivre Johan Hall 1 Introduction MaltParser is a system for data-driven dependency parsing, which can be used to induce a parsing model from treebank data

More information

Transition-based Parsing with Neural Nets

Transition-based Parsing with Neural Nets CS11-747 Neural Networks for NLP Transition-based Parsing with Neural Nets Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Two Types of Linguistic Structure Dependency: focus on relations between

More information

Transition-Based Dependency Parsing with MaltParser

Transition-Based Dependency Parsing with MaltParser Transition-Based Dependency Parsing with MaltParser Joakim Nivre Uppsala University and Växjö University Transition-Based Dependency Parsing 1(13) Introduction Outline Goals of the workshop Transition-based

More information

2068 (I) Attempt all questions.

2068 (I) Attempt all questions. 2068 (I) 1. What do you mean by compiler? How source program analyzed? Explain in brief. 2. Discuss the role of symbol table in compiler design. 3. Convert the regular expression 0 + (1 + 0)* 00 first

More information

Learning with Probabilistic Features for Improved Pipeline Models

Learning with Probabilistic Features for Improved Pipeline Models Learning with Probabilistic Features for Improved Pipeline Models Razvan C. Bunescu School of EECS Ohio University Athens, OH 45701 bunescu@ohio.edu Abstract We present a novel learning framework for pipeline

More information

Projective Dependency Parsing with Perceptron

Projective Dependency Parsing with Perceptron Projective Dependency Parsing with Perceptron Xavier Carreras, Mihai Surdeanu, and Lluís Màrquez Technical University of Catalonia {carreras,surdeanu,lluism}@lsi.upc.edu 8th June 2006 Outline Introduction

More information

Development in Object Detection. Junyuan Lin May 4th

Development in Object Detection. Junyuan Lin May 4th Development in Object Detection Junyuan Lin May 4th Line of Research [1] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection, CVPR 2005. HOG Feature template [2] P. Felzenszwalb,

More information

CS395T Project 2: Shift-Reduce Parsing

CS395T Project 2: Shift-Reduce Parsing CS395T Project 2: Shift-Reduce Parsing Due date: Tuesday, October 17 at 9:30am In this project you ll implement a shift-reduce parser. First you ll implement a greedy model, then you ll extend that model

More information

Stack- propaga+on: Improved Representa+on Learning for Syntax

Stack- propaga+on: Improved Representa+on Learning for Syntax Stack- propaga+on: Improved Representa+on Learning for Syntax Yuan Zhang, David Weiss MIT, Google 1 Transi+on- based Neural Network Parser p(action configuration) So1max Hidden Embedding words labels POS

More information

Compiler Design Overview. Compiler Design 1

Compiler Design Overview. Compiler Design 1 Compiler Design Overview Compiler Design 1 Preliminaries Required Basic knowledge of programming languages. Basic knowledge of FSA and CFG. Knowledge of a high programming language for the programming

More information

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks

More information

Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks

Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks Zelun Luo Department of Computer Science Stanford University zelunluo@stanford.edu Te-Lin Wu Department of

More information

Semantic Dependency Graph Parsing Using Tree Approximations

Semantic Dependency Graph Parsing Using Tree Approximations Semantic Dependency Graph Parsing Using Tree Approximations Željko Agić Alexander Koller Stephan Oepen Center for Language Technology, University of Copenhagen Department of Linguistics, University of

More information

Dependency Parsing. Ganesh Bhosale Neelamadhav G Nilesh Bhosale Pranav Jawale under the guidance of

Dependency Parsing. Ganesh Bhosale Neelamadhav G Nilesh Bhosale Pranav Jawale under the guidance of Dependency Parsing Ganesh Bhosale - 09305034 Neelamadhav G. - 09305045 Nilesh Bhosale - 09305070 Pranav Jawale - 09307606 under the guidance of Prof. Pushpak Bhattacharyya Department of Computer Science

More information

CSX-lite Example. LL(1) Parse Tables. LL(1) Parser Driver. Example of LL(1) Parsing. An LL(1) parse table, T, is a twodimensional

CSX-lite Example. LL(1) Parse Tables. LL(1) Parser Driver. Example of LL(1) Parsing. An LL(1) parse table, T, is a twodimensional LL(1) Parse Tables CSX-lite Example An LL(1) parse table, T, is a twodimensional array. Entries in T are production numbers or blank (error) entries. T is indexed by: A, a non-terminal. A is the nonterminal

More information

Annotating Spatio-Temporal Information in Documents

Annotating Spatio-Temporal Information in Documents Annotating Spatio-Temporal Information in Documents Jannik Strötgen University of Heidelberg Institute of Computer Science Database Systems Research Group http://dbs.ifi.uni-heidelberg.de stroetgen@uni-hd.de

More information

Neural Transition Based Parsing of Web Queries: An Entity Based Approach

Neural Transition Based Parsing of Web Queries: An Entity Based Approach Neural Transition Based Parsing of Web Queries: An Entity Based Approach Rivka Malca and Roi Reichart Technion, Israel Institute of Technology srikim@st.technion.ac.il, roiri@technion.ac.il Abstract Web

More information

Graph neural networks February Visin Francesco

Graph neural networks February Visin Francesco Graph neural networks February 2018 Visin Francesco Who I am Outline Motivation and examples Graph nets (Semi)-formal definition Interaction network Relation network Gated graph sequence neural network

More information

Computationally Efficient M-Estimation of Log-Linear Structure Models

Computationally Efficient M-Estimation of Log-Linear Structure Models Computationally Efficient M-Estimation of Log-Linear Structure Models Noah Smith, Doug Vail, and John Lafferty School of Computer Science Carnegie Mellon University {nasmith,dvail2,lafferty}@cs.cmu.edu

More information

Gradient of the lower bound

Gradient of the lower bound Weakly Supervised with Latent PhD advisor: Dr. Ambedkar Dukkipati Department of Computer Science and Automation gaurav.pandey@csa.iisc.ernet.in Objective Given a training set that comprises image and image-level

More information

Large-Scale Syntactic Processing: Parsing the Web. JHU 2009 Summer Research Workshop

Large-Scale Syntactic Processing: Parsing the Web. JHU 2009 Summer Research Workshop Large-Scale Syntactic Processing: JHU 2009 Summer Research Workshop Intro CCG parser Tasks 2 The Team Stephen Clark (Cambridge, UK) Ann Copestake (Cambridge, UK) James Curran (Sydney, Australia) Byung-Gyu

More information

End-To-End Spam Classification With Neural Networks

End-To-End Spam Classification With Neural Networks End-To-End Spam Classification With Neural Networks Christopher Lennan, Bastian Naber, Jan Reher, Leon Weber 1 Introduction A few years ago, the majority of the internet s network traffic was due to spam

More information

Deep Learning. Volker Tresp Summer 2014

Deep Learning. Volker Tresp Summer 2014 Deep Learning Volker Tresp Summer 2014 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there

More information

Topics in Parsing: Context and Markovization; Dependency Parsing. COMP-599 Oct 17, 2016

Topics in Parsing: Context and Markovization; Dependency Parsing. COMP-599 Oct 17, 2016 Topics in Parsing: Context and Markovization; Dependency Parsing COMP-599 Oct 17, 2016 Outline Review Incorporating context Markovization Learning the context Dependency parsing Eisner s algorithm 2 Review

More information

Recurrent Convolutional Neural Networks for Scene Labeling

Recurrent Convolutional Neural Networks for Scene Labeling Recurrent Convolutional Neural Networks for Scene Labeling Pedro O. Pinheiro, Ronan Collobert Reviewed by Yizhe Zhang August 14, 2015 Scene labeling task Scene labeling: assign a class label to each pixel

More information

Slide credit from Hung-Yi Lee & Richard Socher

Slide credit from Hung-Yi Lee & Richard Socher Slide credit from Hung-Yi Lee & Richard Socher 1 Review Word Vector 2 Word2Vec Variants Skip-gram: predicting surrounding words given the target word (Mikolov+, 2013) CBOW (continuous bag-of-words): predicting

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 5th, 2016 Today Administrivia LSTM Attribute in computer vision, by Abdullah and Samer Project II posted, due

More information

Bayesian model ensembling using meta-trained recurrent neural networks

Bayesian model ensembling using meta-trained recurrent neural networks Bayesian model ensembling using meta-trained recurrent neural networks Luca Ambrogioni l.ambrogioni@donders.ru.nl Umut Güçlü u.guclu@donders.ru.nl Yağmur Güçlütürk y.gucluturk@donders.ru.nl Julia Berezutskaya

More information

May I Have Your Attention Please? (said one neuron to another)

May I Have Your Attention Please? (said one neuron to another) May I Have Your Attention Please? (said one neuron to another) Ani Kembhavi Allen Institute for Artificial Intelligence The world of visual illustrations and many more Science Diagrams Maps 3d visualizations

More information

DRAGNN: A TRANSITION-BASED FRAMEWORK FOR DYNAMICALLY CONNECTED NEURAL NETWORKS

DRAGNN: A TRANSITION-BASED FRAMEWORK FOR DYNAMICALLY CONNECTED NEURAL NETWORKS DRAGNN: A TRANSITION-BASED FRAMEWORK FOR DYNAMICALLY CONNECTED NEURAL NETWORKS Lingpeng Kong Carnegie Mellon University Pittsburgh, PA lingpenk@cs.cmu.edu Chris Alberti Daniel Andor Ivan Bogatyy David

More information

Domain-specific Concept-based Information Retrieval System

Domain-specific Concept-based Information Retrieval System Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical

More information

Densely Connected Bidirectional LSTM with Applications to Sentence Classification

Densely Connected Bidirectional LSTM with Applications to Sentence Classification Densely Connected Bidirectional LSTM with Applications to Sentence Classification Zixiang Ding 1, Rui Xia 1(B), Jianfei Yu 2,XiangLi 1, and Jian Yang 1 1 School of Computer Science and Engineering, Nanjing

More information

3D Deep Learning on Geometric Forms. Hao Su

3D Deep Learning on Geometric Forms. Hao Su 3D Deep Learning on Geometric Forms Hao Su Many 3D representations are available Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models 3D representation

More information

Alternatives to Direct Supervision

Alternatives to Direct Supervision CreativeAI: Deep Learning for Graphics Alternatives to Direct Supervision Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel UCL UCL UCL TUM UCL Timetable Theory and Basics State of

More information

Assignment 4 CSE 517: Natural Language Processing

Assignment 4 CSE 517: Natural Language Processing Assignment 4 CSE 517: Natural Language Processing University of Washington Winter 2016 Due: March 2, 2016, 1:30 pm 1 HMMs and PCFGs Here s the definition of a PCFG given in class on 2/17: A finite set

More information

08 An Introduction to Dense Continuous Robotic Mapping

08 An Introduction to Dense Continuous Robotic Mapping NAVARCH/EECS 568, ROB 530 - Winter 2018 08 An Introduction to Dense Continuous Robotic Mapping Maani Ghaffari March 14, 2018 Previously: Occupancy Grid Maps Pose SLAM graph and its associated dense occupancy

More information

CS5670: Computer Vision

CS5670: Computer Vision CS5670: Computer Vision Noah Snavely Lecture 33: Recognition Basics Slides from Andrej Karpathy and Fei-Fei Li http://vision.stanford.edu/teaching/cs231n/ Announcements Quiz moved to Tuesday Project 4

More information

Encoding RNNs, 48 End of sentence (EOS) token, 207 Exploding gradient, 131 Exponential function, 42 Exponential Linear Unit (ELU), 44

Encoding RNNs, 48 End of sentence (EOS) token, 207 Exploding gradient, 131 Exponential function, 42 Exponential Linear Unit (ELU), 44 A Activation potential, 40 Annotated corpus add padding, 162 check versions, 158 create checkpoints, 164, 166 create input, 160 create train and validation datasets, 163 dropout, 163 DRUG-AE.rel file,

More information

Compiling Regular Expressions COMP360

Compiling Regular Expressions COMP360 Compiling Regular Expressions COMP360 Logic is the beginning of wisdom, not the end. Leonard Nimoy Compiler s Purpose The compiler converts the program source code into a form that can be executed by the

More information

The Perceptron. Simon Šuster, University of Groningen. Course Learning from data November 18, 2013

The Perceptron. Simon Šuster, University of Groningen. Course Learning from data November 18, 2013 The Perceptron Simon Šuster, University of Groningen Course Learning from data November 18, 2013 References Hal Daumé III: A Course in Machine Learning http://ciml.info Tom M. Mitchell: Machine Learning

More information

Sequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015

Sequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015 Sequence Modeling: Recurrent and Recursive Nets By Pyry Takala 14 Oct 2015 Agenda Why Recurrent neural networks? Anatomy and basic training of an RNN (10.2, 10.2.1) Properties of RNNs (10.2.2, 8.2.6) Using

More information

Introduction to Data-Driven Dependency Parsing

Introduction to Data-Driven Dependency Parsing Introduction to Data-Driven Dependency Parsing Introductory Course, ESSLLI 2007 Ryan McDonald 1 Joakim Nivre 2 1 Google Inc., New York, USA E-mail: ryanmcd@google.com 2 Uppsala University and Växjö University,

More information

Empirical Evaluation of RNN Architectures on Sentence Classification Task

Empirical Evaluation of RNN Architectures on Sentence Classification Task Empirical Evaluation of RNN Architectures on Sentence Classification Task Lei Shen, Junlin Zhang Chanjet Information Technology lorashen@126.com, zhangjlh@chanjet.com Abstract. Recurrent Neural Networks

More information

Large Scale Chinese News Categorization. Peng Wang. Joint work with H. Zhang, B. Xu, H.W. Hao

Large Scale Chinese News Categorization. Peng Wang. Joint work with H. Zhang, B. Xu, H.W. Hao Large Scale Chinese News Categorization --based on Improved Feature Selection Method Peng Wang Joint work with H. Zhang, B. Xu, H.W. Hao Computational-Brain Research Center Institute of Automation, Chinese

More information

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

COP 3402 Systems Software Top Down Parsing (Recursive Descent) COP 3402 Systems Software Top Down Parsing (Recursive Descent) Top Down Parsing 1 Outline 1. Top down parsing and LL(k) parsing 2. Recursive descent parsing 3. Example of recursive descent parsing of arithmetic

More information

Structured Prediction Basics

Structured Prediction Basics CS11-747 Neural Networks for NLP Structured Prediction Basics Graham Neubig Site https://phontron.com/class/nn4nlp2017/ A Prediction Problem I hate this movie I love this movie very good good neutral bad

More information

Part-Based Models for Object Class Recognition Part 3

Part-Based Models for Object Class Recognition Part 3 High Level Computer Vision! Part-Based Models for Object Class Recognition Part 3 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de! http://www.d2.mpi-inf.mpg.de/cv ! State-of-the-Art

More information

Domain-Aware Sentiment Classification with GRUs and CNNs

Domain-Aware Sentiment Classification with GRUs and CNNs Domain-Aware Sentiment Classification with GRUs and CNNs Guangyuan Piao 1(B) and John G. Breslin 2 1 Insight Centre for Data Analytics, Data Science Institute, National University of Ireland Galway, Galway,

More information

CS229 Final Project Sentiment Analysis of Tweets: Baselines and Neural Network Models

CS229 Final Project Sentiment Analysis of Tweets: Baselines and Neural Network Models CS229 Final Project Sentiment Analysis of Tweets: Baselines and Neural Network Models Kai Sheng Tai (advised by Richard Socher) (Dated: December 13, 2013) Social media sites such as Twitter are a rich

More information

Refresher on Dependency Syntax and the Nivre Algorithm

Refresher on Dependency Syntax and the Nivre Algorithm Refresher on Dependency yntax and Nivre Algorithm Richard Johansson 1 Introduction This document gives more details about some important topics that re discussed very quickly during lecture: dependency

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

FastText. Jon Koss, Abhishek Jindal

FastText. Jon Koss, Abhishek Jindal FastText Jon Koss, Abhishek Jindal FastText FastText is on par with state-of-the-art deep learning classifiers in terms of accuracy But it is way faster: FastText can train on more than one billion words

More information

Final Project Discussion. Adam Meyers Montclair State University

Final Project Discussion. Adam Meyers Montclair State University Final Project Discussion Adam Meyers Montclair State University Summary Project Timeline Project Format Details/Examples for Different Project Types Linguistic Resource Projects: Annotation, Lexicons,...

More information

NLP Chain. Giuseppe Castellucci Web Mining & Retrieval a.a. 2013/2014

NLP Chain. Giuseppe Castellucci Web Mining & Retrieval a.a. 2013/2014 NLP Chain Giuseppe Castellucci castellucci@ing.uniroma2.it Web Mining & Retrieval a.a. 2013/2014 Outline NLP chains RevNLT Exercise NLP chain Automatic analysis of texts At different levels Token Morphological

More information

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation Concepts Introduced in Chapter 2 A more detailed overview of the compilation process. Parsing Scanning Semantic Analysis Syntax-Directed Translation Intermediate Code Generation Context-Free Grammar A

More information

CS395T paper review. Indoor Segmentation and Support Inference from RGBD Images. Chao Jia Sep

CS395T paper review. Indoor Segmentation and Support Inference from RGBD Images. Chao Jia Sep CS395T paper review Indoor Segmentation and Support Inference from RGBD Images Chao Jia Sep 28 2012 Introduction What do we want -- Indoor scene parsing Segmentation and labeling Support relationships

More information

Sentiment Classification of Food Reviews

Sentiment Classification of Food Reviews Sentiment Classification of Food Reviews Hua Feng Department of Electrical Engineering Stanford University Stanford, CA 94305 fengh15@stanford.edu Ruixi Lin Department of Electrical Engineering Stanford

More information

Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set

Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set Tianze Shi* Liang Huang Lillian Lee* * Cornell University Oregon State University O n 3 O n

More information

A Dependency Parser for Tweets. Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith

A Dependency Parser for Tweets. Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith A Dependency Parser for Tweets Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith NLP for Social Media Boom! Ya ur website suxx bro @SarahKSilverman michelle

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross Girshick Jian Sun Present by: Yixin Yang Mingdong Wang 1 Object Detection 2 1 Applications Basic

More information