Non-Projective Dependency Parsing with Non-Local Transitions

Size: px
Start display at page:

Download "Non-Projective Dependency Parsing with Non-Local Transitions"

Transcription

1 Non-Projective Dependency Parsing with Non-Local Transitions Daniel Fernández-González and Carlos Gómez-Rodríguez Universidade da Coruña FASTPARSE Lab, LyS Research Group, Departamento de Computación Campus de Elviña, s/n, A Coruña, Spain d.fgonzalez@udc.es, carlos.gomez@udc.es Abstract We present a novel transition system, based on the Covington non-projective parser, introducing non-local transitions that can directly create arcs involving nodes to the left of the current focus positions. This avoids the need for long sequences of No-Arc transitions to create long-distance arcs, thus alleviating error propagation. The resulting parser outperforms the original version and achieves the best accuracy on the Stanford Dependencies conversion of the Penn Treebank among greedy transition-based parsers. 1 Introduction Greedy transition-based parsers are popular in NLP, as they provide competitive accuracy with high efficiency. They syntactically analyze a sentence by greedily applying transitions, which read it from left to right and produce a dependency tree. However, this greedy process is prone to error propagation: one wrong choice of transition can lead the parser to an erroneous state, causing more incorrect decisions. This is especially crucial for long attachments requiring a larger number of transitions. In addition, transition-based parsers traditionally focus on only two words of the sentence and their local context to choose the next transition. The lack of a global perspective favors the presence of errors when creating arcs involving multiple transitions. As expected, transition-based parsers build short arcs more accurately than long ones (McDonald and Nivre, 2007). Previous research such as (Fernández-González and Gómez-Rodríguez, 2012) and (Qi and Manning, 2017) proves that the widely-used projective arc-eager transition-based parser of Nivre (2003) benefits from shortening the length of transition sequences by creating non-local attachments. In particular, they augmented the original transition system with new actions whose behavior entails more than one arc-eager transition and involves a context beyond the traditional two focus words. Attardi (2006) and Sartorio et al. (2013) also extended the arc-standard transition-based algorithm (Nivre, 2004) with the same success. In the same vein, we present a novel unrestricted non-projective transition system based on the well-known algorithm by Covington (2001) that shortens the transition sequence necessary to parse a given sentence by the original algorithm, which becomes linear instead of quadratic with respect to sentence length. To achieve that, we propose new transitions that affect non-local words and are equivalent to one or more Covington actions, in a similar way to the transitions defined by Qi and Manning (2017) based on the arc-eager parser. Experiments show that this novel variant significantly outperforms the original one in all datasets tested, and achieves the best reported accuracy for a greedy dependency parser on the Stanford Dependencies conversion of the WSJ Penn Treebank. 2 Non-Projective Covington Parser The original non-projective parser defined by Covington (2001) was modelled under the transitionbased parsing framework by Nivre (2008). We only sketch this transition system briefly for space reasons, and refer to (Nivre, 2008) for details. Parser configurations have the form c = λ 1, λ 2, B, A, where λ 1 and λ 2 are lists of partially processed words, B a list (called buffer) of unprocessed words, and A the set of dependency arcs built so far. Given an input string w 1 w n, the parser starts at the initial configuration c s (w 1... w n ) = [], [], [1... n], and runs transitions until a terminal configuration of the 693 Proceedings of NAACL-HLT 2018, pages New Orleans, Louisiana, June 1-6, c 2018 Association for Computational Linguistics

2 Covington: Shift: λ 1, λ 2, j B, A λ 1 λ 2 j, [], B, A No-Arc: λ 1 i, λ 2, B, A λ 1, i λ 2, B, A Left-Arc: λ 1 i, λ 2, j B, A λ 1, i λ 2, j B, A {j i} only if x x i A (single-head) and i j A (acyclicity). Right-Arc: λ 1 i, λ 2, j B, A λ 1, i λ 2, j B, A {i j} only if x x j A (single-head) and j i A (acyclicity). NL-Covington: Shift: λ 1, λ 2, j B, A λ 1 λ 2 j, [], B, A Left-Arc k : λ 1 i k... i 1, λ 2, j B, A λ 1, i k... i 1 λ 2, j B, A {j i k } only if x x i k A (single-head) and i k j A (acyclicity). Right-Arc k : λ 1 i k... i 1, λ 2, j B, A λ 1, i k... i 1 λ 2, j B, A {i k j} only if x x j A (single-head) and j i k A (acyclicity). Figure 1: Transitions of the non-projective Covington (top) and NL-Covington (bottom) dependency parsers. The notation i j A means that there is a (possibly empty) directed path from i to j in A. form λ 1, λ 2, [], A is reached: at that point, A contains the dependency graph for the input. 1 The set of transitions is shown in the top half of Figure 1. Their logic can be summarized as follows: when in a configuration of the form λ 1 i, λ 2, j B, A, the parser has the chance to create a dependency involving words i and j, which we will call left and right focus words of that configuration. The Left-Arc and Right-Arc transitions are used to create a leftward (i j) or rightward arc (i j), respectively, between these words, and also move i from λ 1 to the first position of λ 2, effectively moving the focus to i 1 and j. If no dependency is desired between the focus words, the No-Arc transition makes the same modification of λ 1 and λ 2, but without building any arc. Finally, the Shift transition moves the whole content of the list λ 2 plus j to λ 1 when no more attachments are pending between j and the words of λ 1, thus reading a new input word and placing the focus on j and j + 1. Transitions that create arcs are disallowed in configurations where this would violate the single-head or acyclicity constraints (cycles and nodes with multiple heads are not allowed in the dependency graph). Figure 3 shows the transition sequence in the Covington transition system which derives the dependency graph in Figure 2. The resulting parser can generate arbitrary nonprojective trees, and its complexity is O(n 2 ). 3 Non-Projective NL-Covington Parser The original logic described by Covington (2001) parses a sentence by systematically traversing 1 Note that, in general, A is a forest, but it can be converted to a tree by linking headless nodes as dependents of an artificial root node at position Figure 2: Dependency tree for an input sentence. Tran. λ 1 λ 2 Buffer Arc [ ] [ ] [ 1, 2, 3, 4, 5 ] SH [ 1 ] [ ] [ 2, 3, 4, 5 ] RA [ ] [ 1 ] [ 2, 3, 4, 5 ] 1 2 SH [ 1, 2 ] [ ] [ 3, 4, 5 ] NA [ 1 ] [ 2 ] [ 3, 4, 5 ] RA [ ] [ 1, 2 ] [ 3, 4, 5 ] 1 3 SH [ 1, 2, 3 ] [ ] [ 4, 5 ] SH [ 1, 2, 3, 4 ] [ ] [ 5 ] LA [ 1, 2, 3 ] [ 4 ] [ 5 ] 4 5 NA [ 1, 2] [ 3, 4 ] [ 5 ] NA [ 1 ] [ 2, 3, 4 ] [ 5 ] RA [ ] [ 1, 2, 3, 4 ] [ 5 ] 1 5 SH [ 1, 2, 3, 4, 5 ] [ ] [ ] Figure 3: Transition sequence for parsing the sentence in Figure 2 using the Covington parser (LA=LEFT-ARC, RA=RIGHT-ARC, NA=NO-ARC, SH=SHIFT). every pair of words. The Shift transition, introduced by Nivre (2008) in the transition-based version, is an optimization that avoids the need to apply a sequence of No-Arc transitions to empty the list λ 1 before reading a new input word. However, there are still situations where sequences of No-Arc transitions are needed. For example, if we are in a configuration C with focus words i and j and the next arc we need to create 694

3 goes from j to i k (k > 1), then we will need k 1 consecutive No-Arc transitions to move the left focus word to i and then apply Left-Arc. This could be avoided if a non-local Left-Arc transition could be undertaken directly at C, creating the required arc and moving k words to λ 2 at once. The advantage of such approach would be twofold: (1) less risk of making a mistake at C due to considering a limited local context, and (2) shorter transition sequence, alleviating error propagation. We present a novel transition system called NL- Covington (for non-local Covington ), described in the bottom half of Figure 1. It consists in a modification of the non-projective Covington algorithm where: (1) the Left-Arc and Right-Arc transitions are parameterized with k, allowing the immediate creation of any attachment between j and the kth leftmost word in λ 1 and moving k words to λ 2 at once, and (2) the No-Arc transition is removed since it is no longer necessary. This new transition system can use some restricted global information to build non-local dependencies and, consequently, reduce the number of transitions needed to parse the input. For instance, as presented in Figure 4, the NL-Covington parser will need 9 transitions, instead of 12 traditional Covington actions, to analyze the sentence in Figure 2. In fact, while in the standard Covington algorithm a transition sequence for a sentence of length n has length O(n 2 ) in the worst case (if all nodes are connected to the first node, then we need to traverse every node to the left of each right focus word); for NL-Covington the sequence length is always O(n): one Shift transition for each of the n words, plus one arc-building transition for each of the n 1 arcs in the dependency tree. Note, however, that this does not affect the parser s time complexity, which is still quadratic as in the original Covington parser. This is because the algorithm has O(n) possible transitions to be scored at each configuration, while the original Covington has O(1) transitions due to being limited to creating local leftward/rightward arcs between the focus words. The completeness and soundness of NL- Covington can easily be proved as there is a mapping between transition sequences of both parsers, where a sequence of k 1 No-Arc and one arc transition in Covington is equivalent to a Left-Arc k or Right-Arc k in NL-Covington. Tran. λ 1 λ 2 Buffer Arc [ ] [ ] [ 1, 2, 3, 4, 5 ] SH [ 1 ] [ ] [ 2, 3, 4, 5 ] RA 1 [ ] [ 1 ] [ 2, 3, 4, 5 ] 1 2 SH [ 1, 2 ] [ ] [ 3, 4, 5 ] RA 2 [ ] [ 1, 2 ] [ 3, 4, 5 ] 1 3 SH [ 1, 2, 3 ] [ ] [ 4, 5 ] SH [ 1, 2, 3, 4 ] [ ] [ 5 ] LA 1 [ 1, 2, 3 ] [ 4 ] [ 5 ] 4 5 RA 3 [ ] [ 1, 2, 3, 4 ] [ 5 ] 1 5 SH [ 1, 2, 3, 4, 5 ] [ ] [ ] Figure 4: Transition sequence for parsing the sentence in Figure 2 using the NL-Covington parser (LA=LEFT-ARC, RA=RIGHT-ARC, SH=SHIFT). 4 Experiments 4.1 Data and Evaluation We use 9 datasets 2 from the CoNLL-X (Buchholz and Marsi, 2006) and all datasets from the CoNLL-XI shared task (Nivre et al., 2007). To compare our system to the current state-of-theart transition-based parsers, we also evaluate it on the Stanford Dependencies (de Marneffe and Manning, 2008) conversion (using the Stanford parser v3.3.0) 3 of the WSJ Penn Treebank (Marcus et al., 1993), hereinafter PT-SD, with standard splits. Labelled and Unlabelled Attachment Scores (LAS and UAS) are computed excluding punctuation only on the PT-SD, for comparability. We repeat each experiment with three independent random initializations and report the average accuracy. Statistical significance is assessed by a paired test with 10,000 bootstrap samples. 4.2 Model To implement our approach we take advantage of the model architecture described in Qi and Manning (2017) for the arc-swift parser, which extends the architecture of Kiperwasser and Goldberg (2016) by applying a biaffine combination during the featurization process. We implement both the Covington and NL-Covington parsers under this architecture, adapt the featurization process with biaffine combination of Qi and Manning (2017) to these parsers, and use their same training 2 We excluded the languages from CoNLL-X that also appeared in CoNLL-XI, i.e., if a language was present in both shared tasks, we used the latest version. 3 lex-parser.shtml 695

4 Covington NL-Covington Language UAS LAS UAS LAS Arabic Basque Catalan Chinese Czech English Greek Hungarian Italian Turkish Bulgarian Danish Dutch German Japanese Portuguese Slovene Spanish Swedish Average Table 1: Parsing accuracy (UAS and LAS, including punctuation) of the Covington and NL- Covington non-projective parsers on CoNLL-XI (first block) and CoNLL-X (second block) datasets. Best results for each language are shown in bold. All improvements in this table are statistically significant (α =.05). setup. More details about these model parameters are provided in Appendix A. Since this architecture uses batch training, we train with a static oracle. The NL-Covington algorithm has no spurious ambiguity at all, so there is only one possible static oracle: canonical transition sequences are generated by choosing the transition that builds the shortest pending gold arc involving the current right focus word j, or Shift if there are no unbuilt gold arcs involving j. We note that a dynamic oracle can be obtained for the NL-Covington parser by adapting the one for standard Covington of Gómez-Rodríguez and Fernández-González (2015). As NL-Covington transitions are concatenations of Covington ones, their loss calculation algorithm is compatible with NL-Covington. Apart from error exploration, this also opens the way to incorporating nonmonotonicity (Fernández-González and Gómez- Rodríguez, 2017). While these approaches have shown to improve accuracy under online training settings, here we prioritize homogeneous comparability to (Qi and Manning, 2017), so we use batch training and a static oracle, and still obtain stateof-the-art accuracy for a greedy parser. Parser Type UAS LAS (Chen and Manning, 2014) gs (Dyer et al., 2015) gs (Weiss et al., 2015) greedy gs (Ballesteros et al., 2016) gd (Kiperwasser and Goldberg, 2016) gd (Qi and Manning, 2017) gs This work gs (Weiss et al., 2015) beam b(8) (Alberti et al., 2015) b(32) (Andor et al., 2016) b(32) (Shi et al., 2017) dp (Kuncoro et al., 2017) (constit.) c Table 2: Accuracy comparison of state-of-theart transition-based dependency parsers on PT-SD. The Type column shows the type of parser: gs is a greedy parser trained with a static oracle, gd a greedy parser trained with a dynamic oracle, b(n) a beam search parser with beam size n, dp a parser that employs global training with dynamic programming, and c a constituent parser with conversion to dependencies. 4.3 Results Table 1 presents a comparison between the Covington parser and the novel variant developed here. The NL-Covington parser outperforms the original version in all datasets tested, with all improvements statistically significant (α =.05). Table 2 compares our novel system with other state-of-the-art transition-based dependency parsers on the PT-SD. Greedy parsers are in the first block, beam-search and dynamic programming parsers in the second block. The third block shows the best result on this benchmark, obtained with constituent parsing with generative re-ranking and conversion to dependencies. Despite being the only non-projective parser tested on a practically projective dataset, 4 our parser achieves the highest score among greedy transition-based models (even above those trained with a dynamic oracle). We even slightly outperform the arc-swift system of Qi and Manning (2017), with the same model architecture, implementation and training setup, but based on the projective arc-eager transition-based parser instead. This may be because our system takes into consideration any permissible attachment between the focus word j and any word in λ 1 at each configuration, while their approach is limited by the arc-eager logic: it al- 4 Only 41 out of 39,832 sentences of the PT-SD training dataset present some kind of non-projectivity. 696

5 Arc-swift NL-Covington Language UAS LAS UAS LAS Arabic Basque Catalan Chinese Czech English Greek Hungarian Italian Turkish Bulgarian Danish Dutch German Japanese Portuguese Slovene Spanish Swedish Average Table 3: Parsing accuracy (UAS and LAS, with punctuation) of the arc-swift and NL-Covington parsers on CoNLL-XI (1st block) and CoNLL-X (2nd block) datasets. Best results for each language are in bold. * indicates statistically significant improvements (α =.05). lows all possible rightward arcs (possibly fewer than our approach as the arc-eager stack usually contains a small number of words), but only one leftward arc is permitted per parser state. It is also worth noting that the arc-swift and NL-Covington parsers have the same worst-case time complexity, (O(n 2 )), as adding non-local arc transitions to the arc-eager parser increases its complexity from linear to quadratic, but it does not affect the complexity of the Covington algorithm. Thus, it can be argued that this technique is better suited to Covington than to arc-eager parsing. We also compare NL-Covington to the arcswift parser on the CoNLL datasets (Table 3). For fairness of comparison, we projectivize (via maltparser 5 ) all training datasets, instead of filtering non-projective sentences, as some of the languages are significantly non-projective. Even doing that, the NL-Covington parser improves over the arc-swift system in terms of UAS in 14 out of 19 datasets, obtaining statistically significant improvements in accuracy on 7 of them, and statistically significant decreases in just one. Finally, we analyze how our approach reduces the length of the transition sequence consumed by 5 Covington NL-Covington Language trans./sent. trans./sent. Arabic Basque Catalan Chinese Czech English Greek Hungarian Italian Turkish Bulgarian Danish Dutch German Japanese Portuguese Slovene Spanish Swedish PTB-SD Average Table 4: Average transitions executed per sentence (trans./sent.) when analyzing each dataset by the original Covington and NL-Covington algorithms. the original Covington parser. In Table 4 we report the transition sequence length per sentence used by the Covington and the NL-Covington algorithms to analyze each dataset from the same benchmark used for evaluating parsing accuracy. As seen in the table, NL-Covington produces notably shorter transition sequences than Covington, with a reduction close to 50% on average. 5 Conclusion We present a novel variant of the non-projective Covington transition-based parser by incorporating non-local transitions, reducing the length of transition sequences from O(n 2 ) to O(n). This system clearly outperforms the original Covington parser and achieves the highest accuracy on the WSJ Penn Treebank (Stanford Dependencies) obtained to date with greedy dependency parsing. Acknowledgments This work has received funding from the European Research Council (ERC), under the European Union s Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No ), from the TELEPARES-UDC (FFI C2-2-R) and ANSWER-ASAP (TIN C2-1-R) projects from MINECO, and from Xunta de Galicia (ED431B 2017/01). 697

6 References Chris Alberti, David Weiss, Greg Coppola, and Slav Petrov Improved transition-based parsing and tagging with neural networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, pages Daniel Andor, Chris Alberti, David Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, and Michael Collins Globally normalized transition-based neural networks. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers. Giuseppe Attardi Experiments with a multilanguage non-projective dependency parser. In Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL). pages Miguel Ballesteros, Yoav Goldberg, Chris Dyer, and Noah A. Smith Training with exploration improves a greedy stack-lstm parser. CoRR abs/ Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov Enriching word vectors with subword information. arxiv preprint arxiv: Sabine Buchholz and Erwin Marsi CoNLL-X shared task on multilingual dependency parsing. In Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL). pages Danqi Chen and Christopher Manning A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar, pages Michael A. Covington A fundamental algorithm for dependency parsing. In Proceedings of the 39th Annual ACM Southeast Conference. ACM, New York, NY, USA, pages Marie-Catherine de Marneffe and Christopher D. Manning The stanford typed dependencies representation. In Coling 2008: Proceedings of the Workshop on Cross-Framework and Cross-Domain Parser Evaluation. Association for Computational Linguistics, Stroudsburg, PA, USA, CrossParser 08, pages Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, and Noah A. Smith Transitionbased dependency parsing with stack long shortterm memory. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers. pages Daniel Fernández-González and Carlos Gómez- Rodríguez Improving transition-based dependency parsing with buffer transitions. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, pages Daniel Fernández-González and Carlos Gómez- Rodríguez A full non-monotonic transition system for unrestricted non-projective parsing. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, pages Carlos Gómez-Rodríguez and Daniel Fernández- González An efficient dynamic oracle for unrestricted non-projective parsing. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2015). Volume 2: Short Papers. Association for Computational Linguistics, Beijing, China, pages Alex Graves and Jürgen Schmidhuber Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Networks pages 5 6. Eliyahu Kiperwasser and Yoav Goldberg Simple and accurate dependency parsing using bidirectional LSTM feature representations. TACL 4: Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, and Noah A. Smith What do recurrent neural network grammars learn about syntax? In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Association for Computational Linguistics, pages Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz Building a large annotated 698

7 corpus of English: The Penn Treebank. Computational Linguistics 19: Ryan McDonald and Joakim Nivre Characterizing the errors of data-driven dependency parsing models. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). pages Joakim Nivre An efficient algorithm for projective dependency parsing. In Proceedings of the 8th International Workshop on Parsing Technologies (IWPT 03). ACL/SIGPARSE, pages Joakim Nivre Incrementality in deterministic dependency parsing. In Proceedings of the Workshop on Incremental Parsing: Bringing Engineering and Cognition Together (ACL). pages Joakim Nivre Algorithms for Deterministic Incremental Dependency Parsing. Computational Linguistics 34(4): Joakim Nivre, Johan Hall, Sandra Kübler, Ryan McDonald, Jens Nilsson, Sebastian Riedel, and Deniz Yuret The CoNLL 2007 shared task on dependency parsing. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL pages Jeffrey Pennington, Richard Socher, and Christopher D. Manning Glove: Global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP). pages Peng Qi and Christopher D. Manning Arcswift: A novel transition system for dependency parsing. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 2: Short Papers. pages Francesco Sartorio, Giorgio Satta, and Joakim Nivre A transition-based dependency parser using a dynamic parsing strategy. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, pages Tianze Shi, Liang Huang, and Lillian Lee Fast(er) exact decoding and global training for transition-based dependency parsing via a minimal feature set. CoRR abs/ David Weiss, Chris Alberti, Michael Collins, and Slav Petrov Structured training for neural network transition-based parsing. In Proceedings of A the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers. pages Model Details We provide more details of the neural network architecture used in this paper, which is taken from Qi and Manning (2017). The model consists of two blocks of 2-layered bidirectional long short-term memory (BiLSTM) networks (Graves and Schmidhuber, 2005) with 400 hidden units in each direction. The first block is used for POS tagging and the second one, for parsing. As the input of the tagging block, we use words represented as word embeddings, and BiL- STMs are employed to perform feature extraction. The resulting output is fed into a multi-layer perceptron (MLP), with a hidden layer of 100 rectified linear units (ReLU), that provides a POS tag for each input token in a 32-dimensional representation. Word embeddings concatenated to these POS tag embeddings serve as input of the second block of BiLSTMs to undertake the parsing stage. Then, the output of the parsing block is fed into a MLP with two separate ReLU hidden layers (one for deriving the representation of the head, and the other for the dependency label) that, after being merged and by means of a softmax function, score all the feasible transitions, allowing to greedily choose and apply the highest-scoring one. Moreover, we adapt the featurization process with biaffine combination described in Qi and Manning (2017) for the arc-swift system to be used on the original Covington and NL-Covington parsers. In particular, arc transitions are featurized by the concatenation of the representation of the head and dependent words of the arc to be created, the No-Arc transition is featurized by the rightmost word in λ 1 and the leftmost word in the buffer B and, finally, for the Shift transition only the leftmost word in B is used. Unlike Qi and Manning (2017) do for baseline parsers, we do not use the featurization method detailed in Kiperwasser and Goldberg (2016) 6 for the original Covington parser, as we observed that this results in lower 6 For instance, Kiperwasser and Goldberg (2016) featurize all transitions of the arc-eager parser in the same way by concatenating the representations of the top 3 words on the stack and the leftmost word in the buffer. 699

8 scores and then the comparison would be unfair in our case. We implement both systems under the same framework, with the original Covington parser represented as the NL-Covington system plus the No-Arc transition and with k limited to 1. A thorough description of the model architecture and featurization mechanism can be found in Qi and Manning (2017). Our training setup is exactly the same used by Qi and Manning (2017), training the models during 10 epochs for large datasets and 30 for small ones. In addition, we initialize word embeddings with 100-dimensional GloVe vectors (Pennington et al., 2014) for English and use 300-dimensional Facebook vectors (Bojanowski et al., 2016) for other languages. The other parameters of the neural network keep the same values. The parser s source code is freely available at Non-Local-Covington. 700

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Transition-Based Dependency Parsing with Stack Long Short-Term Memory Transition-Based Dependency Parsing with Stack Long Short-Term Memory Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, Noah A. Smith Association for Computational Linguistics (ACL), 2015 Presented

More information

CS395T Project 2: Shift-Reduce Parsing

CS395T Project 2: Shift-Reduce Parsing CS395T Project 2: Shift-Reduce Parsing Due date: Tuesday, October 17 at 9:30am In this project you ll implement a shift-reduce parser. First you ll implement a greedy model, then you ll extend that model

More information

Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set

Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set Tianze Shi* Liang Huang Lillian Lee* * Cornell University Oregon State University O n 3 O n

More information

A Quick Guide to MaltParser Optimization

A Quick Guide to MaltParser Optimization A Quick Guide to MaltParser Optimization Joakim Nivre Johan Hall 1 Introduction MaltParser is a system for data-driven dependency parsing, which can be used to induce a parsing model from treebank data

More information

Transition-Based Dependency Parsing with MaltParser

Transition-Based Dependency Parsing with MaltParser Transition-Based Dependency Parsing with MaltParser Joakim Nivre Uppsala University and Växjö University Transition-Based Dependency Parsing 1(13) Introduction Outline Goals of the workshop Transition-based

More information

Non-Projective Dependency Parsing in Expected Linear Time

Non-Projective Dependency Parsing in Expected Linear Time Non-Projective Dependency Parsing in Expected Linear Time Joakim Nivre Uppsala University, Department of Linguistics and Philology, SE-75126 Uppsala Växjö University, School of Mathematics and Systems

More information

Undirected Dependency Parsing

Undirected Dependency Parsing Computational Intelligence, Volume 59, Number 000, 2010 Undirected Dependency Parsing CARLOS GÓMEZ-RODRÍGUEZ cgomezr@udc.es Depto. de Computación, Universidade da Coruña. Facultad de Informática, Campus

More information

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin

Dependency Parsing 2 CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin Dependency Parsing 2 CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan Jurafsky & James Martin Dependency Parsing Formalizing dependency trees Transition-based dependency parsing

More information

arxiv: v1 [cs.cl] 25 Apr 2017

arxiv: v1 [cs.cl] 25 Apr 2017 Joint POS Tagging and Dependency Parsing with Transition-based Neural Networks Liner Yang 1, Meishan Zhang 2, Yang Liu 1, Nan Yu 2, Maosong Sun 1, Guohong Fu 2 1 State Key Laboratory of Intelligent Technology

More information

A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy

A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy Francesco Sartorio Department of Information Engineering University of Padua, Italy sartorio@dei.unipd.it Giorgio Satta Department

More information

Introduction to Data-Driven Dependency Parsing

Introduction to Data-Driven Dependency Parsing Introduction to Data-Driven Dependency Parsing Introductory Course, ESSLLI 2007 Ryan McDonald 1 Joakim Nivre 2 1 Google Inc., New York, USA E-mail: ryanmcd@google.com 2 Uppsala University and Växjö University,

More information

Hybrid Combination of Constituency and Dependency Trees into an Ensemble Dependency Parser

Hybrid Combination of Constituency and Dependency Trees into an Ensemble Dependency Parser Hybrid Combination of Constituency and Dependency Trees into an Ensemble Dependency Parser Nathan David Green and Zdeněk Žabokrtský Charles University in Prague Institute of Formal and Applied Linguistics

More information

Managing a Multilingual Treebank Project

Managing a Multilingual Treebank Project Managing a Multilingual Treebank Project Milan Souček Timo Järvinen Adam LaMontagne Lionbridge Finland {milan.soucek,timo.jarvinen,adam.lamontagne}@lionbridge.com Abstract This paper describes the work

More information

Combining Discrete and Continuous Features for Deterministic Transition-based Dependency Parsing

Combining Discrete and Continuous Features for Deterministic Transition-based Dependency Parsing Combining Discrete and Continuous Features for Deterministic Transition-based Dependency Parsing Meishan Zhang and Yue Zhang Singapore University of Technology and Design {meishan zhang yue zhang}@sutd.edu.sg

More information

Density-Driven Cross-Lingual Transfer of Dependency Parsers

Density-Driven Cross-Lingual Transfer of Dependency Parsers Density-Driven Cross-Lingual Transfer of Dependency Parsers Mohammad Sadegh Rasooli Michael Collins rasooli@cs.columbia.edu Presented by Owen Rambow EMNLP 2015 Motivation Availability of treebanks Accurate

More information

Dependency Parsing with Undirected Graphs

Dependency Parsing with Undirected Graphs Dependency Parsing with Undirected Graphs Carlos Gómez-Rodríguez Departamento de Computación Universidade da Coruña Campus de Elviña, 15071 A Coruña, Spain carlos.gomez@udc.es Daniel Fernández-González

More information

AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands

AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands Svetlana Stoyanchev, Hyuckchul Jung, John Chen, Srinivas Bangalore AT&T Labs Research 1 AT&T Way Bedminster NJ 07921 {sveta,hjung,jchen,srini}@research.att.com

More information

Parsing as Language Modeling

Parsing as Language Modeling Parsing as Language Modeling Do Kook Choe Brown University Providence, RI dc65@cs.brown.edu Eugene Charniak Brown University Providence, RI ec@cs.brown.edu Abstract We recast syntactic parsing as a language

More information

Statistical Dependency Parsing

Statistical Dependency Parsing Statistical Dependency Parsing The State of the Art Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Statistical Dependency Parsing 1(29) Introduction

More information

Transition-based Dependency Parsing with Rich Non-local Features

Transition-based Dependency Parsing with Rich Non-local Features Transition-based Dependency Parsing with Rich Non-local Features Yue Zhang University of Cambridge Computer Laboratory yue.zhang@cl.cam.ac.uk Joakim Nivre Uppsala University Department of Linguistics and

More information

Let s get parsing! Each component processes the Doc object, then passes it on. doc.is_parsed attribute checks whether a Doc object has been parsed

Let s get parsing! Each component processes the Doc object, then passes it on. doc.is_parsed attribute checks whether a Doc object has been parsed Let s get parsing! SpaCy default model includes tagger, parser and entity recognizer nlp = spacy.load('en ) tells spacy to use "en" with ["tagger", "parser", "ner"] Each component processes the Doc object,

More information

MaltOptimizer: A System for MaltParser Optimization

MaltOptimizer: A System for MaltParser Optimization MaltOptimizer: A System for MaltParser Optimization Miguel Ballesteros Joakim Nivre Universidad Complutense de Madrid, Spain miballes@fdi.ucm.es Uppsala University, Sweden joakim.nivre@lingfil.uu.se Abstract

More information

Context Encoding LSTM CS224N Course Project

Context Encoding LSTM CS224N Course Project Context Encoding LSTM CS224N Course Project Abhinav Rastogi arastogi@stanford.edu Supervised by - Samuel R. Bowman December 7, 2015 Abstract This project uses ideas from greedy transition based parsing

More information

Improving Transition-Based Dependency Parsing with Buffer Transitions

Improving Transition-Based Dependency Parsing with Buffer Transitions Improving Transition-Based Dependency Parsing with Buffer Transitions Daniel Fernández-González Departamento de Informática Universidade de Vigo Campus As Lagoas, 32004 Ourense, Spain danifg@uvigo.es Carlos

More information

Dependency Parsing. Ganesh Bhosale Neelamadhav G Nilesh Bhosale Pranav Jawale under the guidance of

Dependency Parsing. Ganesh Bhosale Neelamadhav G Nilesh Bhosale Pranav Jawale under the guidance of Dependency Parsing Ganesh Bhosale - 09305034 Neelamadhav G. - 09305045 Nilesh Bhosale - 09305070 Pranav Jawale - 09307606 under the guidance of Prof. Pushpak Bhattacharyya Department of Computer Science

More information

In-Order Transition-based Constituent Parsing

In-Order Transition-based Constituent Parsing In-Order Transition-based Constituent Parsing Jiangming Liu and Yue Zhang Singapore University of Technology and Design, 8 Somapah Road, Singapore, 487372 jmliunlp@gmail.com, yue zhang@sutd.edu.sg Abstract

More information

Bidirectional Transition-Based Dependency Parsing

Bidirectional Transition-Based Dependency Parsing Bidirectional Transition-Based Dependency Parsing Yunzhe Yuan, Yong Jiang, Kewei Tu School of Information Science and Technology ShanghaiTech University {yuanyzh,jiangyong,tukw}@shanghaitech.edu.cn Abstract

More information

CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019

CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019 CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019 1 Course Instructors: Christopher Manning, Richard Socher 2 Authors: Lisa Wang, Juhi Naik,

More information

Easy-First POS Tagging and Dependency Parsing with Beam Search

Easy-First POS Tagging and Dependency Parsing with Beam Search Easy-First POS Tagging and Dependency Parsing with Beam Search Ji Ma JingboZhu Tong Xiao Nan Yang Natrual Language Processing Lab., Northeastern University, Shenyang, China MOE-MS Key Lab of MCC, University

More information

Agenda for today. Homework questions, issues? Non-projective dependencies Spanning tree algorithm for non-projective parsing

Agenda for today. Homework questions, issues? Non-projective dependencies Spanning tree algorithm for non-projective parsing Agenda for today Homework questions, issues? Non-projective dependencies Spanning tree algorithm for non-projective parsing 1 Projective vs non-projective dependencies If we extract dependencies from trees,

More information

The Application of Constraint Rules to Data-driven Parsing

The Application of Constraint Rules to Data-driven Parsing The Application of Constraint Rules to Data-driven Parsing Sardar Jaf The University of Manchester jafs@cs.man.ac.uk Allan Ramsay The University of Manchester ramsaya@cs.man.ac.uk Abstract In this paper,

More information

Online Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies

Online Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies Online Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies Ivan Titov University of Illinois at Urbana-Champaign James Henderson, Paola Merlo, Gabriele Musillo University

More information

Discriminative Training with Perceptron Algorithm for POS Tagging Task

Discriminative Training with Perceptron Algorithm for POS Tagging Task Discriminative Training with Perceptron Algorithm for POS Tagging Task Mahsa Yarmohammadi Center for Spoken Language Understanding Oregon Health & Science University Portland, Oregon yarmoham@ohsu.edu

More information

Dependency Parsing CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin

Dependency Parsing CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan Jurafsky & James Martin Dependency Parsing Formalizing dependency trees Transition-based dependency parsing

More information

Encoder-Decoder Shift-Reduce Syntactic Parsing

Encoder-Decoder Shift-Reduce Syntactic Parsing Encoder-Decoder Shift-Reduce Syntactic Parsing Jiangming Liu and Yue Zhang Singapore University of Technology and Design, 8 Somapah Road, Singapore, 487372 {jiangming liu, yue zhang}@sutd.edu.sg encoder

More information

Sorting Out Dependency Parsing

Sorting Out Dependency Parsing Sorting Out Dependency Parsing Joakim Nivre Uppsala University and Växjö University Sorting Out Dependency Parsing 1(38) Introduction Introduction Syntactic parsing of natural language: Who does what to

More information

Stack- propaga+on: Improved Representa+on Learning for Syntax

Stack- propaga+on: Improved Representa+on Learning for Syntax Stack- propaga+on: Improved Representa+on Learning for Syntax Yuan Zhang, David Weiss MIT, Google 1 Transi+on- based Neural Network Parser p(action configuration) So1max Hidden Embedding words labels POS

More information

Automatic Discovery of Feature Sets for Dependency Parsing

Automatic Discovery of Feature Sets for Dependency Parsing Automatic Discovery of Feature Sets for Dependency Parsing Peter Nilsson Pierre Nugues Department of Computer Science Lund University peter.nilsson.lund@telia.com Pierre.Nugues@cs.lth.se Abstract This

More information

Using Search-Logs to Improve Query Tagging

Using Search-Logs to Improve Query Tagging Using Search-Logs to Improve Query Tagging Kuzman Ganchev Keith Hall Ryan McDonald Slav Petrov Google, Inc. {kuzman kbhall ryanmcd slav}@google.com Abstract Syntactic analysis of search queries is important

More information

Projective Dependency Parsing with Perceptron

Projective Dependency Parsing with Perceptron Projective Dependency Parsing with Perceptron Xavier Carreras, Mihai Surdeanu, and Lluís Màrquez Technical University of Catalonia {carreras,surdeanu,lluism}@lsi.upc.edu 8th June 2006 Outline Introduction

More information

Sorting Out Dependency Parsing

Sorting Out Dependency Parsing Sorting Out Dependency Parsing Joakim Nivre Uppsala University and Växjö University Sorting Out Dependency Parsing 1(38) Introduction Introduction Syntactic parsing of natural language: Who does what to

More information

DRAGNN: A TRANSITION-BASED FRAMEWORK FOR DYNAMICALLY CONNECTED NEURAL NETWORKS

DRAGNN: A TRANSITION-BASED FRAMEWORK FOR DYNAMICALLY CONNECTED NEURAL NETWORKS DRAGNN: A TRANSITION-BASED FRAMEWORK FOR DYNAMICALLY CONNECTED NEURAL NETWORKS Lingpeng Kong Carnegie Mellon University Pittsburgh, PA lingpenk@cs.cmu.edu Chris Alberti Daniel Andor Ivan Bogatyy David

More information

Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles

Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles James Cross and Liang Huang School of EECS, Oregon State University, Corvallis, OR, USA {james.henry.cross.iii,

More information

arxiv: v1 [cs.cl] 13 Mar 2017

arxiv: v1 [cs.cl] 13 Mar 2017 DRAGNN: A Transition-based Framework for Dynamically Connected Neural Networks Lingpeng Kong Chris Alberti Daniel Andor Ivan Bogatyy David Weiss lingpengk@cs.cmu.edu, {chrisalberti,danielandor,bogatyy,djweiss}@google.com

More information

arxiv: v4 [cs.cl] 7 Jun 2016

arxiv: v4 [cs.cl] 7 Jun 2016 Edge-Linear First-Order Dependency Parsing with Undirected Minimum Spanning Tree Inference Effi Levi 1 Roi Reichart 2 Ari Rappoport 1 1 Institute of Computer Science, The Hebrew Univeristy 2 Faculty of

More information

Tekniker för storskalig parsning: Dependensparsning 2

Tekniker för storskalig parsning: Dependensparsning 2 Tekniker för storskalig parsning: Dependensparsning 2 Joakim Nivre Uppsala Universitet Institutionen för lingvistik och filologi joakim.nivre@lingfil.uu.se Dependensparsning 2 1(45) Data-Driven Dependency

More information

Graph-Based Parsing. Miguel Ballesteros. Algorithms for NLP Course. 7-11

Graph-Based Parsing. Miguel Ballesteros. Algorithms for NLP Course. 7-11 Graph-Based Parsing Miguel Ballesteros Algorithms for NLP Course. 7-11 By using some Joakim Nivre's materials from Uppsala University and Jason Eisner's material from Johns Hopkins University. Outline

More information

Comparing State-of-the-art Dependency Parsers for the EVALITA 2014 Dependency Parsing Task

Comparing State-of-the-art Dependency Parsers for the EVALITA 2014 Dependency Parsing Task 10.12871/clicit201423 Comparing State-of-the-art Dependency Parsers for the EVALITA 2014 Dependency Parsing Task Alberto Lavelli FBK-irst via Sommarive, 18 - Povo I-38123 Trento (TN) - ITALY lavelli@fbk.eu

More information

An Empirical Study of Semi-supervised Structured Conditional Models for Dependency Parsing

An Empirical Study of Semi-supervised Structured Conditional Models for Dependency Parsing An Empirical Study of Semi-supervised Structured Conditional Models for Dependency Parsing Jun Suzuki, Hideki Isozaki NTT CS Lab., NTT Corp. Kyoto, 619-0237, Japan jun@cslab.kecl.ntt.co.jp isozaki@cslab.kecl.ntt.co.jp

More information

Transition-based dependency parsing

Transition-based dependency parsing Transition-based dependency parsing Syntactic analysis (5LN455) 2014-12-18 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Overview Arc-factored dependency parsing

More information

Transition-based Parsing with Neural Nets

Transition-based Parsing with Neural Nets CS11-747 Neural Networks for NLP Transition-based Parsing with Neural Nets Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Two Types of Linguistic Structure Dependency: focus on relations between

More information

Transition-Based Parsing of the Chinese Treebank using a Global Discriminative Model

Transition-Based Parsing of the Chinese Treebank using a Global Discriminative Model Transition-Based Parsing of the Chinese Treebank using a Global Discriminative Model Yue Zhang Oxford University Computing Laboratory yue.zhang@comlab.ox.ac.uk Stephen Clark Cambridge University Computer

More information

Learning Latent Linguistic Structure to Optimize End Tasks. David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu

Learning Latent Linguistic Structure to Optimize End Tasks. David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu Learning Latent Linguistic Structure to Optimize End Tasks David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu 12 October 2012 Learning Latent Linguistic Structure to Optimize End Tasks David A. Smith

More information

Statistical parsing. Fei Xia Feb 27, 2009 CSE 590A

Statistical parsing. Fei Xia Feb 27, 2009 CSE 590A Statistical parsing Fei Xia Feb 27, 2009 CSE 590A Statistical parsing History-based models (1995-2000) Recent development (2000-present): Supervised learning: reranking and label splitting Semi-supervised

More information

Online Service for Polish Dependency Parsing and Results Visualisation

Online Service for Polish Dependency Parsing and Results Visualisation Online Service for Polish Dependency Parsing and Results Visualisation Alina Wróblewska and Piotr Sikora Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland alina@ipipan.waw.pl,piotr.sikora@student.uw.edu.pl

More information

Supplementary A. Built-in transition systems

Supplementary A. Built-in transition systems L. Aufrant, G. Wisniewski PanParser (Supplementary) Supplementary A. Built-in transition systems In the following, we document all transition systems built in PanParser, along with their cost. s and s

More information

Parsing with Dynamic Programming

Parsing with Dynamic Programming CS11-747 Neural Networks for NLP Parsing with Dynamic Programming Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Two Types of Linguistic Structure Dependency: focus on relations between words

More information

Natural Language Processing with Deep Learning CS224N/Ling284

Natural Language Processing with Deep Learning CS224N/Ling284 Natural Language Processing with Deep Learning CS224N/Ling284 Lecture 8: Recurrent Neural Networks Christopher Manning and Richard Socher Organization Extra project office hour today after lecture Overview

More information

Generalized Higher-Order Dependency Parsing with Cube Pruning

Generalized Higher-Order Dependency Parsing with Cube Pruning Generalized Higher-Order Dependency Parsing with Cube Pruning Hao Zhang Ryan McDonald Google, Inc. {haozhang,ryanmcd}@google.com Abstract State-of-the-art graph-based parsers use features over higher-order

More information

arxiv: v2 [cs.cl] 24 Mar 2015

arxiv: v2 [cs.cl] 24 Mar 2015 Yara Parser: A Fast and Accurate Dependency Parser Mohammad Sadegh Rasooli 1 and Joel Tetreault 2 1 Department of Computer Science, Columbia University, New York, NY, rasooli@cs.columbia.edu 2 Yahoo Labs,

More information

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Transition-Based Dependency Parsing with Stack Long Short-Term Memory Transition-Based Dependency Parsing with Stack Long Short-Term Memory Chris Dyer Miguel Ballesteros Wang Ling Austin Matthews Noah A. Smith Marianas Labs NLP Group, Pompeu Fabra University Carnegie Mellon

More information

Transition-based Dependency Parsing Using Two Heterogeneous Gated Recursive Neural Networks

Transition-based Dependency Parsing Using Two Heterogeneous Gated Recursive Neural Networks Transition-based Dependency Parsing Using Two Heterogeneous Gated Recursive Neural Networks Xinchi Chen, Yaqian Zhou, Chenxi Zhu, Xipeng Qiu, Xuanjing Huang Shanghai Key Laboratory of Intelligent Information

More information

Refresher on Dependency Syntax and the Nivre Algorithm

Refresher on Dependency Syntax and the Nivre Algorithm Refresher on Dependency yntax and Nivre Algorithm Richard Johansson 1 Introduction This document gives more details about some important topics that re discussed very quickly during lecture: dependency

More information

Testing parsing improvements with combination and translation in Evalita 2014

Testing parsing improvements with combination and translation in Evalita 2014 10.12871/clicit201424 Testing parsing improvements with combination and translation in Evalita 2014 Alessandro Mazzei Dipartimento di Informatica Università degli Studi di Torino Corso Svizzera 185, 10149

More information

Advanced Search Algorithms

Advanced Search Algorithms CS11-747 Neural Networks for NLP Advanced Search Algorithms Daniel Clothiaux https://phontron.com/class/nn4nlp2017/ Why search? So far, decoding has mostly been greedy Chose the most likely output from

More information

Empirical Evaluation of RNN Architectures on Sentence Classification Task

Empirical Evaluation of RNN Architectures on Sentence Classification Task Empirical Evaluation of RNN Architectures on Sentence Classification Task Lei Shen, Junlin Zhang Chanjet Information Technology lorashen@126.com, zhangjlh@chanjet.com Abstract. Recurrent Neural Networks

More information

HadoopPerceptron: a Toolkit for Distributed Perceptron Training and Prediction with MapReduce

HadoopPerceptron: a Toolkit for Distributed Perceptron Training and Prediction with MapReduce HadoopPerceptron: a Toolkit for Distributed Perceptron Training and Prediction with MapReduce Andrea Gesmundo Computer Science Department University of Geneva Geneva, Switzerland andrea.gesmundo@unige.ch

More information

Neural Discontinuous Constituency Parsing

Neural Discontinuous Constituency Parsing Neural Discontinuous Constituency Parsing Miloš Stanojević and Raquel G. Alhama Institute for Logic, Language and Computation (ILLC) University of Amsterdam {m.stanojevic, rgalhama}@uva.nl Abstract One

More information

Incremental Integer Linear Programming for Non-projective Dependency Parsing

Incremental Integer Linear Programming for Non-projective Dependency Parsing Incremental Integer Linear Programming for Non-projective Dependency Parsing Sebastian Riedel James Clarke ICCS, University of Edinburgh 22. July 2006 EMNLP 2006 S. Riedel, J. Clarke (ICCS, Edinburgh)

More information

Exploring Automatic Feature Selection for Transition-Based Dependency Parsing

Exploring Automatic Feature Selection for Transition-Based Dependency Parsing Procesamiento del Lenguaje Natural, Revista nº 51, septiembre de 2013, pp 119-126 recibido 18-04-2013 revisado 16-06-2013 aceptado 21-06-2013 Exploring Automatic Feature Selection for Transition-Based

More information

Turning on the Turbo: Fast Third-Order Non- Projective Turbo Parsers

Turning on the Turbo: Fast Third-Order Non- Projective Turbo Parsers Carnegie Mellon University Research Showcase @ CMU Language Technologies Institute School of Computer Science 8-2013 Turning on the Turbo: Fast Third-Order Non- Projective Turbo Parsers Andre F.T. Martins

More information

NLP in practice, an example: Semantic Role Labeling

NLP in practice, an example: Semantic Role Labeling NLP in practice, an example: Semantic Role Labeling Anders Björkelund Lund University, Dept. of Computer Science anders.bjorkelund@cs.lth.se October 15, 2010 Anders Björkelund NLP in practice, an example:

More information

Basic Parsing with Context-Free Grammars. Some slides adapted from Karl Stratos and from Chris Manning

Basic Parsing with Context-Free Grammars. Some slides adapted from Karl Stratos and from Chris Manning Basic Parsing with Context-Free Grammars Some slides adapted from Karl Stratos and from Chris Manning 1 Announcements HW 2 out Midterm on 10/19 (see website). Sample ques>ons will be provided. Sign up

More information

Dynamic Feature Selection for Dependency Parsing

Dynamic Feature Selection for Dependency Parsing Dynamic Feature Selection for Dependency Parsing He He, Hal Daumé III and Jason Eisner EMNLP 2013, Seattle Structured Prediction in NLP Part-of-Speech Tagging Parsing N N V Det N Fruit flies like a banana

More information

HC-search for Incremental Parsing

HC-search for Incremental Parsing HC-search for Incremental Parsing Yijia Liu, Wanxiang Che, Bing Qin, Ting Liu Research Center for Social Computing and Information Retrieval Harbin Institute of Technology, China {yjliu,car,qinb,tliu}@ir.hit.edu.cn

More information

A Dynamic Confusion Score for Dependency Arc Labels

A Dynamic Confusion Score for Dependency Arc Labels A Dynamic Confusion Score for Dependency Arc Labels Sambhav Jain and Bhasha Agrawal Language Technologies Research Center IIIT-Hyderabad, India {sambhav.jain, bhasha.agrawal}@research.iiit.ac.in Abstract

More information

Dependency Parsing domain adaptation using transductive SVM

Dependency Parsing domain adaptation using transductive SVM Dependency Parsing domain adaptation using transductive SVM Antonio Valerio Miceli-Barone University of Pisa, Italy / Largo B. Pontecorvo, 3, Pisa, Italy miceli@di.unipi.it Giuseppe Attardi University

More information

A Dependency Parser for Tweets. Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith

A Dependency Parser for Tweets. Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith A Dependency Parser for Tweets Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith NLP for Social Media Boom! Ya ur website suxx bro @SarahKSilverman michelle

More information

End-To-End Spam Classification With Neural Networks

End-To-End Spam Classification With Neural Networks End-To-End Spam Classification With Neural Networks Christopher Lennan, Bastian Naber, Jan Reher, Leon Weber 1 Introduction A few years ago, the majority of the internet s network traffic was due to spam

More information

Backpropagating through Structured Argmax using a SPIGOT

Backpropagating through Structured Argmax using a SPIGOT Backpropagating through Structured Argmax using a SPIGOT Hao Peng, Sam Thomson, Noah A. Smith @ACL July 17, 2018 Overview arg max Parser Downstream task Loss L Overview arg max Parser Downstream task Head

More information

Neural Transition Based Parsing of Web Queries: An Entity Based Approach

Neural Transition Based Parsing of Web Queries: An Entity Based Approach Neural Transition Based Parsing of Web Queries: An Entity Based Approach Rivka Malca and Roi Reichart Technion, Israel Institute of Technology srikim@st.technion.ac.il, roiri@technion.ac.il Abstract Web

More information

Anchoring and Agreement in Syntactic Annotations

Anchoring and Agreement in Syntactic Annotations Anchoring and Agreement in Syntactic Annotations Yevgeni Berzak CSAIL MIT berzak@mit.edu Yan Huang Language Technology Lab DTAL Cambridge University yh358@cam.ac.uk Andrei Barbu CSAIL MIT andrei@0xab.com

More information

MainiwayAI at IJCNLP-2017 Task 2: Ensembles of Deep Architectures for Valence-Arousal Prediction

MainiwayAI at IJCNLP-2017 Task 2: Ensembles of Deep Architectures for Valence-Arousal Prediction MainiwayAI at IJCNLP-2017 Task 2: Ensembles of Deep Architectures for Valence-Arousal Prediction Yassine Benajiba Jin Sun Yong Zhang Zhiliang Weng Or Biran Mainiway AI Lab {yassine,jin.sun,yong.zhang,zhiliang.weng,or.biran}@mainiway.com

More information

Accurate Parsing and Beyond. Yoav Goldberg Bar Ilan University

Accurate Parsing and Beyond. Yoav Goldberg Bar Ilan University Accurate Parsing and Beyond Yoav Goldberg Bar Ilan University Syntactic Parsing subj root rcmod rel xcomp det subj aux acomp acomp The soup, which I expected to be good, was bad subj root rcmod rel xcomp

More information

Langforia: Language Pipelines for Annotating Large Collections of Documents

Langforia: Language Pipelines for Annotating Large Collections of Documents Langforia: Language Pipelines for Annotating Large Collections of Documents Marcus Klang Lund University Department of Computer Science Lund, Sweden Marcus.Klang@cs.lth.se Pierre Nugues Lund University

More information

CRFVoter: Chemical Entity Mention, Gene and Protein Related Object recognition using a conglomerate of CRF based tools

CRFVoter: Chemical Entity Mention, Gene and Protein Related Object recognition using a conglomerate of CRF based tools CRFVoter: Chemical Entity Mention, Gene and Protein Related Object recognition using a conglomerate of CRF based tools Wahed Hemati, Alexander Mehler, and Tolga Uslu Text Technology Lab, Goethe Universitt

More information

Probabilistic Graph-based Dependency Parsing with Convolutional Neural Network

Probabilistic Graph-based Dependency Parsing with Convolutional Neural Network Probabilistic Graph-based Dependency Parsing with Convolutional Neural Network Zhisong Zhang 1,2, Hai Zhao 1,2,, Lianhui Qin 1,2 1 Department of Computer Science and Engineering, Shanghai Jiao Tong University,

More information

CS 224N Assignment 2 Writeup

CS 224N Assignment 2 Writeup CS 224N Assignment 2 Writeup Angela Gong agong@stanford.edu Dept. of Computer Science Allen Nie anie@stanford.edu Symbolic Systems Program 1 Introduction 1.1 PCFG A probabilistic context-free grammar (PCFG)

More information

Utilizing Dependency Language Models for Graph-based Dependency Parsing Models

Utilizing Dependency Language Models for Graph-based Dependency Parsing Models Utilizing Dependency Language Models for Graph-based Dependency Parsing Models Wenliang Chen, Min Zhang, and Haizhou Li Human Language Technology, Institute for Infocomm Research, Singapore {wechen, mzhang,

More information

Optimal Incremental Parsing via Best-First Dynamic Programming

Optimal Incremental Parsing via Best-First Dynamic Programming Optimal Incremental Parsing via Best-First Dynamic Programming Kai Zhao 1 James Cross 1 1 Graduate Center City University of New York 365 Fifth Avenue, New York, NY 10016 {kzhao,jcross}@gc.cuny.edu Liang

More information

Efficient Parsing for Head-Split Dependency Trees

Efficient Parsing for Head-Split Dependency Trees Efficient Parsing for Head-Split Dependency Trees Giorgio Satta Dept. of Information Engineering University of Padua, Italy satta@dei.unipd.it Marco Kuhlmann Dept. of Linguistics and Philology Uppsala

More information

EDAN20 Language Technology Chapter 13: Dependency Parsing

EDAN20 Language Technology   Chapter 13: Dependency Parsing EDAN20 Language Technology http://cs.lth.se/edan20/ Pierre Nugues Lund University Pierre.Nugues@cs.lth.se http://cs.lth.se/pierre_nugues/ September 19, 2016 Pierre Nugues EDAN20 Language Technology http://cs.lth.se/edan20/

More information

Iterative CKY parsing for Probabilistic Context-Free Grammars

Iterative CKY parsing for Probabilistic Context-Free Grammars Iterative CKY parsing for Probabilistic Context-Free Grammars Yoshimasa Tsuruoka and Jun ichi Tsujii Department of Computer Science, University of Tokyo Hongo 7-3-1, Bunkyo-ku, Tokyo 113-0033 CREST, JST

More information

Stanford s System for Parsing the English Web

Stanford s System for Parsing the English Web Stanford s System for Parsing the English Web David McClosky s, Wanxiang Che h, Marta Recasens s, Mengqiu Wang s, Richard Socher s, and Christopher D. Manning s s Natural Language Processing Group, Stanford

More information

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision Anonymized for review Abstract Extending the success of deep neural networks to high level tasks like natural language

More information

Optimizing Planar and 2-Planar Parsers with MaltOptimizer

Optimizing Planar and 2-Planar Parsers with MaltOptimizer Procesamiento del Lenguaje Natural, Revista nº 49 septiembre de 2012, pp 171-178 recibido 05-05-12 revisado 28-05-12 aceptado 04-06-12 Optimizing Planar and 2-Planar Parsers with MaltOptimizer Optimizando

More information

Simple and Effective Dimensionality Reduction for Word Embeddings

Simple and Effective Dimensionality Reduction for Word Embeddings Simple and Effective Dimensionality Reduction for Word Embeddings Vikas Raunak Microsoft India, Hyderabad viraun@microsoft.com Abstract Word embeddings have become the basic building blocks for several

More information

Training for Fast Sequential Prediction Using Dynamic Feature Selection

Training for Fast Sequential Prediction Using Dynamic Feature Selection Training for Fast Sequential Prediction Using Dynamic Feature Selection Emma Strubell Luke Vilnis Andrew McCallum School of Computer Science University of Massachusetts, Amherst Amherst, MA 01002 {strubell,

More information

Shared Task Introduction

Shared Task Introduction Shared Task Introduction Learning Machine Learning Nils Reiter September 26-27, 2018 Nils Reiter (CRETA) Shared Task Introduction September 26-27, 2018 1 / 28 Outline Shared Tasks Data and Annotations

More information

Predicting answer types for question-answering

Predicting answer types for question-answering Predicting answer types for question-answering Ivan Bogatyy Google Inc. 1600 Amphitheatre pkwy Mountain View, CA 94043 bogatyy@google.com Abstract We tackle the problem of answer type prediction, applying

More information