Machine Translation Zoo
|
|
- Horatio Sherman
- 5 years ago
- Views:
Transcription
1 Machine Translation Zoo Tree-to-tree transfer and Discriminative learning Martin Popel ÚFAL (Institute of Formal and Applied Linguistics) Charles University in Prague May 5th 2013, Seminar of Formal Linguistics, Prague
2 Today s Menu 1 MT Intro Taxonomy Hybrids 2 Online Learning Perceptron Structured Prediction 3 Guided Learning 4 Back to MT Easy-First Decoding in MT Guided Learning in MT
3 Today s Menu 1 MT Intro Taxonomy Hybrids 2 Online Learning Perceptron Structured Prediction 3 Guided Learning 4 Back to MT Easy-First Decoding in MT Guided Learning in MT
4 Phrase-based MT (Moses) Training word-alignment (Giza++ & symmetrization) phrase extraction tune parameters (MERT) Decoding get all matching rules find one derivation with a maximum score (beam search)
5 TectoMT Training analyze CzEng to t-layer t-node alignment learn one MaxEnt model for each source lemma and formeme Decoding get all translation variants for each lemma and formeme find a labeling with a maximum score (HMTM)
6 TectoMT MaxEnt Model he / n:subj union / n:with+x agree / v:fin tense=past, voice=active negation=0, sempos=v cut / v:inf, has_left_child=0, sempos=v, has_right_child=1, tag=vb, position=right, named_entity=0 overtime / n:obj chop, saw, trim, shorten, lumber, hew, lower, delete, crop abolish, cancel,... all / adj:attr TRANSFER ANALYSIS SYNTHESIS He agreed with the unions to cut all overtime. Dohodl se s odbory na zrušení všech přesčasů.
7 Machine Translation Taxonomy Level of transfer: Base translation unit (BTU): Extract more segmentations in training? Try (search) more segmentations in decoding? Use more segmentations in the output translation? What is the context X in P(BTU target BTU source, X )?
8 Machine Translation Taxonomy Level of transfer: surface Base translation unit (BTU): word Extract more segmentations in training? no Try (search) more segmentations in decoding? no Use more segmentations in the output translation? no What is the context X in P(BTU target BTU source, X )? Considering just Translation Model: nothing (Brown et al., 1993) word-based
9 Machine Translation Taxonomy Level of transfer: surface Base translation unit (BTU): word, phrase Extract more segmentations in training? yes Try (search) more segmentations in decoding? yes Use more segmentations in the output translation? no What is the context X in P(BTU target BTU source, X )? Considering just Translation Model: nothing (Brown et al., 1993) word-based (Koehn et al., 2003) phrase-based
10 Machine Translation Taxonomy Level of transfer: surface Base translation unit (BTU): word, phrase, phrase with gaps Extract more segmentations in training? yes Try (search) more segmentations in decoding? yes Use more segmentations in the output translation? no What is the context X in P(BTU target BTU source, X )? Considering just Translation Model: nothing (Brown et al., 1993) word-based (Koehn et al., 2003) phrase-based (Chiang, 2005) hierarchical
11 Machine Translation Taxonomy Level of transfer: surface, shallow syntax Base translation unit (BTU): word, phrase, phrase with gaps, treelet Extract more segmentations in training? no Try (search) more segmentations in decoding? no Use more segmentations in the output translation? no What is the context X in P(BTU target BTU source, X )? Considering just Translation Model: neighboring treelets (Quirk and Menezes, 2006) dep. treelet to string (Brown et al., 1993) word-based (Koehn et al., 2003) phrase-based (Chiang, 2005) hierarchical
12 Machine Translation Taxonomy Level of transfer: surface, shallow syntax, tectogrammatical Base translation unit (BTU): word, phrase, phrase with gaps, treelet, node Extract more segmentations in training? no Try (search) more segmentations in decoding? no Use more segmentations in the output translation? no What is the context X in P(BTU target BTU source, X )? Considering just Translation Model: neighboring nodes (Quirk and Menezes, 2006) dep. treelet to string (Mareček et al., 2010) TectoMT (Brown et al., 1993) word-based (Koehn et al., 2003) phrase-based (Chiang, 2005) hierarchical
13 Machine Translation Taxonomy Level of transfer: surface, shallow syntax, tectogrammatical Base translation unit (BTU): word, phrase, phrase with gaps, treelet, node Extract more segmentations in training? yes Try (search) more segmentations in decoding? yes Use more segmentations in the output translation? yes What is the context X in P(BTU target BTU source, X )? Considering just Translation Model: nothing (Quirk and Menezes, 2006) (Mareček et al., 2010) dep. treelet to string TectoMT (Brown et al., 1993) word-based (Koehn et al., 2003) phrase-based (Chiang, 2005) hierarchical (Arun, 2011) Monte Carlo
14 Hybrids: TectoMoses Linearize source t-trees (two factors: lemma and formeme), translate with Moses, project dependencies and use TectoMT synthesis. rule based tectogramatical layer & statistical blocks ANALYSIS TRANSFER SYNTHESIS t-layer fill formems grammatemes use fill morphological categories query build t-tree dictionary HMTM impose agreement mark edges to contract add functional words analytical layer a-layer analytical functions generate parser (McDonald's MST) wordforms morphological layer m-layer tagger (Morce) concatenate lemmatization source language (English) tokenization (Czech) target language w-layer segmentation
15 Hybrids: TectoMoses Linearize source t-trees (two factors: lemma and formeme), translate with Moses, project dependencies and use TectoMT synthesis. rule based tectogramatical layer & statistical blocks ANALYSIS TRANSFER SYNTHESIS t-layer fill formems grammatemes use fill morphological categories query build t-tree dictionary HMTM impose agreement mark edges to contract add functional words analytical layer a-layer analytical functions generate parser (McDonald's MST) wordforms morphological layer m-layer tagger (Morce) concatenate lemmatization source language (English) tokenization (Czech) target language w-layer segmentation
16 Hybrids: TectoMoses Linearize source t-trees (two factors: lemma and formeme), translate with Moses, project dependencies and use TectoMT synthesis. rule based tectogramatical layer & statistical blocks ANALYSIS TRANSFER SYNTHESIS t-layer fill formems grammatemes use fill morphological categories query build t-tree dictionary HMTM impose agreement mark edges to contract add functional words analytical layer a-layer analytical functions generate parser (McDonald's MST) wordforms morphological layer m-layer tagger (Morce) concatenate lemmatization source language (English) tokenization (Czech) target language w-layer segmentation
17 Hybrids: PhraseFix Done for WMT 2013 by Petra Galuščáková: Post-edit TectoMT output using Moses trained on cs-tectomt cs-reference (whole CzEng). How to post-edit only when confident? filter phrase table add confidence feature for MERT improve alignment (monolingual) boost phrase table (e.g. with identities) Future work: use also source (English) sentences multi-source translation project only content words (using TectoMT) factored translation with non-synchronous (overlapping) factors
18 Hybrids: PhraseFix Done for WMT 2013 by Petra Galuščáková: Post-edit TectoMT output using Moses trained on cs-tectomt cs-reference (whole CzEng). How to post-edit only when confident? filter phrase table add confidence feature for MERT improve alignment (monolingual) boost phrase table (e.g. with identities) Future work: use also source (English) sentences multi-source translation project only content words (using TectoMT) factored translation with non-synchronous (overlapping) factors
19 Even More Hybrids: DepFix, AddToTrain, Chimera DepFix (Rosa et al., 2012) post-edit SMT using syntactic analysis and rules exploit also the source sentences, robust parsing AddToTrain (Bojar, Galuščáková) translate monolingual news (or WMT devsets) with TectoMT add this to Moses parallel training data Chimera post-edit AddToTrain output with DepFix sent to WMT 2013 in attempt to beat Google
20 Even More Hybrids: DepFix, AddToTrain, Chimera DepFix (Rosa et al., 2012) post-edit SMT using syntactic analysis and rules exploit also the source sentences, robust parsing AddToTrain (Bojar, Galuščáková) translate monolingual news (or WMT devsets) with TectoMT add this to Moses parallel training data Chimera post-edit AddToTrain output with DepFix sent to WMT 2013 in attempt to beat Google
21 Today s Menu 1 MT Intro Taxonomy Hybrids 2 Online Learning Perceptron Structured Prediction 3 Guided Learning 4 Back to MT Easy-First Decoding in MT Guided Learning in MT
22 General Algorithm for Online Learning w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) Output: w
23 General Algorithm for Online Learning w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) initialize all weights to zero Output: w
24 General Algorithm for Online Learning w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) initialize all weights to zero for each instance (observation) Output: w
25 General Algorithm for Online Learning w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) initialize all weights to zero for each instance (observation) 1. get its features x Output: w
26 General Algorithm for Online Learning w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) initialize all weights to zero for each instance (observation) 1. get its features x 2. do the prediction y pred Output: w
27 General Algorithm for Online Learning w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) Output: w initialize all weights to zero for each instance (observation) 1. get its features x 2. do the prediction y pred 3. get the correct label y gold
28 General Algorithm for Online Learning w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) Output: w initialize all weights to zero for each instance (observation) 1. get its features x 2. do the prediction y pred 3. get the correct label y gold 4. update the weights
29 General Algorithm for Online Learning w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) Output: w initialize all weights to zero for each instance (observation) 1. get its features x 2. do the prediction y pred 3. get the correct label y gold 4. update the weights
30 General Algorithm for Online Learning w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) Output: w initialize all weights to zero for each instance (observation) 1. get its features x 2. do the prediction y pred 3. get the correct label y gold 4. update the weights Definition: conservative online learning no error no update i.e., if y pred = y gold then update(x, y gold, y pred ) = 0
31 General Algorithm for Online Learning w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) //prediction(w, x) = y gold Output: w initialize all weights to zero for each instance (observation) 1. get its features x 2. do the prediction y pred 3. get the correct label y gold 4. update the weights Definition: conservative online learning no error no update i.e., if y pred = y gold then update(x, y gold, y pred ) = 0 Definition: aggressive online learning after the update, the instance would be classified correctly
32 Perceptron w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) Output: w prediction(w, x) def = update(x, y gold, y pred ) def = Binary Perceptron [w x > 0] α(y gold y pred ) x
33 Perceptron w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) dot product (similarity score) of weights and features w x = i w ix i Output: w prediction(w, x) def = update(x, y gold, y pred ) def = Binary Perceptron [w x > 0] α(y gold y pred ) x
34 Perceptron w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) Output: w dot product (similarity score) of weights and features w x = i w ix i Iverson{ bracket 1 if P is true; [P] = 0 otherwise. prediction(w, x) def = update(x, y gold, y pred ) def = Binary Perceptron [w x > 0] α(y gold y pred ) x
35 Perceptron w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) Output: w dot product (similarity score) of weights and features w x = i w ix i Iverson{ bracket 1 if P is true; [P] = 0 otherwise. prediction(w, x) def = update(x, y gold, y pred ) def = Binary Perceptron [w x > 0] α(y gold y pred ) x learning rate (step size) α > 0
36 Perceptron w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) Output: w prediction(w, x) def = update(x, y gold, y pred ) def = Binary Perceptron [w x > 0] α(y gold y pred ) x Multi-class Perceptron arg max y w f(x, y) α ( f(x, y gold ) f(x, y pred ) ) learning rate (step size) α > 0
37 Perceptron w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) Output: w Special case: multi-prototype features f(x, y) def = [y = class 1 ] x, [y = class 2 ] x, [y = class C ] x prediction(w, x) def = update(x, y gold, y pred ) def = Binary Perceptron [w x > 0] α(y gold y pred ) x Multi-class Perceptron arg max y w f(x, y) α ( f(x, y gold ) f(x, y pred ) ) w := w + αf(x, y gold ) αf(x, y pred )
38 Perceptron w := 0 while (x, y gold ) := get new data() y pred := prediction(w, x) w += update(x, y gold, y pred ) General case: any label-dependent features, e.g. f 101 (x, y) def = [(y=nnp or y=nnps) and x capitalized ] Output: w prediction(w, x) def = update(x, y gold, y pred ) def = Binary Perceptron [w x > 0] α(y gold y pred ) x Multi-class Perceptron arg max y w f(x, y) α ( f(x, y gold ) f(x, y pred ) )
39 Structured Prediction the number of possible labels is huge labels y have a structure (graph, tree, sequence,... ) usually can be decomposed (factorized) into subproblems local features f i (x, y, j) can use whole x, but only such y k where k is near j f 101 (x, y, j) def = [ (y j =NNP or y j =NNPS) and word x j capitalized ] f 102 (x, y, j) def = [ y j =NNP and y j 1 =NNP and x 6 ] global features F i (x, y) def = j f i(x, y, j) F number of capitalized words with tag NNP or NNPS F number of NNP followed by NNP or 0 if the sentence is longer than six words We can define also features that cannot be decomposed
40 Structured Prediction using Online Learning 1 local approach update after each local decision output of previous decisions used in local features e.g. Structured Perceptron (Collins, 2002) y pred = arg max y i w if i (x, y j, y j 1,...) 2 global approach generate n-best list (lattice) of outputs y for the whole x compute global features, do update for each x (sentence) we are re-ranking the n-best list e.g. MIRA (Crammer and Singer, 2003) y pred = arg max y i w if i (x, y)
41 Margin-based Online Learning Definitions score(y) = w f(x, y) margin(y) = score(y gold ) score(y) margin > 0 no error margin confidence hinge loss(y) = max ( 0, 1 margin(y) ) Online Prediction and Update y pred def = arg max w f(x, y) w += α ( f(x, y gold ) f(x, y pred ) )
42 Margin-based Online Learning Definitions score(y) = w f(x, y) margin(y) = score(y gold ) score(y) margin > 0 no error margin confidence hinge loss(y) = max ( 0, 1 margin(y) ) Online Prediction and Update y pred def = arg max w f(x, y) w += α ( f(x, y gold ) f(x, y pred ) )
43 Margin-based Online Learning Definitions score(y) = w f(x, y) margin(y) = score(y gold ) score(y) margin > 0 no error margin confidence hinge loss(y) = max ( 0, 1 margin(y) ) Online Prediction and Update y pred def = arg max w f(x, y) w += α ( f(x, y gold ) f(x, y pred ) )
44 Margin-based Online Learning Definitions score(y) = w f(x, y) margin(y) = score(y gold ) score(y) margin > 0 no error margin confidence hinge loss(y) = max ( 0, 1 margin(y) ) Online Prediction and Update y pred def = arg max w f(x, y) w += α ( f(x, y gold ) f(x, y pred ) ) Perceptron α Perc def = 1 (or any fixed value > 0) Passive Aggressive (PA) α PA def = hinge loss(y pred ) f(x,y gold ) f(x,y pred ) 2 Passive Aggressive I α PA-I def = min {C, α PA } Passive Aggressive II α PA-II def = hinge loss(y pred ) f(x,y gold ) f(x,y pred ) C
45 Margin-based Online Learning Definitions score(y) = w f(x, y) margin(y) = score(y gold ) score(y) margin > 0 no error margin confidence hinge loss(y) = max ( 0, 1 margin(y) ) Online Prediction and Update y pred def = arg max w f(x, y) w += α ( f(x, y gold ) f(x, y pred ) ) Perceptron α Perc def = 1 (or any fixed value > 0) Passive Aggressive (PA) α PA def = hinge loss(y pred ) f(x,y gold ) f(x,y pred ) 2 Passive Aggressive I α PA-I def = min {C, α PA } Passive Aggressive II α PA-II def = hinge loss(y pred ) f(x,y gold ) f(x,y pred ) C
46 Cost-sensitive Online Learning Definitions cost(y) = external error metric (non-negative) e.g. 1 - similarity of y and y gold hinge loss(y) = max ( 0, cost(y) margin(y) ) Hope and Fear w += α ( f(x, y gold ) f(x, y pred ) ) min-cost y hope def = arg max y cost(y) max-score y fear def = arg max y score(y) cost-diminished y hope def = arg max y score(y) cost(y) cost-augmented y fear def = arg max y score(y) + cost(y) max-cost y fear def = arg max y cost(y)
47 Cost-sensitive Online Learning Definitions cost(y) = external error metric (non-negative) e.g. 1 - similarity of y and y gold hinge loss(y) = max ( 0, cost(y) margin(y) ) Hope and Fear w += α ( f(x, y gold ) f(x, y pred ) ) min-cost y hope def = arg max y cost(y) max-score y fear def = arg max y score(y) cost-diminished y hope def = arg max y score(y) cost(y) cost-augmented y fear def = arg max y score(y) + cost(y) max-cost y fear def = arg max y cost(y)
48 Cost-sensitive Online Learning Definitions cost(y) = external error metric (non-negative) e.g. 1 - similarity of y and y gold hinge loss(y) = max ( 0, cost(y) margin(y) ) Hope and Fear w += α ( f(x, y hope ) f(x, y fear ) ) min-cost y hope def = arg max y cost(y) max-score y fear def = arg max y score(y) cost-diminished y hope def = arg max y score(y) cost(y) cost-augmented y fear def = arg max y score(y) + cost(y) max-cost y fear def = arg max y cost(y)
49 -cost Cost-sensitive Online Learning Hypothesis Selection score
50 -cost Cost-sensitive Online Learning Hypothesis Selection min-cost (max-bleu) hope score
51 -cost Cost-sensitive Online Learning Hypothesis Selection min-cost (max-bleu) hope max-score fear (y pred ) score
52 -cost Cost-sensitive Online Learning Hypothesis Selection min-cost (max-bleu) hope cost-diminished hope max-score fear (y pred ) score
53 -cost Cost-sensitive Online Learning Hypothesis Selection min-cost (max-bleu) hope cost-diminished hope max-score fear (y pred ) cost-augmented fear score
54 -cost Cost-sensitive Online Learning Hypothesis Selection min-cost (max-bleu) hope cost-diminished hope max-score fear (y pred ) cost-augmented fear score n-best list
55 -cost Cost-sensitive Online Learning Hypothesis Selection min-cost (max-bleu) hope cost-diminished hope max-score fear (y pred ) cost-augmented fear max-cost fear score n-best list
56 Application to MT x = source sentence y gold = its reference translation more references sometimes available reference may be unreachable we score derivations (which include latent variables) one translation may have more derivations
57 Today s Menu 1 MT Intro Taxonomy Hybrids 2 Online Learning Perceptron Structured Prediction 3 Guided Learning 4 Back to MT Easy-First Decoding in MT Guided Learning in MT
58 Easy-First Decoding (PoS Tagging) Agatha found that book interesting NN JJ score VB IN VBN NNP VBD DT RB (Shen, Satta and Joshi, 2007)
59 Easy-First Decoding (PoS Tagging) Agatha found that book interesting NN JJ score VB IN VBN NNP VBD DT RB (Shen, Satta and Joshi, 2007)
60 Easy-First Decoding (PoS Tagging) Agatha found that book interesting NN JJ score VB IN VBN NNP VBD DT RB (Shen, Satta and Joshi, 2007)
61 Easy-First Decoding (PoS Tagging) Agatha found that book interesting NN JJ score DT VBN IN f 123 def = [y j = DT, y j+1 = NN] NNP VBD DT RB (Shen, Satta and Joshi, 2007)
62 Easy-First Decoding (PoS Tagging) Agatha found that book interesting NN JJ score DT IN VBN NNP VBD RB f 124 def = [y j = RB, y j+1 = NN] (Shen, Satta and Joshi, 2007)
63 Easy-First Decoding (PoS Tagging) Agatha found that book interesting NN JJ score DT IN VBN VBD NNP RB (Shen, Satta and Joshi, 2007)
64 Easy-First Decoding (PoS Tagging) Agatha found that book interesting NN JJ score DT IN VBN VBD NNP RB (Shen, Satta and Joshi, 2007)
65 Easy-First Decoding (PoS Tagging) Agatha found that book interesting DT NN JJ score VBN VBD NNP (Shen, Satta and Joshi, 2007)
66 Easy-First Decoding (PoS Tagging) Agatha found that book interesting DT NN JJ score VBN VBD NNP (Shen, Satta and Joshi, 2007)
67 Guided Learning (PoS Tagging) Agatha found that book interesting DT NN JJ score VBN y fear VBD y hope NNP (Shen, Satta and Joshi, 2007)
68 Easy-First Decoding (Dependency Parsing) (Goldberg and Elhadad, 2010)
69 (2) ATTACHRIGHT(1) Easy-First Decoding -52 (Dependency Parsing) 246 a fox brown jumped with joy (Goldberg and Elhadad, 2010)
70 (3) ATTACHRIGHT(1) Easy-First Decoding (Dependency 246 Parsing) fox jumped with joy a brown (Goldberg and Elhadad, 2010)
71 MT Intro Online brown Learning Guided Learning a brown Back to MT (4) ATTACHLEFT(2) Easy-First Decoding (Dependency Parsing) (5) ATTACHLEFT(1) jumped with joy jumped with fox fox joy 430 (6) jumped fox w a brown a brown a brownj (Goldberg and Elhadad, 2010)
72 Today s Menu 1 MT Intro Taxonomy Hybrids 2 Online Learning Perceptron Structured Prediction 3 Guided Learning 4 Back to MT Easy-First Decoding in MT Guided Learning in MT
73 Easy-First Decoding (Phrase-Based MT) Agátě Agatha přišla found ta that kniha book zajímavá interesting score
74 Easy-First Decoding (Phrase-Based MT) Agátě Agatha přišla found ta that kniha book zajímavá interesting našel zjistil, že že kniha score že rezervovat zajímavý ta kniha zajímavá kniha zajímavá Agatha přišla ta rezervovat kniha zajímavý že kniha
75 Easy-First Decoding (Phrase-Based MT) Agátě Agatha přišla found ta that kniha book zajímavá interesting kniha score že rezervovat ta kniha kniha zajímavá rezervovat kniha zajímavý že kniha
76 Easy-First Decoding (Phrase-Based MT) Agátě Agatha přišla found ta that kniha book zajímavá interesting kniha score že rezervovat ta kniha kniha zajímavá rezervovat kniha zajímavý že kniha
77 Easy-First Decoding (Phrase-Based MT) Agátě Agatha přišla found ta that kniha book zajímavá interesting language model score zajímavý zajímavá kniha zajímavá kniha zajímavý
78 Easy-First Decoding (Phrase-Based MT) Agátě Agatha přišla found ta that kniha book zajímavá interesting language model zajímavá score zajímavý kniha zajímavá kniha zajímavý
79 Easy-First Decoding (Phrase-Based MT) Agátě Agatha přišla found ta that kniha book zajímavá interesting zajímavá score zajímavý kniha zajímavá kniha zajímavý
80 Easy-First Decoding (Phrase-Based MT) Agátě Agatha přišla found ta that kniha book zajímavá interesting score
81 Easy-First Decoding (Phrase-Based MT) Agátě Agatha přišla found ta that kniha book zajímavá interesting zjistil, že score našel že ta přišla
82 Features for Guided Learning in MT Source Segment Features segment size (number of words) entropy P(target source) = i P(src, trg i) log P(trg i src) log count(source) source language model: log P(source) word identity, e.g. f 42 def = [src=found that] PoS identity, e.g. f 43 def = [src pos=vbd IN] Target-dependent Features log P(trg src) target language model: log P(target previous segment) log count(target)? identity, e.g. f 142 def = [src=found that & trg=zjistil]
83 Features for Guided Learning in MT Source Segment Features segment size (number of words) entropy P(target source) = i P(src, trg i) log P(trg i src) log count(source) Combinations and Quantizations source language model: log P(source) [size(src) def = 3] log P(trg src) word identity, e.g. [size(src) f 42 = [src=found = 3 & that] 3 < log P(trg src) < 2] def PoS identity, e.g. etc. f 43 = [src pos=vbd IN] Target-dependent Features log P(trg src) target language model: log P(target previous segment) log count(target)? identity, e.g. f 142 def = [src=found that & trg=zjistil]
84 Application to Tecto Trees find v:fin Agatha n:subj book n:obj interesting adj:compl this adj:attr
85 Application to Tecto Trees n:subj Agatha v:fin find n:obj book adj:attr this adj:compl interesting
86 What have you seen in the Zoo
87 Predictions?
88 Predictions? Hope and Fear
Machine Learning for Deep-syntactic MT
Machine Learning for Deep-syntactic MT Martin Popel ÚFAL (Institute of Formal and Applied Linguistics) Charles University in Prague September 11, 2015 Seminar on the 35th Anniversary of the Cooperation
More informationMachine Translation and Discriminative Models
Machine Translation and Discriminative Models Tree-to-tree transfer and Discriminative learning Martin Popel ÚFAL (Institute of Formal and Applied Linguistics) Charles University in Prague March 23rd 2015,
More informationTreex: Modular NLP Framework
: Modular NLP Framework Martin Popel ÚFAL (Institute of Formal and Applied Linguistics) Charles University in Prague September 2015, Prague, MT Marathon Outline Motivation, vs. architecture internals Future
More informationTectoMT: Machine Translation System
: Machine Translation System Martin Popel ÚFAL (Institute of Formal and Applied Linguistics) Charles University in Prague Universität des Saarlandes November 4 th 2010, Saarbrücken, Germany Outline NLP
More informationTectoMT: Modular NLP Framework
: Modular NLP Framework Martin Popel, Zdeněk Žabokrtský ÚFAL, Charles University in Prague IceTAL, 7th International Conference on Natural Language Processing August 17, 2010, Reykjavik Outline Motivation
More informationDiscriminative Training for Phrase-Based Machine Translation
Discriminative Training for Phrase-Based Machine Translation Abhishek Arun 19 April 2007 Overview 1 Evolution from generative to discriminative models Discriminative training Model Learning schemes Featured
More informationINF5820/INF9820 LANGUAGE TECHNOLOGICAL APPLICATIONS. Jan Tore Lønning, Lecture 8, 12 Oct
1 INF5820/INF9820 LANGUAGE TECHNOLOGICAL APPLICATIONS Jan Tore Lønning, Lecture 8, 12 Oct. 2016 jtl@ifi.uio.no Today 2 Preparing bitext Parameter tuning Reranking Some linguistic issues STMT so far 3 We
More informationDepfix: Automatic post-editing of phrase-based machine translation outputs
Rudolf Rosa rosa@ufal.mff.cuni.cz Depfix: Automatic post-editing of phrase-based machine translation outputs Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied
More informationStatistical Machine Translation Part IV Log-Linear Models
Statistical Machine Translation art IV Log-Linear Models Alexander Fraser Institute for Natural Language rocessing University of Stuttgart 2011.11.25 Seminar: Statistical MT Where we have been We have
More informationSparse Feature Learning
Sparse Feature Learning Philipp Koehn 1 March 2016 Multiple Component Models 1 Translation Model Language Model Reordering Model Component Weights 2 Language Model.05 Translation Model.26.04.19.1 Reordering
More informationA General-Purpose Rule Extractor for SCFG-Based Machine Translation
A General-Purpose Rule Extractor for SCFG-Based Machine Translation Greg Hanneman, Michelle Burroughs, and Alon Lavie Language Technologies Institute Carnegie Mellon University Fifth Workshop on Syntax
More informationStructured Prediction Basics
CS11-747 Neural Networks for NLP Structured Prediction Basics Graham Neubig Site https://phontron.com/class/nn4nlp2017/ A Prediction Problem I hate this movie I love this movie very good good neutral bad
More informationBetter translations with user collaboration - Integrated MT at Microsoft
Better s with user collaboration - Integrated MT at Microsoft Chris Wendt Microsoft Research One Microsoft Way Redmond, WA 98052 christw@microsoft.com Abstract This paper outlines the methodologies Microsoft
More informationCMU-UKA Syntax Augmented Machine Translation
Outline CMU-UKA Syntax Augmented Machine Translation Ashish Venugopal, Andreas Zollmann, Stephan Vogel, Alex Waibel InterACT, LTI, Carnegie Mellon University Pittsburgh, PA Outline Outline 1 2 3 4 Issues
More informationTopics in Parsing: Context and Markovization; Dependency Parsing. COMP-599 Oct 17, 2016
Topics in Parsing: Context and Markovization; Dependency Parsing COMP-599 Oct 17, 2016 Outline Review Incorporating context Markovization Learning the context Dependency parsing Eisner s algorithm 2 Review
More informationImproving the quality of a customized SMT system using shared training data. August 28, 2009
Improving the quality of a customized SMT system using shared training data Chris.Wendt@microsoft.com Will.Lewis@microsoft.com August 28, 2009 1 Overview Engine and Customization Basics Experiment Objective
More informationNLP in practice, an example: Semantic Role Labeling
NLP in practice, an example: Semantic Role Labeling Anders Björkelund Lund University, Dept. of Computer Science anders.bjorkelund@cs.lth.se October 15, 2010 Anders Björkelund NLP in practice, an example:
More informationParsing with Dynamic Programming
CS11-747 Neural Networks for NLP Parsing with Dynamic Programming Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Two Types of Linguistic Structure Dependency: focus on relations between words
More informationAgenda for today. Homework questions, issues? Non-projective dependencies Spanning tree algorithm for non-projective parsing
Agenda for today Homework questions, issues? Non-projective dependencies Spanning tree algorithm for non-projective parsing 1 Projective vs non-projective dependencies If we extract dependencies from trees,
More informationCS395T Project 2: Shift-Reduce Parsing
CS395T Project 2: Shift-Reduce Parsing Due date: Tuesday, October 17 at 9:30am In this project you ll implement a shift-reduce parser. First you ll implement a greedy model, then you ll extend that model
More informationLearning Latent Linguistic Structure to Optimize End Tasks. David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu
Learning Latent Linguistic Structure to Optimize End Tasks David A. Smith with Jason Naradowsky and Xiaoye Tiger Wu 12 October 2012 Learning Latent Linguistic Structure to Optimize End Tasks David A. Smith
More informationK-best Parsing Algorithms
K-best Parsing Algorithms Liang Huang University of Pennsylvania joint work with David Chiang (USC Information Sciences Institute) k-best Parsing Liang Huang (Penn) k-best parsing 2 k-best Parsing I saw
More informationTHE CUED NIST 2009 ARABIC-ENGLISH SMT SYSTEM
THE CUED NIST 2009 ARABIC-ENGLISH SMT SYSTEM Adrià de Gispert, Gonzalo Iglesias, Graeme Blackwood, Jamie Brunning, Bill Byrne NIST Open MT 2009 Evaluation Workshop Ottawa, The CUED SMT system Lattice-based
More informationEasy-First Chinese POS Tagging and Dependency Parsing
Easy-First Chinese POS Tagging and Dependency Parsing ABSTRACT Ji Ma, Tong Xiao, Jingbo Zhu, Feiliang Ren Natural Language Processing Laboratory Northeastern University, China majineu@outlook.com, zhujingbo@mail.neu.edu.cn,
More informationHybrid Combination of Constituency and Dependency Trees into an Ensemble Dependency Parser
Hybrid Combination of Constituency and Dependency Trees into an Ensemble Dependency Parser Nathan David Green and Zdeněk Žabokrtský Charles University in Prague Institute of Formal and Applied Linguistics
More informationCS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019
CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019 1 Course Instructors: Christopher Manning, Richard Socher 2 Authors: Lisa Wang, Juhi Naik,
More informationForest-Based Search Algorithms
Forest-Based Search Algorithms for Parsing and Machine Translation Liang Huang University of Pennsylvania Google Research, March 14th, 2008 Search in NLP is not trivial! I saw her duck. Aravind Joshi 2
More informationBidirectional LTAG Dependency Parsing
Bidirectional LTAG Dependency Parsing Libin Shen BBN Technologies Aravind K. Joshi University of Pennsylvania We propose a novel algorithm for bi-directional parsing with linear computational complexity,
More informationApproximate Large Margin Methods for Structured Prediction
: Approximate Large Margin Methods for Structured Prediction Hal Daumé III and Daniel Marcu Information Sciences Institute University of Southern California {hdaume,marcu}@isi.edu Slide 1 Structured Prediction
More informationStatistical parsing. Fei Xia Feb 27, 2009 CSE 590A
Statistical parsing Fei Xia Feb 27, 2009 CSE 590A Statistical parsing History-based models (1995-2000) Recent development (2000-present): Supervised learning: reranking and label splitting Semi-supervised
More informationExperimenting in MT: Moses Toolkit and Eman
Experimenting in MT: Moses Toolkit and Eman Aleš Tamchyna, Ondřej Bojar Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics Charles University, Prague Tue Sept 8, 2015 1 / 30
More informationStart downloading our playground for SMT: wget
Before we start... Start downloading Moses: wget http://ufal.mff.cuni.cz/~tamchyna/mosesgiza.64bit.tar.gz Start downloading our playground for SMT: wget http://ufal.mff.cuni.cz/eman/download/playground.tar
More informationEasy-First POS Tagging and Dependency Parsing with Beam Search
Easy-First POS Tagging and Dependency Parsing with Beam Search Ji Ma JingboZhu Tong Xiao Nan Yang Natrual Language Processing Lab., Northeastern University, Shenyang, China MOE-MS Key Lab of MCC, University
More informationFeatures for Syntax-based Moses: Project Report
Features for Syntax-based Moses: Project Report Eleftherios Avramidis, Arefeh Kazemi, Tom Vanallemeersch, Phil Williams, Rico Sennrich, Maria Nadejde, Matthias Huck 13 September 2014 Features for Syntax-based
More informationSemantics as a Foreign Language. Gabriel Stanovsky and Ido Dagan EMNLP 2018
Semantics as a Foreign Language Gabriel Stanovsky and Ido Dagan EMNLP 2018 Semantic Dependency Parsing (SDP) A collection of three semantic formalisms (Oepen et al., 2014;2015) Semantic Dependency Parsing
More informationDiscriminative Reranking for Machine Translation
Discriminative Reranking for Machine Translation Libin Shen Dept. of Comp. & Info. Science Univ. of Pennsylvania Philadelphia, PA 19104 libin@seas.upenn.edu Anoop Sarkar School of Comp. Science Simon Fraser
More informationSampleRank Training for Phrase-Based Machine Translation
SampleRank Training for Phrase-Based Machine Translation Barry Haddow School of Informatics University of Edinburgh bhaddow@inf.ed.ac.uk Abhishek Arun Microsoft UK abarun@microsoft.com Philipp Koehn School
More informationAdvanced Natural Language Processing and Information Retrieval
Advanced Natural Language Processing and Information Retrieval LAB3: Kernel Methods for Reranking Alessandro Moschitti Department of Computer Science and Information Engineering University of Trento Email:
More informationDynamic Feature Selection for Dependency Parsing
Dynamic Feature Selection for Dependency Parsing He He, Hal Daumé III and Jason Eisner EMNLP 2013, Seattle Structured Prediction in NLP Part-of-Speech Tagging Parsing N N V Det N Fruit flies like a banana
More informationAT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands
AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands Svetlana Stoyanchev, Hyuckchul Jung, John Chen, Srinivas Bangalore AT&T Labs Research 1 AT&T Way Bedminster NJ 07921 {sveta,hjung,jchen,srini}@research.att.com
More informationExploiting Internal and External Semantics for the Clustering of Short Texts Using World Knowledge
Exploiting Internal and External Semantics for the Using World Knowledge, 1,2 Nan Sun, 1 Chao Zhang, 1 Tat-Seng Chua 1 1 School of Computing National University of Singapore 2 School of Computer Science
More informationAchieving Domain Specificity in SMT without Overt Siloing
Achieving Domain Specificity in SMT without Overt Siloing William D. Lewis, Chris Wendt, David Bullock Microsoft Research Redmond, WA 98052, USA wilewis@microsoft.com, christw@microsoft.com, a-davbul@microsoft.com
More informationMorpho-syntactic Analysis with the Stanford CoreNLP
Morpho-syntactic Analysis with the Stanford CoreNLP Danilo Croce croce@info.uniroma2.it WmIR 2015/2016 Objectives of this tutorial Use of a Natural Language Toolkit CoreNLP toolkit Morpho-syntactic analysis
More informationShallow Parsing Swapnil Chaudhari 11305R011 Ankur Aher Raj Dabre 11305R001
Shallow Parsing Swapnil Chaudhari 11305R011 Ankur Aher - 113059006 Raj Dabre 11305R001 Purpose of the Seminar To emphasize on the need for Shallow Parsing. To impart basic information about techniques
More informationTransition-based Parsing with Neural Nets
CS11-747 Neural Networks for NLP Transition-based Parsing with Neural Nets Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Two Types of Linguistic Structure Dependency: focus on relations between
More informationThe HDU Discriminative SMT System for Constrained Data PatentMT at NTCIR10
The HDU Discriminative SMT System for Constrained Data PatentMT at NTCIR10 Patrick Simianer, Gesa Stupperich, Laura Jehl, Katharina Wäschle, Artem Sokolov, Stefan Riezler Institute for Computational Linguistics,
More informationDensity-Driven Cross-Lingual Transfer of Dependency Parsers
Density-Driven Cross-Lingual Transfer of Dependency Parsers Mohammad Sadegh Rasooli Michael Collins rasooli@cs.columbia.edu Presented by Owen Rambow EMNLP 2015 Motivation Availability of treebanks Accurate
More informationStack- propaga+on: Improved Representa+on Learning for Syntax
Stack- propaga+on: Improved Representa+on Learning for Syntax Yuan Zhang, David Weiss MIT, Google 1 Transi+on- based Neural Network Parser p(action configuration) So1max Hidden Embedding words labels POS
More informationUNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES
UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES Saturday 10 th December 2016 09:30 to 11:30 INSTRUCTIONS
More informationVoting between Multiple Data Representations for Text Chunking
Voting between Multiple Data Representations for Text Chunking Hong Shen and Anoop Sarkar School of Computing Science Simon Fraser University Burnaby, BC V5A 1S6, Canada {hshen,anoop}@cs.sfu.ca Abstract.
More informationDiscriminative Training with Perceptron Algorithm for POS Tagging Task
Discriminative Training with Perceptron Algorithm for POS Tagging Task Mahsa Yarmohammadi Center for Spoken Language Understanding Oregon Health & Science University Portland, Oregon yarmoham@ohsu.edu
More informationAnnual Public Report - Project Year 2 November 2012
November 2012 Grant Agreement number: 247762 Project acronym: FAUST Project title: Feedback Analysis for User Adaptive Statistical Translation Funding Scheme: FP7-ICT-2009-4 STREP Period covered: from
More informationComplex Prediction Problems
Problems A novel approach to multiple Structured Output Prediction Max-Planck Institute ECML HLIE08 Information Extraction Extract structured information from unstructured data Typical subtasks Named Entity
More informationForest-based Algorithms in
Forest-based Algorithms in Natural Language Processing Liang Huang overview of Ph.D. work done at Penn (and ISI, ICT) INSTITUTE OF COMPUTING TECHNOLOGY includes joint work with David Chiang, Kevin Knight,
More informationTekniker för storskalig parsning: Dependensparsning 2
Tekniker för storskalig parsning: Dependensparsning 2 Joakim Nivre Uppsala Universitet Institutionen för lingvistik och filologi joakim.nivre@lingfil.uu.se Dependensparsning 2 1(45) Data-Driven Dependency
More informationI Know Your Name: Named Entity Recognition and Structural Parsing
I Know Your Name: Named Entity Recognition and Structural Parsing David Philipson and Nikil Viswanathan {pdavid2, nikil}@stanford.edu CS224N Fall 2011 Introduction In this project, we explore a Maximum
More informationBasic Parsing with Context-Free Grammars. Some slides adapted from Karl Stratos and from Chris Manning
Basic Parsing with Context-Free Grammars Some slides adapted from Karl Stratos and from Chris Manning 1 Announcements HW 2 out Midterm on 10/19 (see website). Sample ques>ons will be provided. Sign up
More informationNatural Language Processing
Natural Language Processing Classification III Dan Klein UC Berkeley 1 Classification 2 Linear Models: Perceptron The perceptron algorithm Iteratively processes the training set, reacting to training errors
More informationLexical Semantics. Regina Barzilay MIT. October, 5766
Lexical Semantics Regina Barzilay MIT October, 5766 Last Time: Vector-Based Similarity Measures man woman grape orange apple n Euclidian: x, y = x y = i=1 ( x i y i ) 2 n x y x i y i i=1 Cosine: cos( x,
More informationEDAN20 Language Technology Chapter 13: Dependency Parsing
EDAN20 Language Technology http://cs.lth.se/edan20/ Pierre Nugues Lund University Pierre.Nugues@cs.lth.se http://cs.lth.se/pierre_nugues/ September 19, 2016 Pierre Nugues EDAN20 Language Technology http://cs.lth.se/edan20/
More informationApache UIMA and Mayo ctakes
Apache and Mayo and how it is used in the clinical domain March 16, 2012 Apache and Mayo Outline 1 Apache and Mayo Outline 1 2 Introducing Pipeline Modules Apache and Mayo What is? (You - eee - muh) Unstructured
More informationNTT SMT System for IWSLT Katsuhito Sudoh, Taro Watanabe, Jun Suzuki, Hajime Tsukada, and Hideki Isozaki NTT Communication Science Labs.
NTT SMT System for IWSLT 2008 Katsuhito Sudoh, Taro Watanabe, Jun Suzuki, Hajime Tsukada, and Hideki Isozaki NTT Communication Science Labs., Japan Overview 2-stage translation system k-best translation
More informationSchool of Computing and Information Systems The University of Melbourne COMP90042 WEB SEARCH AND TEXT ANALYSIS (Semester 1, 2017)
Discussion School of Computing and Information Systems The University of Melbourne COMP9004 WEB SEARCH AND TEXT ANALYSIS (Semester, 07). What is a POS tag? Sample solutions for discussion exercises: Week
More informationComputationally Efficient M-Estimation of Log-Linear Structure Models
Computationally Efficient M-Estimation of Log-Linear Structure Models Noah Smith, Doug Vail, and John Lafferty School of Computer Science Carnegie Mellon University {nasmith,dvail2,lafferty}@cs.cmu.edu
More informationRandom Restarts in Minimum Error Rate Training for Statistical Machine Translation
Random Restarts in Minimum Error Rate Training for Statistical Machine Translation Robert C. Moore and Chris Quirk Microsoft Research Redmond, WA 98052, USA bobmoore@microsoft.com, chrisq@microsoft.com
More informationD1.6 Toolkit and Report for Translator Adaptation to New Languages (Final Version)
www.kconnect.eu D1.6 Toolkit and Report for Translator Adaptation to New Languages (Final Version) Deliverable number D1.6 Dissemination level Public Delivery date 16 September 2016 Status Author(s) Final
More informationTEXT MINING APPLICATION PROGRAMMING
TEXT MINING APPLICATION PROGRAMMING MANU KONCHADY CHARLES RIVER MEDIA Boston, Massachusetts Contents Preface Acknowledgments xv xix Introduction 1 Originsof Text Mining 4 Information Retrieval 4 Natural
More informationCS6200 Information Retrieval. David Smith College of Computer and Information Science Northeastern University
CS6200 Information Retrieval David Smith College of Computer and Information Science Northeastern University Indexing Process Processing Text Converting documents to index terms Why? Matching the exact
More informationFully Delexicalized Contexts for Syntax-Based Word Embeddings
Fully Delexicalized Contexts for Syntax-Based Word Embeddings Jenna Kanerva¹, Sampo Pyysalo² and Filip Ginter¹ ¹Dept of IT - University of Turku, Finland ²Lang. Tech. Lab - University of Cambridge turkunlp.github.io
More informationLSTM for Language Translation and Image Captioning. Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia
1 LSTM for Language Translation and Image Captioning Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia 2 Part I LSTM for Language Translation Motivation Background (RNNs, LSTMs) Model
More informationStructured Perceptron. Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen
Structured Perceptron Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen 1 Outline 1. 2. 3. 4. Brief review of perceptron Structured Perceptron Discriminative Training Methods for Hidden Markov Models: Theory and
More informationThe Prague Bulletin of Mathematical Linguistics NUMBER 91 JANUARY Memory-Based Machine Translation and Language Modeling
The Prague Bulletin of Mathematical Linguistics NUMBER 91 JANUARY 2009 17 26 Memory-Based Machine Translation and Language Modeling Antal van den Bosch, Peter Berck Abstract We describe a freely available
More informationPBML. logo. The Prague Bulletin of Mathematical Linguistics NUMBER??? FEBRUARY Improved Minimum Error Rate Training in Moses
PBML logo The Prague Bulletin of Mathematical Linguistics NUMBER??? FEBRUARY 2009 1 11 Improved Minimum Error Rate Training in Moses Nicola Bertoldi, Barry Haddow, Jean-Baptiste Fouet Abstract We describe
More informationStructural Text Features. Structural Features
Structural Text Features CISC489/689 010, Lecture #13 Monday, April 6 th Ben CartereGe Structural Features So far we have mainly focused on vanilla features of terms in documents Term frequency, document
More informationGraph-Based Parsing. Miguel Ballesteros. Algorithms for NLP Course. 7-11
Graph-Based Parsing Miguel Ballesteros Algorithms for NLP Course. 7-11 By using some Joakim Nivre's materials from Uppsala University and Jason Eisner's material from Johns Hopkins University. Outline
More informationRefactored Moses. Hieu Hoang MT Marathon Prague August 2013
Refactored Moses Hieu Hoang MT Marathon Prague August 2013 What did you Refactor? Decoder Feature Func?on Framework moses.ini format Simplify class structure Deleted func?onality NOT decoding algorithms
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu [Kumar et al. 99] 2/13/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu
More informationTransition-based dependency parsing
Transition-based dependency parsing Syntactic analysis (5LN455) 2014-12-18 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Overview Arc-factored dependency parsing
More informationDependency grammar and dependency parsing
Dependency grammar and dependency parsing Syntactic analysis (5LN455) 2014-12-10 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Mid-course evaluation Mostly positive
More informationMarcello Federico MMT Srl / FBK Trento, Italy
Marcello Federico MMT Srl / FBK Trento, Italy Proceedings for AMTA 2018 Workshop: Translation Quality Estimation and Automatic Post-Editing Boston, March 21, 2018 Page 207 Symbiotic Human and Machine Translation
More informationPV211: Introduction to Information Retrieval
PV211: Introduction to Information Retrieval http://www.fi.muni.cz/~sojka/pv211 IIR 15-1: Support Vector Machines Handout version Petr Sojka, Hinrich Schütze et al. Faculty of Informatics, Masaryk University,
More informationA Space-Efficient Phrase Table Implementation Using Minimal Perfect Hash Functions
A Space-Efficient Phrase Table Implementation Using Minimal Perfect Hash Functions Marcin Junczys-Dowmunt Faculty of Mathematics and Computer Science Adam Mickiewicz University ul. Umultowska 87, 61-614
More informationImproved Models of Distortion Cost for Statistical Machine Translation
Improved Models of Distortion Cost for Statistical Machine Translation Spence Green, Michel Galley, and Christopher D. Manning Computer Science Department Stanford University Stanford, CA 9435 {spenceg,mgalley,manning}@stanford.edu
More informationMoses Version 3.0 Release Notes
Moses Version 3.0 Release Notes Overview The document contains the release notes for the Moses SMT toolkit, version 3.0. It describes the changes to the toolkit since version 2.0 from January 2014. In
More informationIntroducing XAIRA. Lou Burnard Tony Dodd. An XML aware tool for corpus indexing and searching. Research Technology Services, OUCS
Introducing XAIRA An XML aware tool for corpus indexing and searching Lou Burnard Tony Dodd Research Technology Services, OUCS What is XAIRA? XML Aware Indexing and Retrieval Architecture Developed from
More informationThe Prague Bulletin of Mathematical Linguistics NUMBER 94 SEPTEMBER An Experimental Management System. Philipp Koehn
The Prague Bulletin of Mathematical Linguistics NUMBER 94 SEPTEMBER 2010 87 96 An Experimental Management System Philipp Koehn University of Edinburgh Abstract We describe Experiment.perl, an experimental
More informationFinal Project Discussion. Adam Meyers Montclair State University
Final Project Discussion Adam Meyers Montclair State University Summary Project Timeline Project Format Details/Examples for Different Project Types Linguistic Resource Projects: Annotation, Lexicons,...
More informationAnnotation Procedure in Building the Prague Czech-English Dependency Treebank
Annotation Procedure in Building the Prague Czech-English Dependency Treebank Marie Mikulová and Jan Štěpánek Institute of Formal and Applied Linguistics, Charles University in Prague Abstract. In this
More informationINTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) CONTEXT SENSITIVE TEXT SUMMARIZATION USING HIERARCHICAL CLUSTERING ALGORITHM
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & 6367(Print), ISSN 0976 6375(Online) Volume 3, Issue 1, January- June (2012), TECHNOLOGY (IJCET) IAEME ISSN 0976 6367(Print) ISSN 0976 6375(Online) Volume
More informationBatch Tuning Strategies for Statistical Machine Translation
Batch Tuning Strategies for Statistical Machine Translation Colin Cherry and George Foster National Research Council Canada {Colin.Cherry,George.Foster}@nrc-cnrc.gc.ca Abstract There has been a proliferation
More informationHidden Markov Models. Natural Language Processing: Jordan Boyd-Graber. University of Colorado Boulder LECTURE 20. Adapted from material by Ray Mooney
Hidden Markov Models Natural Language Processing: Jordan Boyd-Graber University of Colorado Boulder LECTURE 20 Adapted from material by Ray Mooney Natural Language Processing: Jordan Boyd-Graber Boulder
More informationStructured Perceptron with Inexact Search
Structured Perceptron with Inexact Search Liang Huang Suphan Fayong Yang Guo presented by Allan July 15, 2016 Slides: http://www.statnlp.org/sperceptron.html Table of Contents Structured Perceptron Exact
More informationCorpus Linguistics: corpus annotation
Corpus Linguistics: corpus annotation Karën Fort karen.fort@inist.fr November 30, 2010 Introduction Methodology Annotation Issues Annotation Formats From Formats to Schemes Sources Most of this course
More informationLING/C SC/PSYC 438/538. Lecture 3 Sandiway Fong
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong Today s Topics Homework 4 out due next Tuesday by midnight Homework 3 should have been submitted yesterday Quick Homework 3 review Continue with Perl intro
More informationBackpropagating through Structured Argmax using a SPIGOT
Backpropagating through Structured Argmax using a SPIGOT Hao Peng, Sam Thomson, Noah A. Smith @ACL July 17, 2018 Overview arg max Parser Downstream task Loss L Overview arg max Parser Downstream task Head
More informationAn efficient and user-friendly tool for machine translation quality estimation
An efficient and user-friendly tool for machine translation quality estimation Kashif Shah, Marco Turchi, Lucia Specia Department of Computer Science, University of Sheffield, UK {kashif.shah,l.specia}@sheffield.ac.uk
More informationNatural Language Processing
Natural Language Processing Classification II Dan Klein UC Berkeley 1 Classification 2 Linear Models: Perceptron The perceptron algorithm Iteratively processes the training set, reacting to training errors
More informationThe Perceptron. Simon Šuster, University of Groningen. Course Learning from data November 18, 2013
The Perceptron Simon Šuster, University of Groningen Course Learning from data November 18, 2013 References Hal Daumé III: A Course in Machine Learning http://ciml.info Tom M. Mitchell: Machine Learning
More informationCS 224N Assignment 2 Writeup
CS 224N Assignment 2 Writeup Angela Gong agong@stanford.edu Dept. of Computer Science Allen Nie anie@stanford.edu Symbolic Systems Program 1 Introduction 1.1 PCFG A probabilistic context-free grammar (PCFG)
More informationStructure and Support Vector Machines. SPFLODD October 31, 2013
Structure and Support Vector Machines SPFLODD October 31, 2013 Outline SVMs for structured outputs Declara?ve view Procedural view Warning: Math Ahead Nota?on for Linear Models Training data: {(x 1, y
More information