Elements of Language Processing and Learning lecture 3

Size: px
Start display at page:

Download "Elements of Language Processing and Learning lecture 3"

Transcription

1 Elements of Language Processing and Learning lecture 3 Ivan Titov TA: Milos Stanojevic Institute for Logic, Language and Computation

2 Today Parsing algorithms for CFGs Recap, Chomsky Normal Form (CNF) A dynamic programming algorithm for parsing (CKY) Extension of CKY to support unary inner rules Parsing for PCFGs Extension of CKY for parsing with PCFGs Language Modeling Parser evaluation (if we have time) After this lecture you should be able to start working on the assignment step 1.2 2

3 Syntactic parsing Syntax The study of the patterns of formation of sentences and phrases from words Borders with semantics and morphology sometimes blurred Parsing The process of predicting syntactic representations Syntactic Representations Afyonkarahisarlılaştırabildiklerimizdenmişsinizcesinee in Turkish mean "as if you are one of the people that we thought to be originating from Afyonkarahisar" [wikipedia] PN My Different types of syntactic representations are possible, for example: N dog S V ate VP D a N sausage Each internal node correspond to a phrase root poss nsubj dobj det root My dog ate a sausage PN N V D N Constuent (a.k.a. phrasestructure) tree Dependency tree 3

4 Constituent trees S Internal nodes correspond to phrases VP S a sentence PN My N dog V ate D a N sausage (Noun Phrase): My dog, a sandwich, lakes,.. VP (Verb Phrase): ate a sausage, barked, PP (Prepositional phrases): with a friend, in a car, Nodes immediately above words are PoS tags (aka preterminals) PN pronoun D determiner V verb N noun P preposition 4

5 Context-Free Grammar Context-free grammar is a tuple of 4 elements G =(V,,R,S) V - the set of non-terminals In our case: phrase categories (VP,,..) and PoS tags (N, V,.. aka preterminals) - the set of terminals Words - the set of rules of the form, where, R X! Y 1,Y 2,...,Y n n 0 X 2 V S, Y i 2 V [ is a dedicated start symbol S! VP N! girl! D N N! telescope! PN V! saw! PP V! eat PP! P

6 An example grammar V = {S, V P,, P P, N, V, P N, P } = {girl, telescope, sandwich, I, saw, ate, with, in, a, the} S = {S} R : Inner rules Preterminal rules (correspond to the HMM emission model aka task model) S! VP VP! V VP! V VP! VP PP! PP! D N! PN PP! P ( A girl) (VP ate a sandwich) (V ate) ( a sandwich) (VP saw a girl) (PP with a telescope) ( a girl) (PP with a sandwich) (D a) (N sandwich) (P with) ( with a sandwich) N! girl N! telescope N! sandwich PN! I V! saw V! ate P! with P! in D! a D! the 6

7 Why parsing is hard? Ambiguity Prepositional phrase attachment ambiguity S S VP PN I V saw VP D N a girl P with PP D a N telescope PN I V saw D N a girl P with PP D a N telescope Both a generated by the grammar 7

8 An example probabilistic CFG S! VP 1.0 ( A girl) (VP ate a sandwich) N! girl N! telescope VP! V 0.2 N! sandwich 0.1 VP! V VP! VP PP (VP ate) ( a sandwich) (VP saw a girl) (PP with ) PN! I V! saw ! PP! D N! PN ( a girl) (PP with.) (D a) (N sandwich) V! ate P! with P! in D! a 0.3 PP! P 1.0 (P with) ( with a sandwich) D! the 0.7 8

9 Parsing Parsing is search through the space of all possible parses e.g., we may want either any parse, all parses or the highest scoring parse (if PCFG): The probability by the arg max P (T ) PCFG model T 2G(x) Bottom-up: Set of all trees given by the grammar for the sentence x One starts from words and attempt to construct the full tree Top-down Start from the start symbol and attempt to expand to get the sentence 9

10 CKY algorithm (aka CYK) Cocke-Kasami-Younger algorithm Independently discovered in late 60s / early 70s An efficient bottom up parsing algorithm for (P)CFGs can be used both for the recognition and parsing problems Very important in NLP (and beyond) We will start with the non-probabilistic version 10

11 Constraints on the grammar The basic CKY algorithm supports only rules in the Chomsky Normal Form (CNF): C! x Unary preterminal rules (generation of words given PoS tags N! telescope, D! the, ) C! C 1 C 2 Binary inner rules (e.g., S! VP,! D N) 11

12 Constraints on the grammar The basic CKY algorithm supports only rules in the Chomsky Normal Form (CNF): C! x C! C 1 C 2 Unary preterminal rules (generation of words given PoS tags N! telescope, D! the, ) Binary inner rules (e.g., S! VP,! D N) Any CFG can be converted to an equivalent CNF Equivalent means that they define the same language However (syntactic) trees will look differently Makes linguists unhappy It is possible to address it but defining such transformations that allows for easy reverse transformation 12

13 Transformation to CNF form What one need to do to convert to CNF form Get rid of empty (aka epsilon) productions: C! Get rid of unary rules: C! C 1 N-ary rules: C! C 1 C 2...C n (n>2) Generally not a problem as there are not empty production in the standard (postprocessed) treebanks Not a problem, as our CKY algorithm will support unary rules Crucial to process them, as required for efficient parsing 13

14 Transformation to CNF form: binarization Consider! DT N V BG NN DT N VBG NN the Dutch publishing group How do we get a set of binary rules which are equivalent? 14

15 Transformation to CNF form: binarization Consider! DT N V BG NN DT N VBG NN the Dutch publishing group How do we get a set of binary rules which are equivalent?! DT X X! N Y Y! VBG NN 15

16 Transformation to CNF form: binarization Consider! DT N V BG NN DT N VBG NN the Dutch publishing group How do we get a set of binary rules which are equivalent?! DT X X! N Y Y! VBG NN A more systematic way to refer to new non-terminals! DT! DT DT N! VBG NN 16

17 Transformation to CNF form: binarization Instead of binarizing tules we can binarize trees on preprocessing: DT the N Dutch VBG publishing NN group Also known as lossless Markovization in the context of PCFGs DT Can be easily reversed on postprocessing the DT N Dutch VBG publishing NN group 17

18 Transformation to CNF form: binarization Instead of binarizing tules we can binarize trees on preprocessing: DT N VBG NN the Dutch publishing group DT the DT N This is exactly the transformation used in the code of the assignment Dutch VBG DT N VBG NN group 18

19 CKY: Parsing task We a given start symbol a grammar G =(V,,R,S) a sequence of words w =(w 1,w 2,...,w n ) Our goal is to produce a parse tree for w 19

20 CKY: Parsing task We a given start symbol [Some illustrations and slides in this lecture are from Marco Kuhlmann] a grammar G =(V,,R,S) a sequence of words w =(w 1,w 2,...,w n ) Our goal is to produce a parse tree for w We need an easy way to refer to substrings of w indices refer to fenceposts span (i, j) refers to words between fenceposts i and j 20

21 Key problems Recognition problem: does the sentence belong to the language defined by CFG? Is there a derivation which yields the sentence? Parsing problem: what is a derivation (tree) corresponding the sentence? Probabilistic parsing: what is the most probable tree for the sentence? 21

22 Parsing one word C! w i 22

23 Parsing one word C! w i 23

24 Parsing one word C! w i 24

25 Parsing longer spans C! C 1 C 2 Check through all C 1,C 2, mid 25

26 Parsing longer spans C! C 1 C 2 Check through all C 1,C 2, mid 26

27 Parsing longer spans 27

28 Signatures Applications of rules is independent of inner structure of a parse tree We only need to know the corresponding span and the root label of the tree Its signature [min, max, C] Also known as an edge 28

29 CKY idea Compute for every span a set of admissible labels (may be empty for some spans) Start from small trees (single words) and proceed to larger ones 29

30 CKY idea Compute for every span a set of admissible labels (may be empty for some spans) Start from small trees (single words) and proceed to larger ones When done, how do we detect that the sentence is grammatical? 30

31 CKY idea Compute for every span a set of admissible labels (may be empty for some spans) Start from small trees (single words) and proceed to larger ones When done, check if S is among admissible labels for the whole sentence, if yes the sentence belong to the language That is if a tree with signature [0, n, S] exists 31

32 CKY idea Compute for every span a set of admissible labels (may be empty for some spans) Start from small trees (single words) and proceed to larger ones When done, check if S is among admissible labels for the whole sentence, if yes the sentence belong to the language That is if a tree with signature [0, n, S] exists Unary rules? 32

33 CKY in action lead can poison S! VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

34 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 max = 1 max = 2 max = 3 S? Chart (aka parsing triangle) VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

35 CKY in action lead can poison max = 1 max = 2 max = 3 S? min = 1 min = 2 min = 3 lead can poison S! VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

36 CKY in action lead can poison max = 1 max = 2 max = 3 S? min = 1 min = 2 min = 3 lead can poison S! VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

37 CKY in action lead can poison max = 1 max = 2 max = 3 S? min = 1 min = 2 min = 3 lead can poison S! VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

38 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 max = 1 max = 2 max = 3 S? VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

39 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 max = 1 max = 2 max = S? VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

40 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3? 2? 3? VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

41 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3? 2? 3? VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

42 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V 2 N,M 3 N,V VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

43 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 2 N,M 3 N,V,VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

44 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 4 2? N,M 3 N,V,VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

45 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 4 2? N,M 3 N,V,VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

46 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 4 2 N,M 3 N,V,VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

47 CKY in action lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 4 2 N,M 3 Check about unary rules: no unary rules here N,V,VP S! VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

48 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 4 2 N,M 5 3? N,V,VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

49 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 4 2 N,M 5 3 S, V P, N,V,VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

50 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 4 2 N,M 5 3 S, V P, N,V,VP Check about unary rules: no unary rules here VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

51 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 4 2 N,M 6 5 3? S, V P, N,V,VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

52 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 4 2 N,M 6 5 3? S, V P, N,V,VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

53 CKY in action S! VP lead can poison mid = 1 min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 4 2 N,M 6 S, 5 3 S, V P, N,V,VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

54 CKY in action S! VP lead can poison mid = 2 min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 4 2 N,M 6 S, S(?!) 5 3 S, V P, N,V,VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

55 CKY in action lead can poison mid = 2 min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3 N,V,VP 4 2 N,M 6 S, S(?!) S, V P, N,V Apparently the sentence is ambiguous,vpfor the grammar: (as the grammar overgenerates) 5 3 S! VP VP! VP VP! M V VP! V! N! N N! can N! lead N! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

56 Ambiguity S S N lead M can VP V poison N lead N can VP V poison No subject-verb agreement, and poison used as an intransitive verb Apparently the sentence is ambiguous for the grammar: (as the grammar overgenerates)

57 CKY more formally Chart can be represented by a Boolean array chart[min][max][label] chart[min][max][c] Relevant entries have 0 < min < max apple n Here we assume that labels (C) are integer indices chart[min][max][c] = true if the signature (min, max, C) is already added to the chart; false otherwise. max = 1 max = 2 max = 3 min = 0 1 N,V,VP 4 6 S,V P, min = 1 2 N,M 5 S, V P, 3 N,V min = 2,VP 57

58 CKY more formally Chart can be represented by a Boolean array chart[min][max][label] chart[min][max][c] Relevant entries have 0 < min < max apple n Here we assume that labels (C) are integer indices chart[min][max][c] = true if the signature (min, max, C) is already added to the chart; false otherwise. max = 1 max = 2 max = 3 min = 0 min = 1 1 N,V,VP 4 2 N,M 6 5 S,V P, S, V P, In the assignment code we use a class Chart but its access methods are similar 3 N,V min = 2,VP 58

59 Implementation: preterminal rules 59

60 Implementation: binary rules 60

61 Unary rules How to integrate unary rules C! C 1? 61

62 Implementation: unary rules 62

63 Implementation: unary rules But we forgot something! 63

64 Unary closure What if the grammar contained 2 rules: A! B B! C But C can be derived from A by a chain of rules: A! B! C One could support chains in the algorithm but it is easier to extend the grammar, to get the transitive closure A! B B! C ) A! B B! C A! C 64

65 Unary closure What if the grammar contained 2 rules: A! B B! C But C can be derived from A by a chain of rules: A! B! C One could support chains in the algorithm but it is easier to extend the grammar, to get the reflexive transitive closure A! B B! C ) A! B B! C A! C A! A B! B C! C Convenient for programming reasons in the PCFG case 65

66 Implementation: skeleton 66

67 Algorithm analysis Time complexity? (n 3 R ) R, where is the number of rules in the grammar There exist algorithms with better asymptotical time complexity but the `constant' makes them slower in practice (in general) 67

68 Algorithm analysis Time complexity? (n 3 R ) R, where is the number of rules in the grammar There exist algorithms with better asymptotical time complexity but the `constant' makes them slower in practice (in general) 68

69 Algorithm analysis Time complexity? A few seconds for sentences under < 20 words for a nonoptimized parser (n 3 R ) R, where is the number of rules in the grammar There exist algorithms with better asymptotical time complexity but the `constant' makes them slower in practice (in general) 69

70 Algorithm analysis Time complexity? A few seconds for sentences under < 20 words for a nonoptimized parser (n 3 R ) R, where is the number of rules in the grammar There exist algorithms with better asymptotical time complexity but the `constant' makes them slower in practice (in general) 70

71 Practical time complexity Time complexity? (for the PCFG version) n 3.6 [Plot by Dan Klein] 71

72 Today Parsing algorithms for CFGs Recap, Chomsky Normal Form (CNF) A dynamic programming algorithm for parsing (CKY) Extension of CKY to support unary inner rules Parsing for PCFGs Extension of CKY for parsing with PCFGs Language Modeling Parser evaluation (if we have time) 72

73 Probabilistic parsing We discussed the recognition problem: check if a sentence is parsable with a CFG Now we consider parsing with PCFGs Recognition with PCFGs: what is the probability of the most probable parse tree? Parsing with PCFGs: What is the most probable parse tree? 73

74 CFGs PN I S V 0.5 saw 1.0 D VP N P a girl with PP D N S! VP VP! V VP! V VP! VP PP! PP! D N! PN PP! P N! girl N! telescope N! sandwich PN! I V! saw V! ate P! with P! in D! a D! the a telescope p(t )= =

75 Distribution over trees Let us denote by G(x) the set of derivations for the sentence x The probability distribution defines the scoring P (T ) over the trees T 2 G(x) Finding the best parse for the sentence according to PCFG: arg max T 2G(x) P (T ) 75

76 CKY with PCFGs Chart is represented by a double array chart[min][max][label] chart[min][max][c] It stores probabilities for the most probable subtree with a given signature chart[0][n][s] parse tree will store the probability of the most probable full 76

77 Intuition C! C 1 C 2 For every C choose C 1, C 2 and mid such that P (T 1 ) P (T 2 ) P (C! C 1 C 2 ) is maximal, where T 1 and T 2 are left and right subtrees. 77

78 Implementation: preterminal rules 78

79 Implementation: binary rules 79

80 Unary rules Similarly to CFGs: after producing scores for signatures (c, i, j), try to improve the scores by applying unary rules (and rule chains) If improved, update the scores 80

81 Unary (reflexive transitive) closure A! B B! C ) A! B B! C A! C A! A B! B C! C Note that this is not a PCFG anymore as the rules do not sum to 1 for each parent

82 Unary (reflexive transitive) closure The fact that the rule is composite needs to be stored to recover the true tree A! B B! C ) A! B B! C A! C A! A B! B C! C Note that this is not a PCFG anymore as the rules do not sum to 1 for each parent

83 Unary (reflexive transitive) closure The fact that the rule is composite needs to be stored to recover the true tree A! B B! C ) A! B B! C A! C A! A B! B C! C Note that this is not a PCFG anymore as the rules do not sum to 1 for each parent A! B B! C A! C e 5 ) A! B B! C A! C A! A B! B C! C

84 Unary (reflexive transitive) closure The fact that the rule is composite needs to be stored to recover the true tree A! B B! C ) A! B B! C A! C A! A B! B C! C Note that this is not a PCFG anymore as the rules do not sum to 1 for each parent A! B B! C A! C e 5 ) A! B B! C A! C A! A B! B C! C What about loops, like: A! B! A! C? 84

85 Recovery of the tree For each signature we store backpointers to the elements from which it was built (e.g., rule and, for binary rules, midpoint) start recovering from [0, n, S] Be careful with unary rules Basically you can assume that you always used an unary rule from the closure (but it could be the trivial one ) C! C 85

86 Speeding up the algorithm (approximate search) Any ideas? 86

87 Speeding up the algorithm (approximate search) Basic pruning (roughly): For every span (i,j) store only labels which have the probability at most N times smaller than the probability of the most probable label for this span Check not all rules but only rules yielding subtree labels having non-zero probability Coarse-to-fine pruning Parse with a smaller (simpler) grammar, and precompute (posterior) probabilities for each spans, and use only the ones with non-negligible probability from the previous grammar 87

88 Today Parsing algorithms for CFGs Recap, Chomsky Normal Form (CNF) A dynamic programming algorithm for parsing (CKY) Extension of CKY to support unary inner rules Parsing for PCFGs Extension of CKY for parsing with PCFGs Language Modeling Parser evaluation (if we have time) 88

89 Language modeling The goal of language modeling is estimation how likely is the next word given previous words (in a sentence) Many of you are familiar with ngram language models They are an important component in, for example, in machine translation and speech recognition Estimate the probability of the next word p(w k w k 1,...,w 1 ) Ngram language models make the nth-order Markovian assumption p(w k w k 1,...,w 1 ) p(w k w k 1,...,w k n ) Maximum likelihood estimates are just frequencies but in practice various forms of smoothing are used The probability of a word sequence is p(w 1,...,w m )= my p(w k w k 1,...,w k n ) k=1 89

90 Language modeling Ngram models are not very good at capturing long dependencies Availability of Web-scale data means that they are not as bad as long (up to 5 words) ngrams estimated from over a trillion of tokens is available for English. Still it can easily lead to ungrammatical sentence John has not even reached his/her destination Ankommen is a verb with a separable prefix Er kam am Freitagabend nach einem harten Arbeitstag und dem üblichen Ärger, der ihn schon seit Jahren immer wieder an seinem Arbeitsplatz plagt, mit fraglicher Freude auf ein Mahl, das seine Frau ihm, wie er hoffte, bereits aufgetischt hatte, endlich zu Hause an. [Wikipedia] Syntactic information should help us in modeling these long dependencies 90

91 Language modeling Note, equivalently, to modeling the probability of the next word, we can model the probability of a sentence prefix, as p(w k w k 1,...,w 1 )= p(w 1,...,w k 1,w k ) p(w 1,...,w k 1 ) Instead of looking into modeling probability of any sentence prefix, we will consider probabilities of whole sentences It is simpler and also important Dynamic programming algorithms for sentence prefixes are considered in Jelinek and Lafferty [91] and Stolcke [94]. 91

92 Syntactic language models To get the probability P (x) of a word sequence x we need to sum probabilities P (T ) of all its syntactic derivations T 2 G(x) P (x) = X P (T ) T 2G(x) S S VP VP P (lead can poison) =P ( N M V )+P ( N V ) lead can poison lead N can poison 92

93 Syntactic language models To get the probability P (x) of a word sequence x we need to sum probabilities P (T ) of all its syntactic derivations T 2 G(x) P (x) = X P (T ) T 2G(x) 93

94 Syntactic language models To get the probability P (x) of a word sequence x we need to sum probabilities P (T ) of all its syntactic derivations T 2 G(x) P (x) = X P (T ) T 2G(x) How can we do this efficiently (for PCFGs)? (Recall we know how to parse efficiently: arg max T 2G(x) P (T ) ) 94

95 Modifying CKY In CKY we maintained the probability of the most probable subtree for each signature Here, we will maintain the sum of probabilities of all the subtrees for each signature The CKY algorithm is an instance of the Viterbi algorithm, also known as Max-Product We will look into the algorithm called Sum-Product 95

96 CKY (Max-Product) 96

97 Sum-Product The probability of the sentence can be read from chart[0][n][s] 97

98 Sum-Product The probabilities stored in the chart are called "inside probabilities" and play an important role in induction of trees from unlabeled data The probability of the sentence can be read from chart[0][n][s] 98

99 Language modeling Though these syntactic models encode non-local dependences, their weakness as language models is that they are not rich enough to capture lexical dependencies In practice, syntactic language models use syntactic information to extract syntactically relevant previous words (instead of just previous words) See e.g., Chelba (1997) 99

100 Where are we?... Today we learnt how to parse with (P)CFGs Efficient bottom-up dynamic algorithm Earley algorithm is an effective top-down algorithm (but harder to understand + does not trivially generalize to PCFGs) Touched on language modeling Assignment 1.2: Ready to start working on it Add support of unary rules to a CKY implementation 100

101 Today Parsing algorithms for CFGs Recap, Chomsky Normal Form (CNF) A dynamic programming algorithm for parsing (CKY) Extension of CKY to support unary inner rules Parsing for PCFGs Extension of CKY for parsing with PCFGs Language Modeling Parser evaluation (if we have time) 101

102 Parser evaluation Though has many drawbacks it is easier and allows to track stateof-the-art across many years Intrinsic evaluation: Automatic: evaluate against annotation provided by human experts (gold standard) according to some predefined measure Manual: according to human judgment Extrinsic evaluation: score syntactic representation by comparing how well a system using this representation performs on some task E.g., use syntactic representation as input for a semantic analyzer and compare results of the analyzer using syntax predicted by different parsers. 102

103 Standard evaluation setting in parsing Automatic intrinsic evaluation is used: parsers are evaluated against gold standard by provided by linguists There is a standard split into the parts: training set: used for estimation of model parameters development set: used for tuning the model (initial experiments) test set: final experiments to compare against previous work 103

104 Automatic evaluation of constituent parsers Exact match: percentage of trees predicted correctly Bracket score: scores how well individual phrases (and their boundaries) are identified Crossing brackets: percentage of phrases boundaries crossing The most standard measure; we will focus on it Dependency metrics: scores dependency structure corresponding to the constituent tree (percentage of correctly identified heads) 104

105 Brackets scores The most standard score is bracket score It regards a tree as a collection of brackets: The set of brackets predicted by a parser is compared against the set of brackets in the tree annotated by a linguist Precision, recall and F1 are used as scores Subtree signatures for CKY [min, max, C] 105

106 Bracketing notation S VP PN N V My dog ate D N a sausage The same tree as a bracketed sequence (S ( (PN My) (N Dog) ) (VP (V ate) ( (D a ) (N sausage) ) ) ) 106

107 Brackets scores Pr = Re = number of brackets the parser and annotation agree on number of brackets predicted by the parser number of brackets the parser and annotation agree on number of brackets in annotation F 1= 2 Pr Re Pr+ Re Harmonic mean of precision and recall 107

108 Preview: F1 bracket score Treebank PCFG Unlexicalized Lexicalized PCFG PCFG (Klein and (Collins, 1999) Manning, 2003) Automatically Induced PCFG (Petrov et al., 2006) The best results reported (as of 2012) 108

109 Preview: F1 bracket score Treebank PCFG Unlexicalized Lexicalized PCFG PCFG (Klein and (Collins, 1999) Manning, 2003) Automatically Induced PCFG (Petrov et al., 2006) The best results reported (as of 2012) We will introduce these models next time 109

The CKY algorithm part 1: Recognition

The CKY algorithm part 1: Recognition The CKY algorithm part 1: Recognition Syntactic analysis (5LN455) 2014-11-17 Sara Stymne Department of Linguistics and Philology Mostly based on slides from Marco Kuhlmann Recap: Parsing Parsing The automatic

More information

The CKY algorithm part 1: Recognition

The CKY algorithm part 1: Recognition The CKY algorithm part 1: Recognition Syntactic analysis (5LN455) 2016-11-10 Sara Stymne Department of Linguistics and Philology Mostly based on slides from Marco Kuhlmann Phrase structure trees S root

More information

The CKY algorithm part 2: Probabilistic parsing

The CKY algorithm part 2: Probabilistic parsing The CKY algorithm part 2: Probabilistic parsing Syntactic analysis/parsing 2017-11-14 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Recap: The CKY algorithm The

More information

The CKY Parsing Algorithm and PCFGs. COMP-550 Oct 12, 2017

The CKY Parsing Algorithm and PCFGs. COMP-550 Oct 12, 2017 The CKY Parsing Algorithm and PCFGs COMP-550 Oct 12, 2017 Announcements I m out of town next week: Tuesday lecture: Lexical semantics, by TA Jad Kabbara Thursday lecture: Guest lecture by Prof. Timothy

More information

Dependency grammar and dependency parsing

Dependency grammar and dependency parsing Dependency grammar and dependency parsing Syntactic analysis (5LN455) 2014-12-10 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Mid-course evaluation Mostly positive

More information

Dependency grammar and dependency parsing

Dependency grammar and dependency parsing Dependency grammar and dependency parsing Syntactic analysis (5LN455) 2015-12-09 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Activities - dependency parsing

More information

CKY algorithm / PCFGs

CKY algorithm / PCFGs CKY algorithm / PCFGs CS 585, Fall 2018 Introduction to Natural Language Processing http://people.cs.umass.edu/~miyyer/cs585/ Mohit Iyyer College of Information and Computer Sciences University of Massachusetts

More information

Dependency grammar and dependency parsing

Dependency grammar and dependency parsing Dependency grammar and dependency parsing Syntactic analysis (5LN455) 2016-12-05 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Activities - dependency parsing

More information

Topics in Parsing: Context and Markovization; Dependency Parsing. COMP-599 Oct 17, 2016

Topics in Parsing: Context and Markovization; Dependency Parsing. COMP-599 Oct 17, 2016 Topics in Parsing: Context and Markovization; Dependency Parsing COMP-599 Oct 17, 2016 Outline Review Incorporating context Markovization Learning the context Dependency parsing Eisner s algorithm 2 Review

More information

The CKY Parsing Algorithm and PCFGs. COMP-599 Oct 12, 2016

The CKY Parsing Algorithm and PCFGs. COMP-599 Oct 12, 2016 The CKY Parsing Algorithm and PCFGs COMP-599 Oct 12, 2016 Outline CYK parsing PCFGs Probabilistic CYK parsing 2 CFGs and Constituent Trees Rules/productions: S VP this VP V V is rules jumps rocks Trees:

More information

CS 224N Assignment 2 Writeup

CS 224N Assignment 2 Writeup CS 224N Assignment 2 Writeup Angela Gong agong@stanford.edu Dept. of Computer Science Allen Nie anie@stanford.edu Symbolic Systems Program 1 Introduction 1.1 PCFG A probabilistic context-free grammar (PCFG)

More information

Admin PARSING. Backoff models: absolute discounting. Backoff models: absolute discounting 3/4/11. What is (xy)?

Admin PARSING. Backoff models: absolute discounting. Backoff models: absolute discounting 3/4/11. What is (xy)? Admin Updated slides/examples on backoff with absolute discounting (I ll review them again here today) Assignment 2 Watson vs. Humans (tonight-wednesday) PARING David Kauchak C159 pring 2011 some slides

More information

Assignment 4 CSE 517: Natural Language Processing

Assignment 4 CSE 517: Natural Language Processing Assignment 4 CSE 517: Natural Language Processing University of Washington Winter 2016 Due: March 2, 2016, 1:30 pm 1 HMMs and PCFGs Here s the definition of a PCFG given in class on 2/17: A finite set

More information

Collins and Eisner s algorithms

Collins and Eisner s algorithms Collins and Eisner s algorithms Syntactic analysis (5LN455) 2015-12-14 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Recap: Dependency trees dobj subj det pmod

More information

Advanced PCFG Parsing

Advanced PCFG Parsing Advanced PCFG Parsing Computational Linguistics Alexander Koller 8 December 2017 Today Parsing schemata and agenda-based parsing. Semiring parsing. Pruning techniques for chart parsing. The CKY Algorithm

More information

COMS W4705, Spring 2015: Problem Set 2 Total points: 140

COMS W4705, Spring 2015: Problem Set 2 Total points: 140 COM W4705, pring 2015: Problem et 2 Total points: 140 Analytic Problems (due March 2nd) Question 1 (20 points) A probabilistic context-ree grammar G = (N, Σ, R,, q) in Chomsky Normal Form is deined as

More information

Parsing with Dynamic Programming

Parsing with Dynamic Programming CS11-747 Neural Networks for NLP Parsing with Dynamic Programming Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Two Types of Linguistic Structure Dependency: focus on relations between words

More information

CSE 417 Dynamic Programming (pt 6) Parsing Algorithms

CSE 417 Dynamic Programming (pt 6) Parsing Algorithms CSE 417 Dynamic Programming (pt 6) Parsing Algorithms Reminders > HW9 due on Friday start early program will be slow, so debugging will be slow... should run in 2-4 minutes > Please fill out course evaluations

More information

Ling 571: Deep Processing for Natural Language Processing

Ling 571: Deep Processing for Natural Language Processing Ling 571: Deep Processing for Natural Language Processing Julie Medero January 14, 2013 Today s Plan Motivation for Parsing Parsing as Search Search algorithms Search Strategies Issues Two Goal of Parsing

More information

What is Parsing? NP Det N. Det. NP Papa N caviar NP NP PP S NP VP. N spoon VP V NP VP VP PP. V spoon V ate PP P NP. P with.

What is Parsing? NP Det N. Det. NP Papa N caviar NP NP PP S NP VP. N spoon VP V NP VP VP PP. V spoon V ate PP P NP. P with. Parsing What is Parsing? S NP VP NP Det N S NP Papa N caviar NP NP PP N spoon VP V NP VP VP PP NP VP V spoon V ate PP P NP P with VP PP Det the Det a V NP P NP Det N Det N Papa ate the caviar with a spoon

More information

Parsing Chart Extensions 1

Parsing Chart Extensions 1 Parsing Chart Extensions 1 Chart extensions Dynamic programming methods CKY Earley NLP parsing dynamic programming systems 1 Parsing Chart Extensions 2 Constraint propagation CSP (Quesada) Bidirectional

More information

Ling/CSE 472: Introduction to Computational Linguistics. 5/4/17 Parsing

Ling/CSE 472: Introduction to Computational Linguistics. 5/4/17 Parsing Ling/CSE 472: Introduction to Computational Linguistics 5/4/17 Parsing Reminders Revised project plan due tomorrow Assignment 4 is available Overview Syntax v. parsing Earley CKY (briefly) Chart parsing

More information

Statistical parsing. Fei Xia Feb 27, 2009 CSE 590A

Statistical parsing. Fei Xia Feb 27, 2009 CSE 590A Statistical parsing Fei Xia Feb 27, 2009 CSE 590A Statistical parsing History-based models (1995-2000) Recent development (2000-present): Supervised learning: reranking and label splitting Semi-supervised

More information

The Expectation Maximization (EM) Algorithm

The Expectation Maximization (EM) Algorithm The Expectation Maximization (EM) Algorithm continued! 600.465 - Intro to NLP - J. Eisner 1 General Idea Start by devising a noisy channel Any model that predicts the corpus observations via some hidden

More information

Transition-based Parsing with Neural Nets

Transition-based Parsing with Neural Nets CS11-747 Neural Networks for NLP Transition-based Parsing with Neural Nets Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Two Types of Linguistic Structure Dependency: focus on relations between

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Anoop Sarkar http://www.cs.sfu.ca/~anoop 9/18/18 1 Ambiguity An input is ambiguous with respect to a CFG if it can be derived with two different parse trees A parser needs a

More information

Homework & NLTK. CS 181: Natural Language Processing Lecture 9: Context Free Grammars. Motivation. Formal Def of CFG. Uses of CFG.

Homework & NLTK. CS 181: Natural Language Processing Lecture 9: Context Free Grammars. Motivation. Formal Def of CFG. Uses of CFG. C 181: Natural Language Processing Lecture 9: Context Free Grammars Kim Bruce Pomona College pring 2008 Homework & NLTK Review MLE, Laplace, and Good-Turing in smoothing.py Disclaimer: lide contents borrowed

More information

Probabilistic Parsing of Mathematics

Probabilistic Parsing of Mathematics Probabilistic Parsing of Mathematics Jiří Vyskočil Josef Urban Czech Technical University in Prague, Czech Republic Cezary Kaliszyk University of Innsbruck, Austria Outline Why and why not current formal

More information

Advanced PCFG Parsing

Advanced PCFG Parsing Advanced PCFG Parsing BM1 Advanced atural Language Processing Alexander Koller 4 December 2015 Today Agenda-based semiring parsing with parsing schemata. Pruning techniques for chart parsing. Discriminative

More information

Iterative CKY parsing for Probabilistic Context-Free Grammars

Iterative CKY parsing for Probabilistic Context-Free Grammars Iterative CKY parsing for Probabilistic Context-Free Grammars Yoshimasa Tsuruoka and Jun ichi Tsujii Department of Computer Science, University of Tokyo Hongo 7-3-1, Bunkyo-ku, Tokyo 113-0033 CREST, JST

More information

1. Generalized CKY Parsing Treebank empties and unaries. Seven Lectures on Statistical Parsing. Unary rules: Efficient CKY parsing

1. Generalized CKY Parsing Treebank empties and unaries. Seven Lectures on Statistical Parsing. Unary rules: Efficient CKY parsing even Lectures on tatistical Parsing 1. Generalized CKY Parsing Treebank empties and unaries -HLN -UBJ Christopher Manning LA Linguistic Institute 27 LA 354 Lecture 3 -NONEε Atone PTB Tree -NONEε Atone

More information

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ Specifying Syntax Language Specification Components of a Grammar 1. Terminal symbols or terminals, Σ Syntax Form of phrases Physical arrangement of symbols 2. Nonterminal symbols or syntactic categories,

More information

Hidden Markov Models. Natural Language Processing: Jordan Boyd-Graber. University of Colorado Boulder LECTURE 20. Adapted from material by Ray Mooney

Hidden Markov Models. Natural Language Processing: Jordan Boyd-Graber. University of Colorado Boulder LECTURE 20. Adapted from material by Ray Mooney Hidden Markov Models Natural Language Processing: Jordan Boyd-Graber University of Colorado Boulder LECTURE 20 Adapted from material by Ray Mooney Natural Language Processing: Jordan Boyd-Graber Boulder

More information

I Know Your Name: Named Entity Recognition and Structural Parsing

I Know Your Name: Named Entity Recognition and Structural Parsing I Know Your Name: Named Entity Recognition and Structural Parsing David Philipson and Nikil Viswanathan {pdavid2, nikil}@stanford.edu CS224N Fall 2011 Introduction In this project, we explore a Maximum

More information

Large-Scale Syntactic Processing: Parsing the Web. JHU 2009 Summer Research Workshop

Large-Scale Syntactic Processing: Parsing the Web. JHU 2009 Summer Research Workshop Large-Scale Syntactic Processing: JHU 2009 Summer Research Workshop Intro CCG parser Tasks 2 The Team Stephen Clark (Cambridge, UK) Ann Copestake (Cambridge, UK) James Curran (Sydney, Australia) Byung-Gyu

More information

AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands

AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands Svetlana Stoyanchev, Hyuckchul Jung, John Chen, Srinivas Bangalore AT&T Labs Research 1 AT&T Way Bedminster NJ 07921 {sveta,hjung,jchen,srini}@research.att.com

More information

Context-Free Grammars. Carl Pollard Ohio State University. Linguistics 680 Formal Foundations Tuesday, November 10, 2009

Context-Free Grammars. Carl Pollard Ohio State University. Linguistics 680 Formal Foundations Tuesday, November 10, 2009 Context-Free Grammars Carl Pollard Ohio State University Linguistics 680 Formal Foundations Tuesday, November 10, 2009 These slides are available at: http://www.ling.osu.edu/ scott/680 1 (1) Context-Free

More information

Exam Marco Kuhlmann. This exam consists of three parts:

Exam Marco Kuhlmann. This exam consists of three parts: TDDE09, 729A27 Natural Language Processing (2017) Exam 2017-03-13 Marco Kuhlmann This exam consists of three parts: 1. Part A consists of 5 items, each worth 3 points. These items test your understanding

More information

Context-Free Grammars

Context-Free Grammars Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 3, 2012 (CFGs) A CFG is an ordered quadruple T, N, D, P where a. T is a finite set called the terminals; b. N is a

More information

Parsing partially bracketed input

Parsing partially bracketed input Parsing partially bracketed input Martijn Wieling, Mark-Jan Nederhof and Gertjan van Noord Humanities Computing, University of Groningen Abstract A method is proposed to convert a Context Free Grammar

More information

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis CSE450 Translation of Programming Languages Lecture 4: Syntax Analysis http://xkcd.com/859 Structure of a Today! Compiler Source Language Lexical Analyzer Syntax Analyzer Semantic Analyzer Int. Code Generator

More information

2 Ambiguity in Analyses of Idiomatic Phrases

2 Ambiguity in Analyses of Idiomatic Phrases Representing and Accessing [Textual] Digital Information (COMS/INFO 630), Spring 2006 Lecture 22: TAG Adjunction Trees and Feature Based TAGs 4/20/06 Lecturer: Lillian Lee Scribes: Nicolas Hamatake (nh39),

More information

BMI/CS Lecture #22 - Stochastic Context Free Grammars for RNA Structure Modeling. Colin Dewey (adapted from slides by Mark Craven)

BMI/CS Lecture #22 - Stochastic Context Free Grammars for RNA Structure Modeling. Colin Dewey (adapted from slides by Mark Craven) BMI/CS Lecture #22 - Stochastic Context Free Grammars for RNA Structure Modeling Colin Dewey (adapted from slides by Mark Craven) 2007.04.12 1 Modeling RNA with Stochastic Context Free Grammars consider

More information

NLP Chain. Giuseppe Castellucci Web Mining & Retrieval a.a. 2013/2014

NLP Chain. Giuseppe Castellucci Web Mining & Retrieval a.a. 2013/2014 NLP Chain Giuseppe Castellucci castellucci@ing.uniroma2.it Web Mining & Retrieval a.a. 2013/2014 Outline NLP chains RevNLT Exercise NLP chain Automatic analysis of texts At different levels Token Morphological

More information

CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019

CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019 CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019 1 Course Instructors: Christopher Manning, Richard Socher 2 Authors: Lisa Wang, Juhi Naik,

More information

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES Saturday 10 th December 2016 09:30 to 11:30 INSTRUCTIONS

More information

Lexical Semantics. Regina Barzilay MIT. October, 5766

Lexical Semantics. Regina Barzilay MIT. October, 5766 Lexical Semantics Regina Barzilay MIT October, 5766 Last Time: Vector-Based Similarity Measures man woman grape orange apple n Euclidian: x, y = x y = i=1 ( x i y i ) 2 n x y x i y i i=1 Cosine: cos( x,

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Lecture 11 Ana Bove April 26th 2018 Recap: Regular Languages Decision properties of RL: Is it empty? Does it contain this word? Contains

More information

Advanced PCFG algorithms

Advanced PCFG algorithms Advanced PCFG algorithms! BM1 Advanced atural Language Processing Alexander Koller! 9 December 2014 Today Agenda-based parsing with parsing schemata: generalized perspective on parsing algorithms. Semiring

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Spring 2017 Unit 3: Tree Models Lectures 9-11: Context-Free Grammars and Parsing required hard optional Professor Liang Huang liang.huang.sh@gmail.com Big Picture only 2 ideas

More information

CSE 3302 Programming Languages Lecture 2: Syntax

CSE 3302 Programming Languages Lecture 2: Syntax CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:

More information

Syntax Analysis Part I

Syntax Analysis Part I Syntax Analysis Part I Chapter 4: Context-Free Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,

More information

Transition-based dependency parsing

Transition-based dependency parsing Transition-based dependency parsing Syntactic analysis (5LN455) 2014-12-18 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Overview Arc-factored dependency parsing

More information

Context-Free Grammars

Context-Free Grammars Context-Free Grammars Carl Pollard yntax 2 (Linguistics 602.02) January 3, 2012 Context-Free Grammars (CFGs) A CFG is an ordered quadruple T, N, D, P where a. T is a finite set called the terminals; b.

More information

CSE302: Compiler Design

CSE302: Compiler Design CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 20, 2007 Outline Recap

More information

MEMMs (Log-Linear Tagging Models)

MEMMs (Log-Linear Tagging Models) Chapter 8 MEMMs (Log-Linear Tagging Models) 8.1 Introduction In this chapter we return to the problem of tagging. We previously described hidden Markov models (HMMs) for tagging problems. This chapter

More information

Vorlesung 7: Ein effizienter CYK Parser

Vorlesung 7: Ein effizienter CYK Parser Institut für Computerlinguistik, Uni Zürich: Effiziente Analyse unbeschränkter Texte Vorlesung 7: Ein effizienter CYK Parser Gerold Schneider Institute of Computational Linguistics, University of Zurich

More information

Tekniker för storskalig parsning: Dependensparsning 2

Tekniker för storskalig parsning: Dependensparsning 2 Tekniker för storskalig parsning: Dependensparsning 2 Joakim Nivre Uppsala Universitet Institutionen för lingvistik och filologi joakim.nivre@lingfil.uu.se Dependensparsning 2 1(45) Data-Driven Dependency

More information

Graph-Based Parsing. Miguel Ballesteros. Algorithms for NLP Course. 7-11

Graph-Based Parsing. Miguel Ballesteros. Algorithms for NLP Course. 7-11 Graph-Based Parsing Miguel Ballesteros Algorithms for NLP Course. 7-11 By using some Joakim Nivre's materials from Uppsala University and Jason Eisner's material from Johns Hopkins University. Outline

More information

Syntax Analysis, III Comp 412

Syntax Analysis, III Comp 412 Updated algorithm for removal of indirect left recursion to match EaC3e (3/2018) COMP 412 FALL 2018 Midterm Exam: Thursday October 18, 7PM Herzstein Amphitheater Syntax Analysis, III Comp 412 source code

More information

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis CS 1622 Lecture 2 Lexical Analysis CS 1622 Lecture 2 1 Lecture 2 Review of last lecture and finish up overview The first compiler phase: lexical analysis Reading: Chapter 2 in text (by 1/18) CS 1622 Lecture

More information

Natural Language Dependency Parsing. SPFLODD October 13, 2011

Natural Language Dependency Parsing. SPFLODD October 13, 2011 Natural Language Dependency Parsing SPFLODD October 13, 2011 Quick Review from Tuesday What are CFGs? PCFGs? Weighted CFGs? StaKsKcal parsing using the Penn Treebank What it looks like How to build a simple

More information

Syntax Analysis. Chapter 4

Syntax Analysis. Chapter 4 Syntax Analysis Chapter 4 Check (Important) http://www.engineersgarage.com/contributio n/difference-between-compiler-andinterpreter Introduction covers the major parsing methods that are typically used

More information

Incremental Integer Linear Programming for Non-projective Dependency Parsing

Incremental Integer Linear Programming for Non-projective Dependency Parsing Incremental Integer Linear Programming for Non-projective Dependency Parsing Sebastian Riedel James Clarke ICCS, University of Edinburgh 22. July 2006 EMNLP 2006 S. Riedel, J. Clarke (ICCS, Edinburgh)

More information

Shallow Parsing Swapnil Chaudhari 11305R011 Ankur Aher Raj Dabre 11305R001

Shallow Parsing Swapnil Chaudhari 11305R011 Ankur Aher Raj Dabre 11305R001 Shallow Parsing Swapnil Chaudhari 11305R011 Ankur Aher - 113059006 Raj Dabre 11305R001 Purpose of the Seminar To emphasize on the need for Shallow Parsing. To impart basic information about techniques

More information

Syntax Analysis, III Comp 412

Syntax Analysis, III Comp 412 COMP 412 FALL 2017 Syntax Analysis, III Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp

More information

Klein & Manning, NIPS 2002

Klein & Manning, NIPS 2002 Agenda for today Factoring complex model into product of simpler models Klein & Manning factored model: dependencies and constituents Dual decomposition for higher-order dependency parsing Refresh memory

More information

Semantics of programming languages

Semantics of programming languages Semantics of programming languages Informatics 2A: Lecture 28 Mary Cryan School of Informatics University of Edinburgh mcryan@inf.ed.ac.uk 21 November 2018 1 / 18 Two parallel pipelines A large proportion

More information

Trees Rooted Trees Spanning trees and Shortest Paths. 12. Graphs and Trees 2. Aaron Tan November 2017

Trees Rooted Trees Spanning trees and Shortest Paths. 12. Graphs and Trees 2. Aaron Tan November 2017 12. Graphs and Trees 2 Aaron Tan 6 10 November 2017 1 10.5 Trees 2 Definition Definition Definition: Tree A graph is said to be circuit-free if, and only if, it has no circuits. A graph is called a tree

More information

Languages and Compilers

Languages and Compilers Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:

More information

Ling 571: Deep Processing for Natural Language Processing

Ling 571: Deep Processing for Natural Language Processing Ling 571: Deep Processing for Natural Language Processing Julie Medero February 4, 2013 Today s Plan Assignment Check-in Project 1 Wrap-up CKY Implementations HW2 FAQs: evalb Features & Unification Project

More information

Homework 2: Parsing and Machine Learning

Homework 2: Parsing and Machine Learning Homework 2: Parsing and Machine Learning COMS W4705_001: Natural Language Processing Prof. Kathleen McKeown, Fall 2017 Due: Saturday, October 14th, 2017, 2:00 PM This assignment will consist of tasks in

More information

Syntax and Semantics

Syntax and Semantics Syntax and Semantics Syntax - The form or structure of the expressions, statements, and program units Semantics - The meaning of the expressions, statements, and program units Syntax Example: simple C

More information

Let s get parsing! Each component processes the Doc object, then passes it on. doc.is_parsed attribute checks whether a Doc object has been parsed

Let s get parsing! Each component processes the Doc object, then passes it on. doc.is_parsed attribute checks whether a Doc object has been parsed Let s get parsing! SpaCy default model includes tagger, parser and entity recognizer nlp = spacy.load('en ) tells spacy to use "en" with ["tagger", "parser", "ner"] Each component processes the Doc object,

More information

Ling/CSE 472: Introduction to Computational Linguistics. 5/9/17 Feature structures and unification

Ling/CSE 472: Introduction to Computational Linguistics. 5/9/17 Feature structures and unification Ling/CSE 472: Introduction to Computational Linguistics 5/9/17 Feature structures and unification Overview Problems with CFG Feature structures Unification Agreement Subcategorization Long-distance Dependencies

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2006

More information

LSA 354. Programming Assignment: A Treebank Parser

LSA 354. Programming Assignment: A Treebank Parser LSA 354 Programming Assignment: A Treebank Parser Due Tue, 24 July 2007, 5pm Thanks to Dan Klein for the original assignment. 1 Setup To do this assignment, you can use the Stanford teaching computer clusters.

More information

Dependency Parsing. Allan Jie. February 20, Slides: Allan Jie Dependency Parsing February 20, / 16

Dependency Parsing. Allan Jie. February 20, Slides:   Allan Jie Dependency Parsing February 20, / 16 Dependency Parsing Allan Jie February 20, 2016 Slides: http://www.statnlp.org/dp.html Allan Jie Dependency Parsing February 20, 2016 1 / 16 Table of Contents 1 Dependency Labeled/Unlabeled Dependency Projective/Non-projective

More information

Phrase Structure Parsing. Statistical NLP Spring Conflicting Tests. Constituency Tests. Non-Local Phenomena. Regularity of Rules

Phrase Structure Parsing. Statistical NLP Spring Conflicting Tests. Constituency Tests. Non-Local Phenomena. Regularity of Rules tatistical NLP pring 2007 Lecture 14: Parsing I Dan Klein UC Berkeley Phrase tructure Parsing Phrase structure parsing organizes syntax into constituents or brackets In general, this involves nested trees

More information

Advanced Topics in Information Retrieval Natural Language Processing for IR & IR Evaluation. ATIR April 28, 2016

Advanced Topics in Information Retrieval Natural Language Processing for IR & IR Evaluation. ATIR April 28, 2016 Advanced Topics in Information Retrieval Natural Language Processing for IR & IR Evaluation Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR April 28, 2016 Organizational

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2007

More information

Parsing in Parallel on Multiple Cores and GPUs

Parsing in Parallel on Multiple Cores and GPUs 1/28 Parsing in Parallel on Multiple Cores and GPUs Mark Johnson Centre for Language Sciences and Department of Computing Macquarie University ALTA workshop December 2011 Why parse in parallel? 2/28 The

More information

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Transition-Based Dependency Parsing with Stack Long Short-Term Memory Transition-Based Dependency Parsing with Stack Long Short-Term Memory Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, Noah A. Smith Association for Computational Linguistics (ACL), 2015 Presented

More information

Academic Formalities. CS Modern Compilers: Theory and Practise. Images of the day. What, When and Why of Compilers

Academic Formalities. CS Modern Compilers: Theory and Practise. Images of the day. What, When and Why of Compilers Academic Formalities CS6013 - Modern Compilers: Theory and Practise Introduction V. Krishna Nandivada IIT Madras Written assignment = 5 marks. Programming assignments = 40 marks. Midterm = 25 marks, Final

More information

Structured Perceptron. Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen

Structured Perceptron. Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen Structured Perceptron Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen 1 Outline 1. 2. 3. 4. Brief review of perceptron Structured Perceptron Discriminative Training Methods for Hidden Markov Models: Theory and

More information

SEMINAR: RECENT ADVANCES IN PARSING TECHNOLOGY. Parser Evaluation Approaches

SEMINAR: RECENT ADVANCES IN PARSING TECHNOLOGY. Parser Evaluation Approaches SEMINAR: RECENT ADVANCES IN PARSING TECHNOLOGY Parser Evaluation Approaches NATURE OF PARSER EVALUATION Return accurate syntactic structure of sentence. Which representation? Robustness of parsing. Quick

More information

Dependency Parsing CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin

Dependency Parsing CMSC 723 / LING 723 / INST 725. Marine Carpuat. Fig credits: Joakim Nivre, Dan Jurafsky & James Martin Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan Jurafsky & James Martin Dependency Parsing Formalizing dependency trees Transition-based dependency parsing

More information

CS 842 Ben Cassell University of Waterloo

CS 842 Ben Cassell University of Waterloo CS 842 Ben Cassell University of Waterloo Recursive Descent Re-Cap Top-down parser. Works down parse tree using the formal grammar. Built from mutually recursive procedures. Typically these procedures

More information

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target

More information

CS 406/534 Compiler Construction Parsing Part I

CS 406/534 Compiler Construction Parsing Part I CS 406/534 Compiler Construction Parsing Part I Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr.

More information

Syntactic Analysis. Top-Down Parsing

Syntactic Analysis. Top-Down Parsing Syntactic Analysis Top-Down Parsing Copyright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

More information

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011 Syntax Analysis COMP 524: Programming Languages Srinivas Krishnan January 25, 2011 Based in part on slides and notes by Bjoern Brandenburg, S. Olivier and A. Block. 1 The Big Picture Character Stream Token

More information

A simple pattern-matching algorithm for recovering empty nodes and their antecedents

A simple pattern-matching algorithm for recovering empty nodes and their antecedents A simple pattern-matching algorithm for recovering empty nodes and their antecedents Mark Johnson Brown Laboratory for Linguistic Information Processing Brown University Mark Johnson@Brown.edu Abstract

More information

Basic Parsing with Context-Free Grammars. Some slides adapted from Karl Stratos and from Chris Manning

Basic Parsing with Context-Free Grammars. Some slides adapted from Karl Stratos and from Chris Manning Basic Parsing with Context-Free Grammars Some slides adapted from Karl Stratos and from Chris Manning 1 Announcements HW 2 out Midterm on 10/19 (see website). Sample ques>ons will be provided. Sign up

More information

CSCE 314 Programming Languages

CSCE 314 Programming Languages CSCE 314 Programming Languages Syntactic Analysis Dr. Hyunyoung Lee 1 What Is a Programming Language? Language = syntax + semantics The syntax of a language is concerned with the form of a program: how

More information

6.863J Natural Language Processing Lecture 7: parsing with hierarchical structures context-free parsing. Robert C. Berwick

6.863J Natural Language Processing Lecture 7: parsing with hierarchical structures context-free parsing. Robert C. Berwick 6.863J Natural Language Processing Lecture 7: parsing with hierarchical structures context-free parsing Robert C. Berwick The Menu Bar Administrivia: Schedule alert: Lab2 due today; Lab 3 out late Weds.

More information

Computational Linguistics: Feature Agreement

Computational Linguistics: Feature Agreement Computational Linguistics: Feature Agreement Raffaella Bernardi Contents 1 Admin................................................... 4 2 Formal Grammars......................................... 5 2.1 Recall:

More information

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous. Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or

More information

Dynamic Feature Selection for Dependency Parsing

Dynamic Feature Selection for Dependency Parsing Dynamic Feature Selection for Dependency Parsing He He, Hal Daumé III and Jason Eisner EMNLP 2013, Seattle Structured Prediction in NLP Part-of-Speech Tagging Parsing N N V Det N Fruit flies like a banana

More information

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University Compilation 2012 Parsers and Scanners Jan Midtgaard Michael I. Schwartzbach Aarhus University Context-Free Grammars Example: sentence subject verb object subject person person John Joe Zacharias verb asked

More information