COMS W4705, Spring 2015: Problem Set 2 Total points: 140

Size: px
Start display at page:

Download "COMS W4705, Spring 2015: Problem Set 2 Total points: 140"

Transcription

1 COM W4705, pring 2015: Problem et 2 Total points: 140 Analytic Problems (due March 2nd) Question 1 (20 points) A probabilistic context-ree grammar G = (N, Σ, R,, q) in Chomsky Normal Form is deined as ollows: N is a set o non-terminal symbols (eg,,, etc) Σ is a set o terminal symbols (eg, cat, dog, the, etc) R is a set o rules which take one o two orms: X Y 1 Y 2 or X N, and Y 1, Y 2 N X Y or X N, and Y Σ N is a distinguished start symbol q is a unction that maps every rule in R to a probability, which satisies the ollowing conditions: r R, q(r) 0 X N, X α R q(x α) = 1 Now assume we have a probabilistic CFG G, which has a set o rules R which take one o the two ollowing orms: X Y 1 Y 2 Y n or X N, n 2, and i, Y i N X Y or X N, and Y Σ Note that this is a more permissive deinition than Chomsky normal orm, as some rules in the grammar may have more than 2 non-terminals on the right-hand side An example o a grammar that satisies this more permissive deinition is as ollows: Vt 08 Vt PP 02 DT NN NN 03 PP 07 PP IN 10 Vt saw 10 NN man 07 NN woman 02 NN telescope 01 DT the 10 IN with 05 IN in 05

2 Question 1(a): Describe how to transorm a PCFG G, in this more permissive orm, into an equivalent PCFG G in Chomsky normal orm By equivalent, we mean that there is a one-to-one unction between derivations in G and derivations in G, such that or any derivation T under G which has probability p, (T ) also has probability p (Note: one major motivation or this transormation is that we can then apply the dynamic programming parsing algorithm, described in lecture, to the transormed grammar) Hint: think about adding new rules with new non-terminals to the grammar Question 1(b): how the resulting grammar G ater applying your transormation to the example PCFG shown above Question 2 (20 points) Nathan L Pedant decides to build a treebank He inally produces a corpus which contains the ollowing three parse trees: John V1 BAR ally V1 BAR Fred V1 BAR said COMP declared COMP pronounced COMP that that that ally AD Bill AD Je AD V2 loudly V2 quickly V2 elegantly snored ran swam Clarissa Lexica then purchases the treebank, and decides to build a PCFG, and a parser, using Nathan s data Question 2(a): how the PCFG that Clarissa would derive rom this treebank Question 2(b): how two parse trees or the string Je pronounced that Fred snored loudly, and calculate their probabilities under the PCFG Question 2(c): Clarissa is shocked and dismayed, (see 2(b)), that Je pronounced that Fred snored loudly has two possible parses, and that one o them that Je is doing the pronouncing loudly has relatively high probability, in spite o it having the AD loudly modiiying the higher verb, pronounced This type o high attachment is never seen in the corpus, so the PCFG is clearly missing something Clarissa decides to ix the treebank, by altering some non-terminal labels in the corpus how one such transormation that results in a PCFG that gives zero probability to parse trees with high attachments (Your solution should systematically reine some non-terminals in the treebank, in a way that slightly increases the number o non-terminals in the grammar, but allows the grammar to capture the distinction between high and low attachment to s)

3 Question 3 (20 points) Consider the CKY algorithm or inding the maximum probability or any tree when given as input a sequence o words x 1, x 2,, x n As usual, we use N to denote the set o non-terminals in the grammar, and to denote the start symbol The base case in the recursive deinition is as ollows: or all i = 1 n, or all X N, π(i, i, X) = { q(x xi ) i X x i R 0 otherwise and the recursive deinition is as ollows: or all (i, j) such that 1 i < j n, or all X N, π(i, j, X) = max (q(x Y Z) π(i, s, Y ) π(s + 1, j, Z)) X Y Z R, s {i(j 1)} Finally, we return π(1, n, ) = max t T G (s) p(t) Now assume that we want to ind the maximum probability or any balanced tree or a sentence Here are some example balanced trees: B A D C B A e D C F G B A B A e g D C F G D C F G e g e g It can be seen that in balanced trees, whenever a rule o the orm X -> Y Z is seen in the tree, then the non-terminals Y and Z dominate the same number o words NOTE: we will assume that the length o the sentence, n, is n = 2 x or some integer x For example n = 2, or 4, or 8, or 16, etc

4 Question 3(a): Complete the recursive deinition below, so that the algorithm returns the maximum probability or any balanced tree underlying a sentence x 1, x 2,, x n Base case: or all i = 1 n, or all X N, π(i, i, X) = { q(x xi ) i X x i R 0 otherwise Recursive case: (Complete) Return: π(1, n, ) = max t T G (s) p(t) Programming Problems (due March 12th) Please read the submission instructions, policy and hints on the course webpage In this programming assignment, you will train a probabilistic context-ree grammar (PCFG) or syntactic parsing In class we deined a PCFG as a tuple G = (N, Σ,, R, q) where N is the set o non-terminals, Σ is the vocabulary, is a root symbol, R is a set o rules, and q is a probabilistic model We will assume that the grammar is in Chomsky normal orm (CNF) Recall rom class that this means all rules in R will be either binary rules X Y 1 Y 2 where X, Y 1, Y 2 N or unary rules X Z where X N and Z Σ You will estimate the parameters o the grammar rom a corpus o annotated sentences rom the Wall treet Journal Beore diving into the details o the corpus ormat, we will discuss the rule structure o the grammar You will not have to implement this section but it will be important to know when building your parser Rule tructure Consider a typical tagged sentence rom the Wall treet Journal There/ is/verb no/ asbestos/ in/adp our/pron products/ now/adv / The correct parse tree or the sentence (as annotated by linguists) is as ollows

5 VERB PP AD There is ADP ADV no asbestos in PRON now our products Note that nodes at the ringe o the tree are words, the nodes one level up are part-o-speech tags, and the internal nodes are syntactic tags For the deinition o the syntactic tags used in this corpus, see The Penn Treebank: An Overview The part-o-speech tags should be sel-explanatory, or urther reerence see A Universal Part-o-peech Tagset Both are linked to on the website Unortunately the annotated parse tree is not in Chomsky normal orm For instance, the rule has three children and the rule has a single non-terminal child We will need to work around this problem by applying a transomation to the grammar We will apply two transormations to the tree The irst is to collapse illegal unary rules o the orm X Y where X, Y N into new non-terminals X + Y, or example + There There The second is to split n-ary rules X Y 1 Y 2 Y 3 where X, Y 1, Y 2, Y 3 N into right branching binary rules o the orm X Y 1 X and X Y 2 Y 3 or example With these transormation to the grammar, the new parse correct parse or this sentence is

6 + There VERB is PP AD+ADV no asbestos ADP now in PRON our products Converting the grammar to CNF will greatly simpliy the algorithm or decoding; however, as you saw in Question 1, there are other transormations into CNF that better preserve properties o the original grammar All the parses we provide or the assignment will be in the transormed ormat, you will not need to implement this transormation, but you should understand how it works Corpus Format We provide a training data set with parses parse traindat and a development set o sentences parse devdat along with the correct parses parse devkey All the data comes or the Wall treet Journal corpus and consists o sentences o less than 15 words or eiciency In the training set, each line o the ile consists o a single parse tree in CNF The trees are represented as nested multi-dimensional arrays conorming to the JON standard JON is general data encoding standard (similar to XML) JON is supported in pretty much every language (see Binary rules are represented as arrays o length 3 DT ["", ["DT", ], ["", ]] Unary rules are represented as arrays o length 2 asbestos

7 ["", "asbestos"] For example, the ile treeexample has the tree given above ["", ["", ["", "There"]], ["", ["", ["VERB", "is"], ["", [" ", ["", "no"], ["", "asbestos"]], ["", ["PP", ["ADP", "in "], ["", ["PRON", "our"], ["", "products"]]], ["AD", ["ADV", "now"]]]]], ["", ""]]] The tree is represented as a recursive, multi-dimensional JON array I the array is length 3 it is a binary rule, i it is length 2 it is a unary rule as shown above As an example, assume we want to access the node or is by walking down the tree as in the ollowing diagram There 1 VERB 1 is PP AD+ADV no asbestos ADP now in PRON our products In Python, we can read and access this node using the ollowing code import json tree = jsonloads(open("treeexample")readline()) print tree[2][1][1][1] The irst line reads the tree into an array, and the second line walks down the tree to the node or is Other languages have similar (although possibly more verbose) libraries or reading in these arrays Consult the TAs i you are unable to ind a parser or your avorite language Tree Counts In order to estimate the parameters o the model you will need the counts o the rules used in the corpus The script count cg reqspy reads in a training ile and produces these counts Run the script on the

8 training data and pipe the output into some ile: python count cg reqspy parse traindat > cgcounts Each line in the output contains the count or one event There are three types o counts: Lines where the second token is NONTERMINAL contain counts o non-terminals Count(X) 17 NONTERMINAL indicates that the non-terminal was used 17 times Lines where the second token is BINARYRULE contain emission counts Count(X Y 1 Y 2 ), or example 13 BINARYRULE indicates that binary rule was used 13 times in the training data Lines where the second token is UNARYRULE contain unigram counts Count(X Y ) 14 UNARYRULE + products indicates that the rule + products was seen 14 times in the training data Question 4 (20 points) As in the tagging model, we need to predict emission probabilities or words in the test data that do not occur in the training data Recall our approach is to map inrequent words in the training data to a common class and to treat unseen words as members o this class Replace inrequent words (Count(x) < 5) in the original training data ile with a common symbol RARE Note that this is more diicult than the previous assignment because you will need to read the training ile, ind the words at the ringe o the tree, and rewrite the tree out in the correct ormat We provide a script python pretty print treepy parser devkey which you can use to view your trees in a more readable ormat The script will also throw an error i you output is not in the correct ormat Re-run count cg reqspy to produce new counts ubmit your code and the new training ile it produces Question 5 (40 points) Using the counts produced by count cg reqspy, write a unction that computes rule parameters q(x Y 1 Y 2 ) = Count(X Y 1 Y 2 ) Count(X) q(x w) = Count(X w) Count(X)

9 Using the maximum likelihood estimates or the rule parameters, implement the CKY algorithm to compute arg max t T G (s) p(t) Your parser should read in the counts ile (with rare words) and the ile parse devdat (which is just the sentence rom parse devkey without the parse) and produce output in the ollowing ormat: the tree or one sentence per line represented in the JON tree ormat speciied above Each line should correspond to a parse or a sentence rom parse devdat You can view your output using pretty print parsepy and check your perormance with eval parserpy by running python eval parserpy parser devkey prediction ile The evaluator takes two iles where each line is a matching parse in JON ormat and outputs an evaluation score Note: There is an slight change rom the class notes ome o these sentences are ragments, and they do not have as the root For the purpose o this assignment you should change the last line o the CKY algorithm to This should help perormance i π[1, n, ] 0 return π[1, n, ] else return max π[1, n, X] X N ubmit your program and report on the perormance o your model In addition, write up any observations you made Question 6 (20 Points) Consider the ollowing subtree o the original parse VERB PP AD The tree structure makes the implicit independence assumption that the rule DT is generated independently o the parent non-terminal, ie the probability o generating this subtree in context is p( VERB PP AD )p( ) However, oten this independence assumption is too strong It can be inormative or the lower rule to condition on the parent non-terminal as well, ie model generating this subtree as p( VERB PP AD )p(, parent=)

10 This is a similar idea to using a trigram language model, except with parent non-terminals instead o words We can implement this model without changing our parsing algorithm by applying a tree transormation known as vertical markovization We simply augment each non-terminal with the symbol o its parent VERB PP AD We apply vertical markovization beore binarization; the new non-terminals are augmented with their original parent Ater applying both transormations the inal tree will look like VERB PP AD There is a downside o vertical markovization The transormation greatly increases the number o rules N in the grammar potentially causing data sparsity issues when estimating the probability model q In practice though, using one level o vertical markovization (conditioning on the parent), has been shown to increase accuracy Note: This problem looks straightorward, but the diiculty is that we now have a much larger set o nonterminals N It is likely that a simple implementation o CKY that worked or the last question will not scale to this problem The challenge is to improve the basic ineiciencies o the implementation in order to handle the extra non-terminals The ile parse train vertdat contains the original training sentence with vertical markovization applied to the parse trees Train with the new ile and reparse the development set Run python eval parserpy parser devkey prediction ile to evaluate perormance Report on the perormance o your parser with the new trees In what ways did vertical markovization help or hurt parser perormance? Find an example where it improved your parser s output ubmit your program and any urther observations

Assignment 4 CSE 517: Natural Language Processing

Assignment 4 CSE 517: Natural Language Processing Assignment 4 CSE 517: Natural Language Processing University of Washington Winter 2016 Due: March 2, 2016, 1:30 pm 1 HMMs and PCFGs Here s the definition of a PCFG given in class on 2/17: A finite set

More information

The CKY Parsing Algorithm and PCFGs. COMP-550 Oct 12, 2017

The CKY Parsing Algorithm and PCFGs. COMP-550 Oct 12, 2017 The CKY Parsing Algorithm and PCFGs COMP-550 Oct 12, 2017 Announcements I m out of town next week: Tuesday lecture: Lexical semantics, by TA Jad Kabbara Thursday lecture: Guest lecture by Prof. Timothy

More information

The CKY algorithm part 1: Recognition

The CKY algorithm part 1: Recognition The CKY algorithm part 1: Recognition Syntactic analysis (5LN455) 2016-11-10 Sara Stymne Department of Linguistics and Philology Mostly based on slides from Marco Kuhlmann Phrase structure trees S root

More information

CS 224N Assignment 2 Writeup

CS 224N Assignment 2 Writeup CS 224N Assignment 2 Writeup Angela Gong agong@stanford.edu Dept. of Computer Science Allen Nie anie@stanford.edu Symbolic Systems Program 1 Introduction 1.1 PCFG A probabilistic context-free grammar (PCFG)

More information

Topics in Parsing: Context and Markovization; Dependency Parsing. COMP-599 Oct 17, 2016

Topics in Parsing: Context and Markovization; Dependency Parsing. COMP-599 Oct 17, 2016 Topics in Parsing: Context and Markovization; Dependency Parsing COMP-599 Oct 17, 2016 Outline Review Incorporating context Markovization Learning the context Dependency parsing Eisner s algorithm 2 Review

More information

CKY algorithm / PCFGs

CKY algorithm / PCFGs CKY algorithm / PCFGs CS 585, Fall 2018 Introduction to Natural Language Processing http://people.cs.umass.edu/~miyyer/cs585/ Mohit Iyyer College of Information and Computer Sciences University of Massachusetts

More information

LSA 354. Programming Assignment: A Treebank Parser

LSA 354. Programming Assignment: A Treebank Parser LSA 354 Programming Assignment: A Treebank Parser Due Tue, 24 July 2007, 5pm Thanks to Dan Klein for the original assignment. 1 Setup To do this assignment, you can use the Stanford teaching computer clusters.

More information

The CKY algorithm part 2: Probabilistic parsing

The CKY algorithm part 2: Probabilistic parsing The CKY algorithm part 2: Probabilistic parsing Syntactic analysis/parsing 2017-11-14 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Recap: The CKY algorithm The

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2006

More information

Admin PARSING. Backoff models: absolute discounting. Backoff models: absolute discounting 3/4/11. What is (xy)?

Admin PARSING. Backoff models: absolute discounting. Backoff models: absolute discounting 3/4/11. What is (xy)? Admin Updated slides/examples on backoff with absolute discounting (I ll review them again here today) Assignment 2 Watson vs. Humans (tonight-wednesday) PARING David Kauchak C159 pring 2011 some slides

More information

The CKY Parsing Algorithm and PCFGs. COMP-599 Oct 12, 2016

The CKY Parsing Algorithm and PCFGs. COMP-599 Oct 12, 2016 The CKY Parsing Algorithm and PCFGs COMP-599 Oct 12, 2016 Outline CYK parsing PCFGs Probabilistic CYK parsing 2 CFGs and Constituent Trees Rules/productions: S VP this VP V V is rules jumps rocks Trees:

More information

Parsing in Parallel on Multiple Cores and GPUs

Parsing in Parallel on Multiple Cores and GPUs 1/28 Parsing in Parallel on Multiple Cores and GPUs Mark Johnson Centre for Language Sciences and Department of Computing Macquarie University ALTA workshop December 2011 Why parse in parallel? 2/28 The

More information

Iterative CKY parsing for Probabilistic Context-Free Grammars

Iterative CKY parsing for Probabilistic Context-Free Grammars Iterative CKY parsing for Probabilistic Context-Free Grammars Yoshimasa Tsuruoka and Jun ichi Tsujii Department of Computer Science, University of Tokyo Hongo 7-3-1, Bunkyo-ku, Tokyo 113-0033 CREST, JST

More information

Elements of Language Processing and Learning lecture 3

Elements of Language Processing and Learning lecture 3 Elements of Language Processing and Learning lecture 3 Ivan Titov TA: Milos Stanojevic Institute for Logic, Language and Computation Today Parsing algorithms for CFGs Recap, Chomsky Normal Form (CNF) A

More information

Formalizing Cardinality-based Feature Models and their Staged Configuration

Formalizing Cardinality-based Feature Models and their Staged Configuration Formalizing Cardinality-based Feature Models and their Staged Coniguration Krzyszto Czarnecki, Simon Helsen, and Ulrich Eisenecker 2 University o Waterloo, Canada 2 University o Applied Sciences Kaiserslautern,

More information

1. Generalized CKY Parsing Treebank empties and unaries. Seven Lectures on Statistical Parsing. Unary rules: Efficient CKY parsing

1. Generalized CKY Parsing Treebank empties and unaries. Seven Lectures on Statistical Parsing. Unary rules: Efficient CKY parsing even Lectures on tatistical Parsing 1. Generalized CKY Parsing Treebank empties and unaries -HLN -UBJ Christopher Manning LA Linguistic Institute 27 LA 354 Lecture 3 -NONEε Atone PTB Tree -NONEε Atone

More information

The CKY algorithm part 1: Recognition

The CKY algorithm part 1: Recognition The CKY algorithm part 1: Recognition Syntactic analysis (5LN455) 2014-11-17 Sara Stymne Department of Linguistics and Philology Mostly based on slides from Marco Kuhlmann Recap: Parsing Parsing The automatic

More information

School of Computing and Information Systems The University of Melbourne COMP90042 WEB SEARCH AND TEXT ANALYSIS (Semester 1, 2017)

School of Computing and Information Systems The University of Melbourne COMP90042 WEB SEARCH AND TEXT ANALYSIS (Semester 1, 2017) Discussion School of Computing and Information Systems The University of Melbourne COMP9004 WEB SEARCH AND TEXT ANALYSIS (Semester, 07). What is a POS tag? Sample solutions for discussion exercises: Week

More information

Binary recursion. Unate functions. If a cover C(f) is unate in xj, x, then f is unate in xj. x

Binary recursion. Unate functions. If a cover C(f) is unate in xj, x, then f is unate in xj. x Binary recursion Unate unctions! Theorem I a cover C() is unate in,, then is unate in.! Theorem I is unate in,, then every prime implicant o is unate in. Why are unate unctions so special?! Special Boolean

More information

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS. Regrades 10/6/15. Prelim 1. Prelim 1. Expression trees. Pointers to material

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS. Regrades 10/6/15. Prelim 1. Prelim 1. Expression trees. Pointers to material 1 Prelim 1 2 Max: 99 Mean: 71.2 Median: 73 Std Dev: 1.6 ADS, GRAMMARS, PARSING, R RAVRSALS Lecture 12 CS2110 Spring 2015 3 Prelim 1 Score Grade % 90-99 A 82-89 A-/A 26% 70-82 B/B 62-69 B-/B 50% 50-59 C-/C

More information

Connecting Definition and Use? Tiger Semantic Analysis. Symbol Tables. Symbol Tables (cont d)

Connecting Definition and Use? Tiger Semantic Analysis. Symbol Tables. Symbol Tables (cont d) Tiger source program Tiger Semantic Analysis lexical analyzer report all lexical errors token get next token parser construct variable deinitions to their uses report all syntactic errors absyn checks

More information

Statistical parsing. Fei Xia Feb 27, 2009 CSE 590A

Statistical parsing. Fei Xia Feb 27, 2009 CSE 590A Statistical parsing Fei Xia Feb 27, 2009 CSE 590A Statistical parsing History-based models (1995-2000) Recent development (2000-present): Supervised learning: reranking and label splitting Semi-supervised

More information

Syntax and Grammars 1 / 21

Syntax and Grammars 1 / 21 Syntax and Grammars 1 / 21 Outline What is a language? Abstract syntax and grammars Abstract syntax vs. concrete syntax Encoding grammars as Haskell data types What is a language? 2 / 21 What is a language?

More information

MATRIX ALGORITHM OF SOLVING GRAPH CUTTING PROBLEM

MATRIX ALGORITHM OF SOLVING GRAPH CUTTING PROBLEM UDC 681.3.06 MATRIX ALGORITHM OF SOLVING GRAPH CUTTING PROBLEM V.K. Pogrebnoy TPU Institute «Cybernetic centre» E-mail: vk@ad.cctpu.edu.ru Matrix algorithm o solving graph cutting problem has been suggested.

More information

Parsing with Dynamic Programming

Parsing with Dynamic Programming CS11-747 Neural Networks for NLP Parsing with Dynamic Programming Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Two Types of Linguistic Structure Dependency: focus on relations between words

More information

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1 Asymptotics, Recurrence and Basic Algorithms 1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1 1. O(logn) 2. O(n) 3. O(nlogn) 4. O(n 2 ) 5. O(2 n ) 2. [1 pt] What is the solution

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2007

More information

Homework 2: Parsing and Machine Learning

Homework 2: Parsing and Machine Learning Homework 2: Parsing and Machine Learning COMS W4705_001: Natural Language Processing Prof. Kathleen McKeown, Fall 2017 Due: Saturday, October 14th, 2017, 2:00 PM This assignment will consist of tasks in

More information

CS485/685 Computer Vision Spring 2012 Dr. George Bebis Programming Assignment 2 Due Date: 3/27/2012

CS485/685 Computer Vision Spring 2012 Dr. George Bebis Programming Assignment 2 Due Date: 3/27/2012 CS8/68 Computer Vision Spring 0 Dr. George Bebis Programming Assignment Due Date: /7/0 In this assignment, you will implement an algorithm or normalizing ace image using SVD. Face normalization is a required

More information

I Know Your Name: Named Entity Recognition and Structural Parsing

I Know Your Name: Named Entity Recognition and Structural Parsing I Know Your Name: Named Entity Recognition and Structural Parsing David Philipson and Nikil Viswanathan {pdavid2, nikil}@stanford.edu CS224N Fall 2011 Introduction In this project, we explore a Maximum

More information

A Proposed Approach for Solving Rough Bi-Level. Programming Problems by Genetic Algorithm

A Proposed Approach for Solving Rough Bi-Level. Programming Problems by Genetic Algorithm Int J Contemp Math Sciences, Vol 6, 0, no 0, 45 465 A Proposed Approach or Solving Rough Bi-Level Programming Problems by Genetic Algorithm M S Osman Department o Basic Science, Higher Technological Institute

More information

Homework & NLTK. CS 181: Natural Language Processing Lecture 9: Context Free Grammars. Motivation. Formal Def of CFG. Uses of CFG.

Homework & NLTK. CS 181: Natural Language Processing Lecture 9: Context Free Grammars. Motivation. Formal Def of CFG. Uses of CFG. C 181: Natural Language Processing Lecture 9: Context Free Grammars Kim Bruce Pomona College pring 2008 Homework & NLTK Review MLE, Laplace, and Good-Turing in smoothing.py Disclaimer: lide contents borrowed

More information

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS 3//15 1 AD: Abstract Data ype 2 Just like a type: Bunch of values together with operations on them. Used often in discussing data structures Important: he definition says ntthing about the implementation,

More information

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS 1 Pointers to material ADS, GRAMMARS, PARSING, R RAVRSALS Lecture 13 CS110 all 016 Parse trees: text, section 3.36 Definition of Java Language, sometimes useful: docs.oracle.com/javase/specs/jls/se8/html/index.html

More information

Compilers Course Lecture 4: Context Free Grammars

Compilers Course Lecture 4: Context Free Grammars Compilers Course Lecture 4: Context Free Grammars Example: attempt to define simple arithmetic expressions using named regular expressions: num = [0-9]+ sum = expr "+" expr expr = "(" sum ")" num Appears

More information

COMP 250 Fall graph traversal Nov. 15/16, 2017

COMP 250 Fall graph traversal Nov. 15/16, 2017 Graph traversal One problem we oten need to solve when working with graphs is to decide i there is a sequence o edges (a path) rom one vertex to another, or to ind such a sequence. There are various versions

More information

EECS 6083 Intro to Parsing Context Free Grammars

EECS 6083 Intro to Parsing Context Free Grammars EECS 6083 Intro to Parsing Context Free Grammars Based on slides from text web site: Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. 1 Parsing sequence of tokens parser

More information

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM.

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM. What is the corresponding regex? [2-9]: ([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM. CS 230 - Spring 2018 4-1 More CFG Notation

More information

CS 161: Design and Analysis of Algorithms

CS 161: Design and Analysis of Algorithms CS 161: Design and Analysis o Algorithms Announcements Homework 3, problem 3 removed Greedy Algorithms 4: Human Encoding/Set Cover Human Encoding Set Cover Alphabets and Strings Alphabet = inite set o

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Anoop Sarkar http://www.cs.sfu.ca/~anoop 9/18/18 1 Ambiguity An input is ambiguous with respect to a CFG if it can be derived with two different parse trees A parser needs a

More information

If you are going to form a group for A2, please do it before tomorrow (Friday) noon GRAMMARS & PARSING. Lecture 8 CS2110 Spring 2014

If you are going to form a group for A2, please do it before tomorrow (Friday) noon GRAMMARS & PARSING. Lecture 8 CS2110 Spring 2014 1 If you are going to form a group for A2, please do it before tomorrow (Friday) noon GRAMMARS & PARSING Lecture 8 CS2110 Spring 2014 Pointers. DO visit the java spec website 2 Parse trees: Text page 592

More information

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS

ADTS, GRAMMARS, PARSING, TREE TRAVERSALS 1 Prelim 1 2 Where: Kennedy Auditorium When: A-Lib: 5:30-7 Lie-Z: 7:30-9 (unless we explicitly notified you otherwise) ADS, GRAMMARS, PARSING, R RAVRSALS Lecture 13 CS2110 Spring 2016 Pointers to material

More information

9.8 Graphing Rational Functions

9.8 Graphing Rational Functions 9. Graphing Rational Functions Lets begin with a deinition. Deinition: Rational Function A rational unction is a unction o the orm P where P and Q are polynomials. Q An eample o a simple rational unction

More information

Transition-based Parsing with Neural Nets

Transition-based Parsing with Neural Nets CS11-747 Neural Networks for NLP Transition-based Parsing with Neural Nets Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Two Types of Linguistic Structure Dependency: focus on relations between

More information

Automated Planning for Feature Model Configuration based on Functional and Non-Functional Requirements

Automated Planning for Feature Model Configuration based on Functional and Non-Functional Requirements Automated Planning or Feature Model Coniguration based on Functional and Non-Functional Requirements Samaneh Soltani 1, Mohsen Asadi 1, Dragan Gašević 2, Marek Hatala 1, Ebrahim Bagheri 2 1 Simon Fraser

More information

9.3 Transform Graphs of Linear Functions Use this blank page to compile the most important things you want to remember for cycle 9.

9.3 Transform Graphs of Linear Functions Use this blank page to compile the most important things you want to remember for cycle 9. 9. Transorm Graphs o Linear Functions Use this blank page to compile the most important things you want to remember or cycle 9.: Sec Math In-Sync by Jordan School District, Utah is licensed under a 6 Function

More information

The Expectation Maximization (EM) Algorithm

The Expectation Maximization (EM) Algorithm The Expectation Maximization (EM) Algorithm continued! 600.465 - Intro to NLP - J. Eisner 1 General Idea Start by devising a noisy channel Any model that predicts the corpus observations via some hidden

More information

Lecture 8: Context Free Grammars

Lecture 8: Context Free Grammars Lecture 8: Context Free s Dr Kieran T. Herley Department of Computer Science University College Cork 2017-2018 KH (12/10/17) Lecture 8: Context Free s 2017-2018 1 / 1 Specifying Non-Regular Languages Recall

More information

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Syntax Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Limits of Regular Languages Advantages of Regular Expressions

More information

Announcements. Application of Recursion. Motivation. A Grammar. A Recursive Grammar. Grammars and Parsing

Announcements. Application of Recursion. Motivation. A Grammar. A Recursive Grammar. Grammars and Parsing Grammars and Lecture 5 CS211 Fall 2005 CMS is being cleaned If you did not turn in A1 and haven t already contacted David Schwartz, you need to contact him now Java Reminder Any significant class should

More information

Travaux Dirigés n 2 Modèles des langages de programmation Master Parisien de Recherche en Informatique Paul-André Melliès

Travaux Dirigés n 2 Modèles des langages de programmation Master Parisien de Recherche en Informatique Paul-André Melliès Travaux Dirigés n 2 Modèles des langages de programmation Master Parisien de Recherche en Inormatique Paul-André Melliès Problem. We saw in the course that every coherence space A induces a partial order

More information

A simple pattern-matching algorithm for recovering empty nodes and their antecedents

A simple pattern-matching algorithm for recovering empty nodes and their antecedents A simple pattern-matching algorithm for recovering empty nodes and their antecedents Mark Johnson Brown Laboratory for Linguistic Information Processing Brown University Mark Johnson@Brown.edu Abstract

More information

A Simple Syntax-Directed Translator

A Simple Syntax-Directed Translator Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called

More information

COP 3402 Systems Software Syntax Analysis (Parser)

COP 3402 Systems Software Syntax Analysis (Parser) COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis

More information

syntax tree - * * * - * * * * * 2 1 * * 2 * (2 * 1) - (1 + 0)

syntax tree - * * * - * * * * * 2 1 * * 2 * (2 * 1) - (1 + 0) 0//7 xpression rees rom last time: we can draw a syntax tree for the Java expression ( 0). 0 ASS, GRAMMARS, PARSING, R RAVRSALS Lecture 3 CS0 all 07 Preorder, Postorder, and Inorder Preorder, Postorder,

More information

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019 Comp 411 Principles of Programming Languages Lecture 3 Parsing Corky Cartwright January 11, 2019 Top Down Parsing What is a context-free grammar (CFG)? A recursive definition of a set of strings; it is

More information

MEMMs (Log-Linear Tagging Models)

MEMMs (Log-Linear Tagging Models) Chapter 8 MEMMs (Log-Linear Tagging Models) 8.1 Introduction In this chapter we return to the problem of tagging. We previously described hidden Markov models (HMMs) for tagging problems. This chapter

More information

Discriminative Training with Perceptron Algorithm for POS Tagging Task

Discriminative Training with Perceptron Algorithm for POS Tagging Task Discriminative Training with Perceptron Algorithm for POS Tagging Task Mahsa Yarmohammadi Center for Spoken Language Understanding Oregon Health & Science University Portland, Oregon yarmoham@ohsu.edu

More information

Advanced PCFG Parsing

Advanced PCFG Parsing Advanced PCFG Parsing Computational Linguistics Alexander Koller 8 December 2017 Today Parsing schemata and agenda-based parsing. Semiring parsing. Pruning techniques for chart parsing. The CKY Algorithm

More information

Compiler construction

Compiler construction This lecture Compiler construction Lecture 5: Project etensions Magnus Mreen Spring 2018 Chalmers Universit o Technolog Gothenburg Universit Some project etensions: Arras Pointers and structures Object-oriented

More information

Ling 571: Deep Processing for Natural Language Processing

Ling 571: Deep Processing for Natural Language Processing Ling 571: Deep Processing for Natural Language Processing Julie Medero January 14, 2013 Today s Plan Motivation for Parsing Parsing as Search Search algorithms Search Strategies Issues Two Goal of Parsing

More information

The Graph of an Equation Graph the following by using a table of values and plotting points.

The Graph of an Equation Graph the following by using a table of values and plotting points. Calculus Preparation - Section 1 Graphs and Models Success in math as well as Calculus is to use a multiple perspective -- graphical, analytical, and numerical. Thanks to Rene Descartes we can represent

More information

Introduction to Parsing. Lecture 5

Introduction to Parsing. Lecture 5 Introduction to Parsing Lecture 5 1 Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity 2 Languages and Automata Formal languages are very important

More information

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 4, 10.9

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 4, 10.9 1 INF5830 2015 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lønning, Lecture 4, 10.9 2 Working with texts From bits to meaningful units Today: 3 Reading in texts Character encodings and Unicode Word tokenization

More information

CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019

CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019 CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part IV Dependency Parsing 2 Winter 2019 1 Course Instructors: Christopher Manning, Richard Socher 2 Authors: Lisa Wang, Juhi Naik,

More information

GRAMMARS & PARSING. Lecture 7 CS2110 Fall 2013

GRAMMARS & PARSING. Lecture 7 CS2110 Fall 2013 1 GRAMMARS & PARSING Lecture 7 CS2110 Fall 2013 Pointers to the textbook 2 Parse trees: Text page 592 (23.34), Figure 23-31 Definition of Java Language, sometimes useful: http://docs.oracle.com/javase/specs/jls/se7/html/index.html

More information

Parsing Chart Extensions 1

Parsing Chart Extensions 1 Parsing Chart Extensions 1 Chart extensions Dynamic programming methods CKY Earley NLP parsing dynamic programming systems 1 Parsing Chart Extensions 2 Constraint propagation CSP (Quesada) Bidirectional

More information

CS122 Lecture 4 Winter Term,

CS122 Lecture 4 Winter Term, CS122 Lecture 4 Winter Term, 2014-2015 2 SQL Query Transla.on Last time, introduced query evaluation pipeline SQL query SQL parser abstract syntax tree SQL translator relational algebra plan query plan

More information

Composite functions. [Type the document subtitle] Composite functions, working them out.

Composite functions. [Type the document subtitle] Composite functions, working them out. Composite unctions [Type the document subtitle] Composite unctions, workin them out. luxvis 11/19/01 Composite Functions What are they? In the real world, it is not uncommon or the output o one thin to

More information

Message authentication

Message authentication Message authentication -- Reminder on hash unctions -- MAC unctions hash based block cipher based -- Digital signatures (c) Levente Buttyán (buttyan@crysys.hu) Hash unctions a hash unction is a unction

More information

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5 Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5 1 Not all languages are regular So what happens to the languages which are not regular? Can we still come up with a language recognizer?

More information

Introduction to Parsing

Introduction to Parsing Introduction to Parsing The Front End Source code Scanner tokens Parser IR Errors Parser Checks the stream of words and their parts of speech (produced by the scanner) for grammatical correctness Determines

More information

Extracting Synchronous Grammar Rules From Word-Level Alignments in Linear Time

Extracting Synchronous Grammar Rules From Word-Level Alignments in Linear Time Etracting Synchronous Grammar Rules From Word-Level Alignments in Linear Time Hao Zhang and Daniel Gildea Computer Science Department University o Rochester Rochester, NY, USA David Chiang Inormation Sciences

More information

CSE 3302 Programming Languages Lecture 2: Syntax

CSE 3302 Programming Languages Lecture 2: Syntax CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:

More information

Natural Language Dependency Parsing. SPFLODD October 13, 2011

Natural Language Dependency Parsing. SPFLODD October 13, 2011 Natural Language Dependency Parsing SPFLODD October 13, 2011 Quick Review from Tuesday What are CFGs? PCFGs? Weighted CFGs? StaKsKcal parsing using the Penn Treebank What it looks like How to build a simple

More information

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ Specifying Syntax Language Specification Components of a Grammar 1. Terminal symbols or terminals, Σ Syntax Form of phrases Physical arrangement of symbols 2. Nonterminal symbols or syntactic categories,

More information

Compiler Design Concepts. Syntax Analysis

Compiler Design Concepts. Syntax Analysis Compiler Design Concepts Syntax Analysis Introduction First task is to break up the text into meaningful words called tokens. newval=oldval+12 id = id + num Token Stream Lexical Analysis Source Code (High

More information

Introduction to Syntax Analysis. The Second Phase of Front-End

Introduction to Syntax Analysis. The Second Phase of Front-End Compiler Design IIIT Kalyani, WB 1 Introduction to Syntax Analysis The Second Phase of Front-End Compiler Design IIIT Kalyani, WB 2 Syntax Analysis The syntactic or the structural correctness of a program

More information

Context-Free Grammars and Languages (2015/11)

Context-Free Grammars and Languages (2015/11) Chapter 5 Context-Free Grammars and Languages (2015/11) Adriatic Sea shore at Opatija, Croatia Outline 5.0 Introduction 5.1 Context-Free Grammars (CFG s) 5.2 Parse Trees 5.3 Applications of CFG s 5.4 Ambiguity

More information

A new approach for ranking trapezoidal vague numbers by using SAW method

A new approach for ranking trapezoidal vague numbers by using SAW method Science Road Publishing Corporation Trends in Advanced Science and Engineering ISSN: 225-6557 TASE 2() 57-64, 20 Journal homepage: http://www.sciroad.com/ntase.html A new approach or ranking trapezoidal

More information

Chapter 3 Image Enhancement in the Spatial Domain

Chapter 3 Image Enhancement in the Spatial Domain Chapter 3 Image Enhancement in the Spatial Domain Yinghua He School o Computer Science and Technology Tianjin University Image enhancement approaches Spatial domain image plane itsel Spatial domain methods

More information

Application of recursion. Grammars and Parsing. Recursive grammar. Grammars

Application of recursion. Grammars and Parsing. Recursive grammar. Grammars Application of recursion Grammars and Parsing So far, we have written recursive programs on integers. Let us now consider a new application, grammars and parsing, that shows off the full power of recursion.

More information

Piecewise polynomial interpolation

Piecewise polynomial interpolation Chapter 2 Piecewise polynomial interpolation In ection.6., and in Lab, we learned that it is not a good idea to interpolate unctions by a highorder polynomials at equally spaced points. However, it transpires

More information

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous. Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or

More information

Objective Function Designing Led by User Preferences Acquisition

Objective Function Designing Led by User Preferences Acquisition Objective Function Designing Led by User Preerences Acquisition Patrick Taillandier and Julien Gauri Abstract Many real world problems can be deined as optimisation problems in which the aim is to maximise

More information

CS395T Project 2: Shift-Reduce Parsing

CS395T Project 2: Shift-Reduce Parsing CS395T Project 2: Shift-Reduce Parsing Due date: Tuesday, October 17 at 9:30am In this project you ll implement a shift-reduce parser. First you ll implement a greedy model, then you ll extend that model

More information

Computational Linguistics

Computational Linguistics Computational Linguistics CSC 2501 / 485 Fall 2018 3 3. Chart parsing Gerald Penn Department of Computer Science, University of Toronto Reading: Jurafsky & Martin: 13.3 4. Allen: 3.4, 3.6. Bird et al:

More information

CSE 12 Abstract Syntax Trees

CSE 12 Abstract Syntax Trees CSE 12 Abstract Syntax Trees Compilers and Interpreters Parse Trees and Abstract Syntax Trees (AST's) Creating and Evaluating AST's The Table ADT and Symbol Tables 16 Using Algorithms and Data Structures

More information

Learning Implied Global Constraints

Learning Implied Global Constraints Learning Implied Global Constraints Christian Bessiere LIRMM-CNRS U. Montpellier, France bessiere@lirmm.r Remi Coletta LRD Montpellier, France coletta@l-rd.r Thierry Petit LINA-CNRS École des Mines de

More information

Introduction to Syntax Analysis

Introduction to Syntax Analysis Compiler Design 1 Introduction to Syntax Analysis Compiler Design 2 Syntax Analysis The syntactic or the structural correctness of a program is checked during the syntax analysis phase of compilation.

More information

10. SOPC Builder Component Development Walkthrough

10. SOPC Builder Component Development Walkthrough 10. SOPC Builder Component Development Walkthrough QII54007-9.0.0 Introduction This chapter describes the parts o a custom SOPC Builder component and guides you through the process o creating an example

More information

The λ-calculus. 1 Background on Computability. 2 Some Intuition for the λ-calculus. 1.1 Alan Turing. 1.2 Alonzo Church

The λ-calculus. 1 Background on Computability. 2 Some Intuition for the λ-calculus. 1.1 Alan Turing. 1.2 Alonzo Church The λ-calculus 1 Background on Computability The history o computability stretches back a long ways, but we ll start here with German mathematician David Hilbert in the 1920s. Hilbert proposed a grand

More information

Data Structure and Algorithm Homework #3 Due: 2:20pm, Tuesday, April 9, 2013 TA === Homework submission instructions ===

Data Structure and Algorithm Homework #3 Due: 2:20pm, Tuesday, April 9, 2013 TA   === Homework submission instructions === Data Structure and Algorithm Homework #3 Due: 2:20pm, Tuesday, April 9, 2013 TA email: dsa1@csientuedutw === Homework submission instructions === For Problem 1, submit your source code, a Makefile to compile

More information

Outline. Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

Outline. Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation Outline Introduction to Parsing Lecture 8 Adapted from slides by G. Necula and R. Bodik Limitations of regular languages Parser overview Context-free grammars (CG s) Derivations Syntax-Directed ranslation

More information

Research Article Synthesis of Test Scenarios Using UML Sequence Diagrams

Research Article Synthesis of Test Scenarios Using UML Sequence Diagrams International Scholarly Research Network ISRN Sotware Engineering Volume 2012, Article ID 324054, 22 pages doi:10.5402/2012/324054 Research Article Synthesis o Test Scenarios Using UML Sequence Diagrams

More information

foldr CS 5010 Program Design Paradigms Lesson 5.4

foldr CS 5010 Program Design Paradigms Lesson 5.4 oldr CS 5010 Program Design Paradigms Lesson 5.4 Mitchell Wand, 2012-2014 This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. 1 Introduction In this lesson,

More information

CSE 417 Dynamic Programming (pt 6) Parsing Algorithms

CSE 417 Dynamic Programming (pt 6) Parsing Algorithms CSE 417 Dynamic Programming (pt 6) Parsing Algorithms Reminders > HW9 due on Friday start early program will be slow, so debugging will be slow... should run in 2-4 minutes > Please fill out course evaluations

More information

Lexical Semantics. Regina Barzilay MIT. October, 5766

Lexical Semantics. Regina Barzilay MIT. October, 5766 Lexical Semantics Regina Barzilay MIT October, 5766 Last Time: Vector-Based Similarity Measures man woman grape orange apple n Euclidian: x, y = x y = i=1 ( x i y i ) 2 n x y x i y i i=1 Cosine: cos( x,

More information

AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands

AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands Svetlana Stoyanchev, Hyuckchul Jung, John Chen, Srinivas Bangalore AT&T Labs Research 1 AT&T Way Bedminster NJ 07921 {sveta,hjung,jchen,srini}@research.att.com

More information

OpenGL Rendering Pipeline and Programmable Shaders

OpenGL Rendering Pipeline and Programmable Shaders h gpup Topics OpenGL Rendering Pipeline and Programmable s sh gpup Rendering Pipeline Types OpenGL Language Basics h gpup EE 4702- Lecture Transparency. Formatted 8:06, 7 December 205 rom rendering-pipeline.

More information