Exam III March 17, 2010

Size: px
Start display at page:

Download "Exam III March 17, 2010"

Transcription

1 CIS 4930 NLP Print Your Name Exam III March 17, 2010 Total Score Your work is to be done individually. The exam is worth 106 points (six points of extra credit are available throughout the exam) and it has twelve questions. Unless a problem directly instructs you differently, there are no known errors within this document. If you are instructed to use specific functionality to solve a problem, then follow the guidelines given. Otherwise, you are allowed to utilize anything from Python modules, provided you include all statements allowing access to such functionality. Here is the simplified Brown Tag Set for your reference. Unless otherwise specified, all corpora will be tagged using the definitions of this set. Tag Meaning Tag Meaning Tag Meaning Tag Meaning ADJ Adjective ADV Adverb CNJ Conjunction DET Determiner EX Existential FW Foreign Word MOD Modal Verb N Noun NP Proper Noun NUM Number PRO Pronoun P Preposition TO The Word to UH Interjection V Verb VD Past Tense VG Present Participle VN Past Participle WH wh Determiner 1. [6 pts] Define and describe the following parts of speech. (a) Noun a person, place or thing (b) Past Participle - the form of a verb used to make perfect tenses and passive forms of verbs; verb form following some form of the verb has 2. [5 pts] Define and describe Bayes Rule. A theorem for finding the probability of a fact A being true given that fact B is true. 3. [5 pts] Define and describe the Null Hypothesis Test. The technique of setting up a hypothesis to be nullified or refuted in order to support an alternative hypothesis.

2 March 17, 2010 CIS 4930 Exam III Page 2 of 6 Score 4. [6 pts] State the formula for Pearson s Chi Square Test. 5. [6 pts] Using the values: a total of 5,000 total tokens on the course schedule page, CIS occurring 48 times, 4930 occurring 11 times, and CIS 4930 occuring 10 times, create the table of data used by Pearson s Chi Square Test. CIS!CIS ! [6 pts] We would like to calculate the mean differential between the tokens 4930 and 4905 when each is preceded by CIS. In the Spring 2010 course schedule, CIS occurs 48 times, CIS 4930 occurs 10 times, and CIS 4905 occurs 1 time. Resolve your calculation as much as you can by hand, you may leave your result in a fractional form. C(w 1 w) C(w 2 w) / sqrt(c(w 1 w) + C(w 2 w)) = 10 1 / sqrt(10 + 1) = 9/sqrt(11)

3 March 17, 2010 CIS 4930 Exam III Page 3 of 6 Score 7. [8 pts] A tagger exists within the file: Tagger.pkl. Show how to read this tagger into your program for re-use. from cpickle import load input = open( Tagger.pkl, rb ) tagger = load(input) input.close() 8. [8 pts] You are given a list of tagged sentences called training. Show how to create a bigram tagger using this set of training data and the tagger you read in from the prior question as your backoff tagger. t1 = nltk.bigramtagger(training, backoff=tagger) 9. [4 pts] Given a list of untagged data called data, show how to tag this data using the tagger you created in the prior question. t1.tag(data)

4 March 17, 2010 CIS 4930 Exam III Page 4 of 6 Score 10. [16 pts] Create a method that will receive a tagged corpus and a specific tag. The method will search the corpus for the tag that most commonly follows the tag received. Return a list composed of: the specified tag, the most commonly following tag, and the frequency with which the most common tag follows the specified tag. def findmostcommonfollower(tagged_corpus, specified) : from collections import defaultdict words = tagged_corpus.tagged_words(simplify_tags=true) followers = defaultdict(int) length = len(words) totalcount = 0 for i in range(length) : if words[i][1] == specified : totalcount += 1 if i!= length - 1 : followers[words[i + 1][1]] += 1 max = 0 maxtag = 0 for each in followers.keys() : nextcount = followers[each] if nextcount > max : max = nextcount maxtag = each return [specified, maxtag, (max + 0.0) / totalcount]

5 March 17, 2010 CIS 4930 Exam III Page 5 of 6 Score 11. [20 pts] Prepositional phrases are made up a preposition and some set of following words, and are ended with a noun (the object of the preposition). Create a method that will receive a tagged corpus. The method will search the corpus for all prepositions and return a list of tuples (or sub-lists) containing: the preposition, the entire prepositional phrase (including preposition), and the number of tokens (words) within the prepositional phrase. Consider: the drink spilled from my glass and landed on my new shoes, your method will return: [( from, from my glass, 3), ( on, on my new shoes, 4)]. def findprepositions(tagged_corpus) : words = tagged_corpus.tagged_words(simplify_tags=true) results = [] lastprep = None count = 0 phrase = '' for each in words : if each[1] == 'P' : lastprep = each[0] count = 1 phrase = lastprep elif lastprep!= None : phrase += ' ' + each[0] count += 1 if each[1] == 'N' : results.append([lastprep, phrase, count]) lastprep = None return results

6 March 17, 2010 CIS 4930 Exam III Page 6 of 6 Score 12. [16 pts] The feminine pronouns are: she, her, herself, and hers and the masculine pronouns are: he, him, himself, and his. Create a method that will receive a tagged corpus and return the ratio of feminine pronouns to masculine pronouns. def findgenderratio(tagged_corpus) : words = tagged_corpus.tagged_words(simplify_tags=true) feminine = 0 masculine = 0 for each in words : if each[1] == 'PRO' : if each[0] == 'she' or each[0] == 'her' or each[0] == 'herself' or each[0] == 'hers' : feminine += 1 elif each[0] == 'he' or each[0] == 'him' or each[0] == 'himself' or each[0] == 'his' : masculine += 1 return (feminine + 0.0) / masculine

ERROR CORRECTION USING NATURAL LANGUAGE PROCESSING. A Thesis NILESH KUMAR JAVAR

ERROR CORRECTION USING NATURAL LANGUAGE PROCESSING. A Thesis NILESH KUMAR JAVAR ERROR CORRECTION USING NATURAL LANGUAGE PROCESSING A Thesis by NILESH KUMAR JAVAR Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the requirements

More information

Morpho-syntactic Analysis with the Stanford CoreNLP

Morpho-syntactic Analysis with the Stanford CoreNLP Morpho-syntactic Analysis with the Stanford CoreNLP Danilo Croce croce@info.uniroma2.it WmIR 2015/2016 Objectives of this tutorial Use of a Natural Language Toolkit CoreNLP toolkit Morpho-syntactic analysis

More information

Hidden Markov Models. Natural Language Processing: Jordan Boyd-Graber. University of Colorado Boulder LECTURE 20. Adapted from material by Ray Mooney

Hidden Markov Models. Natural Language Processing: Jordan Boyd-Graber. University of Colorado Boulder LECTURE 20. Adapted from material by Ray Mooney Hidden Markov Models Natural Language Processing: Jordan Boyd-Graber University of Colorado Boulder LECTURE 20 Adapted from material by Ray Mooney Natural Language Processing: Jordan Boyd-Graber Boulder

More information

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 4, 10.9

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 4, 10.9 1 INF5830 2015 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lønning, Lecture 4, 10.9 2 Working with texts From bits to meaningful units Today: 3 Reading in texts Character encodings and Unicode Word tokenization

More information

CSC401 Natural Language Computing

CSC401 Natural Language Computing CSC401 Natural Language Computing Jan 19, 2018 TA: Willie Chang Varada Kolhatkar, Ka-Chun Won, and Aryan Arbabi) Mascots: r/sandersforpresident (left) and r/the_donald (right) To perform sentiment analysis

More information

Privacy and Security in Online Social Networks Department of Computer Science and Engineering Indian Institute of Technology, Madras

Privacy and Security in Online Social Networks Department of Computer Science and Engineering Indian Institute of Technology, Madras Privacy and Security in Online Social Networks Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture - 25 Tutorial 5: Analyzing text using Python NLTK Hi everyone,

More information

13.1 End Marks Using Periods Rule Use a period to end a declarative sentence a statement of fact or opinion.

13.1 End Marks Using Periods Rule Use a period to end a declarative sentence a statement of fact or opinion. 13.1 End Marks Using Periods Rule 13.1.1 Use a period to end a declarative sentence a statement of fact or opinion. Rule 13.1.2 Use a period to end most imperative sentences sentences that give directions

More information

Documentation and analysis of an. endangered language: aspects of. the grammar of Griko

Documentation and analysis of an. endangered language: aspects of. the grammar of Griko Documentation and analysis of an endangered language: aspects of the grammar of Griko Database and Website manual Antonis Anastasopoulos Marika Lekakou NTUA UOI December 12, 2013 Contents Introduction...............................

More information

A Multilingual Social Media Linguistic Corpus

A Multilingual Social Media Linguistic Corpus A Multilingual Social Media Linguistic Corpus Luis Rei 1,2 Dunja Mladenić 1,2 Simon Krek 1 1 Artificial Intelligence Laboratory Jožef Stefan Institute 2 Jožef Stefan International Postgraduate School 4th

More information

NLP Final Project Fall 2015, Due Friday, December 18

NLP Final Project Fall 2015, Due Friday, December 18 NLP Final Project Fall 2015, Due Friday, December 18 For the final project, everyone is required to do some sentiment classification and then choose one of the other three types of projects: annotation,

More information

CIS 660. Image Searching System using CNN-LSTM. Presented by. Mayur Rumalwala Sagar Dahiwala

CIS 660. Image Searching System using CNN-LSTM. Presented by. Mayur Rumalwala Sagar Dahiwala CIS 660 using CNN-LSTM Presented by Mayur Rumalwala Sagar Dahiwala AGENDA Problem in Image Searching? Proposed Solution Tools, Library and Dataset used Architecture of Proposed System Implementation of

More information

THE knowledge needed by software developers

THE knowledge needed by software developers SUBMITTED TO IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 1 Extracting Development Tasks to Navigate Software Documentation Christoph Treude, Martin P. Robillard and Barthélémy Dagenais Abstract Knowledge

More information

A tool for Cross-Language Pair Annotations: CLPA

A tool for Cross-Language Pair Annotations: CLPA A tool for Cross-Language Pair Annotations: CLPA August 28, 2006 This document describes our tool called Cross-Language Pair Annotator (CLPA) that is capable to automatically annotate cognates and false

More information

Student Guide for Usage of Criterion

Student Guide for Usage of Criterion Student Guide for Usage of Criterion Criterion is an Online Writing Evaluation service offered by ETS. It is a computer-based scoring program designed to help you think about your writing process and communicate

More information

THE knowledge needed by software developers is captured

THE knowledge needed by software developers is captured IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 41, NO. 6, JUNE 2015 565 Extracting Development Tasks to Navigate Software Documentation Christoph Treude, Martin P. Robillard, and Barthelemy Dagenais Abstract

More information

English Understanding: From Annotations to AMRs

English Understanding: From Annotations to AMRs English Understanding: From Annotations to AMRs Nathan Schneider August 28, 2012 :: ISI NLP Group :: Summer Internship Project Presentation 1 Current state of the art: syntax-based MT Hierarchical/syntactic

More information

Lab II - Product Specification Outline. CS 411W Lab II. Prototype Product Specification For CLASH. Professor Janet Brunelle Professor Hill Price

Lab II - Product Specification Outline. CS 411W Lab II. Prototype Product Specification For CLASH. Professor Janet Brunelle Professor Hill Price Lab II - Product Specification Outline CS 411W Lab II Prototype Product Specification For CLASH Professor Janet Brunelle Professor Hill Price Prepared by: Artem Fisan Date: 04/20/2015 Table of Contents

More information

Ortolang Tools : MarsaTag

Ortolang Tools : MarsaTag Ortolang Tools : MarsaTag Stéphane Rauzy, Philippe Blache, Grégoire de Montcheuil SECOND VARIAMU WORKSHOP LPL, Aix-en-Provence August 20th & 21st, 2014 ORTOLANG received a State aid under the «Investissements

More information

- Propositions describe relationship between different kinds

- Propositions describe relationship between different kinds SYNTAX OF THE TECTON LANGUAGE D. Kapur, D. R. Musser, A. A. Stepanov Genera1 Electric Research & Development Center ****THS S A WORKNG DOCUMENT. Although it is planned to submit papers based on this materia1

More information

View and Submit an Assignment in Criterion

View and Submit an Assignment in Criterion View and Submit an Assignment in Criterion Criterion is an Online Writing Evaluation service offered by ETS. It is a computer-based scoring program designed to help you think about your writing process

More information

Identifying Idioms of Source Code Identifier in Java Context

Identifying Idioms of Source Code Identifier in Java Context , pp.174-178 http://dx.doi.org/10.14257/astl.2013 Identifying Idioms of Source Code Identifier in Java Context Suntae Kim 1, Rhan Jung 1* 1 Department of Computer Engineering, Kangwon National University,

More information

Inter-Annotator Agreement for a German Newspaper Corpus

Inter-Annotator Agreement for a German Newspaper Corpus Inter-Annotator Agreement for a German Newspaper Corpus Thorsten Brants Saarland University, Computational Linguistics D-66041 Saarbrücken, Germany thorsten@coli.uni-sb.de Abstract This paper presents

More information

Sentiment Analysis using Support Vector Machine based on Feature Selection and Semantic Analysis

Sentiment Analysis using Support Vector Machine based on Feature Selection and Semantic Analysis Sentiment Analysis using Support Vector Machine based on Feature Selection and Semantic Analysis Bhumika M. Jadav M.E. Scholar, L. D. College of Engineering Ahmedabad, India Vimalkumar B. Vaghela, PhD

More information

TS Wikipedia Corpus. TS_Wikipedia_ tri_gram.xml

TS Wikipedia Corpus. TS_Wikipedia_ tri_gram.xml What is? Data Set is a collection of processed Turkish Wikipedia pages. The source of the data is Turkish wiki-dumps 1. The set is a collection of eight (8) separate files which are named as 2 : TS_Wikipedia_

More information

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): / _20

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): / _20 Jones, D. E., Xie, Y., McMahon, C. A., Dotter, M., Chanchevrier, N., & Hicks, B. J. (2016). Improving Enterprise Wide Search in Large Engineering Multinationals: A Linguistic Comparison of the Structures

More information

The Earley Parser

The Earley Parser The 6.863 Earley Parser 1 Introduction The 6.863 Earley parser will work with any well defined context free grammar, including recursive ones and those containing empty categories. It can use either no

More information

OR, you can download the file nltk_data.zip from the class web site, using a URL given in class.

OR, you can download the file nltk_data.zip from the class web site, using a URL given in class. NLP Lab Session Week 2 September 8, 2011 Frequency Distributions and Bigram Distributions Installing NLTK Data Reinstall nltk-2.0b7.win32.msi and Copy and Paste nltk_data from H:\ nltk_data to C:\ nltk_data,

More information

Unit 4 Voice. Answer Key. Objectives

Unit 4 Voice. Answer Key. Objectives English Two Unit 4 Voice Objectives After the completion of this unit, you would be able to explain the functions of active and passive voice transform active sentences into passive and passive sentences

More information

The CKY algorithm part 1: Recognition

The CKY algorithm part 1: Recognition The CKY algorithm part 1: Recognition Syntactic analysis (5LN455) 2016-11-10 Sara Stymne Department of Linguistics and Philology Mostly based on slides from Marco Kuhlmann Phrase structure trees S root

More information

&27L* /,1/D3 D /DQJXDJH,QGHSHQGHQW 1/3$UFKLWHFWXUH XVHGDV*UDPPDU&KHFNHU

&27L* /,1/D3 D /DQJXDJH,QGHSHQGHQW 1/3$UFKLWHFWXUH XVHGDV*UDPPDU&KHFNHU &27L* /,1/D3 D /DQJXDJH,QGHSHQGHQW 1/3$UFKLWHFWXUH XVHGDV*UDPPDU&KHFNHU )UDQFHVF%HQDYHQW */L&RP 83) 1/36HPLQDU 83& 1RYHPEHUWK, Introduction Architecture Data repr. Modules Discussion,QGH[,QWURGXFWLRQ $UFKLWHFWXUH

More information

Maximum Entropy based Natural Language Interface for Relational Database

Maximum Entropy based Natural Language Interface for Relational Database International Journal of Engineering Research and Technology. ISSN 0974-3154 Volume 7, Number 1 (2014), pp. 69-77 International Research Publication House http://www.irphouse.com Maximum Entropy based

More information

Semantic Pattern Classification

Semantic Pattern Classification PFL054 Term project 2011/2012 Semantic Pattern Classification Ema Krejčová 1 Introduction The aim of the project is to construct classifiers which should assign semantic patterns to six given verbs, as

More information

Download this zip file to your NLP class folder in the lab and unzip it there.

Download this zip file to your NLP class folder in the lab and unzip it there. NLP Lab Session Week 13, November 19, 2014 Text Processing and Twitter Sentiment for the Final Projects Getting Started In this lab, we will be doing some work in the Python IDLE window and also running

More information

Course introduction. Marco Kuhlmann Department of Computer and Information Science. Language Technology (2018)

Course introduction. Marco Kuhlmann Department of Computer and Information Science. Language Technology (2018) Language Technology (2018) Course introduction Marco Kuhlmann Department of Computer and Information Science This work is licensed under a Creative Commons Attribution 4.0 International License. What is

More information

Flow Control Statements

Flow Control Statements Flow Control Statements Figure 1: I drew the above segment of a Flowchart Algorithm in Microsoft Word. In programming, statements such as: if, which introduce a condition, are known as: flow-control statements.

More information

Narrative Text Classification for Automatic Key Phrase Extraction in Web Document Corpora

Narrative Text Classification for Automatic Key Phrase Extraction in Web Document Corpora Narrative Text Classification for Automatic Key Phrase Extraction in Web Document Corpora Yongzheng Zhang, Nur Zincir-Heywood, and Evangelos Milios Faculty of Computer Science, Dalhousie University 6050

More information

1. [3 pts] What is your section number, the period your discussion meets, and the name of your discussion leader?

1. [3 pts] What is your section number, the period your discussion meets, and the name of your discussion leader? CIS 3022 Prog for CIS Majors I September 30, 2008 Exam I Print Your Name Your Section # Total Score Your work is to be done individually. The exam is worth 105 points (five points of extra credit are available

More information

Restricted Use Case Modeling Approach

Restricted Use Case Modeling Approach RUCM TAO YUE tao@simula.no Simula Research Laboratory Restricted Use Case Modeling Approach User Manual April 2010 Preface Use case modeling is commonly applied to document requirements. Restricted Use

More information

Lecture 14: Annotation

Lecture 14: Annotation Lecture 14: Annotation Nathan Schneider (with material from Henry Thompson, Alex Lascarides) ENLP 23 October 2016 1/14 Annotation Why gold 6= perfect Quality Control 2/14 Factors in Annotation Suppose

More information

A Text to Image Story Teller Specially Challenged Children - Natural Language Processing Approach

A Text to Image Story Teller Specially Challenged Children - Natural Language Processing Approach A Text to Image Story Teller Specially hallenged hildren - atural Language rocessing Approach Kausar Taj ooja utta Sona Mary Francis inay. M Asst.rofessor achamai.m Asso.rofessor ABSTAT Every human relishes

More information

Language Arts State Performance Indicator Sequence Grade 7. Standard 1- Language

Language Arts State Performance Indicator Sequence Grade 7. Standard 1- Language Standard 1- Language SPI 0701.1.1 Identify the correct use of nouns (i.e., common/proper, singular/plural, possessives, direct/indirect objects, predicate) and pronouns (i.e., agreement, reflexive, interrogative,

More information

Advanced Topics in Information Retrieval Natural Language Processing for IR & IR Evaluation. ATIR April 28, 2016

Advanced Topics in Information Retrieval Natural Language Processing for IR & IR Evaluation. ATIR April 28, 2016 Advanced Topics in Information Retrieval Natural Language Processing for IR & IR Evaluation Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR April 28, 2016 Organizational

More information

A Comparison of Automatic Categorization Algorithms

A Comparison of Automatic  Categorization Algorithms A Comparison of Automatic Email Categorization Algorithms Abstract Email is more and more important in everybody s life. Large quantity of emails makes it difficult for users to efficiently organize and

More information

Complements?? who needs them?

Complements?? who needs them? Complements?? who needs them? Page 1. Complements: Direct and Indirect Objects Page 2. What is a Complement? Page 3. 1.Direct Objects Page 4. Direct Objects Page 5. Direct Objects Page 6. Direct Objects

More information

Maca a configurable tool to integrate Polish morphological data. Adam Radziszewski Tomasz Śniatowski Wrocław University of Technology

Maca a configurable tool to integrate Polish morphological data. Adam Radziszewski Tomasz Śniatowski Wrocław University of Technology Maca a configurable tool to integrate Polish morphological data Adam Radziszewski Tomasz Śniatowski Wrocław University of Technology Outline Morphological resources for Polish Tagset and segmentation differences

More information

10-1 Active sentences and passive sentences

10-1 Active sentences and passive sentences CONTENTS 10-1 Active sentences and passive sentences 10-2 Form of the passive 10-3 Transitive and intransitive verbs 10-4 Using the by-phrase 10-5 The passive forms of the present and past progressive

More information

VOCABULARY Starters Movers Flyers

VOCABULARY Starters Movers Flyers Tiger Time CE:YL Tests Mapping Jan. 2015 www.macmillanyounglearners.com/tigertime VOCABULARY VOCABULARY Starters Movers Flyers Animals L1: U5, L2: U2 L3: U2 L6: U3 The Body and the Face L1: U2 L4: U2 Clothes

More information

An evaluation of three POS taggers for the tagging of the Tswana Learner English Corpus

An evaluation of three POS taggers for the tagging of the Tswana Learner English Corpus An evaluation of three POS taggers for the tagging of the Tswana Learner English Corpus Bertus van Rooy & Lande Schäfer Potchefstroom University Private Bag X6001, Potchefstroom 2520, South Africa Phone:

More information

Technique For Clustering Uncertain Data Based On Probability Distribution Similarity

Technique For Clustering Uncertain Data Based On Probability Distribution Similarity Technique For Clustering Uncertain Data Based On Probability Distribution Similarity Vandana Dubey 1, Mrs A A Nikose 2 Vandana Dubey PBCOE, Nagpur,Maharashtra, India Mrs A A Nikose Assistant Professor

More information

Natural Language Processing Basics. Yingyu Liang University of Wisconsin-Madison

Natural Language Processing Basics. Yingyu Liang University of Wisconsin-Madison Natural Language Processing Basics Yingyu Liang University of Wisconsin-Madison Natural language Processing (NLP) The processing of the human languages by computers One of the oldest AI tasks One of the

More information

Christoph Treude. Bimodal Software Documentation

Christoph Treude. Bimodal Software Documentation Christoph Treude Bimodal Software Documentation Software Documentation [1985] 2 Software Documentation is everywhere [C Parnin and C Treude Measuring API Documentation on Web Web2SE 11: 2nd Int l Workshop

More information

Logical analysis of texts in a natural language and a sense representation

Logical analysis of texts in a natural language and a sense representation Bull. Nov. Comp. Center, Comp. Science, 26 (2007, 147 158 c 2007 NCC Publisher Logical analysis of texts in a natural language and a sense representation Tatyana Batura, Feodor Murzin Abstract. Methods

More information

Final Project Discussion. Adam Meyers Montclair State University

Final Project Discussion. Adam Meyers Montclair State University Final Project Discussion Adam Meyers Montclair State University Summary Project Timeline Project Format Details/Examples for Different Project Types Linguistic Resource Projects: Annotation, Lexicons,...

More information

NAME: 1a. (10 pts.) Describe the characteristics of numbers for which this floating-point data type is well-suited. Give an example.

NAME: 1a. (10 pts.) Describe the characteristics of numbers for which this floating-point data type is well-suited. Give an example. MSU CSC 285 Spring, 2007 Exam 2 (5 pgs.) NAME: 1. Suppose that a eight-bit floating-point data type is defined with the eight bits divided into fields as follows, where the bits are numbered with zero

More information

Alphabetical Index referenced by section numbers for PUNCTUATION FOR FICTION WRITERS by Rick Taubold, PhD and Scott Gamboe

Alphabetical Index referenced by section numbers for PUNCTUATION FOR FICTION WRITERS by Rick Taubold, PhD and Scott Gamboe Alphabetical Index referenced by section numbers for PUNCTUATION FOR FICTION WRITERS by Rick Taubold, PhD and Scott Gamboe?! 4.7 Abbreviations 4.1.2, 4.1.3 Abbreviations, plurals of 7.8.1 Accented letters

More information

Vision Plan. For KDD- Service based Numerical Entity Searcher (KSNES) Version 2.0

Vision Plan. For KDD- Service based Numerical Entity Searcher (KSNES) Version 2.0 Vision Plan For KDD- Service based Numerical Entity Searcher (KSNES) Version 2.0 Submitted in partial fulfillment of the Masters of Software Engineering Degree. Naga Sowjanya Karumuri CIS 895 MSE Project

More information

Ling/CSE 472: Introduction to Computational Linguistics. 5/9/17 Feature structures and unification

Ling/CSE 472: Introduction to Computational Linguistics. 5/9/17 Feature structures and unification Ling/CSE 472: Introduction to Computational Linguistics 5/9/17 Feature structures and unification Overview Problems with CFG Feature structures Unification Agreement Subcategorization Long-distance Dependencies

More information

CS 224N Assignment 2 Writeup

CS 224N Assignment 2 Writeup CS 224N Assignment 2 Writeup Angela Gong agong@stanford.edu Dept. of Computer Science Allen Nie anie@stanford.edu Symbolic Systems Program 1 Introduction 1.1 PCFG A probabilistic context-free grammar (PCFG)

More information

A. The following is a tentative list of parts of speech we will use to match an existing parser:

A. The following is a tentative list of parts of speech we will use to match an existing parser: API Functions available under technology owned by ACI A. The following is a tentative list of parts of speech we will use to match an existing parser: adjective adverb interjection noun verb auxiliary

More information

A Short Introduction to CATMA

A Short Introduction to CATMA A Short Introduction to CATMA Outline: I. Getting Started II. Analyzing Texts - Search Queries in CATMA III. Annotating Texts (collaboratively) with CATMA IV. Further Search Queries: Analyze Your Annotations

More information

Ling 571: Deep Processing for Natural Language Processing

Ling 571: Deep Processing for Natural Language Processing Ling 571: Deep Processing for Natural Language Processing Julie Medero February 4, 2013 Today s Plan Assignment Check-in Project 1 Wrap-up CKY Implementations HW2 FAQs: evalb Features & Unification Project

More information

Question Answering Using XML-Tagged Documents

Question Answering Using XML-Tagged Documents Question Answering Using XML-Tagged Documents Ken Litkowski ken@clres.com http://www.clres.com http://www.clres.com/trec11/index.html XML QA System P Full text processing of TREC top 20 documents Sentence

More information

AUTOMATIC LFG GENERATION

AUTOMATIC LFG GENERATION AUTOMATIC LFG GENERATION MS Thesis for the Degree of Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science (Computer Science) at the National University of Computer and

More information

Multiword deconstruction in AnCora dependencies and final release data

Multiword deconstruction in AnCora dependencies and final release data Multiword deconstruction in AnCora dependencies and final release data TECHNICAL REPORT GLICOM 2014-1 Benjamin Kolz, Toni Badia, Roser Saurí Universitat Pompeu Fabra {benjamin.kolz, toni.badia, roser.sauri}@upf.edu

More information

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) CONTEXT SENSITIVE TEXT SUMMARIZATION USING HIERARCHICAL CLUSTERING ALGORITHM

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) CONTEXT SENSITIVE TEXT SUMMARIZATION USING HIERARCHICAL CLUSTERING ALGORITHM INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & 6367(Print), ISSN 0976 6375(Online) Volume 3, Issue 1, January- June (2012), TECHNOLOGY (IJCET) IAEME ISSN 0976 6367(Print) ISSN 0976 6375(Online) Volume

More information

EDAN20 Language Technology Chapter 13: Dependency Parsing

EDAN20 Language Technology   Chapter 13: Dependency Parsing EDAN20 Language Technology http://cs.lth.se/edan20/ Pierre Nugues Lund University Pierre.Nugues@cs.lth.se http://cs.lth.se/pierre_nugues/ September 19, 2016 Pierre Nugues EDAN20 Language Technology http://cs.lth.se/edan20/

More information

ECTACO Partner E500T. English Spanish Talking Electronic Dictionary & Phrasebook USER MANUAL

ECTACO Partner E500T. English Spanish Talking Electronic Dictionary & Phrasebook USER MANUAL English Spanish Talking Electronic Dictionary & Phrasebook USER MANUAL ECTACO, Inc. assumes no responsibility for any damage or loss resulting from the use of this manual. ECTACO, Inc. assumes no responsibility

More information

Project Proposal. Spoke: a Language for Spoken Dialog Management

Project Proposal. Spoke: a Language for Spoken Dialog Management Programming Languages and Translators, Fall 2010 Project Proposal Spoke: a Language for Spoken Dialog Management William Yang Wang, Xin Chen, Chia-che Tsai, Zhou Yu (yw2347, xc2180, ct2459, zy2147)@columbia.edu

More information

CSI33 Data Structures

CSI33 Data Structures Outline Department of Mathematics and Computer Science Bronx Community College September 6, 2017 Outline Outline 1 Chapter 2: Data Abstraction Outline Chapter 2: Data Abstraction 1 Chapter 2: Data Abstraction

More information

Conceptual and Logical Design

Conceptual and Logical Design Conceptual and Logical Design Lecture 3 (Part 1) Akhtar Ali Building Conceptual Data Model To build a conceptual data model of the data requirements of the enterprise. Model comprises entity types, relationship

More information

1. He considers himself to be a genius. 2. He considered dieting to be unnecessary. 3. She considered that the waffle iron was broken. 4.

1. He considers himself to be a genius. 2. He considered dieting to be unnecessary. 3. She considered that the waffle iron was broken. 4. 1. He considers himself to be a genius. 2. He considered dieting to be unnecessary. 3. She considered that the waffle iron was broken. 4. He finally managed to get the bill paid. 5. I see you found the

More information

Rushin Shah Linguistic Data Consortium Under the guidance of Prof. Mark Liberman, Prof. Lyle Ungar and Mr. Mohamed Maamouri

Rushin Shah Linguistic Data Consortium Under the guidance of Prof. Mark Liberman, Prof. Lyle Ungar and Mr. Mohamed Maamouri Rushin Shah Linguistic Data Consortium Under the guidance of Prof. Mark Liberman, Prof. Lyle Ungar and Mr. Mohamed Maamouri Arabic corpus annotation currently uses the Standard Arabic Morphological Analyzer

More information

Dependency grammar and dependency parsing

Dependency grammar and dependency parsing Dependency grammar and dependency parsing Syntactic analysis (5LN455) 2014-12-10 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Mid-course evaluation Mostly positive

More information

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES Saturday 10 th December 2016 09:30 to 11:30 INSTRUCTIONS

More information

Dependency grammar and dependency parsing

Dependency grammar and dependency parsing Dependency grammar and dependency parsing Syntactic analysis (5LN455) 2015-12-09 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Activities - dependency parsing

More information

CS6200 Information Retrieval. David Smith College of Computer and Information Science Northeastern University

CS6200 Information Retrieval. David Smith College of Computer and Information Science Northeastern University CS6200 Information Retrieval David Smith College of Computer and Information Science Northeastern University Indexing Process Processing Text Converting documents to index terms Why? Matching the exact

More information

Resilience Unit The Iliad & The Odyssey Subject to Change

Resilience Unit The Iliad & The Odyssey Subject to Change Day 1 8 December Is It Better to Give than Receive? Unit Test Pass out Iliad binders God list God & Goddess project 4s 10 January 1s 11 January 2s 12 January 3s 13 January HW write reflections & study

More information

Dependency grammar and dependency parsing

Dependency grammar and dependency parsing Dependency grammar and dependency parsing Syntactic analysis (5LN455) 2016-12-05 Sara Stymne Department of Linguistics and Philology Based on slides from Marco Kuhlmann Activities - dependency parsing

More information

Ranking in a Domain Specific Search Engine

Ranking in a Domain Specific Search Engine Ranking in a Domain Specific Search Engine CS6998-03 - NLP for the Web Spring 2008, Final Report Sara Stolbach, ss3067 [at] columbia.edu Abstract A search engine that runs over all domains must give equal

More information

DR. H S 4 RULES. I have whittled the list down to 4 essential rules for college writing:

DR. H S 4 RULES. I have whittled the list down to 4 essential rules for college writing: PUNCTUATION Since we will NOT be drafting letters or addressing envelopes Since most students understand the use of direct quotes Since some of these rules can be combined DR. H S 4 RULES I have whittled

More information

Session Student Book Workbook Grammar Book Vocabulary Structures Functions. Unit 1 Present Simple p Unit 2 Present Progressive p.

Session Student Book Workbook Grammar Book Vocabulary Structures Functions. Unit 1 Present Simple p Unit 2 Present Progressive p. UNIVERSIDAD AUTÓNOMA DE CHIHUAHUA DIRECCIÓN ACADÉMICA CENTRO DE APRENDIZAJE DE IDIOMAS Session Student Book Workbook Grammar Book Vocabulary Structures Functions 1 Unit 1 p.9-11 Unit 1 p. 4 Free time activities

More information

SURVEY PAPER ON WEB PAGE CONTENT VISUALIZATION

SURVEY PAPER ON WEB PAGE CONTENT VISUALIZATION SURVEY PAPER ON WEB PAGE CONTENT VISUALIZATION Sushil Shrestha M.Tech IT, Department of Computer Science and Engineering, Kathmandu University Faculty, Department of Civil and Geomatics Engineering, Kathmandu

More information

/665 Natural Language Processing Assignment 6: Tagging with a Hidden Markov Model

/665 Natural Language Processing Assignment 6: Tagging with a Hidden Markov Model 601.465/665 Natural Language Processing Assignment 6: Tagging with a Hidden Markov Model Prof. Jason Eisner Fall 2018 Due date: Thursday 15 November, 9 pm In this assignment, you will build a Hidden Markov

More information

LING/C SC/PSYC 438/538. Lecture 3 Sandiway Fong

LING/C SC/PSYC 438/538. Lecture 3 Sandiway Fong LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong Today s Topics Homework 4 out due next Tuesday by midnight Homework 3 should have been submitted yesterday Quick Homework 3 review Continue with Perl intro

More information

Welcome to IEP Assistant Pro

Welcome to IEP Assistant Pro Welcome to IEP Assistant Pro www.psfe.com IEP Assistant Pro is design to assist you in creating an IEP in any program used to create IEPs. It is a companion application. Open IEP Assistant Pro and then

More information

ECTACO Partner EFa400T English Farsi Talking Electronic Dictionary & Phrasebook

ECTACO Partner EFa400T English Farsi Talking Electronic Dictionary & Phrasebook English Farsi Talking Electronic Dictionary & Phrasebook ECTACO Partner EFa400T انگليسی فارسی فرهنگ گویای الکترونيکی و کتابچه عبارات راهنمای کاربر ECTACO, Inc. assumes no responsibility for any damage

More information

Is It Better To Give Than To Receive? Unit subject to change

Is It Better To Give Than To Receive? Unit subject to change 19 October PSAT Day 1 20 October Simon Birch HW -- write reflections, study vocab 7 & hyphens & parentheses, & READ I can identify events from a story. I can identify information presented in diverse media

More information

15-110: Principles of Computing, Spring Problem Set 2 (PS2) Due: Friday, February 2 by 2:30PM on Gradescope

15-110: Principles of Computing, Spring Problem Set 2 (PS2) Due: Friday, February 2 by 2:30PM on Gradescope 15-110: Principles of Computing, Spring 2018 Problem Set 2 (PS2) Due: Friday, February 2 by 2:30PM on Gradescope HANDIN INSTRUCTIONS Download a copy of this PDF file. You have two ways to fill in your

More information

Package phrasemachine

Package phrasemachine Type Package Title Simple Phrase Extraction Version 1.1.2 Date 2017-05-29 Package phrasemachine May 29, 2017 Author Matthew J. Denny, Abram Handler, Brendan O'Connor Maintainer Matthew J. Denny

More information

Keywords Text clustering, feature selection, part-of-speech, chunking, Standard K-means, Bisecting K-means.

Keywords Text clustering, feature selection, part-of-speech, chunking, Standard K-means, Bisecting K-means. Volume 4, Issue 4, April 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Semantic Feature

More information

Compsci 101 Fall 2015 Exam 1 Rubric. Problem 1 (24 points)

Compsci 101 Fall 2015 Exam 1 Rubric. Problem 1 (24 points) Compsci 101 Fall 2015 Exam 1 Rubric Problem 1 (24 points) Each answer is one point each. Spelling of int, integer doesn't matter. Similarly string, str doesn't matter. For the value column all answers

More information

Corpus Linguistics. Seminar Resources for Computational Linguists SS Magdalena Wolska & Michaela Regneri

Corpus Linguistics. Seminar Resources for Computational Linguists SS Magdalena Wolska & Michaela Regneri Seminar Resources for Computational Linguists SS 2007 Magdalena Wolska & Michaela Regneri Armchair Linguists vs. Corpus Linguists Competence Performance 2 Motivation (for ) 3 Outline Corpora Annotation

More information

Interactive Visualization for Computational Linguistics

Interactive Visualization for Computational Linguistics Interactive Visualization for Computational Linguistics ESSLII 2009 2 Interaction and animation References 3 Slides in this section are based on: Yi et al., Toward a Deeper Understanding of the Role of

More information

Modeling Crisis Management System With the Restricted Use Case Modeling Approach

Modeling Crisis Management System With the Restricted Use Case Modeling Approach Modeling Crisis Management System With the Restricted Use Case Modeling Approach Gong Zhang 1, Tao Yue 2, and Shaukat Ali 3 1 School of Computer Science and Engineering, Beihang University, Beijing, China

More information

/665 Natural Language Processing Assignment 6: Tagging with a Hidden Markov Model

/665 Natural Language Processing Assignment 6: Tagging with a Hidden Markov Model 601.465/665 Natural Language Processing Assignment 6: Tagging with a Hidden Markov Model Prof. Jason Eisner Fall 2017 Due date: Sunday 19 November, 9 pm In this assignment, you will build a Hidden Markov

More information

Information Extraction Techniques in Terrorism Surveillance

Information Extraction Techniques in Terrorism Surveillance Information Extraction Techniques in Terrorism Surveillance Roman Tekhov Abstract. The article gives a brief overview of what information extraction is and how it might be used for the purposes of counter-terrorism

More information

Using NLP to Detect Requirements Defects: an Industrial Experience in the Railway Domain

Using NLP to Detect Requirements Defects: an Industrial Experience in the Railway Domain Using NLP to Detect Requirements Defects: an Industrial Experience in the Railway Domain Benedetta Rosadini 1, Alessio Ferrari 3, Gloria Gori 2, Alessandro Fantechi 2, Stefania Gnesi 3, Iacopo Trotta 1,

More information

Frameworks for Natural Language Processing of Textual Requirements

Frameworks for Natural Language Processing of Textual Requirements Frameworks for Natural Language Processing of Textual Requirements Andres Arellano Government of Chile, Santiago, Chile Email: andres.arellano@gmail.com Edward Zontek-Carney Northrop Grumman Corporation,

More information

Compiler Construction

Compiler Construction Compiler Construction Lecture 4 - Context-Free Grammars 2003 Robert M. Siegfried All rights reserved A few necessary definitions Parse -vt,to resolve (as a sentence) into component parts of speech and

More information

J.A.R.V.I.S. Group #14: Yao Rao, # Dianchen Jiang, # Minghui Lin, # Jensen Zhang, #

J.A.R.V.I.S. Group #14: Yao Rao, # Dianchen Jiang, # Minghui Lin, # Jensen Zhang, # J.A.R.V.I.S. Group #14: Yao Rao, #108788980 Dianchen Jiang, #108250990 Minghui Lin, #109557872 Jensen Zhang, #109561796 CSE 352 Artificial Intelligence Prof. Anita Wasilewska Nov. 9th, 2015 Resources http://www.boosharticles.com/2014/10/jarvis-ironman-ai-projectinvention-future/

More information