How to.. What is the point of it?

Size: px
Start display at page:

Download "How to.. What is the point of it?"

Transcription

1 Program's name: Linguistic Toolbox 3.0 α-version Short name: LIT Authors: ViatcheslavYatsko, Mikhail Starikov Platform: Windows System requirements: 1 GB free disk space, 512 RAM,.Net Farmework Supported languages: English Supported formats:.txt Distribution: freeware, GPL Web site: E- mail: iatsko@gmail.com vetsky@yandex.ru What is the point of it? Linguistic Toolbox is a concordance that differs from existing analogues in the following respects. 1. It has an integrated part-of-speech tagger thus allowing the user to create his/her own annotated corpora. Profound linguistic research is often based on a specific text genre (e.g. fiction, scientific text) linguistic category (e.g. possession), or works of a particular author (e.g. Maugham). Publicly available annotated national corpora with evenly distributed genres often fail to meet the demands of such research and LIT has been designed to fill this gap. By means of LIT the user can conduct various searches on his/her own corpora and get statistical information on distribution of various words, patterns, and phrases. 2. LIT has an integrated WordNet module by means of which the user can search not only for a given word but also for words semantically related to it. How to.. How to open LIT LIT doesn't require any installation procedures. Open the folder with LIT you downloaded from the Internet, find and double click LToolboxGui.exe. The program will start. Note that you must have.net Framework installed on your computer. How to upload texts To upload a text just click Add text button and using the Windows Explorer choose a file in a directory on your computer. LIT supports only files with a.txt extension; it won't process other formats, for example.doc files. If you have texts in this format you will have first to save them as plain text. If you have texts in some other text format, e.g..pdf you will have to convert them using existing converters. While the file is being added you will see the progress bar indicating three basic preprocessing procedures tokenization, splitting, and POS-tagging. Since our tagger functions much faster that existing analogues the preprocessing won't take much time even if a large file is uploaded. As soon as the file has been added its name appears in directory of texts and you can obtain from it what you want.

2 Using the delete button you can safely remove files from the directory. How to run queries Before running a query you should either select a text or open it. You may have several texts in the directory, so before running a query just click on one of the texts to select it, you should click a little bit to the left of text's name, don't click on the text's name itself this will manifest you intention to rename the text. A double click will open the text and statistical information about it that may be useful for its analysis, such as number of paragraphs, sentences, total tokens (i.e. total words), total words (i.e. unique words), average sentence length, average paragraph length. To get from LIT what you want you must press the Text query button to open the query box where you can type your queries. The query syntax is similar with that one in Mark Davies' BNC interface. The query, that opens by default, i s *\*, if you press Enter key on your computer's keyboard this query will yield a list of all tokens in the text with pos-tags assigned to each token. Uncheck the Show tags menu option and repeat the query pressing on the Enter key again, and the pos-tags disappear.

3 Let us discuss the query syntax in detail. * - stands for any word or any symbol. E.g. the query syntax bill* stands for any word that opens with these symbols (prefix search); it will find bill, bills, billed, etc. The *en query will result in finding all words that have such an ending (suffix search), lighten, heighten, brighten, etc. if you separate the asterisk from word with a white space it will acquire the meaning any word. * * * Bill * * * will find all phrases in which Bill is preceded and followed by any three words. Be careful with the white spaces, the program will not work if you type an extra one. \ - stands for any pos-tag. You may find words with specific tags. For example \nnp query finds all proper nouns, \n* will find all words whose pos-tags open with n. Since many English nouns have homonymic verb forms using this syntax you can find the words that belongs to a specific part of speech, e.g. place\n* finds place used as a noun- form while place\v* finds verb-forms. Pos-tags search can be combined with word-search. The query * * * Bill \v* * * will find all phrases where Bill is preceded by any three words and followed by a verb- form and any other two words. # - highlights the keyword and positions it in the center of concordance. This symbol (we call it an anchor) may be useful in case of complex queries, try this one (*)* Bill (*)* and

4 this one (*)* # Bill (*)* You will see that in the first case Bill is scattered all over the text while in the latter it is positioned in the center. By the way (*)* query stands for any word repeated any times and when run separately will yield a list of all sentences in the text. I.e. (*)* Bill (*)* will find all the sentences with the word Bill and Bill (*)* finds phrases with Bill followed by any number of words till the end of the sentences where it occurs. This is a specific feature of the current version of LIT: it provides context only within sentences boundaries. We are not sure if a broader context is needed, tell us about it! opt(c) makes the search case-sensitive. By default it's case- insensitive, i.e. in response to Bill the system will also find bill and vice versa. opt(c) Bill finds cases where Bill is in the upper register and opt(c) bill finds those ones with bill in the lower register. syns(word) - finds the word and its synonyms. E.g. syns(place) query will yield all cases with place and its synonyms. We haven't integrated the hypernyms or hyponyms opportunities yet but we can do it on your request; tell us about it! ^ - indicates the position of the keyword at he beginning of the sentence. ^ bill (*)* results in the list of all sentences that open with bill

5 You can try other search possibilities combining various tags and syntax symbols. The lists of syntax symbols and pos-tags are available in the Help directory that comes with LIT. How to get collocates To get collocates for a specific word you may use asterisks separated with white spaces, their number being equal to the number of collocates to the left or to the right of the given word. If you run * * * Bill * * * query you will get three collocates to the left and right of Bill. We use as a sample text 'The Ransom of Red Chief' available in the 'Tutorial' folder. By clicking Collocates button on the panel bar you can open a window with statistic information about collocates. For our sample text you can see that the most frequent collocate on the first position to the left of Bill is says while that one to the right is 's.

6 You can do sorting by clicking on the Count and Item fields. You can select collocates by clicking on the utmost right empty field of the table and copy the table with 'Control + C' so as to paste it into some external editor for further processing. Using the slider you can change the position of collocates. If you check 'Tags' box you can get pos tags rather than specific words as collocates.

7

8 It should be noted that you can upload and select several texts to run queries on all of them. For example you can first get information about some patterns and distributions in the texts of and then in academic texts to perform contrastive analysis. The screenshot below illustrates distribution of collocates for said in three texts.

9 Since we constantly work at improving our products these instructions can be outdated. Visit our Web-page to get updated user instructions or updated versions of LIT. Feel free to contact us if you have questions or comments.

A Short Introduction to CATMA

A Short Introduction to CATMA A Short Introduction to CATMA Outline: I. Getting Started II. Analyzing Texts - Search Queries in CATMA III. Annotating Texts (collaboratively) with CATMA IV. Further Search Queries: Analyze Your Annotations

More information

Data for linguistics ALEXIS DIMITRIADIS. Contents First Last Prev Next Back Close Quit

Data for linguistics ALEXIS DIMITRIADIS. Contents First Last Prev Next Back Close Quit Data for linguistics ALEXIS DIMITRIADIS Text, corpora, and data in the wild 1. Where does language data come from? The usual: Introspection, questionnaires, etc. Corpora, suited to the domain of study:

More information

LING203: Corpus. March 9, 2009

LING203: Corpus. March 9, 2009 LING203: Corpus March 9, 2009 Corpus A collection of machine readable texts SJSU LLD have many corpora http://linguistics.sjsu.edu/bin/view/public/chltcorpora Each corpus has a link to a description page

More information

Module 1: Information Extraction

Module 1: Information Extraction Module 1: Information Extraction Introduction to GATE Developer The University of Sheffield, 1995-2014 This work is licenced under the Creative Commons Attribution-NonCommercial-ShareAlike Licence About

More information

Parallel Concordancing and Translation. Michael Barlow

Parallel Concordancing and Translation. Michael Barlow [Translating and the Computer 26, November 2004 [London: Aslib, 2004] Parallel Concordancing and Translation Michael Barlow Dept. of Applied Language Studies and Linguistics University of Auckland Auckland,

More information

Contents. List of Figures. List of Tables. Acknowledgements

Contents. List of Figures. List of Tables. Acknowledgements Contents List of Figures List of Tables Acknowledgements xiii xv xvii 1 Introduction 1 1.1 Linguistic Data Analysis 3 1.1.1 What's data? 3 1.1.2 Forms of data 3 1.1.3 Collecting and analysing data 7 1.2

More information

ANC2Go: A Web Application for Customized Corpus Creation

ANC2Go: A Web Application for Customized Corpus Creation ANC2Go: A Web Application for Customized Corpus Creation Nancy Ide, Keith Suderman, Brian Simms Department of Computer Science, Vassar College Poughkeepsie, New York 12604 USA {ide, suderman, brsimms}@cs.vassar.edu

More information

Research Tools: DIY Text Tools

Research Tools: DIY Text Tools As with the other Research Tools, the DIY Text Tools are primarily designed for small research projects at the undergraduate level. What are the DIY Text Tools for? These tools are designed to help you

More information

Introducing XAIRA. Lou Burnard Tony Dodd. An XML aware tool for corpus indexing and searching. Research Technology Services, OUCS

Introducing XAIRA. Lou Burnard Tony Dodd. An XML aware tool for corpus indexing and searching. Research Technology Services, OUCS Introducing XAIRA An XML aware tool for corpus indexing and searching Lou Burnard Tony Dodd Research Technology Services, OUCS What is XAIRA? XML Aware Indexing and Retrieval Architecture Developed from

More information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University sekine@cs.nyu.edu Kapil Dalwani Computer Science Department

More information

Final Project Discussion. Adam Meyers Montclair State University

Final Project Discussion. Adam Meyers Montclair State University Final Project Discussion Adam Meyers Montclair State University Summary Project Timeline Project Format Details/Examples for Different Project Types Linguistic Resource Projects: Annotation, Lexicons,...

More information

Windows On Windows systems, simply double click the AntConc icon and this will launch the program.

Windows On Windows systems, simply double click the AntConc icon and this will launch the program. AntConc (Windows, Macintosh OS X, and Linux) Build 3.5.2 (February 8, 2018) Laurence Anthony, Ph.D. Center for English Language Education in Science and Engineering, School of Science and Engineering,

More information

Information Extraction Techniques in Terrorism Surveillance

Information Extraction Techniques in Terrorism Surveillance Information Extraction Techniques in Terrorism Surveillance Roman Tekhov Abstract. The article gives a brief overview of what information extraction is and how it might be used for the purposes of counter-terrorism

More information

Privacy and Security in Online Social Networks Department of Computer Science and Engineering Indian Institute of Technology, Madras

Privacy and Security in Online Social Networks Department of Computer Science and Engineering Indian Institute of Technology, Madras Privacy and Security in Online Social Networks Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture - 25 Tutorial 5: Analyzing text using Python NLTK Hi everyone,

More information

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent

More information

Hotmail Documentation Style Guide

Hotmail Documentation Style Guide Hotmail Documentation Style Guide Version 2.2 This Style Guide exists to ensure that there is a consistent voice among all Hotmail documents. It is an evolving document additions or changes may be made

More information

TectoMT: Modular NLP Framework

TectoMT: Modular NLP Framework : Modular NLP Framework Martin Popel, Zdeněk Žabokrtský ÚFAL, Charles University in Prague IceTAL, 7th International Conference on Natural Language Processing August 17, 2010, Reykjavik Outline Motivation

More information

Tutorial and Exercises with WordList in WordSmith Tools: Level I

Tutorial and Exercises with WordList in WordSmith Tools: Level I Tutorial and Exercises with WordList in WordSmith Tools: Level I WordSmith Tools, developed by Mike Scott, is a corpus analysis tool that integrates three text analysis tools: a monolingual concordancer

More information

A tool for Cross-Language Pair Annotations: CLPA

A tool for Cross-Language Pair Annotations: CLPA A tool for Cross-Language Pair Annotations: CLPA August 28, 2006 This document describes our tool called Cross-Language Pair Annotator (CLPA) that is capable to automatically annotate cognates and false

More information

Tutorial on Text Mining for the Going Digital initiative. Natural Language Processing (NLP), University of Essex

Tutorial on Text Mining for the Going Digital initiative. Natural Language Processing (NLP), University of Essex Tutorial on Text Mining for the Going Digital initiative Natural Language Processing (NLP), University of Essex 6 February, 2013 Topics of This Tutorial o Information Extraction (IE) o Examples of IE systems

More information

from Pavel Mihaylov and Dorothee Beermann Reviewed by Sc o t t Fa r r a r, University of Washington

from Pavel Mihaylov and Dorothee Beermann Reviewed by Sc o t t Fa r r a r, University of Washington Vol. 4 (2010), pp. 60-65 http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/4467 TypeCraft from Pavel Mihaylov and Dorothee Beermann Reviewed by Sc o t t Fa r r a r, University of Washington 1. OVERVIEW.

More information

A Linguistic Approach for Semantic Web Service Discovery

A Linguistic Approach for Semantic Web Service Discovery A Linguistic Approach for Semantic Web Service Discovery Jordy Sangers 307370js jordysangers@hotmail.com Bachelor Thesis Economics and Informatics Erasmus School of Economics Erasmus University Rotterdam

More information

Story Workbench Quickstart Guide Version 1.2.0

Story Workbench Quickstart Guide Version 1.2.0 1 Basic Concepts Story Workbench Quickstart Guide Version 1.2.0 Mark A. Finlayson (markaf@mit.edu) Annotation An indivisible piece of data attached to a text is called an annotation. Annotations, also

More information

Making Sense Out of the Web

Making Sense Out of the Web Making Sense Out of the Web Rada Mihalcea University of North Texas Department of Computer Science rada@cs.unt.edu Abstract. In the past few years, we have witnessed a tremendous growth of the World Wide

More information

CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS

CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS 82 CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS In recent years, everybody is in thirst of getting information from the internet. Search engines are used to fulfill the need of them. Even though the

More information

Let s get parsing! Each component processes the Doc object, then passes it on. doc.is_parsed attribute checks whether a Doc object has been parsed

Let s get parsing! Each component processes the Doc object, then passes it on. doc.is_parsed attribute checks whether a Doc object has been parsed Let s get parsing! SpaCy default model includes tagger, parser and entity recognizer nlp = spacy.load('en ) tells spacy to use "en" with ["tagger", "parser", "ner"] Each component processes the Doc object,

More information

The American National Corpus First Release

The American National Corpus First Release The American National Corpus First Release Nancy Ide and Keith Suderman Department of Computer Science, Vassar College, Poughkeepsie, NY 12604-0520 USA ide@cs.vassar.edu, suderman@cs.vassar.edu Abstract

More information

Enabling Semantic Search in Large Open Source Communities

Enabling Semantic Search in Large Open Source Communities Enabling Semantic Search in Large Open Source Communities Gregor Leban, Lorand Dali, Inna Novalija Jožef Stefan Institute, Jamova cesta 39, 1000 Ljubljana {gregor.leban, lorand.dali, inna.koval}@ijs.si

More information

QDA Miner. Addendum v2.0

QDA Miner. Addendum v2.0 QDA Miner Addendum v2.0 QDA Miner is an easy-to-use qualitative analysis software for coding, annotating, retrieving and reviewing coded data and documents such as open-ended responses, customer comments,

More information

IBM Rational Rhapsody Gateway Add On. Customization Guide

IBM Rational Rhapsody Gateway Add On. Customization Guide Customization Guide Rhapsody IBM Rational Rhapsody Gateway Add On Customization Guide License Agreement No part of this publication may be reproduced, transmitted, stored in a retrieval system, nor translated

More information

Self Help Guide to SPIN. World's Largest Database of Sponsored Funding Opportunities.

Self Help Guide to SPIN. World's Largest Database of Sponsored Funding Opportunities. Self Help Guide to SPIN World's Largest Database of Sponsored Funding Opportunities http://www.geneseo.edu/sponsored_research SPIN SEARCHABLE DATABASE MANUAL SPIN is an extensive research funding opportunity

More information

Using the Multimedia On-line Dictionary

Using the Multimedia On-line Dictionary Educators' Orientation Edzo, Degaimaàdzêëzaà 24-25, 2011 xoò Using the Multimedia On the Main Dictionary page of the dictionary website, http://tlicho.ling.uvic.ca, there are short instructions. The instructions

More information

Using Search-Logs to Improve Query Tagging

Using Search-Logs to Improve Query Tagging Using Search-Logs to Improve Query Tagging Kuzman Ganchev Keith Hall Ryan McDonald Slav Petrov Google, Inc. {kuzman kbhall ryanmcd slav}@google.com Abstract Syntactic analysis of search queries is important

More information

Precise Medication Extraction using Agile Text Mining

Precise Medication Extraction using Agile Text Mining Precise Medication Extraction using Agile Text Mining Chaitanya Shivade *, James Cormack, David Milward * The Ohio State University, Columbus, Ohio, USA Linguamatics Ltd, Cambridge, UK shivade@cse.ohio-state.edu,

More information

Parsing partially bracketed input

Parsing partially bracketed input Parsing partially bracketed input Martijn Wieling, Mark-Jan Nederhof and Gertjan van Noord Humanities Computing, University of Groningen Abstract A method is proposed to convert a Context Free Grammar

More information

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,

More information

Frooition Implementation guide

Frooition Implementation guide Frooition Implementation guide Version: 2.0 Updated: 14/12/2016 Contents Account Setup: 1. Software Checklist 2. Accessing the Frooition Software 3. Completing your Account Profile 4. Updating your Frooition

More information

HW Label the following computer parts: E-Banking E-Government E-Commerce

HW Label the following computer parts: E-Banking E-Government E-Commerce HW 1 1. Label the following computer parts: (7 marks) 2. The Internet has provided the community with online services which are becoming more common in everyday life. Using the below terms give the name

More information

Orange3-Textable Documentation

Orange3-Textable Documentation Orange3-Textable Documentation Release 3.0a1 LangTech Sarl Dec 19, 2017 Contents 1 Getting Started 3 1.1 Orange Textable............................................. 3 1.2 Description................................................

More information

Correlation to Georgia Quality Core Curriculum

Correlation to Georgia Quality Core Curriculum 1. Strand: Oral Communication Topic: Listening/Speaking Standard: Adapts or changes oral language to fit the situation by following the rules of conversation with peers and adults. 2. Standard: Listens

More information

TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL

TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL We have spent the first part of the course learning Excel: importing files, cleaning, sorting, filtering, pivot tables and exporting

More information

Information Retrieval

Information Retrieval Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,

More information

The Goal of this Document. Where to Start?

The Goal of this Document. Where to Start? A QUICK INTRODUCTION TO THE SEMILAR APPLICATION Mihai Lintean, Rajendra Banjade, and Vasile Rus vrus@memphis.edu linteam@gmail.com rbanjade@memphis.edu The Goal of this Document This document introduce

More information

CSC 5930/9010: Text Mining GATE Developer Overview

CSC 5930/9010: Text Mining GATE Developer Overview 1 CSC 5930/9010: Text Mining GATE Developer Overview Dr. Paula Matuszek Paula.Matuszek@villanova.edu Paula.Matuszek@gmail.com (610) 647-9789 GATE Components 2 We will deal primarily with GATE Developer:

More information

User Manual Al Manhal. All rights reserved v 3.0

User Manual Al Manhal. All rights reserved v 3.0 User Manual 1 2010-2016 Al Manhal. All rights reserved v 3.0 Table of Contents Conduct a Search... 3 1. USING SIMPLE SEARCH... 3 2. USING ADVANCED SEARCH... 4 Search Results List... 5 Browse... 7 1. BROWSE

More information

WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY

WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.4, April 2009 349 WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY Mohammed M. Sakre Mohammed M. Kouta Ali M. N. Allam Al Shorouk

More information

Maximum Entropy based Natural Language Interface for Relational Database

Maximum Entropy based Natural Language Interface for Relational Database International Journal of Engineering Research and Technology. ISSN 0974-3154 Volume 7, Number 1 (2014), pp. 69-77 International Research Publication House http://www.irphouse.com Maximum Entropy based

More information

Question Answering Using XML-Tagged Documents

Question Answering Using XML-Tagged Documents Question Answering Using XML-Tagged Documents Ken Litkowski ken@clres.com http://www.clres.com http://www.clres.com/trec11/index.html XML QA System P Full text processing of TREC top 20 documents Sentence

More information

FileSearchEX 1.1 Series

FileSearchEX 1.1 Series FileSearchEX 1.1 Series Instruction Manual document version: 1.1.0.5 Copyright 2010 2018 GOFF Concepts LLC. All rights reserved. GOFF Concepts assumes no responsibility for errors or omissions in this

More information

NLP Final Project Fall 2015, Due Friday, December 18

NLP Final Project Fall 2015, Due Friday, December 18 NLP Final Project Fall 2015, Due Friday, December 18 For the final project, everyone is required to do some sentiment classification and then choose one of the other three types of projects: annotation,

More information

Kurzweil 3000 User s Guide

Kurzweil 3000 User s Guide Kurzweil 3000 User s Guide With Kurzweil, students can: 1. hear, see and track reading material 2. correct what student is writing 3. organize lesson material 4. hear and respond to test material Toolbars

More information

S4B Split Movie Soft4Boost Help S4B Split Movie www.sorentioapps.com Sorentio Systems, Ltd. All rights reserved Contact Us If you have any comments, suggestions or questions regarding S4B Split Movie or

More information

Chapter IR:II. II. Architecture of a Search Engine. Indexing Process Search Process

Chapter IR:II. II. Architecture of a Search Engine. Indexing Process Search Process Chapter IR:II II. Architecture of a Search Engine Indexing Process Search Process IR:II-87 Introduction HAGEN/POTTHAST/STEIN 2017 Remarks: Software architecture refers to the high level structures of a

More information

BYTE / BOOL A BYTE is an unsigned 8 bit integer. ABOOL is a BYTE that is guaranteed to be either 0 (False) or 1 (True).

BYTE / BOOL A BYTE is an unsigned 8 bit integer. ABOOL is a BYTE that is guaranteed to be either 0 (False) or 1 (True). NAME CQi tutorial how to run a CQP query DESCRIPTION This tutorial gives an introduction to the Corpus Query Interface (CQi). After a short description of the data types used by the CQi, a simple application

More information

User Guide. Copyright Wordfast, LLC All rights reserved.

User Guide. Copyright Wordfast, LLC All rights reserved. User Guide All rights reserved. Table of Contents Release Notes Summary... 7 New Features and Improvements... 7 Fixed Issues... 7 Known Issues... 8 1 About this Guide... 9 Conventions...9 Typographical...

More information

Alphabetical Index referenced by section numbers for PUNCTUATION FOR FICTION WRITERS by Rick Taubold, PhD and Scott Gamboe

Alphabetical Index referenced by section numbers for PUNCTUATION FOR FICTION WRITERS by Rick Taubold, PhD and Scott Gamboe Alphabetical Index referenced by section numbers for PUNCTUATION FOR FICTION WRITERS by Rick Taubold, PhD and Scott Gamboe?! 4.7 Abbreviations 4.1.2, 4.1.3 Abbreviations, plurals of 7.8.1 Accented letters

More information

Profiling Medical Journal Articles Using a Gene Ontology Semantic Tagger. Mahmoud El-Haj Paul Rayson Scott Piao Jo Knight

Profiling Medical Journal Articles Using a Gene Ontology Semantic Tagger. Mahmoud El-Haj Paul Rayson Scott Piao Jo Knight Profiling Medical Journal Articles Using a Gene Ontology Semantic Tagger Mahmoud El-Haj Paul Rayson Scott Piao Jo Knight Origin and Outcomes Currently funded through a Wellcome Trust Seed award Collaboration

More information

ACS documents 1, 2, and 3. These documents are available in CH215. What is EMACS? ::::::::::::::::::::::::::::::::::::::::::::::::::::: 2

ACS documents 1, 2, and 3. These documents are available in CH215. What is EMACS? ::::::::::::::::::::::::::::::::::::::::::::::::::::: 2 1 4. Beginning to Use EMACS Academic Computing Support Tennessee Technological University Prerequisite Contents ACS documents 1, 2, and 3. These documents are available in CH215. What is EMACS? :::::::::::::::::::::::::::::::::::::::::::::::::::::

More information

Error annotation in adjective noun (AN) combinations

Error annotation in adjective noun (AN) combinations Error annotation in adjective noun (AN) combinations This document describes the annotation scheme devised for annotating errors in AN combinations and explains how the inter-annotator agreement has been

More information

Document No.: CD Duplicate Master. CD Duplicate Master. Jam Video Software Solution Inc. Page 1

Document No.: CD Duplicate Master. CD Duplicate Master. Jam Video Software Solution Inc.  Page 1 Document No.: CD Duplicate Master CD Duplicate Master Jam Video Software Solution Inc. http://www.jamvideosoftware.com Page 1 Pages Order Introduction...Pages 3 How to buy...pages 4 How to use...pages

More information

It is possible to create webpages without knowing anything about the HTML source behind the page.

It is possible to create webpages without knowing anything about the HTML source behind the page. What is HTML? HTML is the standard markup language for creating Web pages. HTML is a fairly simple language made up of elements, which can be applied to pieces of text to give them different meaning in

More information

Copyright 2018 Maxprograms

Copyright 2018 Maxprograms Copyright 2018 Maxprograms Table of Contents Introduction... 1 TMXEditor... 1 Features... 1 Getting Started... 2 Editing an existing file... 2 Create New TMX File... 3 Maintenance Tasks... 4 Sorting TM

More information

Non-deterministic Finite Automata (NFA)

Non-deterministic Finite Automata (NFA) Non-deterministic Finite Automata (NFA) CAN have transitions on the same input to different states Can include a ε or λ transition (i.e. move to new state without reading input) Often easier to design

More information

Ellogon and the challenge of threads

Ellogon and the challenge of threads Ellogon and the challenge of threads Georgios Petasis Software and Knowledge Engineering Laboratory, Institute of Informatics and Telecommunications, National Centre for Scientific Research Demokritos,

More information

Building a Tokenizer for Indonesian

Building a Tokenizer for Indonesian Building a Tokenizer for Indonesian David Moeljadi and Hannah Choi Division of Linguistics and Multilingual Studies, Nanyang Technological University, Singapore The 21st International Symposium on Malay/Indonesian

More information

Text Analytics Introduction (Part 1)

Text Analytics Introduction (Part 1) Text Analytics Introduction (Part 1) Maha Althobaiti, Udo Kruschwitz, Massimo Poesio School of Computer Science and Electronic Engineering University of Essex udo@essex.ac.uk 23 September 2015 Text Analytics

More information

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) CONTEXT SENSITIVE TEXT SUMMARIZATION USING HIERARCHICAL CLUSTERING ALGORITHM

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) CONTEXT SENSITIVE TEXT SUMMARIZATION USING HIERARCHICAL CLUSTERING ALGORITHM INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & 6367(Print), ISSN 0976 6375(Online) Volume 3, Issue 1, January- June (2012), TECHNOLOGY (IJCET) IAEME ISSN 0976 6367(Print) ISSN 0976 6375(Online) Volume

More information

CaseComplete Roadmap

CaseComplete Roadmap CaseComplete Roadmap Copyright 2004-2014 Serlio Software Development Corporation Contents Get started... 1 Create a project... 1 Set the vision and scope... 1 Brainstorm for primary actors and their goals...

More information

Get the most value from your surveys with text analysis

Get the most value from your surveys with text analysis SPSS Text Analysis for Surveys 3.0 Specifications Get the most value from your surveys with text analysis The words people use to answer a question tell you a lot about what they think and feel. That s

More information

CSC401 Natural Language Computing

CSC401 Natural Language Computing CSC401 Natural Language Computing Jan 19, 2018 TA: Willie Chang Varada Kolhatkar, Ka-Chun Won, and Aryan Arbabi) Mascots: r/sandersforpresident (left) and r/the_donald (right) To perform sentiment analysis

More information

NATURAL LANGUAGE PROCESSING

NATURAL LANGUAGE PROCESSING NATURAL LANGUAGE PROCESSING LESSON 9 : SEMANTIC SIMILARITY OUTLINE Semantic Relations Semantic Similarity Levels Sense Level Word Level Text Level WordNet-based Similarity Methods Hybrid Methods Similarity

More information

LETTERS AND DOCUMENTS

LETTERS AND DOCUMENTS LETTERS AND DOCUMENTS Table of Contents How to Merge and Send Documents and Letters... 3 How to merge a Template Letter... 4 How to send a Letter or Document via Email... 8 How to send a Letter or Document

More information

Web Information Retrieval using WordNet

Web Information Retrieval using WordNet Web Information Retrieval using WordNet Jyotsna Gharat Asst. Professor, Xavier Institute of Engineering, Mumbai, India Jayant Gadge Asst. Professor, Thadomal Shahani Engineering College Mumbai, India ABSTRACT

More information

Chapter 2 Text Processing with the Command Line Interface

Chapter 2 Text Processing with the Command Line Interface Chapter 2 Text Processing with the Command Line Interface Abstract This chapter aims to help demystify the command line interface that is commonly used in UNIX and UNIX-like systems such as Linux and Mac

More information

WHY EFFECTIVE WEB WRITING MATTERS Web users read differently on the web. They rarely read entire pages, word for word.

WHY EFFECTIVE WEB WRITING MATTERS Web users read differently on the web. They rarely read entire pages, word for word. Web Writing 101 WHY EFFECTIVE WEB WRITING MATTERS Web users read differently on the web. They rarely read entire pages, word for word. Instead, users: Scan pages Pick out key words and phrases Read in

More information

AVS4YOU Programs Help

AVS4YOU Programs Help AVS4YOU Help - AVS Document Converter AVS4YOU Programs Help AVS Document Converter www.avs4you.com Online Media Technologies, Ltd., UK. 2004-2012 All rights reserved AVS4YOU Programs Help Page 2 of 39

More information

Text Mining. Representation of Text Documents

Text Mining. Representation of Text Documents Data Mining is typically concerned with the detection of patterns in numeric data, but very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data,

More information

NLP in practice, an example: Semantic Role Labeling

NLP in practice, an example: Semantic Role Labeling NLP in practice, an example: Semantic Role Labeling Anders Björkelund Lund University, Dept. of Computer Science anders.bjorkelund@cs.lth.se October 15, 2010 Anders Björkelund NLP in practice, an example:

More information

National Training and Education Resource. Authoring Course. Participant Guide

National Training and Education Resource. Authoring Course. Participant Guide National Training and Education Resource Authoring Course Participant Guide Table of Contents: OBJECTIVES... 4 OVERVIEW OF NTER... 5 System Requirements... 5 NTER Capabilities... 6 What is the SCORM PlayerWhat

More information

FACULTY QUICK START GUIDE

FACULTY QUICK START GUIDE Page 1 of 34 FACULTY QUICK START GUIDE Developed by elearning In this Guide: We have included the minimum requirements to setup your D2L shell for your face-to-face courses. These include the following:

More information

For convenience in typing examples, we can shorten the wordnet name to wn.

For convenience in typing examples, we can shorten the wordnet name to wn. NLP Lab Session Week 14, December 4, 2013 More Semantics: WordNet similarity in NLTK and LDA Mallet demo More on Final Projects: weka memory and loading Spam documents Getting Started For the final projects,

More information

PBWORKS - Student User Guide

PBWORKS - Student User Guide PBWORKS - Student User Guide Spring and Fall 2011 PBworks - Student Users Guide This guide provides the basic information you need to get started with PBworks. If you don t find the help you need in this

More information

Buyer Seller Communication Marketplace Add-on

Buyer Seller Communication Marketplace Add-on Buyer Seller Communication Marketplace Add-on webkul.com/blog/marketplace-buyer-seller-communication-magento2/ On - March 17, 2016 Buyer Seller Communication Marketplace Add-on is a very useful module

More information

View and Submit an Assignment in Criterion

View and Submit an Assignment in Criterion View and Submit an Assignment in Criterion Criterion is an Online Writing Evaluation service offered by ETS. It is a computer-based scoring program designed to help you think about your writing process

More information

Query classification by using named entity recognition systems and clue keywords

Query classification by using named entity recognition systems and clue keywords Query classification by using named entity recognition systems and clue keywords Masaharu Yoshioka Graduate School of Information Science and echnology, Hokkaido University N14 W9, Kita-ku, Sapporo-shi

More information

@Note2 tutorial. Hugo Costa Ruben Rodrigues Miguel Rocha

@Note2 tutorial. Hugo Costa Ruben Rodrigues Miguel Rocha @Note2 tutorial Hugo Costa (hcosta@silicolife.com) Ruben Rodrigues (pg25227@alunos.uminho.pt) Miguel Rocha (mrocha@di.uminho.pt) 23-01-2018 The document presents a typical workflow using @Note2 platform

More information

INFORMATIQUE ET MÉDECINE/COMPUTER AND MEDICINE ELECTRONIC SUBMISSION OF AN ARTICLE

INFORMATIQUE ET MÉDECINE/COMPUTER AND MEDICINE ELECTRONIC SUBMISSION OF AN ARTICLE INFORMATIQUE ET MÉDECINE/COMPUTER AND MEDICINE ELECTRONIC SUBMISSION OF AN ARTICLE http://www.lebanesemedicaljournal.org/articles/56-3/it1.pdf Adib A. MOUKARZEL 1, Stéphane B. BAZAN 2, Armen MAYALIAN 3

More information

A Comprehensive Analysis of using Semantic Information in Text Categorization

A Comprehensive Analysis of using Semantic Information in Text Categorization A Comprehensive Analysis of using Semantic Information in Text Categorization Kerem Çelik Department of Computer Engineering Boğaziçi University Istanbul, Turkey celikerem@gmail.com Tunga Güngör Department

More information

Windows On Windows systems, simply double click the AntConc icon and this will launch the program.

Windows On Windows systems, simply double click the AntConc icon and this will launch the program. AntConc (Windows, Macintosh OS X, and Linux) Build 3.3.5 Laurence Anthony, Ph.D. Center for English Language Education in Science and Engineering, School of Science and Engineering, Waseda University,

More information

Instructions for Formatting MLA Style Papers in Microsoft Word 2010

Instructions for Formatting MLA Style Papers in Microsoft Word 2010 Instructions for Formatting MLA Style Papers in Microsoft Word 2010 To begin a Microsoft Word 2010 project, click on the Start bar in the lower left corner of the screen. Select All Programs and then find

More information

Morpho-syntactic Analysis with the Stanford CoreNLP

Morpho-syntactic Analysis with the Stanford CoreNLP Morpho-syntactic Analysis with the Stanford CoreNLP Danilo Croce croce@info.uniroma2.it WmIR 2015/2016 Objectives of this tutorial Use of a Natural Language Toolkit CoreNLP toolkit Morpho-syntactic analysis

More information

Emacs manual:

Emacs manual: Emacs manual: http://www.gnu.org/manual/emacs-20.3/html_mono/emacs.html GNU Emacs Reference Card (for version 19) Starting Emacs To enter GNU Emacs 19, just type its name: emacs or emacs filename Leaving

More information

To search and summarize on Internet with Human Language Technology

To search and summarize on Internet with Human Language Technology To search and summarize on Internet with Human Language Technology Hercules DALIANIS Department of Computer and System Sciences KTH and Stockholm University, Forum 100, 164 40 Kista, Sweden Email:hercules@kth.se

More information

Creating Word Outlines from Compendium on a Mac

Creating Word Outlines from Compendium on a Mac Creating Word Outlines from Compendium on a Mac Using the Compendium Outline Template and Macro for Microsoft Word for Mac: Background and Tutorial Jeff Conklin & KC Burgess Yakemovic, CogNexus Institute

More information

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.

More information

a child-friendly word processor for children to write documents

a child-friendly word processor for children to write documents Table of Contents Get Started... 1 Quick Start... 2 Classes and Users... 3 Clicker Explorer... 4 Ribbon... 6 Write Documents... 7 Document Tools... 8 Type with a Keyboard... 12 Write with a Clicker Set...

More information

Finder windows To open a new Finder window, click the Finder icon in the Dock, then select File > New Window.

Finder windows To open a new Finder window, click the Finder icon in the Dock, then select File > New Window. https://support.apple.com/en-us/ht201732 Mac Basics: The Finder organizes all of your files The Finder in OS X provides access to your files, folders, and drives, and helps you to keep them organized.

More information

Joomla 3.X Global Configuration

Joomla 3.X Global Configuration Joomla 3.X Global Configuration Once Joomla is installed on your web host its time to configure Joomla. To facilitate this, the Joomla 3.X Administration User interface offers a site owner a very convenient,

More information

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 4, 10.9

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning, Lecture 4, 10.9 1 INF5830 2015 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lønning, Lecture 4, 10.9 2 Working with texts From bits to meaningful units Today: 3 Reading in texts Character encodings and Unicode Word tokenization

More information

corenlp-xml-reader Documentation

corenlp-xml-reader Documentation corenlp-xml-reader Documentation Release 0.0.4 Edward Newell Feb 07, 2018 Contents 1 Purpose 1 2 Install 3 3 Example 5 3.1 Instantiation............................................... 5 3.2 Sentences.................................................

More information