Regular Expressions in Perl
|
|
- Elijah Garrison
- 6 years ago
- Views:
Transcription
1 Regular Expressions in Perl Marco Baroni Computational skills for text analysis
2 Outline Practical advice Regular expressions
3 Practical advice
4 The programming/testing loop Reading output from a text file Keep source code open in Scite (or other editor) Launch program, save output to file Open output file in a different Scite tab Edit code in Scite and save Re-launch program In Scite, refresh the output with the Revert command (or similar commands of other editors)
5 The Brown corpus The Brown University Standard Corpus of Present-Day American English, assembled by Henry Kučera and Nelson Francis in the sixties The first balanced corpus About 1 million words from various kinds of written sources Made available for teaching purposes by kind permission from Kučera Download from the class website, unzip and place the brown.txt file in your working directory
6 Outline Practical advice Regular expressions
7 Regular expressions I omit their precise mathematical characterization (but they do have one) From our point of view, a regular expression is a string that respects a certain syntax, and can be used to look for patterns inside a text Perl was one of the first languages to implement regexps, and it is still one of the languages where they are better integrated This is part of the reason why Perl is popular among linguists
8 A general frame for the programs we will develop The frame: while (<>) { $input = $_; if ($input =~ /REGEXP/) { # regexp match print "$input"; Usage: perl -w prog.pl brown.txt > out.txt
9 First example A simple string is a simple regexp: while (<>) { $input = $_; if ($input =~ /love/) { print "$input"; This also matches glove, clove and Pullover And it does not match loving nor Love
10 Sets and alternatives Use the square brackets [ ] to refer to a character set, such that any character from the set can occur in the target position Use ˆ to match the negation (complement) of the character set (i.e., any character not in the specified set) [aeiou]: any lower case vowel [ˆaeiou] any character except lower case vowels (e.g., A, t, k,!) With ( ), you can look for sets of alternative character sequences For example: lov(e ing)
11 Character sets Use the square brackets [ ] to refer to a character set, such that any single character from the set can occur in the target position For example, [Ll]ove matches both love and Love: while (<>) { $input = $_; if ($input =~ /[Ll]ove/) { print "$input";
12 Character sets What if we want to avoid glove and Lovelace? (let us ignore loves, loving, etc. for now) A simple solution: while (<>) { $input = $_; if ($input =~ / [Ll]ove /) { print "$input"; Are we going to capture all instances of Love and love in this way?
13 Character sets What if we want to avoid glove and Lovelace? (let us ignore loves, loving, etc. for now) A simple solution: while (<>) { $input = $_; if ($input =~ / [Ll]ove /) { print "$input"; Are we going to capture all instances of Love and love in this way? Not quite: we need to express the constraint that the edges around love can contain anything except an alphabetic symbol
14 Character sets Ranges and negations We need to express the constraint that the edges around love can contain anything except an alphabetic symbol All (English) alphabetic symbols: [a-za-z] (this expression denotes two ranges; for digits, you can use: [0-9]) For Italian, you d also need the accented vowels: [a-za-zàèéìòù] Now we negate the previous expression using ˆ: [ˆa-zA-Z] means: anything, except the alphabetic symbols
15 Character sets Anything non-alphabetical at the edges: if ($input = /[ˆa-zA-Z][Ll]ove[ˆa-zA-Z]/) { print "$input"; NB: no spaces in this regular expression: if you see some, they are due to my L A T E X issues
16 Practice time Create a text file with lines to test the regular expression above, e.g.: some love... should match glove should not match
17 Alternative sequences Square brackets only help when alternatives are limited to single characters For alternative sequences, use ( ), e.g.: if ($input =~ /[Ll]ov(e e[sd] ing)/) { print "$input";
18 Modifiers +, * and? specify how many times the immediately preceding character must be repeated +: at least once ehm+ matches ehm, ehmm, ehmmm..., but not, for example, eh *: as many times as you want, including 0 times loves* matches love, loves, lovess... Note equivalence of ehm+ and ehmm*?: once, optionally loves? matches love and loves, but not, e.g., lovess NB: unlike with wildcards, the modifiers refer to the preceding character (lov* matches lo, lov, lovv..., it does not mean lov followed by any character, that is probably what you meant if you wrote that!) We ll do more practice with modifiers after we tokenize the corpus
19 Solving problems with the modifiers Looking for interesting collocations in a flexible manner... if ($input =~ / solve a problem/) { print "$input"; Problems?
20 Solving problems: inflected forms With (......): if ($input =~ / solv(e e[ds] ing) a problem/) { print "$input"; This approach gets tedious soon
21 The + modifier While it is less precise, we can simply say: search solv followed by one or more alphabetic characters: solv[a-z]+ Insert this pattern in the solving problems regexp
22 Solving problems and the/a/some problem With the * and? modifiers if ($input =~ / solv[a-z]+ [a-z]*?problem/) { print "$input";
23 Practice time Write a simple program that extracts lines that contain proper names from the Brown corpus For the current purposes, a proper name is simply a sequence of two capitalized words Problems? Sketches of solutions?
24 Regexp reference table In part from Kenneth Church: Unix for Poets Example Explanation a letter a [a-z] any lower case English letter [a-zàèéìòù] any lower case Italian letter [A-Z] any upper case English letter [A-ZÀÈÉÌÒÙ] any upper case Italian letter [ ] any digit [0-9] any digit [aeiouaeiou] any English vowel [ˆaeiouAEIOU] any character except the vowels. any character ˆ beginning of string $ end of string x* zero or more x x+ one or more x x? zero or one x (xz yw) xz or yw
CS 124/LINGUIST 180 From Languages to Information. Unix for Poets Dan Jurafsky
CS 124/LINGUIST 180 From Languages to Information Unix for Poets Dan Jurafsky (original by Ken Church, modifications by me and Chris Manning) Stanford University Unix for Poets Text is everywhere The Web
More informationRegular Expressions Explained
Found at: http://publish.ez.no/article/articleprint/11/ Regular Expressions Explained Author: Jan Borsodi Publishing date: 30.10.2000 18:02 This article will give you an introduction to the world of regular
More informationDescribing Languages with Regular Expressions
University of Oslo : Department of Informatics Describing Languages with Regular Expressions Jonathon Read 25 September 2012 INF4820: Algorithms for AI and NLP Outlook How can we write programs that handle
More informationPieter van den Hombergh. April 13, 2018
Intro ergh Fontys Hogeschool voor Techniek en Logistiek April 13, 2018 ergh/fhtenl April 13, 2018 1/11 Regex? are a very power, but also complex tool. There is the saying that: Intro If you start with
More informationCSCI 2132 Software Development. Lecture 7: Wildcards and Regular Expressions
CSCI 2132 Software Development Lecture 7: Wildcards and Regular Expressions Instructor: Vlado Keselj Faculty of Computer Science Dalhousie University 20-Sep-2017 (7) CSCI 2132 1 Previous Lecture Pipes
More informationCS 124/LINGUIST 180 From Languages to Information
CS 124/LINGUIST 180 From Languages to Information Unix for Poets Dan Jurafsky (original by Ken Church, modifications by Chris Manning) Stanford University Unix for Poets (based on Ken Church s presentation)
More informationTaibah University College of Computer Science & Engineering Course Title: Discrete Mathematics Code: CS 103. Chapter 2. Sets
Taibah University College of Computer Science & Engineering Course Title: Discrete Mathematics Code: CS 103 Chapter 2 Sets Slides are adopted from Discrete Mathematics and It's Applications Kenneth H.
More informationUnderstanding Regular Expressions, Special Characters, and Patterns
APPENDIXA Understanding Regular Expressions, Special Characters, and Patterns This appendix describes the regular expressions, special or wildcard characters, and patterns that can be used with filters
More informationLecture 11: Regular Expressions. LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han
Lecture 11: Regular Expressions LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han Outline Language and Computers, Ch.4 Searching 4.4 Searching semi-structured data with regular expressions
More informationA program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, though scanner is also a term for the first stage of a lexer.
Compiler Design A compiler is computer software that transforms computer code written in one programming language (the source language) into another programming language (the target language). The name
More informationOverview. Unix/Regex Lab. 1. Setup & Unix review. 2. Count words in a text. 3. Sort a list of words in various ways. 4.
Overview Unix/Regex Lab CS 341: Natural Language Processing Heather Pon-Barry 1. Setup & Unix review 2. Count words in a text 3. Sort a list of words in various ways 4. Search with grep Based on Unix For
More informationDigital Humanities. Tutorial Regular Expressions. March 10, 2014
Digital Humanities Tutorial Regular Expressions March 10, 2014 1 Introduction In this tutorial we will look at a powerful technique, called regular expressions, to search for specific patterns in corpora.
More informationSet and Set Operations
Set and Set Operations Introduction A set is a collection of objects. The objects in a set are called elements of the set. A well defined set is a set in which we know for sure if an element belongs to
More informationIndian Institute of Technology Kharagpur. PERL Part III. Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. I.I.T.
Indian Institute of Technology Kharagpur PERL Part III Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. I.I.T. Kharagpur, INDIA Lecture 23: PERL Part III On completion, the student will be able
More informationRegular Expressions. Regular expressions are a powerful search-and-replace technique that is widely used in other environments (such as Unix and Perl)
Regular Expressions Regular expressions are a powerful search-and-replace technique that is widely used in other environments (such as Unix and Perl) JavaScript started supporting regular expressions in
More informationCSE 413 Final Exam. June 7, 2011
CSE 413 Final Exam June 7, 2011 Name The exam is closed book, except that you may have a single page of hand-written notes for reference plus the page of notes you had for the midterm (although you are
More informationSystems Programming/ C and UNIX
Systems Programming/ C and UNIX December 7-10, 2017 1/17 December 7-10, 2017 1 / 17 Outline 1 2 Using find 2/17 December 7-10, 2017 2 / 17 String Pattern Matching Tools Regular Expressions Simple Examples
More informationRegular Expressions. using REs to find patterns. implementing REs using finite state automata. Sunday, 4 December 11
Regular Expressions using REs to find patterns implementing REs using finite state automata REs and FSAs Regular expressions can be viewed as a textual way of specifying the structure of finite-state automata
More informationIntroduction to Regular Expressions Version 1.3. Tom Sgouros
Introduction to Regular Expressions Version 1.3 Tom Sgouros June 29, 2001 2 Contents 1 Beginning Regular Expresions 5 1.1 The Simple Version........................ 6 1.2 Difficult Characters........................
More informationRegular Expressions. Computer Science and Engineering College of Engineering The Ohio State University. Lecture 9
Regular Expressions Computer Science and Engineering College of Engineering The Ohio State University Lecture 9 Language Definition: a set of strings Examples Activity: For each above, find (the cardinality
More informationGlobal Search And Replace User s Manual
Global Search And Replace User s Manual Welcome... 2 Configuring the Add-in... 3 How to Use Global Search and Replace... 3 Look In Fields/Special Instructions... 4 Regular Expression Tutorial... 5 Troubleshooting...
More informationIntroduction to: Computers & Programming: Using Patterns with Strings For Search and Modification
Introduction to: Computers & Programming: Using Patterns with Strings For Search and Modification Adam Meyers New York University Outline Eliza a famous AI program using patterns in strings What is a string
More informationITST Searching, Extracting & Archiving Data
ITST 1136 - Searching, Extracting & Archiving Data Name: Step 1 Sign into a Pi UN = pi PW = raspberry Step 2 - Grep - One of the most useful and versatile commands in a Linux terminal environment is the
More informationGrep and Shell Programming
Grep and Shell Programming Comp-206 : Introduction to Software Systems Lecture 7 Alexandre Denault Computer Science McGill University Fall 2006 Teacher's Assistants Michael Hawker Monday, 14h30 to 16h30
More informationL435/L555. Dept. of Linguistics, Indiana University Fall 2016
for : for : L435/L555 Dept. of, Indiana University Fall 2016 1 / 12 What is? for : Decent definition from wikipedia: Computer programming... is a process that leads from an original formulation of a computing
More informationFiltering Service
Secure E-Mail Gateway (SEG) Service Administrative Guides Email Filtering Service Regular Expressions Overview Regular Expressions Overview AT&T Secure E-Mail Gateway customers can use Regular Expressions
More informationWildcards and Regular Expressions
CSCI 2132: Software Development Wildcards and Regular Expressions Norbert Zeh Faculty of Computer Science Dalhousie University Winter 2019 Searching Problem: Find all files whose names match a certain
More informationIf you re using a Mac, follow these commands to prepare your computer to run these demos (and any other analysis you conduct with the Audio BNC
If you re using a Mac, follow these commands to prepare your computer to run these demos (and any other analysis you conduct with the Audio BNC sample). All examples use your Workshop directory (e.g. /Users/peggy/workshop)
More informationGetting started. 1. Applying the keyboard labels. 2. Installing the Lakota Keyboard and Font Bundle
Getting started 1. Applying the keyboard labels 2. Installing the Lakota Keyboard and Font Bundle 3. Starting the Lakota Keyboard and Font Bundle 4. Lakhota fonts installed 5. Keyboards installed Auto
More informationBasics Wildcard and multipliers Special characters Negation Other functions Programming. Regular Expressions. Web Programming
Regular Expressions Web Programming Uta Priss ZELL, Ostfalia University 2013 Web Programming Regular Expressions Slide 1/17 Outline Basics Wildcard and multipliers Special characters Negation Other functions
More informationStructure of Programming Languages Lecture 3
Structure of Programming Languages Lecture 3 CSCI 6636 4536 Spring 2017 CSCI 6636 4536 Lecture 3... 1/25 Spring 2017 1 / 25 Outline 1 Finite Languages Deterministic Finite State Machines Lexical Analysis
More informationBest Practice Recommendations: Constraints with regular expressions in AutomationML
Best Practice Recommendations: Constraints with regular expressions in AutomationML State: October 2014 Table of contents Table of contents... 2 List of figures... 2 Preface... 2 1 Motivation for the realisation
More informationStarting with a great calculator... Variables. Comments. Topic 5: Introduction to Programming in Matlab CSSE, UWA
Starting with a great calculator... Topic 5: Introduction to Programming in Matlab CSSE, UWA! MATLAB is a high level language that allows you to perform calculations on numbers, or arrays of numbers, in
More informationCS Unix Tools. Fall 2010 Lecture 5. Hussam Abu-Libdeh based on slides by David Slater. September 17, 2010
Fall 2010 Lecture 5 Hussam Abu-Libdeh based on slides by David Slater September 17, 2010 Reasons to use Unix Reason #42 to use Unix: Wizardry Mastery of Unix makes you a wizard need proof? here is the
More informationCS 124/LINGUIST 180 From Languages to Informa<on. Unix for Poets (in 2013) Christopher Manning Stanford University
CS 124/LINGUIST 180 From Languages to Informa
More informationCS Unix Tools & Scripting
Cornell University, Spring 2014 1 February 7, 2014 1 Slides evolved from previous versions by Hussam Abu-Libdeh and David Slater Regular Expression A new level of mastery over your data. Pattern matching
More informationLexical Analysis. Lecture 3-4
Lexical Analysis Lecture 3-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 3-4 1 Administrivia I suggest you start looking at Python (see link on class home page). Please
More informationFigure 2.1: Role of Lexical Analyzer
Chapter 2 Lexical Analysis Lexical analysis or scanning is the process which reads the stream of characters making up the source program from left-to-right and groups them into tokens. The lexical analyzer
More informationAn Introduction to Python (TEJ3M & TEJ4M)
An Introduction to Python (TEJ3M & TEJ4M) What is a Programming Language? A high-level language is a programming language that enables a programmer to write programs that are more or less independent of
More informationI. Recursive Descriptions A phrase like to get the next term you add 2, which tells how to obtain
Mathematics 45 Describing Patterns in s Mathematics has been characterized as the science of patterns. From an early age students see patterns in mathematics, including counting by twos, threes, etc.,
More informationLearning Ruby. Regular Expressions. Get at practice page by logging on to csilm.usu.edu and selecting. PROGRAMMING LANGUAGES Regular Expressions
Learning Ruby Regular Expressions Get at practice page by logging on to csilm.usu.edu and selecting PROGRAMMING LANGUAGES Regular Expressions Regular Expressions A regular expression is a special sequence
More informationRegular Expressions and Metric Groupings in Introscope: A Basic Guide
Wily Best Practices Regular Expressions and Metric Groupings in Introscope: A Basic Guide Abstract Using regular expressions (regex) in Introscope Metric Groupings allows you harness the vast amount of
More informationPaolo Santinelli Sistemi e Reti. Regular expressions. Regular expressions aim to facilitate the solution of text manipulation problems
aim to facilitate the solution of text manipulation problems are symbolic notations used to identify patterns in text; are supported by many command line tools; are supported by most programming languages;
More informationFotoScript: The Language Reference Manual
FotoScript: The Language Reference Manual Matthew Raibert mjr2101@columbia.edu Norman Yung ny2009@columbia.edu James Kenneth Mooney jkm2017@columbia.edu Randall Q Li rql1@columbia.edu October 23, 2004
More informationCS160A EXERCISES-FILTERS2 Boyd
Exercises-Filters2 In this exercise we will practice with the Unix filters cut, and tr. We will also practice using paste, even though, strictly speaking, it is not a filter. In addition, we will expand
More informationIntroduction to Java. Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All rights reserved.
Introduction to Java 1 Programs Computer programs, known as software, are instructions to the computer. You tell a computer what to do through programs. Programs are written using programming languages.
More informationACM ICPC2009 Latin American Regionals 1. Problem A Pangram. File code name: pangram
ACM ICPC29 Latin American Regionals Problem A Pangram File code name: pangram Pangram Show is an exciting new television quiz show which offers very large cash prizes for correctly detecting if a sentence
More informationVi & Shell Scripting
Vi & Shell Scripting Comp-206 : Introduction to Week 3 Joseph Vybihal Computer Science McGill University Announcements Sina Meraji's office hours Trottier 3rd floor open area Tuesday 1:30 2:30 PM Thursday
More informationIntroduction to MATLAB
Chapter 1 Introduction to MATLAB 1.1 Software Philosophy Matrix-based numeric computation MATrix LABoratory built-in support for standard matrix and vector operations High-level programming language Programming
More informationUnix for Poets (in 2016) Christopher Manning Stanford University Linguistics 278
Unix for Poets (in 2016) Christopher Manning Stanford University Linguistics 278 Operating systems The operating system wraps the hardware, running the show and providing abstractions Abstractions of processes
More informationCS 301. Lecture 05 Applications of Regular Languages. Stephen Checkoway. January 31, 2018
CS 301 Lecture 05 Applications of Regular Languages Stephen Checkoway January 31, 2018 1 / 17 Characterizing regular languages The following four statements about the language A are equivalent The language
More informationPHP and MySQL for Dynamic Web Sites. Intro Ed Crowley
PHP and MySQL for Dynamic Web Sites Intro Ed Crowley Class Preparation If you haven t already, download the sample scripts from: http://www.larryullman.com/books/phpand-mysql-for-dynamic-web-sitesvisual-quickpro-guide-4thedition/#downloads
More informationEssential Skills for Bioinformatics: Unix/Linux
Essential Skills for Bioinformatics: Unix/Linux SHELL SCRIPTING Overview Bash, the shell we have used interactively in this course, is a full-fledged scripting language. Unlike Python, Bash is not a general-purpose
More informationBootcamp. Christoph Thiele. Summer An example of a primitive universe
Bootcamp Christoph Thiele Summer 2012 0.1 An example of a primitive universe A primitive universe consists of primitive objects and primitive sets. This allows to form primitive statements as to which
More informationMATVEC: MATRIX-VECTOR COMPUTATION LANGUAGE REFERENCE MANUAL. John C. Murphy jcm2105 Programming Languages and Translators Professor Stephen Edwards
MATVEC: MATRIX-VECTOR COMPUTATION LANGUAGE REFERENCE MANUAL John C. Murphy jcm2105 Programming Languages and Translators Professor Stephen Edwards Language Reference Manual Introduction The purpose of
More informationJava Language. Programs. Computer programs, known as software, are instructions to the computer. You tell a computer what to do through programs.
Introduction to Programming Java Language Programs Computer programs, known as software, are instructions to the computer. You tell a computer what to do through programs. Programs are written using programming
More informationLecture 5. Additional useful commands. COP 3353 Introduction to UNIX
Lecture 5 Additional useful commands COP 3353 Introduction to UNIX diff diff compares two text files ( can also be used on directories) and prints the lines for which the files differ. The format is as
More informationCopyright. Trademarks Attachmate Corporation. All rights reserved. USA Patents Pending. WRQ ReflectionVisual Basic User Guide
PROGRAMMING WITH REFLECTION: VISUAL BASIC USER GUIDE WINDOWS XP WINDOWS 2000 WINDOWS SERVER 2003 WINDOWS 2000 SERVER WINDOWS TERMINAL SERVER CITRIX METAFRAME CITRIX METRAFRAME XP ENGLISH Copyright 1994-2006
More informationPace University. Fundamental Concepts of CS121 1
Pace University Fundamental Concepts of CS121 1 Dr. Lixin Tao http://csis.pace.edu/~lixin Computer Science Department Pace University October 12, 2005 This document complements my tutorial Introduction
More information, has the form T i1i 2 i m. = κ i1i 2 i m. x i1. 1 xi2 2 xim m (2)
CS61B, Fall 2002 Project #1 P. N. Hilfinger Due: Friday, 4 October 2002 at 2400 This first project involves writing a calculator program that can perform polynomial arithmetic. We ll do only a very limited
More informationCSE 413 Final Exam Spring 2011 Sample Solution. Strings of alternating 0 s and 1 s that begin and end with the same character, either 0 or 1.
Question 1. (10 points) Regular expressions I. Describe the set of strings generated by each of the following regular expressions. For full credit, give a description of the sets like all sets of strings
More informationGetting ready for L A TEX. Alexis Dimitriadis. Version: March 28, 2013
Getting ready for L A TEX Alexis Dimitriadis Version: March 28, 2013 LaTeX is a great system, but it takes some work to learn. Unfortunately, it also takes some work to set up the necessary software. This
More informationPython for loops. Girls Programming Network School of Information Technologies University of Sydney. Mini-lecture 7
Python for loops Girls Programming Network School of Information Technologies University of Sydney Mini-lecture 7 Lists for loops More Strings Summary 2 Outline 1 Lists 2 for loops 3 More Strings 4 Summary
More informationSelf-Teach Exercises: Getting Started Turtle Python
Self-Teach Exercises: Getting Started Turtle Python 0.1 Select Simple drawing with pauses Click on the Help menu, point to Examples 1 drawing, counting, and procedures, and select the first program on
More informationExporting a Course. This tutorial will explain how to export a course in Blackboard and the difference between exporting and archiving.
Blackboard Tutorial Exporting a Course This tutorial will explain how to export a course in Blackboard and the difference between exporting and archiving. Exporting vs. Archiving The Export/Archive course
More informationSpoke. Language Reference Manual* CS4118 PROGRAMMING LANGUAGES AND TRANSLATORS. William Yang Wang, Chia-che Tsai, Zhou Yu, Xin Chen 2010/11/03
CS4118 PROGRAMMING LANGUAGES AND TRANSLATORS Spoke Language Reference Manual* William Yang Wang, Chia-che Tsai, Zhou Yu, Xin Chen 2010/11/03 (yw2347, ct2459, zy2147, xc2180)@columbia.edu Columbia University,
More informationS E C T I O N O V E R V I E W
AN INTRODUCTION TO SHELLS S E C T I O N O V E R V I E W Continuing from last section, we are going to learn about the following concepts: understanding quotes and escapes; considering the importance of
More informationrqc-install-lawsonsrv.doc
With the hope of helping other Lawson clients, The City of High Point put together the following documentation related to the installation of the Requisition Center (RQC) module. This document provides
More informationIntro to Programming. Unit 7. What is Programming? What is Programming? Intro to Programming
Intro to Programming Unit 7 Intro to Programming 1 What is Programming? 1. Programming Languages 2. Markup vs. Programming 1. Introduction 2. Print Statement 3. Strings 4. Types and Values 5. Math Externals
More informationVISUAL GUIDE to. RX Scripting. for Roulette Xtreme - System Designer 2.0. L J Howell UX Software Ver. 1.0
VISUAL GUIDE to RX Scripting for Roulette Xtreme - System Designer 2.0 L J Howell UX Software 2009 Ver. 1.0 TABLE OF CONTENTS INTRODUCTION...ii What is this book about?... iii How to use this book... iii
More informationRegular Expressions. Todd Kelley CST8207 Todd Kelley 1
Regular Expressions Todd Kelley kelleyt@algonquincollege.com CST8207 Todd Kelley 1 POSIX character classes Some Regular Expression gotchas Regular Expression Resources Assignment 3 on Regular Expressions
More information2014/09/01 Workshop on Finite-State Language Resources Sofia. Local Grammars 1. Éric Laporte
2014/09/01 Workshop on Finite-State Language Resources Sofia Local Grammars 1 Éric Laporte Concordance Outline Local grammar of dates Invoking a subgraph Lexical masks Dictionaries of a text 01/09/2014
More informationT H E I N T E R A C T I V E S H E L L
3 T H E I N T E R A C T I V E S H E L L The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform. Ada Lovelace, October 1842 Before
More informationDr. Sarah Abraham University of Texas at Austin Computer Science Department. Regular Expressions. Elements of Graphics CS324e Spring 2017
Dr. Sarah Abraham University of Texas at Austin Computer Science Department Regular Expressions Elements of Graphics CS324e Spring 2017 What are Regular Expressions? Describe a set of strings based on
More informationCSE 401 Midterm Exam 11/5/10
Name There are 5 questions worth a total of 100 points. Please budget your time so you get to all of the questions. Keep your answers brief and to the point. The exam is closed books, closed notes, closed
More informationWindows On Windows systems, simply double click the AntConc icon and this will launch the program.
AntConc (Windows, Macintosh OS X, and Linux) Build 3.5.2 (February 8, 2018) Laurence Anthony, Ph.D. Center for English Language Education in Science and Engineering, School of Science and Engineering,
More informationRegex, Sed, Awk. Arindam Fadikar. December 12, 2017
Regex, Sed, Awk Arindam Fadikar December 12, 2017 Why Regex Lots of text data. twitter data (social network data) government records web scrapping many more... Regex Regular Expressions or regex or regexp
More informationCOPYRIGHTED MATERIAL. Getting Started with Windows PowerShell. Installing Windows PowerShell
Getting Started with Windows PowerShell If you are like me, then when you begin to look seriously at an interesting piece of software, you like to get your hands dirty and play with it from the beginning.
More informationHadoop Exercise to Create an Inverted List
Hadoop Exercise to Create an Inverted List For this project you will be creating an Inverted Index of words occurring in a set of English books. We ll be using a collection of 3,036 English books written
More informationSyntax. In Text: Chapter 3
Syntax In Text: Chapter 3 1 Outline Syntax: Recognizer vs. generator BNF EBNF Chapter 3: Syntax and Semantics 2 Basic Definitions Syntax the form or structure of the expressions, statements, and program
More informationCSCI 161: Introduction to Programming I Lab 1b: Hello, World (Eclipse, Java)
Goals - to learn how to compile and execute a Java program - to modify a program to enhance it Overview This activity will introduce you to the Java programming language. You will type in the Java program
More informationA Big Step. Shell Scripts, I/O Redirection, Ownership and Permission Concepts, and Binary Numbers
A Big Step Shell Scripts, I/O Redirection, Ownership and Permission Concepts, and Binary Numbers Copyright 2006 2009 Stewart Weiss What a shell really does Here is the scoop on shells. A shell is a program
More informationTo practice overall problem-solving skills, as well as general design of a program
Programming Assignment 5 Due March 27, 2015 at 11:59 PM Objectives To gain experience with file input/output techniques To gain experience with formatting output To practice overall problem-solving skills,
More informationLecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou
Lecture Outline COMP-421 Compiler Design! Lexical Analyzer Lex! Lex Examples Presented by Dr Ioanna Dionysiou Figures and part of the lecture notes taken from A compact guide to lex&yacc, epaperpress.com
More informationLab 7: OCaml 12:00 PM, Oct 22, 2017
CS17 Integrated Introduction to Computer Science Hughes Lab 7: OCaml 12:00 PM, Oct 22, 2017 Contents 1 Getting Started in OCaml 1 2 Pervasives Library 2 3 OCaml Basics 3 3.1 OCaml Types........................................
More informationPython allows variables to hold string values, just like any other type (Boolean, int, float). So, the following assignment statements are valid:
1 STRINGS Objectives: How text data is internally represented as a string Accessing individual characters by a positive or negative index String slices Operations on strings: concatenation, comparison,
More informationC Language, Token, Keywords, Constant, variable
C Language, Token, Keywords, Constant, variable A language written by Brian Kernighan and Dennis Ritchie. This was to be the language that UNIX was written in to become the first "portable" language. C
More informationA language is a subset of the set of all strings over some alphabet. string: a sequence of symbols alphabet: a set of symbols
The current topic:! Introduction! Object-oriented programming: Python! Functional programming: Scheme! Python GUI programming (Tkinter)! Types and values! Logic programming: Prolog! Introduction! Rules,
More informationchapter 2 G ETTING I NFORMATION FROM A TABLE
chapter 2 Chapter G ETTING I NFORMATION FROM A TABLE This chapter explains the basic technique for getting the information you want from a table when you do not want to make any changes to the data and
More informationDownload the examples: LabWeek5examples..py or download LabWeek5examples.txt and rename it as.py from the LabExamples folder or from blackboard.
NLP Lab Session Week 5 September 25, 2013 Regular Expressions and Tokenization So far, we have depended on the NLTK wordpunct tokenizer for our tokenization. Not only does the NLTK have other tokenizers,
More informationGetting To Know Matlab
Getting To Know Matlab The following worksheets will introduce Matlab to the new user. Please, be sure you really know each step of the lab you performed, even if you are asking a friend who has a better
More informationCreating SQL Tables and using Data Types
Creating SQL Tables and using Data Types Aims: To learn how to create tables in Oracle SQL, and how to use Oracle SQL data types in the creation of these tables. Outline of Session: Given a simple database
More information2 Sets. 2.1 Notation. last edited January 26, 2016
2 Sets Sets show up in virtually every topic in mathematics, and so understanding their basics is a necessity for understanding advanced mathematics. As far as we re concerned, the word set means what
More informationECE Lesson Plan - Class 1 Fall, 2001
ECE 201 - Lesson Plan - Class 1 Fall, 2001 Software Development Philosophy Matrix-based numeric computation - MATrix LABoratory High-level programming language - Programming data type specification not
More informationRuby Regular Expressions AND FINITE AUTOMATA
Ruby Regular Expressions AND FINITE AUTOMATA Why Learn Regular Expressions? RegEx are part of many programmer s tools vi, grep, PHP, Perl They provide powerful search (via pattern matching) capabilities
More informationWhat s new in SAS 9.2
Winnipeg SAS User Group 29APR2009 What s new in SAS 9.2 Sylvain Tremblay SAS Canada Education New release of SAS: 9.2 SAS Foundation: BASE STAT... Tools & Solutions Enterprise Guide 4.2 Enterprise Miner
More informationR E G U L A R E X P R E S S I O N S
R E G U L A R E X P R E S S I O N S F O R D ATA C L E A N U P I N S I E R R A Lloyd Chittenden Union Catalog Coordinator Marmot Library Network WHAT ARE REGULAR EXPRESSIONS? Combine literal characters
More informationLexical Analysis. Sukree Sinthupinyo July Chulalongkorn University
Sukree Sinthupinyo 1 1 Department of Computer Engineering Chulalongkorn University 14 July 2012 Outline Introduction 1 Introduction 2 3 4 Transition Diagrams Learning Objectives Understand definition of
More informationCLIP - A Crytographic Language with Irritating Parentheses
CLIP - A Crytographic Language with Irritating Parentheses Author: Duan Wei wd2114@columbia.edu Yi-Hsiu Chen yc2796@columbia.edu Instructor: Prof. Stephen A. Edwards July 24, 2013 Contents 1 Introduction
More informationCRM CUSTOMER RELATIONSHIP MANAGEMENT
CRM CUSTOMER RELATIONSHIP MANAGEMENT Customer Relationship Management is identifying, developing and retaining profitable customers to build lasting relationships and long-term financial success. The agrē
More information