Speech Recognition CSCI-GA Fall Homework Assignment #1: Automata Operations. Instructor: Eugene Weinstein. Due Date: October 17th
|
|
- Maximillian Boyd
- 5 years ago
- Views:
Transcription
1 Speech Recognition CSCI-GA Fall 2013 Homework Assignment #1: Automata Operations Instructor: Eugene Weinstein Due Date: October 17th Note: It is advised, but not required, to use the OpenFST library to complete this assignment. In order to receive full credit, you must give the full sequence of shell commands and give the code of any script/program used to complete each problem. Note that not every single automata operation required to complete this assignment has been covered in class, and you may need to do some research on your own in the relevant documentation. A. Regular expressions and automata. Give epsilon free, deterministic, and minimal finite state acceptors accepting only the strings matching the following regular expressions: a (a ba)* ba* ( a b) *(baab)* ( a b) * baab
2 Producing deterministic and minimal versions of the acceptor (for first regular expression): $ cat > syms.txt a 1 b 2 $ cat > 1a1.txt 0 1 a 1 1 a 1 2 b 2 1 a 1 3 b 3 4 a 4 4 a 4 fstcompile acceptor isymbols=syms.txt 1a1.txt > 1a1.fst cat 1a1.fst fstdeterminize fstminimize > 1a1 opt.fst fstdraw isymbols=syms.txt acceptor portrait 1a1 opt.fst dot Tpng o 1a1 opt.png B. Weighted acceptors and transducers 1. Give a weighted finite state acceptor in the tropical semiring producing strings of length four over the alphabet Σ = { a, b} where a is three times as likely to be produced than b. l n(0.25) = ; l n(0.75) = Give a weighted finite state transducer T which can be used to count the number of and b symbols accepted by a (possibly weighted) finite state acceptor A. State the semiring over which your method gives the desired result, and give the precise sequence of automata operations required (hint: the composition A T will come in handy here). a The counting transducer T is over the log semiring. Note that T (x) = 1 x {a, b } *. Let C = Π o (A T ), where Π o is the projection to the output operation. Then e C(x) is the count of occurrences of the string x in A.
3 3. Randomly generate 100 strings according to the constraints of part 1: the alphabet is Σ = { a, b}, the string is of length four and a is more likely than b. You may do this using the OpenFST tools, or using a script of your own. # In order to generate the strings, we will sample from the acceptor of part 1 above. $ cat fstcompile acceptor isymbols=syms.txt 1b.fst 0 1 a b a b a b a b *ctrl d* # Generate 4 letter strings and put them all into strings.txt, one per line $ for i in {1..100}; do fstrandgen weighted select=log_prob 1b.fst fstprint isymbols=syms.txt awk '{print $3}' tr '\n' ' '; echo; sleep 2; done > strings.txt 4. Apply the transducer T you created in part 2 to the strings from part 3. Does the count of a and b symbols match the distribution of symbols given in part1? # Make a FST archive (far) of the generated strings: $ farcompilestrings symbols=syms.txt strings.txt > strings.far $ farextract strings.far $ echo fstcompile > union.fst # Empty transducer # Union together all string transducers produced $ for file in `ls strings.txt *`; do fstunion union.fst $file > foo.fst; mv foo.fst union.fst ; done # Convert to log semiring and sort on output label for composition $ fstprint union.fst fstcompile arc_type=log fstarcsort sort_type=olabel > union log.fst $ cat fstcompile arc_type=log isymbols=syms.txt osymbols=syms.txt count.fst 0 0 a <eps> 0 0 b <eps> 0 1 a a 0 1 b b 1 1 a <eps> 1 1 b <eps> 1 *ctrl d* $ fstcompose fsts/union log.fst count.fst fstproject project_output fstrmepsilon fstdeterminize fstminimize fstprint acceptor isymbols=syms.txt # output looks like: 0 1 a b The final answer is obtained as e , e The counts add up to 400, and
4 match our expected distribution. C. Camel casing with automata. 1. Download the list of 100 most common English words from 2. From this list, build a camel casing transducer T, i.e., that which, when composed with acceptor A accepting lowercase strings consisting of words in the list, can be used to produce a string acceptor A c, where the strings are camel cased. For example, if the input acceptor accepts the strings gowithme and dosomework, the output acceptor should accept the strings GoWithMe and DoSomeWork. Do not assume that the words are space separated. $ head words.txt the be to of and a in that have I # See appendix below for code listing of words.py $ cat words.txt./words.py fstcompile isymbols=syms.txt osymbols=syms.txt fstclosure > camelcase.fst The desired camelcased acceptor is found as A c = Π o (A T ) where Π o is projection on the output labels. 3. Repeat the process for just the first five words in the list to construct transducer T. Show this transducer.
5 4. Demonstrate that your transducer T works by showing the input acceptor A and the resulting transducer A T for both of the examples given in sub question 2 above. Be sure to also verify that no outputs are produced if the input does not consist solely of words in the list of 100 words. $ cat fstcompile isymbols=syms.txt acceptor > dosomework.fst 0 1 d 1 2 o 2 3 s 3 4 o 4 5 m 5 6 e 6 7 w 7 8 o 8 9 r 9 10 k 10 *ctrl d* $ fstcompose dosomework.fst camelcase.fst fstrmepsilon > dosomework camel.fst $ cat dosomework camel.fst fstdraw isymbols=syms.txt osymbols=syms.txt portrait dot Tpng odosomework camel.png 5. How many states and transitions does your transducer T have? Is it epsilon free and/or deterministic? If not, what ideas do you have for determinizing it (you do not have to implement them)? $ fstinfo camelcase.fst # of states 340 # of arcs 439
6 input deterministic output deterministic input/output epsilons input epsilons output epsilons n n y y y No, the transducer is not deterministic, and it has epsilons. The issue with determinizing such a transducer is that it s not functional, i.e., that is it does not map each input sequence to a unique output sequence. It s possible to determinize such a transducer by introducing special symbols at the end of each output string corresponding to more than one input path. With such symbols, the transducer becomes functional, and can then be determinized. The special symbols can then be replaced with epsilons. See the discussion about p subsequential transducers in Mehryar Mohri. Finite State Transducers in Language and Speech Processing. Computational Linguistics, 23:2, Appendix: words.py #!/usr/bin/python import sys state_num = 1 last_state = 0 for line in sys.stdin: for i, c in enumerate(line.rstrip()): input_sym = c.lower() if i == 0: last_state = 0 output_sym = c.upper() else: output_sym = input_sym print "%d %d %s %s" % ( last_state, state_num, input_sym, output_sym) last_state = state_num state_num += 1 print "%d" % last_state
OpenFst: a General and Efficient Weighted Finite-State Transducer Library. Part I. Library Design and Use
OpenFst: a General and Efficient Weighted Finite-State Transducer Library Part I. Library Design and Use Outline. Definitions Semirings Weighted Automata and Transducers 2. Library Overview FST Construction
More informationTable of Contents OpenFst Library...1 OpenFst Authors...2 OpenFst Background Material...3 OpenFst Quick Tour...4 ArcSort...14 Closure...
OpenFst Library OpenFst Library Table of Contents OpenFst Library...1 OpenFst Authors...2 Principal Contacts:...2 Contributors:...2 OpenFst Background Material...3 OpenFst Quick Tour...4 Finding and Using
More informationWFST: Weighted Finite State Transducer. September 12, 2017 Prof. Marie Meteer
+ WFST: Weighted Finite State Transducer September 12, 217 Prof. Marie Meteer + FSAs: A recurring structure in speech 2 Phonetic HMM Viterbi trellis Language model Pronunciation modeling + The language
More informationReport for each of the weighted automata obtained ˆ the number of states; ˆ the number of ɛ-transitions;
Mehryar Mohri Speech Recognition Courant Institute of Mathematical Sciences Homework assignment 3 (Solution) Part 2, 3 written by David Alvarez 1. For this question, it is recommended that you use the
More informationWeighted Finite-State Transducers in Computational Biology
Weighted Finite-State Transducers in Computational Biology Mehryar Mohri Courant Institute of Mathematical Sciences mohri@cims.nyu.edu Joint work with Corinna Cortes (Google Research). This Tutorial Weighted
More informationCSCI 2132 Software Development. Lecture 7: Wildcards and Regular Expressions
CSCI 2132 Software Development Lecture 7: Wildcards and Regular Expressions Instructor: Vlado Keselj Faculty of Computer Science Dalhousie University 20-Sep-2017 (7) CSCI 2132 1 Previous Lecture Pipes
More informationFinite-State Transducers in Language and Speech Processing
Finite-State Transducers in Language and Speech Processing Mehryar Mohri AT&T Labs-Research Finite-state machines have been used in various domains of natural language processing. We consider here the
More informationA General Weighted Grammar Library
A General Weighted Grammar Library Cyril Allauzen, Mehryar Mohri, and Brian Roark AT&T Labs Research, Shannon Laboratory 80 Park Avenue, Florham Park, NJ 0792-097 {allauzen, mohri, roark}@research.att.com
More informationSpeech Recognition Lecture 12: Lattice Algorithms. Cyril Allauzen Google, NYU Courant Institute Slide Credit: Mehryar Mohri
Speech Recognition Lecture 12: Lattice Algorithms. Cyril Allauzen Google, NYU Courant Institute allauzen@cs.nyu.edu Slide Credit: Mehryar Mohri This Lecture Speech recognition evaluation N-best strings
More informationLexicographic Semirings for Exact Automata Encoding of Sequence Models
Lexicographic Semirings for Exact Automata Encoding of Sequence Models Brian Roark, Richard Sproat, and Izhak Shafran {roark,rws,zak}@cslu.ogi.edu Abstract In this paper we introduce a novel use of the
More informationWeighted Finite State Transducers in Automatic Speech Recognition
Weighted Finite State Transducers in Automatic Speech Recognition ZRE lecture 10.04.2013 Mirko Hannemann Slides provided with permission, Daniel Povey some slides from T. Schultz, M. Mohri and M. Riley
More informationCSE302: Compiler Design
CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 01, 2007 Outline Recap
More informationA General Weighted Grammar Library
A General Weighted Grammar Library Cyril Allauzen, Mehryar Mohri 2, and Brian Roark 3 AT&T Labs Research 80 Park Avenue, Florham Park, NJ 07932-097 allauzen@research.att.com 2 Department of Computer Science
More informationLexical Analysis. Prof. James L. Frankel Harvard University
Lexical Analysis Prof. James L. Frankel Harvard University Version of 5:37 PM 30-Jan-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights reserved. Regular Expression Notation We will develop a
More informationWeighted Finite State Transducers in Automatic Speech Recognition
Weighted Finite State Transducers in Automatic Speech Recognition ZRE lecture 15.04.2015 Mirko Hannemann Slides provided with permission, Daniel Povey some slides from T. Schultz, M. Mohri, M. Riley and
More informationLing/CSE 472: Introduction to Computational Linguistics. 4/6/15: Morphology & FST 2
Ling/CSE 472: Introduction to Computational Linguistics 4/6/15: Morphology & FST 2 Overview Review: FSAs & FSTs XFST xfst demo Examples of FSTs for spelling change rules Reading questions Review: FSAs
More information1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}
CSE 5 Homework 2 Due: Monday October 6, 27 Instructions Upload a single file to Gradescope for each group. should be on each page of the submission. All group members names and PIDs Your assignments in
More informationHomework #1: CMPT-825 Reading: fsmtools/fsm/ Anoop Sarkar
Homework #: CMPT-825 Reading: http://www.research.att.com/ fsmtools/fsm/ Anoop Sarkar anoop@cs.sfu.ca () Machine (Back) Transliteration Languages have different sound inventories. When translating from
More informationCourse Project 2 Regular Expressions
Course Project 2 Regular Expressions CSE 30151 Spring 2017 Version of February 16, 2017 In this project, you ll write a regular expression matcher similar to grep, called mere (for match and echo using
More informationWildcards and Regular Expressions
CSCI 2132: Software Development Wildcards and Regular Expressions Norbert Zeh Faculty of Computer Science Dalhousie University Winter 2019 Searching Problem: Find all files whose names match a certain
More informationHKN CS 374 Midterm 1 Review. Tim Klem Noah Mathes Mahir Morshed
HKN CS 374 Midterm 1 Review Tim Klem Noah Mathes Mahir Morshed Midterm topics It s all about recognizing sets of strings! 1. String Induction 2. Regular languages a. DFA b. NFA c. Regular expressions 3.
More informationHierarchical Phrase-Based Translation with Weighted Finite State Transducers
Hierarchical Phrase-Based Translation with Weighted Finite State Transducers Gonzalo Iglesias Adrià de Gispert Eduardo R. Banga William Byrne University of Vigo. Dept. of Signal Processing and Communications.
More informationFormal languages and computation models
Formal languages and computation models Guy Perrier Bibliography John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman - Introduction to Automata Theory, Languages, and Computation - Addison Wesley, 2006.
More informationCS/ECE 374 Fall Homework 1. Due Tuesday, September 6, 2016 at 8pm
CSECE 374 Fall 2016 Homework 1 Due Tuesday, September 6, 2016 at 8pm Starting with this homework, groups of up to three people can submit joint solutions. Each problem should be submitted by exactly one
More informationPing-pong decoding Combining forward and backward search
Combining forward and backward search Research Internship 09/ - /0/0 Mirko Hannemann Microsoft Research, Speech Technology (Redmond) Supervisor: Daniel Povey /0/0 Mirko Hannemann / Beam Search Search Errors
More information5/20/2007. Touring Essential Programs
Touring Essential Programs Employing fundamental utilities. Managing input and output. Using special characters in the command-line. Managing user environment. Surveying elements of a functioning system.
More informationImplementation of Lexical Analysis
Implementation of Lexical Analysis Lecture 4 (Modified by Professor Vijay Ganesh) Tips on Building Large Systems KISS (Keep It Simple, Stupid!) Don t optimize prematurely Design systems that can be tested
More informationUniversity of Windsor : System Programming Winter Midterm 01-1h20mn. Instructor: Dr. A. Habed
University of Windsor 0360-256: System Programming Winter 2007 - Midterm 01-1h20mn. Instructor: Dr. A. Habed Solution Last name: First name: Student #: NONE NONE NONE Read this first Make sure your paper
More informationCreating LRs with FSTs Part II Compiling automata and transducers
Creating LRs with FSTs Part II Compiling automata and transducers Mans Hulden (University of Helsinki) Iñaki Alegria (University of The Basque Country) Recap: finite automata one or more as : {a,aa,...}:
More informationCSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions
CSE45 Translation of Programming Languages Lecture 2: Automata and Regular Expressions Finite Automata Regular Expression = Specification Finite Automata = Implementation A finite automaton consists of:
More informationUnleashing the Shell Hands-On UNIX System Administration DeCal Week 6 28 February 2011
Unleashing the Shell Hands-On UNIX System Administration DeCal Week 6 28 February 2011 Last time Compiling software and the three-step procedure (./configure && make && make install). Dependency hell and
More informationNon-deterministic Finite Automata (NFA)
Non-deterministic Finite Automata (NFA) CAN have transitions on the same input to different states Can include a ε or λ transition (i.e. move to new state without reading input) Often easier to design
More informationCSE Theory of Computing Fall 2017 Project 2-Finite Automata
CSE 30151 Theory of Computing Fall 2017 Project 2-Finite Automata Version 1: Sept. 27, 2017 1 Overview The goal of this project is to have each student understand at a deep level the functioning of a finite
More informationAssignment 4 CSE 517: Natural Language Processing
Assignment 4 CSE 517: Natural Language Processing University of Washington Winter 2016 Due: March 2, 2016, 1:30 pm 1 HMMs and PCFGs Here s the definition of a PCFG given in class on 2/17: A finite set
More informationRegular Expressions & Automata
Regular Expressions & Automata CMSC 132 Department of Computer Science University of Maryland, College Park Regular expressions Notation Patterns Java support Automata Languages Finite State Machines Turing
More informationYou will likely want to log into your assigned x-node from yesterday. Part 1: Shake-and-bake language generation
FSM Tutorial I assume that you re using bash as your shell; if not, then type bash before you start (you can use csh-derivatives if you want, but your mileage may vary). You will likely want to log into
More informationCSE Theory of Computing Spring 2018 Project 2-Finite Automata
CSE 30151 Theory of Computing Spring 2018 Project 2-Finite Automata Version 2 Contents 1 Overview 2 1.1 Updates................................................ 2 2 Valid Options 2 2.1 Project Options............................................
More informationMidterm 1 1 /8 2 /9 3 /9 4 /12 5 /10. Faculty of Computer Science. Term: Fall 2018 (Sep4-Dec4) Student ID Information. Grade Table Question Score
Faculty of Computer Science Page 1 of 8 Midterm 1 Term: Fall 2018 (Sep4-Dec4) Student ID Information Last name: First name: Student ID #: CS.Dal.Ca userid: Course ID: CSCI 2132 Course Title: Instructor:
More informationFront End: Lexical Analysis. The Structure of a Compiler
Front End: Lexical Analysis The Structure of a Compiler Constructing a Lexical Analyser By hand: Identify lexemes in input and return tokens Automatically: Lexical-Analyser generator We will learn about
More informationLearning with Weighted Transducers
Learning with Weighted Transducers Corinna CORTES a and Mehryar MOHRI b,1 a Google Research, 76 Ninth Avenue, New York, NY 10011 b Courant Institute of Mathematical Sciences and Google Research, 251 Mercer
More informationWeighted Finite-State Transducers in Computational Biology
Weighted Finite-State Transducers in Computational Biology Mehryar Mohri Courant Institute of Mathematical Sciences mohri@cims.nyu.edu Joint work with Corinna Cortes (Google Research). 1 This Tutorial
More informationLexical Analysis. Introduction
Lexical Analysis Introduction Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies
More informationLexical Analysis. Implementation: Finite Automata
Lexical Analysis Implementation: Finite Automata Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs)
More informationApplications of Lexicographic Semirings to Problems in Speech and Language Processing
Applications of Lexicographic Semirings to Problems in Speech and Language Processing Richard Sproat Google, Inc. Izhak Shafran Oregon Health & Science University Mahsa Yarmohammadi Oregon Health & Science
More informationImplementation of Lexical Analysis
Implementation of Lexical Analysis Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs) Implementation
More informationCS356: Discussion #1 Development Environment. Marco Paolieri
CS356: Discussion #1 Development Environment Marco Paolieri (paolieri@usc.edu) Contact Information Marco Paolieri PhD at the University of Florence, Italy (2015) Postdoc at USC since 2016 Email: paolieri@usc.edu
More informationImplementation of Lexical Analysis
Implementation of Lexical Analysis Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs) Implementation
More informationLecture 3 Regular Expressions and Automata
Lecture 3 Regular Expressions and Automata CS 6320 Fall 2018 @ Dan I. Moldovan, Human Language Technology Research Institute, The University of Texas at Dallas 78 Outline Regular Expressions Finite State
More informationRegular Languages. MACM 300 Formal Languages and Automata. Formal Languages: Recap. Regular Languages
Regular Languages MACM 3 Formal Languages and Automata Anoop Sarkar http://www.cs.sfu.ca/~anoop The set of regular languages: each element is a regular language Each regular language is an example of a
More informationLab 2: Training monophone models
v. 1.1 Lab 2: Training monophone models University of Edinburgh January 29, 2018 Last time we begun to get familiar with some of Kaldi s tools and set up a data directory for TIMIT. This time we will train
More informationTheory of Computations Spring 2016 Practice Final Exam Solutions
1 of 8 Theory of Computations Spring 2016 Practice Final Exam Solutions Name: Directions: Answer the questions as well as you can. Partial credit will be given, so show your work where appropriate. Try
More informationCS 314 Principles of Programming Languages. Lecture 3
CS 314 Principles of Programming Languages Lecture 3 Zheng Zhang Department of Computer Science Rutgers University Wednesday 14 th September, 2016 Zheng Zhang 1 CS@Rutgers University Class Information
More informationCSE Theory of Computing Spring 2018 Project 2-Finite Automata
CSE 30151 Theory of Computing Spring 2018 Project 2-Finite Automata Version 1 Contents 1 Overview 2 2 Valid Options 2 2.1 Project Options.................................. 2 2.2 Platform Options.................................
More informationCompiler Construction LECTURE # 3
Compiler Construction LECTURE # 3 The Course Course Code: CS-4141 Course Title: Compiler Construction Instructor: JAWAD AHMAD Email Address: jawadahmad@uoslahore.edu.pk Web Address: http://csandituoslahore.weebly.com/cc.html
More informationCMSC 132: Object-Oriented Programming II
CMSC 132: Object-Oriented Programming II Regular Expressions & Automata Department of Computer Science University of Maryland, College Park 1 Regular expressions Notation Patterns Java support Automata
More informationDiscriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition
Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition by Hong-Kwang Jeff Kuo, Brian Kingsbury (IBM Research) and Geoffry Zweig (Microsoft Research) ICASSP 2007 Presented
More informationBasic Linux (Bash) Commands
Basic Linux (Bash) Commands Hint: Run commands in the emacs shell (emacs -nw, then M-x shell) instead of the terminal. It eases searching for and revising commands and navigating and copying-and-pasting
More informationImplementation of Lexical Analysis
Written ssignments W assigned today Implementation of Lexical nalysis Lecture 4 Due in one week :59pm Electronic hand-in Prof. iken CS 43 Lecture 4 Prof. iken CS 43 Lecture 4 2 Tips on uilding Large Systems
More informationIntroduction: Language Description:
SAKÉ S halva Kohen: sak2232 ( Language Guru ) A runavha Chanda: ac3806 ( Manager ) K ai-zhan Lee: kl2792 ( System Architect ) E mma Etherington: ele2116 ( Tester ) Introduction: Behind all models of computation
More informationStructure of Programming Languages Lecture 3
Structure of Programming Languages Lecture 3 CSCI 6636 4536 Spring 2017 CSCI 6636 4536 Lecture 3... 1/25 Spring 2017 1 / 25 Outline 1 Finite Languages Deterministic Finite State Machines Lexical Analysis
More informationCSCI 340: Computational Models. Turing Machines. Department of Computer Science
CSCI 340: Computational Models Turing Machines Chapter 19 Department of Computer Science The Turing Machine Regular Expressions Acceptor: FA, TG Nondeterminism equal? Yes Closed Under: L 1 + L 2 L 1 L
More informationFSASIM: A Simulator for Finite-State Automata
FSASIM: A Simulator for Finite-State Automata P. N. Hilfinger Chapter 1: Overview 1 1 Overview The fsasim program reads in a description of a finite-state recognizer (either deterministic or non-deterministic),
More informationThe Replace Operator. Lauri Karttunen Rank Xerox Research Centre 6, chemin de Maupertuis F Meylan, France lauri, fr
The Replace Operator Lauri Karttunen Rank Xerox Research Centre 6, chemin de Maupertuis F-38240 Meylan, France lauri, karttunen@xerox, fr Abstract This paper introduces to the calculus of regular expressions
More informationString Matching. Pedro Ribeiro 2016/2017 DCC/FCUP. Pedro Ribeiro (DCC/FCUP) String Matching 2016/ / 42
String Matching Pedro Ribeiro DCC/FCUP 2016/2017 Pedro Ribeiro (DCC/FCUP) String Matching 2016/2017 1 / 42 On this lecture The String Matching Problem Naive Algorithm Deterministic Finite Automata Knuth-Morris-Pratt
More informationA Flexible XML-based Regular Compiler for Creation and Conversion of Linguistic Resources
A Flexible XML-based Regular Compiler for Creation and Conversion of Linguistic Resources Jakub Piskorski,, Oliver Scherf, Feiyu Xu DFKI German Research Center for Artificial Intelligence Stuhlsatzenhausweg
More informationLab - 8 Awk Programming
Lab - 8 Awk Programming AWK is another interpreted programming language which has powerful text processing capabilities. It can solve complex text processing tasks with a few lines of code. Listed below
More informationCS214-AdvancedUNIX. Lecture 2 Basic commands and regular expressions. Ymir Vigfusson. CS214 p.1
CS214-AdvancedUNIX Lecture 2 Basic commands and regular expressions Ymir Vigfusson CS214 p.1 Shellexpansions Let us first consider regular expressions that arise when using the shell (shell expansions).
More informationCENG 334 Computer Networks. Laboratory I Linux Tutorial
CENG 334 Computer Networks Laboratory I Linux Tutorial Contents 1. Logging In and Starting Session 2. Using Commands 1. Basic Commands 2. Working With Files and Directories 3. Permission Bits 3. Introduction
More informationChapter 3: Lexing and Parsing
Chapter 3: Lexing and Parsing Aarne Ranta Slides for the book Implementing Programming Languages. An Introduction to Compilers and Interpreters, College Publications, 2012. Lexing and Parsing* Deeper understanding
More informationImplementation of Lexical Analysis
Outline Implementation of Lexical nalysis Specifying lexical structure using regular expressions Finite automata Deterministic Finite utomata (DFs) Non-deterministic Finite utomata (NFs) Implementation
More informationDr. D.M. Akbar Hussain
1 2 Compiler Construction F6S Lecture - 2 1 3 4 Compiler Construction F6S Lecture - 2 2 5 #include.. #include main() { char in; in = getch ( ); if ( isalpha (in) ) in = getch ( ); else error (); while
More information6 NFA and Regular Expressions
Formal Language and Automata Theory: CS21004 6 NFA and Regular Expressions 6.1 Nondeterministic Finite Automata A nondeterministic finite automata (NFA) is a 5-tuple where 1. is a finite set of states
More informationTable of contents. Our goal. Notes. Notes. Notes. Summer June 29, Our goal is to see how we can use Unix as a tool for developing programs
Summer 2010 Department of Computer Science and Engineering York University Toronto June 29, 2010 1 / 36 Table of contents 1 2 3 4 2 / 36 Our goal Our goal is to see how we can use Unix as a tool for developing
More informationHandbook of Weighted Automata
Manfred Droste Werner Kuich Heiko Vogler Editors Handbook of Weighted Automata 4.1 Springer Contents Part I Foundations Chapter 1: Semirings and Formal Power Series Manfred Droste and Werner Kuich 3 1
More informationFinite automata. We have looked at using Lex to build a scanner on the basis of regular expressions.
Finite automata We have looked at using Lex to build a scanner on the basis of regular expressions. Now we begin to consider the results from automata theory that make Lex possible. Recall: An alphabet
More informationUnix for Poets (in 2016) Christopher Manning Stanford University Linguistics 278
Unix for Poets (in 2016) Christopher Manning Stanford University Linguistics 278 Operating systems The operating system wraps the hardware, running the show and providing abstractions Abstractions of processes
More informationLecture 18 Regular Expressions
Lecture 18 Regular Expressions In this lecture Background Text processing languages Pattern searches with grep Formal Languages and regular expressions Finite State Machines Regular Expression Grammer
More informationStone Soup Translation
Stone Soup Translation DJ Hovermale and Jeremy Morris and Andrew Watts December 3, 2005 1 Introduction 2 Overview of Stone Soup Translation 2.1 Finite State Automata The Stone Soup Translation model is
More informationHierarchical Phrase-Based Translation with WFSTs. Weighted Finite State Transducers
Hierarchical Phrase-Based Translation with Weighted Finite State Transducers Gonzalo Iglesias 1 Adrià de Gispert 2 Eduardo R. Banga 1 William Byrne 2 1 Department of Signal Processing and Communications
More informationDISCRETE-event dynamic systems (DEDS) are dynamic
IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 7, NO. 2, MARCH 1999 175 The Supervised Control of Discrete-Event Dynamic Systems François Charbonnier, Hassane Alla, and René David Abstract The supervisory
More informationCMSC Introduction to Computer Science 2 Summer Quarter 2007 Homework #8 (08/17/2007) Due: 1:30pm
Name: Student ID: Instructor: Borja Sotomayor Do not write in this area 1 2 3 TOTAL Maximum possible points: 20 + 40 Page 1 of 8 Exercise 1 You are provided with an XML file with information
More informationFormal Languages and Compilers Lecture VI: Lexical Analysis
Formal Languages and Compilers Lecture VI: Lexical Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/ artale/ Formal
More informationAn Ecient Compiler for Weighted Rewrite Rules
An Ecient Compiler for Weighted Rewrite Rules Mehryar Mohri AT&T Research 600 Mountain Avenue Murray Hill, 07974 NJ mohri@research.att.com Abstract Context-dependent rewrite rules are used in many areas
More information/665 Natural Language Processing Assignment 7: Finite-State Programming
601.465/665 Natural Language Processing Assignment 7: Finite-State Programming Prof. Jason Eisner Fall 2017 Due date: Friday 8 December, 11:59pm This assignment exposes you to finite-state programming.
More informationbash Tests and Looping Administrative Shell Scripting COMP2101 Fall 2017
bash Tests and Looping Administrative Shell Scripting COMP2101 Fall 2017 Command Lists A command is a sequence of commands separated by the operators ; & && and ; is used to simply execute commands in
More informationThe Kleene Language for Weighted Finite-State Programming:
The Kleene Language for Weighted Finite-State Programming: User Documentation, Version 0.9.5.0 This Document is Work in Progress Corrections and Suggestions Are Welcome Kenneth R. Beesley SAP Labs, LLC
More informationMidterm I - Solution CS164, Spring 2014
164sp14 Midterm 1 - Solution Midterm I - Solution CS164, Spring 2014 March 3, 2014 Please read all instructions (including these) carefully. This is a closed-book exam. You are allowed a one-page handwritten
More informationTheory of Computation Dr. Weiss Extra Practice Exam Solutions
Name: of 7 Theory of Computation Dr. Weiss Extra Practice Exam Solutions Directions: Answer the questions as well as you can. Partial credit will be given, so show your work where appropriate. Try to be
More informationLexical Analysis. Lecture 2-4
Lexical Analysis Lecture 2-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 2 1 Administrivia Moving to 60 Evans on Wednesday HW1 available Pyth manual available on line.
More informationFSA: An Efficient and Flexible C++ Toolkit for Finite State Automata Using On-Demand Computation
FSA: An Efficient and Flexible C++ Toolkit for Finite State Automata Using On-Demand Computation Stephan Kanthak and Hermann Ney Lehrstuhl für Informatik VI, Computer Science Department RWTH Aachen University
More informationExercise 2: Automata Theory
Exercise 2: Automata Theory Formal Methods II, Fall Semester 2013 Distributed: 11.10.2013 Due Date: 25.10.2013 Send your solutions to: tobias.klauser@uzh.ch or deliver them in the class. Finite State Automata
More informationCS/IT 114 Introduction to Java, Part 1 FALL 2016 CLASS 3: SEP. 13TH INSTRUCTOR: JIAYIN WANG
CS/IT 114 Introduction to Java, Part 1 FALL 2016 CLASS 3: SEP. 13TH INSTRUCTOR: JIAYIN WANG 1 Notice Reading Assignment Chapter 1: Introduction to Java Programming Homework 1 It is due this coming Sunday
More informationFormal Languages. Formal Languages
Regular expressions Formal Languages Finite state automata Deterministic Non-deterministic Review of BNF Introduction to Grammars Regular grammars Formal Languages, CS34 Fall2 BGRyder Formal Languages
More informationLOG ON TO LINUX AND LOG OFF
EXPNO:1A LOG ON TO LINUX AND LOG OFF AIM: To know how to logon to Linux and logoff. PROCEDURE: Logon: To logon to the Linux system, we have to enter the correct username and password details, when asked,
More informationCS 124/LINGUIST 180 From Languages to Information. Unix for Poets Dan Jurafsky
CS 124/LINGUIST 180 From Languages to Information Unix for Poets Dan Jurafsky (original by Ken Church, modifications by me and Chris Manning) Stanford University Unix for Poets Text is everywhere The Web
More information1 Finite Representations of Languages
1 Finite Representations of Languages Languages may be infinite sets of strings. We need a finite notation for them. There are at least four ways to do this: 1. Language generators. The language can be
More informationTheory of Computations Spring 2016 Practice Final
1 of 6 Theory of Computations Spring 2016 Practice Final 1. True/False questions: For each part, circle either True or False. (23 points: 1 points each) a. A TM can compute anything a desktop PC can, although
More informationSymbolic Automata Library for Fast Prototyping
http://excel.fit.vutbr.cz Symbolic Automata Library for Fast Prototyping Michaela Bieliková not_in{@} in{e,x,c} in{e,l} F I T Abstract Finite state automata are widely used in the fields of computer science
More informationEssentials for Scientific Computing: Stream editing with sed and awk
Essentials for Scientific Computing: Stream editing with sed and awk Ershaad Ahamed TUE-CMS, JNCASR May 2012 1 Stream Editing sed and awk are stream processing commands. What this means is that they are
More information1.3 Functions and Equivalence Relations 1.4 Languages
CSC4510 AUTOMATA 1.3 Functions and Equivalence Relations 1.4 Languages Functions and Equivalence Relations f : A B means that f is a function from A to B To each element of A, one element of B is assigned
More information