Weighted Finite-State Transducers in Computational Biology

Size: px
Start display at page:

Download "Weighted Finite-State Transducers in Computational Biology"

Transcription

1 Weighted Finite-State Transducers in Computational Biology Mehryar Mohri Courant Institute of Mathematical Sciences Joint work with Corinna Cortes (Google Research).

2 This Tutorial Weighted finite-state transducers and software library Pairwise alignments for bioinformatics Learning weighted transducers page 2 2

3 General Definitions Weighted finite-state transducers: b:a/.2 b:a/.6 a:b/. a:b/.5 a:a/.4 2 b:a/.3 3/. Kernels: similarity measures between vectors, sequences or other structures. Efficient computation of inner products in highdimensional feature spaces. Extensive use in modern machine learning. Must satisfy a mathematical requirement (Mercer s condition). page 3 3

4 Example Problem: find the n best pairwise alignments between two very large sets of sequences (e.g., set >,). Example: S S2 a b a a b b b b b a a a b a b b a a a a a b b a a b b a b b b a a b a a b b b a a a b b b a b a b a b b b a b b a a b b a a ε a ε b b a a a a a b b b a a a page 4 4

5 Software Libraries FSM Library: Finite-State Machine Library. General software utilities for building, combining, optimizing, and searching weighted automata and transducers (MM, Pereira, and Riley, 2). OpenFst Library: Open-source Finite-state transducer Library (Allauzen et al., 27). page 5 5

6 Software Libraries GRM Library: Grammar Library. General software collection for constructing and modifying weighted automata and transducers representing grammars and statistical language models (Allauzen, MM, and Roark, 25). DCD Library: Decoder Library. General software collection for speech recognition decoding and related functions (MM and Riley, 23). page 6 6

7 FSM Library The FSM utilities construct, combine, minimize, and search weighted finite-states transducers. User Program Level: Programs that read from and write to files or pipelines, fsm(): fsmintersect in.fsm in2.fsm >out.fsm C(++) Library Level: Library archive of C(++) functions that implements the user program level, fsm(3): Fsm in = FSMLoad("in.fsm"); Fsm in2 = FSMLoad("in2.fsm"); Fsm out = FSMIntersect(fsm, fsm2); FSMDump("out.fsm", out); page 7 7

8 Definition Level: Specification of labels, of costs, and of types of FSM representations. page 8 8

9 This Lecture Weighted automata and transducers Rational operations Elementary unary operations Fundamental binary operations Optimization algorithms Search algorithms 9 page 9

10 Textual format FSM File Types automata/acceptor files, transducer files, symbols files. Binary format: compiled representation used by all FSM utilities. page

11 Compiling, Printing, and Drawing Compiling fsmcompile -s tropical -ia.syms <A.txt >A.fsm fsmcompile -s log -ia.syms -oa.syms -t <T.txt >T.fsm Printing fsmprint -ia.syms <A.fsm >A.txt fsmprint -ia.syms -oa.syms <T.fsm dot -Tps >T.ps Drawing fsmdraw -ia.syms <A.fsm dot -Tps >A.ps fsmdraw -ia.syms -oa.syms <T.fsm dot -Tps >T.ps page

12 Weight Sets: Semirings A semiring negation. (K,,,, ) is a ring that may lack sum: to compute the weight of a sequence (sum of the weights of the paths labeled with that sequence). product: to compute the weight of a path (product of the weights of constituent transitions). page 2 2

13 Semirings - Examples Semiring Set Boolean {, } Probability R + + Log R {, + } log + + Tropical R {, + } min + + with log defined by: x log y = log(e x + e y ). page 3 3

14 Automata/Acceptors Graphical Representation ( A.ps) red/.5 Acceptor file ( A.txt) red.5 green.3 2 blue 2 yellow green/.3 Symbols file ( A.syms) red green 2 blue 3 yellow 4 blue/ yellow/.6 2 /.8 page 4 4

15 Transducers Graphical Representation ( T.ps) red:yellow/.5 Transducer file ( T.txt) Symbols file ( T.syms) red green 2 blue 3 yellow 4 green:blue/.3 red yellow.5 green blue.3 2 blue green 2 yellow red blue:green/ yellow:red/.6 2 /.8 page 5 5

16 Paths - Definitions and Notation Path π input label output label p[π] i[π]:o[π] n[π] previous state or source state next state or destination state Sets of paths P (R,R 2 ) : paths from R Q to R 2 Q. P (R,x,R 2 ): paths in P (R,R 2 ) with input label x. P (R,x,y,R 2 ): paths in P (R,x,R 2 ) with output label y. page 6 6

17 General Definitions Alphabets: input Σ, output. States: Q, initial states I, final states F. Transitions: E Q (Σ {ɛ}) ( {ɛ}) K Q. Weight functions: initial: λ : I K. final: ρ : F K. page 7 7

18 Automata and Transducers - Definitions Automaton A =(Σ,Q,I,F,E,λ,ρ) x Σ, [[A]](x) = π P (I,x,F) Transducer T =(Σ,,Q,I,F,E,λ,ρ) x Σ,y, [[T ]](x, y) = λ(p[π]) w[π] ρ(n[π]) π P (I,x,y,F) λ(p[π]) w[π] ρ(n[π]) page 8 8

19 Weighted Automata b/.6 b/.2 a/. a/.5 a/.4 2 b/.3 3/. [[A]](x) = Sum of the weights of all successful paths labeled with x [[A]](abb) = page 9 9

20 Weighted Transducers b:a/.6 b:a/.2 a:b/. a:b/.5 a:a/.4 2 b:a/.3 3/. [[T ]](x, y) = Sum of the weights of all successful paths with input x and output y. [[T ]](abb, baa) = page 2 2

21 This Lecture Weighted automata and transducers Rational operations Elementary unary operations Fundamental binary operations Optimization algorithms Search algorithms 2 page 2

22 Rational Operations Sum [[T T 2 ]](x, y) =[[T ]](x, y) [[T 2 ]](x, y) Product [[T T 2 ]](x, y) = Closure [[T ]](x, y) = [[T ]] n (x, y) n= [[T ]](x,y ) [[T 2 ]](x 2,y 2 ). x=x x 2 y=y y 2 page 22 22

23 Conditions (on the closure operation): condition on T: e.g., weight of ε-cycles = (regulated transducers), or semiring condition: e.g., x = as with the tropical semiring (more generally locally closed semirings). Complexity and implementation: linear-time complexity: O(( E + Q )+( E 2 + Q 2 )) O( Q + E ) or lazy implementation. page 23 23

24 Sum - Illustration Program: fsmunion A.fsm B.fsm >C.fsm Graphical representation: red/.5 green/.4 / green/.3 blue/ yellow/.6 2 /.8 blue/.2 2 /.3 red/.5 6 eps/ eps/ green/.3 blue/ 2 /.8 yellow/.6 3 green/.4 4 / blue/.2 5 /.3 page 24 24

25 Product - Illustration Program: fsmconcat A.fsm B.fsm >C.fsm Graphical representation: red/.5 green/.4 / green/.3 blue/ yellow/.6 2 /.8 blue/.2 2 /.3 red/.5 green/.4 4 / green/.3 blue/ yellow/.6 2 eps/.8 3 blue/.2 5 /.3 page 25 25

26 Closure - Illustration Program: fsmclosure B.fsm >C.fsm Graphical representation: green/.4 / green/.4 eps/ / blue/.2 3 / eps/ blue/.2 2 /.3 eps/.3 2 /.3 page 26 26

27 This Lecture Weighted automata and transducers Rational operations Elementary unary operations Fundamental binary operations Optimization algorithms Search algorithms 27 page 27

28 Elementary Unary Operations Reversal Inversion [[ T ]](x, y) =[[T ]]( x, ỹ) [[T ]](x, y) =[[T ]](y, x) Projection [[A]](x) = y [[T ]](x, y) Linear-time complexity, lazy implementation (not for reversal). page 28 28

29 Reversal - Illustration Program: fsmreverse A.fsm >C.fsm Graphical representation: red/.5 green/.3 blue/ yellow/.6 2 green/.2 blue/2 3 / 4 /.3 red/.5 eps/ eps/ green/.2 blue/2 3 blue/ yellow/.6 2 green/.3 / page 29 29

30 Program: fsminvert A.fsm >C.fsm Graphical representation: red:bird/.5 Inversion - Illustration green:pig/.3 blue:cat/ yellow:dog/.6 2 /.8 bird:red/.5 pig:green/.3 cat:blue/ dog:yellow/.6 2 /.8 page 3 3

31 Projection - Illustration Program: fsmproject - T.fsm >A.fsm Graphical representation: red:bird/.5 green:pig/.3 blue:cat/ yellow:dog/.6 2 /.8 red/.5 green/.3 blue/ yellow/.6 2 /.8 page 3 3

32 This Lecture Weighted automata and transducers Rational operations Elementary unary operations Fundamental binary operations Optimization algorithms Search algorithms 32 page 32

33 Some Fundamental Binary Operations Composition ( (K,,,, ) commutative) [[T T 2 ]](x, y) = [[T ]](x, z) [[T 2 ]](z,y) z Intersection ( (K,,,, ) commutative) [[A A 2 ]](x) =[[A ]](x) [[A 2 ]](x) Difference ( A 2 unweighted and deterministic) [[A A 2 ]](x) =[[A A 2 ]](x) (Pereira and Riley, 997; MM et al. 996) page 33 33

34 Complexity and implementation: quadratic complexity: O(( E + Q )( E 2 + Q 2 )) path multiplicity in presence of ε-transitions: ε- filter; lazy implementation. page 34 34

35 Composition - Illustration Program: fsmcompose A.fsm B.fsm >C.fsm Graphical representation: c:a/.3 a:b/.6 a:b/. b:a/.2 2 a:a/.4 b:b/.5 3/.6 b:c/.3 a:b/.4 2/.7 c:b/.9 c:b/.7 (, 2) a:b/ a:c/.4 (, ) (, ) a:b/.8 (3, 2)/.3 page 35 35

36 Multiplicity and ε-transitions - Problem a:d!:e,,,2 (x:x) (!:!) b:! (!2:!2) b:e (!2:!) b:! (!2:!2) a:a b:! 2 c:! 3 d:d 4 c:!!:e 2, 2,2 (!:!) c:! (!2:!2) (!2:!2) a:d!:e 2 d:a 3!:e 3, 3,2 (!:!) d:a (x:x) 4,3 page 36 36

37 Solution - Filter F for Composition!:!!2:! x:x!:! x:x!2:!2!2:!2 x:x 2 Replace T T 2 with T F T 2. page 37 37

38 Intersection - Illustration Program: fsmintersect A.fsm B.fsm >C.fsm Graphical representation: red/.5 green/.4 green/.3 blue/ yellow/.6 2 /.8 / red/.2 blue/.6 yellow/.3 2 /.5 blue/.6 3 /.8 red/.7 green/.7 2 yellow/.9 4 /.3 page 38 38

39 Difference - Illustration Program: fsmdifference A.fsm B.fsm >C.fsm Graphical representation: red/.5 green green/.3 blue/ yellow/.6 2 /.8 red blue yellow 2 red/.5 red/.5 red/.5 green/.3 2 green/.3 3 blue/ yellow/.6 4 /.8 page 39 39

40 This Lecture Weighted automata and transducers Rational operations Elementary unary operations Fundamental binary operations Optimization algorithms Search algorithms 4 page 4

41 Optimization Algorithms Connection: removes non-accessible/noncoaccessible states. ε-removal: removes ε-transitions. Determinization: creates equivalent deterministic machine. Pushing: creates equivalent pushed/stochastic machine. Minimization: creates equivalent minimal deterministic machine. page 4 4

42 Conditions: there are specific semiring conditions for the use of these algorithms, e.g., not all weighted automata or transducers can be determinized using the determinization algorithm. page 42 42

43 Connection - Illustration Program: fsmconnect A.fsm >C.fsm Graphical representation: green/ /.2 red/.5 blue/ 2 /.8 green/.3 yellow/.6 red/ 5 red/.5 green/.3 blue/ yellow/.6 2 /.8 page 43 43

44 Connection - Algorithm Definition: Input: weighted transducer T. Output: equivalent weighted transducer T 2 with all states connected. Description: 3. Depth-first search of from. T I 4. Mark accessible and coaccessible states. 5. Keep marked states and corresponding transitions. Complexity: linear O( Q + E ). page 44 44

45 ε-removal - Illustration Program: fsmrmepsilon T.fsm >TP.fsm Graphical representation: a:b/.!:!/.2 b:a/.4!:!/.3!:!/.6 b:a/.9 b:b/.5 3!:!/ 2 a:a/.8 b:a/.7 4/ 4/ a:a/.6 b:a/ b:a/.7 a:b/. b:b/.7 b:a/.5 b:b/.5 b:a/2.3 a:a/.4 b:a/.7 4/ b:a/.4 b:a/.9 a:a/ b:a/.7 page 45 45

46 Definition: Input: weighted transducer. Output: equivalent WFST Description: Computation of ε-closures. with no ε-transition. Removal of εs. Complexity: T ɛ : O( Q 2 + Q E (T + T )). Acyclic ε-removal - Algorithm T 2 T General case (tropical semiring): O( Q E + Q 2 log Q ) (MM, 2) page 46 46

47 Computation of ε-closures Definition: for p in Q, C[p] = { (q, w) :q ɛ[p], d[p, q] =w }, where d[p, q] = w[π]. π P (p,ɛ,q) Problem formulation: all-pairs shortest-distance problem in T ɛ (T reduced to its ε-transitions). closed semirings: generalization of Floyd- Warshall algorithm. k-closed semirings: generic sparse shortestdistance algorithm. page 47 47

48 Determinization - Algorithm Definition: Input: weighted automaton or transducer (MM,997) Output: equivalent subsequential or deterministic machine T 2 : has a unique initial state and no two transitions leaving the same state share the same input label. Description: 3. Generalization of subset construction: weighted subsets {(q,w ),...,(q n,w n )}, where w i s are remainder weights. 4. Computation of the weight of resulting transitions. T page 48 48

49 Determinization - Conditions Semiring: weakly left divisible semirings. Definition: T is determinizable when the determinization algorithm applies to T. All unweighted automata are determinizable. All acyclic machines are determinizable. Not all weighted automata or transducers are determinizable. Characterization based on the twins property. Complexity: exponential, but lazy implementation. page 49 49

50 Determinization of Weighted Automata - Illustration Program: fsmdeterminize A.fsm >D.fsm Graphical representation: a/ b/4 2 b/ b/3 a/3 b/3 3/ b/ / b/5 a/ {(,2),(2,)}/2 b/ {(,)} b/ b/3 {(3,)}/ {(,),(2,3)}/ page 5 5

51 Determinization of Weighted Transducers - Illustration Program: fsmdeterminize T.fsm >D.fsm Graphical representation: a:a/. a:b/.2 a:c/.5 2 b:b/.3 b:eps/.4 3 c:eps/.7 a:eps/.5 a:b/.6 4/.7 a:a/.2 {(, b, )} a:b/. {(, eps, )} a:eps/. {(, c,.4), (2, a, )} b:a/.3 {(3, b, ), (4, eps,.)} /(eps,.8) a:b/.5 a:b/.6 c:c/. {(4, eps, )} /(eps,.7) page 5 5

52 Pushing - Algorithm (MM,997; 24) Definition: Input: weighted automaton or transducer Output: equivalent automaton or transducer such that the longest common prefix of all T T 2 outgoing paths be ε or such that the sum of the weights of all outgoing transitions be modulo the string or weight at the initial state. page 52 52

53 Description:. Single-source shortest distance computation: for each state q, d[q] = w[π]. π P (q,f) 2. Reweighting: for each transition e such that d[p[e]], w[e] (d[p[e]]) (w[e] d[n[e]]) page 53 53

54 Conditions (automata case): weakly divisible semiring, zero-sum free semiring or automaton. Complexity: automata case acyclic case: linear general case (tropical semiring): transducer case: O( Q + E (T + T )). O( Q log Q + E ). O(( P max + ) E ). page 54 54

55 Weight Pushing - Illustration Program: fsmpush -ic A.fsm >P.fsm Graphical representation: Tropical semiring: a/ b/ c/4 d/ e/ f/ e/ 3/ a/ b/ c/4 d/ e/ f/ e/ 3/ e/ 2 f/ e/ 2 f/ page 55 55

56 Log semiring: a/ b/ c/4 d/ e/ f/ e/ 3/ a/.3266 b/.3266 c/ d/.326 e/.332 f/.332 e/.332 3/-.639 e/ 2 f/ e/ f/.332 page 56 56

57 Label Pushing - Illustration Program: fsmpush -il T.fsm >P.fsm Graphical representation: a:a c:! b:d a:! 2 b:! 3 d:! c:d a:d 4 a:! 5 f:d 6 a:a c:d b:! a:d 2 b:! 3 d:d c:! a:d 4 a:! 5 f:! 6 page 57 57

58 Definition: Minimization - Algorithm (MM,997) Input: deterministic weighted automaton or transducer T. Output: equivalent deterministic automaton or transducer T 2 with the minimal number of states and transitions. Description: Canonical representation: use pushing or other algorithm to standardize input automata. Automata minimization: encode pairs (label, weight) as labels and use classical unweighted minimization algorithm. page 58 58

59 Complexity: Automata case acyclic case: linear, O( Q + E (T + T )). general case (tropical semiring): Transducer case O( E log Q ). acyclic case: O(S + Q + E ( P max + )). general case (tropical semiring): O(S + Q + E (log Q + P max )). page 59 59

60 Minimization - Illustration Program: fsmminimize D.fsm >M.fsm Graphical representation: b/ 2 c/2 d/ a/ b/ a/2 d/3 3 e/2 d/4 e/3 5 c/3 c/ 6 e/ 7/ 4 b/ 2 c/ d/ a/ b/ a/3 d/6 3 e/ d/6 e/ 5 c/ c/ 6 e/ 7/6 d/ a/ b/ b/ a/3 4 c/ 2 c/ e/ d/6 3 6 e/ 7/6 page 6 6

61 Definition: Description Equivalence - Algorithm Input: deterministic weighted automata A and B. Output: TRUE iff A and B equivalent. (MM,997): Canonical representation: use pushing or other algorithm to standardize input automata. Automata minimization: encode pairs (label, weight) as labels and use classical algorithm for testing the equivalence of unweighted automata. Complexity: (second stage is quasi-linear) O( E + E 2 + Q log Q + Q 2 log Q 2 ). page 6 6

62 Equivalence - Illustration Program: fsmequiv [-v] D.fsm M.fsm Graphical representation: blue/.7 2 /.3 red/.3 yellow/.9 3 /.4 red/ blue/ yellow/.3 2 /.3 page 62 62

63 This Lecture Weighted automata and transducers Rational operations Elementary unary operations Fundamental binary operations Optimization algorithms Search algorithms 63 page 63

64 Single-Source Shortest-Distance Algorithms - Illustration Program: fsmbestpath [-n N] A.fsm >C.fsm Graphical representation: red/.5 red/.5 red/.5 green/.3 2 green/.3 3 blue/ yellow/.6 4 /.8 green/.3 blue/ 2 /.8 page 64 64

65 Pruning - Illustration Program: fsmprune -c. A.fsm >C.fsm Graphical representation: red/.5 red/.5 red/.5 green/.3 2 green/.3 3 blue/ yellow/.6 4 /.8 green/.3 blue/ yellow/.6 2 /.8 page 65 65

66 Summary FSM Library: weighted finite-state transducers (semirings); elementary unary operations (e.g., reversal); rational operations (sum, product, closure); fundamental binary operations (e.g., composition); optimization algorithms (e.g., ε-removal, determinization, minimization); search algorithms (e.g., shortest-distance algorithms, n-best paths algorithms, pruning). page 66 66

67 References Cyril Allauzen and Mehryar Mohri. Efficient Algorithms for Testing the Twins Property. Journal of Automata, Languages and Combinatorics, 8(2):7-44, 23. Cyril Allauzen, Mehryar Mohri, and Brian Roark. The Design Principles and Algorithms of a Weighted Grammar Library. International Journal of Foundations of Computer Science, 6(3): 43-42, 25. Cyril Allauzen, Michael Riley, Johan Schalkwyk, Wojciech Skut, and Mehryar Mohri. OpenFst: a general and efficient weighted finite-state transducer library. In Proceedings of the Ninth International Conference on Implementation and Application of Automata, (CIAA 27), volume 4783 of Lecture Notes in Computer Science, pages -23. Springer, Berlin, 27. Mehryar Mohri. Finite-State Transducers in Language and Speech Processing. Computational Linguistics, 23:2, 997. Mehryar Mohri. Weighted Grammar Tools: the GRM Library. In Robustness in Language and Speech Technology. pages Kluwer Academic Publishers, The Netherlands, 2. Mehryar Mohri. Statistical Natural Language Processing. In M. Lothaire, editor, Applied Combinatorics on Words. Cambridge University Press, 25. page 67 67

68 References Mehryar Mohri. Generic Epsilon-Removal and Input Epsilon-Normalization Algorithms for Weighted Transducers. International Journal of Foundations of Computer Science, 3(): 29-43, 22. Mehryar Mohri, Fernando C. N. Pereira, and Michael Riley. The Design Principles of a Weighted Finite-State Transducer Library. Theoretical Computer Science, 23:7-32, January 2. Mehryar Mohri, Fernando C. N. Pereira, and Michael Riley. Weighted Automata in Text and Speech Processing. In Proceedings of the 2th biennial European Conference on Artificial Intelligence (ECAI-96), Workshop on Extended finite state models of language. Budapest, Hungary, 996. John Wiley and Sons, Chichester. Mehryar Mohri and Michael Riley. A Weight Pushing Algorithm for Large Vocabulary Speech Recognition. In Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech'). Aalborg, Denmark, September 2. Fernando Pereira and Michael Riley. Finite State Language Processing, chapter Speech Recognition by Composition of Weighted Finite Automata. The MIT Press, 997. page 68 68

Weighted Finite-State Transducers in Computational Biology

Weighted Finite-State Transducers in Computational Biology Weighted Finite-State Transducers in Computational Biology Mehryar Mohri Courant Institute of Mathematical Sciences mohri@cims.nyu.edu Joint work with Corinna Cortes (Google Research). 1 This Tutorial

More information

You will likely want to log into your assigned x-node from yesterday. Part 1: Shake-and-bake language generation

You will likely want to log into your assigned x-node from yesterday. Part 1: Shake-and-bake language generation FSM Tutorial I assume that you re using bash as your shell; if not, then type bash before you start (you can use csh-derivatives if you want, but your mileage may vary). You will likely want to log into

More information

Learning with Weighted Transducers

Learning with Weighted Transducers Learning with Weighted Transducers Corinna CORTES a and Mehryar MOHRI b,1 a Google Research, 76 Ninth Avenue, New York, NY 10011 b Courant Institute of Mathematical Sciences and Google Research, 251 Mercer

More information

Lexicographic Semirings for Exact Automata Encoding of Sequence Models

Lexicographic Semirings for Exact Automata Encoding of Sequence Models Lexicographic Semirings for Exact Automata Encoding of Sequence Models Brian Roark, Richard Sproat, and Izhak Shafran {roark,rws,zak}@cslu.ogi.edu Abstract In this paper we introduce a novel use of the

More information

Speech Recognition Lecture 12: Lattice Algorithms. Cyril Allauzen Google, NYU Courant Institute Slide Credit: Mehryar Mohri

Speech Recognition Lecture 12: Lattice Algorithms. Cyril Allauzen Google, NYU Courant Institute Slide Credit: Mehryar Mohri Speech Recognition Lecture 12: Lattice Algorithms. Cyril Allauzen Google, NYU Courant Institute allauzen@cs.nyu.edu Slide Credit: Mehryar Mohri This Lecture Speech recognition evaluation N-best strings

More information

Finite-State Transducers in Language and Speech Processing

Finite-State Transducers in Language and Speech Processing Finite-State Transducers in Language and Speech Processing Mehryar Mohri AT&T Labs-Research Finite-state machines have been used in various domains of natural language processing. We consider here the

More information

A General Weighted Grammar Library

A General Weighted Grammar Library A General Weighted Grammar Library Cyril Allauzen, Mehryar Mohri, and Brian Roark AT&T Labs Research, Shannon Laboratory 80 Park Avenue, Florham Park, NJ 0792-097 {allauzen, mohri, roark}@research.att.com

More information

A General Weighted Grammar Library

A General Weighted Grammar Library A General Weighted Grammar Library Cyril Allauzen, Mehryar Mohri 2, and Brian Roark 3 AT&T Labs Research 80 Park Avenue, Florham Park, NJ 07932-097 allauzen@research.att.com 2 Department of Computer Science

More information

OpenFst: a General and Efficient Weighted Finite-State Transducer Library. Part I. Library Design and Use

OpenFst: a General and Efficient Weighted Finite-State Transducer Library. Part I. Library Design and Use OpenFst: a General and Efficient Weighted Finite-State Transducer Library Part I. Library Design and Use Outline. Definitions Semirings Weighted Automata and Transducers 2. Library Overview FST Construction

More information

Weighted Finite State Transducers in Automatic Speech Recognition

Weighted Finite State Transducers in Automatic Speech Recognition Weighted Finite State Transducers in Automatic Speech Recognition ZRE lecture 10.04.2013 Mirko Hannemann Slides provided with permission, Daniel Povey some slides from T. Schultz, M. Mohri and M. Riley

More information

WFST: Weighted Finite State Transducer. September 12, 2017 Prof. Marie Meteer

WFST: Weighted Finite State Transducer. September 12, 2017 Prof. Marie Meteer + WFST: Weighted Finite State Transducer September 12, 217 Prof. Marie Meteer + FSAs: A recurring structure in speech 2 Phonetic HMM Viterbi trellis Language model Pronunciation modeling + The language

More information

A MACHINE LEARNING FRAMEWORK FOR SPOKEN-DIALOG CLASSIFICATION. Patrick Haffner Park Avenue Florham Park, NJ 07932

A MACHINE LEARNING FRAMEWORK FOR SPOKEN-DIALOG CLASSIFICATION. Patrick Haffner Park Avenue Florham Park, NJ 07932 Springer Handbook on Speech Processing and Speech Communication A MACHINE LEARNING FRAMEWORK FOR SPOKEN-DIALOG CLASSIFICATION Corinna Cortes Google Research 76 Ninth Avenue New York, NY corinna@google.com

More information

Weighted Finite State Transducers in Automatic Speech Recognition

Weighted Finite State Transducers in Automatic Speech Recognition Weighted Finite State Transducers in Automatic Speech Recognition ZRE lecture 15.04.2015 Mirko Hannemann Slides provided with permission, Daniel Povey some slides from T. Schultz, M. Mohri, M. Riley and

More information

Report for each of the weighted automata obtained ˆ the number of states; ˆ the number of ɛ-transitions;

Report for each of the weighted automata obtained ˆ the number of states; ˆ the number of ɛ-transitions; Mehryar Mohri Speech Recognition Courant Institute of Mathematical Sciences Homework assignment 3 (Solution) Part 2, 3 written by David Alvarez 1. For this question, it is recommended that you use the

More information

Speech Recognition CSCI-GA Fall Homework Assignment #1: Automata Operations. Instructor: Eugene Weinstein. Due Date: October 17th

Speech Recognition CSCI-GA Fall Homework Assignment #1: Automata Operations. Instructor: Eugene Weinstein. Due Date: October 17th Speech Recognition CSCI-GA.3033-015 Fall 2013 Homework Assignment #1: Automata Operations Instructor: Eugene Weinstein Due Date: October 17th Note: It is advised, but not required, to use the OpenFST library

More information

Improving the Performance of Text Categorization using N-gram Kernels

Improving the Performance of Text Categorization using N-gram Kernels Improving the Performance of Text Categorization using N-gram Kernels Varsha K. V *., Santhosh Kumar C., Reghu Raj P. C. * * Department of Computer Science and Engineering Govt. Engineering College, Palakkad,

More information

The Design Principles of a Weighted Finite-State Transducer Library

The Design Principles of a Weighted Finite-State Transducer Library The Design Principles of a Weighted Finite-State Transducer Library Mehryar Mohri 1, Fernando Pereira and Michael Riley AT&T Labs Research, 180 Park Avenue, Florham Park, NJ 0793-0971 Abstract We describe

More information

Handbook of Weighted Automata

Handbook of Weighted Automata Manfred Droste Werner Kuich Heiko Vogler Editors Handbook of Weighted Automata 4.1 Springer Contents Part I Foundations Chapter 1: Semirings and Formal Power Series Manfred Droste and Werner Kuich 3 1

More information

CS 314 Principles of Programming Languages. Lecture 3

CS 314 Principles of Programming Languages. Lecture 3 CS 314 Principles of Programming Languages Lecture 3 Zheng Zhang Department of Computer Science Rutgers University Wednesday 14 th September, 2016 Zheng Zhang 1 CS@Rutgers University Class Information

More information

CIS-331 Spring 2016 Exam 1 Name: Total of 109 Points Version 1

CIS-331 Spring 2016 Exam 1 Name: Total of 109 Points Version 1 Version 1 Instructions Write your name on the exam paper. Write your name and version number on the top of the yellow paper. Answer Question 1 on the exam paper. Answer Questions 2-4 on the yellow paper.

More information

The Design Principles and Algorithms of a Weighted Grammar Library. and. and. 1. Introduction

The Design Principles and Algorithms of a Weighted Grammar Library. and. and. 1. Introduction International Journal of Foundations of Computer Science c World Scientific Publishing Company The Design Principles and Algorithms of a Weighted Grammar Library CYRIL ALLAUZEN allauzen@research.att.com

More information

Ping-pong decoding Combining forward and backward search

Ping-pong decoding Combining forward and backward search Combining forward and backward search Research Internship 09/ - /0/0 Mirko Hannemann Microsoft Research, Speech Technology (Redmond) Supervisor: Daniel Povey /0/0 Mirko Hannemann / Beam Search Search Errors

More information

A Rational Design for a Weighted Finite-State Transducer Library

A Rational Design for a Weighted Finite-State Transducer Library A Rational Design for a Weighted Finite-State Transducer Library Mehryar Mohri, Fernando Pereira, and Michael Riley AT&T Labs -- Research 180 Park Avenue, Florham Park, NJ 07932-0971 1 Overview We describe

More information

JNTUWORLD. Code No: R

JNTUWORLD. Code No: R Code No: R09220504 R09 SET-1 B.Tech II Year - II Semester Examinations, April-May, 2012 FORMAL LANGUAGES AND AUTOMATA THEORY (Computer Science and Engineering) Time: 3 hours Max. Marks: 75 Answer any five

More information

Applications of Lexicographic Semirings to Problems in Speech and Language Processing

Applications of Lexicographic Semirings to Problems in Speech and Language Processing Applications of Lexicographic Semirings to Problems in Speech and Language Processing Richard Sproat Google, Inc. Izhak Shafran Oregon Health & Science University Mahsa Yarmohammadi Oregon Health & Science

More information

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur Finite Automata Dr. Nadeem Akhtar Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur PhD Laboratory IRISA-UBS University of South Brittany European University

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! [ALSU03] Chapter 3 - Lexical Analysis Sections 3.1-3.4, 3.6-3.7! Reading for next time [ALSU03] Chapter 3 Copyright (c) 2010 Ioanna

More information

Suggesting Edits to Explain Failing Traces

Suggesting Edits to Explain Failing Traces Suggesting Edits to Explain Failing Traces Giles Reger University of Manchester, UK Abstract. Runtime verification involves checking whether an execution trace produced by a running system satisfies a

More information

Multiple Choice Questions

Multiple Choice Questions Techno India Batanagar Computer Science and Engineering Model Questions Subject Name: Formal Language and Automata Theory Subject Code: CS 402 Multiple Choice Questions 1. The basic limitation of an FSM

More information

Regular Languages. MACM 300 Formal Languages and Automata. Formal Languages: Recap. Regular Languages

Regular Languages. MACM 300 Formal Languages and Automata. Formal Languages: Recap. Regular Languages Regular Languages MACM 3 Formal Languages and Automata Anoop Sarkar http://www.cs.sfu.ca/~anoop The set of regular languages: each element is a regular language Each regular language is an example of a

More information

CIS-331 Fall 2014 Exam 1 Name: Total of 109 Points Version 1

CIS-331 Fall 2014 Exam 1 Name: Total of 109 Points Version 1 Version 1 1. (24 Points) Show the routing tables for routers A, B, C, and D. Make sure you account for traffic to the Internet. Router A Router B Router C Router D Network Next Hop Next Hop Next Hop Next

More information

CIS-331 Fall 2013 Exam 1 Name: Total of 120 Points Version 1

CIS-331 Fall 2013 Exam 1 Name: Total of 120 Points Version 1 Version 1 1. (24 Points) Show the routing tables for routers A, B, C, and D. Make sure you account for traffic to the Internet. NOTE: Router E should only be used for Internet traffic. Router A Router

More information

An Ecient Compiler for Weighted Rewrite Rules

An Ecient Compiler for Weighted Rewrite Rules An Ecient Compiler for Weighted Rewrite Rules Mehryar Mohri AT&T Research 600 Mountain Avenue Murray Hill, 07974 NJ mohri@research.att.com Abstract Context-dependent rewrite rules are used in many areas

More information

Homework #1: CMPT-825 Reading: fsmtools/fsm/ Anoop Sarkar

Homework #1: CMPT-825 Reading:   fsmtools/fsm/ Anoop Sarkar Homework #: CMPT-825 Reading: http://www.research.att.com/ fsmtools/fsm/ Anoop Sarkar anoop@cs.sfu.ca () Machine (Back) Transliteration Languages have different sound inventories. When translating from

More information

CIS-331 Final Exam Spring 2015 Total of 115 Points. Version 1

CIS-331 Final Exam Spring 2015 Total of 115 Points. Version 1 Version 1 1. (25 Points) Given that a frame is formatted as follows: And given that a datagram is formatted as follows: And given that a TCP segment is formatted as follows: Assuming no options are present

More information

CIS-331 Final Exam Spring 2018 Total of 120 Points. Version 1

CIS-331 Final Exam Spring 2018 Total of 120 Points. Version 1 Version 1 Instructions 1. Write your name and version number on the top of the yellow paper and the routing tables sheet. 2. Answer Question 2 on the routing tables sheet. 3. Answer Questions 1, 3, 4,

More information

FSA: An Efficient and Flexible C++ Toolkit for Finite State Automata Using On-Demand Computation

FSA: An Efficient and Flexible C++ Toolkit for Finite State Automata Using On-Demand Computation FSA: An Efficient and Flexible C++ Toolkit for Finite State Automata Using On-Demand Computation Stephan Kanthak and Hermann Ney Lehrstuhl für Informatik VI, Computer Science Department RWTH Aachen University

More information

HKN CS 374 Midterm 1 Review. Tim Klem Noah Mathes Mahir Morshed

HKN CS 374 Midterm 1 Review. Tim Klem Noah Mathes Mahir Morshed HKN CS 374 Midterm 1 Review Tim Klem Noah Mathes Mahir Morshed Midterm topics It s all about recognizing sets of strings! 1. String Induction 2. Regular languages a. DFA b. NFA c. Regular expressions 3.

More information

CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages CS 314 Principles of Programming Languages Lecture 2: Syntax Analysis Zheng (Eddy) Zhang Rutgers University January 22, 2018 Announcement First recitation starts this Wednesday Homework 1 will be release

More information

Pynini: A Python library for weighted finite-state grammar compilation

Pynini: A Python library for weighted finite-state grammar compilation Pynini: A Python library for weighted finite-state grammar compilation Kyle Gorman Google, Inc. 111 8th Avenue, New York, NY 10011 Abstract We present Pynini, an open-source library for the compilation

More information

Course Project 2 Regular Expressions

Course Project 2 Regular Expressions Course Project 2 Regular Expressions CSE 30151 Spring 2017 Version of February 16, 2017 In this project, you ll write a regular expression matcher similar to grep, called mere (for match and echo using

More information

CS314: FORMAL LANGUAGES AND AUTOMATA THEORY L. NADA ALZABEN. Lecture 1: Introduction

CS314: FORMAL LANGUAGES AND AUTOMATA THEORY L. NADA ALZABEN. Lecture 1: Introduction CS314: FORMAL LANGUAGES AND AUTOMATA THEORY L. NADA ALZABEN Lecture 1: Introduction Introduction to the course 2 Required Text Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, BY MICHAEL

More information

Ambiguous Grammars and Compactification

Ambiguous Grammars and Compactification Ambiguous Grammars and Compactification Mridul Aanjaneya Stanford University July 17, 2012 Mridul Aanjaneya Automata Theory 1/ 44 Midterm Review Mathematical Induction and Pigeonhole Principle Finite Automata

More information

Implementation of replace rules using preference operator

Implementation of replace rules using preference operator Implementation of replace rules using preference operator Senka Drobac, Miikka Silfverberg, and Anssi Yli-Jyrä University of Helsinki Department of Modern Languages Unioninkatu 40 A FI-00014 Helsingin

More information

CIS-331 Exam 2 Fall 2015 Total of 105 Points Version 1

CIS-331 Exam 2 Fall 2015 Total of 105 Points Version 1 Version 1 1. (20 Points) Given the class A network address 117.0.0.0 will be divided into multiple subnets. a. (5 Points) How many bits will be necessary to address 4,000 subnets? b. (5 Points) What is

More information

International Journal of Advance Research in Computer Science and Management Studies

International Journal of Advance Research in Computer Science and Management Studies Volume 3, Issue 12, December 2015 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Lexical Analysis. Implementation: Finite Automata

Lexical Analysis. Implementation: Finite Automata Lexical Analysis Implementation: Finite Automata Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs)

More information

1. Draw the state graphs for the finite automata which accept sets of strings composed of zeros and ones which:

1. Draw the state graphs for the finite automata which accept sets of strings composed of zeros and ones which: P R O B L E M S Finite Autom ata. Draw the state graphs for the finite automata which accept sets of strings composed of zeros and ones which: a) Are a multiple of three in length. b) End with the string

More information

Talen en Compilers. Johan Jeuring , period 2. January 17, Department of Information and Computing Sciences Utrecht University

Talen en Compilers. Johan Jeuring , period 2. January 17, Department of Information and Computing Sciences Utrecht University Talen en Compilers 2015-2016, period 2 Johan Jeuring Department of Information and Computing Sciences Utrecht University January 17, 2016 13. LR parsing 13-1 This lecture LR parsing Basic idea The LR(0)

More information

CS 125 Section #10 Midterm 2 Review 11/5/14

CS 125 Section #10 Midterm 2 Review 11/5/14 CS 125 Section #10 Midterm 2 Review 11/5/14 1 Topics Covered This midterm covers up through NP-completeness; countability, decidability, and recognizability will not appear on this midterm. Disclaimer:

More information

Overview. Search and Decoding. HMM Speech Recognition. The Search Problem in ASR (1) Today s lecture. Steve Renals

Overview. Search and Decoding. HMM Speech Recognition. The Search Problem in ASR (1) Today s lecture. Steve Renals Overview Search and Decoding Steve Renals Automatic Speech Recognition ASR Lecture 10 January - March 2012 Today s lecture Search in (large vocabulary) speech recognition Viterbi decoding Approximate search

More information

Chapter 18: Decidability

Chapter 18: Decidability Chapter 18: Decidability Peter Cappello Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93106 cappello@cs.ucsb.edu Please read the corresponding chapter before

More information

Unit-IV Boolean Algebra

Unit-IV Boolean Algebra Unit-IV Boolean Algebra Boolean Algebra Chapter: 08 Truth table: Truth table is a table, which represents all the possible values of logical variables/statements along with all the possible results of

More information

Performance analysis in large C++ projects

Performance analysis in large C++ projects Performance analysis in large C++ projects Florent D HALLUIN EPITA Research and Development Laboratory May 13th, 2009 Florent D HALLUIN CSI Seminar 2009 - Performance analysis

More information

Lecture 9. LVCSR Decoding (cont d) and Robustness. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen

Lecture 9. LVCSR Decoding (cont d) and Robustness. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen Lecture 9 LVCSR Decoding (cont d) and Robustness Michael Picheny, huvana Ramabhadran, Stanley F. Chen IM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen}@us.ibm.com

More information

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

MIT Specifying Languages with Regular Expressions and Context-Free Grammars MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Language Definition Problem How to precisely

More information

Automata Theory TEST 1 Answers Max points: 156 Grade basis: 150 Median grade: 81%

Automata Theory TEST 1 Answers Max points: 156 Grade basis: 150 Median grade: 81% Automata Theory TEST 1 Answers Max points: 156 Grade basis: 150 Median grade: 81% 1. (2 pts) See text. You can t be sloppy defining terms like this. You must show a bijection between the natural numbers

More information

QUESTION BANK. Unit 1. Introduction to Finite Automata

QUESTION BANK. Unit 1. Introduction to Finite Automata QUESTION BANK Unit 1 Introduction to Finite Automata 1. Obtain DFAs to accept strings of a s and b s having exactly one a.(5m )(Jun-Jul 10) 2. Obtain a DFA to accept strings of a s and b s having even

More information

2.6 BOOLEAN FUNCTIONS

2.6 BOOLEAN FUNCTIONS 2.6 BOOLEAN FUNCTIONS Binary variables have two values, either 0 or 1. A Boolean function is an expression formed with binary variables, the two binary operators AND and OR, one unary operator NOT, parentheses

More information

CS308 Compiler Principles Lexical Analyzer Li Jiang

CS308 Compiler Principles Lexical Analyzer Li Jiang CS308 Lexical Analyzer Li Jiang Department of Computer Science and Engineering Shanghai Jiao Tong University Content: Outline Basic concepts: pattern, lexeme, and token. Operations on languages, and regular

More information

On Generalizing Rough Set Theory

On Generalizing Rough Set Theory On Generalizing Rough Set Theory Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Abstract. This paper summarizes various formulations

More information

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Massachusetts Institute of Technology Language Definition Problem How to precisely define language Layered structure

More information

Formal Languages and Automata

Formal Languages and Automata Mobile Computing and Software Engineering p. 1/3 Formal Languages and Automata Chapter 3 Regular languages and Regular Grammars Chuan-Ming Liu cmliu@csie.ntut.edu.tw Department of Computer Science and

More information

X Y Z F=X+Y+Z

X Y Z F=X+Y+Z This circuit is used to obtain the compliment of a value. If X = 0, then X = 1. The truth table for NOT gate is : X X 0 1 1 0 2. OR gate : The OR gate has two or more input signals but only one output

More information

CIS-331 Final Exam Spring 2016 Total of 120 Points. Version 1

CIS-331 Final Exam Spring 2016 Total of 120 Points. Version 1 Version 1 1. (25 Points) Given that a frame is formatted as follows: And given that a datagram is formatted as follows: And given that a TCP segment is formatted as follows: Assuming no options are present

More information

A Flexible XML-based Regular Compiler for Creation and Conversion of Linguistic Resources

A Flexible XML-based Regular Compiler for Creation and Conversion of Linguistic Resources A Flexible XML-based Regular Compiler for Creation and Conversion of Linguistic Resources Jakub Piskorski,, Oliver Scherf, Feiyu Xu DFKI German Research Center for Artificial Intelligence Stuhlsatzenhausweg

More information

4. Specifications and Additional Information

4. Specifications and Additional Information 4. Specifications and Additional Information AGX52004-1.0 8B/10B Code This section provides information about the data and control codes for Arria GX devices. Code Notation The 8B/10B data and control

More information

Finite Math - J-term Homework. Section Inverse of a Square Matrix

Finite Math - J-term Homework. Section Inverse of a Square Matrix Section.5-77, 78, 79, 80 Finite Math - J-term 017 Lecture Notes - 1/19/017 Homework Section.6-9, 1, 1, 15, 17, 18, 1, 6, 9, 3, 37, 39, 1,, 5, 6, 55 Section 5.1-9, 11, 1, 13, 1, 17, 9, 30 Section.5 - Inverse

More information

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis CS 1622 Lecture 2 Lexical Analysis CS 1622 Lecture 2 1 Lecture 2 Review of last lecture and finish up overview The first compiler phase: lexical analysis Reading: Chapter 2 in text (by 1/18) CS 1622 Lecture

More information

CS402 Theory of Automata Solved Subjective From Midterm Papers. MIDTERM SPRING 2012 CS402 Theory of Automata

CS402 Theory of Automata Solved Subjective From Midterm Papers. MIDTERM SPRING 2012 CS402 Theory of Automata Solved Subjective From Midterm Papers Dec 07,2012 MC100401285 Moaaz.pk@gmail.com Mc100401285@gmail.com PSMD01 MIDTERM SPRING 2012 Q. Point of Kleen Theory. Answer:- (Page 25) 1. If a language can be accepted

More information

Digital Logic Design (CEN-120) (3+1)

Digital Logic Design (CEN-120) (3+1) Digital Logic Design (CEN-120) (3+1) ASSISTANT PROFESSOR Engr. Syed Rizwan Ali, MS(CAAD)UK, PDG(CS)UK, PGD(PM)IR, BS(CE)PK HEC Certified Master Trainer (MT-FPDP) PEC Certified Professional Engineer (COM/2531)

More information

This document is to be used together with N2285 and N2281.

This document is to be used together with N2285 and N2281. ISO/IEC JTC1/SC2/WG2 N2291 2000-09-25 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation internationale de normalisation еждународная организация по

More information

DKT 122/3 DIGITAL SYSTEM 1

DKT 122/3 DIGITAL SYSTEM 1 Company LOGO DKT 122/3 DIGITAL SYSTEM 1 BOOLEAN ALGEBRA (PART 2) Boolean Algebra Contents Boolean Operations & Expression Laws & Rules of Boolean algebra DeMorgan s Theorems Boolean analysis of logic circuits

More information

CS6160 Theory of Computation Problem Set 2 Department of Computer Science, University of Virginia

CS6160 Theory of Computation Problem Set 2 Department of Computer Science, University of Virginia CS6160 Theory of Computation Problem Set 2 Department of Computer Science, University of Virginia Gabriel Robins Please start solving these problems immediately, and work in study groups. Please prove

More information

Institut für Informatik D Augsburg

Institut für Informatik D Augsburg Universität Augsburg The Confidence-Probability Semiring implemented within OpenFst Markus Huber and Christian Kölbl Report 2011-16 March 2012 Institut für Informatik D-86135 Augsburg Copyright c Markus

More information

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata Formal Languages and Compilers Lecture IV: Regular Languages and Finite Automata Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

More information

Dr. D.M. Akbar Hussain

Dr. D.M. Akbar Hussain 1 2 Compiler Construction F6S Lecture - 2 1 3 4 Compiler Construction F6S Lecture - 2 2 5 #include.. #include main() { char in; in = getch ( ); if ( isalpha (in) ) in = getch ( ); else error (); while

More information

Chapter Seven: Regular Expressions. Formal Language, chapter 7, slide 1

Chapter Seven: Regular Expressions. Formal Language, chapter 7, slide 1 Chapter Seven: Regular Expressions Formal Language, chapter 7, slide The first time a young student sees the mathematical constant π, it looks like just one more school artifact: one more arbitrary symbol

More information

CIS-331 Exam 2 Fall 2014 Total of 105 Points. Version 1

CIS-331 Exam 2 Fall 2014 Total of 105 Points. Version 1 Version 1 1. (20 Points) Given the class A network address 119.0.0.0 will be divided into a maximum of 15,900 subnets. a. (5 Points) How many bits will be necessary to address the 15,900 subnets? b. (5

More information

1. [5 points each] True or False. If the question is currently open, write O or Open.

1. [5 points each] True or False. If the question is currently open, write O or Open. University of Nevada, Las Vegas Computer Science 456/656 Spring 2018 Practice for the Final on May 9, 2018 The entire examination is 775 points. The real final will be much shorter. Name: No books, notes,

More information

SYNERGY INSTITUTE OF ENGINEERING & TECHNOLOGY,DHENKANAL LECTURE NOTES ON DIGITAL ELECTRONICS CIRCUIT(SUBJECT CODE:PCEC4202)

SYNERGY INSTITUTE OF ENGINEERING & TECHNOLOGY,DHENKANAL LECTURE NOTES ON DIGITAL ELECTRONICS CIRCUIT(SUBJECT CODE:PCEC4202) Lecture No:5 Boolean Expressions and Definitions Boolean Algebra Boolean Algebra is used to analyze and simplify the digital (logic) circuits. It uses only the binary numbers i.e. 0 and 1. It is also called

More information

6.1 Combinational Circuits. George Boole ( ) Claude Shannon ( )

6.1 Combinational Circuits. George Boole ( ) Claude Shannon ( ) 6. Combinational Circuits George Boole (85 864) Claude Shannon (96 2) Signals and Wires Digital signals Binary (or logical ) values: or, on or off, high or low voltage Wires. Propagate digital signals

More information

Algebra 1 Review. Properties of Real Numbers. Algebraic Expressions

Algebra 1 Review. Properties of Real Numbers. Algebraic Expressions Algebra 1 Review Properties of Real Numbers Algebraic Expressions Real Numbers Natural Numbers: 1, 2, 3, 4,.. Numbers used for counting Whole Numbers: 0, 1, 2, 3, 4,.. Natural Numbers and 0 Integers:,

More information

Thompson groups, Cantor space, and foldings of de Bruijn graphs. Peter J. Cameron University of St Andrews

Thompson groups, Cantor space, and foldings of de Bruijn graphs. Peter J. Cameron University of St Andrews Thompson groups, Cantor space, and foldings of de Bruijn graphs Peter J Cameron University of St Andrews Breaking the boundaries University of Sussex April 25 The 97s Three groups I was born (in Paul Erdős

More information

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions Last lecture CMSC330 Finite Automata Languages Sets of strings Operations on languages Regular expressions Constants Operators Precedence 1 2 Finite automata States Transitions Examples Types This lecture

More information

CS 61C Summer 2012 Discussion 11 State, Timing, and CPU (Solutions)

CS 61C Summer 2012 Discussion 11 State, Timing, and CPU (Solutions) State Elements State elements provide a means of storing values, and controlling the flow of information in the circuit. The most basic state element (we re concerned with) is a DQ Flip-Flop: D is a single

More information

Question Bank. 10CS63:Compiler Design

Question Bank. 10CS63:Compiler Design Question Bank 10CS63:Compiler Design 1.Determine whether the following regular expressions define the same language? (ab)* and a*b* 2.List the properties of an operator grammar 3. Is macro processing a

More information

CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions

CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions CSE45 Translation of Programming Languages Lecture 2: Automata and Regular Expressions Finite Automata Regular Expression = Specification Finite Automata = Implementation A finite automaton consists of:

More information

Front End: Lexical Analysis. The Structure of a Compiler

Front End: Lexical Analysis. The Structure of a Compiler Front End: Lexical Analysis The Structure of a Compiler Constructing a Lexical Analyser By hand: Identify lexemes in input and return tokens Automatically: Lexical-Analyser generator We will learn about

More information

Preferentially Annotated Regular Path Queries

Preferentially Annotated Regular Path Queries Preferentially Annotated Regular Path Queries Gosta Grahne Concordia U. Canada Alex Thomo U. of Victoria Canada Bill Wadge U. of Victoria Canada Regular Path Queries (RPQ s) Essentially regular expressions

More information

Theory of Computation Prof. Raghunath Tewari Department of Computer Science and Engineering Indian Institute of Technology, Kanpur

Theory of Computation Prof. Raghunath Tewari Department of Computer Science and Engineering Indian Institute of Technology, Kanpur Theory of Computation Prof. Raghunath Tewari Department of Computer Science and Engineering Indian Institute of Technology, Kanpur Lecture 01 Introduction to Finite Automata Welcome everybody. This is

More information

Deciding sequentiability of nite-state transducers by nite-state pattern-matching

Deciding sequentiability of nite-state transducers by nite-state pattern-matching Theoretical Computer Science 313 (2004) 105 117 www.elsevier.com/locate/tcs Deciding sequentiability of nite-state transducers by nite-state pattern-matching Tamas Gaal Xerox Research Centre Europe, Grenoble

More information

Implementation of transducers in Vaucanson

Implementation of transducers in Vaucanson Implementation of transducers in Vaucanson Sarah O Connor LRDE seminar, June 23, 2004 http://vaucanson.lrde.epita.fr/ Copying this document Copying this document Copyright

More information

NONDETERMINISTIC MOORE

NONDETERMINISTIC MOORE NONDETERMINISTIC MOORE AUTOMATA AND BRZOZOWSKI'S ALGORITHM G. Castiglione, A. Restivo, M. Sciortino University of Palermo Workshop PRIN Varese, 5-7 Settembre 2011 SUMMARY A class of nondeterministic Moore

More information

Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition

Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition by Hong-Kwang Jeff Kuo, Brian Kingsbury (IBM Research) and Geoffry Zweig (Microsoft Research) ICASSP 2007 Presented

More information

Lexical Analysis - 2

Lexical Analysis - 2 Lexical Analysis - 2 More regular expressions Finite Automata NFAs and DFAs Scanners JLex - a scanner generator 1 Regular Expressions in JLex Symbol - Meaning. Matches a single character (not newline)

More information

Zhizheng Zhang. Southeast University

Zhizheng Zhang. Southeast University Zhizheng Zhang Southeast University 2016/10/5 Lexical Analysis 1 1. The Role of Lexical Analyzer 2016/10/5 Lexical Analysis 2 2016/10/5 Lexical Analysis 3 Example. position = initial + rate * 60 2016/10/5

More information

Theory of Computations III. WMU CS-6800 Spring -2014

Theory of Computations III. WMU CS-6800 Spring -2014 Theory of Computations III WMU CS-6800 Spring -2014 Markov Algorithm (MA) By Ahmed Al-Gburi & Barzan Shekh 2014 Outline Introduction How to execute a MA Schema Examples Formal Definition Formal Algorithm

More information

Sets 1. The things in a set are called the elements of it. If x is an element of the set S, we say

Sets 1. The things in a set are called the elements of it. If x is an element of the set S, we say Sets 1 Where does mathematics start? What are the ideas which come first, in a logical sense, and form the foundation for everything else? Can we get a very small number of basic ideas? Can we reduce it

More information

String Analysis. Abstract

String Analysis. Abstract String Analysis Fang Yu Marco Cova Abstract String analysis is a static analysis technique that determines the string values that a variable can hold at specific points in a program. This information is

More information