Regular Expression Constrained Sequence Alignment

Size: px
Start display at page:

Download "Regular Expression Constrained Sequence Alignment"

Transcription

1 Regular Expression Constrained Sequence Alignment By Abdullah N. Arslan Department of Computer science University of Vermont Presented by Tomer Heber & Raz Nissim

2 otivation When comparing two proteins, it may be important to take into account a common specific structure. Families of similar protein sequences include a conserved region called a motif. The PROSITE database is a collection of these motifs, represented as regular expressions. Our problem concentrates on sequence alignment of proteins that contain these motifs and need to be aligned according to them.

3 Formalization of the problem The Optimal Sequence Alignment Problem: String Alignment: Given two strings s and t, a global alignment is obtained by inserting spaces into s and t so that the characters of the resulting strings can be put in one-to-one correspondence to each other. H - A S K E L L P A A - C A - L * Two characters correspond/match each other. The Optimal Sequence Alignment is a string alignment that has the maximum number of characters that correspond to each other.

4 Example: Given two strings: S1 = TGFPSVGKTKDDA S2 = TFSVAKDDDGKSA The Optimal Sequence Alignment for this two strings is 8. *By adding the gaps to both strings, 8 characters correspond to each other in both strings: T,F,S,V,K,D,D,A. There are no alignments of 9 or more corresponding characters, thus this is the optimal solution.

5 Regular Expression Constrained Sequence Alignment (RECSA) Definition: Given strings S1, S2 and the regular expression R, RECSA is the problem of finding the optimal global sequence alignment between S1 and S2 where there exists s1 subset of S1 and s2 subset of S2 such that the following constraints are satisfied: 1. Substring s1 is aligned with substring s2 (local alignment). 2. Both s1 and s2 match the expression R, s1, s2 L( R).

6 Example: Given two strings and a regular expression: S1 = TGFPSVGKTKDDA S2 = TFSVAKDDDGKSA The Optimal global sequence alignment Constrained by the regular expression R is 4: *s1 = GFPSVGKT is a substring of S1, and s2 = AKDDDGKS is a substring of S2. s1 and s2 are aligned with each other and both match the regular expression R. 4 characters correspond/match to each other in strings S1 and S2.

7 Needleman Wunsch algorithm This algorithm is used for computing the maximum global score given two strings of size n and m in O(nm).

8 Needleman Wunsch algorithm Continued i, j = The optimal global alignment score of the substrings S 1.. i], S [1.. ]. 1[ 2 j Insertion or Deletion are referred to as gaps. ( S 1 [ i] ) = The price of adding a gap to S2. ( S = The price of adding a gap to S1. 2[ j]) S [ i] S [ ]) = The price of a match or a mismatch of a character. ( 1 2 i Question: Can we improve the runtime for finding an optimal global alignment? Hint: LCS and optimal sequence alignment problems are equivalent.

9 Reminder Given two strings S1, S2 and a regular expression R, we want to find two substrings s1 and s2 (substrings of S1 and S2) that uphold these constraints: 1. s1 is aligned with s2. 2. Both s1 and s2 match the regular expression R. Previous example:

10 Review of Automata (State achines) An automaton is represented by the 5-tuple, where: Q - Set of states. - Finite set of symbols, that we will call the alphabet of the language the automaton accepts. δ - Transition function, that is q 0 - Start state, that is, the state in which the automaton is when no input has been processed yet (Obviously, q 0 Q). F - Set of states of Q (i.e. F Q), called accept states.

11 Review of Automata (cont.) There exists a strong equivalence: for every regular language, there is a finite state automaton (DFA or NFA), and vice versa. Example- This automaton accepts the regular language: A (C + G)* (S + T)

12 Example of N x N multiplication automaton :

13 Constructing our Weighted Finite Automaton We construct from a given regular expression R in several steps: (1) First, given a regular expression R we construct a nondeterministic finite automaton N=(Q,Σ,δ,q0,F), with no ε- moves, such that L(N)= L(R). N accepts only the set of strings described by the regular expression R.

14 Constructing our Weighted Finite Automaton (cont.) (2) We define a weighted N N automaton as the finite automaton = (Q,W,Σ,q0,F ) which we construct as follows: Q = Q Q is the set of states. Each state of corresponds to a pair of states in N. remembers in each state what part of the regular expression has been seen in S1 and S2. W : Q is a function that assigns real weights to each state in Q. Initially all weights are -.

15 Step 2 (cont.) Σ = (Σ x Σ) \ {ε ε}. The alphabet for is the set of edit operations which does not include ε ε. q0 = (q0, q0) is the start state whose weight is 0 and always stays 0. F = (F x F). Is the set of final states. If is in a final state then has processed an alignment that satisfies the regular expression constraint (there are substrings s1 of S1 and s2 of S2 that are aligned and both s1 and s2 take N to final states.

16 Step 2 The Transition Function δ : Σ x Q Q. moves on edit operations: Once an alignment satisfies the regular expression constraint, i.e. once a final state is reached in, the rest of the alignment does not alter the satisfaction of the constraint. ( has the option of staying in a final state on any input after that final state is reached)

17 The main idea

18 Denotations For any given weighted N N automaton we x y denote by for any X Y a copy of the automaton after making the move on X Y. Given two weighted N N automata 1 and 2, we define a commutative and associative operation max- such that max-{1, 2} is a weighted N N automaton with state weights calculated as follows: For all (p, q) Q W (p, q) = max{ 1 2 W (p, q), W (p, q) }.

19 Denotations (cont.) We denote by S[i..j] the substring of S from positions i to j, i j. Let S[i] denote the i-th symbol of string S. We denote by i, j the optimally weighted automaton for S1[1 i] and S2[1 j].

20 The Algorithm Let S1 = n, S2 = m with n m, and let N be a non-deterministic automaton with no - moves equivalent to regular expression R, and let be a weighted N N automaton constructed from N as we described. For all i, j, 0 i n, 0 j m,both i,0 and 0, j are identical weighted N N automaton whose state-weights are all (except for the weight of the start state (q0,q0) which is always 0).

21 Algorithm (cont.) Optimal automata formula: i, j are computed with this 1[ ] S i i is the copy of the optimally weighted 1, j automaton after making the move S [ i ] 1 i 1, j. (we ll see an example later on )

22 Algorithm (cont.) After we finish calculating for all i, j, we ll i, j focus on n, m (the optimally weighted automaton for substrings s1=s1, s2=s2) We output the weight: n m n, max{ W ( p, q) ( p, q) F, m (the maximum of the weights of the accepting states in.) n, m }

23 Calculating the weights Upon making a move, these two steps must be followed:

24 Calculating the weights (example) V=80 q0,q1 a q1,q3 V=? ( a) 20 q1,q1 a V=90 Step 1: (q1,q3) gets the value : max{ W ( q0, q1) ( a), W ( q1, q1) ( a)} In this case, max{80-20, 90-20} = 70.

25 Calculating the weights (example cont.) V= - q0,q1 q1,q1 V= - b a b a q1,q3 V=240 ->? Step 2: If (p,q) is reachable with the move b->a only from parents that have a weight of -, its weight is updated to - as well. It is imperative that this step is done only after step 1 is completed, since some updates could be missed.

26 Optimally Weighted - definition i,j is optimally weighted for S1[1..i], and S2[1..j], if the following two properties hold:

27 Proof of correctness Claim: For all i,j, i, j computed by is optimally weighted. Proof: By induction on nodes (i,j)

28 Proof of correctness (cont.) Base case: i = 0, for all j the weights of are - (or 0). 0, j The two properties hold, and the claim is true. Assuming the claim is true for, i 1, j i, j 1 and i 1, j 1, we will show that the following automata are optimal:

29 Proof of correctness (Intuition) s values are computed by taking the i, j maximal of its parents values for each state, and adding to them the score of one edit action. Optimality of i, j follows from its parents assumption of optimality and since an optimal constrained alignment at node (i,j) uses one of these arcs, and we compute maximum scores for all possible optimal alignments.

30 Run-Time Analysis, Computing each i j takes time O(r) where r is the size of the transition function of. 4 * We note that r = O( t ) where t is the number of states in automaton N accepting the language 2 L(R). This is because has O( t ) states and the transition function of a nondeterministic automata is of size O(states^2) We need to compute O(nm) automata therefore we have a running time of O(nmr). and

31 Optimal alignment path reconstruction In order to reconstruct the optimal path, we can store for each active state of each automaton, the neighboring automaton and the state in that automaton from which the optimal score is obtained. This is possible by altering the ax- function. starting at in reverse. n, m, we generate the alignment path

32 Questions?? A detailed example will be uploaded for further reference. Thanks

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions Last lecture CMSC330 Finite Automata Languages Sets of strings Operations on languages Regular expressions Constants Operators Precedence 1 2 Finite automata States Transitions Examples Types This lecture

More information

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1} CSE 5 Homework 2 Due: Monday October 6, 27 Instructions Upload a single file to Gradescope for each group. should be on each page of the submission. All group members names and PIDs Your assignments in

More information

6 NFA and Regular Expressions

6 NFA and Regular Expressions Formal Language and Automata Theory: CS21004 6 NFA and Regular Expressions 6.1 Nondeterministic Finite Automata A nondeterministic finite automata (NFA) is a 5-tuple where 1. is a finite set of states

More information

lec3:nondeterministic finite state automata

lec3:nondeterministic finite state automata lec3:nondeterministic finite state automata 1 1.introduction Nondeterminism is a useful concept that has great impact on the theory of computation. When the machine is in a given state and reads the next

More information

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG) CS5371 Theory of Computation Lecture 8: Automata Theory VI (PDA, PDA = CFG) Objectives Introduce Pushdown Automaton (PDA) Show that PDA = CFG In terms of descriptive power Pushdown Automaton (PDA) Roughly

More information

Finite automata. We have looked at using Lex to build a scanner on the basis of regular expressions.

Finite automata. We have looked at using Lex to build a scanner on the basis of regular expressions. Finite automata We have looked at using Lex to build a scanner on the basis of regular expressions. Now we begin to consider the results from automata theory that make Lex possible. Recall: An alphabet

More information

CMPSCI 250: Introduction to Computation. Lecture 20: Deterministic and Nondeterministic Finite Automata David Mix Barrington 16 April 2013

CMPSCI 250: Introduction to Computation. Lecture 20: Deterministic and Nondeterministic Finite Automata David Mix Barrington 16 April 2013 CMPSCI 250: Introduction to Computation Lecture 20: Deterministic and Nondeterministic Finite Automata David Mix Barrington 16 April 2013 Deterministic and Nondeterministic Finite Automata Deterministic

More information

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata Formal Languages and Compilers Lecture IV: Regular Languages and Finite Automata Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

More information

Front End: Lexical Analysis. The Structure of a Compiler

Front End: Lexical Analysis. The Structure of a Compiler Front End: Lexical Analysis The Structure of a Compiler Constructing a Lexical Analyser By hand: Identify lexemes in input and return tokens Automatically: Lexical-Analyser generator We will learn about

More information

1. Draw the state graphs for the finite automata which accept sets of strings composed of zeros and ones which:

1. Draw the state graphs for the finite automata which accept sets of strings composed of zeros and ones which: P R O B L E M S Finite Autom ata. Draw the state graphs for the finite automata which accept sets of strings composed of zeros and ones which: a) Are a multiple of three in length. b) End with the string

More information

Regular Languages and Regular Expressions

Regular Languages and Regular Expressions Regular Languages and Regular Expressions According to our definition, a language is regular if there exists a finite state automaton that accepts it. Therefore every regular language can be described

More information

CT32 COMPUTER NETWORKS DEC 2015

CT32 COMPUTER NETWORKS DEC 2015 Q.2 a. Using the principle of mathematical induction, prove that (10 (2n-1) +1) is divisible by 11 for all n N (8) Let P(n): (10 (2n-1) +1) is divisible by 11 For n = 1, the given expression becomes (10

More information

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur Finite Automata Dr. Nadeem Akhtar Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur PhD Laboratory IRISA-UBS University of South Brittany European University

More information

Lexical Analysis. Implementation: Finite Automata

Lexical Analysis. Implementation: Finite Automata Lexical Analysis Implementation: Finite Automata Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs)

More information

CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions

CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions CSE45 Translation of Programming Languages Lecture 2: Automata and Regular Expressions Finite Automata Regular Expression = Specification Finite Automata = Implementation A finite automaton consists of:

More information

CS 310: State Transition Diagrams

CS 310: State Transition Diagrams CS 30: State Transition Diagrams Stefan D. Bruda Winter 207 STATE TRANSITION DIAGRAMS Finite directed graph Edges (transitions) labeled with symbols from an alphabet Nodes (states) labeled only for convenience

More information

NFAs and Myhill-Nerode. CS154 Chris Pollett Feb. 22, 2006.

NFAs and Myhill-Nerode. CS154 Chris Pollett Feb. 22, 2006. NFAs and Myhill-Nerode CS154 Chris Pollett Feb. 22, 2006. Outline Bonus Questions Equivalence with Finite Automata Myhill-Nerode Theorem. Bonus Questions These questions are open to anybody. I will only

More information

Ambiguous Grammars and Compactification

Ambiguous Grammars and Compactification Ambiguous Grammars and Compactification Mridul Aanjaneya Stanford University July 17, 2012 Mridul Aanjaneya Automata Theory 1/ 44 Midterm Review Mathematical Induction and Pigeonhole Principle Finite Automata

More information

ECS 120 Lesson 16 Turing Machines, Pt. 2

ECS 120 Lesson 16 Turing Machines, Pt. 2 ECS 120 Lesson 16 Turing Machines, Pt. 2 Oliver Kreylos Friday, May 4th, 2001 In the last lesson, we looked at Turing Machines, their differences to finite state machines and pushdown automata, and their

More information

Implementation of Lexical Analysis

Implementation of Lexical Analysis Implementation of Lexical Analysis Lecture 4 (Modified by Professor Vijay Ganesh) Tips on Building Large Systems KISS (Keep It Simple, Stupid!) Don t optimize prematurely Design systems that can be tested

More information

Implementation of Lexical Analysis

Implementation of Lexical Analysis Implementation of Lexical Analysis Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs) Implementation

More information

Implementation of Lexical Analysis

Implementation of Lexical Analysis Implementation of Lexical Analysis Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs) Implementation

More information

Nondeterministic Finite Automata (NFA): Nondeterministic Finite Automata (NFA) states of an automaton of this kind may or may not have a transition for each symbol in the alphabet, or can even have multiple

More information

Special course in Computer Science: Advanced Text Algorithms

Special course in Computer Science: Advanced Text Algorithms Special course in Computer Science: Advanced Text Algorithms Lecture 6: Alignments Elena Czeizler and Ion Petre Department of IT, Abo Akademi Computational Biomodelling Laboratory http://www.users.abo.fi/ipetre/textalg

More information

CSE450. Translation of Programming Languages. Automata, Simple Language Design Principles

CSE450. Translation of Programming Languages. Automata, Simple Language Design Principles CSE45 Translation of Programming Languages Automata, Simple Language Design Principles Finite Automata State Graphs A state: The start state: An accepting state: A transition: a A Simple Example A finite

More information

An improved algorithm for the regular expression constrained multiple sequence alignment problem

An improved algorithm for the regular expression constrained multiple sequence alignment problem An improved algorithm for the regular expression constrained multiple sequence alignment problem Abdullah N. Arslan and Dan He Department of Computer Science University of Vermont Burlington, VT 05405,

More information

Languages and Compilers

Languages and Compilers Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:

More information

Formal Definition of Computation. Formal Definition of Computation p.1/28

Formal Definition of Computation. Formal Definition of Computation p.1/28 Formal Definition of Computation Formal Definition of Computation p.1/28 Computation model The model of computation considered so far is the work performed by a finite automaton Formal Definition of Computation

More information

Lexical Analysis. Chapter 2

Lexical Analysis. Chapter 2 Lexical Analysis Chapter 2 1 Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples

More information

Formal Languages and Automata

Formal Languages and Automata Mobile Computing and Software Engineering p. 1/3 Formal Languages and Automata Chapter 3 Regular languages and Regular Grammars Chuan-Ming Liu cmliu@csie.ntut.edu.tw Department of Computer Science and

More information

Mouse, Human, Chimpanzee

Mouse, Human, Chimpanzee More Alignments 1 Mouse, Human, Chimpanzee Mouse to Human Chimpanzee to Human 2 Mouse v.s. Human Chromosome X of Mouse to Human 3 Local Alignment Given: two sequences S and T Find: substrings of S and

More information

CS2 Language Processing note 3

CS2 Language Processing note 3 CS2 Language Processing note 3 CS2Ah 5..4 CS2 Language Processing note 3 Nondeterministic finite automata In this lecture we look at nondeterministic finite automata and prove the Conversion Theorem, which

More information

Converting a DFA to a Regular Expression JP

Converting a DFA to a Regular Expression JP Converting a DFA to a Regular Expression JP Prerequisite knowledge: Regular Languages Deterministic Finite Automata Nondeterministic Finite Automata Regular Expressions Conversion of Regular Expression

More information

David Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain)

David Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres dgriol@inf.uc3m.es Computer Science Department Carlos III University of Madrid Leganés (Spain) OUTLINE Introduction: Definitions The role of the Lexical Analyzer Scanner Implementation

More information

Midterm Exam. CSCI 3136: Principles of Programming Languages. February 20, Group 2

Midterm Exam. CSCI 3136: Principles of Programming Languages. February 20, Group 2 Banner number: Name: Midterm Exam CSCI 336: Principles of Programming Languages February 2, 23 Group Group 2 Group 3 Question. Question 2. Question 3. Question.2 Question 2.2 Question 3.2 Question.3 Question

More information

Theory of Computation Dr. Weiss Extra Practice Exam Solutions

Theory of Computation Dr. Weiss Extra Practice Exam Solutions Name: of 7 Theory of Computation Dr. Weiss Extra Practice Exam Solutions Directions: Answer the questions as well as you can. Partial credit will be given, so show your work where appropriate. Try to be

More information

Lexical Analysis. Prof. James L. Frankel Harvard University

Lexical Analysis. Prof. James L. Frankel Harvard University Lexical Analysis Prof. James L. Frankel Harvard University Version of 5:37 PM 30-Jan-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights reserved. Regular Expression Notation We will develop a

More information

FAdo: Interactive Tools for Learning Formal Computational Models

FAdo: Interactive Tools for Learning Formal Computational Models FAdo: Interactive Tools for Learning Formal Computational Models Rogério Reis Nelma Moreira DCC-FC& LIACC, Universidade do Porto R. do Campo Alegre 823, 4150 Porto, Portugal {rvr,nam}@ncc.up.pt Abstract

More information

Theory of Computation

Theory of Computation Theory of Computation For Computer Science & Information Technology By www.thegateacademy.com Syllabus Syllabus for Theory of Computation Regular Expressions and Finite Automata, Context-Free Grammar s

More information

Midterm I (Solutions) CS164, Spring 2002

Midterm I (Solutions) CS164, Spring 2002 Midterm I (Solutions) CS164, Spring 2002 February 28, 2002 Please read all instructions (including these) carefully. There are 9 pages in this exam and 5 questions, each with multiple parts. Some questions

More information

CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION CSE 105 THEORY OF COMPUTATION Spring 2017 http://cseweb.ucsd.edu/classes/sp17/cse105-ab/ Today's learning goals Sipser Ch 1.2, 1.3 Design NFA recognizing a given language Convert an NFA (with or without

More information

Compiler Construction

Compiler Construction Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Conceptual Structure of a Compiler Source code x1 := y2

More information

Lexical Analysis. Lecture 2-4

Lexical Analysis. Lecture 2-4 Lexical Analysis Lecture 2-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 2 1 Administrivia Moving to 60 Evans on Wednesday HW1 available Pyth manual available on line.

More information

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres dgriol@inf.uc3m.es Introduction: Definitions Lexical analysis or scanning: To read from left-to-right a source

More information

Implementation of Lexical Analysis. Lecture 4

Implementation of Lexical Analysis. Lecture 4 Implementation of Lexical Analysis Lecture 4 1 Tips on Building Large Systems KISS (Keep It Simple, Stupid!) Don t optimize prematurely Design systems that can be tested It is easier to modify a working

More information

Lecture 3: February Local Alignment: The Smith-Waterman Algorithm

Lecture 3: February Local Alignment: The Smith-Waterman Algorithm CSCI1820: Sequence Alignment Spring 2017 Lecture 3: February 7 Lecturer: Sorin Istrail Scribe: Pranavan Chanthrakumar Note: LaTeX template courtesy of UC Berkeley EECS dept. Notes are also adapted from

More information

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama Print family (or last) name: Print given (or first) name: I have read and understand all of

More information

2. Lexical Analysis! Prof. O. Nierstrasz!

2. Lexical Analysis! Prof. O. Nierstrasz! 2. Lexical Analysis! Prof. O. Nierstrasz! Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes.! http://www.cs.ucla.edu/~palsberg/! http://www.cs.purdue.edu/homes/hosking/!

More information

Chapter Seven: Regular Expressions

Chapter Seven: Regular Expressions Chapter Seven: Regular Expressions Regular Expressions We have seen that DFAs and NFAs have equal definitional power. It turns out that regular expressions also have exactly that same definitional power:

More information

Formal Languages and Compilers Lecture VI: Lexical Analysis

Formal Languages and Compilers Lecture VI: Lexical Analysis Formal Languages and Compilers Lecture VI: Lexical Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/ artale/ Formal

More information

(Refer Slide Time: 0:19)

(Refer Slide Time: 0:19) Theory of Computation. Professor somenath Biswas. Department of Computer Science & Engineering. Indian Institute of Technology, Kanpur. Lecture-15. Decision Problems for Regular Languages. (Refer Slide

More information

Computational Genomics and Molecular Biology, Fall

Computational Genomics and Molecular Biology, Fall Computational Genomics and Molecular Biology, Fall 2015 1 Sequence Alignment Dannie Durand Pairwise Sequence Alignment The goal of pairwise sequence alignment is to establish a correspondence between the

More information

14.1 Encoding for different models of computation

14.1 Encoding for different models of computation Lecture 14 Decidable languages In the previous lecture we discussed some examples of encoding schemes, through which various objects can be represented by strings over a given alphabet. We will begin this

More information

Turing Machines. A transducer is a finite state machine (FST) whose output is a string and not just accept or reject.

Turing Machines. A transducer is a finite state machine (FST) whose output is a string and not just accept or reject. Turing Machines Transducers: A transducer is a finite state machine (FST) whose output is a string and not just accept or reject. Each transition of an FST is labeled with two symbols, one designating

More information

I have read and understand all of the instructions below, and I will obey the Academic Honor Code.

I have read and understand all of the instructions below, and I will obey the Academic Honor Code. Midterm Exam CS 341-451: Foundations of Computer Science II Fall 2014, elearning section Prof. Marvin K. Nakayama Print family (or last) name: Print given (or first) name: I have read and understand all

More information

Actually talking about Turing machines this time

Actually talking about Turing machines this time Actually talking about Turing machines this time 10/25/17 (Using slides adapted from the book) Administrivia HW due now (Pumping lemma for context-free languages) HW due Friday (Building TMs) Exam 2 out

More information

To illustrate what is intended the following are three write ups by students. Diagonalization

To illustrate what is intended the following are three write ups by students. Diagonalization General guidelines: You may work with other people, as long as you write up your solution in your own words and understand everything you turn in. Make sure to justify your answers they should be clear

More information

Lexical Analysis - 2

Lexical Analysis - 2 Lexical Analysis - 2 More regular expressions Finite Automata NFAs and DFAs Scanners JLex - a scanner generator 1 Regular Expressions in JLex Symbol - Meaning. Matches a single character (not newline)

More information

Languages and Finite Automata

Languages and Finite Automata Languages and Finite Automata or how to talk to machines... Costas Busch - RPI 1 Languages A language is a set of strings String: A sequence of letters (a word) Examples: cat, dog, house, Defined over

More information

JNTUWORLD. Code No: R

JNTUWORLD. Code No: R Code No: R09220504 R09 SET-1 B.Tech II Year - II Semester Examinations, April-May, 2012 FORMAL LANGUAGES AND AUTOMATA THEORY (Computer Science and Engineering) Time: 3 hours Max. Marks: 75 Answer any five

More information

Lecture 6,

Lecture 6, Lecture 6, 4.16.2009 Today: Review: Basic Set Operation: Recall the basic set operator,!. From this operator come other set quantifiers and operations:!,!,!,! \ Set difference (sometimes denoted, a minus

More information

CSE 431S Scanning. Washington University Spring 2013

CSE 431S Scanning. Washington University Spring 2013 CSE 431S Scanning Washington University Spring 2013 Regular Languages Three ways to describe regular languages FSA Right-linear grammars Regular expressions Regular Expressions A regular expression is

More information

Lexical Analysis. Lecture 3-4

Lexical Analysis. Lecture 3-4 Lexical Analysis Lecture 3-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 3-4 1 Administrivia I suggest you start looking at Python (see link on class home page). Please

More information

Decision Properties of RLs & Automaton Minimization

Decision Properties of RLs & Automaton Minimization Decision Properties of RLs & Automaton Minimization Martin Fränzle formatics and Mathematical Modelling The Technical University of Denmark Languages and Parsing MF Fall p./ What you ll learn. Decidable

More information

ECS 120 Lesson 7 Regular Expressions, Pt. 1

ECS 120 Lesson 7 Regular Expressions, Pt. 1 ECS 120 Lesson 7 Regular Expressions, Pt. 1 Oliver Kreylos Friday, April 13th, 2001 1 Outline Thus far, we have been discussing one way to specify a (regular) language: Giving a machine that reads a word

More information

Finite Automata Part Three

Finite Automata Part Three Finite Automata Part Three Recap from Last Time A language L is called a regular language if there exists a DFA D such that L( D) = L. NFAs An NFA is a Nondeterministic Finite Automaton Can have missing

More information

CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION CSE 105 THEORY OF COMPUTATION Fall 2016 http://cseweb.ucsd.edu/classes/fa16/cse105-abc/ Today's learning goals Sipser sec 3.2 Describe several variants of Turing machines and informally explain why they

More information

Symbolic Automata Library for Fast Prototyping

Symbolic Automata Library for Fast Prototyping http://excel.fit.vutbr.cz Symbolic Automata Library for Fast Prototyping Michaela Bieliková not_in{@} in{e,x,c} in{e,l} F I T Abstract Finite state automata are widely used in the fields of computer science

More information

UML CS Algorithms Qualifying Exam Fall, 2003 ALGORITHMS QUALIFYING EXAM

UML CS Algorithms Qualifying Exam Fall, 2003 ALGORITHMS QUALIFYING EXAM NAME: This exam is open: - books - notes and closed: - neighbors - calculators ALGORITHMS QUALIFYING EXAM The upper bound on exam time is 3 hours. Please put all your work on the exam paper. (Partial credit

More information

Chapter Seven: Regular Expressions. Formal Language, chapter 7, slide 1

Chapter Seven: Regular Expressions. Formal Language, chapter 7, slide 1 Chapter Seven: Regular Expressions Formal Language, chapter 7, slide The first time a young student sees the mathematical constant π, it looks like just one more school artifact: one more arbitrary symbol

More information

HKN CS 374 Midterm 1 Review. Tim Klem Noah Mathes Mahir Morshed

HKN CS 374 Midterm 1 Review. Tim Klem Noah Mathes Mahir Morshed HKN CS 374 Midterm 1 Review Tim Klem Noah Mathes Mahir Morshed Midterm topics It s all about recognizing sets of strings! 1. String Induction 2. Regular languages a. DFA b. NFA c. Regular expressions 3.

More information

Compiler Construction

Compiler Construction Compiler Construction Lecture 2: Lexical Analysis I (Introduction) Thomas Noll Lehrstuhl für Informatik 2 (Software Modeling and Verification) noll@cs.rwth-aachen.de http://moves.rwth-aachen.de/teaching/ss-14/cc14/

More information

Compiler Construction

Compiler Construction Compiler Construction Exercises 1 Review of some Topics in Formal Languages 1. (a) Prove that two words x, y commute (i.e., satisfy xy = yx) if and only if there exists a word w such that x = w m, y =

More information

Midterm I review. Reading: Chapters 1-4

Midterm I review. Reading: Chapters 1-4 Midterm I review Reading: Chapters 1-4 1 Test Details In class, Wednesday, Feb. 25, 2015 3:10pm-4pm Comprehensive Closed book, closed notes 2 Syllabus Formal proofs Finite Automata NFA, DFA, -NFA Regular

More information

CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION CSE 105 THEORY OF COMPUTATION Spring 2016 http://cseweb.ucsd.edu/classes/sp16/cse105-ab/ Today's learning goals Sipser Ch 3.2, 3.3 Define variants of TMs Enumerators Multi-tape TMs Nondeterministic TMs

More information

CS 432 Fall Mike Lam, Professor. Finite Automata Conversions and Lexing

CS 432 Fall Mike Lam, Professor. Finite Automata Conversions and Lexing CS 432 Fall 2017 Mike Lam, Professor Finite Automata Conversions and Lexing Finite Automata Key result: all of the following have the same expressive power (i.e., they all describe regular languages):

More information

Quiz 1: Solutions J/18.400J: Automata, Computability and Complexity. Nati Srebro, Susan Hohenberger

Quiz 1: Solutions J/18.400J: Automata, Computability and Complexity. Nati Srebro, Susan Hohenberger 6.45J/8.4J: utomata, Computability and Complexity Quiz : Solutions Prof. Nancy Lynch Nati Srebro, Susan Hohenberger Please write your name in the upper corner of each page. (2 Points) Q- Problem : True

More information

CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION CSE 105 THEORY OF COMPUTATION Spring 2018 http://cseweb.ucsd.edu/classes/sp18/cse105-ab/ Today's learning goals Sipser Section 2.2 Define push-down automata informally and formally Trace the computation

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! [ALSU03] Chapter 3 - Lexical Analysis Sections 3.1-3.4, 3.6-3.7! Reading for next time [ALSU03] Chapter 3 Copyright (c) 2010 Ioanna

More information

Introduction to Lexical Analysis

Introduction to Lexical Analysis Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexical analyzers (lexers) Regular

More information

Limited Automata and Unary Languages

Limited Automata and Unary Languages Limited Automata and Unary Languages Giovanni Pighizzini and Luca Prigioniero Dipartimento di Informatica, Università degli Studi di Milano, Italy {pighizzini,prigioniero}@di.unimi.it Abstract. Limited

More information

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States

LR Parsing, Part 2. Constructing Parse Tables. An NFA Recognizing Viable Prefixes. Computing the Closure. GOTO Function and DFA States TDDD16 Compilers and Interpreters TDDB44 Compiler Construction LR Parsing, Part 2 Constructing Parse Tables Parse table construction Grammar conflict handling Categories of LR Grammars and Parsers An NFA

More information

Finite Automata Part Three

Finite Automata Part Three Finite Automata Part Three Friday Four Square! Today at 4:15PM, Outside Gates. Announcements Problem Set 4 due right now. Problem Set 5 out, due next Friday, November 2. Play around with finite automata

More information

NONDETERMINISTIC MOORE

NONDETERMINISTIC MOORE NONDETERMINISTIC MOORE AUTOMATA AND BRZOZOWSKI'S ALGORITHM G. Castiglione, A. Restivo, M. Sciortino University of Palermo Workshop PRIN Varese, 5-7 Settembre 2011 SUMMARY A class of nondeterministic Moore

More information

CS 125 Section #10 Midterm 2 Review 11/5/14

CS 125 Section #10 Midterm 2 Review 11/5/14 CS 125 Section #10 Midterm 2 Review 11/5/14 1 Topics Covered This midterm covers up through NP-completeness; countability, decidability, and recognizability will not appear on this midterm. Disclaimer:

More information

CIT3130: Theory of Computation. Regular languages

CIT3130: Theory of Computation. Regular languages ƒ CIT3130: Theory of Computation Regular languages ( M refers to the first edition of Martin and H to IALC by Hopcroft et al.) Definitions of regular expressions and regular languages: A regular expression

More information

Timo Latvala. January 28, 2004

Timo Latvala. January 28, 2004 Reactive Systems: Kripke Structures and Automata Timo Latvala January 28, 2004 Reactive Systems: Kripke Structures and Automata 3-1 Properties of systems invariants: the system never reaches a bad state

More information

1 Parsing (25 pts, 5 each)

1 Parsing (25 pts, 5 each) CSC173 FLAT 2014 ANSWERS AND FFQ 30 September 2014 Please write your name on the bluebook. You may use two sides of handwritten notes. Perfect score is 75 points out of 85 possible. Stay cool and please

More information

Algorithmic Approaches for Biological Data, Lecture #20

Algorithmic Approaches for Biological Data, Lecture #20 Algorithmic Approaches for Biological Data, Lecture #20 Katherine St. John City University of New York American Museum of Natural History 20 April 2016 Outline Aligning with Gaps and Substitution Matrices

More information

TOPIC PAGE NO. UNIT-I FINITE AUTOMATA

TOPIC PAGE NO. UNIT-I FINITE AUTOMATA TABLE OF CONTENTS SNo DATE TOPIC PAGE NO UNIT-I FINITE AUTOMATA 1 Introduction 1 2 Basic Mathematical Notation Techniques 3 3 Finite State systems 4 4 Basic Definitions 6 5 Finite Automaton 7 6 DFA NDFA

More information

Assignment 4 CSE 517: Natural Language Processing

Assignment 4 CSE 517: Natural Language Processing Assignment 4 CSE 517: Natural Language Processing University of Washington Winter 2016 Due: March 2, 2016, 1:30 pm 1 HMMs and PCFGs Here s the definition of a PCFG given in class on 2/17: A finite set

More information

Lecture 10. Sequence alignments

Lecture 10. Sequence alignments Lecture 10 Sequence alignments Alignment algorithms: Overview Given a scoring system, we need to have an algorithm for finding an optimal alignment for a pair of sequences. We want to maximize the score

More information

Computer Sciences Department

Computer Sciences Department 1 Reference Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER 3 D E C I D A B I L I T Y 4 Objectives 5 Objectives investigate the power of algorithms to solve problems.

More information

Theory of Languages and Automata

Theory of Languages and Automata Theory of Languages and Automata Chapter 3- The Church-Turing Thesis Sharif University of Technology Turing Machine O Several models of computing devices Finite automata Pushdown automata O Tasks that

More information

Safra's Büchi determinization algorithm

Safra's Büchi determinization algorithm Safra's Büchi determinization algorithm Aditya Oak Seminar on Automata Theory 28 Jan 2016 Introduction Proposed by S. Safra in 1988 For determinization of non-deterministic Büchi automaton Gives equivalent

More information

Automating Construction of Lexers

Automating Construction of Lexers Automating Construction of Lexers Regular Expression to Programs Not all regular expressions are simple. How can we write a lexer for (a*b aaa)? Tokenizing aaaab Vs aaaaaa Regular Expression Finite state

More information

CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION CSE 105 THEORY OF COMPUTATION Spring 2017 http://cseweb.ucsd.edu/classes/sp17/cse105-ab/ Today's learning goals Sipser Ch 2, 3.1 State and use the Church-Turing thesis. Describe several variants of Turing

More information

[Lexical Analysis] Bikash Balami

[Lexical Analysis] Bikash Balami 1 [Lexical Analysis] Compiler Design and Construction (CSc 352) Compiled By Central Department of Computer Science and Information Technology (CDCSIT) Tribhuvan University, Kirtipur Kathmandu, Nepal 2

More information

T Reactive Systems: Kripke Structures and Automata

T Reactive Systems: Kripke Structures and Automata Tik-79.186 Reactive Systems 1 T-79.186 Reactive Systems: Kripke Structures and Automata Spring 2005, Lecture 3 January 31, 2005 Tik-79.186 Reactive Systems 2 Properties of systems invariants: the system

More information

Lecture 2 Finite Automata

Lecture 2 Finite Automata Lecture 2 Finite Automata August 31, 2007 This lecture is intended as a kind of road map to Chapter 1 of the text just the informal examples that I ll present to motivate the ideas. 1 Expressions without

More information