Lecture 5: Regular Expression and Finite Automata

Similar documents
Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions

CSE 105 THEORY OF COMPUTATION

Compiler phases. Non-tokens

Lecture 8: Context Free Grammars

ECS 120 Lesson 7 Regular Expressions, Pt. 1

CSE 105 THEORY OF COMPUTATION

Computer Sciences Department

Dr. D.M. Akbar Hussain

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions

Front End: Lexical Analysis. The Structure of a Compiler

Finite automata. We have looked at using Lex to build a scanner on the basis of regular expressions.

G52LAC Languages and Computation Lecture 6

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

Structure of Programming Languages Lecture 3

Formal Languages and Compilers Lecture VI: Lexical Analysis

Decision, Computation and Language

CS308 Compiler Principles Lexical Analyzer Li Jiang

(Refer Slide Time: 0:19)

Lexical Analysis and jflex

CS 301. Lecture 05 Applications of Regular Languages. Stephen Checkoway. January 31, 2018

6 NFA and Regular Expressions

Regular Languages and Regular Expressions

Implementation of Lexical Analysis

2. Lexical Analysis! Prof. O. Nierstrasz!

Implementation of Lexical Analysis

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Implementation of Lexical Analysis

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

CT32 COMPUTER NETWORKS DEC 2015

Lecture 2 Finite Automata

Finite Automata and Scanners

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Converting a DFA to a Regular Expression JP

Figure 2.1: Role of Lexical Analyzer

CSE450. Translation of Programming Languages. Automata, Simple Language Design Principles

Decidable Problems. We examine the problems for which there is an algorithm.

Week 2: Syntax Specification, Grammars

Lexical Analysis - 2

Theory Bridge Exam Example Questions Version of June 6, 2008

SAT-CNF Is N P-complete

CS 403 Compiler Construction Lecture 3 Lexical Analysis [Based on Chapter 1, 2, 3 of Aho2]

CMSC 350: COMPILER DESIGN

Formal Languages and Automata

Midterm I (Solutions) CS164, Spring 2002

4 Generating functions in two variables

Zhizheng Zhang. Southeast University

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

UNIT -2 LEXICAL ANALYSIS

Introduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

Lecture I: Shortest Path Algorithms

Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday

Lecture 9 CIS 341: COMPILERS

Regular Languages (14 points) Solution: Problem 1 (6 points) Minimize the following automaton M. Show that the resulting DFA is minimal.

Lecture 18 Regular Expressions

Lexical Analyzer Scanner

Neha 1, Abhishek Sharma 2 1 M.Tech, 2 Assistant Professor. Department of Cse, Shri Balwant College of Engineering &Technology, Dcrust University

ECE251 Midterm practice questions, Fall 2010

Lexical Analyzer Scanner

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2016

COMP Logic for Computer Scientists. Lecture 25

Solutions to Homework 10

CS402 Theory of Automata Solved Subjective From Midterm Papers. MIDTERM SPRING 2012 CS402 Theory of Automata

Scribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017

Theory of Computation Dr. Weiss Extra Practice Exam Solutions

Lecture 11: while loops CS1068+ Introductory Programming in Python. for loop revisited. while loop. Summary. Dr Kieran T. Herley

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Computer Science 236 Fall Nov. 11, 2010

Lexical Analysis. Lecture 2-4

Lecture 15-16: Intermediate Code-Generation

Lexical Analysis. Chapter 2

CS415 Compilers. Lexical Analysis

J. Xue. Tutorials. Tutorials to start in week 3 (i.e., next week) Tutorial questions are already available on-line

A Typed Lambda Calculus for Input Sanitation

1 Computing alignments in only linear space

Automating Construction of Lexers

Regular Languages. MACM 300 Formal Languages and Automata. Formal Languages: Recap. Regular Languages

Monday, August 26, 13. Scanners

Regular Expression Constrained Sequence Alignment

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

Introduction to Lexical Analysis

Wednesday, September 3, 14. Scanners

Quiz 1: Solutions J/18.400J: Automata, Computability and Complexity. Nati Srebro, Susan Hohenberger

Problem: Read in characters and group them into tokens (words). Produce a program listing. Do it efficiently.

CS 310: State Transition Diagrams

A Formal Study of Practical Regular Expressions

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

2068 (I) Attempt all questions.

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 5

2010: Compilers REVIEW: REGULAR EXPRESSIONS HOW TO USE REGULAR EXPRESSIONS

Simple Lexical Analyzer

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Lecture 12: Parser-Generating Tools

Implementation of Lexical Analysis. Lecture 4

Midterm I - Solution CS164, Spring 2014

Scanners. Xiaokang Qiu Purdue University. August 24, ECE 468 Adapted from Kulkarni 2012

CSE Lecture 4: Scanning and parsing 28 Jan Nate Nystrom University of Texas at Arlington

Transcription:

Lecture 5: Regular Expression and Finite Automata Dr Kieran T. Herley Department of Computer Science University College Cork 2017-2018 KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 1 / 1

Summary Overview here KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 2 / 1

Equivalence of FAs and REs DFAs, NFAs and REs Fact DFAs, NFAs and REs have same expressive power i.e. allow precisely same patterns/sets to be specified KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 3 / 1

Translating REs into NFAs Theorem Theorem For every regular expression R, there is a nondeterministic finite automaton M(R) that accepts the language specified by R. Proof (Sketch) By construction Analyze structure of R in terms of subexpressions; reflect structure in expression tree Build M(R) based on structure of R KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 4 / 1

Automaton Construction Regular expression: a i.e. single symbol Corresponding automaton: Accepts expression if f is an accept state. KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 5 / 1

Automaton Construction cont d Regular expression: Corresponding automaton: ɛ Accepts expression if f is an accept state. KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 6 / 1

Automaton Construction cont d Regular expression: X Y Suppose M(X ) and M(Y ) are automata for expressions X and Y (with start states s, s and accept states f, f ). Corresponding automaton: Lemma Automaton M(X Y ) accepts precisely the strings in X Y (if f is an accept state). KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 7 / 1

Proof of Claim Lemma: Accepts expression X Y if f is an accept state. Why? (implicit induction) Clearly α = α X α Y X Y implies s f s f path labelled α exists Existance of s f path labelled α implies α in X Y : Path has structure s f s f Subpath s f corresponds to string matching X Subpath s f corresponds to string matching Y KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 8 / 1

Automaton Construction cont d Regular expression: X Y Suppose M(X ) and M(Y ) are automata for expressions X and Y (with start states s, s and accept states f, f ); Corresponding automaton: Lemma Automaton M(X Y ) accepts expression X Y if f is an accept state. (Proof similiar to previous lemma.) KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 9 / 1

Automaton Construction cont d Regular expression: X Suppose M(X ) is automaton for expression X (with start state s and accept state f ); Corresponding automaton: Lemma Automaton M(X ) ccepts expression X if f is an accept state. KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 10 / 1

Automaton Construction Summary KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 11 / 1

Notes Rules imply recursive algorithm for translating expression E into automaton M(E) that recognizes patterns matching E. Each rule adds at most two states, so #states = O(expr. length). Accept states in sub-automata employed in construction become non-accept states in composite apart from top-level automaton. KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 12 / 1

Example Expression: (a b) abb Translation: all strings beginning with zero or more as or bs followed by abb. Tree captures structure of RE in terms of subexpressions etc. Each non-leaf represents a operator from () (Note: explicit for concatenation) KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 13 / 1

Example r1 M(a) = r2 M(b) = r3 M(a b) = KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 14 / 1

Example cont d M((a b) ) = KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 15 / 1

Example cont d KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 16 / 1

Applications of RE to FA Constuction Usefulness of RE to NFA Construction Lexical Analysis Specify language tokens (identifiers, numerical constants, symbols etc.) as REs Tools like lex automatically generate automaton-based code to decompose source code into constituent tokens Pattern Matching e.g. text editors, grep Pattern specified as RE Automaton-based search locates occurances KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 17 / 1

Applications of RE to FA Constuction grep grep/egrep/fgrep search a file for a pattern (string or regular expression) Examples: fgrep intro /man/man3/*.3* searches files matching RHS for string intro, listing occurrences found egrep Fred (Smith) (Jones) telephone.txt searches telephone.txt for names with first name Fred and last name Smith or Jones grep/egrep/fgrep differ in generality of patterns handled and their efficiency. fgrep the most efficient, egrep the most general. KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 18 / 1

Applications of RE to FA Constuction grep grep E file Build an automaton M that recognizes occurrences of the regular expression E: Simulate M on each line in file. Every time an accept state is entered an occurrence of the pattern (E) has been detected so flag current line. KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 19 / 1

Applications of RE to FA Constuction Note Automaton M(E) recognizes any string x that matches pattern E. (Recall grep flags lines that contain substring matching the pattern.) To get automaton that recognizes any string that contains a substring y that matches E modify as follows: Use NfaAccept to detect matches. KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 20 / 1

Applications of RE to FA Constuction Another Application Can characterize syntax of building blocks (tokens) of most programming languages (identifiers, numerical literals, symbols, comments etc.) using REs. Software tools can automatically generate code to read source and chop it into tokens KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 21 / 1

From NFAs to DFAs Proof not straightforward, but idea is to construct DFA where each KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 22 / 1 DFA vs NFA Theorem For every NFA, there is an equivalent DFA, i.e. one that accepts precisely the same language.

From DFAs to REs DFA-to-RE Translation Theorem For every DFA there is an RE that captures the strings accepted by that DFA. Define Ri,j k to set set of strings that take DFA from state i to state j without going through any state numbered higher than k. Recurrence k = 0 Ri,j 0 = labels on direct i j edges, if any; add ɛ if i = j k > 0 Ri,j k = R k 1 i,k (Rk 1 k,k ) Rk 1 k,j R k 1 i,j KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 23 / 1

From DFAs to REs r1,3 3 = r1,3(r 2 3,3) 2 r3,3 r 2 1,3 2 = 0 1(ɛ (0 1)0 1) (ɛ (0 1)0 1) 0 1 KH (28/09/17) Lecture 5: Regular Expression and Finite Automata 2017-2018 24 / 1 DFA-to-RE Translation Example k = 0 Ri,j 0 = labels on direct i j edges, if any; add ɛ if i = j k > 0 R k i,j = R k 1 i,k (Rk 1 k,k ) Rk 1 k,j R k 1 i,j Answer is sum ( ) of following two expressions k = 0 k = 1 k = 2 r1,1 k ɛ ɛ (00) r1,2 k 0 0 0(00) r1,3 k 1 1 0 1 r2,1 k 0 0 0(00) r2,2 k ɛ ɛ 00 (00) r2,3 k 1 1 01 0 1 r3,1 k (0 1)(00) 0 r3,2 k 0 1 0 1 (0 1)(00) r3,3 k ɛ ɛ ɛ (0 1)0 1 r1,2 3 = r1,3(r 2 3,3) 2 r3,2 r 2 1,2 2 = 0 1(ɛ (0 1)0 1) (0 1)(00) 0(00)