CS 310: State Transition Diagrams

Similar documents
CS 403: Scanning and Parsing

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata

lec3:nondeterministic finite state automata

6 NFA and Regular Expressions

Compiler Construction

Lexical Analysis. Implementation: Finite Automata

CT32 COMPUTER NETWORKS DEC 2015

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}

Lexical Analysis - 2

Implementation of Lexical Analysis

Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday

CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION

Front End: Lexical Analysis. The Structure of a Compiler

Implementation of Lexical Analysis

Implementation of Lexical Analysis

Finite automata. We have looked at using Lex to build a scanner on the basis of regular expressions.

CSCE 5400 ASSIGNMENT #1 SOLUTION

Formal Languages and Compilers Lecture VI: Lexical Analysis

Automating Construction of Lexers

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Figure 2.1: Role of Lexical Analyzer

Regular Languages and Regular Expressions

Lexical Analysis. Finite Automata. (Part 2 of 2)


ECS 120 Lesson 7 Regular Expressions, Pt. 1

CS2 Language Processing note 3

CMPSCI 250: Introduction to Computation. Lecture 20: Deterministic and Nondeterministic Finite Automata David Mix Barrington 16 April 2013

Lexical Analysis. Introduction

The Language for Specifying Lexical Analyzer

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Syntax and Type Analysis

Assignment 1 (Lexical Analyzer)

Lecture 3: Lexical Analysis

Assignment 1 (Lexical Analyzer)

Converting a DFA to a Regular Expression JP

2. Syntax and Type Analysis

Kinder, Gentler Nation

CS 181 B&C EXAM #1 NAME. You have 90 minutes to complete this exam. You may assume without proof any statement proved in class.

Compiler Construction

Regular Expression Constrained Sequence Alignment

CSc 453 Lexical Analysis (Scanning)

CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions

Lexical Analysis. Chapter 2

Finite Automata and Scanners

Lexical Analysis. Lecture 2-4

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 5

CMSC 350: COMPILER DESIGN

NFAs and Myhill-Nerode. CS154 Chris Pollett Feb. 22, 2006.

ECS 120 Lesson 16 Turing Machines, Pt. 2

Theory of Computation Dr. Weiss Extra Practice Exam Solutions

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama

Introduction to Lexical Analysis

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

I have read and understand all of the instructions below, and I will obey the Academic Honor Code.

CS 314 Principles of Programming Languages

Lexical Analysis. Lecture 3-4

CS 314 Principles of Programming Languages. Lecture 3

Compiler course. Chapter 3 Lexical Analysis

David Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain)

CSE450. Translation of Programming Languages. Automata, Simple Language Design Principles

UNIT -2 LEXICAL ANALYSIS

Lexical Error Recovery

The Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?

Implementation of Lexical Analysis. Lecture 4

Compiler Construction I

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

CSE302: Compiler Design

CS415 Compilers. Lexical Analysis

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Outline. 1 Scanning Tokens. 2 Regular Expresssions. 3 Finite State Automata

Chapter 3 Lexical Analysis

Lexical Analysis 1 / 52

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Compiler Construction I

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

CS308 Compiler Principles Lexical Analyzer Li Jiang

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

CISC 4090 Theory of Computation

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Implementation of Lexical Analysis

Safra's Büchi determinization algorithm

Optimizing Finite Automata

[Lexical Analysis] Bikash Balami

CS 403 Compiler Construction Lecture 3 Lexical Analysis [Based on Chapter 1, 2, 3 of Aho2]

Implementation of Lexical Analysis

CSc 453 Compilers and Systems Software

Compiler phases. Non-tokens

2010: Compilers REVIEW: REGULAR EXPRESSIONS HOW TO USE REGULAR EXPRESSIONS

Definition 2.8: A CFG is in Chomsky normal form if every rule. only appear on the left-hand side, we allow the rule S ǫ.

CS402 Theory of Automata Solved Subjective From Midterm Papers. MIDTERM SPRING 2012 CS402 Theory of Automata

Question Bank. 10CS63:Compiler Design

Lexical Analysis. Lecture 3. January 10, 2018

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1

Transcription:

CS 30: State Transition Diagrams Stefan D. Bruda Winter 207

STATE TRANSITION DIAGRAMS Finite directed graph Edges (transitions) labeled with symbols from an alphabet Nodes (states) labeled only for convenience One initial state Several accepting states A string c c 2 c 3... c n is accepted by a state transition diagram if there exists a path from the starting state to an accepting state such that the sequence of labels along the path is c, c 2,..., c n c c 2 c 3 c n 0 s 0 s 0 Same state might be visited more than once Intermediate states might be final The set of exactly all the strings accepted by a state transition diagram is the language accepted (or recognized) by the state transition diagram CS 30: State Transition Diagrams (S. D. Bruda) Winter 207 / 9

DETERMINISTIC FINITE AUTOMATA A state diagram describes graphically a deterministic finite automaton (DFA), a machine that at any given time is in one of finitely many states, and whose state changes according to a predetermined way in response to a sequence of input symbols Formal definition: a deterministic finite automaton is a tuple M = (K, Σ, δ, s, F ) K finite set of states Σ input alphabet F K set of accepting states s K initial state δ : K Σ K transition function 0 q q 2 0 0 q 3 q 4 0 K = {q, q 2, q 3, q 4 } Σ = {0, } F = {q 3, q 4 } s = q δ(q, 0) = q δ(q, ) = q 2 δ(q 2, 0) = q 3 δ(q 2, ) = q 4 δ(q 3, 0) = q 4 δ(q 3, ) = q δ(q 4, 0) = q 4 δ(q 4, ) = q 3 CS 30: State Transition Diagrams (S. D. Bruda) Winter 207 2 / 9

SOFTWARE REALIZATION Big practical advantages of DFA: very easy to implement: Interface to define a vocabulary and a function to obtain the input tokens typename vocab; /* alphabet + end-of-string */ const vocab EOS; /* end-of-string pseudo-token */ vocab gettoken(void); /* returns next token */ Variable (state) changed by a simple switch statement as we go along int main (void) { typedef enum {S0, S,... } state; state s = S0; vocab t = gettoken(); while ( t!= EOS ) { switch (s) { case S0: if (t ==...) s =...; break; if (t ==...) s =...; break;... case S:...... } /* switch */ t = gettoken(); } /* while */ /* accept iff the current state s is final */ } CS 30: State Transition Diagrams (S. D. Bruda) Winter 207 3 / 9

SOFTWARE REALIZATION: EXAMPLE typedef enum {ZERO, ONE, EOS} vocab; vocab gettoken(void) { int c = getc(stdin); if (c == 0 ) return ZERO; if (c == ) return ONE; if (c == \n ) return EOS; perror("illegal character"); } int main (void) { typedef enum {S0, S } state; state s = S0; vocab t = gettoken(); while ( t!= EOS ) { switch (s) { case S0: if (t == ONE) s = S; break; /* if (t == ZERO) s = S0; break */ case S: if (t == ONE) s = S0; break; /* if (t == ZERO) s = S; break */ } /* switch */ t = gettoken(); } /* while */ if (s!= S0) printf("string not accepted.\n"); } CS 30: State Transition Diagrams (S. D. Bruda) Winter 207 4 / 9

NONDETERMINISM So far the state diagrams are deterministic = for any pair (state, input symbol) there can be at most one outgoing transition A nondeterministic diagram allows for the following situation: The acceptance condition remains unchanged: a q a q 2 q 3 Why nondeterminism? Simplifies the construction of the diagram Σ m a n q q 0 q 2 q 3 A nondeterministic diagram can be much smaller than the smallest possible deterministic state diagram that recognizes the same language Also known as nondeterministic finite automata (NFA) CS 30: State Transition Diagrams (S. D. Bruda) Winter 207 5 / 9

NONDETERMINISM So far the state diagrams are deterministic = for any pair (state, input symbol) there can be at most one outgoing transition A nondeterministic diagram allows for the following situation: The acceptance condition remains unchanged: A string c c 2 c 3... c n is accepted by a state transition diagram if there exists some path from the starting state to an accepting state such that the sequence of labels along the path is c, c 2,..., c n Why nondeterminism? Simplifies the construction of the diagram Σ m a n q q 0 q 2 A nondeterministic diagram can be much smaller than the smallest possible deterministic state diagram that recognizes the same language Also known as nondeterministic finite automata (NFA) q 3 q a a q 2 q 3 CS 30: State Transition Diagrams (S. D. Bruda) Winter 207 5 / 9

SOFTWARE REALIZATION As above, except that we have to keep track of a set of states at any given time typedef enum { Q0, Q, Q2, Q3 } state; int main (void) { vocab t = gettoken(); StateSet A; A.include(Q0); while (t!= EOS) { StateSet NewA; for (state s in A) { switch (s) { case Q0: NewA.include(Q0); if (t == m ) NewA.include(Q); break; case Q: if (t == a ) NewA.include(Q2); break; case Q2: if (t == n ) NewA.include(Q3); break; case Q3: break; } } } A = NewA; t = gettoken(); } /* accept iff (Q3 in A) */ CS 30: State Transition Diagrams (S. D. Bruda) Winter 207 6 / 9

SOFTWARE REALIZATION (CONT D) This kind of implementation is fine for throw-away automata Text editor search function searches for a pattern in the text The next search is likely to be different so a brand new automaton needs to be created Some times the automaton is created once and then used multiple times The lexical structure of a programming language is well established Lexical analysis in a compiler is accomplished by an automaton that never changes In such a case it is more efficient to precalculate the set of states Exactly as in the previous program Except that we no longer have an input to guide us, so we calculate the sets NewA for all possible inputs We obtain a DFA that is equivalent to the given NFA (i.e., recognizes the same language) CS 30: State Transition Diagrams (S. D. Bruda) Winter 207 7 / 9

ε-transitions Useful at times to have spontaneous transitions = transitions that change the state without any input being read = ε-transitions Only available for nondeterministic state transition diagrams! Example of usefulness: Construct the state transition diagram for the language {0, } 0{0, } + {w {0, } : w has an even number of s} Even better ε-transitions can be eliminated afterward CS 30: State Transition Diagrams (S. D. Bruda) Winter 207 8 / 9

ELIMINATING ε-transitions For every diagram M with ε-transitions a new diagram without ε-transitions can be constructed as follows: Make a copy M of M where the ε-transitions have been removed. Remove states that have only ε-transitions coming in except for the starting state 2 Add transitions to M as follows: whenever M has a chain of ε-transitions followed by a real transition on x: q ε ε ε p x add to M a transition from state q to state p labeled by x: q x p Note that q and p may be any states In particular this step is also used in the case where q = p 3 If M has a chain of ε-transitions from a state r to an accepting state, then r is made to be an accepting state of M. CS 30: State Transition Diagrams (S. D. Bruda) Winter 207 9 / 9