Reducing a DFA to a Minimal DFA

Similar documents
Dr. D.M. Akbar Hussain

In the last lecture, we discussed how valid tokens may be specified by regular expressions.

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 4: Lexical Analyzers 28 Jan 08

Fig.25: the Role of LEX

Lexical Analysis: Constructing a Scanner from Regular Expressions

Topic 2: Lexing and Flexing

Definition of Regular Expression

Lexical analysis, scanners. Construction of a scanner

Lexical Analysis and Lexical Analyzer Generators

Lexical Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Principles of Programming Languages

Languages. L((a (b)(c))*) = { ε,a,bc,aa,abc,bca,... } εw = wε = w. εabba = abbaε = abba. (a (b)(c)) *

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

CS 432 Fall Mike Lam, Professor a (bc)* Regular Expressions and Finite Automata

CSCE 531, Spring 2017, Midterm Exam Answer Key

Example: Source Code. Lexical Analysis. The Lexical Structure. Tokens. What do we really care here? A Sample Toy Program:

CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona

CMPSC 470: Compiler Construction

CSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 3b Lexical Analysis Elias Athanasopoulos

CS 430 Spring Mike Lam, Professor. Parsing

Assignment 4. Due 09/18/17

CMPT 379 Compilers. Lexical Analysis

Implementing Automata. CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an

Compilers Spring 2013 PRACTICE Midterm Exam

Deterministic. Finite Automata. And Regular Languages. Fall 2018 Costas Busch - RPI 1

Quiz2 45mins. Personal Number: Problem 1. (20pts) Here is an Table of Perl Regular Ex

Scanner Termination. Multi Character Lookahead

CSE 401 Midterm Exam 11/5/10 Sample Solution

Compilation

Finite Automata. Lecture 4 Sections Robb T. Koether. Hampden-Sydney College. Wed, Jan 21, 2015

Compiler Construction D7011E

CS201 Discussion 10 DRAWTREE + TRIES

TO REGULAR EXPRESSIONS

COMP 423 lecture 11 Jan. 28, 2008

Lexical Analysis. Role, Specification & Recognition Tool: LEX Construction: - RE to NFA to DFA to min-state DFA - RE to DFA

this grammar generates the following language: Because this symbol will also be used in a later step, it receives the

CS 340, Fall 2014 Dec 11 th /13 th Final Exam Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string.

Should be done. Do Soon. Structure of a Typical Compiler. Plan for Today. Lab hours and Office hours. Quiz 1 is due tonight, was posted Tuesday night

Fall Compiler Principles Lecture 1: Lexical Analysis. Roman Manevich Ben-Gurion University

LEX5: Regexps to NFA. Lexical Analysis. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class

Discussion 1 Recap. COP4600 Discussion 2 OS concepts, System call, and Assignment 1. Questions. Questions. Outline. Outline 10/24/2010

CS 241 Week 4 Tutorial Solutions

Lecture T1: Pattern Matching

CS 241. Fall 2017 Midterm Review Solutions. October 24, Bits and Bytes 1. 3 MIPS Assembler 6. 4 Regular Languages 7.

Suffix trees, suffix arrays, BWT

ECE 468/573 Midterm 1 September 28, 2012

Information Retrieval and Organisation

LR Parsing, Part 2. Constructing Parse Tables. Need to Automatically Construct LR Parse Tables: Action and GOTO Table

The dictionary model allows several consecutive symbols, called phrases

Fall Compiler Principles Lecture 1: Lexical Analysis. Roman Manevich Ben-Gurion University of the Negev

Slides for Data Mining by I. H. Witten and E. Frank

What are suffix trees?

Operator Precedence. Java CUP. E E + T T T * P P P id id id. Does a+b*c mean (a+b)*c or

Stack. A list whose end points are pointed by top and bottom

Network Interconnection: Bridging CS 571 Fall Kenneth L. Calvert All rights reserved

2014 Haskell January Test Regular Expressions and Finite Automata

10/12/17. Motivating Example. Lexical and Syntax Analysis (2) Recursive-Descent Parsing. Recursive-Descent Parsing. Recursive-Descent Parsing

COS 333: Advanced Programming Techniques

Midterm I Solutions CS164, Spring 2006

CS 321 Programming Languages and Compilers. Bottom Up Parsing

Compression Outline :Algorithms in the Real World. Lempel-Ziv Algorithms. LZ77: Sliding Window Lempel-Ziv

Some Thoughts on Grad School. Undergraduate Compilers Review and Intro to MJC. Structure of a Typical Compiler. Lexing and Parsing

2 Computing all Intersections of a Set of Segments Line Segment Intersection

CS 340, Fall 2016 Sep 29th Exam 1 Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string.

Lab 1 - Counter. Create a project. Add files to the project. Compile design files. Run simulation. Debug results

Agenda & Reading. Class Exercise. COMPSCI 105 SS 2012 Principles of Computer Science. Arrays

Lecture T4: Pattern Matching

COS 333: Advanced Programming Techniques

Regular Expressions and Automata using Miranda

CS481: Bioinformatics Algorithms

12 <= rm <digit> 2 <= rm <no> 2 <= rm <no> <digit> <= rm <no> <= rm <number>

10.5 Graphing Quadratic Functions

Midterm 2 Sample solution

From Dependencies to Evaluation Strategies

Theory of Computation CSE 105

Java CUP. Java CUP Specifications. User Code Additions. Package and Import Specifications

Distributed Systems Principles and Paradigms

Mid-term exam. Scores. Fall term 2012 KAIST EE209 Programming Structures for EE. Thursday Oct 25, Student's name: Student ID:

Suffix Tries. Slides adapted from the course by Ben Langmead

Tries. Yufei Tao KAIST. April 9, Y. Tao, April 9, 2013 Tries

Control-Flow Analysis and Loop Detection

Introduction To Files In Pascal

box Boxes and Arrows 3 true 7.59 'X' An object is drawn as a box that contains its data members, for example:

Regular Expression Matching with Multi-Strings and Intervals. Philip Bille Mikkel Thorup

CMSC 331 First Midterm Exam

Algorithm Design (5) Text Search

UT1553B BCRT True Dual-port Memory Interface

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFORMATICS 1 COMPUTATION & LOGIC INSTRUCTIONS TO CANDIDATES

cisc1110 fall 2010 lecture VI.2 call by value function parameters another call by value example:

Context-Free Grammars

Introduction to Algebra

Example: 2:1 Multiplexer

COMBINATORIAL PATTERN MATCHING

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs.

Lists in Lisp and Scheme

acronyms possibly used in this test: CFG :acontext free grammar CFSM :acharacteristic finite state machine DFA :adeterministic finite automata

Transcription:

Lexicl Anlysis - Prt 4 Reducing DFA to Miniml DFA Input: DFA IN Assume DFA IN never gets stuck (dd ded stte if necessry) Output: DFA MIN An equivlent DFA with the minimum numer of sttes. Hrry H. Porter, 2005 1

Lexicl Anlysis - Prt 4 Reducing DFA to Miniml DFA Input: DFA IN Assume DFA IN never gets stuck (dd ded stte if necessry) Output: DFA MIN An equivlent DFA with the minimum numer of sttes. 1 Ded Stte 2 3 4 5, Hrry H. Porter, 2005 2

Lexicl Anlysis - Prt 4 Reducing DFA to Miniml DFA Input: DFA IN Assume DFA IN never gets stuck (dd ded stte if necessry) Output: DFA MIN An equivlent DFA with the minimum numer of sttes. 1 Ded Stte 2 3 4 5, Approch: Merge two sttes if the effectively do the sme thing. Do the sme thing? At EOF, is DFA IN in n ccepting stte or not? Hrry H. Porter, 2005 3

Lexicl Anlysis - Prt 4 Sufficiently Different Sttes Merge sttes, if t ll possile. Are two sttes sufficiently different... tht they cnnot e merged? Hrry H. Porter, 2005 4

Lexicl Anlysis - Prt 4 Sufficiently Different Sttes Merge sttes, if t ll possile. Are two sttes sufficiently different... tht they cnnot e merged? Stte s is distinguished from stte t y some string w iff: strting t s, given chrcters w, the DFA ends up ccepting,... ut strting t t, the DFA does not ccept. Hrry H. Porter, 2005 5

Lexicl Anlysis - Prt 4 Sufficiently Different Sttes Merge sttes, if t ll possile. Are two sttes sufficiently different... tht they cnnot e merged? Stte s is distinguished from stte t y some string w iff: strting t s, given chrcters w, the DFA ends up ccepting,... ut strting t t, the DFA does not ccept. Exmple: s c t c does not distinguish s nd t. But c distinguishes s nd t. Therefore, s nd t cnnot e merged. Hrry H. Porter, 2005 6

Lexicl Anlysis - Prt 4 Prtitioning Set A prtitioning of set......reks the set into non-overlpping susets. (The prtition reks the set into groups ) Exmple: S = {A, B, C, D, E, F, G} Π = {(A B) (C D E F) (G) } Π 2 = {(A) (B C) (D E F G) } Hrry H. Porter, 2005 7

Lexicl Anlysis - Prt 4 Prtitioning Set A prtitioning of set......reks the set into non-overlpping susets. (The prtition reks the set into groups ) Exmple: S = {A, B, C, D, E, F, G} Π = {(A B) (C D E F) (G) } Π 2 = {(A) (B C) (D E F G) } We cn refine prtition... Π i = { (A B C) (D E) (F G) } Π i+1 = { (A C) (B) (D) (E) (F G) } Note: { (...) (...) (...) } mens {{...}, {...}, {...} } Hrry H. Porter, 2005 8

Lexicl Anlysis - Prt 4 Hopcroft s Algorithm Consider the set of sttes. Initilly, we will prtition it into two groups... Finl Sttes All Other Sttes c e d Hrry H. Porter, 2005 9

Lexicl Anlysis - Prt 4 Hopcroft s Algorithm Consider the set of sttes. Initilly, we will prtition it into two groups... Finl Sttes All Other Sttes c e d (A B D) (C E) Hrry H. Porter, 2005 10

Lexicl Anlysis - Prt 4 Hopcroft s Algorithm Consider the set of sttes. Initilly, we will prtition it into two groups... Finl Sttes All Other Sttes c e d (A B D) (C E) Repetedly refine the prtioning. Two sttes must e plced in different groups... if they cn e distinguished Repet until no group contins sttes tht cn e distinguished. Ech group in the prtitioning ecomes one stte in newly constructed DFA DFA MIN = The miniml DFA Hrry H. Porter, 2005 11

Lexicl Anlysis - Prt 4 How to Refine Prtitioning? Π i = { ( A B D ) ( C E ) } Consider one group... (A B D) Look t output edges on some symol (e.g., x ) P 1 P 2 x x c d x e Hrry H. Porter, 2005 12

Lexicl Anlysis - Prt 4 How to Refine Prtitioning? Π i = { ( A B D ) ( C E ) } Consider one group... (A B D) Look t output edges on some symol (e.g., x ) P 1 P 2 x x c d On x, ll sttes in P 1 go to sttes elonging to the sme group. x e Hrry H. Porter, 2005 13

Lexicl Anlysis - Prt 4 How to Refine Prtitioning? Π i = { ( A B D ) ( C E ) } Consider one group... (A B D) Look t output edges on some symol (e.g., x ) P 1 P 2 x x c d On x, ll sttes in P 1 go to sttes elonging to the sme group. x e y c y e Now consider nother symol (e.g., y ) y d Hrry H. Porter, 2005 14

Lexicl Anlysis - Prt 4 How to Refine Prtitioning? Π i = { ( A B D ) ( C E ) } Consider one group... (A B D) Look t output edges on some symol (e.g., x ) P 1 P 2 x x c d On x, ll sttes in P 1 go to sttes elonging to the sme group. x e y c y e Now consider nother symol (e.g., y ) D is distinguished from A nd B! y d Hrry H. Porter, 2005 15

Lexicl Anlysis - Prt 4 How to Refine Prtitioning? Π i = { ( A B D ) ( C E ) } Consider one group... (A B D) Look t output edges on some symol (e.g., x ) P 1 P 2 x x c d On x, ll sttes in P 1 go to sttes elonging to the sme group. x e y c y y d e Now consider nother symol (e.g., y ) D is distinguished from A nd B! So refine the prtition! Π i+1 = { ( A B ) ( D ) ( C E ) } P 3 P 4 P 2 Hrry H. Porter, 2005 16

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) C A B D E Hrry H. Porter, 2005 17

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) A C B D E Consider (E) Hrry H. Porter, 2005 18

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Consider A C B D E Consider (E) Hrry H. Porter, 2005 19

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Brek prt? Consider A C B D E Consider (E) Hrry H. Porter, 2005 20

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Brek prt? No Consider Brek prt? Consider (E) A C B D E Hrry H. Porter, 2005 21

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Brek prt? No Consider Brek prt? (A B C) (D) Consider (E) A C B D E Hrry H. Porter, 2005 22

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Brek prt? No Consider Brek prt? (A B C) (D) Consider (E) Not possile to rek prt. A C B D E Hrry H. Porter, 2005 23

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Brek prt? No Consider Brek prt? (A B C) (D) Consider (E) Not possile to rek prt. New Prtitioning: Π 2 = (A B C) (D) (E) A C B D E Hrry H. Porter, 2005 24

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Brek prt? No Consider Brek prt? (A B C) (D) Consider (E) Not possile to rek prt. New Prtitioning: Π 2 = (A B C) (D) (E) Consider Brek prt? Consider A C B D E Hrry H. Porter, 2005 25

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Brek prt? No Consider Brek prt? (A B C) (D) Consider (E) Not possile to rek prt. New Prtitioning: Π 2 = (A B C) (D) (E) Consider Brek prt? No Consider Brek prt? A C B D E Hrry H. Porter, 2005 26

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Brek prt? No Consider Brek prt? (A B C) (D) Consider (E) Not possile to rek prt. New Prtitioning: Π 2 = (A B C) (D) (E) Consider Brek prt? No Consider Brek prt? (A C) (B) A C B D E Hrry H. Porter, 2005 27

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Brek prt? No Consider Brek prt? (A B C) (D) Consider (E) Not possile to rek prt. New Prtitioning: Π 2 = (A B C) (D) (E) Consider Brek prt? No Consider Brek prt? (A C) (B) New Prtitioning: Π 3 = (A C) (B) (D) (E) Consider Brek prt? Consider Brek prt? A C B D E Hrry H. Porter, 2005 28

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Brek prt? No Consider Brek prt? (A B C) (D) Consider (E) Not possile to rek prt. New Prtitioning: Π 2 = (A B C) (D) (E) Consider Brek prt? No Consider Brek prt? (A C) (B) New Prtitioning: Π 3 = (A C) (B) (D) (E) Consider Brek prt? No Consider Brek prt? A C B D E Hrry H. Porter, 2005 29

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Brek prt? No Consider Brek prt? (A B C) (D) Consider (E) Not possile to rek prt. New Prtitioning: Π 2 = (A B C) (D) (E) Consider Brek prt? No Consider Brek prt? (A C) (B) New Prtitioning: Π 3 = (A C) (B) (D) (E) Consider Brek prt? No Consider Brek prt? No A C B D E Hrry H. Porter, 2005 30

Lexicl Anlysis - Prt 4 Exmple DFA IN Initil Prtitioning: Π 1 = (A B C D) (E) Consider (A B C D) Consider Brek prt? No Consider Brek prt? (A B C) (D) Consider (E) Not possile to rek prt. New Prtitioning: Π 2 = (A B C) (D) (E) Consider Brek prt? No Consider Brek prt? (A C) (B) New Prtitioning: Π 3 = (A C) (B) (D) (E) Consider Brek prt? No Consider Brek prt? No A C B DFA MIN D AC B D E E Hrry H. Porter, 2005 31

Lexicl Anlysis - Prt 4 Hopcroft s Algorithm Add ded stte nd trnsitions to it if necessry. (Now, every stte hs n outgoing edge on every symol.) Π = initil prtitioning loop Π NEW = Refine(Π) if (Π NEW = Π) then rek Π = Π NEW endloop Hrry H. Porter, 2005 32

Lexicl Anlysis - Prt 4 Hopcroft s Algorithm Add ded stte nd trnsitions to it if necessry. (Now, every stte hs n outgoing edge on every symol.) Π = initil prtitioning loop Π NEW = Refine(Π) if (Π NEW = Π) then rek Π = Π NEW endloop Construct DFA MIN Ech group in Π ecomes stte Hrry H. Porter, 2005 33

Lexicl Anlysis - Prt 4 Hopcroft s Algorithm Add ded stte nd trnsitions to it if necessry. (Now, every stte hs n outgoing edge on every symol.) Π = initil prtitioning loop Π NEW = Refine(Π) if (Π NEW = Π) then rek Π = Π NEW endloop Construct DFA MIN Ech group in Π ecomes stte A C c B D c Hrry H. Porter, 2005 34

Lexicl Anlysis - Prt 4 Hopcroft s Algorithm Add ded stte nd trnsitions to it if necessry. (Now, every stte hs n outgoing edge on every symol.) Π = initil prtitioning loop Π NEW = Refine(Π) if (Π NEW = Π) then rek Π = Π NEW endloop Construct DFA MIN Ech group in Π ecomes stte Choose one stte in ech group (throw ll other sttes wy) Preserve the edges out of the chosen stte A C c B D c Hrry H. Porter, 2005 35

Lexicl Anlysis - Prt 4 Hopcroft s Algorithm Add ded stte nd trnsitions to it if necessry. (Now, every stte hs n outgoing edge on every symol.) Π = initil prtitioning loop Π NEW = Refine(Π) if (Π NEW = Π) then rek Π = Π NEW endloop Construct DFA MIN Ech group in Π ecomes stte Choose one stte in ech group (throw ll other sttes wy) Preserve the edges out of the chosen stte A D c Hrry H. Porter, 2005 36

Lexicl Anlysis - Prt 4 Hopcroft s Algorithm Add ded stte nd trnsitions to it if necessry. (Now, every stte hs n outgoing edge on every symol.) Π = initil prtitioning loop Π NEW = Refine(Π) if (Π NEW = Π) then rek Π = Π NEW endloop Construct DFA MIN Ech group in Π ecomes stte Choose one stte in ech group (throw ll other sttes wy) Preserve the edges out of the chosen stte A D c A D c Hrry H. Porter, 2005 37

Lexicl Anlysis - Prt 4 Hopcroft s Algorithm Add ded stte nd trnsitions to it if necessry. (Now, every stte hs n outgoing edge on every symol.) Π = initil prtitioning loop Π NEW = Refine(Π) if (Π NEW = Π) then rek Π = Π NEW endloop Construct DFA MIN Ech group in Π ecomes stte Choose one stte in ech group (throw ll other sttes wy) Preserve the edges out of the chosen stte Del with strt stte nd finl sttes A D c A D c Hrry H. Porter, 2005 38

Lexicl Anlysis - Prt 4 Hopcroft s Algorithm Add ded stte nd trnsitions to it if necessry. (Now, every stte hs n outgoing edge on every symol.) Π = initil prtitioning loop Π NEW = Refine(Π) if (Π NEW = Π) then rek Π = Π NEW endloop Construct DFA MIN Ech group in Π ecomes stte Choose one stte in ech group (throw ll other sttes wy) Preserve the edges out of the chosen stte Del with strt stte nd finl sttes If desired... Remove ded stte Remove ny stte unrechle from the strt stte A A D D c c Hrry H. Porter, 2005 39

Lexicl Anlysis - Prt 4 Π NEW = Refine(Π) Π NEW = {} for ech group G in Π do Exmple: Π = (A B C E) (D F) Brek G into su-groups (A B C E) (A C) (B E) s follows: Put S nd T into different sugroups if... For ny symol Σ, S nd T go to sttes in two different groups in Π A B x x D C Must split A nd B into different groups Add the su-groups to Π NEW endfor return Π NEW Π NEW = { } Π NEW = { (A C) (B E) } Π NEW = { (A C) (B E) (D F) } Hrry H. Porter, 2005 40

Lexicl Anlysis - Prt 4 Summrizing... Hrry H. Porter, 2005 41

Lexicl Anlysis - Prt 4 Summrizing... Regulr Expressions to Descrie Tokens Hrry H. Porter, 2005 42

Lexicl Anlysis - Prt 4 Summrizing... Regulr Expressions to Descrie Tokens Algorithm: Regulr Expression NFA Hrry H. Porter, 2005 43

Lexicl Anlysis - Prt 4 Summrizing... Regulr Expressions to Descrie Tokens Algorithm: Regulr Expression NFA Algorithm for Simulting NFA Hrry H. Porter, 2005 44

Lexicl Anlysis - Prt 4 Summrizing... Regulr Expressions to Descrie Tokens Algorithm: Regulr Expression NFA Algorithm for Simulting NFA Algorithm: NFA DFA Hrry H. Porter, 2005 45

Lexicl Anlysis - Prt 4 Summrizing... Regulr Expressions to Descrie Tokens Algorithm: Regulr Expression NFA Algorithm for Simulting NFA Algorithm: NFA DFA Algorithm: DFA Miniml DFA Hrry H. Porter, 2005 46

Lexicl Anlysis - Prt 4 Summrizing... Regulr Expressions to Descrie Tokens Algorithm: Regulr Expression NFA Algorithm for Simulting NFA Algorithm: NFA DFA Algorithm: DFA Miniml DFA Algorithm for Simulting DFA Hrry H. Porter, 2005 47

Lexicl Anlysis - Prt 4 Summrizing... Regulr Expressions to Descrie Tokens Algorithm: Regulr Expression NFA Algorithm for Simulting NFA Algorithm: NFA DFA Algorithm: DFA Miniml DFA Algorithm for Simulting DFA Fst: Get Next Chr Evlute Move Function e.g., Arry Lookup Chnge Stte Vrile Test for Accepting Stte Test for EOF Repet Hrry H. Porter, 2005 48

Lexicl Anlysis - Prt 4 Summrizing... Regulr Expressions to Descrie Tokens Algorithm: Regulr Expression NFA Algorithm for Simulting NFA Algorithm: NFA DFA Algorithm: DFA Miniml DFA Algorithm for Simulting DFA Fst: Get Next Chr Evlute Move Function e.g., Arry Lookup Chnge Stte Vrile Test for Accepting Stte Test for EOF Repet Scnner Genertors Crete n efficient Lexer from regulr expressions! Hrry H. Porter, 2005 49

Lexicl Anlysis - Prt 4 Scnner Genertor: LEX Input: r 1 { ction 1 } r 2 { ction 2 }... r N { ction N } Requirements: Choose the lrgest lexeme tht mtches. If more thn one r i mtches, choose the first one. DFA Simultor (C-code) Trnsition Tles (initilized rrys) lex-egin-ptr forwrd-ptr Input Buffers Cnned code dded y lex tool Computed y lex tool Hrry H. Porter, 2005 50

Lexicl Anlysis - Prt 4 Input: { Action-1 } { Action-2 } *+ { Action-3 } Hrry H. Porter, 2005 51

Lexicl Anlysis - Prt 4 Input: { Action-1 } { Action-2 } *+ { Action-3 } Crete NFA: *+ 0 ε ε ε 1 3 7 2 4 5 8 Pttern-1 Pttern-3 6 Pttern-2 Hrry H. Porter, 2005 52

Lexicl Anlysis - Prt 4 Input: { Action-1 } { Action-2 } *+ { Action-3 } Crete NFA: *+ 0 ε ε ε 1 3 7 2 4 5 8 Pttern-1 Pttern-3 6 Pttern-2 Exmple Input: c... Strt: { 0, 1, 3, 7 } Input: { 2, 4, 7 } Input: { 7 } Input: { 8 } Input: c { } Mtch! Pttern: 1 Length: 1 Mtch! Pttern: 3 Length: 3 Done! Identify the lst mtch. Execute the corresponding ction & djust pointers Hrry H. Porter, 2005 53

Lexicl Anlysis - Prt 4 Find the NFA for r 1 r 2... r N Convert to DFA. Approch Ech stte of the DFA corresponds to set of NFA sttes. A stte is finl if ny NFA stte in it ws finl stte. If severl, choose the lowest numered pttern to e the one ccepted. During simultion, keep following edges until you get stuck. As the scnning proceeds... Every time you enter finl stte... Rememer: The current vlue of uffer pointers Which pttern ws recognized Upon termintion... Use tht informtion to... Adjust the uffer pointers Execute the desired ction Hrry H. Porter, 2005 54

Lexicl Anlysis - Prt 4 Input: { Action-1 } { Action-2 } *+ { Action-3 } Exmple Hrry H. Porter, 2005 55

Lexicl Anlysis - Prt 4 Input: { Action-1 } { Action-2 } *+ { Action-3 } Crete NFA: *+ Exmple 0 ε ε ε 1 3 7 2 4 5 8 *+ 6 Hrry H. Porter, 2005 56

Lexicl Anlysis - Prt 4 Input: { Action-1 } { Action-2 } *+ { Action-3 } Crete NFA: *+ Exmple 0 ε ε ε 1 3 7 2 4 5 8 *+ 6 Construct Miniml DFA 2,4,7 5,8 0,1,3,7 8 7 6,8 Hrry H. Porter, 2005 57

Lexicl Anlysis - Prt 4 Input: { Action-1 } { Action-2 } *+ { Action-3 } Crete NFA: *+ Exmple 0 ε ε ε 1 3 7 2 4 5 8 *+ 6 Construct Miniml DFA Attch Actions 0,1,3,7 *+ 8 2,4,7 7 5,8 6,8 *+ *+ Accept only first pttern Hrry H. Porter, 2005 58

Lexicl Anlysis - Prt 4 Input: { Action-1 } { Action-2 } *+ { Action-3 } Crete NFA: *+ Exmple 0 ε ε ε 1 3 7 2 4 5 8 *+ 6 Construct Miniml DFA Attch Actions Exmple Strings: 0,1,3,7 *+ 8 2,4,7 7 5,8 6,8 *+ *+ Accept only first pttern Hrry H. Porter, 2005 59

Lexicl Anlysis - Prt 4 Oldest, most well-known For Unix/C Environment In UNIX: %lex lex.l %cc lex.yy.c The Lex Tool File: lex.l Lex Tool Contins severl regulr expressions File: lex.yy.c A progrm in C... Redy to compile nd link with Prser (e.g., YACC output) Hrry H. Porter, 2005 60

Lexicl Anlysis - Prt 4 Oldest, most well-known For Unix/C Environment In UNIX: %lex lex.l %cc lex.yy.c The Lex Tool File: lex.l Lex Tool Contins severl regulr expressions Input File Formt: File: lex.yy.c %{ }% %% %%...Any C Code......Regulr Definitions......Regulr Expressions with Actions......Any C Code... A progrm in C... Redy to compile nd link with Prser (e.g., YACC output) Hrry H. Porter, 2005 61

Lexicl Anlysis - Prt 4 Regulr Expressions in Lex c Conctention; Most chrcters stnd for themselves Met Chrters: Usul menings * Exmple: ( )*c* () + One or more, e.g., +c? Optionl, e.g.,?c [x-y] Chrcter clsses, e.g., [-z][-za-z0-9]* [^x-y] Anything ut [x-y] \x The usul escpe sequences, e.g., \n. Any chrcter except \n ^ Beginning of line $ End of line "..." To use the met chrcters literlly, Exmple: PCAT comments: "(*".*"*)" {...} Defined nmes, e.g., {letter} / Look-hed Exmple: /cd (Mtches, ut only when followed y cd) Hrry H. Porter, 2005 62

Lexicl Anlysis - Prt 4 Look-Ahed Opertor, / /cd Mtches, ut only if followed y cd. Hrry H. Porter, 2005 63

Lexicl Anlysis - Prt 4 Look-Ahed Opertor, / /cd Mtches, ut only if followed y cd. Add specil ε edge for / ε c d Hrry H. Porter, 2005 64

Lexicl Anlysis - Prt 4 Look-Ahed Opertor, / /cd Mtches, ut only if followed y cd. Add specil ε edge for / ε c d Mrk the following stte to mke note of... The pttern in question The current vlue of the uffer pointers...whenever this stte is encountered during scnning. / Encountered Sve uffer pointers Hrry H. Porter, 2005 65

Lexicl Anlysis - Prt 4 Look-Ahed Opertor, / /cd Mtches, ut only if followed y cd. Add specil ε edge for / ε c d Mrk the following stte to mke note of... The pttern in question The current vlue of the uffer pointers...whenever this stte is encountered during scnning. / Encountered Sve uffer pointers When pttern is finlly mtched, check these notes. If we pssed through / stte for the pttern ccepted, Use the stored uffer positions, insted of the finl positions to descrie the lexeme mtched. Hrry H. Porter, 2005 66

Lexicl Anlysis - Prt 4 Lex: Input File Formt %{...Any C Code... }%...Regulr Definitions... %%...Regulr Expressions with Actions... %%...Any C Code... Hrry H. Porter, 2005 67

Lexicl Anlysis - Prt 4 %{ }% %% %%...Any C Code... #define ID 13 #define NUM 14 #define PLUS 15 #define MINUS 16... #define WHILE 37 #define IF 38......Regulr Definitions... Lex: Input File Formt...Regulr Expressions with Actions......Any C Code...... int lookup (chr * p) {...} int enter (chr * p, int i) {...}... Any C code; Copied without chnges to eginning of the output file Any C code; dded to end of file (typiclly, uxillry support routines) Hrry H. Porter, 2005 68

Lexicl Anlysis - Prt 4 %{ }% %% %% Defined Nmes...Any C Code... Lex: Input File Formt...Regulr Definitions... Defined nmes cn delim [ \t\n] e used in regulr expressions white {delim}+ letter [-za-z] digit [0-9] id {letter}({letter} {digit})* num {digit}+(\.{digit}+)?...regulr Expressions with Actions......Any C Code... Blnk: Every chrcter is Itself literlly Hrry H. Porter, 2005 69

Lexicl Anlysis - Prt 4 %{ }% %%...Any C Code......Regulr Definitions... Lex: Input File Formt...Regulr Expressions with Actions... "+" {return PLUS;} "-" {return MINUS;}... while if... {white} {}... {return WHILE;} {return IF;} Regulr expressions {num} {yylvl =...; return NUM;} {id} {yylvl =...lookup(...)...; return ID;} %%...Any C Code... yylvl is where token ttriute info is stored. Any C code. Include return to give the token to prser. No return mens do nothing. (This token is recognized ut not returned to prser) You my use these vriles to ccess the lexeme: chr * yytext; int yyleng; Hrry H. Porter, 2005 70