Compiler Construction Lecture 9
DFA Minimization The generated DFA may have a large number of states. Hopcroft s algorithm: minimizes DFA states 2
DFA Minimization The generated DFA may have a large number of states. Hopcroft s algorithm: minimizes DFA states 3
DFA Minimization Idea: find groups of equivalent states. All transitions from states in one group G 1 go to states in the same group G 2 4
DFA Minimization Idea: find groups of equivalent states. All transitions from states in one group G 1 go to states in the same group G 2 5
DFA Minimization Construct the minimized DFA such that there is one state for each group of states from the initial DFA. 6
DFA Minimization a a A a B b D b E b a a b C b DFA for (a b )*abb 7
DFA Minimization b a a A,C a B b D b E a b Minimized DFA for (a b )*abb 8
Optimized Acceptor RE R RE=>NFA NFA=>DFA Min. DFA input string w Simulate DFA yes, if w e L(R) no, if w e L(R) 9
Lexical Analyzers Lexical analyzers (scanners) use the same mechanism but they: Have multiple RE descriptions for multiple tokens Have a character stream at the input 10
Lexical Analyzers Lexical analyzers (scanners) use the same mechanism but they: Have multiple RE descriptions for multiple tokens Have a character stream at the input 11
Lexical Analyzers Lexical analyzers (scanners) use the same mechanism but they: Have multiple RE descriptions for multiple tokens Have a character stream at the input 12
Lexical Analyzers Return a sequence of matching tokens at the output (or an error) Always return the longest matching token 13
Lexical Analyzers Return a sequence of matching tokens at the output (or an error) Always return the longest matching token 14
Lexical Analyzers R 1 R 2 RE=>NFA character stream NFA=>DFA Min. DFA Simulate DFA Token stream 15
Lexical Analyzer Generators The lexical analysis process can automated We only need to specify Regular expressions for tokens Rule priorities for multiple longest match cases 16
Lexical Analyzer Generators The lexical analysis process can automated We only need to specify Regular expressions for tokens Rule priorities for multiple longest match cases 17
Lexical Analyzer Generators Flex generates lexical analyzer in C or C++ Jlex written in Java. Generates lexical analyzer in Java 18
Lexical Analyzer Generators Flex generates lexical analyzer in C or C++ Jlex written in Java. Generates lexical analyzer in Java 19
Using Flex Provide a specification file Flex reads this file and produces C or C++ output file contains the scanner. The file consists of three sections 20
Using Flex Provide a specification file Flex reads this file and produces C or C++ output file contains the scanner. The file consists of three sections 21
Using Flex Provide a specification file Flex reads this file and produces C or C++ output file contains the scanner. The file consists of three sections 22
Flex Specification File 1 C or C++ and flex definitions 23
Flex Specification File 1 2 C or C++ and flex definitions %% token definitions and actions 24
Flex Specification File 1 2 3 C or C++ and flex definitions %% token definitions and actions %% user code 25
Specification File lex.l %{ #include tokdefs.h %} D [0-9] L [a-za-z_] id {L}({L} {D})* %% "void" {return(tok_void);} "int" {return(tok_int);} "if" {return(tok_if);} 26
Specification File lex.l "else" {return(tok_else);} "while"{return(tok_while)}; "<=" {return(tok_le);} ">=" {return(tok_ge);} "==" {return(tok_eq);} "!=" {return(tok_ne);} {D}+ {return(tok_int);} {id} {return(tok_id);} [\n] [\t] [ ] ; %% 27
File tokdefs.h #define TOK_VOID 1 #define TOK_INT 2 #define TOK_IF 3 #define TOK_ELSE 4 #define TOK_WHILE 5 #define TOK_LE 6 #define TOK_GE 7 #define TOK_EQ 8 #define TOK_NE 9 #define TOK_INT 10 #define TOK_ID 111 28
Invoking Flex lex.l flex lex.cpp 29
Using Generated Scanner void main() { FlexLexer lex; int tc = lex.yylex(); while(tc!= 0) cout << tc <<, <<lex.yytext() << endl; tc = lex.yylex(); } 30
Creating Scanner EXE flex lex.l g++ c lex.cpp g++ c main.cpp g++ o lex.exe lex.o main.o lex <main.cpp 31