Lexical Analysis. Role, Specification & Recognition Tool: LEX Construction: - RE to NFA to DFA to min-state DFA - RE to DFA

Size: px
Start display at page:

Download "Lexical Analysis. Role, Specification & Recognition Tool: LEX Construction: - RE to NFA to DFA to min-state DFA - RE to DFA"

Transcription

1 Lexicl Anlysis Role, Specifiction & Recognition Tool: LEX Construction: - RE to NFA to DFA to min-stte DFA - RE to DFA

2 Conducting Lexicl Anlysis Techniques for specifying nd implementing lexicl nlyzers Hnd-written stte trnsition digrm tht revels the structure of the tokens hnd-trnslted driver progrm Tools: Pttern Triggered ctions pttern-ction lnguge: LEX Other pplictions: query lnguge, informtion retrievl AWK shell commnds PCB inspection

3 Lexicl Anlyzer nd Prser source progrm lexicl nlyzer token &ttributes get next token Prser symbol tble token: smllest logiclly cohesive sequence of chrcters of interest in source progrm (Aho,Sethi,Ullmn, pp.84)

4 Lexicl Anlysis Convert input lexemes to strem of tokens Typicl Functions: Lexeme: sequence of chrcters tht comprises single token Removl of white spce nd comments insted of writing productions tht include spces nd comments keeping line count for ssociting error messge with line number Digits into Token ID + vlue/ttributes insted of writing productions for integer constnts <num, 31> <+, > <num, 28><+, > <num, 59> Recognizing Identifiers nd Keywords Identifiers: count = count + increment id = id + id Keywords: begin, end, if, else begin, end, if, else Opertors/punctutions: >, <=, <>

5 Why Seprte Lexicl Anlysis from Prsing Simpler Design no production rules nd trnsltions for white spces nd comments Improved Efficiency lexicl nlyzer cn be optimized seprtely (e.g., using specilized buffering techniques) Enhnced Compiler Portbility Input lphbet peculirities nd device-specific nomlies cn be restricted to the lexicl nlyzer

6 Tokens, Ptterns, Lexemes Token: terminl symbol or lexicl unit of prser, representing set of strings of prticulr type e.g., pi, count, => id e.g., , 6.02e23, => num Typicl: keywords, opertors, identifiers, constnts, literl strings, punctution symbols Representtion: n integer (e.g., #define ID 258) with ssocited ttributes Pttern: specifiction of the set of strings rule describing the set of lexemes tht cn represent prticulr token e.g., id => letter followed by letters nd digits Lexeme: sequence of chrcters in the source progrm tht is mtched by the pttern for token Exmples: Fig. 3.2

7 Attributes for Tokens Attributes: dditionl informtion for prticulr lexeme when mtching multiple ptterns prsing decision, trnsltion Implementtion: pointer to symbol tble entry in which the token informtion is kept Exmple: E = M * C ** 2 E (or M, C): <id, ptr to symbol-tble entry for E (M, C)> =: <ssign_op, > *: <multi_op, > **: <exp_op, > 2: <num, integer vlue 2>

8 Lexicl Errors Mtched but mbiguous: left to the other phses (e.g., prser) e.g., fi ( == f(x) ) : fi => identifier?? misspelling of if Unmtched: Pnic mode recovery: delete successive chrcters from the remining input until well-formed token is found Repir input (single error): deleting n extrneous chrcter inserting missing chrcter replcing with correct chrcter trnsposing two djcent chrcters Minimum-distnce error correction (multiple errors)

9 Specifiction of Tokens A Forml Specifiction for Tokens or Ptterns - Strings nd Lnguges - Regulr Expressions & Definitions - Recognition of Tokens

10 Strings nd Lnguge lphbet (or chrcter clss) ( 字符集 ): ny finite set of symbols string over some lphbet ( 字串 ): finite sequence of symbols drwn from tht lphbet length of string s, s : number of symbols in s empty string: specil string of length zero (proper) prefix: bcdef (proper) suffix: bcdef (proper) substring: bcdef subsequence: bcdef

11 Strings nd Lnguge lnguge: ny set of strings over some lphbet empty set: the set contining only empty string, i.e., Φ={}

12 Opertions on Strings Conctention: xy s = s = s x= Dog y= House => xy = DogHouse Exponentitions: s i =s i-1 s (s 0 =)

13 Opertions on Lnguges Union {s s is in L or s is in M} Conctention {st s is in L nd t is in M} Kleene closure: zero or more conctention L * : union of L i (i = 0 infinity) L 0 = {}, L i = L i-1 L Positive closure: one or more conctention L + : union of L i (i = 1 infinity)

14 Exmples L={A, B,, Z,, b,, z}, D={0, 1,, 9} Union L U D = {letters nd digits of length 1} Conctention LD={ letter followed by digit} (={A0, A1, B0, }) L 4 = {4-letter strings}(={aaaa, AABC, BBBB, }) Kleene closure: zero or more conctention L * : {ll strings of letters of length zero (i.e., ) or more} L(L U D) * = {ll strings of letters-nd-digits, strting with letter} Positive closure: one or more conctention D + : {strings of one or more digits}

15 Regulr Expression (R.E.) A Forml Specifiction for Tokens

16 Regulr Expression: Syntx for Specifying String Ptterns Regulr expression r over lphbet Defines the lnguge L(r) corresponding to r Regulr Set: A lnguge denoted by regulr expression Bsic Symbols empty-string: ny symbol in input symbol set Bsic Opertors disjunction (OR, union): r s conctention (AND): r s (or simply rs) closure (repetition): r* identity (prenthesized): (r)

17 Regulr Expression: Syntx for Specifying String Ptterns Extended opertors:? : optionl opertor + : positive closure opertor. : ny chrcter but newline [-z]: chrcter clss [^-z]: complement (ny chrcters NOT in [-z]) {m,n}: number of occurrence ^: strt of line $: end of line registers: the n-th prt of mtch: \1, \2 sed s/.*<img src=\([^ >]*\).*/\1/g escpe, met-symbols: \c (chrcter c literlly) [\-z]:, - or z (NOT:, b,, z ) r/s: r which is followed by s ( / : lookhed opertor)

18 Nottionl Shorthnds One or more instnces (r) + denoting (L(r)) + r * = r + r + = r r * Zero or one instnce r? = r Chrcter clsses [bc] = b c [-z] = b... z

19 Regulr Expression Exmples: = {, b} r = b {, b} r = ( b)( b) {, b, b, bb} = b b bb (nother equivlent regulr expression) r = * {,,,,, } r = ( b)* {ll strings of s nd b s} = (*b*)*

20 Equivlence A lnguge my be represented by two or more equivlent regulr expressions. Equivlence: L(r) = L(s) r = s Algebric properties of Regulr Expression Commuttive: r s = s r Associtive: r (s t) = (r s) t (rs)t = r(st) Distribution: r(s t) = rs rt & (s t)r = sr tr Identity element ( ): r = r nd r =r Appliction of properties: Proof of Equivlence r* = (r )* r** = r*

21 Regulr Definition: A CFG-like Nottion of Regulr Expression Regulr Definition Similr to CFG Define regulr expressions in terms of nmed regulr expressions d 1 r 1 d 2 r 2 d n r n

22 Regulr Definition Exmple of Regulr Definition: letter A B C z digit id letter (letter digit ) * Another Exmple: Unsigned numbers (ex. 3.5) // Unsigned numbers (512, 3.14, 6.33 E 4, 1.89 E -5) digit digits digit digit* optionl_frction.digits optionl_exponent (E(+ - ) digits ) num digits optionl_frction optionl_exponent

23 Nonregulr Set Some lnguges cnnot be described by ny regulr expression Exmples: Blnced nd nested constructs BUT, Cn be specified by CFG Repeting strings {wcw w is string of s nd b s} ={c, bcb, bcb, } Cnnot be expressed in CFG either Context dependent strings nh12 n

24 Regulr Expression: Syntx for Specifying String Ptterns Chomsky Hierrchy: regulr set (R.E.) context-free context-sensitive recursively enumerble (Tuning Mchine)

25 Regulr Expression: Syntx for Specifying String Ptterns Applictions: Mtching wildcrd chrcters (shell commnds, filenme expnsion) string pttern mtching (grep, wk) serch engine (keyword mtching, fuzzy mtch) string pttern editing/processing (sed, vi, tr)

26 Recognition of Tokens

27 Exmple Tsk Grmmr: stmt if expr then stmt if expr then stmt else stmt expr term relop term term term id num

28 Exmple Tsk Terminl Symbols: if if then then else else relop < <= = <> > >= id letter (letter digit)* num digit+ (. digit+ )? ( E(+ -)? digit+)? White Spce Delimited: delim blnk tb newline ws delim+

29 Exmple Tsk Gol: construct lexicl nlyzer tht isoltes lexeme for the next token Produce token nd ssocited ttribute-vlues Methods: FA / FSA: Finite (Stte) Automt By hnds: constructing FAs & simultor for the FAs Simultor (scnner) depends on FAs By tools: writing regulr definition for scnner genertors to build FAs for scnner Scnner: driver progrm tht is independent of the forms of the FAs

30 FA nd Trnsition Digrms b c r = (bc)+ stte trnsition the strt stte finl stte

31 FA/FSA nd Trnsition Tbles sttes inputs b c q0 q1 q1 q2 q2 q3 q3 q1 NextStte = Move( CurrentStte, Input )

32 Recognition stte = 0; while ( (c = next_chr() )!= EOF ) { switch (stte) { cse 0: if ( c == ) stte = 1; brek; cse 1: if ( c == b ) stte = 2; brek; cse 2: if ( c == c ) stte = 3; brek; cse 3: if ( c == ) stte = 1; else { ungetchr(); return (TRUE); } brek; } } defult: error(); if ( stte == 3 ) return (TRUE) else return (FALSE);

33 Finite Automt for the Lexicl Tokens i 1 2 f 3 - z z IF ID NUM z \n blnk, etc. 5 blnk, etc. 1 ny but \n 2 REAL White spce error (nd comment strting with - - ) (Appel, pp. 21)

34 Regulr expressions for tokens if {return IF;} [ - z] [ - z0-9 ] * {return ID;} [0-9] + {return NUM;} ([0-9] +. [0-9] *) (. [0-9] +) {return REAL;} ( -- [ - z]* \n ) ( \n \t ) + {/* do nothing*/}. {error ();} (Appel, pp. 20)

35 Recognition of the Lexicl Tokens Given the FA s (Nïve Pttern Mtching) Trversl of the trnsition digrms in sequence to mtch ny of the bove stte trnsition digrms until mtch Give different unique stte numbers to different initil sttes (nd other sttes) in individul digrm before writing progrm to simulte the trversl process Mtch the longest expression first if two stte trnsition digrms hve super-/sub-string reltionship E.g., mtch REAL before INTEGER On filure, next_stte = init_stte of next FA Exmple progrm: [Aho 86]

36 Finite Stte Automt

37 How to Construct FA Systemticlly? You cn construct single complicted stte trnsition digrm directly to recognize ll token types if you re smrt enough, or E.g., (next pge) You cn do it systemticlly by constructing simpler trnsition digrms nd composing them into lrger networks Preferred for utomtic construction Esy to verify its correctness

38 1,4,9,14 i 0-9 A DFA for Recognizing Common Token Types -h j-z ID 2,5,6,8,15 ID 5,6,7,8,15 NUM 10,11,12,13,15 f 0-9 -z,0-9 -e, g-z, 0-9 IF(or ID) 3,6,7,8 NUM 11,12,13 -z,0-9 ID 6,7,8 1 st pttern or reserved word in LEX spec. -z,0-9 other error Longest mtch (Appel, pp. 29)

39 Finite (Stte) Automt A set of sttes: S A set of input symbols: (the input symbol lphbet) A trnsition (move) function: (s,) = s Initil (strt) stte: s0 A set of finl (ccepting) sttes: F

40 Finite (Stte) Automt Grphicl Representtion: Stte trnsition digrm Implementtion: Stte trnsition tble Deterministic (DFA) Single trnsition for ll sttes on ll input symbols Non-deterministic (NFA) More thn one trnsitions for t lest one stte with some input symbol

41 NFA: Nondeterministic Finite Automt An NFA consists of S: A finite set of sttes : A finite set of input symbols : A trnsition function tht mps (stte, symbol) pirs to sets of sttes s 0 : A stte distinguished s strt stte F: A set of sttes distinguished s finl sttes

42 NFA: An Exmple RE: ( b) * bb Sttes: {0, 1, 2, 3} Input symbols: {, b} Trnsition function: (0,) = {0,1}, (0,b) = {0} (1,b) = {2}, (2,b) = {3} Strt stte: 0 Finl sttes: {3}

43 Trnsition Digrm (NFA) ( b) * bb strt b b b Sttes: {0/Strt/init., 1, 2, 3/Finl} Input symbols: {, b} NFA Trnsition function: (0,) = {0,1}, (0,b) = {0} (1,b) = {2}, (2,b) = {3}

44 Acceptnce of NFA An NFA ccepts n input string s iff there is some pth in the trnsition digrm from the strt stte to some finl stte such tht the edge lbels long this pth spell out s Exmple: bbbbb is ccepted by ( b)*bb bbbb is NOT

45 NFA: Exmple with trnsition RE: * bb * Sttes: {0, 1, 2, 3, 4} Input symbols: {, b} Trnsition function: (0, ) = {1, 3}, (1, ) = {2}, (2, ) = {2} (3, b) = {4}, (4, b) = {4} Strt stte: 0 Finl sttes: {2, 4}

46 Trnsition Digrm (NFA) strt * bb * NFA Trnsition function: (0, ) = {1, 3}, (1, ) = {2}, (2, ) = {2} (3, b) = {4}, (4, b) = {4} b 4 b

47 Deterministic Finite Automt A DFA is specil cse of n NFA in which no stte hs n -trnsition for ech stte s nd input symbol, there is t most one edge lbeled leving s

48 DFA: An Exmple RE: ( b) * bb Sttes: {0, 1, 2, 3} Input symbols: {, b} Trnsition function: (0,) = {1}, (1,) = {1}, (2,) = {1}, (3,) = {1} (0,b) = {0}, (1,b) = {2}, (2,b) = {3}, (3,b) = {0} Strt stte: 0 Finl sttes: {3}

49 Trnsition Digrm A DFA for ( b) * bb strt b b b 3 b

50 Trnsition Digrm strt 0 1 b 2 b 3 b DFA for ( b) * bb {0,2} strt {0} b b b b {0,1} {0,3}

51 Recognition of Regulr Expression Using DFA Simulting Deterministic Finite Automt (DFA) initiliztion: current_stte = s0; input_symbol = 1st symbol while (current_stte is not fil_stte && input_symbol!= EOF) next_stte = (current_stte, input_symbol), & Current_stte = next_stte input_symbol = next_input_symbol If (current_stte in finl sttes) ccept() else fil()

52 Simulting DFA Input. An input string ended with eof nd DFA with strt stte s 0 nd finl sttes F. Output. The nswer yes if ccepts, no otherwise. begin s := s 0 ; c := nextchr; while c <> eof do begin s := move(s, c); // trnsition function c := nextchr end; if s is in F then return yes else return no end.

53 DFA: An Exmple ( b) * bb strt b b b b

54 An Exmple bbbbb bbbb s = 0 s = 0 s = move(0, b) = 0 s = move(0, b) = 0 s = move(0, b) = 0 s = move(0, b) = 0 s = move(0, ) = 1 s = move(0, ) = 1 s = move(1, b) = 2 s = move(1, b) = 2 s = move(2, ) = 1 s = move(2, ) = 1 s = move(1, b) = 2 s = move(1, b) = 2 s = move(2, b) = 3 s is not in {3} s is in {3}

55 Recognition of Regulr Expression Using NFA Simulting Non-Deterministic Finite Automt (NFA) Bcktrck/Bckup: (Sequentil Trversl) remember next lterntive configurtion (current input & next lterntive stte) when lterntive choices re possible Prllelism: (Prllel Trversl) trce every possible lterntives in prllel Look-hed: look t more input symbols to mke it deterministic

56 Simulting n NFA Input. An input string ended with eof nd n NFA with strt stte s 0 nd finl sttes F. Output. The nswer yes if ccepts, no otherwise. begin S := -closure({s 0 }); // s 0 = => S c := nextchr; while c <> eof do begin S := -closure(move(s, c)); // S =c=> M = => S c := nextchr end; if S F <> then return yes else return no end.

57 Opertions on NFA sttes -closure: set of sttes rechble without consuming ny input symbol -closure(s): set of NFA sttes rechble from NFA stte s on -trnsitions lone -closure(s): set of NFA sttes rechble from some NFA stte s in S on -trnsitions lone move(s, c): set of NFA sttes to which there is trnsition on input symbol c from some NFA stte s in S

58 Computtion of -closure Input. An NFA nd set of NFA sttes S. Output. T = -closure(s). begin push ll sttes in S onto stck; & initilize T := S; while stck is not empty do begin pop t, the top element, off of stck; for ech stte u with n edge from t to u lbeled do if u is not in T [i.e., current -closure(s)] do begin end end; return T end. dd u to T; push u onto stck

59 ( b) * bb An Exmple 2 3 strt b 5 T= -closure(0): 01: S={0}, T={0} 02: S={}; t=0; T={0} 03: S={1,7}; T={0,1,7} 04: S={1}; t=7; T={0,1,7} 05: S={1}; T={0,1,7} 06: S={}; t=1; T={0,1,7} 07: S={2,4}; T={0,1,2,4,7} b b : S={2}; t=4; T={0,1,2,4,7} 09: S={2}; T={0,1,2,4,7} 10: S={}; t=2; T={0,1,2,4,7} **: S={}; T={0,1,2,4,7}

60 An Exmple ( b) * bb strt A = -closure ({0}) = {0,1,2,4,7} b b 10 4 b 5

61 An Exmple ( b) * bb strt b move(a,)= {3,8} b b move(a,b)= {5}

62 An Exmple ( b) * bb strt b 5 C = -closure (move(a,b)) = {1,2,4,5,6,7} b move(a,b)= {5} b 10

63 An Exmple ( b) * bb 2 3 strt b 6 5 b move(c,b)= {5} b 10

64 An Exmple ( b) * bb strt b 5 C = -closure (move(c,b)) = {1,2,4,5,6,7} b move(c,b)= {5} b 10

65 An Exmple bbbb S = -closure({0}) = {0,1,2,4,7} = A S = -closure(move({0,1,2,4,7}, b)) = -closure({5}) = {1,2,4,5,6,7} = C S = -closure(move({1,2,4,5,6,7}, b)) = -closure({5}) = {1,2,4,5,6,7} = C S = -closure(move({1,2,4,5,6,7}, )) = -closure({3,8}) = {1,2,3,4,6,7,8} S = -closure(move({1,2,3,4,6,7,8}, b)) = -closure({5,9}) = {1,2,4,5,6,7,9} S = -closure(move({1,2,4,5,6,7,9}, b)) = -closure({5,10}) = {1,2,4,5,6,7,10} S {10} <>

66 Recognition of Regulr Expression Simulting NFA is hrder thn simulting DFA Constructing NFA is esier thn constructing DFA Construct NFA => Construct Equivlent DFA By pre-defining sttes in NFA tht cn be reched in prllel s stte for the DFA & pre-computing ll possible trnsitions Insted of simulting the prllel trnsitions in run-time => (optionl) Stte Minimiztion => Simulte DFA

67 Constructing Automt from R.E. (1) R.E. NFA (Thompson s construction) DFA (Subset Construction) Stte Minimiztion R.E. decomposition into bsic lphbets & opertors construct FA for bsic lphbets merging FA s by opertor

68 Constructing Automt from R.E. (2) R.E. DFA: stte_trnsition position_trnsition in pttern Stte Minimiztion nnotte RE symbols with position lbels get syntx tree of the nnotted pttern compute {nullble, fistpos, lstpos} of subexpressions compute follow(i) s0 = firstpos(root) construct trnsition function ccording to follow(i)

69 Regulr Expression to NFA R.E. NFA (Thompson s construction)

70 Constructing NFA How to define n NFA tht ccepts regulr expression? It is very simple. Remember tht regulr expression is formed by the use of lterntion, conctention nd repetition. Thus ll we need to do is to know how to build the NFA for single symbol, nd how to compose NFAs.

71 Composing NFAs with Alterntion The NFA for symbol (or ) is: strt i f Given two NFA N(s) nd N(t), the NFA N(s t) is: strt i N(s) f N(t) (Aho,Sethi,Ullmn, pp. 122)

72 Composing NFAs with Conctention Given two NFA N(s) nd N(t), the NFA N(st) is: strt i N(s) N(t) f (Aho,Sethi,Ullmn, pp. 123)

73 Composing NFAs with Repetition The NFA for N(s*) is i N(s) f (Aho,Sethi,Ullmn, pp. 123)

74 Properties of the NFA vi. Thompson s Construction Following the construction rules, we obtin n NFA N(r) tht: hs t most twice s mny sttes s the number of symbols nd opertors in r hs exctly one strting nd one ccepting stte ech stte hs t most one outgoing trnsition on symbol of the lphbet or t most two outgoing -trnsitions All nondeterministic trnsitions re introduced by trnsitions tht connect to/from new/old init./finl sttes.

75 An Exmple ( b) * bb 2 3 strt b b b 5

76 Comprison: NFA (by Heuristics) ( b) * bb strt b b b NOT constructed using Thompson s Construction Sttes: {0/Strt/init., 1, 2, 3/Finl} Input symbols: {, b} NFA Trnsition function: (0,) = {0,1}, (0,b) = {0} (1,b) = {2}, (2,b) = {3}

77 NFA to DFA NFA DFA (Subset Construction)

78 Trnslting NFA into DFA Ech stte of DFA (D) corresponds to set of sttes of NFA (N) trnsforming N to D is done by subset construction D will be in stte {x,y,z} fter reding given input string if nd only ifncould be in ny of the sttesx,y, orz, depending on the trnsitions it chooses. D keeps trck of ll the possible routesnmight tke nd runs them in prllel.

79 Simulting n NFA (recll tht ) Input. An input string ended with eof nd n NFA with strt stte s 0 nd finl sttes F. Output. The nswer yes if ccepts, no otherwise. begin S := -closure({s 0 }); // s 0 = => S c := nextchr; while c <> eof do begin S := -closure(move(s, c)); // S =c=> M = => S c := nextchr end; if S F <> then return yes else return no end.

80 c: extends to ll symbols in lphbet (not input Symbols in some files) Simulting n NFA (recll tht ) Input. An input string ended with eof nd n NFA with strt stte s 0 nd finl sttes F. Output. The nswer yes if ccepts, no otherwise. begin S := -closure({s 0 }); // s 0 = => S c := nextchr; c Next stte: U while c <> eof do begin S := -closure(move(s, c)); // S =c=> M = => S c := nextchr end; if S F <> then return yes else return no end. Initil stte Previous stte: T NFA to DFA S: ll sttes generted during NFA prllel trversl over ll possible input prefixes (NOT prticulr input) : ll trnsitions during trversl

81 From n NFA to DFA Subset construction Algorithm. Input. An NFA N. Output. A DFA D with sttes Dsttes nd trnsition tble Dtrn. begin dd -closure(s 0 ) s n unmrked stte to Dsttes; while there is n unmrked stte T in Dsttes do begin mrk T; for ech input symbol do begin U := -closure(move(t, )); if U is not in Dsttes then dd U s n unmrked stte to Dsttes; mrk s finl if U contins the originl finl stte; Dtrn[T, ] := U end end.

82 An Exmple ( b) * bb 2 3 strt b b b 5

83 An Exmple: -closure(s) & move(s,x) s -closure(s) move(s,) move(s,b) importnt stte? 0 {0,1,2,4,7} 1 {1,2,4} Yes 3 {1,2,3,4,6,7} Yes 5 {1,2,4,5,6,7} 6 {1,2,4,6,7} Yes Yes Yes ((Fin)) ((Fin)) ((?))

84 An Exmple -closure({0}) = {0,1,2,4,7} = A A: -closure(move({0,1,2,4,7}, )) Ignore -trnsitions (0, 1, ) -trnsitions: (2,) 3, (7,) 8 b-trnsitions: (4,b) 5, 8 9, 9 10 Good to lbel sttes sequentilly: such tht (s,x) s+1 = -closure({3,8}) = {1,2,3,4,6,7,8} = B A: -closure(move({0,1,2,4,7}, b)) = -closure({5}) = {1,2,4,5,6,7} = C B: -closure(move({1,2,3,4,6,7,8}, )) = -closure({3,8}) = B B: -closure(move({1,2,3,4,6,7,8}, b)) = -closure({5,9}) = {1,2,4,5,6,7,9} = D C: -closure(move({1,2,4,5,6,7}, )) = -closure({3,8}) = B C: -closure(move({1,2,4,5,6,7}, b)) = -closure({5}) = C D: -closure(move({1,2,4,5,6,7,9}, )) = -closure({3,8}) = B D: -closure(move({1,2,4,5,6,7,9}, b)) = -closure({5,10}) = {1,2,4,5,6,7,10} = E E: -closure(move({1,2,4,5,6,7,10}, )) = -closure({3,8}) = B E: -closure(move({1,2,4,5,6,7,10}, b)) = -closure({5}) = C

85 An Exmple Ignore -trnsitions (0, 1, ) -trnsitions: (2,) 3, (7,) 8 b-trnsitions: (4,b) 5,8 9,9 10 Good to lbel sttes sequentilly: such tht (s,x) s+1 Stte A = {0}* ={0,1,2,4,7} B = {3,8}* ={1,2,3,4,6,7,8} C = {5}* ={1,2,4,5,6,7} D = {5,9}* ={1,2,4,5,6,7,9} E = {5,10}* ={1,2,4,5,6,7,10} Input Symbol b B C B D B C B E B C

86 An Exmple Stte A = {0,1,2,4,7} B = {1,2,3,4,6,7,8} C = {1,2,4,5,6,7} D = {1,2,4,5,6,7,9} E = {1,2,4,5,6,7,10} Input Symbol b B C B D B C B E B C

87 An Exmple: Result of Subset Construction b C A b {1,2,4, 5,6,7} b D E strt {0,1,2,4,7} {1,2,3,4, 6,7,8} b {1,2,4,5, 6,7,9} b {1,2,4,5, 6,7,10} B

88 Minimizing Number of Sttes Every DFA hs unique smllest equivlent DFA. Given DFA M, we use splitting to construct the equivlent miniml DFA. Normlly, we ctully merge individul sttes to lrger set of sttes, insted of splitting wildly

89 DFA to Minimum Stte DFA Input. A DFA M=(S,s 0,F). Output. An equivlent DFA M =(S,,s 0,F ) with fewer sttes. begin initilize prtition of two groups of sttes: s q q q {F(finl sttes), S-F(non-finl sttes)} t q q q for ech group G of do begin /* until new unchnged */ prtition G into subgroups such tht ny two sttes s nd t of G re in the sme subgroup iff for ll input symbol, sttes s nd t hve trnsitions on to sttes in the sme group of ; /* t worst, stte will be in subgroup by itself */ updte new by replcing G by the set of ll subgroups formed end s 0 = r(s 0 ), representtive of s 0 ; S = {representtives of subgroups}; F = {representtives of sttes in F}; (s,)=t => (r(s),) = r(t) end.

90 Splitting into Equivlent Sttes Algorithm: Initilly, there re two sets, one consisting of ll ccepting sttes of M, the other contining the remining sttes. repet { Choose set A = { s 1, s 2,, s n } Split A into A 1, A 2,, A m so tht for ll A i & ll symbols if s j, s k A i nd, on input s j t j nd s k t k // source trget then t j nd t k re in the sme set. } until no more chnge.

91 An Exmple Stte A = {0,1,2,4,7} B = {1,2,3,4,6,7,8} C = {1,2,4,5,6,7} D = {1,2,4,5,6,7,9} E = {1,2,4,5,6,7,10} Input Symbol b B C B D B C B E B C

92 An Exmple -Fin +Fin Stte A = {0,1,2,4,7} B = {1,2,3,4,6,7,8} C = {1,2,4,5,6,7} D = {1,2,4,5,6,7,9} E = {1,2,4,5,6,7,10} Input Symbol b B C B D B C B E B C

93 An Exmple Stte A = {0,1,2,4,7} B = {1,2,3,4,6,7,8} C = {1,2,4,5,6,7} D = {1,2,4,5,6,7,9} E = {1,2,4,5,6,7,10} Input Symbol b B C B D B C B E B C

94 An Exmple Stte A = {0,1,2,4,7} B = {1,2,3,4,6,7,8} A = {1,2,4,5,6,7} D = {1,2,4,5,6,7,9} E = {1,2,4,5,6,7,10} Input Symbol b B A B D B A B E B A

95 Trnsition Digrm (fter Stte Reduction) We sid DFA for ( b) * bb {0,2} strt {0} b b b b {0,1} {0,3}

96 Trnsition Digrm (fter Stte Reduction) It relly is DFA for ( b) * bb D strt A b b b b B E

97 RE to DFA Construct DFA from RE directly without intermedite NFA

98 ( b) * bb Review of Thompson s Trnsition Digrm: An Exmple 2 3 A = -closure ({0}) = {0,1,2,4,7} strt b b 10 4 b 5

99 ( b) * bb Review of Thompson s Trnsition Digrm: An Exmple 2 3 strt b b 10 4 b 5 move(a,b)= {5}

100 ( b) * bb strt 0 Review of Thompson s Trnsition Digrm: An Exmple b C = -closure (move(a,b)) = {1,2,4,5,6,7} b b b 10

101 Constructing DFA from R.E. Importnt sttes : -trnsitions hve no effect on determining next stte since they will not relly mke trnsition on visible input symbol -trnsitions determine equivlent sttes in loose sense Importnt sttes re relted to non-null symbol t prticulr position in RE e.g., b t position 2 of ( b)bb# Re-definition of Sttes : Thompson s Trnsition digrm: nodes s sttes (the sttus before & fter mtching symbol) Alterntive method: rcs s sttes (the position (in RE) of mtch) #: simulte the lst node for checking finl stte Only sttes tht consumes symbols mtter

102 DFA directly from R.E.: underlying NFA strt ( 1 b 2 )* 3 b 4 b 5 # 6 A C B 1 b 2 3 D E b 4 b 5 F # 6 Importnt sttes ( {1 6}): with non-null trnsitions

103 DFA directly from R.E.: underlying NFA 1 C ( 1 b 2 )* 3 b 4 b 5 # 6 strt A B E 3 4 b 5 b 6 2 b D Followpos(1) ={1,2,3} F #

104 Constructing Automt from R.E. Exmple: RE = ( b)*bb# ( 1 b 2 )* 3 b 4 b 5 # 6 Syntx tree for RE: (Fig. 3.41) Directed grph for followpos(): Node Followpos 1 on {1,2,3} 2 on b {1,2,3} 3 on {4} 4 on b {5} b 1 Redy to mtch t 3 5 on b {6} 6 - b b #6 b 2 b Redy to mtch b t 2 followpos(1): ( 1 b 2 )* 3 b 4 b 5 # 6 ~ (( 1 b 2 ) ( 1 b 2 )) 3 b 4 b 5 # 6

105 DFA directly from R.E. Possible mtching positions DFA for ( b) * bb ( 1 b 2 )* 3 b 4 b 5 # 6 {1,2,3,5} {1,2,3} strt b b b b {1,2,3,4} Next Possible mtching positions {1,2,3,6}

106 Constructing DFA from RE: FirstPos, LstPos, Nullble Mtching RE s 3 possible cses x(c1 c2)y x(c1.c2)y x(c*)y Followpos: Which position(s)/symbol(s) to mtch fter mtching lstpos of x? Requires firstpos of c, c1, c2, y Need to know whether c1, c2 cn be pss-through (nullble) (c* is lwys nullble)

107 Constructing DFA from R.E. R.E. DFA: Stte (set of) position(s) ( respective symbols) in RE (where n input chrcter is being mtched) Stte_trnsition llowed position trnsition for RE Set of Positions Set of Importnt Sttes of NFA (tht consumes input symbols) DFA Construction: Augment RE: (r)# [#: end-of-pttern mrk] Annotte RE symbols (excluding ) with position lbels Get syntx tree T of the nnotted pttern Compute {nullble, firstpos, lstpos} of nodes [sub-re s] Compute follow(i) [by mking DFT over the tree T] Initil stte: s 0 = firstpos(root) [ complete RE] Construct trnsition function ccording to follow(i) (i,)=i ]

108 Constructing DFA from R.E. DFA Construction: Initil stte: s 0 = firstpos(root) & S = {s 0 } While there is n unmrked stte Q in S do begin For ech input symbol do begin For ech position p in Q s.t. symbol(p)=, Compre : NFA DFA Let U = followpos(p) // tke Union if more thn one such p If U is not (empty), nd U S, then S += U // new stte (Q,)=U // new trnsition End /* * / End /* while */ Q:{p p } U ={followpos(p)}

109 Lexicl Anlyzer Genertor RE Thompson s construction NFA Subset construction DFA

110 Time-Spce Trdeoffs RE (r) to NFA, simulte NFA on input x time: O( r * x ), spce: O( r ) [mx. 2 r sttes] RE to NFA, NFA to DFA, simulte DFA time: O( x ), spce: O(2 r ) Lzy trnsition evlution trnsitions re computed s needed t run time; computed trnsitions re stored in cche for lter use

111 LEX A Lnguge for Specifying Lexicl Anlyzers

112 Lex A lnguge for specifying lexicl nlyzers (for ny lnguge, sy, X) (Lex. Anlyzer Spec.) lex.l lex.yy.c lex compiler C compiler (Lex. Anlyzer in C) lex.yy.c (Lex. Anlyzer Exe.).out source code in X.out tokens (for prser) next_token = yylex();

113 Using Scnner Genertor: Lex Lex is lexicl nlyzer genertor developed by Lesk nd Schmidt of AT&T Bell Lb, written in C, running under UNIX. Lex produces n entire scnner module tht cn be compiled nd linked with other compiler modules. Lex ssocites regulr expressions with rbitrry code frgments. When n expression is mtched, the code segment is executed. A typicl lex progrm contins three sections seprted by%% delimiters.

114 Lex Progrms %{ uxiliry declrtions %} regulr definitions %% trnsltion rules %% uxiliry procedures

115 First Section of Lex The first section define chrcter clsses nd uxiliry regulr expression. (Fig. 3.5 on p. 67) [] delimits chrcter clsses - denotes rnges:[xyz] = =[x-z] \ denotes the escpe chrcter: s in C. ^ complements chrcter clss, (Not): [^xy] denotes ll chrcters exceptxndy.,*, nd+(lterntion, Kleene closure, nd positive closure) re provided. () cn be used to control grouping of subexpressions. (expr)? = =(expr), i.e. mtchesexpr zero times or once. {} signls the mcroexpnsion of symbol defined in the first section.

116 First Section of Lex, cont. Ctention is specified by the juxtposition of two expressions; no explicit opertor is used. [b][cd] will mtch ny of d, c, bc, nd bd. begin = = begin = =[b][e][g][i][n]

117 Second Section of Lex The second section of lex defines tble of regulr expressions nd corresponding commnds. When n expression is mtched, its ssocited commnd is executed. Auxiliry functions my be defined in the third section. Input tht is mtched is stored in the string vribleyytext whose length isyyleng. Lex cretes n integer functionyylex() tht my be clled from the prser. The vlue returned is usully the token code of the token scnned by Lex. Whenyylex() encounters end of file, it clls user-supplied integer function nmedyywrp() to wrp up input processing.

118 Trnsltion Rules P 1 {ction 1 } P 2 {ction 2 }... P n {ction n } where P i re regulr expressions nd ction i re progrm segments to be executed on mtching P i

119 Deling with Multiple Input Files yylex() uses three user-defined functions to hndle chrcter I/O: input(): retrieve single chrcter, 0 on EOF output(c): write single chrcter to the output unput(c): put single chrcter bck on the input to be re-red

120 An Exmple %{ #define LT 24 // uxiliry declrtions (in C) #define LE 25 #define EQ %} // regulr definitions delim [ \t\n] ws {delim}+ letter [A-Z-z] digit [0-9] id {letter}({letter} {digit})* number {digit}+(\.{digit}+)?(e[+\-]?{digit}+)? %%

121 An Exmple // trnsltion rules (ctions re in C) {ws} { /* no ction nd no return */ } if {return (IF);} then {return (THEN);} else {return (ELSE);} {id} {yylvl=instll_id(); return (ID);} {number} {yylvl=instll_num(); return (NUMBER);} < {yylvl=lt; return (RELOP);} <= {yylvl=le; return (RELOP);}... %% // uxiliry procedures (in C) instll_id() { /* yytext to symbol tble */ } instll_num() {... /* yytext to symbol tble */ }

122 Functions nd Vribles yylex() function implementing the lexicl nlyzer nd returning the token mtched yytext globl pointer vrible pointing to the lexeme mtched yyleng globl vrible giving the length of the lexeme mtched yylvl n externl globl vrible storing the ttribute of the token

123 NFA from Lex Progrms P 1 P 2... P n s 0 N(P 1 ) N(P 2 )... N(P n )

124 Rules Look for the longest lexeme e.g., Number Mtch until no trnsition & retrct to longest mtch Look for the first-listed pttern tht mtches the longest lexeme keywords nd identifiers List frequently occurring ptterns first white spce

125 Rules View keywords s exceptions to the rule of identifiers construct keyword tble to distinguish them from id s Lookhed opertor: r 1 /r 2 - mtch string in r 1 only if followed by string in r 2 DO 5 I = DO 5 I = 1, 25 DO/({letter} {digit})* = ({letter} {digit})*,

126 Lexicl Error Recovery Error: none of the ptterns mtches prefix of the remining input Pnic mode error recovery delete successive chrcters from the remining input until the pttern-mtching cn continue Error repir: delete n extrneous chrcter insert missing chrcter replce n incorrect chrcter trnspose two djcent chrcters

127 Appendix: Regulr Expression nd Pttern Mtching - KMP lgorithm - AC lgorithm

128 R.E. nd Pttern Mtching Nïve Pttern Mtching: Specify the pttern with regulr expression R.E. for ech keyword Construct FA for ech such R.E., nd conduct left-to-right mtching: DFA := Stte_Trnsition_Tble := Construct_DFA(R.E.) while (input_pointer!= EOF) stop_stte = recognize(input_pointer, DFA) if fil (stop_stte not in finl_sttes) : move input pointer by one chrcter if not mtch if success (stop_stte in finl_sttes) : output mtching sttus & skip over mtched pttern upon successful mtch

129 R.E. nd Pttern Mtching Why Is It Slow? mtch multiple keywords multiple times for ech keyword, move input pointer bckwrd to the chrcter next to the lst begin of mtching & reset to initil stte on filure, even though some repeted pttern might pper in recently mtched prtil string probbility of filure is significntly lrger thn probbility of success mtch in most pplictions (success or mtch only few times) will therefore strt the next mtching session by setting the input pointer one chrcter behind the strting position of the previous mtch most of the time

130 R.E. nd Pttern Mtching RE vs. Pttern Mtching R.E. <=> FA for recognizing one of set of keywords/ptterns in input string sy yes if input string is in Lng(R.E.) (the regulr lnguge for the expression) Pttern Mtching (PM): recognizing ll the occurrences of ny keyword/pttern, specified in regulr expression, within text document specify ech pttern/keyword with RE output ll occurrences, in ddition to sying yes/no

131 R.E. nd Pttern Mtching Forml Method for Pttern Mtching (PM) Constructing FA for (single/multi-keyword) PM is equivlent to constructing FA tht recognizes the regulr expression: PM = (.* RE)*, nd outputting keyword upon visiting finl stte of the originl FA for recognizing RE RE = K1 K2 K3 Kn (the regulr expression for ll specified keywords). : ny chrcter not strting in the first chrcters of K1 ~ Kn.* : unspecified ptterns (or unknown keywords)

132 R.E. nd Pttern Mtching Constructing FA1 for recognizing RE = K1 K2 Kn equivlent to merging prefixes of the keywords to void redundnt forwrd mtching => TRIE lexicon tree = DFA for RE Constructing FA2 for recognizing PM = (.* RE)* extending FA1 by () including unknown keywords nd (2) introducing epsilon-moves from the originl finl sttes to originl initil sttes on mtching filure, redundnt bckwrd mtching cn be voided if substring preceding current input pointer is the prefix of nother keyword filure function: the stte (in TRIE) to bckoff on filure (!= init. stte if the bove mentioned sub-string exists nd is non-null) epsilon-moves & filure function mke FA2 NFA, whose DFA counterprt cn be simulted by bcktrcking

133 R.E. nd Fst Methods for Pttern Mtching Fst Single Keyword Mtching [KMP - Knuth, Morris & Prtt 1977] Reference: [Aho et. l 1986, Ex ] keyword => stte_trnsition_tble reduce repeted mtching suggested by keyword pttern filure function: where to bckoff on filure

134 R.E. nd Fst Methods for Pttern Mtching Fst Multiple Keyword Mtching [AC, Cherry 1982] Reference: [Aho, Ex ] keywords => TRIE (stte_trnsition_tble) reduce repeted mtching suggested by TRIE of the keywords TRIE filure function

135 R.E. nd Fst Methods for Pttern Mtching Boyer & Moore [1977] Hrrison [1971]: Hshing Method

136 KMP: Filure Function strt 0 1 b 2 3 b If filed t stte 5 on x => Input = bbx (input pointer => x) Need to re-try bbx, bx, bx, x from stte 0 bbx : fil gin; (do not strt with prefix b ) bx : success until stte 3, pointing t x Look bck from s5 & see longest mtch (s3) to prefix Choose the longest one so we cn re-try the lest Do you need to go bck nd try ll these? No. Simply set s :=3 nd keep the input pointer to x Stte 3 is the filure stte of stte 5

137 KMP: Filure Function strt 0 1 b 2 3 b s f(s) If filed t stte 5 on x => Input = bbx (input pointer => x) Need to re-try bbx, bx, bx, x from stte 0 bbx : fil gin; (do not strt with prefix b ) bx : success until stte 3, pointing t x Look bck from s5 & see longest mtch (s3) to prefix Choose the longest one so we cn re-try the lest Do you need to go bck nd try ll these? No. Simply set s :=3 nd keep the input pointer to x Stte 3 is the filure stte of stte 5

138 KMP: Re-Mtching on Filure strt 0 1 b 2 3 b s f(s) If filed t stte 5 on x => (5,x) = fil ( bbx does not mtch prefix) f(5)=3 => (3,x)=??, if fil ( bx unmtch) f(3)=1 => (1,x)=??, if fil ( x unmtch) f(1)=0 => (0,x)=??, try x from initil stte (since no prtil mtch in filed prefixes is observed) If (.,x) is legl trnsition, just go hed to (.,x)

139 KMP strt 0 1 b 2 3 b Recursively compute f(s) bsed on f(.) of previous sttes

Lexical Analysis and Lexical Analyzer Generators

Lexical Analysis and Lexical Analyzer Generators 1 Lexicl Anlysis nd Lexicl Anlyzer Genertors Chpter 3 COP5621 Compiler Construction Copyright Roert vn Engelen, Florid Stte University, 2007-2009 2 The Reson Why Lexicl Anlysis is Seprte Phse Simplifies

More information

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy Recognition of Tokens if expressions nd reltionl opertors if è if then è then else è else relop

More information

Reducing a DFA to a Minimal DFA

Reducing a DFA to a Minimal DFA Lexicl Anlysis - Prt 4 Reducing DFA to Miniml DFA Input: DFA IN Assume DFA IN never gets stuck (dd ded stte if necessry) Output: DFA MIN An equivlent DFA with the minimum numer of sttes. Hrry H. Porter,

More information

Principles of Programming Languages

Principles of Programming Languages Principles of Progrmming Lnguges h"p://www.di.unipi.it/~ndre/did2c/plp- 14/ Prof. Andre Corrdini Deprtment of Computer Science, Pis Lesson 5! Gener;on of Lexicl Anlyzers Creting Lexicl Anlyzer with Lex

More information

Dr. D.M. Akbar Hussain

Dr. D.M. Akbar Hussain Dr. D.M. Akr Hussin Lexicl Anlysis. Bsic Ide: Red the source code nd generte tokens, it is similr wht humns will do to red in; just tking on the input nd reking it down in pieces. Ech token is sequence

More information

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5 CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,

More information

In the last lecture, we discussed how valid tokens may be specified by regular expressions.

In the last lecture, we discussed how valid tokens may be specified by regular expressions. LECTURE 5 Scnning SYNTAX ANALYSIS We know from our previous lectures tht the process of verifying the syntx of the progrm is performed in two stges: Scnning: Identifying nd verifying tokens in progrm.

More information

Topic 2: Lexing and Flexing

Topic 2: Lexing and Flexing Topic 2: Lexing nd Flexing COS 320 Compiling Techniques Princeton University Spring 2016 Lennrt Beringer 1 2 The Compiler Lexicl Anlysis Gol: rek strem of ASCII chrcters (source/input) into sequence of

More information

Definition of Regular Expression

Definition of Regular Expression Definition of Regulr Expression After the definition of the string nd lnguges, we re redy to descrie regulr expressions, the nottion we shll use to define the clss of lnguges known s regulr sets. Recll

More information

Fig.25: the Role of LEX

Fig.25: the Role of LEX The Lnguge for Specifying Lexicl Anlyzer We shll now study how to uild lexicl nlyzer from specifiction of tokens in the form of list of regulr expressions The discussion centers round the design of n existing

More information

Lexical Analysis: Constructing a Scanner from Regular Expressions

Lexical Analysis: Constructing a Scanner from Regular Expressions Lexicl Anlysis: Constructing Scnner from Regulr Expressions Gol Show how to construct FA to recognize ny RE This Lecture Convert RE to n nondeterministic finite utomton (NFA) Use Thompson s construction

More information

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 4: Lexical Analyzers 28 Jan 08

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 4: Lexical Analyzers 28 Jan 08 CS412/413 Introduction to Compilers Tim Teitelum Lecture 4: Lexicl Anlyzers 28 Jn 08 Outline DFA stte minimiztion Lexicl nlyzers Automting lexicl nlysis Jlex lexicl nlyzer genertor CS 412/413 Spring 2008

More information

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 3b Lexical Analysis Elias Athanasopoulos

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 3b Lexical Analysis Elias Athanasopoulos ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy RecogniNon of Tokens if expressions nd relnonl opertors if è if then è then else è else relop è

More information

CSE 401 Midterm Exam 11/5/10 Sample Solution

CSE 401 Midterm Exam 11/5/10 Sample Solution Question 1. egulr expressions (20 points) In the Ad Progrmming lnguge n integer constnt contins one or more digits, but it my lso contin embedded underscores. Any underscores must be preceded nd followed

More information

CS 432 Fall Mike Lam, Professor a (bc)* Regular Expressions and Finite Automata

CS 432 Fall Mike Lam, Professor a (bc)* Regular Expressions and Finite Automata CS 432 Fll 2017 Mike Lm, Professor (c)* Regulr Expressions nd Finite Automt Compiltion Current focus "Bck end" Source code Tokens Syntx tree Mchine code chr dt[20]; int min() { flot x = 42.0; return 7;

More information

Example: Source Code. Lexical Analysis. The Lexical Structure. Tokens. What do we really care here? A Sample Toy Program:

Example: Source Code. Lexical Analysis. The Lexical Structure. Tokens. What do we really care here? A Sample Toy Program: Lexicl Anlysis Red source progrm nd produce list of tokens ( liner nlysis) source progrm The lexicl structure is specified using regulr expressions Other secondry tsks: (1) get rid of white spces (e.g.,

More information

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an Scnner Termintion A scnner reds input chrcters nd prtitions them into tokens. Wht hppens when the end of the input file is reched? It my be useful to crete n Eof pseudo-chrcter when this occurs. In Jv,

More information

CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona

CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona CSc 453 Compilers nd Systems Softwre 4 : Lexicl Anlysis II Deprtment of Computer Science University of Arizon collerg@gmil.com Copyright c 2009 Christin Collerg Implementing Automt NFAs nd DFAs cn e hrd-coded

More information

Lexical Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Lexical Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay Lexicl Anlysis Amith Snyl (www.cse.iit.c.in/ s) Deprtment of Computer Science nd Engineering, Indin Institute of Technology, Bomy Septemer 27 College of Engineering, Pune Lexicl Anlysis: 2/6 Recp The input

More information

Lexical analysis, scanners. Construction of a scanner

Lexical analysis, scanners. Construction of a scanner Lexicl nlysis scnners (NB. Pges 4-5 re for those who need to refresh their knowledge of DFAs nd NFAs. These re not presented during the lectures) Construction of scnner Tools: stte utomt nd trnsition digrms.

More information

Regular Expression Matching with Multi-Strings and Intervals. Philip Bille Mikkel Thorup

Regular Expression Matching with Multi-Strings and Intervals. Philip Bille Mikkel Thorup Regulr Expression Mtching with Multi-Strings nd Intervls Philip Bille Mikkel Thorup Outline Definition Applictions Previous work Two new problems: Multi-strings nd chrcter clss intervls Algorithms Thompson

More information

Assignment 4. Due 09/18/17

Assignment 4. Due 09/18/17 Assignment 4. ue 09/18/17 1. ). Write regulr expressions tht define the strings recognized by the following finite utomt: b d b b b c c b) Write FA tht recognizes the tokens defined by the following regulr

More information

Languages. L((a (b)(c))*) = { ε,a,bc,aa,abc,bca,... } εw = wε = w. εabba = abbaε = abba. (a (b)(c)) *

Languages. L((a (b)(c))*) = { ε,a,bc,aa,abc,bca,... } εw = wε = w. εabba = abbaε = abba. (a (b)(c)) * Pln for Tody nd Beginning Next week Interpreter nd Compiler Structure, or Softwre Architecture Overview of Progrmming Assignments The MeggyJv compiler we will e uilding. Regulr Expressions Finite Stte

More information

CS 430 Spring Mike Lam, Professor. Parsing

CS 430 Spring Mike Lam, Professor. Parsing CS 430 Spring 2015 Mike Lm, Professor Prsing Syntx Anlysis We cn now formlly descrie lnguge's syntx Using regulr expressions nd BNF grmmrs How does tht help us? Syntx Anlysis We cn now formlly descrie

More information

10/12/17. Motivating Example. Lexical and Syntax Analysis (2) Recursive-Descent Parsing. Recursive-Descent Parsing. Recursive-Descent Parsing

10/12/17. Motivating Example. Lexical and Syntax Analysis (2) Recursive-Descent Parsing. Recursive-Descent Parsing. Recursive-Descent Parsing Motivting Exmple Lexicl nd yntx Anlysis (2) In Text: Chpter 4 Consider the grmmr -> cad A -> b Input string: w = cd How to build prse tree top-down? 2 Initilly crete tree contining single node (the strt

More information

2014 Haskell January Test Regular Expressions and Finite Automata

2014 Haskell January Test Regular Expressions and Finite Automata 0 Hskell Jnury Test Regulr Expressions nd Finite Automt This test comprises four prts nd the mximum mrk is 5. Prts I, II nd III re worth 3 of the 5 mrks vilble. The 0 Hskell Progrmming Prize will be wrded

More information

Midterm I Solutions CS164, Spring 2006

Midterm I Solutions CS164, Spring 2006 Midterm I Solutions CS164, Spring 2006 Februry 23, 2006 Plese red ll instructions (including these) crefully. Write your nme, login, SID, nd circle the section time. There re 8 pges in this exm nd 4 questions,

More information

Scanner Termination. Multi Character Lookahead

Scanner Termination. Multi Character Lookahead If d.doublevlue() represents vlid integer, (int) d.doublevlue() will crete the pproprite integer vlue. If string representtion of n integer begins with ~ we cn strip the ~, convert to double nd then negte

More information

Some Thoughts on Grad School. Undergraduate Compilers Review and Intro to MJC. Structure of a Typical Compiler. Lexing and Parsing

Some Thoughts on Grad School. Undergraduate Compilers Review and Intro to MJC. Structure of a Typical Compiler. Lexing and Parsing Undergrdute Compilers Review nd Intro to MJC Announcements Miling list is in full swing Tody Some thoughts on grd school Finish prsing Semntic nlysis Visitor pttern for bstrct syntx trees Some Thoughts

More information

ECE 468/573 Midterm 1 September 28, 2012

ECE 468/573 Midterm 1 September 28, 2012 ECE 468/573 Midterm 1 September 28, 2012 Nme:! Purdue emil:! Plese sign the following: I ffirm tht the nswers given on this test re mine nd mine lone. I did not receive help from ny person or mteril (other

More information

Implementing Automata. CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona

Implementing Automata. CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona Implementing utomt Sc 5 ompilers nd Systems Softwre : Lexicl nlysis II Deprtment of omputer Science University of rizon collerg@gmil.com opyright c 009 hristin ollerg NFs nd DFs cn e hrd-coded using this

More information

CMPSC 470: Compiler Construction

CMPSC 470: Compiler Construction CMPSC 47: Compiler Construction Plese complete the following: Midterm (Type A) Nme Instruction: Mke sure you hve ll pges including this cover nd lnk pge t the end. Answer ech question in the spce provided.

More information

Compilation

Compilation Compiltion 0368-3133 Lecture 2: Lexicl Anlysis Nom Rinetzky 1 2 Lexicl Anlysis Modern Compiler Design: Chpter 2.1 3 Conceptul Structure of Compiler Compiler Source text txt Frontend Semntic Representtion

More information

CMPT 379 Compilers. Lexical Analysis

CMPT 379 Compilers. Lexical Analysis CMPT 379 Compilers Anoop Srkr http://www.cs.sfu.c/~noop 9//7 Lexicl Anlysis Also clled scnning, tke input progrm string nd convert into tokens Exmple: T_DOUBLE ( doule ) T_IDENT ( f ) T_OP ( = ) doule

More information

Theory of Computation CSE 105

Theory of Computation CSE 105 $ $ $ Theory of Computtion CSE 105 Regulr Lnguges Study Guide nd Homework I Homework I: Solutions to the following problems should be turned in clss on July 1, 1999. Instructions: Write your nswers clerly

More information

CMSC 331 First Midterm Exam

CMSC 331 First Midterm Exam 0 00/ 1 20/ 2 05/ 3 15/ 4 15/ 5 15/ 6 20/ 7 30/ 8 30/ 150/ 331 First Midterm Exm 7 October 2003 CMC 331 First Midterm Exm Nme: mple Answers tudent ID#: You will hve seventy-five (75) minutes to complete

More information

Compilers Spring 2013 PRACTICE Midterm Exam

Compilers Spring 2013 PRACTICE Midterm Exam Compilers Spring 2013 PRACTICE Midterm Exm This is full length prctice midterm exm. If you wnt to tke it t exm pce, give yourself 7 minutes to tke the entire test. Just like the rel exm, ech question hs

More information

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis CS143 Hndout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexicl Anlysis In this first written ssignment, you'll get the chnce to ply round with the vrious constructions tht come up when doing lexicl

More information

What are suffix trees?

What are suffix trees? Suffix Trees 1 Wht re suffix trees? Allow lgorithm designers to store very lrge mount of informtion out strings while still keeping within liner spce Allow users to serch for new strings in the originl

More information

CSCE 531, Spring 2017, Midterm Exam Answer Key

CSCE 531, Spring 2017, Midterm Exam Answer Key CCE 531, pring 2017, Midterm Exm Answer Key 1. (15 points) Using the method descried in the ook or in clss, convert the following regulr expression into n equivlent (nondeterministic) finite utomton: (

More information

CS 321 Programming Languages and Compilers. Bottom Up Parsing

CS 321 Programming Languages and Compilers. Bottom Up Parsing CS 321 Progrmming nguges nd Compilers Bottom Up Prsing Bottom-up Prsing: Shift-reduce prsing Grmmr H: fi ; fi b Input: ;;b hs prse tree ; ; b 2 Dt for Shift-reduce Prser Input string: sequence of tokens

More information

acronyms possibly used in this test: CFG :acontext free grammar CFSM :acharacteristic finite state machine DFA :adeterministic finite automata

acronyms possibly used in this test: CFG :acontext free grammar CFSM :acharacteristic finite state machine DFA :adeterministic finite automata EE573 Fll 2002, Exm open book, if question seems mbiguous, sk me to clrify the question. If my nswer doesn t stisfy you, plese stte your ssumptions. cronyms possibly used in this test: CFG :context free

More information

Compiler Construction D7011E

Compiler Construction D7011E Compiler Construction D7011E Lecture 3: Lexer genertors Viktor Leijon Slides lrgely y John Nordlnder with mteril generously provided y Mrk P. Jones. 1 Recp: Hndwritten Lexers: Don t require sophisticted

More information

Problem Set 2 Fall 16 Due: Wednesday, September 21th, in class, before class begins.

Problem Set 2 Fall 16 Due: Wednesday, September 21th, in class, before class begins. Problem Set 2 Fll 16 Due: Wednesdy, September 21th, in clss, before clss begins. 1. LL Prsing For the following sub-problems, consider the following context-free grmmr: S T$ (1) T A (2) T bbb (3) A T (4)

More information

Lecture T4: Pattern Matching

Lecture T4: Pattern Matching Introduction to Theoreticl CS Lecture T4: Pttern Mtching Two fundmentl questions. Wht cn computer do? How fst cn it do it? Generl pproch. Don t tlk bout specific mchines or problems. Consider miniml bstrct

More information

Homework. Context Free Languages III. Languages. Plan for today. Context Free Languages. CFLs and Regular Languages. Homework #5 (due 10/22)

Homework. Context Free Languages III. Languages. Plan for today. Context Free Languages. CFLs and Regular Languages. Homework #5 (due 10/22) Homework Context Free Lnguges III Prse Trees nd Homework #5 (due 10/22) From textbook 6.4,b 6.5b 6.9b,c 6.13 6.22 Pln for tody Context Free Lnguges Next clss of lnguges in our quest! Lnguges Recll. Wht

More information

LEX5: Regexps to NFA. Lexical Analysis. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class

LEX5: Regexps to NFA. Lexical Analysis. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class LEX5: Regexps to NFA Lexicl Anlysis CMPT 379: Compilers Instructor: Anoop Srkr noopsrkr.github.io/compilers-clss Building Lexicl Anlyzer Token POern POern Regulr Expression Regulr Expression NFA NFA DFA

More information

CSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011

CSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011 CSCI 3130: Forml Lnguges nd utomt Theory Lecture 12 The Chinese University of Hong Kong, Fll 2011 ndrej Bogdnov In progrmming lnguges, uilding prse trees is significnt tsk ecuse prse trees tell us the

More information

Should be done. Do Soon. Structure of a Typical Compiler. Plan for Today. Lab hours and Office hours. Quiz 1 is due tonight, was posted Tuesday night

Should be done. Do Soon. Structure of a Typical Compiler. Plan for Today. Lab hours and Office hours. Quiz 1 is due tonight, was posted Tuesday night Should e done L hours nd Office hours Sign up for the miling list t, strting to send importnt info to list http://groups.google.com/group/cs453-spring-2011 Red Ch 1 nd skim Ch 2 through 2.6, red 3.3 nd

More information

Finite Automata. Lecture 4 Sections Robb T. Koether. Hampden-Sydney College. Wed, Jan 21, 2015

Finite Automata. Lecture 4 Sections Robb T. Koether. Hampden-Sydney College. Wed, Jan 21, 2015 Finite Automt Lecture 4 Sections 3.6-3.7 Ro T. Koether Hmpden-Sydney College Wed, Jn 21, 2015 Ro T. Koether (Hmpden-Sydney College) Finite Automt Wed, Jn 21, 2015 1 / 23 1 Nondeterministic Finite Automt

More information

TO REGULAR EXPRESSIONS

TO REGULAR EXPRESSIONS Suject :- Computer Science Course Nme :- Theory Of Computtion DA TO REGULAR EXPRESSIONS Report Sumitted y:- Ajy Singh Meen 07000505 jysmeen@cse.iit.c.in BASIC DEINITIONS DA:- A finite stte mchine where

More information

Stack. A list whose end points are pointed by top and bottom

Stack. A list whose end points are pointed by top and bottom 4. Stck Stck A list whose end points re pointed by top nd bottom Insertion nd deletion tke plce t the top (cf: Wht is the difference between Stck nd Arry?) Bottom is constnt, but top grows nd shrinks!

More information

LR Parsing, Part 2. Constructing Parse Tables. Need to Automatically Construct LR Parse Tables: Action and GOTO Table

LR Parsing, Part 2. Constructing Parse Tables. Need to Automatically Construct LR Parse Tables: Action and GOTO Table TDDD55 Compilers nd Interpreters TDDB44 Compiler Construction LR Prsing, Prt 2 Constructing Prse Tles Prse tle construction Grmmr conflict hndling Ctegories of LR Grmmrs nd Prsers Peter Fritzson, Christoph

More information

Functor (1A) Young Won Lim 8/2/17

Functor (1A) Young Won Lim 8/2/17 Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published

More information

Functor (1A) Young Won Lim 10/5/17

Functor (1A) Young Won Lim 10/5/17 Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published

More information

Algorithm Design (5) Text Search

Algorithm Design (5) Text Search Algorithm Design (5) Text Serch Tkshi Chikym School of Engineering The University of Tokyo Text Serch Find sustring tht mtches the given key string in text dt of lrge mount Key string: chr x[m] Text Dt:

More information

Fall Compiler Principles Lecture 1: Lexical Analysis. Roman Manevich Ben-Gurion University of the Negev

Fall Compiler Principles Lecture 1: Lexical Analysis. Roman Manevich Ben-Gurion University of the Negev Fll 2016-2017 Compiler Principles Lecture 1: Lexicl Anlysis Romn Mnevich Ben-Gurion University of the Negev Agend Understnd role of lexicl nlysis in compiler Regulr lnguges reminder Lexicl nlysis lgorithms

More information

Midterm 2 Sample solution

Midterm 2 Sample solution Nme: Instructions Midterm 2 Smple solution CMSC 430 Introduction to Compilers Fll 2012 November 28, 2012 This exm contins 9 pges, including this one. Mke sure you hve ll the pges. Write your nme on the

More information

CS201 Discussion 10 DRAWTREE + TRIES

CS201 Discussion 10 DRAWTREE + TRIES CS201 Discussion 10 DRAWTREE + TRIES DrwTree First instinct: recursion As very generic structure, we could tckle this problem s follows: drw(): Find the root drw(root) drw(root): Write the line for the

More information

12 <= rm <digit> 2 <= rm <no> 2 <= rm <no> <digit> <= rm <no> <= rm <number>

12 <= rm <digit> 2 <= rm <no> 2 <= rm <no> <digit> <= rm <no> <= rm <number> DDD16 Compilers nd Interpreters DDB44 Compiler Construction R Prsing Prt 1 R prsing concept Using prser genertor Prse ree Genertion Wht is R-prsing? eft-to-right scnning R Rigthmost derivtion in reverse

More information

Deterministic. Finite Automata. And Regular Languages. Fall 2018 Costas Busch - RPI 1

Deterministic. Finite Automata. And Regular Languages. Fall 2018 Costas Busch - RPI 1 Deterministic Finite Automt And Regulr Lnguges Fll 2018 Costs Busch - RPI 1 Deterministic Finite Automton (DFA) Input Tpe String Finite Automton Output Accept or Reject Fll 2018 Costs Busch - RPI 2 Trnsition

More information

COS 333: Advanced Programming Techniques

COS 333: Advanced Programming Techniques COS 333: Advnced Progrmming Techniques Brin Kernighn wk@cs, www.cs.princeton.edu/~wk 311 CS Building 609-258-2089 (ut emil is lwys etter) TA's: Junwen Li, li@cs, CS 217,258-0451 Yong Wng,yongwng@cs, CS

More information

CS481: Bioinformatics Algorithms

CS481: Bioinformatics Algorithms CS481: Bioinformtics Algorithms Cn Alkn EA509 clkn@cs.ilkent.edu.tr http://www.cs.ilkent.edu.tr/~clkn/teching/cs481/ EXACT STRING MATCHING Fingerprint ide Assume: We cn compute fingerprint f(p) of P in

More information

Scanning Theory and Practice

Scanning Theory and Practice CHAPTER 3 Scnning Theory nd Prctice 3.1 Overview The primry function of scnner is to red in chrcters from source file nd group them into tokens. A scnner is sometimes clled lexicl nlyzer or lexer. The

More information

CPSC 213. Polymorphism. Introduction to Computer Systems. Readings for Next Two Lectures. Back to Procedure Calls

CPSC 213. Polymorphism. Introduction to Computer Systems. Readings for Next Two Lectures. Back to Procedure Calls Redings for Next Two Lectures Text CPSC 213 Switch Sttements, Understnding Pointers - 2nd ed: 3.6.7, 3.10-1st ed: 3.6.6, 3.11 Introduction to Computer Systems Unit 1f Dynmic Control Flow Polymorphism nd

More information

COMP 423 lecture 11 Jan. 28, 2008

COMP 423 lecture 11 Jan. 28, 2008 COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring

More information

Eliminating left recursion grammar transformation. The transformed expression grammar

Eliminating left recursion grammar transformation. The transformed expression grammar Eliminting left recursion grmmr trnsformtion Originl! rnsformed! 0 0! 0 α β α α α α α α α α β he two grmmrs generte the sme lnguge, but the one on the right genertes the rst, nd then string of s, using

More information

this grammar generates the following language: Because this symbol will also be used in a later step, it receives the

this grammar generates the following language: Because this symbol will also be used in a later step, it receives the LR() nlysis Drwcks of LR(). Look-hed symols s eplined efore, concerning LR(), it is possile to consult the net set to determine, in the reduction sttes, for which symols it would e possile to perform reductions.

More information

COS 333: Advanced Programming Techniques

COS 333: Advanced Programming Techniques COS 333: Advnced Progrmming Techniques How to find me wk@cs, www.cs.princeton.edu/~wk 311 CS Building 609-258-2089 (ut emil is lwys etter) TA's: Mtvey Arye (rye), Tom Jlin (tjlin), Nick Johnson (npjohnso)

More information

CS 340, Fall 2014 Dec 11 th /13 th Final Exam Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string.

CS 340, Fall 2014 Dec 11 th /13 th Final Exam Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string. CS 340, Fll 2014 Dec 11 th /13 th Finl Exm Nme: Note: in ll questions, the specil symol ɛ (epsilon) is used to indicte the empty string. Question 1. [5 points] Consider the following regulr expression;

More information

CSEP 573 Artificial Intelligence Winter 2016

CSEP 573 Artificial Intelligence Winter 2016 CSEP 573 Artificil Intelligence Winter 2016 Luke Zettlemoyer Problem Spces nd Serch slides from Dn Klein, Sturt Russell, Andrew Moore, Dn Weld, Pieter Abbeel, Ali Frhdi Outline Agents tht Pln Ahed Serch

More information

Operator Precedence. Java CUP. E E + T T T * P P P id id id. Does a+b*c mean (a+b)*c or

Operator Precedence. Java CUP. E E + T T T * P P P id id id. Does a+b*c mean (a+b)*c or Opertor Precedence Most progrmming lnguges hve opertor precedence rules tht stte the order in which opertors re pplied (in the sence of explicit prentheses). Thus in C nd Jv nd CSX, +*c mens compute *c,

More information

Fall Compiler Principles Lecture 1: Lexical Analysis. Roman Manevich Ben-Gurion University

Fall Compiler Principles Lecture 1: Lexical Analysis. Roman Manevich Ben-Gurion University Fll 2014-2015 Compiler Principles Lecture 1: Lexicl Anlysis Romn Mnevich Ben-Gurion University Agend Understnd role of lexicl nlysis in compiler Lexicl nlysis theory Implementing professionl scnner vi

More information

Sample Midterm Solutions COMS W4115 Programming Languages and Translators Monday, October 12, 2009

Sample Midterm Solutions COMS W4115 Programming Languages and Translators Monday, October 12, 2009 Deprtment of Computer cience Columbi University mple Midterm olutions COM W4115 Progrmming Lnguges nd Trnsltors Mondy, October 12, 2009 Closed book, no ids. ch question is worth 20 points. Question 5(c)

More information

Quiz2 45mins. Personal Number: Problem 1. (20pts) Here is an Table of Perl Regular Ex

Quiz2 45mins. Personal Number: Problem 1. (20pts) Here is an Table of Perl Regular Ex Long Quiz2 45mins Nme: Personl Numer: Prolem. (20pts) Here is n Tle of Perl Regulr Ex Chrcter Description. single chrcter \s whitespce chrcter (spce, t, newline) \S non-whitespce chrcter \d digit (0-9)

More information

From Dependencies to Evaluation Strategies

From Dependencies to Evaluation Strategies From Dependencies to Evlution Strtegies Possile strtegies: 1 let the user define the evlution order 2 utomtic strtegy sed on the dependencies: use locl dependencies to determine which ttriutes to compute

More information

Data sharing in OpenMP

Data sharing in OpenMP Dt shring in OpenMP Polo Burgio polo.burgio@unimore.it Outline Expressing prllelism Understnding prllel threds Memory Dt mngement Dt cluses Synchroniztion Brriers, locks, criticl sections Work prtitioning

More information

COMBINATORIAL PATTERN MATCHING

COMBINATORIAL PATTERN MATCHING COMBINATORIAL PATTERN MATCHING Genomic Repets Exmple of repets: ATGGTCTAGGTCCTAGTGGTC Motivtion to find them: Genomic rerrngements re often ssocited with repets Trce evolutionry secrets Mny tumors re chrcterized

More information

Presentation Martin Randers

Presentation Martin Randers Presenttion Mrtin Rnders Outline Introduction Algorithms Implementtion nd experiments Memory consumption Summry Introduction Introduction Evolution of species cn e modelled in trees Trees consist of nodes

More information

Alignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey

Alignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey Alignment of Long Sequences BMI/CS 776 www.biostt.wisc.edu/bmi776/ Spring 2012 Colin Dewey cdewey@biostt.wisc.edu Gols for Lecture the key concepts to understnd re the following how lrge-scle lignment

More information

Mid-term exam. Scores. Fall term 2012 KAIST EE209 Programming Structures for EE. Thursday Oct 25, Student's name: Student ID:

Mid-term exam. Scores. Fall term 2012 KAIST EE209 Programming Structures for EE. Thursday Oct 25, Student's name: Student ID: Fll term 2012 KAIST EE209 Progrmming Structures for EE Mid-term exm Thursdy Oct 25, 2012 Student's nme: Student ID: The exm is closed book nd notes. Red the questions crefully nd focus your nswers on wht

More information

UNIT 11. Query Optimization

UNIT 11. Query Optimization UNIT Query Optimiztion Contents Introduction to Query Optimiztion 2 The Optimiztion Process: An Overview 3 Optimiztion in System R 4 Optimiztion in INGRES 5 Implementing the Join Opertors Wei-Png Yng,

More information

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus Unit #9 : Definite Integrl Properties, Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl

More information

ASTs, Regex, Parsing, and Pretty Printing

ASTs, Regex, Parsing, and Pretty Printing ASTs, Regex, Prsing, nd Pretty Printing CS 2112 Fll 2016 1 Algeric Expressions To strt, consider integer rithmetic. Suppose we hve the following 1. The lphet we will use is the digits {0, 1, 2, 3, 4, 5,

More information

An introduction to model checking

An introduction to model checking An introduction to model checking Slide 1 University of Albert Edmonton July 3rd, 2002 Guy Trembly Dépt d informtique UQAM Outline Wht re forml specifiction nd verifiction methods? Slide 2 Wht is model

More information

Today. Search Problems. Uninformed Search Methods. Depth-First Search Breadth-First Search Uniform-Cost Search

Today. Search Problems. Uninformed Search Methods. Depth-First Search Breadth-First Search Uniform-Cost Search Uninformed Serch [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI t UC Berkeley. All CS188 mterils re vilble t http://i.berkeley.edu.] Tody Serch Problems Uninformed Serch Methods

More information

Control-Flow Analysis and Loop Detection

Control-Flow Analysis and Loop Detection ! Control-Flow Anlysis nd Loop Detection!Lst time! PRE!Tody! Control-flow nlysis! Loops! Identifying loops using domintors! Reducibility! Using loop identifiction to identify induction vribles CS553 Lecture

More information

Lab 1 - Counter. Create a project. Add files to the project. Compile design files. Run simulation. Debug results

Lab 1 - Counter. Create a project. Add files to the project. Compile design files. Run simulation. Debug results 1 L 1 - Counter A project is collection mechnism for n HDL design under specifiction or test. Projects in ModelSim ese interction nd re useful for orgnizing files nd specifying simultion settings. The

More information

CSCI 446: Artificial Intelligence

CSCI 446: Artificial Intelligence CSCI 446: Artificil Intelligence Serch Instructor: Michele Vn Dyne [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI t UC Berkeley. All CS188 mterils re vilble t http://i.berkeley.edu.]

More information

MIPS I/O and Interrupt

MIPS I/O and Interrupt MIPS I/O nd Interrupt Review Floting point instructions re crried out on seprte chip clled coprocessor 1 You hve to move dt to/from coprocessor 1 to do most common opertions such s printing, clling functions,

More information

stack of states and grammar symbols Stack-Bottom marker C. Kessler, IDA, Linköpings universitet. 1. <list> -> <list>, <element> 2.

stack of states and grammar symbols Stack-Bottom marker C. Kessler, IDA, Linköpings universitet. 1. <list> -> <list>, <element> 2. TDDB9 Compilers nd Interpreters TDDB44 Compiler Construction LR Prsing Updted/New slide mteril 007: Pushdown Automton for LR-Prsing Finite-stte pushdown utomton contins lterntingly sttes nd symols in NUΣ

More information

Recognition of Tokens

Recognition of Tokens 42 Recognton o Tokens The queston s how to recognze the tokens? Exmple: ssume the ollowng grmmr rgment to generte specc lnguge: stmt expr expr then stmt expr then stmt else stmt term relop term term term

More information

Compression Outline :Algorithms in the Real World. Lempel-Ziv Algorithms. LZ77: Sliding Window Lempel-Ziv

Compression Outline :Algorithms in the Real World. Lempel-Ziv Algorithms. LZ77: Sliding Window Lempel-Ziv Compression Outline 15-853:Algorithms in the Rel World Dt Compression III Introduction: Lossy vs. Lossless, Benchmrks, Informtion Theory: Entropy, etc. Proility Coding: Huffmn + Arithmetic Coding Applictions

More information

INTRODUCTION TO SIMPLICIAL COMPLEXES

INTRODUCTION TO SIMPLICIAL COMPLEXES INTRODUCTION TO SIMPLICIAL COMPLEXES CASEY KELLEHER AND ALESSANDRA PANTANO 0.1. Introduction. In this ctivity set we re going to introduce notion from Algebric Topology clled simplicil homology. The min

More information

Java CUP. Java CUP Specifications. User Code Additions. Package and Import Specifications

Java CUP. Java CUP Specifications. User Code Additions. Package and Import Specifications Jv CUP Jv CUP is prser-genertion tool, similr to Ycc. CUP uilds Jv prser for LALR(1) grmmrs from production rules nd ssocited Jv code frgments. When prticulr production is recognized, its ssocited code

More information

Lexical Analysis (ASU Ch 3, Fig 3.1)

Lexical Analysis (ASU Ch 3, Fig 3.1) Lexical Analysis (ASU Ch 3, Fig 3.1) Implementation by hand automatically ((F)Lex) Lex generates a finite automaton recogniser uses regular expressions Tasks remove white space (ws) display source program

More information

Dynamic Programming. Andreas Klappenecker. [partially based on slides by Prof. Welch] Monday, September 24, 2012

Dynamic Programming. Andreas Klappenecker. [partially based on slides by Prof. Welch] Monday, September 24, 2012 Dynmic Progrmming Andres Klppenecker [prtilly bsed on slides by Prof. Welch] 1 Dynmic Progrmming Optiml substructure An optiml solution to the problem contins within it optiml solutions to subproblems.

More information

Suffix trees, suffix arrays, BWT

Suffix trees, suffix arrays, BWT ALGORITHMES POUR LA BIO-INFORMATIQUE ET LA VISUALISATION COURS 3 Rluc Uricru Suffix trees, suffix rrys, BWT Bsed on: Suffix trees nd suffix rrys presenttion y Him Kpln Suffix trees course y Pco Gomez Liner-Time

More information

MATH 25 CLASS 5 NOTES, SEP

MATH 25 CLASS 5 NOTES, SEP MATH 25 CLASS 5 NOTES, SEP 30 2011 Contents 1. A brief diversion: reltively prime numbers 1 2. Lest common multiples 3 3. Finding ll solutions to x + by = c 4 Quick links to definitions/theorems Euclid

More information

Chapter 3 Lexical Analysis

Chapter 3 Lexical Analysis Chapter 3 Lexical Analysis Outline Role of lexical analyzer Specification of tokens Recognition of tokens Lexical analyzer generator Finite automata Design of lexical analyzer generator The role of lexical

More information