Compilation

Size: px
Start display at page:

Download "Compilation"

Transcription

1 Compiltion Lecture 2: Lexicl Anlysis Nom Rinetzky 1

2 2

3 Lexicl Anlysis Modern Compiler Design: Chpter 2.1 3

4 Conceptul Structure of Compiler Compiler Source text txt Frontend Semntic Representtion Bckend Executle code exe Lexicl Anlysis Syntx Anlysis Prsing Semntic Anlysis Intermedite Representtion (IR) Code Genertion 4

5 Conceptul Structure of Compiler Compiler Source text txt Frontend Semntic Representtion Bckend Executle code exe Lexicl Anlysis Syntx Anlysis Prsing Semntic Anlysis Intermedite Representtion (IR) Code Genertion words sentences 5

6 Wht does Lexicl Anlysis do? Lnguge: fully prenthesized expressions Expr Num LP Expr Op Expr RP Num Dig Dig Num Dig LP ( RP ) Op + * ( ( ) * 19 ) 6

7 Wht does Lexicl Anlysis do? Lnguge: fully prenthesized expressions Context free lnguge Regulr lnguges Expr Num LP Expr Op Expr RP Num Dig Dig Num Dig LP ( RP ) Op + * ( ( ) * 19 ) 7

8 Wht does Lexicl Anlysis do? Lnguge: fully prenthesized expressions Context free lnguge Regulr lnguges Expr Num LP Expr Op Expr RP Num Dig Dig Num Dig LP ( RP ) Op + * ( ( ) * 19 ) 8

9 Wht does Lexicl Anlysis do? Lnguge: fully prenthesized expressions Context free lnguge Regulr lnguges Expr Num LP Expr Op Expr RP Num Dig Dig Num Dig LP ( RP ) Op + * ( ( ) * 19 ) 9

10 Wht does Lexicl Anlysis do? Lnguge: fully prenthesized expressions Context free lnguge Regulr lnguges Expr Num LP Expr Op Expr RP Num Dig Dig Num Dig LP ( RP ) Op + * ( ( ) * 19 ) LP LP Num Op Num RP Op Num RP 10

11 Wht does Lexicl Anlysis do? Lnguge: fully prenthesized expressions Context free lnguge Regulr lnguges Expr Num LP Expr Op Expr RP Num Dig Dig Num Dig LP ( RP ) Op + * Vlue Kind ( ( ) * 19 ) LP LP Num Op Num RP Op Num RP 11

12 Wht does Lexicl Anlysis do? Context free lnguge Regulr lnguges Vlue Lnguge: fully prenthesized expressions Expr Num LP Expr Op Expr RP Num Dig Dig Num Dig LP ( RP ) Op + * Token Token ( ( ) * 19 ) Kind LP LP Num Op Num RP Op Num RP 12

13 Wht does Lexicl Anlysis do? Prtitions the input into strem of tokens Numers Identifiers Keywords Punctution word in the source lnguge meningful to the syntcticl nlysis Usully represented s (kind, vlue) pirs (Num, 23) (Op, * ) 13

14 From scnning to prsing progrm text ((23 + 7) * x) Lexicl Anlyzer token strem ( LP ( LP 23 Num + OP 7 Num ) RP * OP? Id ) RP Grmmr: Expr... Id Id... z syntx error Prser Op(*) vlid Astrct Syntx Tree Op(+) Id(?) Num(23) Num(7) 14

15 Why Lexicl Anlysis? Well, not strictly necessry, ut Regulr lnguges Í Context-Free lnguges Simplifies the syntx nlysis (prsing) And lnguge definition Modulrity Reusility Efficiency 15

16 Lecture gols Understnd role & plce of lexicl nlysis Lexicl nlysis theory Using progrm generting tools 16

17 Lecture Outline ürole & plce of lexicl nlysis Wht is token? Regulr lnguges Lexicl nlysis Error hndling Automtic cretion of lexicl nlyzers 17

18 Wht is token? (Intuitively) A word in the source lnguge Anything tht should pper in the input to syntx nlysis Identifiers Vlues Lnguge keywords Usully, represented s pir of (kind, vlue) 18

19 Exmple Tokens Type Exmples ID foo, n_14, lst NUM 73, 00, 517, 082 REAL 66.1,.5, 5.5e-10 IF if COMMA, NOTEQ!= LPAREN ( RPAREN ) 19

20 Exmple Non Tokens Type Exmples comment /* ignored */ preprocessor directive #include <foo.h> #define NUMS 5.6 mcro NUMS whitespce \t, \n, \, 20

21 Some sic terminology Lexeme (k symol) - series of letters seprted from the rest of the progrm ccording to convention (spce, semi-column, comm, etc.) Pttern - rule specifying set of strings. Exmple: n identifier is string tht strts with letter nd continues with letters nd digits (Usully) regulr expression Token - pir of (pttern, ttriutes) 21

22 Exmple void mtch0(chr *s) /* find zero */ { } if (!strncmp(s, 0.0, 3)) return 0.0 ; VOID ID(mtch0) LPAREN CHAR DEREF ID(s) RPAREN LBRACE IF LPAREN NOT ID(strncmp) LPAREN ID(s) COMMA STRING(0.0) COMMA NUM(3) RPAREN RPAREN RETURN REAL(0.0) SEMI RBRACE EOF 22

23 Exmple Non Tokens Type Exmples comment /* ignored */ preprocessor directive #include <foo.h> #define NUMS 5.6 mcro NUMS whitespce \t, \n, \, Lexemes tht re recognized ut get consumed rther thn trnsmitted to prser if i/*comment*/f 23

24 Lecture Outline ürole & plce of lexicl nlysis üwht is token? Regulr lnguges Lexicl nlysis Error hndling Automtic cretion of lexicl nlyzers 24

25 How cn we define tokens? Keywords esy! if, then, else, for, while, Identifiers? Numericl Vlues? Strings? Chrcterize unounded sets of vlues using ounded description? 25

26 Regulr lnguges Forml lnguges Σ = finite set of letters Word = sequence of letter Lnguge = set of words Regulr lnguges defined equivlently y Regulr expressions Finite-stte utomt 26

27 Common formt for reg-exps Bsic Ptterns Mtching x The chrcter x. Any chrcter, usully except new line [xyz] Any of the chrcters x,y,z ^x Any chrcter except x Repetition Opertors R? An R or nothing (=optionlly n R) R* Zero or more occurrences of R R+ One or more occurrences of R Composition Opertors R1R2 An R1 followed y R2 R1 R2 Either n R1 or R2 Grouping (R) R itself 27

28 Exmples * cd? = ( )* = ( )* = 28

29 Escpe chrcters Wht is the expression for one or more + symols? (+)+ won t work (\+)+ will ckslsh \ efore n opertor turns it to stndrd chrcter \*, \?, \+, \(\+\*, (\(\+\*)+, ckslsh doule quotes surrounds text (+*, (+* + 29

30 Shorthnds Use nmes for expressions letter = z A B Z letter_ = letter _ digit = id = letter_ (letter_ digit)* Use hyphen to denote rnge letter = -z A-Z digit =

31 Exmples if = if then = then relop = < > <= >= = <> digit = 0-9 digits = digit+ 31

32 Exmple A numer is numer = ( ) + ( e \. ( ) + ) ( e E ( ) + ) Using shorthnds it cn e written s numer = digits (e \.digits (e E (e + -) digits ) ) 32

33 Exercise 1 - Question Lnguge of rtionl numers in deciml representtion (no leding, ending zeros) Not 007 Not

34 Exercise 1 - Answer Lnguge of rtionl numers in deciml representtion (no leding, ending zeros) Digit = Digit0 = 0 Digit Num = Digit Digit0* Frc = Digit0* Digit Pos = Num \.Frc 0\.Frc Num\.Frc PosOrNeg = (Є -)Pos R = 0 PosOrNeg 34

35 Exercise 2 - Question Equl numer of opening nd closing prenthesis: [ n ] n = [], [[]], [[[]]], 35

36 Exercise 2 - Answer Equl numer of opening nd closing prenthesis: [ n ] n = [], [[]], [[[]]], Not regulr Context-free Grmmr: S ::= [] [S] 36

37 Chllenge: Amiguity If = if Id = Letter (Letter Digit)* if is vlid identifiers wht should it e? iffy is lso vlid identifier Solution Longest mtching token Brek ties using order of definitions Keywords should pper efore identifiers 37

38 Creting lexicl nlyzer Given list of token definitions (pttern nme, regex), write progrm such tht Input: String to e nlyzed Output: List of tokens How do we uild n nlyzer? 38

39 Building Scnner Tke I Input: String Output: Sequence of tokens 39

40 Building Scnner Tke I Token nexttoken() { chr c ; loop: c = getchr(); switch (c){ cse ` `: goto loop ; cse `;`: return SemiColumn; cse `+`: c = getchr() ; switch (c) { cse `+': return PlusPlus ; cse '= return PlusEqul; defult: ungetc(c); return Plus; }; cse `<`: cse `w`: } 40

41 There must e etter wy! 41

42 A etter wy Automticlly generte scnner Define tokens using regulr expressions Use finite-stte utomt for detection 42

43 Reg-exp vs. utomt Regulr expressions re declrtive Good for humns Not executle Automt re opertive Define n lgorithm for deciding whether given word is in regulr lnguge Not nturl nottion for humns 43

44 Overview Define tokens using regulr expressions Construct nondeterministic finite-stte utomton (NFA) from regulr expression Determinize the NFA into deterministic finite-stte utomton (DFA) DFA cn e directly used to identify tokens 44

45 Automt theory: ird s-eye view 45

46 Deterministic Automt (DFA) M = (S, Q, d, q 0, F) S - lphet Q finite set of stte q 0 Î Q initil stte F Í Q finl sttes δ : Q S à Q - trnsition function For word w, M rech some stte x M ccepts w if x Î F 46

47 DFA in pictures An utomton is defined y sttes nd trnsitions trnsition, c ccepting stte strt,c,,c,,c strt stte 47

48 Accepting Words Words re red left-to-right c c strt Missing trnsition = non-cceptnce Stuck stte 48

49 Accepting Words Words re red left-to-right c c strt 49

50 Accepting Words Words re red left-to-right c c strt 50

51 Accepting Words Words re red left-to-right c c strt 51

52 Rejecting Words Words re red left-to-right c c strt 52

53 Rejecting Words Missing trnsition mens non-cceptnce c c strt 53

54 Non-deterministic Automt (NFA) M = (S, Q, d, q 0, F) S - lphet Q finite set of stte q 0 Î Q initil stte F Í Q finl sttes δ : Q (S È {e}) 2 Q - trnsition function DFA: δ : Q S à Q For word w, M cn rech numer of sttes X M ccepts w if X M {} Possile: X = {} Possile e-trnsitions 54

55 NFA Allow multiple trnsitions from given stte leled y sme letter c strt c 55

56 Accepting words c c strt c 56

57 Accepting words Mintin set of sttes c c strt c 57

58 Accepting words c c strt c 58

59 Accepting words Accept word if reched n ccepting stte c c strt c 59

60 NFA+Є utomt Є trnsitions cn fire without reding the input strt c Є 60

61 NFA+Є run exmple c strt c Є 61

62 NFA+Є run exmple Now Є trnsition cn non-deterministiclly tke plce c strt c Є 62

63 NFA+Є run exmple c strt c Є 63

64 NFA+Є run exmple c strt c Є 64

65 NFA+Є run exmple Є trnsitions cn fire without reding the input c strt c Є 65

66 Word ccepted NFA+Є run exmple c strt c Є 66

67 From regulr expressions to NFA Step 1: ssign expression nmes nd otin pure regulr expressions R 1 R m Step 2: construct n NFA M i for ech regulr expression R i Step 3: comine ll M i into single NFA Amiguity resolution: prefer longest ccepting word 67

68 From reg. exp. to utomt Theorem: there is n lgorithm to uild n NFA+Є utomton for ny regulr expression Proof: y induction on the structure of the regulr expression strt 68

69 Bsic constructs R = e R = R = f 69

70 Composition R = R1 R2 e M1 e e M2 e R = R1R2 e M1 e M2 e 70

71 Repetition R = R1* e e M1 e e 71

72 72

73 Nïve pproch Try ech utomton seprtely Given word w: Try M 1 (w) Try M 2 (w) Try M n (w) Requires resetting fter every ttempt 73

74 Actully, we comine utomt comines *+ e e e e *

75 Corresponding DFA * *+ 0# ε# ε# ε# ε# # 1# 2# 3# # 4# # 7# 8# 9# # # 10# # # # *+# # 5# 11# # # # 6# # 12# 13# # 75# *+ 75

76 Scnning with DFA Run until stuck Rememer lst ccepting stte Go ck to ccepting stte Return token 76

77 Amiguity resolution # 1# 2# # ε# ε# 3# # 4# # 5# # 6# # 0# ε# # # 7# 8# # *+# ε# 9# # 10# # 11# # 12# # 13# # 75# Longest word Tie-reker sed on order of rules when words hve sme length 77

78 Exmples # 1# 2# # * *+ 0# ε# ε# ε# ε# 3# # 4# # 7# 8# 9# # # 10# # # *+# # 5# 11# # # # 6# # 12# 13# # 75# *+ : gets stuck fter in stte 12, cks up to stte (5 8 11) pttern is *+, token is 78 Tokens: <*+, > <,><,>

79 Exmples # 1# 2# # * *+ 0# ε# ε# ε# ε# 3# # 4# # 7# 8# 9# # # 10# # # *+# # 5# 11# # # # 6# # 12# 13# # 75# *+ : stops fter second in (6 8), token is ecuse it comes first in spec Tokens: <, > <,> 79

80 Summry of Construction Descrie tokens s regulr expressions Decide ttriutes (vlues) to sve for ech token Regulr expressions turned into DFA Also, records which ttriutes (vlues) to keep Lexicl nlyzer simultes the run of n utomt with the given trnsition tle on ny input string 80

81 A Few Remrks Turning n NFA to DFA is expensive, ut Exponentil in the worst cse In prctice, works fine The construction is done once per-lnguge At Compiler construction time Not t compiltion time 81

82 Implementtion 82

83 Implementtion y Exmple if xy, i, zs98 3,32, , comm\n \n, \t, if { return IF; } [-z][-z0-9]* { return ID; } [0-9]+ { return NUM; } [0-9]. [0-9]+ [0-9]*. [0-9]+ { return REAL; } (\-\-[-z]*\n) ( \n \t) { ; }. { error(); } ID 2 3 IF ID error REAL 0 1 NUM REAL 12 w.s. error error w.s. 83

84 ID 2 3 IF ID error REAL 0 1 NUM REAL 12 w.s. error error w.s. int edges[][256]= { /*, 0, 1, 2, 3,..., -, e, f, g, h, i, j,... */ /* stte 0 */ {0,, 0, 0,, 0, 0, 0, 0, 0,..., 0, 0, 0, 0, 0, 0}, /* stte 1 */ {13,, 7, 7, 7, 7,, 9, 4, 4, 4, 4, 2, 4,, 13, 13}, /* stte 2 */ {0,, 4, 4, 4, 4,..., 0, 4, 3, 4, 4, 4, 4,, 0, 0}, /* stte 3 */ {0,, 4, 4, 4, 4,, 0, 4, 4, 4, 4, 4, 4,, 0, 0}, /* stte 4 */ {0,, 4, 4, 4, 4,, 0, 4, 4, 4, 4, 4, 4,, 0, 0}, /* stte 5 */ {0,, 6, 6, 6, 6,, 0, 0, 0, 0, 0, 0, 0,, 0, 0}, /* stte 6 */ {0,, 6, 6, 6, 6,, 0, 0, 0, 0, 0, 0, 0,..., 0, 0}, /* stte 7 */ /* stte */... /* stte 13 */ {0,, 0, 0, 0, 0,, 0, 0, 0, 0, 0, 0, 0,, 0, 0} }; 84

85 Pseudo Code for Scnner chr* input = ; Token nexttoken() { lstfinl = 0; currentstte = 1 ; inputpositionatlstfinl = input; currentposition = input; while (not(isded(currentstte))) { nextstte = edges[currentstte][*currentposition]; if (isfinl(nextstte)) { lstfinl = nextstte ; inputpositionatlstfinl = currentposition; } currentstte = nextstte; dvnce currentposition; } input = inputpositionatlstfinl + 1; return ction[lstfinl]; } 85

86 Exmple ID 2 3 IF ID error REAL 0 1 NUM REAL 12 w.s. error error w.s. Input: if --not--com 2 lnks 86

87 ID 2 3 IF ID error REAL finl stte input 0 1 NUM REAL 0 1 if --not--com 12 w.s. error error w.s. 2 2 if --not--com 3 3 if --not--com return IF 3 0 if --not--com 87

88 finl stte input not--com not--com not--com found whitespce not--com 88

89 ID 2 3 IF ID error REAL finl stte input not--com 12 w.s. error NUM REAL error w.s not--com not--com not--com error not--com not--com 89

90 ID 2 3 IF ID error REAL finl stte input 0 1 NUM REAL 0 1 -not--com 12 w.s. error error w.s not--com 9 0 -not--com error 9 0 -not--com 9 0 -not--com 90

91 Concluding remrks Efficient scnner Minimiztion Error hndling Automtic cretion of lexicl nlyzers 91

92 Efficient Scnners Efficient stte representtion Input uffering Using switch nd gotos insted of tles 92

93 Minimiztion Crete non-deterministic utomton (NDFA) from every regulr expression Merge ll the utomt using epsilon moves (like the construction) Construct deterministic finite utomton (DFA) Stte priority Minimize the utomton seprte ccepting sttes y token kinds 93

94 Exmple if { return IF; } [-z][-z0-9]* { return ID; } [0-9]+ { return NUM; } IF ID error NUM Modern compiler implementtion in ML, Andrew Appel, (c)1998, Figures 2.7,2.8 94

95 Exmple if { return IF; } [-z][-z0-9]* { return ID; } [0-9]+ { return NUM; } IF ID ID IF ID ID error NUM NUM NUM error Modern compiler implementtion in ML, Andrew Appel, (c)1998, Figures 2.7,2.8 95

96 Exmple IF if { return IF; } [-z][-z0-9]* ID { return ID; } ID IF [0-9]+ { return NUM; } ID ID ID ID IF error IF NUM ID NUM NUM NUM NUM error error NUM error Modern compiler implementtion in ML, Andrew Appel, (c)1998, Figures 2.7,2.8 96

97 Exmple IF if { return IF; } [-z][-z0-9]* { return ID; } ID [0-9]+ { return NUM; } ID ID IF ID NUM NUM error NUM error Modern compiler implementtion in ML, Andrew Appel, (c)1998, Figures 2.7,2.8 97

98 Error Hndling Mny errors cnnot e identified t this stge Exmple: fi (==f(x)). Should fi e if? Or is it routine nme? We will discover this lter in the nlysis At this point, we just crete n identifier token Sometimes the lexeme does not mtch ny pttern Esiest: eliminte letters until the eginning of legitimte lexeme Alterntives: eliminte/dd/replce one letter, replce order of two djcent letters, etc. Gol: llow the compiltion to continue Prolem: errors tht spred ll over 98

99 Automticlly generted scnners Use of Progrm-Generting Tools Specifiction è Prt of compiler Compiler-Compiler regulr expressions JFlex input progrm scnner Strem of tokens 99

100 Use of Progrm-Generting Tools Input: regulr expressions nd ctions Action = Jv code Output: scnner progrm tht Produces strem of tokens Invoke ctions when pttern is mtched regulr expressions JFlex input progrm scnner Strem of tokens 100

101 Line Counting Exmple Crete progrm tht counts the numer of lines in given input text file 101

102 Creting Scnner using Flex int num_lines = 0; %% \n ++num_lines;. ; %% min() { yylex(); printf( "# of lines = %d\n", num_lines); } 102

103 Creting Scnner using Flex int num_lines = 0; %% \n ++num_lines;. ; %% min() { yylex(); printf( "# of lines = %d\n", num_lines); } initil \n ^\n newline other 103

104 %% %% JFLex Spec File User code: Copied directly to Jv file JFlex directives: mcros, stte nmes Lexicl nlysis rules: Optionl stte, regulr expression, ction How to rek input to tokens Action when token mtched Possile source of jvc errors down the rod DIGIT= [0-9] LETTER= [-za-z] YYINITIAL {LETTER} ({LETTER} {DIGIT})* 104

105 Creting Scnner using JFlex import jv_cup.runtime.*; %% %cup %{ privte int linecounter = 0; %} %eofvl{ System.out.println("line numer=" + linecounter); return new Symol(sym.EOF); %eofvl} NEWLINE=\n %% {NEWLINE} { linecounter++; } [^{NEWLINE}] { } 105

106 Ctching errors Wht if input doesn t mtch ny token definition? Trick: Add ctch-ll rule tht mtches ny chrcter nd reports n error Add fter ll other rules 106

107 A JFlex specifiction of C Scnner import jv_cup.runtime.*; %% %cup %{ privte int linecounter = 0; %} Letter= [-za-z_] Digit= [0-9] %% \t { } \n { linecounter++; } ; { return new Symol(sym.SemiColumn);} ++ { return new Symol(sym.PlusPlus); } += { return new Symol(sym.PlusEq); } + { return new Symol(sym.Plus); } while { return new Symol(sym.While); } {Letter}({Letter} {Digit})* { return new Symol(sym.Id, yytext() ); } <= { return new Symol(sym.LessOrEqul); } < { return new Symol(sym.LessThn); } 107

108 Missing Creting lexicl nlysis y hnd Tle compression Symol Tles Nested Comments Hndling Mcros 108

109 Lexicl Anlysis: Wht Input: progrm text (file) Output: sequence of tokens 109

110 Lexicl Anlysis: How Define tokens using regulr expressions Construct nondeterministic finite-stte utomton (NFA) from regulr expression Determinize the NFA into deterministic finite-stte utomton (DFA) DFA cn e directly used to identify tokens 110

111 Lexicl Anlysis: Why Red input file Identify lnguge keywords nd stndrd identifiers Hndle include files nd mcros Count line numers Remove whitespces Report illegl symols [Produce symol tle] 111

112 The Rel Antomy of Compiler Source text txt Process text input chrcters Lexicl Syntx tokens Anlysis Anlysis Annotted AST AST Sem. Anlysis Intermedite code genertion IR Intermedite code optimiztion IR Code genertion Symolic Instructions Trget code optimiztion SI Mchine code genertion MI Write executle output Executle code exe 112

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5 CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,

More information

Fall Compiler Principles Lecture 1: Lexical Analysis. Roman Manevich Ben-Gurion University

Fall Compiler Principles Lecture 1: Lexical Analysis. Roman Manevich Ben-Gurion University Fll 2014-2015 Compiler Principles Lecture 1: Lexicl Anlysis Romn Mnevich Ben-Gurion University Agend Understnd role of lexicl nlysis in compiler Lexicl nlysis theory Implementing professionl scnner vi

More information

Fall Compiler Principles Lecture 1: Lexical Analysis. Roman Manevich Ben-Gurion University of the Negev

Fall Compiler Principles Lecture 1: Lexical Analysis. Roman Manevich Ben-Gurion University of the Negev Fll 2016-2017 Compiler Principles Lecture 1: Lexicl Anlysis Romn Mnevich Ben-Gurion University of the Negev Agend Understnd role of lexicl nlysis in compiler Regulr lnguges reminder Lexicl nlysis lgorithms

More information

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 4: Lexical Analyzers 28 Jan 08

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 4: Lexical Analyzers 28 Jan 08 CS412/413 Introduction to Compilers Tim Teitelum Lecture 4: Lexicl Anlyzers 28 Jn 08 Outline DFA stte minimiztion Lexicl nlyzers Automting lexicl nlysis Jlex lexicl nlyzer genertor CS 412/413 Spring 2008

More information

Topic 2: Lexing and Flexing

Topic 2: Lexing and Flexing Topic 2: Lexing nd Flexing COS 320 Compiling Techniques Princeton University Spring 2016 Lennrt Beringer 1 2 The Compiler Lexicl Anlysis Gol: rek strem of ASCII chrcters (source/input) into sequence of

More information

Fig.25: the Role of LEX

Fig.25: the Role of LEX The Lnguge for Specifying Lexicl Anlyzer We shll now study how to uild lexicl nlyzer from specifiction of tokens in the form of list of regulr expressions The discussion centers round the design of n existing

More information

Dr. D.M. Akbar Hussain

Dr. D.M. Akbar Hussain Dr. D.M. Akr Hussin Lexicl Anlysis. Bsic Ide: Red the source code nd generte tokens, it is similr wht humns will do to red in; just tking on the input nd reking it down in pieces. Ech token is sequence

More information

Lexical Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Lexical Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay Lexicl Anlysis Amith Snyl (www.cse.iit.c.in/ s) Deprtment of Computer Science nd Engineering, Indin Institute of Technology, Bomy Septemer 27 College of Engineering, Pune Lexicl Anlysis: 2/6 Recp The input

More information

In the last lecture, we discussed how valid tokens may be specified by regular expressions.

In the last lecture, we discussed how valid tokens may be specified by regular expressions. LECTURE 5 Scnning SYNTAX ANALYSIS We know from our previous lectures tht the process of verifying the syntx of the progrm is performed in two stges: Scnning: Identifying nd verifying tokens in progrm.

More information

CS 432 Fall Mike Lam, Professor a (bc)* Regular Expressions and Finite Automata

CS 432 Fall Mike Lam, Professor a (bc)* Regular Expressions and Finite Automata CS 432 Fll 2017 Mike Lm, Professor (c)* Regulr Expressions nd Finite Automt Compiltion Current focus "Bck end" Source code Tokens Syntx tree Mchine code chr dt[20]; int min() { flot x = 42.0; return 7;

More information

Reducing a DFA to a Minimal DFA

Reducing a DFA to a Minimal DFA Lexicl Anlysis - Prt 4 Reducing DFA to Miniml DFA Input: DFA IN Assume DFA IN never gets stuck (dd ded stte if necessry) Output: DFA MIN An equivlent DFA with the minimum numer of sttes. Hrry H. Porter,

More information

Lexical Analysis: Constructing a Scanner from Regular Expressions

Lexical Analysis: Constructing a Scanner from Regular Expressions Lexicl Anlysis: Constructing Scnner from Regulr Expressions Gol Show how to construct FA to recognize ny RE This Lecture Convert RE to n nondeterministic finite utomton (NFA) Use Thompson s construction

More information

CS 430 Spring Mike Lam, Professor. Parsing

CS 430 Spring Mike Lam, Professor. Parsing CS 430 Spring 2015 Mike Lm, Professor Prsing Syntx Anlysis We cn now formlly descrie lnguge's syntx Using regulr expressions nd BNF grmmrs How does tht help us? Syntx Anlysis We cn now formlly descrie

More information

Definition of Regular Expression

Definition of Regular Expression Definition of Regulr Expression After the definition of the string nd lnguges, we re redy to descrie regulr expressions, the nottion we shll use to define the clss of lnguges known s regulr sets. Recll

More information

Languages. L((a (b)(c))*) = { ε,a,bc,aa,abc,bca,... } εw = wε = w. εabba = abbaε = abba. (a (b)(c)) *

Languages. L((a (b)(c))*) = { ε,a,bc,aa,abc,bca,... } εw = wε = w. εabba = abbaε = abba. (a (b)(c)) * Pln for Tody nd Beginning Next week Interpreter nd Compiler Structure, or Softwre Architecture Overview of Progrmming Assignments The MeggyJv compiler we will e uilding. Regulr Expressions Finite Stte

More information

Compiler Construction D7011E

Compiler Construction D7011E Compiler Construction D7011E Lecture 3: Lexer genertors Viktor Leijon Slides lrgely y John Nordlnder with mteril generously provided y Mrk P. Jones. 1 Recp: Hndwritten Lexers: Don t require sophisticted

More information

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy Recognition of Tokens if expressions nd reltionl opertors if è if then è then else è else relop

More information

Lexical Analysis and Lexical Analyzer Generators

Lexical Analysis and Lexical Analyzer Generators 1 Lexicl Anlysis nd Lexicl Anlyzer Genertors Chpter 3 COP5621 Compiler Construction Copyright Roert vn Engelen, Florid Stte University, 2007-2009 2 The Reson Why Lexicl Anlysis is Seprte Phse Simplifies

More information

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis CS143 Hndout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexicl Anlysis In this first written ssignment, you'll get the chnce to ply round with the vrious constructions tht come up when doing lexicl

More information

Example: Source Code. Lexical Analysis. The Lexical Structure. Tokens. What do we really care here? A Sample Toy Program:

Example: Source Code. Lexical Analysis. The Lexical Structure. Tokens. What do we really care here? A Sample Toy Program: Lexicl Anlysis Red source progrm nd produce list of tokens ( liner nlysis) source progrm The lexicl structure is specified using regulr expressions Other secondry tsks: (1) get rid of white spces (e.g.,

More information

Lexical analysis, scanners. Construction of a scanner

Lexical analysis, scanners. Construction of a scanner Lexicl nlysis scnners (NB. Pges 4-5 re for those who need to refresh their knowledge of DFAs nd NFAs. These re not presented during the lectures) Construction of scnner Tools: stte utomt nd trnsition digrms.

More information

Lexical Analysis. Textbook:Modern Compiler Design Chapter 2.1.

Lexical Analysis. Textbook:Modern Compiler Design Chapter 2.1. Lexical Analysis Textbook:Modern Compiler Design Chapter 2.1 http://www.cs.tau.ac.il/~msagiv/courses/wcc11-12.html 1 A motivating example Create a program that counts the number of lines in a given input

More information

Finite Automata. Lecture 4 Sections Robb T. Koether. Hampden-Sydney College. Wed, Jan 21, 2015

Finite Automata. Lecture 4 Sections Robb T. Koether. Hampden-Sydney College. Wed, Jan 21, 2015 Finite Automt Lecture 4 Sections 3.6-3.7 Ro T. Koether Hmpden-Sydney College Wed, Jn 21, 2015 Ro T. Koether (Hmpden-Sydney College) Finite Automt Wed, Jn 21, 2015 1 / 23 1 Nondeterministic Finite Automt

More information

Some Thoughts on Grad School. Undergraduate Compilers Review and Intro to MJC. Structure of a Typical Compiler. Lexing and Parsing

Some Thoughts on Grad School. Undergraduate Compilers Review and Intro to MJC. Structure of a Typical Compiler. Lexing and Parsing Undergrdute Compilers Review nd Intro to MJC Announcements Miling list is in full swing Tody Some thoughts on grd school Finish prsing Semntic nlysis Visitor pttern for bstrct syntx trees Some Thoughts

More information

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an Scnner Termintion A scnner reds input chrcters nd prtitions them into tokens. Wht hppens when the end of the input file is reched? It my be useful to crete n Eof pseudo-chrcter when this occurs. In Jv,

More information

Compila(on Lecture 2: Lexical Analysis Syntax Analysis (1): CFLs, CFGs, PDAs. Noam Rinetzky

Compila(on Lecture 2: Lexical Analysis Syntax Analysis (1): CFLs, CFGs, PDAs. Noam Rinetzky Compila(on 0368-3133 Lecture 2: Lexical Analysis Syntax Analysis (1): CFLs, CFGs, PDAs Noam Rinetzky 1 2 3 What is a Compiler? source language target language Source text txt Executale code exe Compiler

More information

CMPT 379 Compilers. Lexical Analysis

CMPT 379 Compilers. Lexical Analysis CMPT 379 Compilers Anoop Srkr http://www.cs.sfu.c/~noop 9//7 Lexicl Anlysis Also clled scnning, tke input progrm string nd convert into tokens Exmple: T_DOUBLE ( doule ) T_IDENT ( f ) T_OP ( = ) doule

More information

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 3b Lexical Analysis Elias Athanasopoulos

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 3b Lexical Analysis Elias Athanasopoulos ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy RecogniNon of Tokens if expressions nd relnonl opertors if è if then è then else è else relop è

More information

CSCE 531, Spring 2017, Midterm Exam Answer Key

CSCE 531, Spring 2017, Midterm Exam Answer Key CCE 531, pring 2017, Midterm Exm Answer Key 1. (15 points) Using the method descried in the ook or in clss, convert the following regulr expression into n equivlent (nondeterministic) finite utomton: (

More information

CSE 401 Midterm Exam 11/5/10 Sample Solution

CSE 401 Midterm Exam 11/5/10 Sample Solution Question 1. egulr expressions (20 points) In the Ad Progrmming lnguge n integer constnt contins one or more digits, but it my lso contin embedded underscores. Any underscores must be preceded nd followed

More information

ECE 468/573 Midterm 1 September 28, 2012

ECE 468/573 Midterm 1 September 28, 2012 ECE 468/573 Midterm 1 September 28, 2012 Nme:! Purdue emil:! Plese sign the following: I ffirm tht the nswers given on this test re mine nd mine lone. I did not receive help from ny person or mteril (other

More information

CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona

CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona CSc 453 Compilers nd Systems Softwre 4 : Lexicl Anlysis II Deprtment of Computer Science University of Arizon collerg@gmil.com Copyright c 2009 Christin Collerg Implementing Automt NFAs nd DFAs cn e hrd-coded

More information

Principles of Programming Languages

Principles of Programming Languages Principles of Progrmming Lnguges h"p://www.di.unipi.it/~ndre/did2c/plp- 14/ Prof. Andre Corrdini Deprtment of Computer Science, Pis Lesson 5! Gener;on of Lexicl Anlyzers Creting Lexicl Anlyzer with Lex

More information

Assignment 4. Due 09/18/17

Assignment 4. Due 09/18/17 Assignment 4. ue 09/18/17 1. ). Write regulr expressions tht define the strings recognized by the following finite utomt: b d b b b c c b) Write FA tht recognizes the tokens defined by the following regulr

More information

CMPSC 470: Compiler Construction

CMPSC 470: Compiler Construction CMPSC 47: Compiler Construction Plese complete the following: Midterm (Type A) Nme Instruction: Mke sure you hve ll pges including this cover nd lnk pge t the end. Answer ech question in the spce provided.

More information

Lexical Analysis. Textbook:Modern Compiler Design Chapter 2.1

Lexical Analysis. Textbook:Modern Compiler Design Chapter 2.1 Lexical Analysis Textbook:Modern Compiler Design Chapter 2.1 A motivating example Create a program that counts the number of lines in a given input text file Solution (Flex) int num_lines = 0; %% \n ++num_lines;.

More information

Should be done. Do Soon. Structure of a Typical Compiler. Plan for Today. Lab hours and Office hours. Quiz 1 is due tonight, was posted Tuesday night

Should be done. Do Soon. Structure of a Typical Compiler. Plan for Today. Lab hours and Office hours. Quiz 1 is due tonight, was posted Tuesday night Should e done L hours nd Office hours Sign up for the miling list t, strting to send importnt info to list http://groups.google.com/group/cs453-spring-2011 Red Ch 1 nd skim Ch 2 through 2.6, red 3.3 nd

More information

this grammar generates the following language: Because this symbol will also be used in a later step, it receives the

this grammar generates the following language: Because this symbol will also be used in a later step, it receives the LR() nlysis Drwcks of LR(). Look-hed symols s eplined efore, concerning LR(), it is possile to consult the net set to determine, in the reduction sttes, for which symols it would e possile to perform reductions.

More information

Implementing Automata. CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona

Implementing Automata. CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona Implementing utomt Sc 5 ompilers nd Systems Softwre : Lexicl nlysis II Deprtment of omputer Science University of rizon collerg@gmil.com opyright c 009 hristin ollerg NFs nd DFs cn e hrd-coded using this

More information

CSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011

CSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011 CSCI 3130: Forml Lnguges nd utomt Theory Lecture 12 The Chinese University of Hong Kong, Fll 2011 ndrej Bogdnov In progrmming lnguges, uilding prse trees is significnt tsk ecuse prse trees tell us the

More information

ASTs, Regex, Parsing, and Pretty Printing

ASTs, Regex, Parsing, and Pretty Printing ASTs, Regex, Prsing, nd Pretty Printing CS 2112 Fll 2016 1 Algeric Expressions To strt, consider integer rithmetic. Suppose we hve the following 1. The lphet we will use is the digits {0, 1, 2, 3, 4, 5,

More information

From Dependencies to Evaluation Strategies

From Dependencies to Evaluation Strategies From Dependencies to Evlution Strtegies Possile strtegies: 1 let the user define the evlution order 2 utomtic strtegy sed on the dependencies: use locl dependencies to determine which ttriutes to compute

More information

Applied Databases. Sebastian Maneth. Lecture 13 Online Pattern Matching on Strings. University of Edinburgh - February 29th, 2016

Applied Databases. Sebastian Maneth. Lecture 13 Online Pattern Matching on Strings. University of Edinburgh - February 29th, 2016 Applied Dtses Lecture 13 Online Pttern Mtching on Strings Sestin Mneth University of Edinurgh - Ferury 29th, 2016 2 Outline 1. Nive Method 2. Automton Method 3. Knuth-Morris-Prtt Algorithm 4. Boyer-Moore

More information

COMP 423 lecture 11 Jan. 28, 2008

COMP 423 lecture 11 Jan. 28, 2008 COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring

More information

TO REGULAR EXPRESSIONS

TO REGULAR EXPRESSIONS Suject :- Computer Science Course Nme :- Theory Of Computtion DA TO REGULAR EXPRESSIONS Report Sumitted y:- Ajy Singh Meen 07000505 jysmeen@cse.iit.c.in BASIC DEINITIONS DA:- A finite stte mchine where

More information

Scanner Termination. Multi Character Lookahead

Scanner Termination. Multi Character Lookahead If d.doublevlue() represents vlid integer, (int) d.doublevlue() will crete the pproprite integer vlue. If string representtion of n integer begins with ~ we cn strip the ~, convert to double nd then negte

More information

Quiz2 45mins. Personal Number: Problem 1. (20pts) Here is an Table of Perl Regular Ex

Quiz2 45mins. Personal Number: Problem 1. (20pts) Here is an Table of Perl Regular Ex Long Quiz2 45mins Nme: Personl Numer: Prolem. (20pts) Here is n Tle of Perl Regulr Ex Chrcter Description. single chrcter \s whitespce chrcter (spce, t, newline) \S non-whitespce chrcter \d digit (0-9)

More information

Deterministic. Finite Automata. And Regular Languages. Fall 2018 Costas Busch - RPI 1

Deterministic. Finite Automata. And Regular Languages. Fall 2018 Costas Busch - RPI 1 Deterministic Finite Automt And Regulr Lnguges Fll 2018 Costs Busch - RPI 1 Deterministic Finite Automton (DFA) Input Tpe String Finite Automton Output Accept or Reject Fll 2018 Costs Busch - RPI 2 Trnsition

More information

Operator Precedence. Java CUP. E E + T T T * P P P id id id. Does a+b*c mean (a+b)*c or

Operator Precedence. Java CUP. E E + T T T * P P P id id id. Does a+b*c mean (a+b)*c or Opertor Precedence Most progrmming lnguges hve opertor precedence rules tht stte the order in which opertors re pplied (in the sence of explicit prentheses). Thus in C nd Jv nd CSX, +*c mens compute *c,

More information

12 <= rm <digit> 2 <= rm <no> 2 <= rm <no> <digit> <= rm <no> <= rm <number>

12 <= rm <digit> 2 <= rm <no> 2 <= rm <no> <digit> <= rm <no> <= rm <number> DDD16 Compilers nd Interpreters DDB44 Compiler Construction R Prsing Prt 1 R prsing concept Using prser genertor Prse ree Genertion Wht is R-prsing? eft-to-right scnning R Rigthmost derivtion in reverse

More information

CS 340, Fall 2014 Dec 11 th /13 th Final Exam Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string.

CS 340, Fall 2014 Dec 11 th /13 th Final Exam Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string. CS 340, Fll 2014 Dec 11 th /13 th Finl Exm Nme: Note: in ll questions, the specil symol ɛ (epsilon) is used to indicte the empty string. Question 1. [5 points] Consider the following regulr expression;

More information

Theory of Computation CSE 105

Theory of Computation CSE 105 $ $ $ Theory of Computtion CSE 105 Regulr Lnguges Study Guide nd Homework I Homework I: Solutions to the following problems should be turned in clss on July 1, 1999. Instructions: Write your nswers clerly

More information

Context-Free Grammars

Context-Free Grammars Context-Free Grmmrs Descriing Lnguges We've seen two models for the regulr lnguges: Finite utomt ccept precisely the strings in the lnguge. Regulr expressions descrie precisely the strings in the lnguge.

More information

CS 241. Fall 2017 Midterm Review Solutions. October 24, Bits and Bytes 1. 3 MIPS Assembler 6. 4 Regular Languages 7.

CS 241. Fall 2017 Midterm Review Solutions. October 24, Bits and Bytes 1. 3 MIPS Assembler 6. 4 Regular Languages 7. CS 241 Fll 2017 Midterm Review Solutions Octoer 24, 2017 Contents 1 Bits nd Bytes 1 2 MIPS Assemly Lnguge Progrmming 2 3 MIPS Assemler 6 4 Regulr Lnguges 7 5 Scnning 9 1 Bits nd Bytes 1. Give two s complement

More information

CS 340, Fall 2016 Sep 29th Exam 1 Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string.

CS 340, Fall 2016 Sep 29th Exam 1 Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string. CS 340, Fll 2016 Sep 29th Exm 1 Nme: Note: in ll questions, the speil symol ɛ (epsilon) is used to indite the empty string. Question 1. [10 points] Speify regulr expression tht genertes the lnguge over

More information

Compilers Spring 2013 PRACTICE Midterm Exam

Compilers Spring 2013 PRACTICE Midterm Exam Compilers Spring 2013 PRACTICE Midterm Exm This is full length prctice midterm exm. If you wnt to tke it t exm pce, give yourself 7 minutes to tke the entire test. Just like the rel exm, ech question hs

More information

10/12/17. Motivating Example. Lexical and Syntax Analysis (2) Recursive-Descent Parsing. Recursive-Descent Parsing. Recursive-Descent Parsing

10/12/17. Motivating Example. Lexical and Syntax Analysis (2) Recursive-Descent Parsing. Recursive-Descent Parsing. Recursive-Descent Parsing Motivting Exmple Lexicl nd yntx Anlysis (2) In Text: Chpter 4 Consider the grmmr -> cad A -> b Input string: w = cd How to build prse tree top-down? 2 Initilly crete tree contining single node (the strt

More information

CS 321 Programming Languages and Compilers. Bottom Up Parsing

CS 321 Programming Languages and Compilers. Bottom Up Parsing CS 321 Progrmming nguges nd Compilers Bottom Up Prsing Bottom-up Prsing: Shift-reduce prsing Grmmr H: fi ; fi b Input: ;;b hs prse tree ; ; b 2 Dt for Shift-reduce Prser Input string: sequence of tokens

More information

What are suffix trees?

What are suffix trees? Suffix Trees 1 Wht re suffix trees? Allow lgorithm designers to store very lrge mount of informtion out strings while still keeping within liner spce Allow users to serch for new strings in the originl

More information

LR Parsing, Part 2. Constructing Parse Tables. Need to Automatically Construct LR Parse Tables: Action and GOTO Table

LR Parsing, Part 2. Constructing Parse Tables. Need to Automatically Construct LR Parse Tables: Action and GOTO Table TDDD55 Compilers nd Interpreters TDDB44 Compiler Construction LR Prsing, Prt 2 Constructing Prse Tles Prse tle construction Grmmr conflict hndling Ctegories of LR Grmmrs nd Prsers Peter Fritzson, Christoph

More information

COS 333: Advanced Programming Techniques

COS 333: Advanced Programming Techniques COS 333: Advnced Progrmming Techniques Brin Kernighn wk@cs, www.cs.princeton.edu/~wk 311 CS Building 609-258-2089 (ut emil is lwys etter) TA's: Junwen Li, li@cs, CS 217,258-0451 Yong Wng,yongwng@cs, CS

More information

Java CUP. Java CUP Specifications. User Code Additions. Package and Import Specifications

Java CUP. Java CUP Specifications. User Code Additions. Package and Import Specifications Jv CUP Jv CUP is prser-genertion tool, similr to Ycc. CUP uilds Jv prser for LALR(1) grmmrs from production rules nd ssocited Jv code frgments. When prticulr production is recognized, its ssocited code

More information

Scanning Theory and Practice

Scanning Theory and Practice CHAPTER 3 Scnning Theory nd Prctice 3.1 Overview The primry function of scnner is to red in chrcters from source file nd group them into tokens. A scnner is sometimes clled lexicl nlyzer or lexer. The

More information

stack of states and grammar symbols Stack-Bottom marker C. Kessler, IDA, Linköpings universitet. 1. <list> -> <list>, <element> 2.

stack of states and grammar symbols Stack-Bottom marker C. Kessler, IDA, Linköpings universitet. 1. <list> -> <list>, <element> 2. TDDB9 Compilers nd Interpreters TDDB44 Compiler Construction LR Prsing Updted/New slide mteril 007: Pushdown Automton for LR-Prsing Finite-stte pushdown utomton contins lterntingly sttes nd symols in NUΣ

More information

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming Lecture 10 Evolutionry Computtion: Evolution strtegies nd genetic progrmming Evolution strtegies Genetic progrmming Summry Negnevitsky, Person Eduction, 2011 1 Evolution Strtegies Another pproch to simulting

More information

Context-Free Grammars

Context-Free Grammars Context-Free Grmmrs Descriing Lnguges We've seen two models for the regulr lnguges: Finite utomt ccept precisely the strings in the lnguge. Regulr expressions descrie precisely the strings in the lnguge.

More information

CMSC 331 First Midterm Exam

CMSC 331 First Midterm Exam 0 00/ 1 20/ 2 05/ 3 15/ 4 15/ 5 15/ 6 20/ 7 30/ 8 30/ 150/ 331 First Midterm Exm 7 October 2003 CMC 331 First Midterm Exm Nme: mple Answers tudent ID#: You will hve seventy-five (75) minutes to complete

More information

Lecture T4: Pattern Matching

Lecture T4: Pattern Matching Introduction to Theoreticl CS Lecture T4: Pttern Mtching Two fundmentl questions. Wht cn computer do? How fst cn it do it? Generl pproch. Don t tlk bout specific mchines or problems. Consider miniml bstrct

More information

COS 333: Advanced Programming Techniques

COS 333: Advanced Programming Techniques COS 333: Advnced Progrmming Techniques How to find me wk@cs, www.cs.princeton.edu/~wk 311 CS Building 609-258-2089 (ut emil is lwys etter) TA's: Mtvey Arye (rye), Tom Jlin (tjlin), Nick Johnson (npjohnso)

More information

2014 Haskell January Test Regular Expressions and Finite Automata

2014 Haskell January Test Regular Expressions and Finite Automata 0 Hskell Jnury Test Regulr Expressions nd Finite Automt This test comprises four prts nd the mximum mrk is 5. Prts I, II nd III re worth 3 of the 5 mrks vilble. The 0 Hskell Progrmming Prize will be wrded

More information

CS481: Bioinformatics Algorithms

CS481: Bioinformatics Algorithms CS481: Bioinformtics Algorithms Cn Alkn EA509 clkn@cs.ilkent.edu.tr http://www.cs.ilkent.edu.tr/~clkn/teching/cs481/ EXACT STRING MATCHING Fingerprint ide Assume: We cn compute fingerprint f(p) of P in

More information

Midterm I Solutions CS164, Spring 2006

Midterm I Solutions CS164, Spring 2006 Midterm I Solutions CS164, Spring 2006 Februry 23, 2006 Plese red ll instructions (including these) crefully. Write your nme, login, SID, nd circle the section time. There re 8 pges in this exm nd 4 questions,

More information

Stack. A list whose end points are pointed by top and bottom

Stack. A list whose end points are pointed by top and bottom 4. Stck Stck A list whose end points re pointed by top nd bottom Insertion nd deletion tke plce t the top (cf: Wht is the difference between Stck nd Arry?) Bottom is constnt, but top grows nd shrinks!

More information

CS 241 Week 4 Tutorial Solutions

CS 241 Week 4 Tutorial Solutions CS 4 Week 4 Tutoril Solutions Writing n Assemler, Prt & Regulr Lnguges Prt Winter 8 Assemling instrutions utomtilly. slt $d, $s, $t. Solution: $d, $s, nd $t ll fit in -it signed integers sine they re 5-it

More information

Lexical Analysis. Role, Specification & Recognition Tool: LEX Construction: - RE to NFA to DFA to min-state DFA - RE to DFA

Lexical Analysis. Role, Specification & Recognition Tool: LEX Construction: - RE to NFA to DFA to min-state DFA - RE to DFA Lexicl Anlysis Role, Specifiction & Recognition Tool: LEX Construction: - RE to NFA to DFA to min-stte DFA - RE to DFA Conducting Lexicl Anlysis Techniques for specifying nd implementing lexicl nlyzers

More information

Lecture T1: Pattern Matching

Lecture T1: Pattern Matching Introduction to Theoreticl CS Lecture T: Pttern Mtchin Two fundmentl questions. Wht cn computer do? Wht cn computer do with limited resources? Generl pproch. Don t tlk out specific mchines or prolems.

More information

Regular Expressions and Automata using Miranda

Regular Expressions and Automata using Miranda Regulr Expressions nd Automt using Mirnd Simon Thompson Computing Lortory Univerisity of Kent t Cnterury My 1995 Contents 1 Introduction ::::::::::::::::::::::::::::::::: 1 2 Regulr Expressions :::::::::::::::::::::::::::::

More information

Compilation

Compilation Compilation 0368-3133 Lecture 1: Introduction Lexical Analysis Noam Rinetzky 1 2 Admin Lecturer: Noam Rinetzky maon@tau.ac.il http://www.cs.tau.ac.il/~maon T.A.: Oren Ish Shalom Textbooks: Modern Compiler

More information

Regular Expression Matching with Multi-Strings and Intervals. Philip Bille Mikkel Thorup

Regular Expression Matching with Multi-Strings and Intervals. Philip Bille Mikkel Thorup Regulr Expression Mtching with Multi-Strings nd Intervls Philip Bille Mikkel Thorup Outline Definition Applictions Previous work Two new problems: Multi-strings nd chrcter clss intervls Algorithms Thompson

More information

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFORMATICS 1 COMPUTATION & LOGIC INSTRUCTIONS TO CANDIDATES

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFORMATICS 1 COMPUTATION & LOGIC INSTRUCTIONS TO CANDIDATES UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFORMATICS COMPUTATION & LOGIC Sturdy st April 7 : to : INSTRUCTIONS TO CANDIDATES This is tke-home exercise. It will not

More information

Algorithm Design (5) Text Search

Algorithm Design (5) Text Search Algorithm Design (5) Text Serch Tkshi Chikym School of Engineering The University of Tokyo Text Serch Find sustring tht mtches the given key string in text dt of lrge mount Key string: chr x[m] Text Dt:

More information

COMPUTER SCIENCE 123. Foundations of Computer Science. 6. Tuples

COMPUTER SCIENCE 123. Foundations of Computer Science. 6. Tuples COMPUTER SCIENCE 123 Foundtions of Computer Science 6. Tuples Summry: This lecture introduces tuples in Hskell. Reference: Thompson Sections 5.1 2 R.L. While, 2000 3 Tuples Most dt comes with structure

More information

CS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig

CS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig CS311H: Discrete Mthemtics Grph Theory IV Instructor: Işıl Dillig Instructor: Işıl Dillig, CS311H: Discrete Mthemtics Grph Theory IV 1/25 A Non-plnr Grph Regions of Plnr Grph The plnr representtion of

More information

2 Computing all Intersections of a Set of Segments Line Segment Intersection

2 Computing all Intersections of a Set of Segments Line Segment Intersection 15-451/651: Design & Anlysis of Algorithms Novemer 14, 2016 Lecture #21 Sweep-Line nd Segment Intersection lst chnged: Novemer 8, 2017 1 Preliminries The sweep-line prdigm is very powerful lgorithmic design

More information

Homework. Context Free Languages III. Languages. Plan for today. Context Free Languages. CFLs and Regular Languages. Homework #5 (due 10/22)

Homework. Context Free Languages III. Languages. Plan for today. Context Free Languages. CFLs and Regular Languages. Homework #5 (due 10/22) Homework Context Free Lnguges III Prse Trees nd Homework #5 (due 10/22) From textbook 6.4,b 6.5b 6.9b,c 6.13 6.22 Pln for tody Context Free Lnguges Next clss of lnguges in our quest! Lnguges Recll. Wht

More information

acronyms possibly used in this test: CFG :acontext free grammar CFSM :acharacteristic finite state machine DFA :adeterministic finite automata

acronyms possibly used in this test: CFG :acontext free grammar CFSM :acharacteristic finite state machine DFA :adeterministic finite automata EE573 Fll 2002, Exm open book, if question seems mbiguous, sk me to clrify the question. If my nswer doesn t stisfy you, plese stte your ssumptions. cronyms possibly used in this test: CFG :context free

More information

Discussion 1 Recap. COP4600 Discussion 2 OS concepts, System call, and Assignment 1. Questions. Questions. Outline. Outline 10/24/2010

Discussion 1 Recap. COP4600 Discussion 2 OS concepts, System call, and Assignment 1. Questions. Questions. Outline. Outline 10/24/2010 COP4600 Discussion 2 OS concepts, System cll, nd Assignment 1 TA: Hufeng Jin hj0@cise.ufl.edu Discussion 1 Recp Introduction to C C Bsic Types (chr, int, long, flot, doule, ) C Preprocessors (#include,

More information

COMBINATORIAL PATTERN MATCHING

COMBINATORIAL PATTERN MATCHING COMBINATORIAL PATTERN MATCHING Genomic Repets Exmple of repets: ATGGTCTAGGTCCTAGTGGTC Motivtion to find them: Genomic rerrngements re often ssocited with repets Trce evolutionry secrets Mny tumors re chrcterized

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Dt Mining y I. H. Witten nd E. Frnk Simplicity first Simple lgorithms often work very well! There re mny kinds of simple structure, eg: One ttriute does ll the work All ttriutes contriute eqully

More information

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards A Tutology Checker loosely relted to Stålmrck s Algorithm y Mrtin Richrds mr@cl.cm.c.uk http://www.cl.cm.c.uk/users/mr/ University Computer Lortory New Museum Site Pemroke Street Cmridge, CB2 3QG Mrtin

More information

Systems I. Logic Design I. Topics Digital logic Logic gates Simple combinational logic circuits

Systems I. Logic Design I. Topics Digital logic Logic gates Simple combinational logic circuits Systems I Logic Design I Topics Digitl logic Logic gtes Simple comintionl logic circuits Simple C sttement.. C = + ; Wht pieces of hrdwre do you think you might need? Storge - for vlues,, C Computtion

More information

Presentation Martin Randers

Presentation Martin Randers Presenttion Mrtin Rnders Outline Introduction Algorithms Implementtion nd experiments Memory consumption Summry Introduction Introduction Evolution of species cn e modelled in trees Trees consist of nodes

More information

PPS: User Manual. Krishnendu Chatterjee, Martin Chmelik, Raghav Gupta, and Ayush Kanodia

PPS: User Manual. Krishnendu Chatterjee, Martin Chmelik, Raghav Gupta, and Ayush Kanodia PPS: User Mnul Krishnendu Chtterjee, Mrtin Chmelik, Rghv Gupt, nd Ayush Knodi IST Austri (Institute of Science nd Technology Austri), Klosterneuurg, Austri In this section we descrie the tool fetures,

More information

Virtual Machine (Part I)

Virtual Machine (Part I) Hrvrd University CS Fll 2, Shimon Schocken Virtul Mchine (Prt I) Elements of Computing Systems Virtul Mchine I (Ch. 7) Motivtion clss clss Min Min sttic sttic x; x; function function void void min() min()

More information

Sample Midterm Solutions COMS W4115 Programming Languages and Translators Monday, October 12, 2009

Sample Midterm Solutions COMS W4115 Programming Languages and Translators Monday, October 12, 2009 Deprtment of Computer cience Columbi University mple Midterm olutions COM W4115 Progrmming Lnguges nd Trnsltors Mondy, October 12, 2009 Closed book, no ids. ch question is worth 20 points. Question 5(c)

More information

Mid-term exam. Scores. Fall term 2012 KAIST EE209 Programming Structures for EE. Thursday Oct 25, Student's name: Student ID:

Mid-term exam. Scores. Fall term 2012 KAIST EE209 Programming Structures for EE. Thursday Oct 25, Student's name: Student ID: Fll term 2012 KAIST EE209 Progrmming Structures for EE Mid-term exm Thursdy Oct 25, 2012 Student's nme: Student ID: The exm is closed book nd notes. Red the questions crefully nd focus your nswers on wht

More information

Today. CS 188: Artificial Intelligence Fall Recap: Search. Example: Pancake Problem. Example: Pancake Problem. General Tree Search.

Today. CS 188: Artificial Intelligence Fall Recap: Search. Example: Pancake Problem. Example: Pancake Problem. General Tree Search. CS 88: Artificil Intelligence Fll 00 Lecture : A* Serch 9//00 A* Serch rph Serch Tody Heuristic Design Dn Klein UC Berkeley Multiple slides from Sturt Russell or Andrew Moore Recp: Serch Exmple: Pncke

More information

Recognition of Tokens

Recognition of Tokens 42 Recognton o Tokens The queston s how to recognze the tokens? Exmple: ssume the ollowng grmmr rgment to generte specc lnguge: stmt expr expr then stmt expr then stmt else stmt term relop term term term

More information

Lecture 18: Theory of Computation

Lecture 18: Theory of Computation Introduction to Theoreticl CS ecture 18: Theory of Computtion Two fundmentl questions. Wht cn computer do? Wht cn computer do with limited resources? Generl pproch. Pentium IV running inux kernel.4. Don't

More information

Tries. Yufei Tao KAIST. April 9, Y. Tao, April 9, 2013 Tries

Tries. Yufei Tao KAIST. April 9, Y. Tao, April 9, 2013 Tries Tries Yufei To KAIST April 9, 2013 Y. To, April 9, 2013 Tries In this lecture, we will discuss the following exct mtching prolem on strings. Prolem Let S e set of strings, ech of which hs unique integer

More information