Introduction to Compilers and Language Design Copyright (C) 2017 Douglas Thain. All rights reserved.

Size: px
Start display at page:

Download "Introduction to Compilers and Language Design Copyright (C) 2017 Douglas Thain. All rights reserved."

Transcription

1 Introdution to Compilers nd Lnguge Design Copyright (C) 2017 Dougls Thin. All rights reserved. Anyone is free to downlod nd print the PDF edition of this ook for personl use. Commeril distriution, printing, or reprodution without the uthor s onsent is expressly prohiited. You n find the ltest version of the PDF edition, nd purhse inexpensive hrdover opies t this wesite: Drft version: Deemer 12, 2017

2 11 Chpter 3 Snning 3.1 Kinds of Tokens Snning is the proess of identifying tokens from the rw text soure ode of progrm. At first glne, snning might seem trivil fter ll, identifying words in nturl lnguge is s simple s looking for spes etween letters. However, identifying tokens in soure ode requires the lnguge designer to lrify mny fine detils, so tht it is ler wht is permitted nd wht is not. Most lnguges will hve tokens in these tegories: Keywords re words in the lnguge struture itself, like while or lss or true. Keywords must e hosen refully to reflet the nturl struture of the lnguge, without interfering with the likely nmes of vriles nd other identifiers. Identifiers re the nmes of vriles, funtions, lsses, nd other ode elements hosen y the progrmmer. Typilly, identifiers re ritrry sequenes of letters nd possily numers. Some lnguges require identifiers to e mrked with sentinel (like the dollr sign in Perl) to lerly distinguish identifiers from keywords. Numers ould e formtted s integers, or floting point vlues, or frtions, or in lternte ses suh s inry, otl or hexdeiml. Eh formt should e lerly distinguished, so tht the progrmmer does not onfuse one with the other. Strings re literl hrter sequenes tht must e lerly distinguished from keywords or identifiers. Strings re typilly quoted with single or doule quotes, ut lso must hve some fility for ontining quottions, newlines, nd unprintle hrters. Comments nd whitespe re used to formt progrm to mke it visully ler, nd in some ses (like Python) re signifint to the struture of progrm. When designing new lnguge, or designing ompiler for n existing lnguge, the first jo is to stte preisely wht hrters re permitted in eh type of token. Initilly, this ould e done informlly y stting,

3 12 CHAPTER 3. SCANNING token_t sn_token( FILE *fp ) { hr = fget(fp); if(== * ) { return TOKEN_MULTIPLY; } else if(==! ) { hr d = fget(fp); if(d== = ) { return TOKEN_NOT_EQUAL; } else { unget(d,fp); return TOKEN_NOT; } } else if(islph()) { do { hr d = fget(fp); } while(islphnum(d)); unget(d,fp); return TOKEN_IDENTIFIER; } else if (... ) {... } } Figure 3.1: A Simple Hnd Mde Snner for exmple, An identifier onsists of letter followed y ny numer of letters nd numerls., nd then ssigning symoli onstnt (TOKEN IDENTIFIER) for tht kind of token. As we will see, n informl pproh is often miguous, nd more rigorous pproh is needed. 3.2 A Hnd-Mde Snner Figure 3.1 shows how one might write snner y hnd, using simple oding tehniques. To keep things simple, we only onsider just few tokens: * for multiplition,! for logil-not,!= for not-equl, nd sequenes of letters nd numers for identifiers. The si pproh is to red one hrter t time from the input strem (fget(fp)) nd then lssify it. Some single-hrter tokens re esy: if the snner reds * hrter, it immeditely returns TOKEN MULTIPLY, nd the sme would e true for ddition, sutrtion, nd so forth. However, some hrters re prt of multiple tokens. If the snner enounters!, tht ould represent logil-not opertion y itself, or it ould e the first hrter in the!= sequene representing not-equl-to. Upon reding!, the snner must immeditely red the next hrter. If

4 3.3. REGULAR EXPRESSIONS 13 the next hrter is =, then it hs mthed the sequene!= nd returns TOKEN NOT EQUAL. But, if the hrter following! is something else, then the non-mthing hrter needs to e put k on the input strem usingunget, euse it is not prt of the urrent token. The snner returnstoken NOT nd will onsume the put-k hrter on the next ll tosn token. In similr wy, one letter hs een identified yislph(), then the snner keeps reding letters or numers, until non-mthing hrter is found. The non-mthing hrter is put k, nd the snner returnstoken IDENTIFIER. (We will see this pttern ome up in every stge of the ompiler: n unexpeted item doesn t mth the urrent ojetive, so it must e put k for lter. This is known more generlly s ktrking.) As you n see, hnd-mde snner is rther verose. As more token types re dded, the ode n eome quite onvoluted, prtiulrly if tokens shre ommon sequenes of hrters. It n lso e diffiult for developer to e ertin tht the snner ode orresponds to the desired definition of eh token, whih n result in unexpeted ehvior on omplex inputs. Tht sid, for smll lnguge with limited numer of tokens, hnd-mde snner n e n pproprite solution. For omplex lnguge with lrge numer of tokens, we need more formlized pproh to defining nd snning tokens. A forml pproh will llow us to hve greter onfidene tht token definitions do not onflit nd the snner is implemented orretly. Further, formlized pproh will llow us to mke the snner ompt nd high performne surprisingly, the snner itself n e the performne ottlenek in ompiler, sine every single hrter must e individully onsidered. The forml tools of regulr expressions nd finite utomt llow us to stte very preisely wht my pper in given token type. Then, utomted tools n proess these definitions, find errors or miguities, nd produe ompt, high performne ode. 3.3 Regulr Expressions Regulr expressions (REs) re lnguge for expressing ptterns. They were first desried in the 1950s y Stephen Kleene [8] s n element of his foundtionl work in utomt theory nd omputility. Tody, REs re found in slightly different forms in progrmming lnguges (Perl), stndrd lirries (PCRE), text editors (vi), ommnd-line tools (grep), nd mny other ples. We n use regulr expressions s ompt nd forml wy of speifying the tokens epted y the snner of ompiler, nd then utomtilly trnslte those expressions into working ode. While esily explined, REs n e it triky to use, nd require some prtie in order to hieve the desired results.

5 14 CHAPTER 3. SCANNING Let us define regulr expressions preisely: A regulr expression s is string whih denotes L(s), set of strings drwn from n lphetσ. L(s) is known s the lnguge ofs. L(s) is defined indutively with the following se ses: If Σ thenis regulr expression ndl() = {}. ǫ is regulr expression ndl(ǫ) ontins only the empty string. Then, for ny regulr expressionssndt: 1. s t is RE suh thtl(s t) = L(s) L(t). 2. st is RE suh thtl(st) ontins ll strings formed y the ontention of string inl(s) followed y string inl(t). 3. s is RE suh thtl(s ) = L(s) ontented zero or more times. Rule #3 is known s the Kleene losure nd hs the highest preedene. Rule #2 is known s ontention. Rule #1 hs the lowest preedene nd is known s lterntion. Prentheses n e dded to djust the order of opertions in the usul wy. Here re few exmples using just the si rules. (Note tht finite RE n indite n infinite set.) Regulr Expression s Lnguge L(s) hello { hello } d(o i)g { dog,dig } moo* { mo,moo,mooo,... } (moo)* { ǫ,moo,moomoo,moomoomoo,... } ( )* {,,,,,,... } The syntx desried on the previous pge is entirely suffiient to write ny regulr expression. But, is it lso hndy to hve few helper opertions uilt on top of the si syntx: s? indites thtsis optionl. s? n e written s(s ǫ) s+ indites thtsis repeted one or more times. s+ n e written sss* [-z] indites ny hrter in tht rnge. [-z] n e written s(... z) [ˆx] indites ny hrter exept one. [ˆx] n e written sσ-x

6 3.4. FINITE AUTOMATA 15 Regulr expressions lso oey severl lgeri properties, whih mke it possile to re-rrnge them s needed for effiieny or lrity: Assoitivity: ( ) = ( ) Commuttivity: = Distriution: ( ) = Idempoteny: ** = * Using regulr expressions, we n preisely stte wht is permitted in given token. Suppose we hve hypothetil progrmming lnguge with the following informl definitions nd regulr expressions. For eh token type, we show exmples of strings tht mth (nd do not mth) the regulr expression. Informl definition: An identifier is sequene of pitl letters nd numers, ut numer must not ome first. Regulr expression: [A-Z]+([A-Z] [0-9])* Mthes strings: PRINT MODE5 Does not mth: hello 4YOU Informl definition: A numer is sequene of digits with n optionl deiml point. For lrity, the deiml point must hve digits on oth left nd right sides. Regulr expression: [0-9]+(.[0-9]+)? Mthes strings: Does not mth: Informl definition: Regulr expression: Mthes strings: Does not mth: A omment is ny text (exept right ngle rket) surrounded y ngle rkets. <[ˆ>]*> <triky prt> <<<<look left> <this is n <illegl> omment> 3.4 Finite Automt A finite utomton (FA) is n strt mhine tht n e used to represent ertin forms of omputtion. Grphilly, n FA onsists of numer of sttes (represented y numered irles) nd numer of edges (represented y lelled rrows) etween those sttes. Eh edge is lelled with one or more symols drwn from n lphetσ. The mhine egins in strt sttes 0. For eh input symol presented to the FA, it moves to the stte indited y the edge with the sme lel

7 16 CHAPTER 3. SCANNING s the input symol. Some sttes of the FA re known s epting sttes nd re indited y doule irle. If the FA is in n epting stte fter ll input is onsumed, then we sy tht the FA epts the input. We sy tht the FA rejets the input string if it ends in non-epting stte, or if there is no edge orresponding to the urrent input symol. Every RE n e written s n FA, nd vie vers. For simple regulr expression, one n onstrut n FA y hnd. For exmple, here is n FA for the keywordfor: f o r Here is n FA for identifiers of the form[-z][-z0-9]+ -z 0-9 -z 0 -z And here is n FA for numers of the form([1-9][0-9]*) Deterministi Finite Automt Eh of these three exmples is deterministi finite utomton (DFA). A DFA is speil se of n FA where every stte hs no more thn one outgoing edge for given symol. Put nother wy, DFA hs no miguity: for every omintion of stte nd input symol, there is extly one hoie of wht to do next. Beuse of this property, DFA is very esy to implement in softwre or hrdwre. One integer () is needed to keep trk of the urrent stte.

8 3.4. FINITE AUTOMATA 17 The trnsitions etween sttes re represented y mtrix (M[s,i]) whih enodes the next stte, given the urrent stte nd input symol. (If the trnsition is not llowed, we mrk it withe to indite n error.) For eh symol, we ompute = M[s,i] until ll the input is onsumed, or n error stte is rehed Nondeterministi Finite Automt The lterntive to DFA is nondeterministi finite utomton (NFA). An NFA is perfetly vlid FA, ut it hs n miguity tht mkes it somewht more diffiult to work with. Consider the regulr expression[-z]*ing, whih represents ll lowerse words ending in the suffixing. It n e represented with the following utomton: [-z] i n g Now onsider how this utomton would onsume the word sing. It ould proeed in two different wys. One would e to move to stte 0 on s, stte 1 oni, stte 2 onn, nd stte 3 ong. But the other, eqully vlid wy would e to sty in stte 0 the whole time, mthing eh letter to the [-z] trnsition. Both wys oey the trnsition rules, ut one results in eptne, while the other results in rejetion. The prolem here is tht stte 0 llows for two different trnsitions on the symoli. One is to sty in stte 0 mthing[-z] nd the other is to move to stte 1 mthingi. Moreover, there is no simple rule y whih we n pik one pth or nother. If the input is sing, the right solution is to proeed immeditely from stte zero to stte one on i. But if the input is singing, then we should sty in stte zero for the firsting nd proeed to stte one for the seonding An NFA n lso hve n ǫ (epsilon) trnsition, whih represents the empty string. This trnsition n e tking without onsuming ny input symols t ll. For exmple, we ould represent the regulr expression *( ) with this NFA:

9 18 CHAPTER 3. SCANNING This prtiulr NFA presents vriety of miguous hoies. From stte zero, it ould onsumend sty in stte zero. Or, it ould tke n ǫ to stte one or stte four, nd then onsume neither wy. There re two ommon wys to interpret this miguity: The rystl ll interprettion suggests tht the NFA somehow knows wht the est hoie is, y some mens externl to the NFA itself. In the exmple ove, the NFA would hoose whether to proeed to stte zero, one, or two efore onsuming the first hrter, nd it would lwys mke the right hoie. Needless to sy, this isn t possile in rel implementtion. The mny-worlds interprettion suggests tht tht NFA exists in ll llowle sttes simultneously. When the input is omplete, if ny of those sttes re epting sttes, then the NFA hs epted the input. This interprettion is more useful for onstruting working NFA, or onverting it to DFA. Let us use the mny-worlds interprettion on the exmple ove. Suppose tht the input string is. Initilly the NFA is in stte zero. Without onsuming ny input, it ould tke n epsilon trnsition to sttes one or four. So, we n onsider its initil stte to e ll of those sttes simultneously. Continuing on, the NFA would trverse these sttes until epting the omplete string: Sttes Ation 0, 1, 4 onsume 0, 1, 2, 4, 5 onsume 0, 1, 2, 4, 5 onsume 0, 1, 2, 4, 5 onsume 6 ept In priniple, one n implement n NFA in softwre or hrdwre y simply keeping trk of ll of the possile sttes. But this is ineffiient. In the worst se, we would need to evlute ll sttes for ll hrters on eh input trnsition. A etter pproh is to onvert the NFA into n equivlent DFA, s we show elow.

10 3.5. CONVERSION ALGORITHMS Conversion Algorithms Regulr expressions nd finite utomt re ll eqully powerful. For every RE, there is n FA, nd vie vers. However, DFA is y fr the most strightforwrd of the three to implement in softwre. In this setion, we will show how to onvert n RE into n NFA, then n NFA into DFA, nd then to optimize the size of the DFA. Regulr Expression Thompson's Constrution Nondeterministi Finite Automton Suset Constrution Deterministi Finite Automton Trnsition Mtrix Code Figure 3.2: Reltionship Between REs, NFAs, nd DFAs Converting REs to NFAs To onvert regulr expression to nondeterministi finite utomt, we n follow n lgorithm given first y MNughton nd Ymd [9], nd then y Ken Thompson [10]. We follow the sme indutive definition of regulr expression s given erlier. First, we define utomt orresponding to the se ses of REs: The NFA for ny hrteris: The NFA for nǫtrnsition is: Now, suppose tht we hve lredy onstruted NFAs for the regulr expressions A nd B, indited elow y retngles. Both A nd B hve single strt stte (on the left) nd epting stte (on the right). If we write the ontention of A nd B s AB, then the orresponding NFA is simplyandbonneted y n ǫ trnsition. The strt stte ofaeomes the strt stte of the omintion, nd the epting stte ofbeomes the epting stte of the omintion: The NFA for the ontentionab is: A B

11 20 CHAPTER 3. SCANNING In similr fshion, the lterntion ofandbwritten sa B n e expressed s two utomt joined y ommon strting nd epting nodes, ll onneted yǫtrnsitions: The NFA for the lterntiona B is: A B Finlly, the Kleene losure A* is onstruted y tking the utomton for A, dding strting nd epting nodes, then dding ǫ trnsitions to llow zero or more repetitions: The NFA for the Kleene losurea* is: A Exmple. Let s onsider the proess for n exmple regulr expression (t ow)*. First, we strt with the innermost expression t nd ssemle it into three trnsitions resulting in n epting stte. Then, do the sme thing forow, yielding these two FAs: o w t The lterntion of the two expressions t ow is omplished y dding new strting nd epting node, with epsilon trnsitions. (The oxes re not prt of the grph, ut simply highlight the previous grph omponents rried forwrd.)

12 3.5. CONVERSION ALGORITHMS 21 t o w Then, the Kleene losure(t ow)* is omplished y dding nother strting nd epting stte round the previous FA, with epsilon trnsitions etween: o w t Finlly, the ontention of (t ow)* is hieved y dding single stte t the eginning for: smller piees: o w t You n esily see tht the NFA resulting from the onstrution lgorithm, while orret, is quite omplex nd ontins lrge numer of epsilon trnsitions. An NFA representing the tokens for omplete lnguge ould end up hving thousnds of sttes, whih would e very imprtil to implement. Insted, we n onvert this NFA into n equivlent DFA.

13 22 CHAPTER 3. SCANNING Converting NFAs to DFAs We n onvert ny NFA into n equivlent DFA using the tehnique of suset onstrution. The si ide is to rete DFA suh tht eh stte in the DFA orresponds to multiple sttes in the NFA, ording to the mny-worlds interprettion. Suppose tht we egin with n NFA onsisting of sttes N nd strt stte N 0. We wish to onstrut n equivlent DFA onsisting of sttes D nd strt stted 0. EhD stte will orrespond to multiplen sttes. First, we define helper funtion known s the epsilon losure: Epsilon losure. ǫ losure(n) is the set of NFA sttes rehle from NFA stte n y zero or moreǫtrnsitions. Now we define the suset onstrution lgorithm. First, we rete strt stte D 0 orresponding to the ǫ losure(n 0 ). Then, for eh outgoing hrter from the sttes in D 0, we rete new stte ontining the epsilon losure of the sttes rehle y. More preisely: Suset Constrution Algorithm. Given n NFA with sttes N nd strt stte N 0, rete n equivlent DFA with sttesd nd strt stted 0. Let D 0 = ǫ losure(n 0 ). Add D 0 to list. While items remin on the list: Letde the next DFA stte removed from the list. For eh hrterin Σ: Let T ontin ll NFA sttesn k suh tht: N j d ndn j Nk Crete new DFA stted i = ǫ losure(t) If D i is not lredy in the list, dd it to the end. Figure 3.3: Suset Constrution Algorithm

14 3.5. CONVERSION ALGORITHMS 23 N0 N1 N2 N3 N8 N4 N9 N5 o N10 N6 t w N11 N7 N12 N13 o D3: N6 w D4: N7, N12, N13, N2, N3, N4, N8 D0: N0 D1: D2: N1, N2, N3, N5, N9 N4, N8, N13 D5: N10 t D6: N11, N12, N13, N2,N3, N4, N8 Figure 3.4: Converting n NFA to DFA vi Suset Constrution Exmple. Let s work out the lgorithm on the NFA in Figure 3.4. This is the sme NFA orresponding to the RE(t ow)* with eh of the sttes numered for lrity. 1. Compute D 0 whih is ǫ losure(n 0 ). N 0 hs no ǫ trnsitions, so D 0 = {N 0 }. Add D 0 to the work list. 2. Remove D 0 from the work list. The hrteris n outgoing trnsition from N 0 to N 1. ǫ losure(n 1 ) = {N 1,N 2,N 3,N 4,N 8,N 13 } so dd ll of those to new stted 1 nd ddd 1 to the work list. 3. Remove D 1 from the work list. We n see tht N 4 N5 nd N 8 N 9, so we rete new stte D 2 = {N 5,N 9 } nd dd it to the work list. 4. Remove D 2 from the work list. Bothndore possile trnsitions euse ofn 5 o N6 ndn 9 N10. So, rete new stted 3 for the o trnsition to N 6 nd new stte D 5 for the trnsition to N 10. Add othd 3 ndd 5 to the work list. 5. RemoveD 3 from the work list. The only possile trnsition isn 6 w N 7 so rete new stte D 4 ontining the ǫ losure(n 7 ) nd dd it to the work list. 6. RemoveD 5 from the work list. The only possile trnsition isn 10 t N 11 so rete new stted 6 ontiningǫ losure(n 11 ) nd dd it to the work list.

15 24 CHAPTER 3. SCANNING 7. Remove D 4 from the work list, nd oserve tht the only outgoing trnsitionleds to sttesn 5 ndn 9 whih lredy exist s stted 2, so simply dd trnsitiond 4 D2. 8. Remove D 6 from the work list nd, in similr wy, ddd 6 D2. 9. The work list is empty, so we re done Minimizing DFAs The suset onstrution lgorithm will definitely generte vlid DFA, ut the DFA my possily e very lrge (espeilly if we egn with omplex NFA generted from n RE.) A lrge DFA will hve lrge trnsition mtrix tht will onsume lot of memory. If it doesn t fit in L1 he, the snner ould run very slowly. To ddress this prolem, we n pply Hoproft s lgorithm to shrink DFA into smller (ut equivlent) DFA. The generl pproh of the lgorithm is to optimistilly group together ll possily-equivlent sttes S into super-sttes T. Initilly, we ple ll non-epting S sttes into super-stte T 0 nd epting sttes into super-stte T 1. Then, we exmine the outgoing edges in eh stte s T i. If, given hrter hs edges tht egin in T i nd end in different super-sttes, then we onsider the super-stte to e inonsistent with respet to. (Consider n impermissile trnsition s if it were trnsition tot E, super-stte for errors.) The super-stte must then e split into multiple sttes tht re onsistent with respet to. Repet this proess for ll super-sttes nd ll hrters Σ until no more splits re required. DFA Minimiztion Algorithm. Given DFA with sttes S, rete n equivlent DFA with n equl or fewer numer of sttest. First prtitions intot suh tht: T 0 = non-epting sttes ofs. T 1 = epting sttes ofs. Repet: T i T : Σ: if T i { more thn onet stte}, then splitt i into multiplet sttes suh tht hs the sme tion in eh. Until no more sttes re split. Figure 3.5: Hoproft s DFA Minimiztion Algorithm

16 3.5. CONVERSION ALGORITHMS 25 Exmple. Suppose we hve the following non-optimized DFA nd wish to redue it to smller DFA: We egin y grouping ll of non-epting sttes 1, 2, 3, 4 into one super-stte nd the epting stte 5 into nother super-stte, like this: 1,2,3,4 5 Now, we sk whether this grph is onsistent with respet to ll possile inputs, y referring k to the originl DFA. For exmple, we oserve tht, if we re in super-stte (1,2,3,4) then n input of lwys goes to stte 2, whih keeps us within the super-stte. So, this DFA is onsistent with respet to. However, from super-stte (1,2,3,4) n input of n either sty within the super-stte or go to super-stte (5). So, the DFA is inonsistent with respet to. To fix this, we try splitting out one of the inonsistent sttes (4) into new super-stte, tking the trnsitions with it: 1,2,3 4 5 Agin, we exmine eh super-stte for onsisteny with respet to eh input hrter. Agin, we oserve tht super-stte 1,2,3 is onsistent with respet to, ut not onsistent with respet to euse it n either led to stte 3 or stte 4. We ttempt to fix this y splitting out stte 2 into its own super-stte, yielding this DFA.

17 26 CHAPTER 3. SCANNING 1, Agin, we exmine eh super-stte nd oserve tht eh possile input is onsistent with respet to the super-stte, nd therefore we hve the miniml DFA. 3.6 Limits of Finite Automt Regulr expressions nd finite utomt re powerful nd effetive t reognizing simple ptterns in individul words or tokens, ut they re not suffiient to nlyze ll of the strutures in prolem. For exmple, ould you use finite utomton to mth n ritrry numer of nested prentheses? It s not hrd to write out n FA tht ould mth, sy, up to three pirs of nested prentheses, like this: ( 0 1 ) ( ) 2 ( ) 3 But the key word is ritrry! To mth ny numer of prentheses would require n infinite utomton, whih is oviously imprtil. Even if we were to pply some prtil upper limit (sy, 100 pirs) the utomton would still e imprtilly lrge when omined with ll the other elements of lnguge tht must e supported. For exmple, lnguge like Python permits the nesting of prentheses () for preedene, urly rkets to represent ojet types, nd squre rkets [] to represent lists. An utomton to mth up to 100 nested pirs of eh in ritrry order would hve 1,000,000 sttes! So, we limit ourselves to using regulr expressions nd finite utomt for the nrrow purpose of identifying the words nd symols within prolem. To understnd the higher level struture of progrm, we will insted use prsing tehniques introdued in Chpter Using Snner Genertor Beuse regulr expression preisely desries ll the llowle forms of token, we n use progrm to utomtilly trnsform set of regulr

18 3.7. USING A SCANNER GENERATOR 27 %{ %} %% %% (C Premle Code) (Chrter Clsses) (Regulr Expression Rules) (Additionl Code) Figure 3.6: Struture of Flex File expressions into ode for snner. Suh progrm is known s snner genertor. The progrm Lex, developed t AT&T ws one of the erliest exmples of snner genertor. Flex is the GNU replement oflex nd is widely used in Unix-like operting systems tody to generte snners implemented in C or C++. To use Flex, we write speifition of the snner tht is mixture of regulr expressions, frgments of C ode, nd some speilized diretives. The Flex progrm itself onsumes the speifition nd produes regulr C ode tht n then e ompiled in the norml wy. Figure 3.6 gives the overll struture of Flex file. The first setion onsists of ritrry C ode tht will e pled t the eginning ofsnner., like inlude files, type definitions, nd similr things. Typilly, this is used to inlude file tht ontins the symoli onstnts for tokens. The seond setion sttes hrter lsses, whih re symoli shorthnd for ommonly used regulr expressions. For exmple, you might delredigit [0-9]. This lss n e referred to lter s{digit}. The third setion is the most importnt prt. It sttes regulr expression for eh type of token tht you wish to mth, followed y frgment of C ode tht will e exeuted whenever the expression is mthed. In the simplest se, this ode returns the type of the token, ut it n lso e used to extrt token vlues, disply errors, or nything else pproprite. The fourth setion is ritrry C ode tht will go t the end of the snner, typilly for dditionl helper funtions. A peulir requirement of Flex is tht we must define funtionyywrp() whih returns one to indite tht the input is omplete t the end of the file. If we wnted to ontinue snning in nother file, then yywrp() would open the next file nd return zero. The regulr expression lnguge epted y Flex is very similr to tht of forml regulr expressions disussed ove. The min differene is tht hrters tht hve speil mening with regulr expression (like prenthesis, squre rkets, nd sterisk) must e esped with kslsh or surrounded with doule quotes. Also, period (.) n e used to mth ny hrter t ll, whih is helpful for thing error onditions.

19 28 CHAPTER 3. SCANNING Contents of File: snner.flex %{ #inlude "token.h" %} DIGIT [0-9] LETTER [-za-z] %% (" " \t \n) /* skip whitespe */ \+ { return TOKEN_ADD; } while { return TOKEN_WHILE; } {LETTER}+ { return TOKEN_IDENT; } {DIGIT}+ { return TOKEN_NUMBER; }. { return TOKEN_ERROR; } %% int yywrp() { return 1; } Figure 3.7: Exmple Flex Speifition Contents of File: min. #inlude "token.h" #inlude <stdio.h> extern FILE *yyin; extern int yylex(); extern hr *yytext; int min() { yyin = fopen("progrm.","r"); if(!yyin) { printf("ould not open progrm.!\n"); return 1; } } while(1) { token_t t = yylex(); if(t==token_eof) rek; printf("token: %d text: %s\n",t,yytext); } Figure 3.8: Exmple Min Progrm

20 3.8. PRACTICAL CONSIDERATIONS 29 Contents of File: token.h typedef enum { TOKEN_EOF=0, TOKEN_WHILE, TOKEN_ADD, TOKEN_IDENT, TOKEN_NUMBER, TOKEN_ERROR } token_t; Figure 3.9: Exmple Token Enumertion Figure 3.7 shows simple ut omplete exmple to get you strted. This speifition desries just few tokens: single hrter ddition (whih must e esped with kslsh), the while keyword, n identifier onsisting of one or more letters, nd numer onsisting of one or more digits. As is typil in snner, ny other type of hrter is n error, nd returns n expliit token type for tht purpose. Flex genertes the snner ode, ut not omplete progrm, so you must write min funtion to go with it. Figure 3.8 shows simple driver progrm tht uses this snner. First, the min progrm must delre s extern the symols it expets to use in the generted snner ode:yyin is the file from whih text will e red, yylex is the funtion tht implements the snner, nd the rryyytext ontins the tul text of eh token disovered. Finlly, we must hve onsistent definition of the token types ross the prts of the progrm, so intotoken.h we put n enumertion desriing the new type token t. This file is inluded in oth snner.flex ndmin.. Figure 3.10 shows how ll the piees ome together. snner.flex is onverted into snner. y invoking flex snner.flex -osnner.. Then, oth min. nd snner. re ompiled to produe ojet files, whih re linked together to produe the omplete progrm. 3.8 Prtil Considertions Hndling keywords. - In mny lnguges, keywords (suh s while or if) would otherwise mth the definitions of identifiers, unless speilly hndled. There re severl solutions to this prolem. One is to enter regulr expression for every single keyword into the Flex speifition. (These must preede the definition of identifiers, sine Flex will ept the first expression tht mthes.) Another is to mintin single regulr expression tht mthes ll identifiers nd keywords. The tion ssoited

21 30 CHAPTER 3. SCANNING token.h snner.flex Flex snner. Compiler snner.o Linker snner.exe min. Compiler min.o Figure 3.10: Build Proedure for Flex Progrm with tht rule n ompre the token text with seprte list of keywords nd return the pproprite type. Yet nother pproh is to tret ll keywords nd identifiers s single token type, nd llow the prolem to e sorted out y the prser. (This is neessry in lnguges like PL/1, where identifiers n hve the sme nmes s keywords, nd re distinguished y ontext.) Trking soure lotions. In lter stges of the ompiler, it is useful for the prser or typeheker to know extly wht line nd olumn numer token ws loted t, usully to print out helpful error messge. ( Undefined symol spider t line 153. ) This is esily done y hving the snner mth newline hrters, nd inrese the line ount (ut not return token) eh time one is found. Clening tokens. Strings, hrters, nd similr token types need to e lened up fter they re mthed. For exmple,"hello\n" needs to hve its quotes removed nd the kslsh-n sequene onverted to literl newline hrter. Internlly, the ompiler only res out the tul ontents of the string. Typilly, this is omplished y writing funtion string len in the postmle of the Flex speifition. The funtion is invoked y the mthing rule efore returning the desired token type. Constrining tokens. Although regulr expressions n mth tokens of ritrry length, it does not follow tht ompiler must e prepred to ept them. There would e little point to epting 1000-letter identifier, or n integer lrger thn the mhine s word size. The typil pproh is to set the mximum token length (YYLMAX in flex) to very lrge vlue, then exmine the token to see if it exeeds logil limit in the tion tht mthes the token. This llows you to emit n error messge tht desries the offending token s needed. Error Hndling. The esiest pproh to hndling errors or invlid input is simply to print messge nd exit the progrm. However, this is unhelpful to users of your ompiler if there re multiple errors, it s (usully) etter to see them ll t one. A good pproh is to mth the

22 3.9. EXERCISES 31 minimum mount of invlid text (using the dot rule) nd return n expliit token type inditing n error. The ode tht invokes the snner n then emit suitle messge, nd then sk for the next token. 3.9 Exerises 1. Write regulr expressions for the following entities. You my find it neessry to justify wht is nd is not llowed within eh expression: () English dys of the week: Mondy, Tuesdy,... () All integers where every three digits re seprted y omms for lrity, suh s: 78 1, ,098,000 () Internet emil ddresses like"john Doe" <john.doe@gmil.om> (d) HTTP Uniform Resoure Lotors (URLs) s desried y RFC Write regulr expression for string ontining ny numer of X nd single pirs of< > nd{ } whih my e nested ut not interleved. For exmple these strings re llowed: XXX<XX{X}XXX>X X{X}X<X>X{X}X<X>X But these re not llowed: XXX<X<XX>>XX XX<XX{XX>XX}XX 3. Test the regulr expressions you wrote in the previous two prolems y trnslting them into your fvorite progrmming lnguge tht hs ntive support for regulr expressions. (Perl nd Python re two good hoies.) Evlute the orretness of your progrm y writing test ses tht should (nd should not) mth. 4. Convert these REs into NFAs using Thompson s onstrution: () for [-z]+ [x]?[0-9]+ () ( *d ed ) d* () ( * * )* 5. Convert the NFAs in the previous prolem into DFAs using the suset onstrution method.

23 32 CHAPTER 3. SCANNING 6. Minimize the DFAs in the previous prolem y using Hoproft s lgorithm. 7. Write hnd-mde snner for JvSript Ojet Nottion (JSON) whih is desried t The progrm should red JSON on the input, nd then print out the sequene of tokens oserved: LBRACKET, STRING, COLON, et... Find some lrge JSON douments online nd test your snner to see if it works. 8. Using Flex, write snner for the Jv progrmming lnguge. As ove, red in Jv soure on the input nd output token types. Test it out y pplying it to lrge open soure projet written in Jv.

CS 241 Week 4 Tutorial Solutions

CS 241 Week 4 Tutorial Solutions CS 4 Week 4 Tutoril Solutions Writing n Assemler, Prt & Regulr Lnguges Prt Winter 8 Assemling instrutions utomtilly. slt $d, $s, $t. Solution: $d, $s, nd $t ll fit in -it signed integers sine they re 5-it

More information

CS 340, Fall 2016 Sep 29th Exam 1 Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string.

CS 340, Fall 2016 Sep 29th Exam 1 Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string. CS 340, Fll 2016 Sep 29th Exm 1 Nme: Note: in ll questions, the speil symol ɛ (epsilon) is used to indite the empty string. Question 1. [10 points] Speify regulr expression tht genertes the lnguge over

More information

In the last lecture, we discussed how valid tokens may be specified by regular expressions.

In the last lecture, we discussed how valid tokens may be specified by regular expressions. LECTURE 5 Scnning SYNTAX ANALYSIS We know from our previous lectures tht the process of verifying the syntx of the progrm is performed in two stges: Scnning: Identifying nd verifying tokens in progrm.

More information

Pattern Matching. Pattern Matching. Pattern Matching. Review of Regular Expressions

Pattern Matching. Pattern Matching. Pattern Matching. Review of Regular Expressions Pttern Mthing Pttern Mthing Some of these leture slides hve een dpted from: lgorithms in C, Roert Sedgewik. Gol. Generlize string serhing to inompletely speified ptterns. pplitions. Test if string or its

More information

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5 CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,

More information

Midterm Exam CSC October 2001

Midterm Exam CSC October 2001 Midterm Exm CSC 173 23 Otoer 2001 Diretions This exm hs 8 questions, severl of whih hve suprts. Eh question indites its point vlue. The totl is 100 points. Questions 5() nd 6() re optionl; they re not

More information

Outline. Motivation Background ARCH. Experiment Additional usages for Input-Depth. Regular Expression Matching DPI over Compressed HTTP

Outline. Motivation Background ARCH. Experiment Additional usages for Input-Depth. Regular Expression Matching DPI over Compressed HTTP ARCH This work ws supported y: The Europen Reserh Counil, The Isreli Centers of Reserh Exellene, The Neptune Consortium, nd Ntionl Siene Foundtion wrd CNS-119748 Outline Motivtion Bkground Regulr Expression

More information

Lesson 4.4. Euler Circuits and Paths. Explore This

Lesson 4.4. Euler Circuits and Paths. Explore This Lesson 4.4 Euler Ciruits nd Pths Now tht you re fmilir with some of the onepts of grphs nd the wy grphs onvey onnetions nd reltionships, it s time to egin exploring how they n e used to model mny different

More information

6.045J/18.400J: Automata, Computability and Complexity. Quiz 2: Solutions. Please write your name in the upper corner of each page.

6.045J/18.400J: Automata, Computability and Complexity. Quiz 2: Solutions. Please write your name in the upper corner of each page. 6045J/18400J: Automt, Computbility nd Complexity Mrh 30, 2005 Quiz 2: Solutions Prof Nny Lynh Vinod Vikuntnthn Plese write your nme in the upper orner of eh pge Problem Sore 1 2 3 4 5 6 Totl Q2-1 Problem

More information

Introduction to Algebra

Introduction to Algebra INTRODUCTORY ALGEBRA Mini-Leture 1.1 Introdution to Alger Evlute lgeri expressions y sustitution. Trnslte phrses to lgeri expressions. 1. Evlute the expressions when =, =, nd = 6. ) d) 5 10. Trnslte eh

More information

Paradigm 5. Data Structure. Suffix trees. What is a suffix tree? Suffix tree. Simple applications. Simple applications. Algorithms

Paradigm 5. Data Structure. Suffix trees. What is a suffix tree? Suffix tree. Simple applications. Simple applications. Algorithms Prdigm. Dt Struture Known exmples: link tble, hep, Our leture: suffix tree Will involve mortize method tht will be stressed shortly in this ourse Suffix trees Wht is suffix tree? Simple pplitions History

More information

Fig.25: the Role of LEX

Fig.25: the Role of LEX The Lnguge for Specifying Lexicl Anlyzer We shll now study how to uild lexicl nlyzer from specifiction of tokens in the form of list of regulr expressions The discussion centers round the design of n existing

More information

Dr. D.M. Akbar Hussain

Dr. D.M. Akbar Hussain Dr. D.M. Akr Hussin Lexicl Anlysis. Bsic Ide: Red the source code nd generte tokens, it is similr wht humns will do to red in; just tking on the input nd reking it down in pieces. Ech token is sequence

More information

CMPUT101 Introduction to Computing - Summer 2002

CMPUT101 Introduction to Computing - Summer 2002 CMPUT Introdution to Computing - Summer 22 %XLOGLQJ&RPSXWHU&LUFXLWV Chpter 4.4 3XUSRVH We hve looked t so fr how to uild logi gtes from trnsistors. Next we will look t how to uild iruits from logi gtes,

More information

COMP 423 lecture 11 Jan. 28, 2008

COMP 423 lecture 11 Jan. 28, 2008 COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring

More information

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 4: Lexical Analyzers 28 Jan 08

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 4: Lexical Analyzers 28 Jan 08 CS412/413 Introduction to Compilers Tim Teitelum Lecture 4: Lexicl Anlyzers 28 Jn 08 Outline DFA stte minimiztion Lexicl nlyzers Automting lexicl nlysis Jlex lexicl nlyzer genertor CS 412/413 Spring 2008

More information

Lecture 8: Graph-theoretic problems (again)

Lecture 8: Graph-theoretic problems (again) COMP36111: Advned Algorithms I Leture 8: Grph-theoreti prolems (gin) In Prtt-Hrtmnn Room KB2.38: emil: iprtt@s.mn..uk 2017 18 Reding for this leture: Sipser: Chpter 7. A grph is pir G = (V, E), where V

More information

Lexical Analysis: Constructing a Scanner from Regular Expressions

Lexical Analysis: Constructing a Scanner from Regular Expressions Lexicl Anlysis: Constructing Scnner from Regulr Expressions Gol Show how to construct FA to recognize ny RE This Lecture Convert RE to n nondeterministic finite utomton (NFA) Use Thompson s construction

More information

Greedy Algorithm. Algorithm Fall Semester

Greedy Algorithm. Algorithm Fall Semester Greey Algorithm Algorithm 0 Fll Semester Optimiztion prolems An optimiztion prolem is one in whih you wnt to fin, not just solution, ut the est solution A greey lgorithm sometimes works well for optimiztion

More information

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy Recognition of Tokens if expressions nd reltionl opertors if è if then è then else è else relop

More information

Definition of Regular Expression

Definition of Regular Expression Definition of Regulr Expression After the definition of the string nd lnguges, we re redy to descrie regulr expressions, the nottion we shll use to define the clss of lnguges known s regulr sets. Recll

More information

Reducing a DFA to a Minimal DFA

Reducing a DFA to a Minimal DFA Lexicl Anlysis - Prt 4 Reducing DFA to Miniml DFA Input: DFA IN Assume DFA IN never gets stuck (dd ded stte if necessry) Output: DFA MIN An equivlent DFA with the minimum numer of sttes. Hrry H. Porter,

More information

CSE 401 Compilers. Agenda. Lecture 4: Implemen:ng Scanners Michael Ringenburg Winter 2013

CSE 401 Compilers. Agenda. Lecture 4: Implemen:ng Scanners Michael Ringenburg Winter 2013 CSE 401 Compilers Leture 4: Implemen:ng Snners Mihel Ringenurg Winter 013 Winter 013 UW CSE 401 (Mihel Ringenurg) Agend Lst week we overed regulr expressions nd finite utomt. Tody, we ll finish our finl

More information

Topic 2: Lexing and Flexing

Topic 2: Lexing and Flexing Topic 2: Lexing nd Flexing COS 320 Compiling Techniques Princeton University Spring 2016 Lennrt Beringer 1 2 The Compiler Lexicl Anlysis Gol: rek strem of ASCII chrcters (source/input) into sequence of

More information

COMP108 Algorithmic Foundations

COMP108 Algorithmic Foundations Grph Theory Prudene Wong http://www.s.liv..uk/~pwong/tehing/omp108/201617 How to Mesure 4L? 3L 5L 3L ontiner & 5L ontiner (without mrk) infinite supply of wter You n pour wter from one ontiner to nother

More information

Lexical Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Lexical Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay Lexicl Anlysis Amith Snyl (www.cse.iit.c.in/ s) Deprtment of Computer Science nd Engineering, Indin Institute of Technology, Bomy Septemer 27 College of Engineering, Pune Lexicl Anlysis: 2/6 Recp The input

More information

McAfee Web Gateway

McAfee Web Gateway Relese Notes Revision C MAfee We Gtewy 7.6.2.11 Contents Aout this relese Enhnement Resolved issues Instlltion instrutions Known issues Additionl informtion Find produt doumenttion Aout this relese This

More information

CMPSC 470: Compiler Construction

CMPSC 470: Compiler Construction CMPSC 47: Compiler Construction Plese complete the following: Midterm (Type A) Nme Instruction: Mke sure you hve ll pges including this cover nd lnk pge t the end. Answer ech question in the spce provided.

More information

Distributed Systems Principles and Paradigms

Distributed Systems Principles and Paradigms Distriuted Systems Priniples nd Prdigms Christoph Dorn Distriuted Systems Group, Vienn University of Tehnology.dorn@infosys.tuwien..t http://www.infosys.tuwien..t/stff/dorn Slides dpted from Mrten vn Steen,

More information

Distributed Systems Principles and Paradigms. Chapter 11: Distributed File Systems

Distributed Systems Principles and Paradigms. Chapter 11: Distributed File Systems Distriuted Systems Priniples nd Prdigms Mrten vn Steen VU Amsterdm, Dept. Computer Siene steen@s.vu.nl Chpter 11: Distriuted File Systems Version: Deemer 10, 2012 2 / 14 Distriuted File Systems Distriuted

More information

Table-driven look-ahead lexical analysis

Table-driven look-ahead lexical analysis Tle-riven look-he lexil nlysis WUU YANG Computer n Informtion Siene Deprtment Ntionl Chio-Tung University, HsinChu, Tiwn, R.O.C. Astrt. Moern progrmming lnguges use regulr expressions to efine vli tokens.

More information

CSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011

CSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011 CSCI 3130: Forml Lnguges nd utomt Theory Lecture 12 The Chinese University of Hong Kong, Fll 2011 ndrej Bogdnov In progrmming lnguges, uilding prse trees is significnt tsk ecuse prse trees tell us the

More information

Duality in linear interval equations

Duality in linear interval equations Aville online t http://ijim.sriu..ir Int. J. Industril Mthemtis Vol. 1, No. 1 (2009) 41-45 Dulity in liner intervl equtions M. Movhedin, S. Slhshour, S. Hji Ghsemi, S. Khezerloo, M. Khezerloo, S. M. Khorsny

More information

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs.

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs. Lecture 5 Wlks, Trils, Pths nd Connectedness Reding: Some of the mteril in this lecture comes from Section 1.2 of Dieter Jungnickel (2008), Grphs, Networks nd Algorithms, 3rd edition, which is ville online

More information

Implementing Automata. CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona

Implementing Automata. CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona Implementing utomt Sc 5 ompilers nd Systems Softwre : Lexicl nlysis II Deprtment of omputer Science University of rizon collerg@gmil.com opyright c 009 hristin ollerg NFs nd DFs cn e hrd-coded using this

More information

LR Parsing, Part 2. Constructing Parse Tables. Need to Automatically Construct LR Parse Tables: Action and GOTO Table

LR Parsing, Part 2. Constructing Parse Tables. Need to Automatically Construct LR Parse Tables: Action and GOTO Table TDDD55 Compilers nd Interpreters TDDB44 Compiler Construction LR Prsing, Prt 2 Constructing Prse Tles Prse tle construction Grmmr conflict hndling Ctegories of LR Grmmrs nd Prsers Peter Fritzson, Christoph

More information

10.2 Graph Terminology and Special Types of Graphs

10.2 Graph Terminology and Special Types of Graphs 10.2 Grph Terminology n Speil Types of Grphs Definition 1. Two verties u n v in n unirete grph G re lle jent (or neighors) in G iff u n v re enpoints of n ege e of G. Suh n ege e is lle inient with the

More information

Lexical analysis, scanners. Construction of a scanner

Lexical analysis, scanners. Construction of a scanner Lexicl nlysis scnners (NB. Pges 4-5 re for those who need to refresh their knowledge of DFAs nd NFAs. These re not presented during the lectures) Construction of scnner Tools: stte utomt nd trnsition digrms.

More information

Chapter 9. Greedy Technique. Copyright 2007 Pearson Addison-Wesley. All rights reserved.

Chapter 9. Greedy Technique. Copyright 2007 Pearson Addison-Wesley. All rights reserved. Chpter 9 Greey Tehnique Copyright 2007 Person Aison-Wesley. All rights reserve. Greey Tehnique Construts solution to n optimiztion prolem piee y piee through sequene of hoies tht re: fesile lolly optiml

More information

CS553 Lecture Introduction to Data-flow Analysis 1

CS553 Lecture Introduction to Data-flow Analysis 1 ! Ide Introdution to Dt-flow nlysis!lst Time! Implementing Mrk nd Sweep GC!Tody! Control flow grphs! Liveness nlysis! Register llotion CS553 Leture Introdution to Dt-flow Anlysis 1 Dt-flow Anlysis! Dt-flow

More information

Error Numbers of the Standard Function Block

Error Numbers of the Standard Function Block A.2.2 Numers of the Stndrd Funtion Blok evlution The result of the logi opertion RLO is set if n error ours while the stndrd funtion lok is eing proessed. This llows you to rnh to your own error evlution

More information

V = set of vertices (vertex / node) E = set of edges (v, w) (v, w in V)

V = set of vertices (vertex / node) E = set of edges (v, w) (v, w in V) Definitions G = (V, E) V = set of verties (vertex / noe) E = set of eges (v, w) (v, w in V) (v, w) orere => irete grph (igrph) (v, w) non-orere => unirete grph igrph: w is jent to v if there is n ege from

More information

CS 432 Fall Mike Lam, Professor a (bc)* Regular Expressions and Finite Automata

CS 432 Fall Mike Lam, Professor a (bc)* Regular Expressions and Finite Automata CS 432 Fll 2017 Mike Lm, Professor (c)* Regulr Expressions nd Finite Automt Compiltion Current focus "Bck end" Source code Tokens Syntx tree Mchine code chr dt[20]; int min() { flot x = 42.0; return 7;

More information

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis CS143 Hndout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexicl Anlysis In this first written ssignment, you'll get the chnce to ply round with the vrious constructions tht come up when doing lexicl

More information

ASTs, Regex, Parsing, and Pretty Printing

ASTs, Regex, Parsing, and Pretty Printing ASTs, Regex, Prsing, nd Pretty Printing CS 2112 Fll 2016 1 Algeric Expressions To strt, consider integer rithmetic. Suppose we hve the following 1. The lphet we will use is the digits {0, 1, 2, 3, 4, 5,

More information

Final Exam Review F 06 M 236 Be sure to look over all of your tests, as well as over the activities you did in the activity book

Final Exam Review F 06 M 236 Be sure to look over all of your tests, as well as over the activities you did in the activity book inl xm Review 06 M 236 e sure to loo over ll of your tests, s well s over the tivities you did in the tivity oo 1 1. ind the mesures of the numered ngles nd justify your wor. Line j is prllel to line.

More information

CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona

CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona CSc 453 Compilers nd Systems Softwre 4 : Lexicl Anlysis II Deprtment of Computer Science University of Arizon collerg@gmil.com Copyright c 2009 Christin Collerg Implementing Automt NFAs nd DFAs cn e hrd-coded

More information

10.5 Graphing Quadratic Functions

10.5 Graphing Quadratic Functions 0.5 Grphing Qudrtic Functions Now tht we cn solve qudrtic equtions, we wnt to lern how to grph the function ssocited with the qudrtic eqution. We cll this the qudrtic function. Grphs of Qudrtic Functions

More information

COSC 6374 Parallel Computation. Non-blocking Collective Operations. Edgar Gabriel Fall Overview

COSC 6374 Parallel Computation. Non-blocking Collective Operations. Edgar Gabriel Fall Overview COSC 6374 Prllel Computtion Non-loking Colletive Opertions Edgr Griel Fll 2014 Overview Impt of olletive ommunition opertions Impt of ommunition osts on Speedup Crtesin stenil ommunition All-to-ll ommunition

More information

INTEGRATED WORKFLOW ART DIRECTOR

INTEGRATED WORKFLOW ART DIRECTOR ART DIRECTOR Progrm Resoures INTEGRATED WORKFLOW PROGRAM PLANNING PHASE In this workflow phse proess, you ollorte with the Progrm Mnger, the Projet Mnger, nd the Art Speilist/ Imge Led to updte the resoures

More information

Lecture 12 : Topological Spaces

Lecture 12 : Topological Spaces Leture 12 : Topologil Spes 1 Topologil Spes Topology generlizes notion of distne nd loseness et. Definition 1.1. A topology on set X is olletion T of susets of X hving the following properties. 1. nd X

More information

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 3b Lexical Analysis Elias Athanasopoulos

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 3b Lexical Analysis Elias Athanasopoulos ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy RecogniNon of Tokens if expressions nd relnonl opertors if è if then è then else è else relop è

More information

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an Scnner Termintion A scnner reds input chrcters nd prtitions them into tokens. Wht hppens when the end of the input file is reched? It my be useful to crete n Eof pseudo-chrcter when this occurs. In Jv,

More information

4.3 Balanced Trees. let us assume that we can manipulate them conveniently and see how they can be put together to form trees.

4.3 Balanced Trees. let us assume that we can manipulate them conveniently and see how they can be put together to form trees. 428 T FOU 4.3 Blned Trees T BT GOIT IN T VIOU setion work well for wide vriety of pplitions, ut they hve poor worst-se performne. s we hve noted, files lredy in order, files in reverse order, files with

More information

LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION

LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION Overview LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION 4.4.1.0 Due to the omplex nture of this updte, plese fmilirize yourself with these instrutions nd then ontt RGB Spetrum Tehnil

More information

COSC 6374 Parallel Computation. Communication Performance Modeling (II) Edgar Gabriel Fall Overview. Impact of communication costs on Speedup

COSC 6374 Parallel Computation. Communication Performance Modeling (II) Edgar Gabriel Fall Overview. Impact of communication costs on Speedup COSC 6374 Prllel Computtion Communition Performne Modeling (II) Edgr Griel Fll 2015 Overview Impt of ommunition osts on Speedup Crtesin stenil ommunition All-to-ll ommunition Impt of olletive ommunition

More information

Languages. L((a (b)(c))*) = { ε,a,bc,aa,abc,bca,... } εw = wε = w. εabba = abbaε = abba. (a (b)(c)) *

Languages. L((a (b)(c))*) = { ε,a,bc,aa,abc,bca,... } εw = wε = w. εabba = abbaε = abba. (a (b)(c)) * Pln for Tody nd Beginning Next week Interpreter nd Compiler Structure, or Softwre Architecture Overview of Progrmming Assignments The MeggyJv compiler we will e uilding. Regulr Expressions Finite Stte

More information

CS453 INTRODUCTION TO DATAFLOW ANALYSIS

CS453 INTRODUCTION TO DATAFLOW ANALYSIS CS453 INTRODUCTION TO DATAFLOW ANALYSIS CS453 Leture Register llotion using liveness nlysis 1 Introdution to Dt-flow nlysis Lst Time Register llotion for expression trees nd lol nd prm vrs Tody Register

More information

To access your mailbox from inside your organization. For assistance, call:

To access your mailbox from inside your organization. For assistance, call: 2001 Ative Voie, In. All rights reserved. First edition 2001. Proteted y one or more of the following United Sttes ptents:,070,2;,3,90;,88,0;,33,102;,8,0;,81,0;,2,7;,1,0;,90,88;,01,11. Additionl U.S. nd

More information

Parallelization Optimization of System-Level Specification

Parallelization Optimization of System-Level Specification Prlleliztion Optimiztion of System-Level Speifition Luki i niel. Gjski enter for Emedded omputer Systems University of liforni Irvine, 92697, US {li, gjski} @es.ui.edu strt This pper introdues the prlleliztion

More information

Geometrical reasoning 1

Geometrical reasoning 1 MODULE 5 Geometril resoning 1 OBJECTIVES This module is for study y n individul teher or group of tehers. It: looks t pprohes to developing pupils visulistion nd geometril resoning skills; onsiders progression

More information

Problem Final Exam Set 2 Solutions

Problem Final Exam Set 2 Solutions CSE 5 5 Algoritms nd nd Progrms Prolem Finl Exm Set Solutions Jontn Turner Exm - //05 0/8/0. (5 points) Suppose you re implementing grp lgoritm tt uses ep s one of its primry dt strutures. Te lgoritm does

More information

Section 3.1: Sequences and Series

Section 3.1: Sequences and Series Section.: Sequences d Series Sequences Let s strt out with the definition of sequence: sequence: ordered list of numbers, often with definite pttern Recll tht in set, order doesn t mtter so this is one

More information

TO REGULAR EXPRESSIONS

TO REGULAR EXPRESSIONS Suject :- Computer Science Course Nme :- Theory Of Computtion DA TO REGULAR EXPRESSIONS Report Sumitted y:- Ajy Singh Meen 07000505 jysmeen@cse.iit.c.in BASIC DEINITIONS DA:- A finite stte mchine where

More information

[SYLWAN., 158(6)]. ISI

[SYLWAN., 158(6)]. ISI The proposl of Improved Inext Isomorphi Grph Algorithm to Detet Design Ptterns Afnn Slem B-Brhem, M. Rizwn Jmeel Qureshi Fulty of Computing nd Informtion Tehnology, King Adulziz University, Jeddh, SAUDI

More information

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards A Tutology Checker loosely relted to Stålmrck s Algorithm y Mrtin Richrds mr@cl.cm.c.uk http://www.cl.cm.c.uk/users/mr/ University Computer Lortory New Museum Site Pemroke Street Cmridge, CB2 3QG Mrtin

More information

CS 430 Spring Mike Lam, Professor. Parsing

CS 430 Spring Mike Lam, Professor. Parsing CS 430 Spring 2015 Mike Lm, Professor Prsing Syntx Anlysis We cn now formlly descrie lnguge's syntx Using regulr expressions nd BNF grmmrs How does tht help us? Syntx Anlysis We cn now formlly descrie

More information

Honors Thesis: Investigating the Algebraic Properties of Cayley Digraphs

Honors Thesis: Investigating the Algebraic Properties of Cayley Digraphs Honors Thesis: Investigting the Algebri Properties of Cyley Digrphs Alexis Byers, Wittenberg University Mthemtis Deprtment April 30, 2014 This pper utilizes Grph Theory to gin insight into the lgebri struture

More information

Tiling Triangular Meshes

Tiling Triangular Meshes Tiling Tringulr Meshes Ming-Yee Iu EPFL I&C 1 Introdution Astrt When modelling lrge grphis senes, rtists re not epeted to model minute nd repetitive fetures suh s grss or snd with individul piees of geometry

More information

Fundamentals of Engineering Analysis ENGR Matrix Multiplication, Types

Fundamentals of Engineering Analysis ENGR Matrix Multiplication, Types Fundmentls of Engineering Anlysis ENGR - Mtri Multiplition, Types Spring Slide Mtri Multiplition Define Conformle To multiply A * B, the mtries must e onformle. Given mtries: A m n nd B n p The numer of

More information

this grammar generates the following language: Because this symbol will also be used in a later step, it receives the

this grammar generates the following language: Because this symbol will also be used in a later step, it receives the LR() nlysis Drwcks of LR(). Look-hed symols s eplined efore, concerning LR(), it is possile to consult the net set to determine, in the reduction sttes, for which symols it would e possile to perform reductions.

More information

Distance vector protocol

Distance vector protocol istne vetor protool Irene Finohi finohi@i.unirom.it Routing Routing protool Gol: etermine goo pth (sequene of routers) thru network from soure to Grph strtion for routing lgorithms: grph noes re routers

More information

Example: Source Code. Lexical Analysis. The Lexical Structure. Tokens. What do we really care here? A Sample Toy Program:

Example: Source Code. Lexical Analysis. The Lexical Structure. Tokens. What do we really care here? A Sample Toy Program: Lexicl Anlysis Red source progrm nd produce list of tokens ( liner nlysis) source progrm The lexicl structure is specified using regulr expressions Other secondry tsks: (1) get rid of white spces (e.g.,

More information

Can Pythagoras Swim?

Can Pythagoras Swim? Overview Ativity ID: 8939 Mth Conepts Mterils Students will investigte reltionships etween sides of right tringles to understnd the Pythgoren theorem nd then use it to solve prolems. Students will simplify

More information

Enterprise Digital Signage Create a New Sign

Enterprise Digital Signage Create a New Sign Enterprise Digitl Signge Crete New Sign Intended Audiene: Content dministrtors of Enterprise Digitl Signge inluding stff with remote ess to sign.pitt.edu nd the Content Mnger softwre pplition for their

More information

P(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have

P(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have Rndom Numers nd Monte Crlo Methods Rndom Numer Methods The integrtion methods discussed so fr ll re sed upon mking polynomil pproximtions to the integrnd. Another clss of numericl methods relies upon using

More information

COMMON FRACTIONS. or a / b = a b. , a is called the numerator, and b is called the denominator.

COMMON FRACTIONS. or a / b = a b. , a is called the numerator, and b is called the denominator. COMMON FRACTIONS BASIC DEFINITIONS * A frtion is n inite ivision. or / * In the frtion is lle the numertor n is lle the enomintor. * The whole is seprte into "" equl prts n we re onsiering "" of those

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Dt Mining y I. H. Witten nd E. Frnk Simplicity first Simple lgorithms often work very well! There re mny kinds of simple structure, eg: One ttriute does ll the work All ttriutes contriute eqully

More information

Type Checking. Roadmap (Where are we?) Last lecture Context-sensitive analysis. This lecture Type checking. Symbol tables

Type Checking. Roadmap (Where are we?) Last lecture Context-sensitive analysis. This lecture Type checking. Symbol tables Type Cheking Rodmp (Where re we?) Lst leture Contet-sensitie nlysis Motition Attriute grmmrs Ad ho Synt-direted trnsltion This leture Type heking Type systems Using synt direted trnsltion Symol tles Leil

More information

c s ha2 c s Half Adder Figure 2: Full Adder Block Diagram

c s ha2 c s Half Adder Figure 2: Full Adder Block Diagram Adder Tk: Implement 2-it dder uing 1-it full dder nd 1-it hlf dder omponent (Figure 1) tht re onneted together in top-level module. Derie oth omponent in VHDL. Prepre two implementtion where VHDL omponent

More information

2014 Haskell January Test Regular Expressions and Finite Automata

2014 Haskell January Test Regular Expressions and Finite Automata 0 Hskell Jnury Test Regulr Expressions nd Finite Automt This test comprises four prts nd the mximum mrk is 5. Prts I, II nd III re worth 3 of the 5 mrks vilble. The 0 Hskell Progrmming Prize will be wrded

More information

Internet Routing. IP Packet Format. IP Fragmentation & Reassembly. Principles of Internet Routing. Computer Networks 9/29/2014.

Internet Routing. IP Packet Format. IP Fragmentation & Reassembly. Principles of Internet Routing. Computer Networks 9/29/2014. omputer Networks 9/29/2014 IP Pket Formt Internet Routing Ki Shen IP protool version numer heder length (words) for qulity of servie mx numer remining hops (deremented t eh router) upper lyer protool to

More information

Efficient Subscription Management in Content-based Networks

Efficient Subscription Management in Content-based Networks Effiient Susription Mngement in Content-sed Networks Rphël Chnd, Psl A. Feler Institut EURECOM 06904 Sophi Antipolis, Frne {hnd feler}@eureom.fr Astrt Content-sed pulish/susrie systems offer onvenient

More information

Compiler Construction D7011E

Compiler Construction D7011E Compiler Construction D7011E Lecture 3: Lexer genertors Viktor Leijon Slides lrgely y John Nordlnder with mteril generously provided y Mrk P. Jones. 1 Recp: Hndwritten Lexers: Don t require sophisticted

More information

Photovoltaic Panel Modelling Using a Stochastic Approach in MATLAB &Simulink

Photovoltaic Panel Modelling Using a Stochastic Approach in MATLAB &Simulink hotovolti nel Modelling Using Stohsti Approh in MATLAB &Simulink KAREL ZALATILEK, JAN LEUCHTER eprtment of Eletril Engineering University of efene Kouniov 65, 61 City of Brno CZECH REUBLIC krelzpltilek@unoz,

More information

CS 551 Computer Graphics. Hidden Surface Elimination. Z-Buffering. Basic idea: Hidden Surface Removal

CS 551 Computer Graphics. Hidden Surface Elimination. Z-Buffering. Basic idea: Hidden Surface Removal CS 55 Computer Grphis Hidden Surfe Removl Hidden Surfe Elimintion Ojet preision lgorithms: determine whih ojets re in front of others Uses the Pinter s lgorithm drw visile surfes from k (frthest) to front

More information

Compilation

Compilation Compiltion 0368-3133 Lecture 2: Lexicl Anlysis Nom Rinetzky 1 2 Lexicl Anlysis Modern Compiler Design: Chpter 2.1 3 Conceptul Structure of Compiler Compiler Source text txt Frontend Semntic Representtion

More information

CSCE 531, Spring 2017, Midterm Exam Answer Key

CSCE 531, Spring 2017, Midterm Exam Answer Key CCE 531, pring 2017, Midterm Exm Answer Key 1. (15 points) Using the method descried in the ook or in clss, convert the following regulr expression into n equivlent (nondeterministic) finite utomton: (

More information

1.1. Interval Notation and Set Notation Essential Question When is it convenient to use set-builder notation to represent a set of numbers?

1.1. Interval Notation and Set Notation Essential Question When is it convenient to use set-builder notation to represent a set of numbers? 1.1 TEXAS ESSENTIAL KNOWLEDGE AND SKILLS Prepring for 2A.6.K, 2A.7.I Intervl Nottion nd Set Nottion Essentil Question When is it convenient to use set-uilder nottion to represent set of numers? A collection

More information

Lexical Analysis and Lexical Analyzer Generators

Lexical Analysis and Lexical Analyzer Generators 1 Lexicl Anlysis nd Lexicl Anlyzer Genertors Chpter 3 COP5621 Compiler Construction Copyright Roert vn Engelen, Florid Stte University, 2007-2009 2 The Reson Why Lexicl Anlysis is Seprte Phse Simplifies

More information

From Dependencies to Evaluation Strategies

From Dependencies to Evaluation Strategies From Dependencies to Evlution Strtegies Possile strtegies: 1 let the user define the evlution order 2 utomtic strtegy sed on the dependencies: use locl dependencies to determine which ttriutes to compute

More information

Agilent G3314AA BioConfirm Software

Agilent G3314AA BioConfirm Software Agilent G3314AA BioConfirm Softwre Quik Strt Guide Use this guide to instll nd get strted with the BioConfirm softwre. Wht is BioConfirm Softwre? Agilent G3314AA BioConfirm Softwre lets you onfirm the

More information

Theory of Computation CSE 105

Theory of Computation CSE 105 $ $ $ Theory of Computtion CSE 105 Regulr Lnguges Study Guide nd Homework I Homework I: Solutions to the following problems should be turned in clss on July 1, 1999. Instructions: Write your nswers clerly

More information

COSC 6374 Parallel Computation. Dense Matrix Operations

COSC 6374 Parallel Computation. Dense Matrix Operations COSC 6374 Prllel Computtion Dense Mtrix Opertions Edgr Griel Fll Edgr Griel Prllel Computtion Edgr Griel erminology Dense Mtrix: ll elements of the mtrix ontin relevnt vlues ypilly stored s 2-D rry, (e.g.

More information

Width and Bounding Box of Imprecise Points

Width and Bounding Box of Imprecise Points Width nd Bounding Box of Impreise Points Vhideh Keikh Mrten Löffler Ali Mohdes Zhed Rhmti Astrt In this pper we study the following prolem: we re given set L = {l 1,..., l n } of prllel line segments,

More information

SOFTWARE-BUG LOCALIZATION WITH GRAPH MINING

SOFTWARE-BUG LOCALIZATION WITH GRAPH MINING Chpter 17 SOFTWARE-BUG LOCALIZATION WITH GRAPH MINING Frnk Eihinger Institute for Progrm Strutures nd Dt Orgniztion (IPD) Universit-t Krlsruhe (TH), Germny eihinger@ipd.uk.de Klemens B-ohm Institute for

More information

CMSC 331 First Midterm Exam

CMSC 331 First Midterm Exam 0 00/ 1 20/ 2 05/ 3 15/ 4 15/ 5 15/ 6 20/ 7 30/ 8 30/ 150/ 331 First Midterm Exm 7 October 2003 CMC 331 First Midterm Exm Nme: mple Answers tudent ID#: You will hve seventy-five (75) minutes to complete

More information

Quiz2 45mins. Personal Number: Problem 1. (20pts) Here is an Table of Perl Regular Ex

Quiz2 45mins. Personal Number: Problem 1. (20pts) Here is an Table of Perl Regular Ex Long Quiz2 45mins Nme: Personl Numer: Prolem. (20pts) Here is n Tle of Perl Regulr Ex Chrcter Description. single chrcter \s whitespce chrcter (spce, t, newline) \S non-whitespce chrcter \d digit (0-9)

More information

What are suffix trees?

What are suffix trees? Suffix Trees 1 Wht re suffix trees? Allow lgorithm designers to store very lrge mount of informtion out strings while still keeping within liner spce Allow users to serch for new strings in the originl

More information

Package Contents. Wireless-G USB Network Adapter with SpeedBooster USB Cable Setup CD-ROM with User Guide (English only) Quick Installation

Package Contents. Wireless-G USB Network Adapter with SpeedBooster USB Cable Setup CD-ROM with User Guide (English only) Quick Installation A Division of Ciso Systems, In. Pkge Contents Wireless-G USB Network Adpter with SpeedBooster USB Cle Setup CD-ROM with User Guide (English only) Quik Instlltion 2,4 GHz 802.11g Wireless Model No. Model

More information