Special lecture on IKN. Information Knowledge Network -Information retrieval and pattern matching-
|
|
- Blaze Harrington
- 6 years ago
- Views:
Transcription
1 Special lecture on Information Knowledge Network -Information retrieval and pattern matching- The 5th Regular expression matching Takuya kida IKN Laboratory, Division of Computer Science and Information Technology Special lecture on IKN 2017/11/22
2 Today s contents bout regular expression Flow of matching processing Construction of a parse tree for a RE Construction of a NF for RE matching How to simulate the NF? 2
3 What is regular expression? notation for flexible and strong pattern matching Console command example: rm *.txt cp Important[0-9].doc Grep search example: Match to any filename of.txt Match to Important0.doc ~Important9.doc grep E for.+(256 CHR_SIZE) *.c Matching script example on Perl: m ^ Match to strings that start with followed by.jp/ regular expression can express a regular set (regular language) = can express a language (set of strings) LL of which a finite automaton can accept 3
4 Definition of regular expression regular expression (RE) is a string over Σ {ε,,,,(,)} which is recursively defined by the following rules: (1) and any elements of Σ are REs (2) If αα and ββ are REs, then (αα ββ) is a RE (3) If αα and ββ are REs, then (αα ββ) is a RE (4) If αα is a RE, then αα is a RE (5) Only those derived from the above are REs 例 : ( (( T) (C G)) ) (T CG) Symbols,, and are called operator Symbol + is often used as αα+ = αα αα for RE αα αα ββ is abbreviated as αααα for convenience 4
5 Semantic of regular expression RE is mapped into a subset of Σ (Language LL) (i) = (ii) (iii) (iv) For any a Σ, a = {a} For any REs αα and ββ, (αα ββ) = αα ββ For any REs αα and ββ, (αα ββ) = αα ββ (v) For any RE αα, αα = αα For example: (a (a b) ) (a (a b) ) = a (a b) = {a} a b = {a} a b = {axxxxx a, b } n DF equivalent to the left example q 0 q 2 b a q 1 a,b Execise: how about (T G)(TT)*? a,b 5
6 What is the RE matching problem? Regular expression matching problem is the problem of finding any strings in LL αα = αα for RE αα from a text REs and finite automaton have the same ability to define languages We can construct a F MM that accepts language LL(αα) for RE αα We can also describe a RE αα that derives language LL(MM) for F MM refer to "utomaton and computability" (Sec. 2.5) by Setsuo rikawa and Satoru Miyano Create a DF/NF corresponding to a given RE and simulate the movement It is easier to convert to a NF than to a DF The pattern occurrences can be found when the F reaches to its final states while reading a text 6
7 Flow of matching process General flow NF construction by Thompson method parsing text scan RE Parse tree NF Report the occ. NF construction by Glushkov method DF Flow with filtering technique extraction multiple PM verify RE set of factors Find candidates Report the occ. 7
8 Construction of parse tree Parse tree: a tree structure used in preparation for making NF Each leaf is labeled by symbol a Σ or the empty word ε. Each internal node is labeled by xx {,, }. Ex) Parse tree TT RRRR for RRRR = (T G)((G ) ) (T G)((G )*) T G * G Depth Operator 1 2 8
9 Pseudo code Parse (p=p 1 p 2 p m, last) 1 v θ; 2 while p last $ do 3 if p last Σ or p last = then /* normal character */ 4 v r Create a node with p last ; 5 if v θ then v [ ](v, v r ); 6 else v v r ; 7 last last + 1; 8 else if p last = then /* union operator */ 9 (v r, last) Parse(p, last + 1); 10 v [ ](v, v r ); 11 else if p last = * then /* star operator */ 12 v [ * ](v); 13 last last + 1; 14 else if p last = ( then /* open parenthesis */ 15 (v r, last) Parse(p, last + 1); 16 last last + 1; 17 if v θthen v [ ](v, v r ); 18 else v v r ; 19 else if p last = ) then /* close parenthesis */ 20 return (v, last); 21 end of if 22 end of while 23 return (v, last); 9
10 Thompson s NF construction method Idea: K. Thompson. Regular expression search algorithm. Communications of the CM, 11: , Construct NF TTT(vv) that accepts language LL RREE vv corresponding to the subtree with vv as the top while traversing parse tree TT RRRR in post order Each TTh vv is obtained by concatenating the automaton for the children of vv with ε-transitions Properties of Thompson NF: #states < 2mm, #transitions < 4mm O(mm) Contains many ε-transitions Transitions other than ε-transitions always are from ii to ii + 1 Ex) Thompson NF for RRRR = (T G)((G )*) G T G
11 NF construction algorithm For parse tree TT RRRR, traversing it in post order, construct a NF TTT(vv) for each node vv as follows (i) When vv is ε (ii) When vv is symbol a Σ (iii) When vv is operator (LL RR) I I ε a F F (iv) When vv is operator (LL RR) I I L vv LL F L I R vv RR F R (v) When vv is operator CC F I L vv LL vv RR F R I vv cc F 11
12 Move of the NF construction algorithm Ex) Parse tree TT RRRR for RRRR = (T G)((G ) ) 18 7 * 17 Ex) Thompson NF for RRRR = (T G)((G ) ) 0 T G T G G G
13 Pseudo code Thompson_recur (v) 1 if v = (v L, v R ) or v = (v L, v R ) then 2 Th(v L ) Thompson_recur(v L ); 3 Th(v R ) Thompson_recur(v R ); 4 else if v= * (v C ) then Th(v) Thompson_recur(v C ); 5 /* Recursive post-order traversal so far */ 6 if v=(ε) then return construction (i); 7 if v=(α), α Σ then return construction (ii); 8 if v= (v L, v R ) then return construction (iii); 9 if v= (v L, v R ) then return construction (iv); 10 if v= * (v C ) then return construction (v); Thompon(RE) 11 v RE Parse(RE$, 1); /* construct parse tree */ 12 Th(v RE ) Thompson_recur(v RE ); 13
14 Glushkov s NF construction method V-M. Glushkov. The abstract theory of automata. Russian Mathematical Surveys, 16:1-53, Idea: Make a new expression RE by numbering each symbol a Σ of RE in order from the left to the right (Let Σ be the alphabet with subscripts) Ex) RRRR = (T G)((G )*) RRRRR = ( 1 T 2 G 3 4 )(( 5 G )*) Create an NF that accepts LL(RREE ), then convert it to the final NF by eliminating the subscripts of symbols Properties of Glushkov NF: #states is just mm + 1, but #transitions is O mm 2 There is no ε-transitions For any node vv, all the labels of transitions onto vv are the same Ex) NF for RREE = ( 1 T 2 G 3 4 )(( 5 G )*) Ex) Glushkov NF T G T G G 3 4 G
15 NF construction algorithm (1) Let RRRRR be the numbered expression for RRRR PPPPPP RRRRR = {1,, mm}, Σ : the alphabet with subscripts Traversing parse tree TT RREE in post order, for each language RREE vv corresponding to the subtree with vv as the top node, calculate sets First(RREE vv ) and Last RREE vv, and functions Empty vv and Follow RREE, xx defined as follows: First(RRRR ) = {xx PPPPPP(RRRRR) uu Σ, αα xx uu LL(RRRRR)} Last(RRRR ) = {xx PPPPPP(RRRRR) uu Σ, uuαα xx LL(RRRRR)} Follow(RRRR, xx) = {yy PPPPPP(RRRR ) uu, vv Σ, uuαα xx αα yy vv LL(RRRR )} Empty RRRR returns {ε} if ε LL(RRRR), or φφ otherwise This can be recursively calculated as follows: Emptyε = ε, Emptya Σ = φφ, Empty RREE1 RREE 2 = Empty RREE1 Empty RREE2, Empty (RREE1 RREE 2 ) = Empty RREE1 Empty RREE2, Empty RRRR = ε. The NF is constructed based on the values obtained from the above Initial states of NF Final states of NF Transition function Is the initial state of the NF also a final state? 15
16 NF construction algorithm (2) Glushkov NF GGLL = SS, Σ, II, FF, δδ that accepts language LL(RRRRR) SS : set of states SS = 0, 1,, mm Σ :n alphabet with subscripts II :The initial state id, i.e., II = 0 FF δδ : set of the final states FF = Last(RREE ) Empty RRRR 0. :Transition function defined as follows xx PPPPPP RREE, yy Follow RREE, xx, δδδ xx, αα yy Transitions from the initial state is as follows. yy First(RRRRR), δδδ 0, αα yy = yy Ex) NF for RRRR = ( 1 T 2 G 3 4 )(( 5 G )*) = yy 7 5 G T G
17 Pseudo code Glushkov_variables (v RE, lpos) 1 if v=[ ](v l,v r ) or v=[ ](v l,v r ) then 2 lpos Glushkov_variables(v l, lpos); 3 lpos Glushkov_variables(v r, lpos); 4 else if v=[*](v * ) then lpos Glushkov_variables(v *, lpos); 5 end of if 6 if v=(ε) then 7 First(v) φ, Last(v) φ, Empty v {ε}; 8 else if v=(a), a Σ then 9 lpos lpos + 1; 10 First(v) {lpos}, Last(v) {lpos}, Empty v φ, Follow(lpos) φ; 11 else if v=[ ](v l,v r ) then 12 First(v) First(v l ) First(v r ); 13 Last(v) Last(v l ) Last(v r ); 14 Empty v Empty vl Empty vr ; 15 else if v=[ ](v l,v r ) then 16 First(v) First(v l ) (Empty vl First(v r )); 17 Last(v) (Empty vr Last(v l )) Last(v r ); 18 Empty v Empty vl Empty vr ; O mm 3 19 for x Last(v l ) do Follow(x) Follow(x) First(v r ); 20 else if v=[*](v * ) then 21 First(v) First(v * ), Last(v) Last(v * ), Empty v {ε}; 22 for x Last(v * ) do Follow(x) Follow(x) First(v * ); 23 end of if 24 return lpos; time totally O mm 2 time 17
18 Pseudo code (cont.) Glushkov (RE) 1 /* make a parse tree by parsing RE */ 2 v RE Parse(RE$, 1); 3 4 /* calculate each variable by using the parse tree */ 5 m Glushkov_variables(v RE, 0); 6 7 /* construct NF GL(S,, I, F,δ) by the variables */ 8 Δ φ; 9 for i 0 m do create state I; 10 for x First(v RE ) do Δ Δ {(0, α x, x)}; 11 for i 0 m do 12 for i Follow(i) do Δ Δ {(i,α x, x)}; 13 end of for 14 for x Last(v RE ) (Empty vre {0}) do mark x as terminal; 18
19 Flow of matching process (reprint) General flow NF construction by Thompson method The NF is simulated in O(mmmm) time parsing text scan RE Parse tree NF Report the occ. NF construction by Glushkov method OO(2 mm ) time and space is needed for translating DF There exists a method of converting directly into a DF Refer Sec. 3.9 of Compilers Principles, Techniques and Tools written by. V. ho, R. Sethi, and J. D. Ullman. ddison-wesley, ( 邦訳 : コンパイラ 原理 技法 ツール ) 19
20 Methods of simulating an NF Simulating a Thompson NF directly The most naïve method Storing current active states with a list of size O(mm) and updating them in O(mm) time It obviously takes O(mmmm) time Simulating a Thompson NF by converting into an equivalent DF Based on the classical conversion technique It takes O(2 mm ) time and space preprocessing There is a method that dynamically converts necessary parts of the DF during text scan. V. ho, R. Sethi, and J. D. Ullman. Compilers Principles, Techniques and Tools. ddison-wesley, Efficient hybrid technique Dividing the Thompson NF into modules consist of O(kk) nodes, and converting each module The transitions between modules are simulated in an NF manner E. W. Myers. four Russians algorithm for regular expression pattern matching. Journal of the CM, 39(2): , High-speed NF simulation by bit-parallel technique Simulating a Thompson NF: by S. Wu and U. Manber[1992] Simulating a Glushkov NF: by G. Navarro and M. Raffinot[1999] 20
21 Bit-parallel Thompson S. Wu and U. Manber. Fast text searching allowing errors. Communications of the CM, 35(10):83-91, Simulating a Thompson NF by bit-parallel technique For a Thompson NF, next to the ii-th state is always ii + 1-th except for ε transitions bit-parallel similar to Shift-nd method can be applicable ε-transitions are separately simulated a mask table of size 2 LL is needed (LL is #states of the NF) It takes O 2 LL + mm Σ time for preprocessing It scans in O(nn) time when LL is small enough Mask tables for Thompson NF QQ = ss 0,, ss QQ 1, Σ, II = ss 0, FF, Δ : For QQ nn = 0,, QQ 1, II nn = 0 QQ 1 1, and FF nn = ssjj FF0 QQ 1 jj 10 jj, BB nn ii, σσ = ssii,σσ,ss jj Δ 0 QQ 1 jj 10 jj, EE nn ii = ssjj EE ii 0 QQ 1 jj 10 jj (where EE(ii) is the -closure of ss ii ), EE dd DD = ii,ii=0 OR DD&0 LL ii 1 10 ii 0 LL EE nn ii, BB σσ = ii 0 mm BB nn ii, σσ, 21
22 Pseudo code BuildEps(N = (Q n,,i n,f n,b n,e n ) ) 1 for σ do 2 B[σ] 0 L ; 3 for i 0 L 1 do B[σ] B[σ] B n [i,σ]; 4 end of for 5 E d [0] E n [0]; 6 for i 0 L 1 do 7 for j 0 2 i 1 do 8 E d [2 i + j] E n [ i ] E d [ j ]; 9 end of for 10 end of for 11 return (B, E d ); BPThompson(N = (Q n,,i n,f n,b n,e n ), T = t 1 t 2 t n ) 1 Preprocessing: 2 (B, E d ) BuildEps(N); 3 Searching: 4 D E d [ I n ]; /* initial state */ 5 for pos 1 n do 6 if D & F n 0 L then report an occurrence ending at pos 1; 7 D E d [ (D << 1) & B[t pos ] ]; 8 end of for 22
23 Summary REs and finite automaton have the same ability to define languages Flow of regular expression matching Construct an NF via parse tree for RE, then simulate the NF to scan a text Filtration + pattern plurals collation + inspection + NF simulation How to construct an NF Thompson NF: #states < 2mm, #transitions < 4mm O(mm) space Contains many ε-transitions Transitions other than ε-transitions always are from ii to ii + 1 Glushkov NF: #states is just mm + 1, but #transitions is O mm 2 There is no ε-transitions For any node vv, all the labels of transitions onto vv are the same How to simulate an NF Simulating Thompson NFs directly O(mmmm) time Converting DF scans in O(nn) time, but takes O(2 mm ) time and space for preprocessing Speeding-up by bit-parallel techniques: Bit-parallel Thompson, Bit-parallel Glushkov The next theme: Compressed Pattern Matching 23
24 ppendix bout the definitions of terms which I didn t explain in the first lecture subset of Σ is called a formal language or a language for short For languages LL 1, LL 2 Σ, a set xxxx xx LL 1 and yy LL 2 } is called the product of LL 1 and LL 2 and denoted by LL 1 LL 2 or simply LL 1 LL 2 For language LL Σ, we define LL 0 =, LL nn = LL nn 1 LL (nn 1). Moreover, we define LL = nn=0 LL nn, and call it as the closure of LL. We also denote LL + = nn=1 LL nn bout look-behind notations Handbook of Theoretical Computer Science, Volume : lgorithms and Complexity, The MIT Press, Elsevier, ( 邦訳 ) コンピュータ基礎理論ハンドブック Ⅰ: アルゴリズムと複雑さ, 丸善,1994. Chapter 5, Sec.2.3 and Sec.6.1 ccording to this, it seems that the notion of look-behind had appeared in 1964 It exceeds the frame of context-free grammar (of course beyond RE)! The matching problem of it is proved to be NP-complete! 24
Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur
Finite Automata Dr. Nadeem Akhtar Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur PhD Laboratory IRISA-UBS University of South Brittany European University
More informationDr. D.M. Akbar Hussain
1 2 Compiler Construction F6S Lecture - 2 1 3 4 Compiler Construction F6S Lecture - 2 2 5 #include.. #include main() { char in; in = getch ( ); if ( isalpha (in) ) in = getch ( ); else error (); while
More informationECS 120 Lesson 7 Regular Expressions, Pt. 1
ECS 120 Lesson 7 Regular Expressions, Pt. 1 Oliver Kreylos Friday, April 13th, 2001 1 Outline Thus far, we have been discussing one way to specify a (regular) language: Giving a machine that reads a word
More informationFormal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata
Formal Languages and Compilers Lecture IV: Regular Languages and Finite Automata Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/
More informationFinite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018
Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Lecture 11 Ana Bove April 26th 2018 Recap: Regular Languages Decision properties of RL: Is it empty? Does it contain this word? Contains
More informationAmbiguous Grammars and Compactification
Ambiguous Grammars and Compactification Mridul Aanjaneya Stanford University July 17, 2012 Mridul Aanjaneya Automata Theory 1/ 44 Midterm Review Mathematical Induction and Pigeonhole Principle Finite Automata
More informationCOMP Logic for Computer Scientists. Lecture 23
COMP 1002 Logic for Computer cientists Lecture 23 B 5 2 J Admin stuff Assignment 3 extension Because of the power outage, assignment 3 now due on Tuesday, March 14 (also 7pm) Assignment 4 to be posted
More informationAbout the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design
i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target
More informationR10 SET a) Construct a DFA that accepts an identifier of a C programming language. b) Differentiate between NFA and DFA?
R1 SET - 1 1. a) Construct a DFA that accepts an identifier of a C programming language. b) Differentiate between NFA and DFA? 2. a) Design a DFA that accepts the language over = {, 1} of all strings that
More informationSyntax Analysis Top Down Parsing
Syntax Analysis Top Down Parsing CMPSC 470 Lecture 05 Topics: Overview Recursive-descent parser First and Follow A. Overview Top-down parsing constructs parse tree for input string from root and creating
More informationIntroduction to Parsing. Lecture 8
Introduction to Parsing Lecture 8 Adapted from slides by G. Necula Outline Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Languages and Automata Formal languages
More informationCompiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6
Compiler Design 1 Bottom-UP Parsing Compiler Design 2 The Process The parse tree is built starting from the leaf nodes labeled by the terminals (tokens). The parser tries to discover appropriate reductions,
More informationCS 4120 Introduction to Compilers
CS 4120 Introduction to Compilers Andrew Myers Cornell University Lecture 6: Bottom-Up Parsing 9/9/09 Bottom-up parsing A more powerful parsing technology LR grammars -- more expressive than LL can handle
More informationCourse Project 2 Regular Expressions
Course Project 2 Regular Expressions CSE 30151 Spring 2017 Version of February 16, 2017 In this project, you ll write a regular expression matcher similar to grep, called mere (for match and echo using
More informationOutline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)
Outline Limitations of regular languages Introduction to Parsing Parser overview Lecture 8 Adapted from slides by G. Necula Context-free grammars (CFG s) Derivations Languages and Automata Formal languages
More informationFormal Languages and Automata
Mobile Computing and Software Engineering p. 1/3 Formal Languages and Automata Chapter 3 Regular languages and Regular Grammars Chuan-Ming Liu cmliu@csie.ntut.edu.tw Department of Computer Science and
More informationCOMP Logic for Computer Scientists. Lecture 25
COMP 1002 Logic for Computer Scientists Lecture 25 B 5 2 J Admin stuff Assignment 4 is posted. Due March 23 rd. Monday March 20 th office hours From 2:30pm to 3:30pm I need to attend something 2-2:30pm.
More informationLanguages and Compilers
Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:
More informationLexical Analysis - 1. A. Overview A.a) Role of Lexical Analyzer
CMPSC 470 Lecture 02 Topics: Regular Expression Transition Diagram Lexical Analyzer Implementation A. Overview A.a) Role of Lexical Analyzer Lexical Analysis - 1 Lexical analyzer does: read input character
More informationCSE 105 THEORY OF COMPUTATION
CSE 105 THEORY OF COMPUTATION Spring 2017 http://cseweb.ucsd.edu/classes/sp17/cse105-ab/ Today's learning goals Sipser Ch 1.2, 1.3 Design NFA recognizing a given language Convert an NFA (with or without
More informationrecruitment Logo Typography Colourways Mechanism Usage Pip Recruitment Brand Toolkit
Logo Typography Colourways Mechanism Usage Primary; Secondary; Silhouette; Favicon; Additional Notes; Where possible, use the logo with the striped mechanism behind. Only when it is required to be stripped
More informationImplementation of Lexical Analysis
Outline Implementation of Lexical nalysis Specifying lexical structure using regular expressions Finite automata Deterministic Finite utomata (DFs) Non-deterministic Finite utomata (NFs) Implementation
More informationLexical Analysis. Introduction
Lexical Analysis Introduction Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies
More informationFormal languages and computation models
Formal languages and computation models Guy Perrier Bibliography John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman - Introduction to Automata Theory, Languages, and Computation - Addison Wesley, 2006.
More informationSimilarity and Model Testing
Similarity and Model Testing 11. 5. 014 Hyunse Yoon, Ph.D. Assistant Research Scientist IIHR-Hydroscience & Engineering e-mail: hyun-se-yoon@uiowa.edu Modeling Model: A representation of a physical system
More informationFormal Languages and Compilers Lecture VI: Lexical Analysis
Formal Languages and Compilers Lecture VI: Lexical Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/ artale/ Formal
More informationAdministrivia. Lexical Analysis. Lecture 2-4. Outline. The Structure of a Compiler. Informal sketch of lexical analysis. Issues in lexical analysis
dministrivia Lexical nalysis Lecture 2-4 Notes by G. Necula, with additions by P. Hilfinger Moving to 6 Evans on Wednesday HW available Pyth manual available on line. Please log into your account and electronically
More informationNondeterministic Finite Automata (NFA): Nondeterministic Finite Automata (NFA) states of an automaton of this kind may or may not have a transition for each symbol in the alphabet, or can even have multiple
More informationIntroduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1
Introduction to Automata Theory BİL405 - Automata Theory and Formal Languages 1 Automata, Computability and Complexity Automata, Computability and Complexity are linked by the question: What are the fundamental
More informationRegular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications
Agenda for Today Regular Expressions CSE 413, Autumn 2005 Programming Languages Basic concepts of formal grammars Regular expressions Lexical specification of programming languages Using finite automata
More informationNeha 1, Abhishek Sharma 2 1 M.Tech, 2 Assistant Professor. Department of Cse, Shri Balwant College of Engineering &Technology, Dcrust University
Methods of Regular Expression Neha 1, Abhishek Sharma 2 1 M.Tech, 2 Assistant Professor Department of Cse, Shri Balwant College of Engineering &Technology, Dcrust University Abstract - Regular expressions
More informationBRAND STANDARD GUIDELINES 2014
BRAND STANDARD GUIDELINES 2014 LOGO USAGE & TYPEFACES Logo Usage The Lackawanna College School of Petroleum & Natural Gas logo utilizes typography, two simple rule lines and the Lackawanna College graphic
More informationOn Strongly *-Graphs
Proceedings of the Pakistan Academy of Sciences: A. Physical and Computational Sciences 54 (2): 179 195 (2017) Copyright Pakistan Academy of Sciences ISSN: 2518-4245 (print), 2518-4253 (online) Pakistan
More informationLexical Analysis. Lecture 3-4
Lexical Analysis Lecture 3-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 3-4 1 Administrivia I suggest you start looking at Python (see link on class home page). Please
More informationContext-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5
Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5 1 Not all languages are regular So what happens to the languages which are not regular? Can we still come up with a language recognizer?
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationParsing. Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1)
TD parsing - LL(1) Parsing First and Follow sets Parse table construction BU Parsing Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1), LALR(1) Problems with SLR Aho, Sethi, Ullman, Compilers
More informationIn One Slide. Outline. LR Parsing. Table Construction
LR Parsing Table Construction #1 In One Slide An LR(1) parsing table can be constructed automatically from a CFG. An LR(1) item is a pair made up of a production and a lookahead token; it represents a
More informationRegular Languages and Regular Expressions
Regular Languages and Regular Expressions According to our definition, a language is regular if there exists a finite state automaton that accepts it. Therefore every regular language can be described
More informationLast lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions
Last lecture CMSC330 Finite Automata Languages Sets of strings Operations on languages Regular expressions Constants Operators Precedence 1 2 Finite automata States Transitions Examples Types This lecture
More informationThe Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?
The Front End Source code Front End IR Back End Machine code Errors The purpose of the front end is to deal with the input language Perform a membership test: code source language? Is the program well-formed
More informationVisual Identity Guidelines. Abbreviated for Constituent Leagues
Visual Identity Guidelines Abbreviated for Constituent Leagues 1 Constituent League Logo The logo is available in a horizontal and vertical format. Either can be used depending on the best fit for a particular
More informationCOP4020 Programming Languages. Syntax Prof. Robert van Engelen
COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and context-free grammars n Grammar derivations n More about parse trees n Top-down and
More informationCOP4020 Programming Languages. Syntax Prof. Robert van Engelen
COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up
More informationMidterm Exam. CSCI 3136: Principles of Programming Languages. February 20, Group 2
Banner number: Name: Midterm Exam CSCI 336: Principles of Programming Languages February 2, 23 Group Group 2 Group 3 Question. Question 2. Question 3. Question.2 Question 2.2 Question 3.2 Question.3 Question
More informationUNIT -2 LEXICAL ANALYSIS
OVER VIEW OF LEXICAL ANALYSIS UNIT -2 LEXICAL ANALYSIS o To identify the tokens we need some method of describing the possible tokens that can appear in the input stream. For this purpose we introduce
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationImplementation of Lexical Analysis
Implementation of Lexical Analysis Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs) Implementation
More informationTheoretical Part. Chapter one:- - What are the Phases of compiler? Answer:
Theoretical Part Chapter one:- - What are the Phases of compiler? Six phases Scanner Parser Semantic Analyzer Source code optimizer Code generator Target Code Optimizer Three auxiliary components Literal
More informationIntroduction to Lexing and Parsing
Introduction to Lexing and Parsing ECE 351: Compilers Jon Eyolfson University of Waterloo June 18, 2012 1 Riddle Me This, Riddle Me That What is a compiler? 1 Riddle Me This, Riddle Me That What is a compiler?
More informationMIT Specifying Languages with Regular Expressions and Context-Free Grammars
MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Language Definition Problem How to precisely
More informationMA513: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 18 Date: September 12, 2011
MA53: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 8 Date: September 2, 20 xercise: Define a context-free grammar that represents (a simplification of) expressions
More informationVisit MathNation.com or search "Math Nation" in your phone or tablet's app store to watch the videos that go along with this workbook!
Topic 1: Introduction to Angles - Part 1... 47 Topic 2: Introduction to Angles Part 2... 50 Topic 3: Angle Pairs Part 1... 53 Topic 4: Angle Pairs Part 2... 56 Topic 5: Special Types of Angle Pairs Formed
More informationshift-reduce parsing
Parsing #2 Bottom-up Parsing Rightmost derivations; use of rules from right to left Uses a stack to push symbols the concatenation of the stack symbols with the rest of the input forms a valid bottom-up
More informationLexical Analysis. Chapter 2
Lexical Analysis Chapter 2 1 Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples
More informationAssignment 4 CSE 517: Natural Language Processing
Assignment 4 CSE 517: Natural Language Processing University of Washington Winter 2016 Due: March 2, 2016, 1:30 pm 1 HMMs and PCFGs Here s the definition of a PCFG given in class on 2/17: A finite set
More information1. [5 points each] True or False. If the question is currently open, write O or Open.
University of Nevada, Las Vegas Computer Science 456/656 Spring 2018 Practice for the Final on May 9, 2018 The entire examination is 775 points. The real final will be much shorter. Name: No books, notes,
More informationBottom-Up Parsing. Lecture 11-12
Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 9/22/06 Prof. Hilfinger CS164 Lecture 11 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient
More informationWhere We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser
More informationLexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata
Lexical Analysis Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Phase Ordering of Front-Ends Lexical analysis (lexer) Break input string
More informationUniversity of Nevada, Las Vegas Computer Science 456/656 Fall 2016
University of Nevada, Las Vegas Computer Science 456/656 Fall 2016 The entire examination is 925 points. The real final will be much shorter. Name: No books, notes, scratch paper, or calculators. Use pen
More informationMore Bottom-Up Parsing
More Bottom-Up Parsing Lecture 7 Dr. Sean Peisert ECS 142 Spring 2009 1 Status Project 1 Back By Wednesday (ish) savior lexer in ~cs142/s09/bin Project 2 Due Friday, Apr. 24, 11:55pm My office hours 3pm
More informationImplementation of Lexical Analysis
Written ssignments W assigned today Implementation of Lexical nalysis Lecture 4 Due in one week :59pm Electronic hand-in Prof. iken CS 43 Lecture 4 Prof. iken CS 43 Lecture 4 2 Tips on uilding Large Systems
More informationLecture 3.3 Robust estimation with RANSAC. Thomas Opsahl
Lecture 3.3 Robust estimation with RANSAC Thomas Opsahl Motivation If two perspective cameras captures an image of a planar scene, their images are related by a homography HH 2 Motivation If two perspective
More informationContext-Free Languages and Parse Trees
Context-Free Languages and Parse Trees Mridul Aanjaneya Stanford University July 12, 2012 Mridul Aanjaneya Automata Theory 1/ 41 Context-Free Grammars A context-free grammar is a notation for describing
More informationLexical Analysis. Lecture 2-4
Lexical Analysis Lecture 2-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 2 1 Administrivia Moving to 60 Evans on Wednesday HW1 available Pyth manual available on line.
More informationThe ABC s of Web Site Evaluation
Aa Bb Cc Dd Ee Ff Gg Hh Ii Jj Kk Ll Mm Nn Oo Pp Qq Rr Ss Tt Uu Vv Ww Xx Yy Zz The ABC s of Web Site Evaluation by Kathy Schrock Digital Literacy by Paul Gilster Digital literacy is the ability to understand
More informationMIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology
MIT 6.035 Parse Table Construction Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Parse Tables (Review) ACTION Goto State ( ) $ X s0 shift to s2 error error goto s1
More informationLecture 9: Transformations. CITS3003 Graphics & Animation
Lecture 9: Transformations CITS33 Graphics & Animation E. Angel and D. Shreiner: Interactive Computer Graphics 6E Addison-Wesley 212 Objectives Introduce standard transformations Rotation Translation Scaling
More informationfor (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }
Ex: The difference between Compiler and Interpreter The interpreter actually carries out the computations specified in the source program. In other words, the output of a compiler is a program, whereas
More informationCSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions
CSE 413 Programming Languages & Implementation Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions 1 Agenda Overview of language recognizers Basic concepts of formal grammars Scanner Theory
More informationFormal Grammars and Abstract Machines. Sahar Al Seesi
Formal Grammars and Abstract Machines Sahar Al Seesi What are Formal Languages Describing the sentence structure of a language in a formal way Used in Natural Language Processing Applications (translators,
More informationCT32 COMPUTER NETWORKS DEC 2015
Q.2 a. Using the principle of mathematical induction, prove that (10 (2n-1) +1) is divisible by 11 for all n N (8) Let P(n): (10 (2n-1) +1) is divisible by 11 For n = 1, the given expression becomes (10
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back
More informationModule 6 Lexical Phase - RE to DFA
Module 6 Lexical Phase - RE to DFA The objective of this module is to construct a minimized DFA from a regular expression. A NFA is typically easier to construct but string matching with a NFA is slower.
More informationRegular Languages. MACM 300 Formal Languages and Automata. Formal Languages: Recap. Regular Languages
Regular Languages MACM 3 Formal Languages and Automata Anoop Sarkar http://www.cs.sfu.ca/~anoop The set of regular languages: each element is a regular language Each regular language is an example of a
More informationCSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions
CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions 1 Agenda Overview of language recognizers Basic concepts of formal grammars Scanner Theory
More information"Charting the Course... MOC A Planning, Deploying and Managing Microsoft Forefront TMG Course Summary
Description Course Summary The goal of this three-day instructor-led course is to provide students with the knowledge and skills necessary to effectively plan, deploy and manage Microsoft Forefront Threat
More informationMultiple Choice Questions
Techno India Batanagar Computer Science and Engineering Model Questions Subject Name: Formal Language and Automata Theory Subject Code: CS 402 Multiple Choice Questions 1. The basic limitation of an FSM
More informationBRANDING AND STYLE GUIDELINES
BRANDING AND STYLE GUIDELINES INTRODUCTION The Dodd family brand is designed for clarity of communication and consistency within departments. Bold colors and photographs are set on simple and clean backdrops
More informationImplementation of Lexical Analysis
Implementation of Lexical Analysis Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs) Implementation
More informationWisconsin Retirement Testing Preparation
Wisconsin Retirement Testing Preparation The Wisconsin Retirement System (WRS) is changing its reporting requirements from annual to every pay period starting January 1, 2018. With that, there are many
More informationJNTUWORLD. Code No: R
Code No: R09220504 R09 SET-1 B.Tech II Year - II Semester Examinations, April-May, 2012 FORMAL LANGUAGES AND AUTOMATA THEORY (Computer Science and Engineering) Time: 3 hours Max. Marks: 75 Answer any five
More informationFormal Languages and Compilers Lecture VII Part 3: Syntactic A
Formal Languages and Compilers Lecture VII Part 3: Syntactic Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/
More information1. Lexical Analysis Phase
1. Lexical Analysis Phase The purpose of the lexical analyzer is to read the source program, one character at time, and to translate it into a sequence of primitive units called tokens. Keywords, identifiers,
More information3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino
3. Syntax Analysis Andrea Polini Formal Languages and Compilers Master in Computer Science University of Camerino (Formal Languages and Compilers) 3. Syntax Analysis CS@UNICAM 1 / 54 Syntax Analysis: the
More informationChapter 4. Lexical and Syntax Analysis
Chapter 4 Lexical and Syntax Analysis Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing Copyright 2012 Addison-Wesley. All rights reserved.
More informationCS415 Compilers. Lexical Analysis
CS415 Compilers Lexical Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Lecture 7 1 Announcements First project and second homework
More informationCOMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou
COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! [ALSU03] Chapter 3 - Lexical Analysis Sections 3.1-3.4, 3.6-3.7! Reading for next time [ALSU03] Chapter 3 Copyright (c) 2010 Ioanna
More informationPalatino. Palatino. Linotype. Palatino. Linotype. Linotype. Palatino. Linotype. Palatino. Linotype. Palatino. Linotype
Copyright 2013 Johanna Corsini Arts 79 Typography 1 Sources: http://en.wikipedia.org/wiki/ http://en.wikipedia.org/wiki/typography By Johanna Corsini P a a P o l t a a n L P i l t n a i o a o y l t n n
More informationCSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1
CSE P 501 Compilers Parsing & Context-Free Grammars Hal Perkins Winter 2008 1/15/2008 2002-08 Hal Perkins & UW CSE C-1 Agenda for Today Parsing overview Context free grammars Ambiguous grammars Reading:
More informationTalen en Compilers. Johan Jeuring , period 2. January 17, Department of Information and Computing Sciences Utrecht University
Talen en Compilers 2015-2016, period 2 Johan Jeuring Department of Information and Computing Sciences Utrecht University January 17, 2016 13. LR parsing 13-1 This lecture LR parsing Basic idea The LR(0)
More informationIntroduction to Lexical Analysis
Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexical analyzers (lexers) Regular
More informationCMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters
: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Scanner Parser Static Analyzer Intermediate Representation Front End Back End Compiler / Interpreter
More informationLexical Analysis/Scanning
Compiler Design 1 Lexical Analysis/Scanning Compiler Design 2 Input and Output The input is a stream of characters (ASCII codes) of the source program. The output is a stream of tokens or symbols corresponding
More informationTheory of Computations Spring 2016 Practice Final Exam Solutions
1 of 8 Theory of Computations Spring 2016 Practice Final Exam Solutions Name: Directions: Answer the questions as well as you can. Partial credit will be given, so show your work where appropriate. Try
More informationMathematical Induction
COMP 182 Algorithmic Thinking Mathematical Induction Luay Nakhleh Computer Science Rice University Chapter 5, Section 1-4 Reading Material [P (1) ^8k(P (k)! P (k + 1))]!8nP (n) Why Is It Valid? The well-ordering
More informationAbstract Syntax Trees L3 24
Abstract Syntax Trees L3 24 Formal languages An extensional view of what constitutes a formal language is that it is completely determined by the set of words in the dictionary : Given an alphabet Σ, wecallanysubsetofσ
More informationDecision, Computation and Language
Decision, Computation and Language Regular Expressions Dr. Muhammad S Khan (mskhan@liv.ac.uk) Ashton Building, Room G22 http://www.csc.liv.ac.uk/~khan/comp218 Regular expressions M S Khan (Univ. of Liverpool)
More informationComplexity Theory. Compiled By : Hari Prasad Pokhrel Page 1 of 20. ioenotes.edu.np
Chapter 1: Introduction Introduction Purpose of the Theory of Computation: Develop formal mathematical models of computation that reflect real-world computers. Nowadays, the Theory of Computation can be
More information