We use L i to stand for LL L (i times). It is logical to define L 0 to be { }. The union of languages L and M is given by
|
|
- Madlyn Patrick
- 6 years ago
- Views:
Transcription
1 The term languages to mean any set of string formed from some specific alphaet. The notation of concatenation can also e applied to languages. If L and M are languages, then L.M is the language consisting of all string xy, which can e found y selecting a string x from L, and a string y from M, and concatenating them in that order. That is, LM= {xy x is in L and y in M} we call LM the concatenation of L and M. Example: Let L e {0, 01,110}, and let M e {10,110}. Then LM= {010, 0110, 01110, 11010, }. Is the concatenation operator w1 =fire, w2 =truck w1 w2 =firetruck w2 w1 =truckfire w2 w2 =trucktruck Often drop the : w1w2 =firetruck For any string w, wɛ = w We can concatenate languages as well as strings L1L2 = {wv : w L1 and v L2} {a,a}{,}={a,a,a} {a,a}{a,a}={aa,aa,aa,aa} {a,aa}{a,aa}={aa,aaa,aaaa} We use L i to stand for LL L (i times). It is logical to define L 0 to e { }. The union of languages L and M is given y L M = {x x is in L or x is in M}. The empty set,, is the identity under union, since And L=L =L L=L = 15
2 There is another operation on languages which plays an important role in specifying tokens. This is the kleen closure operator. We use L * to denote the concatenation of language L with itself any numer of times. L * = L i i=0 Example Let D e the language consisting of the string 0, 1 9, that is, each string is a single decimal digit. Then D * is all strings of digits, including the empty string. For example, if L= {aa}, then L * is all string of an even numer of a's, since L 0 = { }, L 1 = {aa}, L 2 = {aaaa},.... If we wished to exclude, we could write L.(L * ), to denote that language. That is:- L.(L * ) =L. L i = L i+1 = L i i=0 i=0 i=1 We shall often use the L * for L.(L * ). The unary postfix operator + is called positive closure, and denotes "one or more instances of". A simple Approach to the Design of Lexical Analyzers There are two primary methods for implementing a scanner. The first is a program that is hard-coded to perform the scanning tasks. The second uses regular expression and finite automata theory to model the scanning process. One way to egin the design of any program is to descrie the ehavior of the program y a flowchart. This approach is particularly useful when the program is a lexical analyzer, ecause the action taken is highly dependent on what characters have een seen recently. Rememering previous characters y the position in a flowchart is a valuale tool, so much so that a specialized kind of flowchart for lexical analyzer, called a transition diagram, has evolved. In a transition diagram, the oxes of the flowchart are drawn as circles and called states. The states are connected y arrow, called edges. The laels on the various edges leaving a state indicate the input characters that can appear after that state. Identifier letter {letter digit} * digit [0-9] letter [A-Z a-z] 16
3 Start Letter Fig. 6: Transition diagram for identifier Fig. 6 shows a transition diagram for an identifier, defined to e a letter followed y any numer of letters or digits. The starting state of the transition diagram is state 0, the edge from which indicates that the first input character must e a letter. If this is the case, we enter state 1 and look at the next input character if this is a letter or the digit, we continue this way, reading letters and digits, and making transition from state 1 to itself, until the next input characters is a delimiter for an identifier, which we have assume is any character that is not a letter or a digit. On reading the delimiter, we enter state 2. To turn a collection of transition diagram into a program, we construct a segment of code for each state. The first step to e done in the code for any state is to otain the next character from the input uffer. For this purpose we use a function GETCHAR, which returns the next character, advancing the lookahead pointer at each call. The next step is to determine which edge, if any, out of the state is laeled y a character or class of characters that includes the character just read. If such an edge is found, control is transferred to the state pointed to y that edge. If no such edge is found, and the state is not one which indicated that a token has een found (indicated y a doule circle), we have fail to find this token. The lookahead pointer must e retracted to where the eginning pointer is, and another token must e searched for, using another transition diagram. If all transition diagrams have een tried without success, a lexical error has een detected, and an error correction routine must e called. Consider the transition diagram in Fig. 6, the code for state 0 might e:- State 0: C: = GETCHAR (); If LETTER(C) then goto state 1 else FAIL () Here, LETTER is a procedure which returns true if and only if C is a letter. Fail() is a routine which retracts the lookahead pointer and starts up the next transition diagram, if there is one, or calls the error routine. The code for state 1 is: State 1 C:=GETCHAR (); if LETTER (C) or DIGIT (C) then goto state 1 else if DELIMITER(C) then goto state 2 else FAIL () 17 Letter or digit Delimiter *
4 DIGIT is a procedure which returns true if and only if C is one of the digits 0, 1 9. DELIMITER is a procedure which returns true whenever C is a character that could follow an identifier. If we define a delimiter to e any character that is not letter or digit, then the clause "if DELIMITER (C) then", need not e presented in state 1. To detect errors more effectively we might define a delimiter precisely (e.g., lank, arithmetic or logical operator, left or right parenthesis, equal sign, colon, semicolon, or comma), depending on the language eing compiled. State 2 indicates that an identifier has een found. Since the delimiter is not part of the identifier, we must retract the lookahead pointer one character, for which we use a procedure RETRACT. We use '*' to indicate states on which input retraction must take place. We must also install the newly-found identifier in the symol tale if it is not already there, using the procedure INSTALL *. In state 2 we return a pair consisting of the integer code for an identifier, which we denote y id, and a value that is a pointer to the symol tale returned y INSTALL. The code for state 2 is: State 2: RETRACT ( ) return (id, INSTALL ( )) If lank must e skipped in the language at hand, we should include in the code for state 2 a step that moved the eginning pointer to the next non-lank. Fig. 7 shows a list of tokens that we want to recognize using token recognizer that use transition diagram explained in Fig. 8. Token Code Value egin end if then else identifier 6 Pointer to Symol Tale constant 7 Pointer to Symol Tale < 8 1 <= 8 2 = 8 3 <> 8 4 > 8 5 >= 8 6 Fig. 7: Token Recognizer 18
5 Keywords: Blank or Start B E G I N newline * Blank or newline E N D * return (2,) return (1,) Blank or newline 14 L S E * return (5,) I Blank or F newline * return (3,) Blank or newline T H E N * return (4,) Identifier: Start Not Letter Letter or digit * return (6,INSTALL ()) Constant: Letter or digit Start Digit Not Digit * return (7,INSTALL ()) Digit 19
6 Re lops: not Start < = or > * return (8,1) = 32 return (8,2) > 33 return (8,4) = 34 return (8,3) > 35 not = * 36 return (8,5) = 37 return (8,6) Fig. 8: transition Diagram A more efficient program can e constructed from a single transition diagram than from a collection of diagrams, since there is no need to acktrack and rescan using a second transition diagram. In Fig. 8, we have comined all keywords into one transition diagram. However, if we attempt to comine the diagram for identifiers with that for keywords, difficulties arise. For example, one seeing the three letters BEG, we could not tell whether to e in state 3 or state 24. In Fig. 8, each keyword is treated as a separate token, whereas all relops are comine into one token class, with the associated token value distinguishing one relops from another. Let us now consider an example if the action of the lexical analyzer constructed from the transition diagram of Fig.8. On seeing IFA followed y a lank, the 20
7 lexical analyzer would traverse state 0, 15, and 16, then fail and retract the input to I. It would then startup the second transition diagram at state 23, traverse state 24 three times, go to state 25 on the lank, retract the input one position, install IFA in the symol tale. Definition of Regular Expression After the definition of the string and languages, we are ready to descrie regular expressions, the notation we shall use to define the class of languages known as regular sets. Recall that a token is either a single string (such as a punctuation symol) or one of a collection of string of a certain type (such as an identifier). If we view the set of strings in each token class as a language, we can use the regularexpression notation to descrie tokens. In regular expression notation we could write the definition for identifier as:- Identifier= letter (letter digit) * The vertical ar means "or" that is union, the parentheses are used to group su expressions, and the star is the closure operator meaning "zero or more instances". What we call the regular expression over alphaet are exactly those expressions that can e constructed from the following rules. Each regular expression denotes a language and we gives the rules for construction of the denoted languages along with the regular-expression construction rules. 1- Is a regular expression denoting { }, that is, the language consisting only the empty string. 2- For each a in, a is a regular expression denoting {a}, the language with only one string, that string consisting of the single symol a. 3- If R and S are regular expression denoting language L R and L S, respectively, then:- i) (R) (S) is a regular expression denoting L R U L S ii) (R). (S) is a regular expression denoting L R. L S iii) (R) * is a regular expression denoting L * R We have shown regular expression formed with parentheses whenever possile. In fact, we eliminate them when we can, using the precedence rules that * has highest precedence, then comes., and has lowest precedence. 21
8 Let us assume that our alphaet is {a, }. The regular expression a denotes {a}, which is different from just the string a. 1- The regular expression a * denotes the closure of the language {a}, that is a * =U{a i } The set of all strings of zero or more a's. The regular expression aa*, which y our precedence rules is parsed a(a)*, denote the strings of one or more a's. We may use a + for aa* 2- What does the regular expression (a )* denote? We see that a denotes {a, }, the language with two string a and. Thus (a )* denote U{a, } i Which is just the set of all string of a's and 's including the empty string. The regular expression (a**)* denote the same set. 3- The expression a a* is grouped a ( (a)*), and denotes the set of strings consisting of either a single "a" or "" followed y zero or more a's. 4- The expression aa a a denotes all strings of length two, so (aa a a )* denotes all strings of even length. Note that is a string of length zero. 5- a denotes strings of length zero or one. Example: The token discussed in Fig. 7, can e descried y regular expression as follows: Keyword=BEGIN END IF THEN ELSE Identifier=letter (letter digit)* Constant=digit* Relops= < <= = < > > >= Where letter stands for A B Z, and digit stands for If two regular expression R and S denote the same language, we write R=S, and say that R and S are equivalent. For example, we previously oserved that (a )*= (a**)*. For any regular expression R, S and T, the following axioms hold:- i=0 i=0 22
9 1- R S= S R ( is commutative) 2- R (S T)=(R S) T ( is associative) 3- R (ST) = (RS) T (. is associative) 4- R(S T) = RS RT and (S T) R= SR TR (. distriutes over 1) 5- R=R =R ( is the identity for concatenation) Finite Automata A recognizer for a language L is a program takes as input a string x and answer "yes" if x is a sentence of L on "no" otherwise. Clearly, the part of a lexical analyzer that identifies the presence of a token on the input is a recognized for the language defining that token. Suppose we have specific a language y a regular expression R, and we are given some string x. We want to know whether x is in the language L denoted y R. One way to attempt this test is to check that x can e decomposed into a sequence of sustrings denoted y the primitive su expressions in R. Suppose R is (a )*a, the set of all strings ending in a and x is the string aa. We see that R=R 1 R 2, where R 1 = (a )* and R 2 = a. We can verify that a is an element of the language denoted y R 1 and that a similarly match R 2. In this way, we show that a is in the language denoted y R. Nondeterministic Finite Automata (NFA) A etter way to convert a regular expression to a recognizer is to construct a generalized transition diagram from the expression. This diagram is called nondeterministic finite automata. A nondeterministic finite automata recognizing the language (a )*a is shown in Fig.9. a Start a Fig. 9: Nondeterministic Finite Automata The NFA is a laeled directed graph. The nodes are called states, and the laeled edges are called transitions. The NFA looks almost like a transition diagram, ut edges can e laeled y as well as characters, and the some character called lael 23
10 two or more transitions out of one state. One state (0 in Fig. 9) is distinguished as the start state, and one or more states may e distinguished as accepting states (or final states). State 3 in Fig. 9 is the final state. The transitions of an NFA can e conveniently represented in taular form y means of a transition tale. The transition tale for the NFA of Fig. 9 is shown in Fig. 10. In the transition tale there is a row for each state and a column for each input symol. The entry for row 1 and symol a is the set of possile next state for state 1 on input a State Input symol Fig.10: Transition Tale a The NFA accepts an input string x if and only if there is a path from the start state to some accepting state, such that laels along that path spell out x. If the input string is aa, then we can show this sequence of moves:- State Remaining input 0 aa 0 a In Fig.11 elow we can see an NFA to recognize aa* *. String aaa is accepted y going through states 0, 1, 2, 2, and 2. The laels of these edges are, a, a and a, whose concatenation is aaa. 0 {0,1} {0} {2} {3} a 1 a 2 Start Fig.11: NFA accepting aa* *. 24
1. Lexical Analysis Phase
1. Lexical Analysis Phase The purpose of the lexical analyzer is to read the source program, one character at time, and to translate it into a sequence of primitive units called tokens. Keywords, identifiers,
More information1.0 Languages, Expressions, Automata
.0 Languages, Expressions, Automata Alphaet: Language: a finite set, typically a set of symols. a particular suset of the strings that can e made from the alphaet. ex: an alphaet of digits = {-,0,,2,3,4,5,6,7,8,9}
More informationLanguages and Finite Automata
Languages and Finite Automata or how to talk to machines... Costas Busch - RPI 1 Languages A language is a set of strings String: A sequence of letters (a word) Examples: cat, dog, house, Defined over
More information8 ε. Figure 1: An NFA-ǫ
0 1 2 3 4 a 6 5 7 8 9 10 LECTURE 27 Figure 1: An FA-ǫ 12.1 ǫ Transitions In all automata that we have seen so far, every time that it has to change from one state to another, it must use one input symol.
More informationDFA: Automata where the next state is uniquely given by the current state and the current input character.
Chapter : SCANNING (Lexical Analysis).3 Finite Automata Introduction to Finite Automata Finite automata (finite-state machines) are a mathematical way of descriing particular kinds of algorithms. A strong
More informationDefinition of Regular Expression
Definition of Regulr Expression After the definition of the string nd lnguges, we re redy to descrie regulr expressions, the nottion we shll use to define the clss of lnguges known s regulr sets. Recll
More informationUNIT -2 LEXICAL ANALYSIS
OVER VIEW OF LEXICAL ANALYSIS UNIT -2 LEXICAL ANALYSIS o To identify the tokens we need some method of describing the possible tokens that can appear in the input stream. For this purpose we introduce
More informationCOMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou
COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! [ALSU03] Chapter 3 - Lexical Analysis Sections 3.1-3.4, 3.6-3.7! Reading for next time [ALSU03] Chapter 3 Copyright (c) 2010 Ioanna
More informationZhizheng Zhang. Southeast University
Zhizheng Zhang Southeast University 2016/10/5 Lexical Analysis 1 1. The Role of Lexical Analyzer 2016/10/5 Lexical Analysis 2 2016/10/5 Lexical Analysis 3 Example. position = initial + rate * 60 2016/10/5
More informationLexical Analysis. Prof. James L. Frankel Harvard University
Lexical Analysis Prof. James L. Frankel Harvard University Version of 5:37 PM 30-Jan-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights reserved. Regular Expression Notation We will develop a
More informationUNIT II LEXICAL ANALYSIS
UNIT II LEXICAL ANALYSIS 2 Marks 1. What are the issues in lexical analysis? Simpler design Compiler efficiency is improved Compiler portability is enhanced. 2. Define patterns/lexeme/tokens? This set
More informationFinite automata. III. Finite automata: language recognizers. Nondeterministic Finite Automata. Nondeterministic Finite Automata with λ-moves
. Finite automata: language recognizers n F can e descried y a laeled directed graph, where the nodes, called states, are laeled with a (unimportant) name edges, called transitions, are laeled with symols
More informationCS 403 Compiler Construction Lecture 3 Lexical Analysis [Based on Chapter 1, 2, 3 of Aho2]
CS 403 Compiler Construction Lecture 3 Lexical Analysis [Based on Chapter 1, 2, 3 of Aho2] 1 What is Lexical Analysis? First step of a compiler. Reads/scans/identify the characters in the program and groups
More informationLexical Analysis. Sukree Sinthupinyo July Chulalongkorn University
Sukree Sinthupinyo 1 1 Department of Computer Engineering Chulalongkorn University 14 July 2012 Outline Introduction 1 Introduction 2 3 4 Transition Diagrams Learning Objectives Understand definition of
More informationECS 120 Lesson 7 Regular Expressions, Pt. 1
ECS 120 Lesson 7 Regular Expressions, Pt. 1 Oliver Kreylos Friday, April 13th, 2001 1 Outline Thus far, we have been discussing one way to specify a (regular) language: Giving a machine that reads a word
More informationRegular Expressions. Regular Expressions. Regular Languages. Specifying Languages. Regular Expressions. Kleene Star Operation
Another means to describe languages accepted by Finite Automata. In some books, regular languages, by definition, are described using regular. Specifying Languages Recall: how do we specify languages?
More informationAbout the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design
i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target
More informationUNIT III. The following section deals with the compilation procedure of any program.
Pune Vidyarthi Griha s COLLEGE OF ENGINEERING, NASHIK-4. 1 UNIT III Role of lexical analysis -parsing & Token, patterns and Lexemes & Lexical Errors, regular definitions for the language constructs & strings,
More informationLexical Analysis (ASU Ch 3, Fig 3.1)
Lexical Analysis (ASU Ch 3, Fig 3.1) Implementation by hand automatically ((F)Lex) Lex generates a finite automaton recogniser uses regular expressions Tasks remove white space (ws) display source program
More informationThe Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?
The Front End Source code Front End IR Back End Machine code Errors The purpose of the front end is to deal with the input language Perform a membership test: code source language? Is the program well-formed
More informationA Simple Syntax-Directed Translator
Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called
More informationAlternation. Kleene Closure. Definition of Regular Expressions
Alternation Small finite sets are conveniently represented by listing their elements. Parentheses delimit expressions, and, the alternation operator, separates alternatives. For example, D, the set of
More informationFormal Languages and Compilers Lecture VI: Lexical Analysis
Formal Languages and Compilers Lecture VI: Lexical Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/ artale/ Formal
More information2.2 Syntax Definition
42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions
More informationConcepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens
Concepts Introduced in Chapter 3 Lexical Analysis Regular Expressions (REs) Nondeterministic Finite Automata (NFA) Converting an RE to an NFA Deterministic Finite Automatic (DFA) Lexical Analysis Why separate
More information2. λ is a regular expression and denotes the set {λ} 4. If r and s are regular expressions denoting the languages R and S, respectively
Regular expressions: a regular expression is built up out of simpler regular expressions using a set of defining rules. Regular expressions allows us to define tokens of programming languages such as identifiers.
More informationPRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer
PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer As the first phase of a compiler, the main task of the lexical analyzer is to read the input
More informationFinite automata. We have looked at using Lex to build a scanner on the basis of regular expressions.
Finite automata We have looked at using Lex to build a scanner on the basis of regular expressions. Now we begin to consider the results from automata theory that make Lex possible. Recall: An alphabet
More informationLexical Analysis. Lecture 3. January 10, 2018
Lexical Analysis Lecture 3 January 10, 2018 Announcements PA1c due tonight at 11:50pm! Don t forget about PA1, the Cool implementation! Use Monday s lecture, the video guides and Cool examples if you re
More informationLexical Analyzer Scanner
Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce
More informationLanguages and Compilers
Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:
More informationLexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata
Lexical Analysis Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Phase Ordering of Front-Ends Lexical analysis (lexer) Break input string
More informationChapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1
Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1 1. Introduction Parsing is the task of Syntax Analysis Determining the syntax, or structure, of a program. The syntax is defined by the grammar rules
More informationB The SLLGEN Parsing System
B The SLLGEN Parsing System Programs are just strings of characters. In order to process a program, we need to group these characters into meaningful units. This grouping is usually divided into two stages:
More informationPart 5 Program Analysis Principles and Techniques
1 Part 5 Program Analysis Principles and Techniques Front end 2 source code scanner tokens parser il errors Responsibilities: Recognize legal programs Report errors Produce il Preliminary storage map Shape
More informationCOMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.
UNIT I LEXICAL ANALYSIS Translator: It is a program that translates one language to another Language. Source Code Translator Target Code 1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System
More informationSEM / YEAR : VI / III CS2352 PRINCIPLES OF COMPLIERS DESIGN UNIT I - LEXICAL ANALYSIS PART - A
SEM / YEAR : VI / III CS2352 PRINCIPLES OF COMPLIERS DESIGN UNIT I - LEXICAL ANALYSIS PART - A 1. What is a compiler? (A.U Nov/Dec 2007) A compiler is a program that reads a program written in one language
More informationLexical Analyzer Scanner
Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce
More informationCS308 Compiler Principles Lexical Analyzer Li Jiang
CS308 Lexical Analyzer Li Jiang Department of Computer Science and Engineering Shanghai Jiao Tong University Content: Outline Basic concepts: pattern, lexeme, and token. Operations on languages, and regular
More informationLexical Analysis. Introduction
Lexical Analysis Introduction Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies
More information[Lexical Analysis] Bikash Balami
1 [Lexical Analysis] Compiler Design and Construction (CSc 352) Compiled By Central Department of Computer Science and Information Technology (CDCSIT) Tribhuvan University, Kirtipur Kathmandu, Nepal 2
More informationThe Language for Specifying Lexical Analyzer
The Language for Specifying Lexical Analyzer We shall now study how to build a lexical analyzer from a specification of tokens in the form of a list of regular expressions The discussion centers around
More information2. Lexical Analysis! Prof. O. Nierstrasz!
2. Lexical Analysis! Prof. O. Nierstrasz! Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes.! http://www.cs.ucla.edu/~palsberg/! http://www.cs.purdue.edu/homes/hosking/!
More informationfor (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }
Ex: The difference between Compiler and Interpreter The interpreter actually carries out the computations specified in the source program. In other words, the output of a compiler is a program, whereas
More informationComputer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres
Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres dgriol@inf.uc3m.es Introduction: Definitions Lexical analysis or scanning: To read from left-to-right a source
More informationLexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!
Lexical Analysis Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast! Compiler Passes Analysis of input program (front-end) character stream
More informationA simple syntax-directed
Syntax-directed is a grammaroriented compiling technique Programming languages: Syntax: what its programs look like? Semantic: what its programs mean? 1 A simple syntax-directed Lexical Syntax Character
More informationPrinciples of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore
(Refer Slide Time: 00:20) Principles of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Lecture - 4 Lexical Analysis-Part-3 Welcome
More informationChapter 3: Lexical Analysis
Chapter 3: Lexical Analysis A simple way to build a lexical analyzer is to construct a diagram that illustrates the structure of tokens of the source language, and then to hand translate the diagram into
More informationNon-deterministic Finite Automata (NFA)
Non-deterministic Finite Automata (NFA) CAN have transitions on the same input to different states Can include a ε or λ transition (i.e. move to new state without reading input) Often easier to design
More informationIntroduction to Lexical Analysis
Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexical analyzers (lexers) Regular
More informationFigure 2.1: Role of Lexical Analyzer
Chapter 2 Lexical Analysis Lexical analysis or scanning is the process which reads the stream of characters making up the source program from left-to-right and groups them into tokens. The lexical analyzer
More informationConcepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective
Concepts Lexical scanning Regular expressions DFAs and FSAs Lex CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. 1 CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. 2 Lexical analysis
More informationfor (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }
Ex: The difference between Compiler and Interpreter The interpreter actually carries out the computations specified in the source program. In other words, the output of a compiler is a program, whereas
More informationChapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective
Chapter 4 Lexical analysis Lexical scanning Regular expressions DFAs and FSAs Lex Concepts CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. 1 CMSC 331, Some material 1998 by Addison Wesley
More informationDavid Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain)
David Griol Barres dgriol@inf.uc3m.es Computer Science Department Carlos III University of Madrid Leganés (Spain) OUTLINE Introduction: Definitions The role of the Lexical Analyzer Scanner Implementation
More informationUNIT I- LEXICAL ANALYSIS. 1.Interpreter: It is one of the translators that translate high level language to low level language.
INRODUCTION TO COMPILING UNIT I- LEXICAL ANALYSIS Translator: It is a program that translates one language to another. Types of Translator: 1.Interpreter 2.Compiler 3.Assembler source code Translator target
More informationIntroduction to Lexical Analysis
Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples
More informationDr. D.M. Akbar Hussain
1 2 Compiler Construction F6S Lecture - 2 1 3 4 Compiler Construction F6S Lecture - 2 2 5 #include.. #include main() { char in; in = getch ( ); if ( isalpha (in) ) in = getch ( ); else error (); while
More informationLANGUAGE TRANSLATORS
1 LANGUAGE TRANSLATORS UNIT: 3 Syllabus Source Program Analysis: Compilers Analysis of the Source Program Phases of a Compiler Cousins of Compiler Grouping of Phases Compiler Construction Tools. Lexical
More informationRegular Languages and Regular Expressions
Regular Languages and Regular Expressions According to our definition, a language is regular if there exists a finite state automaton that accepts it. Therefore every regular language can be described
More informationCOMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou
COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]
More informationRegular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications
Agenda for Today Regular Expressions CSE 413, Autumn 2005 Programming Languages Basic concepts of formal grammars Regular expressions Lexical specification of programming languages Using finite automata
More informationLexical Analysis. Chapter 2
Lexical Analysis Chapter 2 1 Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples
More informationCOMPILER DESIGN LECTURE NOTES
COMPILER DESIGN LECTURE NOTES UNIT -1 1.1 OVERVIEW OF LANGUAGE PROCESSING SYSTEM 1.2 Preprocessor A preprocessor produce input to compilers. They may perform the following functions. 1. Macro processing:
More informationCSE Discrete Structures
CSE 2315 - Discrete Structures Homework 3- Solution - Fall 2010 Due Date: Oct. 28 2010, 3:30 pm Sets 1. Rewrite the following sets as a list of elements. (8 points) a) {x ( y)(y N x = y 3 x < 30)} {0,
More informationComplexity Theory. Compiled By : Hari Prasad Pokhrel Page 1 of 20. ioenotes.edu.np
Chapter 1: Introduction Introduction Purpose of the Theory of Computation: Develop formal mathematical models of computation that reflect real-world computers. Nowadays, the Theory of Computation can be
More informationBuffering Techniques: Buffer Pairs and Sentinels
Week 3 Lexical Analysis Tasks of Lexical Analysis Why separating lexical analysis and parsing? Tokens, Patterns and Lexemes Complex tokens like identifier and numeral are described using regularexpression
More informationCSc 453 Lexical Analysis (Scanning)
CSc 453 Lexical Analysis (Scanning) Saumya Debray The University of Arizona Tucson Overview source program lexical analyzer (scanner) tokens syntax analyzer (parser) symbol table manager Main task: to
More informationLexical Analysis. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.
Lexical Analysis Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.
More informationTENTAMEN / EXAM. General instructions
Linköpings universitet IDA Department of Computer and Information Sciences Prof. Peter Fritzson and Doc. Christoph Kessler TENTAMEN / EXAM TDDB29 Kompilatorer och interpretatorer / Compilers and interpreters
More informationFormal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata
Formal Languages and Compilers Lecture IV: Regular Languages and Finite Automata Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/
More informationUNIT III & IV. Bottom up parsing
UNIT III & IV Bottom up parsing 5.0 Introduction Given a grammar and a sentence belonging to that grammar, if we have to show that the given sentence belongs to the given grammar, there are two methods.
More informationCMPSCI 250: Introduction to Computation. Lecture #7: Quantifiers and Languages 6 February 2012
CMPSCI 250: Introduction to Computation Lecture #7: Quantifiers and Languages 6 February 2012 Quantifiers and Languages Quantifier Definitions Translating Quantifiers Types and the Universe of Discourse
More information1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.
UNIT I Translator: It is a program that translates one language to another Language. Examples of translator are compiler, assembler, interpreter, linker, loader and preprocessor. Source Code Translator
More informationLast lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions
Last lecture CMSC330 Finite Automata Languages Sets of strings Operations on languages Regular expressions Constants Operators Precedence 1 2 Finite automata States Transitions Examples Types This lecture
More informationCS321 Languages and Compiler Design I. Winter 2012 Lecture 4
CS321 Languages and Compiler Design I Winter 2012 Lecture 4 1 LEXICAL ANALYSIS Convert source file characters into token stream. Remove content-free characters (comments, whitespace,...) Detect lexical
More informationLexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual
Lexical Analysis Chapter 1, Section 1.2.1 Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual Inside the Compiler: Front End Lexical analyzer (aka scanner) Converts ASCII or Unicode to a stream of tokens
More informationTHE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES
THE COMPILATION PROCESS Character stream CS 403: Scanning and Parsing Stefan D. Bruda Fall 207 Token stream Parse tree Abstract syntax tree Modified intermediate form Target language Modified target language
More informationLecture 3: Lexical Analysis
Lecture 3: Lexical Analysis COMP 524 Programming Language Concepts tephen Olivier January 2, 29 Based on notes by A. Block, N. Fisher, F. Hernandez-Campos, J. Prins and D. totts Goal of Lecture Character
More informationCS415 Compilers. Lexical Analysis
CS415 Compilers Lexical Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Lecture 7 1 Announcements First project and second homework
More informationLecture 9 CIS 341: COMPILERS
Lecture 9 CIS 341: COMPILERS Announcements HW3: LLVM lite Available on the course web pages. Due: Monday, Feb. 26th at 11:59:59pm Only one group member needs to submit Three submissions per group START
More informationT Parallel and Distributed Systems (4 ECTS)
T 79.4301 Parallel and Distriuted Systems (4 ECTS) T 79.4301 Rinnakkaiset ja hajautetut järjestelmät (4 op) Lecture 4 11th of Feruary 2008 Keijo Heljanko Keijo.Heljanko@tkk.fi T 79.4301 Parallel and Distriuted
More informationNFAs and Myhill-Nerode. CS154 Chris Pollett Feb. 22, 2006.
NFAs and Myhill-Nerode CS154 Chris Pollett Feb. 22, 2006. Outline Bonus Questions Equivalence with Finite Automata Myhill-Nerode Theorem. Bonus Questions These questions are open to anybody. I will only
More informationFinite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur
Finite Automata Dr. Nadeem Akhtar Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur PhD Laboratory IRISA-UBS University of South Brittany European University
More informationT.E. (Computer Engineering) (Semester I) Examination, 2013 THEORY OF COMPUTATION (2008 Course)
*4459255* [4459] 255 Seat No. T.E. (Computer Engineering) (Semester I) Examination, 2013 THEY OF COMPUTATION (2008 Course) Time : 3 Hours Max. Marks : 100 Instructions : 1) Answers to the two Sections
More informationCSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1
CSEP 501 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter 2008 1/8/2008 2002-08 Hal Perkins & UW CSE B-1 Agenda Basic concepts of formal grammars (review) Regular expressions
More informationLexical Analysis. Lecture 2-4
Lexical Analysis Lecture 2-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 2 1 Administrivia Moving to 60 Evans on Wednesday HW1 available Pyth manual available on line.
More informationInterpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console
Scanning 1 read Interpreter Scanner request token Parser send token Console I/O send AST Tree Walker 2 Scanner This process is known as: Scanning, lexing (lexical analysis), and tokenizing This is the
More informationStructure of Programming Languages Lecture 3
Structure of Programming Languages Lecture 3 CSCI 6636 4536 Spring 2017 CSCI 6636 4536 Lecture 3... 1/25 Spring 2017 1 / 25 Outline 1 Finite Languages Deterministic Finite State Machines Lexical Analysis
More informationCS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3
CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3 CS 536 Spring 2015 1 Scanning A scanner transforms a character stream into a token stream. A scanner is sometimes
More informationCS 403: Scanning and Parsing
CS 403: Scanning and Parsing Stefan D. Bruda Fall 2017 THE COMPILATION PROCESS Character stream Scanner (lexical analysis) Token stream Parser (syntax analysis) Parse tree Semantic analysis Abstract syntax
More informationSyntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens
Syntactic Analysis CS45H: Programming Languages Lecture : Lexical Analysis Thomas Dillig Main Question: How to give structure to strings Analogy: Understanding an English sentence First, we separate a
More informationLexical Analysis. Lecture 3-4
Lexical Analysis Lecture 3-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 3-4 1 Administrivia I suggest you start looking at Python (see link on class home page). Please
More informationCS 314 Principles of Programming Languages. Lecture 3
CS 314 Principles of Programming Languages Lecture 3 Zheng Zhang Department of Computer Science Rutgers University Wednesday 14 th September, 2016 Zheng Zhang 1 CS@Rutgers University Class Information
More informationLexical Analysis - 2
Lexical Analysis - 2 More regular expressions Finite Automata NFAs and DFAs Scanners JLex - a scanner generator 1 Regular Expressions in JLex Symbol - Meaning. Matches a single character (not newline)
More informationOutline. 1 Scanning Tokens. 2 Regular Expresssions. 3 Finite State Automata
Outline 1 2 Regular Expresssions Lexical Analysis 3 Finite State Automata 4 Non-deterministic (NFA) Versus Deterministic Finite State Automata (DFA) 5 Regular Expresssions to NFA 6 NFA to DFA 7 8 JavaCC:
More informationBottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing
Also known as Shift-Reduce parsing More powerful than top down Don t need left factored grammars Can handle left recursion Attempt to construct parse tree from an input string eginning at leaves and working
More informationLecture 4: Syntax Specification
The University of North Carolina at Chapel Hill Spring 2002 Lecture 4: Syntax Specification Jan 16 1 Phases of Compilation 2 1 Syntax Analysis Syntax: Webster s definition: 1 a : the way in which linguistic
More informationSection A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.
Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or
More information