DFA: Automata where the next state is uniquely given by the current state and the current input character.

Size: px
Start display at page:

Download "DFA: Automata where the next state is uniquely given by the current state and the current input character."

Transcription

1 Chapter : SCANNING (Lexical Analysis).3 Finite Automata Introduction to Finite Automata Finite automata (finite-state machines) are a mathematical way of descriing particular kinds of algorithms. A strong relationship etween finite automata and regular expression Identifier letter (letter digit)* Transition: Record a change from one state to another upon a match of the character or characters y which they are laeled. Start state: The recognition process egin Drawing an unlaeled arrowed line to it coming from nowhere Accepting states: Represent the end of the recognition process. Drawing a doule-line order around the state in the diagram.3.1 Definition of Deterministic Finite Automata 1. The Concept of DFA DFA: Automata where the next state is uniquely given y the current state and the current input character. Definition of a DFA: A DFA (Deterministic Finite Automation) M consist of (1) an alphaet, () A set of states S, (3) a transition function T : S S, (4) a start state s0 S, (5)And a set of accepting states A S The language accepted y a DFA M, written L(M), is defined to e the set of strings of characters c1cc3.cn with each ci such that there exist states s1 t(s0,c1),s t(s1,c), sn T(sn-1,cn) with sn an element of A (i.e. an accepting state). Accepting state sn means the same thing as the diagram: c1 c cn s0 s1 s sn-1 sn. Some differences etween definition of DFA and the diagram: 1) The definition does not restrict the set of states to numers ) We have not laeled the transitions with characters ut with names representing a set of characters 1

2 3) definitions T: S S, T(s, c) must have a value for every s and c. In the diagram, T (start, c) defined only if c is a letter, T(in_id, c) is defined only if c is a letter or a digit. Error transitions are not drawn in the diagram ut are simply assumed to always exist. 3. Examples of DFA Example.6: exactly accept one Not Not Example.7: at most one not not Example.8:digit [0-9] nat digit + signednat (+ -)? nat Numer singednat(. nat)?(e signednat)? A DFA of Numer:.3. Lookahead, Backtracking, and Nondeterministic Automata 1. A Typical Action of DFA Algorithm Making a transition: move the character from the input string to a string that accumulates the characters elonging to a single token (the token string value or lexeme of the token) Reaching an accepting state: return the token just recognized, along with any associated attriutes. Reaching an error state: either ack up in the input (acktracking) or to generate an error token.

3 . Finite automaton for an identifier with delimiter and return value The error state represents the fact that either an identifier is not to e recognized (if came from the start state) or a delimiter has een seen and we should now accept and generate an identifier-token. [other]: indicate that the delimiting character should e considered look-ahead, it should e returned to the input string and not consumed. This diagram also expresses the principle of longest su-string : the DFA continues to match letters and digits (in state in_id) until a delimiter is found. By contrast the old diagram allowed the DFA to accept at any point while reading an identifier string. 3. How to arrive at the start state in the first place (comine all the tokens into one DFA) a) Each of these tokens egins with a different character Consider the tokens given y the strings :,, and Each of these is a fixed string, and DFAs for them can e written as right : return ASSIGN return LE return EQ Uniting all of their start states into a single start state to get the DFA return ASSIGN : return EQ return LE ) Several tokens eginning with the same character They cannot e simply written as the right diagram, since it is not a DFA return LE > 3 return NE return LT

4 The diagram can e rearranged into a DFA return LE > return NE [other] return LT 4. Expand the Definition of a Finite Automaton One solution for the prolem is to expand the definition of a finite automaton More than one transition from a state may exist for a particular character (NFA: non-deterministic finite automaton,) Developing an algorithm for systematically turning these NFA into DFAs 5. ε-transition A transition that may occur without consulting the input string (and without consuming any characters) It may e viewed as a "match" of the empty string. ( This should not e confused with a match of the characterεin the input) 6. ε-transitions Used in Two Ways. First: to express a choice of alternatives in a way without comining states Advantage: keeping the original automata intact and only adding a new start state to connect them : Second: to explicitly descrie a match of the empty string. 4

5 7. Definition of NFA An NFA (non-deterministic finite automaton) M consists of an alphaet, a set of states S, a transition function T: S x ( U{ε}) (S), a start state s0 from S, and a set of accepting states A from S The language accepted y M, written L(M), is defined to e the set of strings of characters c1c. cn with each ci from U{ε}such that there exist states s1 in T(s0,c1), s in (s1, c),..., sn in T(sn-1, cn) with sn an element of A. 8. Some Notes An NFA does not represent an algorithm. However, it can e simulated y an algorithm that acktracks through every non-deterministic choice. 9. Examples of NFAs Example.10 The string a can e accepted y either of the following sequences of transitions: a ε a ε ε ε This NFA accepts the languages as follows: regular expression: (a ε)* a+ a* * a a A DFA accepts the same language. a 5

6 .3.3 Implementation of Finite Automata in Code 1. Ways to Translate a DFA or NFA into Code a) The code for the DFA accepting identifiers: letter { starting in state 1 } if the next character is a letter then advance the input; { now in state } while the next character is a letter or a digit do advance the input; { stay in state } end while; { go to state 3 without advancing the input} accept; else { error or other cases } end if; 1 letter [other] digit 3 Two drawacks: It is ad hoc that is, each DFA has to e treated slightly differently, and it is difficult to state an algorithm that will translate every DFA to code in this way. The complexity of the code increases dramatically as the numer of states rises or, more specifically, as the numer of different states along aritrary paths rises. ) A etter method: Using a variale to maintain the current state and writing the transitions as a douly nested case statement inside a loop, where the first case statement tests the current state and the nested second level tests the input character. The code of the DFA for identifier: state : 1; { start } while state 1 or do case state of 1: case input character of letter: advance the input : state : ; else state :.{ error or other }; end case; : case input character of letter, digit: advance the input; state : ; { actually unnecessary } else state:3; end case; end case; end while; 6

7 if state3 then accept else error; c) Generic code: Express the DFA as a data structure and then write "generic" code; A transition tale, or two-dimensional array, indexed y state and input character that expresses the values of the transition function T Ways to Translate a Characters DFA in the or alphaet NFA c into Code States s States representing transitions T (s, c) The transition tale of the DFA for identifier: The transition tale of the DFA for identifier: state Input char letter digit other Accepting 1 No [3] no 3 yes Brackets indicate noninputconsuming transitions This column indicates accepting states Assume :the first state listed is the start state Features of Tale-Driven Method Tale driven: use tales to direct the progress of the algorithm. The advantage: The size of the code is reduced, the same code will work for many different prolems, and the code is easier to change (maintain). The disadvantage: The tales can ecome very large, causing a significant increase in the space used y the program. Indeed, much of the space in the arrays we have just descried is wasted. Tale-driven methods often rely on tale-compression methods such as sparse-array representations, although there is usually a time penalty to e paid for such compression, since tale lookup ecomes slower. Since scanners must e efficient, these methods are rarely used for them. NFAs can e implemented in similar ways to DFAs, except NFAs are nondeterministic, there are potentially many different sequences of transitions that must e tried. A program that simulates an NFA must store up transitions that have not yet een tried and acktrack to them on failure. 7

8 .4 From Regular Expression To DFAs Main Purpose Translating a regular expression into a DFA via NFA. Regular Expression NFA DFA Program 8

We use L i to stand for LL L (i times). It is logical to define L 0 to be { }. The union of languages L and M is given by

We use L i to stand for LL L (i times). It is logical to define L 0 to be { }. The union of languages L and M is given by The term languages to mean any set of string formed from some specific alphaet. The notation of concatenation can also e applied to languages. If L and M are languages, then L.M is the language consisting

More information

1.0 Languages, Expressions, Automata

1.0 Languages, Expressions, Automata .0 Languages, Expressions, Automata Alphaet: Language: a finite set, typically a set of symols. a particular suset of the strings that can e made from the alphaet. ex: an alphaet of digits = {-,0,,2,3,4,5,6,7,8,9}

More information

Lexical Analysis. Implementation: Finite Automata

Lexical Analysis. Implementation: Finite Automata Lexical Analysis Implementation: Finite Automata Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs)

More information

Implementation of Lexical Analysis

Implementation of Lexical Analysis Implementation of Lexical Analysis Lecture 4 (Modified by Professor Vijay Ganesh) Tips on Building Large Systems KISS (Keep It Simple, Stupid!) Don t optimize prematurely Design systems that can be tested

More information

Finite automata. III. Finite automata: language recognizers. Nondeterministic Finite Automata. Nondeterministic Finite Automata with λ-moves

Finite automata. III. Finite automata: language recognizers. Nondeterministic Finite Automata. Nondeterministic Finite Automata with λ-moves . Finite automata: language recognizers n F can e descried y a laeled directed graph, where the nodes, called states, are laeled with a (unimportant) name edges, called transitions, are laeled with symols

More information

Lexical Analysis. Chapter 2

Lexical Analysis. Chapter 2 Lexical Analysis Chapter 2 1 Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples

More information

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

CD Assignment I. 1. Explain the various phases of the compiler with a simple example. CD Assignment I 1. Explain the various phases of the compiler with a simple example. The compilation process is a sequence of various phases. Each phase takes input from the previous, and passes the output

More information

David Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain)

David Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres dgriol@inf.uc3m.es Computer Science Department Carlos III University of Madrid Leganés (Spain) OUTLINE Introduction: Definitions The role of the Lexical Analyzer Scanner Implementation

More information

Lexical Analysis. Lecture 2-4

Lexical Analysis. Lecture 2-4 Lexical Analysis Lecture 2-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 2 1 Administrivia Moving to 60 Evans on Wednesday HW1 available Pyth manual available on line.

More information

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres dgriol@inf.uc3m.es Introduction: Definitions Lexical analysis or scanning: To read from left-to-right a source

More information

Alternation. Kleene Closure. Definition of Regular Expressions

Alternation. Kleene Closure. Definition of Regular Expressions Alternation Small finite sets are conveniently represented by listing their elements. Parentheses delimit expressions, and, the alternation operator, separates alternatives. For example, D, the set of

More information

Introduction to Lexical Analysis

Introduction to Lexical Analysis Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexical analyzers (lexers) Regular

More information

CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions

CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions CSE45 Translation of Programming Languages Lecture 2: Automata and Regular Expressions Finite Automata Regular Expression = Specification Finite Automata = Implementation A finite automaton consists of:

More information

Front End: Lexical Analysis. The Structure of a Compiler

Front End: Lexical Analysis. The Structure of a Compiler Front End: Lexical Analysis The Structure of a Compiler Constructing a Lexical Analyser By hand: Identify lexemes in input and return tokens Automatically: Lexical-Analyser generator We will learn about

More information

Formal Languages and Compilers Lecture VI: Lexical Analysis

Formal Languages and Compilers Lecture VI: Lexical Analysis Formal Languages and Compilers Lecture VI: Lexical Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/ artale/ Formal

More information

Finite automata. We have looked at using Lex to build a scanner on the basis of regular expressions.

Finite automata. We have looked at using Lex to build a scanner on the basis of regular expressions. Finite automata We have looked at using Lex to build a scanner on the basis of regular expressions. Now we begin to consider the results from automata theory that make Lex possible. Recall: An alphabet

More information

Lexical Analysis - 2

Lexical Analysis - 2 Lexical Analysis - 2 More regular expressions Finite Automata NFAs and DFAs Scanners JLex - a scanner generator 1 Regular Expressions in JLex Symbol - Meaning. Matches a single character (not newline)

More information

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan Compilers Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan Lexical Analyzer (Scanner) 1. Uses Regular Expressions to define tokens 2. Uses Finite Automata to recognize tokens

More information

Lexical Analysis. Lecture 3-4

Lexical Analysis. Lecture 3-4 Lexical Analysis Lecture 3-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 3-4 1 Administrivia I suggest you start looking at Python (see link on class home page). Please

More information

Implementation of Lexical Analysis

Implementation of Lexical Analysis Implementation of Lexical Analysis Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs) Implementation

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! [ALSU03] Chapter 3 - Lexical Analysis Sections 3.1-3.4, 3.6-3.7! Reading for next time [ALSU03] Chapter 3 Copyright (c) 2010 Ioanna

More information

Implementation of Lexical Analysis

Implementation of Lexical Analysis Implementation of Lexical Analysis Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs) Implementation

More information

CSc 453 Lexical Analysis (Scanning)

CSc 453 Lexical Analysis (Scanning) CSc 453 Lexical Analysis (Scanning) Saumya Debray The University of Arizona Tucson Overview source program lexical analyzer (scanner) tokens syntax analyzer (parser) symbol table manager Main task: to

More information

Implementation of Lexical Analysis. Lecture 4

Implementation of Lexical Analysis. Lecture 4 Implementation of Lexical Analysis Lecture 4 1 Tips on Building Large Systems KISS (Keep It Simple, Stupid!) Don t optimize prematurely Design systems that can be tested It is easier to modify a working

More information

CSE302: Compiler Design

CSE302: Compiler Design CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 13, 2007 Outline Recap

More information

Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday

Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday 1 Finite-state machines CS 536 Last time! A compiler is a recognizer of language S (Source) a translator from S to T (Target) a program

More information

UNIT -2 LEXICAL ANALYSIS

UNIT -2 LEXICAL ANALYSIS OVER VIEW OF LEXICAL ANALYSIS UNIT -2 LEXICAL ANALYSIS o To identify the tokens we need some method of describing the possible tokens that can appear in the input stream. For this purpose we introduce

More information

Writing a Lexical Analyzer in Haskell (part II)

Writing a Lexical Analyzer in Haskell (part II) Writing a Lexical Analyzer in Haskell (part II) Today Regular languages and lexicographical analysis part II Some of the slides today are from Dr. Saumya Debray and Dr. Christian Colberg This week PA1:

More information

Lexical Analysis. Lecture 3. January 10, 2018

Lexical Analysis. Lecture 3. January 10, 2018 Lexical Analysis Lecture 3 January 10, 2018 Announcements PA1c due tonight at 11:50pm! Don t forget about PA1, the Cool implementation! Use Monday s lecture, the video guides and Cool examples if you re

More information

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions Last lecture CMSC330 Finite Automata Languages Sets of strings Operations on languages Regular expressions Constants Operators Precedence 1 2 Finite automata States Transitions Examples Types This lecture

More information

Finite Automata and Scanners

Finite Automata and Scanners Finite Automata and Scanners A finite automaton (FA) can be used to recognize the tokens specified by a regular expression. FAs are simple, idealized computers that recognize strings belonging to regular

More information

Figure 2.1: Role of Lexical Analyzer

Figure 2.1: Role of Lexical Analyzer Chapter 2 Lexical Analysis Lexical analysis or scanning is the process which reads the stream of characters making up the source program from left-to-right and groups them into tokens. The lexical analyzer

More information

CMSC 350: COMPILER DESIGN

CMSC 350: COMPILER DESIGN Lecture 11 CMSC 350: COMPILER DESIGN see HW3 LLVMLITE SPECIFICATION Eisenberg CMSC 350: Compilers 2 Discussion: Defining a Language Premise: programming languages are purely formal objects We (as language

More information

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast! Lexical Analysis Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast! Compiler Passes Analysis of input program (front-end) character stream

More information

Scanners. Xiaokang Qiu Purdue University. August 24, ECE 468 Adapted from Kulkarni 2012

Scanners. Xiaokang Qiu Purdue University. August 24, ECE 468 Adapted from Kulkarni 2012 Scanners Xiaokang Qiu Purdue University ECE 468 Adapted from Kulkarni 2012 August 24, 2016 Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved

More information

Compiler phases. Non-tokens

Compiler phases. Non-tokens Compiler phases Compiler Construction Scanning Lexical Analysis source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2011 01 21 parse tree

More information

Wednesday, September 3, 14. Scanners

Wednesday, September 3, 14. Scanners Scanners Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved words, literals, etc. What do we need to know? How do we define tokens? How can

More information

CS415 Compilers. Lexical Analysis

CS415 Compilers. Lexical Analysis CS415 Compilers Lexical Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Lecture 7 1 Announcements First project and second homework

More information

CSE450. Translation of Programming Languages. Automata, Simple Language Design Principles

CSE450. Translation of Programming Languages. Automata, Simple Language Design Principles CSE45 Translation of Programming Languages Automata, Simple Language Design Principles Finite Automata State Graphs A state: The start state: An accepting state: A transition: a A Simple Example A finite

More information

Implementation of Lexical Analysis

Implementation of Lexical Analysis Written ssignments W assigned today Implementation of Lexical nalysis Lecture 4 Due in one week y 5pm Turn in In class In box outside 4 Gates Electronically Prof. iken CS 43 Lecture 4 Prof. iken CS 43

More information

Chapter 3 Lexical Analysis

Chapter 3 Lexical Analysis Chapter 3 Lexical Analysis Outline Role of lexical analyzer Specification of tokens Recognition of tokens Lexical analyzer generator Finite automata Design of lexical analyzer generator The role of lexical

More information

Implementation of Lexical Analysis

Implementation of Lexical Analysis Written ssignments W assigned today Implementation of Lexical nalysis Lecture 4 Due in one week :59pm Electronic hand-in Prof. iken CS 43 Lecture 4 Prof. iken CS 43 Lecture 4 2 Tips on uilding Large Systems

More information

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward Lexical Analysis COMP 524, Spring 2014 Bryan Ward Based in part on slides and notes by J. Erickson, S. Krishnan, B. Brandenburg, S. Olivier, A. Block and others The Big Picture Character Stream Scanner

More information

Compiler course. Chapter 3 Lexical Analysis

Compiler course. Chapter 3 Lexical Analysis Compiler course Chapter 3 Lexical Analysis 1 A. A. Pourhaji Kazem, Spring 2009 Outline Role of lexical analyzer Specification of tokens Recognition of tokens Lexical analyzer generator Finite automata

More information

T Parallel and Distributed Systems (4 ECTS)

T Parallel and Distributed Systems (4 ECTS) T 79.4301 Parallel and Distriuted Systems (4 ECTS) T 79.4301 Rinnakkaiset ja hajautetut järjestelmät (4 op) Lecture 4 11th of Feruary 2008 Keijo Heljanko Keijo.Heljanko@tkk.fi T 79.4301 Parallel and Distriuted

More information

CSE302: Compiler Design

CSE302: Compiler Design CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 01, 2007 Outline Recap

More information

Lexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual

Lexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual Lexical Analysis Chapter 1, Section 1.2.1 Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual Inside the Compiler: Front End Lexical analyzer (aka scanner) Converts ASCII or Unicode to a stream of tokens

More information

Monday, August 26, 13. Scanners

Monday, August 26, 13. Scanners Scanners Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved words, literals, etc. What do we need to know? How do we define tokens? How can

More information

T.E. (Computer Engineering) (Semester I) Examination, 2013 THEORY OF COMPUTATION (2008 Course)

T.E. (Computer Engineering) (Semester I) Examination, 2013 THEORY OF COMPUTATION (2008 Course) *4459255* [4459] 255 Seat No. T.E. (Computer Engineering) (Semester I) Examination, 2013 THEY OF COMPUTATION (2008 Course) Time : 3 Hours Max. Marks : 100 Instructions : 1) Answers to the two Sections

More information

Lexical Analysis. Introduction

Lexical Analysis. Introduction Lexical Analysis Introduction Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies

More information

1. Lexical Analysis Phase

1. Lexical Analysis Phase 1. Lexical Analysis Phase The purpose of the lexical analyzer is to read the source program, one character at time, and to translate it into a sequence of primitive units called tokens. Keywords, identifiers,

More information

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1} CSE 5 Homework 2 Due: Monday October 6, 27 Instructions Upload a single file to Gradescope for each group. should be on each page of the submission. All group members names and PIDs Your assignments in

More information

CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION CSE 105 THEORY OF COMPUTATION Spring 2018 http://cseweb.ucsd.edu/classes/sp18/cse105-ab/ Today's learning goals Sipser Section 2.2 Define push-down automata informally and formally Trace the computation

More information

Compiler Construction

Compiler Construction Compiler Construction Lecture 2: Lexical Analysis I (Introduction) Thomas Noll Lehrstuhl für Informatik 2 (Software Modeling and Verification) noll@cs.rwth-aachen.de http://moves.rwth-aachen.de/teaching/ss-14/cc14/

More information

CS 432 Fall Mike Lam, Professor. Finite Automata Conversions and Lexing

CS 432 Fall Mike Lam, Professor. Finite Automata Conversions and Lexing CS 432 Fall 2017 Mike Lam, Professor Finite Automata Conversions and Lexing Finite Automata Key result: all of the following have the same expressive power (i.e., they all describe regular languages):

More information

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens Concepts Introduced in Chapter 3 Lexical Analysis Regular Expressions (REs) Nondeterministic Finite Automata (NFA) Converting an RE to an NFA Deterministic Finite Automatic (DFA) Lexical Analysis Why separate

More information

Lecture 9 CIS 341: COMPILERS

Lecture 9 CIS 341: COMPILERS Lecture 9 CIS 341: COMPILERS Announcements HW3: LLVM lite Available on the course web pages. Due: Monday, Feb. 26th at 11:59:59pm Only one group member needs to submit Three submissions per group START

More information

TENTAMEN / EXAM. General instructions

TENTAMEN / EXAM. General instructions Linköpings universitet IDA Department of Computer and Information Sciences Prof. Peter Fritzson and Doc. Christoph Kessler TENTAMEN / EXAM TDDB29 Kompilatorer och interpretatorer / Compilers and interpreters

More information

Lexical Error Recovery

Lexical Error Recovery Lexical Error Recovery A character sequence that can t be scanned into any valid token is a lexical error. Lexical errors are uncommon, but they still must be handled by a scanner. We won t stop compilation

More information

Lecture 2 Finite Automata

Lecture 2 Finite Automata Lecture 2 Finite Automata August 31, 2007 This lecture is intended as a kind of road map to Chapter 1 of the text just the informal examples that I ll present to motivate the ideas. 1 Expressions without

More information

Lexical Error Recovery

Lexical Error Recovery Lexical Error Recovery A character sequence that can t be scanned into any valid token is a lexical error. Lexical errors are uncommon, but they still must be handled by a scanner. We won t stop compilation

More information

Chapter 3: Lexing and Parsing

Chapter 3: Lexing and Parsing Chapter 3: Lexing and Parsing Aarne Ranta Slides for the book Implementing Programming Languages. An Introduction to Compilers and Interpreters, College Publications, 2012. Lexing and Parsing* Deeper understanding

More information

CS308 Compiler Principles Lexical Analyzer Li Jiang

CS308 Compiler Principles Lexical Analyzer Li Jiang CS308 Lexical Analyzer Li Jiang Department of Computer Science and Engineering Shanghai Jiao Tong University Content: Outline Basic concepts: pattern, lexeme, and token. Operations on languages, and regular

More information

Administrivia. Lexical Analysis. Lecture 2-4. Outline. The Structure of a Compiler. Informal sketch of lexical analysis. Issues in lexical analysis

Administrivia. Lexical Analysis. Lecture 2-4. Outline. The Structure of a Compiler. Informal sketch of lexical analysis. Issues in lexical analysis dministrivia Lexical nalysis Lecture 2-4 Notes by G. Necula, with additions by P. Hilfinger Moving to 6 Evans on Wednesday HW available Pyth manual available on line. Please log into your account and electronically

More information

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES THE COMPILATION PROCESS Character stream CS 403: Scanning and Parsing Stefan D. Bruda Fall 207 Token stream Parse tree Abstract syntax tree Modified intermediate form Target language Modified target language

More information

Compiler Construction

Compiler Construction Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Conceptual Structure of a Compiler Source code x1 := y2

More information

CS 403 Compiler Construction Lecture 3 Lexical Analysis [Based on Chapter 1, 2, 3 of Aho2]

CS 403 Compiler Construction Lecture 3 Lexical Analysis [Based on Chapter 1, 2, 3 of Aho2] CS 403 Compiler Construction Lecture 3 Lexical Analysis [Based on Chapter 1, 2, 3 of Aho2] 1 What is Lexical Analysis? First step of a compiler. Reads/scans/identify the characters in the program and groups

More information

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3 CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3 CS 536 Spring 2015 1 Scanning A scanner transforms a character stream into a token stream. A scanner is sometimes

More information

lec3:nondeterministic finite state automata

lec3:nondeterministic finite state automata lec3:nondeterministic finite state automata 1 1.introduction Nondeterminism is a useful concept that has great impact on the theory of computation. When the machine is in a given state and reads the next

More information

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing Also known as Shift-Reduce parsing More powerful than top down Don t need left factored grammars Can handle left recursion Attempt to construct parse tree from an input string eginning at leaves and working

More information

Lexical Analysis. Finite Automata. (Part 2 of 2)

Lexical Analysis. Finite Automata. (Part 2 of 2) # Lexical Analysis Finite Automata (Part 2 of 2) PA0, PA Although we have included the tricky file ends without a newline testcases in previous years, students made good cases against them (e.g., they

More information

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur Finite Automata Dr. Nadeem Akhtar Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur PhD Laboratory IRISA-UBS University of South Brittany European University

More information

Lexical Analyzer Scanner

Lexical Analyzer Scanner Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce

More information

G52LAC Languages and Computation Lecture 6

G52LAC Languages and Computation Lecture 6 G52LAC Languages and Computation Lecture 6 Equivalence of Regular Expression and Finite Automata Henrik Nilsson University of Nottingham G52LACLanguages and ComputationLecture 6 p.1/28 This Lecture (1)

More information

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution Exam Facts Format Wednesday, July 20 th from 11:00 a.m. 1:00 p.m. in Gates B01 The exam is designed to take roughly 90

More information

WARNING for Autumn 2004:

WARNING for Autumn 2004: CSE 413 Programming Languages Autumn 2003 Max Points 50 Closed book, closed notes, no electronics. Do your own work! WARNING for Autumn 2004 Last year s exam did not cover Scheme and Java, but this year

More information

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions 1 Agenda Overview of language recognizers Basic concepts of formal grammars Scanner Theory

More information

Zhizheng Zhang. Southeast University

Zhizheng Zhang. Southeast University Zhizheng Zhang Southeast University 2016/10/5 Lexical Analysis 1 1. The Role of Lexical Analyzer 2016/10/5 Lexical Analysis 2 2016/10/5 Lexical Analysis 3 Example. position = initial + rate * 60 2016/10/5

More information

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer: Theoretical Part Chapter one:- - What are the Phases of compiler? Six phases Scanner Parser Semantic Analyzer Source code optimizer Code generator Target Code Optimizer Three auxiliary components Literal

More information

PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer

PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer As the first phase of a compiler, the main task of the lexical analyzer is to read the input

More information

Lexical Analysis. Finite Automata

Lexical Analysis. Finite Automata #1 Lexical Analysis Finite Automata Cool Demo? (Part 1 of 2) #2 Cunning Plan Informal Sketch of Lexical Analysis LA identifies tokens from input string lexer : (char list) (token list) Issues in Lexical

More information

Recognition of Tokens

Recognition of Tokens Recognition of Tokens Lecture 3 Section 3.4 Robb T. Koether Hampden-Sydney College Mon, Jan 19, 2015 Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 1 / 21 1 A Class of

More information

CS 314 Principles of Programming Languages. Lecture 3

CS 314 Principles of Programming Languages. Lecture 3 CS 314 Principles of Programming Languages Lecture 3 Zheng Zhang Department of Computer Science Rutgers University Wednesday 14 th September, 2016 Zheng Zhang 1 CS@Rutgers University Class Information

More information

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis CS 1622 Lecture 2 Lexical Analysis CS 1622 Lecture 2 1 Lecture 2 Review of last lecture and finish up overview The first compiler phase: lexical analysis Reading: Chapter 2 in text (by 1/18) CS 1622 Lecture

More information

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1 CSE 401 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter 2010 1/8/2010 2002-10 Hal Perkins & UW CSE B-1 Agenda Quick review of basic concepts of formal grammars Regular

More information

CSc 453 Compilers and Systems Software

CSc 453 Compilers and Systems Software CSc 453 Compilers and Systems Software 3 : Lexical Analysis I Christian Collberg Department of Computer Science University of Arizona collberg@gmail.com Copyright c 2009 Christian Collberg August 23, 2009

More information

CS2 Language Processing note 3

CS2 Language Processing note 3 CS2 Language Processing note 3 CS2Ah 5..4 CS2 Language Processing note 3 Nondeterministic finite automata In this lecture we look at nondeterministic finite automata and prove the Conversion Theorem, which

More information

A Scanner should create a token stream from the source code. Here are 4 ways to do this:

A Scanner should create a token stream from the source code. Here are 4 ways to do this: Writing A Scanner A Scanner should create a token stream from the source code. Here are 4 ways to do this: A. Hack something together till it works. (Blech!) B. Use the language specs to produce a Deterministic

More information

Lexical and Syntax Analysis

Lexical and Syntax Analysis Lexical and Syntax Analysis In Text: Chapter 4 N. Meng, F. Poursardar Lexical and Syntactic Analysis Two steps to discover the syntactic structure of a program Lexical analysis (Scanner): to read the input

More information

UNIT II LEXICAL ANALYSIS

UNIT II LEXICAL ANALYSIS UNIT II LEXICAL ANALYSIS 2 Marks 1. What are the issues in lexical analysis? Simpler design Compiler efficiency is improved Compiler portability is enhanced. 2. Define patterns/lexeme/tokens? This set

More information

The Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?

The Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language? The Front End Source code Front End IR Back End Machine code Errors The purpose of the front end is to deal with the input language Perform a membership test: code source language? Is the program well-formed

More information

2010: Compilers REVIEW: REGULAR EXPRESSIONS HOW TO USE REGULAR EXPRESSIONS

2010: Compilers REVIEW: REGULAR EXPRESSIONS HOW TO USE REGULAR EXPRESSIONS 2010: Compilers Lexical Analysis: Finite State Automata Dr. Licia Capra UCL/CS REVIEW: REGULAR EXPRESSIONS a Character in A Empty string R S Alternation (either R or S) RS Concatenation (R followed by

More information

CS 403: Scanning and Parsing

CS 403: Scanning and Parsing CS 403: Scanning and Parsing Stefan D. Bruda Fall 2017 THE COMPILATION PROCESS Character stream Scanner (lexical analysis) Token stream Parser (syntax analysis) Parse tree Semantic analysis Abstract syntax

More information

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console Scanning 1 read Interpreter Scanner request token Parser send token Console I/O send AST Tree Walker 2 Scanner This process is known as: Scanning, lexing (lexical analysis), and tokenizing This is the

More information

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1 CSEP 501 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter 2008 1/8/2008 2002-08 Hal Perkins & UW CSE B-1 Agenda Basic concepts of formal grammars (review) Regular expressions

More information

Lexical Analyzer Scanner

Lexical Analyzer Scanner Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce

More information

Compiler Construction

Compiler Construction Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Recap: First-Longest-Match Analysis Outline of Lecture

More information

CS 310: State Transition Diagrams

CS 310: State Transition Diagrams CS 30: State Transition Diagrams Stefan D. Bruda Winter 207 STATE TRANSITION DIAGRAMS Finite directed graph Edges (transitions) labeled with symbols from an alphabet Nodes (states) labeled only for convenience

More information

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Lexical Analysis Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Phase Ordering of Front-Ends Lexical analysis (lexer) Break input string

More information

Compiler Construction

Compiler Construction Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-17/cc/ Recap: First-Longest-Match Analysis The Extended Matching

More information