CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions

Similar documents
CSE450. Translation of Programming Languages. Automata, Simple Language Design Principles

Implementation of Lexical Analysis

Implementation of Lexical Analysis

Implementation of Lexical Analysis

Kinder, Gentler Nation

Lexical Analysis. Finite Automata. (Part 2 of 2)

Lexical Analysis. Chapter 2

Lexical Analysis. Lecture 2-4

Introduction to Lexical Analysis

Chapter 3 Lexical Analysis

Compiler course. Chapter 3 Lexical Analysis

Implementation of Lexical Analysis. Lecture 4

Implementation of Lexical Analysis

Lexical Analysis. Lecture 3-4

Implementation of Lexical Analysis

Implementation of Lexical Analysis

Lexical Analysis. Implementation: Finite Automata

Administrivia. Lexical Analysis. Lecture 2-4. Outline. The Structure of a Compiler. Informal sketch of lexical analysis. Issues in lexical analysis

CS 432 Fall Mike Lam, Professor. Finite Automata Conversions and Lexing

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions

Finite automata. We have looked at using Lex to build a scanner on the basis of regular expressions.

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata

6 NFA and Regular Expressions

Automating Construction of Lexers

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Formal Languages and Compilers Lecture VI: Lexical Analysis

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Regular Languages. MACM 300 Formal Languages and Automata. Formal Languages: Recap. Regular Languages

Dr. D.M. Akbar Hussain

Front End: Lexical Analysis. The Structure of a Compiler

Writing a Lexical Analyzer in Haskell (part II)

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

(Refer Slide Time: 0:19)

Monday, August 26, 13. Scanners

Wednesday, September 3, 14. Scanners

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

UNIT -2 LEXICAL ANALYSIS

2010: Compilers REVIEW: REGULAR EXPRESSIONS HOW TO USE REGULAR EXPRESSIONS

Lexical Analysis - 2

Compiler Construction

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

CS2 Language Processing note 3

lec3:nondeterministic finite state automata

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1

CS415 Compilers. Lexical Analysis

Structure of Programming Languages Lecture 3

ECS 120 Lesson 7 Regular Expressions, Pt. 1

Scanners. Xiaokang Qiu Purdue University. August 24, ECE 468 Adapted from Kulkarni 2012

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}

Compiler Construction

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

Formal Definition of Computation. Formal Definition of Computation p.1/28

Administrativia. Extra credit for bugs in project assignments. Building a Scanner. CS164, Fall Recall: The Structure of a Compiler

Lexical Analysis. Prof. James L. Frankel Harvard University

Compiler phases. Non-tokens

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Converting a DFA to a Regular Expression JP

Lexical Analyzer Scanner

Lexical Error Recovery

Behaviour Diagrams UML

Chapter 3: Lexing and Parsing

Finite Automata and Scanners

David Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain)

2. Lexical Analysis! Prof. O. Nierstrasz!

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Regular Languages and Regular Expressions

Lexical Analyzer Scanner

CSE302: Compiler Design

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

CS 403 Compiler Construction Lecture 3 Lexical Analysis [Based on Chapter 1, 2, 3 of Aho2]


CMSC 350: COMPILER DESIGN

Decision Properties of RLs & Automaton Minimization

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

Derivations of a CFG. MACM 300 Formal Languages and Automata. Context-free Grammars. Derivations and parse trees

Lecture 3: Lexical Analysis

CSc 453 Lexical Analysis (Scanning)

CS308 Compiler Principles Lexical Analyzer Li Jiang

Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Decidable Problems. We examine the problems for which there is an algorithm.

Computer Sciences Department

FSASIM: A Simulator for Finite-State Automata

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

CPSC 434 Lecture 3, Page 1

Lecture 2 Finite Automata

ECS 120 Lesson 16 Turing Machines, Pt. 2

G52LAC Languages and Computation Lecture 6

Figure 2.1: Role of Lexical Analyzer

Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

1 Parsing (25 pts, 5 each)

Chapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

CSE 401/M501 Compilers

CT32 COMPUTER NETWORKS DEC 2015

CSE 105 THEORY OF COMPUTATION

Introduction to Lexing and Parsing

Transcription:

CSE45 Translation of Programming Languages Lecture 2: Automata and Regular Expressions

Finite Automata Regular Expression = Specification Finite Automata = Implementation A finite automaton consists of: An input alphabet A set of states S A start state n in S A set of accepting states F S A set of transitions: (state, input) state Execution starts in the "start state" and then inputs are read in, changing state based on the transition rules. Accept if the state at the end of input is an accepting state, else reject.

Finite Automata State Graphs A state: The start state: An accepting state: A transition: a

A Simple Example A finite automaton that accepts only "":

Another Simple Example Alphabet: {, } A finite automaton accepting any number of 's followed by a single

And Another Example Alphabet: {, } What language does this recognize? Warning: Very Hard. *?* ( )* (* ) ( *)+

Another Simple Example Alphabet still {, } The operation of the automaton is not completely defined by the input On input "" the automaton could be in either state

Epsilon Moves Another kind of transition: -moves

Deterministic and Non-deterministic Automata Deterministic Finite Automata (DFA) One transition per input per state No -moves Non-deterministic Finite Automata (NFA) Can have multiple transitions for one input in a given state Can have -moves Finite automata have finite memory Need only to encode the current state

Execution of Finite Automata A DFA can take only one path through the state graph Completely determined by the input NFSs can choose Whether to make -moves Which of multiple transitions for a single input to take

Acceptance of NFAs An NFA can get into multiple states Input: Rule: NFA accepts if it can get in a final state.

NFA vs. DFA () NFAs and DFAs recognize the same set of languages (regular languages) NFAs are easier to design You can just list all the rules you want to allow DFAs are easier to implement There are no choices to consider

NFA vs. DFA (2) For a given language the NFA can be simpler than the DFA NFA DFA DFA can be exponentially larger than the NFA

Automata to Regular Expression ( )*. Add new start and single end state 2. Connected new states with epsilon transitions 3. Remove one state at a time A. Replace each incoming transition (to the removed state) with a regex to possible outputs 4. Simplify regex if possible

Lexical Analysis Overview Non-deterministic Finite Automata Regular Expressions Deterministic Finite Automata Lexical Specification Table-driven Implementation of DFA

Regular Expressions to NFA () For each kind of regular expression, we can define an NFA. We wil do this by using recursive definition. Notation: NFA for regular expression A: A Base cases: For For input 'a' a

Regular Expressions to NFA (2) For AB A B For A B A B

Regular Expressions to NFA (3) For A* A How do we do "A+"? What about "A?"?

Example of RE to NFA Conversion Consider the regular expression ( )* The NFA is: A B C E D F G H I J

Lexical Analysis Overview Non-deterministic Finite Automata Regular Expressions Deterministic Finite Automata Lexical Specification Table-driven Implementation of DFA

NFA to DFA: The Trick Simulate the NFA Each state of DFA = a non-empty subset of states of the NFA Start state = the set of NFA states reachable through -moves from the NFA start state Add a transition (S, a) S' to DFA if and only if: S' is the set of NFA states reachable from the states in S after seeing the input a (remember to consider -moves as well!)

NFA to DFA Example A B C E D F G H I J FGABCDHI ABCDHI EJGABCDHI

NFA to DFA Remark An NFA may be in many states at any time How many different possible states are there? If there are n states, the NFA must be in some subset of those n states How many non-empty subsets are there? 2 n - = finitely many

Lexical Analysis Overview Non-deterministic Finite Automata Regular Expressions Deterministic Finite Automata Lexical Specification Table-driven Implementation of DFA

Table Implementation A DFA can be implemented by a 2-dimensional table T One dimension is "states" Other dimension is "input symbols" For every transition (Si, a) Sk define T[i, a] = k DFA "execution" If in state Si and input a, read T[i, a] = k and go to state Sk Very efficient

Table Implementation of DFA T S U S T U T T U U T U

Final notes about Implementation NFA to DFA conversion is at the heart of tools such as Flex But, DFAs can be huge In practice, lex-like tools trade off speed for space in the choice of NFA and DFA representations