# 2010: Compilers REVIEW: REGULAR EXPRESSIONS HOW TO USE REGULAR EXPRESSIONS

Size: px
Start display at page:

Transcription

1 2010: Compilers Lexical Analysis: Finite State Automata Dr. Licia Capra UCL/CS REVIEW: REGULAR EXPRESSIONS a Character in A Empty string R S Alternation (either R or S) RS Concatenation (R followed by S) R* Repetition (zero or more R) R+ RR* (one or more R) R? (R ) (zero or one R) [abcd] [a-z] [^ab] [^a-x] a b c d (any of the listed) a b c.. y z (character range) c d y z (anything but the listed) y z (anything but the range) HOW TO USE REGULAR EXPRESSIONS Mechanism to determine whether input string s belongs to the language L denoted by Regular Expression R Input string s Language L Acceptor? Yes, if s in L No, if s not in L 1

2 OUTLINE Lexical Analyser (source program) (list of tokens) i f ( b = = 0 ) a = b ; Lexical Analyser i f ( b = = 0 ) a = b ; Regular Expressions Finite State Automata Lexer Generator FINITE STATE AUTOMATA Finite state automata consist of: A finite set of states Edges (transitions) between states each labelled with a symbol A start state A set of final states (accepting states) FINITE STATE AUTOMATA Two kinds of finite automata: Deterministic finite automata (DFA)=the transition from the current state is uniquely determined by the current input character Non-deterministic finite automata (NFA)=there may be multiple possible choices or some transitions do not depend on the input character 2

3 DFA EXAMPLE DFA that accepts the strings in the language denoted by regular expression ab*a a Graph b Examples: abba aaa ab a Transition table a b 0 1 error error error int state = 0; char c = nextchar(); while (c!= EOF) { switch (state) { case 0: if (c==`a ) state=1; else error; break; case 1: if (c==`b ) state=1; else if (c==`a ) state=2; else error; break; case 2: error; break; } c = readchar(); } if ((state==2) && (!error)) return <token>; MORE DFA EXAMPLES i f IF token NUM token _ a-z A-Z 1 2 _ a-z A-Z ID token 0-9 NFA DEFINITION A non-deterministic finite state automaton (NFA) is an automaton where the state transitions are such that There may be -transitions (i.e., transitions that do not consume input characters) There may be multiple transitions from the same state on the same input character 3

4 NFA EXAMPLE NFA that accepts the strings in the language denoted by regular expression ab*a a Graph a b Example: abba a DFA NFA DFA Action fully determined on each input symbol String accepted if I can go from initial to final state while reading string NFA There may be choice on each step (which path should I take?) String accepted if there is any path that leads to acceptance (automaton must guess correctly) Obvious table-driven implementation Difficult to implement BUILDING A LEXER IN STEPS Programming Language L Regular Expression describing all valid tokens (while if for else int char ([a-za-z_][a-za-z0-9_]*) (-?[0-9]+) )* DFA recognising tokens in L??? No obvious implementation of DFA accepting programs in L 3-step solution to automatically build the DFA Step 1: from RE to NFA Step 2: from NFA to DFA Step 3: from DFA to minimised DFA 4

5 BUILDING A LEXER IN STEPS STEP 1: FROM RE TO NFA STEP 2: FROM NFA TO DFA STEP 3: FROM DFA TO min DFA STEP 1: FROM RE TO NFA Strategy: build the finite automaton inductively, based on the definition of RE (empty string) a (character) a STEP 1: FROM RE TO NFA R automaton S automaton Alternation R S R S Concatenation RS R S 5

6 STEP 1: FROM RE TO NFA R automaton Kleene star R* R Note: NFA only need one final state (WHY?) STEP 1: FROM RE TO NFA - EXAMPLE A={a,b} R=(ab ba)* EXERCISE Write the NFA that recognises the strings described by the following RE: (a* b*)* Simulate its execution on input ababbab 6

7 BUILDING A LEXER IN STEPS STEP 1: FROM RE TO NFA STEP 2: FROM NFA TO DFA STEP 3: FROM DFA TO min DFA STEP 2: FROM NFA TO DFA Problem: how to execute NFA? String accepted if there is any path that leads to acceptance. How to guess correctly? Solution: search all paths consistent with the string If there is any path that accepts the string, we will find it Idea: search paths in parallel Keep track of set of NFA states we could be in after seeing some string prefix Search set of possible states I could move to when reading next input character STEP 2: FROM NFA TO DFA -closure(s)=set of states reachable from state s with transition -closure(t)= U -closure(s) sint edge(t,a)=set of states reachable with transition a from any state in T DFAedge(T,a)=-closure(edge(T,a)) 7

8 STEP 2: FROM NFA TO DFA DFA initial state = -closure{nfa initial state} For each DFA state S For each character x in A S = DFAedge(S,x) add an edge (S,S ) labelled with character x in DFA For each DFA state S If S contains an NFA final state Mark S as DFA final state STEP 2: FROM NFA TO DFA - EXAMPLE A={a,b} R=(ab ba)* EXERCISES Given the following regular expressions R, build the NFA that recognise L(R) and then convert them to DFA R=(a b)* R=b*(ab ba)b* 8

9 BUILDING A LEXER IN STEPS STEP 1: FROM RE TO NFA STEP 2: FROM NFA TO DFA STEP 3: FROM DFA TO min DFA STEP 3: MINIMISATION ALGORITHM The DFA automatically built from NFA is not minimal, i.e. it contains more states than necessary Minimisation algorithm: converts a DFA to another DFA which recognizes the same language and has a minimum number of states STEP 3: STATE MINIMISATION Idea: find groups of equivalent states All transitions from states in one group G1 go to states in the same group G2 Construct the minimised DFA such that there is one state for each group of states from the initial DFA 9

10 STEP 3: DFA MINIMISATION ALGORITHM STEP 1: Construct a partition P of the set S of states in the original DFA having 2 groups: F = set of final states S-F = set of non-final states STEP 2: Repeat Let P= G1 U G2 U Gn be the current partition Partition each group Gi into subgroups such that: s and t are in the same subgroup if, for each symbol a in A there are transitions s s, t t and s,t are in the same subgroup Gj Combine the computed subgroups into a new partition P Until P == P STEP 3: Construct a DFA with one state for each group of states in the final partition STEP 3: FROM DFA TO MIN DFA - EXAMPLE R=(ab ba)* EXERCISE Given the regular expression R=b*(ab ba)b*, build a minimal DFA that recognises L(R) Step 1: from R to NFA Step 2: from NFA to DFA Step 3: from DFA to minimal DFA 10

11 PUTTING THE PIECES TOGETHER Regular Expression R RE NFA conversion Input String s NFA DFA conversion DFA DFA optimisation DFA simulation Yes, if s in L(R) No, if s not in L(R) LEXICAL ANALYSERS ACCEPTORS Lexical analysers use the same mechanism but they: Have multiple RE describing multiple tokens LEXICAL ANALYSERS Handling multiple Res: NFAs of all regular expressions R1,,Rn must be combined into a single finite automata Keywords Minimised DFA Numbers Identifiers Whitespaces 11

12 LEXICAL ANALYSERS ACCEPTORS Lexical analysers use the same mechanism but they: Have multiple RE describing multiple tokens Have a character stream in input Return a sequence of matching tokens or an error LEXICAL ANALYSERS Input/output stream Associate tokens with final states Output the corresponding token when reaching a final state Keywords Numbers Minimised DFA Identifiers Whitespaces LEXICAL ANALYSERS ACCEPTORS Lexical analysers use the same mechanism but they: Have multiple RE describing multiple tokens Have a character stream in input Have a character stream in input Return a sequence of matching tokens or an error Always return the longest matching token For multiple longest matching tokens, they use rule priorities 12

13 LEXICAL ANALYSERS Longest match When in a final state, look if there are further transactions; if not, return the token for the current final state Rule priority Same length matching token for final states corresponding to multiple tokens Associate the final state to the token with the highest priority AUTOMATING LEXICAL ANALYSIS All of the lexical analysis process can be automated! We only need to specify: Regular expressions for tokens Rule priorities for multiple longest match cases JLex/JFlex = Lexical Analyser Generators 13

### CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 2: Lexical Analysis 23 Jan 08

CS412/413 Introduction to Compilers Tim Teitelbaum Lecture 2: Lexical Analysis 23 Jan 08 Outline Review compiler structure What is lexical analysis? Writing a lexer Specifying tokens: regular expressions

### Implementation of Lexical Analysis

Implementation of Lexical Analysis Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs) Implementation

### Lexical Analysis. Implementation: Finite Automata

Lexical Analysis Implementation: Finite Automata Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs)

### Formal Languages and Compilers Lecture VI: Lexical Analysis

Formal Languages and Compilers Lecture VI: Lexical Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/ artale/ Formal

### David Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain)

David Griol Barres dgriol@inf.uc3m.es Computer Science Department Carlos III University of Madrid Leganés (Spain) OUTLINE Introduction: Definitions The role of the Lexical Analyzer Scanner Implementation

### Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Concepts Introduced in Chapter 3 Lexical Analysis Regular Expressions (REs) Nondeterministic Finite Automata (NFA) Converting an RE to an NFA Deterministic Finite Automatic (DFA) Lexical Analysis Why separate

### Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres dgriol@inf.uc3m.es Introduction: Definitions Lexical analysis or scanning: To read from left-to-right a source

### Implementation of Lexical Analysis

Implementation of Lexical Analysis Outline Specifying lexical structure using regular expressions Finite automata Deterministic Finite Automata (DFAs) Non-deterministic Finite Automata (NFAs) Implementation

### Finite automata. We have looked at using Lex to build a scanner on the basis of regular expressions.

Finite automata We have looked at using Lex to build a scanner on the basis of regular expressions. Now we begin to consider the results from automata theory that make Lex possible. Recall: An alphabet

### Zhizheng Zhang. Southeast University

Zhizheng Zhang Southeast University 2016/10/5 Lexical Analysis 1 1. The Role of Lexical Analyzer 2016/10/5 Lexical Analysis 2 2016/10/5 Lexical Analysis 3 Example. position = initial + rate * 60 2016/10/5

### Implementation of Lexical Analysis

Implementation of Lexical Analysis Lecture 4 (Modified by Professor Vijay Ganesh) Tips on Building Large Systems KISS (Keep It Simple, Stupid!) Don t optimize prematurely Design systems that can be tested

### Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata

Formal Languages and Compilers Lecture IV: Regular Languages and Finite Automata Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

### Outline. 1 Scanning Tokens. 2 Regular Expresssions. 3 Finite State Automata

Outline 1 2 Regular Expresssions Lexical Analysis 3 Finite State Automata 4 Non-deterministic (NFA) Versus Deterministic Finite State Automata (DFA) 5 Regular Expresssions to NFA 6 NFA to DFA 7 8 JavaCC:

### Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

Lexical Analysis Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Phase Ordering of Front-Ends Lexical analysis (lexer) Break input string

### Lexical Analysis 1 / 52

Lexical Analysis 1 / 52 Outline 1 Scanning Tokens 2 Regular Expresssions 3 Finite State Automata 4 Non-deterministic (NFA) Versus Deterministic Finite State Automata (DFA) 5 Regular Expresssions to NFA

### CS308 Compiler Principles Lexical Analyzer Li Jiang

CS308 Lexical Analyzer Li Jiang Department of Computer Science and Engineering Shanghai Jiao Tong University Content: Outline Basic concepts: pattern, lexeme, and token. Operations on languages, and regular

### CSc 453 Lexical Analysis (Scanning)

CSc 453 Lexical Analysis (Scanning) Saumya Debray The University of Arizona Tucson Overview source program lexical analyzer (scanner) tokens syntax analyzer (parser) symbol table manager Main task: to

### Writing a Lexical Analyzer in Haskell (part II)

Writing a Lexical Analyzer in Haskell (part II) Today Regular languages and lexicographical analysis part II Some of the slides today are from Dr. Saumya Debray and Dr. Christian Colberg This week PA1:

### COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! [ALSU03] Chapter 3 - Lexical Analysis Sections 3.1-3.4, 3.6-3.7! Reading for next time [ALSU03] Chapter 3 Copyright (c) 2010 Ioanna

### Front End: Lexical Analysis. The Structure of a Compiler

Front End: Lexical Analysis The Structure of a Compiler Constructing a Lexical Analyser By hand: Identify lexemes in input and return tokens Automatically: Lexical-Analyser generator We will learn about

### DVA337 HT17 - LECTURE 4. Languages and regular expressions

DVA337 HT17 - LECTURE 4 Languages and regular expressions 1 SO FAR 2 TODAY Formal definition of languages in terms of strings Operations on strings and languages Definition of regular expressions Meaning

### [Lexical Analysis] Bikash Balami

1 [Lexical Analysis] Compiler Design and Construction (CSc 352) Compiled By Central Department of Computer Science and Information Technology (CDCSIT) Tribhuvan University, Kirtipur Kathmandu, Nepal 2

### Monday, August 26, 13. Scanners

Scanners Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved words, literals, etc. What do we need to know? How do we define tokens? How can

### Lexical Analysis. Chapter 2

Lexical Analysis Chapter 2 1 Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples

### Wednesday, September 3, 14. Scanners

Scanners Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved words, literals, etc. What do we need to know? How do we define tokens? How can

### CSE302: Compiler Design

CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 13, 2007 Outline Recap

### Outline CS4120/4121. Compilation in a Nutshell 1. Administration. Introduction to Compilers Andrew Myers. HW1 out later today due next Monday.

CS4120/4121 Introduction to Compilers Andrew Myers Lecture 2: Lexical Analysis 31 August 2009 Outline Administration Compilation in a nutshell (or two) What is lexical analysis? Writing a lexer Specifying

### Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday

Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday 1 Finite-state machines CS 536 Last time! A compiler is a recognizer of language S (Source) a translator from S to T (Target) a program

### Automating Construction of Lexers

Automating Construction of Lexers Regular Expression to Programs Not all regular expressions are simple. How can we write a lexer for (a*b aaa)? Tokenizing aaaab Vs aaaaaa Regular Expression Finite state

### Regular Languages and Regular Expressions

Regular Languages and Regular Expressions According to our definition, a language is regular if there exists a finite state automaton that accepts it. Therefore every regular language can be described

### Chapter 3 Lexical Analysis

Chapter 3 Lexical Analysis Outline Role of lexical analyzer Specification of tokens Recognition of tokens Lexical analyzer generator Finite automata Design of lexical analyzer generator The role of lexical

### Dr. D.M. Akbar Hussain

1 2 Compiler Construction F6S Lecture - 2 1 3 4 Compiler Construction F6S Lecture - 2 2 5 #include.. #include main() { char in; in = getch ( ); if ( isalpha (in) ) in = getch ( ); else error (); while

### Scanners. Xiaokang Qiu Purdue University. August 24, ECE 468 Adapted from Kulkarni 2012

Scanners Xiaokang Qiu Purdue University ECE 468 Adapted from Kulkarni 2012 August 24, 2016 Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved

### CMSC 350: COMPILER DESIGN

Lecture 11 CMSC 350: COMPILER DESIGN see HW3 LLVMLITE SPECIFICATION Eisenberg CMSC 350: Compilers 2 Discussion: Defining a Language Premise: programming languages are purely formal objects We (as language

### Compiler course. Chapter 3 Lexical Analysis

Compiler course Chapter 3 Lexical Analysis 1 A. A. Pourhaji Kazem, Spring 2009 Outline Role of lexical analyzer Specification of tokens Recognition of tokens Lexical analyzer generator Finite automata

### Introduction to Lexical Analysis

Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexical analyzers (lexers) Regular

### Compiler phases. Non-tokens

Compiler phases Compiler Construction Scanning Lexical Analysis source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2011 01 21 parse tree

### CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions

CSE45 Translation of Programming Languages Lecture 2: Automata and Regular Expressions Finite Automata Regular Expression = Specification Finite Automata = Implementation A finite automaton consists of:

### CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

CSEP 501 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter 2008 1/8/2008 2002-08 Hal Perkins & UW CSE B-1 Agenda Basic concepts of formal grammars (review) Regular expressions

### Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

Concepts Lexical scanning Regular expressions DFAs and FSAs Lex CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. 1 CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. 2 Lexical analysis

### Lexical Analysis. Lecture 2-4

Lexical Analysis Lecture 2-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 2 1 Administrivia Moving to 60 Evans on Wednesday HW1 available Pyth manual available on line.

### Regular Languages. MACM 300 Formal Languages and Automata. Formal Languages: Recap. Regular Languages

Regular Languages MACM 3 Formal Languages and Automata Anoop Sarkar http://www.cs.sfu.ca/~anoop The set of regular languages: each element is a regular language Each regular language is an example of a

### Chapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

Chapter 4 Lexical analysis Lexical scanning Regular expressions DFAs and FSAs Lex Concepts CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. 1 CMSC 331, Some material 1998 by Addison Wesley

### Lexical Analyzer Scanner

Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce

### Lexical Analyzer Scanner

Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce

### UNIT -2 LEXICAL ANALYSIS

OVER VIEW OF LEXICAL ANALYSIS UNIT -2 LEXICAL ANALYSIS o To identify the tokens we need some method of describing the possible tokens that can appear in the input stream. For this purpose we introduce

### Lexical Analysis. Lecture 3-4

Lexical Analysis Lecture 3-4 Notes by G. Necula, with additions by P. Hilfinger Prof. Hilfinger CS 164 Lecture 3-4 1 Administrivia I suggest you start looking at Python (see link on class home page). Please

### CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION Spring 2017 http://cseweb.ucsd.edu/classes/sp17/cse105-ab/ Today's learning goals Sipser Ch 1.2, 1.3 Design NFA recognizing a given language Convert an NFA (with or without

### CS 314 Principles of Programming Languages. Lecture 3

CS 314 Principles of Programming Languages Lecture 3 Zheng Zhang Department of Computer Science Rutgers University Wednesday 14 th September, 2016 Zheng Zhang 1 CS@Rutgers University Class Information

### CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages Lecture 2: Syntax Analysis Zheng (Eddy) Zhang Rutgers University January 22, 2018 Announcement First recitation starts this Wednesday Homework 1 will be release

### PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer

PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer As the first phase of a compiler, the main task of the lexical analyzer is to read the input

### Lexical Analysis - 1. A. Overview A.a) Role of Lexical Analyzer

CMPSC 470 Lecture 02 Topics: Regular Expression Transition Diagram Lexical Analyzer Implementation A. Overview A.a) Role of Lexical Analyzer Lexical Analysis - 1 Lexical analyzer does: read input character

### Lexical Analysis. Introduction

Lexical Analysis Introduction Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies

### Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Agenda for Today Regular Expressions CSE 413, Autumn 2005 Programming Languages Basic concepts of formal grammars Regular expressions Lexical specification of programming languages Using finite automata

### CS 432 Fall Mike Lam, Professor. Finite Automata Conversions and Lexing

CS 432 Fall 2017 Mike Lam, Professor Finite Automata Conversions and Lexing Finite Automata Key result: all of the following have the same expressive power (i.e., they all describe regular languages):

### Implementation of Lexical Analysis. Lecture 4

Implementation of Lexical Analysis Lecture 4 1 Tips on Building Large Systems KISS (Keep It Simple, Stupid!) Don t optimize prematurely Design systems that can be tested It is easier to modify a working

### Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Finite Automata Dr. Nadeem Akhtar Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur PhD Laboratory IRISA-UBS University of South Brittany European University

### Compiler Construction

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Conceptual Structure of a Compiler Source code x1 := y2

### Figure 2.1: Role of Lexical Analyzer

Chapter 2 Lexical Analysis Lexical analysis or scanning is the process which reads the stream of characters making up the source program from left-to-right and groups them into tokens. The lexical analyzer

### Implementation of Lexical Analysis

Outline Implementation of Lexical nalysis Specifying lexical structure using regular expressions Finite automata Deterministic Finite utomata (DFs) Non-deterministic Finite utomata (NFs) Implementation

### Lexical Analysis - 2

Lexical Analysis - 2 More regular expressions Finite Automata NFAs and DFAs Scanners JLex - a scanner generator 1 Regular Expressions in JLex Symbol - Meaning. Matches a single character (not newline)

### Lexical Analysis (ASU Ch 3, Fig 3.1)

Lexical Analysis (ASU Ch 3, Fig 3.1) Implementation by hand automatically ((F)Lex) Lex generates a finite automaton recogniser uses regular expressions Tasks remove white space (ws) display source program

### Converting a DFA to a Regular Expression JP

Converting a DFA to a Regular Expression JP Prerequisite knowledge: Regular Languages Deterministic Finite Automata Nondeterministic Finite Automata Regular Expressions Conversion of Regular Expression

### Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Lexical Analysis Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast! Compiler Passes Analysis of input program (front-end) character stream

### Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

Lexical Analysis COMP 524, Spring 2014 Bryan Ward Based in part on slides and notes by J. Erickson, S. Krishnan, B. Brandenburg, S. Olivier, A. Block and others The Big Picture Character Stream Scanner

### Compiler Construction

Compiler Construction Exercises 1 Review of some Topics in Formal Languages 1. (a) Prove that two words x, y commute (i.e., satisfy xy = yx) if and only if there exists a word w such that x = w m, y =

### Assignment 1 (Lexical Analyzer)

Assignment 1 (Lexical Analyzer) Compiler Construction CS4435 (Spring 2015) University of Lahore Maryam Bashir Assigned: Saturday, March 14, 2015. Due: Monday 23rd March 2015 11:59 PM Lexical analysis Lexical

### Lecture 9 CIS 341: COMPILERS

Lecture 9 CIS 341: COMPILERS Announcements HW3: LLVM lite Available on the course web pages. Due: Monday, Feb. 26th at 11:59:59pm Only one group member needs to submit Three submissions per group START

### CS 403 Compiler Construction Lecture 3 Lexical Analysis [Based on Chapter 1, 2, 3 of Aho2]

CS 403 Compiler Construction Lecture 3 Lexical Analysis [Based on Chapter 1, 2, 3 of Aho2] 1 What is Lexical Analysis? First step of a compiler. Reads/scans/identify the characters in the program and groups

### Lexical Analysis. Sukree Sinthupinyo July Chulalongkorn University

Sukree Sinthupinyo 1 1 Department of Computer Engineering Chulalongkorn University 14 July 2012 Outline Introduction 1 Introduction 2 3 4 Transition Diagrams Learning Objectives Understand definition of

### Assignment 1 (Lexical Analyzer)

Assignment 1 (Lexical Analyzer) Compiler Construction CS4435 (Spring 2015) University of Lahore Maryam Bashir Assigned: Saturday, March 14, 2015. Due: Monday 23rd March 2015 11:59 PM Lexical analysis Lexical

### Lexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual

Lexical Analysis Chapter 1, Section 1.2.1 Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual Inside the Compiler: Front End Lexical analyzer (aka scanner) Converts ASCII or Unicode to a stream of tokens

### 2. λ is a regular expression and denotes the set {λ} 4. If r and s are regular expressions denoting the languages R and S, respectively

Regular expressions: a regular expression is built up out of simpler regular expressions using a set of defining rules. Regular expressions allows us to define tokens of programming languages such as identifiers.

### Compilers CS S-01 Compiler Basics & Lexical Analysis

Compilers CS414-2005S-01 Compiler Basics & Lexical Analysis David Galles Department of Computer Science University of San Francisco 01-0: Syllabus Office Hours Course Text Prerequisites Test Dates & Testing

### 2. Syntax and Type Analysis

Content of Lecture Syntax and Type Analysis Lecture Compilers Summer Term 2011 Prof. Dr. Arnd Poetzsch-Heffter Software Technology Group TU Kaiserslautern Prof. Dr. Arnd Poetzsch-Heffter Syntax and Type

### Syntax and Type Analysis

Syntax and Type Analysis Lecture Compilers Summer Term 2011 Prof. Dr. Arnd Poetzsch-Heffter Software Technology Group TU Kaiserslautern Prof. Dr. Arnd Poetzsch-Heffter Syntax and Type Analysis 1 Content

### Compilers CS S-01 Compiler Basics & Lexical Analysis

Compilers CS414-2017S-01 Compiler Basics & Lexical Analysis David Galles Department of Computer Science University of San Francisco 01-0: Syllabus Office Hours Course Text Prerequisites Test Dates & Testing

### COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.

UNIT I LEXICAL ANALYSIS Translator: It is a program that translates one language to another Language. Source Code Translator Target Code 1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System

### MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Massachusetts Institute of Technology Language Definition Problem How to precisely define language Layered structure

### CSE450. Translation of Programming Languages. Automata, Simple Language Design Principles

CSE45 Translation of Programming Languages Automata, Simple Language Design Principles Finite Automata State Graphs A state: The start state: An accepting state: A transition: a A Simple Example A finite

### MIT Specifying Languages with Regular Expressions and Context-Free Grammars

MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Language Definition Problem How to precisely

### Compiler Design Aug 1996

Aug 1996 Part A 1 a) What are the different phases of a compiler? Explain briefly with the help of a neat diagram. b) For the following Pascal keywords write the state diagram and also write program segments

### Lexical Analysis/Scanning

Compiler Design 1 Lexical Analysis/Scanning Compiler Design 2 Input and Output The input is a stream of characters (ASCII codes) of the source program. The output is a stream of tokens or symbols corresponding

### CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions 1 Agenda Overview of language recognizers Basic concepts of formal grammars Scanner Theory

### Week 2: Syntax Specification, Grammars

CS320 Principles of Programming Languages Week 2: Syntax Specification, Grammars Jingke Li Portland State University Fall 2017 PSU CS320 Fall 17 Week 2: Syntax Specification, Grammars 1/ 62 Words and Sentences

### Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

Scanning 1 read Interpreter Scanner request token Parser send token Console I/O send AST Tree Walker 2 Scanner This process is known as: Scanning, lexing (lexical analysis), and tokenizing This is the

### Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1

CSE 401 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter 2010 1/8/2010 2002-10 Hal Perkins & UW CSE B-1 Agenda Quick review of basic concepts of formal grammars Regular

### Regular Expressions. Regular Expressions. Regular Languages. Specifying Languages. Regular Expressions. Kleene Star Operation

Another means to describe languages accepted by Finite Automata. In some books, regular languages, by definition, are described using regular. Specifying Languages Recall: how do we specify languages?

### Lexical Analysis. Prof. James L. Frankel Harvard University

Lexical Analysis Prof. James L. Frankel Harvard University Version of 5:37 PM 30-Jan-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights reserved. Regular Expression Notation We will develop a

### Decision, Computation and Language

Decision, Computation and Language Regular Expressions Dr. Muhammad S Khan (mskhan@liv.ac.uk) Ashton Building, Room G22 http://www.csc.liv.ac.uk/~khan/comp218 Regular expressions M S Khan (Univ. of Liverpool)

### Lexical Analysis. Finite Automata

#1 Lexical Analysis Finite Automata Cool Demo? (Part 1 of 2) #2 Cunning Plan Informal Sketch of Lexical Analysis LA identifies tokens from input string lexer : (char list) (token list) Issues in Lexical

### CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer

CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer Assigned: Thursday, September 16, 2004 Due: Tuesday, September 28, 2004, at 11:59pm September 16, 2004 1 Introduction Overview In this

### Lexical Analysis. Lecture 3. January 10, 2018

Lexical Analysis Lecture 3 January 10, 2018 Announcements PA1c due tonight at 11:50pm! Don t forget about PA1, the Cool implementation! Use Monday s lecture, the video guides and Cool examples if you re

### CS402 Theory of Automata Solved Subjective From Midterm Papers. MIDTERM SPRING 2012 CS402 Theory of Automata

Solved Subjective From Midterm Papers Dec 07,2012 MC100401285 Moaaz.pk@gmail.com Mc100401285@gmail.com PSMD01 MIDTERM SPRING 2012 Q. Point of Kleen Theory. Answer:- (Page 25) 1. If a language can be accepted

### The Language for Specifying Lexical Analyzer

The Language for Specifying Lexical Analyzer We shall now study how to build a lexical analyzer from a specification of tokens in the form of a list of regular expressions The discussion centers around

### Lecture 3: Lexical Analysis

Lecture 3: Lexical Analysis COMP 524 Programming Language Concepts tephen Olivier January 2, 29 Based on notes by A. Block, N. Fisher, F. Hernandez-Campos, J. Prins and D. totts Goal of Lecture Character

### CS321 Languages and Compiler Design I. Winter 2012 Lecture 4

CS321 Languages and Compiler Design I Winter 2012 Lecture 4 1 LEXICAL ANALYSIS Convert source file characters into token stream. Remove content-free characters (comments, whitespace,...) Detect lexical