Recognition of Tokens

Similar documents
Scope and Parameter Passing

Chapter 3 Lexical Analysis

Compiler course. Chapter 3 Lexical Analysis

2. λ is a regular expression and denotes the set {λ} 4. If r and s are regular expressions denoting the languages R and S, respectively

The Critical-Path Algorithm

Minimal Spanning Trees

Scheduling and Digraphs

The Pairwise-Comparison Method

JFlex Regular Expressions

Total Orders. Lecture 41 Section 8.5. Robb T. Koether. Hampden-Sydney College. Mon, Apr 8, 2013

Introduction to Compiler Design

Scope and Parameter Passing

LR Parsing - Conflicts

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

Boolean Expressions. Lecture 31 Sections 6.6, 6.7. Robb T. Koether. Hampden-Sydney College. Wed, Apr 8, 2015

Solving Recursive Sequences by Iteration

Rotations and Translations

Recursive Sequences. Lecture 24 Section 5.6. Robb T. Koether. Hampden-Sydney College. Wed, Feb 26, 2014

Recursive Descent Parsers

The Decreasing-Time Algorithm

LR Parsing - The Items

Pointers. Lecture 2 Sections Robb T. Koether. Hampden-Sydney College. Mon, Jan 20, 2014

Stack Applications. Lecture 27 Sections Robb T. Koether. Hampden-Sydney College. Wed, Mar 29, 2017

Compilers: table driven scanning Spring 2008

JFlex. Lecture 16 Section 3.5, JFlex Manual. Robb T. Koether. Hampden-Sydney College. Mon, Feb 23, 2015

Integer Overflow. Lecture 8 Section 2.5. Robb T. Koether. Hampden-Sydney College. Mon, Jan 27, 2014

The Traveling Salesman Problem Brute Force Method

Implementing Linked Lists

while Loops Lecture 13 Sections Robb T. Koether Wed, Sep 26, 2018 Hampden-Sydney College

Recursive Linked Lists

Chapter 3: Lexical Analysis

Density Curves Sections

Recursive Sequences. Lecture 24 Section 5.6. Robb T. Koether. Hampden-Sydney College. Wed, Feb 27, 2013

Sampling Distribution Examples Sections 15.4, 15.5

Lexical Analysis (ASU Ch 3, Fig 3.1)

Programming Languages

Lexical Analysis - 1. A. Overview A.a) Role of Lexical Analyzer

Pointers. Lecture 1 Sections Robb T. Koether. Hampden-Sydney College. Wed, Jan 14, 2015

CSE302: Compiler Design

Nondeterministic Programming in C++

Pointers. Lecture 2 Sections Robb T. Koether. Hampden-Sydney College. Fri, Jan 18, 2013

The Traveling Salesman Problem Nearest-Neighbor Algorithm

Recursion. Lecture 26 Sections , Robb T. Koether. Hampden-Sydney College. Mon, Apr 6, 2015

Stack Applications. Lecture 25 Sections Robb T. Koether. Hampden-Sydney College. Mon, Mar 30, 2015

Pointer Arithmetic. Lecture 4 Chapter 10. Robb T. Koether. Hampden-Sydney College. Wed, Jan 25, 2017

Fundamental Data Types

Array Lists. Lecture 15. Robb T. Koether. Hampden-Sydney College. Mon, Feb 22, 2016

Recursion. Lecture 2 Sections Robb T. Koether. Hampden-Sydney College. Wed, Jan 17, 2018

Friends and Unary Operators

The string Class. Lecture 21 Sections 2.9, 3.9, Robb T. Koether. Wed, Oct 17, Hampden-Sydney College

Projects for Compilers

PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer

Boxplots. Lecture 17 Section Robb T. Koether. Hampden-Sydney College. Wed, Feb 10, 2010

Linked Lists. Lecture 16 Sections Robb T. Koether. Hampden-Sydney College. Wed, Feb 22, 2017

Programming Languages

The Plurality-with-Elimination Method

Array Lists. Lecture 15. Robb T. Koether. Hampden-Sydney College. Fri, Feb 16, 2018

Street-Routing Problems

Lexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual

The Class Construct Part 1

Webpage Navigation. Lecture 27. Robb T. Koether. Hampden-Sydney College. Mon, Apr 2, 2018

Ambient and Diffuse Light

Operators. Lecture 12 Section Robb T. Koether. Hampden-Sydney College. Fri, Feb 9, 2018

Magnification and Minification

Function Usage. Lecture 15 Sections 6.3, 6.4. Robb T. Koether. Hampden-Sydney College. Mon, Oct 1, 2018

XQuery FLOWR Expressions Lecture 35

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Displaying Distributions - Quantitative Variables

Stacks and their Applications

Abstract Syntax Trees Synthetic and Inherited Attributes

MySQL Creating a Database Lecture 3

Dynamic Allocation of Memory

Binary Tree Implementation

The Class Construct Part 2

The x86 Instruction Set

Binary Tree Implementation

DTDs and XML Attributes

Program Analysis ( 软件源代码分析技术 ) ZHENG LI ( 李征 )

XPath Lecture 34. Robb T. Koether. Hampden-Sydney College. Wed, Apr 11, 2012

Regular Expressions. Lecture 10 Sections Robb T. Koether. Hampden-Sydney College. Wed, Sep 14, 2016

UNIT -2 LEXICAL ANALYSIS

The Normal Distribution

1. Lexical Analysis Phase

The Constructors. Lecture 6 Sections Robb T. Koether. Hampden-Sydney College. Fri, Jan 26, 2018

The Constructors. Lecture 7 Sections Robb T. Koether. Hampden-Sydney College. Wed, Feb 1, 2017

The Coefficient of Determination

Assignment 1 (Lexical Analyzer)

Basic PHP. Lecture 19. Robb T. Koether. Hampden-Sydney College. Mon, Feb 26, 2108

The Graphics Pipeline

Assignment 1 (Lexical Analyzer)

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

Aggregation. Lecture 7 Section Robb T. Koether. Hampden-Sydney College. Wed, Jan 29, 2014

List Iterator Implementation

Inheritance: The Fundamental Functions

The Graphics Pipeline

Formal Languages and Compilers Lecture VI: Lexical Analysis

Introduction to Databases

The CYK Parsing Algorithm

2010: Compilers REVIEW: REGULAR EXPRESSIONS HOW TO USE REGULAR EXPRESSIONS

Binary Tree Applications

Transcription:

Recognition of Tokens Lecture 3 Section 3.4 Robb T. Koether Hampden-Sydney College Mon, Jan 19, 2015 Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 1 / 21

1 A Class of Tokens 2 The Input Buffer 3 Transition Diagrams 4 Writing the Lexer 5 Assignment Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 2 / 21

Outline 1 A Class of Tokens 2 The Input Buffer 3 Transition Diagrams 4 Writing the Lexer 5 Assignment Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 3 / 21

A Class of Tokens We will explore and demonstrate the concepts of a lexer by using a simple class of tokens. digit [0-9] digits digit + number digits (. digits)? (E [+-]? digits)? letter [A-Za-z] id letter (letter digit) if if then then else else relop < > <= >= = <> Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 4 / 21

Whitespace In addition to recognizing tokens, the lexer must strip whitespace from the input. Whitespace is not a token, but it must be recognized by the lexer. ws (blank tab newline) + Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 5 / 21

Outline 1 A Class of Tokens 2 The Input Buffer 3 Transition Diagrams 4 Writing the Lexer 5 Assignment Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 6 / 21

The Input Buffer The input to the lexer is a stream of characters. We may consider the characters to be residing in a buffer. We mark two positions in the buffer. lexemebegin forward The pointer lexemebegin holds the starting position of the current token. The pointer forward points to the current symbol. Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 7 / 21

The Input Buffer The lexer begins in the start state with the current symbol (pointed to by both lexemebegin and forward). The process moves from state to state by following the transitions whose labels match the current symbol (forward). This continues until no further moves are possible. Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 8 / 21

The Input Buffer Example (Processing a Statement) forward i n t c o u n t = 0 ; lexemebegin The input buffer Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 9 / 21

The Input Buffer Example (Processing a Statement) forward i n t c o u n t = 0 ; lexemebegin Advance one symbol Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 9 / 21

The Input Buffer Example (Processing a Statement) forward i n t c o u n t = 0 ; lexemebegin Could be an identifier; could be a keyword Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 9 / 21

The Input Buffer Example (Processing a Statement) forward i n t c o u n t = 0 ; lexemebegin It is the keyword int Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 9 / 21

The Input Buffer Example (Processing a Statement) forward i n t c o u n t = 0 ; lexemebegin Skip whitespace Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 9 / 21

The Input Buffer Example (Processing a Statement) forward i n t c o u n t = 0 ; lexemebegin Could be an identifier; could be a keyword Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 9 / 21

The Input Buffer Example (Processing a Statement) forward i n t c o u n t = 0 ; lexemebegin It is an identifier Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 9 / 21

The Input Buffer Example (Processing a Statement) forward i n t c o u n t = 0 ; lexemebegin This is an operator, but which one? Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 9 / 21

The Input Buffer Example (Processing a Statement) forward i n t c o u n t = 0 ; lexemebegin It is the assignment operator Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 9 / 21

Outline 1 A Class of Tokens 2 The Input Buffer 3 Transition Diagrams 4 Writing the Lexer 5 Assignment Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 10 / 21

Transition Diagrams Definition (Transition Diagram) A transition diagram is a directed graph. It consists of a finite set of nodes, called states. One state is designated the start state. The directed edges between states represent transitions. Each transition is labeled with a symbol (or possibly a regular expression). A subset of the set of states is designated the accepting states. The remaining states are rejecting states. Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 11 / 21

Transition Diagrams Example (Relational Operators) Consider the relational operators <, >, <=, >=, =, and <>. The first symbol may be <, >, =, or something else. If the first symbol is <, then the next symbol may be =, >, or something else. If the first symbol is >, then the next symbol may be = or something else. If the first symbol is =, then the next symbol is something else. Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 12 / 21

Transition Diagrams Example (Relational Operators) = 4 LE 1 > 5 NE < other 6 LT and retract 0 = 2 EQ > 3 = 7 GE other 8 GT and retract Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 13 / 21

Transition Diagrams Example (Identifiers) letter digit 9 letter other 10 11 identifier and retract Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 14 / 21

Transition Diagrams Example (Keywords) i f other keyword and retract t h e n other keyword and retract e l s e other keyword and retract Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 15 / 21

Transition Diagrams Draw the transition diagram for numbers digit [0-9] digits digit + number digits (. digits)? (E [+-]? digits)? Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 16 / 21

Outline 1 A Class of Tokens 2 The Input Buffer 3 Transition Diagrams 4 Writing the Lexer 5 Assignment Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 17 / 21

Writing the Lexer The lexer is the program that implements the transition diagram. We could use A switch statement, and/or An if-else structure. Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 18 / 21

Writing the Lexer The Lexer for Relational Operators Token getrelop() { int state = 0; char c = get_next_symbol(); while (c == < c == = c == > ) { switch (state) { case 0: if (c == < ) state = 1; else if (c == = ) state = 2; else if (c == > ) state = 3; else fail(); break;... case 8: retract(); return Token(GT); } } } Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 19 / 21

Outline 1 A Class of Tokens 2 The Input Buffer 3 Transition Diagrams 4 Writing the Lexer 5 Assignment Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 20 / 21

Assignment Assignment Read Section 3.4. Exercises 1, 2(c)(i). Robb T. Koether (Hampden-Sydney College) Recognition of Tokens Mon, Jan 19, 2015 21 / 21