JFlex. Lecture 16 Section 3.5, JFlex Manual. Robb T. Koether. Hampden-Sydney College. Mon, Feb 23, 2015

Similar documents
JFlex Regular Expressions

Introduction to Compiler Design

Recognition of Tokens

CUP. Lecture 18 CUP User s Manual (online) Robb T. Koether. Hampden-Sydney College. Fri, Feb 27, 2015

Fundamental Data Types

Basic PHP. Lecture 19. Robb T. Koether. Hampden-Sydney College. Mon, Feb 26, 2108

An introduction to Flex

I/O and Parsing Tutorial

Minimal Spanning Trees

Scope and Parameter Passing

Recursive Sequences. Lecture 24 Section 5.6. Robb T. Koether. Hampden-Sydney College. Wed, Feb 27, 2013

Recursive Sequences. Lecture 24 Section 5.6. Robb T. Koether. Hampden-Sydney College. Wed, Feb 26, 2014

More Examples. Lex/Flex/JLex

while Loops Lecture 13 Sections Robb T. Koether Wed, Sep 26, 2018 Hampden-Sydney College

Using Lex or Flex. Prof. James L. Frankel Harvard University

CSC 467 Lecture 3: Regular Expressions

Lexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual

The Critical-Path Algorithm

Function Usage. Lecture 15 Sections 6.3, 6.4. Robb T. Koether. Hampden-Sydney College. Mon, Oct 1, 2018

Recursive Descent Parsers

Stack Applications. Lecture 27 Sections Robb T. Koether. Hampden-Sydney College. Wed, Mar 29, 2017

LR Parsing - Conflicts

Simple Lexical Analyzer

Basic PHP Lecture 17

A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994

Scheduling and Digraphs

The Pairwise-Comparison Method

Rotations and Translations

Module 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool

The string Class. Lecture 21 Sections 2.9, 3.9, Robb T. Koether. Wed, Oct 17, Hampden-Sydney College

Linked Lists. Lecture 16 Sections Robb T. Koether. Hampden-Sydney College. Wed, Feb 22, 2017

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

Scope and Parameter Passing

Stack Applications. Lecture 25 Sections Robb T. Koether. Hampden-Sydney College. Mon, Mar 30, 2015

Solving Recursive Sequences by Iteration

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Compiler Construction D7011E

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3

LR Parsing - The Items

Sampling Distribution Examples Sections 15.4, 15.5

Operators. Lecture 12 Section Robb T. Koether. Hampden-Sydney College. Fri, Feb 9, 2018

Lexical Analysis and jflex

Outline CS4120/4121. Compilation in a Nutshell 1. Administration. Introduction to Compilers Andrew Myers. HW1 out later today due next Monday.

Flex and lexical analysis. October 25, 2016

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

The structure of a compiler

Building the Abstract Syntax Trees

The Class Construct Part 1

Implementing Linked Lists

Abstract Syntax Trees Synthetic and Inherited Attributes

The x86 Instruction Set

MP 3 A Lexer for MiniJava

The Decreasing-Time Algorithm

CS 541 Spring Programming Assignment 2 CSX Scanner

Pointers. Lecture 2 Sections Robb T. Koether. Hampden-Sydney College. Mon, Jan 20, 2014

Webpage Navigation. Lecture 27. Robb T. Koether. Hampden-Sydney College. Mon, Apr 2, 2018

The Constructors. Lecture 7 Sections Robb T. Koether. Hampden-Sydney College. Wed, Feb 1, 2017

The Traveling Salesman Problem Brute Force Method

Friends and Unary Operators

Stacks and their Applications

A program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, though scanner is also a term for the first stage of a lexer.

Lexical Analysis. Textbook:Modern Compiler Design Chapter 2.1.

POLITECNICO DI TORINO. Formal Languages and Compilers. Laboratory N 1. Laboratory N 1. Languages?

Boxplots. Lecture 17 Section Robb T. Koether. Hampden-Sydney College. Wed, Feb 10, 2010

Programming Languages

Formal Languages and Compilers

Programming Project 1: Lexical Analyzer (Scanner)

Density Curves Sections

Alternation. Kleene Closure. Definition of Regular Expressions

The Class Construct Part 2

XPath Lecture 34. Robb T. Koether. Hampden-Sydney College. Wed, Apr 11, 2012

Programming Assignment I Due Thursday, October 9, 2008 at 11:59pm

Handout 7, Lex (5/30/2001)

HW8 Use Lex/Yacc to Turn this: Into this: Lex and Yacc. Lex / Yacc History. A Quick Tour. if myvar == 6.02e23**2 then f(..!

Array Lists. Lecture 15. Robb T. Koether. Hampden-Sydney College. Mon, Feb 22, 2016

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 2: Lexical Analysis 23 Jan 08

CS 301. Lecture 05 Applications of Regular Languages. Stephen Checkoway. January 31, 2018

Boolean Expressions. Lecture 31 Sections 6.6, 6.7. Robb T. Koether. Hampden-Sydney College. Wed, Apr 8, 2015

An Introduction to LEX and YACC. SYSC Programming Languages

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process

Ray Pereda Unicon Technical Report UTR-02. February 25, Abstract

Lexical Analysis - Flex

Compiler Construction LECTURE # 3

Integer Overflow. Lecture 8 Section 2.5. Robb T. Koether. Hampden-Sydney College. Mon, Jan 27, 2014

Chapter 3 -- Scanner (Lexical Analyzer)

Programming Assignment II

Recursion. Lecture 26 Sections , Robb T. Koether. Hampden-Sydney College. Mon, Apr 6, 2015

Pointers. Lecture 1 Sections Robb T. Koether. Hampden-Sydney College. Wed, Jan 14, 2015

Street-Routing Problems

Flex and lexical analysis

Structure of Programming Languages Lecture 3

The Traveling Salesman Problem Nearest-Neighbor Algorithm

CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer

Ambient and Diffuse Light

MySQL Creating a Database Lecture 3

CS321 Languages and Compiler Design I. Winter 2012 Lecture 4

XPath. Lecture 36. Robb T. Koether. Wed, Apr 16, Hampden-Sydney College. Robb T. Koether (Hampden-Sydney College) XPath Wed, Apr 16, / 28

Syntax Analysis MIF08. Laure Gonnord

Recursive Linked Lists

Pointers. Lecture 2 Sections Robb T. Koether. Hampden-Sydney College. Fri, Jan 18, 2013

Transcription:

JFlex Lecture 16 Section 3.5, JFlex Manual Robb T. Koether Hampden-Sydney College Mon, Feb 23, 2015 Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 1 / 30

1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 2 / 30

Outline 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 3 / 30

JFlex JFlex is a lexical analyzer generator in Java. It is based on the lexical analyzer generator JLex, which is based on the lexical analyzer generator lex. The gnu lexical analyzer flex is also based on lex. JFlex reads a description of a set of tokens and outputs a Java program that will process those tokens. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 4 / 30

JFlex Overview JFlex will create a Java program Yylex.java, equivalent to our program MyLexer.java. The Yylex class contains a function yylex(), which will return the next token from the input file, equivalent to our function next_token(). Each token is described by a regular expression. We provide actions to be taken when Yylex matches a token. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 5 / 30

JFlex Overview In fact, we can tell JFlex to Rename Yylex.java as MyLexer.java. Rename yylex()as next_token(). Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 6 / 30

The JFlex Input File We will use the extension.flex for the input files to JFlex. The file is divided into three parts. User code Directives Rules These three sections are separated by %%. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 7 / 30

Outline 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 8 / 30

JFlex User Code Any code written in the user-code section is copied directly into the Java source file created by JFlex. This code is included before the definition of the Yylex class. Typically, this section is used for import statements. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 9 / 30

Outline 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 10 / 30

JFlex Directives Any code bracketed within %{ and %} is copied directly into the Yylex class, at the beginning of the class. Although this code is incorporated into the Yylex class, it is not incorporated into any Yylex member function. Thus, we may define Yylex class variables or additional member functions. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 11 / 30

The init and eof Directives Code bracketed within %init{ and %init} is copied into the Yylex constructor. Code bracketed within %eof{ and %eof} is copied into the Yylex function yy_do_eof(), which is called exactly once upon end of file. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 12 / 30

JFlex Token Types Unless we specify otherwise, the data type of the returned tokens is Yytoken. This class is not created automatically. We may change the return type to int by typing one of the directives %integer or %int. We may change the return type to Integer by typing the directive %intwrap. We may set the return type to any other type by using the directive %type type. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 13 / 30

JFlex Token Types and EOF If the return type is Yytoken or Integer, then the EOF token is null. If the return type is int, then the EOF token is -1. For any other type, we need to specify the EOF value. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 14 / 30

JFlex Token Types and EOF By using the %eofval directive, we may indicate what value to return upon EOF. We write %eofval{ return new type(eof-value); %eofval} Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 15 / 30

JFlex debug Directive If we include the %debug directive, then JFlex will create a main() function in the Yylex class, allowing us to run the lexer without an attached parser. The main() function will receive an input filename from the command line. It will process the tokens in that file and report information about each token. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 16 / 30

Outline 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 17 / 30

JFlex Rules Each JFlex rule consists of a regular expression and an action to be taken when the expression is matched. The associated action is a segment of Java code, enclosed in braces { }. Typically, the action will be to return the appropriate token. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 18 / 30

JFlex Regular Expressions Regular expressions are expressed using ASCII characters (32-127). The following characters are metacharacters.? * + ( ) ˆ $. [ ] { } " \ Metacharacters have special meaning; they do not represent themselves. All other characters represent themselves. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 19 / 30

JFlex Regular Expressions Regular Expression Matches r One occurrence of r r? Zero or one occurrence of r r* Zero or more occurrences of r r+ One or more occurrences of r r s r or s rs r concatenated with s r and s are regular expressions. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 20 / 30

JFlex Regular Expressions Parentheses are used for grouping. The expression ("+" "-")? represents an optional plus or minus sign. If a regular expression begins with ˆ, then it is matched only at the beginning of a line. If a regular expression ends with $, then it is matched only at the end of a line. The dot. matches any non-newline character. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 21 / 30

JFlex Regular Expressions Brackets [ ] match any single character listed within the brackets. For example, [abc] matches a or b or c. [A-Za-z] matches any letter. If the first character after [ is ˆ, then the brackets match any character except those listed. [ˆA-Za-z] matches any nonletter. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 22 / 30

JFlex Regular Expressions A single character within double quotes " " or after \ represents itself, except for n, r, b, t, and f. Metacharacters lose their special meaning and represent themselves when they stand alone within single quotes or follow \. "?" and \? match?. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 23 / 30

JFlex Escape Sequences Escape Sequence Matches \n newline (LF) \r carriage return (CR) \b backspace (BS) \t tab (TB) \f form feed (FF) If a character c is not a special escape-sequence character, then \c matches c. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 24 / 30

Outline 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 25 / 30

The JAR Files JFlex uses a number of Java class. These classes have been compiled and are stored in the Java archive file flex-1.6.0.jar. Assignment 6 will have instructions on how to download and install this file. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 26 / 30

Running JFlex The lexical analyzer generator is the Main class in the JFlex folder. To create a lexical analyzer from the file filename.flex, type java jflex.main filename.flex This produces a file Yylex.java (or whatever we named it), which must be compiled to create the lexical analyzer. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 27 / 30

Running the Lexical Analyzer Example (Using the Yylex Class) InputStreamReader isr = new InputStreamReader(System.in); BufferedReader br = new BufferedReader(isr); Yylex lexer = new Yylex(br); token = lexer.yylex(); To run the lexical analyzer, a Yylex object must first be created. The Yylex constructor has one parameter, specifying a Reader. We will convert standard input, which is an InputStream, to a buffered reader. Then call the function yylex() to get the next token. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 28 / 30

Outline 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 29 / 30

Assignment Assignment Read Section 3.5, which is about lex, not JFlex, but they are very similar. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 30 / 30