Lexical Analysis - Flex

Similar documents
Figure 2.1: Role of Lexical Analyzer

Lex Spec Example. Int installid() {/* code to put id lexeme into string table*/}

CSC 467 Lecture 3: Regular Expressions

Module 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool

JFlex Regular Expressions

An introduction to Flex

Flex and lexical analysis

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

Chapter 3 Lexical Analysis

Compiler course. Chapter 3 Lexical Analysis

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Compiler Construction. Lecture 9

Lexical and Parser Tools

Flex and lexical analysis. October 25, 2016

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Syntax Analysis The Parser Generator (BYacc/J)

Lexical Analysis - 1. A. Overview A.a) Role of Lexical Analyzer

Compiler phases. Non-tokens

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

Introduction to Lex & Yacc. (flex & bison)

Chapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

Implementation of Lexical Analysis

Compiler Lab. Introduction to tools Lex and Yacc

Edited by Himanshu Mittal. Lexical Analysis Phase

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Simple Lexical Analyzer

Lexical Analysis. Implementing Scanners & LEX: A Lexical Analyzer Tool

PRACTICAL CLASS: Flex & Bison

LEX/Flex Scanner Generator

Using Lex or Flex. Prof. James L. Frankel Harvard University

Compiler Construction

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

CSE302: Compiler Design

CPSC 434 Lecture 3, Page 1

Implementation of Lexical Analysis

Lexical analysis. Syntactical analysis. Semantical analysis. Intermediate code generation. Optimization. Code generation. Target specific optimization

Introduction to Compiler Design

Assignment 1 (Lexical Analyzer)

Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

Structure of Programming Languages Lecture 3

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Parsing and Pattern Recognition

LECTURE 6 Scanning Part 2

Assignment 1 (Lexical Analyzer)

Alternation. Kleene Closure. Definition of Regular Expressions

Languages and Compilers

Automated Tools. The Compilation Task. Automated? Automated? Easier ways to create parsers. The final stages of compilation are language dependant

Program Development Tools. Lexical Analyzers. Lexical Analysis Terms. Attributes for Tokens

JFlex. Lecture 16 Section 3.5, JFlex Manual. Robb T. Koether. Hampden-Sydney College. Mon, Feb 23, 2015

Compiler Construction

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Lexical Analyzer Scanner

Compiler Construction

Outline CS4120/4121. Compilation in a Nutshell 1. Administration. Introduction to Compilers Andrew Myers. HW1 out later today due next Monday.

CS4850 SummerII Lex Primer. Usage Paradigm of Lex. Lex is a tool for creating lexical analyzers. Lexical analyzers tokenize input streams.

Lexical Analyzer Scanner

G Compiler Construction Lecture 4: Lexical Analysis. Mohamed Zahran (aka Z)

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.

Formal Languages and Compilers Lecture VI: Lexical Analysis

CSc 453 Lexical Analysis (Scanning)

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

CS321 Languages and Compiler Design I. Winter 2012 Lecture 4

Lexical Analysis and jflex

Lexical Analysis. Introduction

Preparing for the ACW Languages & Compilers

Compilers: table driven scanning Spring 2008

JavaCC: SimpleExamples

UNIT -2 LEXICAL ANALYSIS

HW8 Use Lex/Yacc to Turn this: Into this: Lex and Yacc. Lex / Yacc History. A Quick Tour. if myvar == 6.02e23**2 then f(..!

More Examples. Lex/Flex/JLex

Lexical Analysis. Textbook:Modern Compiler Design Chapter 2.1.

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Lexical Analysis. Chapter 2

The structure of a compiler

Lexical Analysis. Lecture 2-4

L L G E N. Generator of syntax analyzier (parser)

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

EDAN65: Compilers, Lecture 02 Regular expressions and scanning. Görel Hedin Revised:

Lex & Yacc (GNU distribution - flex & bison) Jeonghwan Park

LECTURE 3. Compiler Phases

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

Outline. 1 Scanning Tokens. 2 Regular Expresssions. 3 Finite State Automata

CSCI-GA Compiler Construction Lecture 4: Lexical Analysis I. Hubertus Franke

Lexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual

UNIT II LEXICAL ANALYSIS

CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell

CS 403: Scanning and Parsing

Lex and Yacc. A Quick Tour

Lexical and Syntax Analysis

Lexical Analysis. Lecture 3-4

Lexical Analysis 1 / 52

Lexical Analysis. Textbook:Modern Compiler Design Chapter 2.1

Finite Automata and Scanners

David Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain)

Compiler Construction D7011E

Dr. D.M. Akbar Hussain

Transcription:

Lexical Analysis - Flex CMPSC 470 Lecture 03 Topics: Flex / JFlex A. Lex/Flex Lex and flex (fast lex) are programs that 1. Take, as input, a program containing regular expressions (describing patterns of lexemes of tokens) and their actions. 2. Transform the input regular expressions into a transition diagram (using table driven implementation of DFA), and 3. Generate a C program (lex.yy.c) that simulates this transition diagram. How to use lex or flex 1. 2. 3. Format of lex input file

B. jflex jflex is a lexical analyzer generator for java, written in java. Format requirement of jflex is little bit different to that of lex or flex. jflex 1.6.1 is available at http://jflex.de Steps to use jflex 1. Go to http://jflex.de 2. Go to download 3. Download jflex-1.6.1.zip (or jflex-1.6.1.tar.gz) 4. Unzip it in your working directory 5. Find jflex-1.6.1/lib/jflex-1.6.1.jar 6. Compile your input file as follows: 7. It generates java source containing lexical analyzer from regular expression and its actions described in your input file. C. jflex input format and output format Format of jflex input file User code before class definition Such as import Options and macros % User code inside of class % Declarations Transition rules, such that pattern1 action 1 pattern2 action 2 Format of output java file

D. Example: TestLexer.flex import static java.lang.math.*; %class TestLexer %byaccj %int % % Object obj; public TestLexer(java.io.Reader r, Object obj) this(r); this.obj = obj; // "public TestLexer(java.io.Reader in)" will be generated as default digit = [0-9] number = digit+ real = number(.number)?(e[+-]?number)? letter = [A-Za-z] newline = \n "+" String lexeme = yytext(); return TestMain.PLUS; "if" String lexeme = yytext(); return TestMain.IF; number String lexeme = yytext(); return TestMain.NUM; real String lexeme = yytext(); return TestMain.REAL; letter(letter digit)+ String lexeme = yytext(); return TestMain.WORD; newline String lexeme = yytext(); System.out.print("((newline)), \n"); /* skip */ [ \t\r]+ String lexeme = yytext(); System.out.print("((whitespace "+lexeme+")), "); /* skip */ /* error fallback */ [^] System.err.println("Error: unexpected character '"+yytext()+"'"); return -1;

Notes %class TestLexer %byaccj %int yytext() Regular definitions are defined in Declaration part. In pattern part, you should use the token pattern using regular expression.

Generate lexer code Lexer code will be generated using the following command: It generate TestLexer.java that contains the following codes: /* The following code was generated by JFlex 1.6.1 */ import static java.lang.math.*; /** * This class is a scanner generated by * <a href="http://www.jflex.de/">jflex</a> 1.6.1 * from the specification file <tt>testlexer.flex</tt> */ class TestLexer...... /* user code: */ Object obj; public TestLexer(java.io.Reader r, Object obj) this(r); this.obj = obj; // "public TestLexer(java.io.Reader in)" will be generated as default if (zzinput == YYEOF && zzstartread == zzcurrentpos) zzateof = true; zzdoeof(); return 0; else switch (zzaction < 0? zzaction : ZZ_ACTION[zzAction]) case 1: System.err.println("Error: unexpected character '"+yytext()+"'"); return -1; case 9: case 2: String lexeme = yytext(); return TestMain.NUM; case 10: case 3: String lexeme = yytext(); System.out.print("((newline)), \n"); /* skip */ case 11: case 4: String lexeme = yytext(); System.out.print("((whitespace "+lexeme+")), "); /* skip */...

TestMain.java class TestMain public static final int NUM = 10; public static final int REAL = 11; public static final int WORD = 12; public static final int PLUS = 13; public static final int IF = 14; public static void main(string[] args) throws Exception java.io.reader r = new java.io.stringreader ("main\n" +"123\n" +"1.23 123e1\n" ); //if(args.length < 0) // return; //java.io.reader r = new java.io.filereader(args[0]); Object o = new Object(); TestLexer lex = new TestLexer(r, o); while(true) int token = lex.yylex(); if(token == 0) // end of input if(token == -1) // error switch(token) case NUM : System.out.print("<NUM, " + lex.yytext() + ">"); case REAL: System.out.print("<REAL, " + lex.yytext() + ">"); case WORD: System.out.print("<WORD, " + lex.yytext() + ">");

E. Extension of regular expression in lex/flex In expression, c represents a character, s represents a string, r represents a regular expression. Expression Matches Example C The one non-operator character c \c Character c literally s String s literally. Any character but new line ^ Beginning of a line $ End of line [s] [^s] Any one of the character in string s Any one character not in string s r* Kleene closure. Zero or more matching r r+ Positive closure. One or more string matching r r? Zero or one r rm,n Between mm and nn occurrence of r r 1r 2 An r 1 followed by r 2 r 1 r 2 An r 1 or r 2 (r) Same as r r 1/r 2 r 1 when followed by r 2 Example)