File I/O in Flex Scanners

Similar documents
Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Using Lex or Flex. Prof. James L. Frankel Harvard University

The structure of a compiler

Marcello Bersani Ed. 22, via Golgi 42, 3 piano 3769

CSC 467 Lecture 3: Regular Expressions

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

LEX/Flex Scanner Generator

TDDD55- Compilers and Interpreters Lesson 2

CS4850 SummerII Lex Primer. Usage Paradigm of Lex. Lex is a tool for creating lexical analyzers. Lexical analyzers tokenize input streams.

FLEX(1) FLEX(1) FLEX(1) FLEX(1) directive. flex fast lexical analyzer generator

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process

Yacc: A Syntactic Analysers Generator

Lex (Lesk & Schmidt[Lesk75]) was developed in the early to mid- 70 s.

Output with printf Input. from a file from a command arguments from the command read

A program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, though scanner is also a term for the first stage of a lexer.

I. OVERVIEW 1 II. INTRODUCTION 3 III. OPERATING PROCEDURE 5 IV. PCLEX 10 V. PCYACC 21. Table of Contents

COMPILER CONSTRUCTION Seminar 02 TDDB44

An introduction to Flex

Version 2.4 November

Module 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool

Flex, version 2.5. A fast scanner generator Edition 2.5, March Vern Paxson

cmps104a 2002q4 Assignment 2 Lexical Analyzer page 1

Lexical and Syntax Analysis

Compil M1 : Front-End

Flex, version A fast scanner generator Edition , 27 March Vern Paxson W. L. Estes John Millaway

Handout 7, Lex (5/30/2001)

Gechstudentszone.wordpress.com

Lexical Analysis. Implementing Scanners & LEX: A Lexical Analyzer Tool

LECTURE 7. Lex and Intro to Parsing

Lesson 10. CDT301 Compiler Theory, Spring 2011 Teacher: Linus Källberg

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

UNIT - 7 LEX AND YACC - 1

Flex, version A fast scanner generator Edition , 27 March Vern Paxson W. L. Estes John Millaway

LECTURE 11. Semantic Analysis and Yacc

CS 426 Fall Machine Problem 1. Machine Problem 1. CS 426 Compiler Construction Fall Semester 2017

Honu. Version November 6, 2010

Lexical Analysis with Flex

SFU CMPT 361 Computer Graphics Fall 2017 Assignment 2. Assignment due Thursday, October 19, 11:59pm

More Examples. Lex/Flex/JLex

CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell

JFlex. Lecture 16 Section 3.5, JFlex Manual. Robb T. Koether. Hampden-Sydney College. Mon, Feb 23, 2015

The Structure of a Syntax-Directed Compiler

Project 2 Interpreter for Snail. 2 The Snail Programming Language

DINO. Language Reference Manual. Author: Manu Jain

POLITECNICO DI TORINO. Formal Languages and Compilers. Laboratory N 1. Laboratory N 1. Languages?

Formal Languages and Compilers

Using an LALR(1) Parser Generator

CSCI 200 Lab 4 Evaluating infix arithmetic expressions

TDDD55 - Compilers and Interpreters Lesson 3

CS664 Compiler Theory and Design LIU 1 of 16 ANTLR. Christopher League* 17 February Figure 1: ANTLR plugin installer

CS401 - Computer Architecture and Assembly Language Programming Glossary By

COMP 250 Winter stacks Feb. 2, 2016

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

Flex, version 2.5. A fast scanner generator. Edition 2.5, March Vern Paxson

regsim.scm ~/umb/cs450/ch5.base/ 1 11/11/13

TDDD55- Compilers and Interpreters Lesson 3

Standard File Pointers

CMSC445 Compiler design Blaheta. Project 2: Lexer. Due: 15 February 2012

CS Introduction to Data Structures How to Parse Arithmetic Expressions

Edited by Himanshu Mittal. Lexical Analysis Phase

PRACTICAL CLASS: Flex & Bison

Syntax-Directed Translation

CSE302: Compiler Design

Advanced C Programming Topics

Figure 2.1: Role of Lexical Analyzer

UNIT III & IV. Bottom up parsing

Lexical and Parser Tools

DATA STRUCTURES USING C

Title:[ Variables Comparison Operators If Else Statements ]

More Scripting Todd Kelley CST8207 Todd Kelley 1

LP/LispLite: Trivial Lisp Org Mode Conversion

Parser Tools: lex and yacc-style Parsing

Generating a Lexical Analyzer Program

Department : Computer Science & Engineering

Compiler Design 1. Yacc/Bison. Goutam Biswas. Lect 8

Flex, version 2.5. A fast scanner generator. Edition 2.5, March Vern Paxson

Ray Pereda Unicon Technical Report UTR-02. February 25, Abstract

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Java How to Program, 10/e. Copyright by Pearson Education, Inc. All Rights Reserved.

Essential Linux Shell Commands

Alternation. Kleene Closure. Definition of Regular Expressions

Lecture 05 I/O statements Printf, Scanf Simple statements, Compound statements

Programming Assignment II

Introduction to Regular Expressions Version 1.3. Tom Sgouros

Computer Systems Lecture 9

Principles of Programming Languages

Lexical Analyzer Scanner

Programming Project 1: Lexical Analyzer (Scanner)

Languages and Compilers

Lexical Analyzer Scanner

Character Classes. Our specification of the reserved word if, as shown earlier, is incomplete. We don t (yet) handle upper or mixedcase.

They grow as needed, and may be made to shrink. Officially, a Perl array is a variable whose value is a list.

Parser Tools: lex and yacc-style Parsing

LECTURE 6 Scanning Part 2

CS102: Standard I/O. %<flag(s)><width><precision><size>conversion-code

Transcription:

File I/O in Flex Scanners Unless you make other arrangements, a scanner reads from the stdio FILE called yyin, so to read a single file, you need only set it before the first call to yylex. The main routine opens a filename passed on the command line, if the user specified one, and assigns the FILE to yyin. Otherwise, yyin is left unset, in which case yylex automatically sets it to stdin. 3071 1

When a lex scanner reached the end of yyin, it called yywrap(). The idea was that if there was another input file, yywrap could adjust yyin and return 0 to resume scanning. If that was really the end of the input, it returned 1 to the scanner to say that it was done. Alternative %option noyywrap 3071 2

For each file, it opens the file, uses yyrestart() to make it the input to the scanner, and calls yylex() to scan it. 3071 3

To handle its input, a flex scanner uses a structure known as a YY_BUFFER_STATE, which describes a single input source. It contains a string buffer and a bunch of variables and flags. Usually it contains a FILE* for the file it s reading from, but it s also possible to create a YY_BUFFER_STATE not connected a file to scan a string already in memory. 3071 4

The default input behavior of a flex scanner is approximately this: YY_BUFFER_STATE bp; extern FILE* yyin; //... whatever the program does before the first call to the scanner if(!yyin) yyin = stdin; //default input is stdin bp = yy_create_buffer(yyin,yy_buf_size ); //YY_BUF_SIZE defined by flex, typically 16K yy_switch_to_buffer(bp); //tell it to use the buffer we just made yylex(); //or yyparse() or whatever calls the scanner There are several other functions to create scanner buffers, including yy_scan_string("string") to scan a null-terminated string and yy_scan_buffer(char *base, size) to scan a buffer of known size. 3071 5

We ll try out our knowledge of flex I/O with a simple program that handles nested include files and prints them out. To make it a little more interesting, it prints the input files with the line number of each line in its file. To do that, the program keeps a stack of nested input files and line numbers, pushing an entry each time it encounters a #include and popping an entry off the stack when it gets to the end of a file. We also use a very powerful flex feature called start states that let us control which patterns can be matched when. The %x line near the top of the file defines IFILE as a start state that we ll use when we re looking for the filename in a #include statement. 3071 6

At any point, the scanner is in one start state and can match patterns active in that start state only. In effect, the state defines a different scanner, with its own rules. You can define as many start states as needed, but in this program we need only one in addition to the INITIAL state that flex always defines. Patterns are tagged with start state names in angle brackets to indicate in which state(s) the pattern is active. The %x marks IFILE as an exclusive start state, which means that when that state is active, only patterns specifically marked with the state can match. (There are also inclusive start states declared with %s, in which patterns not marked with any state can also match. Exclusive states are usually more useful.) In action code, the macro BEGIN switches to a different start state. 3071 7

In the patterns, the first pattern matches a #include statement up through the double quote that precedes the filename. The pattern permits optional whitespace in the usual places. It switches to IFILE state to read the next input filename. 3071 8

In IFILE state, the second pattern matches a filename, characters up to a closing quote, whitespace, or end-of-line. The filename is passed to newfile to stack the current input file and set up the next level of input, but first there s the matter of dealing with whatever remains of the #include line. The next pattern deals with the case of an Ill-formed #include line that doesn t have a filename after the double quote. 3071 9

The last four patterns do the actual work of printing out each line with a preceding line number. Flex provides a variable called yylineno that is intended to track line numbers, so we might as well use it. The pattern ^. matches any character at the beginning of a line, so the action prints the current line number and the character. Since a dot doesn t match a newline, ^\n matches a newline at the beginning of a line, that is, an empty line, so the code prints out the line number and the new line and increments the line number. A newline or other character not at the beginning of the line is just printed out with ECHO, incrementing the line number for a new line. 3071 10

For the first file, don t need to save a line number 3071 11

Nothing on stack, end of first file 3071 12