PRACTICAL CLASS: Flex & Bison

Similar documents
Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Chapter 3 Lexical Analysis

Syntax Analysis Part IV

Compiler course. Chapter 3 Lexical Analysis

Lexical Analysis. Implementing Scanners & LEX: A Lexical Analyzer Tool

Introduction to Lex & Yacc. (flex & bison)

LECTURE 11. Semantic Analysis and Yacc

TDDD55 - Compilers and Interpreters Lesson 3

Lex & Yacc (GNU distribution - flex & bison) Jeonghwan Park

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

CSE302: Compiler Design

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

CSE302: Compiler Design

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

TDDD55- Compilers and Interpreters Lesson 3

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process

Compiler Design 1. Yacc/Bison. Goutam Biswas. Lect 8

An Introduction to LEX and YACC. SYSC Programming Languages

Lex Spec Example. Int installid() {/* code to put id lexeme into string table*/}

Module 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool

Preparing for the ACW Languages & Compilers

Principles of Programming Languages

TDDD55- Compilers and Interpreters Lesson 2

Compiler Lab. Introduction to tools Lex and Yacc

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

Lexical and Parser Tools

Syntax Analysis Part VIII

Yacc Yet Another Compiler Compiler

As we have seen, token attribute values are supplied via yylval, as in. More on Yacc s value stack

Etienne Bernard eb/textes/minimanlexyacc-english.html

Lexical and Syntax Analysis

Using an LALR(1) Parser Generator

Hyacc comes under the GNU General Public License (Except the hyaccpar file, which comes under BSD License)

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Yacc: A Syntactic Analysers Generator

COMPILER CONSTRUCTION Seminar 02 TDDB44

CSCI Compiler Design

LEX/Flex Scanner Generator

Lesson 10. CDT301 Compiler Theory, Spring 2011 Teacher: Linus Källberg

Gechstudentszone.wordpress.com

Parsing How parser works?

CSC 467 Lecture 3: Regular Expressions

Chapter 3 -- Scanner (Lexical Analyzer)

LECTURE 7. Lex and Intro to Parsing

Syntax-Directed Translation

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Yacc. Generator of LALR(1) parsers. YACC = Yet Another Compiler Compiler symptom of two facts: Compiler. Compiler. Parser

Typical tradeoffs in compiler design are: speed of compilation size of the generated code speed of the generated code Speed of Execution Foundations

CS 403: Scanning and Parsing

Component Compilers. Abstract

Flex and lexical analysis

Lexical Analysis (ASU Ch 3, Fig 3.1)

Automatic Scanning and Parsing using LEX and YACC

2. λ is a regular expression and denotes the set {λ} 4. If r and s are regular expressions denoting the languages R and S, respectively

CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell

Figure 2.1: Role of Lexical Analyzer

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Using Lex or Flex. Prof. James L. Frankel Harvard University

Type 3 languages. Regular grammars Finite automata. Regular expressions. Deterministic Nondeterministic. a, a, ε, E 1.E 2, E 1 E 2, E 1*, (E 1 )

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Marcello Bersani Ed. 22, via Golgi 42, 3 piano 3769

Chapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

Lexical and Syntax Analysis

Syn S t yn a t x a Ana x lysi y s si 1

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

Compiler Construction

Compiler Construction

PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer

Compil M1 : Front-End

Applications of Context-Free Grammars (CFG)

Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

CS4850 SummerII Lex Primer. Usage Paradigm of Lex. Lex is a tool for creating lexical analyzers. Lexical analyzers tokenize input streams.

Program Analysis ( 软件源代码分析技术 ) ZHENG LI ( 李征 )

Lexical and Syntax Analysis

Projects for Compilers

Principles of Compiler Design Presented by, R.Venkadeshan,M.Tech-IT, Lecturer /CSE Dept, Chettinad College of Engineering &Technology

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

Chapter 2: Syntax Directed Translation and YACC

Parsing and Pattern Recognition

Edited by Himanshu Mittal. Lexical Analysis Phase

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Gechstudentszone.wordpress.com

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

Bison. The YACC-compatible Parser Generator November 1995, Bison Version by Charles Donnelly and Richard Stallman

The structure of a compiler

Ray Pereda Unicon Technical Report UTR-03. February 25, Abstract

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.

(F)lex & Bison/Yacc. Language Tools for C/C++ CS 550 Programming Languages. Alexander Gutierrez

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

Modern Compiler Design: An approach to make Compiler Design a Significant Study for Students

Compiler Construction

CS143 Handout 12 Summer 2011 July 1 st, 2011 Introduction to bison

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

COMPILER CONSTRUCTION Seminar 01 TDDB

Lab 2. Lexing and Parsing with Flex and Bison - 2 labs

HW8 Use Lex/Yacc to Turn this: Into this: Lex and Yacc. Lex / Yacc History. A Quick Tour. if myvar == 6.02e23**2 then f(..!

Transcription:

Master s Degree Course in Computer Engineering Formal Languages FORMAL LANGUAGES AND COMPILERS PRACTICAL CLASS: Flex & Bison Eliana Bove eliana.bove@poliba.it

Install On Linux: install with the package manager of your distribution On Windows: Install flex.exe [DL from http://gnuwin32.sourceforge.net/packages/flex.htm] Install bison.exe [DL from http://gnuwin32.sourceforge.net/packages/bison.htm] Warning 1: On Windows it is better to change the installation path from the default (C:\Program Files (x86)\gnuwin32) to C:\GnuWin32, as Bison has issues with spaces in directory names. Warning 2: a C compiler is required For example Dev-C++ in C:\Dev-Cpp Include in the PATH environment variable the bin subdirectories of the compiler, Flex and Bison (;C:\Dev-Cpp\bin;C:\GnuWin32\bin)

Lexical analysis: Flex Flex source program lex.l Flex compiler lex.yy.c lex.yy.c C compiler a.out Input stream a.out Sequence of tokens

Lexical analysis: input file A LEX/Flex input file is composed of three different sections, separated by the %% symbol Section 1 %{ #include constant definition scanner macro % basic definitions It may be empty Between characters %{ and %, it contains library #include, customized constant and/or macro definitions for the user C program; this part of text will be literally copied into the generated C program; Basic definitions describe regular expressions used in the second section. Section 2 %% Token definitions and actions Contains the definition of patterns with associated actions to execute, as pairs pattern action Action must start on the same line where the pattern regular expression ends, separated by spaces or tabulations. Section 3 %% Support procedures C user code It may be empty; if it is, the %% separator is omitted. It contains the support routines the programmer intends to use in actions described in the second sections.

Lexical analysis: exercise 1 Exercise 1 : Create a scanner to recognize the following tokens: Lexemes Token name Attribute value any whitespace - - if if - then then - else else - any id id pointer any number number pointer < relop LT <= relop LE = relop EQ <> relop NE > relop GT >= relop GE

Lexical analysis: exercise 1 Exercise 1: Flex source program ex1.l %{ /* definitions of manifest constants*/ #define YYSTYPE int YYSTYPE yylval; #define LT 1 #define LE 2 #define EQ 3 #define NE 4 #define GT 5 #define GE 6 #define IF 7 #define THEN 8 #define ELSE 9 #define ID 10 #define NUMBER 11 #define REL0P 12 %

Lexical analysis: exercise 1 Exercise 1: Flex source program ex1.l /* regular definitions */ delim [ \t\n] ws {delim+ letter [A-Za-z] digit [0-9] id {letter({letter {digit)* number {digit+(\.{digit+)?(e[+-]?{digit+)? %% {ws {/* no action and no return */ if {return(if); then {return(then) ; else {return(else) ; {id {yylval = (int) installid(); return(id); {number {yylval = (int) installnum() ; return(number) ; "<" {yylval = LT; return(relop) ; "<=" {yylval = LE; return(relop) ;

Lexical analysis: exercise 1 Esercizio 1 : flex source program ex1.l "=" {yylval = EQ ; return(relop) ; "<>" {yylval = NE; return(relop); ">" {yylval = GT; return(relop); ">=" {yylval = GE; return(relop); %% int installid() { int installnum() { /* function to install the lexeme, whose first character is pointed to by yytext, and whose length is yyleng, into the symbol table and return a pointer thereto */ printf ("Installing %s of length %d as id\n", yytext, yyleng); return 1; /* similar to installid, but puts numerical constants into a separate table */ printf ("Installing %s of length %d as num\n", yytext, yyleng); return 1;

Lexical analysis: exercise 1 1. Open shell 2. Go to directory where the file.l Flex input file is stored 3. Run: flex ex1.l ( produces lex.yy.c) gcc lex.yy.c lfl ( generates scanner a.exe (a.out)) The library libfl.a is needed to compile. Its path depends on the Flex install directory (gcc lex.yy.c L C:\GnuWin32\lib lfl) a.exe < t1.txt (run on t1.txt input file) (in Linux./a.out)

Lexical analysis: exercise 2 Exercise 2 Write a Flex program which, given a C program in input, produces in output an equivalent one without comments. Exercise 2: Flex specification ex2.l %{ % /* define comment state */ %x comm %% "/*" BEGIN(comm); <comm>[^*\n]* /* eat anything that's not a '*' */ <comm>"*"+[^*/\n]* /* eat up '*'s not followed by '/'s */ <comm>\n /* possible new lines */ <comm>"*"+"/" BEGIN(INITIAL); %%

Lexical analysis: exercise 2 1. Open shell 2. Go to directory where the Flex input file is located. 3. Run: flex ex2.l ( produces lex.yy.c) gcc lex.yy.c lfl ( generates the scanner a.exe (a.out)) a.exe < t2.c (run on t2.c input file) (in Linux./a.out < t2.txt)

Syntax analysis: Bison YACC specification translate.y Bison compiler y.tab.c y.tab.c C compiler a.out input a.out output

Syntax analysis: input file A Bison input file is composed of three different sections, separated by the %% symbol Prologue %{ % #include constant definition basic declarations Optional Between %{ and % symbols it contains the library #include directives, definitions of any entity used in rules in the second section or routines in the third section. The contents are copied at the beginning of the parser. It contains Bison declarations, i.e. names of terminal and nonterminal symbols of the grammar, and rules for precedence/associativity between symbols. Precedence/associativity rules are expressed with the %left, %right or %nonassoc operators. Grammar symbols can be denoted in three ways: named tokens; every token name (by definition, in upper case for terminals and lower case for nonterminals) must be defined with a %token declaration literal token referring to a single character ( + ) string token referring to a sequence of characters ( <- )

Syntax analysis: input file Rules %% Translation rules Contains grammar rules described in a BNF-derived form. Here the whole grammar is described and actions to be executed are defined and associated to the various grammar productions. <head> : <body> 1 {<semantic action> 1 <body> 2 {<semantic action> 2 <body> n {<semantic action> n a semantic action is a sequence of C statements; actions can appear in any place within the production body and must be executed in place; actions can exchange values with the parser through pseudo-variables introduced by the $ symbol: pseudo-variable $$ refers to the left hand side of the production, while the pseudo-variable $n refers to the token in place n on the right hand side of the production if unspecified, the default action is {$$ = $1; Epilogue %% Support C routines Optional Contains any useful code, including that of functions of declared in the prologue. All contents are copied to the end of the parser file.

Syntax analysis: exercise 3 Exercise 3: Bison specification ex3.y Build a calculator starting from the following grammar: E E + T T T T * F F F (E) digit digit is a single digit between 0 and 9 Exercise 3: Bison specification ex3.y %{ % #include <stdio.h> #include <ctype.h> %token DIGIT %%

Syntax analysis: exercise 3 Exercise 3: Bison specification ex3.y input: /* empty string */ input line /* with this left-recursive rule, we can parse consecutive lines */ ; line: '\n' expr '\n' { printf ("%d\n", $1); ; expr : expr '+' term { $$ = $1 + $3; term ; term : term '*' factor { $$ = $1 * $3; factor ; factor : '(' expr ')' { $$ = $2; DIGIT ; %% int main (void) { return yyparse(); int yyerror (const char *s) { printf ("%s\n", s);

Syntax analysis: exercise 3 Exercise 3: Bison specification ex3.y yylex() { int c; c = getchar(); if(isdigit(c)) { yylval = c - 0'; return DIGIT; return c; 1. Open shell and go to the directory where the Bison specification file is located. 2. Run: bison ex3.y ( produces ex3.tab.c) gcc ex3.tab.c ( generates the parser a.exe (a.out)) a.exe (in Linux./a.out)

Syntax analysis: exercise 4 Exercise 4: Bison specification ex4.y Create a calculator supporting more complicated expressions (sum, multiplication, subtraction, division, exponentiation). Watch out for operator precedence! Exercise 4: Bison specification ex4.y %{ % #define YYSTYPE double #include <math.h> #include <stdio.h> #include <ctype.h> /* BISON Declarations */ %token NUM %left '-' '+' %left '*' '/' %left NEG /* negation--unary minus */ %right '^' /* exponentiation */

Syntax analysis: exercise 4 Exercise 4: Bison specification ex4.y %% input: /* empty string */ input line ; line: ; '\n' exp '\n' { printf ("\t%.10g\n", $1); exp: NUM { $$ = $1; exp '+' exp { $$ = $1 + $3; exp '-' exp { $$ = $1 - $3; exp '*' exp { $$ = $1 * $3; exp '/' exp { $$ = $1 / $3; '-' exp %prec NEG { $$ = -$2; /* %prec tells the parser to use the precedence of the NEG token, not of the literal - token declared before*/ exp '^' exp { $$ = pow ($1, $3); '(' exp ')' { $$ = $2; ;

Syntax analysis: exercise 4 Exercise 4: Bison specification ex4.y %% int yylex (void){ int c; /* Skip white space. */ while((c = getchar()) == ' ' c == '\t'){ continue; /* Process numbers. */ if (c == '.' isdigit(c)){ ungetc(c, stdin); scanf("%lf", &yylval); return NUM; /* Return end-of-input. */ if(c == EOF){ return 0; /* Return a single char. */ return c;

Syntax analysis: exercise 4 Exercise 4: Bison specification ex4.y int yyerror(const char *s) { printf ("%s\n", s); int main (void) { return yyparse (); 1. Open shell and go to the directory where the Bison specification file is located. 2. Run: bison ex4.y ( produces ex4.tab.c) gcc ex4.tab.c -lm ( generates the parser a.exe (a.out); -lm links the C math library libm) a.exe (in Linux./a.out)

Flex + Bison bas.y Bison compiler y.tab.c source y.tab.h C compiler bas.exe bas.l Lex compiler lex.yy.c compiled output

Lexical + syntax analysis: exercise 5 Exercise 5: Solve exercise 4 generating the lexical analyzer with Flex. (Combined Flex + Bison use) Exercise 5: Bison specification ex5.y %{ % #define YYSTYPE double #include <math.h> #include <stdio.h> /* BISON Declarations */ %token NUM %token PLUS MINUS TIMES DIVIDE POWER %token LEFT RIGHT %token END %left MINUS PLUS %left TIMES DIVIDE %left NEG %right POWER

Lexical + syntax analysis: exercise 5 Exercise 5: Bison specification ex5.y %% input: /* empty string */ input line ; line: END exp END { printf ("\t%.10g\n", $1); ; exp: NUM { $$ = $1; exp PLUS exp { $$ = $1 + $3; exp MINUS exp { $$ = $1 - $3; exp TIMES exp { $$ = $1 * $3; exp DIVIDE exp { $$ = $1 / $3; MINUS exp %prec NEG { $$ = -$2; exp POWER exp { $$ = pow ($1, $3); LEFT exp RIGHT { $$ = $2; ; %% int yyerror(char *s) { printf("%s\n", s); int main (void){ return yyparse ();

Lexical + syntax analysis: exercise 5 Exercise 5: Flex specification ex5.l %{ #define YYSTYPE double #include "parser.tab.h" % /* regular definitions */ delim [ \t] ws {delim+ digit [0-9] number {digit+(\.{digit+)?(e[+-]?{digit+)? %% {ws {/* no action and no return */ {number {yylval = atof(yytext); return NUM ; "+" {return PLUS; "-" {return MINUS; "*" {return TIMES; "/" {return DIVIDE; "^" {return POWER; "(" {return LEFT; ")" {return RIGHT; "\n" {return END; %%

Lexical + syntax analysis: exercise 5 1. Open shell and go to the directory there the Flex and Bison specification files are located. 2. Run: bison d ex5.y ( produces ex5.tab.c and ex5.tab.h) Notice: the Bison specification file is compiled with the d in order to generate a header file (ex5.tab.h) containing macro definitions for token names defined in the grammar. flex ex5.l ( produces lex.yy.c) gcc ex5.tab.c lex.yy.c lfl -lm ( generates the parser a.exe (a.out); we must link also the libfl Flex library, which defines the yywrap function) a.exe (in Linux./a.out) flag