(F)lex & Bison/Yacc. Language Tools for C/C++ CS 550 Programming Languages. Alexander Gutierrez

Similar documents
Python Lex-Yacc. Language Tool for Python CS 550 Programming Languages. Alexander Gutierrez May 12, 2016

Yacc: A Syntactic Analysers Generator

Preparing for the ACW Languages & Compilers

Automatic Scanning and Parsing using LEX and YACC

Compiler Lab. Introduction to tools Lex and Yacc

Lex & Yacc (GNU distribution - flex & bison) Jeonghwan Park

Introduction to Lex & Yacc. (flex & bison)

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Syntax Analysis Part IV

CSE302: Compiler Design

TDDD55- Compilers and Interpreters Lesson 2

An Introduction to LEX and YACC. SYSC Programming Languages

Yacc Yet Another Compiler Compiler

Lexical and Parser Tools

UNIVERSITY OF CALIFORNIA

Component Compilers. Abstract

PRACTICAL CLASS: Flex & Bison

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

COMPILER CONSTRUCTION Seminar 02 TDDB44

Principles of Programming Languages

TDDD55 - Compilers and Interpreters Lesson 3

Advances in Compilers

Introduction to Parsing. Lecture 8

Introduction to Compiler Construction

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

TDDD55- Compilers and Interpreters Lesson 3

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

The structure of a compiler

Introduction to Compiler Construction

Using an LALR(1) Parser Generator

Compiler Design (40-414)

An introduction to Flex

Introduction to Parsing Ambiguity and Syntax Errors

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

CSCI Compiler Design

LECTURE 11. Semantic Analysis and Yacc

Context-Free Grammars

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

Flex and lexical analysis

LECTURE 7. Lex and Intro to Parsing

CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell

CS131 Compilers: Programming Assignment 2 Due Tuesday, April 4, 2017 at 11:59pm

Introduction to Compiler Design

Grammars and Parsing, second week

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

CS4850 SummerII Lex Primer. Usage Paradigm of Lex. Lex is a tool for creating lexical analyzers. Lexical analyzers tokenize input streams.

Compiler Design 1. Yacc/Bison. Goutam Biswas. Lect 8

Etienne Bernard eb/textes/minimanlexyacc-english.html

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process

Introduction to Parsing Ambiguity and Syntax Errors

CS 321 IV. Overview of Compilation

Cooking flex with Perl

Scanning. COMP 520: Compiler Design (4 credits) Professor Laurie Hendren.

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Compil M1 : Front-End

COP 3402 Systems Software Syntax Analysis (Parser)

CS 230 Programming Languages

CS 426 Fall Machine Problem 1. Machine Problem 1. CS 426 Compiler Construction Fall Semester 2017

Ray Pereda Unicon Technical Report UTR-03. February 25, Abstract

Flex and lexical analysis. October 25, 2016

Parsing How parser works?

1 The Var Shell (vsh)

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

Chapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

Name of chapter & details

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Quick Parser Development Using Modified Compilers and Generated Syntax Rules

Hyacc comes under the GNU General Public License (Except the hyaccpar file, which comes under BSD License)

Figure 2.1: Role of Lexical Analyzer

CS143 Handout 12 Summer 2011 July 1 st, 2011 Introduction to bison

Syntax-Directed Translation

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

Applications of Context-Free Grammars (CFG)

CS Compiler Construction West Virginia fall semester 2014 August 18, 2014 syllabus 1.0

Formats of Translated Programs

Final Examination May 5, 2005

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Inside PHP Tom OSCON th July, 2012

CSC 467 Lecture 3: Regular Expressions

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

Compiler Construction

Introduction to Parsing. Lecture 5

CS131: Programming Languages and Compilers. Spring 2017

A program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, though scanner is also a term for the first stage of a lexer.

Compiler Construction

Better error handling using Flex and Bison

Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

Intro To Parsing. Step By Step

Project 2 Interpreter for Snail. 2 The Snail Programming Language

Programming Project II

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

COP4020 Programming Assignment 1 CALC Interpreter/Translator Due March 4, 2015

Programming Assignment III

Write text parsers with yacc and lex

What do Compilers Produce?

27-Sep CSCI 2132 Software Development Lecture 10: Formatted Input and Output. Faculty of Computer Science, Dalhousie University. Lecture 10 p.

Transcription:

(F)lex & Bison/Yacc Language Tools for C/C++ CS 550 Programming Languages Alexander Gutierrez

Lex and Flex Overview Lex/Flex is a scanner generator for C/C++ It reads pairs of regular expressions and code to create a lexical analyzer (scanner) written in C/C++ Lex was the original generator written under proprietary license Flex was a separate project to recreate lex as an open source program Lex was originally the standard program, but Flex is now the preferred version They both are practically the same and Lex is harder to get, so we will refer to Flex 2

Yacc and Bison Overview Yacc/Bison is a parser generator for C/C++ As a compiler-compiler (parser generator), it is used to create a parser It reads a LALR grammar and creates a parser This parser is can be used as a component of a compiler by feeding in tokens generated by a lexical analyzer In this case, we will use Flex to generate the tokens for Bison Similar to Lex/Flex history, Bison was created as an open source version of Yacc We will refer to Bison for this presentation 3

Which to use? We will use Flex & Bison http://flex.sourceforge.net/ http://www.gnu.org/software/bison/ These are freely available (BSD,GNU), Lex & Yacc are not (AT&T proprietary) Lex & Yacc formerly were standard on machines, but are now basically superseded by Flex & Bison Since they re basically the same, we only really care about Flex & Bison Flex & Bison are on tux 4

Lex/Flex on tux.cs.drexel.edu Only Flex is available Command name: flex Why am I able to type lex and it seems to work? On tux, it is symlinked lex -> flex 5

Yacc/Bison on tux.cs.drexel.edu Only Bison is available Command name: bison Typing yacc seems to work? It is just a symlink, too? Mostly, yes On tux, invoking yacc calls a script that runs bison in yacc-compatibility mode Why is there a yacc-compatibility mode for bison if they are basically the same? To account for some POSIX differences and minor quirks that we don t really care about Just use bison 6

The Bigger Picture We can use Flex and Bison to (relatively) easily implement our own programming language To do this, we need to make the instruction manuals for Flex and Bison For Flex, we need to determine what tokens our language consists of and how each token can be described using a regular expression For Bison, we need to create a (LALR) grammar that takes these tokens and turns it into machine code Both Flex and Bison will produce a piece of C/C++ code which we can compile using an appropriate C/C++ compiler 7

Balanced Parentheses Example The code for this example can be found at: https://www.cs.drexel.edu/~jjohnson/2012-13/spring/cs550/programs/grammars/ Files paren.l paren.y This example looks at the language of balanced parentheses First, we will look at the regular expression file we give to Flex Next, we will look at the grammar we give to Bison Finally, we will compile and test our compiler 8

paren.l %{ #include "paren.tab.h" %} %% \( { return LEFTPAREN; } \) { return RIGHTPAREN; }. \n { return 0; } %% 9

paren.y %{ #include <string.h> #include <stdio.h> %} %token LEFTPAREN RIGHTPAREN %% S0: S1 S0 { printf("s0 => S1 S0\n"); } S1 { printf("s0 => S1\n"); } ; S1: LEFTPAREN S2 RIGHTPAREN { printf("s1 => (S2)\n"); } LEFTPAREN RIGHTPAREN { printf("s1 => ()\n"); } ; S2: S1 S2 { printf("s2 => S1 S2\n"); } S1 { printf("s2 => S1\n"); } %% 10

Compiling on tux All we need are these two files, paren.l and paren.y, in our directory: $ ls paren.l paren.y We can compile using the following sequence of commands (NOTE: ORDER IS VERY IMPORTANT) $ bison -d paren.y $ flex paren.l $ gcc paren.tab.c lex.yy.c -ly -lfl Further explanation follows... 11

Running Bison The reason we use bison first is to produce information about tokens that it accepts, which we can feed to flex to create our lexical analyzer $ bison -d paren.y The -d option for bison creates header files which enable us to feed this information to flex Remember this line in paren.l : #include "paren.tab.h" paren.tab.h is a header file that bison creates with this option Our directory now looks like: $ ls paren.l paren.tab.c paren.tab.h paren.y 12

Running Flex Now we can simply run flex to produce our lexical analyzer: $ flex paren.l This produces another piece of code, lex.yy.c : $ ls lex.yy.c paren.l paren.tab.c paren.tab.h paren.y Next we can compile the whole thing and try it out. 13

Compiling the compiler Now, we use the last command mentioned earlier: $ gcc paren.tab.c lex.yy.c -ly -lfl Here, we are using gcc to compile the code using the bison (yacc) and flex libraries. The order of the options are actually important in order to make the resulting compiler work. As usual with the GNU C/C++ compilers, the result is an executable named a.out by default 14

Using Our New Language We can test to make sure it works by running the executable and giving it input. $./a.out (()) S1 => () S2 => S1 S1 => (S2) S0 => S1 I entered in a string that is in the language, (()), and it executes the associated code. In this case, the code that is instructed to run by the language were the printf statements we saw earlier in the grammar. In other words, the function of this interpreter is to display its own parsing via its grammar rules. 15

Using Our New Language (cont.) Another example input: $./a.out (()()( S1 => () S1 => () syntax error In this case, I gave it a malformed program. The input was not in the recognized language due to imbalanced parentheses and therefore gave a syntax error. The grammar that we gave it is being enforced. 16

Summary Use flex and bison on tux (already installed) Design your own language by creating tokenization instructions via regular expressions for Flex and a grammar for Bison Implement the language by giving Flex and Bison these instructions to generate a lexical analyzer and parser respectively Compile with a C/C++ compiler to realize your very own programming language 17

Reference John R. Levine, flex & bison, O'Reilly & Associates. This book can be found through Drexel s library website for free. flex & bison is basically an updated version of the old lex & yacc book because they are practically the same utilities. 18