Syntax Analysis Part IV

Similar documents
Principles of Programming Languages

Syn S t yn a t x a Ana x lysi y s si 1

CSE302: Compiler Design

CSCI Compiler Design

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

PRACTICAL CLASS: Flex & Bison

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Using an LALR(1) Parser Generator

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Syntax-Directed Translation

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Parsing How parser works?

Yacc Yet Another Compiler Compiler

Lex & Yacc (GNU distribution - flex & bison) Jeonghwan Park

Compiler Design 1. Yacc/Bison. Goutam Biswas. Lect 8

Yacc. Generator of LALR(1) parsers. YACC = Yet Another Compiler Compiler symptom of two facts: Compiler. Compiler. Parser

LECTURE 11. Semantic Analysis and Yacc

As we have seen, token attribute values are supplied via yylval, as in. More on Yacc s value stack

An Introduction to LEX and YACC. SYSC Programming Languages

Compiler Lab. Introduction to tools Lex and Yacc

Hyacc comes under the GNU General Public License (Except the hyaccpar file, which comes under BSD License)

Lexical and Syntax Analysis

Introduction to Lex & Yacc. (flex & bison)

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

Gechstudentszone.wordpress.com

TDDD55 - Compilers and Interpreters Lesson 3

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

Compiler Construction

COMPILER CONSTRUCTION Seminar 02 TDDB44

CS143 Handout 12 Summer 2011 July 1 st, 2011 Introduction to bison

Yacc: A Syntactic Analysers Generator

Compiler Construction

Syntax Analysis Part VIII

Compiler Construction Assignment 3 Spring 2018

Automatic Scanning and Parsing using LEX and YACC

TDDD55- Compilers and Interpreters Lesson 3

Lab 2. Lexing and Parsing with Flex and Bison - 2 labs

Lesson 10. CDT301 Compiler Theory, Spring 2011 Teacher: Linus Källberg

Compiler construction in4020 lecture 5

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

Context-free grammars

Compiler Construction

Compiler construction 2002 week 5

Syntax Analysis The Parser Generator (BYacc/J)

COMPILER (CSE 4120) (Lecture 6: Parsing 4 Bottom-up Parsing )

Compiler Construction: Parsing

Parsing. COMP 520: Compiler Design (4 credits) Professor Laurie Hendren.

Compiler construction 2005 lecture 5

Syntax-Directed Translation Part I

Lexical and Parser Tools

Preparing for the ACW Languages & Compilers

Yacc: Yet Another Compiler-Compiler

TDDD55- Compilers and Interpreters Lesson 2

HW8 Use Lex/Yacc to Turn this: Into this: Lex and Yacc. Lex / Yacc History. A Quick Tour. if myvar == 6.02e23**2 then f(..!

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

Lex and Yacc. A Quick Tour

Parser Tools: lex and yacc-style Parsing

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

CS 426 Fall Machine Problem 1. Machine Problem 1. CS 426 Compiler Construction Fall Semester 2017

CSC 467 Lecture 3: Regular Expressions

Etienne Bernard eb/textes/minimanlexyacc-english.html

Parser Tools: lex and yacc-style Parsing

Building a Parser Part III

CSE302: Compiler Design

Chapter 2: Syntax Directed Translation and YACC

Lex and Yacc. More Details

Lexical Analysis. Implementing Scanners & LEX: A Lexical Analyzer Tool

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Chapter 3 -- Scanner (Lexical Analyzer)

Chapter 3 Lexical Analysis

CS 403: Scanning and Parsing

(F)lex & Bison/Yacc. Language Tools for C/C++ CS 550 Programming Languages. Alexander Gutierrez

UNIT III & IV. Bottom up parsing

I. OVERVIEW 1 II. INTRODUCTION 3 III. OPERATING PROCEDURE 5 IV. PCLEX 10 V. PCYACC 21. Table of Contents

CS415 Compilers. LR Parsing & Error Recovery

Compiler course. Chapter 3 Lexical Analysis

Programming Project II

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

A Pragmatic Introduction to Abstract Language Analysis with Flex and Bison under Windows DRAFT

Applications of Context-Free Grammars (CFG)

I 1 : {E E, E E +E, E E E}

Compil M1 : Front-End

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

LR Parsing LALR Parser Generators

IN4305 Engineering project Compiler construction

CS 11 Ocaml track: lecture 6

Exercise 1 Codify in Yacc the generator of the abstract trees of the language defined by the following grammar:

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Typical tradeoffs in compiler design are: speed of compilation size of the generated code speed of the generated code Speed of Execution Foundations

Conflicts in LR Parsing and More LR Parsing Types

Marcello Bersani Ed. 22, via Golgi 42, 3 piano 3769

DEMO A Language for Practice Implementation Comp 506, Spring 2018

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Syntax Directed Translation

Over-simplified history of programming languages

CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell

Bottom-Up Parsing. Lecture 11-12

Transcription:

Syntax Analysis Part IV Chapter 4: Bison Slides adapted from : Robert van Engelen, Florida State University

Yacc and Bison Yacc (Yet Another Compiler Compiler) Generates LALR(1) parsers Bison Improved version of Yacc

Creating an LALR(1) Parser with Yacc/Bison yacc specification yacc.y Bison compiler yacc.tab.c yacc.tab.c C compiler a.out input stream a.out output stream

Bison Specification A Bison specification consists of three parts: Bison declarations, and C declarations within %{ %} %% translation rules %% user-defined auxiliary procedures The translation rules are CFG productions with actions: production 1 { semantic action 1 } production 2 { semantic action 2 } production n { semantic action n }

Writing a Grammar in Yacc Productions in Yacc are of the form Nonterminal: tokens/nonterminals { action } tokens/nonterminals { action } ; Tokens that are single characters can be used directly within productions, e.g. + Named tokens must be declared first in the declaration part using %token TokenName

Synthesized Attributes Semantic actions may refer to attributes of terminals and nonterminals in a production: X : Y 1 Y 2 Y 3 Y n { action } $$ refers to the value of the attribute of X $i refers to the value of the attribute of Y i Example : factor : ( expr ) { $$=$2; } factor.val=x ( expr.val=x ) $$=$2

Example I : Incomplete Code %{ #include <ctype.h> %} %token DIGIT %% line : expr \n { printf( %d\n, $1); } ; expr : expr + term { $$ = $1 + $3; } term { $$ = $1; } ; term : term * factor { $$ = $1 * $3; } factor { $$ = $1; } ; factor : ( expr ) { $$ = $2; } DIGIT { $$ = $1; } ; %% int yylex() { int c = getchar(); if (isdigit(c)) { yylval = c- 0 ; return DIGIT; } return c; } Attribute of token (stored in yylval) Attribute of expr (parent) Attribute of expr (child) Attribute of term (child) Example of a very crude lexical analyzer invoked by the parser

Example I : Complete Code %{ #include <ctype.h> #include <stdio.h> int yylex(); void yyerror(char *s); %} %token DIGIT %% line : expr '\n' { printf("%d\n", $1); } ; expr : expr '+' term { $$ = $1 + $3; } term { $$ = $1; } ; term : term '*' factor { $$ = $1 * $3; } factor { $$ = $1; } ; factor : '(' expr ')' { $$ = $2; } DIGIT { $$ = $1; } ;

Example I : Complete Code %% int yylex() { int c = getchar(); if (isdigit(c)) { yylval = c-'0'; return DIGIT; } return c; } int main() { if (yyparse()!= 0 ) fprintf(stderr, "Abnormal exit\n"); return 0; } void yyerror(char *s) { fprintf(stderr, "Error: %s\n", s); } bison calculator.y gcc calculator.tab.c./a.out 5+8*(9+2)*3 269 ^D

Tokens Two types of tokens: literal and symbolic Literal tokens represented using the corresponding C character constant (ASCII code) Symbolic tokens represented as numbers higher than any possible character s code, so they will not conflict with any literal tokens Numbers can be forced by declaration %token NUMBER 621

Tokens When using symbolic tokens, run Bison with d option to create a C header file with definitions If Bison is combined with Flex, add #include xxx.tab.h in lexer file, where xxx.y is the source file

Symbol Values Both tokens and nonterminals have an associated semantic value For token, the value is stored in the C variable yylval For nonterminals, the value is stored in the $$, $1, pseudo-variables The associated semantic value is of type YYSTYPE, declared as int by default YYSTYPE can be redefined using the C instruction #define

Dealing with Ambiguous Grammars A description of parsing action conflicts can be obtained using the -v option, which produces an additional file y.output Reduce/reduce conflicts solved by using the conflicting production listed first Shift/reduce conflicts resolved in favor of shift

Dealing with Ambiguous Grammars We can also deal with ambiguous grammars by defining operator precedence levels and left/right associativity of the operators Example: %left + - %left * / %right UMINUS

Dealing with Ambiguous Grammars Productions are also given precedence and associativity, inherited from their rightmost nonterminal Example: item [E E + E ] and lookahead + resolved with reduction (+ left associative) Example: item [E E + E ] and lookahead * resolved with shift (* higher precedence)

Dealing with Ambiguous Grammars Can force different precedence by attaching to a production %prec terminal Symbol terminal can be a placeholder: this terminal is never used by lexical analyzer, but indicates a precedence (see later examples)

Combining Bison with Flex Yacc or Bison specification yacc.y Lex or Flex specification lex.l and token definitions y.tab.h lex.yy.c y.tab.c Yacc or Bison compiler Lex or Flex compiler C compiler y.tab.c y.tab.h lex.yy.c a.out input stream a.out output stream

Example II : Bison %{ #include <ctype.h> #include <stdio.h> #define YYSTYPE double %} %token NUMBER %left + - %left * / %right UMINUS double type for nonterminal attributes and yylval terminal placeholder

Example II : Bison %% lines : lines expr \n { printf( %g\n, $2); } lines \n /* empty */ ; expr : expr + expr { $$ = $1 + $3; } expr - expr { $$ = $1 - $3; } expr * expr { $$ = $1 * $3; } expr / expr { $$ = $1 / $3; } ( expr ) { $$ = $2; } - expr %prec UMINUS { $$ = -$2; } NUMBER ;

Example II : Bison %% int main() { if (yyparse()!= 0) fprintf(stderr, Abnormal exit\n ); return 0; } void yyerror(char *s) { fprintf(stderr, Error: %s\n, s); } Run the parser Invoked by parser to report parse errors

Example II : Flex %option noyywrap %{ #include example.tab.h extern YYSTYPE yylval; %} number [0-9]+\.? [0-9]*\.[0-9]+ Generated by Bison, contains #define NUMBER xxx Defined in example.tab.c

Example II : Flex %% [ \t] { /* skip blanks */ } {number} { sscanf(yytext, %lf, &yylval); return NUMBER; } \n. { return yytext[0]; } bison d example.y flex example.l gcc example.tab.c lex.yy.c./a.out 2.4*(6+13) 45.6 ^D

Error Recovery in Yacc %{ %} %% lines : lines expr \n { printf( %g\n, $2; } lines \n /* empty */ error \n { yyerror( reenter last line: ); yyerrok; } ; Error production: set error mode and skip input until newline Reset parser to normal mode

Symbol Values If several types are needed for grammar symbols, a union type must be defined The %union directive identifies all of the possible C types that a symbol value can have The field declarations are copied verbatim into a C union declaration of the type YYSTYPE In the absence of a %union declaration, Bison defines YYSTYPE to be int

Symbol Values Associate the types declared in %union with specific grammar symbols using the %type declaration for nonterminal the %token declaration for tokens

Example %union { double dval; char *sval; }... %token <dval> REAL %token <sval> STRING %type <dval> expr // token // token // nonterminal

Exercise III Write an interpreter for combination of arithmetic expressions and Boolean expressions Use %union directive to deal with different data types

Exercise III : Flex %option noyywrap %{ %} #include <stdlib.h> #include "boolean.tab.h" fract [0-9]+\.? [0-9]*\.[0-9]+

Exercise III : Flex %% [ \t] { /* skip blanks */ } "&&" { return AND; } " " { return OR; } "!" { return NOT; } "<" { return LT; } "<=" { return LE; } ">" { return GT; } ">=" { return GE; } "==" { return EQ; } {fract} { yylval.val = atof(yytext); return FRACT; } \n. { return yytext[0]; }

Exercise III : Bison %{ #include <ctype.h> #include <stdio.h> int yylex(); void yyerror(char *s); %} %union { double val; int bool; // 1 == true, 0 == false }

Exercise III : Bison %token <val> FRACT %type <val> expr %type <bool> comp %type <bool> bexpr %left %left %right %left %left %right '+' '-' '*' '/' UMINUS OR AND NOT %nonassoc EQ LT GT LE GE

Exercise III : Bison %% lines : lines bexpr '\n' { printf("%d\n", $2); } lines '\n' /* empty */ ; bexpr : bexpr OR bexpr { if ($1 == 1 $3 == 1) $$ = 1; else $$ = 0; } bexpr AND bexpr { if ($1 == 1 && $3 == 1) $$ = 1; else $$ = 0; } NOT bexpr { if ($2 == 1 ) $$ = 0; else $$ = 1; } '(' bexpr ')' { $$ = $2; } comp { $$ = $1; } ;

Exercise III : Bison comp : expr LT expr { if ($1 < $3) $$ = 1; else $$ = 0; } expr LE expr { if ($1 <= $3) $$ = 1; else $$ = 0; } expr GE expr { if ($1 >= $3) $$ = 1; else $$ = 0; } expr GT expr { if ($1 > $3) $$ = 1; else $$ = 0; } expr EQ expr { if ($1 == $3) $$ = 1; else $$ = 0; } // '(' comp ')' { $$ = $2; } ;

Exercise III : Bison expr : expr '+' expr { $$ = $1 + $3; } expr '-' expr { $$ = $1 - $3; } expr '*' expr { $$ = $1 * $3; } expr '/' expr { $$ = $1 / $3; } '(' expr ')' { $$ = $2; } '-' expr %prec UMINUS { $$ = -$2; } FRACT ;

Exercise III : Bison %% int main() { if (yyparse()!= 0) fprintf(stderr, "Abnormal exit\n"); return 0; } void yyerror(char *s) { fprintf(stderr, "Error: %s\n", s); }

Summary of Commands To run Bison + Flex : bison d o parser.c parser.y flex o scanner.c scanner.l gcc o parser parser.c scanner.c./parser < input > output