Applications of Context-Free Grammars (CFG)

Similar documents
Compiler Lab. Introduction to tools Lex and Yacc

Yacc: A Syntactic Analysers Generator

An Introduction to LEX and YACC. SYSC Programming Languages

LECTURE 11. Semantic Analysis and Yacc

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Introduction to Lex & Yacc. (flex & bison)

Compil M1 : Front-End

Lexical and Parser Tools

Using an LALR(1) Parser Generator

Yacc Yet Another Compiler Compiler

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process

Syntax Analysis Part IV

PRACTICAL CLASS: Flex & Bison

Chapter 3 -- Scanner (Lexical Analyzer)

As we have seen, token attribute values are supplied via yylval, as in. More on Yacc s value stack

Parsing How parser works?

Lex & Yacc (GNU distribution - flex & bison) Jeonghwan Park

Syntax-Directed Translation

Lexical and Syntax Analysis

Module 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool

Chapter 2: Syntax Directed Translation and YACC

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

TDDD55 - Compilers and Interpreters Lesson 3

Preparing for the ACW Languages & Compilers

CSC 467 Lecture 3: Regular Expressions

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

CS4850 SummerII Lex Primer. Usage Paradigm of Lex. Lex is a tool for creating lexical analyzers. Lexical analyzers tokenize input streams.

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

LECTURE 7. Lex and Intro to Parsing

The structure of a compiler

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

CSE302: Compiler Design

IBM. UNIX System Services Programming Tools. z/os. Version 2 Release 3 SA

TDDD55- Compilers and Interpreters Lesson 3

CSCI Compiler Design

CS143 Handout 12 Summer 2011 July 1 st, 2011 Introduction to bison

CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell

Parsing and Pattern Recognition

COMPILER CONSTRUCTION Seminar 02 TDDB44

COP 3402 Systems Software Syntax Analysis (Parser)

L L G E N. Generator of syntax analyzier (parser)

Syntax Analysis Part VIII

LEX/Flex Scanner Generator

An introduction to Flex

Lex Spec Example. Int installid() {/* code to put id lexeme into string table*/}

Lexical Analysis (ASU Ch 3, Fig 3.1)

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

TDDD55- Compilers and Interpreters Lesson 2

Compiler Front-End. Compiler Back-End. Specific Examples

A program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, though scanner is also a term for the first stage of a lexer.

Ray Pereda Unicon Technical Report UTR-02. February 25, Abstract

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

Automatic Scanning and Parsing using LEX and YACC

Department : Computer Science & Engineering

Syntax Analysis The Parser Generator (BYacc/J)

Part 5 Program Analysis Principles and Techniques

Using Lex or Flex. Prof. James L. Frankel Harvard University

A Simple Syntax-Directed Translator

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

COP4020 Programming Assignment 1 CALC Interpreter/Translator Due March 4, 2015

Project 2 Interpreter for Snail. 2 The Snail Programming Language

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Handout 7, Lex (5/30/2001)

Compiler Design (40-414)

Syntax-Directed Translation. Introduction

Marcello Bersani Ed. 22, via Golgi 42, 3 piano 3769

Left to right design 1

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Etienne Bernard eb/textes/minimanlexyacc-english.html

Flex and lexical analysis

Gechstudentszone.wordpress.com

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

Compiler Design 1. Yacc/Bison. Goutam Biswas. Lect 8

Programming Language Syntax and Analysis

(F)lex & Bison/Yacc. Language Tools for C/C++ CS 550 Programming Languages. Alexander Gutierrez

CS 403: Scanning and Parsing

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

COP5621 Exam 3 - Spring 2005

CONTEXT FREE GRAMMAR. presented by Mahender reddy

Lexical Analysis. Implementing Scanners & LEX: A Lexical Analyzer Tool

Lab 2. Lexing and Parsing with Flex and Bison - 2 labs

Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Chapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

CMSC 330: Organization of Programming Languages

Chapter 3 Lexical Analysis

Syntax Directed Translation

1 Lexical Considerations

Figure 2.1: Role of Lexical Analyzer

MATVEC: MATRIX-VECTOR COMPUTATION LANGUAGE REFERENCE MANUAL. John C. Murphy jcm2105 Programming Languages and Translators Professor Stephen Edwards

CMSC 330: Organization of Programming Languages. Context Free Grammars

Lexical Analysis. Introduction

Lex (Lesk & Schmidt[Lesk75]) was developed in the early to mid- 70 s.

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Transcription:

Applications of Context-Free Grammars (CFG) Parsers. The YACC Parser-Generator. by: Saleh Al-shomrani (1) Parsers Parsers are programs that create parse trees from source programs. Many aspects of a programming language have a structure that may be described by REs. (e.g. Identifiers could be represented by RE using lex analyzer). However, there are some very important aspects of programming languages that cannot be represented be REs. (Typical languages use parentheses and/or brackets in a nested and balanced fashion). Example #1: A grammar G bal = ({B, {(, ), P, B), where P consists of: B -> BB (B) Example #2: A grammar that generates the possible sequences of if and else in C (represented as i and e, respectively) is: S -> SS is ises Q: Can we generate the following strings using the above grammar, And why?: ieie, iie, ei, iei, ieeii? How about: iieie? 2 1

The answer for the last one is yes. Because the iieie corresponds to a C program whose structure is like: if (Condition) { if (Condition) Statement; else Statement; if (Condition) Statement; else Statement; 3 Compilation Sequence Source code a = b + c * d Lexical Analyzer tokens id1 = id2 + id3 * id4 Syntax Analyzer Syntax tree id1 = + id2 id3 * id4 Code Generator Generated code load id3 mul id4 add id2 store id1 4 2

(2) The YACC Parser-Generator Yacc and lex are very closely related. The fact that both program generators are often used in combination should not be surprising. The structure of a yacc program closely resembles the structure of a lex program. A yacc program has the following structure. <Declarations> <Rules> <Programs> A yacc program describes the production rules for a context-free grammar. (A yacc program usually has a.y" suffix.) Yacc generates a procedure yyparse() that processes a stream of tokens generated by yylex() and attempts to match a sentence in the specified language. Notice that yacc (yyparse()) calls the scanner when it needs the next token. The scanner is called yylex(). This may or may not be generated by lex.the output of yacc is placed in a file called y.tab.c, unless otherwise specified. 5 (Building a compiler with Lex/Yacc) bas.y yacc (yyparse) y.tab.c source y.tab.h cc bas.exe bas.l lex lex.yy.c (yylex) Compiled output Commands to create our compiler, bas.exe, are: yacc d bas.y lex bas.l cc lex.yy.c y.tab.c obas.exe # create y.tab.h, y.tab.c # create lex.yy.c # compile/link 6 3

== You may use a generated version of by simply including the statement: in the program section of the definition file. The declaration section contains declaration statements such as: %token TK_ID and %start set The heart of a yacc program is the rules section. This section describes the grammar productions and actions to perform once those productions are realized. For example, a typical grammar rule might be the following: set : '{' list_of_ids ' ; Here set and list_of_ids are variables (nonterminals) and '{' and '' are terminals. (The semicolon in the above rule definition denotes the end of a sequence of production rules.) Similarly, we might define!! "" ## $ $ "" && '' (( )) ** + +,, --.. // 00 11 22 33.. 44 55 44.. 66 77 88 99 11 :: ;; 11 77 < < 88 7 where list_of_ids is a variable and TK_ID and ',' are tokens. Notice that alternation is denoted by in our grammar rules. Yacc works with attribute grammars, i.e., those grammars in which every nonterminal and terminal may have an associated attribute or value. In yacc actions, these attributes may be read and/or set when needed. The attribute of the variable (nonterminal) on the left-hand side is denoted by $$. The attributes of the other elements in a production may be accessed by their number. For example, if the variable EXPR has an integer attribute, then the following production rule and action are appropriate. EXPR : EXPR + EXPR {$$ = $1 + $3; ; > Example 1: Here is a lex program that removes comments, tabs, new lines, etc. It returns {, ; = TKID, and TK_COLORS as tokens. 8 4

/* exam-y.y: Use strings and sets as yacc types */ %{ struct color_list { char *my_color; struct color_list *next; ; % %union { char *t_val; struct color_list *color_set; /* These are the token types that exam-l.l returns */ %token TKID %token TK_COLORS /* The token TKID returns the type t_val. */ %type <t_val> TKID %type <color_set> color_def list_of_ids %start color_def; 9 Here are the production rules: color_def ; list_of_ids ; : TK_COLORS '=' '{' list_of_ids '' ';' { print_set($4); : TKID { struct color_list *set1; set1 = (struct color_list *)malloc(sizeof(struct color_list)); set1->next = NULL; set1->my_color = $1; $$ = set1; TKID ',' list_of_ids { struct color_list *set1; set1 = (struct color_list *)malloc(sizeof(struct color_list)); set1->next = $3; set1->my_color = $1; $$ = set1; 10 5

/* exam-l.l: Here is a lex program that removes comments, tabs, newlines, etc. */ /* It returns {, ; = TKID, and TK_COLORS as tokens. */ [ \t\n\f] {ACC(yytext[0]); /* Remove tabs/spaces/newlines */ \/\* {char c; int line_cur; line_cur = linecount; while (1) {if ((c = input()) == EOF) { /* If this is the case, there is an error, an unterminated comment. */ printf("detected unterminated comment starting on line %d \n",line_cur); return(0); ACC(c); 11 if (c == '*') { if ((c = input()) == '/') { break; else {unput(c); colors {printf("%s\n",yytext); return TK_COLORS; [a-za-z_.][a-za-z_.0-9]* { /* Copy yytext to yylval.string_val */ yylval.t_val = strdup(yytext); return TKID; [{,;=] {return yytext[0];. {printf("illegal Character %s on line %d\n", yytext,linecount); printf("ignored \n"); ACC(yytext[0]); 12 6

?? @ Input: /* This is a test of a colors file. */ colors = {red, green, blue, white; /* End of test. */ Output: Reading a colors definition Equals Beginning of set Color = red Separator Color = green Separator Color = blue Separator Color = white Semicolon 13 Example 2: This is a yacc program that acts as an interpreter for a simple language called SSET that manipulates STRINGS and SETS of STRINGS.. It has two classes: (1) set_list: that represents sets and their functions like: Searching, storing, set union, set intersection, set difference, and printing contents of a set. (2) symtab: that represents a symbol table that stores information about each variable such as: variable name, type, and its values. - Here are the production rules from YACC file without their actions (Too long to fit here!). 14 7

A A Sset : declaration program {$$ = NULL; ; declaration : TK_SET TK_ID setdeclar ';' { TK_STRING TK_ID strdeclar ';' { ; setdeclar : ',' TK_ID setdeclar {$$ = NULL; ; program : declaration program {$$ = $1; statement program {/* if lamda */ $$ = NULL; ; statement : TK_ID '=' simp_exp { TK_DISPLAY TK_STR_CONST ';' { TK_DISPLAY TK_ID ';' { ; simp_exp : TK_ID ';' { TK_STR_CONST ';' { set_def {$$ = $1; bin_exp {$$ = $1; set_def : '{' list_of_ids '' ';' {$$ = $2; ; list_of_ids : TK_ID { TK_STR_CONST { TK_ID ',' list_of_ids { TK_STR_CONST ',' list_of_ids {/* if lamda */ $$ = NULL; ; bin_exp : bin1 ';' { $$ = $1; bin2 ';' { $$ = $1; bin3 ';' { $$ = $1; bin4 ';' { $$ = $1; ; bin1 : TK_ID '+' TK_ID { TK_ID '+' { ; bin2 : TK_ID '*' TK_ID { TK_ID '*' '{' TK_ID '' { ; bin3 : TK_ID '-' TK_ID { TK_ID '-' '{' TK_ID '' { ; bin4 : '(' bin1 ') TK_ID { 15 Input: SET s1; STRING s2; s2 = "John"; s1 = {s2,"paul","ringo", "George"; DISPLAY "The Beatles ---- "; DISPLAY s1; s1 = s1 - {s2; DISPLAY s1; s1={; DISPLAY s1; Output: The Beatles ---- {John, Paul, Ringo, George {Paul, Ringo, George { 16 8

C C C C B Compilation (Makefile): CFLAG = -g sset: y.tab.o g++ -g -o sset y.tab.o y.tab.o: y.tab.c lex.yy.c g++ -c $(CFLAG) y.tab.c y.tab.c: start.y yacc start.y lex.yy.c: start.l flex start.l Other References: http://www.combo.org/lex_yacc_page/lex.html http://www.epaperpress.com http://www.gnu.org http://www.cygnus.com 17 9