L L G E N. Generator of syntax analyzier (parser)

Similar documents
flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

CSC 467 Lecture 3: Regular Expressions

An Introduction to LEX and YACC. SYSC Programming Languages

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

CPS 506 Comparative Programming Languages. Syntax Specification

IN4305 Engineering project Compiler construction

Syntax. A. Bellaachia Page: 1

Compiler construction in4020 lecture 5

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Using an LALR(1) Parser Generator

Compiler construction 2002 week 5

1 Lexical Considerations

Lexical and Syntax Analysis

Using Lex or Flex. Prof. James L. Frankel Harvard University

A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994

LECTURE 7. Lex and Intro to Parsing

Chapter 3: Lexing and Parsing

CSE 3302 Programming Languages Lecture 2: Syntax

Yacc: A Syntactic Analysers Generator

Introduction to Lex & Yacc. (flex & bison)

Typescript on LLVM Language Reference Manual

Part 5 Program Analysis Principles and Techniques

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Syntax-Directed Translation

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

ECE251 Midterm practice questions, Fall 2010

CS4850 SummerII Lex Primer. Usage Paradigm of Lex. Lex is a tool for creating lexical analyzers. Lexical analyzers tokenize input streams.

Lexical Considerations

Compiler construction 2005 lecture 5

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

A simple syntax-directed

LECTURE 11. Semantic Analysis and Yacc

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Lexical Considerations

COP 3402 Systems Software Syntax Analysis (Parser)

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

Ulex: A Lexical Analyzer Generator for Unicon

JavaCC: SimpleExamples

CSE 401 Midterm Exam Sample Solution 2/11/15

An introduction to Flex

CSCI Compiler Design

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Lex & Yacc (GNU distribution - flex & bison) Jeonghwan Park

CS 403: Scanning and Parsing

Syntax Analysis Part IV

Introduction to Lexical Analysis

Lexical analysis. Syntactical analysis. Semantical analysis. Intermediate code generation. Optimization. Code generation. Target specific optimization

Yacc Yet Another Compiler Compiler

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

1. Lexical Analysis Phase

CS 297 Report. By Yan Yao

Lexical Analysis. Introduction

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Parsing and Pattern Recognition

Lexical Analyzer Scanner

Lexical Analyzer Scanner

Preparing for the ACW Languages & Compilers

Gechstudentszone.wordpress.com

LECTURE 6 Scanning Part 2

Chapter 3 -- Scanner (Lexical Analyzer)

Flex and lexical analysis

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Languages and Compilers

Programming Language Syntax and Analysis

CSCI312 Principles of Programming Languages!

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

2068 (I) Attempt all questions.

CS 426 Fall Machine Problem 1. Machine Problem 1. CS 426 Compiler Construction Fall Semester 2017

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Programming for Engineers Introduction to C

Sprite an animation manipulation language Language Reference Manual

Programming in C++ 4. The lexical basis of C++

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Ray Pereda Unicon Technical Report UTR-02. February 25, Abstract

Dr. D.M. Akbar Hussain

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

CSE302: Compiler Design

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

EECS 6083 Intro to Parsing Context Free Grammars

TDDD55- Compilers and Interpreters Lesson 2

2.2 Syntax Definition

UNIT -2 LEXICAL ANALYSIS

Features of C. Portable Procedural / Modular Structured Language Statically typed Middle level language

Compiler Lab. Introduction to tools Lex and Yacc

The structure of a compiler

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Chapter 2, Part I Introduction to C Programming

Flow Control. CSC215 Lecture

COMPILER DESIGN. For COMPUTER SCIENCE

Flex and lexical analysis. October 25, 2016

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 2: Lexical Analysis 23 Jan 08

Introduction to Parsing. Lecture 8

A program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, though scanner is also a term for the first stage of a lexer.

Chapter 3. Describing Syntax and Semantics ISBN

Full file at C How to Program, 6/e Multiple Choice Test Bank

3. Context-free grammars & parsing

Decaf Language Reference Manual

Lexical and Parser Tools

Transcription:

L L G E N Generator of syntax analyzier (parser)

GENERATOR L LGEN The main task of generator LLGEN is generate a parser (in C), which use the recursive go down method without recurrence The source code is generated by LLgen'a on the basis of the file containing the specification In the file with specification we can use the extended specification of simple LL(1) grammars. Because LLgen includes a built-in mechanism of static and dynamic conflict resolution, it allows the use of ambiguous grammars 2

GENERATOR LLGEN L E X Diagram the organization of LLgen: G C C scan.l scan.c scane.exe RESULT gram.g gram.c Lpars.c Lpars.h LLgen file.txt 3

GENERATOR LLGEN flex l scan.l (use of generator LEX) result - lex.yy.c LLgen gram.g (use of generator LLgen for file with the speciticafion gram.g) result - Lpars.c and Lpars.h gcc lex.yy.c Lpars.c gram.c (compilation in C)./a.out < file.in (analisi of file) 4 4

GENERATOR LLGEN LLgen generator uses (default) an external lexical analyzer (generated by Lex). For this purpose is used the function yylex() The file Lpars.h which is generated during generator operation LLgen, contains definitions that assigned numeric constants declared the names of the token-s 5

GNENRATOR LLGEN Ways to use another analyzer are as follows: Put the implementation of the scanner directly in the specification grammar (in the { } or in an external file) In the specifications, we have to indicate the name of the function which is use by LLgen %lexical name_function If it is necessary, we have to incorporate into the lexical analyzer of file Lpars.h 6

GENERATOR LLGEN LLgen is a tool row. The specification file for LLgen we create in a form of plain text file, sometimes in several files Each of the generated source code, contains productions, directive of generator LLgen and declarations and code in C. 7

CREATE SPECIFICATION Each production from the specifications for the program LLgen comprises: nonterminal, the character ":" and the right hand side of the production. Ends with a semicolon The right side of alternative production are separated by a " " The right side of the production can be of terminals, nonterminals and semantic actions nonterminal : the right hand side of the production 8

CREATE SPECIFICATION The rules create specifications: White spaces are ignored, but can not occur within name Comments are introducing after the character // Comments can not be nested Comments may occur at any place where it is allowed to occurrence of names 9

CREATE SPECIFICATION The rules create specifications: The names of terminal and non-terminal symbols can be any length. They have a syntax such as C language identifiers Symbol names must not conflict with keywords in C Capitalization is distinguishable 10

CREATE SPECIFICATION The rules create specifications: The names of symbols can be any length, but in LLgen significant is 50 characters All names generated and used by LLgen begin with the prefix LL 11

DECLARATION OF TERMINA Terminals that are not letters, we declare: %token ken If you have multiple terminals to declaration, we can do this: %token name1, name2, name3 Any use of the terminal must be preceded his declaration 12

DEClARATION OF TERMINAL The terminals, which are the letters are included in quotes LLgen also recognizes (as C) a set of special literals, i.e.: new line \n tab \t carriage return \r apostrophe \ withdrawal character \b backslash \\ octal number \xxx 13

DECLRATION OF TERMINAL REMEMBER!!! Assume that the parser encountered in the test file, name that has not been declared as a token. This name will be treated by LLgen as a symbol nonterminal 14

DECLRATION OF TERMINAL Nonterminals is implemented like a function of the C language In LLgen we can use local variables. The generator enables them to declare, in brackets, only the left side of production as a nonterminal symbol, eg.: A {int ken} : S ken T 15

DECLRATION OF TERMINAL Through the semantic action, we mean any single instruction (a group of instructions) written in C, which are enclosed in braces In LLgen the semantic actions can insert only the right side of the production, eg.: A {int counter} : S ken {counter=1} T 16

STARTER NONTERMINAL Analyzers generated by LLgen may have multiple terminals not boot Declaration of starter nonterminals (otherwise axiom) is as follows: %start function, name_of_nonterminal example: %start parse, S 17

COMMANDS OF COMPILATION The command, which is used to start of the generator is LLgen. This command is invoked for pilku specifications (extension g), for example,.: LLgen gram.g LLgen on the output produces three files: gram.c file in C, which contains implementation of parser Lpars.h file containing the syntax analyzer interface Lpars.c parser skeleton and board control 18

OPTION -V Sometimes it's helpful for startup and testing parser, use the -v option Thanks to the -v option, will be generated file LL.output, which contains information about the unresolved conflicts that have arisen in the grammar 19

EXTENSION OF GRAMMAR OF SYNTAX Extensions of syntax context-free grammars: * (*quantity) feedback clouser + (+quantity) positive clouser? optionality operator [...] The possibility of grouping of symbols 20

Example Let ={a,b}. Let consider the following regular language L=L(b*a). Then: S : B A B : b B A : a S : B A B : b * A : a 21

Example Let ={a,b}. Let consider the following language L={b, ab, aab, aaab}. Then: S : A B A : a C C : a B : b S : A B A : a *3 B : b 22

Example Let ={a,b}. Let consider the following language L={ab, aab, aaab}. Then: S : A B A : a C C : a B : b S : A B A : a +3 B : b 23

Example Let ={a,b}. Let consider the following landuade L={b, ab}. Then: S : A B A : a B : b S : A B A : a? B : b 24

Example Let ={a,b}. Let consider the following language L={A * : A =2}. Then: S : a B b B B : a b S : [ a b ] +2 25

COMPARISON Consider the grammar, which is not a simple LL(1) grammar. Compare the effort of procedure for adjusting the grammar and implementation of grammar in generator LLgen Let =[a,b}. We will write a program that accepts context-free language L={A * : A=a n b n n Ν} 26

COMPARISON { int quan_a, quan_b } %start parse, S S : A B { if (quan_a= = quan_b) puts( OK. ) else puts( Blad ) } 27

COMPARISON We remove the leftmost recursion A : a { quan_a=1 } A a { quan_a++ } B : b { quan_b=1 } B b { quan_b++ } A : a { quan_a=1 } a A { quan_a++ } B : b { quan_b=1 } b B { quan_b++ } 28

COMPARISON A : a C { quan_a++ } C : { quan_a=0 } a C { quan_a++ } B : b D { quan_b++ } D : { quan_b=0 } b D { quan_b++ } 29

COMPARISON S : {quan_a=quan_b=0} A B { if (quan_a= = quan_b) puts( OK. ) else puts( Blad ) } A : [ a {quan_a++} ] + B : [ b {quan_b++} ] + 30

LLSYMB LLsymb is a global integer variable that can take on different values. What value will be accepted, depending on the position of the read head on the right side of the production: Possible values: If the parser read the token, then in variable LLsymb we have this token After grouping and alternative, in variable is remembered token 31

CREATE SPECIFICATION In the file with the specifications for the generator LLgen, should be included implementation of the main function %start parse, S int main(){ parse() return 0 } 32

CREATE SPECIFICATION The file with the specifications for the generator LLgen should also be included function LLmessage This function is automatically called by the parser when an error occurs syntax void LLmessage ( int tk ) Do not return any value It has one integer parameter 33

CREATE SPECIFICATION The variable tk accepts the following values: when he was expected token tk tk > 0 when loaded is an unexpected token and it has been removed tk = 0 if not encountered the expected end of the file and other input will be ignored tk = - 1 34

Example The operation of LLgen generator is best seen in the example. At the entrance there is a string of words made up of natural alphabet, words end with a colon and are separated by a comma. Given the input string contains at least one word... 35

CONFLICTS During working of the syntax generator, can occur the following conflicts: - We are not able to determine which of the right sides should to be developed - conflict of alternatives - The structure that is currently in progress, includes the closure and it is difficult to determine, whether the input is its continuation, or start another construction - a conflict of repetitions 36

CONFLICTS Conflict of alternatives can be resolved in two ways: dynamic settlement of the alternatives conflict: %if (condition) static settlement of the alternatives conflict: %prefer %if(1) %avoid %if(0) 37

Example Consider the task of testing whether the binary number is even number Lexical analyzer identifies and returns the binary numbers %% [01] { return yytext[0] } 38

Example { int read_number } %start parse, S S : 0 { read_number = 0 } R 1 { read_number = 1 } R R : %if (read_number ==0 ) {puts( even number )} {puts( odd number )} S 39

SOLVING OF THE CONFLICT Example use of mechanism of static resolution of alternatives conflict is the socalled problem: "dangling else" This issue will be discussed in detail during the lecture devoted to the generator YACC 40

SOLVING OF THE CONFLICT To resolve of the conflict of repetition we may use the keyword %while %while ( condition ) 41

THE END END OF THE SIXTH LECTURE