TDDD55- Compilers and Interpreters Lesson 3

Similar documents
COMPILERS AND INTERPRETERS Lesson 4 TDDD16

TDDD55 - Compilers and Interpreters Lesson 3

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

COMPILER CONSTRUCTION Seminar 02 TDDB44

Yacc: A Syntactic Analysers Generator

PRACTICAL CLASS: Flex & Bison

Syntax Analysis Part IV

TDDD55- Compilers and Interpreters Lesson 2

Using an LALR(1) Parser Generator

CSE302: Compiler Design

Syntax-Directed Translation

LECTURE 11. Semantic Analysis and Yacc

Yacc Yet Another Compiler Compiler

Compiler Design 1. Yacc/Bison. Goutam Biswas. Lect 8

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Lex & Yacc (GNU distribution - flex & bison) Jeonghwan Park

Compiler Lab. Introduction to tools Lex and Yacc

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

An Introduction to LEX and YACC. SYSC Programming Languages

Compiler Construction

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

Introduction to Lex & Yacc. (flex & bison)

Principles of Programming Languages

LECTURE 7. Lex and Intro to Parsing

Compiler Construction

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

CS 403: Scanning and Parsing

COMPILER CONSTRUCTION Seminar 03 TDDB

Programming Project II

Context-free grammars

CST-402(T): Language Processors

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

Compiler Construction: Parsing

As we have seen, token attribute values are supplied via yylval, as in. More on Yacc s value stack

Automatic Scanning and Parsing using LEX and YACC

Syntax Analysis The Parser Generator (BYacc/J)

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Preparing for the ACW Languages & Compilers

CSCI Compiler Design

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

Compiler construction in4020 lecture 5

Time : 1 Hour Max Marks : 30

Lexical and Syntax Analysis

Lab 2. Lexing and Parsing with Flex and Bison - 2 labs

A simple syntax-directed

Hyacc comes under the GNU General Public License (Except the hyaccpar file, which comes under BSD License)

Introduction to Compiler Design

COMPILER CONSTRUCTION Seminar 01 TDDB

Compiler construction 2002 week 5

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

Parsing How parser works?

Lexical and Parser Tools

CS606- compiler instruction Solved MCQS From Midterm Papers

Compil M1 : Front-End

CS143 Handout 12 Summer 2011 July 1 st, 2011 Introduction to bison

Decaf PP2: Syntax Analysis

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

Lexical and Syntax Analysis

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Compiler Design (40-414)

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

A programming language requires two major definitions A simple one pass compiler

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.

Compiler Front-End. Compiler Back-End. Specific Examples

Chapter 2: Syntax Directed Translation and YACC

The Structure of a Syntax-Directed Compiler

Syntax-Directed Translation. Lecture 14

Ray Pereda Unicon Technical Report UTR-03. February 25, Abstract

A Simple Syntax-Directed Translator

Bison. The YACC-compatible Parser Generator November 1995, Bison Version by Charles Donnelly and Richard Stallman

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.

What do Compilers Produce?

Programming Language Syntax and Analysis

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Chapter 2 A Quick Tour

CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell

LECTURE 3. Compiler Phases

Formats of Translated Programs

Compiler Construction

More Assigned Reading and Exercises on Syntax (for Exam 2)

Languages and Compilers

Syn S t yn a t x a Ana x lysi y s si 1

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

The Structure of a Syntax-Directed Compiler

Yacc. Generator of LALR(1) parsers. YACC = Yet Another Compiler Compiler symptom of two facts: Compiler. Compiler. Parser

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

Compilers and Interpreters

UNIVERSITY OF CALIFORNIA

Compiler construction 2005 lecture 5

COP 3402 Systems Software Syntax Analysis (Parser)

Figure 2.1: Role of Lexical Analyzer

Chapter 3 Lexical Analysis

Front End. Hwansoo Han

Transcription:

TDDD55- Compilers and Interpreters Lesson 3 Zeinab Ganjei (zeinab.ganjei@liu.se) Department of Computer and Information Science Linköping University

1. Grammars and Top-Down Parsing Some grammar rules are given Your task: Rewrite the grammar (eliminate left recursion, etc.) Add attributes and attribute rules to the grammar Implement your grammar in a C++ class named Parser. The Parser class should contain a method named Parse that returns the value of a single statement in the language.

2. Scanner Specification Finish a scanner specification given in a scanner.l flex file, by adding rules for C and C++ style comments, identifiers, integers, and floating point numbers.

3. Parser Generators Finish a parser specification given in a parser.y bison file, by adding rules for expressions, conditions and function definitions,... You also need to augment the grammar with error productions.

4. Intermediate Code Generation The purpose of this assignment to learn about how abstract syntax trees can be translated into intermediate code. You are to finish a generator for intermediate code by adding rules for some language statements.

Laboratory Skeleton ~TDDD55 /lab /doc /lab1 /lab2 /lab3-4 Documentation for the assignments. Contains all the necessary files to complete the first assignment Contains all the necessary files to complete the second assignment Contains all the necessary files to complete assignment three and four

Bison Parser Generator

Purpose of a Parser The parser accepts tokens from the scanner and verifies the syntactic correctness of the program. Syntactic correctness is judged by verification against a formal grammar which specifies the language to be recognized. Along the way, it also derives information about the program and builds a fundamental data structure known as parse tree or abstract syntax tree (ast). The abstract syntax tree is an internal representation of the program and augments the symbol table.

Bottom-Up Parsing Recognize the components of a program and then combine them to form more complex constructs until a whole program is recognized. The parse tree is then built from the bottom and up, hence the name.

Bottom-Up Parsing(2) X := ( a + b ) * c; := x * + c a b

LR Parsing A Specific bottom-up technique LR stands for Left->right scan, Rightmost derivation. Probably the most common & popular parsing technique. yacc, bison, and many other parser generation tools utilize LR parsing. Great for a hi es, ot so great for hu a s

Pros and Cons of LR parsing Advantages of LR: Accept a wide range of grammars/languages Well suited for automatic parser generation Very fast Generally easy to maintain Disadvantages of LR: Error handling can be tricky Difficult to use manually

Bison Bison is a general-purpose parser generator that converts a grammar description of a context-free grammar into a C program to parse that grammar

Bison (2) Input: a specification file containing mainly the grammar definition Output: a C source file containing the parser The entry point is the function int yyparse(); yyparse reads tokens by calling yylex and parses until end of file to be parsed, or unrecoverable syntax error occurs returns 0 for success and 1 for failure

Bison Usage Bison source program parser.y Bison Compiler y.tab.c y.tab.c C Compiler a.out Token stream a.out Parse tree

Bison Specification File A Bison specification is composed of 4 parts. %{ %} /* C declarations */ /* Bison declarations */ %% /* Grammar rules */ %% /* Additional C code */

1.1. C Declarations Contains macro definitions and declarations of functions and variables that are used in the actions in the grammar rules Copied to the beginning of the parser file so that they precede the definition of yyparse Use #include to get the declarations from a header file. If C de laratio s is t eeded, the the %{ a d %} deli iters that bracket this section can be omitted

1.2. Bison Declarations Contains : declarations that define terminal and non-terminal symbols Data types of semantic values of various symbols specify precedence

2. Grammar Rules Contains one or more Bison grammar rule, and nothing else. Example: expressio : expressio + ter { $$ = $1 + $3; } ; There must always be at least one grammar rule, and the first %% (which precedes the grammar rules) may never be omitted even if it is the first thing in the file.

Bison Specification File A Bison specification is composed of 4 parts. %{ %} /* C declarations */ /* Bison declarations */ %% /* Grammar rules */ %% /* Additional C code */

3. Additional C Code Copied verbatim to the end of the parser file, just as the C declarations section is copied to the beginning. This is the most convenient place to put anything that should e i the parser file ut is t eeded efore the defi itio of yyparse. The definitions of yylex and yyerror often go here.

Bison Example 1 (1/2) %{ #include <ctype.h> /* standard C declarations here */ // extern int yylex(); }% %token DIGIT /* bison declarations */ %% /* Grammar rules */ line : expr \n { printf { %d\n, $1 }; } ; expr : expr + term { $$ = $1 + $3; } term ; term : term * factor { $$ = $1 * $3; } factor ;

Bison Example 1 (2/2) factor : ( expr ) { $$ = $2; } DIGIT ; %% /* Additional C code */ int yylex () { /* A really simple lexical analyzer */ int c; c = getchar (); if ( isdigit (c) ) { yylval = c - 0 ; return DIGIT; } return c; }

Bison Example 2 Mid-Rules thing: A { printf( seen an A ); } B ; The same as: thing: A fakename B ; fakename: /* empty */ { printf( seen an A ); } ;

Bison Example 3 (1/2) /* Infix notation calculator--calc */ %{ #define YYSTYPE double #include <math.h> %} /* BISON Declarations */ %token NUM %left '-' '+' %left '*' '/ %right '^' /* exponentiation */ /* Grammar follows */ %%

Bison Example 3 (2/2) input: /* empty string */ input line ; line: '\n' exp '\n' { printf ("\t%.10g\n", $1); }; exp: NUM { $$ = $1; } exp '+' exp { $$ = $1 + $3; } exp '-' exp { $$ = $1 - $3; } exp '*' exp { $$ = $1 * $3; } exp '/' exp { $$ = $1 / $3; } exp '^' exp { $$ = pow ($1, $3); } '(' exp ')' { $$ = $2; } ; %%

Syntax Errors Error productions can be added to the specification They help the compiler to recover from syntax errors and to continue to parse In order for the error productions to work we need at least one valid token after the error symbol Example: fu tio Call : ID para List ID error

Using Bison With Flex Bison and flex are obviously designed to work together Bison produces a driver program called yylex() (actually its ( ll - included in the lex library #i lude lex.yy.c i the last part of iso spe ifi atio this gives the program yylex access to bisons toke names

Using Bison with Flex (2) Thus do the following: % flex scanner.l % bison parser.y % cc y.tab.c -ly -ll This will produce an a.out which is a parser with an integrated scanner included

Laboratory Assignment 3

Parser Generations Finnish a parser specification given in a parser.y bison file, by adding rules for expressions, conditions and function definitions,...

Functions function : funcnamedecl parameters : type variables functions block ; { Outline: // Set the return type of the function // Set the function body // Set current function to point to the parent again } ; funcnamedecl : FUNCTION id { // Check if the function is already defined, report error if so // Create a new function information and set its parent to current function // Link the newly created function information to the current function // Set the new function information to be current function } ;

Expressions For precedence and associativity you can factorize the rules for expressio s or you can specify precedence and associativy at the top of the Bison specification file, in the Bison Declarations section. Read more about this in the Bison reference(s).

Expressions (2) Example with factoring: expression : expression + term { // If any of the sub-expressions is NULL, set $$ to NULL // Create a new Plus node but IntegerToReal casting might be needed }...

Conditions For precedence and associativity you can factorize the rules for o ditio s or you can specify precedence and associativy at the top of the Bison specification file, in the Bison Declarations section. Read more about this in the Bison reference(s).

Laboratory Assignment 4

Intermediate Code Is closer to machine code without being machine dependent. Can handle temporary variables. Means higher portability, intermediate code can easier be expanded to assembly code. Offers the possibility of performing code optimizations such as register allocation.

Intermediate Code (2) Why use intermediate languages? Retargeting - build a compiler for a new machine by attaching a new code generator to an existing front-end and middle-part Optimization - reuse intermediate code optimizers in compilers for different languages and different machines Code generation - for different source languages can be combined

Intermediate Languages Various types of intermediate code are: Infix notation Postfix notation Three address code Triples Quadruples

Quadruples You will use quadruples as intermediate language where an instruction has four fields: operator operand1 operand2 result

Generation of Intermediate Code <instr_list> program example; const PI = 3.14159; var a : real; b : real; begin b := a + PI; end. b := + NULL q_rplus A PI $1 q_rassign $1 - B q_labl 4 - - a PI

Quadruples (A + B) * (C + D) - E operator operand1 operand2 result + A B T1 + C D T2 * T1 T2 T3 - T3 E T4

Intermediate Code Generation The purpose of this assignment is to learn how abstract syntax trees can be translated into intermediate code. You are to finish a generator for intermediate code (quadruples) by adding rules for some language constructs. You will work in the file codegen.cc.

Binary Operations In function BinaryGenerateCode: Generate code for left expression and right expression. Generate either a realop or intop quadruple Type of the result is the same as the type of the operands You can use currentfunction->temporaryvariable

Array References The absolute address is computed as follows: absadr = baseadr + arraytypesize * index

Array References (2) Generate code for the index expression You must then compute the absolute memory address You will have to create several temporary variables (of integer type) for intermediate storage Generate a quadruple iaddr with id variable as input for getting the base address Create a quadruple for loading the size of the type in question to a temporary variable Then generate imul and iadd quadruples Finally generate either a istore or rstore quadruple

IF Statement S if E then S 1 S if E then S 1 else S 2

WHILE Statement S while E do S 1

Questions? TDDD55 Compilers and Interpreters 2013