Lesson 10. CDT301 Compiler Theory, Spring 2011 Teacher: Linus Källberg

Similar documents
Using an LALR(1) Parser Generator

Syntax Analysis Part IV

Lexical analysis. Syntactical analysis. Semantical analysis. Intermediate code generation. Optimization. Code generation. Target specific optimization

PRACTICAL CLASS: Flex & Bison

CSCI Compiler Design

CSC 467 Lecture 3: Regular Expressions

Introduction to Lex & Yacc. (flex & bison)

Lex & Yacc (GNU distribution - flex & bison) Jeonghwan Park

Compiler Lab. Introduction to tools Lex and Yacc

Yacc. Generator of LALR(1) parsers. YACC = Yet Another Compiler Compiler symptom of two facts: Compiler. Compiler. Parser

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

Yacc: A Syntactic Analysers Generator

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Lexical Analysis. Implementing Scanners & LEX: A Lexical Analyzer Tool

As we have seen, token attribute values are supplied via yylval, as in. More on Yacc s value stack

Syntax-Directed Translation

Building a Parser Part III

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Lexical and Syntax Analysis

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Context-free grammars

Yacc Yet Another Compiler Compiler

The structure of a compiler

Compiler Construction: Parsing

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Module 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

CSE302: Compiler Design

Lab 2. Lexing and Parsing with Flex and Bison - 2 labs

Parsing How parser works?

Principles of Programming Languages

Lexical and Syntax Analysis

Using Lex or Flex. Prof. James L. Frankel Harvard University

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

TDDD55 - Compilers and Interpreters Lesson 3

Compiler course. Chapter 3 Lexical Analysis

Chapter 3 Lexical Analysis

An Introduction to LEX and YACC. SYSC Programming Languages

CSE302: Compiler Design

Compiler Construction Assignment 3 Spring 2018

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

LECTURE 11. Semantic Analysis and Yacc

Last Time. What do we want? When do we want it? An AST. Now!

Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

Conflicts in LR Parsing and More LR Parsing Types

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

COMPILER CONSTRUCTION Seminar 02 TDDB44

I. OVERVIEW 1 II. INTRODUCTION 3 III. OPERATING PROCEDURE 5 IV. PCLEX 10 V. PCYACC 21. Table of Contents

Lexical and Parser Tools

Marcello Bersani Ed. 22, via Golgi 42, 3 piano 3769

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Error Detection in LALR Parsers. LALR is More Powerful. { b + c = a; } Eof. Expr Expr + id Expr id we can first match an id:

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Compiler Design 1. Yacc/Bison. Goutam Biswas. Lect 8

CS4850 SummerII Lex Primer. Usage Paradigm of Lex. Lex is a tool for creating lexical analyzers. Lexical analyzers tokenize input streams.

Parser Tools: lex and yacc-style Parsing

UNIVERSITY OF CALIFORNIA

LR Parsing LALR Parser Generators

Syn S t yn a t x a Ana x lysi y s si 1

Etienne Bernard eb/textes/minimanlexyacc-english.html

Programming Project II

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

TDDD55- Compilers and Interpreters Lesson 3

Compil M1 : Front-End

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Bottom-Up Parsing. Lecture 11-12

Monday, September 13, Parsers

TDDD55- Compilers and Interpreters Lesson 2

A simple syntax-directed

Figure 2.1: Role of Lexical Analyzer

A programming language requires two major definitions A simple one pass compiler

Wednesday, August 31, Parsers

cmps104a 2002q4 Assignment 3 LALR(1) Parser page 1

Bottom-Up Parsing. Lecture 11-12

Parser Tools: lex and yacc-style Parsing

Compiler Construction

UNIT III & IV. Bottom up parsing

In One Slide. Outline. LR Parsing. Table Construction

Configuration Sets for CSX- Lite. Parser Action Table

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

Edited by Himanshu Mittal. Lexical Analysis Phase

CS143 Handout 12 Summer 2011 July 1 st, 2011 Introduction to bison

Gechstudentszone.wordpress.com

LEX/Flex Scanner Generator

Lecture 8: Deterministic Bottom-Up Parsing

Preparing for the ACW Languages & Compilers

Compiler construction in4020 lecture 5

Ulex: A Lexical Analyzer Generator for Unicon

CS 426 Fall Machine Problem 1. Machine Problem 1. CS 426 Compiler Construction Fall Semester 2017

CS606- compiler instruction Solved MCQS From Midterm Papers

LR Parsing LALR Parser Generators

How do LL(1) Parsers Build Syntax Trees?

Syntax Analysis Part VIII

Transcription:

Lesson 10 CDT301 Compiler Theory, Spring 2011 Teacher: Linus Källberg

Outline Flex Bison Abstract syntax trees 2

FLEX 3

Flex Tool for automatic generation of scanners Open-source version of Lex Takes regular expressions as input Outputs a C (or C++) file for the scanner 4

Flex mylexer.l mylexer.c mylexer.obj Regexps Flex intyylex() C compiler 0110100011 0101010 5

The input fileto Flex Definitions %% Rules %% User code 6

The definitions section Macro definitions: Specify a letter: letter [A-Za-z] Specify a delimiter: delimiter [,:;.] Specify a digit: digit [0-9] Specify an identifier: id letter(letter digit)* 7

The definitions section User code: %{ #include <stdio.h> int a_nice_global_variable = 0; int my_favourite_function(void) {return 42;} %} 8

The rulessection Rule = regexp+ C code Longest matching pattern is used If two equally long patterns match, the first one in the file is used Examples: = >=? <(= >)? { return RELOP; } {id} { return ID; } 9

The regexplanguageof Flex? Previous regexp is optional {}Macro expansion (defined in the definitions section). Matches any character that is not end of line $ Matches the end of a line ^ Matches the beginning of a line [] Matches any enclosed character 10

The [] syntax Similar to but more powerful Example: digit [0123456789] is the same as digit 0 1 2 3 4 5 6 7 8 9 Special characters inside the brackets: and ^ digit [0-9] letter [A-Za-z] non_digit[^0-9] 11

The user code section Only C code valid here Will be copied unchanged to the generated C file 12

The generatedscanner By default, a function called yylex() is defined Works similar to your GetNextToken() from lab 1 The name can be changed with options Some globalsare defined as well (can be changed into local variables with options): yyin The file to read from yytext The matched lexeme (char*) yyleng The length of yytext yylineno Line number of the match 13

The yywrap() function Called upon end-of-file Shouldbe suppliedby the user Suppressed with %option noyywrap or --noyywrap 14

Scanner statesin Flex Affects what tokens should be recognized Example from the language ALF: { fref 32 DEADC0DE } <- Identifier { hex_val DEADC0DE } <- Hex constant 15

Scanner statesin Flex Declare state: %x READ_HEX Usethe stateto make rulesconditional: hex_val { BEGIN(READ_HEX); return HEX_VAL_KW; } [a-za-z_][a-za-z0-9_]* { returnid; } <READ_HEX>[0-9a-fA-F]+{ BEGIN(INITIAL); returnnum; } 16

Online resources http://flex.sourceforge.net/manual/index.html 17

BISON 18

Bison Tool for automatic generation of parsers Open-source alternative to Yacc Takes an SDT scheme as input Outputs C (or C++) source code for an LALR parser Commonly used together with Flex 19

Bison myparser.c myparser.y intparse() myparser.obj SDT scheme Bison myparser.h C compiler 0110100011 0101010 Token definitions 20

The input file to Bison Definitions %% SDT scheme %% User code 21

Definitions section Define tokens Define operator precedence Define operator associativity Definethe typesof grammarsymbol attributes WriteC codebetween%{ and %} Issue certain commands to Bison 22

Token definition Normal case: %token IDENTIFIER %token WHILE Token, precedence, associativity, and type: %left <Operator> RELOP %left <Operator> MINUSOP PLUSOP %right <Operator> NOTOP Enables use of ambiguous grammars! 23

Definingtypes Just enter the type inside <> before the list of tokens: %left <Operator> RELOP %left <Operator> MULOP %right <Operator> NOTOP UNOP %token <String> ID STRING Or the same for non-terminals: %type <Node> stmnt expr actuals exprs 24

The variable yylval Used by the lexical analyzer to store token attributes Default type is int May be given another type(s) using %union: %union { int Operator; char *String; NODE_TYPE Node; } The type (member name) is then used like this: %token <String> ID STRING 25

Code provided by the user yyerror(char* msg) Function called on syntax errors yylex() Function called to get the next token 26

Options to Bison Given on the command line or in the grammar file --defines or %defines: Output a C header file with definitions useful to a scanner Tokens (#defines) and the type on yylval %error-verbose: More detailed error messages --name-prefix or %name-prefix: Change the default yy prefix on all names %define api.pure: Do not use globals --verboseor %verbose: Writedetailedinformation to extra output file 27

Translationschemesection decl : BASIC_TYPE idents ';' ; idents : idents ',' ident ident ; ident : ID ; 28

Semanticactions Written in C Executed when the production is used in a reduction $$, $1, $2, etc. refer to the attributes of the grammar symbols Can be used as regular C variables $$ refer to the attribute of the head, $1 to the attribute of the first symbol in the body, etc. E : E '+' T { $$ = $1 + $3; } ; 29

Default actions: Using ambiguous grammars in Bison Reduce/reduce: choose first rule in file Shift/reduce: always shift With explicit precedence and associativity: Shift/reduce: Compareprec/assof rulewith that of lookahead token 30

The %expectdeclaration To suppress shift/reduce warnings: %expect n wheren is the exactnr of conflicts 31

Contextualprecedence Same token mighthavedifferent precedence depending on context: expr expr expr expr* expr expr id Stack expr Input * expr 32

Contextualprecedence Define dummy token: %left'-' %left'*' %left UMINUS Use the %prec modifier: expr expr%precuminus 33

Examples of parser configurations Stack Input Action if(cond) stmt else shift Stack Input Action expr+ expr * shift Stack Input Action expr* expr + red. expr expr* expr Stack Input Action expr* expr * red. expr expr* expr 34

Online resources http://www.gnu.org/software/bison/manual/html_node/index.html 35

ABSTRACT SYNTAX TREES 36

Abstract syntax trees AST or just syntax tree E + E a + E E E a * 5 * b 5 b 37

Syntax treesvs. parsetrees Parse trees: Interior nodes are nonterminals, leaves are terminals Rarely constructed as an explicit data structure Represents the concrete syntax Syntax trees: Interior nodes are operators, leaves are operands Commonly constructed as an explicit data structure Represents the abstract syntax 38

Whysyntax trees? Simplifies subsequent analyses Independent on the parsing strategy Makes it easier to add new analysis passes without having to modify the parser More compact representation than parse trees 39

Syntax treeexample if(a < 1) b = 2 + 3; else{ c = d * 4; e(f, 5); } if null < = null = calle null a 1 b + c * f 5 null 2 3 d 4 40

Exercise(1) Draw an abstract syntax treefor the statement while(i < 100) { x = 2 * x; i = i + 1; } 41

Constructinga syntax treein Bison expr : expr'+' expr { $$ = createopnode($1, '+',$3);} expr'*' expr { $$ = createopnode($1, '*',$3);} ID { $$ = createidnode($1.name); } ; 42

Constructinga syntax treein Bison stmt : RETURN expr';' { $$ = mreturn($2, $1); } ; stmts: stmtsstmt { $$ = connectstmts($1, $2); } { $$ = NULL; } ; 43

Conclusion Flex generates C source code for a scanner given a set of regular expressions Bison generates C source code for a bottomup parser given a syntax-directed translation scheme Building syntax trees simplifies subsequent analyses of the program Syntax trees can be built in semantic actions 44

Nexttime Syntax-directed definitions and translation schemes Semantic analysis and type analysis 45