Compiler Construction

Similar documents
Compiler Construction

Compiler Construction

Syntax Analysis Part IV

CSE302: Compiler Design

Syntax-Directed Translation

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Context-free grammars

Compiler Construction: Parsing

Conflicts in LR Parsing and More LR Parsing Types

Yacc: A Syntactic Analysers Generator

Principles of Programming Languages

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

TDDD55 - Compilers and Interpreters Lesson 3

Using an LALR(1) Parser Generator

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Bottom-Up Parsing. Lecture 11-12

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

COMPILER CONSTRUCTION Seminar 02 TDDB44

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

CSCI Compiler Design

TDDD55- Compilers and Interpreters Lesson 3

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Lecture 8: Deterministic Bottom-Up Parsing

In One Slide. Outline. LR Parsing. Table Construction

Lecture 7: Deterministic Bottom-Up Parsing

Compiler construction in4020 lecture 5

LR Parsing LALR Parser Generators

Compiler construction 2002 week 5

Compilers. Bottom-up Parsing. (original slides by Sam

Parsing How parser works?

Lexical and Syntax Analysis

Bottom-Up Parsing. Lecture 11-12

Compiler Lab. Introduction to tools Lex and Yacc

Yacc Yet Another Compiler Compiler

LR Parsing LALR Parser Generators

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

COMP 181. Prelude. Prelude. Summary of parsing. A Hierarchy of Grammar Classes. More power? Syntax-directed translation. Analysis

Compiler construction 2005 lecture 5

Lexical and Syntax Analysis. Bottom-Up Parsing

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Compiler Construction

Compiler Construction

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

Compiler Construction

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Compiler Construction

Syn S t yn a t x a Ana x lysi y s si 1

IN4305 Engineering project Compiler construction

4. Lexical and Syntax Analysis

Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev

LR Parsing. Table Construction

As we have seen, token attribute values are supplied via yylval, as in. More on Yacc s value stack

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

COMPILER (CSE 4120) (Lecture 6: Parsing 4 Bottom-up Parsing )

Configuration Sets for CSX- Lite. Parser Action Table

CPS 506 Comparative Programming Languages. Syntax Specification

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Chapter 4. Lexical and Syntax Analysis

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

How do LL(1) Parsers Build Syntax Trees?

Syntax Analysis The Parser Generator (BYacc/J)

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

PRACTICAL CLASS: Flex & Bison

LECTURE 3. Compiler Phases

A Simple Syntax-Directed Translator

Compiler Construction

Syntax Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Hyacc comes under the GNU General Public License (Except the hyaccpar file, which comes under BSD License)

Compiler Construction

4. Lexical and Syntax Analysis

Compiler Construction

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

Syntax and Parsing COMS W4115. Prof. Stephen A. Edwards Fall 2003 Columbia University Department of Computer Science

A programming language requires two major definitions A simple one pass compiler

Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov Compiler Design (170701)

Monday, September 13, Parsers

Automatic Scanning and Parsing using LEX and YACC

shift-reduce parsing

Compiler Construction

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

CS664 Compiler Theory and Design LIU 1 of 16 ANTLR. Christopher League* 17 February Figure 1: ANTLR plugin installer

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Compiler Construction Lecture 1: Introduction Winter Semester 2018/19 Thomas Noll Software Modeling and Verification Group RWTH Aachen University

Compiler Construction

Introduction to Compiler Design

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

3. Parsing. Oscar Nierstrasz

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant

Transcription:

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-17/cc/

Recap: LR(1) Parsing Outline of Lecture 11 Recap: LR(1) Parsing Generating Top-Down Parsers Using ANTLR Generating Bottom-Up Parsers Using yacc and bison LL and LR Parsing in Practice Overview Semantic Analysis Attribute Grammars 2 of 34 Compiler Construction

Recap: LR(1) Parsing LR(1) Items and Sets Observation: not every element of fo(a) can follow every occurrence of A = refinement of LR(0) items by adding possible lookahead symbols Definition (LR(1) items and sets) Let G = N, Σ, P, S CFG Σ be start separated by S S. If S r αaaw r αβ 1 β 2 aw, then [A β 1 β 2, a] is called an LR(1) item for αβ 1. If S r αa r αβ 1 β 2, then [A β 1 β 2, ε] is called an LR(1) item for αβ 1. Given γ X, LR(1)(γ) denotes the set of all LR(1) items for γ, called the LR(1) set (or: LR(1) information) of γ. LR(1)(G) := {LR(1)(γ) γ X }. 3 of 34 Compiler Construction

Recap: LR(1) Parsing LR(1) Conflicts Definition (LR(1) conflicts) Let G = N, Σ, P, S CFG Σ and I LR(1)(G). I has a shift/reduce conflict if there exist A α 1 aα 2, B β P and x Σ ε such that [A α 1 aα 2, x], [B β, a] I. I has a reduce/reduce conflict if there exist x Σ ε and A α, B β P with A B or α β such that [A α, x], [B β, x] I. Lemma G LR(1) iff no I LR(1)(G) contains conflicting items. 4 of 34 Compiler Construction

Recap: LR(1) Parsing The LR(1) Action Function Definition (LR(1) action function) The LR(1) action function act : LR(1)(G) Σ ε {red i i [p]} {shift, accept, error} is defined by act(i, x) := red i if i 0, π i = A α and [A α, x] I shift if [A α 1 xα 2, y] I and x Σ accept if [S S, ε] I and x = ε error otherwise Corollary For every G CFG Σ, G LR(1) iff its LR(1) action function is well defined. 5 of 34 Compiler Construction

Generating Top-Down Parsers Using ANTLR Outline of Lecture 11 Recap: LR(1) Parsing Generating Top-Down Parsers Using ANTLR Generating Bottom-Up Parsers Using yacc and bison LL and LR Parsing in Practice Overview Semantic Analysis Attribute Grammars 6 of 34 Compiler Construction

Generating Top-Down Parsers Using ANTLR Overview of ANTLR ANother Tool for Language Recognition Input: language description using EBNF grammars Output: recogniser for the language Supports recognisers for three kinds of input: character streams (generation of scanner) token streams (generation of parser) node streams (generation of tree walker) Current version: ANTLR 4.7 generates LL( ) recognisers: flexible choice of lookahead length applies longest match principle supports ambiguous grammars by using first match principle for rules supports direct left recursion targets Java, C++, C#, Python, JavaScript, Go, Swift Details: http://www.antlr.org/ T. Parr: The Definitive ANTLR 4 Reference, Pragmatic Bookshelf, 2013 7 of 34 Compiler Construction

Generating Top-Down Parsers Using ANTLR Example: Infix Postfix Translator (Simple.g4) grammar Simple; // productions for syntax analysis program returns [String s]: e=expr EOF {$s = $e.s;}; expr returns [String s]: t=term r=rest {$s = $t.s + $r.s;}; rest returns [String s]: PLUS t=term r=rest {$s = $t.s + "+" + $r.s;} MINUS t=term r=rest {$s = $t.s + "-" + $r.s;} /* empty */ {$s = "";}; term returns [String s]: DIGIT {$s = $DIGIT.text;}; // productions for lexical analysis PLUS : + ; MINUS : - ; DIGIT : [0-9]; 8 of 34 Compiler Construction

Generating Top-Down Parsers Using ANTLR Java Code for Using Translator import java.io.*; import org.antlr.v4.runtime.*; public class SimpleMain { public static void main(final String[] args) throws IOException { String printsource = null, printsymtab = null, printir = null, printasm = null; SimpleLexer lexer = /* Create instance of lexer */ new SimpleLexer(new ANTLRInputStream(args[0])); SimpleParser parser = /* Create instance of parser */ new SimpleParser(new CommonTokenStream(lexer)); String postfix = parser.program().s; /* Run translator */ System.out.println(postfix); } } 9 of 34 Compiler Construction

Generating Top-Down Parsers Using ANTLR An Example Run 1. After installation, invoke ANTLR: $ java -jar /usr/local/lib/antlr-4.7-complete.jar Simple.g4 (will generate SimpleLexer.java, SimpleParser.java, Simple.tokens, and SimpleLexer.tokens) 2. Use Java compiler: $ javac -cp /usr/local/lib/antlr-4.7-complete.jar Simple*.java 3. Run translator: $ java -cp.:/usr/local/lib/antlr-4.7-complete.jar SimpleMain 9-5+2 95-2+ 10 of 34 Compiler Construction

Generating Top-Down Parsers Using ANTLR Advantages of ANTLR Advantages of ANTLR Generated (Java) code is similar to hand-written code possible (and easy) to read and debug generated code Syntax for specifying scanners, parsers and tree walkers is identical Support for many target programming languages ANTLR is well supported and has an active user community 11 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison Outline of Lecture 11 Recap: LR(1) Parsing Generating Top-Down Parsers Using ANTLR Generating Bottom-Up Parsers Using yacc and bison LL and LR Parsing in Practice Overview Semantic Analysis Attribute Grammars 12 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison The yacc and bison Tools Usage of yacc ( yet another compiler compiler ): yacc [f]lex spec.y y.tab.c lex.yy.c spec.l yacc specification Parser source Scanner source [f]lex specification cc a.out Executable LALR(1) parser 13 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison The yacc and bison Tools Usage of yacc ( yet another compiler compiler ): yacc [f]lex spec.y y.tab.c lex.yy.c spec.l yacc specification Parser source Scanner source [f]lex specification cc a.out Executable LALR(1) parser Like for [f]lex, a yacc specification is of the form Declarations (optional) %% Rules %% Auxiliary procedures (optional) 13 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison The yacc and bison Tools Usage of yacc ( yet another compiler compiler ): yacc [f]lex spec.y y.tab.c lex.yy.c spec.l yacc specification Parser source Scanner source [f]lex specification cc a.out Executable LALR(1) parser Like for [f]lex, a yacc specification is of the form Declarations (optional) %% Rules %% Auxiliary procedures (optional) bison : upward-compatible GNU implementation of yacc (more flexible w.r.t. file names,...) 13 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison yacc Specifications Declarations: Token definitions: %token Tokens Not every token needs to be declared ( +, =,...) Start symbol: %start Symbol (optional) C code for declarations etc.: %{ Code %} 14 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison yacc Specifications Declarations: Token definitions: %token Tokens Not every token needs to be declared ( +, =,...) Start symbol: %start Symbol (optional) C code for declarations etc.: %{ Code %} Rules: context-free productions and semantic actions A α 1 α 2... α n represented as A : α 1 {Action 1 } α 2. {Action 2 } α n {Action n }; Semantic actions = C statements for computing attribute values $$ = attribute value of A $i = attribute value of i-th symbol on right-hand side Default action: $$ = $1 14 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison yacc Specifications Declarations: Token definitions: %token Tokens Not every token needs to be declared ( +, =,...) Start symbol: %start Symbol (optional) C code for declarations etc.: %{ Code %} Rules: context-free productions and semantic actions A α 1 α 2... α n represented as A : α 1 {Action 1 } α 2. {Action 2 } α n {Action n }; Semantic actions = C statements for computing attribute values $$ = attribute value of A $i = attribute value of i-th symbol on right-hand side Default action: $$ = $1 Auxiliary procedures: scanner (if not generated by [f]lex), error routines,... 14 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison Example: Simple Desk Calculator I %{/* SLR(1) grammar for arithmetic expressions (Example 9.9) */ #include <stdio.h> #include <ctype.h> %} %token DIGIT %% line : expr \n { printf("%d\n", $1); }; expr : expr + term { $$ = $1 + $3; } term { $$ = $1; }; term : term * factor { $$ = $1 * $3; } factor { $$ = $1; }; factor : ( expr ) { $$ = $2; } DIGIT { $$ = $1; }; %% yylex() { int c; c = getchar(); if (isdigit(c)) yylval = c - 0 ; return DIGIT; return c; } 15 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison Example: Simple Desk Calculator II $ yacc calc.y $ cc y.tab.c -ly $ a.out 2+3 5 $ a.out 2+3*5 17 16 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison An Ambiguous Grammar I %{/* Ambiguous grammar for arithmetic expressions (Example 10.17) */ #include <stdio.h> #include <ctype.h> %} %token DIGIT %% line : expr \n { printf("%d\n", $1); }; expr : expr + expr { $$ = $1 + $3; } expr * expr { $$ = $1 * $3; } DIGIT { $$ = $1; }; %% yylex() { int c; c = getchar(); if (isdigit(c)) {yylval = c - 0 ; return DIGIT;} return c; } 17 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison An Ambiguous Grammar II Invoking yacc with the option -v produces a report y.output: State 8 2 expr: expr. + expr 2 expr + expr. 3 expr. * expr + shift and goto state 6 * shift and goto state 7 + [reduce with rule 2 (expr)] * [reduce with rule 2 (expr)] State 9 2 expr: expr. + expr 3 expr. * expr 3 expr * expr. + shift and goto state 6 * shift and goto state 7 + [reduce with rule 3 (expr)] * [reduce with rule 3 (expr)] 18 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison Conflict Handling in yacc Default conflict resolving strategy in yacc: reduce/reduce: choose first conflicting production in specification 19 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison Conflict Handling in yacc Default conflict resolving strategy in yacc: reduce/reduce: choose first conflicting production in specification shift/reduce: prefer shift resolves dangling-else ambiguity (Example 10.18) correctly also adequate for strong following weak operator (* after +; Example 10.17) and for right-associative operators not appropriate for weak following strong operator and for left-associative operators ( = reduce; see Example 10.17) 19 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison Conflict Handling in yacc Default conflict resolving strategy in yacc: reduce/reduce: choose first conflicting production in specification shift/reduce: prefer shift resolves dangling-else ambiguity (Example 10.18) correctly also adequate for strong following weak operator (* after +; Example 10.17) and for right-associative operators not appropriate for weak following strong operator and for left-associative operators ( = reduce; see Example 10.17) For ambiguous grammar: $ yacc ambig.y conflicts: 4 shift/reduce $ cc y.tab.c -ly $ a.out 2+3*5 17 $ a.out 2*3+5 16 19 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison Precedences and Associativities in yacc I General mechanism for resolving conflicts: %[left right] Operators 1. %[left right] Operators n operators in one line have given associativity and same precedence precedence increases over lines 20 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison Precedences and Associativities in yacc I General mechanism for resolving conflicts: %[left right] Operators 1. %[left right] Operators n operators in one line have given associativity and same precedence precedence increases over lines Example 11.1 %left + - %left * / %right ^ ^ (right associative) binds stronger than * and / (left associative), which in turn bind stronger than + and - (left associative) 20 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison Precedences and Associativities in yacc II %{/* Ambiguous grammar for arithmetic expressions with precedences and associativities */ #include <stdio.h> #include <ctype.h> %} %token DIGIT %left + %left * %% line : expr \n { printf("%d\n", $1); }; expr : expr + expr { $$ = $1 + $3; } expr * expr { $$ = $1 * $3; } DIGIT { $$ = $1; }; %% yylex() { int c; c = getchar(); if (isdigit(c)) {yylval = c - 0 ; return DIGIT;} return c; } 21 of 34 Compiler Construction

Generating Bottom-Up Parsers Using yacc and bison Precedences and Associativities in yacc III $ yacc ambig-prio.y $ cc y.tab.c -ly $ a.out 2*3+5 11 $ a.out 2+3*5 17 22 of 34 Compiler Construction

LL and LR Parsing in Practice Outline of Lecture 11 Recap: LR(1) Parsing Generating Top-Down Parsers Using ANTLR Generating Bottom-Up Parsers Using yacc and bison LL and LR Parsing in Practice Overview Semantic Analysis Attribute Grammars 23 of 34 Compiler Construction

LL and LR Parsing in Practice LL and LR Parsing in Practice In practice: use of LL(1)/LL( ) or LALR(1) 24 of 34 Compiler Construction

LL and LR Parsing in Practice LL and LR Parsing in Practice In practice: use of LL(1)/LL( ) or LALR(1) Detailed comparison (cf. Fischer/LeBlanc: Crafting a Compiler, Benjamin/Cummings, 1988): Simplicity: LL wins LL parsing technique easier to understand recursive-descent parser easier to debug than LALR action tables 24 of 34 Compiler Construction

LL and LR Parsing in Practice LL and LR Parsing in Practice In practice: use of LL(1)/LL( ) or LALR(1) Detailed comparison (cf. Fischer/LeBlanc: Crafting a Compiler, Benjamin/Cummings, 1988): Simplicity: LL wins Generality: LALR wins almost LL(1) LALR(1) (only pathological counterexamples) LL requires elimination of left recursion and left factorization 24 of 34 Compiler Construction

LL and LR Parsing in Practice LL and LR Parsing in Practice In practice: use of LL(1)/LL( ) or LALR(1) Detailed comparison (cf. Fischer/LeBlanc: Crafting a Compiler, Benjamin/Cummings, 1988): Simplicity: LL wins Generality: LALR wins Semantic actions: (see semantic analysis) LL wins actions can be placed anywhere in LL parsers without causing conflicts in LALR: implicit ε-productions = may generate conflicts 24 of 34 Compiler Construction

LL and LR Parsing in Practice LL and LR Parsing in Practice In practice: use of LL(1)/LL( ) or LALR(1) Detailed comparison (cf. Fischer/LeBlanc: Crafting a Compiler, Benjamin/Cummings, 1988): Simplicity: LL wins Generality: LALR wins Semantic actions: (see semantic analysis) LL wins Error handling: LL wins top-down approach provides context information = better basis for reporting and/or repairing errors 24 of 34 Compiler Construction

LL and LR Parsing in Practice LL and LR Parsing in Practice In practice: use of LL(1)/LL( ) or LALR(1) Detailed comparison (cf. Fischer/LeBlanc: Crafting a Compiler, Benjamin/Cummings, 1988): Simplicity: LL wins Generality: LALR wins Semantic actions: (see semantic analysis) LL wins Error handling: LL wins Parser size: comparable LL: action table LALR: action/goto table 24 of 34 Compiler Construction

LL and LR Parsing in Practice LL and LR Parsing in Practice In practice: use of LL(1)/LL( ) or LALR(1) Detailed comparison (cf. Fischer/LeBlanc: Crafting a Compiler, Benjamin/Cummings, 1988): Simplicity: LL wins Generality: LALR wins Semantic actions: (see semantic analysis) LL wins Error handling: LL wins Parser size: comparable Parsing speed: comparable both linear in length of input program (LL(1): see Lemma 7.15 for ε-free case) concrete figures tool dependent 24 of 34 Compiler Construction

LL and LR Parsing in Practice LL and LR Parsing in Practice In practice: use of LL(1)/LL( ) or LALR(1) Detailed comparison (cf. Fischer/LeBlanc: Crafting a Compiler, Benjamin/Cummings, 1988): Simplicity: LL wins Generality: LALR wins Semantic actions: (see semantic analysis) LL wins Error handling: LL wins Parser size: comparable Parsing speed: comparable Conclusion: choose LL when possible (depending on available grammars and tools) 24 of 34 Compiler Construction

Overview Outline of Lecture 11 Recap: LR(1) Parsing Generating Top-Down Parsers Using ANTLR Generating Bottom-Up Parsers Using yacc and bison LL and LR Parsing in Practice Overview Semantic Analysis Attribute Grammars 25 of 34 Compiler Construction

Overview Conceptual Structure of a Compiler Source code Lexical analysis (Scanner) Asg Syntax analysis (Parser) Var Exp Sum Semantic analysis Var Con Asg ok attribute grammars Generation of intermediate code int Var Exp int Sum int Code optimisation int Var Con int Generation of target code Target code 26 of 34 Compiler Construction

Semantic Analysis Outline of Lecture 11 Recap: LR(1) Parsing Generating Top-Down Parsers Using ANTLR Generating Bottom-Up Parsers Using yacc and bison LL and LR Parsing in Practice Overview Semantic Analysis Attribute Grammars 27 of 34 Compiler Construction

Semantic Analysis Beyond Syntax To generate (efficient) code, the compiler needs to answer many questions: Are there identifiers that are not declared? Declared but not used? 28 of 34 Compiler Construction

Semantic Analysis Beyond Syntax To generate (efficient) code, the compiler needs to answer many questions: Are there identifiers that are not declared? Declared but not used? Is x a scalar, an array, or a procedure? Of which type? 28 of 34 Compiler Construction

Semantic Analysis Beyond Syntax To generate (efficient) code, the compiler needs to answer many questions: Are there identifiers that are not declared? Declared but not used? Is x a scalar, an array, or a procedure? Of which type? Which declaration of x is used by each reference? 28 of 34 Compiler Construction

Semantic Analysis Beyond Syntax To generate (efficient) code, the compiler needs to answer many questions: Are there identifiers that are not declared? Declared but not used? Is x a scalar, an array, or a procedure? Of which type? Which declaration of x is used by each reference? Is x defined before it is used? 28 of 34 Compiler Construction

Semantic Analysis Beyond Syntax To generate (efficient) code, the compiler needs to answer many questions: Are there identifiers that are not declared? Declared but not used? Is x a scalar, an array, or a procedure? Of which type? Which declaration of x is used by each reference? Is x defined before it is used? Is the expression 3 * x + y type consistent? 28 of 34 Compiler Construction

Semantic Analysis Beyond Syntax To generate (efficient) code, the compiler needs to answer many questions: Are there identifiers that are not declared? Declared but not used? Is x a scalar, an array, or a procedure? Of which type? Which declaration of x is used by each reference? Is x defined before it is used? Is the expression 3 * x + y type consistent? Where should the value of x be stored (register/stack/heap)? 28 of 34 Compiler Construction

Semantic Analysis Beyond Syntax To generate (efficient) code, the compiler needs to answer many questions: Are there identifiers that are not declared? Declared but not used? Is x a scalar, an array, or a procedure? Of which type? Which declaration of x is used by each reference? Is x defined before it is used? Is the expression 3 * x + y type consistent? Where should the value of x be stored (register/stack/heap)? Do p and q refer to the same memory location (aliasing)?... 28 of 34 Compiler Construction

Semantic Analysis Beyond Syntax To generate (efficient) code, the compiler needs to answer many questions: Are there identifiers that are not declared? Declared but not used? Is x a scalar, an array, or a procedure? Of which type? Which declaration of x is used by each reference? Is x defined before it is used? Is the expression 3 * x + y type consistent? Where should the value of x be stored (register/stack/heap)? Do p and q refer to the same memory location (aliasing)?... These cannot be expressed using context-free grammars! 28 of 34 Compiler Construction

Semantic Analysis Beyond Syntax To generate (efficient) code, the compiler needs to answer many questions: Are there identifiers that are not declared? Declared but not used? Is x a scalar, an array, or a procedure? Of which type? Which declaration of x is used by each reference? Is x defined before it is used? Is the expression 3 * x + y type consistent? Where should the value of x be stored (register/stack/heap)? Do p and q refer to the same memory location (aliasing)?... These cannot be expressed using context-free grammars! (For example, {ww w Σ + } / CFL Σ ) 28 of 34 Compiler Construction

Semantic Analysis Static Semantics Static semantics Refers to properties of program constructs which are true for every occurrence of this construct in every program execution and can be decided at compile time ( static ) but are context-sensitive and thus not expressible using context-free grammars ( semantics ). 29 of 34 Compiler Construction

Semantic Analysis Static Semantics Static semantics Refers to properties of program constructs which are true for every occurrence of this construct in every program execution and can be decided at compile time ( static ) but are context-sensitive and thus not expressible using context-free grammars ( semantics ). Example properties Static: type or declaredness of an identifier, number of registers required to evaluate an expression,... Dynamic: value of an expression, size of procedure stack,... 29 of 34 Compiler Construction

Attribute Grammars Outline of Lecture 11 Recap: LR(1) Parsing Generating Top-Down Parsers Using ANTLR Generating Bottom-Up Parsers Using yacc and bison LL and LR Parsing in Practice Overview Semantic Analysis Attribute Grammars 30 of 34 Compiler Construction

Attribute Grammars Attribute Grammars I Goal: compute context-dependent but runtime-independent properties of a given program Idea: enrich context-free grammar by semantic rules which annotate syntax tree with attribute values = Semantic analysis = attribute evaluation Result: attributed syntax tree 31 of 34 Compiler Construction

Attribute Grammars Attribute Grammars I Goal: compute context-dependent but runtime-independent properties of a given program Idea: enrich context-free grammar by semantic rules which annotate syntax tree with attribute values = Semantic analysis = attribute evaluation Result: attributed syntax tree In greater detail: With every grammar symbol a set of attributes is associated. Two types of attributes are distinguished: Synthesized: bottom-up computation (from the leaves to the root) Inherited: top-down computation (from the root to the leaves) With every production a set of semantic rules is associated. 31 of 34 Compiler Construction

Attribute Grammars Attribute Grammars II Advantage: attribute grammars provide a very flexible and broadly applicable mechanism for transporting information through the syntax tree ( syntax-directed translation ) Attribute values: symbol tables, data types, code, error flags,... Application in Compiler Construction: static semantics program analysis for optimisation code generation error handling Automatic attribute evaluation by compiler generators (cf. ANTLR s and yacc s synthesized attributes) Originally designed by D. Knuth for defining the semantics of context-free languages (Math. Syst. Theory 2 (1968), pp. 127 145) 32 of 34 Compiler Construction

Attribute Grammars Example: Type Checking I Example 11.1 (Attribute grammar for type checking) Pgm Dcl Cmd Dcl ε Typ var;dcl Typ int Typ bool Cmd ε var := Exp;Cmd Exp num var Exp + Exp Exp < Exp Exp && Exp 33 of 34 Compiler Construction

Attribute Grammars Example: Type Checking I Example 11.1 (Attribute grammar for type checking) Pgm Dcl Cmd ok.0 = ok.2 Dcl ε st.0 = [id err id Id] Typ var;dcl st.0 = st.4[id.2 typ.1] Typ int typ.0 = int Typ bool typ.0 = bool Cmd ε ok.0 = true var := Exp;Cmd ok.0 = (env.0(id.1) = typ.3 ok.5) Exp num typ.0 = int var typ.0 = env.0(id.1) Exp + Exp typ.0 = (typ.1 = typ.3 = int? int:err) Exp < Exp typ.0 = (typ.1 = typ.3 = int? bool:err) Exp && Exp typ.0 = (typ.1 = typ.3 = bool? bool:err) Synthesized attributes: id (identifier name), ok (Boolean result), st (symbol table, mapping identifiers to types), typ (data type in {bool, int, err}) 33 of 34 Compiler Construction

Attribute Grammars Example: Type Checking I Example 11.1 (Attribute grammar for type checking) Pgm Dcl Cmd ok.0 = ok.2 env.2 = st.1 Dcl ε st.0 = [id err id Id] Typ var;dcl st.0 = st.4[id.2 typ.1] Typ int typ.0 = int Typ bool typ.0 = bool Cmd ε ok.0 = true var := Exp;Cmd ok.0 = (env.0(id.1) = typ.3 ok.5) env.3 = env.0 env.5 = env.0 Exp num typ.0 = int var typ.0 = env.0(id.1) Exp + Exp typ.0 = (typ.1 = typ.3 = int? int:err) env.1 = env.0 env.3 = env.0 Exp < Exp typ.0 = (typ.1 = typ.3 = int? bool:err) env.1 = env.0 env.3 = env.0 Exp && Exp typ.0 = (typ.1 = typ.3 = bool? bool:err) env.1 = env.0 env.3 = env.0 Synthesized attributes: id (identifier name), ok (Boolean result), st (symbol table, mapping identifiers to types), typ (data type in {bool, int, err}) Inherited attributes: env (environment same type as symbol table) 33 of 34 Compiler Construction

Attribute Grammars Example: Type Checking II Example 11.2 (Attributed syntax tree) For program int x; bool y; x := x+1; y := x && y: Pgm o Dcl s e Cmd o Typ t var i ; Dcl s var i := e Exp t ; e Cmd o int Typ t var i ; Dcl s e Exp t + e Exp t var i := Exp e t ; e Cmd o bool ε var i num e Exp t && e Exp t ε ( e = env, i = id, o = ok, s = st, t = typ) var i var i 34 of 34 Compiler Construction