More On Syntax Directed Translation

Similar documents
Syntactic Directed Translation

Intermediate Code Generation

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Summary: Semantic Analysis

COP5621 Exam 3 - Spring 2005

Semantic Analysis. Role of Semantic Analysis

CPS 506 Comparative Programming Languages. Syntax Specification

LECTURE 3. Compiler Phases

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Syntax-Directed Translation

Syntax-Directed Translation. Introduction

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

A Simple Syntax-Directed Translator

Syntax-Directed Translation Part I

Compiler Principle and Technology. Prof. Dongming LU April 15th, 2019

[Syntax Directed Translation] Bikash Balami

Compilers. 5. Attributed Grammars. Laszlo Böszörmenyi Compilers Attributed Grammars - 1

Chapter 4 :: Semantic Analysis

Semantic Analysis Attribute Grammars

Lecture 14 Sections Mon, Mar 2, 2009

Syntax-Directed Translation

Syntax-Directed Translation

Semantic Analysis. CSE 307 Principles of Programming Languages Stony Brook University

Time : 1 Hour Max Marks : 30

UNIT IV INTERMEDIATE CODE GENERATION

COP4020 Programming Languages. Semantics Prof. Robert van Engelen

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

CSCI Compiler Design

Syntax-Directed Translation

Semantic Analysis. Compiler Architecture

CS 406: Syntax Directed Translation

Compilers. Compiler Construction Tutorial The Front-end

Semantic actions for expressions

COP4020 Programming Languages. Semantics Robert van Engelen & Chris Lacher

Principles of Programming Languages

Syntax Errors; Static Semantics

CSE 431S Type Checking. Washington University Spring 2013

CIT Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1

Context-sensitive Analysis. Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

Syntax-Directed Translation. Concepts Introduced in Chapter 5. Syntax-Directed Definitions

Programming Languages

CSC 467 Lecture 13-14: Semantic Analysis

Semantic Analysis computes additional information related to the meaning of the program once the syntactic structure is known.

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

1 Lexical Considerations

A programming language requires two major definitions A simple one pass compiler

CSCI Compiler Design

Syntax-Directed Translation. Lecture 14

Last Time. What do we want? When do we want it? An AST. Now!

Syntax-Directed Translation Part II

Semantic analysis and intermediate representations. Which methods / formalisms are used in the various phases during the analysis?

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

Lexical Considerations

Semantic Analysis with Attribute Grammars Part 3

As we have seen, token attribute values are supplied via yylval, as in. More on Yacc s value stack

Ambiguity and Errors Syntax-Directed Translation

Compilers - Chapter 2: An introduction to syntax analysis (and a complete toy compiler)

A simple syntax-directed

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Compiler Design Concepts. Syntax Analysis

QUESTIONS RELATED TO UNIT I, II And III

(Not Quite) Minijava

Compilerconstructie. najaar Rudy van Vliet kamer 124 Snellius, tel rvvliet(at)liacs.

SML-SYNTAX-LANGUAGE INTERPRETER IN JAVA. Jiahao Yuan Supervisor: Dr. Vijay Gehlot

Syntax-Directed Translation. CS Compiler Design. SDD and SDT scheme. Example: SDD vs SDT scheme infix to postfix trans

Lexical and Syntax Analysis. Top-Down Parsing

Type Checking. Chapter 6, Section 6.3, 6.5

Lexical Considerations

Question Bank. 10CS63:Compiler Design

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1

Syntax-directed translation. Context-sensitive analysis. What context-sensitive questions might the compiler ask?

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

COP4020 Spring 2011 Midterm Exam

Introduction to Programming Using Java (98-388)

Program Assignment 2 Due date: 10/20 12:30pm

Static Checking and Type Systems

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Chapter 4 - Semantic Analysis. June 2, 2015

Error Handling Syntax-Directed Translation Recursive Descent Parsing

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend:

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Formal Languages and Compilers Lecture IX Semantic Analysis: Type Chec. Type Checking & Symbol Table

Compilers Project 3: Semantic Analyzer

Programming Languages & Translators PARSING. Baishakhi Ray. Fall These slides are motivated from Prof. Alex Aiken: Compilers (Stanford)

Introduction to Syntax Analysis. Compiler Design Syntax Analysis s.l. dr. ing. Ciprian-Bogdan Chirila

CSE 12 Abstract Syntax Trees

Compiler Construction I

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

CS415 Compilers Context-Sensitive Analysis Type checking Symbol tables

Type checking of statements We change the start rule from P D ; E to P D ; S and add the following rules for statements: S id := E

Test 1 Summer 2014 Multiple Choice. Write your answer to the LEFT of each problem. 5 points each 1. Preprocessor macros are associated with: A. C B.

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

Transcription:

More On Syntax Directed Translation 1

Types of Attributes We have productions of the form: A X 1 X 2 X 3... X n with semantic rules of the form: b:= f(c 1, c 2, c 3,..., c n ) where b and the c s are attributes of the grammar symbols b is called a synthesized attribute if: b is an attribute of A (i.e. the LHS), and the c s are all attributes of the X s (symbols on the RHS) 35

Synthesized Attributes b:= f(c 1, c 2, c 3 ) A b X 1 X 2 X 3 c 1 c 2 c 3 Information to compute b is passed up the parse tree 36

Inherited Attributes A X 1 X 2 X 3... X n b:= f(c 1, c 2, c 3,..., c n ) b is an inherited attribute if b is an attribute of one of the X s (i.e. RHS) and the c s are attributes of A and/or one or more of the other X s which means they are beside or above where b is needed 37

Inherited Attributes b:= f(c 1, c 2, c 3 ) A c 2 X 1 X 2 X 3 b c 1 c 3 Information to compute b must be passed down the parse tree Note that X 2 is only associated with X 1 and X 3 via their appearance on the RHS of a production for A -so information must flow through A in the tree 38

S-Attributed Definitions A syntax directed definition is S-attributed if it uses only synthesized attributes Which implies that the definition can be annotated by evaluating semantic rules for nodes bottom-up which fits naturally with bottom-up parsers and can also be evaluated with top down parsers since both types perform depth-first traversals But the presence of inherited attributes poses a problem. 39

Inherited Attributes Example A definition for C-style declarations: Production Semantic Rules D T L L.type := T.type (inherited attribute passed down to L) T int T.type := INTEGER T float T.type := FLOAT L L 1, id L 1.type := L.type (inherited from LHS) settype(id.entry, L.type) L id settype(id.entry, L.type) 40

Inherited Attributes in C-Declarations D T T.type = INT L L.type = INT settype(y.entry,int) int L L.type = INT settype(x.entry,int), id (y) id (x) 41

Computing Synthesized and Inherited Attributes Synthesized Attributes Natural fit for bottom up parsers Yacc: $$ = $1 + $3; etc. Can use parsing function return value in recursive descent Inherited Attributes Natural for top-down parsers Recursive descent: parameters in parsing function call Quite troublesome for bottom-up parsers Especially getting at the attribute of the left-hand-side Tricks such as reaching under the stack Sometimes can be done with embedded actions But usually dealt with later during traversal of the AST 42

L-attributed Definitions a syntax directed definition is L-attributed if every inherited attribute of some symbol X i on the RHS of some production A X 1..X i-1 X i.. X n depends only on the attributes of A (the LHS), and on the attributes of the symbols X 1..X i-1 to the left of symbol X i in the production This implies that the definition can be annotated by evaluating semantic rules for nodes in a depth first, left-to-right traversal of the tree 43

Evaluation of L-Attributed Definitions An L-attributed definition can be evaluated in a depth-first tree traversal as follows: procedure dfvisit(n:node); begin for each child m of n, from left to right do begin evaluate inherited attributes of m; dfvisit(m) end; evaluate synthesized attributes of n end which means it can be evaluated on the fly driven by a parser Note that every S-attributed definition is also L- attributed 44

Translation Schemes A CFG with attributes associated with grammar symbols and semantic actions enclosed in braces { } embedded within the RHS s to indicate the time during the processing when the action should be executed to evaluate its attribute 45

Translation Schemes An action can only be executed when all the attributes it refers to have already been evaluated For a synthesized attribute: the action can simply be placed at the end of the production For an inherited attribute: an inherited attribute for a symbol on the RHS must be evaluated in an action before that symbol (and then passed down), and an action cannot refer to a synthesized attribute to the right of the action 46

Pascal-Style Declarations The C declaration example was L-Attributed..but the following obvious grammar is not: D var id IDLIST : T IDLIST, id IDLIST ε T integer real (Symbol T is to the right of the IDLIST in the first production, so the type cannot be passed down the tree during a left-to-right traversal) 47

Rewriting Grammars to Facilitate Translation The Pascal declaration grammar can be re-written to permit the use of only synthesized attributes: D var id LIS { settype(id.entry, LIS.type) } LIS, id LIS 1 { settype(id.entry, LIS 1.type) LIS.type := LIS 1.type } : T { LIS.type := T.type } T integer { T.type := INTEGER } real { T.type := FLOAT } 48

Attribute Evaluation with Revised Grammar D settype(a.entry,integer) var ID LIS settype(b.entry,integer) LIS.type := INTEGER a, ID LIS settype(c.entry,integer) LIS.type := INTEGER b, ID LIS LIS.type := INTEGER c : T T.type := INTEGER integer 49

Symbol Tables & Abstract Syntax Trees 50

Symbol Tables Symbol tables can take many forms We have seen the simple linked list form as used in the type checking example For a language like Java there are typically several tables such as: A linked list of CLASS descriptors For each CLASS, a linked list of METHOD and FIELD descriptors For each method, a list of formal parameters, local variables and a pointer to an AST structure for the actual code 51

Tree Structures as Intermediate Code Representation Production compilers usually build some form of intermediate code during parsing, postponing target code generation until later after optimizations can be performed The intermediate language is generally quite independent from the nitty-gritty details of any particular ISA of real machines A common form of intermediate code is a tree structure This cleanly separates the front end (source language analysis) from the back end (code generation for a specific target machine) 52

Parse Trees and Abstract Syntax Trees Parse trees could be an intermediate form, but are cumbersome E E + E E E op E E * E c ( E ) - E ID a ( E ) String: a * ( -b ) + c - E Only 3 operations, But 7 interior nodes! b 53

Abstract Syntax Trees Abstract Syntax Trees eliminate the clutter and capture meaning in a minimal form E E op E String: ( E ) - E ID a * ( -b ) + c a + * c - b 3 interior nodes 54

Translation Scheme for Abstract Syntax Trees NODE *MakeNODE(op, left, right) NODE *MakeUNARY(op, arg) NODE *MakeLEAF(id) E E 1 op E 2 { E.n=MakeNODE(op, E 1.n, E 2.n) } ( E 1 ) { E.n = E 1.n } - E 1 { E.n = MakeUNARY( -, E 1.n) } ID { E.n = MakeLEAF(id) } AST is a very convenient representation for machineindependent optimizations Generate target code later via post-order tree traversal a a * ( -b ) + c + * c - b 55

Abstract Syntax Trees The AST is simply a tree structure that is a simplification of a parse tree that contains only the significant information without the syntactic sugar. The symbol table is actually just another form of AST, which captures the relevant information about classes, attributes, and methods, while ignoring the syntactic details of how these are declared. It usually also contains other information fields not necessarily filled in at parse time for use in semantic analysis or code generation AST s are actually very straightforward to construct in a parser. 56

Expression Tree (AST) (Examples are from an AST-based type checker) Binary operator case: struct AST {int opr; struct AST *left; struct AST *right; }; An operator, and pointers to AST nodes for the left and right operands. 57

Creating an AST node in a parser exp : exp '+' exp { $$=make_binary ('+', $1, $3); } with similar semantic actions creating variants of the AST node for different syntactic structures We pass the pointer up the tree via $$, so when the final reduction to the start symbol occurs, it gets a pointer to the root of the whole tree 58

AST Node types There are different operators (binary, unary ) and different language structures (if, while ), so we actually need many different kinds of nodes. We can create a unique data structure for each node type This would lead to a large number of unique data structures to keep track of So we use a more general purpose structure with variants If coding in an object-oriented language, we can use subclasses of a generic node class In C, we have to use a "general purpose" structure, perhaps using a C union to deal with special cases 59

General Purpose AST Node #define MAXCHILDREN 2 typedef enum { binary_exp, int_const, bool_const, var_exp, assign_ast, prog_ast} NodeKind; typedef struct AST { struct AST * child[maxchildren]; struct AST * next; int lineno; NodeKind nodekind; union { int op; int val; char * name; struct VAR *vartable; } attr; Type type; /* for type checking of exps */ } AST; Example: If we have an AST pointer a and know that the "nodekind" is binary_exp, we can refer to the leftoperand of the operator as: a->child[0] the right-operand as a->child[1] and the operator will be a->attr.op etc. 60

Node Creation #define NEW(type) (type *) calloc(1,sizeof(type)) AST *make_binary(int opt, AST *left, AST *right) { AST * e = NEW(AST); e->nodekind = binary_exp; e->attr.op = opt; e->child[0] = left; e->child[1] = right; e->lineno = lineno; /* from global variable in scanner */ return e; }...and one of these for each AST subtype 61

Type Checking by AST Traversal Assume a two-pass type checker: 1. First it parses the input and builds an AST and a symbol table as it goes 2. Then we traverse the AST recursively computing the types of expressions and checking all semantic rules 62

Expression Productions e : e '=' e { $$ = make_binary ('=', $1, $3); } e '+' e { $$ = make_binary ('+', $1, $3); } e tor e { $$ = make_binary (tor, $1, $3);} '(' e ')' { $$ = $2; } NUM { $$ = make_int($1); } ttrue { $$ = make_bool(1); } tfalse { $$ = make_bool(0); } ID { $$ = make_var($1); } ; 64

Tree Traversal to Decorate Tree and Check Type Rules Type type_check(ast *a) { Type t1,t2; int op; switch (a->nodekind) { case binary_exp: t1 = type_check(a->child[0]); t2 = type_check(a->child[1]); op = a->attr.op; if (op=='+') { /* arithmetic */ ASSERT(t1==typeINT,"Left operand not Int",a->lineno); ASSERT(t2==typeINT,"Right operand not Int",a->lineno); a->type = typeint; } else if (op== xxx) ( for each of the other operators)... break; case... (for all other AST variants) } /* end of switch */ return a->type; } 65

AST Typecheck Demo 1: program test; 2: var a: integer; 3: b: integer; 4: c: boolean; 5: begin 6: a := c; 7: c := a=b; 8: a := true; 9: x := y+3; 10: b := a or c; 11: end. % typecheck < test.p line 6: Assignment type mismatch line 8: Assignment type mismatch line 9: Undeclared identifier line 9: LHS of assignment not declared line 9: Undeclared identifier line 9: Left operand not Int line 9: Assignment type mismatch line 10: Left operand not boolean line 10: Assignment type mismatch % 67