CSCI565 Compiler Design

Similar documents
CSCI565 Compiler Design

CSCI Compiler Design

CSCI Compiler Design

Module 26 Backpatching and Procedures

Intermediate Code Generation

Intermediate Code Generation Part II

Intermediate Code Generation

Intermediate Code Generation

Intermediate Code Generation

Intermediate Code Generation

UNIT-3. (if we were doing an infix to postfix translator) Figure: conceptual view of syntax directed translation.

CS2210: Compiler Construction. Code Generation

Syntax-Directed Translation

Intermediate Code Generation

Intermediate Code Generation

CSCI 171 Chapter Outlines

Introduction to Programming Using Java (98-388)

LECTURE 3. Compiler Phases

UNIT IV INTERMEDIATE CODE GENERATION

CSCI Compiler Design

Principle of Compilers Lecture VIII: Intermediate Code Generation. Alessandro Artale

Intermediate Representations & Symbol Tables

NARESHKUMAR.R, AP\CSE, MAHALAKSHMI ENGINEERING COLLEGE, TRICHY Page 1

CSCI Compiler Design

THEORY OF COMPILATION

Dixita Kagathara Page 1

CSCI 565: Compiler Design and Implementation Spring 2006

Intermediate Code Generation

Principle of Complier Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Semantic Actions and 3-Address Code Generation

CSCE 531, Spring 2015 Final Exam Answer Key

More On Syntax Directed Translation

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Formal Languages and Compilers Lecture X Intermediate Code Generation

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

Test 1 Summer 2014 Multiple Choice. Write your answer to the LEFT of each problem. 5 points each 1. Preprocessor macros are associated with: A. C B.

Type Checking. Chapter 6, Section 6.3, 6.5

Object Code (Machine Code) Dr. D. M. Akbar Hussain Department of Software Engineering & Media Technology. Three Address Code

Concepts Introduced in Chapter 6

Principles of Compiler Design

Syntax Directed Translation

CMSC430 Spring 2014 Midterm 2 Solutions

A Simple Syntax-Directed Translator

COP4020 Spring 2011 Midterm Exam

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Qualifying Exam in Programming Languages and Compilers

Generating 3-Address Code from an Attribute Grammar

CMPT 379 Compilers. Anoop Sarkar. 11/13/07 1. TAC: Intermediate Representation. Language + Machine Independent TAC

Time : 1 Hour Max Marks : 30

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

Compiler Construction Assignment 3 Spring 2018

Intermediate Representations

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 97 INSTRUCTIONS

Concepts Introduced in Chapter 6

Single-pass Static Semantic Check for Efficient Translation in YAPL

A simple syntax-directed

FORM 2 (Please put your name and form # on the scantron!!!!)

CSCI Compiler Design

Problem Score Max Score 1 Syntax directed translation & type

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

COP5621 Exam 3 - Spring 2005

Homework 1 Answers. CS 322 Compiler Construction Winter Quarter 2006

CSCI 2212: Intermediate Programming / C Review, Chapters 10 and 11

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

1 Lexical Considerations

Intermediate Representa.on

CMPSC 160 Translation of Programming Languages. Three-Address Code

Semantic Analysis computes additional information related to the meaning of the program once the syntactic structure is known.

Lecture08: Scope and Lexical Address

Chapter 6 Intermediate Code Generation

Lexical Considerations

Programming, numerics and optimization

CS 415 Midterm Exam Spring SOLUTION

11. a b c d e. 12. a b c d e. 13. a b c d e. 14. a b c d e. 15. a b c d e

CS 432 Fall Mike Lam, Professor. Code Generation

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

CPS 506 Comparative Programming Languages. Syntax Specification

Syntax-Directed Translation Part II

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

CSC 467 Lecture 13-14: Semantic Analysis

CSE302: Compiler Design

Review of the C Programming Language

Symbol Tables. ASU Textbook Chapter 7.6, 6.5 and 6.3. Tsan-sheng Hsu.

COP5621 Exam 4 - Spring 2005

CSCI565 Compiler Design

Intermediate Code Generation

Semantic analysis and intermediate representations. Which methods / formalisms are used in the various phases during the analysis?

5. Semantic Analysis!

Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov Compiler Design (170701)

COS 140: Foundations of Computer Science

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

The PCAT Programming Language Reference Manual

Lexical Considerations

Ambiguity and Errors Syntax-Directed Translation

Semantic Analysis and Type Checking

CSE P 501 Exam 8/5/04

Lecture 09: Data Abstraction ++ Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree.

CS415 Compilers Context-Sensitive Analysis Type checking Symbol tables

Transcription:

CSCI565 Compiler Design Spring 2015 Homework 2 - Solution Problem 1: Attributive Grammar and Syntax-Directed Translation [25 points] Conser the grammar fragment below. It allows for the declaration of scalar or single dimensional arrays in a style similar to PASCAL where the type specifier and its storage size indications precedes the variable entifiers. To simply the implementation we only allow some basic types. Also integer stands for a terminal symbol with an integer value attribute named ival. VarDeclaration TypeSpecifier Dimensions IdentifierList TypeSpecifier Dimensions IdentifierList ';' vo char float int double '[' integer ']' ε ',' IdentifierList a. [15 points] Without rewriting the grammar develop an attributive grammar and syntax-directed definition (SDD) that inserts all the symbols associated with the variable declaration in a symbol table as part of the semantic action associated with the VarDeclaration production. Your solution should also check, and report, duplicate variable symbols, in which case only the first occurrence should be inserted in the symbol table. b. [10 points] Show the values of your attributes as well as the order in which they need to be evaluated for the snippet of code fragment "int [6] a, b;". Can the evaluation of your attributes be carried out in a single pass over the parse tree? Why or why not? a. There are many possible solutions to this simple problem. Here we present an L-attributed grammar solution with the list of attributes and corresponding types as follows: Dimensions: basesize (integer, inherited); size (integer, synthesized) TypeSpecifier: basesize (integer, synthesized) IdentifierList: list (list of symbols, synthesized) vardeclaration: list (list of symbols, synthesized); size (integer, synthesized) using these attributes we can define the semantics rules as follows: VarDeclaration TypeSpecifier Dimensions IdentifierList ';' { for each in IdentifierList.list do currentsymboltable.insert(, Dimensions.size); end for TypeSpecifier vo { TypeSpecifier.baseSize = 0; char { TypeSpecifier.baseSize = 1; float { TypeSpecifier.baseSize = 4; int { TypeSpecifier.baseSize = 4; double { TypeSpecifier.baseSize = 8; Dimensions '[' integer ']' { Dimensions.Size = Dimensions.baseSize * integer.ival; /* array case */ ε { Dimensions.Size = Dimensions.baseSize; /* scalar case */ IdentifierList 0 ',' IdentifierList 1 { IdentifierList.list 0 = appendandcheck(identifierlist.list 1, ); { IdentifierList.list = { 1 of 7

b. Using these rules one possible evaluation order is from left-to right following a depth-first search pre-order traversal. VarDeclaration list = {a,b size = 24 TypeSpecifier Dimensions IdentifierList ';' list = {a,b basesize = 4 basesize = 4 size = 6 x 4 int '[' integer ']' Id list = {a ',' IdentifierList list = {b 6 a Id list = {b b 2 of 7

Problem 2: Static-Single Assignment Representation [15 points] For the sequence of instructions shown below depict an SSA-form representation (as there could be more than one). Comment on the need to save all the values at the end of the loop and how the SSA representation helps you in your evaluation of the code. Do not forget to include the φ-functions. b = 0; d =...; a =...; i =...; L1: if(i > 0) { if(a < 0){ d = 0; b = 0; else { b = b + 1; d = 1; i = i - 1; if(i < 0) goto Lbreak; goto L1; Lbreak: x = a; y = b; A possible representation in SSA is as shown below where each value associated with each variable is denoted by a subscripted index. Notice that in fact there is no loop in this code. a 1 =... b 1 = 0 i 1 =... d 1 =... L: i 2 = φ(i 1,i 3 ) b 2 = φ(b 1,b 5 ) if (i 2 <= 0) then goto Lbreak if (a 1 >= 0) then goto X d 2 = 0 b 3 = 0 goto Y X: b 4 = b 2 + 1 d 3 = 1 Y: i 3 = i 2-1 b 5 = φ(b 3,b 4 ) if (i 3 < 0) then goto Lbreak goto L1 Lbreak: b 5 = φ(b 2,b 5 ) x 1 = a 1 y 1 = b 5 As can be observed by inspection each use has a single definition point that reaches it and each value is defined only once. This is particularly tricky for variable 'b'. 3 of 7

Problem 3: Symbol Table Organization [10 points] For the PASCAL code below answer the following questions: 01: procedure main 02: integer a, b, c; 03: procedure f1(w,x); 04: integer w, x; 05: f2(w,x); 06: end; 07: procedure f2(y,z); 08: integer a, y, z; 09: procedure f3(m,n); 10: integer b, m, n; 11: c = a * b * m + f3(y,z); 12: b = a * (x + 1); 13: end; 14: f3(c,z); 15: end; 16: function f4(k) : integer; 17: integer k; 18: f4 := (k + 1); 19: end; 20:... 21: f1(a,b); 22: end; a) [05 points] Draw the symbol tables for each of the procedures in this code (including main) and show their nesting relationship by linking them via a pointer reference in the structure (or record) used to implement them in memory. Include the entries or s for the local variables, arguments and any other information you find relevant for the purposes of code generation, such as its type and location at run-time. b) [05 points] For the statement in line 12 what are the specific instance of the variables used in this statement the compiler needs to locate? Explain how the compiler obtains the data corresponding to each of these variables table at compile time. a) The figure below depicts the hierarchical structure of the procedure in this PASCAL program. main kind symbol type size var a integer 4 var b integer 4 var c integer 4 kind param param f1 symbol w x type size integer 4 integer 4 f2 kind symbol type size var a integer 4 param y integer 4 param z integer 4 f4 kind symbol type size param k integer 4 kind f3 symbol type size var b integer 4 param m integer 4 param n integer 4 b) For the statement in line 12 we simply follow the symbol table entries to find out the specific instance of each of the symbols. Given than the statement is located lexically inse the body of procedure f3 the search for symbols always begins in the symbol table for f3. In this statement "b = a*(x + 1);" the symbol b refers to the local variable of procedure f3, the symbol a to the local variable in procedure f2 and the x symbol is undefined. This is a semantic error, which the compiler needs to report as this symbol is out of scope. 4 of 7

Problem 4: Intermediate Code Generation [30 points] Conser the code generation scheme for expressions described in class. Assume that the grammar does allow for pointer dereferencing expression to consists of a simple entifier such as a->f1 or b.f2 where the symbol a refers to a reference or address of a C struct and b for a location of a struct in C. Assume for the purpose of this exercise that during a semantic analysis phase you have computed the offset of the first byte of each of the s f1 and f2 being referenced in thee expressions. In this context answer the following: a) [20 points] Derived a SDT code generation scheme that handles simple pointer references such as the ones above as well as more complicated ones with multiple pointer indirections, e.g., a->f1->f2. Not that you can combine the "->" and the "." operators. You need to include as part of your answer relevant productions of the grammar. b) [10 points] Show your code generation scheme for the simple expression a = b->f1 + c.f2 assuming that b and c are declared in the C programming language as shown below. typedef struct { int f1; B; typedef struct { int y; int f2; C; int a; B* b; C* c; a) Below we depict a possible parse tree that will help us structure the grammar and corresponding productions, attributes and semantic rules for this problem. All the attributes should be synthesized as that helps the integration with a bottom-up parser and if not also with a single bottom-up traversal of the tree. type: pointer: place: code: symbol boolean temporary name list of instructions '=' Using these attributes we also defined a set of simple auxiliary functions, namely: b f2 type :: getoffset(user-defined type, _name); symboltable :: gettype(symbol_name); symboltable :: gettypeof Field(type_name, _name); a f1 As to the semantic rules we can define them as follows: assign '=' exp 1 exp 1 exp 2 '+' exp 3 { t4 = newtemp(); assign.place = t4; assign.code = append(exp 1.code, gen('t4 = exp 1';)); assign.pointer = exp 1.pointer; assign.type = igettype(.symbol); { t3 = newtemp(); exp 1.place = t3; exp 1.code = append(exp 2.code, exp 3.code,gen(t3 = exp 2.place + exp 3.place)); exp 1.pointer = exp 2.pointer; exp 1.type = exp 2.type; 5 of 7

exp { exp.code =.code; exp.type =.type; exp.pointer =.pointer; exp.place =.place; 1 2 1 2 '.' 1 { offset = getfieldoffset( 2.type,.symbol); t2 = newtemp(); 1.place = t2; 1.type = gettypeoffield( 2.type,.symbol); if( ispointertype( 2.type) ){ 1.code = append( 2.code,{ gen('t2 = 2.place + offset'; 't2 = *t2;)); 1.pointer = true; else { error { offset = getfieldoffset( 2.type,.symbol); t2 = newtemp(); 1.place = t2; 1.type = gettypeoffield( 2.type,.symbol); if( ispointertype( 2.type) ){ error; else { 1.code = append( 2.code,{ gen('t2 = 2.place + offset'; 't2 = *t2;)); 1.pointer = ispointertype( 1.type); { 1.type = gettype(.symbol); 1.pointer = ispointertype(.symbol)); t1 = newtemp(); 1.place = t1; 1.code = { gen('t1 =.symbol;'); b) Using the attribute grammar outlined above for the expression: '=' place= a t3 = c; t4=t3+4; t4 = *t4; t5 = t3 + t4; c = t5; place= a code= null '+' place= t5 t3 = c; t4=t3+4; t4 = *t4; t5 = t3 + t4; a exp place= t2 place= t2 exp place= t4 code= t3 = c; t4=t3+4; t4 = *t4; place= t4 code= t3 = c; t4=t3+4; t4 = *t4; type= B* place= t1 pointer= true code= 't1 = b' f1 type= C* place= t3 pointer= true code= 't3 = c' f2 b c 6 of 7

Problem 5: Back-patching of Loop Constructs [20 points] We have covered in class an SDT scheme to generated code using the back-patching technique for a while loop construct. In this exercise you will develop a similar scheme for the repeat-while construct using the production below and also taking into account continue and break statements. Argue that your solution works for the case of nested loops and break and continue statements at different nesting levels. (1) S repeat L while E; (2) S continue; (3) S break; (4) L S ; L (4) L S Do not forget to show the augmented production with the marker non-terminal symbols, M and possibly N along with the corresponding rules for the additional symbols and productions. Argue for the correctness of your solution without necessarily having to show an example. We have seen in class a possible approach to this SDT scheme is to have additional synthesized attributes for the statements, respectively a nextlist a skiplist and a breaklist. In the skiplist are the addresses of unresolved goto instructions that correspond to continue statements whereas in the break list are the addresses of unresolved goto instructions that correspond to break statements. The nextlist corresponds to addresses of unresolved goto instructions that follow the regular control-flow, as it is the case of regular instructions or if-then-else constructs. While the skiplist need to be patched with the addresses of the first instruction of the current nesting level, i.e. the first instructions that evaluates the control predicate of the loop, the breaklist needs to be patched with the first address following the current S construct. This cannot be immediately recognized at this level in the back-patching and thus the address of the goto in the breaklist is passed up as part of the synthesized attribute nextlist of S. (1) S repeat M 1 L while M 2 E; { backpatch(l.nextlist,m 2.quad); backpatch(l.skiplist,m 2.quad); backpatch(e.truelist.m 1.quad); S.nextlist = merge(e.falselist,l.breaklist); S.breaklist = nil; (2) S continue; { S.skiplist = newlist(nextaddr()); emit( goto ); S.breaklist = nil; S.nextlist = nil; (3) S break; { S.breaklist = newlist(nextaddr()); emit( goto ); S.nextlist = nil; S.skiplist = nil; (4) L 1 S ; M 2 L 2 { backpatch(s.nextlist, M 2.quad); L 1.nextlist = L 2.nextlist; L 1.breaklist = merge(s.breaklist,l 2.breaklist); L 1.skiplist = merge(s.skiplist,l 2.skiplist); (5) L S { L.breaklist = S.breaklist; L.nextlist = S.nextlist; L.skiplist = S.skiplist; (6) M 1 ε { M 1.quad = nextaddr; (7) M 2 ε { M 2.quad = nextaddr; Regarding the first production, the first back-patching command fills in the places where the control in L is transferred to the next iteration, that is, to the evaluation of the conditional E that is given by the M 2.quad value. The second backpatching command links the places where the evaluation of E is false to M 1.quad that is to the top of the loop. Next we merge the places where are goto instructions with the E.falselist as both these have addresses where the goto instructions will transfer control to the first instruction following the loop. The continue generates a single entry in a skiplist whereas the break generates a single entry in a breaklist of the corresponding S symbol. Regarding the sequencing of statement in production (4) we have to link the addresses from continue instructions in S with the first instruction in the predicate of the while construct in which the continue is nested. This is accomplished by using the skiplist attribute. The addresses that correspond to break instructions in either and S or L 2 need to be merged whereas the L 1.nextlist is simply the locations that need to be filled in with the addresses after L 2 which is only known at the next level up. Note that nested loop will have the break instruction just to the nest level up (see the role of L.breaklist in (5) and S.nextlist in (1) and then the S.nextlist in (4) where it is patched to M 2.quad. 7 of 7