Examination in Compilers, EDAN65

Similar documents
2.2 Syntax Definition

This book is licensed under a Creative Commons Attribution 3.0 License

A Simple Syntax-Directed Translator

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

CSE P 501 Exam Sample Solution 12/1/11

Part 5 Program Analysis Principles and Techniques

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

CSE 582 Autumn 2002 Exam 11/26/02

CSE 401/M501 18au Midterm Exam 11/2/18. Name ID #

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

CSE 3302 Programming Languages Lecture 2: Syntax

1 Lexical Considerations

Principles of Programming Languages COMP251: Syntax and Grammars

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Week 2: Syntax Specification, Grammars

The PCAT Programming Language Reference Manual

CPS 506 Comparative Programming Languages. Syntax Specification

CPSC 411, 2015W Term 2 Midterm Exam Date: February 25, 2016; Instructor: Ron Garcia

Syntax. A. Bellaachia Page: 1

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

LL(k) Compiler Construction. Choice points in EBNF grammar. Left recursive grammar

More Assigned Reading and Exercises on Syntax (for Exam 2)

LL(k) Compiler Construction. Top-down Parsing. LL(1) parsing engine. LL engine ID, $ S 0 E 1 T 2 3

CSE 12 Abstract Syntax Trees

Programming Assignment 2 LALR Parsing and Building ASTs

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

Principles of Programming Languages COMP251: Syntax and Grammars

Decaf Language Reference Manual

CSE 413 Final Exam Spring 2011 Sample Solution. Strings of alternating 0 s and 1 s that begin and end with the same character, either 0 or 1.

CSE P 501 Exam 12/1/11

Lexical Considerations

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

Lexical Considerations

The SPL Programming Language Reference Manual

Midterm I - Solution CS164, Spring 2014

EECS 6083 Intro to Parsing Context Free Grammars

Full file at

CS 6353 Compiler Construction Project Assignments

CSE 401 Midterm Exam 11/5/10

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

CS 411 Midterm Feb 2008

Programming Assignment 2 LALR Parsing and Building ASTs

CSE 413 Final Exam. June 7, 2011

CSE P 501 Exam 8/5/04

ASTs, Objective CAML, and Ocamlyacc

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

11. a b c d e. 12. a b c d e. 13. a b c d e. 14. a b c d e. 15. a b c d e

3. Context-free grammars & parsing

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Syntax Intro and Overview. Syntax

Building Compilers with Phoenix

CS 6353 Compiler Construction Project Assignments

Programming Lecture 3

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

IPCoreL. Phillip Duane Douglas, Jr. 11/3/2010

Stating the obvious, people and computers do not speak the same language.

SFU CMPT 379 Compilers Spring 2018 Milestone 1. Milestone due Friday, January 26, by 11:59 pm.

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

Question Points Score

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

CSCI312 Principles of Programming Languages!

Related Course Objec6ves

Compiler Techniques MN1 The nano-c Language

CS/ECE 374 Fall Homework 1. Due Tuesday, September 6, 2016 at 8pm

ASML Language Reference Manual

CPSC 411, Fall 2010 Midterm Examination

Visitors. Move functionality

CSE P 501 Exam 11/17/05 Sample Solution

Syntax Errors; Static Semantics

Syntax. In Text: Chapter 3

Jim Lambers ENERGY 211 / CME 211 Autumn Quarter Programming Project 4

CS 374 Fall 2014 Homework 2 Due Tuesday, September 16, 2014 at noon

Intro to semantics; Small-step semantics Lecture 1 Tuesday, January 29, 2013

CA Compiler Construction

A language is a subset of the set of all strings over some alphabet. string: a sequence of symbols alphabet: a set of symbols

1. Consider the following program in a PCAT-like language.

Syntax and Grammars 1 / 21

Introduction to Lexing and Parsing

Chapter 3. Describing Syntax and Semantics

Program Fundamentals

Defining syntax using CFGs

22c:111 Programming Language Concepts. Fall Syntax III

Exercise ANTLRv4. Patryk Kiepas. March 25, 2017

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

GAWK Language Reference Manual

ECE251 Midterm practice questions, Fall 2010

CSE 582 Autumn 2002 Exam Sample Solution

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Decaf Language Reference

CS 164 Handout 11. Midterm Examination. There are seven questions on the exam, each worth between 10 and 20 points.

CS 314 Principles of Programming Languages

A simple syntax-directed

Homework 1 Due Tuesday, January 30, 2018 at 8pm

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

CSCE 314 Programming Languages

CS 536 Midterm Exam Spring 2013

WARNING for Autumn 2004:

Lexical Scanning COMP360

Transcription:

Examination in Compilers, EDAN65 Department of Computer Science, Lund University 2016 10 28, 08.00-13.00 Note! Your exam will be marked only if you have completed all six programming lab assignments in advance. Start each solution (1, 2, 3, 4) on a separate sheet of paper. Write your personal identifier 1 on every sheet of paper. Write clearly and legibly. Try to find clear readable solutions with meaningful names. Unnecessary complexity will result in point reduction. The following documents may be used during the exam: Reference manual for JastAdd2 x86 Cheat Sheet You may also use a dictionary from English to your native language. Max points: 60 For grade 3: Min 30 For grade 4: Min 40 For grade 5: Min 50 1 The personal identifier is a short phrase, a code or a brief sentence of your choice. It can be anything, but not something that can reveal your identity. The purpose of this identifier is to make it possible for you to identify your exam in case something goes wrong with the anonymous code on the exam cover (such as if it is confused with another code due to sloppy writing). 1

1 Lexical analysis The following token definitions are all part of a compiler for a programming language: ARROW = "->" MINUS = "-" GT = ">" DECREMENT = "--" a) Consider the following string: "-->" Suppose no disambiguation rules are used. List all token sequences that this string could be interpreted as. (3p) b) There are two common rules for disambiguating lexical rules. Which of them is relevant for disambiguating the string "-->", and which token sequence will be the result when using that rule? (2p) c) Whitespace in the language consists of arbitrarily long non-empty sequences of blanks, tabs, newlines and return characters. Give a regular expression WHITE- SPACE for such whitespace sequences. (2p) d) Draw a combined DFA covering ARROW, MINUS, GT, DECREMENT, and WHITE- SPACE that has as few states as possible. Mark the final states with the appropriate tokens. (5p) 2

2 Context-Free Grammars Consider the following program in a language that has a simple form of higher-order functions, where a function can take another function as a parameter. The integrate function is an example of such a higher-order function. It has a function parameter f, and computes an approximation of the integral of f over an interval low..hi. The main function shows an example of calling integrate with the function g as the third argument. The type of f is declared as (float) -> float, meaning that f takes a float parameter and returns a float. As we can see, this matches the definition of g. In general, a function type in this language is written as (t1, t2,..., tn) -> t for n 0, where t1, t2,..., tn are the parameter types and t is the return type of the function. The function example shows other examples of function parameters: The function parameter f1 takes an int and a float and returns nothing (void). The function parameter f2 takes no parameters and returns an int. The function parameter f3 takes an (int)->int function as its parameter and returns an int. float integrate( float low, float hi, ( float) -> float f ) { return ( f(low) + 4*( f( (low+hi)/2 ) ) + f(hi) ) * (hi-low) / 6; float g( float x) { return 3*x*x*x + 2*x + 7; void main() { print( integrate(0, 10, g)); void example( (int, float) -> void f1, () -> int f2, ((int)->int) -> int f3){... Below, parts of the abstract grammar for the language are shown. Program ::= FunDecl*; FunDecl ::= Type IdDecl Param* Body; abstract Type; FloatType : Type; IntType : Type; VoidType : Type; FunType : Type ::= ParamType: Type* ReturnType: Type; IdDecl ::= <ID >; Param ::= Type IdDecl; Body ::= Stmt*; abstract Stmt;... 3

a) Construct an unambiguous context-free grammar for the part of the language described by the abstract grammar above. The grammar should be on EBNF form (Extended Backus-Naur Form), i.e., allowing optionals and lists. Your EBNF grammar should be as similar as possible to the abstract grammar: for each class in the abstract grammar, there should be a corresponding nonterminal, with the same name, and you should avoid using additional nonterminals. Your grammar should cover the example program above, except for the statements inside functions. You may assume there is a predefined token ID for identifiers. (6p) b) Below, some more parts of the abstract grammar are shown. ReturnStmt : Stmt ::= Expr; CallStmt : Stmt ::= Call; abstract Expr; Call : Expr ::= IdUse Arg: Expr*; IdUse : Expr ::= <ID >; abstract BinExpr : Expr ::= Left: Expr Right: Expr; Add : BinExpr; Sub : BinExpr; Mul : BinExpr; Div : BinExpr; FloatLit : Expr ::= <FLOAT >; IntLit : Expr ::= <INT >; Construct an unambiguous context-free grammar for this part of the language, on BNF or canonical form, i.e., without using optionals or lists. The parse trees should reflect the usual associativity and precedence rules: binary operators are left-associative, and with multiplication and division having higher precedence than addition and subtraction. Make sure your grammar covers the statements used in the example program. You may assume there are predefined tokens INT and FLOAT for integer and float literals. If you construct an ambiguous grammar that is otherwise correct, you will get half of the points. (8p) c) Prove that the following expression can be derived from the nonterminal Expr of your BNF grammar, by drawing a parse tree for it. Make sure to include all nonterminals and terminals in the tree so that it matches the grammar exactly. The root of the tree should be an Expr nonterminal. It is fine to abbreviate the nonterminals in the tree, e.g., to write E instead of Expr, as long as the abbreviations are obvious. x + f(x) * 2 If you did not solve the previous task completely, and your grammar is ambiguous for this expression, provide two different parse trees for the expression, to get full points on this task. (4p) 4

3 Program analysis We will continue to work with the language with higher-order functions introduced in problem 2. The language does not permit the use of the void type for parameters, but since this is not prohibited by the abstract grammar, we will instead check this using attributes. Whereas the example given at page 3 shows legal uses of void types, the example below shows a program with illegal uses: int f( void x) { // Line 1, col 7: Illegal use of " void". return 0; int g(( void) -> int p) { // Line 4, col 8: Illegal use of " void". return 0; We would like to compute a set of error messages for illegal uses of void types. For the above example program, the set should contain the two error message strings above. For the example program on page 3, the set should be empty. To compute the line and column numbers, you can assume that each ASTNode has int attributes getline() and getcol(). a) Solve this problem by using attribute grammars, and without using Java s instanceof keyword. The result of the computation should be an attribute of type Set in the Program node. Hint! Use a collection attribute. (8p) b) Solve the problem by using a visitor. The visitor should have a static method static Set result( Program node) {... that computes the result. No attributes or inter-type methods may be used in this solution. You may assume that there is an interface Visitor, with a method void visit(c node); for each concrete class C in the abstract grammar. Assume also that there is an abstract method void accept( Visitor v); for the general class ASTNode, and an implementation of accept for each concrete class C in the abstract grammar. Each such implementation forwards the call to the appropriate visit method in v as follows: void accept( Visitor v) { v. visit( this); Hint! You don t have to visit all nodes. (7p) 5

4 Code generation and run-time systems We will continue to work with the language with higher-order functions. Consider the following program: int h(int a, (int) -> int f ) { return 3*(a + f(2*a)); int g( int x) { (** PC **) return 4 * x + 5; void print(int x) {... void main() { print(h(1, g)); // Will print 42 The same calling convention is to be used as in the labs, i.e., where parameters are passed on the stack, and the return value is passed in rax. For a function parameter, it is the address to the function that should be passed. For example, when main calls h with the parameter g, it is the address of g that should be passed. This can be done with the instruction leaq, which moves an address into a register. This is in contrast to the instruction movq, which moves the content at an address into a register. In subproblem a) you will make a drawing and in subproblems b) and c) you will write x86 code. The code should be consistent with your drawing. Use only the instructions on the x86 Cheat Sheet. Use rbp as frame pointer and rsp as stack pointer. You are encouraged to comment your code to help us understand your intention. For simplicity and readability, you may leave out the characters q, $, %, and, in the code. For example, you may write add 8 rax instead of addq $8, %rax. a) Draw the situation on the stack at the location **PC**. Your drawing should include stack pointer, frame pointer, dynamic links, and parameters. You may leave out possible temporaries from the drawing. Include the actual values as far as possible, including dynamic links, and mark which frame is which. (5p) b) Translate the statement print(h(1, g)) to unoptimized x86 code. (5p) c) Translate the function h to unoptimized x86 code. Also, draw a table showing the addresses of the parameters. (5p) 6