CIS 341 Midterm February 28, Name (printed): Pennkey (login id): SOLUTIONS

Similar documents
CIS 341 Midterm March 2, 2017 SOLUTIONS

CIS 341 Midterm March 2, Name (printed): Pennkey (login id): Do not begin the exam until you are told to do so.

CIS 341 Final Examination 4 May 2017

CS 31: Intro to Systems Functions and the Stack. Martin Gagne Swarthmore College February 23, 2016

Assembly Language: Function Calls

Assembly Language: Function Calls" Goals of this Lecture"

CSE P 501 Exam 8/5/04 Sample Solution. 1. (10 points) Write a regular expression or regular expressions that generate the following sets of strings.

Assembly Language: Function Calls" Goals of this Lecture"

Assembly Language: Function Calls. Goals of this Lecture. Function Call Problems

x86 assembly CS449 Fall 2017

Lecture 15 CIS 341: COMPILERS

What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control.

Intel assembly language using gcc

Introduction to Compilers Professor Jeremy Siek

CPEG421/621 Tutorial

Procedure Calls. Young W. Lim Mon. Young W. Lim Procedure Calls Mon 1 / 29

AS08-C++ and Assembly Calling and Returning. CS220 Logic Design AS08-C++ and Assembly. AS08-C++ and Assembly Calling Conventions

Intermediate Code Generation

Function Calls COS 217. Reading: Chapter 4 of Programming From the Ground Up (available online from the course Web site)

CIS 341 Final Examination 3 May 2011

4/1/15 LLVM AND SSA. Low-Level Virtual Machine (LLVM) LLVM Compiler Infrastructure. LL: A Subset of LLVM. Basic Blocks

4) C = 96 * B 5) 1 and 3 only 6) 2 and 4 only

CSE P 501 Exam 8/5/04

CS241 Computer Organization Spring 2015 IA

Assignment 11: functions, calling conventions, and the stack

CMSC 313 Fall2009 Midterm Exam 2 Section 01 Nov 11, 2009

Procedure Calls. Young W. Lim Sat. Young W. Lim Procedure Calls Sat 1 / 27

How Software Executes

Question 4.2 2: (Solution, p 5) Suppose that the HYMN CPU begins with the following in memory. addr data (translation) LOAD 11110

CIT Week13 Lecture

W4118: PC Hardware and x86. Junfeng Yang

Introduction to Computer Systems. Exam 1. February 22, This is an open-book exam. Notes are permitted, but not computers.

Lecture #16: Introduction to Runtime Organization. Last modified: Fri Mar 19 00:17: CS164: Lecture #16 1

You may work with a partner on this quiz; both of you must submit your answers.

Assembly level Programming. 198:211 Computer Architecture. (recall) Von Neumann Architecture. Simplified hardware view. Lecture 10 Fall 2012

Introduction to Computer Systems. Exam 1. February 22, Model Solution fp

What is a Compiler? Compiler Construction SMD163. Why Translation is Needed: Know your Target: Lecture 8: Introduction to code generation

X86 Review Process Layout, ISA, etc. CS642: Computer Security. Drew Davidson

CSC 2400: Computing Systems. X86 Assembly: Function Calls"

Homework. In-line Assembly Code Machine Language Program Efficiency Tricks Reading PAL, pp 3-6, Practice Exam 1

UW CSE 351, Winter 2013 Midterm Exam

Compiler Construction D7011E

Lecture 9 CIS 341: COMPILERS

CMSC 313 Lecture 12. Project 3 Questions. How C functions pass parameters. UMBC, CMSC313, Richard Chang

CIS 341 Final Examination 30 April 2013

A Simple Syntax-Directed Translator

CIS 341 Final Examination 4 May 2018

CSE P 501 Exam 11/17/05 Sample Solution

CMSC 313 Lecture 12 [draft] How C functions pass parameters

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

2.2 Syntax Definition

CSE 582 Autumn 2002 Exam Sample Solution

CS 164 Handout 11. Midterm Examination. There are seven questions on the exam, each worth between 10 and 20 points.

Process Layout and Function Calls

A simple syntax-directed

CSE 413 Winter 2001 Final Exam Sample Solution

CMSC 350: COMPILER DESIGN

Summary: Direct Code Generation

ASSEMBLY III: PROCEDURES. Jo, Heeseung

CS642: Computer Security

Assembly III: Procedures. Jo, Heeseung

COMP 210 Example Question Exam 2 (Solutions at the bottom)

1 Lexical Considerations

CSE 401 Midterm Exam Sample Solution 11/4/11

CSE2421 FINAL EXAM SPRING Name KEY. Instructions: Signature

Machine-level Programming (3)

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

CSE 401 Final Exam. December 16, 2010

CIS 120 Midterm II November 8, Name (printed): Pennkey (login id):

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

Compiler Construction: LLVMlite

Lecture 7 CIS 341: COMPILERS

CS 33: Week 3 Discussion. x86 Assembly (v1.0) Section 1G

Systems I. Machine-Level Programming V: Procedures

Machine Programming 3: Procedures

Syntax and Grammars 1 / 21

Assembly Programmer s View Lecture 4A Machine-Level Programming I: Introduction

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Code Generation. Lecture 30

Assembly III: Procedures. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS 403: Scanning and Parsing

CSC 2400: Computing Systems. X86 Assembly: Function Calls

CPSC 313, Summer 2013 Final Exam Date: June 24, 2013; Instructor: Mike Feeley

Computer Systems Organization V Fall 2009

CSCI 2121 Computer Organization and Assembly Language PRACTICE QUESTION BANK

CSE P 501 Exam Sample Solution 12/1/11

Optimizing Finite Automata

Time : 1 Hour Max Marks : 30

CPS104 Recitation: Assembly Programming

Integers and variables

Process Layout, Function Calls, and the Heap

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2016 Lecture 12

Stacks and Frames Demystified. CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han

Stack -- Memory which holds register contents. Will keep the EIP of the next address after the call

CPSC 313, Winter 2016 Term 1 Sample Date: October 2016; Instructor: Mike Feeley

Lexical Considerations

CS5363 Final Review. cs5363 1

CSE 582 Autumn 2002 Exam 11/26/02

Transcription:

CIS 341 Midterm February 28, 2013 Name (printed): Pennkey (login id): My signature below certifies that I have complied with the University of Pennsylvania s Code of Academic Integrity in completing this examination. Signature: Date: SOLUTIONS 1

1. True or False (10 points) a. T F In X86 assembly, the instruction Jmp eax will cause the program counter regiser (eip) to be set to the contents of the eax register. b. T F We could remove the Call and Ret instructions from the X86lite subset we ve been using for class projects without changing the expressiveness of the language. c. T F It is possible to write a regular expression that matches that set of strings consisting of well-balanced parentheses. d. T F The regular expression (ɛ + b + bb)(a + ab + abb)* accepts all strings of a s and b s with at most two consecutive occurrences of b. (Here + indicates alternative choice.) e. T F LR(k) grammars cannot be right recursive. f. T F There is no such thing as a shift/shift conflict for a LR parser. g. T F In an LR(0) parser state, the item S (.L) indicates that nonterminal L is at the top of the stack because L is to the right of the dot. h. T F Choosing integer tags as a representation for OCaml datatype variants or C and Java-style enums to be sequential and dense (for example just the integers 0... N) allows match and switch compilation to use efficient jump tables. i. T F In C, the compiler lays out a 2D array with statically known dimensions (like int M[3][4]) so that all of the elements are contiguous in memory. j. T F When compiling C for 32-bit x86 and word aligned fields, returns 6 (bytes). sizeof(struct{char a; char b; int c}) 2

2. Lexing, Parsing, and Grammars (20 points) a. Consider the following ambiguous context-free grammar for the language of regular expressions, where R is the only nonterminal and the terminals are taken from the set { c, *, +, (, ), ɛ}. Here c stands for a collection of character-carrying tokens, where the character is c. (We use + rather than for alternation to avoid confusion with the symbol used on the grammar.) R ::= ɛ c R+R R* RR (R) We might implement the datatype of abstract syntax trees for this grammar using the following OCaml code: type rexp = Eps (* ɛ *) Char of char (* c *) Alt of rexp * rexp (* R+R *) Star of rexp (* R* *) Seq of rexp * rexp (* RR *) i. (6 points) Demonstrate that this grammar is ambiguous by giving two different abstract syntax trees (OCaml values of type rexp) that might be generated by parsing the input sequence: a +( b c d ) Alt(Char a, Seq(Char b, Seq(Char c, Char d ))) or Alt(Char a, Seq(Seq(Char b, Char c ), Char d )) ii. (8 points) Write down the context-free grammar obtained by disambiguating the language above so that all operators associate to the left and * has higher precedence than sequence, which has higher precedence than +. R ::= R+S S S ::= ST T T ::= ɛ c (R) T * 3

b. (6 points) Recall that a palindrome is a sequence that reads the same forward and backward: A and ABBA and BABAB are all palindromes but AB is not. Write down a context free grammar that recognizes all (and only) the palindromes over the two tokens {A, B}. Your grammar is allowed to be ambiguous. Use the nonterminal S as the start symbol. S ::= ɛ A B ASA BSB 4

3. LLVM IR & Intermediate code generation (25 points) In this problem we consider the LLVM intermediate representation. For your reference, the Appendix at the end of the exam defines the grammar for the subset of the LLVM IR we have been using (so far) in the course. a. (4 points) A collection of LLVM Lite IR blocks with an entry point entry must satisfy a number of invariants for it to be well formed (i.e. make sense ) as a program. For example, each %uid that is used as an operand in an instruction must be defined before it is used. What is the static single assignment invariant relative to the LLVM IR subset given in the appendix? Each %uid must appear on the left hand side of a defining = exactly once in the entire control-flow graph. b. (6 points) What value will be returned by running the following LLVM code, starting from the label entry? entry: %t1 = alloca store 3, %t1 %t2 = load %t1 %t3 = load %t1 %t4 = mul %t2, %t3 store %t4, %t1 %t5 = icmp eq %t2, %t3 br %t5, label %then, label %else 81 then: %t6 = alloca %t7 = load %t1 %t8 = mul %t7, %t7 store %t8, %t6 %t9 = load %t6 store %t9, %t1 br label %else else: %t10 = load %t1 ret %t10 5

c. (15 points) Some languages have a do { stmt } while(exp) construct that executes stmt once and then repeats it until the guard expression exp becomes false. In this problem, we investigate how to compile this statement form to LLVM IR. Suppose we have (most of the) implementations of compilation judgments: C [exp] = (operand stream) C [stmt] = stream Here, a stream is a sequence of labels, LLVM instructions, and terminators that can be broken up into basic blocks. (You need not distinguish these cases with constructors.) How do you translate the do while construct to LLVM? That is, give (OCaml-like) pseudo code that says how you would implement the rule below. Indicate which labels and LLVM uids should be freshly generated. C [do { stmt } while (exp)] = let (op, exp_insns) = C [exp] in let stmt_insns = C [stmt] in let tmp = mk_uid () in let lpre = mk_label() in let lpost = mk_label() in lpre: stmt_insns exp_insns %tmp = icmp eq op, 0 br %tmp label %lpost, label %lpre lpost: 6

4. X86 Assembly Programming (20 points) Consider the following C function: void foo(int x, int* y) { int a = x + (*y); *y = a; } Recall that int* y declares y to be an int pointer and that *y accesses the contents pointed to by y. The gcc compiler (in 32-bit only mode and without optimizations) produces the following X86 assembly code, which is in our X86lite subset and follows cdecl calling conventions: _foo: pushl %ebp movl %esp, %ebp ; %ebp set here subl $12, %esp movl 12(%ebp), %eax movl 8(%ebp), %ecx movl %ecx, -4(%ebp) movl %eax, -8(%ebp) movl -8(%ebp), %eax movl (%eax), %eax movl -4(%ebp), %ecx addl %ecx, %eax movl %eax, -12(%ebp) movl -8(%ebp), %eax movl -12(%ebp), %ecx movl %ecx, (%eax) addl $12, %esp popl %ebp ret a. (4 points) The caller of this function places argument x at which (indirect offset) memory location? (Relative to the value in %ebp after the line marked above.) a. -12(%ebp) b. -8(%ebp) c. -4(%ebp) d. 8(%ebp) e. 12(%ebp) b. (4 points) The local variable a resides at which (indirect offset) memory location? (Relative to the value in %ebp after the line marked above.) a. -12(%ebp) b. -8(%ebp) c. -4(%ebp) d. 8(%ebp) e. 12(%ebp) c. (4 points) How much memory does the stack frame used by _foo in this code take up in bytes? Include the saved return address and base pointer, and any stack space allocated for local storage, but not the space allocated by the caller for the function arguments. a. 8 bytes b. 16 bytes c. 20 bytes d. 24 bytes e. 28 bytes f. 32 bytes 7

d. (8 points) Which of the following optimized versions could replace the body _foo: and still be correct with respect to the C program and cdecl calling conventions? Mark all that are correct there may be more than one. i. THIS ONE _foo: pushl movl movl movl addl popl ret %ebp %esp, %ebp 12(%ebp), %eax 8(%ebp), %ecx %ecx, (%eax) %ebp ii. _foo: pushl movl movl addl movl popl ret %ebp %esp, %ebp 12(%ebp), %eax 8(%ebp), %eax %ebp, %esp %ebp iii. _foo: movl 4(%esp), %eax addl 8(%esp), %eax ret iv. _foo: movl 4(%esp), %ebx movl 8(%esp), %eax addl %ebx, (%eax) ret 8

5. Scope Checking (25 points) In this problem we will consider scope checking (a simple subset of) OCaml programs. The grammar for this subset of OCaml is given by the syntactic categories below: exp ::= Expressions x f variables and function names int integer constants exp 1 + exp 2 arithmetic exp 2 exp 2 function application let x = exp 1 in exp 2 local lets (exp) prog ::= Programs ;;exp answer expression let x = exp prog top-level declarations let f x = exp prog top-level, one-argument function declarations Scoping contexts for this language consist of comma-separated lists of variable and function names: G ::= Scoping Contexts empty context x, G add x to the context f, G add f to the context Scope checking for expressions is defined by the following inference rules, which use judgments of the form G exp. Recall that the notation x G means that x occurs in the context list G. x G G x [VARX] f G G f [VARF] G int [INT] G exp 1 G exp 2 G exp 1 + exp 2 [ADD] G exp 1 G exp 2 G exp 1 exp 2 [APP] G exp 1 x, G exp 2 G let x = exp 1 in exp 2 [LET] Scope checking for programs is defined by these three inference rules, which use judgments of the form G prog. G exp G ;; exp [EXP] G exp x, G prog G let x = exp prog [LETX] x, G exp f, G prog G let f x = exp prog [LETF] A program prog is considered to be well scoped exactly when it is possible to derive a judgment in the empty context: prog 9

a. Consider the following program: let x = 3 let f y = let z = 3 + y in z + x ;; f x It is well-scoped, as shown by the following (partial) derivation that has some missing pieces as marked by names G?, D1?, and D2?. D1? z G? y, x, 3 [INT] y y, x, y, x, y [VARX] [VARX] G? z [ADD] y, x, 3 + y G? z + x y, x, let z = 3 + y in z + x x G? G? x [LET] [VARX] [ADD] x, let f y = let z = 3 + y in z + x ;; f x let x = 3 let f y = let z = 3 + y in z + x ;; f x D2? [LETX] [LETF] i. (3 points) Which of the following derivation trees belongs in the spot marked D1? a. x x, x, x [VARX] b. 3 [INT] c. x, 3 [INT] d. ii. (3 points) What is the missing context G? a. z, x, b. y, x, c. x, y, z, d. z, y, x, x x [VARX] iii. (8 points) In the space below, draw the missing derivation tree marked by D2?. Be sure to include the names of the rules as in the tree above. f f, x, f, x, f [VARF] x f, x, f, x, x [VARX] [APP] f, x, f x f, x, ;; f x [EXP] 10

b. (5 points) The following program is syntactically well-formed but ill-scoped one of the x G checks fails for some variable x and some context G. What are they? (Be careful with the ordering of variables in the context!) let a = 3 let f1 b = b + a let f2 c = (let d = f1 3 in f1 a) + (f1 d) ;; f2 (f1 a) x = d G = c, f1,a, c. (6 points) The LETF rule above does not allow the function to be recursive. Write down the premises of the rule needed to allow a well-scoped function body to mention the function name itself: f, x, G exp f, G prog G let f x = exp prog [LETRECF] 11

Appendix LLVM Lite IR A subset of the LLVM IR that we have used in Project 3: op ::= %uid constant bop ::= add sub mul shl... cmpop ::= eq ne slt sle... insn ::= %uid = bop op1, op2 %uid = alloca %uid = load op1 store op1, op2 %uid = icmp cmpop op1, op2 terminator ::= ret op br op label %lbl1, label %lbl2 br label %lbl block ::= lbl: insn 1... insn n terminator An LLVM Lite IR program is a collection of blocks such that all labels mentioned in the terminators of those blocks are included among the labels of the blocks themselves. One label called entry is the designated entry point of the program. A %uid is an operand identifier and %lbl is a label identifier. 12