CSCI565 Compiler Design

Similar documents
Compiler Design. Spring Run-Time Environments. Sample Exercises and Solutions. Prof. Pedro C. Diniz

CSCI Compiler Design

System Software Assignment 1 Runtime Support for Procedures

CSC 2400: Computing Systems. X86 Assembly: Function Calls"

CA Compiler Construction

CSC 2400: Computing Systems. X86 Assembly: Function Calls

THEORY OF COMPILATION

Calling Conventions. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 2.8 and 2.12

Intermediate Code Generation

G Programming Languages - Fall 2012

Run-time Environments

Run-time Environments - 2

Run-time Environments

Module 27 Switch-case statements and Run-time storage management

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

Run-time Environment

CS356: Discussion #6 Assembly Procedures and Arrays. Marco Paolieri

ECE260: Fundamentals of Computer Engineering

Lectures 5. Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS

1 Lexical Considerations

Functions in MIPS. Functions in MIPS 1

Assignment 11: functions, calling conventions, and the stack

Prof. Kavita Bala and Prof. Hakim Weatherspoon CS 3410, Spring 2014 Computer Science Cornell University. See P&H 2.8 and 2.12, and A.

CSC 8400: Computer Systems. Using the Stack for Function Calls

Compilers and Code Optimization EDOARDO FUSELLA

Scope: Global and Local. Concept of Scope of Variable

Final CSE 131B Spring 2004

Implementing Subroutines. Outline [1]

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

Procedure and Object- Oriented Abstraction

12/4/18. Outline. Implementing Subprograms. Semantics of a subroutine call. Storage of Information. Semantics of a subroutine return

Dynamic memory allocation

BIL 104E Introduction to Scientific and Engineering Computing. Lecture 14

CSE Lecture In Class Example Handout

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

Lecture 5. Announcements: Today: Finish up functions in MIPS

Chapter 14 Functions. Function. Example of High-Level Structure. Functions in C

Course Administration

Stack Frames. September 2, Indiana University. Geoffrey Brown, Bryce Himebaugh 2015 September 2, / 15

Programming in C. Pointers and Arrays

ECE232: Hardware Organization and Design

CSE 333 Lecture 7 - final C details

Midterm Exam 2 Solutions C Programming Dr. Beeson, Spring 2009

CS240: Programming in C

Q1: /20 Q2: /30 Q3: /24 Q4: /26. Total: /100

Binary Representation. Decimal Representation. Hexadecimal Representation. Binary to Hexadecimal

Decimal Representation

CSC 2400: Computer Systems. Using the Stack for Function Calls

CS 0449 Sample Midterm

Understanding Pointers

Fundamentals of Programming Session 12

Short Notes of CS201

CSCI 565: Compiler Design and Implementation Spring 2006

Topic 7: Activation Records

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

Lexical Considerations

We can emit stack-machine-style code for expressions via recursion

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

Compilers and computer architecture: A realistic compiler to MIPS

Runtime management. CS Compiler Design. The procedure abstraction. The procedure abstraction. Runtime management. V.

CS201 - Introduction to Programming Glossary By

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )

Code Generation. Lecture 12

United States Naval Academy Electrical and Computer Engineering Department EC310-6 Week Midterm Spring AY2017

Function Calls. 1 Administrivia. Tom Kelliher, CS 240. Feb. 13, Announcements. Collect homework. Assignment. Read

PROGRAMMAZIONE I A.A. 2017/2018

CSC258: Computer Organization. Functions and the Compiler Tool Chain

Compilation /15a Lecture 7. Activation Records Noam Rinetzky

Programming Languages: Lecture 12

CSE 230 Intermediate Programming in C and C++ Functions

Processes. Johan Montelius KTH

Today. Putting it all together

Unit 7. Functions. Need of User Defined Functions

Run Time Environment. Activation Records Procedure Linkage Name Translation and Variable Access

A process. the stack

Lexical Considerations

Run-Time Environments

Computer Systems Lecture 9

Lab Exam 1 D [1 mark] Give an example of a sample input which would make the function

Subroutines. int main() { int i, j; i = 5; j = celtokel(i); i = j; return 0;}

Functions and Procedures

Functions in C. Memory Allocation in C. C to LC3 Code generation. Next.. Complete and submit C to LC3 code generation. How to handle function calls?

Compilation 2014 Activation Records (Part 1)

CSC 2400: Computer Systems. Arrays and Strings in C

Anne Bracy CS 3410 Computer Science Cornell University

Computer Architecture. Chapter 2-2. Instructions: Language of the Computer

CS 316: Procedure Calls/Pipelining

Jump Statements. The keyword break and continue are often used in repetition structures to provide additional controls.

C BOOTCAMP DAY 2. CS3600, Northeastern University. Alan Mislove. Slides adapted from Anandha Gopalan s CS132 course at Univ.

Stacks and Frames Demystified. CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han

Function Calls. Tom Kelliher, CS 220. Oct. 24, SPIM programs due Wednesday. Refer to homework handout for what to turn in, and how.

Sample Midterm (Spring 2010)

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

CSc 520 Principles of Programming Languages. Questions. rocedures as Control Abstractions... 30: Procedures Introduction

Readings and References. Procedure Detail. Leaf procedures. Non-leaf procedure. Calling tree. Layout of stack frame (little leaf)

Overview (1A) Young Won Lim 9/14/17

Chapter 11 Introduction to Programming in C

CSC 2400: Computer Systems. Using the Stack for Function Calls

CHAPTER 4 FUNCTIONS. 4.1 Introduction

Midterm II CS164, Spring 2006

Transcription:

CSCI565 Compiler Design Spring 2011 Homework 4 Solution Due Date: April 6, 2011 in class Problem 1: Activation Records and Stack Layout [50 points] Consider the following C source program shown below. #include <stdio.h> #include <stdlib.h> int num; int values[4]; void procd(int d){ values[num] = d; num = num + 1; printf("%d\n",d); } int proca(int a){ procd(a); return 0; } int (int b){ if(b <= 0){ return -b; } else { procd(b); procc(b); } return 0; } int procc(int c){ return (c-1); } int main(int argc, char **argv){ num = 0; 34: proca(atoi(argv[1])); 35: (atoi(argv[2])); 36: printf("num: %d\n",num); 37: return 0; 38: } Questions: (a) [05 points] (b) [05 points] (c) [05 points] Show the call graph, i.e. caller-callee relationship for user defined procedures/functions. Show the call tree and its execution history, i.e., the arguments values and output produced by the program s execution when the value of argv[1] is 3 and argv[2] is 2. Discuss for this particular code if the Activation Records (ARs) for each of the functions proca through procd can be allocated statically or not. Explain why or why not. (d) [15 points] Draw the contents of the stack when the control reaches the line in the source code labeled as 20- i.e., just before the program executes the return statement in this line. For the purpose of indicating the return addresses include the designation as N+ for a call instruction on line N. For instance, then procedure invokes the procedure procd in line 22, the corresponding return address can be labeled as 22+ to indicate that the return address should be immediately after line 22. Indicate the contents of the global and local variables to each procedure as well as the links in the AR. Use the AR organization described in class indicating the location of each procedure s local variable in the corresponding AR. (e) [10 points] (f) [10 points] The specific calling sequence of proca invoking procd could be shortened by inlining a copy of procd in the body of procedure proca. The structure of the AR for procedure proca would have to be modified to account for any particular local variables of procd. Discuss the impact on size and access to local variables this transformation would have. For this particular code do you need to rely on the Access s on the AR to access non-local variables? Why or why not? 1 of 7

Solution: a) The call graph is a graph where the nodes represent the functions/procedures in the code and the directed edges denote the relationship caller-callee. For this code the call graph is as shown below. main atoi proca procd procc printf b) The call history is distinct from the call tree in the sense the various invocations are explicitly represented in the tree (no longer a graph) as shown on the left hand side below. The execution of the main function for the inputs 3 and 2 is as shown on the right hand side. atoi("3") main(3,"3","2") proca(3) procd(3) atoi("2") procd(2) (2) printf("num: %d\n",3) procc(2) Active Section of the Call Tree when control reaches line 20 prompt>./main 3 2 3 2 1 num: 3 printf("%d\n",3) printf("%d\n",2) (1) procd(1) procc(1) printf("%d\n",1) (0) c) As can be seen in the call graph depicted in a) above there is a section of the call graph that has a cycle. This means that the procedures and procc are mutually recursive. The implication is that there can potentially be multiple active instances of each of these procedures at a given point in time during the execution. As such there need to be distinct AR one for each of these active invocations and as a consequence these AR need to be allocated on the stack and not statically. Unlike these two procedures, procedures proca and procd can have their AR allocated statically as at any point in time during the execution they only have a single invocation active. d) The organization of the stack with the ARs when the control reaches the execution point labeled 20 is as illustrated below. Values for arguments and global variables reflect the execution when the control reaches this point. Note that no AR includes storage for local variables as no procedure has declared local variables. Instead the ARs include storage for arguments and other fields as discussed in class. In this depiction the stack grows downwards with each procedure invocation. Also depicted a frame for the globally addressed data, in this case the num and values array variables. 2 of 7

global data num = 3 values : values[0] = 3 values[1] = 2 values[2] = 1 values[3] = 0 main Ret. Address: ----- Ret. Address: 29+ Arg 1: argc = 3 Arg 1: b = 1 Arg 2: argv[1] = "3" argv[2] = "2" procc Ret. Address: 23+ Ret. Address: 35+ Arg 1: c = 1 Arg 1: b = 2 SP procc Ret. Address: 29+ Ret. Address: 23+ Arg 1: b = 0 Arg 1: c = 2 e) Procedure procd has a single input argument and only manipulates global variables. Its inlining in procedure proca would have no effect on the AR as its single argument is never written to and thus needs not to retain any value across the invocations of procd. The resulting code for procedure proca after the inlining of procd is as shown below. int proca(int a){ values[num] = a; num = num + 1; printf("%d\n",a); return 0; } f) As is the case of C, there is no reason to include the Access on the procedures AR. The C language does not support nested scopes at the procedure level (although it does support nested static scopes inside each procedure). The only local variables each procedure may access include the local variables as the global or file level variable for which there is a single copy active at any given point in time during the execution. There is never the need to access local variables of another procedure that is currently active. No Access is required and neither the use of the Display mechanism. 3 of 7

Problem 2: Register Allocation [50 points] Consider the following 3-address representation of a computation depicted below on the left-hand side. In this assembly code the access to a procedures arguments and local variables as well as the return value is done using the Activation Record Pointer () and using the offsets as indicated on the right-hand side of the figure. Global variables are accessed via a global data pointer (GP) as is the case when accessing a global array of integer values A[ ]. Finally, the return of the control to the calling context is accomplished by the return instruction on line 34. Source Line 34: L1: L3: L4: Assembly t1 = - 4 t1 = *t1 t2 = - 8 t2 = *t2 if (t1 < 0) goto L3 t3 = GP + 12 *t3 = 1 t4 = GP + 16 *t4 = 0 t6 = + 16 t7 = 0 *t6 = t7 t8 = + 16 if (t8 == 32) goto L3 t9 = GP + 20 t11 = t8 * 4 t12 = t9 + t11 t12 = *t12 if (t12 = t2) goto L2 t5 = + 16 t5 = *t5 t13 = + 4 *t13 = t5 t8 = + 16 t9 = t8 + 1 t8 = + 16 *t8 = t9 t10 = + 4 *t10 = -1 return Comment t1 = param a if (a < 0) return -1 global x = 1 global y = 0 i = 0 if (i < 31) return -1 t12 = A[i] if (A[i] == b) return i return value = i i = i + 1 return value = -1 AR offset arg : b -8 arg : a -4 return address return value Access 0 4 8 12 local: i 16 Questions: (a) [10 points] (b) [15 points] Derive the set of basic blocks in this assembly code. Recall that a basic block is maximal a sequence of instructions with a single entry and exit points. For each basic block determine which variables, such as the temporary variables, are live when the control leaves the basic block (this information will later one be used for the register allocator) as well as the worst-case frequency of execution of each basic block when the procedure to which this code corresponds is executed. Present your results in a tabular form. Using the top-down register allocation algorithm described in class determine which variables should be assigned to registers under the assumption you only have 3 physical registers, r0, r1 and r2. Rewrite the code using these three physical registers and assuming that for temporary computations you can use other three scratch register r11, r12 and r13. (c) [25 points] Using the graph-coloring register allocation algorithm rewrite the 3-address code using 3 physical registers for the two definitions of instruction interference described in class. Present the interference graphs (or matrices) along with a possible color assignment. You do not have to show the intermediate coloring algorithm steps but show the live ranges of each variable. Again assume you have the additional three registers r11, r12 and r13 for temporary computations. 4 of 7

Solution: (a) The basic blocks are as shown below on the right-hand side. On the left-hand side we have a table of which variables are live out side each basic block. As can be seen the live ranges of these variables is extremely short. In fact only variables t2 and t8 have live ranges that span multiple basic blocks. ENTRY t1 = - 4 t1 = *t1 t2 = - 8 t2 = *t2 if (t1 < 0) goto L3 t3 = GP + 12 *t3 = 1 t4 = GP + 16 *t4 = 0 t6 = + 16 t7 = 0 *t6 = t7 BB1 BB2 L1: t8 = + 16 BB3 if (t8 == 32) goto L3 t9 = GP + 20 t11 = t8 * 4 t12 = t9 + t11 BB4 t12 = *t12 if (t12 = t2) goto L2 Live Out Variable 01 02 03 04 05 06 07 08 t1 0 0 0 0 0 0 0 0 t2 1 1 1 1 0 1 0 0 t3 0 0 0 0 0 0 0 0 t4 0 0 0 0 0 0 0 0 t5 0 0 0 0 0 0 0 0 t6 0 0 0 0 0 0 0 0 t7 0 0 0 0 0 0 0 0 t8 0 0 1 0 0 0 0 0 t9 0 0 0 0 0 0 0 0 t10 0 0 0 0 0 0 0 0 t11 0 0 0 0 0 0 0 0 t12 0 0 0 0 0 0 0 0 t13 0 0 0 0 0 0 0 0 Live Ranges Instruction Ranges t1 [0-5] t2 [3-20] [26-31] t3 [6-7] t4 [8-9] t5 [21-24] t6 [10-12] t7 [11-12] t8 [13-17] [26-30] t9 [16-17] [28-30] t10 [32-33] t11 [17-18] t12 [18-20] t13 [23-24] t5 = + 16 t5 = *t5 t13 = + 4 *t13 = t5 BB5 L3: t10 = + 4 *t10 = -1 BB7 t8 = + 16 t9 = t8 + 1 t8 = + 16 *t8 = t9 BB6 34: L4: return BB8 EXIT (b) In the figure below on the right-hand side we have the frequency of execution of each block and the corresponding metric for each variable. Under the top-down allocation strategy we would place variables t8, t9 and t12 in registers leaving all the other using register r11, r12 and r13. Based on the metric developed for this top-down allocator we would select the temporary variables t8, t9 and t12 as being mapped to the registers r0, r1 and r2. The resulting code using the physical registers r0, r1 and r2 is as shown in the figure below. Notice that the modified code has no spill code, Instead, and fir the conditional statement on line 19 there is the need to reload the value of t2 the input parameter of the function a second time in register r12, hence the additional two lines of code with addresses labeled 00. 5 of 7

Source Line 00: 00: L1: Assembly r11 = - 4 r11 = *r11 r12 = - 8 r12 = *r12 if (r11 < 0) goto L3 r11 = GP + 12 *r11 = 1 r12 = GP + 16 *r12 = 0 r11 = + 16 r12 = 0 *r11 = r12 if (r0 == 32) goto L3 r1 = GP + 20 r11 = r0 * 4 r2 = r1 + r11 r2 = *r2 r12 = - 8 r12 = *r12 Comment t1 = param a if (a < 0) return -1 global x = 1 global y = 0 i = 0 if (i < 31) return -1 t12 = A[i] BB ID 01 02 03 04 05 06 07 08 Frequency 1 1 32 32 1 32 1 1 Variable 01 02 03 04 05 06 07 08 t1 4 0 0 0 0 0 0 0 t2 3 0 0 1 0 0 0 0 t3 0 2 0 0 0 0 0 0 t4 0 2 0 0 0 0 0 0 t5 0 0 0 0 4 0 0 0 t6 0 2 0 0 0 0 0 0 t7 0 2 0 0 0 0 0 0 t8 0 0 4 1 0 6 0 0 t9 0 0 0 2 0 2 0 0 t10 0 0 0 0 0 0 2 0 t11 0 0 0 2 0 0 0 0 t12 0 0 0 4 0 0 0 0 t13 0 0 0 0 2 0 0 0 Variable Metric t1 4 t2 35 t3 2 t4 2 t5 4 t6 2 t7 2 t8 352 t9 128 t10 2 t11 64 t12 128 t13 2 34: L3: L4: if (r2 = r12) goto L2 r11 = + 16 r11 = *r11 r12 = + 4 *r12 = r11 r1 = r0 + 1 *r0 = r1 r11 = + 4 *r11 = -1 return if (A[i] == b) return i return value = i i = i + 1 return value = -1 (c) On the left we have the interference table for the simpler notion of interference whereas on the right we have for the notion of interference taking into account the use of the value in each register when the RHS is evaluated. The live ranges on the right-hand side of the figure include the R superscript for a Read as the last operation or a W for the Write as the first operation on the instruction at that specific line. Live Ranges Instruction Ranges t1 [0-5] t2 [3-20] [26-31] t3 [6-7] t4 [8-9] t5 [21-24] t6 [10-12] t7 [11-12] t8 [13-17] [26-30] t9 [16-17] [28-30] t10 [32-33] t11 [17-18] t12 [18-20] t13 [23-24] t10 t11 t9 t12 t8 t1 t2 t7 t3 t4 t5 t13 t6 Live Ranges Instruction Ranges t1 [0-5] t2 [3-20] [26-31] t3 [6-7] t4 [8-9] t5 [21-24] t6 [10-12] t7 [11-12] t8 [13-17] [26-30] t9 [16-17] [28-30] t10 [32-33] R t11 [17-18 ] W t12 [18-20] t13 [23-24] t10 t11 t9 t12 t8 t1 t2 t7 t3 t4 t5 t13 t6 As can be seen there is a slight, but important, difference in terms of the interference graph. For the first definition of interference we have a clique of size 4 in the graph which means that we will not be able to color it with less than 4 colors. For the second definition we can easily color it with 3 colors and hence use 3 registers. For the first definition, and given that we only have 3 available registers we need to select one of these colors to spill. The coloring algorithm suggests dropping one of the higher degree nodes. In this case all the nodes in the 4-clique have the same degree. Looking at the live ranges t11 is the node that corresponds to the shortest range and we can choose that one not to assign any register to in reality using the alternative register to temporary load the corresponding value in register for the computations in lines 17 and 18. 6 of 7

The figure below depicts the resulting code with the assignment of registers for the second definition of interference and hence using the three registers r0, r1 and r2. Source Line Assembly Comment 34: L1: L3: L4: r0 = - 4 r1 = - 8 r1 = *r1 if (r0 < 0) goto L3 r0 = GP + 12 *r0 = 1 r0 = GP + 16 *r0 = 0 r2 = 0 *r0 = r2 if (r0 == 32) goto L3 r2 = GP + 20 r11 = r0 * 4 r0 = r2 + r11 if (r0 = r1) goto L2 r1 = + 4 *r1 = r0 r2= r0 + 1 *r0 = r2 r1 = + 4 *r1= -1 return t1 = param a if (a < 0) return -1 global x = 1 global y = 0 i = 0 if (i < 31) return -1 t12 = A[i] if (A[i] == b) return i return value = i i = i + 1 return value = -1 7 of 7