RECURSIVE FUNCTIONS ON STACK

Similar documents
CS356: Discussion #6 Assembly Procedures and Arrays. Marco Paolieri

Subprograms, Subroutines, and Functions

Recursion. What is Recursion? Simple Example. Repeatedly Reduce the Problem Into Smaller Problems to Solve the Big Problem

CNIT 127: Exploit Development. Ch 2: Stack Overflows in Linux

CSC 2400: Computer Systems. Using the Stack for Function Calls

Systems I. Machine-Level Programming V: Procedures

Part 7. Stacks. Stack. Stack. Examples of Stacks. Stack Operation: Push. Piles of Data. The Stack

CSC 8400: Computer Systems. Using the Stack for Function Calls

Buffer Overflow Attack (AskCypert CLaaS)

Programs in memory. The layout of memory is roughly:

Chapter 7 Subroutines. Richard P. Paul, SPARC Architecture, Assembly Language Programming, and C

Stack overflow exploitation

Procedure-Calling Conventions October 30

Chapter 2A Instructions: Language of the Computer

COMP 202 Recursion. CONTENTS: Recursion. COMP Recursion 1

CSC 2400: Computer Systems. Using the Stack for Function Calls

1 Dynamic Memory continued: Memory Leaks

MARS MIDI Player - Technical Design Document

Source level debugging. October 18, 2016

MPATE-GE 2618: C Programming for Music Technology. Unit 4.1

(Refer Slide Time: 00:51)

143A: Principles of Operating Systems. Lecture 4: Calling conventions. Anton Burtsev October, 2017

CSC 2400: Computing Systems. X86 Assembly: Function Calls"

CS240: Programming in C

When you add a number to a pointer, that number is added, but first it is multiplied by the sizeof the type the pointer points to.

Introduction to Scientific Computing

ECE251: Tuesday September 11

143A: Principles of Operating Systems. Lecture 5: Calling conventions. Anton Burtsev January, 2017

Lab 4 : MIPS Function Calls

Laboratory Assignment #4 Debugging in Eclipse CDT 1

EECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution

Jackson State University Department of Computer Science CSC / Advanced Information Security Spring 2013 Lab Project # 5

CSC 2400: Computing Systems. X86 Assembly: Function Calls

ECE251: Tuesday September 12

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #34. Function with pointer Argument

(Refer Slide Time: 00:26)

Programming Studio #9 ECE 190

Embedded Systems - FS 2018

SRC Assembly Language Programming - III

CSCI 136 Data Structures & Advanced Programming. Lecture 14 Fall 2018 Instructor: Bills

Motivation. Compiler. Our ultimate goal: Hack code. Jack code (example) Translate high-level programs into executable code. return; } } return

Introduction to C/C++ Programming

Problem with Scanning an Infix Expression

COMP-202. Recursion. COMP Recursion, 2011 Jörg Kienzle and others

238P: Operating Systems. Lecture 3: Calling conventions. Anton Burtsev October, 2018

Stack. 4. In Stack all Operations such as Insertion and Deletion are permitted at only one end. Size of the Stack 6. Maximum Value of Stack Top 5

CSE 230 Intermediate Programming in C and C++ Functions

Using a debugger. Segmentation fault? GDB to the rescue!

introduction to Programming in C Department of Computer Science and Engineering Lecture No. #40 Recursion Linear Recursion

CSCI-1200 Data Structures Fall 2017 Lecture 5 Pointers, Arrays, & Pointer Arithmetic

After completing this appendix, you will be able to:

Computer Architecture and System Software Lecture 06: Assembly Language Programming

Programming and Data Structures in C Instruction for students

IDE: Integrated Development Environment

Programming II (CS300)

CSCI-1200 Data Structures Spring 2016 Lecture 6 Pointers & Dynamic Memory

unsigned char memory[] STACK ¼ 0x xC of address space globals function KERNEL code local variables

Buffer-Overflow Attacks on the Stack

Offensive Security My First Buffer Overflow: Tutorial

Software Development. Modular Design and Algorithm Analysis

Run-Time Data Structures

United States Naval Academy Electrical and Computer Engineering Department EC310-6 Week Midterm Spring AY2017

Some Basic Concepts EL6483. Spring EL6483 Some Basic Concepts Spring / 22

Today's Topics. CISC 458 Winter J.R. Cordy

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University.

1. Stack overflow & underflow 2. Implementation: partially filled array & linked list 3. Applications: reverse string, backtracking

CSE Lecture In Class Example Handout

G Programming Languages - Fall 2012

Functions in MIPS. Functions in MIPS 1

Class Information ANNOUCEMENTS

Project 3: RPN Calculator

GDB Tutorial. A Walkthrough with Examples. CMSC Spring Last modified March 22, GDB Tutorial

Announcements. Assignment 2 due next class in CSIL assignment boxes

Implementing Procedure Calls

COMP2611: Computer Organization MIPS function and recursion

Programming II (CS300)

Introduction to C. Why C? Difference between Python and C C compiler stages Basic syntax in C

Buffer-Overflow Attacks on the Stack

Fun facts about recursion

Functions in C. Lecture Topics. Lecture materials. Homework. Machine problem. Announcements. ECE 190 Lecture 16 March 9, 2011

ECE 190 Midterm Exam 3 Spring 2011

Call Stacks + On Writing Good Code

Lab 10: Introduction to x86 Assembly

Computer Organization & Assembly Language Programming (CSE 2312)

UNIT-II. Part-2: CENTRAL PROCESSING UNIT

Lecture 7: Examples, MARS, Arithmetic

Notes - Recursion. A geeky definition of recursion is as follows: Recursion see Recursion.

Function Calling Conventions 1 CS 64: Computer Organization and Design Logic Lecture #9

CMPSCI 187: Programming With Data Structures. Lecture #16: Thinking About Recursion David Mix Barrington 12 October 2012

Algorithm. An algorithm is a computational process for solving a problem. computational: a computer must be able to perform the steps of the process

hardware interrupts software interrupts

Run-time Environment

1) What is the primary purpose of template functions? 2) Suppose bag is a template class, what is the syntax for declaring a bag b of integers?

Buffer Overflow Attacks

Single-Cycle CPU VITO KLAUDIO CSC343 FALL 2015 PROF. IZIDOR GERTNER

CS 11 C track: lecture 5

CSE 214 Computer Science II Recursion

Computer Systems Lecture 9

NCS 301 DATA STRUCTURE USING C

CS 161 Computer Security

Transcription:

Debugging with Visual Studio & GDB OCTOBER 31, 2015 BY CSC 342 FALL 2015 Prof. IZIDOR GERTNER

1 Table of contents 1. Objective... pg. 2 2. Overview... pg. 3 3. Microsoft s Visual Studio Debugger... pg. 5 3.1 Iterative Method... pg. 5 3.2 Recursive Method... pg. 9 4. GDB Debugger for Linux... pg. 17 4.1 Iterative Method... pg. 17 4.2 Recursive Method... pg. 21 5. Analysis... pg. 28 6. Conclusion... pg. 30 7. Appendix... pg. 31

2 1. Objective The objective of this laboratory is to understand how the stack works with recursive functions. We will do this by debugging a function in two different environments; Visual Studio and GDB. We take into consideration a simple function to calculate the factorial of a number. This is a very useful function to understand how the stack works on recursive functions by comparing it with a regular iterative function. The results of these laboratory will be summarized in a plot of the running time of each method versus the value of the number of which the factorial is to be calculated.

3 2. Overview Recursive functions are routines that call themselves. We use these type of functions to solve large problems by combining the solution of smaller sub-problems and since this kind of routine works by calling itself, it will eventually come to a sub-problem which it can solve without calling itself. This case is called the Base Case of the routine. In our case the base case is when the number that we are trying to calculate the factorial of reaches the value of 1. The function that is going to be used to achieve the goal of this laboratory is the Factorial function. The factorial of a number is defined by the symbol!, i.e. exclamation mark. Whenever we see a notation like x! we understand that we are talking about the factorial of number x. The factorial is the product of all the numbers between x and 1. We do not want to go beyond one because a multiplication with zero would result in zero no matter how large the number x is. Furthermore it is unnecessary to compute the factorial of negative numbers since we can simply negate the result to get the product of negative x for example. We use a simple fact of the factorial function to put it into a recursive form. Since the factorial of number x is the product of the numbers between x and 1, we can write: x! = x (x 1) (x 2) 1 It is trivial to see from the above definition of the factorial that x! can be rewritten as: x! = x (x 1)! and (x 1)! = (x 1) (x 2)! and so on Hence, we can use a recursive routine to solve sub-problems till we reach the base case. Recursive functions use call stack.

4 A call stack is composed of stack frames and a new frame is created each time the subroutine is called from the recursive function. The stack frame is used to store all the variables for one invocation of a routine. For the recursive factorial function we except the compiler to create a new stack frame for each call to the function. For very large numbers we expect to have a very large number of stack frames created by the compiler and this would mean using a lot of memory. The following picture illustrates what was stated above: Figure 1 Stack Frames for the Recursive Factorial Function

5 3. Microsoft s Visual Studio Debugger 3.1. Iterative Method In this part of the project we take into consideration the iterative method of the factorial function and debug it via the Visual Studio Debugger. Create a new project in Visual Studio and set it up as an Empty Function. Next, create two new files, the first one is the main function where the second file, factorial will be called from. The following figure shows the two functions to be created: Figure 2 Iterative Method Functions We compile and link these two functions together to make sure that we have we do not have any errors in our code. Then we start debugging. The following figure shows the state of the stack just before calling the factorial function from the main function:

6 Figure 3 Stack Frame of main We can see that the stack frame for the main function is created and we have a base pointer that contains the address 0x0043F874. Let s continue our debugging session by entering the factorial function. We expect the compiler to create a new stack frame for the function called and so the base pointer should point to the new base of the frame, which is a different address than that of the main function. Consider the following screenshot:

7 Figure 4 Iterative Factorial Function Stack We can see that our expectations are met and the address of the base pointer has changed. We now have a new stack frame with the necessary variables stored. In the above figure, the red square contains the address of the EBP pointer which now has a new value of 0x004F790. The blue square shows the location where the variable i is stored and currently has a value of 0x01, and the green circle shows the location of where the variable j is stored which has the value of 0x01 since the. Our final result will be stored in j. Let s observe the stack after the loop has completed all five iterations.

8 Figure 5 End of Iterative Function As we can see, the value of i at this point is 5, therefore the loop has finished iterating and the function will return the value of j which is stored in memory location 0x0043F788 and is 0x78. If we convert this value from Little Endian Notation and from hexadecimal to decimal notation we get the value of 120. After this the function will return to the last address in main before we jumped into the factorial function. We have achieved our goal and have the expected result in stack. We only created one frame and everything was done inside this frame by using the iterative method. Let s see how the stack behaves when we are dealing with a recursive function.

9 3.2. Recursive Method Consider the following functions that are used in Visual Studio to run the recursive method of the factorial function: Figure 6 Recursive Method Functions As we can see the way of coding the factorial function is different this time. We compile each of these two files separately and afterwards we link them together by clicking on the Build Solution option of Visual Studio. After we make sure that our program is correctly coded we procced to start a debugging session. We click on the Start with Debugging option and step over some instruction till we have the stack created for the main function and we are ready to jump into the recursive factorial function. The following screenshot shows the stack frame of the main function just before calling the recfactorial function:

10 Figure 7 Stack Frame of main Function We now have a stack frame for our main function. The base pointer, depicted by the red square in the figure, points to the address 0x001AFA84 which is the base of the main function frame. We can see that the variable n, shown inside the green square of value 0x05, is stored at location 0x001AFA7C and we have the variable result initialized to zero at location 0x001AFA70 which is contained within blue square. On the other hand, the purple square contains the address of the instruction that recursive factorial function will return after it has completed its task. We are ready to jump into the recursive factorial function. The following picture shows the stack frame for the recursive factorial function with parameter n = 5:

11 Figure 8 Recursive Stack Frame 1 We notice that for this frame we have a new base pointer, meaning that the compiler created a new stack frame. This new base pointer is indicated by the red square. The green square shows the location of variable n, which is below the base pointer. This is what we expected since this variable was pushed in stack before calling the recursive factorial function. The purple box contains the address of the return to main instruction which was also pushed in stack just before calling the function and therefore is below the base pointer. Let s call this Recursive Stack Frame 1 since it is the first one created by the compiler after the main function stack frame. Since the value of n in this function is not zero, then the function will call itself again with variable n = 4. Let s consider the following screenshots that show the stack for each call:

12 Figure 9 Recursive Stack Frame 2 Figure 10 Recursive Stack Frame 3

13 Figure 11 Recursive Stack Frame 4 Figure 12 Recursive Stack Frame 5

14 In the above screenshots we can see that the compiler creates different stack frames for each function call. This can be seen by checking the EBP for each time. In stack frame 2, the base pointer contains the address 0x001AF8BC. In the next frame, where the parameter n is 3 we have the base pointer address 0x001AF7E4. Stack frame 4 starts at address 0x001AF70C and finally the last stack frame, where n is 1 has the base pointer pointing to address 0x001AF634. These different addresses for the base pointer show that each function call creates a new stack frame for each function. This could be very costly when we want to find the factorial of a large number, therefore we must be very careful when to use the recursive method. The next call to the recursive factorial function will take the parameter n equal to zero, therefore it will stop at the comparison instruction. At this point, the number 1 will be returned to the caller, which will multiply this number by the current value of n. The Stack Frames return their values to the previous frame, which called them, starting from the innermost frame and end up returning the final value to the main function which will assign it to the variable result. The following diagram shows how the Stack Frames work with each other:

15 Graph 1 Stack Frames of the Recursive Factorial After these operations are completed and the compiler has cleared all the Stack Frames that are not going to be used anymore, we return to the main function to assign the value of 120 to the variable result. Let s see the main stack frame after the return:

16 Figure 13 Stack Frame of Main Function As we can see in the red square, the base pointer is again pointing to the starting address that is 0x001AFA84. At location 0x001AFA7C, shown by the green square, we still have the value of n, which is 5. Finally we can see the result that we were expecting is now set to 0x78 at location 0x001AFA70 which is depicted by the blue square. We convert this value from Little Endian notation and then from hexadecimal notation to get the decimal value of it, which is 120 and we can conclude that the function has worked correctly. In terms of efficiency, the compiler had to create 5 different stack frames for each function call to the recursive routine and this might become very costly when dealing with very large integers. We have finished studying the behavior of the stack when dealing with recursive functions in Visual Studio. Now we are going to do the same analysis on a Linux platform.

17 4. GDB Debugger for Linux 4.1. Iterative Method We repeat the same experiment using another platform. This time we analyze the stack using a Linux operating system with 64-bit architecture. For this part we use the GCC compiler and the GDB debugger. The following figure shows the stack frame of the main function as soon as we run the program: Figure 14 Stack Frame of Main Function This is the disassembly window of our function. In the red square are shown the base pointer, the stack pointer and the current frames. The current frames are shown by using the command

18 info stack from GDB. This is a very interesting option because we can visualize better how many stack frames are created at this point of the program. The next thing to do is to step into the iterative factorial function, disassemble and check the stack. The following screenshot shows the stack at the moment of the first iteration of the loop: Figure 15 Stack Frame of Factorial Function (Part 1) As we can see the base pointer has changed and now points to a new address 0x7FFFFFFFDF60 which is different from the base pointer of the main function. This means that the compiler has created a new frame for the factorial function. Also, from the info stack command we can see that we now have two frames for our program. The frame that we are currently is depicted by #0 and the frame where we will return after this function has

19 completed its task is frame #1. The stack works as a LIFO (Last In First Out) data structure. This means that the last frame to be created will be the first one to return. The next screenshot shows the state of the stack after the last iteration of the loop: Figure 16 Stack Frame of Factorial Function (Part 2) At this point of the program we have finished iterating through the loop. The value of the variable j is the final value that we are looking for. We can check the value of j in two ways, one way is to look at the contents of the memory address where j is stored and the other way is to simply print out the value of j using the print option of GDB. As we can see from the red squares in the screenshot, the value of j is now 120. Notice that we still have two stack frames. This is because we have not returned to the main function yet, but since we are done with the iterations we can return. The following figure shows the state of the stack for the main function when we return from the factorial function:

20 Figure 17 Stack Frame of Main Function after Returning As we can see the base pointer points to address 0x7FFFFFFFDF80 again and the value of the result is located in the stack at memory address 0x7FFFFFFFDF78. We can say that our function has worked correctly. We can continue our study of the stack by using the recursive method now.

21 4.2. Recursive Method For this method, we use the same function as we used in Visual Studio. The following screenshot shows the stack of the main function as soon as we enter the program: Figure 18 Stack Frame of Main Function In this case the base pointer points to address 0x7FFFFFFFDF40 and from the info stack command we can see that we have only one stack frame created so far. Let s step into the recursive factorial function to see how the stack behaves:

22 Figure 19 Recursive Factorial Stack Frame 5 We can see from the figure that the base pointer now points to another address and therefore we know that the compiler has created a new stack frame. We can confirm this by the info stack command which gives us two stack frames. The innermost frame is the frame for the recfactorial function which takes as argument n = 5. Let s consider all the stack frames created for the Recursive function now. The following pictures show the frames created each time the recursive function calls itself:

23 Figure 20 Recursive Factorial Stack Frame 4 Figure 21 Recursive Factorial Stack Frame 3

24 Figure 22 Recursive Factorial Stack Frame 2 Figure 23 Recursive Factorial Stack Frame 1

25 We can see that the stack frames are different for each call of the recursive factorial. In total we have six frames created at the end of the recursion. Five of these frames are created by the recursive factorial function and the last one is the main function stack frame. In each of the above pictures the stack frames are inside the red square. The last recursive call will take n = 0 as an argument therefore it will return 1 to the previous stack frame which will multiply it by its current n. The same procedure will be executed as when we were working with Visual Studio. The frames will be popped one by one till we get to the main function and assign the final value to the variable result. The following two screenshots show how the frames disappear when we keep stepping into our program: Figure 24 Returning Stack Frames (Part 1)

26 Figure 24 Returning Stack Frames (Part 2) As we can see, at the last step we are left with only one stack frame, the one that we started with, i.e. the main function stack frame. For every step we take the current value of n is returned from the current frame and then it is multiplied with the value of n that was on the previous frame. As we said before, the LIFO property of the stack is not lost. Let s make one more step and check the state of the main function stack frame after we return from all the recursive calls. The following screenshot shows the result:

27 Figure 25 Stack Frame of Main Function after Returning As we can see we are back where we started, at the main function frame with base pointer at memory address 0x7FFFFFFFDF40. We can see that the value of the variable result is stored in memory location 0x7FFFFFFFDF38 and it is equal to 120. This means that the recursive factorial function has finished correctly and we have the value that we were looking for.

28 5. Analysis To analyze the performance of the recursive method versus the performance of the iterative method we use the QueryPerformanceCounter function to measure the running time of each function. The following table shows the running time of each function for different values of n, i.e. the number which we want to calculate the factorial: n Iterative (seconds) Recursive (seconds) 10 0.000345603542265217 0.000372550353110154 100 0.000376827624672842 0.000418317158830919 1,000 0.000399924891111359 0.000477771233552287 10,000 0.000400352618267628 0.000753227522189416 100,000 0.000478626687864825 0 1,00,0000 0.000602667563182786 0 Table 1 Iterative vs Recursive Running Time As we can see, there is running time of zero for very large values of n in the recursive method. This means that there is not enough memory to create the necessary stack frames for the 100,000 recursions and the compiler generates and error: Process is terminated due to StackOverflowException. In order to visualize better the performance of these two functions we plot the running times for each of the values of n in a graph to see how they behave. The following graph shows the plot of the running time vs. the size of n:

Time (Seconds) 29 Iterative vs. Recursive Function Calls 0.0008 0.0007 0.0006 0.0005 0.0004 0.0003 0.0002 Iterative Recursive 0.0001 0 10^1 10^2 10^3 10^4 10^5 10^6 n Graph 1 Iterative vs Recursive Running Time It is obvious that the recursive method is much faster than the iterative method for some range of values of n, but after a certain value we cannot use the recursive method anymore because it would result in Stack Overflow which would terminate our program immediately.

30 6. Conclusion In this laboratory we studied the behavior of the stack when we worked with recursive functions and compared it to the iterative methods. The function chosen to be analyzed was the factorial one. We calculated the factorial of the integer 5. The iterative method created only one stack frame for the function and it worked within that frame. The variables were stored in the same frame and were overwritten by the new value assigned to the final result. On the other hand the recursive functions force the compiler to create a new frame every time the sub-routine is called. This method could be very expensive as the number to be calculated gets bigger. Therefore we need to be careful when to use the recursive methods because we might be better off by using a simple iterative loop.

31 7. Appendix main.cpp int factorial(int n); void main(){ int n = 5; factorial(5); } factorial.cpp int factorial(int n){ int j = 1; for(int i = 1; i <= n; i++) j *= i; return j; } recfactorial.cpp int recfactorial(int n) { if (n == 0) return 1; } return n * recfactorial(n - 1);