int32_t Buffer[BUFFSZ] = {-1, -1, -1, 1, -1, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, -1, -1, -1, -1, -1}; int32_t* A = &Buffer[5];

Similar documents
A short session with gdb verifies a few facts; the student has made notes of some observations:

1. A student is testing an implementation of a C function; when compiled with gcc, the following x86-32 assembly code is produced:

1. A student is testing an implementation of a C function; when compiled with gcc, the following x86-64 assembly code is produced:

CS 2505 Computer Organization I Test 2. Do not start the test until instructed to do so! printed

CS 2505 Computer Organization I Test 2. Do not start the test until instructed to do so! printed

CS 2505 Computer Organization I Test 2. Do not start the test until instructed to do so! printed

Download the tarball for this session. It will include the following files:

CSE 361S Intro to Systems Software Lab Assignment #4

CS 2505 Computer Organization I Test 2. Do not start the test until instructed to do so! printed

Download the tarball for this session. It will include the following files:

CS 2505 Computer Organization I

Exercise Session 6 Computer Architecture and Systems Programming

CS 3214 Spring # Problem Points Min Max Average Median SD Grader. 1 Memory Layout and Locality Bill

Binghamton University. CS-220 Spring X86 Debug. Computer Systems Section 3.11

Lecture 08 Control-flow Hijacking Defenses

CSC 373, Winter 2012 Lab Assignment 3: The Buffer Bomb

Here is a C function that will print a selected block of bytes from such a memory block, using an array-based view of the necessary logic:

CS 2505 Computer Organization I

CSC 2400: Computing Systems. X86 Assembly: Function Calls

Buffer Overflow Attack

Machine Language, Assemblers and Linkers"

Compila(on, Disassembly, and Profiling

CS 3214 Computer Systems. Do not start the test until instructed to do so! printed

Buffer-Overflow Attacks on the Stack

Lab 10: Introduction to x86 Assembly

Assembly Programmer s View Lecture 4A Machine-Level Programming I: Introduction

CS356: Discussion #5 Debugging with GDB. Marco Paolieri

Advanced Buffer Overflow

18-600: Recitation #3

Machine Programming 1: Introduction

buffer overflow exploitation

Pointer Casts and Data Accesses

EECS 213 Introduction to Computer Systems Dinda, Spring Homework 3. Memory and Cache

Here is a C function that will print a selected block of bytes from such a memory block, using an array-based view of the necessary logic:

ANITA S SUPER AWESOME RECITATION SLIDES

UW CSE 351, Winter 2013 Midterm Exam

MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION

Buffer-Overflow Attacks on the Stack

Both parts center on the concept of a "mesa", and make use of the following data type:

The First Real Bug. gdb. Computer Organization I McQuain

Creating a String Data Type in C

Jackson State University Department of Computer Science CSC / Advanced Information Security Spring 2013 Lab Project # 5

CS/ECE 354 Practice Midterm Exam Solutions Spring 2016

20: Exploits and Containment

The assignment requires solving a matrix access problem using only pointers to access the array elements, and introduces the use of struct data types.

Accessing Data in Memory

gcc o driver std=c99 -Wall driver.c bigmesa.c

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

Advanced Buffer Overflow

CSC 2400: Computing Systems. X86 Assembly: Function Calls"

Pointer Accesses to Memory and Bitwise Manipulation

Università Ca Foscari Venezia

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

The Dynamic Debugger gdb

CSCE 212H, Spring 2008 Lab Assignment 3: Assembly Language Assigned: Feb. 7, Due: Feb. 14, 11:59PM

15-213/18-213, Fall 2011 Exam 1

CS 2505 Computer Organization I Test 1. Do not start the test until instructed to do so! printed

CS 2505 Computer Organization I Test 1. Do not start the test until instructed to do so! printed

Stack overflow exploitation

Instruction Set Architectures

Computer Systems Architecture I. CSE 560M Lecture 3 Prof. Patrick Crowley

Pointer Accesses to Memory and Bitwise Manipulation

Sungkyunkwan University

15-213/18-243, Fall 2010 Exam 1 - Version A

GDB Tutorial. A Walkthrough with Examples. CMSC Spring Last modified March 22, GDB Tutorial

You may work with a partner on this quiz; both of you must submit your answers.

CS 105, Fall Lab 4: The Buffer Bomb. See Calendar for Dates

CS354 gdb Tutorial Written by Chris Feilbach

EECS 213, Fall 2009 Exploit Lab

The following notes illustrate debugging a linked list implementation with gdb.

Pointer Accesses to Memory and Bitwise Manipulation

CS 201 Winter 2014 (Karavanic) Final Exam

Binghamton University. CS-220 Spring X86 Debug. Computer Systems Section 3.11

CS341, Spring 2014 The Buffer Bomb Assigned: 9:30AM Thursday, February 27th, 2014 Due: 9:30AM Thursday, March 13th, 2014

CNIT 127: Exploit Development. Ch 2: Stack Overflows in Linux

TIE: Principled Reverse Engineering of Types in Binary Programs! JongHyup Lee, Thanassis Avgerinos, and David Brumley

Project 1 Notes and Demo

CS , Fall 2002 Exam 1

CS / ECE , Spring 2010 Exam 1

Intro x86 Part 3: Linux Tools & Analysis

Systems I. Machine-Level Programming I: Introduction

Buffer Overflow Attacks

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

15-213/18-243, Summer 2011 Exam 1 Tuesday, June 28, 2011

struct _Rational { int64_t Top; // numerator int64_t Bottom; // denominator }; typedef struct _Rational Rational;

Instruction Set Architectures

This time. Defenses and other memory safety vulnerabilities. Everything you ve always wanted to know about gdb but were too afraid to ask

CMPSC 497 Buffer Overflow Vulnerabilities

Obstacle Course Buffer Overflow Hacking Exercise

GDB Tutorial. Young W. Lim Tue. Young W. Lim GDB Tutorial Tue 1 / 32

CSE 351 Midterm - Winter 2017

CS/COE 0449 term 2174 Lab 5: gdb

War Industries Presents: An Introduction to Programming for Hackers Part V - Functions. By Lovepump, Visit:

CSE 410: Systems Programming

MACHINE-LEVEL PROGRAMMING I: BASICS

CPS104 Recitation: Assembly Programming

Introduction to Computer Systems , fall th Lecture, Sep. 28 th

Intro to Segmentation Fault Handling in Linux. By Khanh Ngo-Duy

The X86 Assembly Language Instruction Nop Means

Problem Set 1: Unix Commands 1

Transcription:

This assignment refers to concepts discussed in the course notes on gdb and the book The Art of Debugging by Matloff & Salzman. The questions are definitely "hands-on" and will require some reading beyond the course notes. Download the file HW09.tar and unpack it on a Linux system. It contains files you will need for this assignment. You may work in pairs for this assignment. If you choose to work with a partner, make sure only one of you submits a solution and that the file lists names and PIDs for both of you. Prepare your answers to the following questions in a single plain ASCII text file. Submit your file to the Curator system by the posted deadline for this assignment. No late submissions will be accepted. You will submit your answers to the Curator System (www.cs.vt.edu/curator) under the heading HW09. 1. A student is testing an implementation of the following C function: /** * Computes and returns sum of A[0]:A{Sz-1]. * Pre: * A points to an array of dimension at least Sz * A[0:Sz-1] are initialized * Returns: * sum of A[0] through A[Sz-1] */ int32_t AddEm(const int32_t* const A, uint32_t Sz); Unfortunately, the student is not going to show you the C source code for the implementation; instead, she has written a driver and compiled it with AddEm.c with the following command: gcc -c -O0 -m32 -std=c99 Wall W -g Q1main.c AddEm.c As you can see, the code is compiled to 32-bit instructions, with no optimizations, and with debugging information added (since the build did include the switch g). So, the student can use gdb to analyze the execution of the function; here's the driver she wrote: #define BUFFSZ 20 #define LISTSZ 10 int main() { int32_t Buffer[BUFFSZ] = {-1, -1, -1, 1, -1, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, -1, -1, -1, -1, -1}; int32_t* A = &Buffer[5]; int32_t Sum = AddEm(A, LISTSZ); printf("sum is %"PRId32"\n", Sum); } return 0; Here's my driver code. I created a buffer with extra space around the array I passed to the function, and set known values into the extra space. Note that AddEm() should have returned the sum 1023. But, I ran it and it returned 16843010! 1

A short session with gdb verifies a few facts; the student has made notes of some observations: [jillhokie@vmcentos65 Q1]$ gdb Q1... (gdb) break AddEm.c:17 Breakpoint 1 at 0x80483e3: file AddEm.c, line 17. (gdb) run Let's run to the breakpoint in AddEm(). Starting program: /home/jillhokie/2505/addem Breakpoint 1, There is a loop in AddEm() that is supposed to compute the sum; the loop test is on line 17, so I m setting a breakpoint there. (gdb) print/x A $1 = 0xffffd1cc (gdb) print/x Curr $2 = 0xffffd1cc (gdb) print *(int32_t*) Curr $3 = 1 (gdb) print/x Stop $4 = 0xffffd1f4 (a) Let's check the parameters: So the array A is at address 0xffffd1cc. Curr is pointing to A[0], so that's OK> *Curr is 1, which is correct too. What about Stop? (gdb) print (uint32_t) Stop - (uint32_t) Curr $5 = 40 (b) Is that right? Maybe the UB for the loop is wrong... (gdb) next 18 Sum += *(int32_t*) Curr; (gdb) print Sum $6 = 0 OK, Sum has been initialized to 0; remember the code in line 18 (gdb) next 19 Curr++; (gdb) print Sum $7 = 1 Well, Sum has been updated correctly. (gdb) next (gdb) print/x Curr $8 = 0xffffd1cd (c) Is that right? (gdb) print *(int32_t*) Curr $9 = 33554432 (gdb) print/x *(int32_t*) Curr $10 = 0x2000000 Well, that's NOT the next value in my array! Let's see that in hex... does that tell us anything? 2

(gdb) disassem Dump of assembler code for function AddEm: 0x080483c4 <+0>: push %ebp 0x080483c5 <+1>: mov %esp,%ebp 0x080483c7 <+3>: sub $0x10,%esp 0x080483ca <+6>: movl $0x0,-0xc(%ebp) 0x080483d1 <+13>: mov 0x8(%ebp),%eax 0x080483d4 <+16>: mov %eax,-0x8(%ebp) 0x080483d7 <+19>: mov 0xc(%ebp),%eax 0x080483da <+22>: shl $0x2,%eax 0x080483dd <+25>: add -0x8(%ebp),%eax 0x080483e0 <+28>: mov %eax,-0x4(%ebp) 0x080483e3 <+31>: jmp 0x80483f1 <AddEm+45> => 0x080483e5 <+33>: mov -0x8(%ebp),%eax 0x080483e8 <+36>: mov (%eax),%eax 0x080483ea <+38>: add %eax,-0xc(%ebp) 0x080483ed <+41>: addl $0x1,-0x8(%ebp) 0x080483f1 <+45>: mov -0x8(%ebp),%eax 0x080483f4 <+48>: cmp -0x4(%ebp),%eax 0x080483f7 <+51>: jb 0x80483e5 <AddEm+33> 0x080483f9 <+53>: mov -0xc(%ebp),%eax 0x080483fc <+56>: leave 0x080483fd <+57>: ret End of assembler dump. The addresses are shown relative to the beginning of the function. Remember that the function name, AddEm, becomes a label representing an address in the assembly code. The expression <+45> in the disassembly above means the instruction is at an address 45 bytes after the beginning of the function. Let's figure out where things are in the stack frame for AddEm(): Let's look at the assembly code for AddEm(). There's a loop test at <+45/51>, and it jumps to <+33>. Aha! Now I see what's going on... look at <+41>! Let me explain...... (gdb) p/x $ebp $20 = 0xffffd198 That's the address where the frame for AddEm() begins. (gdb) p/x &Sum $21 = 0xffffd18c (gdb) p/x &Curr $22 = 0xffffd190 (gdb) p/x &Stop $23 = 0xffffd194 (d)... and the address of Sum; that's %ebp - 12... and the address of Curr; that's %ebp - 8... and the address of Stop; that's %ebp - 4 Just for fun, here are the details of the current stack: (gdb) where full #0 AddEm (A=0xffffd1cc, Sz=10) at AddEm.c:18 Sum = 33554433 Curr = 0xffffd1ce Stop = 0xffffd1f4... #1 0x08048441 in main () at Q1main.c:14 Buffer = {-1, -1, -1, 1, -1, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, -1, -1, -1, -1, -1} A = 0xffffd1cc Sum = 13504500 3

OK, enough fun with the stack... now that we know where things are, we can try reconstructing the code: 0x080483c4 <+0>: push %ebp # set up a frame 0x080483c5 <+1>: mov %esp,%ebp 0x080483c7 <+3>: sub $0x10,%esp 0x080483ca <+6>: movl $0x0,-0xc(%ebp) # Sum = 0 0x080483d1 <+13>: mov 0x8(%ebp),%eax # eax = A 0x080483d4 <+16>: mov %eax,-0x8(%ebp) # Curr = A 0x080483d7 <+19>: mov 0xc(%ebp),%eax # eax = Size 0x080483da <+22>: shl $0x2,%eax # eax = 4 * Size 0x080483dd <+25>: add -0x8(%ebp),%eax # eax = A + 4 * Size 0x080483e0 <+28>: mov %eax,-0x4(%ebp) # Stop = A + 4 * Size 0x080483e3 <+31>: jmp 0x80483f1 <AddEm+45> # goto <+45> 0x080483e5 <+33>: mov -0x8(%ebp),%eax # eax = Curr 0x080483e8 <+36>: mov (%eax),%eax # eax = *Curr 0x080483ea <+38>: add %eax,-0xc(%ebp) # Sum += *Curr 0x080483ed <+41>: addl $0x1,-0x8(%ebp) # Curr++ => 0x080483f1 <+45>: mov -0x8(%ebp),%eax # eax = Curr 0x080483f4 <+48>: cmp -0x4(%ebp),%eax # compare Curr to Stop 0x080483f7 <+51>: jb 0x80483e5 <AddEm+33> # repeat loop if less 0x080483f9 <+53>: mov -0xc(%ebp),%eax # return value is Sum 0x080483fc <+56>: leave # exit function 0x080483fd <+57>: ret We are currently at the instruction marked =>, just after the increment of Curr. Let's step through the machine code for a bit:... 0x080483f4 0x080483f7 18 Sum += *(int32_t*) Curr; (gdb) p/x Curr $18 = 0xffffd1ce 0x080483e8 18 0x080483ea 18 Sum += *(int32_t*) Curr; Sum += *(int32_t*) Curr; 19 Curr++; (gdb) p/x Curr $19 = 0xffffd1cf Again, Curr is behaving strangely... OK, that's not very revealing, but it does show how to step through machine code. Let's restart the program and see what's really going on with Curr. 4

(gdb) run The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/jillhokie/2505/addem Breakpoint 3, We can display the contents of a block of memory; let's see what A looks like: (gdb) x/10w A 0xffffd1cc: 0x00000001 0x00000002 0x00000004 0x00000008 0xffffd1dc: 0x00000010 0x00000020 0x00000040 0x00000080 0xffffd1ec: 0x00000100 0x00000200 (gdb) p/x Curr $21 = 0xffffd1cc 0xffffd1cc: 0x00000001 (gdb) delete 3 (gdb) watch Curr And check Curr and its target; that all looks OK to start with. Let's remove the breakpoint. I'm interested in what Curr is doing, not in the loop. And set a watchpoint on Curr... then continue execution... Old value = (const uint8_t *) 0xffffd1cc "\001" New value = (const uint8_t *) 0xffffd1cd "" Watchpoints cause a pause when the value of the watched expression changes. 0xffffd1cd: 0x02000000 Now, the value of *Curr is odd... compare this to the display of A above. Old value = (const uint8_t *) 0xffffd1cd "" New value = (const uint8_t *) 0xffffd1ce "" 0xffffd1ce: 0x00020000 Old value = (const uint8_t *) 0xffffd1ce "" New value = (const uint8_t *) 0xffffd1cf "" 5

0xffffd1cf: 0x00000200 Old value = (const uint8_t *) 0xffffd1cf "" New value = (const uint8_t *) 0xffffd1d0 "\002" 0xffffd1d0: 0x00000002 Old value = (const uint8_t *) 0xffffd1d0 "\002" New value = (const uint8_t *) 0xffffd1d1 "" 0xffffd1d1: 0x04000000 (e) Now, do you see what Curr is doing? OK, what about setting breakpoints in machine code? There are no line numbers, like in C code. But machine instructions are stored at addresses (which are shown in the disassembly). Can I set a breakpoint at an address in code? Where should I set the breakpoint? Let's set one at the beginning of the loop body. That instruction is at address 0x080483e5. I bet that if I dereference that address, gdb will interpret that and set a breakpoint at that instruction... (gdb) delete 4 (gdb) break *0x080483e5 Breakpoint 5 at 0x80483e5: file AddEm.c, line 18. (f) Yes! Now I can issue a continue and the program should run until it reaches the beginning of the loop again. OK, that's enough for now... I know what's wrong with the C code. (Actually, I've known for some time.) Do you? Let's see if you can answer some questions about the program and the debug session... look for the highlighted labels like this in the gdb session: (y) You may find it useful to look around in the gdb session for clues, not just near the labels. 6

a) [5 points] Given what you see about the loop test a few lines further down, does the value for Stop make sense? Explain. b) [5 points] Does the difference between Stop and Curr make sense? Why? c) [5 points] Is the value of Curr shown here what it should be? Explain. d) [5 points] Why does this tell us that Sum is stored at the address %ebp 12? e) [5 points] Explain the results of these displays of *Curr, taking into account the values displayed for A. That is, given what we know about Curr and A, why do these values make sense (even though they are not what we wanted)? f) [5 points] Now that you see how Curr is being modified (especially the assembly code instruction we saw earlier), explain what could be in the C code that would make gcc translate the update of Curr to this assembly code. 7

2. Another student is testing an implementation of the following C function: /** * Fills array A of dimension Sz with integer squares. * Pre: * A points to an array of dimension Sz (or larger) * Post: * A[k] = (k + 1)^2, for k = 0:Sz-1 */ void WriteSquares(int* const A, int Sz); This time, the student only has an object file for the function implementation. However, in this case it appears that the implementation does what is required. However, the student suspects his testing may be missing an array-bounds error within the implementation of WriteSquares(). So, the student writes some clever code to see if his hunch about the implementation is correct: The cleverness in this code is that the student has guaranteed that there is a known value (a canary value) just before the first element of the array, and just after the last element of the array. If the implementation of WriteSquares() does violate the array bounds, we should see a change in one or both of the canary values after WriteSquares() returns. #include <stdlib.h> #define CANARY 0XDEADBEEF #include "WriteSquares.h" int main() { DEADBEEF DEADBEEF int Sz = 100; int* MemoryBlock = malloc(sz * sizeof(int) + 16); if ( MemoryBlock == NULL ) return 1; *MemoryBlock = CANARY; *(MemoryBlock + 1) = CANARY; *(MemoryBlock + Sz + 2) = CANARY; *(MemoryBlock + Sz + 3) = CANARY; int* A = MemoryBlock + 2; DEADBEEF DEADBEEF // We suspect this function may contain a bug (or two), // and it may write outside the proper boundaries of the // array A of dimension Sz: WriteSquares(A, Sz); free(memoryblock); return 0; a) [10 points] Use gdb to examine the results of the call to WriteSquares(). You should determine what memory values are set correctly, and which are set incorrectly or modified when they should not be. Hint: setting an appropriate watchpoint in gdb can yield a very fast resolution of the question. b) [10 points] Use gdb or objdump to analyze what's wrong with the implementation of WriteSquares(). Note this is not the same as the previous question; that was concerned with effects, not causes. You must show your gdb session, or your output from objdump and reverse-engineering to support your conclusion. 8

3. The directory Q3 (created when you unpacked the tar file referred to above) contains three files: Q3main.c, Q3.h and Q3.o. The object file contains the compiled code for the function Q3() declared in the header file. Q3main.c contains a main() function designed to call Q3(); read the comments in Q3main.c. Experiment a bit with the code; you will discover that running Q3main results in a runtime error, unless you get very lucky and use a parameter to Q3() that satisfies a particular constraint. You must determine what constraint the parameter to Q3() must satisfy in order to avoid the runtime error. Brute force attacks can answer the question and will receive no credit. There are several ways to analyze this situation, and you have a number of tools available to aid you. You can use gdb to examine the execution of the code. You can also use objdump,with the d switch, to display the assembly code for an object or executable file. You must state the constraint the parameter to Q3() must satisfy and show what rational analysis you performed to determine the constraint. You can justify your conclusion by showing a transcript of a gdb session and/or showing objdump output with an analysis of the x86 assembly code. (It's easy to copy text from a Linux shell window and paste it into a text editor.) a) [5 points] Identify the exact assembly/machine instruction within Q3() at which the runtime error occurs. Either list the instruction, or give its address. Show exactly how you determined your answer. b) [10 points] Analyze the instruction you identified in part a), and explain exactly why executing this instruction would cause a runtime error. c) [10 points] For what parameter value(s) will Q3() not trigger a segmentation fault? Show exactly how you determined your answer(s) to this question; guessing is not a valid technique, nor is experimentation with different parameter values. 4. Repeat question 1, but with the files Q4main.c, Q4.h and Q4.o. a) [5 points] Identify the exact assembly/machine instruction within Q4() at which the runtime error occurs. Show exactly how you determined your answer. b) [10 points] Analyze the instruction you identified in part a), and explain exactly why executing this instruction would cause a runtime error. c) [10 points] For what parameter value(s) will Q4() not trigger a segmentation fault? Show exactly how you determined your answer(s) to this question; guessing is not a valid technique, nor is experimentation with different parameter values. The last two questions bear some relationship to the sorts of things you'll have to figure out when you defuse your binary bomb. So, this makes a good warmup for that assignment. 9