Intro x86 Part 3: Linux Tools & Analysis

Similar documents
Binghamton University. CS-220 Spring X86 Debug. Computer Systems Section 3.11

Program Exploitation Intro

Link 2. Object Files

Buffer Overflow Attack

CMSC Lecture 03. UMBC, CMSC313, Richard Chang

Buffer-Overflow Attacks on the Stack

Buffer-Overflow Attacks on the Stack

Link 2. Object Files

Lab 3. The Art of Assembly Language (II)

CNIT 127: Exploit Development. Ch 2: Stack Overflows in Linux

Mitchell Adair January, 2014

CSC 405 Computer Security Reverse Engineering Part 1

Assembly Programmer s View Lecture 4A Machine-Level Programming I: Introduction

Instruction Set Architectures

Using the GNU Debugger

Using the GNU Debugger

Machine Programming 1: Introduction

buffer overflow exploitation

U Reverse Engineering

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

Instruction Set Architectures

CMSC 313 Lecture 12. Project 3 Questions. How C functions pass parameters. UMBC, CMSC313, Richard Chang

CS 161 Computer Security. Week of January 22, 2018: GDB and x86 assembly

x86 assembly CS449 Fall 2017

Simple C Program. Assembly Ouput. Using GCC to produce Assembly. Assembly produced by GCC is easy to recognize:

1. A student is testing an implementation of a C function; when compiled with gcc, the following x86-32 assembly code is produced:

CMSC 313 Lecture 08 Project 2 Questions Recap Indexed Addressing Examples Some i386 string instructions A Bigger Example: Escape Sequence Project

18-600: Recitation #3

CS 270 Systems Programming. Debugging Tools. CS 270: Systems Programming. Instructor: Raphael Finkel

ANITA S SUPER AWESOME RECITATION SLIDES

CSCE 212H, Spring 2008 Lab Assignment 3: Assembly Language Assigned: Feb. 7, Due: Feb. 14, 11:59PM

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

Binghamton University. CS-220 Spring X86 Debug. Computer Systems Section 3.11

CSC 591 Systems Attacks and Defenses Reverse Engineering Part 1

Lab 10: Introduction to x86 Assembly

Machine Language, Assemblers and Linkers"

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Lecture 2 Assembly Language

A short session with gdb verifies a few facts; the student has made notes of some observations:

CMSC 313 Lecture 12 [draft] How C functions pass parameters

GDB Tutorial. Young W. Lim Tue. Young W. Lim GDB Tutorial Tue 1 / 32

MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION

BUFFER OVERFLOW DEFENSES & COUNTERMEASURES

CSC 591 Systems Attacks and Defenses Return-into-libc & ROP

Introduction to Intel x86-64 Assembly, Architecture, Applications, & Alliteration. Xeno Kovah 2014 xkovah at gmail

The geometry of innocent flesh on the bone: Return-into-libc without function calls (on the x86) Hovav Shacham presented by: Fabian Fäßler

CPS104 Recitation: Assembly Programming

CSE 351 Section 4 GDB and x86-64 Assembly Hi there! Welcome back to section, we re happy that you re here

Homework. In-line Assembly Code Machine Language Program Efficiency Tricks Reading PAL, pp 3-6, Practice Exam 1

Practical Malware Analysis

Systems I. Machine-Level Programming I: Introduction

Assembly Language Programming Debugging programs

Machine-Level Programming I: Introduction Jan. 30, 2001

Link 4. Relocation. Young W. Lim Wed. Young W. Lim Link 4. Relocation Wed 1 / 22

238P: Operating Systems. Lecture 7: Basic Architecture of a Program. Anton Burtsev January, 2018

Function Call Convention

x86 assembly CS449 Spring 2016

GDB Tutorial. Young W. Lim Fri. Young W. Lim GDB Tutorial Fri 1 / 24

CS , Fall 2004 Exam 1

238P: Operating Systems. Lecture 4: Linking and Loading (Basic architecture of a program) Anton Burtsev October, 2018

THEORY OF COMPILATION

GDB Tutorial. Young W. Lim Thr. Young W. Lim GDB Tutorial Thr 1 / 24

Stack -- Memory which holds register contents. Will keep the EIP of the next address after the call

Link 4. Relocation. Young W. Lim Thr. Young W. Lim Link 4. Relocation Thr 1 / 26

Islamic University Gaza Engineering Faculty Department of Computer Engineering ECOM 2125: Assembly Language LAB. Lab # 7. Procedures and the Stack

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING PREVIEW SLIDES 16, SPRING 2013

The X86 Assembly Language Instruction Nop Means

Labeling Library Functions in Stripped Binaries

Control flow. Condition codes Conditional and unconditional jumps Loops Switch statements

War Industries Presents: An Introduction to Programming for Hackers Part V - Functions. By Lovepump, Visit:

Exercise Session 6 Computer Architecture and Systems Programming

+ Machine Level Programming: x86-64 History

CISC 360. Machine-Level Programming I: Introduction Sept. 18, 2008

mp2 Warmup Instructions (Updated 1/25/2016 by Ron Cheung for using VMs)

Introduction Presentation A

CS , Fall 2002 Exam 1

CSC 405 Computer Security Reverse Engineering

CS 31: Intro to Systems Functions and the Stack. Martin Gagne Swarthmore College February 23, 2016

CS 2505 Computer Organization I Test 2. Do not start the test until instructed to do so! printed

Intro to x86 Binaries. From ASM to exploit

CSE2421 FINAL EXAM SPRING Name KEY. Instructions: Signature

Compila(on, Disassembly, and Profiling

Source level debugging. October 18, 2016

MACHINE-LEVEL PROGRAMMING I: BASICS

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

Buffer Overflow. An Introduction

Procedure Calls. Young W. Lim Sat. Young W. Lim Procedure Calls Sat 1 / 27

Project 1 Notes and Demo

The IA-32 Stack and Function Calls. CS4379/5375 Software Reverse Engineering Dr. Jaime C. Acosta

W4118: PC Hardware and x86. Junfeng Yang

143A: Principles of Operating Systems. Lecture 4: Linking and Loading (Basic architecture of a program) Anton Burtsev October, 2018

int32_t Buffer[BUFFSZ] = {-1, -1, -1, 1, -1, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, -1, -1, -1, -1, -1}; int32_t* A = &Buffer[5];

Computer Systems Architecture I. CSE 560M Lecture 3 Prof. Patrick Crowley

Machine-Level Programming Introduction

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

Carnegie Mellon. 5 th Lecture, Jan. 31, Instructors: Todd C. Mowry & Anthony Rowe

Binghamton University. CS-220 Spring Loading Code. Computer Systems Chapter 7.5, 7.8, 7.9

CS/COE 0449 term 2174 Lab 5: gdb

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

Transcription:

Intro x86 Part 3: Linux Tools & Analysis Xeno Kovah 2009/2010 xkovah at gmail Approved for Public Release: 10-3348. Distribution Unlimited

All materials is licensed under a Creative Commons Share Alike license. http://creativecommons.org/licenses/by-sa/3.0/ 2

Intel vs. AT&T Syntax Intel: Destination <- Source(s) Windows. Think algebra or C: y = 2x + 1; mov ebp, esp add esp, 0x14 ; (esp = esp + 0x14) AT&T: Source(s) -> Destination *nix/gnu. Think elementary school: 1 + 2 = 3 mov %esp, %ebp add $0x14,%esp So registers get a % prefix and immediates get a $ My classes will use Intel syntax except in this section But it s important to know both, so you can read documents in either format. 3

Intel vs AT&T Syntax 2 In my opinion the hardest-to-read difference is for r/m32 values For intel it s expressed as [base + index*scale + disp]! For AT&T it s expressed as! disp(base, index, scale)! Examples: call DWORD PTR [ebx+esi*4-0xe8] call *-0xe8(%ebx,%esi,4) mov eax, DWORD PTR [ebp+0x8] mov 0x8(%ebp), %eax lea eax, [ebx-0xe8] lea -0xe8(%ebx), %eax 4

Intel vs AT&T Syntax 3 For instructions which can operate on different sizes, the mnemonic will have an indicator of the size. movb - operates on bytes mov/movw - operates on word (2 bytes) movl - operates on long (dword) (4 bytes) Intel does indicate size with things like mov dword ptr [eax], but it s just not in the actual mnemonic of the instruction 5

gcc - GNU project C and C++ compiler Available for many *nix systems (Linux/BSD/OSX/Solaris) Supports many other architectures besides x86 Some C/C++ options, some architecture-specific options Main option we care about is building debug symbols. Use -ggdb command line argument. Basically all of the VisualStudio options in the project properties page are just fancy wrappers around giving their compiler command line arguments. The equivalent on *nix is for to developers create makefile s which are a configuration or configurations which describes which options will be used for compilation, how files will be linked together, etc. We won t get that complicated in this class, so we can just specify command line arguments manually. Book p. 53 6

gcc basic usage gcc -o <output filename> <input file name> gcc -o hello hello.c If -o and output filename are unspecified, default output filename is a.out (for legacy reasons) So we will be using: gcc -ggdb -o <filename> <filename>.c gcc -ggdb -o Example1 Example1.c 7

objdump - display information from object files Where object file can be an intermediate file created during compilation but before linking, or a fully linked executable For our purposes means any ELF file - the executable format standard for Linux The main thing we care about is -d to disassemble a file. Can override the output syntax with -M intel Good for getting an alternative perspective on what an instruction is doing, while learning AT&T syntax Book p. 63 8

objdump -d hello hello: file format elf32-i386 Disassembly of section.init: 08048274 <_init>: 8048274: 55 push %ebp 8048275: 89 e5 mov %esp,%ebp 8048277: 53 push %ebx 8048278: 83 ec 04 sub $0x4,%esp 804827b: e8 00 00 00 00 call 8048280 <_init+0xc> 08048374 <main>: 8048374: 8d 4c 24 04 lea 0x4(%esp),%ecx 8048378: 83 e4 f0 and $0xfffffff0,%esp 804837b: ff 71 fc pushl -0x4(%ecx) 804837e: 55 push %ebp 804837f: 89 e5 mov %esp,%ebp 8048381: 51 push %ecx 9

objdump -d -M intel hello hello: file format elf32-i386 Disassembly of section.init: 08048274 <_init>: 8048274: 55 push ebp 8048275: 89 e5 mov ebp,esp 8048277: 53 push ebx 8048278: 83 ec 04 sub esp,0x4 804827b: e8 00 00 00 00 call 8048280 <_init+0xc> 08048374 <main>: 8048374: 8d 4c 24 04 lea ecx,[esp+0x4] 8048378: 83 e4 f0 and esp,0xfffffff0 804837b: ff 71 fc push DWORD PTR [ecx-0x4] 804837e: 55 push ebp 804837f: 89 e5 mov ebp,esp 8048381: 51 push ecx 10

hexdump & xxd Sometimes useful to look at a hexdump to see opcodes/operands or raw file format info hexdump, hd - ASCII, decimal, hexadecimal, octal dump hexdump -C for canonical hex & ASCII view Use for a quick peek at the hex xxd - make a hexdump or do the reverse Use as a quick and dirty hex editor xxd hello > hello.dump Edit hello.dump xxd -r hello.dump > hello 11

GDB - the GNU debugger A command line debugger - quite a bit less userfriendly for beginners. There are wrappers such as ddd but I tried them back when I was learning asm and didn t find them to be helpful. YMMV Syntax for starting a program in GDB in this class: gdb <program name> -x <command file> gdb Example1 -x mycmds Book p. 57 12

About GDB -x <command file> Somewhat more memorable long form is --command=<command file> <command file> is a plaintext file with a list of commands that GDB should execute upon starting up. Sort of like scripting the debugger. Absolutely essential to making GDB reasonable to work with for extended periods of time (I used GDB for many years copying and pasting my command list every time I started GDB, so I was super ultra happy when I found this option) 13

GDB commands help - internal navigation of available commands run or r - run the program r <argv> - run the program passing the arguments in <argv> I.e. for Example 2 r 1 2 would be what we used in windows 14

GDB commands 2 help display display prints out a statement every time the debugger stops display/fmt EXP FMT can be a combination of the following: i - display as asm instruction x or d - display as hex or decimal b or h or w - display as byte, halfword (2 bytes), word (4 bytes - as opposed to intel calling that a double word. Confusing!) s - character string (will just keep reading till it hits a null character) <number> - display <number> worth of things (instructions, bytes, words, strings, etc) info display to see all outstanding display statements and their numbers undisplay <num> to remove a display statement by number 15

GDB commands 3 x/fmt EXP - x for Examine memory at expression Always assumes the given value is a memory address, and it dereferences it to look at the value at that memory address print/fmt EXP - print the value of an expression Doesn t try to dereference memory Both commands take the same type of format specifier as display Example: (gdb) x/x $ebp 0xbffbcb78: 0xbffbcbe8 (gdb) print/x $ebp $2 = 0xbffbcb78 (gdb) x/x $eax 0x1: Cannot access memory at address 0x1 (gdb) print/x $eax $3 = 0x1 16

GDB commands 4 For all breakpoint-related commands see help breakpoints break or b - set a breakpoint With debugging symbols you can do things like b main. Without them you can do things like b *<address> to break at a given memory address. Note: gdb s interpretation of where a function begins may exclude the function prolog like push ebp info breakpoints or info b - show currently set breakpoints "delete <num> - deletes breakpoint number <num>, where <num> came from "info breakpoints" 17

GDB 7 commands New for GDB 7, released Sept 2009 Thanks to Dave Keppler for notifying me of the availability of these new commands reverse-step ('rs') -- Step program backward until it reaches the beginning of a previous source line reverse-stepi -- Step backward exactly one instruction reverse-continue ('rc') -- Continue program being debugged but run it in reverse reverse-finish -- Execute backward until just before the selected stack frame is called 18

GDB 7 commands 2 reverse-next ('rn') -- Step program backward, proceeding through subroutine calls. reverse-nexti ('rni') -- Step backward one instruction, but proceed through called subroutines. set exec-direction (forward/reverse) -- Set direction of execution. All subsequent execution commands (continue, step, until etc.) will run the program being debugged in the selected direction. The "disassemble" command now supports an optional / m modifier to print mixed source+assembly. "disassemble" command with a /r modifier, print the raw instructions in hex as well as in symbolic form. See help disassemble for full syntax 19

initial GDB commands file display/10i $eip display/x $eax display/x $ebx display/x $ecx display/x $edx display/x $edi display/x $esi display/x $ebp display/32xw $esp break main 20

(gdb) r Starting program: /home/user/hello Example run with here commands if source is Breakpoint 1, main () at hello.c:4 4 printf("hello\n"); available 9: x/32xw $esp 0xbf9a2550: 0xb7f46db0 0xbf9a2570 0xbf9a25c8 0xb7df2450 file 0xbf9a2560: 0xb7f53ce0 0x080483b0 0xbf9a25c8 0xb7df2450 <snip> 8: /x $ebp = 0xbf9a2558 7: /x $esi = 0xb7f53ce0 6: /x $edi = 0x0 5: /x $edx = 0xbf9a2590 4: /x $ecx = 0xbf9a2570 3: /x $ebx = 0xb7f26ff4 2: /x $eax = 0x1 1: x/10i $eip 0x8048385 <main+17>: movl $0x8048460,(%esp) 0x804838c <main+24>: call 0x80482d4 <puts@plt> 0x8048391 <main+29>: mov $0x1234,%eax 0x8048396 <main+34>: add $0x4,%esp 0x8048399 <main+37>: pop %ecx 0x804839a <main+38>: pop %ebp 0x804839b <main+39>: lea -0x4(%ecx),%esp 0x804839e <main+42>: ret 0x804839f: nop 0x80483a0 < libc_csu_fini>: push %ebp Source code line printed 21

Stepping stepi or si - steps one asm instruction at a time Will always step into subroutines step or s - steps one source line at a time (if no source is available, works like stepi) until or u - steps until the next source line, not stepping into subroutines If no source available, this will work like a stepi that will step over subroutines 22

GDB misc commands set disassembly-flavor intel - use intel syntax rather than AT&T Again, not using now, just good to know continue or c - run until you hit another breakpoint or the program ends backtrace or bt - print a trace of the call stack, showing all the functions which were called before the current function 23

Lab time: Running examples with GDB 24