Assembly Language Programming Debugging programs November 18, 2017
Debugging programs During the development and investigation of behavior of system programs various tools are used. Some utilities are simple, like strace, which lets us passively follow program actions. Debuggers are more complicated, but have many more possibilities, for example we can stop the program and analyze the current contents of memory.
Tracing programs In Linux for tracing system calls we use strace program (other Unixes could have different but similar tools, e.g. truss). Infomations presented by strace are also available through /proc subsystem. Having official C program #include <stdio.h> int main (int argc, char **argv) printf("witaj świecie\n"); we will try to observe it using strace.
Tracing program katastrofa:~/akpn$ gcc -static -o hello hello.c katastrofa:~/akpn$ strace./hello execve("./hello", ["./hello"], [/* 54 vars */]) = 0 fcntl64(0, F_GETFD) = 0 fcntl64(1, F_GETFD) = 0 fcntl64(2, F_GETFD) = 0 geteuid32() = 1000 getuid32() = 1000 getegid32() = 100 getgid32() = 100 brk(0) = 0x80ad8f4 brk(0x80ae8f4) = 0x80ae8f4 brk(0x80af000) = 0x80af000 fstat64(1, st_mode=s_ifchr 0620, st_rdev=makedev(136, 3),...) = 0 old_mmap(null, 4096, PROT_READ PROT_WRITE, MAP_PRIVATE MAP_ANONYMOUS, write(1, "Witaj \266wiecie\n", 14Witaj świecie ) = 14 munmap(0x40000000, 4096) = 0 exit_group(14) =? katastrofa:~/akpn$
Tracing program Attention: the option -static makes the trace shorter, but probably will not work in most systems, if static libraries has not been installed. strace shows all executed system calls together with arguments and returned values.
Tracing programs Let us look at the assembler version section.text global _start ;for linker (ld) msg db Witaj świecie!,0xa ;message to display len equ $ - msg ;its length _start: ;entry point mov edx,len ;length mov ecx,msg ;message address mov ebx,1 ;stdout file descriptor mov eax,4 ;write system call int 0x80 mov eax,1 ;exit system call int 0x80
Programy śledzace Now the trace is much shorter katastrofa:~/akpn$ nasm -f elf hello.asm katastrofa:~/akpn$ ld -o hello hello.o katastrofa:~/akpn$ strace./hello execve("./hello", ["./hello"], [/* 54 vars */]) = 0 write(1, "Witaj świecie!\n", 15Witaj świecie! ) = 15 _exit(1) =? katastrofa:~/akpn$
Tracing programs At the end chef proposes the following dish katastrofa:~/akpn$ strace strace./hello
Using a debugger Debugger is a tool, which lets us stop the executing program and examine or change its state (among other things) Typical debugger permits us: dictate the places (called breakpoints), where a program should stop and give the control back to the debugger; use step-execution of program, that is stop the program after each single executed instruction; examine and change values of variables of a stopped program (and also change instructions), and then continue its execution. Additionally some debuggers let us specify conditions to stop the execution if they are satisfied (breakpoints described by logical formula and not by particular address). But do not count on having 200 of them.
Using the debugger Any program modifications by the debugger are made directly on the program image in memory and do not change the binary file on disk (unless you use save directive). NASM cooperated well with the debugger ald, developed mainly to find errors in system programs. Unfortunately it is not actively developed from long time (2004) so there is nothing to do but to learn gdb but it already learned NASM syntax (actually Intel syntax). There are some modifications of gdb with windowing interface around try them and use if you like some.
Using the debugger Let us come back to our file hello.asm We assembly it and link with only one additional option, and we are ready for observation and modification of our program. $ nasm -g -f elf hello.asm $ ld -o hello hello.o... First we have to call the debuggera: $ gdb hello... (gdb) The debugger waits for our commands, signaling its readiness by a promptem gdb.
Using the debugger We will start with looking at the information about our file. The informations about sections in our program are available from the command info files (gdb) info files Symbols from "/home/zbyszek/cb/examples/bin/d4ascii". Local exec file: /home/zbyszek/cb/examples/bin/d4ascii, file type elf64-x86-64. Entry point: 0x4009d0 0x0000000000400238-0x0000000000400254 is.interp... 0x00000000004006f8-0x0000000000400890 is.rela.plt 0x0000000000400890-0x00000000004008aa is.init 0x00000000004008b0-0x00000000004009d0 is.plt 0x00000000004009d0-0x0000000000400d24 is.text 0x0000000000400d24-0x0000000000400d2d is.fini 0x0000000000400d30-0x0000000000400d77 is.rodata... 0x0000000000601e08-0x0000000000601ff8 is.dynamic 0x0000000000601ff8-0x0000000000602000 is.got 0x0000000000602000-0x00000000006020a0 is.got.plt 0x00000000006020a0-0x0000000000602120 is.data 0x0000000000602120-0x00000000006026b0 is.bss
Using the debuggerem To examine the region of memory use the command x (gdb) x/20cb 0x6020c0 0x6020c0 <inventory_fields>: 64 @ 13 \r 64 @ 0 \000 0 \000 0 0x6020c8 <inventory_fields+8>: 67 C 0 \000 4 \004 0 \000 0 \0 0x6020d0 <inventory_fields+16>: 69 E 13 \r 64 @ 0 \000 The argument is the address in memory where to start the display. The control parameters given after slash are number of units, format and unit size, in this case bytes in character format.
Using the debuggerem To set a variable in the program we can use set command, for example set variable licznik = 3 However you can give an assignment expression as an argument to other commands, for example print licznik=3 To store a value into arbitrary place in memory, use the cast (gdb) set {int}0x8049094 = 3 (gdb) print *0x8049094 $1 = 3
Using the debugger At any time we can stop using the debugger with the command quit (many commands may be shortened if the abbreviation is unique, in this case it is enough to type q) (gdb) q $ _ My favorite and most used command is help I really recommend it (the manual of gdb has a few hundred of pages!)
Using the debugger There is a command to look at the registers contents (gdb) info registers eax 0x22c8e0 2279648 ecx 0x226070 2252912 edx 0x21bd10 2211088 ebx 0x22bfc4 2277316 esp 0xbffff2a0 0xbffff2a0 ebp 0x0 0x0 esi 0xbffff2ac -1073745236 edi 0x80481c0 134513088 eip 0x80481c4 0x80481c4 <_start+4> eflags 0x200282 [ SF IF ID ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
Using the debugger The info command has many more useful arguments. The contents of a register can be changed with the command set $eax=1 Using the command p (print) (gdb) print $xmm0 we can more precisely look at the contents of a single register (especially if it contans packed values). The command x (from examine) has parameters to control display format (and is applicable not only to registers, but also to memory cells aka. variables). The command where displays the backtrace the stack of function calls.
Programming The main purpose of using debugger (for us) is testing small programs in machine language. Some debugger have the assembly command, which allows for entering short programs, for example to test unclear items from the processor documentation. In gdb it is not possible, unless you enter them in binary, save the memory image to file and reload the program. The command list shows the code of (the part of) the program: the parametr should be a number of middle line in displayed sequence of instructions.
Programming Using set disassembly-flavor intel we will obtain the syntax close to that of NASM. (gdb) list 8 3 section.data 4 output: 5 db The processor Vendor ID is xxxxxxxxxxxx \n 6 output_len equ $-output 7 8 section.text 9 global _start 10 _start: 11 mov eax,0 12 cpuid
Starting execution For starting programs the following commands are used: The command (gdb) run causes the exeution of the program from the beginning. The command (gdb) continue continues execution of a stopped program beginning from the current instruction (according to the current contents of the EIP register).
Breakpoints The program is run to the end, unless the breakpoints have been defined. Breakpoints mark (addresses of) instructions, that the program should be stopped after reaching them and pass the control back to the debugger. They are set with the command break, giving e.g. the number of the instruction in the code (instruction numbers are displayed by list command): (gdb) break 25
Stepping mode There are additional commands to run the program in the step mode. The simplest form of the command step causes the execution of a single instruction. (gdb) step 26 mov ebx, 0x8048231 The optional argument lets us specify the number of instructions, which should be executed. (gdb) step 5
Stepping mode Warning: If the current instruction is int or call, then using the command step causes the debugger to enter inside an interrupt handler or a library procedure (which is sometimes loooooong). The command next lets us avoid entering inside procedures by executing all their bodies. (gdb) next 5 27 test eax, eax Note: If a procedure has been compiled without debugger option, then those commands are usually equivalent (to the second one).
Additional informations Breakpoints are usually implemented by putting the following single-byte instruction into the appropriate location in the program (while preserving the previous contents and restoring it after any stop not necessarily this one, so do not count of seeing them in the memory unless doing something nasty). int 3 It is sometimes useful to put this instruction at the end of tested program (to catch premature explosions). The step tracing is usually implemented by setting the flag TF in the processor status word.